Based on the information of previous studies and the research question of the paper a model has been concluded. The following chapter will discuss why the model is constructed in the way it is, why the variables are present and what the results of the inclusion of said variable might be.
For the model, 23 sub-Saharan countries will be regressed over a twenty-year period, ranging from 1998 to 2017. The Sub-Saharan countries have been ranked by GDP per capita and the 23 countries with the highest rank have been selected. Only 23 countries are selected in this model since the omitted countries lacked large amount of data and included biased data for the cases when the country is a small island receiving large numbers of tourism. For example the Seychelles, it is a group of small islands with an area of only 455 . The data is collected from the World Bank, the UN, and OECD. The variables are selected based on what previous studies have concluded regarding FDI.
The dependent variable in the model is FDI for the selected countries. FDI is defined as net inflow in absolute numbers, in thousands of USD.1 The reason for why the net inflow in FDI is selected instead of FDI per capita, or FDI as a percentage of GDP, is because the purpose is to find the determinants for the total FDI inflow into the Sub-Saharan countries.
GDP, Gross Domestic Product denominated in absolute numbers. It is collected from The World Bank. It is one of the most researched and widely accepted economic indicators of the total market size of a country. It is calculated by adding the gross value produced by the residents, the tax incomes with subsidies subtracted and thereby not included in the value of its product. All figures are presented in thousands of dollars. The variable is a proxy for market size.
HDI, Human Development Index is an index to track the development of wellbeing in the country. It measures three dimensions and of which HDI is the mean. HDI is a measure of averages in vital achievements of human development as long and healthy life, knowledge and a decent standard of living. These dimensions are represented by three different indices: life expectancy index, education index and GNI (Gross National Income). The data is collected from the UN’s development program. The index is represented on a scale between 0 and 1, but for our estimation, it has been modified to be on a scale between 0 and 100 in order to receive coefficients made to scale. HDI is used for representing the market size in terms of its purchasing power and well-being.
INFLATION, a measurement of economic stability in a country. High inflation causes instability; hence, low and foremost, predictable inflation is appreciated by investors. Negative inflation is known as deflation which is nothing desirable by investors. It is included as an explanatory variable in order to see how inflation relates to FDI inflow changes over time. Collected from The World Bank, it is presented in a yearly change, denominated in decimals instead of percentages. The variable is a proxy for economic stability.
GDPGROWTH, a commonly applied variable when discussing economic growth and prosperity. Collected from The World Bank, it measures the growth rate of the GDP (Gross National Product) of a country. It is presented in a yearly change, denominated in decimals instead of percentages. It is included as a variable to see the effect of short term trend on FDI in the selected countries.
CORRINDX, a variable illustrated by an index, illustrating corruption. It is presented by Transparency International. It ranks all the countries in the world while providing the countries with a score between 0 and 100. Where 0 is highly corrupted and 100 is not corrupted. The scores represent how corrupted the selected countries are. This variable is selected to represent the political risk of a country as it represents corporate governance and political stability.
EASEBUS, (Ease of doing business) is collected from the organization doingbusiness as a part of The World Bank. Economies are ranked on a scale of 1 to 190 and if the ranking is high, it means that the regulatory environment is more friendly for starting operations or to expand in that country. The ranking is based on the scores of 10 different topics: starting a business, dealing with construction permits, getting electricity, registering property, getting credit, protecting minority investors, paying taxes, trading across borders, enforcing contracts and resolving insolvency. This variable is mainly representing the openness of a country, but it does also represent the political risk and stability since the topics the index is based on is related to government policies and regulations.
Table 1 below is illustrating how we believe each variable in our model will affect FDI and under which category the variable is a proxy for.
As previous studies on the subject of FDI, this study also contains limitations due to practical reasons. The study performed by Froot and Stein (1991) concluded that weak exchange rates are positively related to FDI inflow. Even though it is something we find interesting we have chosen to exclude exchange rates from the study. Because of the restricted amount of data available.
By investing the samples individual foreign investment policy as the study conducted by Reiter and Steensma (2010), it is quite possible that we might get further insight. However, we have decided to exclude specific investment policies in the research. Inclusion of said variable tends to be hard to quantify in an empirical model as the one conducted in this thesis as laws and regulations rarely are binary. Interpreting the policies without fully understanding inter-relations and the consequences, faulty conclusions are too big of a risk. Also, powerful corporations have been able to deviate from existing policies as discussed in the previous section regarding FDI.
In this section, the empirical model and the econometric method are presented.
Furthermore, tests for identifying the most feasible and fitted model is carried out. Also, statistical tests with the purpose of validating the model are performed. By avoiding to violate vital assumptions and other important key factors that may cause invalidity for the model we hope to generate a robust model.
To find the determinants of FDI the collected data will be tested by econometric methods to see how each variable has affected FDI in the selected countries throughout the 20 years period. This is done by a Panel Data estimation. An estimation with data points for each country and each year, to identify coefficients that represent the effect these variables have on FDI.
FDIit = β0 + β1 GDPit + β2 HDIit + β3 GDPGROWTHit + β4 CORRINDXit + β5 EASEBUSit + β6 INFLATIONit + εit (1)
Where i represents the country and t represents the year. An error term is also included in the model, represented by εit .
In order to test the authenticity of the model, testing the correlation of the independent variables is vital to avoid violating important assumptions while performing regression models. Multicollinearity is one assumption that needs to be avoided in order to be able to get trustworthy and interpretable results. Multicollinearity exists when one or several independent variables are explaining each other. That will make the coefficients in the model biased. Therefore, a correlation matrix has been generated for checking whether or not the independent variables explain each other. (Gujarati, 2003)
In table 2, how much each variable are correlated to each other can be seen. By looking at the matrix, a significant correlation between two of our dependent variables, CORRINDX, and EASEBUS (0,575284) can easily be identified. An estimation is performed for both of the variables to see which model that explains the FDI the better. Based on the outcome, we decided to keep the variable EASEBUS as the model provides a value higher R-square and a significant coefficient, whereas the coefficient for CORRINDX was not significant. See FEM(2) in table 5 for the outcomes of the tests.
The developed empirical model as follows:
FDIit = β0 + β1 GDPit + β2 HDIit + β3 GDPGROWTHit + β4 EASEBUSit + β6 INFLATIONit + εit (2)
After correcting the model for multicollinearity, it is possible to select which model to use.
First, we performed a Pooled OLS model to use as a comparative model. It is a simple technique being run with Ordinary Least Squares, OLS on panel data. The model assumes that the intercept (β0) of the panel data objects is identical. That means that the model does not take the time variation of the data into account. Time-variant variables imply that variables might have different characteristics at different points in time, time-invariant variables implies the contrary. Constants and individual specific effects are taken into account in this estimation. As a consequence, the model might give a skewed picture of the actual relationship in the model. (Gujarati, 2003)
Then, a fixed effect model was performed. The Fixed Effect Model, commonly abbreviated, FEM is a regression model that is compatible with panel data. The fixed effect is due to the intercept is assumed to be time-invariant even though the individual observations may have differing intercepts over time. Therefore, the model assumes that the slope coefficients of the individuals do not shift over time. The estimation cancels out constants and individual specific effects. This allows special characteristics, such as the time variation, to be taken into account by the presence of dummies. The use of dummies decreases the degree of freedom when estimating the model. Which may cause a problem as the normal distribution curve is flattened and the tails of the curve increases. It increases the risk of faulty conclusions based on the test. (Gujarati, 2003)
Lastly, a Random Effects Model – REM was then performed as well. It is also a model for panel data. Instead of assuming that the slope coefficients (β) of the model are fixed. The coefficients are assumed to be the mean of a random drawing of a greater sample. (Gujarati, 2003)
To select between FEM and REM we performed a Hausman test. The Hausman test is a statistical test in order to decide whether the FEM-model or the REM-model is most suitable. The hypothesis of the test is the following;
Since the p-value for the test is lower than the selected level of significance (5%) It is possible to reject the null hypothesis. Hence, the FEM model is preferred over the REM model. Therefore, the FEM model is concluded to be the primary model for the regression. As previously noted, a Pooled OLS model will be included for comparative reasons.
The following section will present the descriptive statistics and the statistical output created by running the model. The key statistics regarding the output will also be presented and put in relation to the established assumptions and rules for statistical testing.
In Table 1, the descriptive statistics for the dependent variable and all the independent variables are presented. There are a total of 460 observations for the 23 selected countries over the 20 year period. The table displays the variable intervals and its statistics.
The values of the coefficients for the Pooled OLS model, FEM(1) model where EASEBUS is used and FEM(2) where CORRINDX is used is presented below. The results for the Pooled OLS estimation is displayed because it is interesting to see the results when keeping the individual effects and the constants in the data. The results from the Pooled OLS estimation and FEM(2) model is only present to view the differences, and will not be used when discussing our findings and coming to conclusions.
As illustrated in Table 5, R-square (overall), the coefficient of determination for the FEM(1) model is 0,576833. It is interpreted as approximately 57,6% of the variation and change in the dependent variable, FDI can be explained by the independent variables.
The coefficients for GDP and HDI both have positive signs, meaning that they have a positive relationship to FDI inflow. EASEBUS do have a positive coefficient as well. The coefficient for INFLATION and GDPGROWTH are insignificant and cannot therefore be explained by this estimation.
For a deeper insight, the sample was divided into two groups for each addition regressions. Two additional regressions were estimated on two different sets of groups.2 To test if there are inter-sample differences which could further explain the relationship between FDI and the variables. The following regressions were made using the FEM model.
For the first regression of the sub-section, the sample was divided between the richest and the poorest countries based on their total GDP.
The signs of the coefficients for both sub-samples are consistent. The only remarkable difference between the groups is that the coefficient for EASEBUS is only significant for the poor countries.
For the second regression in the sub-section, the sample was divided based on market size. It was done practically by ranking the sample by the total population in 2017.
The most noticeable finding on the regression is the difference in the sign of the coefficient for INFLATION. For Large pop. the sign is positive whereas it is negative for Small pop. HDI is only significant Large pop. while GDP is only significant for Small pop.
2.1 Important notes of FDI flows
3. Literature Review and Theoretical Framework
3.2 Previous studies
4.1 Data Sources
5.1 Empirical model
5.2 Econometric Method
6.1 Descriptive Statistics
6.2 Main Results
6.3 Sub-section Results
Reference list .
GET THE COMPLETE PROJECT
Determinants of FDI in Sub-Saharan Africa