Outline of definitions
In order to measure ethnic diversity, one has to define what exactly ethnicity is. One could try to define ethnicity based purely on one parameter such as genetic differences or place of birth. Or one could combine a few of those characteristics together and say that this is the correct way. Another solution would be to base ethnicity on surveys, that is, to ask the populace about what ethnic group they perceive themselves as members of, but even then, what exactly do we ask, what is an ethnic group, is it a caste, a clan, a common language or a common set of traditions. There is no one universal definition of ethnicity that would be accepted as entirely correct. This poses a problem when working with such variables as ethnic diversity, a decision has to be made on what the definition of ethnicity, although a priori incorrect, would be so that using that definition we could subsequently measure diversity.
A common way of measuring ethnic diversity is by calculating a fractionalization index (Alesina et al. 2003; Montalvo & Reynal-Querol, 2005). The index is discussed in detail in the following sections. Definition is important because based on this definition the indices of diversity are calculated, and then those indices are included in econometric analysis to provide results. So ultimately the definition to a large extent decides the result.
The generalized formula for computing the fractionalization index:
Where is the proportion of people who belongs to group in the total population and = 1,2,3, … , . is the number of ethnic groups.
The ethnic fractionalization index (FRAC) is computing the probability that two randomly drawn individuals are not of the same ethnic group. The index goes from 0 to 1, where 0 means that the population is completely homogeneous and 1 means that each person belongs to a different group. Below is an example to provide some intuitive sense behind it.
In Table 3.2 “structure of population” denotes the composition of population, for example different ethnicities. Let’s focus on Country B. In parentheses are the respective shares of ethnicities, (0.5, 0.5) means that the population is split evenly into two ethnic groups. Then the fractionalization index for country “B” (FRAC) is computed to be 0.5, this means that there is a 50% chance that two randomly drawn people from population of country “B” will have different ethnicity. The same procedure can be applied to calculate the fractionalization index for the rest of the countries.
Brief history on development of diversity indices
Ethno-Linguistic Fractionalization Index (ELF)
Even though this index will not be used in the empirical and analysis parts, it is necessary to mention due to its importance in the development of the new indices by Alesina et al. (2003) and subsequently the update of the index by us. Ethno-Linguistic Fractionalization Index is the best known and, in the past, most widely used index that measures national homogeneity (Fearon, 2003; Montalvo & Reynal-Querol, 2005). ELF index was calculated based on the data from Atlas Narodov Mira compiled in 1964 by Soviet cartographers and used by a generation of economists (e.g. Mauro, 1995; Easterly & Levine, 1997; La Porta et al. 1999). However, this index was criticized by other researchers mainly because it is largely based on linguistic differences (e.g. Alesina et al. 2003; Posner 2004; Cederman & Girardin, 2007). For example, if one uses this index to see how ethnically diverse the U.S is, that would result in white and Afro-Americans being in the same ethnic group only because they speak the same language.
As mentioned previously, the values of diversity indices are very sensitive to specification. If the specification of an ethnic group changes, then the calculated value of the index also changes. Since the indices are then used to empirically test various relationships, said specification can greatly affect results. Therefore, researchers went beyond the simplicity of ELF index´s assumption. One of the most ambitious attempts to do so was made by Alesina et al. (2003), who constructed three new indices that captured a wider array of cultural dimensions (Patsiurko et al. 2012).
Ethnic, Linguistic and Religious Fractionalization (Alesina et al. 2003)
In their paper, Alesina et al. (2003) construct three new indices, one based on a broad measure of ethnicity, one based strictly on language and one based on religion (Montalvo Reynal-Querol, 2005). By doing so, they separated language from ethnicity, which was not the case in the ethno-linguistic index, where ethnicity and language were combined together. This isolation of ethnic diversity allowed to control for important physical factors such as racial origin or skin color. This caused indices in some world regions to change significantly. For example, if one were to measure a level of national heterogeneity in Latin America with “old” ethno-linguistic index and relatively new ethnic fractionalization index from Alesina et al. (2003) then, on average, the new one will show much higher diversity across this region. The reason for this is the fact that because of former colonies the majority of people in Latin America speak the same languages, such as Spanish, Portuguese and English, which make this part of the world almost homogenous when using the ELF index that relies mostly on language distinctions. Whereas the new index captures the important physical aspects of ethnicity which were obscured in commonly used ELF index. Additionally, Alesina et al. (2003) included religion as a measure of diversity, which was not included at all in the ELF index. Another advantage of the index over the ELF, is that Alesina et al. (2003) used twice as many countries in his sample. If being precise, between 190 and 215 countries were used to calculate new indices, depending on the type of index, versus 112 countries used to calculate the ELF index.
A main drawback is that even though the majority of data is collected from Encyclopaedia Britannica, different sources were also used, like CIA World Factbook, Levinson and Minority Rights Group International. These sources list their data for periods different from Encyclopaedia Britannica and from each other. Using data from different sources and periods at the same time could negatively affect robustness of the index. The drawback of using different sources and different years is however, unavoidable, since there is no unified survey on diversity data taking place in all countries at the same time.
Updated Ethnic Fractionalization Index
We also base our index solely on ethnic data. This approach allows us to isolate the effect of ethnicity on growth and provides a possibility to do a robustness check using the ethnic fractionalization index of Alesina et al. (2003) that is constructed in the same vein.
As mentioned previously in the introductory part, the beginning of the 21st century brought about big changes in ethnic composition of world nations. Therefore, it makes more sense to update the index rather than use pre-existing ones. For example, indices of Alesina et al. (2003) were calculated using the data on ethnic groups from 1979-2001. Hence, if we use their index to examine the relationship between ethnic diversity and GDP per capita growth for period beyond 2001, the result may be misleading. With an updated index we will be able to properly capture the increased level of diversity in a countries sample and see how it affects economic growth.
As we argued in the outline of definitions section, in order to calculate a new index of diversity it is necessary to define what an ethnic group is. To construct our index, we decide to take the definitions of ethnicity as given by the sources. Meaning that a source defines how it will separate ethnic groups from one another and creates a pie chart with share of ethnic groups as percent of population. We on our part take these percentages and use them in a formula to calculate the index of ethnic fractionalization. The main difference of our index with the index of Alesina et al. (2003) is that the sources updated the data as far as 2017, in turn allowing us to update the index. Our main source is Encyclopaedia Britannica as in many instances it provides more detailed ethnic group definitions. Meaning that, for example, it separates Arab and Berber into separate ethnic groups, whereas CIA World Factbook lists them as a single group. Taking the ethnic groups as given is a viable approach if all of the data for all countries is taken from a single source, so that the ethnic groups are defined “on the same level”, meaning that for example, Arab and Berber groups must be listed as separate for all countries, and not separate in some, and as one group in others. Because of this reason, the countries that did not have any data on ethnic groups available from Encyclopaedia Britannica were studied on a case by case basis from other sources to eliminate any discrepancies in the definitions of ethnic group.
Previously it was mentioned that we take the definitions of what an ethnic group is directly from the sources. The reason for this is that creating our own definition of an ethnic group and subsequently dividing each and every country in the world according to it, is the most time consuming process when calculating an index of ethnic diversity and this step alone can take months to accomplish.
At first using a cross-section with change in GDP as a dependent variable and initial conditions as independent variables was considered. However, a panel approach is chosen instead because it provides us with a larger sample size, increased variability of the data, and therefore improves the overall efficiency of our estimates (Hsiao, 2014). The increased sample size is especially beneficial for our limited time period to increase the total number of observations.
The dataset is a panel at country level for 217 countries regardless of region or development status for the period of 2003-2015. Even though the timeframe is small, this particular period was chosen because we are looking to test how previous results of 1960s-1980s hold with modern realities without capturing very major events that could significantly affect results, such as the aftermath of breakup of the USSR, 9/11 and large-scale wars in the early 2000s.
All of the data, except for the indices, was collected from The World Bank. For the purpose of this thesis we have calculated an index of ethnic fractionalization using updated data on ethnic groups in 217 countries. Majority of the data for the index was collected from Encyclopaedia Britannica (194 countries), other sources include CIA World Factbook (20 countries) and Census bureaus (3 countries). The period of ethnic group data used ranges from 2000 to 2017. Note that the index of ethnic diversity by Alesina et al. (2003) is used as a robustness check and is included as given by its authors. All of the variables were adjusted for inflation where applicable.
Since we use the neoclassical model as a base, the left side of the equation is income. This model does not specify exactly by which economic measure income should be represented. However, previous studies that have dealt with the effect of ethnic diversity on growth have employed GDP per capita (Easterly & Levine, 1997; Alesina et al. 2003; Patsiurko et al. 2012). Following the lines of previous research, we have chosen to use GDP per capita, adjusted for inflation, as the dependent variable in this paper.
Independent variables according to the neoclassical growth model are capital, labor and human capital. Capital is included as gross capital formation. Labor is included as workforce (amount of people eligible for work) in a country. Capital and Labor are taken in per capita form. Human capital is included in the model in the form of average years of education. After the basic control variables, two types of dummies are included. Time dummies for each year are added to take care of the time trend inherent in growth variables such as GDP per capita and capital. Another set of dummies are regional dummies added to account for differences in development across world regions. The regions are: America and Caribbean, East-Asia and Pacific, Europe and Central Asia, Middle East and North Africa, Sub-Saharan Africa. Only four are included in the regression to avoid perfect multicollinearity. The independent variable we are most interested in is represented by the updated index of ethnic diversity.
In addition to the basic growth variables, variable of interest and necessary time and regional dummies, we will introduce some additional control variables. First, Foreign Direct Investment (FDI) net inflows is introduced as investment. Investment is a variable commonly used as a control by previous researchers (see Montalvo & Reynal-Querol, 2005; Gören, 2014; Bove & Elia, 2016). Another variable is urban population measured in terms of percent of a country’s population living in urban areas. This is added to control for differences in development between individual countries, as opposed to regional dummies, which control for differences in development between world regions. Urban population is chosen because a country’s development and income are positively correlated with the proportion of population living in urban areas (Bloom et al. 2008).
2.1 Previous research
2.2 Channels that link economic growth and ethnic diversity
2.3 Negative outcome links
2.4 Positive outcome links
2.4.2 Misallocation of labor
2.5 Ethnic fractionalization within economic growth framework
3.1 Outline of definitions
3.2 Measuring diversity
3.3 Brief history on development of diversity indices
3.3.1 Ethno-Linguistic Fractionalization Index (ELF)
3.3.2 Ethnic, Linguistic and Religious Fractionalization (Alesina et al. 2003)
3.4 Updated Ethnic Fractionalization Index
4 Empirical framework
4.2.1 Dependent variable
4.2.2 Independent variables
4.3 Data summary
4.4 Descriptive statistics
4.5 Correlation matrix
4.7 Model selection
4.8 Estimated model
5.1 Updated Ethnic Fractionalization
5.2 Ethnic fractionalization (Alesina et al. 2003)
GET THE COMPLETE PROJECT
The role of ethnic diversity in influencing economic growth