The WLS regression model – Project topics materials

Get Complete Project Material File(s) Now! »

Theoretical framework

Murphy, Shleifer and Vishny (1991, 1993) presented a theoretical framework discussing how cor-ruption affects income inequality and growth. This framework was modified by Li et al. (2000) in their paper; “Corruption, income distribution and growth”. We will make additional modifications to this framework and show how it is relevant for our thesis. The main differences in our model and the one presented by Li et al. are the presumptions regarding the gini coefficients. Three differ-ent cases are analyzed and described below for further comprehension. The encompassing frame-work helps to understand the effects of corruption on income inequality.
Assume an economy where one can engage in one out of three economic activities. First, say that a person can produce a good for the market, at the output of α. Second, the person can produce a subsistence good, at the output of γ<α. This good cannot be subjected to rent-seeking behavior, that is, it cannot be stolen or confiscated. However, that is not the case for the market output, which can be subjected to rent-seeking behavior. Further on, the third activity that a person can en-gage in is rent-seeking. It is denoted by β and it is the maximum amount at which he can produce. The overall return from production (including rent-seeking) will fall under the circumstances of an increase in rent-seeking activities.
The ratio of people engaging in market production and rent-seeking activities are denoted by n, and the income per capita by y. The equilibrium in this economy is established by the populations ac-cumulated engagements in either production of a good (α), subsistence production (γ<α) or rent-seeking (β). Therefore the allocation of labor will depend upon α, γ, and β.
In Case 1, β<γ, which correspond to figure 2. Under these circumstances property rights are well preserved and the society does not suffer from any corruption. The return for producers are higher than for rent-seekers, additionally the return for subsistence producers are also higher than for rent-seekers. As we assume that individuals want to maximize their own output, under this situation each individual will produce goods and there are no subsistence producers or rent-seekers. The ra-tio of people engaging in rent-seeking activities is, n=0, and the return for rent-seekers is diminish-ing. However, n changes, let say n>0, the market production can be described by α-nβ (diminish-ing).
We assume the Gini coefficient to be zero. No corruption and well preserved property rights lead to the highest possible output (per capita), denoted by α.
In Case 2 correspond to figure 3 where, β>α. Property rights are poorly preserved and therefore due to the greater return for rent-seekers, people rather engage in rent-seeking activities than any-thing else. This is an extreme corrupt society. Figure 3 shows that there can only be one equilibri-um, at the point where the return from production has gone down to γ and that it is equal to the output from rent-seekers. This is when rent-seekers are crowding themselves out i.e., = (!!!!, which in equilibrium is !! = (∝!!)!, given that n’’ >n’. In equilibrium all individuals’ income is the same as subsistence production γ, hence, the equilibrium is not where the market productivity is, at α2.
The number of rent-seekers will increase over time and accordingly the number of producers will decrease, i.e. α-nβ=γ. As the number of individuals interacting in rent-seeking activities increase, the market output will decrease. The gini coefficient has a likelihood of being high. As higher Gini coefficient as closer to being completely equal.
In Case 3, γ <β<α. This last case we refer to an intermediate level of corruption which consist of three equilibria as shown in figure 4. (a) The first equilibrium refers back to Case 1 where all people chose to produce in accordance with output α. (b) Second equilibrium comes from Case 2, where people chose among production (α), subsistence production (β), and rent-seeking activities (γ). This is encountered under the circumstances where income per capita is pushed down to γ, and equilib-rium based upon = ∝! − 1. (c) The third equilibrium, where people either engage in market pro-duction or rent-seeking. The output is denoted by β. Observe that in this equilibrium there is no people engaging in subsistence production. The equilibrium is based upon α-βn=β, or n’’’= (∝!!) ! given that n’’’<n’. This occurs since new entries of rent seekers will force the return of the produc-ers on the market down to the same return as the rent-seekers, and that is before any initiated crowding out. However, this last equilibrium is not stable, nor desirable as it push n beyond n’’’. Consequently, it implies a rising return to rent-seekers. Therefore there is only two stable equilibria, one where n=0 and another where n=n’’. In accordance with the former, less people will engage in rent-seeking activities than what is shown in case 2. Additionally the income level β is higher than in case 2, but still lower than case 1.
Concluding case 3, one can see that the variation in income will vary more than in case 1, however, not as much as in case 2. Countries which have a low corruption level will have a lower level of income variation than countries with an intermediate or high level of corruption. The gini coefficient is higher than in case 1 but not as high as in case 2.
The empirical implications of the modified model we find that the best situation is case 1, where property rights are well preserved and no existing corruption. This is under the conditions of β placed below γ. It will lead to the highest possible per capita output, denoted by α. Anti-corruption beliefs, i.e. legal system or cultural impacts may also affect an important role (North, 1981).
Our hypothesis states:
Property rights and corruption are negatively correlated
A high level of corruption imply high income inequality
The correlation between corruption and income inequality is negative
Corruption should also have a negative correlation with income level

Data

The data have been constructed from several sources to compile a suitable dataset3. The dataset in-cludes 99 countries. The countries were not chosen by any random selection. We simply used all countries that had sufficient data both for the Gini coefficient and the corruption index. Countries included in the dataset can be found in Appendix 1. The variables4 were collected from several dif-ferent sources as currently there is no single provider of all variables required.
Collecting accurate and qualitative data is found to be fairly difficult since corruption is illegal and its operations are mostly hidden as earlier discussed. Thereto there are no unified or general known method used by institutions. Historically one have been using indices as a method to measure cor-ruption in empirical studies (Dahlström, 2009). We use the corruption perception index (CPI), which is provided and accumulated by Transparency International. It is the far broadest index avail-able and it is matching our intentions with this paper as we are only interested in the percieived lev-el of corruption in a country. We are not targeting any specific form or measure of corruption. The CPI index currently contains data from 180 countries and has been recorded since 1995. The CPI rank countries from 0 (most corrupt) to 10 (least corrupt). For the sake of simplicity the index has been inverted, so now 10 indicates higher level of corruption.5
The Gini index is as explained in section 2 and it is the area between the line of equality and the Lo-renz curve. Indices for income inequality can be accumulated from different metrics. This paper us-es the Gini index, which is the most common metrics for income inequality. The Gini index fits our model best as we are computing for regressions in a cross-country sample. Gini coefficients are ac-cessed from the World Income Inequality Database 2 (WIID2), which is collected by UNU-Wider, United Nations University. The World Institute for Development Economic Research have updat-ed and checked the WIID1, which is a newer version of Deininger & Squire database from the World Bank. Deininger & Squire is otherwise a source that can be commonly seen in many other works. WIID2 also contain more measuring points and the fact that it is the world’s largest data-base of Gini coefficients makes it an appropriate source.6
The regressions and the empirical analysis is based upon a balanced dataset including averaged vari-ables between the years 2002 and 2006. Notice that we have only been using avareges. We sought to use as recent data as possible without abandoning a relatively good sample size. Most countries have complete variables, however, some years have missing gini coefficients. This will not represent any significant problem as the Gini coefficient merely face any radical changes over the time span of 5 years. The five-year average is used to eliminate any short-run variations and thus the variables will give a more universal representation of the genuine relationship.
Figure 5 shows that plotting the average gini against the average corruption index. The Gini coeffi-cient is positively correlated with Corruption. Implying that if corruption increases so will the gini coefficient.
Plotting the variables in figure 5 gives an indication of heteroscedasticity. However the visual ap-proach by simply revieving the scattered plot or the histogram of our dependent variable the gini coefficient will not statistically prove heteroscedasticity. Therefore, in appendix 5 we have present-ed the full testing of heteroscedasticity showing that we can reject the null hypothis of homoscedac-ity, hence heteroscedasiticity exists.

READ UNDERSTANDING PARTY DOMINANCE IN DEMOCRATIC POLITICAL SYSTEMS

Model

When looking at how corruption affects income inequality we use these following linear regression model. Using a Weighted Least Squares regression method. One of the reasons of showing the WLS regression model would be the heteroscedasity previously discussed.
These models hold the income inequality (GINI) as the depended variable and Corruption index (CPI) as independent variable. Following our purpose we look for patterns depending on different groupings based upon legal system. Model 1 include dummy variables following the La Porta grouping, identifying legal systems.
Y : An average of each countries gini coefficient over 2002-2006
X1 : An average of each countries corruption perception index over 2002-2006
DSocialist : Dummy indicating if a country’s legal system is identified as Socialist
DEnglish : Dummy indicating if a country’s legal system is identified as English
DFrench : Dummy indicating if a country’s legal system is identified as French
DGerman : Dummy indicating if a country’s legal system is identified as German
DScandinavia: Dummy indicating if a country’s legal system is identified as Scandinavia

WLS vs OLS

The ordinary least squares (OLS) estimator is the best method to use if random errors in a linear regression model are having the same variances, that is, the standard deviation of the error term is contant (Shao, 1990). However, many regressions shows that the opposite, as in our case, that the size of the variances of random errors are different from each other. This imply that the result will not be optimized and the OLS model is not the appropriate approach in selection of model.
Instead of the OLS regression estimator one can use a WLS regression model. The WLS has some advantages. One of the greatest advantages is that the model fits small data sets (Shao, 1990) and the WLS regression also can be a good measure when the variances in the random errors are not the same (Carrol, 1982).
Furthermore we continued with the WLS method as it gave us both better R2 and a lower Se.

The WLS regression model

In a linear regression one of the common assumptions is that each data point gives accurate infor-mation of their respective part in the total variation. That is, the standard deviation of the error term is constant. Undoubtedly this will not always be the case. In line with the former, one cannot always assume that all data points should be treated equally. By analysing Figure 5, the scatter plot of the gini coefficient and the corruption index, we can see that we have some distant data points, where some countries have performed indifferent with the hypothosis, i.e. low gini but high corrup-tion or the other way around (e.g. Chile or Slovakia).
In our WLS regression we use corruption as a weighted variable and can produce a more efficient estimation of the parameters. The observations are weighted relatively to the weight of the other observations.
With a WLS regression we produced a good estimate for the relationship of corruption and income inequality. However, the WLS regression is not all pros, there are some cons which are worth men-tioning. First, the WLS theory assumes that the weights are known, which obviously almost never I the case. Therefore one estimates the weights in a WLS regression . The affects of using estimated weights instead of real weights is argued to have minor impacts on the actual regression interpreta-tion. Secondly, outliners still have an impact on the regression result as in any other least squares regression, but to overlook the WLS as an appropriate method would result in an analysis with low-er quality and less significance.

Results

Our thesis is based upon cross section data constructed by 99 countries. These 99 countries were the only countries that had sufficient data presented between the years 2002 and 2006 on both the measures of corruption and income inequality. A full review of the dataset can be found in Appen-dix 1.
The legal origin of each group is divided as follows: English (19), Socialist (32), French (38), Scan-dinavian (5), and German (5) (a full specification can be found in Appendix 2). The countries are divded into these sub groups depending upon their legal system.
The tables in Appendix 3 shows the full results for the WLS regression model (1) including dum-mies depending upon legal origin. The R2 reported in the regression is .500 which is reasonably good for only including one independent variable. Consequently about 50% of the movements of corruption explains the affects on income inequality, holding all other unknown variable the same.
Comparing the regression results from the WLS regression with dummies and the WLS regression without dummies we can see that our results are stronger in the case with dummies included.

Table of Contents
1 Introduction
1.1 Purpose
1.2 Disposition
2 Background
2.1 Corruption
2.2 Income inequality
3 Theoretical framework
4 Data
4.1 Model
5 Results
6 Conclusion
GET THE COMPLETE PROJECT