Direct and Reverse Estimates of Returns to Scale1
Increasing returns to scale are of great importance for numerous macroeconomic models (see for example, Farmer and Guo, 1994, and Jones, 2005). However, there is a lack of consensus on whether the assumption of increasing returns to scale is empirically plau-sible. Diﬀerent methods of estimating returns to scale have been used in the literature, and produced diverging results. Therefore, it is important to understand exactly what each method does and when it might be preferable to use one over others.
This chapter contributes to the existing literature in several ways. First, I conduct a comparative study between the direct and reverse regression models for various estima-tors, namely, the Ordinary Least Squares (OLS), the Tow-Stage Least Squares (2SLS), the Limited-Information Maximum Likelihood (LIML) and the Fuller-modified LIML (F-LIML) in the context of estimating returns to scale and technical progress. Second, I point out that the weak instrument problem is an important source of bias, which causes divergence in the estimated value of returns to scale. Our empirical results are produced by using estimation methods that are robust to weak instruments.
An important part of the literature on the estimation of returns to scale relates the output growth index linearly to the input growth index, see for example Diewert and Fox (2008). The intercept and the slope of the linear equation appear as the measurement of technical progress and of returns to scale, respectively. Whereas the theory provides a deterministic relationship between both variables of interest (output and input growth rates), from an empirical perspective it is necessary to decide which variable, the input or the output, is stochastic and therefore measured with errors from the true population regression line. Suppose that the data are represented in a coordinate system, where the x-axis is the input variable and the y-axis is the output variables. The direct regression model assumes that the output variable is stochastic and fits a line that minimizes the squared vertical distance between the data and the regression line in the direction of y-axis. By contrast, the reverse regression model considers the reversed situation, where the input variable becomes the dependent variable. Both regression models have theoretical foundation and have been applied in empirical works. Depending on which regression model is chosen, the estimating results are often very diﬀerent.
Apart from the lack of consensus on regression models, researchers are also debating the choice of estimators. Two types of estimator are considered in the literature, namely, the OLS and Instrumental Variables (IV) estimators, especially the 2SLS estimator. It is easy to show that the direct and reverse OLS estimators are biased when both the dependent variable and the regressor are measured with errors. In contrast, I show in this chapter the direct 2SLS estimator and its reverse counterpart are consistent and asymptotically equivalent. However, in practice the gap between the direct and re-verse 2SLS regression is often very large. For instance, Hall (1988) presented the 2SLS estimation results of returns to scale and price-cost markup coeﬃcients using annual two-digit sectoral data for 1953-1984. His estimated returns to scale are unreasonably large for the reverse regression model and even negative for the direct regression model. Bartelsman (1995) was one of the first authors to question the 2SLS estimator used by Hall (1988). Bartelsman provided a series of Monte Carlo experiments to illustrate that the bias is likely to be large when estimating coeﬃcients from the reverse 2SLS ap-proach. An influential article by Basu and Fernald (1997) compared the OLS and 2SLS estimation strategies for the direct regression model. Their OLS estimation results for thirty-four U.S. private business industries (1959-1989) show that estimated coeﬃcients are often much smaller than one (decreasing returns). Returns to scale are larger in the 2SLS estimation, but their average value is still close to one, and cannot confirm the increasing returns to scale hypothesis. By applying the OLS estimation to a larger database (for 1949-2000) of two-digit U.S. manufacturing industries, Diewert and Fox (2008) also obtained contradictory results between the direct and reverse approaches.
One of the reasons for the failure of 2SLS in estimating returns to scale is the weakness of instruments. Studies by Staiger and Stock (1997), Shea (1997) show that weak instruments could generate large bias in the 2SLS estimation. The implausible results of Hall (1988) is a typical consequence of using the weak instruments. Now, the formal diagnostic and cure for weak instruments are available in the literature, see for example, Hahn and Hausman (2002), Stock et al. (2002) and Stock and Yogo (2002). In this chapter, I extend the work of Bartelsman (1995) by taking weak instruments into account. Compared to Hall (1988), Basu and Fernald (1997) and Diewert and Fox (2008), this chapter goes a step further by testing the quality of instruments and estimating the returns to scale with robust estimators, such as the LIML and F-LIML estimators.
Working with data for twenty one sectors of U.S. manufacturing industries over the last half-century, I found strong evidence of increasing returns to scale. On the other hand, technical progress has made little contribution to U.S. economic growth. Compared with prior empirical results on the estimation of returns to scale, our results are more robust to weak instruments and support a growing body of theoretical models emphasizing the importance of increasing returns to scale for explaining productivity growth.
The remainder of this chapter is organized as follows: I first present the econometric model and the identification issues in Section 2.2. More attention is given to the discus-sion of IV estimations and the weak instrument problem in Section 2.3. The empirical application to the U.S. manufacturing industries data is reported in Section 2.4. Section 2.5 concludes.
Based on the prior works of Diewert (1976) and Diewert and Fox (2008), this chapter follows the Diewert-Fox method of measuring technical progress and returns to scale, where a (multiple inputs and multiple outputs) firm’s technology is represented by a non-constant returns to scale Translog cost function. The framework proposed by Diew-ert and Fox (2008) not only relaxes a series of simplifying restrictions, i.e., single-output, constant returns to scale and perfect competition, but also establishes a very practical relationship between aggregate input index and aggregate output index. The measure-ment of technical progress and of returns to scale appear respectively in this equation as the intercept and the slope, which seem easy to identify. However, I will demonstrate that the identification issue is not straightforward, after a broadly acceptable stochas-tic specification is chosen. Under a series of restrictions on the Translog cost function and the neutral technical change assumption, the deterministic relationship between the log-Törnqvist input growth index and the log-Törnqvist output growth index is: yú = – + —xú , (2.1) where yú and xú are T ◊ 1 vectors of latent values of output growth index and input growth index, respectively. The parameter – is the constant rate of cost reduction and — is the degree of returns to scale. When inputs are increased, if output increases at the same rate, i.e., — = 1, then the technology exhibits constant returns to scale. If output increases by less than that quantity, i.e., — < 1, the technology exhibits decreasing returns to scale. If output increases by more than that quantity, i.e., — > 1, the technology exhibits increasing returns to scale.
The intercept and the slope of equation (2.1) are the two parameters of interest.
Since these factors can never be measured or observed perfectly in the real world, the common practice is to introduce additive error terms. Suppose that there are T observations in the sample, where the observed values are denoted by (x, y). They are measured with additive errors, u and v. Let x = xú + u and y = yú + v , (2.2) where y and x are T ◊ 1 vectors of observations. The model (2.1)-(2.2) is a linear Error-in-Variable (EIV) model, which can be also rewritten in a more compact form with only the observable variables, y = – + —x + Á, where Á © v ≠ —u. We make some statistical assumptions to restrict our stochastic framework.
Assumptions 2.1. Suppose u and v are two zero-mean i.i.d. normally distributed variables. Formally, let V[u] © ‡u2 and V[v] © ‡v2 and V [ Á] = E[(v ≠ —u)2 ] © ‡Á2. The latent variables (xú, yú) are uncorrelated with error terms; suppose that the first and second moments exist. Let E[xú] © µ and V[xú] © ‡2 .
The set of parameters that we want to estimate in this model is ◊ © (–, —, µ, ‡, ‡u, ‡v).
The symmetric treatment of x and y seems to be a simple extension of classical stochastic specification, where only one variable is assumed to be subject to error. The introduction of the second error term increases dramatically the diﬃculty of estima-tion. A surprising consequence is that the unique intercept and slope of the fitting line cannot be identified from the bivariate data set (x, y) alone. This is the well-known identification problem of the EIV model, which was firstly highlighted by Adcock (1878) who tried to handle it by using the Orthogonal regression (this estimation method is consistent only if both variables are subject to errors that have the same variance, i.e., ‡u2 = ‡v2). Adcock’s intuition is the origin of Total least squares (TLS) estimation, which was generalized one hundred years later by Golub and Van Loan (1980). Another idea on the estimation of measurement error models was introduced by Wald (1940) with the objective of proposing a method in which strong assumptions regarding the er-ror structure are not required. Unfortunately, this class of estimators is not feasible. Since the publication of Wald’s method, the problem of estimating EIV models, has re-ceived increasing attention from researchers. There have been several surveys, including Madansky (1959), Stefanski (2000) and Gillard (2006).
Direct and reverse regressions
Two types of estimator are commonly used in empirical studies. Some authors, such as Hall (1988, 1990) suggested using the 2SLS estimator. Others like Basu and Fernald (1997), Diewert and Fox (2008) emphasized the OLS approach. It is easy to show that OLS is biased and that the direct and reverse OLS regressions produce upper and lower bounds on the true value of returns to scale.
Robust estimation inference with weak instruments
A simple response to the finite-sample bias problem is to limit the number of instruments used. One can drop, for example, the weakest instruments selected with help of the first-stage regression or combine them. Given a regression model, the literature suggests that there is a class of alternative estimators, which are asymptotically equivalent to 2SLS but have better finite-sample properties, see Davidson and MacKinnon (1993). One of them is the LIML estimator due to Anderson and Rubin (1949). The LIML estimator is rarely used in applied studies, mainly because the additional assumption of the joint normally of error terms is required. The LIML estimator belongs to the family of k-class estimators proposed by Nagar (1959).
Weak instrument test and specification test
A valid instrument set must satisfy two conditions, the exogeneity and the rank con-dition, which are formally stated in Assumption 2.2. In addition to the two classical requirements, a valid instrument set must also be highly correlated with endogenous regressors. As mentioned previously, instruments that do not have a high degree of ex-planatory power, magnify the finite-sample bias of 2SLS. Therefore, a careful diagnostics for weak instruments is important before interpreting the estimation results. Several approaches for testing weak instruments are available. These approaches include the test based on the partial R2 and the F statistic of the first-stage regression, the test based on the pairwise correlations between the endogenous regressors and instruments, and the Hahn and Hausman (2002) specification test. In this study, I follow Stock and Yogo (2002) weak instruments diagnostic, which uses the first-stage F statistic.
Before going into the econometric analysis of testing weak instruments, it is advis-able to define formally the weak instrument set. Stock and Yogo (2002) proposed two definitions based on the type of consequences induced by weak instruments. In general, weak instruments can lead to bias in estimator as shown previously, they can also lead to large size distortion in statistical test. Thus, the first definition of weak instruments set is given in terms of the maximum relative bias, where the relative bias is defined as the ratio of IV bias to OLS bias, i.e., (—IV ≠ —)/(—OLS ≠ —). The second definition is ˆ = —, where based on the maximum size of Wald test for testing the null hypothesis — the size of test is defined as Pr[reject H0 | H0 is true]. Hence, a set of instrument is weak if the first-stage F statistic is suﬃciently small to cause large relative bias or size distortion.
Given the number of instruments and the definition of weak instruments set, Stock and Yogo (2002, Table 2.1-2.4) provided critical values of F statistic for the 2SLS, LIML and F-LIML estimators. For example, when four instruments are used in a regression model with one endogenous regressor, the F statistic must exceed 6.7 to reject the null hypothesis that the relative bias of 2SLS is larger or equal to 20%. The threshold value of F statistic is increased to 10.3 for a more rigorous test with the null hypothesis that the relative bias is larger or equal to 10%. On the other hand, the 2SLS estimator with a F statistic less than 5, will likely produce biased estimation results. Alternatively, from the size distortion perspective, the F statistic must exceed 24.6 to reject the null hypothesis that the actual size of the 2SLS Wald test at 5% significance level can be greater than 10%. The similar critical values are also available for the F-LIML estimator. However, the LIML estimator does not have moments in finite sample and its relative bias is not well defined, the critical values for testing weak instruments are only given in terms of the maximum size distortion. Since the LIML and F-LIML estimators perform better in finite-samples, their critical values are lower, which suggests that the LIML and F-LIML estimators are superior to 2SLS when the instruments are weak. Table 2.1 summaries the critical values for the weak instrument test with L = 4 and the complete tables can be found in Stock and Yogo (2002).
Table of contents :
1 General Introduction
1.2 An overview of the thesis
2 Direct and Reverse Estimates of Returns to Scale
2.2 Econometric model
2.3 Direct and reverse regressions
2.3.1 Instrumental variable estimator
2.3.2 Robust estimation inference with weak instruments
2.3.3 Weak instrument test and specification test
2.4 Evidence of increasing returns to scale
3 Hicks-neutral and Non-neutral Productivity
3.2 A CES production function with biased technical change
3.2.1 CES specification
3.2.2 Factors-augmenting technical change
3.3 Identification and estimation via control functions
3.3.1 Control function approach
3.3.2 Identification conditions
3.3.3 Estimation procedures
3.3.4 Bias of the Cobb-Douglas specification
3.4 Empirical investigation
3.4.1 Estimation results
3.4.2 Recovering the Hicks-neutral and factor-augmenting productivity
4 Fixed and Variable Cost
4.2 Defining fixed costs and fixed inputs
4.3 A microeconomic framework for fixed costs
4.3.1 On the limitations of traditional production analysis
4.3.2 Another view of the traditional production function
4.3.3 An extended production function
4.4 Some consequences of neglecting fixed costs
4.5 On flexible functional forms
4.6 Econometric treatment of cost heterogeneity
4.7 Empirical investigation
4.7.1 Empirical models and estimation strategies
4.7.2 Data and empirical results
4.7.3 Estimation with industry specific dummies
5 Productivity, Fixed Cost and Export
5.2 Theoretical model with heterogeneous entry costs
5.2.1 Heterogeneous export entry costs
5.2.2 Comparative statics
5.3 Empirical methodology for measuring entry barriers
5.3.1 Overview of our empirical model
5.3.2 Treatment evaluation model
5.3.3 Estimating the productivity
5.4 Empirical investigation
5.4.1 Estimation of productivity
5.4.2 Entry costs and productivity
6 General Conclusion
6.1 Main findings and implications of the analysis
6.2 Limitations and extensions
7 Résumé de thèse
7.3 Résumé de chapitres
7.4 Principaux résultats et les implications de l’étude
8 General references