Economics Everywhere -- Huanren Zhang's Blog: Panel Data Analysis -- Random Effects vs. Fixed Effects

y_it=beta x_it + a_i + u_it

The unobserved factors affect the dependent variable consist of two types: a_i (constant over time), and u_it (varying over time). a_i is called an unobserved effect or a fixed effect. u_it is called the idiosyncratic error or time varying error.

Fixed effects estimation uses a transformation (time-demeaned data) to remove the unobserved effect a_i. Fixed-effects transformation is also called the within transformation. A pooled OLS estimator uses the time variation in y and x within each cross-sectional observation. It is based on the time-demeaned variables and is called the fixed effects estimator or the within estimator.

In the fixed effects model, any explanatory variable that is constant over time for all i gets swept away by the fixed effect transformation, therefore, we cannot include variables such as gender or a city's distance from a river. The fixed effects estimation is adequate if we want to draw inferences only about the examined individuals.

When we assume that the unobserved effect a_i in the above model is uncorrelated with each explanatory variable in all periods Cov(x_it,a_i)=0 , we have a random effects model. This model is adequate, if we want to draw inferences about the whole population, not only the examined sample.

The fixed effects estimator subtracts the time averages from the corresponding variable. The random effects transformation subtracts a fraction of that time average, where the fraction depends on sigma_u^2, sigma_a^2, and the number of time periods, T.

In practice, it is usually informative to compute the pooled OLS estimates. Comparing the three sets of estimates can help us determine the nature of the biases caused by leaving the unobserved effect a_i, entirely in the error term (as does pooled OLS) or partially in the error term (as does the RE transformation.) Remember, however, the pooled OLS standard errors and test statistics are generally invalid: they ignore the often substantial serial correlation in the composite errors, v_it=a_i + u_it.

One can used Hausman test. A failure to reject means either that the RE and FE estimates are sufficiently close so that it does not mater which is used, or the sample variation is so large in the FE estimates that one cannot conclude practically significant differences are statistically significant.

Using FE is mechanically the same as allowing a different intercept for each cross-sectional unit. FE is almost always much more convincing than RE for policy analysis using aggregated data.

Panel data analysis can also be used to analyze clustered data. Depending on the nature of the clustered data, FE or RE can be used.

Economics Everywhere -- Huanren Zhang's Blog

Sunday, April 6, 2014

Panel Data Analysis -- Random Effects vs. Fixed Effects

No comments:

Post a Comment