Mincer earnings regression in the form of the double Pareto-lognormal model

Working Paper 2016-407


In this study, the standard Mincer earnings regression equation in the form of the lognormal (LN) model is generalized into the form of the double-Pareto-lognormal (dPLN) model, substantially improving the goodness-of-fit to wage data. The empirical study contrasts the new and traditional models with respect to relationships between the wage and its determinant factors other than the primary equation for the conditional mean of log-wage, given potential work experience and education, such that, the wage distributions predicted by the dPLN-regression model faithfully reproduce the log-wage quantile regression results of the original data, whereas those by the LN-regression model fail such reproduction. Furthermore, the dPLN-regression model predicts that higher education has statistically significant positive effects on wage dispersion, particularly at the higher end, whereas the LN-regression model predicts insignificant negative effects even when heteroskedasticity in the error term is incorporated into the model. Thus, the new model is expected to be useful for not only accurately estimating contributions of wage determinant factors to wage dispersions and the shares of low-wage workers, but also improving the existing analysis methods using earnings equations such as the Oaxaca-Blinder decomposition and return of education by utilizing the dispersion regression equations.

Authors: Masato Okamoto.

Keywords: Distributional regression, heteroskedasticity, mixture distribution, quantile regression, wage dispersion.
JEL: D31, D63, J31.