Matching and semi-parametric IV estimation, a distance-based measure of migration, and the wages of young men

John Ham, Xianghong Li, Patricia B. Reagan

Research output: Contribution to journalArticle

Abstract

Our paper estimates the effect of US internal migration on wage growth for young men between their first and second job. Our analysis of migration extends previous research by: (i) exploiting the distance-based measures of migration in the National Longitudinal Surveys of Youth 1979 (NLSY79); (ii) allowing the effect of migration to differ by schooling level and (iii) using propensity score matching to estimate the average treatment effect on the treated (ATET) for movers and (iv) using local average treatment effect (LATE) estimators with covariates to estimate the average treatment effect (ATE) and ATET for compliers. We believe the Conditional Independence Assumption (CIA) is reasonable for our matching estimators since the NLSY79 provides a relatively rich array of variables on which to match. Our matching methods are based on local linear, local cubic, and local linear ridge regressions. Local linear and local ridge regression matching produce relatively similar point estimates and standard errors, while local cubic regression matching badly over-fits the data and provides very noisy estimates. We use the bootstrap to calculate standard errors. Since the validity of the bootstrap has not been investigated for the matching estimators we use, and has been shown to be invalid for nearest neighbor matching estimators, we conduct a Monte Carlo study on the appropriateness of using the bootstrap to calculate standard errors for local linear regression matching. The data generating processes in our Monte Carlo study are relatively rich and calibrated to match our empirical models or to test the sensitivity of our results to the choice of parameter values. The estimated standard errors from the bootstrap are very close to those from the Monte Carlo experiments, which lends support to our using the bootstrap to calculate standard errors in our setting. From the matching estimators we find a significant positive effect of migration on the wage growth of college graduates, and a marginally significant negative effect for high school dropouts. We do not find any significant effects for other educational groups or for the overall sample. Our results are generally robust to changes in the model specification and changes in our distance-based measure of migration. We find that better data matters; if we use a measure of migration based on moving across county lines, we overstate the number of moves, while if we use a measure based on moving across state lines, we understate the number of moves. Further, using either the county or state measures leads to much less precise estimates. We also consider semi-parametric LATE estimators with covariates (Frlich 2007), using two sets of instrumental variables. We precisely estimate the proportion of compliers in our data, but because we have a small number of compliers, we cannot obtain precise LATE estimates.

Original languageEnglish (US)
Pages (from-to)208-227
Number of pages20
JournalJournal of Econometrics
Volume161
Issue number2
DOIs
StatePublished - Apr 1 2011

Fingerprint

Wages
Average Treatment Effect
Migration
Bootstrap
Standard error
Estimator
Estimate
Ridge Regression
Monte Carlo Study
Calculate
Covariates
Linear regression
Estimated standard error
Local Linear Regression
Local Regression
Propensity Score
Instrumental Variables
Conditional Independence
Point Estimate
Young men

Keywords

  • LATE
  • Propensity score matching
  • US internal migration

ASJC Scopus subject areas

  • Economics and Econometrics
  • Applied Mathematics

Cite this

Matching and semi-parametric IV estimation, a distance-based measure of migration, and the wages of young men. / Ham, John; Li, Xianghong; Reagan, Patricia B.

In: Journal of Econometrics, Vol. 161, No. 2, 01.04.2011, p. 208-227.

Research output: Contribution to journalArticle

@article{7c9e8f147e614233a280caf9b7d0f389,
title = "Matching and semi-parametric IV estimation, a distance-based measure of migration, and the wages of young men",
abstract = "Our paper estimates the effect of US internal migration on wage growth for young men between their first and second job. Our analysis of migration extends previous research by: (i) exploiting the distance-based measures of migration in the National Longitudinal Surveys of Youth 1979 (NLSY79); (ii) allowing the effect of migration to differ by schooling level and (iii) using propensity score matching to estimate the average treatment effect on the treated (ATET) for movers and (iv) using local average treatment effect (LATE) estimators with covariates to estimate the average treatment effect (ATE) and ATET for compliers. We believe the Conditional Independence Assumption (CIA) is reasonable for our matching estimators since the NLSY79 provides a relatively rich array of variables on which to match. Our matching methods are based on local linear, local cubic, and local linear ridge regressions. Local linear and local ridge regression matching produce relatively similar point estimates and standard errors, while local cubic regression matching badly over-fits the data and provides very noisy estimates. We use the bootstrap to calculate standard errors. Since the validity of the bootstrap has not been investigated for the matching estimators we use, and has been shown to be invalid for nearest neighbor matching estimators, we conduct a Monte Carlo study on the appropriateness of using the bootstrap to calculate standard errors for local linear regression matching. The data generating processes in our Monte Carlo study are relatively rich and calibrated to match our empirical models or to test the sensitivity of our results to the choice of parameter values. The estimated standard errors from the bootstrap are very close to those from the Monte Carlo experiments, which lends support to our using the bootstrap to calculate standard errors in our setting. From the matching estimators we find a significant positive effect of migration on the wage growth of college graduates, and a marginally significant negative effect for high school dropouts. We do not find any significant effects for other educational groups or for the overall sample. Our results are generally robust to changes in the model specification and changes in our distance-based measure of migration. We find that better data matters; if we use a measure of migration based on moving across county lines, we overstate the number of moves, while if we use a measure based on moving across state lines, we understate the number of moves. Further, using either the county or state measures leads to much less precise estimates. We also consider semi-parametric LATE estimators with covariates (Frlich 2007), using two sets of instrumental variables. We precisely estimate the proportion of compliers in our data, but because we have a small number of compliers, we cannot obtain precise LATE estimates.",
keywords = "LATE, Propensity score matching, US internal migration",
author = "John Ham and Xianghong Li and Reagan, {Patricia B.}",
year = "2011",
month = "4",
day = "1",
doi = "10.1016/j.jeconom.2010.12.004",
language = "English (US)",
volume = "161",
pages = "208--227",
journal = "Journal of Econometrics",
issn = "0304-4076",
publisher = "Elsevier BV",
number = "2",

}

TY - JOUR

T1 - Matching and semi-parametric IV estimation, a distance-based measure of migration, and the wages of young men

AU - Ham, John

AU - Li, Xianghong

AU - Reagan, Patricia B.

PY - 2011/4/1

Y1 - 2011/4/1

N2 - Our paper estimates the effect of US internal migration on wage growth for young men between their first and second job. Our analysis of migration extends previous research by: (i) exploiting the distance-based measures of migration in the National Longitudinal Surveys of Youth 1979 (NLSY79); (ii) allowing the effect of migration to differ by schooling level and (iii) using propensity score matching to estimate the average treatment effect on the treated (ATET) for movers and (iv) using local average treatment effect (LATE) estimators with covariates to estimate the average treatment effect (ATE) and ATET for compliers. We believe the Conditional Independence Assumption (CIA) is reasonable for our matching estimators since the NLSY79 provides a relatively rich array of variables on which to match. Our matching methods are based on local linear, local cubic, and local linear ridge regressions. Local linear and local ridge regression matching produce relatively similar point estimates and standard errors, while local cubic regression matching badly over-fits the data and provides very noisy estimates. We use the bootstrap to calculate standard errors. Since the validity of the bootstrap has not been investigated for the matching estimators we use, and has been shown to be invalid for nearest neighbor matching estimators, we conduct a Monte Carlo study on the appropriateness of using the bootstrap to calculate standard errors for local linear regression matching. The data generating processes in our Monte Carlo study are relatively rich and calibrated to match our empirical models or to test the sensitivity of our results to the choice of parameter values. The estimated standard errors from the bootstrap are very close to those from the Monte Carlo experiments, which lends support to our using the bootstrap to calculate standard errors in our setting. From the matching estimators we find a significant positive effect of migration on the wage growth of college graduates, and a marginally significant negative effect for high school dropouts. We do not find any significant effects for other educational groups or for the overall sample. Our results are generally robust to changes in the model specification and changes in our distance-based measure of migration. We find that better data matters; if we use a measure of migration based on moving across county lines, we overstate the number of moves, while if we use a measure based on moving across state lines, we understate the number of moves. Further, using either the county or state measures leads to much less precise estimates. We also consider semi-parametric LATE estimators with covariates (Frlich 2007), using two sets of instrumental variables. We precisely estimate the proportion of compliers in our data, but because we have a small number of compliers, we cannot obtain precise LATE estimates.

AB - Our paper estimates the effect of US internal migration on wage growth for young men between their first and second job. Our analysis of migration extends previous research by: (i) exploiting the distance-based measures of migration in the National Longitudinal Surveys of Youth 1979 (NLSY79); (ii) allowing the effect of migration to differ by schooling level and (iii) using propensity score matching to estimate the average treatment effect on the treated (ATET) for movers and (iv) using local average treatment effect (LATE) estimators with covariates to estimate the average treatment effect (ATE) and ATET for compliers. We believe the Conditional Independence Assumption (CIA) is reasonable for our matching estimators since the NLSY79 provides a relatively rich array of variables on which to match. Our matching methods are based on local linear, local cubic, and local linear ridge regressions. Local linear and local ridge regression matching produce relatively similar point estimates and standard errors, while local cubic regression matching badly over-fits the data and provides very noisy estimates. We use the bootstrap to calculate standard errors. Since the validity of the bootstrap has not been investigated for the matching estimators we use, and has been shown to be invalid for nearest neighbor matching estimators, we conduct a Monte Carlo study on the appropriateness of using the bootstrap to calculate standard errors for local linear regression matching. The data generating processes in our Monte Carlo study are relatively rich and calibrated to match our empirical models or to test the sensitivity of our results to the choice of parameter values. The estimated standard errors from the bootstrap are very close to those from the Monte Carlo experiments, which lends support to our using the bootstrap to calculate standard errors in our setting. From the matching estimators we find a significant positive effect of migration on the wage growth of college graduates, and a marginally significant negative effect for high school dropouts. We do not find any significant effects for other educational groups or for the overall sample. Our results are generally robust to changes in the model specification and changes in our distance-based measure of migration. We find that better data matters; if we use a measure of migration based on moving across county lines, we overstate the number of moves, while if we use a measure based on moving across state lines, we understate the number of moves. Further, using either the county or state measures leads to much less precise estimates. We also consider semi-parametric LATE estimators with covariates (Frlich 2007), using two sets of instrumental variables. We precisely estimate the proportion of compliers in our data, but because we have a small number of compliers, we cannot obtain precise LATE estimates.

KW - LATE

KW - Propensity score matching

KW - US internal migration

UR - http://www.scopus.com/inward/record.url?scp=79952453863&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952453863&partnerID=8YFLogxK

U2 - 10.1016/j.jeconom.2010.12.004

DO - 10.1016/j.jeconom.2010.12.004

M3 - Article

AN - SCOPUS:79952453863

VL - 161

SP - 208

EP - 227

JO - Journal of Econometrics

JF - Journal of Econometrics

SN - 0304-4076

IS - 2

ER -