DiD regression with imputation method
didImputation( y0, data, cohort, nevertreated.value = Inf, unit = NULL, time = NULL, het = NULL, w = NULL, coef = -Inf:Inf, OATT = FALSE, with.se = TRUE, tol = 1e-06, maxit = 100, mutatedata = FALSE, verbose = 0, effective.sample = 30 )
y0 | Formula. model for Y(0). This is the full model without leads and lags used to predict the counterfactual outcome. |
---|---|
data | A data frame or data.table |
cohort | Character. Time of treatment identifier. By default it expect |
nevertreated.value | Any, default is |
unit | Character, default is |
time | Character, default is |
het | Character, default is |
w | Numerical vector or Variable name. Default is |
coef | R Expression of leads and lags (eg |
OATT | Logical, default is |
with.se | Logical, default is |
tol | Numeric, default is |
maxit | Numeric, default is |
mutatedata | Logical, default is |
verbose | Numeric, default is |
effective.sample | Numeric, default is |
A didImputation object with the results of the imputation estimation.
Data used for estimation.
User and CPU time for weights convergence.
p-value for the positive horizon estimates.
Table of regression coefficients.
Wald statistic of the pre-trend regression.
Average treatment effects from the imputation procedure.
Number of iterations it took to compute the weights.
fixest
regression object for the pre-trends estimation.
See below for additional details on some arguments.
By default, the function will estimate all coefficients available. You can
customize this behavior with the coef
option. The option takes an R
expression of the form -{leads}:{lags}
. The default behavior is to estimate
all available coefficients such that coef = -Inf:Inf
.
For pre-trend coefficients, by default the function will set the greatest
leads as the reference group if -Inf
is set.
Borusyak, K., Jaravel, X., & Spiess, J. give a condition on weights such that consistency of estimators holds. This condition states that the Herfindahl index of weights must converge to zero. Another interpretation is that the inverse of the index is a measure of 'effective sample size'. Authors recommand an effective sample size of at least 30. If the effective sample size is lower, a warning will be thrown.
The cohort
argument is the date of treatment of the observation. By default
it expect Inf
for never treated individuals. You can override this behavior
by setting another value to nevertreated.value
.
The standard errors are computed using an alternating projection method. You
can tweek its meta-parameter by changing tol
and maxit
.
Borusyak, K., Jaravel, X., & Spiess, J. (2021). Revisiting event study designs: Robust and efficient estimation. Working paper.
Stata implementation from the authors:
did_imputation (https://github.com/borusyak/did_imputation)
Another R implementation using sparse matrix inversion: didimputation (https://github.com/kylebutts/didimputation)
Antoine Mayerowitz
Maxime Gravoueille
#Load example data data(did_simulated) # Estimate the overall average treatment effect on treated and all available pre-trends didImputation(y ~ 0 | i + t, cohort = 'g', OATT = TRUE, data = did_simulated) #> Event Study: imputation method. Dep. Var.: y #> Counterfactual model: y ~ 0 | i + t #> Observations: 1500 #> Estimate Std. Error t value Pr(>|t|) #> k::-4 0.05837981 0.2279555 0.2561018 7.980837e-01 #> k::-3 0.12351493 0.2384032 0.5180927 6.048537e-01 #> k::-2 0.20413885 0.2568542 0.7947654 4.275069e-01 #> k::-1 0.05923100 0.2842246 0.2083950 8.350909e-01 #> k::0 2.32785342 0.1031787 22.5613832 1.038251e-112 #> ...... 0 rows not shown. # Estimate the full dynamic model didImputation(y ~ 0 | i + t, cohort = 'g', data = did_simulated) #> Event Study: imputation method. Dep. Var.: y #> Counterfactual model: y ~ 0 | i + t #> Observations: 1500 #> Estimate Std. Error t value Pr(>|t|) #> k::0 0.973520 0.09719356 10.01630 1.292486e-23 #> k::1 2.085544 0.10992597 18.97226 2.892152e-80 #> k::2 2.990625 0.14310200 20.89856 5.518751e-97 #> k::3 3.980677 0.18913808 21.04641 2.466738e-98 #> k::4 4.774795 0.25861218 18.46315 4.087941e-76 #> ...... 4 rows not shown. # Estimate positive (lags) coefficients didImputation(y ~ 0 | i + t, cohort = 'g', coef = 0:Inf, data = did_simulated) #> Event Study: imputation method. Dep. Var.: y #> Counterfactual model: y ~ 0 | i + t #> Observations: 1500 #> Estimate Std. Error t value Pr(>|t|) #> k::0 0.973520 0.09719356 10.01630 1.292486e-23 #> k::1 2.085544 0.10992597 18.97226 2.892152e-80 #> k::2 2.990625 0.14310200 20.89856 5.518751e-97 #> k::3 3.980677 0.18913808 21.04641 2.466738e-98 #> k::4 4.774795 0.25861218 18.46315 4.087941e-76 #> ...... 0 rows not shown. # Estimate first 3 post treatment coefficients didImputation(y ~ 0 | i + t, cohort = 'g', coef = 0:2, data = did_simulated) #> Event Study: imputation method. Dep. Var.: y #> Counterfactual model: y ~ 0 | i + t #> Observations: 1500 #> Estimate Std. Error t value Pr(>|t|) #> k::0 0.973520 0.09719356 10.01630 1.292486e-23 #> k::1 2.085544 0.10992597 18.97226 2.892152e-80 #> k::2 2.990625 0.14310200 20.89856 5.518751e-97 #> ...... -2 rows not shown. # Return only point estimates didImputation(y ~ 0 | i + t, cohort = 'g', with.se = FALSE, data = did_simulated) #> Event Study: imputation method. Dep. Var.: y #> Counterfactual model: y ~ 0 | i + t #> Observations: 1500 #> Estimate Std. Error t value Pr(>|t|) #> k::0 0.973520 NA NA NA #> k::1 2.085544 NA NA NA #> k::2 2.990625 NA NA NA #> k::3 3.980677 NA NA NA #> k::4 4.774795 NA NA NA #> ...... 4 rows not shown.