Two-way clustering in Stata

How does one cluster standard errors two ways in Stata? This question comes up frequently in time series panel data (i.e. where data are organized by unit ID and time period) but can come up in other data with panel structure as well (e.g. firms by industry and region).

I’ll first show how two-way clustering does not work in Stata. I have seen this occasionally in practice, so I think it’s important to get it out of the way.

The standard regress command in Stata only allows one-way clustering. Getting around that restriction, one might be tempted to

Create a group identifier for the interaction of your two levels of clustering
Run regress and cluster by the newly created group identifier

## 
## 
## . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
## (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
## 
## . egen double_cluster=group(idcode year)
## 
## . regress ln_wage age i.race, vce(cluster double_cluster)
## 
## Linear regression                               Number of obs     =     28,510
##                                                 F(3, 28509)       =     905.75
##                                                 Prob > F          =     0.0000
##                                                 R-squared         =     0.0946
##                                                 Root MSE          =     .45494
## 
##                     (Std. Err. adjusted for 28,510 clusters in double_cluster)
## ------------------------------------------------------------------------------
##              |               Robust
##      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##          age |   .0196731   .0004233    46.48   0.000     .0188435    .0205028
##              |
##         race |
##       black  |  -.1377638   .0059505   -23.15   0.000    -.1494271   -.1261006
##       other  |   .0666999   .0284081     2.35   0.019     .0110187    .1223812
##              |
##        _cons |   1.141686    .012024    94.95   0.000     1.118119    1.165254
## ------------------------------------------------------------------------------

What goes wrong here? You can see already that something is off because the number of clusters is the same as the number of observations. Since, in this dataset, the combination of idcode and year uniquely identifies each observations, the above approach effectively does not cluster at all. Instead, it gives you heteroskedasticity-robust standard errors, which are typically too small. You can check this by comparing to the output the same regression as above but with the robust option.

## 
## 
## . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
## (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
## 
## . regress ln_wage age i.race, robust
## 
## Linear regression                               Number of obs     =     28,510
##                                                 F(3, 28506)       =     905.75
##                                                 Prob > F          =     0.0000
##                                                 R-squared         =     0.0946
##                                                 Root MSE          =     .45494
## 
## ------------------------------------------------------------------------------
##              |               Robust
##      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##          age |   .0196731   .0004233    46.48   0.000     .0188435    .0205028
##              |
##         race |
##       black  |  -.1377638   .0059505   -23.15   0.000    -.1494271   -.1261006
##       other  |   .0666999   .0284081     2.35   0.019     .0110187    .1223812
##              |
##        _cons |   1.141686    .012024    94.95   0.000     1.118119    1.165254
## ------------------------------------------------------------------------------

So how does two-way clustering in Stata work then? There are a couple of user-written commands that one can use. I recommend reghdfe by Sergio Correia because it is extremely versatile. Give him credit for it if you use the command! Other good options are ivreg2 by Baum, Schaffer and Stillman or cgmreg by Cameron, Gelbach and Miller.

One issue with reghdfe is that the inclusion of fixed effects is a required option. Sometimes you want to explore how results change with and without fixed effects, while still maintaining two-way clustered standard errors. A shortcut to make it work in reghdfe is to absorb a constant. In the example above:

## 
## 
## . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
## (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
## 
## . gen temp = 1
## 
## . reghdfe ln_wage age i.race, absorb(temp) cluster(idcode year)
## (converged in 1 iterations)
## 
## HDFE Linear regression                            Number of obs   =     28,510
## Absorbing 1 HDFE group                            F(   3,     14) =      99.06
## Statistics robust to heteroskedasticity           Prob > F        =     0.0000
##                                                   R-squared       =     0.0946
##                                                   Adj R-squared   =     0.0945
## Number of clusters (idcode)  =      4,710         Within R-sq.    =     0.0946
## Number of clusters (year)    =         15         Root MSE        =     0.4549
## 
##                            (Std. Err. adjusted for 15 clusters in idcode year)
## ------------------------------------------------------------------------------
##              |               Robust
##      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##          age |   .0196731   .0014594    13.48   0.000     .0165431    .0228032
##              |
##         race |
##       black  |  -.1377638   .0133762   -10.30   0.000     -.166453   -.1090747
##       other  |   .0666999   .0664563     1.00   0.333    -.0758347    .2092346
## ------------------------------------------------------------------------------
## 
## Absorbed degrees of freedom:
## ---------------------------------------------------------------+
##  Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
## -------------+-------------------------------------------------|
##         temp |            1               1              0     | 
## ---------------------------------------------------------------+

Compared to the initial incorrect approach, correctly two-way clustered standard errors differ substantially in this example.

What goes on at a more technical level is that two-way clustering amounts to adding up standard errors from clustering by each variable separately and then subtracting standard errors from clustering by the interaction of the two levels, see Cameron, Gelbach and Miller for details. The incorrect group ID approach only computes the interaction part.

6 thoughts on “Two-way clustering in Stata”

Luis Schmidt 1. November 2018 at 1:48

Hallo Tom! Vielen Dank fuer den Text, es hat mich sehr geholfen. Ich habe aber gemerkt, dass man kann auch noabsorb schreiben anstatt temp=1 zu erstellen. Gruss aus Brasilien. Luis
1. Tom Zimmermann Post author13. November 2018 at 9:42
  
  Danke fuer den Tipp, die Option kannte ich nicht!
Kevin 24. November 2018 at 23:49

Hi Tom – Thanks for this illustration. Quick follow up: do we still need to worry about the small number of clusters (in this case the small number of clusters for years)? Would we still need to do Wild bootstrap (or something similar) as Cameron, Gelbach, Miller recommend in their other work?
Thanks!!!
1. Tom Zimmermann Post author26. November 2018 at 10:17
  
  Hi Kevin,
  yes, with a small number of clusters (here: years), you still need to worry about consistency of standard error estimates. Thanks for pointing that out. I’m not sure reghdfe provides an option for the Wild Bootstrap. Other commands might. In any case, the Wild Bootstrap is not your only option, Angrist and Pischke discuss alternatives in chapter 8 of Mostly Harmless Econometrics.
  Best,
  Tom
Sergio 8. February 2019 at 6:13

Hi Tom!

Just wanted to point out that newer versions of reghdfe include the “noabsorb” (noa) option that will just add a normal constant.

Also, to run wild bootstraps you can use the boottest Stata package [1] that David Roodman and coauthors have recently published.

Cheers from DC!
Sergio

[1] http://qed.econ.queensu.ca/working_papers/papers/qed_wp_1406.pdf
1. Tom Zimmermann Post author8. February 2019 at 9:20
  
  Hi Sergio,
  
  great, thanks for letting me know! I just updated my reghdfe version and it works like a charm 🙂
  
  Best,
  Tom

Comments are closed.