Supplementary Material to “The Effect of a Reduction in the Opening Hours of Polling Stations on Turnout”

As indicated in the main text, I will further describe most details of the robustness checks in this separate appendix.

First, a concern in difference-in-differences designs is that the control group could not be able to generate the counterfactual evolution of the outcome that the treated units would have followed in the absence of the intervention. To address this issue, data-driven procedures to construct a synthetic control group have recently been developed. Although the common time trend assumption has looked plausible in the present setting, I use a synthetic control group here to evaluate whether my results are indeed robust to variations in the control group. Specifically, I use entropy balancing (Hainmueller, 2012), which re-weights the data to achieve balance in means for pre-treatment characteristics across the treatment and control group. As pre-treatment characteristics, I choose the turnout in the three pre-treatment periods 1979, 1984, and 1989. Thus, weights are chosen in such a way that treatment and control group have exactly the same turnout level (and thus the same time trend in turnout) before the policy change in 1994. I then run the following weighted first-difference regression for the year 1994 (see Marcus, 2013 for a similar application of entropy balancing in combination with difference-in-differences):

(4)

with weights corresponding to those generated by the entropy balancing approach. Thus, the resulting weighted OLS estimate will be given by, where is a (Nx2)-matrix in which the first column is a vector of 1s and the second column consists of the treatment group indicator. Finally, is the vector of weights generated by entropy balancing. Table A1 shows that entropy balancing indeed leads to the same pre-treatment turnout for the treatment group and the synthetic control group. Column 1 of Table A2 shows the results of the entropy balancing approach. The point estimate even is of a slightly larger magnitude than before and still highly significant. Column 2 shows a placebo estimate from the entropy balancing approach, i.e. I assume the year 1989 to be the treatment year. In this case, the estimate of interest is close to zero and insignificant as it should be.[1]

In column 3 of Table A2, I show the results of a weighted regression. Precisely, this regression weights the observations by the number of eligible voters in a county; it thus gives counties with a larger number of eligible voters more weight. This makes sense in the present setting for two reasons: First, in the present setting, non-weighted regressions give causal effects averaged over political units, but not over the number of voters involved. Second, and more importantly, weighting might increase the precision of the regression estimates. This is because turnout is calculated as the number of votes divided by the number of eligible voters. When the number of eligible voters is high, these rates might be more precisely calculated and thus more reliable. Column 3 shows that the point estimate and inferences are unchanged.

Column 4 runs the baseline regression (2) with municipality level data. As explained, doing so attenuates concerns for potential aggregation bias in county level data. Indeed, the results show that neither point estimates nor inferences change to any significant extent. Moreover, when I compare – with the municipality-level data at hand – clustering at the municipality level with clustering at the county level, standard errors are only slightly larger in the latter case (0.0042 compared to 0.0038). Thus, clustering at relatively low levels of aggregation might be unproblematic.

Another robustness check is to restrict the sample to those counties that are situated at the border between Saarland and Rhineland-Palatinate. The idea is that these counties are more likely to be subject to the same shocks, especially because state borders between Saarland and Rhineland-Palatinate are highly non-linear (as can be seen in Figure 1). For example, there might exist weather shocks that lead to turnouts that differ across Rhineland-Palatinate and Saarland. Focusing on border counties makes this concern likely irrelevant. It also reduces the concern about unobserved, group-specific effects in general. Column 5 of Table A2 shows the results. The point estimate stays almost unchanged, which suggests that the border sample does not change the baseline results. The significance level, however, decreases slightly (p-value: 0.178): Because the number of observations strongly decreases (less than one fourth of the total observations), the standard errors more than double.

Column 6 investigates whether the results are robust to outliers. Specifically, I exclude all observations deemed as outliers by Hadi’s (1992) multivariate procedure. As shown, applying this procedure excludes 70 observations from the baseline sample. Nonetheless, the results remain unchanged.

Column 7 of Table A2 includes the additional control variables that are not available for the entire pre-treatment period. From the additional control variables, only the vote share of the SPD (obtained at the last local election) as well as the proportion of people above the age of 65 are significantly associated with turnout. The main point estimate of interest is even much larger than in the baseline specification, but marginally insignificant (p-value:0.117), presumably because the decreasing number of observations causes an increase in the standard errors (standard errors are more than four times larger than in the baseline specification).

Because turnout is a fractional variable that must lie between zero and one, employing turnout as the dependent variable might cause some problems if turnout is close to one of the two boundaries. Therefore, I have used a logarithmic transformation in column 8. The inferences, however, stay unchanged.However, in this log-of-the-odds transformation, the marginal effect of opening hours on turnout varies with the level of the dependent variable (Revelli, 2016). Specifically, the effect of reduced opening hours on turnout is -0.031 when turnout is 50%, -0.027 when turnout is 70%, and -0.012 when turnout is 90%. In column 9, I have, instead of state-specific time trends, included county-specific time trends. However, also in this case, the results remain unchanged.

Column 10 performs another placebo exercise: Specifically, I have randomly drawn 18 of the 36 counties in Rhineland-Palatinate to be treated since 1994, while the other 18 counties of Rhineland-Palatinate function as placebo controls. This robustness check rules out the possibility that the large effect found in the baseline analysis is a statistical fluke that can even be found when there is no change in opening hours. If such large effects cannot be found simply by chance, then one would expect that this placebo treatment effect is virtually zero. Indeed, this is the case.

Finally, I have implemented a jackknife analysis in which I, in turn, excluded each county from the estimation to probe whether the results are driven by idiosyncratic circumstances in single counties. However, I found this not to be the case. No matter which county is excluded is from the sample, the effect of reduced opening hours is always significantly different from zero. P-values are always below 0.02. Results are available upon request.

References

Hadi, A. S. (1992). Identifying Multiple Outliers in Multivariate Data, Journal of the Royal Statistical Society: Series B, 54(3), 761-771.

Hainmueller, J. (2012). Entropy Balancing: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies, Political Analysis, 20(1), 25-46.

Marcus, J. (2013). The Effect of Unemployment on the Mental Health of Spouses – Evidence from Plant Closures in Germany, Journal of Health Economics, 32(3), 546-558.

Revelli, F. (2016). Tax Limits and Local Elections, Public Choice, 166(1), 53-68.

Table A1: Results from entropy balancing

Rhineland-Palatinate / Saarland
Before weighting
Turnout in 1979 / 0.7811 / 0.8298
Turnout in 1984 / 0.7603 / 0.8026
Turnout in 1989 / 0.7702 / 0.8089
After weighting
Turnout in 1979 / 0.8298 / 0.8298
Turnout in 1984 / 0.8027 / 0.8026
Turnout in 1989 / 0.8089 / 0.8089

Table A2: Robustness checks

(1) / (2) / (3) / (4) / (5) / (6) / (7) / (8) / (9) / (10)
VARIABLES / / / Turnout / Turnout / Turnout / Turnout / Turnout / Transformed Turnout / Turnout / Turnout
Reduced Opening Hours / -0.026*** / -0.004 / -0.018** / -0.016*** / -0.014 / -0.023*** / -0.031 / -0.124*** / -0.024*** / 0.001
(0.003) / (0.006) / (0.008) / (0.004) / (0.010) / (0.005) / (0.019) / (0.043) / (0.008) / (0.005)
Proportion of young, 0-15 / 0.002
(0.039)
Proportion of old, 65+ / 1.543**
(0.616)
Vote share CDU / 0.026
(0.079)
Vote share SPD / 0.339***
(0.116)
Vote share Green / -0.193
(0.243)
Vote share FDP / 0.253
(0.468)
Observations / 42 / 42 / 210 / 1,318 / 50 / 140 / 126 / 210 / 210 / 180
R-squared / 0.914 / 0.431 / 0.868 / 0.892 / 0.966 / 0.968 / 0.967 / 0.966 / 0.976 / 0.938
Period
State-specific trend
Robustness check / 1994
NO
Entropy Balancing / 1989
NO
Entropy Balancing Placebo / 1979-1999
YES
Weighted Regression / 1979-1999
YES
Municipa- lity Level / 1979-1999
YES
Neighbor Sample / 1979-1999
YES
Outliers Removed / 1989-1999
YES
Additional Controls / 1979-1999
YES
Turnout Transf. / 1979-1999
NO
County-
specific trend / 1979-1999
YES
Placebo Reform in RLP

Except column 4, all regressions include the control variables from Table 2. Columns 1 and 2 use conventional, unclustered standard errors. Column 4 uses standard errors clustered at the municipality level. In all other columns, standard errors are clustered at the county level. *Significant at the 10 percent level, **Significant at the 5 percent level, ***Significant at the 1 percent level.

Table A3: Alternative ways to deal with standard errors in the Difference-in-Differences design

Baseline / Driscoll-Kraay standard errors / Collapsed data / FGLS with common autocorrelation / FGLS with panel-specific autocorrelation / Clustering at state-year level
Reduced Opening Hours / -0.020***
(0.005) / -0.020***
(0.005) / -0.034***
(0.009) / -0.014***
(0.005) / -0.016***
(0.006) / -0.020***
(0.003)
Observations / 210 / 210 / 84 / 210 / 210 / 210

All regressions include, but do not report, municipality- and year-fixed effects. All columns except column (3) also include the control variables shown in Table 3. Column (1) uses the standard errors from the baseline procedure. Column (2) uses Driscoll-Kraay standard errors. Column (3) uses conventional, unclustered standard errors. (4) implements FGLS estimation, and the autocorrelation coefficient in the error term is assumed to be identical across counties. Column (5) implements FGLS estimation with an autocorrelation coefficient in the error term that can potentially differ across counties. Column 6 clusters at the state-year level. *Significant at the 10 percent level, **Significant at the 5 percent level, ***Significant at the 1 percent level.

[1] In this case, of course, the data are weighted in such a way that only the 1979 and 1984 turnouts are the same in treatment and (synthetic) control group.