Supplementary Material

References for the Regional Climate Models used in NARCCAP

CRCM:

Caya, D., and R. Laprise (1999) A semi-Lagrangian semi-implicit regional climate model: The Canadian RCM. Mon. Wea. Rev., 127: 341-362.

MM5:

Grell, G., Dudhia, J. and Stauffer, D. R. (1993) A description of the fifth generation Penn State/NCAR Mesoscale Model (MM5). NCAR Tech. Note NCAR/TN-398, 1A, 107 pp.

HadRM3:

Jones, R. G. et al. (2004) Workbook on Generating High Resolution Climate Change Scenarios using PRECIS. UNDP: New York.

RegCM3:

Giorgi, F., M. R. Marinucci, and G. T. Bates (1993a) Development of a second Generation regional climate model (RegCM2): Boundary layer and radiative transfer processes’, Mon. Wea. Rev. 121:2794–2813.

Giorgi, F., M. R. Marinucci, G. de Canio, and G. T. Bates (1993b) Development of a second generation regional climate model (RegCM2): Convective processes and assimilation of lateral boundary conditions, Mon. Wea. Rev. 121:2814–2832.

Pal, J. S. et al. (2007) Regional climate modeling for the developing world - The ICTP RegCM3 and RegCNET. Bulletin of the American Meteorological Society 88: 1395-.

ECPC RSM:

Juang, H., S. Hong, and M. Kanamitsu (1997) The NMC nested regional spectral model. An update. Bull. Amer. Meteor. Soc. 78:2125-2143.

WRF:

Skamarock, W.C. et al. (2005) A description of the Advanced Research WRF Version 2. NCAR Technical Note NCAR/TN-468+STR, 88pp.

Sub-Regionalization of the Domain

This sub-regionalization was created to aid in the analysis of NARCCAP simulations within the North American domain (Bukovsky, 2011). It is, in essence, a simplification of the terrestrial ecoregions provided in Ricketts et al. (1999) and over the U.S. it closely follows the regions used by NEON for consistency (National Ecological Observatory Network, 2010). Eco-climatic zones are sensitive to variations in temperature and precipitation, and were judged to be a good proxy for areas of similar regional climatology. See Figure S1 for region boundaries and names.

More Detailed Description of the ANOVA Model and Additional Statistical Comments

Our approach is based on a random effects analysis of variance (ANOVA) model. Letting Yij denote the climate change measurement for the ith RCM and jth GCM, the mathematical form of the ANOVA model is given by

Yij = m + ai + bj + eij,

where m represents an overall mean, ai represents the contribution from the RCM, bj represents the contribution from the GCM, and eij represents an error or residual term. It is assumed that eachai has a Gaussian distribution with zero-mean and variancesa2, each bj has a Gaussian distribution with zero-mean and variance sb2, and that each eij has a Gaussian distribution with zero-mean and variance s2. It is further assumed that each of these effects ai, bj, eij is independent of each other.

Consider organizing the Yij in a table with rows representing the RCMs and columns the GCM. Of course, due to the NARCCAP design, some of the cells in the table will not have values. Now, theai can be thought of as a row effect; that is,ai represents something that is common to each of the values in a particular row representing RCM-GCM combinations run with a common RCM. Thebj can be thought of similarly for the columns; that is, bj represents something that is common to each of the values in a particular column representing RCM-GCM combinations run with a common GCM. It is unlikely that the values in the table are going to be determined exactly from the overall mean m, and the row (ai) and column (bj) effects. The eij then represent any deviations.

The parameters sa2 and sb2 control how different the values are from row to row (RCM to RCM) or column to column (GCM to GCM), respectively. As this model partitions the overall variability of the Yij in the table into three components (sa2+sb2+s2), a large value of sa2 would suggest that the choice of RCM plays an important role in determining the values of the Yij. A similar interpretation can be given to sb2 for the GCM. A large value of s2 relative to the other two could suggest either 1) the models are responding to the climate change scenario in very similar ways or 2) the natural variability of the climate system in the models is large enough to overwhelm the impact of any particular choice of RCM or GCM.

The statistical model outlined above does not include a term for an interaction between RCM and GCM as the statistical model structure is limited by the experimental design and the considerations that determined the structure of the experimental design. If it were suspected that such interactions are present, a more in-depth analysis would be necessary including the possibility of performing additional model runs.

The parameters in the ANOVA model are fit via maximum-likelihood. The random effects ANOVA model suggests that the joint distribution of the Yij is multivariate normal with each Yij having a common mean m. The covariance matrix is more complicated. The variances are given by sa2+sb2+s2 while the covariances are either sa2 if an RCM is shared, sb2 if a GCM is shared, or 0 otherwise. In other words, RCM-GCM combinations are correlated if they use the same RCM or boundary conditions from the same GCM.

Uncertainty estimates for the variance components can be obtained and various hypothesis tests and other inferential tools can be examined. A full suite of comparisons would include testing for regions where both RCM and GCM are significant, RCM is significant but not GCM, GCM is significant but not RCM, or both RCM and GCM are not significant. However, for our purposes in this work, these hypothesis tests would add little to the discussion and significantly increase the length of the article.

One issue that has arisen is that the NARCCAP experiment is incomplete and one of the twelve pairs of runs is missing. We address this through an adaptation of the expectation-maximization algorithm. Namely, the missing values (i.e. the missing simulations) are “filled in” using the conditional expectation based on the joint distribution from the ANOVA model. To be more precise, the algorithm alternates between filling in the missing values and maximizing the likelihood to find estimates of the model parameters until convergence has been achieved.

When analyzing aggregate data, we also note that there is always the possibility of encountering Simpson’s paradox (i.e., the phenomenon of an association that disappears or even reverses at different levels of aggregation). Generally, this requires both the presence of a confounding variable as well as an unequal or uneven distribution of this confounding variable. The construction of the sub-regionalization of the NARCCAP domain should minimize the impact of any such confounding variables that might be present.

Further, it is an open research question as to at what spatial scales the RCM would dominate the GCM and vice versa. The fact that we see such results at the level of the regionalization used here is important as understanding the impact of RCM versus GCM on such aggregate regions is of interest to climate modelers, impacts researchers, etc.

However, there is some need of caution to prevent an “ecological fallacy.” That is, it is important to not ascribe the results on the regions to every grid box that makes up that region. Clearly, it is entirely possible that a region result that suggests a dominant RCM component to the variability could have individual grid boxes where the GCM is the dominate component and vice versa.

Finally, it is possible that similar analysis could be performed on other spatial scales, including a fully spatial analysis at the grid-box level. We consider this work in progress and beyond the scope of this paper. However, we are currently developing methodology that builds upon the work of Kaufman and Sain (2010) and Sain, Nychka, and Mearns (2010) and incorporates more computational efficient and effective spatial covariance structures (i.e., Nychka et al., 2013) for an ANOVA-style spatial analysis of the NARCCAP ensemble.

Additional Statistical References

Kaufman, C.G. and Sain, S.R. (2010) Bayesian functional ANOVA modeling using Gaussian process prior distributions. Bayesian Analysis, 5:123-150, doi:10.1214/10-BA505.

Nychka, D., Bandyopadhyay, S., Hammerling, D., Lindgren, F., and Sain, S. (2013) A multi-resolution Gaussian process model for the analysis of large spatial data sets.
Journal of Computational and Graphical Statistics (submitted).

Figure S1

Figure Legend:

Figure S1.The domain for NARCCAP and the sub-regions into which the domain was divided according to Bukovsky (2011).

Sub-regionalization References

Bukovsky, M.S., (2011) Masks for the Bukovsky regionalization of North America, Regional Integrated Sciences Collective, Institute for Mathematics Applied to Geosciences, National Center for Atmospheric Research, Boulder, CO. Downloaded 2012-06-13. [http://www.narccap.ucar.edu/contrib/bukovsky/]

Kampe, T.U., B.R. Johnson, M. Kuester, M. Keller (2010) NEON: the first continental-scale ecological observatory with airborne remote sensing of vegetation canopy biochemistry and structure. J. Appl. Remote Sensing 4:043510.

Ricketts, T.H., et al., 1999. Terrestrial ecoregions of North America: A conservation assessment. Island Press, Washington, DC, 485p.

1