July 10, 2014

Response to Review

“Estimating North American background ozone in U.S. surface air with two independent global models: Variability, uncertainties, and recommendations”

A.M. Fiore, J.T. Oberman, M. Lin, L. Zhang, O.E. Clifton, D.J. Jacob, V. Naik, L.W. Horowitz, J.P. Pinto, G.P. Milly

Reviewer’s comments are reproduced below in italics with our responses beneath.

The manuscript "Estimating North American background ozone in U.S. surface air with

two independent global models: Variability, uncertainties, and recommendations" by Fiore et al. presents results of two independent model simulations of the North American

O3 background (NAB) concentration, which is defined as the virtual concentration that

would result if all North American anthropogenic emissions would be switched off. The

manuscript is well structured and written and the scientific results are generally presentedwith the required care. However, I think that the current manuscript would benefit from some additional analysis of the model performance against observations and updated simulations(if possible) with improved process representations. In addition, a series of minorchanges are required as outlined below.

We thank the reviewer for the thoughtful comments. Wherever possible, we have changed the manuscript to reflect these suggestions, and we believe the manuscript is now much stronger as a result of these revisions.

Major comments

1) Additional validation: On page 9 section 3.2 it is mentioned that ’evaluation with daily

O3 sondes will be important to ascertain whether the models represent the vertical structure of O3’. This is a very critical point of the model evaluation given the important contribution of stratospheric ozone to the western US ozone background. However, such a comparison is not presented in this study, even if it does not seem to be an extraordinary additional effort. Some of the speculation on the vertical transport and mixing in the tropopause region and the troposphere that is given later in the document could be avoided if a thorough comparison with ozone sondes would be included. The comparison with satellite retrieved sub-columns cannot reveal all details of the vertical profiles since the retrievals strongly smooth it.

Unfortunately we do not have the 3D daily fields archived that are needed for this comparison. We now explicitly discuss this (L344-351):

“The diagnostics necessary to determine whether AM3 actually simulates more stratosphere-to-troposphere exchange of O3, or whether it mixes free tropospheric air (including the stratospheric component) into the planetary boundary layer more efficiently were not archived from these simulations. Future multi-model efforts should archive daily O3 profiles at sonde launch locations in order to ascertain whether the models represent the observed vertical structure of O3 throughout the troposphere and lower stratosphere, as shown for AM3 during the 2010 CalNex field campaign (Lin et al., 2012a; Lin et al., 2012b).”

Why is the presented analysis of daily variability restricted to only 4 surface sites? It

would be beneficial to extend this analysis to all sites and present results like those given

in Tables 3 and 4 on a map.

Table 4 was intended simply to report on the time series in Figure 7, which we felt had too much information to include directly in the panels of the time series themselves. We have now fully adopted the reveiwer’s suggestion, thus removing Table 4 and adding new Figures 7 and 8 that show the correlations across the entire CASTNet, excluding the high altitude CA sites to be consistent with our analysis elsewhere. We have adapted the text accordingly in Section 3.4.2 (L574-597) to discuss the new correlation figures. The unfiltered NAB data from Table 3 is shown in map form in Figure 1.

2) Updated model simulations: The authors claim to focus on different processes in

the models to understand the uncertainties in the model estimated North American background O3. Of the four process that are discussed in detail (vertical transport, biomass burning influence, lightning NOx and biogenic hydrocarbon emissions) the authors already know that two are not well represented in the individual models. The treatment of biomass burning emissions in AM3 is deemed to be too crude (timing, release altitude, see also below) and the lightning NOx production in the presented GEOS-Chem simulation is known to be overestimated. Since O3 chemistry is not linear it is difficult enough to distinguish the influence of a single process onto the final O3 concentrations (background or total). Hence, it seems critical to me to use the best available model representations of the mentioned processes when uncertainties in NAB should be discussed. For this specific study I am wondering why the authors did not (at least) use an updated GEOS-Chem simulation with corrected lightning emissions and why biomass burning is treated so crudely in AM3. If it would be possible for the authors to redo their simulations in a moderate amount of time, I am sure the conclusions of this study would benefit.

As we now address in the introduction, this paper documents versions of the models that were used in EPA’s Integrated Science Assessment, and our detailed discussion here of their strengths and limitations is thus relevant to ongoing policy discussions (L140-144):

“A comparison of the GEOS-Chem and GFDL AM3 year 2006 simulations described here, at a measurement site in Gothic, CO, U.S.A., is included in the ISA supplemental material. Here we extend that initial analysis by examining the AM3 and GEOS-Chem NAB estimates in a fully consistent and process-oriented manner for March through August of 2006.”

While future work should indeed improve the model process representation, this request to re-do the entire study is not possible for us since the students who did the work have moved on. As we recommend, we feel that targeted process studies in multiple models during periods where field campaign data for multiple species is available will be most productive in terms of narrowing uncertainties in the specific processes controlling NAB levels.

Minor comments

3) Abstract: ’Differences in model representation of these processes contribute more to uncertainty in NAB estimates than the choice of horizontal resolution within a single model’. This conclusion seems to be too general given the fact that it relies on the results from a single model and even for this difference due to resolution seem to be considerable in summer.

We have revised to, “Differences in the representations of these processes within the GFDL AM3 and GEOS-Chemmodels contribute more to uncertainty in NAB estimates, particularly in spring when NAB is highest, than the choice of horizontal resolution within a single model (GEOS-Chem).”

4) P3 first paragraph: For which metric is the O3 NAAQS threshold defined? Please add

this information here, even if it given later in the document.

Good point. We now state in the first paragraph of the introduction (L71-76):

“Following these reviews, the level for the O3 NAAQS has been lowered over the past decade, from 0.08 ppm in 1997 to the current level of 0.075 ppm (75 ppb) in 2008, with proposals calling for even lower levels, within a range of 60-70 ppb on the basis of the latest health evidence (Federal Register, 2010). A location is considered to be in violation of the O3NAAQS when the three-year-average of the fourth highest MDA8 exceeds the current 75 ppb level.”

5) P5 last paragraph: It is unclear at this point why the annual fourth highest maximum

daily average 8-hour value is reported. The information that this is current U.S. EPA metric for assessing compliance with NAAQS is only given on page 8. See also previous comment.

We now state (L184-185):

“A few studies report theannual fourth highest maximum daily average 8-hour (MDA8) NAB value, the metric used to assess compliance with the O3 NAAQS.”

6) P8, paragraph on emissions: It would be interesting in this context to also give some

numbers on the used anthropogenic emissions outside North America, especially those of

north-eastern Asia. How do these numbers differ in both models? Also give emissions of

total non-methane hydrocarbons, not just NOx.

We now provide the requested numbers for global, North American, and East Asian NOx, CO, and propane (as an example NMVOC) in Table 2 and refer to this in the text L293-294: “Global, North American, and East Asian annual emissions for 2006 are provided in Table 2.”

7) P8 CASTNet: How many sites does this network comprise and which sites did you

actually use in your comparison? Do the sites indicated in Figure 8 represent the complete network?

We use 77 sites in our analysis as shown in the new Figures 7 and 8. These include all sites with sufficient data except for a few high elevation sites in California at which measured ozone does not generally vary coherently with that at the InterMountain West sites. Text has been revised (L304-306):

“Our evaluation focuses on the maximum daily 8-hour average (MDA8) O3 concentrations, the statistic currently used by the U.S. EPA to assess compliance with the O3 NAAQS at 77 CASTNet sites.”

8) P8 last sentence: Is this equal to the grid box concentration in the lowest model layer?

Or is some kind of transfer function used to get O3 concentrations at the sampling height

above ground? Possibly repeat the height of the lowest model levels (as given in Table 2).

Furthermore, how did you treat altitude misrepresentation of surface level sites, given the

coarse horizontal resolution of the models? Did you exclude site with large differences of

their real altitude as compared with the surface altitude in the model?

Yes, we report the grid box concentration in the lowest model layer since we only have the hourly ozone data archived from the model surface level. We now state (L307-310):

“Both models use a terrain-following sigma-coordinate for near-surface layers, with the lowest layer centered at 60m and 70m for a column at sea level in GFDL AM3 and GEOS-Chem, respectively.”

9) P11 first paragraph: It is not clear to me if simulated NAB or total O3 is compared

with the satellite products. How much do these two differ at 500 hPaanyway. Just briefly

mention it in the text. Also: How broad are the averaging kernels of TES and OMI and

how much contribution of boundary layer O3 can you expect to see in these retrievals?

Total simulated ozone is compared with the satellite products. We have clarified this in the text and addressed the other points raised. On L381-388:

“With the exception of O3 produced within the U.S. boundary layer from CH4 or natural NMVOC and natural NOx, NAB in surface air mixes downward from the free troposphere. We use 500 hPa products retrieved from both the OMI and TES instruments aboard the NASA Aura satellite to evaluate the potential for space-based constraints on simulated mid-tropospheric total O3 distributions. Our comparison thus evaluates the reservoir of mid-tropospheric ozone, of any origin, that can mix into the planetary boundary layer. For context, the ratio of NAB to total O3over North America at 500 hPa in AM3 ranges spatially and seasonally from <75% to 98%, with the largest ratios occurring in spring.”

We address the averaging kernels in L333-338:

“We apply the appropriate satellite averaging kernels to the model daily ozone fields for direct comparison with the retrieved satellite O3 columns (Zhang et al., 2010). While the averaging kernels for the 500 hPa retrieved product for both the TES and OMI instruments are most sensitive to the mid-troposphere, there is a broad vertical sensitivity throughout the troposphere, but generally very little information is retrieved from the boundary layer (e.g., see example averaging kernels in Figure 1 of Zhang et al. (2010)).”

10) P18, first paragraph: I don’t think that this example is very conclusive. The ’stratospheric’event is much less pronounced in the observations than in the simulations but then it continues much longer than the simulated event. Additional observed parameters, if available, may actually be of help to identify the period of stratospheric influence from the observations (e.g. carbon monoxide or water vapor).

We have removed this example.

11) P18, section 3.4.2: Here the total amounts of biomass burning emissions need to be

discussed. They are not given in your Table 2 or elsewhere. The timing is another important issue. Since AM3 is using a climatology, the emissions simply may take place at the wrong time and are probably smoothed out in time (and space). The smoothing will lead to an overestimation of the plume dispersion and most likely contribute to overestimates in O3 due to increased production efficiency in diluted plumes. The same could be said about GEOS-Chem. Why not use daily emission fields that are available from GFED as well?

Biomass burning NOxemission totals are now provided in Table 2 and we refer to Table 2 in this paragraph (L716-719):

“There are several EUS events during spring and summer where AM3 simulates a localized spike in NAB that is not simulated by GEOS-Chem, which we attribute at least partially to the different treatment of wildfire emissions in the models (Table 2).”

As noted above, we are not able to re-do these simulations.

12) P19, first paragraph: Back trajectories are not a state-of-the-art tool for tracing plumes in or close to the planetary boundary layer, since they do not represent any turbulent or convective mixing. Please replace by a backward lagrangian dispersion simulation. In addition, you need to indicate the location of the ’Canadian’ fires. The country is relatively large and while your green trajectory nicely cuts it in half it remains unclear if the fires were anywhere close to it. Furthermore, you mention in the introduction that wildfires mainly contribute to O3 production when they intercept urban (NOx-rich) plumes. So where does the NOx come from in your initial plume to form PAN when anthropogenic, North American emissions are switched off? Are there any indications that AeroComNOx emissions are biased? Or could it be again a timing issue. You might be emitting wildfire emissions at the wrong time for example into an atmosphere that experienced recent lightning NOx production. In conclusion, this discussion remains very speculative.

The intent here was to point out that precisely because of the problems noted by the reviewer (using climatological emissions; distributing the fire NOx emissions vertically where PAN is more stable and O3 production is more efficient), AM3 produces a high background event whereas GEOS-Chem, which uses year-specific emissions that are distributed only in the surface layer, does not. Given that the reviewer feels this discussion is speculative, we have chosen to remove Figure 10 and the associated discussion. The key point regarding the poor simulation from AM3 is evident already from the time series in Figure 9a where the NAB is as high as the observed value.

Technical comments

13) P17: ’contrast, GEOS-Chem NAB (thin blue line) decreases, as does total O3 (thick red line)’ This should be (thick blue line).

Thanks, changed.

14) P17, last sentence: Something is wrong in this sentence. Should it read ’... as capturedby AM3 BUT the model overestimates the observed values...’ ?

Yes, thanks, changed.

15) Figure 2: The state borders between Utah and Colorado and between Arizona and New Mexico are missing in the upper left panel (at least in my pdf version of the manuscript).

Fixed.

16) Figure 5: What is the meaning of ’CONUS’? Not explained in the text of figure captions.CONtinental US?

Added to caption, also on Figure 6.

17) Figure 7: Please add a zoom into the event that is discussed in the text. Right now it

is almost impossible to see the details that are described in the text.

Done. We now have Figure 9a as the full March-Aug time series at all 4 sites, and Figure 9b which is the zoomed in version showing only mid-April to mid-June to better see the event attributed partially to stratospheric influence by AM3.

18) Figure 8: Do white areas in panel b indicate missing observations?

We have removed this figure for the reasons discussed above.

19) Figure 9: Do white areas in the model simulations indicate values outside the color

range?

Figure removed as noted above.