Practical Recommendations
(Preliminary version after COST ES0601, Year3)
General recommendations
Search intensively for metadata and use them when homogenizing.
Metadata are important for putting observations into a proper perspective, for understanding the biases that might be inherent in the observations and the changes in the biases over time. Therefore, metadata should be as complete as possible, as up to date as possible, and as readily available as possible. For putting current observations into an accurate historical perspective for climate change studies, the homogeneity of the data needs to be assessed.
- Metadata will give you a rough overview of station quality and ease your decision to start homogenizing a series or not;
- Metadata will help you in stating more precisely the exact date of inhomogeneities;
- Metadata will justify adjusted breaks;
An extensive compilation of metadata has been published by WMO, 2003, in the Guidelines on Climate Metadata and Homogenization, WCDMP-No.53, WMO-TD No. 1186. The guide has been made electronically available (see references).
Archive (store) the original data,
It is highly recommended to preserve your original (raw) data. This concerns original paper sheets in the archives, but also your digitized data. In the future, there may be more and better advanced tools for homogenization, there may be more and better data for reference stations, there may be additional sources of metadata, and a number of other good reasons to keep your raw data as they are.
Document your homogenization (metadata)
Documentation of homogenization is metadata. Somebody using the data might ask questions: who has homogenized this series, which tools did he use, when was the homogenization carried out, how many breaks have been detected and corrected, what was the magnitude of the adjustments, what were the reasons for the breaks, etc?.
COST HOME specific recommendations
- Apply the procedures recommended by COST –HOME ES0601 and make use of the software provided by COST ES0601.
Cost Action Home recommends using homogenization methods which performed best in the “Benchmark Experiment” (Venema et al., 2009). The experiment allowed comparing the performance of the most scientifically advanced and widely used homogenization tests for break detection and correction.
It is recommended to apply at least two different examinations.
- Quality
Make sure that your series have passed adequate quality checks! It does not make sense to start with unreliable erroneous data, but starting with quality controlled data will produce more realistic correlations of station data and reduce the risk of outliers in the homogenized series.
- Homogenization
A homogenization algorithm is referred to as (fully) automatic, if it produces homogenized records out of a set of station records without any user interaction during homogenization. The operator may only set parameters such as thresholds before calling the algorithm. Its results must be reproducible and independent of the operator calling the algorithm. Semiautomatic methods support the operator by calculating statistics but require some user interaction during the homogenization process, e.g. manual selection of reference stations or manual acceptance of a breakpoint suggested by the software.
State of the art automatic algorithms can produce quite good results, and are recommendable for large networks when semiautomatic methods are too time consuming and too costly. However, so far the best semi-automatic methods performed better than automatic ones.
- Break Detection
In general, the use of absolute homogeneity tests is not recommended by COST ES0601. Compared to relative tests they have performed badly in the “Benchmark Experiment”. However, there will situations when the application of relative tests is impossible. This is conceivable e.g. for early instrumental series when no sufficiently correlated series for comparisons could have been made available. Other examples would be breaks which have affected the whole station sample, e.g. changes in regard to observing practices touching the whole network at the same time (e.g. observing times and algorithms for means calculation, etc). In this context it is strongly recommended to include metadata knowledge as much as possible.
Assume each series may contain multiple breaks! It is important expect multiple breaks in the series, this concerns the candidate series but also any reference series for comparison; even if metadata give indication of an undisturbed series.
Test together well-correlated series in climatologically coherent regions. You may either make use of pair-wise comparisons or of a weighted reference series.
Figure 1: Example for break detection in seasonal precipitation series with pair-wise comparisons. The study region of GAR (Greater Alpine Region, 4-19°E , 43-49°N)has been divided into smaller subregions. The selection of regions and assigned stations took regionally different precipitation regimes and into account. Regional precipitation patters of GAR are ruled by Atlantic, Mediterranean and continental influences. (Figure source: Auer et al., 2005).
For extreme temperatures testing minimum and maximum temperature together (DTR) turned out to be useful.
It is recommended to detect breaks by comparing annual and seasonal series. This is a compromise; annual series are less noisy than seasonal ones but annual series may contain hidden breaks. Sometimes it may be even useful to use monthly series in order to find the precise position of a “large” break.
- Data correction
For a simple and direct method for data correction you may use reference series; homogeneous parts of a well correlated neighbor series or of several well correlated neighbor series.
Whenever parallel measurements are available it is recommended to make use of them. First of all you will get an impression of the structure of the expected break. You will find breaks depending on the dominating weather situation,but also others (comp Fig. 2). 21 years of parallel measurements of daily maximum temperatures in Kremsmünster (Austria, 14°07’44’’ E, 48°03’18’’, 382 m asl.) demonstrate an example for a weather dependent break (left graph). In such a case it will be worth to consider a respective correction method taking the whole frequency distribution into account. However, not all detected breaks will be necessarily weather dependent. The right graph displays parallel measured temperatures at Feuerkogel (Austria, 13°43’06” E, 47°49’04’’ N), a mountain station in 1618 m asl. The scatter plot of manually and automatically temperatures produced for the period 1990 – 2008 does not point towards a temperature dependent break when manually measurements will be replaced by automatically ones. Correcting the mean only will be as useful as methods taking the whole distribution into account. If adjustment factors can be calculated directly from sufficiently long series of parallel measurements their magnitude should be taken into account.
Figure 2:Left: Differences of July daily maximum temperatures (automatic minus manual) in dependence of automatically measured maximum temperatures for the period 1988-2008. Right: differences in summer (June to August) daily mean temperatures at Feuerkogel (automatic minus manual) and automatically measured daily mean temperature for the period 1990 - 2008.
It is essential to assess the uncertainties of data correction. Put confidence on the provided corrections using different samplings and varying reference stations.
For monthly temperature series in most cases raw or smoothed monthly correction coefficients should be applied, for precipitation seasonal correction is recommended.
For daily values correction has to be applied to each day. For daily temperatures this may be coefficients derived from monthly adjustments (e.g. Vincent et al.,2002) or correction coefficients derived utilizing the whole frequency distribution (e.g. Mestre et al., 2009).
- Data adjustment for urban series
Urban series are loaded with a surplus temperature the so called urban heat island. However, the urbanization effect does not cause a sudden break, but may cause a gradual trend in the series. For such cases a special correction model for trends should be applied to remove or to keep additional urban trend. Please note: Not all urban stations may exhibit an urban heat island trend. Stations originally established in a densely built-up area will behave differently to stations originally installed in a lightly urbanized environment that has experienced growth.
Figure 3: Suburban mean temperature series of Wien-Laaerberg (Austria, 16°23’E, 48°10’N, 220 m asl.) corrected from urban effect (solid), compared to non corrected (+). Figure by courtesy of Olivier Mestre, 2010.
- After Homogenization
Document your homogenization (metadata)
Check the results of homogenization with respect to outliers that might have been produced by homogenization
Homogenized data are of higher quality than the inhomogeneous original data. However for climatological analyses be aware that you are working with homogenized data with some remaining uncertainties.
- References
Aguilar E, Auer I, Brunet M, Peterson TC, Wieringa J. 2003. Guidelines on Climate Metadata and Homogenization. WCDMP-No.53. WMO-TD No. 1186, 51pp. : to be downloaded from:
Auer I, Böhm R, Jurkovic A, Orlik A, Potzmann R, Schöner W, Ungersböck M, Brunetti M, Nanni T, Maugeri M, Briffa K, Jones P, Efthymiadis D, Mestre O, Moisselin JM, Begert M, Brazdil R, Bochnicek O, Cegnar T, Gajic-Capka M, Zaninovic K, Majstorovic Z, Szalai S, Szentimrey T, Mercalli L. 2005. A new instrumental precipitation dataset for the Greater Alpine Region for the period 1800-2002. International Journal of Climatology. 25: 139-166.
Della-Marta P M, Wanner H, 2006: A Method for homogenising the extremes and mean of daily temperature measurements, J. of Climate, 19, 4179-4197.
Mestre O, Gruber C, Prieur C, Caussinus H, Jourdain S, 2009: A method for homogenization of daily temperature observations, submitted to JAMC.
Tan LS, Burton S, Crouthamel R, van Engelen A, Hutchinson R, Nicodemus L, Peterson TC, Rahimzadeh F. 2004. Guidelines on Climate Data Rescue. WMO/TD No. 1210,11 pp. to be downloaded from:
Venema V, Mestre O, Aguilar E. 2009. Description of the COST-HOME monthly benchmarkdataset with temperature and precipitation data fortesting homogenisation algorithms. COST report, to be downloaded from ftp://ftp.meteo.uni-bonn.de/pub/victor/costhome/monthly_benchmark/description_monthly_benchmark_dataset.pdf, pp14.
Vincent L A, Zhang X, Bonsal B R, Hogg W D, 2002 : Homogenisation of daily temperatures over Canada. J. Climate, 15, 1322-1334.
Version May 2010, 19-20, agreed at MC Bucharest. This version will be finalized after COST HOME Year 4.