Elements of Validation:
Richard B. Rood
University of Michigan
(original draft in February 2003 at NASAGoddardSpaceFlightCenter)
November 15, 2005
Validation is a term used broadly in the Earth-science community to substantiate the accuracy and precision of a measurement or data product. This information is obtained by comparison to a baseline of information that is accepted as standard. In a generalized sense, validation also substantiates whether or not a data product is of adequate quality to address the scientific questions for which the data were collected. In the absence of other sources of independent information, this sort of focused scientific investigation stands as the substantiation of data quality. The validation process is the basis of quantitative application of a data product to geophysical problems. This section breaks down the term validation into a number of sub-processes, or elements, that are often used in validation process.
The elements are numbered 1 through 5 based on how well they can be automated and how quickly they can be executed. For example, in Monitoring, the aggregation of observations and the development of statistics can be automated with the generation of routine analyses that identify anomalies that require investigation to be resolved. Scientific Validation, on the other hand, requires the application of the data product to a scientific problem and determination of the ability of the observation to contribute to the quantitative understanding of that problem. This is a process that could take years, and ultimately lead to the reanalysis of data product or even the determination that the product is not of ample quality to be scientifically useful.
All of these elements contribute to validation. Each brings something unique to the process. It is a balance between the information from these different elements that ultimately stands as the test of the quality of product, and its ability to contribute to the community's science investigations. The validation process is dynamic, leading to changes in the data product, which require further validation.
There are some general attributes of validation. First more than one source of information about the measured parameter is needed. Ideally, this is two independent observations. Second the validation process requires scrutiny of the measured parameter by independent investigators. While those developing instruments and algorithms to retrieve environmental information from measurements are an integral part of validation, robust validation requires formal separation of those who develop from those who validate to assure objectivity.
1) Monitoring: Monitoring is the routine evaluation of a time series of observations. The goal of monitoring is to determine if there is some change apparent in the time series of the quantity being observed. If there is an independent estimate of the same quantity, the comparison can be made between the two sources of information. The comparison can also be made with a forecast estimate based on a model prediction initialized at some earlier time. The identification of a change in these comparisons often indicates a change in the instrument performance or in the treatment of the data stream from the instrument. However, this is not a priori true, and other sources of change in the comparison must be investigated. Monitoring of the behavior of a new observation with short-term forecasts is an important part of determining the suitability of a measurement for assimilation.
2) Quality Assessment: Quality Assessment is the determination of whether or not a measurement is consistent, in a statistical sense, with other estimates of the same quantity. There is a determination of mean differences, bias, and variability. The quality assessment can be made by comparison of an observation with 1) expected values from a previously established climatology, 2) a forecast value at the same spatial and temporal location, or 3) a collocated independent observation. The observation's quality is first characterized by determining if the observation lies within a range of expected values based on past experience. If the observation lies outside of the expected range, further tests, for instance comparison with nearest neighbors of observations made with the same instrument, provide information on whether or not the observation might be of geophysical significance and outside of the range of expected values. If there is an independent, calibrated source of information, then a quality assessment relative to this standard can be determined.
3) Short-term Forecasts: Short-term forecasts provide a first-look evaluation of the consistency between models, hence theory, and observations. Short-term is defined by the expectation that the variability in the observed parameter can be directly traced to an earlier observation through the dynamics and physics; that is, the system is a deterministic initial value problem. A primary example of this is numerical weather prediction. The quality of weather forecasting has evolved to the point that for many new observations that vary on the time scale of weather, e.g. ozone, routine forecasting can contribute to their validation. Spatial and temporal characteristics of the quality of the forecast provide information on observational and model quality. Given a short-term forecast with a measured level of expected quality, these forecasts become an important part of monitoring and quality assessment.
4) Systems Validation: Systems validation is required when a data product is comprised of several parameters that are expected to be utilized together on single science problems. These science problems, themselves, often represent the integrated effect of a number of processes that may be measured or calculated separately. An example of such a problem is the change in outgoing longwave radiation between an El Nino and La Nina cycle. Systems validation checks the consistency between the parameters, with the consistency often determined through utilization of models and theory. In systems validation integral constraints such as mass conservation are often important. The separation between systems validation and scientific research is often unclear. Systems validation, however, often focuses on a set of problems that are of continuing interest and that a baseline exists to define standards of how well the problem can be addressed. Validation requires a determination of these baseline problems as well as a set of quantitative metrics of the state of the art at addressing these problems.
5) Scientific validation: Scientific validation is the investigation of the ability of a measured parameter to address new scientific questions with unknown answers. Scientific validation is characterized by uncertainty; that is, discoveries are expected in the process. Therefore, scientific validation may be a long process. Further, broad use of the measured parameter by the community is necessary to assure adequate breadth of investigations. The result of scientific validation is often the identification of new standards that must be met in the estimation of a parameter to be useful in scientific research. Scientific validation also provides information on how investigators use the data products, which helps to motivate new and better ways to generate the data products.