Brief Notes on Building a Common Fire History Data Base

DRAFT For Discussion Only January 3, 2012

Brief Notes on Building a Common Fire History Data Base

Danny C. Lee, USDA Forest Service

Introduction

A comprehensive database on historical wildland fire occurrence is essential to being able to accurately portray the magnitude and extent of wildland fire across the United States. Such data could also serve as the baseline from which to evaluate possible changes in fire occurrence and severity due to changes in management activities and investments. Current reporting systems maintained by individual federal, state, tribal, and local agencies collectively provide a wealth of information, but inconsistencies among the different systems have so far prevented compilation of an single, standardized data system that accurately reports all relevant fires in an easily accessible format.

The combination of historical inconsistencies in reporting and omission of key data on some incidents (or no record at all) likely means that a factually accurate and precise fire-by-fire listing of all wildfires in the US over the last decade or so is unattainable within the time frame and resources currently available. Thus, the challenge is to design and develop an alternative data set that provides statistically valid estimates of the frequency, location, and extent of wildfire in the US using a combination of available data sets. The purpose of these estimates would be to provide a reasonable and commonly accepted baseline for use in further comparative analyses by the National Science and Analysis Team and others working on the Cohesive Strategy.

General Approach

One approach for building a useable reference dataset is to construct a hybrid dataset that uses information from multiple data sets, not by simply consolidating the data, but rather by using each component data set to estimate the parameters best represented by those data. The basic process is as follows.

1. Establish the objective attributes of primary interest. For example, likely attributes include location, frequency, areal extent, temporal variation, and possibly relative severity.

2. For each attribute, identify the available data sets that provide the most accurate and precise characterization of each and note the degree of spatial resolution provided by those data sets. For example, there may be data sets that have a high degree of spatial resolution (thus providing precise estimates of location) but lack the full complement of fires (thus underestimating the frequency of occurrence). Similarly, there could be data sets that account for all fires within a broader geographic area (e.g., county), but lack specific coordinates for each fire.

3. Based on consideration of the data identified in (2) and ancillary GIS information, identify the finest level of spatial resolution (i.e., minimum mapping unit [MMU]) for which reasonable estimates of all attributes can be produced. The expectation is that this likely will be some combination of administrative boundaries (e.g., counties or management units), population density (e.g., census blocks or urban density), and biophysical characteristics (e.g., dominant vegetation types). Various statistical techniques can be used to identify strata that best explain spatial variation in measured attributes.

4. Statistically estimate descriptive parameters (e.g., mean and variance) for each attribute for each MMU. Again, a variety of estimation methods might be explored, depending on how the NSAT and others anticipate using these estimates and practical constraints imposed by existing data.

The steps above provide only a general structure to the process. Clearly, there will be more details to work out as the process evolves and data are examined. An overriding issue to consider is how much complexity to and to the estimation process and the resultant products. Wildland fire is a affected by many interacting spatial and temporal factors. These interactions results is highly variable patterns of occurrence across the US from year to year. Thus any single year is generally quite different from a simple composite of the historical averages at all locations. Are these differences important enough that they need to be incorporated within the analyses conducted in Phase III? These and other questions will have to be resolved in due course.