Study Guide on Applications
(e.stat sections are noted in parenthesis)
Applications here refer to summarizing data with index numbers, exploring the special problems in time series data, and using statistics for quality control.
Questions to ask at the beginning:
- Do I want to aggregate price or quantity data?
- Are the data observed at regular periods over time, for example, by month?
- Has the time series data been seasonally adjusted?
- Does the time series show a trend?
- Is a concern for quality focused on a mean, proportion, or standard deviation?
- May I explore sources of variation with a regression?
- Are production lots to be accepted or rejected?
Index Numbers
One way to aggregate detailed data into a summary is to construct an index. An index number is a weighted average of the detailed data. To aggregate prices, use quantities as weights. To aggregate quantities, use prices as weights.
For a price index, the quantity weights might be from a base period (a Lespeyres Index) or from the current period (a Paasche Index). The weights might shift over time, creating a chain index. (24-08) The indices for the same time period constructed with different weights might be averaged with the geometric mean. (24-04)
Lespeyres Price Index
For a quantity index, the price weights may be from a base period (a Lespeyres Index) or from a current period (Paasche Idex). Chaining and averaging may also be appropriate.
(24-07)
Lespeyres Quantity Index
If it is appropriate to assume constant expenditure shares, a geometric index may apply. (24-09)
Trends
Estimate a trend by regressing time series data on time. Examine a scatterplot of the variables plotted against time and look for curvature. Choose a functional form to estimate the trend that will capture the curvature. (25-03)
Seasonal Adjustment
Economic time series data with observation periods shorter than annual are likely to vary systematically by season. We can either build time into our models or use seasonally adjusted data. A common method of seasonal adjustment is to compute a centered moving average of the series, take a ratio of the raw values to the moving average value for each period, and find an average of the ratio for each seasonal period (for example, each month). Dividing each raw data value by the average ratio for its seasonal period creates a seasonally adjusted series. (25-04)
Autocorrelation
Movements in economic time series tend to persist through time. A boom in one quarter makes more likely that the immediate succeeding quarter will boom as well. Recessions also persist. Economic time series, then, tend to move together. Persistence in time series is called positive autocorrelation. A high in one period may be systematically associated with a low in the next; call this negative autocorrelation. If we are to estimate causal relationships, we need to take account of autocorrelation. (25-05 & 06)
To test residuals for autocorrelation in a model without a lagged dependent variable, compute a Durbin-Watson d statistic. (25-07)
The d-statistic varies between 0 and 4 with values near 2 indicating no autocorrelation.
When autocorrelation is present in the residuals, one might reformulation the model using first differences (25-08) or by including lagged values of the dependent variables. (25-10)
Quality Control Chart
To control a dimension of a product or service, measure it carefully with a benchmark sample. The construct a control chart with the target value and a range within which the statistic from the monitoring sample is likely to fall with given probability. For example, to control a mean, set limits within which the sample mean from monitoring samples will fall with probability, say, 95 percent. (26-05)
Control charts may be used for individual observations, for sample means, proportions (26-07) and standard deviations (26-06). We can also monitor a count of defects using a control chart based on the Poisson distribution. (26-08) If the distribution of the statistic being tracked in unknown, we can define a control chart using a bootstrap method. (26-09) The control chart may be one- or two-sided and it may have one threshold to trigger more frequent samples and a second threshold to trigger immediate action. (26-10) Plot values from monitoring samples as they occur over time and look for patterns and instances when a process may be out of control. When the process is out of control, make corrections.
Quality and Regression (26-11)
We can use a regression to identify sources of variation in the target variable. Attributes of the production process, time and temperature, and the manager on the scene might be coded and used as explanatory variables. By estimating relationships with causal variables, a quality control team might think of steps to take to reduce variance and so increase control.
Acceptance Samples
Use the binomial (26-13) or hypergeometric (26-14) to find the sample size and maximum number of defects in a sample that will be acceptable.