May 2002

Quality Adjustment for PPP: Principles and an empirical study

Mick Silver* and Saeed Heravi*

*CardiffUniversity

Colum Drive

Cardiff CF10 3EU

U.K.

Email:

Draft Paper Prepared for a Conference on the International Comparisons Program, World Bank, WashingtonD.C., March 11-12, 2002 and extended to cover quality adjustment methods.

We acknowledge the support of the World Bank and of GfK Marketing Services Ltd. for supplying the data. Helpful comments were received from Bart van Ark (University of Groningen), Jim Cuthbert and James Mead, Yonas Biru and Ed Dean. The usual disclaimers apply.

Contents

Principles

  1. Introduction: types of errors in quality adjustment
  1. Imputation: assumptions and use

3. Explicit Methods

(a)Expert judgement

(b) Quantity adjustment

(c)Differences in production/option costs

(d) Hedonic approach

(i) Principles and method

(ii) On theory

(iii) Implementation

(iv) Need for caution

  1. The analytical framework
  1. Hedonic indices

(a) Hedonic functions with dummy variables on time

(b) Country-on-country hedonic indices

(c)Superlative and exact hedonic indices (SEHI)

Empirical study

  1. Variables and data

7. Empirical results: coverage and tightness of the specifications.

8. Matched versus unmatched prices.

Principles

1. Introduction: types of errors in quality adjustment

Accurate price comparisons between countries requires that the prices of like are compared with like. For consumer price indices (CPIs) price collectors monitor the prices of a selected models over time, the problem of quality adjustment arising when a model is no longer sold in an outlet. For PPPs a detailed specification for each item is required and matched prices are collected in each country according to this specification. If, for a bilateral comparison, a matching item is unavailable, one procedure is to impute the price, that is, to assume the price comparison for the missing item, had it existed, is the same as that for a broader class of existing matched items. Another is to find an item that is close to the desired specification and quality adjust its price to take account of the quality difference.

A problem arises with tight specifications for the item descriptors. The more tightly defined are the specifications, the greater the chance of the item not being matched and an imputation or quality adjustment being required. Loose specifications lead to price comparisons of dissimilar items, which are tainted by quality differences, that is, non-comparable items being wrongly treated as comparable. The first concern of this paper is to examine the nature and effects of the trade-off between tighter item descriptors and less quality differential bias on the one hand, and poorer coverage and bias arising from excluding unmatched observations with different price changes on the other. Excluding unmatched items using imputations results in sample degradation. There are therefore two types of bias: the first is in-sample bias due to loose item specifications resulting in non-comparable items being compared. The second is out-of-samplebias due to low coverage from tight item specifications and unrepresentative comparisons. It should be noted that out-of-sample bias requires a degraded sample of matched comparisons, usually when item specification are tight, and that price differences within the sample differ from those outside.

Figure 1 illustrates such comparisons with tightly specified items and low coverage being compared in case A, and loose specifications with higher coverage in B. The coverage of tightly specified items may of course be good for some product areas, though we stylise the problem since our analytical concern is with problems of choosing between tight specifications with low coverage, and loose specifications with higher coverage.

Figure 1, Illustration of tight (A) and loose (B) item specifications

Country 1 / Country 2
A

B

Figure 2, Item specification and bias

Out-of sample bias
In-sample bias / Similar price differences / Dissimilar price differences
Tight specifications / Directly comparable:
low bias / Imputation:
low bias / Imputation:
high bias with low coverage. [Increase coverage, hedonic indices]
Loose specifications / EQA: bias depends on effectiveness of EQA / EQA: bias depends on effectiveness of EQA / EQA: bias depends on effectiveness of EQA and coverage. High bias, low coverage. [Increase coverage, hedonic indices]
NB: EQA is an ‘Explicit Quality Adjustment’

Figure 2 considers in-sample and out-of-sample bias for tight and loose specifications. Consider the use of tight specifications resulting in relatively low coverage. The in-sample bias should be relatively low if price comparisons are simply directly compared. If price differences outside the sample of specified items are similar to those directly compared, imputation leads to low bias. With imputation the price differences of the out-of-sample items are simply assumed to be the same as those of (targeted) in-sample comparisons. If price differences out-of-the-sample are dissimilar, then imputation leads to bias, the extent of the bias depending on the dissimilarity between price changes in and outside of the sample, and the coverage of the sample. In such circumstances more (tight) item comparisons or looser ones with appropriate quality adjustments are required to eat into that product universe of items.

Consider loosely specified items in Figure 2. When comparing loosely specified items bias may arise because like in one country is not being compared with like in another. It is therefore beholden on the data collecting process to ensure data are collected not only on prices, but also on the characteristics of the items. Then any differences between the quality of the characteristics selected can be identified and an explicit quality adjustment (EQA) undertaken. It is stressed that if such an adjustment is not made, it is still implicit the calculation. It is just that it is a poor adjustment with all of any price difference due to quality differences being assumed to be part of the price comparison.

Figure 2 shows that with loose specifications the in-sample bias depends on the effectiveness of the explicit quality adjustment. Similarly if out-of-sample price changes are similar to those in the sample, the out-of-sample bias also depends on the effectiveness of the explicit quality adjustment. It is thus stressed that any bias in the quality adjustment carries over from the representative items to the weight for the product area. With dissimilar out-of-sample price differences to those in the sample, the bias again depends on the effectiveness of the quality adjustments in sample. However, if the in-sample coverage is low, then the unrepresentative quality-adjusted price changes in sample, will not reflect the out-of-sample price comparisons leading to high level of out-of-sample bias. Again there will be a need to increase the coverage of the product area by selecting more representative items however loosely or tightly defined. If explicit quality adjustments are difficult and quality is believed to vary across countries, then more tightly specified comparisons will be required.

It is stressed that data on prices and characteristics are required for two purposes. The first is to quality adjust for in-sample bias when loose specifications are used. The second is to compile hedonic indices to extend the sample to cover out-of-sample unmatched items. This requires further explanation. In Figure 2 hedonic indices are proposed when prices of out-of-sample items are dissimilar. These indices can take a number of forms as will be explained subsequently. One form is to regress the prices of items comprising a product area on its characteristics using data in the two (or more) countries concerned for the single regression. A dummy variable(s) for country(s) is included, its coefficient being an estimate of the parity adjustment at this level. The sample need not be matched and can thus extend out-of-sample. Hedonic indices are thus a valuable tool for minimising out-of-sample bias by not constraining sample selection to matched items. However, it requires the use of checklists to collect data on prices and characteristics.

The paper is composed of two parts. The first is a more rigorous formulation of approaches to quality adjustment. In section 2 the use of imputations is outlined, section 3 covers explicit quality adjustment including subjective estimates, quality adjustments, option costs and hedonic adjustments. Section 4 is an analytical framework for out-of-sample bias and section 5 examines the use of hedonic indices in this context.

The second part of the paper is an empirical study. In sections 6 to 8 scanner data on three countries for television sets are used to illustrate some of the issues discussed.

The empirical section of the paper is based on scanner data for television sets, which are compiled, from the bar-code readers of outlets in three countries: the United Kingdom, France and the Netherlands. Heravi, Heston and Silver (2001) used such data to examine alternative formulae and methods for parity adjustments across countries. The use of scanner data to explore methodological issues is well developed for CPIs (Silver, 1995 and Silver and Heravi, 2002) and this paper represents a second example of its employment for PPP purposes. In section 6 a description of the data is provided. Section 7 provides the empirical results on how coverage varies with the tightness of the specifications. Coverage is reduced with the fall in the number of matches as the item descriptors become increasingly tighter, and the extent of this is shown for bilateral comparisons between the three countries. Details on coverage are provided in terms of number of matches and the proportion of expenditure covered. The number of matches may be relatively high as price collectors find matched models that meet the item specification, but which may have relatively low sales. For many outlets the prices collected may be the list price on display of a model, irrespective of whether it has had a relatively small number, or possibly any, sales. The analysis is thus replicated with sales volume cut-offs. It is as if the price collectors are only being asked to match items only if the sales are ‘substantial,’ in the sense of the quantity cut-offs used. The implications for coverage are discussed. The analysis is also of interest if it is assumed that the missing items in imputations are more likely to be the smaller selling models, then the differences in the results give some insight into the potential bias. This analysis is undertaken for a range of size levels.

If the fall in coverage is substantial, this does not in itself imply a bias. Section 8 follows the model in section 4 by examining the nature and extent of differences between matched and unmatched quality-adjusted prices. This is tested by the estimation of a hedonic regression over the whole sample, and the use of dummy variables for unmatched items. Also identified are whether the unmatched observations have different residuals than the matched ones. The definition of ‘unmatched’ depends on the tightness of specifications and the analysis is re-run according to different degrees of disaggregation. Summary statistics on residuals are presented for matched and unmatched comparisons for each level of tightness of specification. An analysis of such residuals is shown to be essential to an understanding of the extent of bias in such price comparisons.

The empirical section concludes in section 8 with comparisons of the parity adjustments using matched data, at different levels of aggregation, and the use of whole data set using hedonic indices. The use of such indices allow price comparisons to be undertaken with unmatched data as long as information on the quality characteristics are also available for each item. With matching, the price collector attempts to ensure like is compared with like, with the hedonic regression the differences in quality are partialed out as part of the estimation. The scanner data, with its extensive coverage of transactions and inclusion of variables on the quality characteristics of each model, allow the use of all of the data in each country and controls for mismatches in the estimation. Scanner data are shown to be useful for providing insights into such methodological issues which PPP data, being constrained to the sample collected, as opposed to the universe of transactions, cannot identify.

2. Imputation: assumptions and use

This method uses the price changes of other items as estimates of the price changes of the missing items. It is the computationally most straightforward of methods in this form, since the item is simply dropped from the calculation. A targeted form of the method would use similar price movements of a cell or elementary aggregate of similar items, or be based on a higher level of aggregation if either the lower level had an insufficient sample size or price changes at the higher level were judged to be more representative of the price changes of the missing items. Any stratification system used in the selection of outlets would facilitate this. However, in practice the sample of items may be too limited or different from the item(s) for which the imputation is required. An appropriate stratum is required with a sufficiently large sample size. Stratification by product area and outlet-type may be preferred to just by product area, if differences in price changes are expected for outlet types (Silver and Webb, 2001). The stratum used for the target should be based on the analyst’s knowledge of the market and an understanding of similarities of price changes between and within strata.

The underlying assumptions of these methods require some analysis since, as discussed by Triplett (1999 and 2002) they are often misunderstood. Consider items where, as before, is the price of item m in country A. Item m is unavailable in country B, so a replacement item is selected. is the price of a replacement item n in country B. Now n replaces m, but is of a different quality. So let A(z) be the quality adjustment to which equates its quality services or utility to such that the quality-adjusted price . For the imputation method to work, the average price changes of the i=1….m items, including the quality adjusted price P*m,B, given on the left-hand-side of equation (4), must equal the average price change from just using the overall mean of the rest of the i=1….m-1items, on the right-hand-side of equation (1). The discrepancy or bias from the method is the balancing term Q. It is the implicit adjustment that allows the method to work. The geometric formulation for one unavailable item is given by:

(1)

(2)

and for x unavailable items by:

(3)

The relationships are readily visualized if is defined as the respective geometric means of price changes of items that continue to be recorded and of quality-adjusted unavailable items, i.e., for the geometric case:

where

(4)

then the ratio of geometric mean biases from substituting equations (4 in (3) is:

(5)

which equals zero when . The bias depends on the ratio of unavailable values and the difference between the mean of price changes for existing items and the mean of quality-adjusted replacement to unavailable price changes. The bias decreases as eitheror the difference between and decreases. Furthermore, the method is reliant on a comparison between price changes for existing items and quality-adjusted price changes for the replacement/unavailable comparison. This is more likely to be justified than a comparison without the quality adjustment to prices. For example, if there were m = 3 items, each with a price of 100 in country A. Let the country B prices be 120 for two items, but assume the third is unavailable, i.e., x = 1 and is replaced by an item with a price of 140, 20 of which is due to quality differences. Then the geometric bias as given in equations (4) and (5) where , m = 3, is:

Table 1. Example of the bias from implicit quality adjustment for r2=1.00
Geometric mean
Ratio of missing items, x/m
0.01 / 0.05 / 0.1 / 0.25 / 0.5
r1
1 / 1 / 1 / 1 / 1 / 1
1.01 / 0.999901 / 0.999503 / 0.999005 / 0.997516 / 0.995037
1.02 / 0.999802 / 0.99901 / 0.998022 / 0.995062 / 0.990148
1.03 / 0.999704 / 0.998523 / 0.997048 / 0.992638 / 0.985329
1.04 / 0.999608 / 0.998041 / 0.996086 / 0.990243 / 0.980581
1.05 / 0.999512 / 0.997563 / 0.995133 / 0.987877 / 0.9759
1.1 / 0.999047 / 0.995246 / 0.990514 / 0.976454 / 0.953463
1.15 / 0.998603 / 0.993036 / 0.986121 / 0.965663 / 0.932505
1.2 / 0.998178 / 0.990925 / 0.981933 / 0.955443 / 0.912871
1.3 / 0.99738 / 0.986967 / 0.974105 / 0.936514 / 0.877058
1.5 / 0.995954 / 0.979931 / 0.960265 / 0.903602 / 0.816497

Had the bias depended on the unadjusted price of 140 compared with 100, the method would be prone to serious error. In this calculation the direction of the bias is given by and does not depend on whether quality is improving or deteriorating, i.e., whether or .

Table 1 provides an illustration whereby the (mean) price changes of items that continue to exist, r1, is allowed to vary for values between 1 and 1.5 – no price change and a 50 percent increase. The (mean) price change of the quality-adjusted country B items compared with the country A is assumed to not change, i.e., r2 = 1.00. The bias is given for ratios of missing values of 0.01, 0.05, 0.1, 0.25 and 0.5, for geometric means. For example, if 50 percent of price quotes are missing and the missing quality-adjusted prices do not change, but the prices of matched items increase by 5 percent, then the bias for the geometric mean is represented by the proportional factor .9759; i.e., instead of 1.05, the index should be 0.9759 * 1.05 = 1.0247.

Equation (1) shows that the ratio x/m and the difference between r1 and r2 determine the bias. Table 1 shows that the bias can be quite substantial when x/m is relatively large. For example, x/m = 0.25, an inflation rate of 5 percent for existing items translates to an index change of 3.73 percent for the geometric formulation, when r2 = 1.00, i.e. when quality-adjusted prices of unavailable items are constant. Instead of being 1.0373, ignoring the unavailable items would give a result of 1.05. Even with 10 percent missing, (x/m = 0.1) an inflation rate of 5 percent for existing items translates to 4.45 percent for the geometric formulation when r2 = 1.00. However, consider a fairly lowratio of x/m, say 0.05; then even when r2 = 1.00 and r1 = 1.20, Table 1 finds 18.9 percent corrected rates of inflation for the geometric formulation. In competitive markets r1 and r2 are unlikely to differ by substantial amounts since r2 is a price comparison between the new item and the old item after adjusting for quality differences. However, for non-tradable goods and services, price differentials may be quite substantial, even when quality adjusted. If r1 and r2 are the same, then there would no bias from the method even if x/m = 0.9. There may, however, be more sampling error.