Specimen Statistics ExaminationPaper (Consultancy and Research Methods)

This paper is an example of the sort of paper you should expect. The actual paper in the summer is likely to cover a slightly different range of topics.

If you wish, you will be able to take one side (A4 size) of notes into the examination.

Answer all questions. The time allowed is 1 hour.

Ten cars (of the same type) were selected at random from the adverts in a paper. The age, miles driven, and the price of these ten cars are given below.

Age (years) / Miles (000) / Price (£)
6 / 36 / 3500
6 / 36 / 3200
6 / 36 / 3700
2 / 22 / 7400
2 / 5 / 6200
5 / 31 / 4200
4 / 22 / 5400
5 / 39 / 4600
1 / 9 / 7400
4 / 27 / 4500

(36 in the Miles (000) column means that the car done 36000 miles.)

This data was then analysed using the Excel Regression Tool:

SUMMARY OUTPUT OF REGRESSION ANALYSIS

These results are an edited version of the output of the Regression Tool in Excel. Variable 1 is Age (years), Variable 2 is Miles (000s), and the dependent (Y) variable is Price (£).

R Square / 0.9565
Standard Error / 364.3
Observations / 10
Confidence interval
Coefficients / p-value / Lower 95% / Upper 95%
Intercept / 8179.3 / 7472.1 / 8886.5
X Variable 1 (slope) / -1077.0 / 0.0002 / -1445.0 / -708.9
X Variable 2 (slope) / 47.4 / 0.0934 / -10.3 / 105.1
  1. Calculate the median and inter-quartile range of the prices of the cars in the sample. (15 marks)
  2. Sketch (draw roughly) a scatter diagram of Miles (000) against Age, and use it to estimate the approximate value of a correlation coefficient (either Kendall’s or Pearson’s). (Answers within 0.5 of a correct answer are acceptable). (15 marks)
  3. Sketch a histogram to show the distribution of the prices, and discuss whether you would expect the distribution of a larger sample from a similar source to follow the normal distribution. (15 marks)
  4. Use the regression model to predict the price of a car that is 3 years old and has done 20,500 miles. (15 marks)
  5. Discuss what conclusions can be drawn from this model about the impact of Age and Miles on the price of a car. Your answer should include a discussion of how accurateyour answer is likely to be. (40 marks)

Notes on answers

  1. Median is £4550. My answer for the lower quartile of data is £3600 and upper is £6800, so the inter-quartile range is £3200. (Reasonable answers for the lower quartile range from 3600 to 3950, and for the upper quartile from 5800 to 6800, any answer for the inter-quartile range from 1850 to 3200 would be acceptable.)
  2. The scatter diagram should show the correlation is positive, so a good guess would be +0.5. (The Pearson correlation is 0.9, and Kendall’s is 0.7.)
  3. For the histogram the obvious class intervals are £3000-£3999, £4000-£4999, etc. There is insufficient data to be sure whether the distribution is normal or not: good answers would explain what the normal distribution is, and then point out that pattern is not quite normal but the sample is very small …
  4. The prediction should be £5920.
  5. Age has a negative impact – every extra year leads to a reduction in the price of £1077, if other variables (like Miles) are unchanged. The impact of Miles seems to be positive – each extra thousand miles leads to an extra £47.40 on the price, if other variables are unchanged. Good answers would go on to discuss the implications of the p-valuesand confidence intervals, and R squared, for the accuracy of the results, and the fact that the two variables are closely related (see part c) means that the results are difficult to interpret (cars that are older have almost certainly done more miles). Very, very good answers might include an estimate of the likely inaccuracy in a prediction.

1