Faculty of Business Administration

Review Questions

M248 Analyzing Data

Dear Students,

The review questions for M248 were prepared by Dr. Mansour Batainah (Kuwait Branch) and Mr. Nabil Rahal (Lebanon Branch). As a course team responsible for ensuring successful delivery of the course, we have decided to compile these questions to guide you in your preparation for the final examination of M248. We fully understand that the course materials are hard to be covered within a semester, and that this is the first time the course is offered at AOU. It is our sincere intention to help you complete the course successfully, and it is your duty to spend some effort to work on understanding how to answer these questions. I am very positive that your tutors and branch course chair will spare no effort or time to assist you with solving whatever course-related problems that you may encounter. It is for your information that the final examination will include four parts each of which has five questions. Most of the questions are straightforward applications of the concepts that you have studied, but some need some thinking. The most important matter that you need to bear in mind is that the review questions could help you pass the course, but do not guarantee high grades. High achievers need to allot some time for solving the exercises and activities presented in the four textbooks of the course. At the end, I strongly recommend that you devote some time for studying the TMAs before the final exam.

Wish you the best of luck.

Abdulkarim Al-Eisa

M428 Course Chair

AOU Headquarter, Kuwait

1. The following data are the weights of cats in kg:

1.3 1.5 2.2 1.7 2.2 2.9 3.9 4.3

a)Find the mode, median and the range of the data

b)Calculate the lower quartile

c)Calculate the mean and deduce the skewness of the data.

2. Compare the following two boxplots:

3. Compare the following two boxplots:

4. What is the best graphical representation of the following data sets? Justify your answer.

Year / 1990 / 1995 / 2000 / 2005 / 2010
Category
Nurses / 23 / 26 / 30 / 35 / 37
Doctors / 13 / 15 / 20 / 25 / 27
others / 30 / 28 / 33 / 37 / 50

5. State whether each of the given functions is a probability mass function or not. Justify your answer.

5.For what Condition is the Bernoulli distribution with parameter p a Discrete Uniform distribution?

6. Explain why the probability mass function of a Geometric distribution with parameter p, is defined as

7. The probability distribution of the random variable X is given in the following table:

/ -1 / 0 / 2 / 4
/ 0.2 / 0.1 / 0.3 / 0.4

Find the mean and the standard deviation of the random variable X.

8.The probability that a machine produces a defective item is 0.01. N items are selected at random.

A.Let X be the random variable that represents the number of defective items in a sequence of 10 selected items. What is the distribution of X? Calculate the probability that exactly two items are defective.

B.If Y denotes the number of items up to and including the first defective one.

a. What is the distribution of Y

  1. Calculate the probability that 5 items are selected before the first defective item is selected.
  2. Calculate E(Y) and V(Y).

9. A perfect die is rolled. If the result is 6 we record 1, otherwise we record 0.

A. Let X be the random variable that represents the recorded number, write the range of X. What is the distribution of X?

B. Let Y denotes the number of 6 spot face results in a sequence of 12 rolls.

a. What is the probability distribution of Y?

b. Calculate the probability that exactly two rolls are required to obtain the face

c. Find the mean and the variance of the number of 6 spot faces resulting from the 12 rolls.

10. A survey conducted for a departmental store showed that 60% of their customers are food department customers, and the remaining percentage of customers is cloth department customers.

  1. For a sample that consists of 10 customers, what is the probability that six of these customers will be food customers?
  2. What is the probability that the first customer of the cloth department will be the customer number eight?

11. Answer the following multiple choice questions by circling the answer that you believe is the correct one.

  1. For a standard normal distribution:

a / The mean is 0
b / The standard deviation is 1
c / The variance is 1
d / All of the above.
  1. If the normal distribution is used to model the variation in the population, then:
  1. Approximately 95% of the population is within 1.645 standard deviation of the mean.
  2. Approximately 95% of the population is within 1.96 standard deviation of the mean.
  3. Approximately 95% of the population is within 2.576 standard deviation of the mean.
  4. Approximately 95% of the population is within 1.225 standard deviation of the mean.
  1. Assume that the grades of AOU students are normally distributed with mean of 66 and a standard deviation of 2.5. What is the probability that a randomly drawn student will have a grade exceeding 70?

a / 0.0548
b / 0.5124
c / 0.00
d / 1.00
  1. In the standard normal distribution, the probability of z value to be more than 0.00 and less than 0.80?

a / 0.500
b / 0.7881
c / 0.2881
d / 0.4881
  1. For a binomial random variable X, the parameters are {n = 25 and ρ =25}. The result of normal approximation of the probability P(X < 10) is:

a / 0.000
b / 0.115
c / 0.120
d / 0.125
  1. If the random variable X follows a normal distribution with mean =2.45 and standard deviation s =2.03. so an approximate 95% confidence interval for the population mean of the random variable X is:

a / (2.61, 2.93)
b / (2.38, 2.74)
c / (2.29, 2.61)
d / None of the above
  1. The normal distribution is a

a / Discrete distribution
b / Continuous distribution.
c / Positively skewed distribution
d / None of the above.
  1. The total area under a normal curve is

a / 0
b / 2
c / 1
d / 4
  1. If Z is a random variable follows a standard normal distribution. The probability that of the value of Z to be lower than zero is:

a / 0.00
b / 0.50
c / 1.00
d / 2.00
  1. The central limit theorem states that as sample size increases, the population distribution more closely approximates a normal distribution.
  1. True b. False

12. For a sequence of Bernoulli trials, the probability of success ρ is supposed to be 0.40. The results of a sequence of 20 Bernoulli trials are shown in the following table:

0 / 0 / 1 / 0 / 0 / 1 / 1 / 0 / 0 / 0
1 / 0 / 1 / 1 / 0 / 0 / 0 / 0 / 0 / 0

The null and alternative hypotheses are as follows

a)Find a 95% confidence interval for ρ.

b)What is the conclusion of the test at the 0.05 significance level?

13. The results of an accounting examination were normally distributed with a mean 45 and a standard deviation of 15.

a)What is the probability of getting a score above 90?

b)What is the probability of getting a score between 30 and 75?

14. If X is a random variable that follows a normal distribution with mean 20 and variance 5, and Y is a another random variable that is,

Y= 5X– 10

What is the mean and variance for a random variable Y?

  1. The random variable X is normally distributed with mean 6 and standard deviation 0.4.

a)What is the distribution of the random variable Y which is given by?

b)Calculate its mean and variance.

c)Calculate the probability

  1. A publishing company has just published a new college textbook. Before the company decides the price at which to sell this textbook, it wants to know the average price of all similar textbooks in the market. The research department at the company took a sample of 36 similar textbooks and collected information on their prices. This information produced a mean price of $48.40 for this sample and a standard deviation of $4.50.Construct a 90% confidence interval for the mean price of all such college textbooks.
  1. The mean age of all CEOs (chief executive officers) for major corporations in the United States was 48 years in 1991. A random sample of 25 CEOs taken recently from major corporations showed a mean age of 46 years with a standard deviation of 5 years. Assume that the ages of all CEOs have an approximate normal distribution.

Test at the 1% significance level if the current mean age of all CEOs is different from that in 1991?

(Hint: )

  1. Data given below is the difference in lengths of cuckoo eggs in the nests of two species of birds.

Subject / 1 / 2 / 3 / 4 / 5 / 6
Difference(in cm) / -2 / 1 / 2 / 3 / 4 / 7

Use a Wilcoxon signed rank test on the given data to investigate the hypothesis that there is no difference in the median degree of lengths for the two species. When performing the test, use the normal approximation to the Wilcoxon test statistic to obtain an approximate p-value. Interpret the result.

19. For a given data, the mean is equal to 22.5 and the mode is 25.2. What can you say about the skewness of the data?

20. An experiment was undertaken to examine the association between smokers and heart disease. A sample of 1000 smoker was taken randomly, and we found that 750 have heart disease. Use these data to determine a 99% confidence interval for the proportion of smokers who have heart disease.

21. Given: . Suppose that X and Y are independent.

a) What is the distribution of Y, calculate its parameters.

b) What is the distribution of Y-X, calculate its parameters.

c)Find the probability

22. The probability of the random variable X Is given as follows:

X 1 2 3 4 5

P(X) 0.30 0.15 0.60 0.40 0.20

Find the mean and variance of X.

23. Compare the two box plots A and B.

24. The prices of all houses in New York state have a probability distribution with a mean of $157,000 and a standard deviation of $29,500.Let be the mean price of a sample of 400 houses selected from New YorkState.

a)What is the probability that the mean price obtained from this sample will be within $3000 of the population mean?

b)What is the probability that the mean price obtained from this sample will be lower than the population mean by $2500 or more?

25. The U.S. Bureau of Labor conducts survey quite often to collect information on the labor market. According to one such research survey, the workers employed in manufacturing industries in the United States earned an average of $466.42 per week in 1992. Assume that this mean is based on a random sample of 1000 workers selected from the manufacturing industries and the standard deviation of weekly earnings for this sample is $70.

Find a 99% confidence interval for the mean weekly earnings of all workers employed in manufacturing industries in 1992.

26. A researcher measured the heights for 15 students selected randomly from a normal population of M248 students. The sample mean height and the sample variance are 172 cm and 24, respectively.The purpose of this study is to test if the mean height of M248 students is 170 cm.

The hypothesis is given by

a) Identify a suitable test statistic for a fixed level test and state its distribution. Find the rejection region and state your conclusion.

b) Calculate the p-value of the test, and then re-write your conclusion.

27. Data given below are the percentage of gold content in rings from a set of 6 identical rings

Ring gold % / 5.8 / 6.1 / 6.6 / 6.4 / 5.8 / 6

Suppose that we want to investigate whether it is plausible that the rings could come from a population where the median gold content is 6.Test the hypothesis using the Wilcoxon signed rank test. Use the normality approximation to find the p-value. Interpret your result.

28. The following sample of size four was drawn from a population which has a Bernoulli distribution with unknown parameter p

4 2 5 1

Write down the maximum likelihood estimate of p.

29. The following sample of size five was drawn from a population a continuous uniform distribution where is an unknown parameter

1.29 1.92 2.19 0.29 1.029

Write down the maximum likelihood estimate of .

30. SAT tests are usually designed so that people have a variance of 2.6.A new SAT test is given to a random sample of 20 people taken from a normal distribution. The sample variance of their score is 1.75. Perform a two-sided fixed test of the hypothesis that the population variance is 2.6 using a significance level of 0.05.

31. The figure below represents data on the monthly income and the number of cars owned for 1000 people for 7 countries of the Middle East.

a) Write down the explanatory variable and the response variable

b)What can you say about the relation ship between the income and the number of cars owned?

32. A linear regression model is used to model the relationship between the two variables x (explanatory variable) and y (response variable).

Given that:

a) Calculate the equation of the least square line for the data

b) Calculate the Pearson correlation coefficient r between the two variables.

c)How strong is the relationship between them?

33. The equation of the least square line for a given data is

Although, Given that:

Calculate a 95% confidence interval for the mean response for.

34. If X is a random variable follows a normal distribution with mean and variance 16 and 8 respectively. And Y is a another random variable that is,

Y= 3 X + 22

Find the mean and variance for a random variable (Y).

35. A doctor wants to investigate whether there is a relationship between the sex and eye color. He took a sample of 210 persons and classified each person as Black or Green eyes and as male or female.The results are summarized in the following table:

sex / Black eyes / Green eyes / Total
Male / 70 / 20 / 90
Female / 80 / 30 / 110
Total / 150 / 50 / 200

Carry out a chi-squared test for no association between the sex and eye color. Report your conclusion.

36. An economist launched a study to examine the relationship between income level and consumption expenditures. A random sample of 10 people was randomly selected, and the data recorded as in the following table.

Person / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10
Income / 200 / 300 / 400 / 500 / 600 / 700 / 800 / 900 / 1000 / 1500
Consumption / 180 / 260 / 350 / 420 / 500 / 560 / 600 / 650 / 800 / 1000

Calculate the two parameters of the simple regression model. Find the regression function, draw the regression line, and interpret your answer.

37. Use the following set of data to estimate the simple linear regression function. Draw the regression line which fits the data and interpret your answer.

Maintenance Expenditures / 0 / 100 / 200 / 300 / 400 / 500 / 600 / 1000
Machine working life / 4 / 6 / 8 / 10 / 12 / 13 / 14 / 16

38. If Z is a random variable follows a standard normal distribution:

a.What is the mean of the random variable [Z]

b.What is the variance of the random variable [Z]

c. What is the probability that of the value of Z to be equal or lower than 1.28.

d.What is the probability that of the value of Z to be more than 1.56.

39. Suppose that X is a random variable which modeled by a normal distribution with mean 50 and standard deviation 4. What is the proportion for X to be at least 53.

40. The variation in exam scores of students was modeled by a random variable (R) which follows a normal distribution with mean and standard deviation 100 and 225 respectively.

a. What is the proportion of scores between 80 and 120?

b. What is the proportion of scores to be at least 90?

c. What is the proportion of score to be more than 100?

41. Garden fertiliser is packed into plastic bags which are nominally ‘1.5 kg’. The net weights of these bags are known to follow a normal distribution with a mean weight of 1.55 kg and a standard deviation of 0.05 kg. A bag is selected at random. What is the probability that the bag contains less than 1.6 kg? (Hint: use the Standard Normal table).

42. A normal model was proposed for the distribution of the marks of Arab Open University student in Economic course with mean and standard deviation 72 and 12 respectively.Use this model to calculate the following:

a.The proportion of students who are less than 64.

b.The proportion of students who are between 62 and 80.

c.The proportion of students who are more than 85.

43. The random variable X follows a binomial distribution with [n=16, p= 0.5]. The true value of the probability P(12 X 15) is 0.0384. Use a normal distribution to find an approximate value of the probability.

44. A sample of 122 cars in a particular renting agency was selected. For each car, the number of times that it had been rented in the preceding six months was counted. The sample mean and standard deviation of number of rents were 1.992 and 1.394, respectively.

a. What is the approximate 90% confidence interval for the mean number of rents (M) during the last six months?

b. Use the calculated confidence interval to test the null hypothesis that M is 2.5 against the alternative hypothesis that M is not 2.5.

c. What is the significance level of the test?

d. What is the conclusion of the test?

45. The following table shows the effects of quantitative skills course on different groups of students.

Improved / Did not improved / Total
Level 1 / 15 / 12 / 27
Level 2 / 13 / 5 / 18
Level 3 / 12 / 3 / 15
Total / 40 / 20 / 60

Calculate the value of the Chi-square test statistic for this set of data?

46. A tax inspector observes the spending of a representative sample of 25 shoppers at a convenience store. Mean spending from this sample is KD20 with a standard deviation of 10. Calculate the 99 percent confidence limits for the sample mean as an estimate of mean spending for all the store’s customers. (Hint: 25 is a small sample).

47. A survey conducted for Arab Open University showed that 80% of the students are tea drinkers, and the remaining percentage of them does not drink a tea.

  1. For a sample that consists of 20 students, what is the probability that fifteen of them will be tea drinker?
  2. What is the probability that the first student who drinks a tea will be the student number 10?

48. Use the following information to answer this question:

The profits, in KD’000, from a random sample of eight weeks for two connected shops X and Y, were found to be:

Week / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8
Shop X / 13 / 22 / 21 / 16 / 21 / 17 / 25 / 23
Shop Y / 29 / 26 / 33 / 34 / 36 / 38 / 28 / 31
  1. What is the median weekly profit for shop Y?
  2. What is the estimated mean of all weekly profits for shop X?
  3. What is the estimated standard deviation of the weekly profit of shop X?
  4. What is the skewness of the weekly profit of shop Y?

49. The probability of the random variable X is given as follows:

x / 6 / 12 / 10 / 8 / 5 / 4 / 14 / 20
p(x) / 0.15 / 0.15 / 0.20 / 0.05 / 0.10 / 0.03 / 0.15 / 0.17

Find the mean and standard deviation of X.

50. The random variable X is normally distributed with mean 6 and variance 4. Let

a. What is the distribution of Y; calculate its mean and variance.

b. Calculate the probability.

51. The weight of a sample of 100 men was measured. The sample mean was found to be 174 kg, and the sample standard deviation was 20. Calculate an exact 95% confidence interval for the mean weight of men.