Practice Problems for Applied Statistics Midterm #1

1)What is the difference between a population and a sample? What is our objective in examining samples?

2)Using the unemployment data in the Excel spreadsheet P1DATA.XLS (linked to the course website), create:

a)a frequency distribution

b)a relative frequency distribution

c)a percent frequency distribution

d)a cumulative relative frequency distribution

e)a cumulative percent frequency distribution

f)a histogram

g)an ogive

h)a stem and leaf display (use only the years 1981-1984 for this)

3)Using the unemployment data used in problem 2,

a)what is the mean?

b)what is the median?

c)what is the mode?

d)what is the range?

e)what is the interquartile range?

f)what is the five number summary?

g)what is the variance?

h)what is the standard deviation?

i)what is the z-score of smallest observation?

j)what is the z-score of the largest observation?

k)create a table that lists the z-score for every item

4)Are there any unemployment outliers in the data used in problem 2? Justify your answer carefully.

5)Using the unemployment and interest rate data, create:

a)a scatterplot

b)a crosstabulation with 5 equal width classes for each variable

c)Comment on the relationship between the two variables. Justify your answer by referring to the scatterplot and crosstabulation.

6)Using the unemployment and interest rate data,

a)what is the covariance?

b)what is the correlation coefficient?

c)Comment on the relationship between the two variables. Justify your answer by referring to the covariance and correlation coefficient.

7)Using the interest rate data and the classes formed in problem 5,

a)create a table of grouped interest rate data

b)what is the mean of the grouped data?

c)what is the variance of the grouped data?

d)what is the standard deviation of the grouped data?

8)Compare the values calculated in problem 7 to the mean, variance, and standard deviation of the full interest rate data. Comment on the potential errors when we have grouped data.

9)Consider the following sales data:

Month /

Sales

/ Month /

Sales

January / $200 / July / $140
February / $190 / August / $150
March / $200 / September / $140
April / $180 / October / $120
May / $170 / November / $110
June / $170 / December / $90

a)Create a histogram of the data that might mislead people to believe that sales are generally increasing.

b)Create a time series plot that might mislead people to believe that sales are generally increasing.

c)Based on your answers to a) and b), what advice would you give to people who review business plans for potential investment?

d)Briefly comment on the relationship between transparency, ethics, and legality.

10)Evaluate the validity of the following statement: “A cutoff rule of z = +/-3 should be used to determine outliers”. If you disagree, comment on how one might determine an appropriate outlier rule.

11)Evaluate the validity of the following statement: “Ethical behavior demands that we present data in such a way that it is accurate and complete, but not transparent”.

12)Briefly comment on how outliers might cause descriptive statistics to be misleading.

13)Consider four sets A, B, C, and D such that AB, AC, AD=, BC, BD=, CD=, and ABC=. Draw a Venn diagram depicting this situation.

14)A door-to-door salesman has examined historical data on his success given the sex of the person who answers the door. 74% of the time, a woman answers the door. He has also noted the following:
P(sale)=0.3 (i.e., the man makes a sale at 30% of the houses he approaches)
P(sale & woman)=0.19 (i.e., 19% of the time, a woman answers the door and a sale follows).
What is the probability of getting a sale given that a man answers the door?

15)A bank screens credit applicant based on three factors, current debt, income, and prior payment history. 40% of all applicants are rejected. 15% of applicants fail the debt test. 20% of applicants fail the income test. 5% of applicants fail the payment history test. You know that a certain customer applied and was rejected. What is the probability that the customer was rejected due to low income? Comment on your ability to answer the question if 30% of all applicants are rejected (and the other numbers are the same).

16)Suppose that telemarketing sales are dependent on two factors: weather (when it’s raining, more people are home) and time of day (if you call during prime time, people are less likely to answer the phone). Those factors are independent. It rains with probability 0.1 and prime time constitutes 40% of the normal calling hours. A telemarketer can make 10 calls per hour. The net profit (including everything except telemarketer wages) per successful call is $9 and the probabilities of success on a given call are as follows.

Raining / Not Raining
Prime Time / 0.25 / 0.15
Not Prime Time / 0.3 / 0.2

Telemarketers charge $15 per hour. What is the expected profit per hour of calling? Should you implement a restricted calling plan? If so, what would you recommend? What is the expected profit per hour of calling under the new plan?