See the Attached Excel File for the Calculations

See the Attached Excel File for the Calculations

You have administered a standardized test of manual dexterity to two groups of 10 semi skilled workers. One of these two groups of workers will be employed by you to work in a warehouse with many fragile items. The higher the manual dexterity of a worker the less likelihood that worker will break significant inventory. Because of a unique contract you must hire all 10 employees from one of the two groups and none of the employees from the other.

Group A Scores / Group B Scores
63
79
64
83
87
81
89
75
91
78
/ 98
89
26
65
100
93
59
60
100
100

You must decide which group to choose. Choose at least two measures of central tendency and at least one measure of dispersion for each group and use those to make your choice. Be sure to justify your choice with at least one page of discussion and analysis.

See the attached excel file for the calculations…

Two useful measures of central tendency for this data are the mean and the median. The mean is the arithmetic average of all the data points in each category. The median is the middle value within each category, meaning half the data lies above the median and half lies below the median. The means for both group A and group B are 79, so they are equal. The median for group A is 80 and the median for group B is higher at 91. Just based on this information, group B appears to have more stronger candidates. However, there are also some outliers on the low end, so it’s important to also study the dispersion of the data.

A good measure of dispersion for these data is the range. The range is equal to the maximum value minus the minimum value. It gives us a picture of the full spread of the data, from the weakest member of the data set to the strongest. A small range means that the people are all relatively consistent, and a large range means that the people are not consistent. The range for group A is 28 and the range for group B is 74. This means that the people in group A are much more consistent than those in group B. In the case of a warehouse with fragile items, one bad person can destroy all the items if they aren’t very careful. Therefore, a smaller range (given the same mean) would imply a more consistent and useful set of employees.

All in all, I would go with the people from group A. They have the same mean value as the people in group B. Although group A’s median is a bit lower, they are much more consistent. Their range is about 1/3 the size of B’s range. This means that we will not have any terrible people in the group, and won’t have to worry about one person destroying all of the products.

You are the production manager for an operation that produces circuit boards. Each board is tested at the end of the manufacturing process. Rejected boards are discarded and have no future value. For product B-17, you have kept data on the number of rejects for the past 38 weeks.
These data are:Number of rejected boards.
Week / Rejects
1 / 7
2 / 8
3 / 10
4 / 9
5 / 11
6 / 10
7 / 10
8 / 6
9 / 7
10 / 13
11 / 11
12 / 7
13 / 9
14 / 10
15 / 12
16 / 10
17 / 7
18 / 6
19 / 8
20 / 9
21 / 9
22 / 8
23 / 9
24 / 7
25 / 9
26 / 8
27 / 11
28 / 10
29 / 10
30 / 12
31 / 9
32 / 10
33 / 8
34 / 8
35 / 8
36 / 7
37 / 10
38 / 9 / Prepare a frequency distribution chart.
Number of Boards Rejected in a Week / Frequency / Percent / Cumulative
Frequency Percent
6 / 2 / 5.3 / 2 5.3
7 / 6 / 15.8 / 8 21.1
8 / 7 / 18.4 / 15 39.5
9 / 8 / 21.1 / 23 60.5
10 / 9 / 23.7 / 32 84.2
11 / 3 / 7.9 / 35 92.1
12 / 2 / 5.3 / 37 97.4
13 / 1 / 2.6 / 38 100.0
38 / 100.0
What is the Standard Deviation?
Standard Deviation =1.6765

What is the approximate probability that at least 7 rejected boards will be produced next week?
Prepare a frequency distribution chart.
What is the Standard Deviation?
What is the approximate probability that at least 7 rejected boards will be produced next week?
Be sure to clearly show how you arrived at your conclusion.
It appears that all that needs to be done here is:
What is the approximate probability that at least 7 rejected boards will be produced next week?
Let’s look at the cumulative probability chart to get that answer.
Prob(x ≥ 7) = 1 – prob(x = 6)
= 1 – 0.053
= 0.947
Look at the data below for the income levels and prices paid for cars for ten people:
Annual Income Level / Amount Spent on Car
38,000
40,000
117,000
17,000
23,000
79,000
33,000
66,000
15,000
52,000 / 12,000
16,000
41,000
3,500
6,500
21,000
5,000
8,000
1,500
6,000
Answer the following questions:
  1. What kind of correlation do you expect to find between annual income and amount spent on car? Will it be positive or negative? Will it be a strong relationship? Base your answer on your personal guess as well as by looking through the data.
I would expect a positive relationship between annual income and the amount spent on a car. I don’t think the relationship will be very strong, since I know many people who don’t make much money yet have expensive cars, and vice versa – many people who make quite a bit of money, yet have cheaper cars. The data seems to agree with my guess – the more people make, in general, the more expensive their car.
  1. What is the direction of causality in this relationship - i.e. does having a more expensive car make you earn more money, or does earning more money make you spend more on your car? In other words, define one of these variables as your dependent variable (Y) and one as your independent variable (X).
The causality goes in the direction of earning more money means you spend more money on your car. Going the other direction wouldn’t make any sense. That makes the X variable the income amount and the Y variable the amount you spend on the car.
  1. What method do you think would be best for testing the relationship between your dependent and independent variable, ANOVA or regression? Explain your reasoning thoroughly with a discussion of both methods.
A regression would be the best way to look at the relationship between these two variables. Regressions are used when you analyze the relationship between independent and dependent variables. You can determine a linear regression equation to use as a sort of predictor of the Y value, given an X value.
ANOVA, on the other hand, is used for comparing the mean values of three or more data sets. You can use it to determine whether the means are all equal, or not. We don’t even have three sets of data here, so this is not an application of ANOVA.
  1. Go to this calculation page and enter in your data in the X and Y columns (don't use commas, enter 8,000 as 8000). Then click on the button "Y=MX B". Then click on the "graph" button. Write out your equation as calculated, along with your coefficients. Discuss the significance and interpretation of this result, and discuss your graph.
The equation is:
y = 0.329224 x -3752.76
r = 0.88895

The r coefficient of 0.889 shows that there is a very strong positive linear relationship between the variables.

The slope of about 0.33 says that in general, for every extra dollar of income, a person will spend an additional 33 cents on their car.

The y intercept of -3752.76 implies that if a person has no income at all, they won’t spend any money on a car (since negative values don’t have a real world interpretation…)

The graph shows that our regression line is close to all of the data points, and that there are no strong outliers.