Fathom and Curve of best fit for Normal Data
Investigating the Age of the Wealthy.
- Use data from DASL by visiting: Select Data Subjects > Economics > Next 10 >Billionaires 92 Datafile (if instructions do not work, then use the following link: .)
2.Import Data into Fathom (Method 2). Right click on URL and copy. In Fathom, from File Menu, select Import from URL. Right click on address box and select paste. Select OK. A case box should appear with Gold Balls in it. Select table of values and check to see if data was imported. If not, use Method 1 from the cancer activity.
3.Graph wealth depending on age. To make an accurate scatter plot, there must only be numbers in the table of values. If you scroll down to case 98, there is an asterisk or a blank in the cell. This is another type of data clean up that you will have to check for. Delete all cases that have at least one asterisk or blank and delete the graph too.
I. Analysing One-Variable Data
4.Select graph and place on desktop. Drag age and put it on the x-axis. Notice the shape of the dot plot. This looks to be a normal distribution. To verify, lets calculate the mean and median of the data. If they are equal, then it is a normal distribution.
5.Right mouse click on graph. Select Plot Value. Type on the screen mean(age). Press OK. Repeat this process to calculate the median. Notice they are approximately the same and they are in the middle of the dot plot.
- Change this graph to a histogram by clicking on Dot Plot and selecting Histogram. It is easy to see the mode if you double click on the x-scale and change binWidth to 1. The mode is 68. The age interval is output at screen bottom. Thisage data is not exactly normal. To return to a computer centered graph, select Rescale Graph Axes from Graph Menu.
II. To obtain a visual confirmation that the data follows a Normal Distribution look at a Normal Quartile Plot
7. A normal Quantile Plot shows the distribution of continuous numeric data. It plots the z-scores (the difference between a value and the mean divided by the standard deviation) associated with the percentile of each case. If the data are Normal, the plot should show a straight line.
8. Change your histogram to a normal quantile plot by clicking on histogram and selecting normal quantile. Notice that your age data are very close to the straight line shown on the plot.
III. To check whether the distribution of the data has the following properties of a Normal Probability distribution
50% of the data falls on each side of the mean
About 68% of the data falls within one standard deviation of the mean
About 95% of the data falls within two standard deviations of the mean
About 99.7% of the data falls within three standard deviations of the mean
9. Convert the graph to a dot plot where each dot represents one data value. Select plot value from the Graph menu. To compute the sample standard deviation for the data enter the formula s(age). Select OK.
10. To sort the age values: Select the Age column in the table. Right click on the table and select Sort Ascending.
11. Count the data in each interval and divide by 225 to get the percentage within each interval as follows: For example, for the interval of data that falls within one standard deviation on either side of the mean, , shift select the desired interval in the table. Click on case 35, press shift, and click on case 193. Notice that the selected entries are counted on the bottom of the screen and highlighted on the graph. Divide this count of 159 by 225. So, about 70.67% of the data are in the interval . Notice again that this data is not exactly normal. Repeat this process to calculate the percentage of data that are in the two larger intervalsand .
IV. TheNormal Curve: Plotting a normal curve on top of a histogram
12. With a histogram showing, choose Scale->Density from the Graph menu. This will normalize the area of the histogram to one unit squared so a normal curve will match with it. Choose Plot Function from the Graph menu.
13. In the formula editor, type: normalDensity (x, mean(age ), s(age)). Click OK.
14. Remember to save your work before attempting to print it.
Select the text tool: What bias may result from rounding off the mean and standard deviations for the calculations in step 11? Reflect how this data could be useful: What is the data telling us?
V. Further: Using Sliders to Analyse Data
15.Create another graph on the desktop and put age on the x-axis and wealth on the y-axis.
16.Bring down 4 sliders from the toolbar (two iconsleft of the A). Label a, b, k, and d.
17.Right click on the graph. Select Plot Function. Click on + sign beside Function. Click on + sign beside distribution. Click on + sign beside Normal. Double click on normal Density (the description of this tool is at the bottom of the Expression for function screen).
18.Type in the letters a, k, b, and d as shown on the screen capture below. Select OK.
19.Using the sliders, try to fit a curve of best fit. Drag the slider on the scale to change the value of a, b, k, and d. Remember to save your work before attempting to print it.
Select the text tool: Reflect how this data could be useful: What is the data telling us?
Create a question that could be investigated by analysing the data.
Note: The current list for the wealthiest 100 folks: . Select Continue to site.