SC2002, Baltimore—Education Program—Fathom Software Workshop1

Tim Hendrix—Department of Mathematics & Computer Science
Meredith College, Raleigh, NC 2760711/17/02

Modeling & Analysis with Fathom Software


Dynamic Statistics™ Software
for Deeper Understanding
Software Authors:William Finzer, Tim Erickson, & Jill Binker;
Lead Programmer: Kirk Swenson / Fathom is a statistical software package that promotes exploration, investigation, and discovery understanding. Fathom is similar in its vision for statistics education to the vision that Geometer’s Sketchpad has given dynamic study of geometry. The software is appropriate for both secondary and post-secondary study, and can be used in a variety of courses.
/ Website:

In this workshop, we will provide a “hands-on” overview of Fathom. The goal is not to provide detailed instruction in every feature of the software package; rather, the goal is to demonstrate ways in which the software can be used to promote modeling and visualization. I have used the software with both pre-service & in-service mathematics and science teachers to promote conceptual understanding of data analysis, and concepts of model building.

With Fathom software, one can:

View data in multiple representations—as attributes of individual cases; in “spreadsheet-like” tables; in summary tables; in graphs

Bring data into Fathom files in multiple ways—directly inputting data in spreadsheet format; importing data from various text files (e.g., tab-delimited, comma-separated); copying and pasting from other files; and most excitingly, importing data directly from the Internet!

Create contingency tables—cross tabulation of data intuitively

Create dynamic graphs and explore data graphically—that help students understand which types of graphs are appropriate for various forms of data. If “unlocked”, the data may be altered through directly manipulation of the graph; if “locked”, the data are not alterable.

Create a variety of functional plots and explore data graphically —Linear regression—manual moveable line fits showing sum of squares, median-median fit, least squares fit; Build more complex mathematical models with an extensive collection of functions; plot specific values; plot most probability distributions; establish model parameters (sliders)

Perform a variety of statistical tests—Descriptive statistics; estimate population parameters; test hypotheses

Investigate probability and statistics dynamically—collect random samples with replacement; collect measures from a sample; create simulations

The essential basics of Fathom:

Collections—Treats collections of cases as a collection of gold balls; each gold ball represents one case: e.g., each individual subject in a human study is a case, and the variables about which data are collected are attributes.

One can scroll through each case in the data file by double-clicking on a collection; this view is called the Inspector. It is similar to a database view and from here, one can create measures, make comments, and control advanced features, such as sampling.

Data are summarized in a case table—similar to a spreadsheet view of the data. Each row is a caseand each column is an attribute.

Graphs are displayed inside a separate window within the file. In fact, any of the views—a collection, case table, inspector, graph, or summary table—can be collapsed to an iconsuch as the graph icon to the right by dragging the bottom right corner to the top left corner. This gives a Fathom file the feel of a desktop workspace with moveable, collapsible windows inside of the file.

Sliders allow the user to establish parameters for coefficients in a model, in order to explore change effects of specific parameters.

Summary tables are contingency tables or cross-tabulation tables that are easily built by dragging and dropping attributes into summary rows or columns.

Exploration # 1: Population Data

Goal of activity:Examining Data Collections, Data Inspectors, Case Tables

Creating Summary Tables, Basic Graphs

Fathom software comes with hundreds of data sets ready to be explored. Many data sets come from census files.

Under File, choose Open… and search for “BeverlyHills.ftm”. Note the extension for Fathom files (.ftm). Either the file is loaded under presentation examples on the conference computer, or search the Sample Documents sub-directory under Fathom Program files, or download the file from

Once you have opened the file, drag the corners of the collection to collapse and expand the collection. Double-click on a gold ball to open the Case Inspector. Use the arrows at the bottom to scroll through the cases. What attributes have been collected for each case?

With the collection highlighted, drag a case table from the Tool Bar to the file window. Whoosh! The data are displayed in a spreadsheet-like table.

Drag a graph icon from the Tool Bar to the file window. Notice that it plots nothing at first. To graph data, drag an attribute to either the horizontal or vertical axis. Notice the drop-down menu in the upper right corner of the graph window. This menu indicates the various types of graphs that are possibly appropriate for the type of variables the user has chosen to graph. For example, if charting “gender”, the software won’t allow the possibility of a histogram.

Explore with graphing various attributes. We’ll look at a few different attributes in this file.

How to split a chart: When charting a single attribute, you may drag a categorical variable to the other axis to disaggregate the data within the graph.

Drag a summary table from the Tool Bar to the file window. As with the graph, the summary table is empty initially. Drag “sex” to the arrow point downward to the summary rows. What would be an appropriate attribute to cross-tabulate with gender? Try dragging “race” to the arrow pointing right to the summary columns.

Point of this exploration: Fathom’s ease of use; intuitive; the nature of the software features lends itself to exploring the data and understanding the differences between categorical and continuous data.

Exploration # 2: Importing Data from the Internet

Goal of activity: To import datafiles directly from websites on the Internet, linear regression

Open a new Fathom file, and minimize it.

Open Internet Explorer or your preferred internet browser. Go to the following URL: which leads to the online Data and Story Library, a famous repository of datafiles.

Click on the link Data Subjects, which leads to a laundry list of data categories. Click on Education, then on Reading Test Scores Datafile. Scroll down the page and examine the data file. Go to the Location Bar, highlight the URL, and Copy it (under Edit or Control-C).

Pull up the blank Fathom file, go to File, and select Import from URL. A dialog box will appear, and Paste the URL of the Reading Test Scores Datafile into the URL box. Click OK, and see what happens.

Pull down a case table, and scan the data. Pull down a graph, and plot PRE1 versus POST1, creating a scatter plot. Is there a positive correlation between the data? Would you choose to fit a regression line to these data?

Under the Graph menu, choose Moveable Line. Drag it around and through the data. Again, under Graph menu, choose Show Squares. See if you can minimize the Sum of Squares. Once again, under Graph, choose Least Squares Line. Deselect the Show Squares option.

Under Graph, select the Median-Median Line. What is the difference between these two regression lines? Which one is more robust? Drag a data point around. Which line is affected more—the least squares or the median-median line.

Open a New File. Return to Internet Explorer, and visit the following URL: Select, on this page, People Estimates. On the next page, choose from the drop-down menu, “Annual Series through 2001 by state” and click GO. This leads to a datafile. Again, copy the URL and return to Fathom. Practice importing this file directly from the internet.

Cleaning the data—After importing the census file, there is some mild cleaning to do. The columns need to be named as appropriate attributes, and a couple of cases need to be deleted. Overall, though, importing datafiles seems rather painfree.

Point of this Activity—Importing datafiles is relatively painless process, and easy to modify. Linear regression can be explored intuitively rather than the blackbox approach.

Exploration # 3: Election Data Analysis

Goal of activity:To see how Fathom can promote exploration of data to investigate a problem and make both conjectures and conclusions.

Under File, Choose Open and select the ElectionData.ftm file. If not on conference computer, you may download the file from the following URL:

This problem originates from a former student, Adam Poetzel (a mathematics teacher at Champaign Central High School, Champaign, IL), who wanted to develop an online lesson activity for statistics students to explore dynamically the results of the 2000 Presidential election in the state of Florida. The set-up of the problem may be found at:

The datafile contains the number of votes cast in each county of the state of Florida for the 2000 presidential election. The datafile gives the number of votes cast for Bush, Gore, Buchanan, and the total number of votes cast in each county.

Pull down a Graph from the Tool Bar. Graph the Total Votes Cast on the horizontal axis. On the vertical axis, plot the votes for Bush. What does the scatter plot graph indicate? Try replacing the vertical axis with Votes for Gore. Anything interesting yet? Which counties represent the upper right most data points of the scatter plot?

Outliers—Now, try replacing the vertical axis with Votes for Buchanan. Notice anything? What county represents the outlier? Notice that highlighting a data point on the graph highlights the case in the case table and vice-versa.

From this graphical analysis, are there conjectures you would be willing to make about the 2000 Presidential Election results in the state of Florida? How would you go about exploring and confirming your conjectures?

Point of this activity—Many questions that can be explored need data and graphical analysis of the data. Fathom can promote exploration and investigation of problems that would be otherwise prohibitive.

Exploration # 4:Moon Watch Data

Goal of activity:Graphical analysis, model-fitting, slider parameters

Open the file Moonwatch.ftm and explore with the case table. These data come from the website: At this website, there are links and explanations about the phases of the moon and details about how this dataset is generated.

Pull down a Graph from the Tool Bar. Plot Day on the horizontal axis. Plot January on the vertical axis. Describe the graph. How do we interpret this graph?

Drag March to the horizontal axis. What do you notice about the graph? Why is this the case? Can we explain this mathematical relationship via the moon? Drag another month to the horizontal axis. What shape is the scatter plot now? Any conclusions from this?

Return to January on the vertical axis with Day on the horizontal axis. Drag March to the main area of the graph—not either axis. Drop it on the main part of the graph. What happens? This defines March as a legend attribute—can you explain what the graph is telling us?

Under Graph, choose Remove Legend Attribute. Now, Under Graph, choose Plot Function. What type of function would most closely relate to the data plotted over time? Can you create a graph to match the data points as closely as possible?

One of the graph icons on the file shows one fit of the data by a cosine curve.

Notice the sliders at the top of the file. Let’s discuss how they represent various parameters of the cosine curve graph. You can pull a slider from the Tool Bar. Under the graph menu, choose Plot Function, and expand Global Values. You can now place the sliders into a functional expression.

Point of this activity—The nature of the data and problem should guide the selection of an appropriate mathematical model. Sliders allow one to explore the effects of various parameters on a particular model. Legend attributes allow us to compare subsets within the dataset. Relationships between attributes can be explained both mathematically and within the original context of the dataset, which is the way modeling should work.

Exploration # 5:Modeling in other environments; analyzed by Fathom

Goal of activity:Explore the model capabilities of NetLogo, exporting data to Fathom, and seeing important relationships.

Open up the program NetLogo. This software is a modeling environment. There are other sessions this weekend that will explore this modeling tool in more detail. Once the program is started, go to File, and choose Models Library.

A dialog box will appear with many categories of models in directories. Expand biology and choose Wolf-Sheep Predation.

Play with the parameter sliders and then click on Set-Up. Then, click on Go and watch the model run. Are you surprised by what happens to the populations of sheep and wolves? Click Go after a few seconds to stop the running of the model.

Under File, choose Export Plot. Name the file “sheep.csv” and at the next dialog box, click on Populations. Save the file on the desktop, where you can find it easily.

Open Fathom, and under File, choose Import from File. Navigate to the sheep.csv file and open it.

Clean up the attribute names and delete unneeded cases. Plot the number of sheep versus time and the number of wolves versus time. Can you explain the graphs in terms of the model?

Create a scatter plot of sheep versus wolves. What does the graph look like? Why is this graph in this shape? Can you explain the graph in terms of the model?

Point of this activity—NetLogo provides a powerful tool for creating, tweaking, and running models. The data obtained from those models can be explored quickly and easily in Fathom, even leading to surprising results about deeper concepts—e.g., equilibrium of a predation model.

Exploration # 6:Modeling a Virus

Goal of activity:To see how various parameters affect the building of a model.

Return to NetLogo. Under File, choose Models Library, and expand Biology. Select Virus. Set all of the parameters around 50% and set-up the model, then run it for a few seconds, about 10 seconds.

Export the plots again by saving them as “modelvirus.csv”. Note: Do not save as “virus.csv”—could be misleading! Import the file into Fathom and clean up the attribute names, deleting any unneeded cases.

Plot Time_in_Weeks on the horizontal axis and the number of people Sick on the vertical axis. Can you plot a function that goes through many of the data points? What factors need to be taken into consideration?

Equilibrium? Amplification? Decay? Periodicity? Phase shift?

Explore with the graph, and the function plot. See if you can explain the various mathematical parameters of the model in terms of the context of the virus.

Point of this activity—models are inherently complex. Even with assumptions, and simplifications that are necessary, models must try to respond to conditions of the system being modeled. Mathematically, trying to fit a curve through the data of a model requires integrating and synthesizing analysis of various parameters of a model—both in terms of the mathematics and in terms of the context of the model.

Conclusion—Modeling is the process of looking at the world through mathematical eyes. This is not as simplistic as taking a set of data and fitting a line or quadratic function closely to the data. Rather, the process of modeling is a synthetic process, beginning with the problem in context, and integrating mathematical understanding with understanding of the problem situation. Iteration, analysis, and evaluation are necessary to refine the model.

Bigger conclusion—if we desire that our students graduate with a broader, synthesized understanding of mathematics that can be applied to “real” situations, we must offer them opportunities to simulate the building of models, and conceptual interpretation of those models. Moreover, if we desire that our future teachers change the ways in which K-12 students interact with mathematics towards greater modeling and visualization, then those future teachers need to learn mathematics from this perspective.

Presenter:Tim Hendrix

—I would be pleased to interact with you about these ideas or tools in any way that I can. Please feel free to contact me at:

Tim Hendrix, Asst. Professor of Mathematics

Department of Mathematics & Computer Science,

Meredith College

3800 Hillsborough Street

Raleigh, NC 27607

Phone: 919.760.8240

Fax: 919.760.8141

Email:

Websites:

Visit my website at:

Visit Meredith College Department of Mathematics & Computer Science at:

Visit Meredith College, Raleigh, NC:

Visit the Office for Mathematics, Science, and Technology Education at the University of Illinois at Urbana-Champaign at:


Visit the Fathom software website at:

Visit the NetLogo software website at: