Visualizing and Understanding Probability and Statistics

Visualizing and Understanding Probability and Statistics:

Graphical Simulations Using Excel

Sheldon P. Gordon and Florence S. Gordon

Farmingdale State University of New York N Y Institute of Technology

Farmingdale, NY 11731 Old Westbury, NY 11568

Abstract The article discusses the use of interactive graphical simulations in Excel for investigating a wide variety of random processes and sampling distributions that arise in introductory statistics courses. These include simulations of various probabilistic processes such as coin flipping and dice rolling, the Central Limit Theorem, confidence intervals and hypothesis testing, and regression analysis. The article emphasizes a variety of ways that such graphical simulations can be used effectively as classroom demonstrations to motivate statistical ideas, to increase student understanding of otherwise difficult concepts and methods, to provide tools by which the students can personally discover the underlying statistical concepts, and to "justify" or validate the statistical theory.

Students in introductory statistics courses are often asked to accept many statements and procedures on faith since the mathematical justification may be far too sophisticated for them to comprehend. In large measure, this can be attributed to the nature of statistics, which is quite unlike most other undergraduate mathematics offerings. At most large universities, introductory statistics courses are given by a statistics department, though often by individual departments (such as economics, business, biology, psychology, etc) that use statistics heavily. At some small colleges, the introductory statistics courses are given by trained statisticians who are often in the mathematics department.

However, at many schools, particularly two-year colleges, introductory statistics is offered by the mathematics department with instructors (often part-time faculty) having little, or no, formal statistical training. According to the 2000 CBMS study [1], over 400,000 students took introductory statistics in a mathematics department; this number represents about four times the number taking such a course through a statistics department. Moreover, the AP Statistics program [2] in the high schools has been growing extremely rapidly; in 2002, about 50,000 students took the AP exam at the end of the course, and this number had been growing at an annual rate of about 25%. As with the college mathematicians who teach the introductory statistics course, most of the high school teachers have had little formal training (though some teacher development workshops) in statistics. It is these two groups to whom this article is primarily directed.

In mathematics courses, the underlying theory applies directly to the objective in question, whether it is a function being optimized, a system of linear equations being solved, or a differential equation being solved. In statistics, however, the theory applies to some unseen underlying population, whereas usually the student has just one sample in hand. In fact, the sample is only used to make a statistical inference about the unknown population, either to estimate a population parameter such as the mean μ or the proportion π or to test a hypothesis about one of these (or some other) parameters. Unfortunately, students see only the sample, but have no direct way to perceive the population or to develop any deep understanding of its properties. It is therefore not surprising that many students come out of an introductory statistics course having mastered, at best, a series of computational procedures, but with relatively little statistical understanding. Many others have been through an introductory course that focuses heavily on the use of technology, either calculator or software, to perform statistical computations, but that experience alone does not necessarily translate into statistical understanding.

Two key notions that underlie virtually every concept and method in statistics are randomness and variation among samples. Has the sample been collected in a truly random fashion that reflects the underlying population? How representative of that population is this one sample? How does this single sample compare to other possible samples drawn from the same population?

Fortunately, most of the critical topics in probability and statistical inference can be dramatically presented using computer graphics simulations to allow students to visualize the underlying statistical populations and so enhance their understanding of the statistical concepts and methods. Many years ago, the current authors addressed the challenge of making these notions evident to students by developing a comprehensive package of computer graphics simulations using BASIC that addressed virtually every topic in introductory probability and statistics. Unfortunately, BASIC routines have become outmoded over the years, especially in terms of being able to provide students (or even colleagues) with copies of the files to explore the statistical concepts on their own computers. We have lately returned to this challenge and are developing a comparable package of graphical simulations using Excel because it is available on almost all computers today. (The complete package can be downloaded from the author’s website, as discussed at the end of this article.)

In the intervening years, many other statistics educators have thought deeply about these issues and have developed specialized software tools to implement random simulations of many of the basic statistical ideas and techniques. These issues have been discussed in a variety of articles in [3, 4, 5 and 6], as well as in individual papers in the on-line Journal of Statistical Education [7]. There are many applets available on the Web and many of the same kinds of simulations can be generated using statistical software packages such as Minitab. These efforts have had considerable impact on the teaching of statistics among professional statisticians, but probably much less impact on the teaching of statistics by mathematicians or by statistical users in other disciplines. In part, this may be due to the fact that mathematics faculty without statistical training are often uncomfortable with using such statistical packages in the classroom; others feel it is not appropriate to require students to purchase specialized software in addition to an expensive textbook and calculator; still others feel that the one software tool that almost all of their students will someday use in other courses and on the job is Excel (and the majority of students today appear to arrive on campus already knowing how to use it). As a consequence, there seem to be good reasons to have such simulations available in Excel, so that both faculty and students can have easy access to their use without needing to expend money or time in learning new programs.

Some Probabilistic Simulations In this section, we consider a variety of topics in simple probability that are typically treated in most introductory statistics classes. A common problem for most students is the non-intuitive nature of probability. A large part of that problem is that many students cannot visualize what happens in the long run in a random situation, probably because they don’t fully understand what randomness is all about. An effective tool for demonstrating the difference between the long run and the short run is the use of a pair of graphical simulations, one for the Law of Large Numbers and the other for what the authors term the Chaos of Small Numbers. For instance, if you have a fair coin, on each flip there is a 50-50 chance of getting a Head (a Success) or a Tail (a Failure). In the long run, if you flip the coin repeatedly, you can expect roughly half of all the flips to come up Heads. But there is a huge gap between the results of a single flip and the results after a large number of flips! In Figure 1, we show the results of a graphical simulation in Excel developed by the authors for flipping a fair coin 20 times. Figure 1a shows the 20 “actual” outcomes and Figure 1b shows the cumulative proportions of successes. Notice in Figure 1a that the first two flips were Successes (heads), the third was a Failure (a tail), and so forth, including two runs of four successive Successes. Overall, of the 20 repetitions, there were 14 Successes and 6 Failures. In Figure 1b, we see that because the first flip resulted in a Success, the cumulative proportion of Successes starts at 1; because the second flip was also a Success, the cumulative proportion remains at 1; after the third flip, a Failure, the cumulative proportion drops to 0.667, and so forth. Overall, however, the cumulative proportion seems to converge toward 0.50, but it certainly does not get particularly close during the first 20 flips.

Figure 2 shows the corresponding results for a second run of the same program, but the results are strikingly different. There is a very different pattern to the order in which Successes and Failures occur, with a majority of Failures, including a very unlikely run of nine in a row. Moreover, the pattern in the cumulative proportions of Successes is also quite different – a preponderance of Successes at the start and then the long run of Failures. Overall, 9 of the 20 flips resulted in Successes, so the final proportion of Successes was 0.45 and the convergence toward the theoretical limit of 0.50 is quite apparent. (A display such as this can be a powerful argument about families who feel an overwhelming belief that the next child they have will be a particular gender.)

We note that the Excel program allows the user to select the probability of success, so it needn’t be just 0.5. The number of repetitions displayed is also under the user’s control and can be any number up to 25.

For comparison, Figure 3 shows the results with 1000 repetitions. Clearly, in the long run, the percentage of successes converges to 0.5, as most would expect. In the “run” illustrated in the figure, there appear to be a majority of Failures, since the cumulative proportion of Failures is somewhat below 0.5 for much of the time. Overall, about 49% of the outcomes on this simulation were Successes; repeated runs will create virtually the same overall result, although the detailed pattern will vary from run to run. Students need to see and think about these kinds of displays to give them a better understanding of randomness in the short run and the overall patterns that result in the long run.

Even a simple experiment such as flipping a set of two or three coins may be difficult for many students in terms of predicting the type of results that will occur. We can derive the associated probabilities theoretically, but most students in these courses are still not convinced about the accuracy of the results. Computer simulations can be very valuable in illustrating what is happening and in conveying the needed understanding and conviction.

At a very simplistic level, one can conduct a physical simulation of a simple coin-flipping experiment in class. Have each student take out several coins and ask them to flip the coins a total of say 10 times and record the results. Then compile all the results on the board and display the results in a frequency distribution table or a histogram to illustrate the overriding pattern. However, the authors’ experience is that this experiment can easily lead to the class getting bogged down in the details of enumerating and analyzing the data, so that many students will not truly “see” the overall patterns that emerge as different numbers of coins are flipped.

Instead, at a more sophisticated level, one can conduct the comparable experiment electronically, with the displays automatically shown. (If nothing else, this avoids having coins rolling all over the floor for the janitorial staff to collect in the evening.) Such an appropriate graphics routine that simulates the experiment repeatedly and displays the results visually can be extremely effective in conveying understanding. Students are able, visually, to detect the resulting patterns and see that the same pattern replicates so long as the underlying assumption – the fact that the coins are fair – is maintained. In turn, this provides the students with the conviction that theory indeed agrees with reality. For instance, Figure 4 shows the results of 500 simulated flips of a pair of fair coins – 0 Heads (2 Tails) came up 125 times (or 25.0% of the time); 1 Head and 1 Tail came up 243 times (or 48.6% of the time); and 2 Heads came up 132 times (or 26.4% of the time).

Based on the authors’ experience in this case, these results are typically very much at variance with what most students expect to happen; intuitively, they usually expect each of the three possible outcomes to happen ⅓ of the time. Seeing these results, particularly when repeated runs are demonstrated to make the point that virtually the same percentages occur every time, goes a long way toward convincing the students that the theoretical predictions are valid. Alternatively, one can use the results of this kind of simulation to motivate the development of a theory that accounts for the observed outcomes – namely the fact that 0 Heads and 2 Heads each arise ¼ of the time and 1 Head and 1 Tail arise ½ of the time.

One can change the number of coins in this simulation to three fair coins instead of two. The resulting display on one run of the graphical simulation is shown in Figure 5, where we see that the results are seemingly symmetric. Moreover, of the 500 repetitions, 0 Heads came up 62 times (or 12.4% of the time), 1 Head (and 2 Tails) came up 186 times (or 37.2% of the time), 2 Heads (and 1 Tail) came up 185 times (or 37.0% of the time), and 3 Heads came up 67 times (or 13.4% of the time). These values, and the comparable ones that occur with repeated runs of the simulation, suggest that the breakdown for the four possible outcomes is roughly ⅛, ⅜, ⅜ and ⅛. Again, these results can be used either to validate the predictions of the theory or to motivate the development of the theory to account for the observed outcomes. Furthermore, one can extend these developments by selecting more than three coins, say n = 4 and n = 5 coins.