INSTRUCTIONS FOR USING MICROSOFT EXCEL – PERFORMING DESCRIPTIVE AND INFERENTIAL STATISTICS AND GRAPHING
This exercise is intended as an introduction to Excel, not an introduction to statistics. If you have difficulty understanding terms like standard deviation, mean, t-test, p-value, I would refer you to the following websites. They were written for the general education classes in science here at Kennesaw and they provide a good overview of the importance and use of statistics. If you still have difficulty after reviewing the web sites, come and see me.
Basic Statistics
Inferential Statistics
Statistical Analyses
- Double click on Excel icon OR click on (start), (Programs), (Microsoft Excel)
- At the top of the worksheet, type in text information that will help you or others understand later what study the data reflect, computations performed, statistical test(s) performed, etc.
- At the top of each column, enter a descriptive title that identifies the data that will be entered in that column.
- Enter data/observations for a sample in columns. Note: the accompanying instructions are for data observations entered in columns. Each data type is entered in a separate column.
- To SORT data:
- Place the cursor in any cell within the data you want to sort.
- Click on the “Data” tab on the Toolbar at the top of the page.
- Select “Sort” from the pull-down menu that appears.
- When the dialog box appears, make sure that the radio button at the bottom that says “Header row” is selected.
- Using the pull-down menus, select the columns you wish to use to sort your data. You can select up to three different columns, but Excel will always sort them in the priority listed in the dialog box.
- Once you have filled in the relevant information in the dialog box, hit OK.
- To calculate the MEAN for a set of observations:
- position cursor in the cell where you would like the computed mean value to appear
- click on (fx) in the toolbar, click on (statistical), click on (average); note: once a function has been used it can be quickly found under the “Most Recently Used” category.
- a box will appear asking you for the observations you want to calculate the average of; position cursor in first cell of the column containing observations for a sample and, holding the left click down, select all values of a sample. Alternatively, you can hold the shift key down while pressing the down arrow key or the page down key.
- click (OK)
- Steps 1 – 4 can be repeated for other data sets OR you can copy the formula and paste it to another cell. The number of observations must be the same! Not only the formula, but all specifications associated with it will be copied. You can check this by clicking on the cell with the formula; in the formula bar at the top of the spreadsheet, the formula and the specifications are given (e.g. AVERAGE (C3:C17) indicates that the calculation is the average of cells C3 through C17).
- To determine the MODE AND MEDIAN for a data set, use (fx) to select mode and median (again, under the category Statistical)
- To calculate the VARIANCE for a set of observations:
- position cursor where you would like the computed variance value to appear
- click on (fx) in the toolbar, click on (statistical), scroll down the list in the box at the right and click on (VAR)
- a box will appear asking you for the observations you want to calculate the variance of; position cursor in first cell (of row or column) containing observations for a sample and, holding the left click down, select all values of a sample
- click (OK)
- the variance for the values selected will appear in the box you first set your cursor in. Steps 1 – 4 can be repeated for other data sets OR you can copy the formula and paste it to another cell. The number of observations must be the same! Not only the formula, but all specifications associated with it will be copied. You can check this by clicking on the cell with the formula; in the formula bar at the top of the spreadsheet, the formula and the specifications are given (e.g. AVERAGE (C3:C17) indicates that the calculation is the average of cells C3 through C17).
- To calculate the STANDARD DEVIATION for a set of observations:
- position cursor where you would like the computed standard deviation value to appear
- click on (fx) in the toolbar, click on (statistical), scroll down the list in the box at the right and click on (STDEV)
- a box will appear asking you for the observations you want to calculate the standard deviation of; position cursor in first cell (of row or column) containing observations for a sample and, holding the left click down, select all values of a sample
- click (OK)
- the standard deviation for the values selected will appear in the box you first set your cursor in. Steps 1 – 4 can be repeated for other data sets OR you can copy the formula and paste it to another cell. The number of observations must be the same! Not only the formula, but all specifications associated with it will be copied. You can check this by clicking on the cell with the formula; in the formula bar at the top of the spreadsheet, the formula and the specifications are given (e.g. AVERAGE (C3:C17) indicates that the calculation is the average of cells C3 through C17).
- To calculate the COEFFICIENT OF VARIATION for a set of observations:
- Divide the sample standard deviation by the sample mean.
- Multiply by 100.
- To perform a t-TEST:
- Enter the sample data sets to be compared in adjacent columns. It is recommended that you label the columns.
- Position cursor where you want the computed p value for the t-test to appear
- Click on (fx) in the toolbar, click on (statistical), scroll down the list of calculations in the box on the right and click on (TTEST)
- A box will appear asking you to define/select the following: array 1 (e.g. those observations in a given sample); array 2 (e.g. those observations in the other sample), tails (enter 2), type (depends on the relationship of data; if the data is “before and after” measurements or data for the same individuals under different treatments, then select “paired” – or; if the analysis is for unpaired data sets and the variances for the two data sets are similar, select 2; if the analysis is for unpaired data sets and the variances are clearly very different, select 3).
- When all boxes are filled in, click on (OK). If you want to change information in a box, simply click the mouse in the box, delete the information, and enter new information.
- A calculated p value will appear in the box you first set your cursor in.
Generating a Bar Graph
- If you enter the information to be graphed carefully, this can be a breeze! I recommend typing in your data as follows (label the rows and columns also, specifically noting the category of the I.V.!).
mean / SD
I.V. case 1
I.V. case 2
I.V. case 3
(etc.)
- Select any cell outside of the data range. Click on the Chart Wizard button on toolbar at the top of the page. Select: Column chart type and Clustered column chart sub-type, then click the Next button.
- A box will appear that will allow you to enter specifics about the chart: highlight only those cells that contain the I.V. names and the values for the means (left click and hold to highlight a group of cells). A preview of the graph will appear within a window of the box. Click the Next button.
- Another box will appear that will allow you to provide a title and legend and axes labels and units. Click the Next button.
- The next box to appear asks you where you want to place the new graph – on the same page as the one you are currently working in or on a new sheet. The choice is yours. To add standard error bars, position the cursor within a column and right click. Select Format Data Series. Select Y Error Bars. Click on the first Display (e.g. Both) and click the button next to Custom. Click in the + box and then go to the spreadsheet, find the corresponding SDs and highlight them with the mouse. Click in the – box and the go to the spreadsheet and highlight the corresponding SDs again (remember that we want to display the SD above, or +, and below, or -, a column). Click OK. The standard error bars should now appear on the graph.
- Any portion of the graph can be modified by placing mouse over the area that you want to change and right clicking. The options for change will occur in a pop-up box.
Generating a Scatter Plot and Calculating a Correlation Coefficient
- Enter the two data sets to be graphed in adjacent columns such that the variable you want to plot on the x axis is in the first (e.g. left most) column. It’s a good idea to label the columns for your reference.
- Click on the Chart Wizard and select XY (scatter) as the chart type and Scatter as the chart sub-type.
- A box will appear that will allow you to enter specifics about the chart . Click the mouse in the data range line, and then use mouse to highlight all the data in both columns. This tells Excel what data you want to plot. Do not include the column labels. Click the Next button
- Another box will appear that will allow you to provide a title and legend and axes labels and units. Click the Next button.
- The next box to appear asks you where you want to place the new graph – on the same page as the one you are currently working in or on a new sheet. The choice is yours. Regardless of location, you can change font size and type and any other feature of the graph.
- To calculate a correlation coefficient:
- Enter the two variables to be correlated in adjacent columns. It is recommended that you label the columns.
- Position cursor where you want the computed r value for the correlation coefficient to appear
- Click on (fx) in the toolbar, click on (statistical), scroll down the list of calculations in the box on the right and click on (CORREL)
- A box will appear asking you to define/select the following: array 1 (e.g. the observations on the first sample); and array 2.
- When all boxes are filled in, click on (OK). If you want to change information in a box, simply click the mouse in the box, delete the information, and enter new information.
- A calculated correlation coefficient will appear in the box you first set your cursor in.
- Use the table on the next page to determine whether the correlation is statistically significant.
Table 1. Significance levels and correlation coefficients. To determine whether a correlation is statistically significant (p < 0.05), go down the column until you reach the row with the correct number of paired observations (N). If the absolute value of your correlation coefficient is greater than the number listed, the correlation between your two variables is significant. If it is less than the number indicated, it is not significant.
NCorrelation
Coefficient
50.86
60.81
70.75
80.71
90.67
100.63
110.60
120.57
130.55
140.53
150.51
160.49
170.48
180.47
190.46
200.45
210.44
220.43
230.42
240.41
250.40
260.39
270.38
280.37
290.36
300.36
310.36
320.35
330.35
340.35
350.34
360.34
370.34
380.34
390.33
400.33
NCorrelation
Coefficient
410.33
420.32
430.32
440.32
450.32
460.31
470.31
480.31
490.31
50 0.30
510.30
520.30
530.30
540.30
550.29
560.29
570.29
580.29
590.29
600.28
610.28
620.28
630.28
640.28
650.27
660.27
670.27
680.27
690.27
700.26
710.26
720.26
730.26
740.26
750.26
760.25
NCorrelation
Coefficient
770.25
780.25
790.25
800.25
810.25
820.24
830.24
840.24
850.24
860.24
870.24
880.23
890.23
900.23
910.23
920.23
930.23
940.22
950.22
960.22
970.22
980.22
990.22
1000.21