MAE244 ANALYSIS c.3

STATISTICAL ANALYSIS

Mean and Standard Deviations

Statistical analysis is often used to explain variations in experimental data. It is the basis for which predictions can be made from measurements (as in extrapolation). Probably the statistical measures that are most familiar to students are the mean (or average), which is used to describe a sample center or location, and standard deviation, which is a measure of the spread of the sample. The mean is defined as

where y is the variable of interest for each member of the sample and n is the number of observations in the sample. The standard deviation is the square root of the variance

Standard Deviation = s =

where

Example: A total of 58 AISI 1018 cold-drown stell bars were tested to determine the 0.2 percent offset yield strength Sy in kpsi. The results were:

Sy / m
64 / 2
68 / 6
72 / 6
76 / 9
80 / 19
84 / 10
88 / 4
92 / 2

m is the number of measurements at the certain value.

Employ the previous equations:

Therefore, the yield strength of the steel equals to 78.41±6.52 kpsi

Often in laboratory experiments, students will collect data (e.g. strain) as a result of some known stimulus (e.g. load) and will be asked to determine the relationship between x (strain) and y (load). As an example, Young's Modulus (or the Elastic Modulus) is the linear relationship, or slope, between stress (y) and strain (x). To find Young's modulus, student would plot stress and strain, and then draw a line that best fits the data. The slope could then be determined by finding the change in y over the change in x (y=mx+b from algebra). The problem with this method is that everyone would probably draw this line differently, and there would be no unique value for E for the experimental data, only estimates. To overcome this shortcoming, the method of least squares will be used.

Linear Regression (Least Squares Fit)

The least squares method is used to fit a polynomial of nth degree. Because our experiments will be conducted in the linear range of linear elastic materials, the only thing that should be considered is the fit of a straight line. Thus, through the least squares fit, the slope, m, and the intercept, b, of the straight line will be determined.

y = mx + b

that will be the best representation of the experimental data (x1, y1), (x2, y2), .... (xn, yn)....(xN, yN). The least square fit will tell how changes in x affect changes in y, where x is the independent variable and y is the dependent variable.

The term e is added to define the actual location of the points (i.e. e is an error term). For n x and y data points, the slope, m, and the intercept, b, are calculated using the following equations:


Correlation

The main use of regression is prediction. The sample correlation coefficient, r, is the statistic to determine the strength of the correlation (or prediction). It is found using

where r=1 is a perfect positive fit and r=-1 is a perfect negative fit. r2, the coefficient of determination, is often used to indicate the proportion of the variability in y explained by the linear bivariate association with x.

Example. r = 0.89, therefore r2 = 0.79. Then 79% of the variability among y is explained on the basis of the linear relationship between x.

Regression is for prediction!

Correlation is the strength of the prediction!

Statistical analysis can be performed using Excel or Lotus 1-2-3 so it is not necessary to perform hand calculations using the above equations.

This analysis tool performs linear regression analysis by using the "least squares" method to fit a line through a set of observations. Student can analyze how a single dependent variable is affected by the values of one or more independent variables¾ for example, how an athlete's performance is affected by such factors as age, height, and weight. Student can apportion shares in the performance measure to each of these three factors, based on a set of performance data, and then use the results to predict the performance of a new, untested athlete.

For the following set of experimental data, regression analysis was performed using Excel.

Strain / Stress
m mm/mm / MPa
0 / 0
180 / 5
570 / 10
700 / 15
1075 / 20
1300 / 25
1600 / 30
1690 / 35