D:\SG\aquila\qm\lehre\tacs-ue\ilmt\stat\STATG.DOCVersion 18.05.2004page 1 of 10
The Use of Statgraphics 5.0
IN AND OUTPUT
1. Import of Data Files
2. Input of Data
3. Modification and Output of Results
MULTIPLE COMPARISONS
1. Import of Data
2. Visualization of Data
3. Test for Outlier
4. Test for Homogenity of Variances
5. Test for Normal Distribution
6. ANOVA - Is there any difference between the samples ?
7. Multiple Range Tests - Which samples are different?
VARIANCE COMPONENTS (Nested Designs): error of sampling, analysis
1. Import of Data
2. Visualization of Data
3. Test for Outlier
4. Test for Homogenity of Variances
5. Test for Normal Distribution
6. ESTIMATE VARIANCE COMPONENTS
EXPERIMENTAL DESIGNS
1. Create Experiment
2. Run Experiments
3. Enter Data
4. Analyze Data
Files:
- In and Output
IO.XLS
REGR.TXT
- Anova
HVA.CSV
- Experimental Designs
TAU.XLS
TAU1.SFX
IN AND OUTPUT
1. Import of Data Files
Configuration: \Windows\Systemsteuerung\Ländereinstellung\Zahl:
Dezimalzeichen...... .
Symbol f. Zifferngruppierung...... blank
Listentrennzeichen...... ;
File /Open Data File /Dateityp: Alle Files(*.*)
- Excel:/IO.XLS/Variable Names from first row
- CSV:/HVA.CSVcomma delimited/Variable Names from first row
- Textfile: /REGR.TXTtab delimited/Variable Names from first row
2. Input of Data
- Mark column /right mouse button /Modify Column
x y
1 11.5
2 12.4
3 13
4 16
5 17
- File /Save Data File as: Regr.sf
3. Modification and Output of Results
ANALYSIS
- File /Open Data File /regr.sf
- Relate /Simple Regression
- Tabular Options: Analysis Summary
- Graphical Options: Plot fitted model, Residuals vs. x
Modification
- click window 2x with left mouse button
- Click element with right mouse button
- Options
OUTPUT
a) to Statreporter: click window with right mouse button /Copy to Statreporter
b) from Statreporter to Winword: copy and paste
Textwindow
- click window 2x with left mouse button /mark text /Icon Cut -> Winword: insert as text
- (click window 2x with left mouse button /Icon Copy -> Winword: insert as object)
save Graphic to file
- Save Graph as regr.wmf
Without colours:Graphics\Options\Profile: Black and White
File\PageSetup: Black and White
Regression Analysis - Linear model: Y = a + b*X
Dependent variable: Y
Independent variable: X
Standard T
Parameter Estimate Error Statistic P-Value
------
Intercept 9.6 0.73964 12.9793 0.0010
Slope 1.46 0.22301 6.5468 0.0072
------
Analysis of Variance
Source Sum of Squares Df Mean Square F-Ratio P-Value
------
Model 21.316 1 21.316 42.86 0.0072
Residual 1.492 3 0.497333
------
Total (Corr.) 22.808 4
Correlation Coefficient = 0.966739
R-squared = 93.4584 percent
Standard Error of Est. = 0.705219
The StatAdvisor
The output shows the results of fitting a linear model to describe the relationship between Y and X. The equation of the fitted model is:
Y = 9.6 + 1.46*X
Since the P-value in the ANOVA table is less than 0.01, there is a statistically significant relationship between Y and X at the 99% confidence level.
The R-Squared statistic indicates that the model as fitted explains 93.4584% of the variability in Y.
The correlation coefficient equals 0.966739, indicating a relatively strong relationship between the variables.
The standard error of the estimate shows the standard deviation of the residuals to be 0.705219. This value can be used to construct prediction limits for new observations by selecting the Forecasts option from the text menu.
MULTIPLE COMPARISONS
Problem: Which of 5 products are different in the moisture content?
Each product is analysed 5 times. The averages of each product are compared by multiple range tests.
D:\SG\aquila\qm\lehre\tacs-ue\ilmt\stat\STATG.DOCVersion 18.05.2004page 1 of 10
1. Import of Data
HVA.CSV (comma delimited):
!Variable Probe und Nr muß sortiert sein!
PROBE NR TS
1 1 90.91
1 7 90.60
1 11 90.40
1 13 90.52
1 15 90.77
2 2 90.79
2 5 90.36
2 8 90.32
2 18 90.59
2 21 90.51
3 4 90.27
3 12 90.37
3 17 90.38
3 20 90.31
3 24 90.49
4 6 90.57
4 9 90.82
4 14 90.63
4 19 90.98
4 23 90.44
5 3 90.24
5 10 90.35
5 16 90.15
5 22 90.08
5 25 90.46
2. Visualization of Data
a) Test for Trend: Data versus sequence of measurements
Icon Scatterplot: NR->X TS->Y PROBE->Select
Pane Options: PROBE->Point Codes, Points+Lines
b) Visual test for Outlier and Distribution
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable
- Scatterplot
Graphic Options: Scatterplot
- Box-and-Whisker Plot
Graphic Options: Box-and-Whisker-Plot
Pane Options: vertical
D:\SG\aquila\qm\lehre\tacs-ue\ilmt\stat\STATG.DOCVersion 18.05.2004page 1 of 10
3. Test for Outlier
Grubbs: PW = | xi - av(xi)| / s < T (replications;)
4. Test for Homogenity of Variances
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable
Tabular Options: Variance Check
Variance Check
Cochran's C test: 0.297977P-Value = 0.99813
Bartlett's test:1.19452P-Value = 0.519824
Hartley's test: 6.5PW = s2max / s2min < T(;samples;replications-1)
The StatAdvisor
The three statistics displayed in this table test the null hypothesis that the standard deviations of TS within each of the 5 levels of PROBE is the same. Of particular interest are the two P-values. Since the smaller of the P-values is greater than or equal to 0.05, there is not a statistically significant difference amongst the standard deviations at the 95.0% confidence level.
5. Test for Normal Distribution
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable
Tabular Options: Summary Statistics \Pane Options: Selection of parameters
NVif stand. Skewness/Curtosis <=+/-2
Summary Statistics for TS
PROBE Count Average
1 5 90.64
2 5 90.514
3 5 90.364
4 5 90.688
5 5 90.256
------
Total 25 90.4924
PROBE Variance Standard deviation
1 0.04085 0.202114
2 0.03583 0.189288
3 0.00698 0.0835464
4 0.04537 0.213002
5 0.02323 0.152414
------
Total 0.0530607 0.230349
PROBE Minimum Maximum
1 90.4 90.91
2 90.32 90.79
3 90.27 90.49
4 90.44 90.98
5 90.08 90.46
------
Total 90.08 90.98
PROBE Range Stnd. skewness
1 0.51 0.288577
2 0.47 0.589419
3 0.22 0.663105
4 0.54 0.39776
5 0.38 0.287198
------
Total 0.9 0.899742
PROBE Stnd. kurtosis Sum
1 -0.530679 453.2
2 -0.178296 452.57
3 0.314799 451.82
4 -0.446946 453.44
5 -0.589814 451.28
------
Total -0.279892 2262.31
The StatAdvisor
This table shows various statistics for TS for each of the 5 levels of PROBE. The one-way analysis of variance is primarily intended to compare the means of the different levels, listed here under the Average column. Select Means Plot from the list of Graphical Options to display the means graphically.
R/s-test (David): Tu(replications;)<(PW = R/s) <To (replications;)
6. ANOVA - Is there any difference between the samples ?
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable
Tabular Options: Anova Table
ANOVA Table for TS by PROBE
Analysis of Variance
Source Sum of Squares Df Mean Square F-Ratio P-Value
------
Between groups 0.664416 4 0.166104 5.45 0.0039
Within groups 0.60904 20 0.030452
------
Total (Corr.) 1.27346 24
The StatAdvisor: The ANOVA table decomposes the variance of TS into two components: a between-group component and a within-group component. The F-ratio, which in this case equals 5.45462, is a ratio of the between-group estimate to the within-group estimate. Since the P-value of the F-test is less than 0.05, there is a statistically significant difference between the mean TS from one level of PROBE to another at the 95.0% confidence level. To determine which means are significantly different from which others, select Multiple Range Tests from the list of Tabular Options.
7. Multiple Range Tests - Which samples are different?
Tabular Options: Multiple Range Tests
Pane Options: LSD, Tuckey HSD, Scheffe, Bonferroni, Student-Newman Keuls, Duncan
Multiple Range Tests for TS by PROBE: Method: 95.0 percent LSD
PROBE Count Mean Homogeneous Groups
------
5 5 90.256 X
3 5 90.364 XX
2 5 90.514 XX
1 5 90.64 X
4 5 90.688 X
Contrast Difference +/- Limits
------
1 - 2 0.126 0.230221
1 - 3 *0.276 0.230221
1 - 4 -0.048 0.230221
1 - 5 *0.384 0.230221
2 - 3 0.15 0.230221
2 - 4 -0.174 0.230221
2 - 5 *0.258 0.230221
3 - 4 *-0.324 0.230221
3 - 5 0.108 0.230221
4 - 5 *0.432 0.230221
------
* denotes a statistically significant difference.
The StatAdvisor: This table applies a multiple comparison procedure to determine which means are significantly different from which others. The bottom half of the output shows the estimated difference between each pair of means. An asterisk has been placed next to 5 pairs, indicating that these pairs show statistically significant differences at the 95.0% confidence level. At the top of the page, 3 homogenous groups are identified using columns of X's. Within each column, the levels containing X's form a group of means within which there are no statistically significant differences. The method currently being used to discriminate among the means is Fisher's least significant difference (LSD) procedure. With this method, there is a 5.0% risk of calling each pair of means significantly different when the actual difference equals 0.
Multiple Range Tests for TS by PROBE: Method: 95.0 percent Bonferroni
PROBE Count Mean Homogeneous Groups
------
5 5 90.256 X
3 5 90.364 XX
2 5 90.514 XX
1 5 90.64 X
4 5 90.688 X
Contrast Difference +/- Limits
------
1 - 2 0.126 0.348031
1 - 3 0.276 0.348031
1 - 4 -0.048 0.348031
1 - 5 *0.384 0.348031
2 - 3 0.15 0.348031
2 - 4 -0.174 0.348031
2 - 5 0.258 0.348031
3 - 4 -0.324 0.348031
3 - 5 0.108 0.348031
4 - 5 *0.432 0.348031
------
* denotes a statistically significant difference.
VARIANCE COMPONENTS (Nested Designs): error of sampling, analysis
Problem:How big are the contributions of the sampling method and the analysis method to the variability of the analysed moisture content?
To quantify the variance within the samples and the variance of the averages of the samples 5 samples are drawn from a bag, homogenized and each sample is analysed 5 times.
1. Import of Data
2. Visualization of Data
3. Test for Outlier
4. Test for Homogenity of Variances
5. Test for Normal Distribution
6. ESTIMATE VARIANCE COMPONENTS
\Compare\Analysis of Variance\Variance Components
PROBE->Factors in Order of Nesting TS->Dependent Variable
Tabular Options: Analysis Summary
Variance Components Analysis
Dependent variable: TS
Factors: PROBE
Number of complete cases: 25
Analysis of Variance for TS
Source Sum of Squares Df Mean Square Var. Comp. Percent Index
------
TOTAL (CORRECTED) 1.27346 24
------
PROBE 0.664416 4 0.166104 0.0271304 47.12 1
ERROR 0.60904 20 0.030452 0.030452 52.88 0
------
The StatAdvisor: The analysis of variance table shown here divides the variance of TS into 1 components, one for each factor. Each factor after the first is nested in the one above. The goal of such an analysis is usually to estimate the amount of variability contributed by each of the factors, called the variance components. In this case, the factor contributing the most variance is ERROR. Its contribution represents 52.8842% of the total variation in TS.
Error of sampling = s12 = (MQ1 - MQ0) / k, k..replications
Confidence limits:
lower limit = [(MQ1*L12 - MQ0) / k]1/2 < s1 < upper limit = [(MQ1*L22 - MQ0) / k]1/2
L1(, Df1, Df0) L2(, Df1, Df0)
Error of analysis = s02 = MQ0
Confidence limits:
lower limit: s0*L1 < s0 < upper limit: s0*L2
L1(,Df0, ) L2(, Df0, )
EXPERIMENTAL DESIGNS
Problem:How big is the effect of heating time and concentration of starch solutions on the viscosity of the gelatinised starch
Suspensions of starch with different concentrations in water are heated for different times at 80C. With this samples defined shear tests are made. The effects of starch concentration (Conc) and the time of heating (Time) on the shear resistance (tau) at D=300 s-1 is quantified.
1. Create Experiment
\Special \Experimental Design \Create Design \Screening Design
2 Factors, 1 Response, Fractional Design, 0 Center Point, 1 Replication
Randomize
correct Block
Tabular Options: Design Summary, Worksheet
Save Design File tau.sfx
Print Worksheet
Design Summary
Design class: Screening
Design name: Factorial 2^2
Base Design
Number of experimental factors: 2 Number of blocks: 1
Number of responses: 1 Number of centerpoints per block: 0
Number of runs: 8
Randomized: Yes
Factors Low High Units Continuous
------
Conc -1.0 1.0 Yes
Time -1.0 1.0 Yes
Responses Units
------
tau
The StatAdvisor: You have created a Factorial design which will study the effects of 2 factors in 8 runs. The design is to be run in a single block. The order of the experiments has been fully randomized. This will provide protection against the effects of lurking variables.
2. Run Experiments
3. Enter Data
\Special \Experimental Design \Open Design: tau.sfx
Tabular Options: Design Summary, Worksheet
!!!! take care of correct input of tau to the corresponding experiments !!!!
runBLOCKConcTimetau
41-1-140
1011-1105
71-11130
3111119
21-1-142
911-198
51-11134
8111122
4. Analyze Data
\Special \Experimental Design \Analyze Design
Analysis Options:
max. Order Effect: 2
-> ignore Block number
Estimated Sigma from: Experimental Data
Tabular Options: Analysis Summary, ANOVA Table, Regression coeff., Optimization
Graphical Options: Pareto Chart, Main Effects, Interaction Plot, Response Plots, Diagnostic Plots
Analysis Summary
Estimated effects for tau
average = 98.75 +/- 1.10397
A:CONC = 24.5 +/- 2.20794
B:TIME = 55.0 +/- 2.20794
AB = -36.0 +/- 2.20794
------
Standard errors are based on total error with 4 d.f.
The StatAdvisor: This table shows each of the estimated effects and interactions. Also shown is the standard error of each of the effects, which measures their sampling error.
To plot the estimates in decreasing order of importance, select Pareto Charts from the list of Graphical Options.
To test the statistical significance of the effects, select ANOVA Table from the list of Tabular Options.
You can then remove insignificant effects by pressing the alternate mouse button, selecting Analysis Options, and pressing the Exclude button.
Analysis of Variance for TAU:
Source Sum of Squares Df Mean Square F-Ratio P-Value
------
A:CONC 1200.5 1 1200.5 123.13 0.0004
B:TIME 6050.0 1 6050.0 620.51 0.0000
AB 2592.0 1 2592.0 265.85 0.0001
Total error 39.0 4 9.75
------
Total (corr.) 9881.5 7
R-squared = 99.6053 percentStandard Error of Est. = 3.1225
R-squared (adjusted for d.f.) = 99.3093 percentMean absolute error = 2.0
Durbin-Watson statistic = 2.76282
The StatAdvisor: The ANOVA table partitions the variability in TAU into separate pieces for each of the effects. It then tests the statistical significance of each effect by comparing the mean square against an estimate of the experimental error. In this case, 3 effects have P-values less than 0.05, indicating that they are significantly different from zero at the 95.0% confidence level.
The R-Squared statistic indicates that the model as fitted explains 99.6053% of the variability in TAU. The adjusted R-squared statistic, which is more suitable for comparing models with different numbers of independent variables, is 99.3093%.
The standard error of the estimate shows the standard deviation of the residuals to be 3.1225.
The mean absolute error (MAE) of 2.0 is the average value of the residuals.
The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in your data file. Since the DW value is > 1.4, there is probably not any serious autocorrelation in the residuals.
Pareto Chart: Pane Options: Standardized
- all factors and interactions are significant
Regression coeffs. for tau
constant = 98.75
A:CONC = 12.25
B:TIME = 27.5
AB = -18.0
The StatAdvisor: This pane displays the regression equation which has been fitted to the data. The equation of the fitted model is
TAU = 98.75 + 12.25*CONC + 27.5*TIME - 18.0*CONC*TIME
where the values of the variables are specified in their original units.
To have STATGRAPHICS evaluate this function, select Predictions from the list of Tabular Options.
To plot the function, select Response Plots from the list of Graphical Options.
- at high CONC, the TIME has low effect
- at high TIME, the CONC has no effect
Surface Plot: Pane Options: show points
Contour Plot: Pan Options: Painted Regions
- same TAU can be obtained with low CONC at high TIME
- at high CONC, the TIME has low effect
Diagnostic Plot:Pane Options: Residuals vs. Run Order
Pane Options: Residuals vs. Factor A:conc