Valid Statistical Techniques, v1.03

Copyright 2017 by John N. Zorich Jr.; all rights reserved, with the following exception:
Anyone may use all or part of this document in their own company's internal procedures.

To create your own copy, "select all", "copy", & "paste" into a blank Word document.

Some text boxes may need adjustment if they overlap when viewed in Word_2011 or earlier.
Your compliments, criticisms, and suggestions regarding this document would be welcomed.

For the most up-to-date version, write to: JOHNZORICH@YAHOO.COM or visit JOHNZORICH.COM

1.  PURPOSE

The purpose of this procedure is to provide guidance on how to appropriately (i.e. validly) use some of the most common statistical techniques. In some cases, formulas are provided; however, it is not the purpose of this procedure to provided instructions for how to perform the calculations for every statistical technique discussed here; such instructions are easily obtained from published articles, reputable websites, or statistics textbooks.

2.  SCOPE

This procedure applies to common statistical techniques that are used to analyze product sample statistics so as to determine the corresponding product population parameters, in order to draw conclusions about the acceptability of the population (e.g. a lot) or of the process that produced the population. Examples of such statistics include averages, percentages, and %-in-specification. This procedure does not apply to statistical techniques used to analyze clinical trials data.

3.  GENERAL REFERENCES (other references are given in the body of the SOP)

a.  NIST/SEMATECH e-Handbook of Statistical Methods
This book is freely available at: http://www.itl.nist.gov/div898/handbook

The National Institute of Science & Technology (NIST) is an agency of the United States Department of Commerce.

b.  Wikipedia ( https://en.wikipedia.org ) contains articles on many of the topics covered here.

4.  DEFINITIONS OF KEY TERMS (as used in this document)

a.  Product: raw material, component, in-process or finished goods

b.  Process: an activity that outputs a product

c.  Sample: a representative subset of a population

d.  Population: the entire data set or physical group from which the sample was taken; commonly, a population is a lot (or batch), but a population can also be regarded as the process from which a lot has been derived (e.g. during process validation, an entire pilot or validation lot can be considered a sample from the "population" process that produced it)

e.  Statistic: a summary value (e.g. an average) that has been calculated based upon inspection or measurement of all the members of a sample; a sample statistic is used to estimate the corresponding parameter

f.  Parameter: a summary value (e.g. an average) that has been calculated based upon inspection or measurement of all the members of a population

g.  Standard Error: the standard deviation of the population of sample statistics (e.g. the standard error of the sample mean is the standard deviation of the population of means that is obtained by taking the mean of each of all possible samples of a given size from a given population)

h.  Excel: the spreadsheet software program sold by Microsoft Corporation

5.  LIST OF STATISTICAL TECHNIQUES IN THIS SOP

·  Normality tests and normality transformations

·  Confidence intervals and limits

·  Coefficients of determination, and correlation coefficients

·  Statistical power and tests of statistical significance

·  Statistical process control (SPC)

·  Process capability and confidence/reliability calculations

·  AQL sampling plans

·  Guard-banded QC specifications

6.  FUNDAMENTAL VALIDITY CRITERIA FOR ALL STATISTICAL TECHNIQUES

Any statistical technique that is used within the scope of this procedure must meet the following criteria:

a.  Techniques must be taken from textbooks or articles that have been published by reputable publishing houses, journals, or websites (in the case of websites, reputability is a judgement-call, the basis of which should be appropriately documented in quality system records, if it is not a general-recognized source); or taken from commercial statistical software programs.

b.  Output of the technique should provide information about the relevant population parameter, at a chosen level of confidence; a technique that outputs only a sample statistic should not be used for design verification, design or process or product validation, or product/lot disposition.

c.  Output of the technique must be as accurate and exact as is practically possible; that is, pre-computer-era "approximation" techniques should not be used for design verification, design or process or product validation, or product/lot disposition.

7.  NORMALITY TESTING AND NORMALITY TRANSFORMATIONS

To assess whether a data-set is normally distributed, plot the data on a Normal Probability Plot (a.k.a. NPP), which can be created either by hand, by Excel, or by using programs such as Minitab or StatGraphics. The plot will appear to be a straight line only if the data is normally distributed; if the plot is not a straight line (with points above and below the line in an approximately random fashion), then the data is not normally distributed. Although use of an NPP plot in such a manner is somewhat subjective, it is the best of all normality evaluation methods, as explained in quotations below. There is no minimum sample size required; however, other popular tests of normality require a minimum of 5 (e.g. the Shapiro-Francia W' test) or 6 (e.g. the Anderson-Darling A2* test).

To manually create an NPP, first arrange the data, as shown below for an n=12 sample:

Raw Data (sorted) / Rank / Median Rank =
(Rank − 0.3) / ( n + 0.4) / 1-sided Z-table value
corresponding to Median Rank **
8.80 / 1 / 0.0565 / -1.585
9.30 / 2 / 0.1371 / -1.093
10.30 / 3 / 0.2177 / -0.780
10.80 / 4 / 0.2984 / -0.529
11.00 / 5 / 0.3790 / -0.308
12.40 / 6 / 0.4597 / -0.101
13.23 / 7 / 0.5403 / 0.101
13.34 / 8 / 0.6210 / 0.308
14.98 / 9 / 0.7016 / 0.529
17.20 / 10 / 0.7823 / 0.780
18.06 / 11 / 0.8629 / 1.093
20.80 / 12 / 0.9435 / 1.585

** Microsoft Excel function "NORM.S.INV" can be used to obtain the appropriate
Z-table value. For example, NORM.S.INV(0.9435) = 1.585

Then plot the Z-table value (the right-hand column above) versus the raw data (the left-hand column above), as shown in the example plot (below):

The plot above has a slight curve to it, as shown in the plot below of the same data:

If, as determined visually, the NPP has even a slight curve (as shown above), a transformation to normality must be sought that creates a straight-as-practically-possible NPP. Any subsequent statistical analysis is then used on the transformed values, not on the original raw data; and any numerical requirement (e.g. a QC specification) that is used in the statistical analysis must be transformed in the same way as the raw data. If a transformation is used, the final results of the statistical analysis (e.g. a confidence interval) should be reported in units of the original raw data (by reverse-transforming it), not in units of transformed values.

The following is a list of some commonly-used transformations to normality; if a transformation is needed, try them all and use the transformation that results in the straightest-looking NPP; only when it is not obvious which line is the straightest, choose the one with the largest R2 value. When the range of raw data (and QC specifications) spans 1.00, it may be useful to try transformations after first adding 1.00 to each raw data value (and to the QC specs).

TRANSFORMATION / MS EXCEL EQUIVALENT
Inverse (X) / = 1 / X
Square root (X) / = SQRT ( X )
Cube root (X) / = ( X ) ^ ( 1 / 3 )
Quadratic (X) / = ( X ) ^ 2
Cubic (X) / = ( X ) ^ 3
Logarithm (X) / = LN ( X )
Inverse Hyperbolic Sine (X) / = ASINH ( X )
Inverse Hyperbolic Sine (Square Root (X)) / = ASINH ( SQRT ( X ) )
Logit (X)
[ used only when all X values are 1.00 ] / = LN ( X / ( 1 − X) )

An "inverse" transformation was found to produce the best straight for the sample data used in the previous plot, as shown below:

8.  CONFIDENCE INTERVALS AND LIMITS

Any statistic (e.g. a sample average) is an estimate of the true value for the population from which the sample was taken; that true value is called the population parameter. The classic way to determine how accurate is that estimate is to calculate the relevant "confidence interval". A confidence interval was described by its inventor as a range "in which we may assume are contained the [parameter] values of the estimated characters of the population"; a confidence level (e.g. 95%) can be assigned to such an assumption (J. Neyman, "On the Two Different Aspects of the Representative Method," Journal of the Royal Statistical Society, 97, no. 4 (1934): p. 562).

Two-sided confidence intervals are ranges that extend above and below the value of the sample statistic. One-sided confidence intervals extend from a value on one side of the statistic to a value as far as possible away from the statistic, on the opposite side of the statistic, as shown in the figure below.

In some cases, a confidence interval has a natural limit; for example, the largest possible value in a confidence interval for a proportion is 100%, and the smallest possible value in a confidence interval for a weight is 0.000 gram.

Any sample size is valid when calculating confidence intervals and limits. As can be seen in the figure below, the width of confidence intervals changes depending upon sample size. A narrow confidence interval based upon a large sample is no more valid than a wide confidence interval based upon a small sample size. They are both, e.g., 95% confidence estimates for the population parameter.

a.  CONFIDENCE LIMITS FOR COUNT-DATA PROPORTIONS

Proportions are ratios. For example, if there are 90 good parts in a sample of 100 parts, the proportion of good parts is 90/100. Similarly, if there are 10 defective parts in a sample of 100, the proportion of defective parts is 10/100. The numerator is always the count of the characteristic of interest (e.g. good, bad, green, red, etc.).

Proportions can be expressed as a decimal fraction, e.g. 0.10, or as a percentage, e.g. 10.0%. There are many published methods for calculating confidence intervals for a proportion; unfortunately, the intervals and limits produced by those various methods can be dramatically different, in part because some of the methods are rough approximations for the exact method. For example, the Z-table method gives an incorrectly narrow interval with incorrectly placed upper and lower confidence limits. The method described below is called the "Exact Confidence Interval"; it is the oldest method and is generally considered the "gold standard" (L. D. Brown, et. al., "Interval Estimation for a Binomial Proportion," Statistical Science, 16, no. 2 (2001), "comment" p. 117).

When using software (e.g. Minitab or StatGraphics) to calculate confidence intervals on a sample proportion, and if given a choice as to which method to use, always choose the "Exact Confidence Interval".

To calculate Exact confidence limits for proportions using Excel, use the following formulas:

Upper 2-sided limit = beta.inv ( 1 – (1 – C ) / 2 , k + 1 , n – k )

Lower 2-sided limit = beta.inv ( (1 – C ) / 2 , k , n – k + 1) )

Upper 1-sided limit = beta.inv ( C, k + 1 , n – k )

Lower 1-sided limit = beta.inv ( 1 – C , k , n – k + 1) )

where

C = confidence desired, as a decimal fraction (e.g. use 0.95 for 95% confidence)

n = sample size

k = count of the characteristic of interest (e.g. "defective part") in the sample

For example, if C = 0.95, k = 10, and n = 100, then the proportion is 10/100 = 10% and

BETA.INV (1 – (1 – 0.95) / 2, 10 + 1, 100 – 10) = 0.176 = 17.6%

is the upper 2-sided 95% confidence limit.

b.  CONFIDENCE LIMITS FOR MEASUREMENT-DATA AVERAGES

There are only two generally-accepted methods for calculating confidence intervals for sample means that are based upon measurement data; one method uses Z-tables and the other uses t-tables. Those 2 methods yield the same result (to many digits) when sample sizes are huge, but can differ greatly when sample sizes are not huge (when sample sizes are small, the Z-table method gives an incorrectly narrow interval with incorrectly placed upper and lower confidence limits, whereas the t-table method yields the correct interval and correct limits). Therefore, only the t-table method should be used.

To calculate confidence limits for averages using Excel, use the following formulas:

Upper 2-sided limit =

SampleAverage + CONFIDENCE.T (1 − C, SampleStdev, n )

Lower 2-sided limit =

SampleAverage − CONFIDENCE.T (1 − C, SampleStdev, n )

Upper 1-sided limit =

SampleAverage + CONFIDENCE.T (2*(1 − C), SampleStdev , n )

Lower 1-sided limit =

SampleAverage − CONFIDENCE.T (2*(1 − C), SampleStdev , n )

where

C = confidence desired, as a decimal fraction (e.g. use 0.95 for 95% confidence)

n = sample size

Stdev = standard deviation calculated using the "STDEV.S" Excel function

For example, if SampleAverage = 50, n =100, C = 0.95, and SampleStdev = 0.7, then
50 + CONFIDENCE.T (1 − 0.95, 0.7, 100) = 50.139

is the upper 2-sided 95% confidence limit.

c.  CONFIDENCE LIMITS FOR MEASUREMENT-DATA STANDARD DEVIATIONS

There are only two generally-accepted methods for calculating confidence intervals for sample standard deviations that are based upon measurement data; one method uses Z-tables and the other uses Chi-squared-tables. Those 2 methods yield the same result (to many digits) when sample sizes are huge, but can differ greatly when sample sizes are not huge (when sample sizes are small, the Z-table method gives an incorrectly narrow interval with incorrectly placed upper and lower confidence limits, whereas the Chi-squared method yields the correct interval and correct limits). Therefore, only the Chi-squared-table method should be used.