Statistical Methods for Process Improvements
xn
∑n
x=x 1=i
i∑n
n 2) i(X∑ − X
1-n1=i=s
Week 1
Knorr-Bremse Group
Why do We Need Statistics?• Variability
– Does a process hit the target with a minimum of variability?
Th M V l d i if i Th St d d D i ti– The Mean Value determines if a process is on target. The Standard Deviationdescribes the variability of the process.
• StabilityStability– How does the process behave over time?
– A stable process has a consistent mean and a predictable variability over time.
13 UCL=13,116
Xbar Chart of Process A
351
Xbar Chart of Process B
le M
ea
n
12
11
10__X=9,959
le M
ea
n
30
25
20UCL=18 19
1
Sa
mp
9
8
7 LCL=6,803
Sa
mp
l
15
10
__X=12,09
UCL=18,19
LCL=5 98
Sample24222018161412108642
6
,
Sample24222018161412108642
5LCL=5,98
1
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 2/37
Which of these two processes would you prefer?
Interpretation of Variation• Every process varies with time. Some processes show controlled variation,
while other processes have uncontrolled variation. (Walter Shewhart).
• A controlled variation is characterized by a stable and consistent pattern• A controlled variation is characterized by a stable and consistent pattern of variation with time. Reasons for this type of variation are common causes.
• An uncontrolled variation is characterized by unpredictable variation. Reasons here are special or assignable causes.
• Process A runs with controlled variationSpecial CProcess A runs with controlled variation.
• Process B shows uncontrolled variation.
Xb Ch t f P A Xb Ch t f P B
Causes!
13
12
UCL=13,116
Xbar Chart of Process A
35
30
1
Xbar Chart of Process B
Sa
mp
le M
ea
n
11
10
9
__X=9,959
Sa
mp
le M
ea
n
25
20
15
UCL=18,19
1
24222018161412108642
8
7
6
LCL=6,803
24222018161412108642
10
5
__X=12,09
LCL=5,98
1
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 3/37
Sample24222018161412108642
Sample24222018161412108642
Can We Accept Variation?
• Every process shows variation
• We accept variation if:p
– The total variation of the output is relativley small compared to the process specification and the process is on target.
– The process is stable over time.
ost
LSL USLNom
Co
The traditional view of variation
Accepted Variation
LSL USLNom
tC
ost
The new view of variation
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 4/37
Probabilities
Value Comb. Probability
2 1 ,0278,
3 2 ,0556
4 3 ,0833
5 4 11115 4 ,1111
6 5 ,1389
7 6 ,1667,
8 5 ,1389
9 4 ,1111
10 3 083310 3 ,0833
11 2 ,0556
12 1 ,0278,
Total 36 1,0000
Probability of a value for dice 1 = 1/6 = .1667
Probability of a value for dice 2 = 1/6 = .1667
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 5/37
Probability for any combination = 1/6 x 1/6 = 1/36 = .0278
Graphic of a Probability Function
Customer requirement: values between 3 and 11
.18
16
Value Comb. Probability
2 1 0278.16
.142.8%2.8% LSL USL
2 1 ,0278
3 2 ,0556
4 3 ,0833
.12
.10
5 4 ,1111
6 5 ,1389
7 6 1667
.08
.06
7 6 ,1667
8 5 ,1389
9 4 ,1111.06
.04
02
10 3 ,0833
11 2 ,0556
12 1 0278
Performance: 94 4%
.02
2 1210864 140Sum of the die results
12 1 ,0278
Total 36 1,0000
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 6/37
Performance: 94.4%
The Normal Distribution Curve
Units
µ
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 7/37
The Normal Distribution Curve
Specification Limitp(x > a) = 1
σ 2π e-(1/2)[(x - µ )/σ ]2
a
∞
dx
Area of the YieldProbabilityProbabilityof defects
+ infinite- infinite
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 8/37
Analytical Approach • Determine if the process is stable.
• If the process is not stable:
• Identify and eliminate the causes for instability
• If the process is stable:
• Determine/estimate the total amount of variation
• Identify the sources of variationy
• Reduce the variation
• We will now discuss statistical tools which will help us to do so.e o d scuss s a s ca oo s c e p us o do so
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 9/37
Overview Data Types
• Attribute Data (Qualitative)
– CategoriesCategories
– Yes, No
– Go, No goGo, No go
– Machine 1, Machine 2, Machine 3
– Pass/FailPass/Fail
• Variable Data (Quantitative)
– Discrete (Count) DataDiscrete (Count) Data
• Maintenance Equipment Failures, Number of HV-Arcs
• Number of Customer ReturnsNumber of Customer Returns
• Defects per Unit
– Continuous DataContinuous Data
• Decimal Subdivisions are meaningful
• Time, Pressure, Conveyor Speed
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 10/37
Time, Pressure, Conveyor Speed
A Selection of Statistical Techniques
Factor X = Input
Discrete / Attributive Continuous / VariableDiscrete / Attributive Continuous / Variable
ut
te ve
L i ti
= O
utp
u
Dis
cret
Attr
ibut
ivChi - Square
Logistic
Regression
nse
Y = A
s
Res
po
n
ntin
uou
aria
ble T - Test
ANOVA ( F - Test) Regression
R
Con Va
Median Tests
Statistical techniques for all combination of data types are available
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 11/37
y
Basic Definitions in Statistics
• Types of data
• Measuring scaleMeasuring scale
• Measures for the center of the data
M• Mean
• Median
• Measures for the variation of data
• RangeRange
• Variance
S d d d i i• Standard deviation
• Normal distribution and normal probabilities
• Standard (Z) transformation
• Process capability metrics
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 12/37
Process capability metrics
Sample vs. Population
X = Mean value of a sample µ = Mean value of the population
= Standard Deviation of a sample
σ = Standard Deviation of the population
S
Statistics Estimation Parameter
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 13/37
Two Important Statistical Equations
Xi
N
∑Mean of the
µ =N
i=1
)(XN
2∑
population
N
)(X= 1=i
2i∑ −µ
σStandard deviation of the population
n
∑
Nthe population
Mean of the sample
x=x 1=i
i∑sample
St d d d i ti f
n n 2) i(X∑ − XStandard deviation of
the sample1-n
1=i)i(
=s
∑
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 14/37
1n
Description of the Center
• Mean: Arithmetic average of the data
– Reflects the influence of all data
– Strongly influenced by extreme values
xx
ni
i
n
==∑
1
• Median: Reflects the 50% rank – the center of d t ft ti f l t hi h
i=1
data after sorting from low to high
– Does not include all values in the calculation
– Is “robust” to extreme values
Two successive steps influence the mean additively
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 15/37
Description of the Variation• The Range is the distance between the extreme values of a
data set.
• The Variance is the sum of the average squared deviation of each data point from the mean divided by the degrees of freedomfreedom.
• The Standard Deviation is the square root of the variance.
The most common and useful measure of variation is theThe most common and useful measure of variation is the standard deviation
You can calculate the variance of two steps by
Important!2A Variance of step AIf σ =
You can calculate the variance of two steps by adding the variances.
Important!
222222
2B
A
Variance of step Band
Variance of step AIf
σσ
=
2B
2ATotal
2B
2A
2Difference
2B
2A
2Total andthan
σσσ
σσσσσσ
+=
−=+=
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 16/37
BATotal
The Calculation of the Standard Deviation
2)X-(X X-X X11234
n 2
4567
11=i
2)i(X∑ − X78910 1-n10Σ
MeanVariance n
1=i
2) i(X∑ − XVariance
s
1-nAssignment: Calculate the Standard Deviation of the following numbers: 2, 1, 3, 5, 4, 3 Use the form sheet above
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 17/37
the form sheet above.
The Effect of the Quadratic Deviation
By squaring the deviations extreme values heavily effect the meaneffect the mean.
(x - x)2(x x)
100
50
Sq-
De
v
0
S
1050
0
Deviates
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 18/37
Descriptive Statistics for 3 DistributionsStat
>Basic Statistics
>Display Des>Display Des...
File: Distribution mtw
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3
File: Distribution.mtw
Q Q
Symmetric 500 0 70,000 0,447 10,000 29,824 63,412 69,977 76,653
Pos Asym 500 0 70,000 0,447 10,000 62,921 63,647 65,695 72,821
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 19/37
Neg Asym 500 0 70,000 0,447 10,000 1,866 67,891 73,783 76,290
The Histogram of these Distributions
140
120
Histogram of Pos AsymStat
>Basic Statistics
>Display Des
Fre
qu
en
cy
100
80
60
>Display Des...
>Graphs Positive Asymmetry
P A130120110100908070
40
20
0
Pos Asym
Symmetric
70
60
Histogram of Symmetric
250
200
Histogram of Neg Asym
Symmetric Distribution
Fre
qu
en
cy
50
40
30
Fre
qu
en
cy
200
150
100
Negative Asymmetry
10090807060504030
20
10
07260483624120
50
0
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 20/37
Symmetric Neg Asym
Mean & MedianMean, Median
70 Mean 70 00
Histogram (with Normal Curve) of Symmetric
Histogram with Normal Curve
en
cy
60
50
40
Mean 70,00StDev 10,00N 500
Fre
qu
e
30
20
10
Symmetric10090807060504030
0
140
120
Mean 70,00StDev 10,00N 500
Histogram (with Normal Curve) of Pos Asym
250
200
Mean 70,00StDev 10,00N 500
Histogram (with Normal Curve) of Neg Asym
MeanMean
Fre
qu
en
cy
100
80
60 Fre
qu
en
cy
200
150
100
Median
Mean
Median
125,0112,5100,087,575,062,550,0
40
20
0847260483624120
50
0
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 21/37
Pos Asym,,,,,,,
Neg Asym
Different Forms of Distributions
Distribution 1
Distribution 2Distribution 2
Distribution 3
How do you interpret the differences?
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 22/37
y p
Areas under the Normal Curve
0.4
68 %
0.3
95 %
equ
ency
0.2
Fre
0.1
99,73 %
0
0 1 2 3 4-1-2-3-4Output
0.0
The shape of the curve is determined by the Standard Deviation and the Mean.
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 23/37
Deviation and the Mean.
Areas under the Normal Curve
Rules of thumb for the normal distribution
Rule 1• Roughly 60-75% of all data are in the area of
+/- 1 standard deviation from the mean.
Rule 2• Usually 90-98% of the data are in the area ofUsually 90 98% of the data are in the area of
+/- 2 standard deviation from the mean.
Rule 3Rule 3• About 99-100% of the data are in the area of
+/- 3 standard deviation from the mean+/- 3 standard deviation from the mean.
File. Distribution.mtw
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 24/37
File. Distribution.mtw
Testing for Normal Distributions
• Diagrams to describe test results for normal distribution are useful. We get information about the behavior of the distribution. If h d f ll l di ib i h l b biliIf the data follows a normal distribution the normal probability diagram displays a straight line.
Mi i b d hi d hi di (• Minitab does this test and generates this diagram (see next page). Additional to the graph Minitab displays „A square“ and a p value“„p value .
• A square is a calculated test value after Anderson/Darling. Its value shows the summed squared distances of the single datavalue shows the summed squared distances of the single data points from the straight line. Big A square values indicate that the data don’t follow a normal distribution.
• The p value helps to decide whether the data are normally distributed or not.
p values <0,05: Data are non normal
p values >0,05: Normal distributed Data
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 25/37
Testing for Normal DistributionsStat
>Basic Statistics
>Normality Test>Normality Test
70
60
Mean 70,00StDev 10,00N 500
Histogram (with Normal Curve) of Symmetric
99,9
99
Mean 70,00StDev 10,00N 500AD 0 418
Probability Plot of SymmetricNormal
Fre
qu
en
cy
50
40
30
Pe
rce
nt
9590
807060504030
0,328AD 0,418P-Value
20
10
0
20
10
5
1
0,1
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 26/37
Symmetric10090807060504030
Symmetric11010090807060504030
Testing for Normal Distributions
140
120
Mean 70,00StDev 10,00N 500
Histogram (with Normal Curve) of Pos Asym
qu
en
cy
120
100
80
Fre
q
60
40
20
Pos Asym125,0112,5100,087,575,062,550,0
0
99,9Mean 70,00
Probability Plot of Pos AsymNormal
en
t
99
9590
807060
<0,005
StDev 10,00N 500AD 46,489P-Value
Pe
rce
50403020
10
5
1
Pos Asym130120110100908070605040
1
0,1
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 27/37
Testing for Normal Distributions
250 Mean 70,00StDev 10,00N 500
Histogram (with Normal Curve) of Neg Asym
qu
en
cy
200
150
Fre
q
100
50
Neg Asym847260483624120
0
99,9Mean 70,00
Probability Plot of Neg AsymNormal
en
t
99
9590
807060
<0,005
StDev 10,00N 500AD 44,491P-Value
Pe
rce
50403020
10
5
1
Neg Asym100806040200
1
0,1
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 28/37
Analyze a Mystery Distribution
Generate a normal distribution diagram for the mystery data set C4 What is your conclusion?
Probability Plot of Mystery
mystery data set C4. What is your conclusion?
99,9
99
Mean 100,0StDev 32,38
Probability Plot of MysteryNormal
99
9590
80
<0,005
N 500AD 27,108P-Value
Pe
rce
nt 70
6050403020
10
5
1
Mystery200150100500
0,1
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 29/37
Descriptive Statistics with MinitabStat
>Basic Statistics
>Graphical Summary>Graphical Summary…
A nderson-Darling Normality Test
A -Squared 0,42
Summary for Symmetric
V ariance 100,000Skewness -0,050008Kurtosis 0,423256
A Squared 0,42P-V alue 0,328
Mean 70,000StDev 10,000
10090807060504030
Kurtosis 0,423256N 500
Minimum 29,8241st Q uartile 63,412Median 69,9773rd Q uartile 76,653Maximum 103,301,
95% C onfidence Interv al for Mean
69,121 70,879
95% C onfidence Interv al for Median
69,021 70,737
95% C onfidence Interv al for StDev95% Confidence Intervals
Median
Mean
71,070,570,069,569,0
9,416 10,66295% Confidence Intervals
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 30/37
Descriptive Statistics with Minitab
A nderson-Darling Normality Test
A -Squared 0,42
Summary for Symmetric
V ariance 100,000Skewness 0 050008
A Squared 0,42P-V alue 0,328
Mean 70,000StDev 10,000
Skewness -0,050008Kurtosis 0,423256N 500
Minimum 29,8241st Q uartile 63,412
10090807060504030
Median 69,9773rd Q uartile 76,653Maximum 103,301
95% C onfidence Interv al for Mean
69,121 70,879, ,
95% C onfidence Interv al for Median
69,021 70,737
95% C onfidence Interv al for StDev
9,416 10,66295% Confidence Intervals
Median
Mean
71,070,570,069,569,0
Skewness und Kurtosis are related on asymmetry and flatness of the di t ib ti Th l t 0 th l di t ib t d
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 31/37
distribution. The closer to 0, the more normal distributed.
Descriptive Statistics with Minitab
A nderson-Darling Normality Test
A -Squared 46,49
Summary for Pos Asym
V ariance 100,000Skewness 2 41707
A Squared 46,49P-V alue < 0,005
Mean 70,000StDev 10,000
Skewness 2,41707Kurtosis 6,93041N 500
Minimum 62,9211st Q uartile 63,647
130120110100908070
Median 65,6953rd Q uartile 72,821Maximum 130,366
95% C onfidence Interv al for Mean
69,121 70,879, ,
95% C onfidence Interv al for Median
65,260 66,501
95% C onfidence Interv al for StDev
9,416 10,66295% Confidence Intervals
Median
Mean
71706968676665
A positive skewness number shows a positive distortion. A kurtosis number h hi h k f th di t ib ti
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 32/37
shows a high peak of the distribution.
Descriptive Statistics with Minitab
A nderson-Darling Normality Test
A -Squared 44,49
Summary for Neg Asym
V ariance 100,000Skewness 2 8688
A Squared 44,49P-V alue < 0,005
Mean 70,000StDev 10,000
Skewness -2,8688Kurtosis 11,5897N 500
Minimum 1,8661st Q uartile 67,891
7260483624120
Median 73,7833rd Q uartile 76,290Maximum 77,106
95% C onfidence Interv al for Mean
69,121 70,879, ,
95% C onfidence Interv al for Median
73,162 74,326
95% C onfidence Interv al for StDev
9,416 10,66295% Confidence Intervals
Median
Mean
75747372717069
A negative skewness number shows a negative distortion. The kurtosis b i iti i d h hi h k
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 33/37
number is positive again and shows a high peak.
Descriptive Statistics with Minitab
Summary for MysteryA nderson-Darling Normality Test
A -Squared 27,11P-V alue < 0,005
Mean 100,00StDev 32 38V ariance 1048,78Skewness 0,00716Kurtosis -1,63184N 500
StDev 32,38
160140120100806040
Minimum 41,771st Q uartile 68,69Median 104,203rd Q uartile 130,81Maximum 162,82
f d l f95% C onfidence Interv al for Mean
97,15 102,85
95% C onfidence Interv al for Median
82,78 117,66
95% C onfidence Interv al for StDev
Median
Mean
95% C onfidence Interv al for StDev
30,49 34,5395% Confidence Intervals
1201101009080
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 34/37
Probability Plots Graph
>Probability Plot…
99 9
Probability Plot of SymmetricNormal - 95% CI
99,9
99
95
Mean 70,00StDev 10,00N 500AD 0,418P-Value 0,328
90
8070605040e
rce
nt
,
403020
10
5
Pe
11010090807060504030
1
0,1
S t i
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 35/37
Symmetric
Some Exercises
• Analyze the variable Y in the file Delivery Time.mtw.y y
Are the data normal distributed?
I hi h t 95% f th l ?• In which area you expect 95% of the values?
• Analyze the variable Y, days / receiving, in the file Late Payment.mtw
• What kind of distribution?
• In which area you expect 99% of the values?y p
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 36/37
Summary
• Types of data
• Scale of measurements
• Measure of the center of the data• Measure of the center of the data
• Mean
• Median
• Measure of the spread of dataMeasure of the spread of data
• Range
• Variance
• Standard deviation
• Normal distribution and normal probabilities
Knorr-Bremse Group 08 BB W1 Statistical Methods 07, D. Szemkus/H. Winkler Page 37/37