chapter 21 design of engineering experiments chapter 2 – some basic statistical concepts...
TRANSCRIPT
![Page 1: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/1.jpg)
Chapter 2 1
Design of Engineering ExperimentsChapter 2 – Some Basic Statistical Concepts• Describing sample data
– Random samples– Sample mean, variance, standard deviation– Populations versus samples– Population mean, variance, standard deviation– Estimating parameters
• Simple comparative experiments– The hypothesis testing framework– The two-sample t-test– Checking assumptions, validity
![Page 2: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/2.jpg)
Chapter 2 2
Portland Cement Formulation (page 24)
![Page 3: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/3.jpg)
Chapter 2 3
Graphical View of the DataDot Diagram, Fig. 2.1, pp. 24
![Page 4: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/4.jpg)
Chapter 2 4
If you have a large sample, a histogram may be useful
![Page 5: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/5.jpg)
Chapter 2 5
Box Plots, Fig. 2.3, pp. 26
![Page 6: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/6.jpg)
Chapter 2 6
The Hypothesis Testing Framework
• Statistical hypothesis testing is a useful framework for many experimental situations
• Origins of the methodology date from the early 1900s
• We will use a procedure known as the two-sample t-test
![Page 7: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/7.jpg)
Chapter 2 7
The Hypothesis Testing Framework
• Sampling from a normal distribution• Statistical hypotheses:
0 1 2
1 1 2
:
:
H
H
![Page 8: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/8.jpg)
Chapter 2 8
Estimation of Parameters
1
2 2 2
1
1 estimates the population mean
1( ) estimates the variance
1
n
ii
n
ii
y yn
S y yn
![Page 9: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/9.jpg)
Chapter 2 9
Summary Statistics (pg. 36)
1
21
1
1
16.76
0.100
0.316
10
y
S
S
n
Formulation 1
“New recipe”
Formulation 2
“Original recipe”
2
22
2
2
17.04
0.061
0.248
10
y
S
S
n
![Page 10: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/10.jpg)
Chapter 2 10
How the Two-Sample t-Test Works:
1 2
22y
Use the sample means to draw inferences about the population means
16.76 17.04 0.28
Difference in sample means
Standard deviation of the difference in sample means
This suggests a statistic:
y y
n
1 20 2 2
1 2
1 2
Zy y
n n
![Page 11: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/11.jpg)
Chapter 2 11
How the Two-Sample t-Test Works:2 2 2 2
1 2 1 2
1 2
2 21 2
1 2
2 2 21 2
2 22 1 1 2 2
1 2
Use and to estimate and
The previous ratio becomes
However, we have the case where
Pool the individual sample variances:
( 1) ( 1)
2p
S S
y y
S Sn n
n S n SS
n n
![Page 12: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/12.jpg)
Chapter 2 12
How the Two-Sample t-Test Works:
• Values of t0 that are near zero are consistent with the null hypothesis
• Values of t0 that are very different from zero are consistent with the alternative hypothesis
• t0 is a “distance” measure-how far apart the averages are expressed in standard deviation units
• Notice the interpretation of t0 as a signal-to-noise ratio
1 20
1 2
The test statistic is
1 1
p
y yt
Sn n
![Page 13: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/13.jpg)
Chapter 2 13
The Two-Sample (Pooled) t-Test2 2
2 1 1 2 2
1 2
1 20
1 2
( 1) ( 1) 9(0.100) 9(0.061)0.081
2 10 10 2
0.284
16.76 17.04 2.20
1 1 1 10.284
10 10
The two sample means are a little over two standard deviations apart
Is t
p
p
p
n S n SS
n n
S
y yt
Sn n
his a "large" difference?
![Page 14: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/14.jpg)
Chapter 2 14
William Sealy Gosset (1876, 1937)
Gosset's interest in barley cultivation led him to speculate that design of experiments should aim, not only at improving the average yield, but also at breeding varieties whose yield was insensitive (robust) to variation in soil and climate.
Developed the t-test (1908)
Gosset was a friend of both Karl Pearson and R.A. Fisher, an achievement, for each had a monumental ego and a loathing for the other.
Gosset was a modest man who cut short an admirer with the comment that “Fisher would have discovered it all anyway.”
![Page 15: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/15.jpg)
Chapter 2 15
The Two-Sample (Pooled) t-Test
• So far, we haven’t really done any “statistics”
• We need an objective basis for deciding how large the test statistic t0
really is• In 1908, W. S. Gosset
derived the reference distribution for t0 … called the t distribution
• Tables of the t distribution – see textbook appendix
t0 = -2.20
![Page 16: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/16.jpg)
Chapter 2 16
The Two-Sample (Pooled) t-Test• A value of t0 between
–2.101 and 2.101 is consistent with equality of means
• It is possible for the means to be equal and t0 to exceed either 2.101 or –2.101, but it would be a “rare event” … leads to the conclusion that the means are different
• Could also use the P-value approach
t0 = -2.20
![Page 17: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/17.jpg)
Chapter 2 17
The Two-Sample (Pooled) t-Test
• The P-value is the area (probability) in the tails of the t-distribution beyond -2.20 + the probability beyond +2.20 (it’s a two-sided test)
• The P-value is a measure of how unusual the value of the test statistic is given that the null hypothesis is true
• The P-value the risk of wrongly rejecting the null hypothesis of equal means (it measures rareness of the event)
• The P-value in our problem is P = 0.042
t0 = -2.20
![Page 18: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/18.jpg)
Chapter 2 18
Computer Two-Sample t-Test Results
![Page 19: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/19.jpg)
Chapter 2 19
Checking Assumptions – The Normal Probability Plot
![Page 20: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/20.jpg)
Chapter 2 20
Importance of the t-Test
• Provides an objective framework for simple comparative experiments
• Could be used to test all relevant hypotheses in a two-level factorial design, because all of these hypotheses involve the mean response at one “side” of the cube versus the mean response at the opposite “side” of the cube
![Page 21: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/21.jpg)
Chapter 2 21
Confidence Intervals (See pg. 44)• Hypothesis testing gives an objective statement
concerning the difference in means, but it doesn’t specify “how different” they are
• General form of a confidence interval
• The 100(1- α)% confidence interval on the difference in two means:
where ( ) 1 L U P L U
1 2
1 2
1 2 / 2, 2 1 2 1 2
1 2 / 2, 2 1 2
(1/ ) (1/ )
(1/ ) (1/ )
n n p
n n p
y y t S n n
y y t S n n
![Page 22: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/22.jpg)
Chapter 2 22
![Page 23: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/23.jpg)
Chapter 2 23
A função t.test no R
t.test(stats)Student's t-TestDescriptionPerforms one and two sample t-tests on vectors of data. Usaget.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0,paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...)
![Page 24: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/24.jpg)
Chapter 2 24
Argumentos da função t.testx - a (non-empty) numeric vector of data values.
y - an optional (non-empty) numeric vector of data values.
alternative - a character string specifying the alternative hypothesis, must be one of “two.sided" (default), "greater" or "less". You can specify just the initial letter.
mu - a number indicating the true value of the mean (or difference in means if you are performing a two sample test).
paired - a logical indicating whether you want a paired t-test.
var.equal - a logical variable indicating whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.
![Page 25: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/25.jpg)
Chapter 2 25
Argumentos da função t.testconf.level
- confidence level of the interval.
formula - a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding groups.
data - an optional matrix or data frame containing the variables in the formula.
subset - an optional vector specifying a subset of observations to be used.
na.action -
a function which indicates what should happen when the data contain NAs. Defaults to getOption("na.action").
![Page 26: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/26.jpg)
Chapter 2 26
Exemplo dos dados sobre cimento
• Arquivo em cimento.txt com nome das variáveis.
• Ler e realizar o teste t no R.
![Page 27: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/27.jpg)
Chapter 2 27
Usando o R
• dados=read.table(“m://aulas//flavia//cimento.txt”,header=T)
• stripchart(dados,at=c(1,1.1))
• boxplot(dados)
![Page 28: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/28.jpg)
Chapter 2 28
t.test(dados$m,dados$u,alternative="two.sided",var.equal=T,paired=F,conf.level=.95)
Two Sample t-test
data: dados$m and dados$u t = -2.1869, df = 18, p-value = 0.0422alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.54507339 -0.01092661 sample estimates:mean of x mean of y 16.764 17.042
![Page 29: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/29.jpg)
Chapter 2 29
Comparando as variâncias
• Dadas duas amostras independentes de duas distribuições normais, antes de realizar o teste t, para comparar as médias, é necessário verificar se é razoável ou não considerar variâncias iguais ou não, para saber se adotaremos o teste t “pooled” (combinado) ou se adotaremos uma aproximação para o número de graus de liberdade da distribuição amostral da estatística de teste, adotando uma aproximação e não a distribuição exata.
![Page 30: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/30.jpg)
Chapter 2 30
• Se as amostras provêm de fato de populações normais temos que a variância amostral a menos de constante tem distribuição de qui-quadrado com número de graus de liberdade n-1, em que n é o tamanho da amostra.
• Como as amostras são independentes, segue que a menos da constante, as duas variâncias amostrais são independentemente distribuídas segundo uma distribuição de qui-quadrado.
![Page 31: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/31.jpg)
Chapter 2 31
Resumindo...
1,122
21
21
22
212
2
21~
que tal
2,1,~)1(
nn
ni
ii
FS
S
i
indS
ni
![Page 32: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/32.jpg)
Chapter 2 32
Teste de igualdade das variâncias
• Sob a hipótese de que as variâncias são iguais, segue que a estatística de teste é dada pela razão das variâncias amostrais e, num teste bilateral de nível de significância α, rejeitaremos a hipótese nula se:
.~ com,)FP(X
que em
ou
,mn,,
1,1,2/122
21
1,1,2/22
21
2121
mn
nnnn
FX
FS
SF
S
S
![Page 33: Chapter 21 Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts Describing sample data –Random samples –Sample mean, variance,](https://reader036.vdocuments.us/reader036/viewer/2022062512/552fc0fd497959413d8ba94e/html5/thumbnails/33.jpg)
Chapter 2 33
• No R está disponível a função var.test
F test to compare two variances
data: dados$m and dados$u F = 1.6293, num df = 9, denom df = 9, p-value = 0.4785alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.4046845 6.5593806 sample estimates:ratio of variances 1.629257
var.test(dados$m,dados$u,ratio=1,alternative="two.sided",conf.level=0.95)