5/29/2013
1
Chapter 20
Comparison Tests:
Attribute (Pass/fail) Response
Introduction
• This chapter focuses on comparing attribute response
situations (e.g., does the failure frequencies of
completing a purchase order differ between two
departments?)
5/29/2013
2
20.1 S4/IEE Application Examples:
Attribute Comparison Tests
• Manufacturing 30,000-foot-level quality metric: An S4/IEE
project is to reduce the number of defects in a printed circuit
board manufacturing process. A highly ranked input from the
cause-and-effect matrix was inspector; i.e., the team thought
that inspectors could be classifying failures differently. A null
hypothesis test of equality of defective rates reported by
inspector indicated that the difference was statistically
significant.
20.1 S4/IEE Application Examples:
Attribute Comparison Tests
• Transactional 30,000-foot-level metric: DSO reduction was
chosen as an S4/IEE project. A cause-and-effect matrix
ranked a possible important input was that there was a
difference between companies in the number of defective
invoices reported or lost. The null hypothesis test of equality of
defective invoices by company indicated that there was a
statistically significant difference.
5/29/2013
3
20.2 Comparing Attribute Data
• The methods presented can be used to compare
the frequency of failure of two production machines
or suppliers.
• The null hypothesis for the comparison tests is
there no difference, while the alternative
hypothesis is there a difference.
20.3 Sample Size:
Comparing Proportions
• From “How to Choose the Proper Sample Size (The ASQC
Basic References in Quality Control: Statistical Techniques,
Vol. 12)” by Gary G. Brush (ISBN 9780873890502)
• Multiplying the appropriate single-sampled population
equation by 2 to determine the sample size for each of the
two populations.
• Rule of thumb: there should be at least 5 failures for each
category.
5/29/2013
4
Supplement: Inference on the
Difference Between 2 Proportions
7
• Set-Up:
• Let 𝑋 be the number of successes in 𝑛𝑋 independent
Bernoulli trials with success probability 𝑝𝑋, and let 𝑌 be
the number of successes in 𝑛𝑌 independent Bernoulli
trials with success probability 𝑝𝑌, so that 𝑋~𝐵𝑖𝑛 𝑛𝑋, 𝑝𝑋
and 𝑌~𝐵𝑖𝑛 𝑛𝑌 , 𝑝𝑌 .
• Define
𝑛 𝑋 = 𝑛𝑋 + 2 𝑛 𝑌 = 𝑛𝑌 + 2
𝑝 𝑋 =𝑋+1
𝑛 𝑋 𝑝 𝑌 =
𝑌+1
𝑛 𝑌
CI
8
• Given the set-up just described, the 100 1 − 𝛼 % CI for
the difference (𝑝𝑋 − 𝑝𝑌) is
𝑝 𝑋 − 𝑝 𝑌 ± 𝑧𝛼 2
𝑝 𝑋 1 − 𝑝 𝑋𝑛 𝑋
+𝑝 𝑌(1 − 𝑝 𝑌)
𝑛 𝑌
• If the lower limit of the confidence interval is less than
-1, replace it with -1.
• If the upper limit of the confidence interval is greater
than 1, replace it with 1.
• There is a traditional confidence interval as well. It is a
generalization of the one for a single proportion.
5/29/2013
5
Example
Methods for estimating strength and stiffness requirements should be conservative in that they should overestimate rather than underestimate. The success rate of such a method can be measured by a probability of an overestimate. An article in Journal of Structural Engineering presents the results of an experiment that evaluated a standard method for estimating the brace force for a compression web brace. In a sample of 380 short test columns the method overestimated the force for 304 of them, and in a sample of 394 long test columns, the method overestimated the force for 360 of them. Find a 95% confidence interval for the difference between the success rates for long columns and short columns.
9
Hypothesis Tests on the Difference
Between Two Proportions
• The procedure for testing the difference between
two populations is similar to the procedure for
testing the difference between two means.
• One of the null and alternative hypotheses are
H0: pX – pY ≥ 0 versus H1: pX – pY < 0.
10
5/29/2013
6
Comments
• The test is based on the statistic .
• We must determine the null distribution of this
statistic.
• By the Central Limit Theorem, since nX and nY are
both large, we know that the sample proportions
for X and Y have an approximately normal
distribution.
11
YX pp ˆˆ
More on Proportions
• The difference between the proportions is also
normally distributed.
• Let , then
YX nn
YXp
ˆ
YXYX
nnppNpp
11)ˆ1(ˆ,0~ˆˆ
12
5/29/2013
7
Hypothesis Test
• Let X ~ Bin(nX, pX) and Y ~ Bin(nY, pY). Assume nX and
nY are large, and that X and Y are independent.
• To test a null hypothesis of the form H0: pX – pY 0,
H0: pX – pY ≥ 0, and H0: pX – pY = 0.
• Compute
• Compute the z-score:
13
.ˆ and ,ˆ,ˆYXY
YX
Xnn
YXp
n
Yp
n
Xp
)/1/1)(ˆ1(ˆ
ˆˆ
YX
YX
nnpp
ppz
P-value
Compute the P-value. The P-value is an area
under the normal curve, which depends on the
alternative hypothesis as follows:
• If the alternative hypothesis is H1: pX – pY > 0, then the P-value is the area to the right of z.
• If the alternative hypothesis is H1: pX – pY < 0, then the P-value is the area to the left of z.
• If the alternative hypothesis is H1: pX – pY 0, then the P-value is the sum of the areas in the tails cut off by z and -z.
14
5/29/2013
8
Example
Industrial firms often employ methods of “risk transfer”, such as insurance or indemnity clauses in contracts, as a technique of risk management. An article reports the results of a survey in which managers were asked which methods played a major role in the risk management strategy of their firms. In a sample of 43 oil companies, 22 indicated that risk transfer played a major role, while in a sample of 93 construction companies, 55 reported that risk transfer played a major role. Can we conclude that the proportion of oil companies that employ the method of risk transfer is less than the proportion of construction companies that do?
15
20.4 Comparing Proportions
• The chi-square distribution can be used to compare the
frequency of occurrence for discrete variables.
• Within the test, observed frequency distribution was
compared to a theoretical distribution.
• Data compilation and analysis is in the form of the following
constancy table, which observations are designed as 𝑂𝑖𝑗
and expected values are calculated to be 𝐸𝑖𝑗.
5/29/2013
9
20.4 Comparing Proportions
𝑨𝟏 𝑨𝟐 𝑨𝟑 ⋯ 𝑨𝒕 Total
𝑩𝟏 𝑂11 𝑂12 𝑂13 ⋯ 𝑂1𝑡 𝑇𝑟𝑜𝑤 1 = 𝑂11 + 𝑂12 + 𝑂13 + ⋯+ 𝑂1𝑡
𝐸11 𝐸12 𝐸13 ⋯ 𝐸1𝑡
𝑩𝟐 𝑂21 𝑂22 𝑂23 ⋯ 𝑂2𝑡 𝑇𝑟𝑜𝑤 2 = 𝑂21 + 𝑂22 + 𝑂23 + ⋯+ 𝑂2𝑡
𝐸21 𝐸22 𝐸23 ⋯ 𝐸2𝑡
𝑩𝟑 𝑂31 𝑂32 𝑂33 ⋯ 𝑂3𝑡 𝑇𝑟𝑜𝑤 3 = 𝑂31 + 𝑂32 + 𝑂33 + ⋯+ 𝑂3𝑡
𝐸31 𝐸32 𝐸33 ⋯ 𝐸3𝑡
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
𝑩𝒔 𝑂𝑠1 𝑂𝑠2 𝑂𝑠3 ⋯ 𝑂𝑠𝑡 𝑇𝑟𝑜𝑤 𝑠 = 𝑂𝑠1 + 𝑂𝑠2 + 𝑂𝑠3 + ⋯+ 𝑂𝑠𝑡
𝐸𝑠1 𝐸𝑠2 𝐸𝑠3 ⋯ 𝐸𝑠𝑡
Total 𝑇𝑐𝑜𝑙 1 𝑇𝑐𝑜𝑙 2 𝑇𝑐𝑜𝑙 3 ⋯ 𝑇𝑐𝑜𝑙 𝑡 𝑇 = 𝑇𝑟𝑜𝑤 1 + 𝑇𝑟𝑜𝑤 2 + 𝑇𝑟𝑜𝑤 3 + ⋯+ 𝑇𝑟𝑜𝑤 𝑠
20.4 Comparing Proportions
• The expected values are calculated as
𝐸𝑖𝑗 =𝑇𝑟𝑜𝑤 𝑖 × 𝑇𝑐𝑜𝑙 𝑗
𝑇
• The null hypothesis is that there is no difference while the
alternative is that at least one of the proportions is different.
• The chi-square statistic (Table G) could be used when
assessing this hypothesis, where the number of degree of
freedom (𝜈) is 𝑠 − 1 𝑡 − 1 .
• If the 𝜒𝑐𝑎𝑙2 is larger than the chi-square criterion, the null
hypothesis is rejected at 𝛼 risk.
𝜒𝑐𝑎𝑙2 =
(𝑂𝑖𝑗 − 𝐸𝑖𝑗)2
𝐸𝑖𝑗
𝑡
𝑗=1
𝑠
𝑖=1
5/29/2013
10
20.5 Example 20.1 Comparing Proportions
• The abilities of three x-ray inspectors at an airport were
evaluated on the detection of key items. A test was devised in
which 90 pieces of luggage were “bugged” with a device that
they should question. Each inspector was exposed to exactly
30 of the :bugged” items in random fashion. The null
hypothesis is that there is no difference between inspectors.
The alternative hypothesis is that at least one of the proportions
is different.
Observed Insp 1 Insp 2 Insp 3 Treatment
Total
Detected 27 25 22 74
Undetected 3 5 8 16
Sample total 30 30 30 90
20.5 Example 20.1 Comparing Proportions
Observed Insp 1 Insp 2 Insp 3 Treatment
Total
Detected 27 25 22 74
Undetected 3 5 8 16 Sample total 30 30 30 90
Expected Insp 1 Insp 2 Insp 3 Treatment
Total
Detected 24.66667 24.66667 24.66667 74
Undetected 5.333333 5.333333 5.333333 16
Sample total 30 30 30 90
Chi-Square Insp 1 Insp 2 Insp 3
Detected 0.220721 0.004505 0.288288
Undetected 1.020833 0.020833 1.333333
2.888514 From Table G, 𝜒.05,2
2 = 5.99, fail to reject 𝐻0.
5/29/2013
11
20.5 Example 20.1 Comparing Proportions
Chi-Square Test: Insp1, Insp2, Insp3
Expected counts are printed below observed counts
Chi-Square contributions are printed below expected counts
Insp1 Insp2 Insp3 Total
1 27 25 22 74
24.67 24.67 24.67
0.221 0.005 0.288
2 3 5 8 16
5.33 5.33 5.33
1.021 0.021 1.333
Total 30 30 30 90
Chi-Sq = 2.889, DF = 2, P-Value = 0.236
Minitab:
Stat
Tables
2 Chi-sq test (2 way..)
20.6 Comparing Nonconformance
Proportions and Count Frequencies
• Consider a situation in which an organization wants to
evaluate the nonconformance rates of several suppliers to
determine if there are differences. The chi-square approach
could asses this situation from an overall point of view.
• The methodology does not identify which suppliers might
be worse than the overall mean.
• 𝑝-chart (for nonconformance data) or 𝑢-chart (for count data)
could be used to identify out-of-control-limit data.
• Some statistical programs (e.g., Minitab) use a methodology
similar to the Analysis of Means (ANOM) for both proportion
and count data when the sample size is the same. The null
hypothesis is that the rate from each category equates to the
overall mean.
5/29/2013
12
20.7 Example 20.2 Comparing
Nonconformance Proportions
• For the data in Example 20.1
20.7 Example 20.2 Comparing
Nonconformance Proportions
Minitab:
Stat
ANOVA
Analysis of Means
5/29/2013
13
20.8 Example 20.3 Comparing
Counts
Insp Defects
1 330 2 350
3 285
4 320
5 315 6 390
7 320
8 270
9 310
10 318
Minitab:
Stat
ANOVA
Analysis of Means
20.9 Example 20.4 Difference in
Two Proportions
Minitab:
Stat
Basic Statistics
2P 2 Proportions
Test and CI for Two Proportions
Sample X N Sample p
1 6290 620000 0.010145
2 4661 490000 0.009512
Difference = p (1) - p (2)
Estimate for difference: 0.000632916
95% CI for difference: (0.000264020, 0.00100181)
Test for difference = 0 (vs not = 0): Z = 3.36 P-Value = 0.001