chap 2 introduction to statistics this chapter gives overview of statistics including histogram...
Post on 20-Dec-2015
215 views
TRANSCRIPT
![Page 1: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/1.jpg)
Chap 2 Introduction to Statistics
This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion
![Page 2: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/2.jpg)
INTRODUCTION TO STATISTICS Statistics – deriving relevant information
from data Deals with
Collection of data – census, GDP, football, accident, no. of employees (male, female , department, etc)
Collection , tabulation, analysis, interpretation, an presentation of quantitative data – can make some conclusions on sample or population studied, make decisions on quality
![Page 3: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/3.jpg)
INTRODUCTION TO STATISTICS
Use of statistics in quality deals with second meaning. – inductive statistics
Examples : What can we learn from the data? What conclusions can be drawn? What does the data tell about our process
and product performance? etc.
![Page 4: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/4.jpg)
INTRODUCTION TO STATISTICS Understand the use of statistics vital
in business to make decisions based on facts in conducting business improvements in controlling and monitoring process,
products or service performance Application of statistics to real life
problems such as for quality problems will result in improved organizational performance
![Page 5: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/5.jpg)
Collection of data
Collect Data – direct observation or indirect through written or verbal questions (market research, opinion polls)
Direct observation measured, visual checking, classified as variables and attributes
Variables data – measurable quality characteristics
Attributes – characteristics not measured but classified as conforming or non-conforming
![Page 6: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/6.jpg)
Collection of data
Data collected with purpose Find out process conditions For improvement
Variables – quality characteristics that are measurable and countable CONTINUOUS - Dimensions, weight, height,
etc. (meter, gallon, p.s.i., etc.) DISCRETE - numbers that exhibit gaps,
countable, (no. of defective parts, no. of defects/car, Whole numbers, 1, 2, 3….100)
![Page 7: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/7.jpg)
Collection of data Attributes - quality characteristics that are non-
measurable and ‘those we do not want to measure’
Example : surface appearance, color, Acceptable, non-acceptable conforming, non-conf.
Data collected in form of discrete values Variables (weight of sugar) CAN be classified as
attributes weight within limits – number of
conforming outside limits – no. of non conforming
![Page 8: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/8.jpg)
0 1 3 0 1 0 1 0
1 5 4 1 2 1 2 0
1 0 2 0 0 2 0 1
2 1 1 1 2 1 1
0 4 1 3 1 1 1
1 3 4 0 0 0 0
1 3 0 1 2 2 3
Summarizing Data Consider this data set on number of Daily Billing errors
Data in this from MeaninglessNot effectiveDifficult to use
![Page 9: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/9.jpg)
Need to summarize data in the form of: Graphical – Freq. Dist., Histogram, Graphs,
Charts, Diagrams Analytical – Measures of central tendency,
Measure of dispersion
![Page 10: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/10.jpg)
Frequency Distribution (FD)
Summary of how data (observations) occur within each subdivision or groups of observed values
Help visualize distribution of data Can see how total frequency is distributed Two types : Ungrouped data – listing of observed values Grouped data – lump together observed values
23.522.5
22.521.5
21.520.5
![Page 11: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/11.jpg)
FD - Ungrouped Data
1. Establish array, arrange in ascending or descend (as in column 1)
2. Tabulate the frequency – place tally marking in column 2
3. Present in graphical form – Histogram, Relative freq. distr.
No of errors Tally mark Frequency
0 /////////// 13
1 ////
2 /////
3 ////
4
5
![Page 12: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/12.jpg)
FD – Ungrouped data
No error Freq Relative freq
Cumulative freq
Rel cum freq
0 15 0.29 15 0.29
1 20 0.38 35 0.67
2 8 0.15 43 0.83
3 5 0.10 48 0.92
4 3 0.06 51 0.98
5 1 0.02 52 1.00
Total 52
4 graphical representations
1. Frequency histogram
2. Relative freq histogram
3. Cumulative frequency histogram
4. Relative cum frequency histogram 0 1 2 3 4 5
1412108642
Frequency
![Page 13: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/13.jpg)
Frequency Distribution For Grouped Data Data which are continuous variable need grouping
Steps1. Collect data and construct tally sheet Make tally - coded if necessary Too many data – group into cells Simplify presentation of distribution Too many cells – distort true picture Too few cells – too concentrated No of cells – judgment by analyst – trial and error Generally 5-20 cells Less than 100 data – use 5 –9 cells 100 – 500 data – use 8 to 17 cells More than 500 – use 15 to 20 cells
![Page 14: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/14.jpg)
Midpoint
UPPER BOUNDARY
CELL
CELL NOMENCLATURE
Cell interval (i)
![Page 15: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/15.jpg)
2. Determine the range
R = XH - XL R = range XH = highest value of data XL = lowest value of data Example : If highest number is 2.575 and lowest number
is 2.531, then R = XH - XL = 2.575 – 2.531 = 0.044
![Page 16: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/16.jpg)
3. Determine the cell interval Cell interval = distance between adjacent cell midpoints.
If possible, use odd interval values e.g. 0.001, 0.07, 0.5 , 3; so that midpoint values will have same no. decimal places as data values.
Use Sturgis rule. i = R/(1+ 3.322 log n) Trial and error h = R/i ;h= number of cells or cllases Assume i = 0.003; h = 0.044/0.003 = 15 cells Assume i = 0.005; h = 0.044/0.005 = 9 cells Assume ii = 0.007; h = 0.044/0/.007 = 6 cells Cell interval 0.005 with 9 cells will give best presentation
of data. Use guidelines in step 1.
![Page 17: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/17.jpg)
4. Determine cell midpoints
MPL = XL + i/2 (do not round) = 2.531 + 0.005/2 = 2.533 1st cell have 5 different values (also the other
cells)
2.531 2.532 2.533 2.534 2.535
2.533
2.538
![Page 18: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/18.jpg)
5. Determine cell boundaries
Limit values of cell lower upper To avoid ambiguity in putting data Boundary values have an extra decimal
place or sig. figure in accuracy that observed values
+ 0.0005 to highest value in cell - 0.0005 to lowest value in cell
![Page 19: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/19.jpg)
6. Tabulate cell frequency Post amount of numbers in each cell Frequency distribution table
Cell boundary Cell MP Freq.
2.531 – 2.535 2.533 6
2.536 – 2.540 2.538 8
2.541 – 2.545 2.543 12
2.546 – 2.550 2.553 13
2.551 – 2.555 2.553 20
2.556 – 2.560 2.563 19
2.561 – 2.565 2.563 13
2.566 – 2.570 2.568 11
2.571 – 2.575 2.573 8
110
![Page 20: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/20.jpg)
Freq dist gives better view of central value and how data dispersed than the unorganized data sheet
Histogram – describes variation in process Used to solve problems determine process capability compare with specifications suggest shape of distribution indicate data discrepancies, e.g. gaps
![Page 21: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/21.jpg)
Characteristics Of Frequency Distribution Symmetry, Number of modes (one, two or
multiple), Peakedness of data
Sym.
SkewRight
SkewLeft
Bi-modal
‘very peak’leptokurtic
flatterplatykurtic
![Page 22: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/22.jpg)
Characteristics of Frequency Distribution
F.D. can give sufficient info to provide basis for decision making.
Distributions are compared regarding:-
Location Spread Shape
![Page 23: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/23.jpg)
Descriptive Statistics Analytical method allow comparison between
data 2 main analytical methods for describing data
Measures of central tendency Measures of dispersion
Measures of central tendency of a distribution - a numerical value that describes the central position of data
3 common measures mean median mode
![Page 24: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/24.jpg)
Measure of Central Tendency Mean - most common measure used What is middle value? What is average
number of rejects, errors, dimension of product?
Mean for Ungrouped Data - unarranged x (x bar)
n
xxx
n
xX n21
n
1ii
![Page 25: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/25.jpg)
Mean
ExampleA QA engineer inspects 5 pieces of a tyre’s thread depth (mm). What is the mean thread depth?
x1 = 12.3 x2 = 12.5 X3 = 12.0.x4 = 13.0 x5 = 12.8
mm12.55
62.5
5
Σxx i
![Page 26: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/26.jpg)
Mean - Grouped Data
When data already grouped in frequency distribution
fi (n)= sum. of freq.
fi = freq in the ith cell
n = no. of cells/classxi = mid point in ith cell
i
h
1iii
Σf
xfx
![Page 27: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/27.jpg)
Mean - Grouped Data
Cell (i) Class boundary
Mid Point (xi)
Freq (fi)
Fixi fi fixi
1 1 – 20 10 2 20 2
2 21 – 40 30 10 300 12
3 41 - 60 50 20 1000 32
4 61 – 80 70 12 840 44
5 81 -100 90 6 540 50
Totals 2700
i
ii
fxf
x
= 2700/50 = 54
![Page 28: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/28.jpg)
Weighted average
Tensile tests aluminium alloy conducted with differentnumber of samples each time. Results are as follows: 1st test : x1 = 207 MPa n = 5
2nd test : x2 = 203 MPa n = 6
3rd test : x3 = 206 MPa n = 3
or use sum of weights equals 1.00 W1 = 5/(5+6+3) = 0.36W2 = 6/(5+6+3) = 0.43W3 = 3/(5+6+3) = 0.21 Total = 1.00
n
1ii
n
1iii
w
w
xwx
MPa205365
(206)3(203)6(5)(207)xw
xw = weighted avg.
wi = weight of ith average
![Page 29: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/29.jpg)
Median – Ungrouped Data Median – value of data which divides total
observation into 2 equal parts Ungrouped data – 2 possibilities When total number of data (N) is a) odd or b)
even If N is odd ; (N+1/2)th value is median eg. 3 4 5 6 8 N+1/2=6/2=3 ,
3rd no. If N is even eg. 3 5 7 9 ½ of (5+7)=6
NOTE: ORDER THE NUMBERS FIRST!
![Page 30: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/30.jpg)
Median – Grouped Data Need to find cell / class having middle value &
interpolating in the cell using
Lm = lower boundary of cell with the medianCfm = Cum. freq. of all cells below Lmfm =class/cell freq. where median occursi = cell interval
ExampleMD = 40.5 + 10
= 53.5
if
cf2n
Lxm
m
m0.5
![Page 31: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/31.jpg)
Measures of dispersion
describes how the data are spread out or scattered on each side of central value
both measures of central tendency & dispersion needed to describe data
Exams Results Class 1 – avg. : 60.0 marks highest : 95 lowest : 25 Class 2 – avg. : 60.0 marks highest : 100 lowest : 15 marks
![Page 32: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/32.jpg)
Measures of dispersion
Main types – range, standard deviation, and variance
Range – difference bet. highest & lowest value
R = XH - XL Standard deviation Variance – standard deviation squared Large value shows greater variability or
spread
![Page 33: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/33.jpg)
Standard deviation
For Ungrouped Data s = sample std. dev.
xi = observed value
x = average n = no. of observed value
or use
1n
xxs
n
1i
2i
1nn
xxn
s
2
i
n
1i
2i
n
1i
![Page 34: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/34.jpg)
Standard deviation – grouped data
Cell (i) Class boundary
Mid Point (xi)
Freq (fi)
Fixi fi fixi
1 1 – 20 10 2 20 2
2 21 – 40 30 10 300 12
3 41 - 60 50 20 1000 32
4 61 – 80 70 12 840 44
5 81 -100 90 6 540 50
Totals 2700
20.6424.494950
(2700)(166,600)50 2
1)(nn
xfxfn
s
h
1
h
1
2ii
2ii
NOTE: DO NOT ROUND OFF fixi & fixi2ACCURACY AFFECTED
![Page 35: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/35.jpg)
Concept Of Population and Sample Total daily prod. of steel shaft. Year’s Prod. Volume of calculators Compute x and s sample statistics True Population Parameters and Why sample? not possible measure population costs involved 100% manual inspection –
accuracy/error
Population
Sample
![Page 36: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/36.jpg)
Concept Of Population and Sample
SAMPLEStatistics, x , s
POPN.Parameter - mean - std. dev.
![Page 37: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/37.jpg)
Normal Distribution
Also called Gaussian distribution Symmetrical, unimodal, bell-shaped dist
with mean, median, mode same value Popn. curve – as sample size cell interval
- get smooth polygon
ND
![Page 38: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/38.jpg)
Normal Distribution
Much of variation in nature & industry follow N.D.
Variation in height of humans, weight of elephants, casting weights, size piston ring
Electrical properties, material – tensile strength, etc.
![Page 39: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/39.jpg)
Example - ND
![Page 40: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/40.jpg)
Characteristics of ND
Can have different mean but same standard deviation
![Page 41: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/41.jpg)
Different standard deviation but same mean
![Page 42: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/42.jpg)
Relationship between std deviation and area under curve
![Page 43: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/43.jpg)
Normal Distribution Example
Need estimates of mean and standard deviation and the Normal Table
Example : From past experience a manufacturer
concludes that the burnout time of a particular light bulb follows a normal distribution. Sample has been tested and the average (x ) found to be 60 days with a standard deviation () of 20 days. How many bulbs can be expected to be still working after 100 days.
![Page 44: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/44.jpg)
Solution Problem is actually to find area under the curve beyond 100 days Sketch Normal distribution and shade the area needed Calculate z value corresponding to x value using formula Z=(xi - )/ = (100-60)/20 = +2.00 Look in the Normal Table for z = +2.00 – gives area under curve as
0.9773 But, we want x >100 or z > 2.00. Therefore Area = 1.000 – 0.9773
= 0.0227, i.e. 2.27% probability that life of light bulb is > 100 hours
μ = 60
σ =20
x0 100
![Page 45: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/45.jpg)
Test For Normality To determine whether data is normal Probability Plot - plot data on normal probability
paper Steps1. Order the data2. Rank the observations3. Calculate the plotting position
i= rank , n=sample size, PP= plotting position in %
4. Label data scale5. Plot the points on normal probability paper6. Attempt to fit by eye ‘best line’7. Determine normality
n
)5.0i(100PP
![Page 46: Chap 2 Introduction to Statistics This chapter gives overview of statistics including histogram construction, measures of central tendency, and dispersion](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d425503460f94a1e0df/html5/thumbnails/46.jpg)
Example