1 prof. indrajit mukherjee, school of management, iit bombay others convenience stratified judgment...
TRANSCRIPT
1Prof. Indrajit Mukherjee, School of Management, IIT Bombay
OthersConvenience
Stratified
Judgment
Non-ProbabilitySamples
Probability Samples
SimpleRandom
Systematic
StratifiedCluster
Samples
Sampling Techniques
2Prof. Indrajit Mukherjee, School of Management, IIT Bombay
TYPE OFSAMPLING
SELECTIONSTRATEGY
PURPOSE
Convenience Select casesbased on theiravailability for
the study.
Saves timetime,
money andeffort; but at the
expense ofinformation and
credibility.
3Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Simple random sampling
Sample method Resulting method
The population is identified uniquely by number. Selection by
random number
Every number of the population has an equal chance of being selected
into the sample
4Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Simulating From Continuous Uniform
( ]
Random numbersUniform [0,1] distribution
Uniform [a, b] distribution
0 r 1
0 a a + r(b-a) b
Shift a Stretch (b - a) b
5Prof. Indrajit Mukherjee, School of Management, IIT Bombay
How to use random number table to select a random samplecorresponds to a number on the list of your population. In the example below, # 08 has been chosen as the starting point and the first student chosen is Carol Chan.
10 09 73 25 33 7637 54 20 48 05 6408 42 26 89 53 1990 01 90 25 29 0912 80 79 99 70 8066 06 57 47 17 3431 06 01 08 05 45
Step 3: Move to the next number, 42 and select the person corresponding to that number intothe sample. #87 – Tan Teck WahStep 4: Continue to the next number that qualifies and select that person into the sample.# 26 -- Jerry Lewis, followed by #89, #53 and #19Step 5: After you have selected the student # 19, go to the next line and choose #90. Continuein the same manner until the full sample is selected. If you encounter a number selectedearlier (e.g., 90, 06 in this example) simply skip over it and choose the next number.
Starting point:move right to the endof the row, then downto the next row row;move left to the endEnd, then down to the next row, and so on.
6Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Systematic sampling (contd.)“Example”
1 26 51 76
2 27 52 77
3 28 53 78
4 29 54 79
5 30 55 80
6 31 56 81
7 32 57 82
8 33 58 83
9 34 59 84
10 35 60 85
11 36 61 86
12 37 62 87
13 38 63 88
14 39 64 89
15 40 65 90
16 41 66 91
17 42 67 92
18 43 68 93
19 44 69 94
20 45 70 95
21 46 71 96
22 47 72 97
23 48 73 98
24 49 74 99
25 50 75 100
Start with #4 and take every 5th unit
N=100
Want n=20
N/n=5
Select a random number from 1-5:Chose 4
7Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Stratified Random Sample: Stratified by Age
20 - 30 years old(homogeneous within)(alike)
30 - 40 years old(within homogeneous) (alike)
40 - 50 yearsold(homogeneous within)(alike)
Heterogeneous(different)between
Heterogeneous(different)between
8Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Sample Spaces and EventsRandom Experiments
Noise variables affect the transformation of inputs to outputs.
Noise variables
Controlled variables
Input OutputSystem
9Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Example
Rotation speed Traverse speedTool type Tool sharpnessShaft material Shaft lengthMaterial removal per cut Part cleanliness Coolant flow Operator Material variation Ambient temperature Coolant age
Machining a shafton a lathe
Outputs (Y’s)DiameterTaperSurface finish
10Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Four Types of Probability
Marginal Union Joint Conditional
The probability of XOccurring P( X )
The probability of Xor Y occurring
The probability of Xand Y occurring
The probability of Xoccurring giventhat Y has occurred P(X|Y)X YX Y
X X Y X Y
11Prof. Indrajit Mukherjee, School of Management, IIT Bombay
P(A and B)(Venn Diagram)
P(A) P(B)
P(A and B)
12Prof. Indrajit Mukherjee, School of Management, IIT Bombay
P(A or B)
13Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Sample Spaces and EventsVenn Diagrams
14Prof. Indrajit Mukherjee, School of Management, IIT Bombay
E4
E1
E2
E3
Venn diagram of four mutually exclusive events
15Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Collectively Exhaustive Events• Events are said to be collectively exhaustive ifthe list of outcomes includes every possibleoutcome: heads and tails as possibleoutcomes of coin flip
16Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Example 3Draw Mutually Collectively
Exclusive Exhaustive
Draw a space and a club Yes Yes
Draw a face card and a Yes Yesnumber cardDraw an ace and a 3 Yes No
Draw a club and a nonclub Yes Yes
Draw a 5 and a diamond No No
Draw a red card and a No Nodiamond
17Prof. Indrajit Mukherjee, School of Management, IIT Bombay
The following circuit operates only if there is a path of functional devices from left to right. The probability that each device function is shown on the graph. Assume that devices fail independently, what is the probability the circuit operates?
Let T and B denote the events that the top and bottom devices operate, Respectively. There is a path if at least on device operates. The probability that the circuit operates is
P(T or B) =1-[P(T or B)’]=1-P(T’ and B’)
A simple formula for the solution can be derived from the complements T and B’. From the independence assumption.
P(T’ and B’)=P(T’) P(B’)=(1-0.95)2 =0.052
P(T or B)=1- 0.052 0.9975
0.95
0.95
a b
18Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Probability(D|F)
P(D|F) = P(DF)/P(F)
/
P(D) P(DF) P(F)
19Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Random Variables (Numeric)
Experiment Outcome Random Variable Range of RandomVariable
Stock 50Xmas trees
Number oftrees sold
X = number oftrees sold
0,1,2,, 50
Inspect 600items
Number acceptable
Y = number acceptable
0,1,2,…,600
Send out5,000 sales letters
Number ofPeople responding
Z = number ofpeople responding
0,1,2,…,5,000
Build anapartment
building
%completedafter 4months
R = %completedafter 4 months
0≤R ≤ 100
Test thelifetime of a
light bulb(minutes)
Time bulblasts - up to
80,000minutes
S = time bulbburns
0 ≤ S ≤ 80,000
20Prof. Indrajit Mukherjee, School of Management, IIT Bombay
105 221 183 186 121 181 180 143
97 154 153 174 120 168 167 141
245 228 174 199 181 158 176 110
163 131 154 115 160 208 158 133
207 180 190 193 194 133 156 123
134 178 76 167 184 135 229 146
218 157 101 171 165 172 158 169
199 151 142 163 145 171 148 158
160 175 149 87 160 237 150 135
196 201 200 176 150 170 118 149
Comressive strength (in psi) of 80 aluminum-lithium alloy specimens
21Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Frequency Distributions and Histograms
Histogram of compressive strength for 80 aluminum-lithium alloy specimens.
0
0.0625
0.1250
0.1895
0.2500
0.3125 25
0
5
10
15
20
70 90 110 130 150 170 190 210 230 250
Fre
qu
en
cy
compressive strength (psi)
22Prof. Indrajit Mukherjee, School of Management, IIT Bombay
438 450 487 451 452 441 444 461 432 471
413 450 430 437 465 444 471 453 431 458
444 450 446 444 466 458 471 452 455 445
468 459 450 453 473 454 458 438 447 463
445 466 456 434 471 437 459 445 454 423
472 470 433 454 464 443 449 435 435 451
474 457 455 448 478 465 462 454 425 440
454 441 459 435 446 435 460 428 449 442
455 450 423 432 459 444 445 454 449 441
449 445 455 441 464 457 437 434 452 439
Histograms – Useful for large data sets
Group values of the variable into bins, then count the number ofobservations that fall into each binPlot frequency (or relative frequency) versus the values of thevariable
23Prof. Indrajit Mukherjee, School of Management, IIT Bombay
30
0
10
20
405 415 425 435 445 455 465 475 485 495
Minitab histogram for the metal layer thickness data in table
Metal thickness
Frequency
24Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Histogram ExampleData in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
No gapsbetweenbars, sincecontinuousData
10
0
5
5 15 25 36 45 55
00
3
65
4
2
Histogram
Class midpoints
Frequency
25Prof. Indrajit Mukherjee, School of Management, IIT Bombay
How Many Class Intervals?
• Many (Narrow classintervals)• may yield a very jagged distributionwith gaps from empty classes• Can give a poor indication of howfrequency varies across classes
• Few (Wide class intervals)• may compress variation too muchand yield a blocky distribution• can obscure important patterns ofvariation.
26Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Calculation of Grouped Mean
Class Interval Frequency Class Midpoint fM
20-under 30 6 25 15030-under 40 18 35 63040-under 50 11 45 49550-under 60 11 55 60560-under 70 3 65 19570-under 80 1 75 75
50 2150
215043.0
50
fm
f
27Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Mode of Grouped Data
• Midpoint of the modal class• Modal class has the greatest frequency
Class Interval Frequency
20-under 30 3530-under 40 1840-under 50 1150-under 60 1160-under 70 370-under 80 1
30+40Mode= 35
2
28Prof. Indrajit Mukherjee, School of Management, IIT Bombay
6 1 5 7 8 6 0 2 4 25 2 4 4 1 4 1 7 2 34 3 3 3 6 3 2 3 4 55 2 3 4 4 4 2 3 5 75 4 5 5 4 5 3 3 3 12
29Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Frequency Distribution:Discrete Data
• Discrete data: possible values are countable
Example: Anadvertiser asks200 customershow many daysper week theyread the dailynewspaper.
Number of days read Frequency
0 441 242 183 164 205 226 267 30
total 200
30Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Relative FrequencyRelative Frequency: What proportion is in each
22% of thepeople in thesample reportthat they read theNewspaper days per week
Number of days read Frequency
Relative frequency
0 44 0.221 24 0.122 18 0.093 16 0.084 20 0.105 22 0.116 26 0.137 30 0.15
total 200 1.00
31Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Relative Frequency Plot andProbability Distributions
Histogram approximates a probability density function.
F(x)
X
32Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Interpretations of Probability
Relative frequency of corrupted pulses sent over acommunications channel.
Relative frequency of corrupted pulse=2/10
Corrupted pulse
Time
Volt
age
33Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Interpretations of Probability
P(E)=30(0.01)=0.30Probability of the event E is the sum of the probabilities of the outcomes in E
Diodes
E
S
34Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Random Variables
Question Random Variable x Type
Familysize
x = Number of dependents in family reported on tax return
Discrete
Distance fromhome to store
x = Distance in miles fromhome to the store site
Continuous
Own dogor cat
x = 1 if own no pet;= 2 if own dog(s) only;= 3 if own cat(s) only;= 4 if own dog(s) and cat(s)
Discrete
35Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Using past data on TV sales, …a tabular representation of the probabilitydistribution for TV sales was developed.
Unit soldNumber of days read
0 801 502 403 104 20
total 200
x f(x)0 .401 .252 .203 .054 .10
total 1.00
36Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Graphical Representation of the ProbabilityDistribution
.50
.10
.20
.30
.40
0 1 2 3 4
Values of random variable x (TV sales)
Pro
babili
ty
37Prof. Indrajit Mukherjee, School of Management, IIT Bombay
• Uniform Probability Distribution• Normal Probability Distribution• Exponential Probability Distribution
Uniform
Normal Exponential
F(x)
X X X
F(x) F(x)
38Prof. Indrajit Mukherjee, School of Management, IIT Bombay
x1 x2 x3 x4 x5
F(x)
X
F(x)
X
p(x3)
p(x4)
p(x5)p(x1)
p(x2)
a b
Sometimes called aprobability mass function
Sometimes called a probabilitydensity function
Probability distributions (a)Discrete case (b)continuous case
39Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Throwing a Dice
1 2 3 4 5 6
1/6
Distribution of X
P(X
)
X
40Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Example 2aOutcome Probabilityof Roll = 5
Die 1 Die 21 4 1/362 3 1/363 2 1/364 1 1/36
Rolling two diceresults in a total offive spots showing. There are a total of 36possibleoutcomes
41Prof. Indrajit Mukherjee, School of Management, IIT Bombay
sample Sample sample1,1 1 3,1 2 5,1 31,2 1.5 3,2 2.5 5,2 3.51,3 2 3,3 3 5,3 41,4 2.5 3,4 3.5 5,4 4.51,5 3 3,5 4 5,5 51,6 3.5 3,6 4.5 5,6 5.52,1 1.5 4,1 2.5 6,1 3.52,2 2 4,2 3 6,2 42,3 2.5 4,3 3.5 6,3 4.52,4 3 4,4 4 6,4 52,5 3.5 4,5 4.5 6,5 5.52,6 4 4,6 5 6,6 6
X X X
All Samples of subgroup size 2 from a Population
42Prof. Indrajit Mukherjee, School of Management, IIT Bombay
1 1/361.5 2/362 3/36
2.5 4/363 5/36
3.5 6/364 5/36
4.5 4/365 3/36
5.5 2/366 1/36
X P X
SamplingDistribution of X
43Prof. Indrajit Mukherjee, School of Management, IIT Bombay
1 2 3 4 5 6
(b) Sampling distribution of
6/36
4/36
2/36
X
p X
X
44Prof. Indrajit Mukherjee, School of Management, IIT Bombay
0
.6
.5
.4
.3
.2
.1
61 3.5Sampling Distribution of for n = 5
X
X
45Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Sampling Distributions of MeansFigure Distributions of averagescores from throwing dice.
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4 5 6
46Prof. Indrajit Mukherjee, School of Management, IIT Bombay
SamplingDistributionBecomesAlmostNormalRegardlessof Shape ofPopulation
As SampleSize GetsLargeEnough
Central Limit Theorem
X
47Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Probability DistributionsOutcome X Number responding p(X)
SA 5 10 0.1
A 4 20 0.2
N 3 30 0.3
D 2 40 0.3
SD 1 50 0.1
1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Chart Title
48Prof. Indrajit Mukherjee, School of Management, IIT Bombay
X P(X=x) X P(X=x)0 0.205891 6 0.0019391 0.343152 7 0.0002772 0.266896 8 0.0000313 0.128505 9 0.0000034 0.042835 10 05 0.010471
49Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Probability Distributions andProbability Density Functions
Density function of a loading on a long, thin beam.
X
Loadin
g
50Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Probability Distributions andProbability Density Functions
Probability determined from the area under f(x).
P(a<X<b)
X
F(x)
a b
51Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Continuous Uniform Random Variable
1/(b-a)
X
F(x)
a bContinuous uniform probability density function.
52Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Uniform Distribution or rectangularprobability distribution
1/(b-a)
X
F(x)
a b
1, where f x a x b
b a
area = width x height = (b – a) x1
b a
1
b a
53Prof. Indrajit Mukherjee, School of Management, IIT Bombay
ExampleThe amount of gasoline sold daily at a service station isuniformly distributed with a minimum of 2,000 gallons and amaximum of 5,000 gallons.
X
F(x)
1
5,000 2,000 2,000 5,000
Find the probability that daily sales will fall betweenand 3,000 gallons.2,500 Algebraically: what is P(2,500 ≤ X ≤ 3,000) ?
54Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Example
X
F(x)
1
3,000 2,000 5,000
12,500 3,000 3,000 2,500 0.1667
3,000P X
“there is about a 17% chance that between 2,500 and 3,000gallons of gas will be sold on a given day”
55Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Cumulative Distribution Functions
X
F(x)
0 12.5
1
Cumulative Distribution Functions
56Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Probability Distributions andProbability Density Functions
X
F(x)
12.5 12.6
Probability density function
57Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Normal Probability DistributionCharacteristicsThe distribution is symmetric, and is bell-shaped.
x
58Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Normal Probability Distribution
Characteristics
Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and .5 to the right).
x
0.5 0.5
59Prof. Indrajit Mukherjee, School of Management, IIT Bombay
The mean is not necessarily the 50th percentile of the distribution (that’s the median)The mean is not necessarily the most likely value of the random variable (that’s the mode)
Two probability distributions with same mean but different standard deviations
µ Median Mode
The mean of a distribution
Two probability distributions with different means
µ µ Mode
µ=20µ=10 µ=10
σ=2
σ=4
60Prof. Indrajit Mukherjee, School of Management, IIT Bombay
µ-1σ µ+1σ µ+2σ µ+3σµ-2σµ-3σ µ
99.73%
68.26%
95.46%
µ
σ2
F(x)
x
Areas under normal distributionThe normal distribution
61Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Normal Distribution
Standardizing a normal random variable.
62Prof. Indrajit Mukherjee, School of Management, IIT Bombay
A normal distribution whose mean is zero and standarddeviation is one is called the standard normal distribution.
σ=1µ=0
As we shall see shortly, any normal distribution can be converted to astandard normal distribution with simple algebra. This makes calculations much easier.
Standard Normal Distribution…
2
1 0
2 11
1 2
x
f x e x
63Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Normal DistributionExample
0 1.5 z
64Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Areas under Standardized Normal Distribution
65Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Areas under Standardized Normal Distribution
66Prof. Indrajit Mukherjee, School of Management, IIT Bombay
Using Excel to ComputeStandard Normal ProbabilitiesFormula Worksheet
A B
1 Probabilities: standard normal distribution
2 P (z < 1 00) =NORMSDIST(1)
3 P (0.00 < z < 1.00) =NORMSDIST(1)-NORMSDIST(0)
4 P (0.00 < z < 1.25) =NORMSDIST(1.25)-NORMSDIST(0)
5 P (-1.00 < z < 1.00) =NORMSDIST(1)-NORMSDIST(-1)
6 P (z > 1.58) =1-NORMSDIST(1.58)
7 P (z < -0.50) =NORMSDIST(-0.5)
8