topic 5 - joint distributions and the clt joint distributions - pages 145 - 156145 - 156 central...
TRANSCRIPT
Topic 5 - Joint distributions and the CLT
• Joint distributions - pages 145 - 156 • Central Limit Theorem - pages 183 - 185
Joint distributions
• Often times, we are interested in more than one random variable at a time.
• For example, what is the probability that a car will have at least one engine problem and at least one blowout during the same week?
• X = # of engine problems in a week• Y = # of blowouts in a week• P(X ≥ 1, Y ≥ 1) is what we are looking for• To understand these sorts of probabilities,
we need to develop joint distributions.
Discrete distributions
• A discrete joint probability mass function is given by
f(x,y) = P(X = x, Y = y)where
all ( , )
all ( , )
all ( , )
1. ( , ) 0 for all ,
2. ( , ) 1
3. (( , ) ) ( , )
4. ( ( , )) ( , ) ( , )
x y
x y A
x y
f x y x y
f x y
P X Y A f x y
E h X Y h x y f x y
Return to the car example
• Consider the following joint pmf for X and Y
• P(X ≥ 1, Y ≥ 1) =
• P(X ≥ 1) = • E(X + Y) =
X\Y 0 1 2 3 4
0 1/2 1/16 1/32 1/32 1/32
1 1/16 1/32 1/32 1/32 1/32
2 1/32 1/32 1/32 1/32 1/32
Joint to marginals• The probability mass functions for X and Y
individually (called marginals) are given by
• Returning to the car example:fX(x) =
fY(y) =
E(X) =
E(Y) =
all all ( ) ( , ), ( ) ( , )X Yy x
f x f x y f y f x y
Continuous distributions
• A joint probability density function for two continuous random variables, (X,Y), has the following four properties:
- -
- -
1. ( , ) 0 for all ,
2. ( , ) 1
3. (( , ) ) ( , )
4. ( ( , )) ( , ) ( , )
A
f x y x y
f x y dxdy
P X Y A f x y dxdy
E h X Y h x y f x y dxdy
Continuous example• Consider the following joint pdf:
• Show condition 2 holds on your own.• Show P(0 < X < 1, ¼ < Y < ½) = 23/512
2(1 3 )( , ) 0 2, 0 1
4x y
f x y x y
Joint to marginals
• The marginal pdfs for X and Y can be found by
• For the previous example, find fX(x) and fY(y).
( ) ( , ) , ( ) ( , )X Yf x f x y dy f y f x y dx
Independence of X and Y
• The random variables X and Y are independent if f(x,y) = fX(x) fY(y) for all pairs (x,y).
• For the discrete clunker car example, are X and Y independent?
• For the continuous example, are X and Y independent?
Sampling distributions• We assume that each data value we collect
represents a random selection from a common population distribution.
• The collection of these independent random variables is called a random sample from the distribution.
• A statistic is a function of these random variables that is used to estimate some characteristic of the population distribution.
• The distribution of a statistic is called a sampling distribution.
• The sampling distribution is a key component to making inferences about the population.
StatCrunch example• StatCrunch subscriptions are sold for 6 months
($5) or 12 months ($8).• From past data, I can tell you that roughly 80%
of subscriptions are $5 and 20% are $8.• Let X represent the amount in $ of a purchase.• E(X) =
• Var(X) =
StatCrunch example continued• Now consider the amounts of a random
sample of two purchases, X1, X2.
• A natural statistic of interest is X1 + X2, the total amount of the purchases.
Outcomes
X1 + X2 Probability
X1 + X2
Probability
StatCrunch example continued
• E(X1 + X2) =
• E([X1 + X2]2) =
• Var(X1 + X2) =
StatCrunch example continued• If I have n purchases in a day, what is
– my expected earnings?– the variance of my earnings?– the shape of my earnings distribution for large n?
• Let’s experiment by simulating 1000 days with 100 purchases per day.
• StatCrunch
Central Limit Theorem• We have just illustrated one of the most
important theorems in statistics.• As the sample size, n, becomes large the
distribution of the sum of a random sample from a distribution with mean and variance 2 converges to a Normal distribution with mean n and variance n2.
• A sample size of at least 30 is typically required to use the CLT
• The amazing part of this theorem is that it is true regardless of the form of the underlying distribution.
Airplane example• Suppose the weight of an airline passenger
has a mean of 150 lbs. and a standard deviation of 25 lbs. What is the probability the combined weight of 100 passengers will exceed the maximum allowable weight of 15,500 lbs?
• How many passengers should be allowed on the plane if we want this probability to be at most 0.01?
The sample mean• For constant c, E(cY) = cE(Y) and Var(cY) = c2Var(Y)
• E( ) =
• Var( ) =
• The CLT says that for large samples, is approximately normal with a mean of and a variance of 2/n.
• So, the variance of the sample mean decreases with n.
X
X
X
Sampling applet