7.1 discrete and continuous random variablesteachers.sduhsd.net/bshay/combined ch7_8...

62
A random variable is: We usually denote random variables by _____________________ such as X or Y When a random variable X describes a random phenomenon, the ___________________ S just lists the _________________ of the random variable. Example : 7.1 Discrete and Continuous Random Variables

Upload: trinhdung

Post on 30-May-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

A random variable is:

• We usually denote random variables by _____________________ such as X or Y

• When a random variable X describes a random phenomenon, the ___________________ S just lists the _________________ of the random variable.

• Example:

7.1 Discrete and Continuous Random Variables

• What is the probability distribution of the discrete random variable X that counts the number of heads in four tosses of a coin?

• Probability of tossing at least 2 heads?

• Probability of at least one head?

Let X = count of heads in 4 tosses

X

P(x)

Discrete random variable

A discrete random variable X has: The probability distribution of a discrete random variable X lists: The probabilities must satisfy 2 requirements: 1) 2)

Example

• The instructor of a large class gives 15% each of A’s and D’s, 30% each of B’s and C’s, and 10% F’s. Choose a student at random from this class. The student’s grade on a 4-pt scale (A = 4) is a random variable X. Find the probability that the student got a B or better.

You!

• Construct the probability distribution for the number of boys in a three-child family. Find the following probabilities:

1) P(2 or more boys)

2) P(No boys)

3) P(1 or less boys)

In an article in the journal Developmental Psychology (March 1986), a probability distribution for the age X (in years) when male college students began to shave regularly is shown:

Here is the probability distribution for X in table form:

1) Is this a valid probability distribution? What is the random variable of interest? Is X discrete?

2) What is the most common age at which a randomly selected male college student begins shaving?

3) What is the probability that a randomly selected male college student begins shaving at 16? What is the probability that a randomly selected male college student begins shaving before 15?

X 11 12 13 14 15 16 17 18 19 20+

P(x) 0.013 0 0.027 0.067 0.213 0.267 0.240 0.093 0.067 0.013

Continuous Random Variables

Example: S = {all numbers x between 0 and 1

inclusive} • The probability distribution of X

assigns probabilities as ___________________________

• Any density curve has area exactly ___ underneath it (probability = ___)

.

A continuous random variable X:

Example

• A random number generator will spread its output uniformly across the entire interval from 0 to 9 as we allow it to generate a long sequence of numbers. The results of many trials are represented by the density curve of a ____________________.

• Find the probability that the generator produces a number X between 3 and 7

• Find the probability that

the generator produces a number X less than or equal to 5 or greater than 8

Special Note:

• All continuous probability distributions assign probability ____ to every _________________.

• The probability of x __.8 is the same as x __ .8 Example: Find P(.79 < x < .81) = Find P(.799 < x < .801) = Find P(.7999 < x < .8001) = Find P(x=.8) =

Normal Distributions as Probability Distributions

• Because any density curve describes an assignment of probabilities, ________________________________________.

• If X has the N( ) distribution, then

is a ______________________________ having the distribution N(0,1).

,µ σ

xz µσ−

=

Example

• An opinion poll asks an SRS of 1500 adults what they consider to be the most serious problem affecting schools. Suppose that if we could ask all adults this ?, 30% would say “drugs.”

• Assume your sample proportion follows a normal distribution:

N(.3, .0118).

• Given: Mean = .3, and Standard dev. = .0118

• Find the probability that the poll result differs from the truth about the population by more than 2 percentage points.

1) The probabilities that a randomly selected customer purchases 1, 2, 3,

4, or 5 items at a convenience store are .32, .12, .23, .18, and .15, respectively.

a) Identify the random variable of interest. X = ____. Then construct a probability distribution (table), and draw a probability distribution histogram.

b) Find P(X>3.5) c) Find P(1.0 <X<3.0) d) Find P(X<5) 2) A certain probability density function is made up of two straight-line

segments. The first segment begins at the origin and goes to the point (1,1). The second segment goes from (1,1) to the point (x, 1).

a) Sketch the distribution function, and determine what x has to be in order to be a legitimate density curve.

b) Find P(0<X<.5) c) Find P(X=1) d) Find P(0<X<1.25) e) Circle the correct option: X is an example of a (discrete)

(continuous) random variable.

Mean of Discrete Random Variable

• The mean of a set of observations is their __________________, whereas the mean of a random variable X is an ________________________________

• The mean of a random variable X is often called the __________________ of X, and describes the ____________average outcome . It is a ___________________.

To find the mean (weighted average) of X:

Example 7.6, p. 483

X 1 2 3 4 5 6 7 8 9

Prob 1/9 1/9

1/9

1/9

1/9

1/9

1/9

1/9

1/9

V 1 2 3 4 5 6 7 8 9

Prob .301 .176 .125 .097 .079 .067 .058 .051 .046

Mean of X (μx) =

Mean of V (μV) =

Locating the Mean in a discrete distribution =

Example (Mean of a prob. distribution = ______________)

Two dice are rolled simultaneously. If both show a 6, then the player wins $20, otherwise the player loses the game. It costs $2.00 to play the game. What is the expected gain or loss of the game?

The standard deviation of X is:

Ex. 7.7

• Linda sells cars and motivates herself by using probability estimates of her sales. She estimates her car sales as follows:

• Find the mean and variance of X.

Cars sold 0 1 2 3

Prob. .3 .4 .2 .1

Define in your own words the Law of Large Numbers:

• Ex. 7.8: The distribution of the heights of women is close to normal with N(64.5, 2.5). The graph plots the values as we add women to our sample.

• Eventually the mean of the observations gets close to the __________________________and settles down at that value.

More on Law of Large #’s

The law says that the ___________________of many independent observations/decisions are __________________________. Insurance companies, grocery stores, and other industries can predict demand even though their many customers make independent decisions.

Examples:

How large is a large number?

Can’t write on a rule on how many trials are needed to guarantee a mean outcome close to ; this depends on the _______________ of the random outcomes. The more variable the outcomes, the more _____ are needed to ensure that the mean outcome is close to the distribution mean .

µ

µX

You: Emergency Evacuation

Time to Evacuate (nearest hr)

Probability

13 0.04

14 0.25

15 0.40

16 0.18

17 0.10

18 0.03

• A panel of meteorological and civil engineers studying emergency evacuation plans for Florida’s Gulf Coast in the event of a hurricane has estimated that it would take between 13 and 18 hours to evacuate people living in a low-lying land, with the probabilities shown here.

• Find the mean, variance, and standard deviation of the distribution.

Note: To find the _____ (or difference) of the means of 2 random variables, ______ the

individual means of the random variables X and Y together.

If 2 random variables are ____________, the variance of the sum (or difference) of the 2 random variables is ______________________ of the 2 individual variances. DOES

NOT WORK for standard deviations!

Rules of Means Rule 1: Rule 2:

Rules of Variances Rule 1: Rule 2:

Rules Example The following data comes from a normally distributed population. Given that

X={2, 9, 11, 22} and Y={5, 7, 15, 21}, illustrate the rules for means and variances.

Rule 1 for means: Find 3 + 2X and the new mean

Rule 2 for means: Find X + Y and mean of X + Y

Rule 1 for variances: Find 3 + 2X and the new variance

Example #1 Gain Communication sells units to both the military and civilian

markets. Next year’s sales depend on market conditions that are unpredictable. Given the military and civilian division estimates and the fact that Gain makes a profit of $2000 on each military unit sold and $3500 on each civilian unit sold, find:

a) The mean and the variance of the number X of communication units.

b) The best estimate of next year’s profit. Military Units sold: 1000 3000 5000 10,000 Probability: .1 .3 .4 .2 Civilian Units sold: 300 500 750 Probability: .4 .5 .1

Example #2 The probabilities that a randomly selected customer

purchases 1, 2, 3, 4, or 5 items at a convenience store are .32, .12, .23, .18, and .15, respectively.

a) Construct a probability distribution (table) for the data and verify that this is a legitimate probability distribution.

b) Calculate the mean of the random variable. Interpret this value in the context of this problem.

c) Find the standard deviation of X.

d) Suppose 2 customers (A and B) are selected at random. Find the mean and the standard deviation of the difference in the number of items purchased by A and by B. Show your work.

Example #3 Any linear combination of independent Normal random variables

is also Normally Distributed.

Suppose that the mean height of policemen is 70 inches w/a standard deviation of 3 inches. And suppose that the mean height for policewomen is 65 inches with a standard deviation of 2.5 inches. If heights of policemen and policewomen are Normally distributed, find the probability that a randomly selected policewoman is taller than a randomly selected policeman.

Example #4

Here’s a game: If a player rolls two dice and gets a sum of 2 or 12, he wins $20. If the person gets a 7, he wins $5. The cost to play the game is $3. Find the expected payout for the game.

Example #5

The random variable X takes the two values ____and ____, each with probability 0.5. Use the definition of mean and variance for discrete random variables to show that X has mean and standard deviation . µ σ

There are 4 runners on the New High School team. The team is planning to participate in a race in which each runner runs a mile. The team time is the sum of the individual times for the 4 runners. Assume that the individual times of the 4 runners are all independent of each other. The individual times, in minutes, of the runners in similar races are approximately normally distributed with the following means and standard deviations.

a) Runner 3 thinks that he can run a mile in less than 4.2 minutes in the next race. Is this likely to happen? Explain.

b) The distribution of possible team times is approximately normal. What are the mean and standard deviation of this distribution?

c) Suppose the team’s best time to date is 18.4 minutes. What is the probability that the team will beat its own best time in the next race?

Mean Std. Dev. Runner 1 4.9 0.15 Runner 2 4.7 0.16 Runner 3 4.5 0.14 Runner 4 4.8 0.15

• 2NIP: 2 outcomes: N: I: P:

• Examples:

The binomial setting: 1. 2. 3. 4.

EX 8.1 • If both parents carry genes for the O and A blood

types, each child has probability = .25 of getting two 0 genes resulting in blood type 0. Assume independence.

• A “success” is the number of 0 blood types among 5 children of these parents in 5 independent observations.

• Is this a binomial distribution?

The binomial distribution is:

Ex 8.2 • You are dealt 10 cards from a shuffled deck

and wish to count the number X of red cards as you go through the deck.

• A “success” is a red card.

• Is this a binomial distribution?

How to calculate Binomial Probabilities

• The binomial coefficient:

• Binomial Probability:

What this means:

Given a SRS of size ___ and from a population with proportion ___ of successes. When the population is much larger than the sample:

Example 8.3 • An engineer selects an SRS of 10 switches from a large shipment for

detailed inspection. Unknown to the engineer, 10% of them fail to meet the specifications. What is the probability that no more than 1 of the 10 switches in the sample fail inspection?

Note: Assuming that a defective switch is drawn first (p=0.1), the

probability for the second switch being defective changes to p=0.0999. For practical purposes, this behaves like a binomial setting even if the condition of independence does not strictly hold.

• Ex. 8.6: The number X of switches that fail inspection in Example 8.3 has approximately the binomial distribution with n = 10 and p = .1. Find the probability that no more than 1 switch fails.

Binomial Probability:

Calculator

• We want to calculate when X is B(10, .1) • P(X 1) = P(X=0)+P(X=1) • Calculator:

PDF (example 8.1 cont’d)

• Given a discrete random variable X, the ___________________________ assigns a probability to each value of X. The probabilities must satisfy the ______________________ given in chapter 6.

• B(5, .25) distribution where n=5 children

X 0 1 2 3 4 5

P(x) .2373 .3955 .2637 .0879 .0146 .000977

PDF by calculator

• How we arrived at the _________________________

for B(5, .25) distribution where n=5 children:

X 0 1 2 3 4 5

P(x) .2373 .3955 .2637 .0879 .0146 .000977

Alternative Calculation/shortcut! CDF

• Given a random variable X, the ____________________________of X calculates the _____ of the probabilities for 0, 1, 2…up to the value of X.

• i.e., it calculates the probability of obtaining _________ X successes in n trials.

• For last example:

Example 8.8 • Corinne is a basketball player who makes 75% of her free throws.

In one game, she shoots 12 free throws and makes only 7 of them. Find the probability of making a basket on at most 7 free throws.

• Check Binomial Settings first (Use binompdf(n,p,X)):

• Now use binomcdf:

Extra Practice of CDF... Given B(23, .7) find:

1. P(X<17) =

2. P(X≤17) =

3. P(X>17) =

Example: Each child born to a set of parents has probability .25 of having blood type O. If these parents have 5 kids, what is the probability that exactly 2 of them have type O blood?

Binomial with n = 5 tries and p = .25. Find P(X = 2), where S=success=.25 and F=fail=.75

1st way: List all 10 possible outcomes: each with the same probability (for ex, the 1st: SSFFF = (.25)(.25)(.75)(.75)(.75)) P(X=2) = 2nd way: Using the binomial formula:

!!( !)

n nk k n k

= −

2 ways, same answer

• The binomial distribution is a _________________ of a probability distribution for a discrete random variable.

• It is therefore possible to find the mean and standard deviation of a binomial in the same way as we did for a discrete random variable – but you really don’t want to.

Mean of Binomial Random Variable: Standard Deviation of Binomial Random Variable:

Ex 8.11 • The count X of bad switches is binomial with n =10

and p=.1. Find the mean and standard deviation:

The binomial distribution can be used as an approximation for the Normal distribution. N(___, _______) Rule of Thumb:

Ex. 8.12: Shopping • Sample surveys show that fewer people enjoy shopping than in

the past. A survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that “I like buying new clothes, but shopping is often frustrating and time-consuming.” Population: all U.S. residents 18+. Suppose 60% of all adult U.S. residents “agree.” What is the probability that 1520 or more of the sample agree?

• Binomial distributions often arise in discrimination cases when the population in

question is large. The generic question is “If the selection were made at random from the entire population, what is the probability that the number of members of a protected class hired/promoted/laid off would be as small/large as it actually was?” This assumes that all members of the qualified population have equal merit, so its just a first step. If the population is large, we can act as if the candidates are chosen independently.

• In 2004, the National Institute of Health announced that it would give a few new Director Pioneer Awards for research. The awards were highly valued: $500,000 per year for five years for research support. Nine awards were made, all to men. This caused an outcry.

• There were 1300 nominees for the award, 80% male. Suppose that all nominees are equally qualified. If we choose 9 at random, the number of women among the winners has (to a close approximation) the binomial distribution with n=9 and p=0.2. Call the number of women X.

• Find P(no award go to women), P(at least one woman), P(no more than one woman), the mean number of women in repeated random drawing, and the standard deviation. Can we use the normal approximation to calculate these probabilities?

• Binomial distributions often arise in discrimination cases when the population in question is large. The generic question is “If the selection were made at random from the entire population, what is the probability that the number of members of a protected class hired/promoted/laid off would be as small/large as it actually was?” This assumes that all members of the qualified population have equal merit, so its just a first step. If the population is large, we can act as if the candidates are chosen independently.

• In 2004, the National Institute of Health announced that it would give a few new Director Pioneer Awards for research. The awards were highly valued: $500,000 per year for five years for research support. Nine awards were made, all to men. This caused an outcry.

• There were 1300 nominees for the award, 80% male. Suppose that all nominees are equally qualified. If we choose 9 at random, the number of women among the winners has (to a close approximation) the binomial distribution with n=9 and p=0.2. Call the number of women X.

• Find P(no award go to women), P(at least one woman), P(no more than one woman), the mean number of women in repeated random drawing, and the standard deviation. Can we use the normal approximation to calculate these probabilities?

At an archaeological site that was an ancient swamp, the bones from 20 brontosaur skeletons have been unearthed. The bones do not show any sign of disease or malformation. It is thought that these animals wandered into a deep area of the swamp and became trapped in the swamp bottom. The 20 left femur bones (thigh bones) were located and 4 of these left femurs are to randomly selected without replacement for DNA testing to determine gender.

a) Let X be the number out of the 4 selected left femurs that are from males.

Based on how these bones were sampled, explain why the probability distribution of X is not binomial.

b) Suppose that the group of 20 brontosaurs whose remains were found in the swamp had been made up of 10 males and 10 females. What is the probability that all 4 in the sample to be tested are male?

c) The DNA testing revealed that all 4 femurs tested were from males. Based on this result and your answer from part (b), do you think that males and females were equally represented in the group of 20 brontosaurs stuck in the swamp? Explain.

d) Is it reasonable to generalize your conclusion from part c) pertaining to the group of 20 brontosaurs to the population of all brontosaurs? Explain why or why not.

More discrimination in the workplace

There are several thousand workers at a particular factory, of which 30% are Hispanic. We randomly select a sample of 15 employees to serve on a committee to study and recommend changes to the employee benefits program. But only 3 Hispanic employees were selected, and the Hispanic employees have charged that the selection process was rigged to favor non-Hispanics. Is there evidence of this? Specifically, what is the probability that at most 3 Hispanics are chosen for the committee?

• Used when the goal is to obtain a _________ number of _________________.

• The random variable X is defined as counting the number of trials needed to obtain that ___________.

• Possible values of a geometric random variable: 1, 2, 3…(infinite) since it is theoretically possible to proceed indefinitely without ever obtaining a success.

• Examples:

The Geometric Setting: 1. 2. 3. 4.

The Geometric Setting: 2PIFS

1. 2:

2. Probability:

3. I: The observations are ______________

4. FS: variable of interest is the number of trials required to obtain the _____________

Examples • An experiment consists of rolling a single die. The

event of interest is rolling a 3; this event is called a success. The random variable X is defined as X = the number of trials until a 3 occurs. Is this a geometric setting?

• Suppose you repeatedly draw cards without replacement from a deck of 52 cards until you draw an ace. Is this a geometric setting?

Ex. 8.17: An experiment consists of rolling a single die. The event of interest is rolling a 3; this event is called a success. The random variable X is defined as X = the number of trials until a 3 occurs.

• P(X=1) =

• P(X=2) =

• P (X=3) =

Probability of success:____ Probability of failure:____ Rule for Calculating Geometric Probabilities:

Example: Glenn likes the game at the fair where you toss a coin into a saucer. You win if the coin comes to rest in the saucer w/o sliding off. Glenn has played this a lot and has determined that he wins 1 out of every 12 times he plays. He believes his chances of winning are the same for each toss. He has no reason to think the tosses are not independent. Let X be the # of tosses until a win.

1) Find the probability of success on any given trial

2) Find the expected number of successes (mean) 3) Find the standard deviation

The Mean (Expected Value) of a Geometric Random Variable: The Variance of a Geometric Random Variable:

• Roll a die until a 3 is observed. Find the probability that it takes more than 6 rolls to observe a 3.

• Let Y be the number of Glenn’s coin tosses until a coin stays in the saucer. The expected number is 12. Find the probability that it takes more than 12 tosses to win a stuffed animal.

P (X>n):

Calculator for Geometric Distributions:

Distr: _____________ (p, n)

Distr: _____________ (p, n)

For the offices in a large office building, there are 100 different lock-and-key combinations. You start testing locks to see if the key will fit. The number of locks X you must test to find one that the key fits has a geometric distribution with p = 1/100 = 0.01. (The necessary assumption here is that each office is equally likely to have any of the 100 combos; this permits us to say that p remains constant at 1/100 on each trial).

1) What is the expected number of offices you will have to visit in order to find an office with a lock that the key fits?

2) What is the probability that you will have to visit at least 200 offices in order to find an office with a lock that the key fits?

3) What is the probability that you will have to visit at most

200 offices?

There is a probability of 0.08 that a vaccine will cause a certain side

effect. Suppose that a number of patients are inoculated with the vaccine. We are interested in the number of patients vaccinated until the first side effect is observed.

1. Define the random variable of interest. X=?____________ 2. Verify that this describes a geometric setting. 3. Find the probability that exactly 5 patients must be vaccinated in

order to observe the first side effect. 4. Construct a probability distribution table for X (up through X = 5). 5. How many patients would you expect to have to vaccinate in order

to observe the first side effect? 6. What is the probability that the number of patients vaccinated until

the first side effect is observed is at most 5?

Exploring Geometric Distributions with the TI83

• Page 547 Technology Toolbox. • Note: The probability distribution histogram is

strongly skewed to the right. The height of each bar after the 1st is the height of the previous bar times the probability of failure (1-p). Since you are * each consecutive height by a number <1, each new bar will always be shorter than the previous. Therefore the histogram will ALWAYS be right-skewed.