random variables streamlining probability: probability distribution, expected value and standard...

Post on 16-Jan-2016

246 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Random Variables

Streamlining Probability:Probability Distribution, Expected Value and Standard Deviation of

Random Variable

Graphically and Numerically Summarize a Random

ExperimentPrincipal vehicle by which we do this:random variablesA random variable assigns a number

to each outcome of an experiment

Random Variables

Definition:A random variable is a numerical-

valued variable whose value is based on the outcome of a random event.

Denoted by upper-case letters X, Y, etc.

When the number of possible values of X is finite (number of heads in 3 tosses of a coin) or countably infinite (number of tosses until you get 3 heads in a row), the random variable is discrete. (Will study continuous rv’s later).

Examples: Discrete rv’s

1. X = # of games played in a randomly selected World Series

Possible values of X are x=4, 5, 6, 7

2. Y=score on 13th hole (par 5) at Augusta National golf course for a randomly selected golfer on day 1 of 2015 Masters

y=3, 4, 5, 6, 7

Examples: Discrete rv’s

Number of girls in a 5 child familyNumber of customers that use an

ATM in a 1-hour period.Number of tosses of a fair coin that

is required until you get 3 heads in a row (note that this discrete random variable has a countably infinite number of possible values: x=3, 4, 5, 6, 7, . . .)

CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.590249410 4 TYSON FOODS INC 20.9 3.94886910 5 ATLANTIC SOUTHEAST AIRLINES 20.1 15.787183910 9 SYSTEM SOFTWARE ASSOC INC 23.7 11.662475210 4 MUELLER (PAUL) CO 14.5 3.936473510 7 GANTOS INC 15.7 1.800755P10 9 ADVANTAGE HEALTH CORP 23.3 5.323935910 2 DAWSON GEOPHYSICAL CO 14.9 9.368555910 4 ORBIT INTERNATIONAL CP 15.0 3.016278010 4 CHECK TECHNOLOGY CORP 17.1 3.251460610 4 LANCE INC 19.0 8.54523710 4 ASPECT TELECOMMUNICATIONS 25.7 8.274555310 4 PULASKI FURNITURE CORP 22.0 2.180819410 4 SCHULMAN (A.) INC 19.4 6.019770920 9 COLUMBIA HOSPITAL CORP 18.3 3.123790310 4 DATA MEASUREMENT CORP 11.3 2.611457710 4 BROOKTREE CORP 13.8 13.600431L10 9 ACCESS HEALTH MARKETING INC 22.4 11.029605610 4 ESCALADE INC 10.8 2.023303110 4 DBA SYSTEMS INC 6.3 5.064124610 4 NEUTROGENA CORP 27.2 9.059492810 6 MICROAGE INC 9.0 0.522821010 7 CROWN BOOKS CORP 24.4 1.8190710 4 AST RESEARCH INC 9.7 7.346978310 6 JACO ELECTRONICS INC 31.9 0.4531320 4 ADAC LABORATORIES 18.5 10.649766010 4 KIRSCHNER MEDICAL CORP 33.0 0.830205210 4 EXIDE ELECTRS GROUP INC 29.0 2.446065P10 5 INTERPROVINCIAL PIPE LN 11.9 19.219247910 4 COHERENT INC 40.2 1.2

CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.5

Data Variables and Data Distributions

Data variables are

known outcomes.

CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.590249410 4 TYSON FOODS INC 20.9 3.94886910 5 ATLANTIC SOUTHEAST AIRLINES 20.1 15.787183910 9 SYSTEM SOFTWARE ASSOC INC 23.7 11.662475210 4 MUELLER (PAUL) CO 14.5 3.936473510 7 GANTOS INC 15.7 1.800755P10 9 ADVANTAGE HEALTH CORP 23.3 5.323935910 2 DAWSON GEOPHYSICAL CO 14.9 9.368555910 4 ORBIT INTERNATIONAL CP 15.0 3.016278010 4 CHECK TECHNOLOGY CORP 17.1 3.251460610 4 LANCE INC 19.0 8.54523710 4 ASPECT TELECOMMUNICATIONS 25.7 8.274555310 4 PULASKI FURNITURE CORP 22.0 2.180819410 4 SCHULMAN (A.) INC 19.4 6.019770920 9 COLUMBIA HOSPITAL CORP 18.3 3.123790310 4 DATA MEASUREMENT CORP 11.3 2.611457710 4 BROOKTREE CORP 13.8 13.600431L10 9 ACCESS HEALTH MARKETING INC 22.4 11.029605610 4 ESCALADE INC 10.8 2.023303110 4 DBA SYSTEMS INC 6.3 5.064124610 4 NEUTROGENA CORP 27.2 9.059492810 6 MICROAGE INC 9.0 0.522821010 7 CROWN BOOKS CORP 24.4 1.8190710 4 AST RESEARCH INC 9.7 7.346978310 6 JACO ELECTRONICS INC 31.9 0.4531320 4 ADAC LABORATORIES 18.5 10.649766010 4 KIRSCHNER MEDICAL CORP 33.0 0.830205210 4 EXIDE ELECTRS GROUP INC 29.0 2.446065P10 5 INTERPROVINCIAL PIPE LN 11.9 19.219247910 4 COHERENT INC 40.2 1.2

DATA DISTRIBUTIONDATA DISTRIBUTIONPrice-Earnings RatiosPrice-Earnings Ratios

|||| ||||

Class(bin)

ClassBoundary Tally Frequency

1 6.00-12.99 |||| | 6 6/35 = 0.171

2 13.00-19.99 10

3 20.00-26.99 |||| |||| |||| 14

4 27.00-33.99 |||| 4

5 34.00-40.99 | 1 1/35 = 0.029

RelativeFrequency

4/35 = 0.114

14/35 = 0.400

10/35 = 0.286

Handout 2.1, P. 10

CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.5

Data Variables and Data Distributons

Data variables are

known outcomes.

Data distributions

tell us what happened.

Random Variables and Probability Distributions

Random variables areunknown chance outcomes.

Probability distributionstell us what is likely

to happen.

Data variables are

known outcomes.

Data distributions

tell us what happened.

X = the random variable (profits)xi = outcome i

x1 = 10

x2 = 5

x3 = 1

x4 = -4

Notation

Probability

Great 0.20

Good 0.40

OK 0.25

EconomicScenario

Profit($ Millions)

5

1

-4Lousy 0.15

10

x4

X

x1

x2

x3

P is the probabilityp(xi)= Pr(X = xi) is the probability of X being

outcome xi

p(x1) = Pr(X = 10) = .20

p(x2) = Pr(X = 5) = .40

p(x3) = Pr(X = 1) = .25

p(x4) = Pr(X = -4) = .15

Notation

Probability

Great 0.20

Good 0.40

OK 0.25

EconomicScenario

Profit($ Millions)

5

1

-4Lousy 0.15

10

Pr(X=x4)

X

Pr(X=x1)

Pr(X=x2)

Pr(X=x3)

x1

x2

x3

x4

.05

.10

.15

.40

.20

.25

.30

.35

Probability Histogram

-4 -2 0 2 4 6 8 10 12

Profit

Probability

Lousy

OK

Good

Great

.05

.10

.15

.40

.20

.25

.30

.35

Probability

Great 0.20

Good 0.40

OK 0.25

EconomicScenario

Profit($ Millions)

5

1

-4Lousy 0.15

10

p(x4)

X

x1

x2

x3

x4

P

p(x1)

p(x2)

p(x3)

Probability Distribution Of Number of Games Played in Randomly Selected World Series

Estimate based on results from 1946 to 2014.

x 4 5 6 7

p(x) 12/65=0.185

12/65=0.185

14/65=0.215

27/65=0.415

Probability Histogram

4 5 6 70

0.1

0.2

0.3

0.4

0.185 0.1850.215

0.415

Number of Games in Randomly Selected World

Series

Probability distributions: requirements

Notation: p(x)= Pr(X = x) is the probability that the random variable X has value x

Requirements1. 0 p(x) 1 for all values x of X

2. all x p(x) = 1

Expected Value of a Discrete Random Variable

A measure of the “middle” of the values of a random variable

k = the number of outcomes

µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk)

Weighted meanEach outcome is weighted by its

probability

Mean orExpectedValue

Sample MeanSample Mean

n

n

1=ii

X

= X

å

nx

n

1 + ... +

3x

n

1 +

2x

n

1 +

1x

n

1 =

nn

x + ... + 3

x + 2

x + 1

x = X

k

i ii=1

( ) = x P(X=x )E x å

Other Weighted Means1. Stock Market: The Dow Jones

Industrial Average The “Dow” consists of 30 companies (the

30 companies in the “Dow” change periodically)

To compute the Dow Jones Industrial Average, a weight proportional to the company’s “size” is assigned to each company’s stock price

2. GPA A=4, B=3, C=2, D=1, F=0Five 3-hour courses: 2 A's (6 hrs), 1 B (3 hrs), 2 C's (6 hrs)

4 * 6 3*3 2 * 6 45GPA: 3.0

15 15

k = the number of outcomes (k=4)

µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk)

EXAMPLE

µ = 10*.20 + 5*.40 + 1*.25 – 4*.15 = 3.65 ($ mil)

Mean

Probability

Great 0.20

Good 0.40

OK 0.25

EconomicScenario

Profit($ Millions)

5

1

-4Lousy 0.15

10

P(X=x4)

X

x1

x2

x3

x4

P

P(X=x1)

P(X=x2)

P(X=x3)

k

i ii=1

( ) = x P(X=x )E x å

-4 -2 0 2 4 6 8 10 12

Profit

Probability

Lousy

OK

Good

Great

.05

.10

.15

.40

.20

.25

.30

.35

k = the number of outcomes (k=4)

µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk)

EXAMPLE

µ = 10·.20 + 5·.40 + 1·.25 - 4·.15 = 3.65 ($ mil)

Mean

µ=3.65

k

i ii=1

( ) = x P(X=x )E x å

Interpretation

E(x) is not the value of the random variable x that you “expect” to observe if you perform the experiment once

Interpretation

E(x) is a “long run” average; if you perform the experiment many times and observe the random variable x each time, then the average x of these observed x-values will get closer to E(x) as you observe more and more values of the random variable x.

Example: Green Mountain Lottery

State of Vermontchoose 3 digits from 0 through 9;

repeats allowedwin $500

x $0 $500p(x) .999 .001

E(x)=$0(.999) + $500(.001) = $.50

Example (cont.)

E(x)=$.50On average, each ticket wins $.50.Important for Vermont to knowE(x) is not necessarily a possible

value of the random variable (values of x are $0 and $500)

Example (cont.)

So the probability distribution of x is:

x 0 1 2 3p(x) 1/8 3/8 3/8 1/8

Example

Let X = number of heads in 3 tosses of a fair coin.

1.58

12

)81(3)

83(2)

831()

81(0

4

1i)

ip(x

ixE(x)

is )μ (orE(x)

å

So the probability distribution of X is:

x 0 1 2 3p(x) 1/8 3/8 3/8 1/8

US Roulette Wheel and Table

The roulette wheel has alternating black and red slots numbered 1 through 36.

There are also 2 green slots numbered 0 and 00.

A bet on any one of the 38 numbers (1-36, 0, or 00) pays odds of 35:1; that is . . .

If you bet $1 on the winning number, you receive $36, so your winnings are $35

American Roulette 0 - 00(The European version has only one 0.)

US Roulette Wheel: Expected Value of a $1 bet on a single number

Let x be your winnings resulting from a $1 bet on a single number; x has 2 possible values

x -1 35p(x) 37/38 1/38

E(x)= -1(37/38)+35(1/38)= -.05So on average the house wins 5 cents on

every such bet. A “fair” game would have E(x)=0.

The roulette wheels are spinning 24/7, winning big $$ for the house, resulting in …

Summarizing data and probability

DataHistogrammeasure of the center: sample mean

xmeasure of spread:

sample standard deviation s

Random variableProbability

Histogrammeasure of the

center: population mean m

measure of spread: population standard deviation s

Standard Deviation of a Discrete Random Variable

Measures how “spread out” the random variable is

s =

(X X)

n - 1 =

1805.703

34 = 53.10892

i2

i=1

n

å

VarianceVariance

The deviations of the individual x ‘s from the mean (expected value) of their probability distribution: xi - µ

Var(X)=2 (sigma squared) is the variance of the probability distribution

Variation

X - Xi

s =

(X X)

n - 1 =

1805.703

34 = 53.10892

i2

i=1

n

å

VarianceVariance

Variation

2 2

=1

Var(X) = = ( ) ( = )k

i ii

x P X x å

Variance of discrete random variable X

Probability

Great 0.20

Good 0.40

OK 0.25

EconomicScenario

Profit($ Millions)

5

1

-4Lousy 0.15

10

P(X=x4)

X

x1

x2

x3

x4

P

P(X=x1)

P(X=x2)

P(X=x3)

P. 207, Handout 4.1, P. 4

Example2 = (x1-µ)2 · P(X=x1) + (x2-µ)2 · P(X=x2) +

(x3-µ)2 · P(X=x3) + (x4-µ)2 · P(X=x4)

= (10-3.65)2 · 0.20 + (5-3.65)2 · 0.40 + (1-3.65)2 · 0.25 + (-4-3.65)2 · 0.15 =

19.3275

Variation

3.65 3.65

3.65

3.65

2 2

1

= ( ) ( = )=

x P X xi ii

k

å

Standard Deviation: of More Interest then the Variance

variancepopulation theof

root square theisdeviation standard population The

Standard Deviation (s) =

Positive Square Root of the Variance

Standard DeviationStandard Deviation

s = s2

, or SD, is the standard deviation of the probability distribution

Standard Deviation

(or SD) = 19.3275 4.40 ($ mil.)

2 = 19.3275

2 (or SD) =

Finance and Investment Interpretation

X = return on an investment (stock, portfolio, etc.)

E(x) = =m expected return on this investment

s is a measure of the risk of the investment

ExampleA basketball player shoots 3 free throws. P(make)

=P(miss)=0.5. Let X = number of free throws made.

2 2 2 2 23 31 18 8 8 8

3 31 18 8 8 8

0 1 2 3

1 3 3 1( ) E(X)

8 8 8 8

Compute the variance:

(0 1.5) (1 1.5) (2 1.5) (3 1.5)

2.25 .25 .25 2.25

.75.

.75 .866

x

p x

2 2

=1

= ( ( )) ( = )k

i ii

x E X P X x å

© 2010 Pearson Education

37

Expected Value of a Random VariableExample: The probability model for a particular life insurance policy is shown. Find the expected annual payout on a policy.

We expect that the insurance company will pay out $200 per policy per year.

© 2010 Pearson Education

38

Standard Deviation of a Random Variable

Example: The probability model for a particular life insurance policy is shown. Find the standard deviation of the annual payout.

68-95-99.7 Rule for Random Variables

For random variables x whose probability histograms are approximately mound-shaped:

P( - m s x + ) .68msP( - m s x + ) .9m sP( -3 m s x + 3 ) .997m s

( - , + m s m s) (50-5, 50+5) (45, 55)P( - m s X + ) = ms P(45 X 55)=.048+.057+.066+.073+.078+.08+.078+.

073+ .066+.057+.048=.724

Rules for E(X), Var(X) and SD(X):adding a constant a

If X is a rv and a is a constant:

E(X+a) = E(X)+a

Example: a = -1

E(X+a)=E(X-1)=E(X)-1

Rules for E(X), Var(X) and SD(X): adding constant a (cont.)

Var(X+a) = Var(X)SD(X+a) = SD(X)

Example: a = -1

Var(X+a)=Var(X-1)=Var(X)

SD(X+a)=SD(X-1)=SD(X)

Probability

Great 0.20

Good 0.40

OK 0.25

EconomicScenario

Profit($ Millions)

5

1

-4Lousy 0.15

10

P(X=x4)

X

x1

x2

x3

x4

P

P(X=x1)

P(X=x2)

P(X=x3)

Probability

Great 0.20

Good 0.40

OK 0.25

EconomicScenario

Profit($ Millions)

5+2

1+2

-4+2Lousy 0.15

10+2

P(X=x4)

X+2

x1+2

x2+2

x3+2

x4+2

P

P(X=x1)

P(X=x2)

P(X=x3)

E(x + a) = E(x) + a; SD(x + a)=SD(x); let a = 2

Probability

0

0.1

0.2

0.3

0.4

0.5

-4 -2 0 2 4 6 8 10 12 14

Profit=m 5.65

= 4.40Probability

0

0.1

0.2

0.3

0.4

0.5

-4 -2 0 2 4 6 8 10 12 14

Profit=m 3.65

= 4.40

New Expected Value

Long (UNC-CH) way:E(x+2)=12(.20)+7(.40)+3(.25)+(-2)

(.15)= 5.65

Smart (NCSU) way:a=2; E(x+2) =E(x) + 2 = 3.65 + 2 =

5.65

New Variance and SDLong (UNC-CH) way: (compute from

“scratch”)Var(X+2)=(12-5.65)2(0.20)+…

+(-2+5.65)2(0.15) = 19.3275SD(X+2) = √19.3275 = 4.40

Smart (NCSU) way:Var(X+2) = Var(X) = 19.3275SD(X+2) = SD(X) = 4.40

Rules for E(X), Var(X) and SD(X): multiplying by constant b

E(bX)=b E(X)

Var(b X) = b2Var(X)

SD(bX)= |b|SD(X)

Example: b =-1 E(bX)=E(-X)=-E(X)

Var(bX)=Var(-1X)==(-1)2Var(X)=Var(X)

SD(bX)=SD(-1X)==|-1|SD(X)=SD(X)

Expected Value and SD of Linear Transformation a + bx

Let X=number of repairs a new computer needs each year. Suppose E(X)= 0.20 and SD(X)=0.55

The service contract for the computer offers unlimited repairs for $100 per year plus a $25 service charge for each repair.

What are the mean and standard deviation of the yearly cost of the service contract?

Cost = $100 + $25XE(cost) = E($100+$25X)=$100+$25E(X)=$100+$25*0.20=

= $100+$5=$105SD(cost)=SD($100+$25X)=SD($25X)=$25*SD(X)=$25*0.55=

=$13.75

Addition and Subtraction Rules for Random Variables

E(X+Y) = E(X) + E(Y); E(X-Y) = E(X) - E(Y)

When X and Y are independent random variables:1. Var(X+Y)=Var(X)+Var(Y)

2. SD(X+Y)=SD’s do not add:

SD(X+Y)≠ SD(X)+SD(Y)3. Var(X−Y)=Var(X)+Var(Y)

4. SD(X −Y)=SD’s do not subtract:

SD(X−Y)≠ SD(X)−SD(Y)SD(X−Y)≠ SD(X)+SD(Y)

( ) ( )Var X Var Y

( ) ( )Var X Var Y

Motivation forVar(X-Y)=Var(X)+Var(Y)

Let X=amount automatic dispensing machine puts into your 16 oz drink (say at McD’s)

A thirsty, broke friend shows up.Let Y=amount you pour into friend’s 8 oz

cup Let Z = amount left in your cup; Z = ?Z = X-YVar(Z) = Var(X-Y) =

Var(X) + Var(Y)Has 2 components

Example: rv’s NOT independent

X=number of hours a randomly selected student from our class slept between 9 am yesterday and 9 am today.

Y=number of hours a randomly selected student from our class was awake between 9 am yesterday and 9 am today. Y = 24 – X.

What are the expected value and variance of the total hours that a student is asleep and awake between 9 am yesterday and 9 am today?

Total hours that a student is asleep and awake between 9 am yesterday and 9 am today = X+Y

E(X+Y) = E(X+24-X) = E(24) = 24 Var(X+Y) = Var(X+24-X) = Var(24) = 0. We don't add Var(X) and Var(Y) since X and Y are not

independent.

a2

c2=a2+b2

b2

Pythagorean Theorem of Statistics for Independent X and Y

a

b

c

a2 + b2 = c2

Var(X)

Var(Y)

Var(X+Y)

SD(X)

SD(Y)

SD(X+Y)

Var(X)+Var(Y)=Var(X+Y)

a + b ≠ cSD(X)+SD(Y) ≠SD(X+Y)

9

25=9+16

16

Pythagorean Theorem of Statistics for Independent X and Y

3

4

5

32 + 42 = 52

Var(X)

Var(Y)

Var(X+Y)

SD(X)

SD(Y)

SD(X+Y)

Var(X)+Var(Y)=Var(X+Y)

3 + 4 ≠ 5SD(X)+SD(Y) ≠SD(X+Y)

Example: meal plansRegular plan: X = daily amount spentE(X) = $13.50, SD(X) = $7Expected value and stan. dev. of total

spent in 2 consecutive days?E(X1+X2)=E(X1)+E(X2)=$13.50+

$13.50=$27

1 2 1 2 1 2

2 2 2 2 2

( ) ( ) ( ) ( )

($7) ($7) $ 49 $ 49 $ 98 $9.90

SD X X Var X X Var X Var X

SD(X1 + X2) ≠ SD(X1)+SD(X2) = $7+$7=$14

Example: meal plans (cont.)Jumbo plan for football players

Y=daily amount spentE(Y) = $24.75, SD(Y) = $9.50Amount by which football player’s

spending exceeds regular student spending is Y-X

E(Y-X)=E(Y)–E(X)=$24.75-$13.50=$11.25

2 2 2 2 2

( ) ( ) ( ) ( )

($9.50) ($7) $ 90.25 $ 49 $ 139.25 $11.80

SD Y X Var Y X Var Y Var X

SD(Y @ X) ≠ SD(Y) @ SD(X) = $9.50 @ $7=$2.50

For random variables, X+X≠2X Let X be the annual payout on a life insurance

policy. From mortality tables E(X)=$200 and SD(X)=$3,867.

1) If the payout amounts are doubled, what are the new expected value and standard deviation?Double payout is 2X.

E(2X)=2E(X)=2*$200=$400SD(2X)=2SD(X)=2*$3,867=$7,734

2) Suppose insurance policies are sold to 2 people. The annual payouts are X1 and X2. Assume the 2 people behave independently. What are the expected value and standard deviation of the total payout?E(X1 + X2)=E(X1) + E(X2) = $200 + $200 =

$400

1 2 1 2 1 2

2 2

SD(X + X )= ( ) ( ) ( )

(3867) (3867) 14,953,689 14,953,689

29,907,378

Var X X Var X Var X

$5,468.76

The risk to the insurance co. when doubling the payout (2X) is not the same as the risk when selling policies to 2 people.

top related