unit 6: probability. math? ugh! why bother? you hear on tv a gubernatorial candidate has a 5% lead...

37
Unit 6: Probability

Upload: aron-rodgers

Post on 18-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Unit 6: Probability

Page 2: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Math? Ugh! Why bother?

• You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win?

• You’ve got sample data. How far might your average (or whatever) be off from the population average?

• You’ve got experimental data. It doesn’t seem to match the prevailing theory. How likely is it that you’ve found something new?

Page 3: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Probability• Start with finite probability (“frequency

theory”), to understand rules– finite number of possible results in “sample

space”, usually equally likely

• Move to continuous probability, to include “normal curve”, etc.– sample space is all numbers [maybe in some

interval]

Page 4: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Finite probabilities• “Event”: some set of possible “outcomes”, i.e.,

values in the “sample space”• Probability of an event (with equally likely

outcomes): # of outcomes in the event (called “successful outcomes”) / # of all possible outcomes (expressed as fraction or %)– Ex: Roll a die. P(getting < 3) = 2/6 = 1/3.

• Idea: P(A) = fraction of times A would occur if experiment is repeated many times

• Equal likelihood of outcomes is important– Flip 2 quarters: TT or HH twice as likely as HT?– Similarly, roll 2 dice: 36 equally likely outcomes, and 4,4

seems only half as likely as 3,5 (unless dice have different colors)

Page 5: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Immediate results

• P(Ac) = 1 – P(A) (Ac is the set of all outcomes not in A, the “complement” of A)– Ex: Roll one die. P(a four) = 1/6, so P(not a

four) = 1 – 1/6 = 5/6– Ex: Flip a coin 5 times. P(no heads) = 1/32, so

P(at least one head) = 31/32.

• 0 ≤ P(A) ≤ 1

Hard example: In 5-card draw, P(4-of-a-kind)

Page 6: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Example: The first box model

• A box contains tickets with the numbers 1, 1, 4, 4, 4, 7, 7, 7, 7, 12. Pick a random slip.

• P(1) = 2/10 = 20%

• P(7) = 4/10 = 40%

• P(not 7) = 60%

• P(1 or 12) = 3/10 = 30%

• P(even) = 4/10 = 40%

Page 7: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Boolean operations: And (I)

• Conditional probability P(A|B) (“probability of A given B”): Sample space is restricted to B (i.e., we know B has occurred). Now compute probability of A.– Ex: Pick a card from a (straight) deck. (Face

cards don’t include aces.) • P(♥) = 13/52 = 1/4• P(♥ | face card) = 3/12 = 1/4

– Ex: Two cards dealt face down:• P(2nd is deuce) = 4/52 = 1/13• P(2nd is deuce | 1st is deuce) = 3/51 = 1/17

Page 8: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Boolean operations: And (II)• Multplication rule: P(A and B) = P(A)•P(B|A)

• “independent events”: P(B|A) = P(B) − So with indep events, mult rule becomes P(A and B) = P(A)•P(B)− Remark: If A is indep of B, then B is indep of A.

Page 9: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Examples• Boxes of tickets

– Box 1: A1,A2,A2,B1,B1,B2 : letters and numbers are not independent

– Box 2: A1,A2,A2,B1,B2,B2 : letters and numbers are independent

• From box of 1, 1, 4, 4, 4, 7, 7, 7, 7, 12, pick two tickets:– with replacement: – P(two 4’s) = (3/10)(3/10) and P(1 then 7) = (2/10)(4/10)– without replacement: – P(two 4’s) = (3/10)(2/9) and P(1 then 7) = (2/10)(4/9)

• Die thrown 4 times– Which is more likely, 3333 or 1436?– What is P( 4 scores ≤ 2 )?

Page 10: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Boolean operations: And (III)

• Ex: Caucasian woman with blonde ponytail snatched purse, jumped into yellow car driven by black man with mustache and beard. Man and woman fitting description arrested. At trial, prosecutor says probs are: yellow car, 1/10; man with mustache, 1/4; woman with ponytail, 1/10; woman with blonde hair, 1/3; black man with beard, 1/10; interracial couple in car, 1/1000. So chances are 1/(10•4•10•3•10•1000) = 1/12,000,000 that they are wrong people. (???)

Page 11: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Boolean operations: Or (I)• Addition rule: P(A or B (or both)) = P(A) +

P(B) – P(A and B)• “mutually exclusive events”: P(A and B) =

0; i.e., if one occurs, the other cannot – With mut excl events, addition rule becomes

P(A or B) = P(A) + P(B)

Page 12: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Examples

• Pick a card. – P(A or K) = 4/52 + 4/52 = 2/13– P(A or ♠) = 4/52 + 13/52 – 1/52 = 4/13

• Tickets 1-100 in a box, draw one.– P(≤ 10 or ≥ 90) = 10/100 + 11/100 = 21/100– P(≤ 10 or div by 5) = 10/100 + 20/100 – 2/100

= 7/25

Page 13: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Expected values• Ex 1: Flip a coin 10 times, paying $1 to play

each time. You win $.50 (plus your $1) if you get a head. How much should you expect to win?

• Ex 2: Roll two dodecahedral (12-sided) dice. You win $10 (plus your payment to play) if you get doubles. How much should you pay to play for a fair game?

Page 14: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Two similar examples:

• From text: Paradox of the Chevalier de la Méré: P(at least 1 ace in 4 rolls of die) > P(at least 1 double-ace in 24 rolls of 2 dice)

• Birthday problem: With 30 people in a room, how likely is it that at least two have the same birth date?

Page 15: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

The Birthday problem

# people P(no match) P(match)

2 0.99726027 0.00273973

3 0.99179583 0.00820417

4 0.98364409 0.01635591

5 0.97286443 0.02713557

6 0.95953752 0.04046248

7 0.9437643 0.0562357

8 0.92566471 0.07433529

9 0.90537617 0.09462383

10 0.88305182 0.11694818

11 0.85885862 0.14114138

12 0.83297521 0.16702479

13 0.80558972 0.19441028

14 0.77689749 0.22310251

15 0.74709868 0.25290132

16 0.71639599 0.28360401

17 0.68499233 0.31500767

18 0.65308858 0.34691142

19 0.62088147 0.37911853

20 0.58856162 0.41143838

21 0.55631166 0.44368834

22 0.52430469 0.47569531

23 0.49270277 0.50729723

24 0.46165574 0.53834426

25 0.4313003 0.5686997

26 0.40175918 0.59824082

27 0.37314072 0.62685928

28 0.34553853 0.65446147

29 0.31903146 0.68096854

30 0.29368376 0.70631624

31 0.26954537 0.73045463

32 0.24665247 0.75334753

33 0.22502815 0.77497185

34 0.20468314 0.79531686

Page 16: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Tree diagram

Flip a coin, then roll a die, list all alternatives

Page 17: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

The Monty Hall Problem

(From Marilyn vos Savant’s column)

Game show: Three doors hide a car and 2 goats. Contestant picks a door. Host opens one of the other doors to reveal a goat. Contestant then may switch to the other unopened door. Is it better to stay with the original choice or to switch; or doesn’t it matter?

Marilyn’s answer: Switch!

Many respondents: Doesn’t matter. (“You’re the goat!”)

Page 18: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Tree diagram of Stayer’s possible games

Page 19: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve
Page 20: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Binomial coefficients (I)

• How many ways are there to choose k things (without regard to order) from a set of n things?– How many ways are there to choose 3 club

officers from a set of 5 to get funded for a trip to a convention?

– How many ways are there to choose 2 cards out of the 4 of a given rank to form a pair?

– How many ways are there to choose, out of 8 replications of an experiment, 6 to be successful?

Page 21: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

abc abd abe acb acd ace adb adc ade aeb

aec aed bac bad bae bca bcd bce bda bdc

bde bea bec bed cab cad cae cba cbd cbe

cda cdb cde cea ceb ced dab dac dae dba

dbc dbe dca dcb dce dea deb dec eab eac

ead eba ebc ebd eca ecb ecd eda edb edc

Ways to arrange 3 letters taken from {a,b,c,d,e}

Ways to arrange 3 given letters: a, b, c

abc acb bac bca cab cba

Page 22: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Binomial coefficients (II)• Step one: How many ways are there to choose

k things in order from a set of n things?– n(n-1)(n-2)...(n-k+1)

• Step two: How many ways are there to order k given things?– k(k-1)(k-2)...1

• Step three: Divide. – C(n,k) = [n(n-1)(n-2)...(n-k+1)]/[k(k-1)(k-2)...1]

• Notation: n! = n(n-1)(n-2)...1 [1 if n=0]– C(n,k) = n!/[k! (n-k)!]

Page 23: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Binomial coefficients (I) revisited

• How many ways are there to choose k things (without regard to order) from a set of n things?– How many ways are there to choose 3 club

officers from a set of 5 to get funded for a trip to a convention?

– How many ways are there to choose 2 cards out of the 4 of a given rank to form a pair?

– How many ways are there to choose, out of 8 replications of an experiment, 6 to be successful?

Page 24: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Binomial probabilities (I)

• General question: Suppose an experiment is carried out n times under the same conditions. A given event (set of outcomes) A has probability p . What is the probability that A occurs exactly k times out of the n repetitions?– Ex: Roll a die 5 times. Probability of getting exactly

three 4’s?• There are exactly C(5,3) = 10 patterns of three 4’s

and 2 non-4’s • Each has probability (1/6)3(5/6)2

• So the answer is 10 (1/6)3(5/6)2

• In general, the answer is C(n,k) pk (1-p)n-k

Page 25: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

• Reqs for binomial probability: – (1) Experiment has 2 complementary outcomes.– (2) On repeated trials, probabilities don’t change.

• Though formula gives probability of exactly k “successes” out of n repetitions of an experiment, we will usually use it for counting at least k “successes” out of n– So we have to add up the probabilities for k and

k+1 and k+2 and ... and n .

Page 26: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Examples of binomial distributions (I)

• In a family of 5 kids, P(exactly 3 girls)

• Roll a die 15 times, P(exactly 4 twos)

• Roll two dice 10 times, P(at most two sums of 5)

Page 27: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Examples of binomial distributions (II)

• Feed vitamin A to one each of 10 pairs of rats, then all run a maze. In 7 pairs, the A-rat was faster. If vitamin A was no help (i.e., each rat was equally likely to be faster), how likely is it that, just by chance, A-rat was faster in at least 7 pairs?

• In a county that is 40% Caucasian, how likely is it that a jury pool of 20 people has 18 or more Caucasians?

Page 28: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Math stuff about binomial coeffients• They’re called that because they are the coefficients of x

and y in the expansion of (x+y)n: – C(n,0)xn + C(n,1)xn-1y + C(n,2)xn-2y2 + ... + C(n,n-1)xyn-1 + C(n,n)yn

• For small n , compute C(n,k) with “Pascal’s triangle”: 1’s in first row and column, then each entry is sum of the one above and the one to the right

Page 29: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

(More from Marilyn vos Savant’s column)

Suppose we assume that 5% of the people are drug users. A drug test is 95% accurate (i.e., it produces a correct result 95% of the time, whether the person is using drugs or not). A randomly chosen person tests positive. Is the person highly to be a drug user?

Marilyn’s answer: Given your conditions, once the person has tested positive, you may as well flip a coin to determine whether she or he is a drug user. The chances are only 50-50. But the assumptions, the make-up of the test group and the true accuracy of the tests themselves are additional considerations.

(To see this, suppose the population is 10,000 people; compare numbers of false positives and true positives.)

Page 30: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Drug [disease] testing probabilities

Drug [disease] present? Test positive Test negative

Yes “Sensitivity” False negative

No False positive “Specificity”

Ex: Suppose the Bovine test for lactose abuse has a sensitivity of 0.99 and a specificity of 0.95; and that 7% of a certain population abuses lactose. If a person tests positive on the Bovine test, how likely is it that (s)he really abuses lactose?

Sum

1

1

pos neg

abuser .99 .01

clean .05 .95

Page 31: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Assuming 7% of population is really positive:

x = sensitivity

y = specificity

z = P(pos test => pos)

curve: x = .99

points: x = .99

y = .95 , .90

z = .6 , .42

Page 32: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Counting dragonflies

(thanks to Profs. V. MacMillen and R. Arnold)

Page 33: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve
Page 34: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve
Page 35: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Only two pairs

• 30 censuses altogether, 17 with only two pairs

• Of 17, 12 had both in same plot

• Do they prefer to lay eggs in proximity?

Page 36: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

Censuses with >2 pairs

P1 P2 P3

0 0 3

0 3 0

3 0 0

1 0 2

1 2 0

3 0 0

1 2 0

2 1 0

1 2 0

0 0 3

1 2 0

P1 P2 P3

4 0 0

0 0 4

3 1 0

4 0 0

0 4 0

0 2 2

P1 P2 P3

3 2 0

0 0 5

0 3 2

3 1 1

2 3 0

2 3 0

1 3 1

1 0 4

1 2 2

0 1 4

0 0 5

Up to 12 at the same time

Page 37: Unit 6: Probability. Math? Ugh! Why bother? You hear on TV a gubernatorial candidate has a 5% lead over her opponent. Should you believe she’ll win? You’ve

With 3 pairs