1 dec 2011comp80131-seedsm81 scientific methods 1 barry & goran ‘scientific evaluation,...

23
1 Dec 2011 COMP80131-SEEDSM8 1 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8: Statistical Methods- Significance tests & confidence limits www.cs.man.ac.uk/~barry/mydocs/ MyCOMP80131

Upload: esmond-hopkins

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 1

Scientific Methods 1

Barry & Goran

‘Scientific evaluation, experimental design

& statistical methods’

COMP80131

Lecture 8: Statistical Methods-Significance tests & confidence limits

www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131

Page 2: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 2

Introduction

• Statistical significance testing has so far been applied on the assumption of a

(1) discrete population with binomial distribution

(2) continuous population with known normal pdf & known std.

• Before proceeding further, take a quick look at a few more prob distributions & pdfs.

• Significance testing can be adapted to any of these.

Page 3: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 3

Exponential pdf• Lifetimes e.g. of light bulbs follow an exponential distribution:

0:)/1(

0: 0 )( / xe

xxpdf x

0 1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

pdf

x

mean = 2;x = 0:0.1:10;y = exppdf(x,mean);plot(x,y);

Mean =

Std = also

Page 4: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 4

Poisson Distribution

• λ, is both mean & variance of the distribution. • Poisson & exponential distributions are related. • If number of counts follows a Poisson distribution, then interval

between individual counts follows exponential distribution.• As λ gets larger, Poisson pdf normal with µ = λ, σ2 = λ.

integeran is x where!

)(x

exprob

xx

• For applications that involve counting number of times a random event occurs in a given amount of time, e.g. number of people walking into a store in an hour.

Page 5: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 5

Poisson distributions in MATLABx=0:16y = poisspdf(x,5);stem(x,y);

0 10 20 30 40 50 600

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

prob

(x)

x0 2 4 6 8 10 12 14 16

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

prob

(x)

x

x=0:60y = poisspdf(x,20);stem(x,y);

Page 6: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 6

Chi-squared distribution

0:

)2/(2

10: 0

)( 2/12/2/

xexV

xxpdf xV

V

Given V indep normally distrib random variables, X1, X2, …, XV all with mean = 0 & std =1, let 2(V) = X1

2 + X22 + … + XV

2

Then the pdf of samples x of 2 is:

‘Gamma function’ (x) is a generalisation of x! to non-integers.

This pdf will tell us how about variance of a population.

If s=std of samples of V observations of normally distributed pop with std σ: Vs2/2 2 (V)

Page 7: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 7

Plot chi2 pdf with V = 4

0 5 10 150

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

pdf

x

x = 0:0.2:15; y = chi2pdf(x,4); plot(x,y)

Page 8: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 8

Student’s t-distribution pdf

tVtVV

Vtpdf

V: )/1

)2/(

) 2/)1( ()(

2/)1(2

Depends on a single parameter V (degrees of freedom).

As V, t-pdf approaches standard normal distribution

If x is a random sample of size n from a normal distribution with mean μ, then the t-statistic

stdev)-samples&mean-samplex(with /

ns

x

has Student's t-pdf with V = n – 1 degrees of freedom.

Page 9: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 9

Compare t-pdf(V=5) with normal

-5 -4 -3 -2 -1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

T-p

df(

blu

e)

Norm

-pdf(

red)

x = -5:0.1:5;y = tpdf(x,5);z = normpdf(x,0,1);plot(x,y,'b',x,z,'r');

Page 10: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 10

MATLAB functions for t-dist

• pdf for t-distribution with V degrees of freedom: y = tpdf ( t,V);

(With samples with n values, V = n-1)

.

• Cumulative df with V degrees of freedom p = tcdf ( t , V) Prob of rand var being t

• Complementary df (area under ‘tail’ from t to ) p = 1 – tcdf ( t , V) Prob of rand var being > t

Page 11: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 11

Inverse-cdf in MATLAB

• Inverse of cumulative distrib function:

• If p=tcdf(t,V) then t = tinv(p,V)

Value of t such that prob of rand var being t is p

• If p = normcdf(z,m,) then z = norminv(p,m, )

Value of z such that prob of rand var being z is p

Complementary version:

t = tinv(1-p,V)

Value of t such that prob of rand var being > t is p.

Similarly for complementary version of norminv

Page 12: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 12

Significance testing: z-test• Assume Normal population with known stddev = .• Null hypothesis: pop-mean =0

• Alternative hyp: pop-mean < 0• Take one sample of n values & calculate the z-statistic:

stdev)-pop&mean-samplex(with /

0

n

xz

If pop-mean = 0, dist of z will be standard Normal (mean=0, std=1)

-2 -1 0 1 2 40

0.1

0.2

0.3

0.4

Std

Nor

mal

pdf

z

If mean of z is 0, how likely is a value z as just calculated?

p-value = prob (x z)

= 1-normcdf(z,0,1)

If p-value < significance level alpha () reject null hyp.

Page 13: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 13

Alternative formulationstdev)-pop&mean-samplex(with

/0

n

xz

Assuming we need 95% confidence, = 0.05Let z() = norminv(1-,0,1) = 1.65Prob of getting rand var 1.65 is less than 0.05If z 1.65, it is outside our 95% ‘confidence limit’ that the null hyp may be true.So reject null hyp.Confidence limit is for z is - to 1.65Neglect possibility that z may be negative.(1-tailed test)Confidence limit for sample-mean is - to 1.65/n + 0

Page 14: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 14

2-tailed teststdev)-pop&mean-samplex(with

/0

n

xz

Assuming we need 95% confidence, = 0.05Allowing possibility that z < 0, extreme portions of tails are for z > z(/2)) and for z < -z(/2)). prob(z z(/2)) + prob (z -z((/2) ) = 2 prob(z z(/2)) = Now, z(/2) = norminv(1-/2,0,1) = 1.96Prob of getting rand var 1.96 or -1.96 is 0.05If z > 1.96 or z < - 1.96, it is outside our 95% ‘confidence limit’ that the null hyp may be true. So reject null hyp.Confidence limit is for z is -1.96 to 1.96Confidence limits for sample-mean is 0 - 1.96/n to 0 + 1.96/n

Page 15: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 15

Significance testing: t-test• Assume Normal population with unknown stddev.• Null hypothesis: pop-mean =0

• Alternative hyp: pop-mean < 0• Take one sample of n values & calculate the t-statistic:

stdev)-sample&mean-samplex(with /

0

sns

xt

If pop-mean = 0, dist of t will be standard t-pdf (blue) with V=n-1.

How likely is calculated value of t?

‘1-tailed’ p-value = prob (x t)

= 1 - tcdf(t , n-1)

If p-value < significance level alpha () reject null hyp.

t

-5 -4 -3 -2 -1 0 1 2 3 4 50

0.1

0.2

0.3

0.4

T-p

df(b

lue)

Nor

m-p

df(r

ed)

Page 16: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 16

Alternative formulation (2-tailed)

stdev)-sample&mean-samplex(with /

0

sns

xt

• Assuming we need 95% confidence, = 0.05• Confidence limits for 0 is:

• Null Hyp is that pop-mean is 0

If value of 0 is outside these limits, reject the null hyp that population mean is 0

Can say with 95% confidence that pop-mean > 0 or < 0

If 0 is within these confidence limits, cannot reject null-hyp.

nsntinvxnsntinvx /)1,2/1( to/)1,2/1(

Page 17: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 17

Difference betw z-test & t-test(2-tailed)• With z-test pop-std () is known; with t-test is unknown.

stdev)-pop&mean-samplex(with /

0

n

xz

stdev)-sample&mean-samplex(with /

0

sns

xt

For z-test, p-value = prob ( x z) = 1- normcdf(z,0,1)For t-test, p-value = prob( x t) = 1 – tcdf(t,n-1)Same Null-hyp: pop-mean = 0 : reject if 0 outside conf limits

Confidence limits for z-test:

Confidence limits for t-test:

nxnx /)1,0 ,2/1(norminv to/)1,0, 2/1(norminv

nsntinvxnsntinvx /)1,2/1( to/)1,2/1(

Page 18: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 18

Non-Gaussian populations

• If samples of size n are ‘randomly’ chosen from a pop with mean & std , the pdf of their mean, m1 say, approaches a Normal (Gaussian) pdf with mean & std /n as n is made larger & larger.

• Regardless of whether the population is Gaussian or not!

• This is Central Limit Theorem

• Tests can be made to work for non-Gaussian populations provided n is ‘large enough’.

Page 19: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 19

Barry’s Assignment

• Deadline 20 Dec 2011

• Email to [email protected] with ‘SEEDSM’ in title

• or

• Hand in paper copy to SSO

• Exam statistics are in examdata.dat and examdata.xls in

• www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131

• (or navigate from www.cs.man.ac.uk/~barry)

Page 20: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 20

Question 1

• What are the essential differences between Baysian and ‘frequentist’ statistics?

Page 21: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 21

Question 2: fair coin test

Suppose we obtain heads 15 times out of 20 flips of a coin. By establishing confidence limits, state whether it is it likely to be a fair coin?

Page 22: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 22

Question 3: Exam statistics

• Analyse the ficticious exam results & comment on features.• Compute means, stds & vars for each subject & histograms for the

distributions.• Make observations about performance in each subject & overall• Do marks support the hypothesis that people good at Music are also

good at Maths?• Do they support the hypothesis that people good at English are also

good at French?• Do they support the hypothesis that people good at Art are also good

at Maths?• If you have access to only 50 rows of this data, investigate the same

hypotheses• What conclusions could you draw, and with what degree of certainty?

Page 23: 1 Dec 2011COMP80131-SEEDSM81 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8:

1 Dec 2011 COMP80131-SEEDSM8 23

Question 4: Bayes Theorem(a) A patent goes to a doctor with a bad cough & a fever. The doctor needs

to decide whether he has ‘swine flu’. Let statement S = ‘has bad cough and fever’ & statement F = ‘has swine flu’. The doctor consults his medical books and finds that about 40% of patients with swine-flu have these same symptoms. Assuming that, currently, about 1% of the population is suffering from swine-flu and that currently about 5% have bad cough and fever (due to many possible causes including swine-flu), we can apply Bayes theorem to estimate the probability of this particular patient having swine-flu.

(b) A doctor in another country knows form his text-books that for 40% of patients with swine-flu, the statement S, ‘has bad cough and fever’ is true. He sees many patients and comes to believe that the probability that a patient with ‘bad cough and fever’ actually has swine-flu is about 0.1 or 10%. If there were reason to believe that, currently, about 1% of the population have a bad cough and fever, what percentage of the population is likely to be suffering from swine-flu?