core: may the “power” (statistical) - be with you!

31
May the “ Power ” (Statistical) Be with You! Dr. Mickey Shachar C.O.R.E. Webinar Series 13 October, 2015

Upload: trident-university

Post on 10-Feb-2017

451 views

Category:

Education


2 download

TRANSCRIPT

May  the  “Power”  (Statistical)Be  with  You!  

Dr.  Mickey  Shachar  C.O.R.E.  Webinar  Series

13  October,  2015

Power  AnalysisTopics

Error  Types Effect  Size Power  Analysis

Hypothesis  Testing

2

¡ RQ: Is  there  a  statistical  significant  difference  instudents’  academic  performance  in  Math  between                          the  classes  of  Dr.  Adam  and  Dr.  Eve?

¡ Hnull: There  is  no  statistical  significant  difference  instudents’  academic  performance  in  Math  between  the  classes  of  Dr.  Adam.  and  Dr.  Eve.

You  are  the  Dean and  receive  the  following  report:

3

¡ Report:An  Independent  Samples  T  Test  was  run  to  compare  the  means  of  a  Math  test  between   Dr.  Eve:                                                  M  =  90.96  (12.60)  and  Dr.  Adam:  M  =  89.32  (15.38),  yielding  a  statistical  significant  difference  with    t(1358)  =  2.164    p  =  .031  .  Hence  we  reject  Hnull and  conclude  that  the  Dr.  Eve’s  students  outperformed  Dr.  Adam’s  students.

¡ What  should  the  Dean do  based  on  these  accurate true  results?  § A: CritiqueDr.  Adam  on  his  students’  low  performance  and  set  a  deadline  and  minimal  score  for  him  to  meet.

§ B:  Promote  Dr.  Eve  and  let  Dr.  Adam  eat  his  heart  out.§ C: Results  are  subject  to  chance due  to  small  sample  size,  and  we  need  to  rerun  study  with  a  larger  sample.  

§ D: Attend  Dr.  Shachar’s C.O.R.E.  PowerWebinar  4

5

q Problemswith  Hypothesis  SignificantTesting  -­‐ based  on  p  values  are:

q The  p-­‐valuedepends  essentially  on  two  things:  the  size  of  the  effect  and the  size  of  the  sample.    One  would  get  a  ‘significant’  result  either  if  the  effect  were  very  big  (despite  having  only  a  small  sample)  or  if  the  sample  were  very  big  (even  if  the  actual  effect  size  were  tiny).

qWe  are  looking  at  “StatisticalSignificance”  and  not  at  “Practical Significance”.  

¡ If  only  the  null  hypothesis  is  available  and  is  rejected,  at  most the  conclusion  is  that  “the  difference  is  not  zero”

¡ When  the  President  asks  the  Five-­‐Star  General  to  estimatethe  war  casualty,  can  he  give  “not  zero”  as  a  satisfactory  answer?!

6

¡ We  should  be  concerned  with  not  only  whether  a  null  hypothesis  is  false  or  not,  but  also  how  false  it  is.

¡ In  other  words,  if  the  difference  is  not  zero,  how  large the  difference  one  should  expect?  

¡ The  larger  the  effect  size  (the  difference  between  the  Hnulland  Halt Means)  is,  the  greater  the power of a test is.

7

A-­‐Priori -­‐ It  allows  you  to  decide,  in  the  process  of  designingan  experiment/study:¡ How  large  a  sample  is  needed  to  enable  statisticaljudgments  that  are  accurate  and  reliable,  and¡ How  likely  your  statistical  test  will  be  able  to  detect effects  of  a  

given  size  in  a  particular  situation.  ¡ Without  these  calculations,  sample  size  may  be  too  high  or  too  low.  

§ If  sample  size  is  too  low,  the  experiment  will  lack  the  precision.§ If  sample  size  is  too  large,  time  and  resources  will  be  wasted.

Post-­‐Hoc  -­‐ It  allows  you  to  decide,  after study  was  executed:¡ Whether  the  study  attained  an  acceptable  power,  and¡ Whether  the  results  have  a  practical  significance.

APA  -­‐ Publication  Requirements:¡ All  study  publications  should  report  in  addition  to  p  values,  the  

effect  sizes  (ES) and  their  Confidence  Interval  (CI). 8

Power AnalysisTopics

Error  Types

Type  I  =  alpha Type  II  =  betaPower  =  1-­‐ beta

Effect  Size Power  Analysis

9

¡ The  null  hypothesis  is  either  true  or  false¡ The  null  hypothesis  is  either  rejected or  not  rejected.  ¡ Only  4  possible  things  can  happen:

State  of  the  WorldH0

State  of  the  WorldH1

Our  Decision  H0

Correct  Acceptance Type  II  Error  (beta)

Our  Decision  H1

Type  I  Error  (alpha) Correct  Rejection

10

Common  acceptance  in  the  social  sciences:¡ Type  I  error  -­‐ alpha,must  be  kept  at  or  below  .05

¡ Type  II  error  -­‐ beta,must  be  kept  low as  well.¡ "Statistical  power,"which  is  equal  to  1  -­‐ beta,  must  be  kept  correspondingly  high.

¡ Ideally,  power  should  be  at  least  .80 to  detect  a  reasonable  departure  from  the  null  hypothesis.

11

Power  AnalysisTopics

Error  Types Effect Size

ES  Indices Cohen’s  Conventions

Power  Analysis

12

¡ Effect  size  (ES)  is  a  name  given  to  a  family  of  indices that  measure  the  magnitude  of  a  treatmenteffect  (Becker,  2000).

§ Unlike  significance  tests,  these  indices  are  independentof  sample  size.  

¡ There  is  a  wide  array  of  formulas  used  to  measure  ES:§ as  the  standardized  difference  between  two  means ‘d’  or  ‘g’

§ as  the  correlation between  the  independent  variable  (IV)  classification  and  the  individual  scores  on  the  dependent  variable  (DV)    ‘r’.

§ Others:  OR,  HR,  RR,  etc. 13

The  simplest  form,  effect  size,  as  denoted                                by  the  symbol  ‘d’  is  the  mean  difference  between  groups  in  standard  score  form  i.e.,  the  ratio  of  the  difference  between  the  means  to  the  standard  deviation.

14

Conventions Standardized  Difference  of  Means  ‘d’

Correlation  ‘r’

‘Small’ 0.2 0.1

‘Medium’ 0.5 0.3

‘Large’ 0.8 0.5

15

Power AnalysisTopics

Error Types Effect Size Power Analysis

a-priori post-hoc

16

The  factors  influencing  power  in  a  statistical  test:

¡ What  kind of  statistical  test  is  being  performed.  § You  will  need  to  calculate  a  different  effect  size  per  test  type!!!

¡ Sample  size.  In  general,  the  larger  the  sample  size,  the  larger  the  power.  

¡ The  size  of  experimental  effects.  If  the  null  hypothesis  is  wrong  by  a  substantial  amount,  power  will  be  higher  than  if  it  is  wrong  by  a  small  amount.

¡ The  level  of  error  in  experimental  measurements.  anything  that  enhances  the  accuracy  and  consistency  of  measurement  can  increase  statistical  power.

17

¡ To  ensure  a  statistical  test  will  have  adequate  power,  one  usually  must  perform  special  analyses  prior  to  running  the  experiment,  to  calculate  how  large  an  N is  required.

¡ The  question  is,  "How  large  an  N is  necessary  to  produce  a  power that  is  reasonably  high"  in  this  situation,  while  maintaining  alpha at  a  reasonably  low  value  .

18

To  determine  the  sample  size  needed,                  we  play  with  four factors  (in  red  below):

1. Obtain  “ES”  -­‐ where  do  we  find  it?1. Lit  review2. Pilot3. An  “educated  conjecture”  

2. Define  alpha <=.053. Define  power (1-­‐beta)  .804. Calculate  sample  size  (by  stat  calculator)  

see example19

To  determine  the  sample  size  needed,                  we  play  with  four factors  (in  red  below):

1. Obtain  “ES”  -­‐ where  do  we  find  it?1. Lit  review2. Pilot3. An  “educated  conjecture”  

2. Define  alpha <=.053. Define  power (1-­‐beta)  .804. Calculate  sample  size  (can  use  Gpower)

see example20

21

For  a  t  test  with:  ES=  .02,  Alpha=.05,  Power =  .8,  We  will  need  N=788 subjects  for  our  sample  

Now  that  we  are  done  with  our study,                      we  need  to  check  how  well  did  the  actualresults  we  found  do  in  terms  of  power:

Again,  we  play  with  four factors:1. Input  “ES”  – from  our study2. Define  alpha <=.053. Input  sample  size  -­‐ from  our study4. Calculate  power  – can  use  G-­‐Power

22

23

For  our    t  test  with:  ES=  .091,  Alpha=.05,  Sample  size  N=1360,  We  have  obtained  a  dismal .388  power  !!!

24

¡ Hypothesis  Testing  based  on  p  value  –provides  only  statistical significance.

¡ Power  analysis  is  crucial for  your  study:¡ A-­‐priori:  to  determine  required  sample  size¡ Post-­‐hoc:  § To  calculate  and  examine  power from  actualresearch  study

§ To  examine  the  practical significance  of  the  research  findings.  

¡ If  you  fired Dr.  Adam  – Reinstatehim!!!

25

¡ “G  Power”  v.  3.1.9.2.  (2015).  Buchner,  Erdfelder,  Faul,  &  Lang.  

§ To  download  software  for  free:  http://www.psycho.uni-­‐duesseldorf.de/abteilungen/aap/gpower3

¡ Using  “G  Power”  for  Statistical  Power  and  Sample  Size  Analysis  (2008).  Eveland,  J.D.  § Download  instructions  to  follow  for  PPT.

¡ Becker,  L.  A.  (2000).  Effect  Sizes.  Retrieved:  http://www.uccs.edu/lbecker/effect-­‐size.html

Visit  https://www.trident.edu/webinars/core/

26

27

28

Attention  Faculty,  Students,  Alumni  and  Guest  Speakers  in  Business,  Health  Sciences,  and  Education:

¡ Have  you  wanted  to  present  your  ongoing  scholarly  and  professional  work  to  a  general  audience?

¡ COREGrand Rounds provides  a  platform  for  professional  development  and  increased  engagement  to  receive  constructive  feedback  from  peers  and  scholars-­‐in-­‐training.

¡ Email  Dr.  Bernice  B.  Rumala at  [email protected]  to  sign  up

30

Thank YouMay the “power” be with you

Dr. Mickey [email protected]

¡ To  receive  more  information  about  C.O.R.E.  please  visit  the  C.O.R.E.  webpage  at:  www.trident.edu/webinars/core

¡ For  further  information  about  Trident’s  doctoral  programs  in  educational  leadership,  business  and  health  sciences  please  visit  :  https://www.trident.edu/degrees/doctoral/

¡ Do  you  have  any  comments  for  C.O.R.E.,  you  may  email  Dr.  Bernice  B.  Rumala,  C.O.R.E.  Chair,  at:  [email protected]

31