thinking about data: a simple principle to help you improve your scientific data analysis

43
Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis Scott A. Venners, Ph.D., MPH November 13, 2003 1

Upload: etana

Post on 20-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis Scott A. Venners, Ph.D., MPH November 13, 2003. 1. PowerPoint slides available at: www.artima.com/AMU/lecture.ppt (Try tomorrow). 2. ?. Classes. First Data Set. 3. Y = Outcome Variable - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Thinking About Data:

A Simple Principle to Help You Improve your Scientific Data Analysis

Scott A. Venners, Ph.D., MPH

November 13, 20031

Page 2: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

PowerPoint slides available at:

www.artima.com/AMU/lecture.ppt

(Try tomorrow)

2

Page 3: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Classes First Data Set

?

3

Page 4: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Y = Outcome Variable

X = Predictor of Interest

Cov1…N = Potential Confounders (Covariates)

Y = X + Cov1 + Cov2 + Cov3 + … + Cov(n)

X p-value <0.05?

Yes - Write a paper.

4

Page 5: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Simple Principle:

1. Your model only represents one possible explanation of data.

2. You must actively think of all possible alternative explanations and test them.

3. Those that are not testable define the uncertainty of your analysis.

5

Page 6: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

(Can Test)

Possible Explanations of Data

(Cannot Test)

6

Page 7: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

(Can Test)

Possible Explanations of Data

(Cannot Test)

7

Page 8: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

(Can Test)

Possible Explanations of Data

(Cannot Test)

8

Page 9: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Possible Explanations of Data

(Cannot Test)Model

9

Page 10: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

(Can Test)

Do not stop here!

(Cannot Test)

Model

10

Page 11: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Skills you need:

1. Thinking of possible explanations

2. Knowing how to test them.

11

Page 12: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Example 1: Simple model.

Skill: Visualizing Confounding

12

Page 13: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Example 1: Does an inactive lifestyle increase the risk of low bone density?

= Inactive Lifestyle

= Active Lifestyle13

Page 14: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Inactive LifestyleActive Lifestyle

14

Page 15: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Inactive LifestyleActive Lifestyle

= Inactive Lifestyle

= Active Lifestyle

= Low Bone Density

15

Page 16: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Inactive LifestyleActive Lifestyle

What else could cause this result?

Female, Smoking, Excessive Alcohol, Old Age…

16

Page 17: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Inactive LifestyleActive Lifestyle

Female Smoking Ex Alcohol Old Age

Active Lifestyle Inactive Lifestyle

49% 21%

1% 30%

51% 19%

1% 50%

17

Page 18: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Inactive LifestyleActive Lifestyle

Is the association between inactive lifestyle and low bone density confounded by old age?

30% Old Age 50%

18

Page 19: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Inactive LifestyleActive Lifestyle

Is the association between inactive lifestyle and low bone density confounded by old age?

30% Old Age 50%

No 19

Page 20: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Older Age

Lo

w B

on

e D

ensi

ty

Active Inactive

30%

50%

Younger AgeL

ow

Bo

ne

Den

sity

Active Inactive

30%

50%

20

Page 21: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Inactive LifestyleActive Lifestyle

Is the association between inactive lifestyle and low bone density confounded by old age?

Yes

30% Old Age 50%

21

Page 22: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Lo

w B

on

e D

ensi

ty

Active Inactive

0% 0%

Younger Age

Lo

w B

on

e D

ensi

ty

Active Inactive

100% 100%

Older Age

22

Page 23: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Independent Effect(s)

Active (0)

Inactive (1)

Active (0)

Inactive (1)

Lo

w B

on

e D

en

sit

y

10% 30%10% 30%

Inactive Only

10 + 0(Old) + 20(Inactive)

Older Age (1)Younger Age (0)

23

Page 24: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Independent Effect(s)

Older Age Only

10 + 20(Old) + 0(Inactive)Lo

w B

on

e D

en

sit

y

30% 30%10% 10%

Lo

w B

on

e D

en

sit

y

10% 30%10% 30%

Inactive Only

10 + 0(Old) + 20(Inactive)

Older Age (1)Younger Age (0)

Active (0)

Inactive (1)

Active (0)

Inactive (1)

24

Page 25: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Independent Effect(s)

Both Older Age and Inactive

10 + 20(Old) + 20(Inactive)

Older Age (1)L

ow

Bo

ne

De

ns

ity

30% 50%

Younger Age (0)

Older Age Only

10 + 20(Old) + 0(Inactive)Lo

w B

on

e D

en

sit

y

30% 30%10% 10%

10% 30%

Lo

w B

on

e D

en

sit

y

10% 30%10% 30%

Inactive Only

10 + 0(Old) + 20(Inactive)

Active (0)

Inactive (1)

Active (0)

Inactive (1) 25

Page 26: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Independent Effect(s)

Both Older Age and Inactive

10 + 20(Old) + 20(Inactive)

Lo

w B

on

e D

en

sit

y

30% 50%10% 30%

Older Age (1)Younger Age (0)

Active (0)

Inactive (1)

Active (0)

Inactive (1)

26

Page 27: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Independent Effect(s)

Both Older Age and Inactive

10 + 20(Old) + 20(Inactive)

Lo

w B

on

e D

en

sit

y

30% 50%10% 30%

Active Inactive Active Inactive

Older Age and Inactive Interaction

10 + 20(Old) + 20(Inactive)

+ 10(Old*Inactive)

Lo

w B

on

e D

en

sit

y

30% 60%10% 30%

Older Age (1)Younger Age (0)

27

Page 28: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Example 2:

Sometimes just putting potential confounders into model is not correct.

28

Page 29: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Example 2: Does passive smoking increase the risk of chronic cough?

= Passive Smoking

= No Passive Smoking29

Page 30: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Passive SmokingNo Passive Smoking

30

Page 31: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

= Passive Smoking

= No Passive Smoking

= Chronic Cough

Passive SmokingNo Passive Smoking

25% Cough 25%

31

Page 32: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Passive SmokingNo Passive Smoking

What else could cause this result?

Active Smoking…

25% Cough 25%

32

Page 33: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Passive SmokingNo Passive Smoking

Is the association between passive smoking and cough confounded by active smoking?

45% Active Smoking 17%

33

Page 34: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Active Smoking

Co

ug

h

No Passive Passive

47% 47%

No Active SmokingC

ou

gh 7% 20%

No Passive Passive34

Page 35: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Co

ug

h

No Passive Passive

47% 47%

Co

ug

h 7% 20%

No Passive Passive

How to model?Active SmokingNo Active Smoking

35

Page 36: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Co

ug

h

47% 47%

Co

ug

h 7% 20%

No Passive (0)

Passive (1)

How to model?Active Smoking (1)No Active Smoking (0)

No Passive (0)

Passive (1)

?Cough% = 7 + 40(Smoke) + 13(Passive) - 13(Smoke*Passive)

No36

Page 37: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Example 3:

Sometimes explanations for data are not so clear.

37

Page 38: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Husband’s current smoking

None

<20 cigs/day

>20 cigs/day

Crude Adjusted*

OR p

Ref

1.19 .429

2.18 .013

Ref

1.04 .854

1.81 .049

OR p

* Adjusted for husband and wife’s ages, education, stress, exposure to dust and noise, husband’s alcohol use, previous smoking, and exposure to toxins, and wife’s body-mass index.

Odds ratios of early pregnancy loss.

38

Page 39: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Husband’s current smoking

None

<20 cigs/day

>=20 cigs/day

Crude Adjusted*

OR p

Ref

1.19 .429

2.18 .013

Ref

1.14 .576

2.02 .022

OR p

If remove husband’s education from model:

39

Page 40: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

High School 79% 59% 50%

Husband’s Smoking None <20 cigs/day >=20 cigs/day

40

Page 41: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

20%

29%

44%

22% 21%

30%

< High School >= High School

% Early Pregnancy

Loss

Husband’s Smoking None <20 cigs/day >20 cigs/day

41

Page 42: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

20%

29%

44%

22% 21%

30%

< High School >= High School

% Early Pregnancy

Loss

Husband’s Smoking None <20 cigs/day >20 cigs/day

High School 79% 59% 50%

42

Page 43: Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis

Main Points:

No matter if you have good resultsor bad, always think beyond your preferred explanation for data.

Explore all possibilities before choosing your preferred model.

Acknowledge what you cannot testas your limitations.

43