cognitive biases in data science

16
www.flydata.com Cognitive Biases in Data Science Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

Upload: flydata-inc

Post on 17-Jul-2015

207 views

Category:

Internet


1 download

TRANSCRIPT

Page 1: Cognitive Biases in Data Science

www.flydata.com

Cognitive Biases in

Data Science

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

Page 2: Cognitive Biases in Data Science

Introduction

Copyright © 2014 FlyData Inc. All rights reserved.

● We often think of “data” as objective information

● In reality, data can be just as subjective as the

people who record it!

● In scientific fields especially…

○ empirical methods are used to observe nature

○ data should always be collected and

interpreted impartially

www.flydata.com

Page 3: Cognitive Biases in Data Science

Introduction

Copyright © 2014 FlyData Inc. All rights reserved.

● Cognitive biases are an obstacle when trying to

interpret information

○ Can easily skew results

○ They are innate tendencies

● Here are 4 major biases that are known to have

considerable effects on research and science:

www.flydata.com

Page 4: Cognitive Biases in Data Science

#1 Confirmation

Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

Page 5: Cognitive Biases in Data Science

Confirmation Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● Confirmation bias is the tendency to process

information in a way that confirms one’s

preconceptions or hypotheses.

○ Actively seek out and assign more value to

data that confirms our own hypotheses...

○ And ignore/understate evidence that could

mean otherwise!

Page 6: Cognitive Biases in Data Science

Confirmation Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● You may have “good” preconceptions from an

educated intuition or previous experiences…

● But it’s not like that in many cases!

○ Can directly affect the results of a study

or analysis!

Page 7: Cognitive Biases in Data Science

#2 Observation

Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

Page 8: Cognitive Biases in Data Science

Observation Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● The tendency to look in places where it is

expected to produce good results, or where it is

very convenient to observe

○ Easy accessibility/availability doesn’t mean

it’s the most important!

● The most available and known data source

may often be a good one…

○ But no data analysis is complete without a

complete picture of your data.

● Data science is about producing actionable

insights

○ If only the wrong things are being observed

and measured, you produce false insights!

Page 9: Cognitive Biases in Data Science

Observation Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● To be an efficient researcher, perhaps it’s

best to frequently ask yourself these

questions:

○ “Am I measuring the right things?”

○ “Are there better sources from which to

get data from?”

Page 10: Cognitive Biases in Data Science

#3 Funding Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

Page 11: Cognitive Biases in Data Science

Funding Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● Unconscious tendency to skew models, data,

or interpretations of data in a way that favors

the objectives of a financial sponsor or

employer.

○ Sometimes called sponsorship bias

● Any scientist/researcher should keep this in

mind

○ Unknowingly making a business decision

with flawed data will ultimately damage

sponsor!

○ Will damage your career

○ ..and it’s just bad science!

Page 12: Cognitive Biases in Data Science

Example

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● In the 1990’s, the tobacco industry funded a

number of research studies on the effects of

tobacco and smoking cigarettes

● After investigation, industry sponsors and

research centers were found to

○ Present findings in a misleading way

○ Withhold certain findings about the

relationships between smoking and

cancer

● This is a prime example of a funding bias.

Page 13: Cognitive Biases in Data Science

#4 Sampling Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

Page 14: Cognitive Biases in Data Science

Sampling Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● In experimentation, we take a sample, which

should be representative of a whole population

○ Achieved by statistical techniques and well-

designed randomization

○ What happens if proper randomization isn’t

achieved?

● It’s not uncommon for researchers to have a

sampling bias

○ Selection of groups or data for

experimentation is unintentionally not

representative of the population

Page 15: Cognitive Biases in Data Science

Sampling Bias

Copyright © 2014 FlyData Inc. All rights reserved. www.flydata.com

● No matter how big/diverse the sample is..

○ Always a possibility of inconsistency in

data/sample collection

● This bias also ties in with the other 3 biases!

○ If any of those biases affects the way in

which you collect samples, then you’re

also experiencing a sampling bias!

Page 16: Cognitive Biases in Data Science

www.flydata.com www.flydata.com

Check us out!

-> http://flydata.com

[email protected]

Toll Free: 1-855-427-9787

http://flydata.com

We are an official data integration

partner of Amazon Redshift