web science - how is it different?

36
Danie l Web Science: How is it different? Daniel Tunkelang Head of Query Understanding

Upload: daniel-tunkelang

Post on 08-Sep-2014

4.057 views

Category:

Data & Analytics


1 download

DESCRIPTION

Web Science: How is it different? Daniel Tunkelang, LinkedIn Keynote Address at ACM Web Science 2014 Conference The scientific method of observation, measurement, and experiment may be our greatest achievement as a species. The technological innovation we enjoy today is the product of a culture of systematized scientific experimentation. But historically scientific experimentation has been expensive. Experiments consumed natural resources, took a long time to conduct, and required even more time and labor to analyze. In order to be productive, scientists have had to factor these costs into their work and to optimize accordingly. Web science is different. Not, as some have speciously argued, because big data has made the scientific method obsolete. The key difference is that web science has changed the economics of scientific experimentation. Thus, even as web scientists apply the traditional scientific method, they optimize based on very different economics. In this talk, I'll survey how web science has changed our approach to experimentation, for better and for worse. Specifically, I'll talk about differences in hypothesis generation, offline analysis, and online testing. Bio Daniel Tunkelang is Head of Query Understanding at LinkedIn, where he previously formed and led the product data science team. LinkedIn search allows members to find people, companies, jobs, groups and other content. His team aims to provide users with the best possible results that satisfy their information needs and help to get insights from professional data. Tunkelang has BS and MS degrees in computer science and math from MIT, and a PhD in computer science from CMU. He co-founded the annual symposium on human-computer interaction and information retrieval (HCIR) and wrote the first book on Faceted Search (Morgan and Claypool 2009). Prior to joining LinkedIn, Tunkelang was Chief Scientist of Endeca (acquired by Oracle in 2011 for $1.1B) and leader of the local search quality team at Google, mapping local businesses to their home pages. He is the co-inventor of 20 patents.

TRANSCRIPT

Page 1: Web science - How is it different?

Daniel

Web Science: How is it different?

Daniel TunkelangHead of Query Understanding

Page 2: Web science - How is it different?
Page 3: Web science - How is it different?

tl;dr:

The scientific method is alive and well.Big data has just changed the economics.

Page 4: Web science - How is it different?

How have the web and big data changed science?

Let’s ask some of the experts.

Page 5: Web science - How is it different?

“You have to kiss a lot of frogs to find one prince. So how can you find your prince

faster? By finding more frogs and kissing them faster and faster.”

Mike MoranDo It Wrong Quickly: How the Web Changes the Old Marketing

Rules, 2007

Cited by Kohavi in Online Controlled Experiments at Large Scale, 2013

Page 6: Web science - How is it different?

Web Science = faster, cheaper experiments.

Page 7: Web science - How is it different?

“The cost of experimentation is now the same or less than the cost of analysis. You can get more value…by doing a quick experiment than from doing a

sophisticated analysis.”

Michael SchrageValue-Creation, Experiments, and Why IT Does Matter, 2010

Page 8: Web science - How is it different?

Web Science = more experiments, less analysis?

Page 9: Web science - How is it different?

“with massive data, this approach to science — hypothesize, model, test — is becoming obsolete… Petabytes allow us to say: "Correlation is enough." We can

stop looking for models…analyze the data without hypotheses…throw the numbers into the biggest computing clusters the world…and let…algorithms find patterns

where science cannot.”

Chris AndersonThe End of Theory, 2008

Page 10: Web science - How is it different?

RIP

ScientificMethod

1600 BCE –late 20th century

Killed by Big Data

?

Page 11: Web science - How is it different?

No.

Page 12: Web science - How is it different?

Let’s rewind.

Page 13: Web science - How is it different?

What makes it science?

Page 14: Web science - How is it different?

Hypothesis

Page 15: Web science - How is it different?

Model

Page 16: Web science - How is it different?

Test

Page 17: Web science - How is it different?

The scientific method still works today.

What’s changed is the economics.

Page 18: Web science - How is it different?

Scientific Method1747

Page 19: Web science - How is it different?

Scientific MethodToday

Page 20: Web science - How is it different?

It’s the economy, science.

YesterdayExperiments are

expensive,choose hypotheses

wisely.

TodayExperiments are cheap,do as many as you can!

Page 21: Web science - How is it different?

What about Web Science?

Page 22: Web science - How is it different?

A/B testing: everybody’s doing

it.

Page 23: Web science - How is it different?

Google: 20k search

experiments per year

Page 24: Web science - How is it different?

hypotheses

Page 25: Web science - How is it different?

The Myth of Insight

Page 26: Web science - How is it different?

Scientists gain insight

by staring at data.

Page 27: Web science - How is it different?

Big data tools improve

data exploration.

Page 28: Web science - How is it different?

In hypothesis generation,

quantity trumps quality.

Page 29: Web science - How is it different?

Except when it doesn’t.

Page 30: Web science - How is it different?
Page 31: Web science - How is it different?

Easier to analyze data than research

humans.

Page 32: Web science - How is it different?

But we pay the price.Example: search engine improvements in

batch evaluations don’t always predict real user benefits.

[Hersh et al, 2000] Do Batch and User Evaluations Give the Same Results?

[Turpin & Hersh, 2001] Why Batch and User Evaluations do not Give the Same Results

[Turpin, Scholer, 2006] User Performance versus Precision Measures for Simple

Search Tasks

But also see…

[Smucker & Jethani, 2010] Human Performance and Retrieval Precision Revisited

Page 33: Web science - How is it different?

When local optimization is cheap, you neglect the rest.

Page 34: Web science - How is it different?

To summarize: how is web science

different?•Online testing is cheaper and scalable.

•Data exploration tools make hypothesis generation cheaper and easier.

•But the experiments that are easy and cheap aren’t always the most valuable.

•Easy to forget our biases as scientists.

Page 35: Web science - How is it different?

Take-Aways•The scientific method is alive and well.

Big data has just changes the economics.

•Cheaper hypothesis testing and generation has already been transformative. That’s why big data matters.

•But we neglect the human side of scientific experimentation at our peril.

Page 36: Web science - How is it different?

Daniel [email protected]://linkedin.com/in/

dtunkelang