the science of managing data scientists
DESCRIPTION
Creating great products powered by big data can be challenging. Data science work is often ambiguous, which can make results unpredictable and scheduling almost impossible. Many of the popular software engineering processes just won’t work for these innovative and ambitious projects. Waterfall falls apart; it just doesn’t make sense to define the product before understanding the limitations of the data and technology. And shoehorning data experiments into tight agile sprints is both difficult and doesn’t necessarily lend itself to discoveries that involve a lot inspiration and perspiration before a light bulb moment. Even with a working process, few teams collaborate truly effectively. Projects that involve machine learning, algorithm development, or other deeply technical endeavors, are filled with advanced math and complicated terminology, which leaves plenty of teams with communication gaps that prevent the synergy realized when working cohesively. Thankfully there are solutions to these problems! Based on personal experience, and interviews with many other leaders spearheading big data initiatives, this session aims to distill these lessons into actionable strategies you can use to improve process and communication for your own team.TRANSCRIPT
The Science of Managing Data Scientists
Kate Matsudaira
Tuesday, February 26, 13
Tuesday, February 26, 13
What are they doing all day?
Tuesday, February 26, 13
http://data.whicdn.com/images/29273643/funny-science-news-experiments-memes-super-science_large.jpg
Data science is different
Tuesday, February 26, 13
Research doesn’t fit
image src:http://laurajul.dk/wp-content/uploads/2011/10/Screen-shot-2011-10-06-at-00.03.57.png
the traditional SDLC
Tuesday, February 26, 13
image src: legoexpress.tumblr.com
Good help is hard to find
(and keep)
Tuesday, February 26, 13
What are they doing all day?
Tuesday, February 26, 13
image src: www.ideachampions.com
Bring transparency
Tuesday, February 26, 13
Logistics
Trust
Communication
Tuesday, February 26, 13
Tuesday, February 26, 13
Communication
Tuesday, February 26, 13
Do you speak the same language?
image src: abclang.livejournal.com
Tuesday, February 26, 13
What does it mean to be finished?
image source: ladywhodoesntlunch.blogspot.com
Tuesday, February 26, 13
Define “quality”
image source: http://www.sodahead.com/fun/
Tuesday, February 26, 13
What do precision and recall mean?
P
RGive them a lesson in semantics
Tuesday, February 26, 13
Before
Precision: 80%Recall 25%
Tuesday, February 26, 13
After
For the top search terms* accessories appear 25% of the time on the first page.
For top search terms the head products are present 90% of the time in the top 3 results, 98% of the time in the top5.
✴ Top search terms are the 1000 most popular queries on our website over the last 30 days.
Tuesday, February 26, 13
Measure with data that matters
image source: http://www.freepatentsonline.com/6971185-0-large.jpg
Tuesday, February 26, 13
This is a really hard problem.
Tuesday, February 26, 13
We know it is hard...
Tuesday, February 26, 13
We know it is hard...but we don’t know
why it is hard.
Tuesday, February 26, 13
Use analogies & examples
Tuesday, February 26, 13
Use analogies & examples
The dandelion swayed in the gentle breeze like an oscillating electric fan set on medium.
Tuesday, February 26, 13
Constructing model lineages for products is really difficult.
Tuesday, February 26, 13
Macbook Air
Macbook
Macbook Pro
?Bravia
EX500 LCD
BraviaEX620 LED
BraviaEX523 LCD
?
Be specificTuesday, February 26, 13
Product Evolution
We are building a
lineage
Take it up a level
Tuesday, February 26, 13
image src: www.clarkgriswoldcollection.com
Show before
Tuesday, February 26, 13
image src: www.clarkgriswoldcollection.com
Show before
and after
Tuesday, February 26, 13
Tuesday, February 26, 13
Logistics
Tuesday, February 26, 13
How do you create a sense of urgency?
image source: http://favim.com/image/503182/
Tuesday, February 26, 13
What are the reasons behind it?
image source: http://dailyfailcenter.com/sites/default/files/fail/explain-this-shit.jpg
Tuesday, February 26, 13
Deadlines in research?
Tuesday, February 26, 13
They don’t call it
research for nothin’
image source: http://sausagetails.com/2012/07/
Tuesday, February 26, 13
You can’t predict the future
image source: maxseesmovies.blogspot.com
Tuesday, February 26, 13
Applying “agile” to R&D
image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg
Tuesday, February 26, 13
Applying “agile” to R&D
image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg
Backlog of experimentsTrips to Hawaii
That’s doesn’t sound agile....
Tuesday, February 26, 13
Applying “agile” to R&D
image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg
Backlog of experiments
Regular Demos
Trips to Hawaii
That’s doesn’t sound agile....
Tuesday, February 26, 13
Applying “agile” to R&D
image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg
Backlog of experiments
Regular Demos
Trips to Hawaii
That’s doesn’t sound agile....
Defined workflow with iterations
Tuesday, February 26, 13
Collect Data Do we have the
right data?
Do we have the infrastructure to
analyze the data?
Build it...
Run experimentsGet results
Are we finished or do
we need more.....?
Start Here
Yes
No
No
Yes
Research
Data
Tuesday, February 26, 13
Experiments can take a while...
image source: www.wallpapermay.com Tuesday, February 26, 13
Take tools out of the equation
image source: http://www.holesinyoursocks.com/2011/02/14/funny-monday-tool-love-happy-valentines-day/
Tuesday, February 26, 13
Focus on feeds and files
Image source: http://carcat.files.wordpress.com/2009/03/funny-pictures-cat-searches-for-a-file.jpg
Tuesday, February 26, 13
Format & storage standards/prices
/date=2012-07-01
price-obs.2012-07-01.csv.gz
/date=2012-07-02
/full
/date=2012-07-01
2012-07-01T00-10-00.csv.gz
/inc
2012-07-01T00-20-00.csv.gz
Tuesday, February 26, 13
Where is your golden set?
image source: ads/2011/09/4bc2dd714ecb9eebb3a66d074638.jpeg
Tuesday, February 26, 13
Tuesday, February 26, 13
Trust
Tuesday, February 26, 13
What are they doing all day?
Tuesday, February 26, 13
Building trust
image source: http://writealoud.com/funny-dinosaur-pictures/
Tuesday, February 26, 13
Their motivations
image src: http://www.fredhoogervorst.com/oni.app/local/upload/03897400db.jpg
Tuesday, February 26, 13
Their motivations
image src: http://www.fredhoogervorst.com/oni.app/local/upload/03897400db.jpg
Hard problems to solve
Tuesday, February 26, 13
Their motivations
image src: http://www.fredhoogervorst.com/oni.app/local/upload/03897400db.jpg
Hard problems to solve
My work in the wild serving customers
Tuesday, February 26, 13
Their motivations
image src: http://www.fredhoogervorst.com/oni.app/local/upload/03897400db.jpg
Hard problems to solve
Recognition for a job well done
My work in the wild serving customers
Being GOLD!
Tuesday, February 26, 13
Their motivations
image src: http://www.fredhoogervorst.com/oni.app/local/upload/03897400db.jpg
Hard problems to solve
Recognition for a job well done
Open the door to higher-ups
My work in the wild serving customers
Being GOLD!
Tuesday, February 26, 13
Minimize time-to-ship
image source: : http://2smallerwheels.blogspot.com/2011/09/speedy-delivery.html
Tuesday, February 26, 13
Let them own the data
image source: http://www.cloudproviderusa.com/weekly-dose-of-humor-mo-data-mo-problems/
Tuesday, February 26, 13
Let me write a program
image src: http://funnyfilez.funnypart.com/pictures/FunnyPart-com-geek_on_the_go.jpg
Tuesday, February 26, 13
Algorithms can’t solve everything
Tuesday, February 26, 13
Algorithms can’t solve everything
Even though we wish they could!
Tuesday, February 26, 13
Manual data can be awesome
image source: http://www.funnyjunk.com/funny_pictures/4416618/LOOK+AT+DE+PUPPY/
Tuesday, February 26, 13
Can you do the impossible?
image source: anybody-want-a-peanut.blogspot.com
Tuesday, February 26, 13
Data Miners aren’t
magicians
image source: http://1.bp.blogspot.com/_zmT48r8jfbE/S1z4t5L7JxI/AAAAAAAAAJU/1vYZmc5lhBU/s400/steve-brooks-magician.jpg
Tuesday, February 26, 13
You CAN make failure look good
image source: http://fightthebees.com/wp-content/uploads/2011/11/cow-fail.jpg
Tuesday, February 26, 13
Your challenge:
Tuesday, February 26, 13
Your challenge:
Should you choose to accept it....
Tuesday, February 26, 13
Your challenge:Take a difficult problem and
transform it into a feature
Should you choose to accept it....
Tuesday, February 26, 13
1.7GHz dual-core Intel Core i5 $300
256GB flash storage
Intel HD Graphics 4000
Apple premium
$250
8GB SDRAM
$200
$200
$100
$1050Estimated Value:
HARD: Value Estimation
Tuesday, February 26, 13
EASIER: Comparing Products
Tuesday, February 26, 13
Think with your business hat
image source: memegenerator.net
Tuesday, February 26, 13
How do you surface ideas & insights?
image source: http://static.someecards.com/someecards/usercards/1327680406361_4248272.png
Tuesday, February 26, 13
Show ‘em
image source: neslihandurmusoglu.edublogs.org
Tuesday, February 26, 13
And show ‘em
often
image source: http://www.social-science.co.uk/research/
Tuesday, February 26, 13
Negative results happen
image source: http://www.theofrak.com/2012/09/existential-star-wars.html
Tuesday, February 26, 13
The journey is the reward
image source: failblog.org
Tuesday, February 26, 13
Tuesday, February 26, 13
Communication
Tuesday, February 26, 13
Logistics
Communication
Tuesday, February 26, 13
Logistics
Trust
Communication
Tuesday, February 26, 13
Questionshttp://katemats.com
@katemats
Tuesday, February 26, 13
Questionshttp://katemats.com
@katemats
May the forces of evil get lost on the way to
your doorstep.
Tuesday, February 26, 13