data science in 2016: moving up by paco nathan at big data spain 2015
TRANSCRIPT
![Page 1: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/1.jpg)
![Page 2: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/2.jpg)
Data Science in 2016: Moving Up
2015-10-15 • Madrid • http://bigdataspain.org/
Paco Nathan, @pacoid O’Reilly Media
![Page 3: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/3.jpg)
• general patterns
• trends and analysis: the discipline, the jobs
• some good examples: moving up into use cases
• glimpses ahead: an emerging content
• a proposed theme
Data Science 2016: Moving Up
![Page 4: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/4.jpg)
Design Patterns
![Page 5: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/5.jpg)
Design Patterns
Methodology for cloud-computing architecture (2008-06-29)http://ceteri.blogspot.com/2008/06/methodology-for-cloud-computing.html
![Page 6: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/6.jpg)
cluster scheduler
datapipes
some cloud
containers
analytics
search/index
elasticcompute
elasticstorage
Design Patterns
![Page 7: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/7.jpg)
Design Patterns
some cloud
![Page 8: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/8.jpg)
Design Patterns
some cloud
DataStax$189.7M
Confluent$30.9M
Databricks$47M
Jupyter$6M
Elastic$104M
Docker$162MMesosphere
$48.75M
![Page 9: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/9.jpg)
Design Patterns: Issues
some cloud
• integration could be better• that implies sharing markets• VCs in Silicon Valley dislike that• customers need integration
![Page 10: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/10.jpg)
some cloud
Design Patterns: Where?
![Page 11: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/11.jpg)
Design Patterns: Where?
some cloud
![Page 12: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/12.jpg)
Design Patterns: Where?
some cloud
![Page 13: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/13.jpg)
Design Patterns: Where?
some cloud
![Page 14: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/14.jpg)
Design Patterns: Where?
some cloud
![Page 15: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/15.jpg)
Design Patterns: Where?
some cloud
• that playing field becomes overly crowded, soon…
• what happens at that point?
![Page 16: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/16.jpg)
• so much emphasis on plumbing: `data engineering`
• not enough on domain expertise, which trumps all
Much activity in Big Data seems awkwardly focused at the bottom of the tech stack: infrastructure, not domain
However, that may be changing…
Design Patterns: Opinion
![Page 17: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/17.jpg)
Interesting Trends
![Page 18: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/18.jpg)
Interesting Trends
There are many possible trends to discuss, but let’s concentrate on four of these going into 2016:
• leveraging multicore and large memory spaces
• generalized libraries for frequently repeated work
• workflows blend the best of people and computing
• framework for a big leap ahead, not just incremental
![Page 19: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/19.jpg)
Original definitions for what became relational databases had less to do with dedicated SQL products, more similarity with something like Spark SQL
Interesting Trend #1: Contemporary Hardware
A relational model of data for large shared data banks Edgar Codd Communications of the ACM (1970) dl.acm.org/citation.cfm?id=362685
![Page 20: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/20.jpg)
Python Java/Scala R SQL …
DataFrame Logical Plan
LLVM JVM GPU NVRAM
Unified API, One Engine, Automatically Optimized
Tungsten backend
language frontend
…
from Databricks
Interesting Trend #1: Contemporary Hardware
![Page 21: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/21.jpg)
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal Josh Rosen spark-summit.org/2015/events/deep-dive-into-project-tungsten-bringing-spark-closer-to-bare-metal/
Set Footer from Insert Dropdown Menu 27
Physical Execution: CPU Efficient Data Structures
Keep data closure to CPU cache
Interesting Trend #1: Contemporary Hardware
from Databricks
![Page 22: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/22.jpg)
Interesting Trend #2: Generalized Libraries
Tensors are a good way to handle time-series geo-spatially distributed linked data with lots of N-dimensional attributes
In other words, nearly a general case for handling much of the data that we’re likely to encounter
That’s better than attempting to shoehorn data into matrix representation, then writing lots of custom code to support it
![Page 23: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/23.jpg)
Tensor factorization may be problematic, but probabilistic solutions seem to provide relatively general case solutions:
The Tensor Renaissance in Data Science Anima Anandkumar @UC Irvine radar.oreilly.com/2015/05/the-tensor-renaissance-in-data-science.html
Spacey Random Walks and Higher Order Markov Chains David Gleich @Purdueslideshare.net/dgleich/spacey-random-walks-and-higher-order-markov-chains
Interesting Trend #2: Generalized Libraries
![Page 24: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/24.jpg)
Interesting Trend #3: Leveraging Workflows
evaluationoptimizationrepresentationcirca 2010
ETL into cluster/cloud
datadata
visualize,reporting
Data Prep
Features
Learners, Parameters
UnsupervisedLearning
Explore
train set
test set
models
Evaluate
Optimize
Scoringproduction
datause
cases
data pipelines
actionable resultsdecisions, feedback
bar developers
foo algorithms
APIs, algorithms, developer-centric template thinking – these only go so far; the overall context is a workflow…
![Page 25: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/25.jpg)
evaluationoptimizationrepresentationcirca 2010
ETL into cluster/cloud
datadata
visualize,reporting
Data Prep
Features
Learners, Parameters
UnsupervisedLearning
Explore
train set
test set
models
Evaluate
Optimize
Scoringproduction
datause
cases
data pipelines
actionable resultsdecisions, feedback
bar developers
foo algorithms
look beyond an API, beyond a code repo … think of people and machines working together
Interesting Trend #3: Leveraging Workflows
APIs, algorithms, developer-centric template thinking – these only
![Page 26: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/26.jpg)
Chris Ré, @Stanfordhttps://www.macfound.org/fellows/943/
Drugs, DNA, and Dinosaurs: Building High Quality Knowledge Bases with DeepDive Strata CA (2015)
The Thorn in the Side of Big Data: too few artists Strata CA (2014)
Interesting Trend #4: A Leap Ahead
![Page 27: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/27.jpg)
Chris Réhttps://www.macfound.org/fellows/943/
Drugs, DNA, and Dinosaurs: Building High Quality Knowledge Bases with DeepDiveStrata CA (2015)
The Thorn in the Side of Big Data: too few artistsStrata CA (2014)
Interesting Trend #4: A Leap Ahead
cognitive computing “flywheel”: probabilistic reasoning about complex data and predictions together
![Page 28: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/28.jpg)
Chris Réhttps://www.macfound.org/fellows/943/
Drugs, DNA, and Dinosaurs: Building High Quality Knowledge Bases with DeepDiveStrata CA (2015)
The Thorn in the Side of Big Data: too few artistsStrata CA (2014)
Interesting Trend #4: A Leap Ahead
![Page 29: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/29.jpg)
Data Scientists
![Page 30: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/30.jpg)
William Cleveland “Data Science: an Action Plan for Expanding the Technical Areas of the Field of Statistics,” International Statistical Review (2001), 69, 21-26http://www.stat.purdue.edu/~wsc/papers/datascience.pdf
Leo Breiman “Statistical modeling: the two cultures”, Statistical Science (2001), 16:199-231http://projecteuclid.org/euclid.ss/1009213726
…also good to mention John Tukey
Data Scientists: Primary Sources
![Page 31: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/31.jpg)
Data Scientists: Five Years of Strata Conference
![Page 32: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/32.jpg)
One 2015 report (RJMetrics) tallied a minimum of 11,400 data scientists worldwide by scraping LinkedIn
So many suddenly, really? Perhaps that’s doubtful…
Comparing surveys: O’Reilly Media conducts salary surveys for data scientists, along with exploring about the tools used
2013 – tools, trends, not all data is “Big”, coding scripts!2014 – correlation of tools and skills, rapid evolution2015 – divide blurring between open source and proprietary
Data Scientists: Everywhere, all the time?
![Page 33: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/33.jpg)
http://radar.oreilly.com/2015/09/2015-data-science-salary-survey.htmlJohn King, Roger Magoulas
Data Scientists: 2015 Survey
![Page 34: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/34.jpg)
Data Scientists: 2015 Survey
![Page 35: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/35.jpg)
Moving Up
![Page 36: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/36.jpg)
Enlitic http://www.enlitic.com/deep learning to assist doctors treating cancer
Moving Up: Medicine
![Page 37: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/37.jpg)
Moving Up: Medicine
“Whatever the models might discover or predict, Howard isn’t suggesting they’ll do away with a doctor’s judgment. Rather, artificially intelligent computers could provide strong, unbiased second opinions, or perhaps lead a doctor down a path of investigation she other wouldn’t have considered.”
With Enlitic, a veteran data scientist plans to fight disease using deep learning GigaOM (2014-08-22) https://gigaom.com/2014/08/22/with-enlitic-a-veteran-data-scientist-plans-to-fight-disease-using-deep-learning/
![Page 38: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/38.jpg)
Moving Up: Political Platform
http://www.predikon.ch/en/voting-patterns/residents
![Page 39: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/39.jpg)
Moving Up: Political Platform
Mining DemocracyMatthias Grossglauser @EPFL ICT Labs (2015) http://ictlabs-summer-school.sics.se/slides/mining%20democracy.pdf
What if a political candidate could cluster political positions in a multi-dimensional data space, to optimize for being recommended to voters?
http://www.predikon.ch/en/voting-patterns/residents
![Page 40: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/40.jpg)
Moving Up: Government Ethics
The White House has a plan to help society through data analysis Fortune (2018-09-30) http://fortune.com/2015/09/30/dj-patil-white-house-data/
![Page 41: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/41.jpg)
Moving Up: Government Ethics
The White House has a plan to help society through data analysis Fortune (2018-09-30) http://fortune.com/2015/09/30/dj-patil-white-house-data/
“Opening up government data about child labor to concerned data scientists; recruiting folks to help analyze data about suicide prevention, social injustice and incarceration; a call for mandatory and `intrinsic` ethics instruction in every course teaching students data science; and an effort to help the transgender community create its own census of sorts, so that members and society can get a better grasp on the issues that matter to the group.”
![Page 42: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/42.jpg)
Moving Up: Neuroscience
Analytics + Visualization for Neuroscience: Spark, Thunder, LightningJeremy Freeman 2015-01-29youtu.be/cBQm4LhHn9g?t=28m55s
![Page 43: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/43.jpg)
For excellent examples of Science and Data together see CodeNeuro, particularly for use of Jupyter notebooks + Apache Spark
Moving Up: Neuroscience
![Page 44: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/44.jpg)
Learning
![Page 45: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/45.jpg)
Learning: What About MOOCs?
![Page 46: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/46.jpg)
Massive Open Online Courses – seven year trend, beginning with:
Connectivism and Connective Knowledge George Siemens, Stephen DownesUniversity of PEI (2008) http://cck11.mooc.ca/
Learning: What About MOOCs?
Adios Ed Tech. Hola something else George Siemens (2015-09-09) http://www.elearnspace.org/blog/2015/09/09/adios-ed-tech-hola-something-else/
![Page 47: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/47.jpg)
Online education: MOOCs taken by educated fewEzekiel Emanuel, Nature 503, 342 (2013-11-21)
• 80% students already have an advanced degree
• 80% come from the richest 6% of the population
Michael Shanks @Stanford: “retrenchment around traditional disciplines will make disparities even more pronounced”
An Early Report Card on Massive Open Online CoursesGeoffrey Fowler, WSJ (2013-10-08)
Amherst, Duke, etc., have rejected edX
Learning: What About MOOCs?
![Page 48: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/48.jpg)
Online education: MOOCs taken by educated fewEzekiel Emanuel
• 80% students already have an advanced degree
• 80% come from the richest 6% of the population
Michael Shanksdisciplines will make disparities even more pronounced”
An Early Report Card on Massive Open Online CoursesGeoffrey Fowler
Amhers
Learning: What About MOOCs?
So then, what else works better?
![Page 49: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/49.jpg)
How to Flip a Class CTL @UT/Austin http://ctl.utexas.edu/teaching/flipping-a-class/how
1. identify where the flipped classroom model makes the most sense for your course
2. spend class time engaging students in application activities with feedback
3. clarify connections between inside and outside of class learning
4. adapt your materials for students to acquire course content in preparation of class
5. extend learning beyond class through individual and collaborative practice
Learning: Inverted Classroom
![Page 50: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/50.jpg)
Scalable LearningDavid Black-Schaffer @UppsalaSverker Janson @KTH SICShttps://www.scalable-learning.com/
• active learning: Flipped Classroom and Just-in-time Teaching
• exams built directly into specific diagrams within videos
• metrics for where in video+code that students get stuck
• instructor can customize subsequent classroom discussions (active teaching phase) based on stuck/unstuck metrics
Learning: Inverted Classroom
![Page 51: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/51.jpg)
Learning programming at scalePhilip Guo O’Reilly Radar (2015-08-13)http://radar.oreilly.com/2015/08/learning-programming-at-scale.html
• PythonTutor• CodechellaTutors could keep an eye on around 50 learners during a 30-minute session, start 12 chat conversations, and concurrently help 3 learners at once
Learning: Collaborative Learning
![Page 52: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/52.jpg)
Data-driven Education and the Quantified StudentLorena Barba @GWUPyData Seattle (2015)https://youtu.be/2YIZ2SY9mW4
• keynote talk: abstract, slides• homepage• Open edX Universities Symposium, DC 2015-11-11
Learning: If you study just one link from this talk…
![Page 53: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/53.jpg)
If by some bizarre chance you haven’t used it already, go to https://jupyter.org/
• 50+ different language kernels• new funding 2015-07
• UC Berkeley, Cal Poly
• nbgrader autograder by Jess Hamrick• jupyterhub multi-user server
• curating a list of examples• repeatable science!
see also: Teaching with Jupyter Notebooks http://tinyurl.com/scipy2015-education
Learning: Jupyter Project
![Page 54: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/54.jpg)
Embracing Jupyter Notebooks at O'Reilly Andrew Odewahn O’Reilly Media (2015-05-07)https://beta.oreilly.com/ideas/jupyter-at-oreilly
O’Reilly Media is using our Atlas platform to make Jupyter Notebooks a first class authoring environment for our publishing program
Jupyter, Thebe, Atlas, Docker, etc.
Learning: O’Reilly Media
![Page 55: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/55.jpg)
Learning: O’Reilly Media
https://beta.oreilly.com/
![Page 56: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/56.jpg)
in-person blended on-demand
MostlySynchronous
MostlyAsynch
InvertedClassroom
Subscription
Free
Content
Learning: Audience Patterns
![Page 57: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/57.jpg)
Is it possible to measure “distance” between a learner and a subject community?
From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews Julian McAuley, Jure Leskovec http://i.stanford.edu/~julian/pdfs/www13.pdf
Learning: Machine Learning about People Learning
![Page 58: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/58.jpg)
Learning, Assessment, Team Building, Diversity – these can be accomplished together, in situ
Collective Intelligence in Human Groups Anita Williams Woolley @CMUhttps://youtu.be/Bz1dDiW2mvM
• balance of participation (no one dominates)
• 2+ women engaging within the group
• group size < 9
• diversity of formal backgrounds
Learning: Machine Learning about People Learning
![Page 59: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/59.jpg)
People + Automation
![Page 60: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/60.jpg)
Data Science teams apply machine learning (automation) to help arrive at key insights, to learn what is important in data sets – finding the proverbial needle in the haystack
Cognitive Computing exhibits people + automation as a process, in a learning context
That’s also a basic tenet of workflows in general: people + automation
And a key aspect of the emerging gig economy too…
People + Automation
![Page 61: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/61.jpg)
People + Automation: Gig Economy
![Page 62: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/62.jpg)
People + Automation: Gig Economy
http://orchestra.unlimitedlabs.com/
“Workflows with humans and machines”
![Page 63: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/63.jpg)
People + Automation: Gig Economy
Workers in a World of Continuous Partial EmploymentTim O’ReillyMedium (2015-08-31) https://medium.com/the-wtf-economy/workers-in-a-world-of-continuous-partial-employment-4d7b53f18f96
http://conferences.oreilly.com/next-economy
![Page 64: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/64.jpg)
Learning is key. Effective use of Data Science in these new economic conditions requires people + automation, learning together – albeit in different ways. Plus, there’s an excellent framework for that:
Autopoiesis and Cognition Humberto Maturana, Francisco VarelaSpringer (1973)https://books.google.es/books?id=nVmcN9Ja68kC
People + Automation
![Page 65: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/65.jpg)
I’d like to leave this as a theme for you to consider about Data Science 2016, Moving Up into use cases…
We see an intersection of key points in both the emerging Cognitive Computing context and the Gig Economy in general:
systems of people + automation, learning together
It posits an interesting duality for use to leverage
With that I wish you a great conference here at Big Data Spain!
People + Automation
![Page 66: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/66.jpg)
Gracias
![Page 67: Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015](https://reader031.vdocuments.us/reader031/viewer/2022030402/587414971a28abcb5b8b5065/html5/thumbnails/67.jpg)
contact:
Just Enough Math O’Reilly (2014)
justenoughmath.compreview: youtu.be/TQ58cWgdCpA
monthly newsletter for updates, events, conf summaries, etc.: liber118.com/pxn/
Intro to Apache SparkO’Reilly (2015) shop.oreilly.com/product/0636920036807.do