data science popup austin: surfing silver dynamic bayesian forecasting for fun and profit
TRANSCRIPT
DATA SCIENCEPOP UP
AUSTIN
Surfing Silver: Dynamic Bayesian Forecasting for Fun and Profit
Jonathan DinuAuthor and Teacher
clearspandex
SURFING SILVERDYNAMIC BAYESIAN FORECASTING FOR FUN AND PROFIT
Jonathan Dinu // April 13th, 2016 // @clearspandex
THE 2008 ELECTION
let me tell you a little story...
Jonathan Dinu // April 13th, 2016 // @clearspandex
> Nate Silver> Drew Linzer> Josh Putnam
> Simon Jackman
Jonathan Dinu // April 13th, 2016 // @clearspandex
THE THEORY BEHIND THE MAGIC
Courtesy of 538 and Drew Linzer (Votamatic)
Jonathan Dinu // April 13th, 2016 // @clearspandex
CHALLENGES
> Historical Predictions susceptible to Uncertainty
> Sparse pre-election Poll Data
> Sampling Error and House Effects Bias Polls
Jonathan Dinu // April 13th, 2016 // @clearspandex
WHAT DREW (AND NATE) DID DIFFERENTLY
> State level vs. National Polls
> Online Updates as more data become available> Not All Polls are Created Equal (weights/averages)> (Probabilistic) Forecasting in addition to Estimation
Jonathan Dinu // April 13th, 2016 // @clearspandex
DYNAMIC BAYESIAN FORECASTING2
National: State:
Forecasts:
Not shown here: informative priors based on historical predictions
Jonathan Dinu // April 13th, 2016 // @clearspandex
STRUCTURED PREDICTIONSUPERVISED LEARNING ON SEQUENCES
Jonathan Dinu // April 13th, 2016 // @clearspandex
GRAPHICAL MODELS
> Assess Risk (uncertainty) as Probability of Failure
> Unobservable (hidden) Failure States
> Proactive/Early Prediction> Interpretable Latent Properties
> Online Algorithm (iteratively improve)
Jonathan Dinu // April 13th, 2016 // @clearspandex
KEY IDEAS
> Uncertainty> Point vs. Distribution (or confidence intervals)
> Bayesian vs. Frequentists methods> Temporal variability
All models are wrong, but some models are useful... or something
Jonathan Dinu // April 13th, 2016 // @clearspandex
IOT IMPACT: DETECTING MACHINE FAILURES
> Historical Structural Predictions susceptible to Uncertainty(Supervised Learning)
> Sparse pre-election Poll Data (costly to measure)> Sampling Error Biases Polls Inspections
(prediction in the absence of data)> Online Updates as more data become available
> Not All Polls sensors are Created Equal (weights/averages)> (Probabilistic) Forecasting in addition to Estimation
Jonathan Dinu // April 13th, 2016 // @clearspandex
INDUSTRIAL MACHINES3
HTTP://WWW.CITEMASTER.NET/GET/8BD1ACC0-F04B-11E3-BBAF-00163E009CC7/SALFNER05PREDICTING.PDFJonathan Dinu // April 13th, 2016 // @clearspandex
MORE INTERPRETABLEWE HAVE TO ACTUALLY FIX THE MACHINES AFTER ALL...
Jonathan Dinu // April 13th, 2016 // @clearspandex
REFERENCES
> The Signal and the Noise> Data Journalism Handbook
> Dynamic Bayesian Forecasting of Presidential Elections in the States (Drew A. Linzer)
> Time for Change model (Alan Abramowitz)> Baysian Data Analysis Gelman
> Causality Judea Pearl> 538: How we are Forecasting the 2016 Primaries
> Predicting Time-to-Failure of Industrial Machines with Temporal Data MiningJonathan Dinu // April 13th, 2016 // @clearspandex