abstractions conference 2016 - machine learning in healthcare – ml for the rest of us

Post on 16-Apr-2017

307 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

|

Machine Learning in Healthcare – ML for the Rest of Us

Mohinder DickSenior Software ArchitectUPMC EnterprisesAugust 18, 2016

Email: dickm@upmc.eduTwitter: @mobyware

|

GoalBreakdown the perception that machine learning is too complex.

Address assumptions of what machine learning IS, and what it’s NOT.

Impart the basics of machine learning.

|©2015 UPMC Enterpr ises : PROPRIETARY3

Machine Learning can be perceived as Science Fiction

|©2015 UPMC Enterpr ises : PROPRIETARY4

Machine Learning can be perceived as Complex and Vastin terms of research, terminology,

and technology

|©2015 UPMC Enterpr ises : PROPRIETARY5

|©2015 UPMC Enterpr ises : PROPRIETARY6

|©2015 UPMC Enterpr ises : PROPRIETARY7

What we’ll accomplish today

• Machine Learning Basics in 8 Slides• How to get started• Let’s teach machines (Demo)• Machine Learning in Healthcare• Q & A• References

|©2015 UPMC Enterpr ises : PROPRIETARY8

Machine Learning in 8 Slides | Overview

Machine Learning

Supervised Unsupervised

Linear Non Linear Self Organizing Maps

K MeansClustering

Linear Classification

Linear Regression

NN Decision Trees

Random Forces

|©2015 UPMC Enterpr ises : PROPRIETARY9

Machine Learning in 8 Slides | Key Terms

Term Description

Machine Learning Algorithms that automatically improve with data

Model Statistical representation of the algorithm’s experience

Supervised Learning Algorithms that improves using labeled examples

Linear Algorithm Predictions are proportional to the feature/input values

Feature Data point that affect the target you are trying to predict.

Target Variable Data point that you are trying to predict

Classification Predicting a categorical target variable

Regression Predicting a continuous target variable

|©2015 UPMC Enterpr ises : PROPRIETARY10

Machine Learning in 8 Slides | Supervised Models

Unlabeled Data ML Model Predictions

• ML algorithms process “unseen” data to give predictions• A model is a statistical representation of the algorithms experiences/training

How do you get a model?

|©2015 UPMC Enterpr ises : PROPRIETARY11

Machine Learning in 8 Slides | Supervised Models

Labeled Data Algorithm Trained Model

• Supervised ML algorithms get their name because they learn with help• You have to provide them experience in labeled examples• The algorithm translates that data into a representation called a model• It can be used later to make predictions• The more data the better

|©2015 UPMC Enterpr ises : PROPRIETARY12

Machine Learning in 8 Slides | Linear Models

Linear RegressionLinear Classification

Labeled Data Algorithm Trained Model

ERROR

|©2015 UPMC Enterpr ises : PROPRIETARY13

Machine Learning in 8 Slides | Labels & Features

Labeled Data Algorithm Trained Model

Features

Label

+ Good/Bad + Real Lame+ Great Score

• During training features with labels are given to the algorithm to generate the model• Both features and labels can be categorical or continuous• The type of your target affects the flavor of algorithm you chose

|©2015 UPMC Enterpr ises : PROPRIETARY14

Machine Learning in 8 Slides | Labels & Features

Facial Recognition

Geometric features called “Haars”

ML Model

Is this a face?

Non-linear model

Predictions

|©2015 UPMC Enterpr ises : PROPRIETARY15

Machine Learning in 8 Slides | Recap

Term Description

Machine Learning Algorithms that automatically improve with data

Model Statistical representation of the algorithm’s experience

Supervised Learning Algorithms that improves using labeled examples

Linear Algorithm Predictions are proportional to the feature/input values

Feature Data point that affect the target you are trying to predict.

Target Variable Data point that you are trying to predict

Classification Predicting a categorical target variable

Regression Predicting a continuous target variable

|©2015 UPMC Enterpr ises : PROPRIETARY16

How to get started | Training, Data, Tools & Approaches

TrainingBooks

MOOCsInstructor Led

DataGovernment Agencies

Vendors

Tools Data Pipeline

SparkPython

|©2015 UPMC Enterpr ises : PROPRIETARY17

How to get started | Tool Details

Apache SparkCluster aware execution

MLLib for machine learning

PythonLoosely typed languageLibraries – scikit, pandas

and numpy

R Statistical language,

IDE ecosystem

|©2015 UPMC Enterpr ises : PROPRIETARY18

How to get started | Tool Details

|©2015 UPMC Enterpr ises : PROPRIETARY19

How to get started | Data Driven Approach

Problem •Hospital bed capacity

Data •Admitting Hospital•Patient diagnosis and demographics

Analysis•Data representation•Data analysis•Evaluation

Policy •Increase capacity at hospitals with average patient age under 40

|©2015 UPMC Enterpr ises : PROPRIETARY20

How to get started | Data Driven Approach

|©2015 UPMC Enterpr ises : PROPRIETARY21

Machine Learning in Healthcare | Data Driven Approach

Project AUses ML and patient Records to predict negative patient outcomes

Project B• Uses user-feedback to innovatively improve the ML Model

Project C• Grass roots ML project that yielded insight into factors that affect the length of a patients stay.

|©2015 UPMC Enterpr ises : PROPRIETARY22

Machine Learning in Healthcare | Project A

• Hospitals already have digital records, because of the ACA• Can use public algorithm and software to make predictions• Predictions save lives and/or money

ModelData in EMR: MedsLabsVisit HistoryNotes

+ SEPSIS+ Adverse Drug Reaction+ % Readmission

Predictions

|©2015 UPMC Enterpr ises : PROPRIETARY23

Machine Learning in Healthcare | Project B

• System uses proprietary model to make recommendations to clinicians• Hospital decided to improve system by developing a way to incorporate feedback• Hospital improves the model overtime without having to buy a new system

Health Records Model Predictions

MLRecommendations

User

Feedback

|©2015 UPMC Enterpr ises : PROPRIETARY24

Machine Learning in Healthcare | Project C

• Interested individuals built a model using visit data• Predictions were just OKAY• BUT inspecting the model allowed the team to learn new factors that were unknown about

the different factors affected the clinical outcomes

Visitor Info Model+ Length of Stay (LOS)

Predictions

Insight about LOS

|©2015 UPMC Enterpr ises : PROPRIETARY25

Let’s teach machines

https://github.com/MobyWare/ml-abstractions-intro-python

|©2015 UPMC Enterpr ises : PROPRIETARY26

References

Resource LocationPresentation link [search slideshare]

Demo code (direct) http://mybinder.org/repo/mobyware/ml-abstractions-intro-python

Demo code https://github.com/MobyWare/ml-abstractions-intro-python

EDX Course list (see data) https://www.edx.org/course

Coursera Course list https://www.coursera.org/courses/?domains=data-science

Spark Download Page http://spark.apache.org/downloads.html

Anaconda Python Install Page http://docs.continuum.io/anaconda/install

R install page https://cran.r-project.org/

R Studio install page https://www.rstudio.com/products/rstudio/download/

Ebay Tech Blog on Spark http://www.ebaytechblog.com/2014/05/28/using-spark-to-ignite-data-analytics/

|

Questions

top related