modeling)with)rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn)...

58
Modeling with Rules Cynthia Rudin Assistant Professor of Sta8s8cs Massachuse:s Ins8tute of Technology joint work with: David Madigan (Columbia) Allison Chang, Ben Letham (MIT PhD students) Dimitris Bertsimas (MIT) Tyler McCormick (UW) Gene Kogan (Independent)

Upload: duongminh

Post on 14-May-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Modeling  with  Rules  

Cynthia  Rudin  Assistant  Professor  of  Sta8s8cs  

Massachuse:s  Ins8tute  of  Technology  joint  work  with:    

David  Madigan  (Columbia)    Allison  Chang,  Ben  Letham  (MIT  PhD  students)  

Dimitris  Bertsimas  (MIT)    Tyler  McCormick  (UW)  

Gene  Kogan  (Independent)    

Page 2: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Would  like  predic8ve  models  that  are  both  accurate  and  interpretable.  

Accuracy  =  classifica8on  accuracy  Interpretability  =  ?  

Page 3: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Would  like  predic8ve  models  that  are  both  accurate  and  interpretable.  

Accuracy  =  classifica8on  accuracy  Interpretability  =    

concise  -­‐  model  is  small  convincing  -­‐  there  are  reasons  behind  each  predic8on  

Page 4: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Decision  List  

fenway  park=1   1                    97/100  8mes  

rush_hour=0   -­‐1                  474/523  8mes                      

rain=0,    construc8on=0   -­‐1                  329/482  8mes  

Friday=1   -­‐1                  3/3  8mes  rain=1   1                  452/892  8mes  

Traffic  jam    in  Boston?  

Modeling  with  Rules  

otherwise   -­‐1                  10/15  8mes  

Page 5: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Modeling  with  Rules  Dichotomy  in  the  State  of  the  Art  

Accuracy  

vs.  

Interpretability  

Decision  Trees  

Support  Vector  Machines  Boosted  Decision  Trees  

Page 6: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Modeling  with  Rules  Daydreaming  

•   Nice  if  the  whole  algorithm  were  interpretable                                                                          OR  •   Want  the  accuracy  of  SVM/Boosted  DT  and  the  interpretability  of  Decision  Trees.  

Page 7: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

•  Part 1: Humans can interpret the predictions, and understand the full algorithm

•  Part 2: Bayesian hierarchical modeling with rules

•  Part 3: Accurate rule classifiers using MIO

 Sequen8al  Event  Predic8on  with  Associa8on  Rules    (R,  Letham,  Aouissi,  Kogan,  Madigan)  -­‐  COLT  2011  

A  Hierarchical  Model  for  Associa8on  Rule  Mining  of  Sequen8al  Events:  An  Approach  to  Automated  Medical  Symptom  Predic8on.      (McCormick,  R,  Madigan)  –  Annals  of  Applied  Sta8s8cs,  forthcoming  2012  

Ordered  Rules  for  Classifica8on:  A  Discrete  Op8miza8on  Approach  to  Associa8ve  Classifica8on    (Bertsimas,  Chang,  R)  –  In  progress  

Outline

Page 8: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

 Associa8on  Rule  Mining:  (Agrawal;  Imielinski;  Swami,  1993)  &  (Agrawal  and  Srikant,  1994)  

Page 9: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Construc8on=1  

Rain=1  

Traffic=1  

Page 10: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

15  8mes  we  saw  construc8on  and  rain,  and    13  out  of  15  of  those  8mes  we  also  saw  traffic  

Supp(construction=1 & rain=1) = 15Supp(traffic=1 & construction=1 & rain=1) = 13

Conf (contruction=1 & rain=1 → traffic=1) = 13 /15

Page 11: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

“Max  Confidence,  Min  Support”  Algorithm  Step  1.  Find  all  rules                          ,  where    

Step  2.  Rank  rules  in  descending  order  of                                                  recommend  the  right  hand  side  of  the  first  rule  that  applies.  

                 15                                            13/15=.867  

                 25                                              20/25=.8  

                 17                                              12/17=.706  

                 50                                                34/50=.68  

a→ b Supp(a) ≥θ .Conf (a→ b),

Conf (a→ b),Supp(a)

1  

rush  hour=0   -­‐1  

Friday=1   -­‐1  

otherwise   -­‐1  

Page 12: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

15  8mes  we  saw  construc8on  and  rain,  and    13  out  of  15  of  those  8mes  we  also  saw  traffic  

Conf=.99,  Supp=10000  vs.  Conf=1,  Supp=10  

Supp(construction=1 & rain=1) = 15Supp(traffic=1 & construction=1 & rain=1) = 13

Conf (contruction=1 & rain=1 → traffic=1) = 13 /15

Page 13: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Bayesian  version  of  the  confidence      

AdjustedConf (a→ b) := Supp(a&b)Supp(a)+ K

Page 14: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

“Adjusted  Confidence”  Algorithm  Step  1 Find  all  rules                          .  

Step  2.  Rank  rules  in  descending  order  of                                                                                      recommend  the  right  hand  side  of  the  first  rule  that  applies.  

                 25                                            20/(25+5)=.67  

                 15                                            13/(15+5)=.65  

                 50                                            34/(50+5)=.62  

                 17                                            12/(17+5)=.55  

a→ bAdjustedConf (a→ b),

AdjustedConf (a→ b), K = 5Supp(a)

rush  hour=0   -­‐1  

Friday=1   0  

otherwise   -­‐1  

1  

Page 15: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

•  Rare  rules  can  be  used  

•  Among  rules  with  similar  confidence,  prefers  rules  with  higher  support    

•  K  encourages  larger  support,  helps  with  predic8on  

Conf=.99,  Support=10000  vs.  Conf=1,  Support=10  

AdjustedConf (a→ b) := Supp(a&b)Supp(a)+ K

Page 16: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

– Humans can understand the prediction, and the algorithm

– Good for sequential event problems, where a set of events happen in a particular order •  e.g., for predicting what a customer will put next into an online

shopping cart, or for predicting medical symptoms in a sequence

– Having larger K helps with generalization •  algorithmic stability (pointwise hypothesis stability) •  other learning theoretic implications

–  Performs better empirically than the Max-Conf Min-Support Classifiers in our experiments

A  Learning  Theory  Framework  for  Associa8on  Rules  and  Sequen8al  Events  (R,  Letham,  Kogan,  Madigan)  –  SSRN  2011    

Sequen8al  Event  Predic8on  with  Associa8on  Rules    (R,  Letham,  Aouissi,  Kogan,  Madigan)  -­‐  COLT  2011  

Page 17: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

•  Part 1: Humans can interpret the predictions, and understand the full algorithm

•  Part 2: Bayesian hierarchical modeling with rules

•  Part 3: Accurate rule classifiers using MIO

 Sequen8al  Event  Predic8on  with  Associa8on  Rules    (R,  Letham,  Aouissi,  Kogan,  Madigan)  -­‐  COLT  2011  

A  Hierarchical  Model  for  Associa8on  Rule  Mining  of  Sequen8al  Events:  An  Approach  to  Automated  Medical  Symptom  Predic8on.      (McCormick,  R,  Madigan)  –  Annals  of  Applied  Sta8s8cs,  forthcoming  2012  

Ordered  Rules  for  Classifica8on:  A  Discrete  Op8miza8on  Approach  to  Associa8ve  Classifica8on    (Bertsimas,  Chang,  R)  –  In  progress  

Outline

Page 18: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Recommender  Systems  for  Medical  Condi8ons  

Predic8on  based  on  your  medical  history:  

Input  medical  condi8on:  

Page 19: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Recommender  Systems  for  Medical  Condi8ons  

Predic8on  based  on  your  medical  history:  

Input  medical  condi8on:  

dyspepsia  &  epigastric  pain    heartburn  depression                                                high  blood  pressure  

Gastroesophageal  reflux                        high  blood  pressure  

Page 20: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

heartburn  headache  dyspepsia  

fungal  infec8on  heartburn  

epigastric  pain  hypertension  dyspepsia  

Recommenda8ons  1.  rhini8s  2.  dyspepsia  3.  low  back  pain  

Recommenda8ons  1.  dyspepsia  2.  high  blood  pressure  3.  low  back  pain  

Recommenda8ons  1.  epigastric  pain  2.  heartburn  3.  high  blood  pressure  

t  

Medical  Condi8on  Predic8on  

Page 21: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  

Page 22: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  i patient index, r rule index of lhsr → rhsr

Page 23: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  i patient index, r rule index of lhsr → rhsryir := Suppi (rhsr& lhsr )nir := Suppi (lhsr )

We'll model yir ~ Binomial(nir , pir )

shared  across  individuals  

Page 24: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  i patient index, r rule index of lhsr → rhsryir := Suppi (rhsr& lhsr )nir := Suppi (lhsr )

We'll model yir ~ Binomial(nir , pir )pir ~ Beta(π ir ,τ i )

Page 25: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  i patient index, r rule index of lhsr → rhsryir := Suppi (rhsr& lhsr )nir := Suppi (lhsr )

We'll model yir ~ Binomial(nir , pir )pir ~ Beta(π ir ,τ i )

Under this model, E(pir | yir ,nir ) =yir +π ir

nir +π ir +τ i.

Page 26: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  i patient index, r rule index of lhsr → rhsryir := Suppi (rhsr& lhsr )nir := Suppi (lhsr )

We'll model yir ~ Binomial(nir , pir )pir ~ Beta(π ir ,τ i )π ir = exp(M 'i βr + γ i )

Page 27: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  

π ir = exp(M 'i βr + γ i )

Page 28: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  

π ir = exp(M 'i βr + γ i )

M∈ I×D (observable characteristics)

1   1  

Page 29: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  

π ir = exp(M 'i βr + γ i )

M∈ I×D (observable characteristics)

1   1  

Example: π ir = exp(βr ,0 + βr ,11male + γ i ) = exp(βr ,11male )exp(βr ,0 + γ i )

Page 30: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  i patient index, r rule index of lhsr → rhsryir := Suppi (rhsr& lhsr )nir := Suppi (lhsr )

We'll model yir ~ Binomial(nir , pir )pir ~ Beta(π ir ,τ i )π ir = exp(M 'i βr + γ i )

Page 31: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

yir ~ Binomial(nir , pir )pir ~ Beta(π ir ,τ i )π ir = exp(M 'i βr + γ i )

Hierarchical  Associa8on  Rule  Model  (HARM)  

log(τ i ) ~ Normal(0,στ2 )

log(βrd ) ~ Normal(µβ ,σ β2 )

log(γ i ) ~ Normal(µγ ,σγ2 )

diffuse uniform priors on µβ ,σ β2 ,στ

2

HARM estimates posterior distribution (MCMC), then ranks rules by posterior mean.

Page 32: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Hierarchical  Associa8on  Rule  Model  (HARM)  

•  43,000  pa8ent  encounters  •  ~2,300  pa8ents,  age  (>  40)  •  pre-­‐exis8ng  condi8ons  dealt  with  separately  •  used  25  most  common  condi8ons,  and  25  least  common  condi8ons  

Page 33: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

t  

training test

For trials=1:500 •  Form training and test sets:

–  sample ~200 patients –  for each patient, randomly split encounters into training and

test

Page 34: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

t  

For trials=1:500 •  Form training and test sets:

–  sample ~200 patients –  for each patient, randomly split encounters into training and

test •  For each patient, iteratively make predictions on test encounters

–  get 1 point whenever our top 3 recommendations contain patient’s next condition

training test

Page 35: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

●●

●●●● ●

●●

●●●●●

HAR

MC

onf.

Adj.

k=.2

5Ad

j. k=

.5Ad

j. k=

1Ad

j. k

=2Th

resh

.=2

Thre

sh.=

3

0.0

0.1

0.2

0.3

0.4

0.5

0.6

(a) All patients

Prop

ortio

n of

cor

rect

pre

dict

ions

Page 36: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Myocardial  infarc8on  in  pa8ents  with    hypertension,  in  treatment  (T)  and  placebo  (P)  groups  

Key:   Middle  half  

Middle  90%  

Mean  of  posterior  means  

HARM   Confidence  

P T 40−50

P T 51−60

P T 61−70

P T Over 70

T P 40−50

T P 51−60

T P 61−70

T P Over 70

Rescaled

 Risk

 

Page 37: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Key:   Middle  half  

Middle  90%  

Mean  of  posterior  means  

HARM   Confidence  

Myocardial  infarc8on  in  pa8ents  with  high  cholesterol,  in  treatment  (T)  and  placebo  (P)  groups  

P T 40−50

P T 51−60

P T 61−70

P T Over 70

P T 40−50

P T 51−60

P T 61−70

P T Over 70

Rescaled

 Risk

 

Page 38: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

•  Part 1: Humans can interpret the predictions, and understand the full algorithm

•  Part 2: Bayesian hierarchical modeling with rules

•  Part 3: Accurate rule classifiers using MIO

 Sequen8al  Event  Predic8on  with  Associa8on  Rules    (R,  Letham,  Aouissi,  Kogan,  Madigan)  -­‐  COLT  2011  

A  Hierarchical  Model  for  Associa8on  Rule  Mining  of  Sequen8al  Events:  An  Approach  to  Automated  Medical  Symptom  Predic8on.      (McCormick,  R,  Madigan)  –  Annals  of  Applied  Sta8s8cs,  forthcoming  2012  

Ordered  Rules  for  Classifica8on:  A  Discrete  Op8miza8on  Approach  to  Associa8ve  Classifica8on    (Bertsimas,  Chang,  R)  –  In  progress  

Outline

Page 39: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Mixed Integer Optimization

•  MIO/MIP is a style of mathematical programming •  Not generally used for ML – perception from 1970’s that MIO’s are intractable

Page 40: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Mixed Integer Optimization

•  MIO/MIP is a style of mathematical programming •  Not generally used for ML – perception from 1970’s that MIO’s are intractable  

•  Not all valid MIO formulations are equally strong

Page 41: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Mixed Integer Optimization

•  MIO/MIP is a style of mathematical programming •  Not generally used for ML – perception from 1970’s that MIO’s are intractable

•  Not all valid MIO formulations are equally strong •  Can use LP relaxations for very large scale problems

Page 42: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Mixed Integer Optimization

•  MIO/MIP is a style of mathematical programming •  Not generally used for ML – perception from 1970’s that MIO’s are intractable

•  Not all valid MIO formulations are equally strong •  Can use LP relaxations for very large scale problems    •  Associa8on  rules  historically  plagued  by  “combinatorial  explosion”...    

Page 43: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Ordered  Rules  for  Classifica8on  •  Minimize  misclassifica8on  error,  regularize  by  height  of  the  highest  null  rule.  

43  

“null rules”: higher one predicts the default class and ends the list.

Page 44: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

MIO  Learning  Algorithm  

Page 45: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

MIO  Learning  Algorithm  Maximize  classificaAon  accuracy  

Page 46: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

MIO  Learning  Algorithm  Maximize  classificaAon  accuracy  

Maximize  rank  of  the  highest  null  rule  (regularizaAon)  

Page 47: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Experiments  

•  Five  algorithms  – Logis8c  Regression  (LogReg)  – Support  Vector  Machines  /  RBF  kernel  (SVM)    – Classifica8on  and  Regression  Trees  (CART)    – Boosted  Decision  Trees  (AdaBoost)  – Ordered  Rules  for  Classifica8on  (ORC)  

•  Several  publicly  available  datasets  (UCI)  •  Accuracy  averaged  over  3  folds  

Page 48: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Classifica8on  Accuracy  

Page 49: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

o  

~x  

o  

o   o  

~x   ~x  

x  

x  

~o  

yes   no  

1  

0  

.26  

.47   .92  

:  :  

:   :  

:   :  

CART  on  Tic  Tac  Toe  

CART  accuracy  =  0.9388715  

Page 50: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

ORC  on  Tic  Tac  Toe  

x  

x  

x  

x   x   x  

x  

x  

x  

x  

x  

x  

x  

x  

x  

x  

x  

x  

x   x   x  

x   x   x  

x  wins  

1   2   3   4   5   6   7   8  

9  

x  wins  x  wins   x  wins  x  wins   x  wins  x  wins  

x  does  not  win  

x  wins  

ORC  accuracy  =  1  

Page 51: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

MONKS  Problems  1  

•  6  Integer  valued  features  taking  values  1,2,3,4  •  Examples  are  in  class  1  if  either  a1=a2  or  a5=1  

Page 52: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

CART  on  MONKS  Problems  1  

•  Examples  are  in  class  1  if  either  a1=a2  or  a5=1  

Page 53: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

ORC  on  MONKS  Problems  1  

•  Examples  are  in  class  1  if  either  a1=a2  or  a5=1  

a1=3, a2=3 →1 (33/33)a1=2, a2=2 →1 (30/30)a5=1 →1 (65/65) a1=1, a2=1 →1 (31/31)∅ →−1 (152/288)

Page 54: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

•  The  bo:om  line:  You  don’t  need  to  sacrifice  accuracy  to  get  interpretability.  

Page 55: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

•  Part 1: Humans can interpret the predictions, and understand the full algorithm

•  Part 2: Bayesian hierarchical modeling with rules

•  Part 3: Accurate rule classifiers using MIO

 Sequen8al  Event  Predic8on  with  Associa8on  Rules    (R,  Letham,  Aouissi,  Kogan,  Madigan)  -­‐  COLT  2011  

A  Hierarchical  Model  for  Associa8on  Rule  Mining  of  Sequen8al  Events:  An  Approach  to  Automated  Medical  Symptom  Predic8on.      (McCormick,  R,  Madigan)  –  Annals  of  Applied  Sta8s8cs,  forthcoming  2012  

Ordered  Rules  for  Classifica8on:  A  Discrete  Op8miza8on  Approach  to  Associa8ve  Classifica8on    (Bertsimas,  Chang,  R)  –  In  progress  

Outline

current  work  coming  up    

Page 56: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Associa8on  Rules/  Associa8ve  Classifica8on  

Decision  Trees  

Decision  Lists  

Logical  Analysis  of  Data  (LAD)  

Bayesian  Analysis  ML  algorithms  that  use  rules  as  features  

Page 57: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Current  Work  •  Machine  Learning  for  the  NYC  Power  Grid    

–  cover  of  IEEE  Computer,  spotlight  issue  for  IEEE  TPAMI  in  February,  WIRED  Science,  Slashdot,  US  News  &  World  Report...  

•  Supervised  Ranking,  Equivalences  between  Ranking  and  Classifica8on,  Ranking  with  MIO  

•  Reverse-­‐Engineering  Quality  Rankings  –  in  Businessweek  last  week  

•  ML  algorithms  that  understand  how  they  will  be  used  for  a  subsequent  task    

•  Several  other  projects  

Page 58: Modeling)with)Rules) - ima.umn.edu · dyspepsia&)epigastric)pain) )heartburn) depressionhighbloodpressure) ... • Can use LP relaxations for very large scale problems! Mixed Integer

Thank  you!