model automation in r

17
Model Automation in R Using MASS, randomForest, forecast, and caret

Upload: will-johnson

Post on 22-Jan-2018

628 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Model Automation in R

Using MASS, randomForest, forecast, and caret

Who is Will Johnson?

● Database Manager at Uline (Pleasant Prairie)● MS Predictive Analytics (2015)● Operating www.LearnByMarketing.com

○ R tutorials, thoughts on analysis.

Learn By Marketing.com

Agenda

1. What is Model Automation2. Pros and Cons of Model Automation3. Decision Trees and Random Forests {randomForest}4. Stepwise Regression {MASS}5. Auto.Arima for time series {forecast}6. Hyperparameter Search {caret}

What is Model Automation?

Hypothesis Space

vs

Hyperparameter Space

Pros and Cons of Model Automation

PROS:

● You Don’t Have to Think!● “Faster” Iterations.● See what’s “Important”

CONS:

● You Don’t Have to Think!● Jellybeans

Agenda

1. What is Model Automation2. Pros and Cons of Model Automation3. Decision Trees and Random Forests {randomForest}4. Stepwise Regression {MASS}5. Auto.Arima for time series {forecast}6. Hyperparameter Search {caret}

Decision Trees

● Gini Index + Entropy

randomForest

● Mean Decrease in Gini Index

library(randomForest)rf <- randomForest(y~., data = dat)rf$importance #Var Name + ImportancevarImpPlot(rf) #Visualization

Stepwise Regression

● AIC

Stepwise Regression

library(MASS)mod <- lm(hp~.,data=mt)#Step Backward and remove one variable at a timestepAIC(mod,direction = "backward",trace = T)#Create a model using only the interceptmod_lower = lm(hp~1,data=mt)#Step Forward and add one variable at a timestepAIC(mod_lower,direction = "forward", scope=list(upper=upper_form,lower=~1))#Step Forward or Backward each step starting with a intercept modelstepAIC(mod_lower,direction = "both", scope=list(upper=upper_form,lower=~1))

#Get the Independent Variables#(and exclude hp dependent variable)indep_vars <-paste(names(mt)[-which(names(mt)=="hp")], collapse="+")#Turn those variable names into a formulaupper_form = formula(paste("~",indep_vars,collapse=""))#~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb

Auto.Arima

● Time Series models.● AutoRegressive…● Moving Averages…● With Differencing!

library(forecast)library(fpp)#Step Backward and remove one variable at a timedata("elecequip")ee <- elecequip[1:180]model <- auto.arima(ee,stationary = T)# ar1 ma1 ma2 ma3 intercept#0.8428 -0.6571 -0.1753 0.6353 95.7265#s.e. 0.0431 0.0537 0.0573 0.0561 3.2223plot(forecast(model,h=10))lines(x = 181:191, y= elecequip[181:191], type = 'l', col = 'red')

Auto.Arima

train {caret}

library(caret)#Step Backward and remove one variable at a timetctrl <- trainControl(method = "cv",number=10, repeats=10)rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001))

rpart_model <- train(y~. data, method="rpart", metric = "Kappa", trControl = tctrl,

tuneGrid = rpart_opts, subset = train_log)

train {caret}

Recap

Learn By Marketing.com

library(randomForest) varImpPlot()library(MASS) stepAIC()library(forecast) auto.arima()library(caret) train()

Questions?Learn By

Marketing.com