model automation in r
TRANSCRIPT
Who is Will Johnson?
● Database Manager at Uline (Pleasant Prairie)● MS Predictive Analytics (2015)● Operating www.LearnByMarketing.com
○ R tutorials, thoughts on analysis.
Learn By Marketing.com
Agenda
1. What is Model Automation2. Pros and Cons of Model Automation3. Decision Trees and Random Forests {randomForest}4. Stepwise Regression {MASS}5. Auto.Arima for time series {forecast}6. Hyperparameter Search {caret}
Pros and Cons of Model Automation
PROS:
● You Don’t Have to Think!● “Faster” Iterations.● See what’s “Important”
CONS:
● You Don’t Have to Think!● Jellybeans
Agenda
1. What is Model Automation2. Pros and Cons of Model Automation3. Decision Trees and Random Forests {randomForest}4. Stepwise Regression {MASS}5. Auto.Arima for time series {forecast}6. Hyperparameter Search {caret}
randomForest
● Mean Decrease in Gini Index
library(randomForest)rf <- randomForest(y~., data = dat)rf$importance #Var Name + ImportancevarImpPlot(rf) #Visualization
Stepwise Regression
library(MASS)mod <- lm(hp~.,data=mt)#Step Backward and remove one variable at a timestepAIC(mod,direction = "backward",trace = T)#Create a model using only the interceptmod_lower = lm(hp~1,data=mt)#Step Forward and add one variable at a timestepAIC(mod_lower,direction = "forward", scope=list(upper=upper_form,lower=~1))#Step Forward or Backward each step starting with a intercept modelstepAIC(mod_lower,direction = "both", scope=list(upper=upper_form,lower=~1))
#Get the Independent Variables#(and exclude hp dependent variable)indep_vars <-paste(names(mt)[-which(names(mt)=="hp")], collapse="+")#Turn those variable names into a formulaupper_form = formula(paste("~",indep_vars,collapse=""))#~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb
Auto.Arima
● Time Series models.● AutoRegressive…● Moving Averages…● With Differencing!
library(forecast)library(fpp)#Step Backward and remove one variable at a timedata("elecequip")ee <- elecequip[1:180]model <- auto.arima(ee,stationary = T)# ar1 ma1 ma2 ma3 intercept#0.8428 -0.6571 -0.1753 0.6353 95.7265#s.e. 0.0431 0.0537 0.0573 0.0561 3.2223plot(forecast(model,h=10))lines(x = 181:191, y= elecequip[181:191], type = 'l', col = 'red')
train {caret}
library(caret)#Step Backward and remove one variable at a timetctrl <- trainControl(method = "cv",number=10, repeats=10)rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001))
rpart_model <- train(y~. data, method="rpart", metric = "Kappa", trControl = tctrl,
tuneGrid = rpart_opts, subset = train_log)
Recap
Learn By Marketing.com
library(randomForest) varImpPlot()library(MASS) stepAIC()library(forecast) auto.arima()library(caret) train()