regression: simple and linear...advertisement sales advertisement ales estimating coefficients true...
Post on 13-Aug-2020
4 Views
Preview:
TRANSCRIPT
INTRODUCTION TO MACHINE LEARNING
Regression: Simple and Linear
Introduction to Machine Learning
Regression Principle
PREDICTORS REGRESSION RESPONSE
Introduction to Machine Learning
ExampleShop Data: sales, competition, district size, ...
Data Analyst Relationship?
● Predictors: competition, advertisement, …
● Response: sales
Shopkeeper Predictions!
Introduction to Machine Learning
Simple Linear Regression● Simple: one predictor to model the response
● Linear: approximately linear relationship
Linearity Plausible? Sca!erplot!
Introduction to Machine Learning
Example● Relationship: advertisement sales
● Expectation: positively correlated
Introduction to Machine Learning
Example● Observation: upwards linear trend
● First Step: simple linear regression
5 10 15
0100
200
300
400
500
advertisement
sales
Introduction to Machine Learning
ModelFi!ing a line
● Predictor: ● Intercept:
● Statistical Error:
● Response: ● Slope:
Introduction to Machine Learning
5 10 15
0100
200
300
400
500
advertisement
sales
Advertisement
Sale
s
Estimating Coefficients
True Response
Residuals
Fi!ed Response
Minimize!
5 10 15
010
020
030
040
050
0
5 10 15
0100
200
300
400
500
advertisement
sales
#Observations
Introduction to Machine Learning
Estimating Coefficients
> my_lm <- lm(sales ~ ads, data = shop_data)
PredictorResponse
5 10 15
010
020
030
040
050
0
> my_lm$coefficients Returns coefficients
Introduction to Machine Learning
Prediction with RegressionPredicting new outcomes
> y_new <- predict(my_lm, x_new, interval = "confidence")
Provides confidence interval
,
Estimated Response
New Predictor Instance
Estimated Coefficients
Must be data frame
Example: Ads: 11.000$ Sales: 380.000$
Introduction to Machine Learning
Accuracy: RMSEMeasure of accuracy:
Estimated Response
True Response# Observations
difficult to interpret!
Example: RMSE = 76.000$ Meaning?
RMSE has unit + scale
Introduction to Machine Learning
Interpretation: % explained variance,
Accuracy: R-squaredSample mean response
close to 1 good fit!
> summary(my_lm)$r.squared
R-squared
Total SS
Example: 0.84
INTRODUCTION TO MACHINE LEARNING
Let’s practice!
INTRODUCTION TO MACHINE LEARNING
Multivariable Linear Regression
Introduction to Machine Learning
5 10 15
010
020
030
040
050
0
0 5 10 15
010
020
030
040
050
0
nearby competition
sale
s
ExampleSimple Linear Regression:
> lm(sales ~ ads, data = shop_data)
0 5 10 15
010
020
030
040
050
0
nearby competition
sale
s
> lm(sales ~ comp, data = shop_data)
Loss of information!
Introduction to Machine Learning
Multi-Linear Model
Solution: combine in multi linear model!
Individual Effect
● Higher predictive power
● Higher accuracy
Introduction to Machine Learning
Multi-Linear Regression Model
● Predictors:
● Response:
● Statistical Error:
● Coefficients:
Introduction to Machine Learning
Estimating Coefficients
True Response Fi!ed Response
#Observations
Minimize!
Residuals
Introduction to Machine Learning
Extending!
> my_lm <- lm(sales ~ ads + comp + ..., data = shop_data)
More predictors: total inventory, district size, …
PredictorsResponse
Extend methodology to p predictors:
Introduction to Machine Learning
RMSE & Adjusted R-SquaredMore predictors
● Penalizes more predictors
● Used to compare
> summary(my_lm)$adj.r.squared
Solution: adjusted R-squared
Lower RMSE and higher R-squared
Higher complexity and cost{
In Example: 0.819 0.906
Introduction to Machine Learning
Influence of predictors● p-value: indicator influence of parameter
● p-value low — more likely parameter has significant influence
> summary(my_lm)
Call: lm(formula = sales ~ ads + comp, data = shop_data)
Residuals: Min 1Q Median 3Q Max -131.920 -23.009 -4.448 33.978 146.486
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 228.740 80.592 2.838 0.009084 ** ads 25.521 5.900 4.325 0.000231 *** comp -19.234 4.549 -4.228 0.000296 ***
P-Values
Introduction to Machine Learning
Example● Want 95% confidence — p-value <= 0.05
● Want 99% confidence — p-value <= 0.01
Note: Do not mix up R-squared with p-values!
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 228.740 80.592 2.838 0.009084 ** ads 25.521 5.900 4.325 0.000231 *** comp -19.234 4.549 -4.228 0.000296 ***
P-Values
Introduction to Machine Learning
Assumptions● Just make a model,
make a summary and look at p-values?
● Not that simple!
● We made some implicit assumptions
Introduction to Machine Learning
0 100 200 300 400 500 600
−100
−50
050
100
150
Residual Plot
Estimated Sales
Res
idua
ls
−2 −1 0 1 2
−100
−50
050
100
150
Normal Q−Q Plot
Theoretical Quantiles
Res
idua
l Qua
ntile
s
> qqnorm(lm_shop$residuals)
> plot(lm_shop$fitted.values, lm_shop$residuals)
Verifying AssumptionsResiduals:
● Identical Normal:
Draws normal Q-Q plot
No pa!ern?● Independent:
Approximately a line?
Introduction to Machine Learning
Verfiying Assumptions
−2 −1 0 1 2
−100
−50
050
100
150
Normal Q−Q Plot
Theoretical QuantilesR
esid
ual Q
uant
iles
0 100 200 300 400 500 600
−100
−50
050
100
150
Residual Plot
Estimated Sales
Res
idua
ls
● Important to avoid mistakes!
● Alternative tests exist
INTRODUCTION TO MACHINE LEARNING
Let’s practice!
INTRODUCTION TO MACHINE LEARNING
k-Nearest Neighbors and Generalization
Introduction to Machine Learning
Non-Parametric Regression
2 3 4 5 6
4446
4850
5254
56
x
y
Problem: Visible pa!ern, but not linear
Introduction to Machine Learning
Non-Parametric RegressionProblem: Visible pa!ern, but not linear
Solutions:
● Transformation
● Multi-linear Regression
● non-Parametric Regression
Tedious
Advanced
Doable
Introduction to Machine Learning
Non-Parametric RegressionProblem: Visible pa!ern, but not linear
Techniques:
● k-Nearest Neighbors
● Kernel Regression
● Regression Trees
● …
No parameter estimations required!
Introduction to Machine Learning
k-NN: Algorithm
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
New observation
Given a training set and a new observation:
Introduction to Machine Learning
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
Given a training set and a new observation:
k-NN: Algorithm
1. Calculate the distance in the predictors
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
Introduction to Machine Learning
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
k-NN: Algorithm
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
yk = 4
2. Select the k nearest
Given a training set and a new observation:
Introduction to Machine Learning
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
k-NN: Algorithm
Mean of 4 responses
3. Aggregate the response of the k nearest
Given a training set and a new observation:
Introduction to Machine Learning
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
4.5 4.6 4.7 4.8 4.9 5.0
45.5
46.0
46.5
47.0
47.5
48.0
48.5
x
y
k-NN: Algorithm
Prediction
4. The outcome is your prediction
Given a training set and a new observation:
Introduction to Machine Learning
Choosing k● k = 1: Perfect fit on training set but poor predictions
● k = #obs in training set: Mean, also poor predictions
Reasonable: k = 20% of #obs in training set
Bias - Variance trade off!
Introduction to Machine Learning
Generalization in Regression● Built your own regression model
● Worked on training set
● Does it generalize well?!
● Two techniques
● Hold Out: simply split the dataset
● K-fold cross-validation
Introduction to Machine Learning
Hold Out Method for RegressionTest setTraining set
Build regression model on training set
Calculate RMSE within training set
Predict the outcome of the test set
Calculate the RMSE within test set
Compare Test RMSE and Training RMSE
Introduction to Machine Learning
2 3 4 5 6 7 8
−20
24
68
10
x
y
2 3 4 5 6 7 8
−20
24
68
10
x
y
2 3 4 5 6 7 8
−20
24
68
10
x
y
● Fit:
● Generalize:
● Prediction: ✔
✔
✔
Under and Overfi!ing Underfit Overfit
● Fit:
● Generalize:
● Prediction:
● Fit:
● Generalize:
● Prediction:
✔
✔
✘
✘
✘
✘
INTRODUCTION TO MACHINE LEARNING
Let’s practice!
top related