regression: simple and linear...advertisement sales advertisement ales estimating coeﬃcients true...

INTRODUCTION TO MACHINE LEARNING

Regression: Simple and Linear

Introduction to Machine Learning

Regression Principle

PREDICTORS REGRESSION RESPONSE

ExampleShop Data: sales, competition, district size, ...

Data Analyst Relationship?

● Predictors: competition, advertisement, …

● Response: sales

Shopkeeper Predictions!

Simple Linear Regression● Simple: one predictor to model the response

● Linear: approximately linear relationship

Linearity Plausible? Sca!erplot!

Example● Relationship: advertisement sales

● Expectation: positively correlated

Example● Observation: upwards linear trend

● First Step: simple linear regression

5 10 15

ModelFi!ing a line

● Predictor: ● Intercept:

● Statistical Error:

● Response: ● Slope:

5 10 15

Estimating Coefficients

True Response

Residuals

Fi!ed Response

Minimize!

5 10 15

#Observations

> my_lm <- lm(sales ~ ads, data = shop_data)

PredictorResponse

5 10 15

> my_lm$coefficients Returns coefficients

Prediction with RegressionPredicting new outcomes

> y_new <- predict(my_lm, x_new, interval = "confidence")

Provides confidence interval

Estimated Response

New Predictor Instance

Estimated Coefficients

Must be data frame

Example: Ads: 11.000$ Sales: 380.000$

Accuracy: RMSEMeasure of accuracy:

Estimated Response

True Response# Observations

difficult to interpret!

Example: RMSE = 76.000$ Meaning?

RMSE has unit + scale

Interpretation: % explained variance,

Accuracy: R-squaredSample mean response

close to 1 good fit!

> summary(my_lm)$r.squared

R-squared

Total SS

Example: 0.84

Let’s practice!

Multivariable Linear Regression

5 10 15

0 5 10 15

nearby competition

ExampleSimple Linear Regression:

> lm(sales ~ ads, data = shop_data)

0 5 10 15

nearby competition

> lm(sales ~ comp, data = shop_data)

Loss of information!

Multi-Linear Model

Solution: combine in multi linear model!

Individual Effect

● Higher predictive power

● Higher accuracy

Multi-Linear Regression Model

● Predictors:

● Response:

● Statistical Error:

● Coefficients:

True Response Fi!ed Response

#Observations

Minimize!

Residuals

Extending!

> my_lm <- lm(sales ~ ads + comp + ..., data = shop_data)

More predictors: total inventory, district size, …

PredictorsResponse

Extend methodology to p predictors:

RMSE & Adjusted R-SquaredMore predictors

● Penalizes more predictors

● Used to compare

> summary(my_lm)$adj.r.squared

Solution: adjusted R-squared

Lower RMSE and higher R-squared

Higher complexity and cost{

In Example: 0.819 0.906

Influence of predictors● p-value: indicator influence of parameter

● p-value low — more likely parameter has significant influence

> summary(my_lm)

Call: lm(formula = sales ~ ads + comp, data = shop_data)

Residuals: Min 1Q Median 3Q Max -131.920 -23.009 -4.448 33.978 146.486

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 228.740 80.592 2.838 0.009084 ** ads 25.521 5.900 4.325 0.000231 *** comp -19.234 4.549 -4.228 0.000296 ***

P-Values

Example● Want 95% confidence — p-value <= 0.05

● Want 99% confidence — p-value <= 0.01

Note: Do not mix up R-squared with p-values!

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 228.740 80.592 2.838 0.009084 ** ads 25.521 5.900 4.325 0.000231 *** comp -19.234 4.549 -4.228 0.000296 ***

P-Values

Assumptions● Just make a model,

make a summary and look at p-values?

● Not that simple!

● We made some implicit assumptions

0 100 200 300 400 500 600

−100

Residual Plot

Estimated Sales

−2 −1 0 1 2

−100

Normal Q−Q Plot

Theoretical Quantiles

> qqnorm(lm_shop$residuals)

> plot(lm_shop$fitted.values, lm_shop$residuals)

Verifying AssumptionsResiduals:

● Identical Normal:

Draws normal Q-Q plot

No pa!ern?● Independent:

Approximately a line?

Verfiying Assumptions

−2 −1 0 1 2

−100

Normal Q−Q Plot

Theoretical QuantilesR

0 100 200 300 400 500 600

−100

Residual Plot

Estimated Sales

● Important to avoid mistakes!

● Alternative tests exist

Let’s practice!

k-Nearest Neighbors and Generalization

Non-Parametric Regression

2 3 4 5 6

Problem: Visible pa!ern, but not linear

Non-Parametric RegressionProblem: Visible pa!ern, but not linear

Solutions:

● Transformation

● Multi-linear Regression

● non-Parametric Regression

Tedious

Advanced

Doable

Non-Parametric RegressionProblem: Visible pa!ern, but not linear

Techniques:

● k-Nearest Neighbors

● Kernel Regression

● Regression Trees

● …

No parameter estimations required!

k-NN: Algorithm

4.5 4.6 4.7 4.8 4.9 5.0

New observation

Given a training set and a new observation:

4.5 4.6 4.7 4.8 4.9 5.0

k-NN: Algorithm

1. Calculate the distance in the predictors

4.5 4.6 4.7 4.8 4.9 5.0

k-NN: Algorithm

4.5 4.6 4.7 4.8 4.9 5.0

yk = 4

2. Select the k nearest

4.5 4.6 4.7 4.8 4.9 5.0

k-NN: Algorithm

Mean of 4 responses

3. Aggregate the response of the k nearest

4.5 4.6 4.7 4.8 4.9 5.0

k-NN: Algorithm

Prediction

4. The outcome is your prediction

Choosing k● k = 1: Perfect fit on training set but poor predictions

● k = #obs in training set: Mean, also poor predictions

Reasonable: k = 20% of #obs in training set

Bias - Variance trade off!

Generalization in Regression● Built your own regression model

● Worked on training set

● Does it generalize well?!

● Two techniques

● Hold Out: simply split the dataset

● K-fold cross-validation

Hold Out Method for RegressionTest setTraining set

Build regression model on training set

Calculate RMSE within training set

Predict the outcome of the test set

Calculate the RMSE within test set

Compare Test RMSE and Training RMSE

2 3 4 5 6 7 8

● Fit:

● Generalize:

● Prediction: ✔

Under and Overfi!ing Underfit Overfit

● Fit:

● Generalize:

● Prediction:

● Fit:

● Generalize:

● Prediction:

Let’s practice!

regression: simple and linear...advertisement sales advertisement ales estimating coeﬃcients true...

Documents

evaporation heat transfer coeﬃcients of ammonia with a ......

linear recurrences with polynomial coeﬃcients and...

inverse problems: recovery of bv coeﬃcients from...

a rotation matrices and clebsch-gordan coeﬃcients · a...

hilbert modular forms with coeﬃcients in intersection...

the eﬀect of zonal harmonic coeﬃcients in the …the...

taylor coeﬃcients of the thermodynamic potential to sixth...

properties of coeﬃcients of certain linear forms in...

comparison of wavelet decomposition coeﬃcients ......

recurrent relation for branching coeﬃcients of aﬃne lie...

congruences for the coeﬃcients of modular forms and...

arxiv:0705.3826v1 [math.rt] 25 may 20072 magyar the...

the life cycle of a direct response advertisement: from...

6 time-varying coeﬃcients var - uab...

comparison of wavelet decomposition coeﬃcients

mel$frequency)cepstral))...

c6 coeﬃcients and dipole polarizabilities for all atoms...

the lam´e and metric coeﬃcients for curvilinear...

inverse linear programming with interval coeﬃcients

calculation of the hydrodynamic coeﬃcients for cylinders...