dm week01 linreg.handout

7
Christof Monz Informatics Institute University of Amsterdam Data Mining Week 1: Linear Regression Outline Christof Monz Data Mining - Week 1: Linear Regression 1 I Plotting real-valued predictions I Linear regression I Error function

Upload: okeee

Post on 20-Aug-2015

232 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Dm week01 linreg.handout

Christof MonzInformatics Institute

University of Amsterdam

Data MiningWeek 1: Linear Regression

Outline

Christof MonzData Mining - Week 1: Linear Regression

1

I Plotting real-valued predictionsI Linear regressionI Error function

Page 2: Dm week01 linreg.handout

Linear Regression

Christof MonzData Mining - Week 1: Linear Regression

2

I Predict real-values (as opposed to discreteclasses)

I Simple machine learning prediction taskI Assumes linear correlation between data and

target values

Scatter Plots

Christof MonzData Mining - Week 1: Linear Regression

3

10 15 20 25 30 35 40 45

1015

2025

3035

40

x

y

Page 3: Dm week01 linreg.handout

Linear Regression

Christof MonzData Mining - Week 1: Linear Regression

4

I Find the line that approximates the data asclosely as possible

I y = a + b · xwhere b is the slope, and a is the y-intercept

I a and b should be chosen such that theyminimize the difference between the predictedvalues and the values in the training data

Error Functions

Christof MonzData Mining - Week 1: Linear Regression

5

I There are a number of ways to define an errorfunction

I Sum of absolute errors = ∑i∈D|yi− (a + bxi)|

I Sum of squared errors = ∑i∈D

(yi− (a + bxi))2

where yi is the true valueI Squared error is most commonly usedI Task: Find the parameters a and b that

minimize the squared error over the trainingdata

Page 4: Dm week01 linreg.handout

Error Functions

Christof MonzData Mining - Week 1: Linear Regression

6

I Normalized error functions:

I Mean squared error = ∑i∈D

(yi−(a+bxi))2

|D|

I Relative squared error = ∑i∈D(yi−(a+bxi))2

∑i∈D(yi−y)2

where y = 1|D|∑i∈D yi

I Root relative squared error =√

∑i∈D(yi−(a+bxi))2

∑i∈D(yi−y)2

Minimizing Error Functions

Christof MonzData Mining - Week 1: Linear Regression

7

I There are roughly two ways:• Try different parameter instantiations and see which

ones lead to the lowest error (search)

• Solve mathematically (closed form)

I Most parameter estimation problems in machinelearning can only be solved by searching

I For linear regression, we can solve itmathematically

Page 5: Dm week01 linreg.handout

Minimizing SSE

Christof MonzData Mining - Week 1: Linear Regression

8

I SSE = ∑i∈D

(yi− (a + bxi))2

I Take the partial derivatives with respect to aand b

I Set each partial derivative equal to zero andsolve for a and b respectively

I The resulting values for a and b minimize theerror rate and can be used to predict unseendata instances

Applying Linear Regression

Christof MonzData Mining - Week 1: Linear Regression

9

I For a given training set we first compute b:

b = |D|∑i∈D xiyi−∑i∈D xi ∑i∈D yi

|D|∑i∈D x2i −(∑i∈D xi)2

I and then a, using the value computed for b:a = y−bx

I For any new instances x ′ (i.e. instances thatwere not in the training set), the predicted valueis: a + bx ′

I Extendible to multi-valued functions

Page 6: Dm week01 linreg.handout

Linear Regression

Christof MonzData Mining - Week 1: Linear Regression

10

I Used to predict real-number values, givennumerical input variables

I Parameters can be estimated analytically (i.e.by applying some mathematics), which won’t bethe case for most parameter estimationalgorithms we’ll see later on

I Extendible to non-linear functions, e.g.log-linear regression

Correlation

Christof MonzData Mining - Week 1: Linear Regression

11

I So far we have used linear regression to predicttarget values (prediction)

I Linear regression can also be used to determinehow closely to variables are correlated(description)

I The smaller the error rate, the stronger thecorrelation between the variables

I Correlation does mean that there is some(interesting relation) between variables (notnecessarily causal)

Page 7: Dm week01 linreg.handout

Recap

Christof MonzData Mining - Week 1: Linear Regression

12

I Linear regressionI Error ratesI Analytical parameter estimation