day 6 - linear regression and logistic regression

21
Day 6 - Linear Regression and Logistic Regression Agenda: Linear Regression Examples Issues to Pay Attention To with Linear Regression Classification and Logistic Regression Training classifiers Evaluating classifiers More thoughts on square capital example and whether to approach problem as regression or classification

Upload: others

Post on 26-Oct-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Day 6 - Linear Regression and Logistic Regression

Day 6 - Linear Regression and Logistic Regression Agenda:

Linear Regression•Examples◦Issues to Pay Attention To with Linear Regression◦

Classification and Logistic Regression•Training classifiers◦Evaluating classifiers◦

More thoughts on square capital example and whether to approach problem as regression or classification

Page 2: Day 6 - Linear Regression and Logistic Regression

CS 6140: Machine Learning — Fall 2021— Paul Hand

HW 2Due: Wednesday September 29, 2021 at 2:30 PM Eastern time via Gradescope.

Names: [Put Your Name(s) Here]

You can submit this homework either by yourself or in a group of 2. You may consult any andall resources. You may submit your answers to this homework by directly editing this tex file(available on the course website) or by submitting a PDF of a Jupyter or Colab notebook. Whenyou upload your solutions to Gradescope, make sure to tag each problem with the correct page.

Question 1. In this problem, you will fit polynomials to one-dimensional data using linear regression.

(a) Generate training data (xi ,yi ) for i = 1 . . .8 by xi ⇠Uniform([0,1]), and yi = f (xi )+"i , wheref (x) = 1+2x �2x2 and "i ⇠N (0,�2) and � = 0.1. Plot the training data and the function f .

Response:

(b) In this problem, you will find the best fit degree d polynomial for the above data for eachd between 0 and 7. Find it with least squares linear regression by minimizing the trainingmean squared error (MSE)

min✓

12

8X

i=1

✓yi �

dX

k=0

✓kxki

◆2(1)

using the Normal Equations. Use numpy.linalg.solve to solve the Normal Equationsinstead of computing a matrix inverse. On 8 separate plots, plot the data and the best fitdegree-d polynomial.

Response:

(c) Plot the MSE with respect to the training data (training MSE) as a function of d. Whichvalue of d provided the lowest training MSE?

Response:

(d) Generate a test set of 1000 data points sampled according to the same process as in part (a).Plot the MSE with respect to the test data (test MSE) as a function of d. Which value of dprovided the lowest test MSE?

Response:

Question 2. Linear regression using gradient descent and TensorFlow

1

Aggie

raina MLModel

Learn A

valuate

moldyUSE

Inn

Page 3: Day 6 - Linear Regression and Logistic Regression

Least Squares Formulation for Linear Regression (for a general model)

Given D X Yii n

X GIRD Y EIR

model Y EEmljj.geterror

Want y O nxk

giri'll get ight get

If ii gem foreglenFind

main Illy 011 minimizessum of Squarg

where ye In offof GrasGi Yi JIN

solution given by Normal Equations

Xtra sty a Tty

Page 4: Day 6 - Linear Regression and Logistic Regression

Examples of setting up and solving linear regression

Find best fit cubic through Id data

Data.EC i YiJ3i wXi Yi

Model YI Goto Xt Gatto X t noise

Findmin spy girl

x x x'if0

more of of fsolution given by Normal Equations

Xtra sty after x'y

Page 5: Day 6 - Linear Regression and Logistic Regression

Find best fit plane

Data H Yi etIz IR

Model Y Go to X 0222 noise

Findmin fly I y0

more 111 ofD IEEEi

1 9 x

ye I

tally

Page 6: Day 6 - Linear Regression and Logistic Regression

Solving and Optimization Problem using Gradient Descent

Min Fix

Gradient descent Take successive steps downhill

fit Xi off Xi

Tf pomstep sizelearning rate

ts in directionof steepest descent

frats

X2 X

level sets

Depiction of gradient descent

i

Page 7: Day 6 - Linear Regression and Logistic Regression

Things that can go wrong: Underfitting and Overfitting Things that can go wrong: numerical instability

yUnderfitting

model y 00

Enough

overfitting

halal is too Expressive

it can fit noise

Xtra sty

Page 8: Day 6 - Linear Regression and Logistic Regression

Other topics: What happens when there is fewer data than features? What happens if there are outliers in the data? How do you deal with categorical features?

If nik usually

y 0 Atx is invalid

alI that is a unique 0Nhk

If nah rely many solutions

X has a null space

main Illy IgpI outliercan show Estimate

Of 0 arbitrarily much

ISSUE W MSE

more robust medianming Elly tell

y Got 0 X tea Xz

Xzt it person i is consultant

O O'wise

indicator

Page 9: Day 6 - Linear Regression and Logistic Regression

Be careful about whether you want to view your problem as a prediction task

Page 10: Day 6 - Linear Regression and Logistic Regression

Classification and Logistic RegressionViewing Regression and Classification as function estimation problems

Let F IRI IRy fix noise

Find 8 f

Classification predict membership in a category

wtf no Io.fiy frxlt noise

Given It Yi in nFind F

boundarydecision

Terminology

X input variables predictors independent vars features

Y response dependent variable outputvariable

f model predictor hypothesis

Page 11: Day 6 - Linear Regression and Logistic Regression

Parametric Approach: Choose a model for f with unknown parameters. Estimate the parameters.

Parametric

Regression predict a continuous value

Model Fo IR't IRy fort noise

Given city in

Find 8 0

ParametricClassification predict membership in a category

Let link m

o.EEy frxlt noiseGiven It Yi in nFind O

decisionboundary

Approach for Estimating G

select a model for f w parameters 0

minimize the loss between

training labels and predictions on

training data

Page 12: Day 6 - Linear Regression and Logistic Regression

Binary Classification in 2D with logistic regression

Min LI Yi ForkQ it I predictionloss

or yfunction

Example linear regression hey y y'trsquare loss or l loss

1yo o É

Training Data No Yasin n

X2

class l

class O

X

Page 13: Day 6 - Linear Regression and Logistic Regression

Given this data, draw a decision boundary (curve where you would say class 1 is on one side and class 2 is on the other side)

Xz o class Iclass O

X

X z o class Iclass O

fittingnoise

Page 14: Day 6 - Linear Regression and Logistic Regression

Modely of Goto x Osx JK O

where

sigmoid

logisticfunction I

Solve main El LI Yi JIN O for E

PredictFor new sample X predict

class I it Is zclass O if JC V2

What loss function should you use

One choice log loss

Lf Y J logroll it yet

lag l j if y O

binary continuo

ylogs ly log l Y

Page 15: Day 6 - Linear Regression and Logistic Regression

Decision Boundary for Logistic Regression

Training data x 3Yo b

Modely ofGotQ x Osx JK O

Predictfor new sample X predict

class I if Jt sclass O if Jc Yz

Decision boundary is linear

class I

x

a o sit0 0 0 Attak

Page 16: Day 6 - Linear Regression and Logistic Regression

Activity: Could you use logistic regression to build a reasonable classifier for the following data?

r

THEIR a 5 01 Got G X t 022s

Decision body is lienar

Bad fit for thisdata

b Forgot aft

Page 17: Day 6 - Linear Regression and Logistic Regression

Evaluating Classifiers Activity: Someone invents a test for a rare disease that affects 0.1% of the population. The test has accuracy 99.9%. Are you convinced this is a good test?

Prediction

t É

FalsePositive

TrueNegative

Accuracy TI.tn

Page 18: Day 6 - Linear Regression and Logistic Regression

Activity: You are building a binary classifier that detects whether a pedestrian is crossing the sidewalk within 30 feet of a self driving car. If the detection is positive, the car puts on the breaks. Would you rather have good precision and great recall or good recall and great precision?

Precision Iff what fraction of predicted

positives are real

what fraction of positivesRecall Iffy are correctly predicted

Want high precision high recall

Page 19: Day 6 - Linear Regression and Logistic Regression

There is a trade off between True Positives and False Positives, and between True Negatives and False Negatives

Training data x 3Yo o b

Modely ofGotQ x Osx JK O

Predictfor new sample X predict

class I it Is zclass 0 if Y V2

could chooseany value

Page 20: Day 6 - Linear Regression and Logistic Regression

Receiver Operating Characteristic Curves

Each point is a classifier w differentthreshold

Everythingrate

randomrate classifier

declareEverythingnegative

Page 21: Day 6 - Linear Regression and Logistic Regression

Comparing classifiers and Area-Under-Curve (AUC) Also common to plot precision-recall curves

Iggy VI want Auc l

Fprate

precis