bits f464: machine learning - dr. kamlesh tiwari · 2018-01-10 · bits f464: machine learning...

18
BITS F464: MACHINE L EARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer Science and Information Systems Engineering, BITS Pilani, Rajasthan-333031 INDIA Jan 10, 2018 (Campus @ BITS-Pilani Jan-May 2018)

Upload: others

Post on 29-Mar-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

BITS F464: MACHINE LEARNING

Lecture-01: Introduction

Dr. Kamlesh TiwariAssistant Professor

Department of Computer Science and Information Systems Engineering,BITS Pilani, Rajasthan-333031 INDIA

Jan 10, 2018 (Campus @ BITS-Pilani Jan-May 2018)

Page 2: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.

There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.

Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.

Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10

Page 3: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.

There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.

Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.

Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10

Page 4: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.

There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.

Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.

Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10

Page 5: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.

There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.

Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.

Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10

Page 6: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

Building Blocks

Input: xOutput: yTraining data: (x (1), y (1)), (x (2), y (2)), ..., (x (m), y (m))

x (i) could be a multivariate say x (i) = (x (i)1 , x (i)

2 , ..., x (i)n )

Target function: true function

f : x → y

Hypothesish : x → y

Accuracy: agreement b/w f and h

Issue isTrue function is not known.

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 3 / 10

Page 7: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

The Flow of ML

True Function (Unknown) f

Training Data

Observations

Machine Learning

Hypothesis Space (with bias) H

Hypothesis h

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 4 / 10

Page 8: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)

Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model

n∑i=1

wi × xi =

{> Threshold Then APPROVE

otherwise DENY/REJECT

Simplified:

h(x) = sign(n∑

i=1

wi × xi − Threshold)

Add an extra term x0 then

h(x) = sign(n∑

i=0

wi × xi)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10

Page 9: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)

Let x1=accountBal, x2=Salary, x3=age ...

What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model

n∑i=1

wi × xi =

{> Threshold Then APPROVE

otherwise DENY/REJECT

Simplified:

h(x) = sign(n∑

i=1

wi × xi − Threshold)

Add an extra term x0 then

h(x) = sign(n∑

i=0

wi × xi)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10

Page 10: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)

Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...

The Modeln∑

i=1

wi × xi =

{> Threshold Then APPROVE

otherwise DENY/REJECT

Simplified:

h(x) = sign(n∑

i=1

wi × xi − Threshold)

Add an extra term x0 then

h(x) = sign(n∑

i=0

wi × xi)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10

Page 11: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)

Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model

n∑i=1

wi × xi =

{> Threshold Then APPROVE

otherwise DENY/REJECT

Simplified:

h(x) = sign(n∑

i=1

wi × xi − Threshold)

Add an extra term x0 then

h(x) = sign(n∑

i=0

wi × xi)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10

Page 12: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)

Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model

n∑i=1

wi × xi =

{> Threshold Then APPROVE

otherwise DENY/REJECT

Simplified:

h(x) = sign(n∑

i=1

wi × xi − Threshold)

Add an extra term x0 then

h(x) = sign(n∑

i=0

wi × xi)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10

Page 13: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

A Toy model (Contd..)

Can you recognize h(x) = sign(∑n

i=0 wi × xi)

It is a linear equation (in two dimension) or hyper plane

Vector (w1,w2, ...,wm) would be normal on the plane.What changes this plane? wi ’sLearning: Use misclassified examples to update wi = wi + yixi

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 6 / 10

Page 14: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

A Toy model (Contd..)

Can you recognize h(x) = sign(∑n

i=0 wi × xi)

It is a linear equation (in two dimension) or hyper plane

Vector (w1,w2, ...,wm) would be normal on the plane.What changes this plane? wi ’sLearning: Use misclassified examples to update wi = wi + yixi

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 6 / 10

Page 15: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

Loss function

Performance is the closeness of hypothesis function with targetfunctionFor example

I Classification

loss(y ,h(x)) ={

1 if h(x) 6= y0 otherwise

I Regression

loss(y ,h(x)) ={

(h(x)− y)2 if h(x) 6= y0 otherwise

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 7 / 10

Page 16: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

PerformanceError rates include the chance of accepting an intruder (FalseAcceptance Rate (FAR)) and that of rejecting a genuine individual(False Rejection Rate (FRR))

Equal error rate (EER) corrosponds to a point where FAR andFRR are equal

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 8 / 10

Page 17: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

Receiver Operating Curve (ROC)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 9 / 10

Page 18: BITS F464: Machine Learning - Dr. Kamlesh Tiwari · 2018-01-10 · BITS F464: MACHINE LEARNING Lecture-01: Introduction Dr. Kamlesh Tiwari Assistant Professor Department of Computer

Thank You!

Thank you very much for your attention!

Queries ?

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 10 / 10