bits f464: machine learning - dr. kamlesh tiwari · 2018-01-10 · bits f464: machine learning...

BITS F464: MACHINE LEARNING

Lecture-01: Introduction

Dr. Kamlesh TiwariAssistant Professor

Department of Computer Science and Information Systems Engineering,BITS Pilani, Rajasthan-333031 INDIA

Jan 10, 2018 (Campus @ BITS-Pilani Jan-May 2018)

IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.

There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.

Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.

Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)

Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10

Building Blocks

Input: xOutput: yTraining data: (x (1), y (1)), (x (2), y (2)), ..., (x (m), y (m))

x (i) could be a multivariate say x (i) = (x (i)1 , x (i)

2 , ..., x (i)n )

Target function: true function

f : x → y

Hypothesish : x → y

Accuracy: agreement b/w f and h

Issue isTrue function is not known.

The Flow of ML

True Function (Unknown) f

Training Data

Observations

Machine Learning

Hypothesis Space (with bias) H

Hypothesis h

A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)

Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model

n∑i=1

wi × xi =

{> Threshold Then APPROVE

otherwise DENY/REJECT

Simplified:

h(x) = sign(n∑

wi × xi − Threshold)

Add an extra term x0 then

h(x) = sign(n∑

wi × xi)

Let x1=accountBal, x2=Salary, x3=age ...

What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model

n∑i=1

wi × xi =

Simplified:

h(x) = sign(n∑

wi × xi)

Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...

The Modeln∑

wi × xi =

Simplified:

h(x) = sign(n∑

wi × xi)

n∑i=1

wi × xi =

Simplified:

h(x) = sign(n∑

wi × xi)

n∑i=1

wi × xi =

Simplified:

h(x) = sign(n∑

wi × xi)

A Toy model (Contd..)

Can you recognize h(x) = sign(∑n

i=0 wi × xi)

It is a linear equation (in two dimension) or hyper plane

Vector (w1,w2, ...,wm) would be normal on the plane.What changes this plane? wi ’sLearning: Use misclassified examples to update wi = wi + yixi

A Toy model (Contd..)

Can you recognize h(x) = sign(∑n

i=0 wi × xi)

It is a linear equation (in two dimension) or hyper plane

Vector (w1,w2, ...,wm) would be normal on the plane.What changes this plane? wi ’sLearning: Use misclassified examples to update wi = wi + yixi

Loss function

Performance is the closeness of hypothesis function with targetfunctionFor example

I Classification

loss(y ,h(x)) ={

1 if h(x) 6= y0 otherwise

I Regression

loss(y ,h(x)) ={

(h(x)− y)2 if h(x) 6= y0 otherwise

PerformanceError rates include the chance of accepting an intruder (FalseAcceptance Rate (FAR)) and that of rejecting a genuine individual(False Rejection Rate (FRR))

Equal error rate (EER) corrosponds to a point where FAR andFRR are equal

Receiver Operating Curve (ROC)

Thank You!

Thank you very much for your attention!

Queries ?

bits f464: machine learning - dr. kamlesh tiwari · 2018-01-10 · bits f464: machine learning...

Documents