bits f464: machine learning - dr. kamlesh tiwari · 2018-01-10 · bits f464: machine learning...
TRANSCRIPT
BITS F464: MACHINE LEARNING
Lecture-01: Introduction
Dr. Kamlesh TiwariAssistant Professor
Department of Computer Science and Information Systems Engineering,BITS Pilani, Rajasthan-333031 INDIA
Jan 10, 2018 (Campus @ BITS-Pilani Jan-May 2018)
IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.
There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.
Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.
Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10
IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.
There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.
Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.
Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10
IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.
There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.
Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.
Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10
IntroductionML depends upon Pattern Recognition which corresponds to findingregularities in the data.
There should be a pattern.No issues if we are unable to describe it mathematically.Sufficient examples or data is required.
Consider e-mail filtering SPAM/Not-SPAMAssumption is that there are some words whose frequency iscorrelated to this filtering.
Netflix Prize (2009)Open competition to predict user ratings for films. Prize of USD 1million was given to the BellKor’s Pragmatic Chaos team whichimproved previous prediction by by ∼10.06% (used matrixfactorization)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 2 / 10
Building Blocks
Input: xOutput: yTraining data: (x (1), y (1)), (x (2), y (2)), ..., (x (m), y (m))
x (i) could be a multivariate say x (i) = (x (i)1 , x (i)
2 , ..., x (i)n )
Target function: true function
f : x → y
Hypothesish : x → y
Accuracy: agreement b/w f and h
Issue isTrue function is not known.
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 3 / 10
The Flow of ML
True Function (Unknown) f
Training Data
Observations
Machine Learning
Hypothesis Space (with bias) H
Hypothesis h
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 4 / 10
A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)
Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model
n∑i=1
wi × xi =
{> Threshold Then APPROVE
otherwise DENY/REJECT
Simplified:
h(x) = sign(n∑
i=1
wi × xi − Threshold)
Add an extra term x0 then
h(x) = sign(n∑
i=0
wi × xi)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10
A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)
Let x1=accountBal, x2=Salary, x3=age ...
What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model
n∑i=1
wi × xi =
{> Threshold Then APPROVE
otherwise DENY/REJECT
Simplified:
h(x) = sign(n∑
i=1
wi × xi − Threshold)
Add an extra term x0 then
h(x) = sign(n∑
i=0
wi × xi)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10
A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)
Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...
The Modeln∑
i=1
wi × xi =
{> Threshold Then APPROVE
otherwise DENY/REJECT
Simplified:
h(x) = sign(n∑
i=1
wi × xi − Threshold)
Add an extra term x0 then
h(x) = sign(n∑
i=0
wi × xi)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10
A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)
Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model
n∑i=1
wi × xi =
{> Threshold Then APPROVE
otherwise DENY/REJECT
Simplified:
h(x) = sign(n∑
i=1
wi × xi − Threshold)
Add an extra term x0 then
h(x) = sign(n∑
i=0
wi × xi)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10
A Toy modelThe Problem: credit approval.Input: x = (x1, x2, ..., xn)
Let x1=accountBal, x2=Salary, x3=age ...What weight we should give w1=0.6, x2=0.3, x3=-0.1 ...The Model
n∑i=1
wi × xi =
{> Threshold Then APPROVE
otherwise DENY/REJECT
Simplified:
h(x) = sign(n∑
i=1
wi × xi − Threshold)
Add an extra term x0 then
h(x) = sign(n∑
i=0
wi × xi)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 5 / 10
A Toy model (Contd..)
Can you recognize h(x) = sign(∑n
i=0 wi × xi)
It is a linear equation (in two dimension) or hyper plane
Vector (w1,w2, ...,wm) would be normal on the plane.What changes this plane? wi ’sLearning: Use misclassified examples to update wi = wi + yixi
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 6 / 10
A Toy model (Contd..)
Can you recognize h(x) = sign(∑n
i=0 wi × xi)
It is a linear equation (in two dimension) or hyper plane
Vector (w1,w2, ...,wm) would be normal on the plane.What changes this plane? wi ’sLearning: Use misclassified examples to update wi = wi + yixi
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 6 / 10
Loss function
Performance is the closeness of hypothesis function with targetfunctionFor example
I Classification
loss(y ,h(x)) ={
1 if h(x) 6= y0 otherwise
I Regression
loss(y ,h(x)) ={
(h(x)− y)2 if h(x) 6= y0 otherwise
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 7 / 10
PerformanceError rates include the chance of accepting an intruder (FalseAcceptance Rate (FAR)) and that of rejecting a genuine individual(False Rejection Rate (FRR))
Equal error rate (EER) corrosponds to a point where FAR andFRR are equal
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 8 / 10
Receiver Operating Curve (ROC)
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 9 / 10
Thank You!
Thank you very much for your attention!
Queries ?
Machine Learning (BITS F464) M T F (2-3PM) 6151@BITS-Pilani Lecture-02 (Jan 10, 2018) 10 / 10