csc2515&tutorial&3& · outline& •...

19
CSC2515 Tutorial 3 Jan 27 2015 Presented by Ali Punjani

Upload: others

Post on 09-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

CSC2515  Tutorial  3  

Jan  27  2015    

Presented  by  Ali  Punjani    

Page 2: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Outline  

•  Mul>variate  Linear  Regression,  Demo  •  Cross-­‐Valida>on,  Review  •  k-­‐NN  Classifica>on,  Demo  

Page 3: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Mul>variate  Linear  Regression  •  We  want  to  predict  output,  such  as  the  median  house  price,  

from  mul>-­‐dimensional  observa>ons  

•  Each  house  is  a  data  point  n,  with  observa>ons  indexed  by  j:    

•  Simple  predictor  is  analogue  of  linear  classifier,  producing  real-­‐valued  y  for  input  x  with  parameters  w  (assuming  x0  =  1):          

y = w0 + wjx j =wTx

j=1

d

x(n) = (x1(n) ,..., xd

(n) )

3  

Page 4: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Mul>variate  Data  

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

=

Nd

NN

d

d

XXX

XXXXXX

!"

!!

21

222

21

112

11

X

4  

•  Mul>ple  measurements  (sensors)  •  d  inputs/features/aXributes  •  N  instances/observa>ons/examples  

Page 5: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Mul>variate  Parameters  

( ) ( )( )[ ]

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

=−−=≡Σ

221

22221

11221

µµ

ddd

d

d

TE

σσσ

σσσ

σσσ

!"

!!

XXXCov

Mean:E x[ ] = µ1,...,µd[ ]T

Covariance:σ ij ≡Cov Xi,Xj( )

Correlation: Corr Xi,Xj( ) ≡ ρij =σ ij

σ iσ j

5  

Page 6: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Mul>variate  Normal  Distribu>on  

( )

( )( )

( ) ( )⎥⎦⎤

⎢⎣⎡ −−−= − µxµxx

µx

1212

Σ21

Σ21Σ

Td

d

p exp

~

//π

,N

6  

•  Mahalanobis  distance:  (x  –  μ)T  ∑–1  (x  –  μ)      measures  the  distance  from  x  to  μ  in  terms  of  ∑  (normalizes  for  difference  in  variances  and  correla>ons)  

Page 7: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Bivariate  Normal  

7  

Page 8: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

8  Lecture  Notes  for  E  Alpaydın  2010  Introduc>on  to  Machine  Learning  2e  ©  The  MIT  Press  (V1.0)  

Page 9: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

•  If  xi  are  independent,  offdiagonals  of  ∑  are  0,  Mahalanobis  distance  reduces  to  weighted  (by  1/σi)  Euclidean  distance:  

•  If  variances  are  also  equal,  reduces  to  Euclidean  distance  

 

Independent  Inputs:  Naive  Bayes  

( ) ( )( ) ⎥

⎥⎦

⎢⎢⎣

⎡⎟⎟⎠

⎞⎜⎜⎝

⎛ −−== ∑∏

=

=

=

d

i i

iid

ii

d

d

iii

xxpp1

2

1

21 21

2

1σµ

σπexp

/ ⨿x

9  

Page 10: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Parametric  Classifica>on  

•  If  p  (x  |  Ci  )  ~  N  (  μi  ,∑i  )  

•  Discriminant  func>ons  

( )( )

( ) ( )⎥⎦⎤

⎢⎣⎡ −−−= −

iiT

ii

diCp µxµxx 1212

Σ21

Σ21 exp| //π

( ) ( ) ( )

( ) ( ) ( )iiiT

ii

iii

CPdCPCpg

 log  loglog

 log|  log

+−−−−−=

+=

− µΣµ21Σ

212

21 xx

xx

π

10  

Page 11: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

MATLAB  Demo  

•  Mul>variate  Linear  Regression  

Page 12: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Cross-­‐Valida>on  

•   why  valida>on?  •   performance  es>ma>on  •   model  selec>on  (e.g.  hyper  parameters)  

•   hold-­‐out  valida>on  •   split  dataset  into  training  set  and  test  set  •   drawbacks:  waste  of  dataset,  es>ma>on  of  error  rate  maybe  misleading  

•   cross-­‐valida>on  

Page 13: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Cross-­‐Valida>on  

•   random  subsampling  •   k-­‐fold  cross-­‐valida>on  

•   leave-­‐1-­‐out  cross-­‐valida>on  (  k=N  )  

Page 14: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Cross-­‐Valida>on  

•   random  subsampling  •   k-­‐fold  cross-­‐valida:on  

•   leave-­‐1-­‐out  cross-­‐valida>on  (  k=N  )  

Page 15: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Cross-­‐Valida>on  

•   random  subsampling  •   k-­‐fold  cross-­‐valida>on  

•   leave-­‐1-­‐out  cross-­‐valida:on  (  k=N  )  

Page 16: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Cross-­‐Valida>on  

•   how  many  folds  do  we  need  ?  •   with  larger  k  •   error  es>ma>on  tends  to  be  more  accurate  •   but  computa>onal  >me  will  be  larger  

•   in  prac>ce,  larger  dataset,  smaller  k  •   a  common  choice  for  k-­‐fold  cross-­‐valida>on  is  k  =  10  

Page 17: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

Some  Issues  with  Cross-­‐Valida>on  

•   intensive  use  of  cross-­‐valida>on  can  overfit  if  you  explore  too  many  models,  by  tuning  hyper  parameters  to  predict  the  whole  training  set  well  •   hold  out  an  addi>onal  test  set  before  doing  any  model  selec>on.  Check  the  best  model  performs  well  even  on  the  addi>onal  test  set  

•   >me  consuming  (always  if  done  naively)  •   there  are  efficient  tricks  that  can  save  work  over  brute  force  

Page 18: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

k-­‐Nearest  Neighbors  •   k-­‐NN  is  a  simple  algorithm  which  stores  all  available  training  examples  and  predict  value/class  of  an  unseen  instance  based  on  a  similarity  measure  •   k  =  1  •   predict  the  same  value/class  as  the  nearest  instance  in  the  training  set  

•   k  >  1  •   find  the  k  closet  training  examples  •   predict  class:  majority  vote  •   predict  value:  average  weighted  by  inverse  distance  

•   memory  based,  no  explicit  training  or  model  

Page 19: CSC2515&Tutorial&3& · Outline& • Mul>variate&Linear&Regression,&Demo& • CrossFValidaon,&Review& • kFNN&Classificaon,&Demo&

k-­‐NN  Classifica>on  •   similarity  measure:  Euclidean  distance,  etc.  •   assump>on  behind  Euclidean  distance:  uncorrelated  inputs  with  equal  variances  

•   predict  class:  majority  vote  •   k  preferably  odd  to  avoid  >es  for  binary  classifica>on  •   choice  of  k  •   smaller  k:  higher  variance  (less  stable)  •   larger  k:  higher  bias  (less  precise)  •   cross-­‐valida>on  can  help  

•   MATLAB  demo