machine learning - piazza

45
Machine Learning Slides: James Hays, Isabelle Guyon, Erik Sudderth, Mark Johnson, Derek Hoiem Photo: CMU Machine Learning Department protests G20

Upload: others

Post on 12-Jun-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning - Piazza

MachineLearning

Slides: James Hays, Isabelle Guyon, Erik Sudderth, Mark Johnson, Derek Hoiem Photo: CMU Machine Learning

Department protests G20

Page 2: Machine Learning - Piazza
Page 3: Machine Learning - Piazza

Clustering:grouptogethersimilarpointsandrepresentthemwithasingletoken

KeyChallenges:1)Whatmakestwopoints/images/patchessimilar?2)HowdowecomputeanoverallgroupingfrompairwisesimilariCes?

Slide: Derek Hoiem

Page 4: Machine Learning - Piazza

Howdowecluster?

•  K-means–  IteraCvelyre-assignpointstothenearestclustercenter

•  AgglomeraCveclustering–  StartwitheachpointasitsownclusteranditeraCvelymergetheclosestclusters

•  Mean-shiHclustering–  EsCmatemodesofpdf

•  Spectralclustering–  Splitthenodesinagraphbasedonassignedlinkswithsimilarityweights

Page 5: Machine Learning - Piazza

ClusteringforSummarizaConGoal:clustertominimizevarianceindatagivenclusters– PreserveinformaCon

( )∑∑ −=N

j

K

ijiN ij

21

,

** argmin, xcδcδc

δ

Whether xj is assigned to ci

Cluster center Data

Slide: Derek Hoiem

Page 6: Machine Learning - Piazza

K-meansalgorithm

Illustration: http://en.wikipedia.org/wiki/K-means_clustering

1. Randomly select K centers

2. Assign each point to nearest center

3. Compute new center (mean) for each cluster

Page 7: Machine Learning - Piazza

K-meansalgorithm

Illustration: http://en.wikipedia.org/wiki/K-means_clustering

1. Randomly select K centers

2. Assign each point to nearest center

3. Compute new center (mean) for each cluster

Back to 2

Page 8: Machine Learning - Piazza

BuildingVisualDicConaries1.  Samplepatchesfrom

adatabase–  E.g.,128dimensional

SIFTvectors

2.  Clusterthepatches–  Clustercentersare

thedicConary

3.  Assignacodeword(number)toeachnewpatch,accordingtothenearestcluster

Page 9: Machine Learning - Piazza

Examplesoflearnedcodewords

Sivic et al. ICCV 2005 http://www.robots.ox.ac.uk/~vgg/publications/papers/sivic05b.pdf

Most likely codewords for 4 learned “topics” EM with multinomial (problem 3) to get topics

Page 10: Machine Learning - Piazza

AgglomeraCveclustering

Page 11: Machine Learning - Piazza

AgglomeraCveclustering

Page 12: Machine Learning - Piazza

AgglomeraCveclustering

Page 13: Machine Learning - Piazza

AgglomeraCveclustering

Page 14: Machine Learning - Piazza

AgglomeraCveclustering

Page 15: Machine Learning - Piazza

AgglomeraCveclusteringHowtodefineclustersimilarity?-  Averagedistancebetweenpoints,

maximumdistance,minimumdistance-  Distancebetweenmeansormedoids

Howmanyclusters?-  Clusteringcreatesadendrogram(atree)-  Thresholdbasedonmaxnumberofclusters

orbasedondistancebetweenmergesdi

stan

ce

Page 16: Machine Learning - Piazza

AgglomeraCveclusteringdemohZp://home.dei.polimi.it/maZeucc/Clustering/tutorial_html/AppletH.html

Page 17: Machine Learning - Piazza

Conclusions:AgglomeraCveClusteringGood•  Simpletoimplement,widespreadapplicaCon•  ClustershaveadapCveshapes•  ProvidesahierarchyofclustersBad•  Mayhaveimbalancedclusters•  SCllhavetochoosenumberofclustersorthreshold

•  Needtousean“ultrametric”togetameaningfulhierarchy

Page 18: Machine Learning - Piazza

•  VersaCletechniqueforclustering-basedsegmentaCon

D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.

MeanshiHsegmentaCon

Page 19: Machine Learning - Piazza

MeanshiHalgorithm•  Trytofindmodesofthisnon-parametric

density

Page 20: Machine Learning - Piazza

KerneldensityesCmaCon

Kernel density estimation function

Gaussian kernel

Page 21: Machine Learning - Piazza

Region of interest

Center of mass

Mean Shift vector

Slide by Y. Ukrainitz & B. Sarel

MeanshiH

Page 22: Machine Learning - Piazza

Region of interest

Center of mass

Mean Shift vector

Slide by Y. Ukrainitz & B. Sarel

MeanshiH

Page 23: Machine Learning - Piazza

Region of interest

Center of mass

Mean Shift vector

Slide by Y. Ukrainitz & B. Sarel

MeanshiH

Page 24: Machine Learning - Piazza

Region of interest

Center of mass

Mean Shift vector

MeanshiH

Slide by Y. Ukrainitz & B. Sarel

Page 25: Machine Learning - Piazza

Region of interest

Center of mass

Mean Shift vector

Slide by Y. Ukrainitz & B. Sarel

MeanshiH

Page 26: Machine Learning - Piazza

Region of interest

Center of mass

Mean Shift vector

Slide by Y. Ukrainitz & B. Sarel

MeanshiH

Page 27: Machine Learning - Piazza

Region of interest

Center of mass

Slide by Y. Ukrainitz & B. Sarel

MeanshiH

Page 28: Machine Learning - Piazza

Simple Mean Shift procedure: •  Compute mean shift vector

• Translate the Kernel window by m(x)

2

1

2

1

( )

ni

ii

ni

i

gh

gh

=

=

⎡ ⎤⎛ ⎞⎢ ⎥⎜ ⎟

⎜ ⎟⎢ ⎥⎝ ⎠= −⎢ ⎥⎛ ⎞⎢ ⎥⎜ ⎟⎢ ⎥⎜ ⎟⎝ ⎠⎣ ⎦

x - xx

m x xx - x

g( ) ( )kʹ= −x x

CompuCngtheMeanShiH

Slide by Y. Ukrainitz & B. Sarel

Page 29: Machine Learning - Piazza

•  AZracConbasin:theregionforwhichalltrajectoriesleadtothesamemode

•  Cluster:alldatapointsintheaZracConbasinofamode

Slide by Y. Ukrainitz & B. Sarel

AZracConbasin

Page 30: Machine Learning - Piazza

AZracConbasin

Page 31: Machine Learning - Piazza

MeanshiHclustering•  ThemeanshiHalgorithmseeksmodesofthe

givensetofpoints1.  Choosekernelandbandwidth2.  Foreachpoint:

a)  Centerawindowonthatpointb)  Computethemeanofthedatainthesearchwindowc)  CenterthesearchwindowatthenewmeanlocaCond)  Repeat(b,c)unClconvergence

3.  Assignpointsthatleadtonearbymodestothesamecluster

Page 32: Machine Learning - Piazza

•  Computefeaturesforeachpixel(color,gradients,texture,etc)•  SetkernelsizeforfeaturesKfandposiConKs•  IniCalizewindowsatindividualpixellocaCons•  PerformmeanshiHforeachwindowunClconvergence•  MergewindowsthatarewithinwidthofKfandKs

SegmentaConbyMeanShiH

Page 33: Machine Learning - Piazza

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

MeanshiHsegmentaConresults

Page 34: Machine Learning - Piazza

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

Page 35: Machine Learning - Piazza

Mean-shiH:otherissues•  Speedups

–  BinnedesCmaCon–  Fastsearchofneighbors– UpdateeachwindowineachiteraCon(fasterconvergence)

•  Othertricks– UsekNNtodeterminewindowsizesadapCvely

•  LotsoftheoreCcalsupportD.ComaniciuandP.Meer,MeanShiH:ARobustApproachtowardFeatureSpaceAnalysis,PAMI2002.

Page 36: Machine Learning - Piazza

MeanshiHprosandcons

•  Pros–  Goodgeneral-pracCcesegmentaCon–  Flexibleinnumberandshapeofregions–  Robusttooutliers

•  Cons–  Havetochoosekernelsizeinadvance–  Notsuitableforhigh-dimensionalfeatures

•  Whentouseit–  Oversegmentatoin– MulCplesegmentaCons–  Tracking,clustering,filteringapplicaCons

Page 37: Machine Learning - Piazza

Spectralclustering Grouppointsbasedonlinksinagraph

A B

Page 38: Machine Learning - Piazza

Cutsinagraph

A B

Normalized Cut •  a cut penalizes large segments •  fix by normalizing for size of segments

•  volume(A) = sum of costs of all edges that touch A

Source: Seitz

Page 39: Machine Learning - Piazza

NormalizedcutsforsegmentaCon

Page 40: Machine Learning - Piazza

VisualPageRank•  Determiningimportancebyrandomwalk

– What’stheprobabilitythatyouwillrandomlywalktoagivennode?

•  Createadjacencymatrixbasedonvisualsimilarity•  EdgeweightsdetermineprobabilityoftransiCon

Jing Baluja 2008

Page 41: Machine Learning - Piazza

Whichalgorithmtouse?•  QuanCzaCon/SummarizaCon:K-means

– Aimstopreservevarianceoforiginaldata– Caneasilyassignnewpointtoacluster

Quantization for computing histograms

Summary of 20,000 photos of Rome using “greedy k-means”

http://grail.cs.washington.edu/projects/canonview/

Page 42: Machine Learning - Piazza

Whichalgorithmtouse?•  ImagesegmentaCon:agglomeraCveclustering

– Moreflexiblewithdistancemeasures(e.g.,canbebasedonboundarypredicCon)

– AdaptsbeZertospecificdata– Hierarchycanbeuseful

http://www.cs.berkeley.edu/~arbelaez/UCM.html

Page 43: Machine Learning - Piazza

Thingstoremember•  K-meansusefulforsummarizaCon,buildingdicConariesofpatches,generalclustering

•  AgglomeraCveclusteringusefulforsegmentaCon,generalclustering

•  Spectralclusteringusefulfordeterminingrelevance,summarizaCon,segmentaCon

Page 44: Machine Learning - Piazza

ClusteringKeyalgorithm•  K-means

Page 45: Machine Learning - Piazza