machine learning saarland university, ss 2007 holger bast marjan celikik kevin chang stefan funke...

Machine LearningSaarland University, SS 2007

Holger BastMarjan CelikikKevin ChangStefan Funke

Joachim Giesen

Max-Planck-Institut für InformatikSaarbrücken, Germany

Lecture 1, Friday April 19th, 2007(basics and example applications)

Overview of this Lecture

Machine Learning Basics

– Classification

– Objects as feature vectors

– Regression

– Clustering

Example applications

– Surface reconstruction

– Preference Learning

– Netflix challenge (how to earn $1,000,000)

– Text search

Classification

Given a set of points, each labeled + or –

– learn something from them …

– … in order to predict label of new points

++ – –

––

–?–

this is an instance of supervised learning

Classification — Quality

Which classifier is better?

– answer requires a model of where the data comes from

– and a measure of quality/accuracy

++ – –

––

Classification — Outliers and Overfitting

We have to find a balance between two extremes

– oversimplification ( large classification error)

– overfitting ( lack of regularity)

– again: requires a model of the data

++ – –

––

Classification — Point Transformation

If a classifier does not work for the original data

– try it on a transformation of the data

– typically: make points linearly separable by a suitable mapping to a higher-dimensional space

+ ++ +++– – – ++0

– –

map x to (x , |x|)

Classification — more labels

– –

––

Typically:

– first, basic technique for binary classification

– then, extension to more labels

Objects as Feature Vectors

But why learn something about points ?

General Idea:

– represent objects as points in a space of fixed dimension

– each dimension corresponds to a so-called feature of the object

Very crucial:

– selection of features

– normalization of vectors

Example: Objects with attributes

– features = values

– normalize by reference value for each feature

Person 1 Person 2 Person 3

188 cm 181 cm 190 cm75 kg 90 kg 77 kg

age 36 age 32 age 34

1887536

1819033

Person 4

176 cm55 kg

age 24

heightweightage

1907734

1725534

1.040.940.90

1.011.130.83

height/180weight/70age/30

1.060.960.85

0.960.690.60

2 8 28 5 82 7 2

282858272

Example: Images

– features = pixels(with grey values)

– often fine without further normalization

1 6 16 6 61 6 1

Image 1 Image 2

pixel (1,1)

pixel (1,2)

pixel (1,3)

pixel (2,1)

pixel (2,2)

pixel (2,3)

pixel (3,1)

pixel (3,2)

pixel (3,3)

161666161

Example: Text documents– features = words

– normalize to unit norm

1110001

LearningMachineSSStatisticalTheory20062007

Doc. 1

Machine LearningSS 2007

Doc. 1

Doc. 2

Statistical

LearningTheory

SS 2007

Doc. 2

Statistical

LearningTheory

SS 2007

Doc. 3

Statistical

LearningTheory

SS 2006

Doc. 3

Statistical

LearningTheory

SS 2006

1011101

1011110

Example: Text documents– features = words

– normalize to unit norm

0.50.50.50000.5

LearningMachineSSStatisticalTheory20062007

Doc. 1

Doc. 2

Statistical

LearningTheory

SS 2007

Doc. 2

Statistical

LearningTheory

SS 2007

Doc. 3

Statistical

LearningTheory

SS 2006

Doc. 3

Statistical

LearningTheory

SS 2006

0.400.40.40.400.4

0.400.40.40.40.40

Regression

Learn a function that maps objects to values

Similar trade-off as for classification:

– risk of oversimplification vs. risk of overfitting

given value(typically multi-dimensional)

value to learn(typically a real number)

Regression

Learn a function that maps objects to values

Similar trade-off as for classification:

– risk of oversimplification vs. risk of overfitting

given value(typically multi-dimensional)

value to learn(typically a real number)

Clustering

Partition given set of points into clusters

machine learning saarland university, ss 2007 holger bast marjan celikik kevin chang stefan funke...

Documents

introduction -...

saarbrücken informatik saarbrücken computational...

cornelia funke - srce od mastila

flow complex joachim giesen friedrich-schiller-universität...

fearless (a mirrorworld novel) by cornelia funke

alfa kick-off meeting in saarbrücken

funke abimbola frsa - global equality & diversity

218zzz2u9z8k37r9ob41kso0-wpengine.netdna-ssl.com · general...

funke shell tube he e

saarbrücken, germany - home | interreg nwe€¦ · wp.i2...

french students in saarbrücken

asea reviews – charles funke

new zealand signature varietal: sauvignon blanc alex giesen,...

funke vpc pipe coupling - rsk databasen

m. andreini, n. van de giesen,

saarbrücken, germany - llvm

meister eckhart - sister odila funke (1916)

04 brian giesen pandemic flu case

laboratory catalogue for milk analysis - funke-dr.n.gerber

evacuation head-count at giesen wines - whosonlocation ·...