1 cse 446 machine learning daniel weld xiao ling congle zhang
TRANSCRIPT
![Page 1: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/1.jpg)
1
CSE 446Machine Learning
Daniel Weld
Xiao Ling
Congle Zhang
![Page 2: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/2.jpg)
2©2005-2009 Carlos Guestrin
What is Machine Learning ?
![Page 3: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/3.jpg)
3©2005-2009 Carlos Guestrin
Machine Learning
Study of algorithms that improve their performance at some task with experience
Data UnderstandingMachine Learning
![Page 4: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/4.jpg)
4
Why?
Is this topic important?
![Page 5: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/5.jpg)
5©2005-2009 Carlos Guestrin
Exponential Growth in Data
Data UnderstandingMachine Learning
![Page 6: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/6.jpg)
6©2005-2009 Carlos Guestrin
Supremacy of Machine Learning
Machine learning is preferred approach to Speech recognition, Natural language processing Web search – result ranking Computer vision Medical outcomes analysis Robot control Computational biology Sensor networks …
This trend is accelerating Improved machine learning algorithms Improved data capture, networking, faster computers Software too complex to write by hand New sensors / IO devices Demand for self-customization to user, environment
![Page 7: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/7.jpg)
7
Logistics
![Page 8: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/8.jpg)
8©2005-2009 Carlos Guestrin, D. Weld,
Syllabus
Covers a wide range of Machine Learning techniques – from basic to state-of-the-art
You will learn about the methods you heard about: Naïve Bayes, logistic regression, nearest-neighbor, decision trees, boosting, neural
nets, overfitting, regularization, dimensionality reduction, error bounds, loss function, VC dimension, SVMs, kernels, margin bounds, K-means, EM, mixture models, semi-supervised learning, HMMs, graphical models, active learning
Covers algorithms, theory and applications It’s going to be fun and hard work
![Page 9: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/9.jpg)
9©2005-2009 Carlos Guestrin
Prerequisites
Probabilities Distributions, densities, marginalization…
Basic statistics Moments, typical distributions, regression…
Algorithms Dynamic programming, basic data structures, complexity…
Programming Mostly your choice of language, but Python (NumPy) + Matlab will be very
useful We provide some background, but the class will be fast paced
Ability to deal with “abstract mathematical concepts”
![Page 10: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/10.jpg)
10
Staff
Two Great TAs: Fantastic resource for learning, interact with them!
Xiao Ling, CSE 610, xiaoling@cs Office hours: TBA
Congle Zhang, CSE 524, clzhang@cs Office hours: TBA
Administrative Assistant Alicen Smith, CSE 546, asmith@cs
![Page 11: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/11.jpg)
11
Text Books
Required Text: Pattern Recognition and Machine Learning; Chris Bishop
Optional: Machine Learning; Tom Mitchell The Elements of Statistical Learning: Data Mining, Inference,
and Prediction; Trevor Hastie, Robert Tibshirani, Jerome Friedman
Information Theory, Inference, and Learning Algorithms; David MacKay
Website: Andrew Ng’s AI class videos Website: Tom Mitchell’s AI class videos
![Page 12: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/12.jpg)
12
Grading
4 homeworks (55%) First one goes out Fri 1/6/12
Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early
Midterm (15%) Circa Feb 10 in class
Final (30%) TBD by registrar
![Page 13: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/13.jpg)
13©2005-2009 Carlos Guestrin
Homeworks
Homeworks are hard, start early Due at the beginning of class
Minus 33% credit for each day (or part of day) late All homeworks must be handed in, even for zero credit
Collaboration You may discuss the questions Each student writes their own answers Write on your homework anyone with whom you collaborate Each student must write their own code for the programming part Please don’t search for answers on the web, Google, previous years’
homeworks, etc. Ask us if you are not sure if you can use a particular reference
![Page 14: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/14.jpg)
14
Communication
Main discussion board https://catalyst.uw.edu/gopost/board/xling/25219/
Urgent announcements cse446@cs Subscribe:
http://mailman.cs.washington.edu/mailman/listinfo/cse446
To email instructors, always use: cse446_instructor@cs
![Page 15: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/15.jpg)
15
Space of ML Problems
What is B
eing Learned?
Type of Supervision (eg, Experience, Feedback)
LabeledExamples
Reward Nothing
Discrete Function
Classification Clustering
Continuous Function
Regression
Policy Apprenticeship Learning
ReinforcementLearning
![Page 16: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/16.jpg)
16©2009 Carlos Guestrin
Classification
from data to discrete classes
![Page 17: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/17.jpg)
17©2005-2009 Carlos Guestrin
![Page 18: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/18.jpg)
Spam filtering
18©2009 Carlos Guestrin
data prediction
![Page 19: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/19.jpg)
19©2009 Carlos Guestrin
Text classification
Company home page
vs
Personal home page
vs
Univeristy home page
vs
…
![Page 20: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/20.jpg)
20©2009 Carlos Guestrin
Object detection
Example training images for each orientation
(Prof. H. Schneiderman)
![Page 21: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/21.jpg)
21©2009 Carlos Guestrin
Reading a noun (vs verb)
[Rustandi et al., 2005]
![Page 22: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/22.jpg)
Weather prediction
22©2009 Carlos Guestrin
![Page 23: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/23.jpg)
The classification pipeline
23©2009 Carlos Guestrin
Training
Testing
![Page 24: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/24.jpg)
24©2009 Carlos Guestrin
Regression
predicting a numeric value
![Page 25: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/25.jpg)
Stock market
25©2009 Carlos Guestrin
![Page 26: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/26.jpg)
Weather prediction revisted
26©2009 Carlos Guestrin
Temperature
![Page 27: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/27.jpg)
27©2009 Carlos Guestrin
Modeling sensor data
Measure temperatures at some locations
Predict temperatures throughout the environment
SERVER
LAB
KITCHEN
COPYELEC
PHONEQUIET
STORAGE
CONFERENCE
OFFICEOFFICE50
51
52 53
54
46
48
49
47
43
45
44
42 41
3739
38 36
33
3
6
10
11
12
13 14
1516
17
19
2021
22
242526283032
31
2729
23
18
9
5
8
7
4
34
1
2
3540
[Guestrin et al. ’04]
![Page 28: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/28.jpg)
28©2009 Carlos Guestrin
Clustering
discovering structure in data
![Page 29: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/29.jpg)
Clustering Data: Group similar things
![Page 30: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/30.jpg)
Clustering images
30©2009 Carlos Guestrin [Goldberger et al.]
Set of Images
![Page 31: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/31.jpg)
Clustering web search results
31©2009 Carlos Guestrin
![Page 32: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/32.jpg)
32©2009 Carlos Guestrin
Reinforcement Learning
training by feedback
![Page 33: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/33.jpg)
33
Reinforcement Learning
©2005-2009 Carlos Guestrin
![Page 34: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/34.jpg)
35©2009 Carlos Guestrin
Learning to act
Reinforcement learning An agent
Makes sensor observations Must select action Receives rewards
positive for “good” states negative for “bad” states
[Ng et al. ’05]
![Page 35: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/35.jpg)
36
In Summary
What is B
eing Learned?
Type of Supervision (eg, Experience, Feedback)
LabeledExamples
Reward Nothing
Discrete Function
Classification Clustering
Continuous Function
Regression
Policy Apprenticeship Learning
ReinforcementLearning
![Page 36: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/36.jpg)
37
In Summary
What is B
eing Learned?
Type of Supervision (eg, Experience, Feedback)
LabeledExamples
Reward Nothing
Discrete Function
Classification Clustering
Continuous Function
Regression
Policy Apprenticeship Learning
ReinforcementLearning
![Page 37: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/37.jpg)
38
Key Concepts
![Page 38: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/38.jpg)
Classifier
0.0 1.0 2.0 3.0 4.0 5.0 6.0
0.0
1.0
2.0
3.0
Hypothesis:Function for labeling examples
++
+ +
++
+
+
- -
-
- -
--
-
-
- +
++
-
-
-
+
+ Label: -Label: +
?
?
?
?
![Page 39: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/39.jpg)
40
Generalization
Hypotheses must generalize to correctly classify instances not in the training data.
Simply memorizing training examples is a consistent hypothesis that does not generalize.
![Page 40: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/40.jpg)
ML = Function Approximation
41
c(x)
x
May not be any perfect fitClassification ~ discrete functions
h(x)
h(x) = contains(`nigeria’, x) contains(`wire-transfer’, x)
![Page 41: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/41.jpg)
© Daniel S. Weld 42
Why is Learning Possible?
Experience alone never justifies any conclusion about any unseen instance.
Learning occurs when
PREJUDICE meets DATA!
Learning a “Frobnitz”
![Page 42: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/42.jpg)
© Daniel S. Weld 43
Bias
The nice word for prejudice is “bias”.Different from “Bias” in statistics
What kind of hypotheses will you consider?What is allowable range of functions you use when
approximating?What kind of hypotheses do you prefer?
![Page 43: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/43.jpg)
© Daniel S. Weld 44
Some Typical Biases
Occam’s razor“It is needless to do more when less will suffice”
– William of Occam,
died 1349 of the Black plagueMDL – Minimum description lengthConcepts can be approximated by ... conjunctions of predicates
... by linear functions
... by short decision trees
Frobnitz?
![Page 44: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/44.jpg)
45
ML as Optimization
Specify Preference Bias aka “Loss Function”
Solve using optimization Combinatorial Convex Linear Nasty
©2005-2009 Carlos Guestrin
![Page 45: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/45.jpg)
Overfitting
Hypothesis H is overfit when H’ and H has smaller error on training examples, but H has bigger error on test examples
![Page 46: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/46.jpg)
Overfitting
Hypothesis H is overfit when H’ and H has smaller error on training examples, but H has bigger error on test examples
Causes of overfitting Training set is too small Large number of features
Big problem in machine learning One solution: Validation set
![Page 47: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/47.jpg)
© Daniel S. Weld 48
Overfitting
Accuracy
0.9
0.8
0.7
0.6
On training dataOn test data
Model complexity (e.g., number of nodes in decision tree)
![Page 48: 1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang](https://reader035.vdocuments.us/reader035/viewer/2022062308/56649ddf5503460f94ad902c/html5/thumbnails/48.jpg)
49
The Road Ahead