introduction to machine learning (and notation) · adapted from machine learning: a probabilistic...
TRANSCRIPT
![Page 1: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/1.jpg)
Introduction to Machine Learning (and Notation)
David I. InouyeSaturday, August 24, 2019
David I. Inouye Introduction to Machine Learning 0
![Page 2: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/2.jpg)
Announcements
▸TA: Liming Wu
▸Homework 1 will be posted by Wed due next Wed▸Submit GitHub username ASAP:
https://forms.gle/A4to4Q7huAiKaQBN9
▸Hopefully, first quiz on Wednesday, beginning of class
David I. Inouye Introduction to Machine Learning 1
![Page 3: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/3.jpg)
Outline
▸Supervised learning▸Regression▸Classification
▸Unsupervised learning
▸Other key concepts
David I. Inouye Introduction to Machine Learning 2
![Page 4: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/4.jpg)
The goal of supervised learning is to estimate a mapping (or function) between input and output
David I. Inouye Introduction to Machine Learning 3
Input𝒙
Output𝑦
Mapping𝑓
![Page 5: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/5.jpg)
The goal of supervised learning is to estimate a mapping (or function) between input and outputgiven only input-output examples
David I. Inouye Introduction to Machine Learning 4
Input𝒙
Output𝑦?
![Page 6: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/6.jpg)
The set of input-output pairs is called a training set,denoted by 𝒟 = 𝒙&, 𝑦& &()
*
▸Input 𝒙&▸Called features (ML), attributes, or covariates (Stats).
Sometimes just variables.▸Can be numeric, categorical, discrete, or nominal.▸Examples
▸[height, weight, age, gender]▸[𝑥), 𝑥-,⋯ , 𝑥/] – A d-dimensional vector of numbers▸Image▸Email message
▸Output 𝑦&▸Called output, response, or target (or label)▸Real-valued/numeric output: e.g., 𝑦& ∈ ℛ▸Categorical, discrete, or nominal output: 𝑦& from finite
set, i.e., 𝑦& ∈ {1,2,⋯ , 𝑐}
David I. Inouye Introduction to Machine Learning 5
![Page 7: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/7.jpg)
If the output 𝑦& is numeric, then the problem is known as regression
David I. Inouye Introduction to Machine Learning 6
NOTE: Input 𝑥 does not have to be numeric. Only the output 𝑦 must be numeric.
![Page 8: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/8.jpg)
If the output 𝑦& is numeric, then the problem is known as regression
▸Given height 𝑥&, predict age 𝑦&
▸Predict GPA given SAT score
▸Predict SAT score given GPA
▸Predict GRE given SAT and GPA
David I. Inouye Introduction to Machine Learning 7
![Page 9: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/9.jpg)
If output is categorical, then the problem is known as classification
David I. Inouye Introduction to Machine Learning 8
![Page 10: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/10.jpg)
If output is categorical, then the problem is known as classification
▸Given height 𝑥, predict “male” (𝑦 = 0) or “female” (𝑦 = 1)
▸Predict defaulting on loan (“yes” or “no”) given salary and mortgage payment
David I. Inouye Introduction to Machine Learning 9
![Page 11: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/11.jpg)
Side note: Encoding / representing a categorical variable can be done in many ways
▸Suppose the categorical variable is “yes” and “no”▸Canonical ways: “no” -> 0 and “yes -> 1▸What are other possible encodings?
▸What if there are more than two categories such as cats, dogs, fish and snakes?
▸What is good and bad about using {1,2,3,4} for above example of animals?
▸One-hot encoding is another common way
David I. Inouye Introduction to Machine Learning 10
![Page 12: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/12.jpg)
The goal of unsupervised learning is to find “interesting patterns” ONLY in the input
David I. Inouye Introduction to Machine Learning 11
Input𝒙
??? ▸Also called descriptive
learning or knowledge discovery
▸What are “interesting patterns”?▸Could be many things▸Clusters▸Correlations
![Page 13: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/13.jpg)
In unsupervised learning, the training set is only a set of input values 𝒟 = 𝒙& &()*
▸Estimate natural clusters (or groups) of customers
▸Estimate the correlation between height andweight, 𝒙 = [ℎ, 𝑤]
▸Estimate a single number that summarizes all variables of wealth (e.g. credit score)
David I. Inouye Introduction to Machine Learning 12
![Page 14: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/14.jpg)
Given this dataset, should we use supervised or unsupervised learning?
David I. Inouye Introduction to Machine Learning 13
𝑑 features/attributes/covariates
𝑛 samples/observations/examples
yes no
Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012.
Color Shape Size (cm) Is it good?
Blue Square 10 yes
Red Ellipse 2.4 yes
Red Ellipse 20.7 no
![Page 15: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/15.jpg)
Is this a regression or classification problem?
David I. Inouye Introduction to Machine Learning 14
𝑑 features/attributes/covariates
𝑛 samples/observations/examples
yes no
Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012.
Color Shape Size (cm) Is it good?
Blue Square 10 yes
Red Ellipse 2.4 yes
Red Ellipse 20.7 no
![Page 16: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/16.jpg)
Suppose we assume classification, which features are the input 𝒙 and which are the output 𝑦?
David I. Inouye Introduction to Machine Learning 15
𝑑 features/attributes/covariates
𝑛 samples/observations/examples
yes no
Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012.
Color Shape Size (cm) Is it good?
Blue Square 10 yes
Red Ellipse 2.4 yes
Red Ellipse 20.7 no
![Page 17: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/17.jpg)
Suppose we assume regression, which features are the input 𝒙 and which are the output 𝑦?
David I. Inouye Introduction to Machine Learning 16
𝑑 features/attributes/covariates
𝑛 samples/observations/examples
yes no
Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012.
Color Shape Size (cm) Is it good?
Blue Square 10 yes
Red Ellipse 2.4 yes
Red Ellipse 20.7 no
![Page 18: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/18.jpg)
How could we use unsupervised learning?
David I. Inouye Introduction to Machine Learning 17
𝑑 features/attributes/covariates
𝑛 samples/observations/examples
yes no
Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012.
Color Shape Size (cm) Is it good?
Blue Square 10 yes
Red Ellipse 2.4 yes
Red Ellipse 20.7 no
![Page 19: Introduction to Machine Learning (and Notation) · Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012. Color Shape Size (cm) Is it good? Blue](https://reader034.vdocuments.us/reader034/viewer/2022043008/5f98cc2dadc0b80146025109/html5/thumbnails/19.jpg)
The dataset cannot determine the task,rather the context determines the task
David I. Inouye Introduction to Machine Learning 18
𝑑 features/attributes/covariates
𝑛 samples/observations/examples
yes no
Adapted from Machine Learning: A Probabilistic Perspective, Ch. 1, Kevin P. Murphy, 2012.
Color Shape Size (cm) Is it good?
Blue Square 10 yes
Red Ellipse 2.4 yes
Red Ellipse 20.7 no