csc 411: lecture 01: introduction
TRANSCRIPT
![Page 1: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/1.jpg)
CSC 411: Lecture 01: Introduction
Rich Zemel, Raquel Urtasun and Sanja Fidler
University of Toronto
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 1 / 44
![Page 2: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/2.jpg)
Today
Administration details
Why is machine learning so cool?
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 2 / 44
![Page 3: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/3.jpg)
The Team I
Instructors:
I Raquel Urtasun
I Richard Zemel
Email:
Offices:
I Raquel: 290E in Pratt
I Richard: 290D in Pratt
Office hours: TBA
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 3 / 44
![Page 4: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/4.jpg)
The Team II
TA’s:
I Siddharth Ancha
I Azin Asgarian
I Min Bai
I Lluis Castrejon Subira
I Kaustav Kundu
I Hao-Wei Lee
I Renjie Liao
I Shun Liao
I Wenjie Luo
I David Madras
I Seyed Parsa Mirdehghan
I Mengye Ren
I Geoffrey Roeder
I Yulia Rubanova
I Elias Tragas
I Eleni Triantafillou
I Shenlong Wang
I Ayazhan Zhakhan
Email:
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 4 / 44
![Page 5: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/5.jpg)
Admin Details
Liberal wrt waiving pre-requisites
I But it is up to you to determine if you have the appropriate background
Do I have the appropriate background?
I Linear algebra: vector/matrix manipulations, properties
I Calculus: partial derivatives
I Probability: common distributions; Bayes Rule
I Statistics: mean/median/mode; maximum likelihood
I Sheldon Ross: A First Course in Probability
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 5 / 44
![Page 6: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/6.jpg)
Course Information (Section 1)
Class: Mondays at 11-1pm in AH 400
Instructor: Raquel Urtasun
Tutorials: Monday, 3-4pm, same classroom
Class Website:
http://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/
CSC411_Fall16.html
The class will use Piazza for announcements and discussions:
https://piazza.com/utoronto.ca/fall2016/csc411/home
First time, sign up here:
https://piazza.com/utoronto.ca/fall2016/csc411
Your grade will not depend on your participation on Piazza. It’s just a
good way for asking questions, discussing with your instructor, TAs and your
peers
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 6 / 44
![Page 7: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/7.jpg)
Course Information (Section 2)
Class: Wednesdays at 11-1pm in MS 2170
Instructor: Raquel Urtasun
Tutorials: Wednesday, 3-4pm, BA 1170
Class Website:
http://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/
CSC411_Fall16.html
The class will use Piazza for announcements and discussions:
https://piazza.com/utoronto.ca/fall2016/csc411/home
First time, sign up here:
https://piazza.com/utoronto.ca/fall2016/csc411/home
Your grade will not depend on your participation on Piazza. It’s just a
good way for asking questions, discussing with your instructor, TAs and your
peers
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 7 / 44
![Page 8: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/8.jpg)
Course Information (Section 3)
Class: Thursdays at 4-6pm in KP 108
Instructor: Richard Zemel
Tutorials: Thursday, 6-7pm, same class
Class Website:
http://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/
CSC411_Fall16.html
The class will use Piazza for announcements and discussions:
https://piazza.com/utoronto.ca/fall2016/csc411/home
First time, sign up here:
https://piazza.com/utoronto.ca/fall2016/csc411/home
Your grade will not depend on your participation on Piazza. It’s just a
good way for asking questions, discussing with your instructor, TAs and your
peers
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 8 / 44
![Page 9: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/9.jpg)
Course Information (Section 4)
Class: Fridays at 11-1pm in MS 2172
Instructor: Richard Zemel
Tutorials: Thursday, 3-4pm, same class
Class Website:
http://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/
CSC411_Fall16.html
The class will use Piazza for announcements and discussions:
https://piazza.com/utoronto.ca/fall2016/csc411/home
First time, sign up here:
https://piazza.com/utoronto.ca/fall2016/csc411/home
Your grade will not depend on your participation on Piazza. It’s just a
good way for asking questions, discussing with your instructor, TAs and your
peers
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 9 / 44
![Page 10: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/10.jpg)
Textbook(s)
Christopher Bishop: ”Pattern Recognition and Machine Learning”, 2006
Other Textbooks:
I Kevin Murphy: ”Machine Learning: a Probabilistic Perspective”I David Mackay: ”Information Theory, Inference, and Learning
Algorithms”I Ethem Alpaydin: ”Introduction to Machine Learning”, 2nd edition,
2010.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 10 / 44
![Page 11: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/11.jpg)
Textbook(s)
Christopher Bishop: ”Pattern Recognition and Machine Learning”, 2006
Other Textbooks:
I Kevin Murphy: ”Machine Learning: a Probabilistic Perspective”I David Mackay: ”Information Theory, Inference, and Learning
Algorithms”I Ethem Alpaydin: ”Introduction to Machine Learning”, 2nd edition,
2010.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 10 / 44
![Page 12: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/12.jpg)
Requirements (Undergrads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44
![Page 13: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/13.jpg)
Requirements (Undergrads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44
![Page 14: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/14.jpg)
Requirements (Undergrads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44
![Page 15: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/15.jpg)
Requirements (Undergrads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44
![Page 16: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/16.jpg)
Requirements (Grads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44
![Page 17: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/17.jpg)
Requirements (Grads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44
![Page 18: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/18.jpg)
Requirements (Grads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44
![Page 19: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/19.jpg)
Requirements (Grads)
Do the readings!
Assignments:
I Three assignments, first two worth 15% each, last one worth 25%, for
a total of 55%
I Programming: take code and extend it
I Derivations: pen(cil)-and-paper
Mid-term:
I One hour exam
I Worth 20% of course mark
Final:
I Focused on second half of course
I Worth 25% of course mark
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44
![Page 20: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/20.jpg)
More on Assigments
Collaboration on the assignments is not allowed. Each student is responsiblefor his/her own work. Discussion of assignments should be limited toclarification of the handout itself, and should not involve any sharing ofpseudocode or code or simulation results. Violation of this policy is groundsfor a semester grade of F, in accordance with university regulations.
The schedule of assignments is included in the syllabus. Assignments aredue at the beginning of class/tutorial on the due date.
Assignments handed in late but before 5 pm of that day will be penalized by5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day willbe assessed thereafter.
Extensions will be granted only in special situations, and you will need aStudent Medical Certificate or a written request approved by the instructorat least one week before the due date.
Final assignment is a bake-off: competition between ML algorithms. We willgive you some data for training a ML system, and you will try to develop thebest method. We will then determine which system performs best on unseentest data. Grads can do own project.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44
![Page 21: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/21.jpg)
More on Assigments
Collaboration on the assignments is not allowed. Each student is responsiblefor his/her own work. Discussion of assignments should be limited toclarification of the handout itself, and should not involve any sharing ofpseudocode or code or simulation results. Violation of this policy is groundsfor a semester grade of F, in accordance with university regulations.
The schedule of assignments is included in the syllabus. Assignments aredue at the beginning of class/tutorial on the due date.
Assignments handed in late but before 5 pm of that day will be penalized by5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day willbe assessed thereafter.
Extensions will be granted only in special situations, and you will need aStudent Medical Certificate or a written request approved by the instructorat least one week before the due date.
Final assignment is a bake-off: competition between ML algorithms. We willgive you some data for training a ML system, and you will try to develop thebest method. We will then determine which system performs best on unseentest data. Grads can do own project.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44
![Page 22: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/22.jpg)
More on Assigments
Collaboration on the assignments is not allowed. Each student is responsiblefor his/her own work. Discussion of assignments should be limited toclarification of the handout itself, and should not involve any sharing ofpseudocode or code or simulation results. Violation of this policy is groundsfor a semester grade of F, in accordance with university regulations.
The schedule of assignments is included in the syllabus. Assignments aredue at the beginning of class/tutorial on the due date.
Assignments handed in late but before 5 pm of that day will be penalized by5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day willbe assessed thereafter.
Extensions will be granted only in special situations, and you will need aStudent Medical Certificate or a written request approved by the instructorat least one week before the due date.
Final assignment is a bake-off: competition between ML algorithms. We willgive you some data for training a ML system, and you will try to develop thebest method. We will then determine which system performs best on unseentest data. Grads can do own project.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44
![Page 23: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/23.jpg)
More on Assigments
Collaboration on the assignments is not allowed. Each student is responsiblefor his/her own work. Discussion of assignments should be limited toclarification of the handout itself, and should not involve any sharing ofpseudocode or code or simulation results. Violation of this policy is groundsfor a semester grade of F, in accordance with university regulations.
The schedule of assignments is included in the syllabus. Assignments aredue at the beginning of class/tutorial on the due date.
Assignments handed in late but before 5 pm of that day will be penalized by5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day willbe assessed thereafter.
Extensions will be granted only in special situations, and you will need aStudent Medical Certificate or a written request approved by the instructorat least one week before the due date.
Final assignment is a bake-off: competition between ML algorithms. We willgive you some data for training a ML system, and you will try to develop thebest method. We will then determine which system performs best on unseentest data. Grads can do own project.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44
![Page 24: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/24.jpg)
More on Assigments
Collaboration on the assignments is not allowed. Each student is responsiblefor his/her own work. Discussion of assignments should be limited toclarification of the handout itself, and should not involve any sharing ofpseudocode or code or simulation results. Violation of this policy is groundsfor a semester grade of F, in accordance with university regulations.
The schedule of assignments is included in the syllabus. Assignments aredue at the beginning of class/tutorial on the due date.
Assignments handed in late but before 5 pm of that day will be penalized by5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day willbe assessed thereafter.
Extensions will be granted only in special situations, and you will need aStudent Medical Certificate or a written request approved by the instructorat least one week before the due date.
Final assignment is a bake-off: competition between ML algorithms. We willgive you some data for training a ML system, and you will try to develop thebest method. We will then determine which system performs best on unseentest data. Grads can do own project.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44
![Page 25: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/25.jpg)
Provisional Calendar (Section 1)
Intro + Linear Regression
Linear Classif. + Logistic Regression
Non-parametric + Decision trees
Multi-class + Prob. Classif I
Thanksgiving
Prob. Classif II + NNets I
Nnet II + Clustering
Midterm + Mixt. of Gaussians
Reading Week
PCA/Autoencoders + SVM
Kernels + Ensemble I
Ensemble II + RL
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 14 / 44
![Page 26: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/26.jpg)
Provisional Calendar (Sections 2,3,4)
Intro + Linear Regression
Linear Classif. + Logistic Regression
Non-parametric + Decision trees
Multi-class + Prob. Classif I
Prob. Classif II + NNets I
Nnet II + Clustering
Midterm + Mixt. of Gaussians
PCA/Autoencoders + SVM
Kernels + Ensemble I
Ensemble II + RL
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 15 / 44
![Page 27: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/27.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problemI In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behaveI From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 28: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/28.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problem
Figure : How can we make a robot cook?
I In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behaveI From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 29: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/29.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problemI In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behaveI From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 30: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/30.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problemI In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behaveI From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 31: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/31.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problemI In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behave
I From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 32: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/32.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problemI In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behaveI From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 33: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/33.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problemI In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behaveI From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 34: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/34.jpg)
What is Machine Learning?
How can we solve a specific problem?
I As computer scientists we write a program that encodes a set of rules
that are useful to solve the problemI In many cases is very difficult to specify those rules, e.g., given a
picture determine whether there is a cat in the image
Learning systems are not directly programmed to solve a problem, instead
develop own program based on:
I Examples of how they should behaveI From trial-and-error experience trying to solve the problem
Different than standard CS:
I Want to implement unknown function, only have access e.g., to sample
input-output pairs (training examples)
Learning simply means incorporating information from the training examples
into the system
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44
![Page 35: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/35.jpg)
Tasks that requires machine learning: What makes a 2?
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 17 / 44
![Page 36: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/36.jpg)
Tasks that benefits from machine learning: cooking!
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 18 / 44
![Page 37: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/37.jpg)
Why use learning?
It is very hard to write programs that solve problems like recognizing ahandwritten digit
I What distinguishes a 2 from a 7?I How does our brain do it?
Instead of writing a program by hand, we collect examples that specify thecorrect output for a given input
A machine learning algorithm then takes these examples and produces aprogram that does the job
I The program produced by the learning algorithm may look verydifferent from a typical hand-written program. It may contain millionsof numbers.
I If we do it right, the program works for new cases as well as the oneswe trained it on.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44
![Page 38: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/38.jpg)
Why use learning?
It is very hard to write programs that solve problems like recognizing ahandwritten digit
I What distinguishes a 2 from a 7?
I How does our brain do it?
Instead of writing a program by hand, we collect examples that specify thecorrect output for a given input
A machine learning algorithm then takes these examples and produces aprogram that does the job
I The program produced by the learning algorithm may look verydifferent from a typical hand-written program. It may contain millionsof numbers.
I If we do it right, the program works for new cases as well as the oneswe trained it on.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44
![Page 39: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/39.jpg)
Why use learning?
It is very hard to write programs that solve problems like recognizing ahandwritten digit
I What distinguishes a 2 from a 7?I How does our brain do it?
Instead of writing a program by hand, we collect examples that specify thecorrect output for a given input
A machine learning algorithm then takes these examples and produces aprogram that does the job
I The program produced by the learning algorithm may look verydifferent from a typical hand-written program. It may contain millionsof numbers.
I If we do it right, the program works for new cases as well as the oneswe trained it on.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44
![Page 40: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/40.jpg)
Why use learning?
It is very hard to write programs that solve problems like recognizing ahandwritten digit
I What distinguishes a 2 from a 7?I How does our brain do it?
Instead of writing a program by hand, we collect examples that specify thecorrect output for a given input
A machine learning algorithm then takes these examples and produces aprogram that does the job
I The program produced by the learning algorithm may look verydifferent from a typical hand-written program. It may contain millionsof numbers.
I If we do it right, the program works for new cases as well as the oneswe trained it on.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44
![Page 41: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/41.jpg)
Why use learning?
It is very hard to write programs that solve problems like recognizing ahandwritten digit
I What distinguishes a 2 from a 7?I How does our brain do it?
Instead of writing a program by hand, we collect examples that specify thecorrect output for a given input
A machine learning algorithm then takes these examples and produces aprogram that does the job
I The program produced by the learning algorithm may look verydifferent from a typical hand-written program. It may contain millionsof numbers.
I If we do it right, the program works for new cases as well as the oneswe trained it on.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44
![Page 42: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/42.jpg)
Why use learning?
It is very hard to write programs that solve problems like recognizing ahandwritten digit
I What distinguishes a 2 from a 7?I How does our brain do it?
Instead of writing a program by hand, we collect examples that specify thecorrect output for a given input
A machine learning algorithm then takes these examples and produces aprogram that does the job
I The program produced by the learning algorithm may look verydifferent from a typical hand-written program. It may contain millionsof numbers.
I If we do it right, the program works for new cases as well as the oneswe trained it on.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44
![Page 43: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/43.jpg)
Why use learning?
It is very hard to write programs that solve problems like recognizing ahandwritten digit
I What distinguishes a 2 from a 7?I How does our brain do it?
Instead of writing a program by hand, we collect examples that specify thecorrect output for a given input
A machine learning algorithm then takes these examples and produces aprogram that does the job
I The program produced by the learning algorithm may look verydifferent from a typical hand-written program. It may contain millionsof numbers.
I If we do it right, the program works for new cases as well as the oneswe trained it on.
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44
![Page 44: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/44.jpg)
Learning algorithms are useful in many tasks
1. Classification: Determine which discrete category the example is
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 20 / 44
![Page 45: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/45.jpg)
Examples of Classification
What digit is this?
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44
![Page 46: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/46.jpg)
Examples of Classification
Is this a dog?
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44
![Page 47: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/47.jpg)
Examples of Classification
what about this one?
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44
![Page 48: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/48.jpg)
Examples of Classification
Am I going to pass the exam?
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44
![Page 49: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/49.jpg)
Examples of Classification
Do I have diabetes?
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44
![Page 50: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/50.jpg)
Learning algorithms are useful in many tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 22 / 44
![Page 51: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/51.jpg)
Examples of Recognizing patterns
Figure : Siri: https://www.youtube.com/watch?v=8ciagGASro0
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 23 / 44
![Page 52: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/52.jpg)
Examples of Recognizing patterns
Figure : Photomath: https://photomath.net/
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 23 / 44
![Page 53: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/53.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 24 / 44
![Page 54: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/54.jpg)
Examples of Recommendation systems
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 25 / 44
![Page 55: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/55.jpg)
Examples of Recommendation systems
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 25 / 44
![Page 56: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/56.jpg)
Examples of Recommendation systems
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 25 / 44
![Page 57: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/57.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
4. Information retrieval: Find documents or images with similar content
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 26 / 44
![Page 58: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/58.jpg)
Examples of Information Retrieval
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44
![Page 59: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/59.jpg)
Examples of Information Retrieval
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44
![Page 60: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/60.jpg)
Examples of Information Retrieval
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44
![Page 61: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/61.jpg)
Examples of Information Retrieval
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44
![Page 62: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/62.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
4. Information retrieval: Find documents or images with similar content
5. Computer vision: detection, segmentation, depth estimation, optical flow,etc
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 28 / 44
![Page 63: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/63.jpg)
Computer Vision
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 29 / 44
![Page 64: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/64.jpg)
Computer Vision
Figure : Kinect: https://www.youtube.com/watch?v=op82fDRRqSY
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 29 / 44
![Page 65: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/65.jpg)
Computer Vision
[Gatys, Ecker, Bethge. A Neural Algorithm of Artistic Style. Arxiv’15.]
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 29 / 44
![Page 66: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/66.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
4. Information retrieval: Find documents or images with similar content
5. Computer vision: detection, segmentation, depth estimation, optical flow,etc
6. Robotics: perception, planning, etc
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 30 / 44
![Page 67: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/67.jpg)
Autonomous Driving
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 31 / 44
![Page 68: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/68.jpg)
Flying Robots
Figure : Video: https://www.youtube.com/watch?v=YQIMGV5vtd4
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 32 / 44
![Page 69: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/69.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
4. Information retrieval: Find documents or images with similar content
5. Computer vision: detection, segmentation, depth estimation, optical flow,etc
6. Robotics: perception, planning, etc
7. Learning to play games
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 33 / 44
![Page 70: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/70.jpg)
Playing Games: Atari
Figure : Video: https://www.youtube.com/watch?v=V1eYniJ0Rnk
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 34 / 44
![Page 71: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/71.jpg)
Playing Games: Super Mario
Figure : Video: https://www.youtube.com/watch?v=wfL4L_l4U9A
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 35 / 44
![Page 72: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/72.jpg)
Playing Games: Alpha Go
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 36 / 44
![Page 73: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/73.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
4. Information retrieval: Find documents or images with similar content
5. Computer vision: detection, segmentation, depth estimation, optical flow,etc
6. Robotics: perception, planning, etc
7. Learning to play games
8. Recognizing anomalies: Unusual sequences of credit card transactions, panicsituation at an airport
9. Spam filtering, fraud detection: The enemy adapts so we must adapt too
10. Many more!
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 37 / 44
![Page 74: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/74.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
4. Information retrieval: Find documents or images with similar content
5. Computer vision: detection, segmentation, depth estimation, optical flow,etc
6. Robotics: perception, planning, etc
7. Learning to play games
8. Recognizing anomalies: Unusual sequences of credit card transactions, panicsituation at an airport
9. Spam filtering, fraud detection: The enemy adapts so we must adapt too
10. Many more!
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 37 / 44
![Page 75: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/75.jpg)
Learning algorithms are useful in other tasks
1. Classification: Determine which discrete category the example is
2. Recognizing patterns: Speech Recognition, facial identity, etc
3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon,Netflix).
4. Information retrieval: Find documents or images with similar content
5. Computer vision: detection, segmentation, depth estimation, optical flow,etc
6. Robotics: perception, planning, etc
7. Learning to play games
8. Recognizing anomalies: Unusual sequences of credit card transactions, panicsituation at an airport
9. Spam filtering, fraud detection: The enemy adapts so we must adapt too
10. Many more!
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 37 / 44
![Page 76: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/76.jpg)
Human Learning
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 38 / 44
![Page 77: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/77.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vectorI Classification: 1-of-N output (speech recognition, object recognition,
medical diagnosis)I Regression: real-valued output (predicting market prices, customer
rating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract featuresI How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 78: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/78.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vector
I Classification: 1-of-N output (speech recognition, object recognition,medical diagnosis)
I Regression: real-valued output (predicting market prices, customerrating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract featuresI How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 79: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/79.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vectorI Classification: 1-of-N output (speech recognition, object recognition,
medical diagnosis)
I Regression: real-valued output (predicting market prices, customerrating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract featuresI How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 80: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/80.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vectorI Classification: 1-of-N output (speech recognition, object recognition,
medical diagnosis)I Regression: real-valued output (predicting market prices, customer
rating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract featuresI How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 81: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/81.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vectorI Classification: 1-of-N output (speech recognition, object recognition,
medical diagnosis)I Regression: real-valued output (predicting market prices, customer
rating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract featuresI How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 82: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/82.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vectorI Classification: 1-of-N output (speech recognition, object recognition,
medical diagnosis)I Regression: real-valued output (predicting market prices, customer
rating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract features
I How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 83: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/83.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vectorI Classification: 1-of-N output (speech recognition, object recognition,
medical diagnosis)I Regression: real-valued output (predicting market prices, customer
rating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract featuresI How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 84: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/84.jpg)
Types of learning tasks
Supervised: correct output known for each training example
I Learn to predict output when given an input vectorI Classification: 1-of-N output (speech recognition, object recognition,
medical diagnosis)I Regression: real-valued output (predicting market prices, customer
rating)
Unsupervised learning
I Create an internal representation of the input, capturingregularities/structure in data
I Examples: form clusters; extract featuresI How do we know if a representation is good?
Reinforcement learning
I Learn action to maximize payoffI Not much information in a payoff signalI Payoff is often delayed
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44
![Page 85: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/85.jpg)
Machine Learning vs Data Mining
Data-mining: Typically using very simple machine learning techniques onvery large databases because computers are too slow to do anything moreinteresting with ten billion examples
Previously used in a negative sense
I misguided statistical procedure of looking for all kinds of relationshipsin the data until finally find one
Now lines are blurred: many ML problems involve tons of data
But problems with AI flavor (e.g., recognition, robot navigation) still domainof ML
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44
![Page 86: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/86.jpg)
Machine Learning vs Data Mining
Data-mining: Typically using very simple machine learning techniques onvery large databases because computers are too slow to do anything moreinteresting with ten billion examples
Previously used in a negative sense
I misguided statistical procedure of looking for all kinds of relationshipsin the data until finally find one
Now lines are blurred: many ML problems involve tons of data
But problems with AI flavor (e.g., recognition, robot navigation) still domainof ML
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44
![Page 87: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/87.jpg)
Machine Learning vs Data Mining
Data-mining: Typically using very simple machine learning techniques onvery large databases because computers are too slow to do anything moreinteresting with ten billion examples
Previously used in a negative sense
I misguided statistical procedure of looking for all kinds of relationshipsin the data until finally find one
Now lines are blurred: many ML problems involve tons of data
But problems with AI flavor (e.g., recognition, robot navigation) still domainof ML
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44
![Page 88: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/88.jpg)
Machine Learning vs Data Mining
Data-mining: Typically using very simple machine learning techniques onvery large databases because computers are too slow to do anything moreinteresting with ten billion examples
Previously used in a negative sense
I misguided statistical procedure of looking for all kinds of relationshipsin the data until finally find one
Now lines are blurred: many ML problems involve tons of data
But problems with AI flavor (e.g., recognition, robot navigation) still domainof ML
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44
![Page 89: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/89.jpg)
Machine Learning vs Statistics
ML uses statistical theory to build models
A lot of ML is rediscovery of things statisticians already knew; often
disguised by differences in terminology
But the emphasis is very different:
I Good piece of statistics: Clever proof that relatively simple estimation
procedure is asymptotically unbiased.
I Good piece of ML: Demo that a complicated algorithm produces
impressive results on a specific task.
Can view ML as applying computational techniques to statistical problems.
But go beyond typical statistics problems, with different aims (speed vs.
accuracy).
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44
![Page 90: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/90.jpg)
Machine Learning vs Statistics
ML uses statistical theory to build models
A lot of ML is rediscovery of things statisticians already knew; often
disguised by differences in terminology
But the emphasis is very different:
I Good piece of statistics: Clever proof that relatively simple estimation
procedure is asymptotically unbiased.
I Good piece of ML: Demo that a complicated algorithm produces
impressive results on a specific task.
Can view ML as applying computational techniques to statistical problems.
But go beyond typical statistics problems, with different aims (speed vs.
accuracy).
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44
![Page 91: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/91.jpg)
Machine Learning vs Statistics
ML uses statistical theory to build models
A lot of ML is rediscovery of things statisticians already knew; often
disguised by differences in terminology
But the emphasis is very different:
I Good piece of statistics: Clever proof that relatively simple estimation
procedure is asymptotically unbiased.
I Good piece of ML: Demo that a complicated algorithm produces
impressive results on a specific task.
Can view ML as applying computational techniques to statistical problems.
But go beyond typical statistics problems, with different aims (speed vs.
accuracy).
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44
![Page 92: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/92.jpg)
Machine Learning vs Statistics
ML uses statistical theory to build models
A lot of ML is rediscovery of things statisticians already knew; often
disguised by differences in terminology
But the emphasis is very different:
I Good piece of statistics: Clever proof that relatively simple estimation
procedure is asymptotically unbiased.
I Good piece of ML: Demo that a complicated algorithm produces
impressive results on a specific task.
Can view ML as applying computational techniques to statistical problems.
But go beyond typical statistics problems, with different aims (speed vs.
accuracy).
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44
![Page 93: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/93.jpg)
Machine Learning vs Statistics
ML uses statistical theory to build models
A lot of ML is rediscovery of things statisticians already knew; often
disguised by differences in terminology
But the emphasis is very different:
I Good piece of statistics: Clever proof that relatively simple estimation
procedure is asymptotically unbiased.
I Good piece of ML: Demo that a complicated algorithm produces
impressive results on a specific task.
Can view ML as applying computational techniques to statistical problems.
But go beyond typical statistics problems, with different aims (speed vs.
accuracy).
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44
![Page 94: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/94.jpg)
Cultural gap (Tibshirani)
MACHINE LEARNINGweights
learning
generalization
supervised learning
unsupervised learning
large grant: $1,000,000
conference location:Snowbird, French Alps
STATISTICS
parameters
fitting
test set performance
regression/classification
density estimation, clustering
large grant: $50,000
conference location: Las Vegas inAugust
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 42 / 44
![Page 95: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/95.jpg)
Course Survey
Please complete the following survey this week:https://docs.google.com/forms/d/e/
1FAIpQLScd5JwTrh55gW-O-5UKXLidFPvvH-XhVxr36AqfQzsrdDNxGQ/
viewform?usp=send_form
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 43 / 44
![Page 96: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/96.jpg)
Initial Case Study
What grade will I get in this course?
Data: entry survey and marks from this and previous years
Process the data
I Split into training set; and test setI Determine representation of input;I Determine the representation of the output;
Choose form of model: linear regression
Decide how to evaluate the system’s performance: objective function
Set model parameters to optimize performance
Evaluate on test set: generalization
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44
![Page 97: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/97.jpg)
Initial Case Study
What grade will I get in this course?
Data: entry survey and marks from this and previous years
Process the data
I Split into training set; and test setI Determine representation of input;I Determine the representation of the output;
Choose form of model: linear regression
Decide how to evaluate the system’s performance: objective function
Set model parameters to optimize performance
Evaluate on test set: generalization
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44
![Page 98: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/98.jpg)
Initial Case Study
What grade will I get in this course?
Data: entry survey and marks from this and previous years
Process the data
I Split into training set; and test setI Determine representation of input;I Determine the representation of the output;
Choose form of model: linear regression
Decide how to evaluate the system’s performance: objective function
Set model parameters to optimize performance
Evaluate on test set: generalization
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44
![Page 99: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/99.jpg)
Initial Case Study
What grade will I get in this course?
Data: entry survey and marks from this and previous years
Process the data
I Split into training set; and test setI Determine representation of input;I Determine the representation of the output;
Choose form of model: linear regression
Decide how to evaluate the system’s performance: objective function
Set model parameters to optimize performance
Evaluate on test set: generalization
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44
![Page 100: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/100.jpg)
Initial Case Study
What grade will I get in this course?
Data: entry survey and marks from this and previous years
Process the data
I Split into training set; and test setI Determine representation of input;I Determine the representation of the output;
Choose form of model: linear regression
Decide how to evaluate the system’s performance: objective function
Set model parameters to optimize performance
Evaluate on test set: generalization
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44
![Page 101: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/101.jpg)
Initial Case Study
What grade will I get in this course?
Data: entry survey and marks from this and previous years
Process the data
I Split into training set; and test setI Determine representation of input;I Determine the representation of the output;
Choose form of model: linear regression
Decide how to evaluate the system’s performance: objective function
Set model parameters to optimize performance
Evaluate on test set: generalization
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44
![Page 102: CSC 411: Lecture 01: Introduction](https://reader035.vdocuments.us/reader035/viewer/2022062413/5867ab3f1a28abb4408bdb06/html5/thumbnails/102.jpg)
Initial Case Study
What grade will I get in this course?
Data: entry survey and marks from this and previous years
Process the data
I Split into training set; and test setI Determine representation of input;I Determine the representation of the output;
Choose form of model: linear regression
Decide how to evaluate the system’s performance: objective function
Set model parameters to optimize performance
Evaluate on test set: generalization
Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44