sb2b statistical machine learning hilary term 2017flaxman/ht17_lecture1.pdf · administrative...

29
SB2b Statistical Machine Learning Hilary Term 2017 Mihaela van der Schaar and Seth Flaxman Guest lecturer: Yee Whye Teh Department of Statistics Oxford Slides and other materials available at: http://www.oxford-man.ox.ac.uk/~mvanderschaar/home_ page/course_ml.html

Upload: others

Post on 31-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

SB2b Statistical Machine LearningHilary Term 2017

Mihaela van der Schaar and Seth FlaxmanGuest lecturer: Yee Whye Teh

Department of StatisticsOxford

Slides and other materials available at:http://www.oxford-man.ox.ac.uk/~mvanderschaar/home_

page/course_ml.html

Page 2: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Administrative details

Course Structure

MMath Part B & MSc in Applied Statistics

Lectures:Wednesdays 12:00-13:00, LG.01.Thursdays 16:00-17:00, LG.01.

MSc:4 problem sheets, discussed at the classes: weeks 2,4,6,7 (checkwebsite)

Part C:4 problem sheetsClass Tutors: Lloyd Elliott, Kevin Sharp, and Hyunjik KimPlease sign up for the classes on the sign up sheet!

Page 3: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Administrative details

Course Aims

1 Understand statistical fundamentals of machine learning, with a focus onsupervised learning (classification and regression) and empirical riskminimisation.

2 Understand difference between generative and discriminative learningframeworks.

3 Learn to identify and use appropriate methods and models for given dataand task.

4 Learn to use the relevant R or python packages to analyse data, interpretresults, and evaluate methods.

Page 4: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Administrative details

Syllabus I

Part I: Introduction to supervised learning (4 lectures)Empirical risk minimizationBias/variance, Generalization, Overfitting, Cross validationRegularizationLogistic regressionNeural networks

Part II: Classification and regression (3 lectures)Generative vs. Discriminative modelsK-nearest neighbours, Maximum Likelihood Estimation, Mixture modelsNaive Bayes, Decision trees, CARTSupport Vector MachinesRandom forest, Boostrap Aggregation (Bagging), Ensemble learningExpectation Maximization

Page 5: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Administrative details

Syllabus II

Part III: Theoretical frameworks

Statistical learning theory

Decision theory

Part IV: Further topics

Optimisation

Hidden Markov Models

Backward-forward algorithms

Reinforcement learning

Page 6: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Machine Learning?

http://gureckislab.org

Page 7: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Machine Learning?

Arthur Samuel, 1959Field of study that gives computers the ability to learn without being explicitlyprogrammed.

Tom Mitchell, 1997Any computer program that improves its performance at some task throughexperience.

Kevin Murphy, 2012

To develop methods that can automatically detect patterns in data, andthen to use the uncovered patterns to predict future data or other outcomesof interest.

Page 8: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Machine Learning?

Arthur Samuel, 1959Field of study that gives computers the ability to learn without being explicitlyprogrammed.

Tom Mitchell, 1997Any computer program that improves its performance at some task throughexperience.

Kevin Murphy, 2012

To develop methods that can automatically detect patterns in data, andthen to use the uncovered patterns to predict future data or other outcomesof interest.

Page 9: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Machine Learning?

Arthur Samuel, 1959Field of study that gives computers the ability to learn without being explicitlyprogrammed.

Tom Mitchell, 1997Any computer program that improves its performance at some task throughexperience.

Kevin Murphy, 2012

To develop methods that can automatically detect patterns in data, andthen to use the uncovered patterns to predict future data or other outcomesof interest.

Page 10: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Machine Learning?

data

InformationStructurePredictionDecisionsActions

Larry Page about DeepMind’s ML systems that can learn to play video games like humans

Page 11: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Machine Learning?

Machine Learning

statistics

computerscience

cognitivescience

psychology

mathematics

engineeringoperationsresearch

physics

biologygenetics

businessfinance

Page 12: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Data Science? Early years

John Tukey, The Future of Data Analysis, 1962

For a long time I have thought I was a statistician, interested in inferencesfrom the particular to the general. But as I have watched mathematicalstatistics evolve, I have had cause to wonder and to doubt. ... All in all I havecome to feel that my central interest is in data analysis, which I take to include,among other things: procedures for analyzing data, techniques for interpretingthe results of such procedures, ways of planning the gathering of data to makeits analysis easier, more precise or more accurate, and all the machinery andresults of (mathematical) statistics which apply to analyzing data

Four driving forces, according to Tukey

The formal theories of statisticsAccelerating developments in computers...The challenge, in many fields, of more and ever larger bodies of dataThe emphasis on quantification in an ever wider variety of disciplines

Page 13: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

What is Data Science?

Bin Yu, Let us own Data Science, IMS Presidential Address, 2014StatisticsDomain/science knowledgeComputingCollaboration/teamworkCommunication to outsiders

David Donoho, 50 years of Data Science, 2015

“Greater Data Science”:Data Exploration and PreparationData Representation and TransformationComputing with DataData ModelingData Visualization and PresentationScience about Data Science

Page 14: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Statistical Machine Learning

Statistics vs Machine Learning

Traditional Problems in Applied Statistics

Well formulated question that we would like to answer.Expensive data gathering and/or expensive computation.Create specially designed experiments to collect high quality data.

Information RevolutionImprovements in data processing and data storage.Powerful, cheap, easy data capturing.Lots of (low quality) data with potentially valuable information inside.

CS and Stats forced back together: unified framework of data,inferences, procedures, algorithms

statistics taking computation seriouslycomputing taking statistical risk seriously

Michael I. Jordan: On the Computational and Statistical Interface and "Big Data"Max Welling: Are Machine Learning and Statistics Complementary?

Page 15: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Types of Machine Learning

Types of Machine Learning

Unsupervised learning

Extract key features of the “unlabelled” dataclustering, signal separation, density estimationGoal: representation, hypothesis generation, visualization

Supervised learning

Data contains “labels”: every example is an input-output pairclassification, regressionGoal: prediction on new examples

Page 16: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Types of Machine Learning

Types of Machine Learning

Semi-supervised Learning

A database of examples, only a small subset of which are labelled.

Multi-task Learning

A database of examples, each of which has multiple labels corresponding todifferent prediction tasks.

Reinforcement Learning

An agent acting in an environment, given rewards for performing appropriateactions, learns to maximize their reward.

Page 17: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Supervised Learning

Supervised Learning

Unsupervised learning:To “extract structure” and postulate hypotheses about data generatingprocess from “unlabelled” observations x1, . . . , xn.Visualize, summarize and compress data.

Supervised learning:In addition to the observations of X, we have access to their responsevariables / labels Y ∈ Y: we observe {(xi, yi)}n

i=1.Types of supervised learning:

Classification: discrete responses, e.g. Y = {+1,−1} or {1, . . . ,K}.Regression: a numerical value is observed and Y = R.

The goal is to accurately predict the response Y on new observations of X,i.e., to learn a function f : Rp → Y, such that f (X) will be close to the trueresponse Y.

Page 18: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Overview Supervised Learning

Applications of Machine Learning

spam filteringrecommendation

systemsfraud detection

self-driving carsimage recognition

stock market analysis

ImageNet: Computer Eyesight Gets a Lot More Accurate, Krizhevsky et al, 2012 New applications of ML: Machine Learning is Eating the World

Page 19: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Spam detection

Observations X are text documentsLabels Y are spam = +1 and not spam = −1.How do we encode documents of different lengths as a vector X ∈ Rp?Given a set of labelled documents {(xi, yi)}n

i=1 how do we learn a function

f : Rp → Y

Many answers to both questions will be covered in this course: logisticregression, naive Bayes, neural networks, Support Vector Machines, etc.

Page 20: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Image classification

Observations X are imagesLabels Y ∈ {0, 1, . . . , 9}Learn a function

f : Rp → Y

Page 21: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Face recognition

Observations X are imagesLabels Y are a very large set of people: {Queen Elizabeth, Bill Gates,Justin Trudeau, Leonardo DiCaprio, etc.}How do we encode images as vectors X ∈ Rp?Given a set of labelled images {(xi, yi)}n

i=1 how do we learn a function

f : Rp → Y

Fundamentally harder or different than image classification?

Page 22: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Face detection

Farfade, Saberian, and Li 2015 https://arxiv.org/pdf/1502.02766v3.pdf

Page 23: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Face detection

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Farfade, Saberian, and Li 2015 https://arxiv.org/pdf/1502.02766v3.pdf

Observations X are images

What are the labels Y?

How should our function f work?

Page 24: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Machine translation

Kyunghyun Cho https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-gpus-part-3/

Observations X are sentences in language A

Labels Y are sentences in language B

How should we encode X and Y numerically?

Is this regression or classification?

Page 25: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Speech recognition

Dahl et al. 2012

Page 26: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Self-driving cars

27 million connections and 250 thousand parametersdevblogs.nvidia.com/parallelforall/deep-learning-self-driving-cars/

Page 27: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice

Product recommendation

Fully observe all user interactions on a website (what pages they view,what items they buy, what reviews they leave, etc.)What products should be recommended to them? On which websites?How can you phrase this as supervised learning?

Page 28: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice Software

Software

RPython: scikit-learn, mlpy, TheanoWeka, mlpack, Torch, Shogun, TensorFlow...Matlab/Octave

Page 29: SB2b Statistical Machine Learning Hilary Term 2017flaxman/HT17_lecture1.pdf · Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a

Machine learning in practice Software

Machine learning advances in 2016 and challengesahead

2016:Free/open source software for deep learning: TensorFlow (Google),CNTK (Microsoft), PaddlePaddle (Baidu), MXNet (Amazon)Audio generationGoAdvances in machine translation (Google translate)

2017 and beyond:Increasing concern about, regulation of algorithmsTransparency / explainability in machine learningEffect of increasing automation of work on societyMedical advances?