deep learning primer - a brief introduction

6
Deep Learning Primer Anantharaman Narayana Iyer 7 th June 2014

Upload: ananth

Post on 24-May-2015

302 views

Category:

Data & Analytics


0 download

DESCRIPTION

Deep learning is receiving phenomenal attention due to breakthrough results in several AI tasks and significant research investment by top technology companies like Google, Facebook, Microsoft, IBM. For someone who has not been introduced to this technology, it may be daunting to learn several concepts such as feature learning, Restricted Boltzmann Machines, Autoencoders, etc all at once and start applying it to their own AI applications. This presentation is the first of several in this series that is intended at practitioners.

TRANSCRIPT

Page 1: Deep Learning Primer - a brief introduction

Deep  Learning  Primer  

Anantharaman  Narayana  Iyer    7th  June  2014  

Page 2: Deep Learning Primer - a brief introduction

What  is  Deep  Learning?  Deep  learning  is  a  Machine  Learning  technique  disBnguished  by  2  defining  characterisBcs:  

1.  Deep  Architecture  •  MulBple  layers  of  learning.    •  Methodologies  to  train  these  layers  that  gets  close  to  global  opBmum,  alleviaBng  the  effect  of  local  minima  arising  due  to  non-­‐convex  objecBve  funcBon  

2.   Feature  Learning  (aka  RepresentaBon  Learning)  •  TradiBonal  machine  learning  system  design,  such  as  LogisBc  Regression,  involve  manual  feature  design.  In  contrast,  a  deep  learning  system  automaBcally  learns  the  features  given  the  input.  

AutomaBc  Feature  ExtracBon  

Machine  Learning  System  

Input  

Output  

Features  

Page 3: Deep Learning Primer - a brief introduction

Why  is  there  a  phenomenal  interest?  •  Considered  the  next  big  thing  in  

Machine  Learning  by  several  experts  

•  Breakthrough  results  reported  in:  –  Speech  RecogniBon  

•  MicrosoY  Audio  Video  Indexing  Service  (MAVIS),  reduced  word  error  rates  by  about  30%  on  4  major  benchmarks  

–  Object  RecogniBon  •  MNIST  digits  recogniBon:  error  rate  0.27%  •  Successful  image  recogniBon  by  Google    

–  Natural  Language  Processing  •  SENNA  system  that  reported  state  of  the  art  

results  in  tasks  like  POS  tagging,  Chunking,  Named  EnBty  RecogniBon  etc  

•  Substan3al  investments  on  this  technology  recently  by  top  technology  companies  

Page 4: Deep Learning Primer - a brief introduction

Building  a  deep  learning  system  •  Many  ways  to  build  a  deep  learning  system,  

with  the  defining  characterisBcs  being:  –  MulBple  layers  where  each  layer  performs  a  

nonlinear  transformaBon  of  the  output  generated  by  its  preceding  layer.  

–  AutomaBc  feature  learning  where  the  features  are  progressively  more  abstract  

–  Hierarchical  in  nature.  •  Broad  approaches/categorizaBons  

–  Unsupervised  or  generaBve  models  –  Supervised  discriminaBve  models  –  Hybrid  (use  an  unsupervised  model  as  an  

aid  to  perform  superior  discriminaBon)  •  Common  building  blocks  for  unsupervised  

and  hybrid  approaches  –  Restricted  Boltzmann  Machines  (RBM)    –  Autoencoders  

Page 5: Deep Learning Primer - a brief introduction

ApplicaBon  Example  Problem:  Suppose  we  need  to  build  a  deep  learning  system  to  detect  if  a  given  digital  image  contains  a  human  face  or  not.  Inputs  are  the  image  pixels  and  the  output  is  a  binary.  •  We  can  think  of  the  human  face  to  be  composed  of  a  few  key  facial  

consBtuents  such  as  ears,  eyes,  nose  etc.  These  further  can  be  thought  of  contours  with  well  defined  edges,  which  in  turn  are  consBtuted  by  specific  paderns  of  pixels.  

•  We  think  of  this  as  generaBng  edges  from  input  pixels,  from  edges  generate  the  facial  aspects  and  from  those  detect  a  human  face.    

•  The  role  of  a  hidden  layer  in  this  system  is  to  perform  a  nonlinear  transform  of  its  inputs  (lower  level  of  abstracBon)  and  produce  a  more  abstract  output  (as  e.g.  generaBng  a  nose  object  from  the  given  contours).    

•  Thus  we  progressively  move  up  in  abstracBon  starBng  from  raw  pixels  and  ending  up  with  a  face  object.    

Page 6: Deep Learning Primer - a brief introduction

High  level  implementaBon  steps  •  Suppose  we  implement  the  given  applicaBon  as  a  deep  neural  network  as:  

–  Pixel  values  consBtute  the  input  layer  –  A  single  output  unit  consBtuBng  the  output  layer  –  We  will  have  2  hidden  layers  

•  We  will  use  a  stacked  autoencoder  as  the  basic  building  block.  –  An  autoencoder  (AE)  neural  network  learns  to  produce  an  output  that  is  same  as  input  using  

unsupervised  learning.  Thus,  given  pixel  values  x  as  input,  the  goal  of  AE  is  to  produce  an  output  image  to  be  same  as  input.  

–  As  we  have  2  hidden  layers  we  will  require  2  AE’s  –  say  AE1,  AE2.  We  will  create  a  bodleneck  by  having  a  smaller  number  of  hidden  units  compared  to  number  of  input  units.  

•  Layerwise  pretraining  –  Train  the  AE1  with  the  available  images  (that  may  or  may  not  have  an  human  image)  

unsupervised.  Now,  the  output  of  hidden  units  of  AE1  consBtute  the  “learnt”  features  at  an  abstracBon  higher  than  the  input  pixels.  (e.g.  Edges  from  pixels)  

–  Cascade  the  output  of  the  hidden  layer  of  the  AE  in  the  previous  step  with  AE2  and  train  AE2  to  learn  more  abstract  features  (e.g.  facial  components  from  edges)  

•  Add  a  logisBc  regression  layer  as  the  output  layer  and  stack  the  2  AE’s  and  the  output  layer  to  consBtute  a  Neural  Network  

•  Fine  tune  this  network  using  backpropagaBon  with  a  smaller  number  of  labeled  images