mit6.870 grounding object recognition and scene understanding: lecture 1

107
Wednesdays 1-4pm Room 13-1143 Instructor: Antonio Torralba Email: [email protected] 6.870 Grounding object recognition and scene understanding http://people.csail.mit.edu/torralba/courses/6.870/6.870.recognition.htm Some slides are borrowed from other classes (see links on the course web site). Let me know if I forget to give credit to the right people.

Upload: zukun

Post on 13-Dec-2014

386 views

Category:

Education


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Wednesdays 1-4pm Room 13-1143 Instructor: Antonio Torralba Email: [email protected]

6.870 Grounding object recognition and scene

understanding

http://people.csail.mit.edu/torralba/courses/6.870/6.870.recognition.htm

Some slides are borrowed from other classes (see links on the course web site). Let me know if I forget to give credit to the right people.

Page 2: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

http://groups.csail.mit.edu/vision/courses/6.869/

Page 3: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Grading

•  Class participation: 20%

•  Paper presentations: 40%

•  Course project: 40%

Page 4: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Course project •  Topics for projects: It can derive from one

of the papers studied or from your own research.

•  Work individually or in pairs.

•  Results described as a 4 pages CVPR paper

•  Short presentation at the end of the semester

Page 5: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Paper presentations (40%)

Email me at the end of the class for scheduling the next week. We will first decide how to structure the week together.

•  Presenter: –  Present the key ideas, background material, and technical details. –  Show me the slides two days before the class. –  To test the basic ideas of the paper(s), using code available online or

writing toy code. –  Create toy test problems that reveal something about the algorithm. –  Constructive criticism.

Page 6: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Readings  

Page 7: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Lecture  1    Class  goals  and    a  short  introduc2on  

6.870 Grounding object recognition and scene understanding

Page 8: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

What  is  vision?  

•  What  does  it  mean,  to  see?    “to  know  what  is  where  by  looking”.  

•  How  to  discover  from  images  what  is  present  in  the  world,  where  things  are,  what  ac2ons  are  taking  place.  

from  Marr,  1982  

Page 9: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  importance  of  images  

100  million  $  

“Dora  Maar  au  Chat”  Pablo  Picasso,  1941  

Some  images  are  more  important  than  others    

Page 10: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Why  is  vision  hard?  

Page 11: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  structure  of  ambient  light  

Page 12: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  structure  of  ambient  light  

Page 13: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  Plenop2c  Func2on  

The intensity P can be parameterized as:

P (θ, φ, t, λ, X, Y, Z) “The complete set of all convergence points constitutes the permanent possibilities of vision.” Gibson

Adelson & Bergen, 91

Page 14: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 15: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 16: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 17: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 18: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Why  is  vision  hard?  

Page 19: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Measuring  light  vs.  measuring  scene  proper2es  

We perceive two squares, one on top of each other.

Page 20: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Measuring  light  vs.  measuring  scene  proper2es  

by Roger Shepard (”Turning the Tables”)

Depth processing is automatic, and we can not shut it down…

Page 21: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Measuring  light  vs.  measuring  scene  proper2es  

Page 22: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Measuring  light  vs.  measuring  scene  proper2es  

Page 23: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Measuring  light  vs.  measuring  scene  proper2es  

(c) 2006 Walt Anthony

Page 24: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Assump2ons  can  be  wrong  

Ames  room  

Page 25: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

By Aude Oliva

Page 26: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Why  is  vision  hard?  

Page 27: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Some  things  have  strong  varia2ons  in  appearance  

Page 28: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Some  things  know  that  you  have  eyes  

Brady,  M.  J.,  &  Kersten,  D.  (2003).  Bootstrapped  learning  of  novel  objects.  J  Vis,  3(6),  413-­‐422    

Page 29: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

A  short  history  of  vision  

Page 30: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  early  op2mism  

Page 31: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 32: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  crisis  of  the  80’s  

Page 33: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Yes,  object  recogni2on  is  hard…  (or at least it seems so for now…)

Object  recogni2on  Is  it  really  so  hard?  

Page 34: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges 1: view point variation

Michelangelo 1475-1564

Page 35: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges 2: illumination

slide credit: S. Ullman

Page 36: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges 3: occlusion

Magritte, 1957

Page 37: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges 4: scale

Page 38: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges 5: deformation

Xu, Beihong 1943

Page 39: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges 6: background clutter

Klimt, 1913

Page 40: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges 7: intra-class variation

Page 41: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Challenges

Brady, M. J., & Kersten, D. (2003). Bootstrapped learning of novel objects. J Vis, 3(6), 413-422

Page 42: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Discover the camouflaged object

Brady, M. J., & Kersten, D. (2003). Bootstrapped learning of novel objects. J Vis, 3(6), 413-422

Page 43: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Discover the camouflaged object

Brady, M. J., & Kersten, D. (2003). Bootstrapped learning of novel objects. J Vis, 3(6), 413-422

Page 44: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 45: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 46: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 47: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 48: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 49: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Any guesses?

Page 50: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 51: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

So,  let’s  make  the  problem  simpler:  Block  world  

Nice framework to develop fancy math, but too far from reality… Object Recognition in the Geometric Era: a Retrospective. Joseph L. Mundy. 2006

Page 52: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Object Recognition in the Geometric Era: a Retrospective. Joseph L. Mundy. 2006

Binford  and  generalized  cylinders  

Page 53: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Binford  and  generalized  cylinders  

Page 54: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Recogni2on  by  components  

Irving Biederman Recognition-by-Components: A Theory of Human Image Understanding. Psychological Review, 1987.

Page 55: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Recogni2on  by  components  The  fundamental  assump2on  of  the  proposed  theory,  

recogni2on-­‐by-­‐components  (RBC),  is  that  a  modest  set  of  generalized-­‐cone  components,  called  geons  (N  =  36),  can  be  derived  from  contrasts  of  five  readily  detectable  proper2es  of  edges  in  a  two-­‐dimensional  image:  curvature,  collinearity,  symmetry,  parallelism,  and  cotermina2on.  

The  “contribu2on  lies  in  its  proposal  for  a  par2cular  vocabulary  of  components  derived  from  perceptual  mechanisms  and  its  account  of  how  an  arrangement  of  these  components  can  access  a  representa2on  of  an  object  in  memory.”  

Page 56: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

1)  We know that this object is nothing we know

2)  We can split this objects into parts that everybody will agree

3)  We can see how it resembles something familiar: “a hot dog cart”

“The naive realism that emerges in descriptions of nonsense objects may be reflecting the workings of a representational system by which objects are identified.”

A  do-­‐it-­‐yourself  example  

Page 57: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Stages  of  processing  

“Parsing is performed, primarily at concave regions, simultaneously with a detection of nonaccidental properties.”

Page 58: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Non  accidental  proper2es  

Certain properties of edges in a two-dimensional image are taken by the visual system as strong evidence that the edges in the three-dimensional world contain those same properties.

Non accidental properties, (Witkin & Tenenbaum,1983): Rarely be produced by accidental alignments of viewpoint and object features and consequently are generally unaffected by slight variations in viewpoint.

?

image

Page 59: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Examples:

•  Colinearity

•  Smoothness

•  Symmetry

•  Parallelism

•  Cotermination

Page 60: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

From  generalized  cylinders  to  GEONS  

“From variation over only two or three levels in the nonaccidental relations of four attributes of generalized cylinders, a set of 36 GEONS can be generated.”

Geons represent a restricted form of generalized cylinders.

Page 61: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Objects  and  their  geons  

Page 62: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Scenes  and  geons  

Mezzanotte & Biederman

Page 63: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  importance  of  spa2al  arrangement  

Page 64: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Parts and Structure approaches With a different perspective, these models focused more on the

geometry than on defining the constituent elements:

•  Fischler & Elschlager 1973 •  Yuille ‘91 •  Brunelli & Poggio ‘93 •  Lades, v.d. Malsburg et al. ‘93 •  Cootes, Lanitis, Taylor et al. ‘95 •  Amit & Geman ‘95, ‘99 •  Perona et al. ‘95, ‘96, ’98, ’00, ’03, ‘04, ‘05 •  Felzenszwalb & Huttenlocher ’00, ’04 •  Crandall & Huttenlocher ’05, ’06 •  Leibe & Schiele ’03, ’04 •  Many papers since 2000

Figure from [Fischler & Elschlager 73]

Page 65: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

But,  despite  promising  ini2al  results…things  did  not  work  out  so  well  (lack  of  data,  processing  power,  lack  of  reliable  methods  for  low-­‐level  and  mid-­‐level  vision)  

Instead,  a  different  way  of  thinking  about  object  detec2on  started  making  some  progress:  learning  based  approaches  and  classifiers,  which  ignored  low  and  mid-­‐level  vision.  

Maybe  the  2me  is  here  to  come  back  to  some  of  the  earlier  models,  more  grounded  in  intui2ons  about  visual  percep2on.  

Page 66: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Renewed  op2mism  

Page 67: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Neocognitron  Fukushima (1980). Hierarchical multilayered neural network

S-cells work as feature-extracting cells. They resemble simple cells of the primary visual cortex in their response.

C-cells, which resembles complex cells in the visual cortex, are inserted in the network to allow for positional errors in the features of the stimulus. The input connections of C-cells, which come from S-cells of the preceding layer, are fixed and invariable. Each C-cell receives excitatory input connections from a group of S-cells that extract the same feature, but from slightly different positions. The C-cell responds if at least one of these S-cells yield an output.

Page 68: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Neocognitron  

Learning is done greedily for each layer

Page 69: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Convolu2onal  Neural  Network  

The output neurons share all the intermediate levels

Le Cun et al, 98

Page 70: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Face detection and the success of learning based approaches

•  The representation and matching of pictorial structures Fischler, Elschlager (1973). •  Face recognition using eigenfaces M. Turk and A. Pentland (1991). •  Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995) •  Graded Learning for Object Detection - Fleuret, Geman (1999) •  Robust Real-time Object Detection - Viola, Jones (2001) •  Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre, Mukherjee, Poggio (2001) • ….

Page 71: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

•  The representation and matching of pictorial structures Fischler, Elschlager (1973). •  Face recognition using eigenfaces M. Turk and A. Pentland (1991). •  Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995) •  Graded Learning for Object Detection - Fleuret, Geman (1999) •  Robust Real-time Object Detection - Viola, Jones (2001) •  Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre, Mukherjee, Poggio (2001) • ….

Page 72: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Faces  everywhere  

72 http://www.marcofolio.net/imagedump/faces_everywhere_15_images_8_illusions.html

Page 73: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Feret dataset, 1996 DARPA

The face age

•  The representation and matching of pictorial structures Fischler, Elschlager (1973). •  Face recognition using eigenfaces M. Turk and A. Pentland (1991). •  Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995) •  Graded Learning for Object Detection - Fleuret, Geman (1999) •  Robust Real-time Object Detection - Viola, Jones (2001) •  Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre, Mukherjee, Poggio (2001) • ….

Page 74: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Paul Viola Michael J. Jones Mitsubishi Electric Research Laboratories (MERL)

Cambridge, MA

Most of this work was done at Compaq CRL before the authors moved to MERL

Rapid Object Detection Using a Boosted Cascade of Simple Features

http://citeseer.ist.psu.edu/cache/papers/cs/23183/http:zSzzSzwww.ai.mit.eduzSzpeoplezSzviolazSzresearchzSzpublicationszSzICCV01-Viola-Jones.pdf/viola01robust.pdf

Manuscript available on web:

Page 75: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Haar-like filters and cascades Viola and Jones, ICCV 2001

The average intensity in the block is computed with four sums independently of the block size.

Also Fleuret and Geman, 2001

Page 76: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Face detection

Page 77: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

•  The representation and matching of pictorial structures Fischler, Elschlager (1973). •  Face recognition using eigenfaces M. Turk and A. Pentland (1991). •  Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995) •  Graded Learning for Object Detection - Fleuret, Geman (1999) •  Robust Real-time Object Detection - Viola, Jones (2001) •  Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre, Mukherjee, Poggio (2001) • ….

Page 78: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Families of recognition algorithms Bag of words models Voting models

Constellation models Rigid template models

Sirovich and Kirby 1987 Turk, Pentland, 1991 Dalal & Triggs, 2006

Fischler and Elschlager, 1973 Burl, Leung, and Perona, 1995

Weber, Welling, and Perona, 2000 Fergus, Perona, & Zisserman, CVPR 2003

Viola and Jones, ICCV 2001 Heisele, Poggio, et. al., NIPS 01

Schneiderman, Kanade 2004 Vidal-Naquet, Ullman 2003

Shape matching Deformable models

Csurka, Dance, Fan, Willamowski, and Bray 2004 Sivic, Russell, Freeman, Zisserman, ICCV 2005

Berg, Berg, Malik, 2005 Cootes, Edwards, Taylor, 2001

Page 79: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Scene understanding Carboneio,  de  Freitas  &  Barnard  (2004)  

Kumar,  Hebert  (2005)  

Torralba  Murphy  Freeman  (2004)  

Fink  &  Perona  (2003)  

Sudderth,  Torralba,  Wilsky,  Freeman  (2005)    

Hoiem,  Efros,  Hebert  (2005)  

Torralba,  Sinha  (2001)  

Rabinovich  et  al  (2007)  

Heitz  and  Koller  (2008)  

Desai,  Ramanan,  and  Fowlkes  (2009)  

Choi, Lim, Torralba, Willsky (2010)

Page 80: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

NSF Frontiers in computer vision workshop, 2011

Page 81: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

MobilEye

Page 82: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Demo google googles

Page 83: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The  labeling  crisis  

DUCK

DUCK

GRASS

PERSON

TREE

LAKE

BENCH

PERSON

PERSON PERSON

DUCK

PATH

SKY

SIGN

Page 84: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

So what does object recognition involve?

Slide by Fei-Fei, Fergus, Torralba

Page 85: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Verification: is that a lamp?

Slide by Fei-Fei, Fergus, Torralba

Page 86: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Detection: are there people?

Slide by Fei-Fei, Fergus, Torralba

Page 87: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Identification: is that Potala Palace?

Slide by Fei-Fei, Fergus, Torralba

Page 88: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Object categorization

mountain

building

tree

banner

vendor people

street lamp

Slide by Fei-Fei, Fergus, Torralba

Page 89: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Scene and context categorization

•  outdoor •  city •  …

Slide by Fei-Fei, Fergus, Torralba

Page 90: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Is this space large or small? How far are the buildings in the back?

Slide by Fei-Fei, Fergus, Torralba

Page 91: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Activity

What is this person doing? What are these two doing??

Slide by Fei-Fei, Fergus, Torralba

Page 92: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

What  are  we  tuned  to?  

The  visual  system  is  tuned  to  process  structures  typically  found  in  the  world.    

Page 93: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

The visual system seems to be tuned to a set of images:

Demo inspired from D. Field

Page 94: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Remember these images

Page 95: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Did you saw this image?

Page 96: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Test 2

Remember these images

Page 97: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Did you saw this image?

Page 98: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Human vision • Many input modalities • Active • Supervised, unsupervised, semi supervised learning. It can look for supervision.

Robot vision • Many poor input modalities • Active, but it does not go far

Internet vision • Many input modalities • It can reach everywhere • Tons of data

Data

Page 99: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Kinect

Page 100: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

100 CSE 576, Spring 2008 Stereo matching

Active stereo with structured light

Project “structured” light patterns onto the object •  simplifies the correspondence problem

camera 2

camera 1

projector

camera 1

projector

Li Zhang’s one-shot stereo

Slide credit: Rick Szeliski

Li Zhang, Brian Curless, and Steven M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. In Proceedings of the 1st International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT), Padova, Italy, June 19-21, 2002, pp. 24-36.

Page 101: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

101 CSE 576, Spring 2008 Stereo matching

Page 102: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

102

Page 103: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 104: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 105: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Willow garage

http://www.willowgarage.com/pages/pr2/overview

Page 106: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
Page 107: MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1

Class goals

•  Vision and language

•  Vision and robotics

•  Vision and others

To provide the right vision tools for not vision experts Thinking about the tasks to find new representations

The strategies our visual system uses are tuned to our visual world