cs231m mobile computer vision - stanford university · presentation per team evaluation: •clarity...
TRANSCRIPT
![Page 1: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/1.jpg)
CS231M · Mobile Computer VisionSpring 2015
Instructors: - Prof Silvio Savarese (Stanford, CVGLab)
- Dr Kari Pulli (Light)
CA: - Saumitro Dasgupta [email protected]
![Page 2: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/2.jpg)
Agenda
• Administrative– Requirements
– Grading policy
• Mobile Computer Vision
• Syllabus & Projects
![Page 3: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/3.jpg)
Structure of the course
• First part: – Familiarize with android mobile platform
– Work on two programming assignments on the mobile platform
• Second part:– Teams will work on a final project
– Teams present in class 1-2 state-of-the-art papers on mobile computer vision
![Page 4: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/4.jpg)
What you will learn
• How to program on the Android development platform
• State-of-the computer vision algorithms for mobile platforms
• Implement a working vision-based app on a mobile device
![Page 5: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/5.jpg)
What you need to do
• Two programming assignments [40%]
• Course project [40%]
• Team presentations in class [15%]
• Class participation [5%]
• No midterms no finals!
![Page 6: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/6.jpg)
Programming assignments [40%]
• Problem-0: warm up assignment (it won’t be graded)
• Self-calibration; it will guide you through installing the Android platform and help you run a simple toy application
• Two problem sets [20% each]
• Each graded problem set includes – a programming assignment [15%]
– a write-up [5%]
• Two topics:– Feature detection, descriptors, image matching, panorama
and HDR image construction– Object and landmark recognition; Image classification
Skeleton code for each PA will be available with links to library
![Page 7: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/7.jpg)
Programming assignments [40%]
• Programming assignments will be implemented on Nvidia Shield: a Tegra-based (K1) Android tablet– Simulators can only be used for debugging
• NVIDIA tablets are available for each student who is taking the course for credit.
• Tablets kindly lent by Nvidia
• Supporting material/tutorials will be based upon the Android platform
![Page 8: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/8.jpg)
Programming assignments [40%]
• Important dates:
– Problem-0• Released: 4/6 (next Monday)• Due: 4/13
– First problem assignment:• Released on 4/13 • Due on 4/24
– Second problem assignment:• Released on 4/24 • Due on 5/6
![Page 9: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/9.jpg)
Course Project [40%]
• Goal: implement a computer vision application on a mobile platform
• We encourage students to use the NVIDIA Tegra-based Android tablet
• Students can use iOS for final project (but please let us know if this is the case)
• NOTE: programming assignments must be completed on android
• Simulator can only be used for debugging
![Page 10: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/10.jpg)
Course Project [40%]
• Teams: 1-3 people per team
– The quality of your project will be judged regardless of the number of people on the team
– Be nice to your partner: do you plan to drop the course?
![Page 11: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/11.jpg)
Course Project [40%]
• Evaluation:
• Proposal report 10%• Final report 20% • Final presentation 10%
![Page 12: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/12.jpg)
Course Project [40%]
–Project proposal report due on 5/8
–Project presentations on 5/27, 6/1, 6/3
– Final project due: 6/7
• Important dates:
![Page 13: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/13.jpg)
Some examples of projects are:
• Recover the 3D layout of a room and augment it with new IKEA furniture
• Recognize your friend's face and link it to your friends on Facebook
• Localize yourself in a google map and visualize the closest restaurant on the smart phone's display
• Detect and face and turn it into a cartoon (and share it with friends)
• Create HDR panoramic images• Recognize landmarks on the Stanford campus (e.g.:
Memorial Church) and link it to relevant info from the web (Wikipedia, photos from other users, etc...)
![Page 14: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/14.jpg)
Presentations in class [15%]
– Each student team will present 1-2 state-of-the-art papers on mobile computer vision
– Topics are pre-assigned but student teams can bid to present a paper of interest.
– Paper topics will be uploaded to the course syllabus soon.
– We are currently estimating a 15-30 minutes presentation per team
Evaluation:• Clarity of the presentation• Ability to master the topic• Ability to answer questions
![Page 15: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/15.jpg)
Class participation [5%]
• Participate in class by attending, asking questions and participate in class discussions
• During lectures• During team presentations
• In-class and piazza participation both count.
• Quantity and quality of your questions will be used for evaluating class participation.
![Page 16: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/16.jpg)
Late policy
• No late submission for problem sets and Project
• Two “24-hours one-time late submission bonus" are available; • that is, you can use this bonus to submit your PA
late after at most 24 hours. After you use your bonuses, you must submit on your assignment on time
• NOTE: 24-hours bonuses are not available for projects; project reports must be submitted in time.
![Page 17: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/17.jpg)
Prerequisites
• CS131A, CS231A, CS232 or equivalent
• Familiar with C++ and JAVA
![Page 18: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/18.jpg)
Collaboration policy
– Read the student code book, understand what is ‘collaboration’ and what is ‘academic infraction’.
– Discussing project assignment with each other is allowed, but coding must be done individually
– Using on line presentation material (slides, etc…) is not allowed in general. Exceptions can be made and individual cases will be discussed with the instructor.
– On line software/code can be used but students must consult instructor beforehand. Failing to communicate this to the instructor will result to a penalty
![Page 19: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/19.jpg)
Agenda
• Administrative– Requirements
– Grading policy
• Advanced topics in computer vision
• Syllabus & Projects
![Page 20: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/20.jpg)
20
From the movie Minority Report, 2002
![Page 21: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/21.jpg)
211990 20102000
Computer vision and Mobile Applications
Kooaba
A9
Kinect
EosSystems Autostich
Google Goggles
• More powerful machine vision• Better clouds • Increase computational power• More bandwidth
![Page 22: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/22.jpg)
Panoramic Photography
kolor
![Page 23: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/23.jpg)
HDR
23
Intellsys
![Page 24: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/24.jpg)
Digital photography
24Adobe Photoshop
![Page 25: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/25.jpg)
25
3D modeling of landmarks
![Page 26: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/26.jpg)
VLSAM
26
By Johnny Lee
![Page 27: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/27.jpg)
27
Augmented reality
![Page 28: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/28.jpg)
Fingerprint biometrics
![Page 29: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/29.jpg)
29
Visual search and landmarks recognition
![Page 31: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/31.jpg)
Face detection
![Page 32: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/32.jpg)
321990 20102000
Kooaba
A9
Kinect
EosSystems Autostich
Google Goggles
Computer vision and Mobile Applications
![Page 33: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/33.jpg)
EosSystems
331990 20102000
2D
3D
Google Goggles
![Page 34: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/34.jpg)
341990
EosSystems
20102000
2D
3D
Google Goggles
![Page 35: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/35.jpg)
40
Current state of computer vision
3D Reconstruction• 3D shape recovery • 3D scene reconstruction • Camera localization• Pose estimation
2D Recognition• Image matching• Object detection• Texture classification• Activity recognition
Mobile computer vision
![Page 36: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/36.jpg)
Embedded systems
A special purpose computer system enclosed or encapsulated within a physical system
They are everywhere today!• Consumer electronics• Communication• Entertainment• Transportation• Health• Home appliances
Co
urt
esy
of
Pro
f Ti
m C
hen
g
![Page 37: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/37.jpg)
• Video-assisted robots• Medical imaging devices
Examples of embedded systems
Co
urt
esy
of
Pro
f Ti
m C
hen
g
![Page 38: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/38.jpg)
• Video-assisted robots• Medical imaging devices• Autonomous cars
Examples of embedded systems
![Page 39: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/39.jpg)
Co
urt
esy
of
Pro
f Ti
m C
hen
g
![Page 40: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/40.jpg)
• Video-assisted robots• Medical imaging devices• Autonomous cars• Smart phones• Tablets• Glasses
Examples of embedded systems
![Page 41: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/41.jpg)
Smart phones & tablets:common characteristics
![Page 42: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/42.jpg)
• Accelerometer
• Compass/gyroscope
![Page 43: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/43.jpg)
• Nvidia Tegra-K1 based Android tablet
![Page 44: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/44.jpg)
Operating systems
Android OS
• Version 5 (lollipop)
![Page 45: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/45.jpg)
Challenges
• The mobile system has limited: computational power - bandwidth - memory
• Computer vision algorithms with
- What to compute on the client (features, tracks)- How much data must be transferred to the back end- What to compute on the back end- How much data must be transferred back to the client to visualize results
- Guaranteed (high)accuracy- Efficient (use little computational power/memory)- Fast (possibly real time)
• Many of these CV problems are still open
![Page 46: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/46.jpg)
Computing nodes (back end) (cloud, server)
The internet; on-line repositories
mobile system(client)
Client and Server paradigm
![Page 47: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/47.jpg)
In this class
• We will explore computer vision algorithms
– Meet challenges above
– Can be implemented on a mobile system
![Page 48: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/48.jpg)
Class organizationPart 1:• Review the Android architecture • Introduce Android development platform
Part 2:• Feature detection and descriptors• Image matching• Panorama and HDR images
Part 3:• Object detection, landmark recognition and image classification• Deep Learning and random forest
Part 4:• Tracking, VSLAM, and virtual augmentation
Part 5:• Special topics in mobile computer vision• Project discussion and presentations
![Page 49: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/49.jpg)
Class organizationPart 1:• Review the Android architecture • Introduce Android development platform
Part 2:• Feature detection and descriptors• Image matching• Panorama and HDR images
Part 3:• Object detection, landmark recognition and image classification• Deep Learning and random forest
Part 4:• Tracking, VSLAM, and virtual augmentation
Part 5:• Special topics in mobile computer vision• Project discussion and presentations
![Page 50: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/50.jpg)
•Detectors:•DoG• Harris
•Descriptors:• SURF: Speeded Up Robust Features• Implementation on mobile platforms
Features and descriptors
[Bay et al 06][Chen et al 07]
![Page 51: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/51.jpg)
Image matching
• M. Brown and D. Lowe, “recognizing panoramas”, 03• Yingen Xiong and Kari Pulli, "Fast Panorama Stitching for High-Quality Panoramic Images on Mobile Phones", IEEE Transactions on consumer electronics, 2010
![Page 52: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/52.jpg)
Automatic Panorama Stitching
Sources: M. Brown
• M. Brown and D. Lowe, “recognizing panoramas”, 03• Yingen Xiong and Kari Pulli, "Fast Panorama Stitching for High-Quality Panoramic Images on Mobile Phones", IEEE Transactions on consumer electronics, 2010
![Page 53: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/53.jpg)
Sources: M. Brown
• M. Brown and D. Lowe, “recognizing panoramas”, 03• Yingen Xiong and Kari Pulli, "Fast Panorama Stitching for High-Quality Panoramic Images on Mobile Phones", IEEE Transactions on consumer electronics, 2010
Automatic Panorama Stitching
![Page 54: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/54.jpg)
High Dynamic Range (HDR) images
LDR images
Weightmaps
Mertens, Kautz, van Reeth PG 2007
![Page 55: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/55.jpg)
Class organizationPart 1:• Review the Android architecture • Introduce Android development platform
Part 2:• Feature detection and descriptors• Image matching• Panorama and HDR images
Part 3:• Object detection, landmark recognition and image classification• Deep Learning and random forest
Part 4:• Tracking, VSLAM, and virtual augmentation
Part 5:• Special topics in mobile computer vision• Project discussion and presentations
![Page 56: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/56.jpg)
65
Visual search and landmarks recognition
![Page 57: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/57.jpg)
Courtesy of R. Szelisky and S. Seitz
Location/landmark recognitionQuack et al 08Hays & Efros 08Li et al 08
![Page 58: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/58.jpg)
Web repositories
(on the cloud; eg picasa)
Yeh et al 04Zheng et al 09
![Page 59: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/59.jpg)
Shape and object matching
A
• Shape Classification Using the Inner-Distance [Ling and Jacobs 07]
![Page 60: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/60.jpg)
- K. Grauman and T. Darrell 2005- S. Lazebnik et al, 2006- D. Nister et al. 2006,
Bag of words representations
- Accuracy- Efficiency- Scalability to large database
• Pyramid matching• Recognition with a Vocabulary Tree
![Page 61: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/61.jpg)
Deep learning and random forests
Courtesy of Laura Skelton
Hinton, Bengio, Lecun, Ng, Breiman, Amit, Geman, etc….
![Page 62: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/62.jpg)
Class organizationPart 1:• Review the Android architecture • Introduce Android development platform
Part 2:• Feature detection and descriptors• Image matching• Panorama and HDR images
Part 3:• Object detection, landmark recognition and image classification• Deep Learning and random forest
Part 4:• Tracking, VSLAM, and virtual augmentation
Part 5:• Special topics in mobile computer vision• Project discussion and presentations
![Page 63: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/63.jpg)
Features from videos
Courtesy of Jean-Yves Bouguet
• Features from videos• Descriptors from videos • On-line feature tracking
•Ferrari et al 01•Skrypnyk & Lowe 04•Takacs et al 07•Ta et al 09•Klein & Murray 09
![Page 64: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/64.jpg)
3D reconstruction and camera localization
• SFM• VSLAM
•Ferrari et al 01•Skrypnyk & Lowe 04•Takacs et al 07•Ta et al 09•Klein & Murray 09
Courtesy of Jean-Yves Bouguet
![Page 65: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/65.jpg)
Tracking and Virtual Reality insertions
"Server-side object recognition and client-side object tracking for mobile augmented reality", Stephan Gammeter , Alexander Gassmann, Lukas Bossard, Till Quack, and Luc Van Gool
![Page 66: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/66.jpg)
Class organizationPart 1:• Review the Android architecture • Introduce Android development platform
Part 2:• Feature detection and descriptors• Image matching• Panorama and HDR images
Part 3:• Object detection, landmark recognition and image classification• Deep Learning and random forest
Part 4:• Tracking, VSLAM, and virtual augmentation
Part 5:• Special topics in mobile computer vision• Project discussion and presentations
![Page 67: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/67.jpg)
Agenda
• Administrative– Requirements
– Grading policy
• Mobile Computer Vision
• Syllabus
![Page 68: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/68.jpg)
Next Lecture
• Overview of the Android platform & guiding examples by Dr Alejandro Troccoli (Nvidia)
• Please come and pick up your tablet!
– Today until 6pm, 126 Gates
– Tomorrow from 2-3pm, 126 Gates
![Page 69: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/69.jpg)
![Page 70: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/70.jpg)
Computer vision and Applications
79
2D
3D
EosSystems
Google Goggles
![Page 71: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/71.jpg)
Computer vision and Applications
80
2D
3D Newapplications
EosSystems
Google Goggles
![Page 72: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/72.jpg)
How to make this to work?
• Image classification
• Object detection
• Tracking
• Matching
• 3D reconstruction
Solve a number of challenging computer vision problems and implement them on a mobile system
![Page 73: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/73.jpg)
Does this image contain a car? [where?]
car
Detection & object recognition
Building
clock
person
![Page 74: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/74.jpg)
Building
clock
personcar
Detection:Which object does this image contain? [where?]
![Page 75: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/75.jpg)
clock
Does this image contain a clock? [where?]
Detection & object recognition
![Page 76: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/76.jpg)
![Page 77: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/77.jpg)
Challenges: illumination
image credit: J. Koenderink
![Page 78: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/78.jpg)
Challenges: scale
slide credit: Fei-Fei, Fergus & Torralba
![Page 79: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/79.jpg)
Challenges: deformation
![Page 80: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/80.jpg)
Challenges: occlusion
Magritte, 1957 slide credit: Fei-Fei, Fergus & Torralba
![Page 81: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/81.jpg)
Challenges: background clutter
Kilmeny Niland. 1995
![Page 82: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/82.jpg)
Challenges: viewpoint variation
Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba
![Page 83: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/83.jpg)
Challenges: intra-class variation
![Page 84: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/84.jpg)
Recognition
– Recognition task
– Search strategy: Sliding Windows
• Simple
• Computational complexity (x,y, S, , N of classes)- BSW by Lampert et al 08
- Also, Alexe, et al 10
Viola, Jones 2001,
![Page 85: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/85.jpg)
Recognition
– Recognition task
– Search strategy: Sliding Windows
• Simple
• Computational complexity (x,y, S, , N of classes)
• Localization• Objects are not boxes
- BSW by Lampert et al 08
- Also, Alexe, et al 10
Viola, Jones 2001,
![Page 86: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/86.jpg)
Recognition
– Recognition task
– Search strategy: Sliding Windows
• Simple
• Computational complexity (x,y, S, , N of classes)
• Localization• Objects are not boxes
• Prone to false positive
- BSW by Lampert et al 08
- Also, Alexe, et al 10
Non max suppression: Canny ’86….Desai et al , 2009
Viola, Jones 2001,
![Page 87: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/87.jpg)
Star models by Latent SVM
Felzenszwalb, McAllester, Ramanan, 08• Source code:
![Page 88: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/88.jpg)
• Visual codebook is used to index votes for object position
B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit
Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004
training image
visual codeword with
displacement vectors
Implicit shape models
Credit slide: S. Lazebnik
![Page 89: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/89.jpg)
Face Recognition• Digital photography
• Automatic face tagging
![Page 90: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/90.jpg)
The Viola/Jones Face Detector
• A “paradigmatic” method for real-time object detection
• Training is slow, but detection is very fast
• Extensions to mobile applications
P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.
![Page 91: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/91.jpg)
History: single object recognition
•No intra-class variation•High view point changes
Single 3D Object Recognition
![Page 92: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/92.jpg)
Lowe. ’99, ’04
•Handle severe occlusions•Fast!
Co
urtesy o
f D. Lo
we
Single 3D Object Recognition
Hsiao et al CVPR 10
![Page 93: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/93.jpg)
+ GPS
•Recognizing landmarks
Single 3D Object Recognition
![Page 94: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/94.jpg)
Co
urt
esy
of
Jean
-Yve
s B
ou
guet
–V
isio
n L
ab, C
alif
orn
ia In
stit
ute
of
Tech
no
logy
Tracking and 3D modeling
G. Klein and D. Murray. Improving the agility of keyframebased SLAM. In ECCV08, 2008.
G. Klein and D. Murray. Parallel tracking and mapping on a camera phone. In ISMAR’09, 2009
![Page 95: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/95.jpg)
Tracking and Virtual Reality insertions
"Server-side object recognition and client-side object tracking for mobile augmented reality", Stephan Gammeter , Alexander Gassmann, Lukas Bossard, Till Quack, and Luc Van Gool
![Page 96: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/96.jpg)
Detection and 3d Modeling
Min Sun, Gary Bradski, Bing-xin Xu, Silvio Savarese, Depth-Encoded Hough Voting for Joint ObjectDetection and Shape Recovery, ECCV 2009
![Page 97: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/97.jpg)
Min Sun, Gary Bradski, Bing-xin Xu, Silvio Savarese, Depth-Encoded Hough Voting for Joint ObjectDetection and Shape Recovery, ECCV 2009
Detection and 3d Modeling
![Page 98: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/98.jpg)
Automatic Panorama Stitching
Sources: M. Brown
![Page 99: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/99.jpg)
Sources: M. Brown
•M. Brown and D. Lowe, “recognizing panoramas”, 03•Yingen Xiong and Kari Pulli, "Fast Panorama Stitching for High-Quality Panoramic Images on Mobile Phones", IEEE Transactions on consumer electronics, 2010
Automatic Panorama Stitching
![Page 100: CS231M Mobile Computer Vision - Stanford University · presentation per team Evaluation: •Clarity of the presentation •Ability to master the topic •Ability to answer questions](https://reader035.vdocuments.us/reader035/viewer/2022071107/5fe1e4ca613f6e5e360b6215/html5/thumbnails/100.jpg)
Agenda
• Administrative– Requirements
– Grading policy
• Advanced Topics in Mobile Computer Vision
• Syllabus & Projects