cs 349: machine learning · topic 19: machine learning and visual sensing* cs 349: machine learning...
TRANSCRIPT
![Page 1: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/1.jpg)
Topic 19: Machine Learning and Visual Sensing*
CS 349: Machine Learning
Oliver Cossairt
*Some slides taken from “Machine Leanring Visual Sensing and Machine Learning” lecture by Vivek Boominathan and Ashok Veeraraghavan,
2019 CVPR Workshop on Data-Driven Computational Imaginghttps://ciml.media.mit.edu/
![Page 2: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/2.jpg)
Computational Visual Sensing
OpticalSystem
ComputationalAlgorithm OutputScene
Tagging computational algorithms with optical systems has expandedvisual sensing ability
![Page 3: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/3.jpg)
Example 1: Imaging Through Scattering Media
Fog Snow Water
Moving Particle Scatterer(e.g. biological tissue)
Camera
Measurements
“Coherent Inverse Scattering via Transmission Matrices: Efficient PhaseRetrieval Algorithms and a Public Dataset,” Chris Metzler*, Manoj Sharma*,Sudarshan Nagesh, Richard Baraniuk, Oliver Cossairt, Ashok Veeraraghavan
![Page 4: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/4.jpg)
Learning to see through Scattering Media
Denoised ReconstructionMeasurement
ReconstructionAlgorithm
Ground Truth
Diffuser acts as multiple scattering
material
MeasurementsInput Images
Training Set
![Page 5: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/5.jpg)
Experimental Validation: Imaging through Scattering Media
Measurement
ReconstructionAlgorithm
Ground Truth
Measurement
ReconstructionAlgorithm
Ground TruthReconstruction
Reconstruction
MeasurementsInput Images
Training Set
![Page 6: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/6.jpg)
Example 2: On-Chip Holographic Microscopy
CMOS sensor
LED
Biological sample
large Field-of-View, compact, cost-effective2.2x2.2 um pixel resolution
Blepharisma hologram
Training DatasetPeranema Spirostomum Didinium Euplotes Paramecium Blepharisma
Collect hologram dataset with high resolution
0.00 s
3.25 s
3D Reconstruction Algorithm
3D Tracking Result
“Dictionary-based phase retrieval for space-time super resolution single lens-free on-chip holographic video,”Winston (Zihao) Wang et. al, COSI 2017.
![Page 7: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/7.jpg)
Summary: Visual Sensing using Machine Learning
OpticalSystem
Machine Learning
Machine Learning System
( I ) Backend ML ( II ) Joint design with ML
( III ) ML with Optical System
OpticalSystem
Machine Learning
OpticalLayer(s)
Electronic Layer(s)
![Page 8: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/8.jpg)
Visual Sensing using Machine Learning
OpticalSystem
Machine Learning
Part I: Backend MLPart I: Backend ML
OpticalSystem
MachineLearning Output
Fixed Data-driven Improve Quality
Data-driven “learned” models can improve the quality of visual sensing systems.
![Page 9: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/9.jpg)
Thin Optical System
Drastically reducing camera thickness by replacing lens with thin mask
S. Asif et al., IEEE Transactions on Computational Imaging (2016)
FlatCam
![Page 10: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/10.jpg)
Ill posed inverse problemForward Model:
Capture Scene
Y XΦ" Φ#
Ill posed – Φ" and Φ# are poorly conditioned
![Page 11: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/11.jpg)
Regularized reconstruction
Reconstruction Regularization
![Page 12: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/12.jpg)
Data-driven reconstruction
Measurement PreviousDeep learning
…
New
![Page 13: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/13.jpg)
End-to-end approach
MeasurementOutput
-1
Fully trainable deep network
Model inversion Perceptual enhancement
W1 ⇥ Y ⇥WT2
<latexit sha1_base64="JMNWijYNrFW9vf3xTXiWgq8VVEw=">AAACAnicdVDJSgNBEO2JW4zbqCfx0hgET8NMFExuAS8eI2STZBx6Oj1Jk56F7hohDMGLv+LFgyJe/Qpv/o2dDVwfFDzeq6Kqnp8IrsC2P4zc0vLK6lp+vbCxubW9Y+7uNVWcSsoaNBaxbPtEMcEj1gAOgrUTyUjoC9byhxcTv3XLpOJxVIdRwtyQ9CMecEpAS5550PIc3AUeMoWvF6TllW7qnlm0rUrF1sC/iWPZUxTRHDXPfO/2YpqGLAIqiFIdx07AzYgETgUbF7qpYgmhQ9JnHU0jole52fSFMT7WSg8HsdQVAZ6qXycyEio1Cn3dGRIYqJ/eRPzL66QQlN2MR0kKLKKzRUEqMMR4kgfucckoiJEmhEqub8V0QCShoFMr6BAWn+L/SbNkOadW6eqsWC3P48ijQ3SETpCDzlEVXaIaaiCK7tADekLPxr3xaLwYr7PWnDGf2UffYLx9AnSrlis=</latexit>
Learned
Based on forward model
Naïve approach
![Page 14: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/14.jpg)
Results
Raw Captures
Tikhonovregularization
Data-drivenEnd-to-End
![Page 15: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/15.jpg)
Event-driven Video Frame Synthesis
Zihao Wang1, Weixin Jiang1, Kuan He1, Boxin Shi2, Aggelos Katsaggelos1, Oliver Cossairt1
1 Northwestern University 2 Peking University
2nd
Int’l Workshop on Physics Based Vision meets Deep Learning (PBDL) in conjunction with
![Page 16: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/16.jpg)
What’s event camera? Another high-speed camera?
6/2/20 Wang et al. PBDL2019, ICCV Workshop 16
Scenario: moving poster with shapes
Capture: 22 FPSDisplay: 1.1 FPS
Data from DAVIS datasetEach pixel:• Compare brightness variations
• (blue: increase; red: decrease)• Small latency (micro-second level)
• 106 FPS! (at max)• Works independently (asynchronous)
![Page 17: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/17.jpg)
6/2/20 PBDL2019, ICCV Workshop 17
We propose intensity frame + eventsfor high frame-rate video synthesis
![Page 18: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/18.jpg)
Our approach: fusion of intensity frame + events
6/2/20 PBDL2019, ICCV Workshop 18
DMR: Differentiable Model-based Reconstruction
![Page 19: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/19.jpg)
Results (DMR)
6/2/20 19
Blurry images Events during exposure EDI [CVPR’19] Ours
Video recovery
• Motion deblur case• Given a blurry image + events in-exposure, recover intermediate sharp frames.
Wang et al. PBDL2019, ICCV Workshop
![Page 20: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/20.jpg)
Visual Sensing using Machine Learning
OpticalSystem
Machine Learning
Part I: Backend ML Part II: Joint design with ML
OpticalSystem
Machine Learning
Part II: Joint design with ML
OpticalSystem
MachineLearning Output
Data-drivenImprove Quality
Joint design with data-driven techniques can bring out best in both systems
![Page 21: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/21.jpg)
PhaseCam3D - Learning Phase Masks For Passive Single-view Depth Estimation
Example 1:
Yicheng Wu, Vivek Boominathan, Huaijin Chen, Aswin Sankaranarayanan, and Ashok Veeraraghavan. “PhaseCam3D—Learning Phase Masks for Passive Single View Depth Estimation.” IEEE International Conference on Computational Photography (ICCP), 2019
Part II: Joint design with ML
![Page 22: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/22.jpg)
Defocus of general lens
Defocused image
× Identical PSF at both sides of the focal plane.× Impossible to tell the depth based on the blur size.
FocusFar NearPSFs at different depths
$ Sensor
Lens
trentwoodsphoto.com
![Page 23: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/23.jpg)
PhaseCam3D: an end-to-end learning approach
Optical System Digital network
…
PhaseCam3D sensor
SensorPhase mask
Lens
Depth map
Scene
q Differential optical modelq Digital networkq End-to-end learning
![Page 24: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/24.jpg)
Optimal simulation resultsHeight map
PSFs-10 -9 -8 -7 -6 -5 -4
-3 -2 -1 0 1 2 3
4 5 6 7 8 9 10
Sharp image Coded image
Estimated disparityTrue disparity
![Page 25: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/25.jpg)
Fabricate the learned phase mask
Photonic Professional GT, Nanoscribe GmbH
Two-photon lithography 3D printer Fabricated phase mask
2.835 mm
![Page 26: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/26.jpg)
Accuracy evaluation: compare with Kinect
Coded Images
Estimated depth by PhaseCam3D
Estimated depth by Kinect
0.6
0.8
1
1.2
1.4
[m]
Error: IJKL:;MLNO6 = 1.25cm
![Page 27: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/27.jpg)
Visual Sensing using Machine Learning
OpticalSystem
Machine Learning
Machine Learning System
Part I: Backend ML Part II: Joint design with ML
Part III: ML with Optical System
OpticalSystem
Machine Learning
OpticalLayer(s)
Electronic Layer(s)
Part III: ML with Optical System
OpticalLayer(s)
ElectronicLayers(s)
Vision(Inference/
Classification
Machine Learning System
Incorporating optical layer(s) into machine learning system can decrease latency and power
![Page 28: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/28.jpg)
ASP Vision: Optically Computing the First Layer of CNNs using Angle Sensitive Pixels
Example 1:
Huaijin G. Chen, Suren Jayasuriya, Jiyue Yang, Judy Stephen, Sriram Sivaramakrishnan, Ashok Veeraraghavan, and Alyosha Molnar. “ASP vision: Optically computing the first layer of convolutional neural networks using angle sensitive pixels.” Computer Vision and Pattern Recognition (CVPR), 2016.
Part III: ML with Optical System
![Page 29: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/29.jpg)
ASP camera as first layer of DNN
“Elephant”
L3 LN OutputL2
···
ASP Vision : Sensor + Deep Learning Co-Design
Reduced CNN
ASP CameraL1
Scene
Optically computed
···
ASP camera has gabor like filters that show up as kernels
in many CNNs, eg. AlexNet
![Page 30: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/30.jpg)
How many FLOPs can we save by skipping the first layer?
VGG-M NiN LeNet# of Conv. Layers 8 9 4
Input Image Size 224 ⨉ 224 ⨉ 3 32 ⨉ 32 ⨉ 3 28 ⨉ 28 ⨉ 1
# of First Layer Filters 96 (Original)
12(Prototype)
192 (Original)
12 (Prototype)
20(Original)
12 (Prototype)
First Layer Conv. Kernel 7 ⨉ 7 ⨉ 96 7 ⨉ 7 ⨉ 12 5 ⨉ 5 ⨉ 192 5 ⨉ 5 ⨉ 12 5 ⨉ 5 ⨉ 20 5 ⨉ 5 ⨉ 12
FLOPS of Fist layer 708.0M 88.5 M 14.75M 921.6K 392 K 235 K
Total FLOPS 6.02G 3.83 G 200.3M 157 M 10.4 M 8.8 M
First Layer FLOPS Saving 11.76% 2.3% 7.36% 0.6% 3.77% 2.67%
How many FLOPs can we save by skipping the first layer?
VGG-M NiN LeNet# of Conv. Layers 8 9 4
Input Image Size 224 ⨉ 224 ⨉ 3 32 ⨉ 32 ⨉ 3 28 ⨉ 28 ⨉ 1
# of First Layer Filters 96 (Original)
12(Prototype)
192 (Original)
12 (Prototype)
20(Original)
12 (Prototype)
First Layer Conv. Kernel 7 ⨉ 7 ⨉ 96 7 ⨉ 7 ⨉ 12 5 ⨉ 5 ⨉ 192 5 ⨉ 5 ⨉ 12 5 ⨉ 5 ⨉ 20 5 ⨉ 5 ⨉ 12
FLOPS of Fist layer 708.0M 88.5 M 14.75M 921.6K 392 K 235 K
Total FLOPS 6.02G 3.83 G 200.3M 157 M 10.4 M 8.8 M
First Layer FLOPS Saving 11.76% 2.3% 7.36% 0.6% 3.77% 2.67%
![Page 31: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/31.jpg)
Privacy-preserving action recognition using coded aperture videos, CVPRW’19
classifier action labelsMotion
featuresclassifier action labels
Conventional action recognition Privacy-preserving action recognition
1. Hoppin
g
2. Sta
ggering
3. Ju
mpin
g u
p
4. Ju
mpin
g jack
5. sq
uat
6. Sta
ndin
g u
p
7. Sitting d
ow
n8. Thro
w
9. Cla
ppin
g
10. Handw
avin
g
Thin Optical System
Drastically reducing camera thickness by replacing lens with thin mask
S. Asif et al., IEEE Transactions on Computational Imaging (2016)
FlatCam
Coded Aperture Camera
![Page 32: CS 349: Machine Learning · Topic 19: Machine Learning and Visual Sensing* CS 349: Machine Learning Oliver Cossairt *Some slides taken from “Machine LeanringVisual Sensing and Machine](https://reader035.vdocuments.us/reader035/viewer/2022081409/60849e83e872c804a131f3ad/html5/thumbnails/32.jpg)
THANK YOU AND
CONGRATULATIONS!