signals and systems (18-396), image and video processing ...jzhu/class/18200/f06/l06_chen.pdf ·...
TRANSCRIPT
![Page 1: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/1.jpg)
Signals and Systems (18-396), Image and Video Processing (18-798),
and Life Beyond…
Prof. Tsuhan [email protected]
![Page 2: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/2.jpg)
Signals and SystemsSignals and Systems
Estimation, Detection and Estimation, Detection and IdentIdent..
Digital Signal Processing IIDigital Signal Processing II
Image and Video ProcessingImage and Video Processing
Multimedia CommunicationMultimedia Communication
Applied Stochastic Proc.Applied Stochastic Proc.
Optical Image and Radar Proc.Optical Image and Radar Proc.
Pattern RecognitionPattern Recognition
Sample Courses in Signal Processing and Communication
Error Control CodingError Control Coding
Digital Signal Processing IDigital Signal Processing I
Dig. Comm. and Sig. Proc. Dig. Comm. and Sig. Proc. SystSyst. Design. Design
Fund. Comm. Sys.Fund. Comm. Sys.
![Page 3: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/3.jpg)
Sound
![Page 4: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/4.jpg)
Tsuhan Chen
– CD: 44.1 kHz × 16 bits × 2 channels = 1.411 Mbits/s
FrequencyBand (Hz)
SamplingRate (kHz)
Bits perSample
Raw Bitrate(kbits/s)
TelephoneSpeech
300~3400 8 8 64
WidebandSpeech
50~7000 16 8 128
MediumbandAudio
10~11000 24 16 384
WidebandAudio
10~22000 48 16 768
Digital Audio
![Page 5: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/5.jpg)
Tsuhan Chen
MPEG-1 Audio• ISO/IEC 11172-3 (1988~1991)
– First high quality audio compression standard– Sampling rates: 32, 44.1, 48 kHz– CD quality two-channel audio at ~256 kbits/s
• CD: 44.1 kHz × 16 bits × 2 = 1.411 Mbits/s– YES, this is MP3!!!
• Quality demonstration– Stereo 44.1 kHz at 64 kbits/s– Stereo 44.1 kHz at 128 kbits/s– Stereo 44.1 kHz at 192 kbits/s– Stereo 44.1 kHz at 256 kbits/s
![Page 6: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/6.jpg)
Image
![Page 7: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/7.jpg)
Tsuhan Chen
ImageRGB Color
R = 255G = 200B = 200
R = 150G = 170B = 253
R = 251G = 200B = 190
R = 124G = 110B = 123
R = 204G = 203B = 202
R = 151G = 140B = 139
R = 248G = 220B = 242
R = 190G = 170B = 90
R = 151G = 148B = 149
R = 244G = 222B = 214
R = 253G = 100B = 120
R= 230G=120B=234
R = 149G = 244B = 130
R = 159G = 149B = 150
R = 254G = 133B = 200
m
n
R = 0G = 0B = 0
![Page 8: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/8.jpg)
Tsuhan Chen
Sample Images
Lena Pepper Baboon
512 × 512 × 3 bytes = 768KBWith JPEG, ~32KB
![Page 9: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/9.jpg)
Tsuhan Chen
SamplingSpatial Subsampling
MSE = 2058MAE = 24CR = 16:1
MSE = 3924MAE = 36CR = 64:1
Original (256×256) (64×64) (32×32)Aliasing!!!
![Page 10: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/10.jpg)
Tsuhan Chen
SamplingSpatial Subsampling w/Averaging
MSE =1010MAE =18CR=16:1
MSE =1643MAE =26CR=64:1
Original (256×256) (64×64) (32×32)
![Page 11: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/11.jpg)
Tsuhan Chen
Quantization
MSE = 9670MAE = 78CR = 2:1
MSE = 10381MAE = 82CR = 4:1
Original (24bit) (12-bit) (6-bit)
![Page 12: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/12.jpg)
Video
![Page 13: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/13.jpg)
Tsuhan Chen
Video
Pixel or Pel
Sequence
…...
time
Frame or Picture
Line
![Page 14: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/14.jpg)
Tsuhan Chen
Video Data
• Video
• So, we need MPEG-1 (VCD etc.), MPEG-2 (DVD etc.), MPEG-4 (some camcorders, etc.)
Pels/line Lines Frames/s Bytes/pel Bit rate
Video Telephony(CIF)
352 288 10 1.5 12.2 Mbits/s
Broadcast TV(ITU-R 601 4:2:2)
720 480 30 2 166 Mbits/s
HDTV ~1280 ~720 60 2 885 Mbits/s
![Page 15: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/15.jpg)
Computer Graphics
![Page 16: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/16.jpg)
Tsuhan Chen
Face Animation
• Wire-frame mesh model with texture mapping
![Page 17: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/17.jpg)
Computer Vision
![Page 18: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/18.jpg)
Face Tracking
Use color information to segment target vs. non-target pixels
Use deformable template to track the target
![Page 19: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/19.jpg)
Tsuhan Chen
Lip Tracking
• Use a Gaussian mixture with three Gaussians to model the color distribution of the mouth
• Template: two parabolas defined by λ = (a,b,c,d,e)
a b
c
e
d
![Page 20: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/20.jpg)
Tsuhan Chen
Eye Tracking
• Find the center of the darkest region in the search window normalization
thresholding
Finding the center
![Page 21: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/21.jpg)
Tsuhan Chen
Tracking in a Car…Face/Eye/Hand Tracking
Driver Verification:Security and User Preference
GestureCam
Airbag Deployment Control
Gesture-Controlled Map Browsing
![Page 22: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/22.jpg)
Higher Dimensions?
![Page 23: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/23.jpg)
Tsuhan Chen
(Vx,Vy,Vz)(Vx,Vy,Vz)
(θ,ψ)(θ,ψ)
7D Plenoptic Function
[Adelson’91]
),,,,,,( tVVVf zyx λψθ
![Page 24: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/24.jpg)
Image-Based Rendering• Plenoptic Function [Adelson’91]
[McMillan’95]
• Lumigraph/Lightfield[Gortler/Grzeszczuk’96] [Levoy’96]
• Concentric Mosaics [Shum99]
![Page 25: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/25.jpg)
Tsuhan Chen
“The Matrix”
![Page 26: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/26.jpg)
Tsuhan Chen
EyeVision
[Kanade’01]
After CorrectionBefore Correction
4D IBR(incl. time)
Super Bowl XXXV
![Page 27: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/27.jpg)
Tsuhan Chen
Self-Reconfigurable Camera Array
[Levoy, Stanford] [McMillan, MIT][Zhang and Chen, CMU]
![Page 28: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/28.jpg)
Tsuhan Chen
Setup
![Page 29: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/29.jpg)
Tsuhan Chen
Results• Real-time capturing/calibration/rendering
– 48 webcams sensor network– 2 step-motors each (translation and pan)
• Building the next version…– More mobile and wireless
![Page 30: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/30.jpg)
Tsuhan Chen
Ongoing: Mirror/Lens Array
This is lightfield/lumigraph!
![Page 31: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/31.jpg)
Tsuhan Chen
Future: “Transparent Material”
Many applications…
Camera Array
3D Display
![Page 32: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/32.jpg)
Information Retrieval(Pattern Recognition)
![Page 33: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/33.jpg)
Tsuhan Chen
Hand-Drawn Sketch Retrieval
User sketches a query
QuerySketch
SimilarSketch
Page stored in Database
![Page 34: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/34.jpg)
Tsuhan Chen
Query
Retrieved Trademarks
Trademark Retrieval
![Page 35: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/35.jpg)
Tsuhan Chen
Hand-Drawn Query
Retrieved Trademarks
Trademark Retrieval
![Page 36: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/36.jpg)
Tsuhan Chen
3D Object Retrieval
![Page 37: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/37.jpg)
Tsuhan Chen
Sketched 3D Query too…
![Page 38: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/38.jpg)
Tsuhan Chen
3D Protein Structures too…
![Page 39: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/39.jpg)
Tsuhan Chen
Summary
• Signals and Systems• Image and Video Processing• Computer Vision• Computer Graphics• Pattern Recognition• Information Retrieval
![Page 40: Signals and Systems (18-396), Image and Video Processing ...jzhu/class/18200/F06/L06_Chen.pdf · Tsuhan Chen MPEG-1 Audio • ISO/IEC 11172-3 (1988~1991) – First high quality audio](https://reader030.vdocuments.us/reader030/viewer/2022041114/5f23b2fc2da3832ff26f7dd6/html5/thumbnails/40.jpg)
Signals and Systems (18-396), Image and Video Processing (18-798),
and Life Beyond…
Prof. Tsuhan [email protected]