computer and machine visionecee.colorado.edu/~siewerts/extra/ecen5043/ecen5763_doc/...incompatible...
TRANSCRIPT
February 5, 2014 Sam Siewert
Computer and Machine
Vision
Lecture Week 4
Part-1
FFMPEG FAQ Read It!!
http://ffmpeg.org/faq.html
You should know how to Decode Video
(recorded from your camera or pre-
recorded by someone els)
You should know how to Encode Video (to
turn in with your labs)
Sam Siewert 2
Outline of Week 4 Practical Methods for Dealing with Camera Streams, Frame by Frame and De-coding/Re-encoding for Analysis (And Lab Reports!!)
Deeper Dive on Color, Human Vision Characteristics
Wrap-up On Convolution and Transformation – Finish Reading Through Chapter 3 in CV (Image Processing and
Transforms)
– Finish Reading in OpenCV through Chapter 6 – Start asking Questions about Example Code as We go
Introduction to Segmentation and Recognition Problem and Approaches – Next Step is Histograms and Thresholds
– Goal is Recognition
– Then 3D
Sam Siewert 3
Ffmpeg (avconv) Notes sudo apt-get install ffmpeg
ffmpeg -i movie.mpg –ss 30 –t 30 movie%d.ppm –- 30 seconds @ 30 sec
ssiewert@ssiewert-VirtualBox:~/a485/media$ ffmpeg -i big_buck_bunny_480p_surround-fix.avi -ss 30 -t 30 bbb%d.ppm
ffmpeg version 0.8.6-4:0.8.6-0ubuntu0.12.04.1, Copyright (c) 2000-2013 the Libav developers
built on Apr 2 2013 17:02:36 with gcc 4.6.3
Input #0, avi, from 'big_buck_bunny_480p_surround-fix.avi':
Duration: 00:09:56.45, start: 0.000000, bitrate: 2957 kb/s
Stream #0.0: Video: mpeg4 (Simple Profile), yuv420p, 854x480 [PAR 1:1 DAR 427:240], 24 tbr, 24 tbn, 24 tbc
Stream #0.1: Audio: ac3, 48000 Hz, 5.1, s16, 448 kb/s
Incompatible pixel format 'yuv420p' for codec 'ppm', auto-selecting format 'rgb24'
[buffer @ 0x907700] w:854 h:480 pixfmt:yuv420p
[avsink @ 0x9054c0] auto-inserting filter 'auto-inserted scaler 0' between the filter 'src' and the filter 'out'
[scale @ 0x905b60] w:854 h:480 fmt:yuv420p -> w:854 h:480 fmt:rgb24 flags:0x4
Output #0, image2, to 'bbb%d.ppm':
Metadata:
encoder : Lavf53.21.1
Stream #0.0: Video: ppm, rgb24, 854x480 [PAR 1:1 DAR 427:240], q=2-31, 200 kb/s, 90k tbn, 24 tbc
Stream mapping:
Stream #0.0 -> #0.0
Press ctrl-c to stop encoding
...
Last message repeated 719 times -0kB time=29.00 bitrate= -0.0kbits/s
frame= 720 fps= 38 q=0.0 Lsize= -0kB time=30.00 bitrate= -0.0kbits/s
video:864686kB audio:0kB global headers:0kB muxing overhead -100.000002%
ssiewert@ssiewert-VirtualBox:~/a485/media$
Sam Siewert 4
Now with PPM Frames PPM is Simple, but No Compression – Good for CV – http://en.wikipedia.org/wiki/Netpbm_format - Read this!
– JPEG, PNG are Compressed
– TIFF is an Alternative, but More Complex
Sam Siewert 5
Simple Re-encode When Quality is not a Concern, Keep it Simple
ffmpeg -f image2 -i bbb%d.ppm bbbtrans.mpg
vlc bbbtrans.mpg
Sam Siewert 6
Quality Encoding is Tricky Use MPEG4 HQ Settings, Encode 480p, AR=4:3 ffmpeg -f image2 -i bbb%d.ppm -maxrate 20000k -bufsize 32M -s 640x480 -vcodec mpeg4 -qscale 1 bbbtranshq.mp4
Sam Siewert 7
Deeper Dive on Color and
Human Vision
Color, 3D Cues, and a Bit of
Physiology / Psychology of
Vision
Sam Siewert
8
Reminder
Computer Vision has the Goal to Emulate,
Understand, Extend and Repair Human
Vision Systems
Machine Vision has the Goal to Automate
a Process Using Instrumentation including
Photometers
Sam Siewert 9
Radiometry vs. Photometry Radiometry – the study of light from viewpoint of Physics – Energy (joules), Flux
(photons/cm2), Power (watts/m)
– Full Electromagnetic Spectrum including Visible from Far Infrared to Gamma-rays
Photometry – study of light from perspective of “useful”, often “visible” light
Sam Siewert 10
Machine Vision Color in Brief Physics of Wavelength and Intensity of Electromagnetic Waves Simplified
– Luminance (Y), Chrominance U=B-Y (blue-luma), V=R-Y (red-luma)
– Additive Primary Hues (R, G, B)
– Secondary Hues (Combined R+G, R+B, G+B) CMY
– Complete subset of Colors is Gamut
– Chromaticity is independent of the luminance
Measured in Visible for Classic Photometry (and Infrared) – Visible Photometer (detectors sensitive to intensity in λ bands)
Luminous flux compared to standard source (lumen) – visible to eye
Number of photons or flux in photons/cm2
– Spectrograph (frequency separation from incident wave)
– Radiometry or Radiant flux – total power in watts
Physiology of What We Can See – Tristimulus L(560-580nm) S(420-440nm) M(530-540nm) of Photoreceptors
– What We “See” in Various Lighting Conditions
– Different and Perhaps Not Identical to the Photometer
– Perhaps Not Identical to Display in Lighting Conditions
Encoding from Measurements to Digital Samples – Per Pixel, by color band, in an array
Sam Siewert 11
Additive model
Subtractive model
Displayed Color Transformation Demo Adapting Rec 709 sRGB Color Model to Different Lighting
Conditions
– Cinematic, Ambient Fluorescent or Tungsten, Full-spectrum sunlight
Based on Primaries, Tristimulus, Lighting, We “See” a
different Gamut
Sam Siewert 12
CRT Gamut and CIE
Chromaticity Diagram
Plot of Colors Visible to Eye
Chromaticity Diagram
L(560-580nm) S(420-440nm) M(530-540nm
Rec 709 Color Space
And D65 White Point
http://en.wikipedia.org/wiki/File:CIE-1931_diagram_in_LAB_space.svg
Demo of Transform Work with eeColor for RT Color Transform at 1080p
Gamut Boost for Lighting Conditions and Display Characteristics
Sam Siewert 13 http://www.eecolor.com/
Color and Object Recognition Demo
(Revisited) Object Recognition and Tracking Using Color in Real-Time
Use Color Models or “Signatures” for Known Objects – General Color Perception and Recognition – Computer Vision
– Specific Color Signature Recognition – Machine Vision
– Controlled Lighting, Apriori algorithm, not tracking primary colors, but rather centroids of objects with a color signature
Sam Siewert 14
ECEN 4623/5623 – University of Colorado Boulder