hsa-4146, creating smarter applications and systems through visual intelligence, by jeff bier
Post on 11-Sep-2014
514 views
DESCRIPTION
Presentation HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligence, by Jeff Bier at the AMD Developer Summit (APU13) November 11-13, 2013.TRANSCRIPT
Jeff Bier, Founder, Embedded Vision Alliance / President, BDTI
AMD Developer Summit, November 13, 2013
Creating Smarter Applications and Systems
Through Visual Intelligence
“Computer vision is the science and technology of machines that see,
where ‘see’ means that the machine is able to extract information from
an image that is necessary to solve some task.”
– Adapted from en.wikipedia.org/wiki/Computer_vision
Computer vision is distinct from other types of video and image
processing: it involves extracting meaning from visual inputs.
We use the term “embedded vision” to refer to the practical deployment
of computer vision into a wide range of products and applications
• Industrial, automotive, medical, defense, retail, gaming, consumer
electronics, security, education, …
• In embedded systems, mobile devices, PCs and the cloud
Copyright © 2013 Embedded Vision Alliance 2
Computer Vision / Embedded Vision
The proliferation of embedded vision is enabled by:
• Hardware: processors, sensors, etc.
• Software: tools, algorithms, libraries, APIs
Copyright © 2013 Embedded Vision Alliance 3
Why is Embedded Vision Proliferating Now?
Embedded vision upgrades what machines know about the physical world,
and how they interact with it
This enables dramatic improvements in existing products—and creation of
new types of products
Embedded vision can:
• Boost efficiency: Improving throughput and quality
• Enhance safety: Detecting danger and preventing accidents
• Simplify usability: Making the “user interface” disappear
• Fuel innovation: Enabling us to do things that were impossible
Copyright © 2013 Embedded Vision Alliance 4
What Does Embedded Vision Enable?
Copyright © 2013 Embedded Vision Alliance 5
Embedded Vision:
The Software-Defined Sensor
Established (or rapidly growing)
embedded vision markets:
• Factory automation
• Agriculture
• Video game consoles
• Military
• Automotive safety
• Augmented reality for retail
(in store, at home, mobile)
• Public safety and security
Copyright © 2013 Embedded Vision Alliance 6
Example Embedded Vision Application Areas
Emerging embedded vision
markets:
• Building automation
• Toys and games
• User interfaces (mobile devices,
cars, consumer electronics)
• Robots for many uses and
settings
• Education
• Clinical and home health care
• Field service (e.g., equipment
repair)
• Aids for the visually impaired
Augmenting Human Capabilities: OrCam
Visual Interpreter for the Sight Impaired
Copyright © 2013 Embedded Vision Alliance
www.youtube.com/watch?v=ykDDxWbt5Nw
7
• Infinitely varying inputs in many applications: uncontrolled lighting,
orientation, motion, occlusion
• Complex, multi-layered algorithms
• Lack of analytical models means
exhaustive experimentation is required
• Numerous algorithms and algorithm
parameters to choose from
• Most vision applications involve high data rates and
complex algorithms high computation requirements
• For vision to be widely deployed, it must be implemented in many
designs that are constrained in cost, size, and power consumption
• Most product creators lack experience with embedded vision
Copyright © 2013 Embedded Vision Alliance 8
What Makes Embedded Vision Hard?
www.selectspecs.com
A typical embedded vision pipeline:
Typical total compute load: ~10-100 billion operations/second
Loads can vary dramatically with pixel rate and algorithm complexity
9
How Does Embedded Vision Work?
Segmenta-tion
Object Analysis
Heuristics or Expert System
Image Acquisition
Image Pre-
processing
Feature Detection
Ultra-high data rates;
low to medium
algorithm complexity
High to medium data
rates; medium algorithm
complexity
Low data rates;
high algorithm
complexity
How Does Embedded Vision Work?
Copyright © 2013 Embedded Vision Alliance
Detect lane markings on the road and warn when car veers out of the lane
A simplified solution:
• Acquire road images from front-facing camera (often with fish-eye
lens)
• Apply pre-processing (primarily lens correction)
• Perform edge detection
• Detect lines in the image with Hough transform
• Determine which lines are lane markings
• Track lane markings and estimate positions in the next frame
• Assess car’s trajectory with respect to lane and warn driver in case
of lane departure
Lane Departure Warning—The Problem
Copyright © 2013 Embedded Vision Alliance 10
• Lenses (especially inexpensive ones) tend to distort images
• Straight lines become curves
• Distorted images tend to thwart vision
algorithms
11
Lane-Departure Warning: Lens Distortion
Section based on “Lens Distortion Correction” by Shehrzad Qureshi; used with permission. Image courtesy of and © Luis Alvarez
Copyright © 2013 Embedded Vision Alliance
• A typical solution is to use a known test pattern to quantify the lens
distortion and generate a set of warping coefficients that enable the
distortion to be (approximately) reversed
• The good news: the calibration procedure is performed once
• The bad news: the resulting coefficients then must be used to
“undistort” (warp) each frame before further processing
• Warping requires interpolating between pixels
12
Lens Distortion: A Solution
Copyright © 2013 Embedded Vision Alliance
In our lane-departure warning system, edge detection is the first step in
detecting lines (which may correspond to lane markings)
• Edge detection is a well-understood technique
• Primarily comprises 2D FIR filtering
• Computationally-intensive pixel processing
• Many algorithms are available (Canny, Sobel, etc.)
• Some algorithms are highly data-parallel
• Others (e.g. Canny) include steps such as edge-tracing that
reduce data parallelism
13
Lane Departure Warning: Edge Detection
Copyright © 2013 Embedded Vision Alliance
Edge thinning removes unwanted spurious edge pixels
• Improves output of Hough transform
• Often performed in multiple passes over the frame
• Also useful in other applications
14
Lane-Departure Warning: Edge Thinning
Copyright © 2013 Embedded Vision Alliance
• Hough transform examines the edge pixels found in the image, and
detects predefined shapes (typically lines or circles)
• In a lane-departure warning system, Hough transform is used to detect
lines, which may correspond to lane markings on the road
15
Lane-Departure Warning: Hough Transform
Original image Edges detected Lines detected
Copyright © 2013 Embedded Vision Alliance
Similar to a histogram
• Each detected edge pixel is a “vote” for all of the lines that pass
through the pixel’s position in the frame
• Lines with the most “votes” are detected in the image
• Uses a quantized line-parameter space (e.g. angle and distance
from origin)
• Must compute all possible line-parameter values for each detected
edge pixel
16
Lane-Departure Warning: Hough Transform
Edge
pixels
Every possible line through an edge pixel
gets one vote when the pixel is processed
A line passing
through many edge
pixels gets many
votes
Copyright © 2013 Embedded Vision Alliance
• Filter the detected lines to discard lines that are not likely to be lane
markings
• Find start and end points of line segments
• Filter by length, position, and angle
• Filter by line color and background color
• Additional heuristics may apply (e.g. dashed or solid lines are likely
to be lane markings, but lines with uneven gaps are not)
• Possibly classify the lines as lane markings or other lane indication (e.g.
curb)
17
Lane-Departure Warning:
Detecting Lane Markings
Copyright © 2013 Embedded Vision Alliance
• Tracking lane markers from each frame to the next
• Helps eliminate spurious errors
• Provides a measure of the car’s trajectory relative to the lane
• Typically done using a predictive filter:
• Predict new positions of lane markings in the current frame
• Match the lane markings to the predicted positions and compute the
prediction error
• Update the predictor for future frames
• Kalman filters are often used for prediction in vision applications
• Theoretically these are the fastest-converging filters
• Found in OpenCV
• Simpler filters are often sufficient
• Very low computational demand due to low data rates
• E.g. 2 lane marking positions × 30 fps 60 samples per second
18
Lane-Departure Warning:
Tracking Lane Markings
Copyright © 2013 Embedded Vision Alliance
• The basic algorithm presented is not robust. May need
significant enhancements for real-world conditions
• Must work properly on curved roads
• Must handle diverse conditions (e.g. glare on wet road at
night)
• Integration with other automotive safety functions
19
Lane-Departure Warning: Challenges
Copyright © 2013 Embedded Vision Alliance
• Nearly all embedded vision systems use a
CPU, but using only CPUs is often impractical
due power, size and/or cost
• Many other processor types are used: GPUs, DSPs,
FPGAs, many-core arrays, specialized datapath engines, etc.
• But this can create big challenges for developers:
• Figuring out how to partition
• Complex programming models
• Multiple languages, tool flows
20
Heterogeneous Workloads Often Map Most
Efficiently to Heterogeneous Architectures
Segmenta-tion
Object Analysis
Heuristics or Expert System
Image Acquisition
Image Pre-
processing
Feature Detection
Copyright © 2013 Embedded Vision Alliance
CPU
GPU
DSP
CONNECTIVITY
ISPs
DISPLAY
NAVIGATION
SENSORS
MULTIMEDIA
Source: Qualcomm
The Embedded Vision Alliance (www.Embedded-Vision.com) is a
partnership of 35 leading embedded vision technology and services
suppliers
Mission: Inspire and empower product creators (including mobile
app developers) to incorporate visual intelligence into their
products
The Alliance provides free, high-quality technical educational
resources for engineers
• The Embedded Vision Academy offers in-depth tutorial
articles, video “chalk talks,” code examples, tools and
discussion forums
• The Embedded Vision Insights newsletter delivers news,
Alliance updates and new resources
Companies interested in becoming sponsoring members of the
Alliance should contact [email protected]
Copyright © 2013 Embedded Vision Alliance 21
Helping Product Creators Harness
Embedded Vision
• “Embedded vision” refers to practical systems that extract meaning
from visual inputs
• Embedded vision upgrades what machines know about the physical
world, and how they interact with it, enabling dramatic improvements
in existing products—and creation of new types of products
• To date, embedded vision has largely been limited to low-profile
applications like surveillance and industrial inspection
• Thanks to the emergence of high-performance, low-cost, energy
efficient programmable processors, this is changing
• Heterogeneous processors are often best for embedded vision
• HSA increases flexibility and simplifies programming
• The Embedded Vision Alliance provides a wide range of resources to
help product creators incorporate visual intelligence into their products
Copyright © 2013 Embedded Vision Alliance 22
Conclusions
• Eye-Catching Vision Video Clips:
http://www.embedded-vision.com/eye-
catching-embedded-vision-clips
• Embedded Vision Alliance News Stream:
http://www.embedded-
vision.com/news
• BDTI OpenCV Executable Demo
Package—No programming required:
www.embeddedvisionacademy.com/ope
ncvdemo
• BDTI Quick-Start OpenCV Kit:
www.embeddedvisionacademy.com/ope
ncvkit
Copyright © 2013 Embedded Vision Alliance 23
Resources
Copyright © 2013 Embedded Vision Alliance
Thank You
24
Visit us at www.Embedded-Vision.com
• Alliance Member companies position themselves as
leaders in front of thousands of product creators
who visit the Alliance web site each month
• Multiple Embedded Vision Summit conferences
each year introduce Member companies and their
products to prospective customers
• Our Member companies meet quarterly to develop
business partnerships and gain insights into
embedded vision markets and technology trends
• We secure frequent press coverage on embedded
vision topics, gaining exposure for our members as
thought leaders
• Companies interested in joining the Alliance may
contact us via [email protected]
Copyright © 2013 Embedded Vision Alliance 25
Vision Technology and Service Suppliers:
Join the Alliance