half of the human brain is devoted directly or indirectly ......the era of machines that see:...

© 2014 Embedded Vision Alliance 1

“Half of the human brain is devoted

directly or indirectly to vision.”

– Paraphrased from Prof. Mriganka Sur, MIT


Jeff Bier, Founder, Embedded Vision Alliance / President, BDTI

Presentation for Cubic Corporation, November 10, 2014

The Era of Machines That See:

Opportunities, Challenges, and Trends in

Embedded Vision


Computer vision: systems that extract meaning from visual inputs

Computer vision has been an

active research field for decades,

with limited commercial applications

Embedded vision: the practical, widely

deployable evolution of computer vision

• Applications: industrial, automotive,

medical, defense, retail, gaming,

consumer electronics, security, education, …

• Embedded systems, mobile devices, PCs and the cloud

Computer Vision Embedded Vision


The proliferation of embedded vision is enabled by:

• Hardware: processors, sensors, etc.

• Software: tools, algorithms, libraries, APIs

Why is Embedded Vision Proliferating Now?


0

5000

10000

15000

20000

25000

30000

1996 1998 2000 2002 2004 2006 2008 2010 2012

MM

AC

s/s

ec

on

d

Year

DSP Performance: High-end, Single-core DSPs from TI

Source: BDTI Analysis

10 GMACs/

second

Enabling Embedded Vision:

Processor Performance


Most machines are useful only to the extent that they interact with the

physical world

Visual information is the richest source of information about the real

world: People, places, and things

Vision is the highest-bandwidth mode for machines to obtain info from

the real world

Embedded vision can:

• Boost efficiency: Improving throughput and quality

• Enhance safety: Detecting danger and preventing accidents

• Simplify usability: Making the “user interface” disappear

• Fuel innovation: Enabling us to do things that were impossible

The Highest Bandwidth Input Channel


What Does Embedded Vision Enable?


Augmented Reality for Realistic Simulations

www.youtube.com/watch?v=v_bo1m7kXgQ

http://www.youtube.com/watch?v=v_bo1m7kXgQ


Augmenting Human Capabilities: OrCam

Visual Interpreter for the Sight Impaired

www.youtube.com/watch?v=ykDDxWbt5Nw

http://www.youtube.com/watch?v=ykDDxWbt5Nw


Dyson 360 Robot Vacuum

www.youtube.com/watch?v=oguKCHP7jNQ

http://www.youtube.com/watch?v=oguKCHP7jNQ




Touch+ Makes Any Surface Multi-touch

www.youtube.com/watch?v=v_bo1m7kXgQ

http://www.youtube.com/watch?v=v_bo1m7kXgQ


Embedded Vision Increases

Human Productivity and Safety

www.youtube.com/watch?v=9Wv9k_ssLcI

http://www.youtube.com/watch?v=9Wv9k_ssLcI


Mercedes: www.youtube.com/watch?v=WGgSyA8HXyY

Philips: www.youtube.com/watch?v=2M7AFoqJyDI

IKEA: www.youtube.com/watch?v=DhbHnec4se0

LEGO: www.youtube.com/watch?v=mUuVvY4c4-A

www.youtube.com/watch?v=Td7cKB2BxIo

Amazon: www.youtube.com/watch?v=bnqnvL8B0k0

www.youtube.com/watch?v=8gy5tYVR-28

Stanley: www.youtube.com/watch?v=orTO3E0Vvok

Audi: www.youtube.com/watch?v=2YqflcbCVZg

Tesco: www.youtube.com/watch?v=bMCw7-lYUKw

Major League Baseball: bit.ly/1qylyRI

CENTR Cam: vimeo.com/91037496

More Videos for Later

http://www.youtube.com/watch?v=WGgSyA8HXyY

http://www.youtube.com/watch?v=2M7AFoqJyDI

http://www.youtube.com/watch?v=2M7AFoqJyDI

http://www.youtube.com/watch?v=DhbHnec4se0

http://www.youtube.com/watch?v=mUuVvY4c4-A



http://www.youtube.com/watch?v=Td7cKB2BxIo

http://www.youtube.com/watch?v=bnqnvL8B0k0

http://www.youtube.com/watch?v=8gy5tYVR-28



http://www.youtube.com/watch?v=orTO3E0Vvok

http://www.youtube.com/watch?v=2YqflcbCVZg

http://www.youtube.com/watch?v=bMCw7-lYUKw



http://www.bit.ly/1qylyRI

http://www.vimeo.com/91037496


How Does Embedded Vision Work?


Data reduction

Data reduction

How Does Embedded Vision Work? How Does Embedded Vision Work?


A simplified embedded vision pipeline:

Typical total compute load: ~10-100 billion operations/second

Loads can vary dramatically with pixel rate and algorithm complexity


Segmenta-tion

Object Analysis

Heuristics or Expert System

Image Acquisition

Image Pre-

processing

Feature Detection

Ultra-high data rates;

low to medium

algorithm complexity

High to medium data

rates; medium algorithm

complexity

Low data rates;

high algorithm

complexity



A simplified embedded vision algorithm: A typical embedded vision algorithm:

Vision Algorithm Development Challenges

Segmenta-tion

Object Analysis

Heuristics or Expert System

Image Acquisition

Lens Correction

Image Pre-

processing


• Infinitely varying inputs in many applications…

• Uncontrolled conditions: lighting, orientation, motion, occlusion

• Leads to ambiguity…

• Leads to the need for complex,

multi-layered algorithms to extract

meaning from pixels

• Plus:

• Lack of analytical models means

exhaustive experimentation is required

• Numerous algorithms and algorithm

parameters to choose from

• It’s a whole-system problem

What Makes Embedded Vision Hard?

www.selectspecs.com


• Most vision applications involve high data rates and complex algorithms

• For vision to be widely deployed, it must be implemented in many

designs that are constrained in cost, size, and power consumption

• These constraints, combined with high performance and bandwidth

demands, create challenging design problems

• Algorithms are diverse and dynamic, so fixed-function compute engines

are less attractive

• Modern embedded CPUs may have the muscle, but are often too

expensive or power-hungry

• Many vision applications require parallel or specialized hardware

• E.g., DSP, GPU, FPGA or other co-processor

• Most product creators lack experience with embedded vision

Implementing Embedded Vision

is Challenging


Example: Lane Marking Detection


Detect lane markings on the road and warn when car veers out of the lane

The “textbook” solution:

• Acquire road images from front-facing camera (often with fish-eye

lens)

• Apply pre-processing (primarily lens correction)

• Perform edge detection

• Detect lines in the image with Hough transform

• Determine which lines are lane markings

• Track lane markings and estimate positions in the next frame

• Assess car’s trajectory with respect to lane and warn driver in case

of lane departure

Lane-Departure Warning—The Problem


In a lane-departure warning system, edge detection is the first step in

detecting lines (which may correspond to lane markings)

• Edge detection is a well-understood technique

• Primarily comprises 2D FIR filtering

• Computationally-intensive pixel processing

• Many algorithms are available (Canny, Sobel, etc.)

Lane-Departure Warning: Edge Detection


Edge thinning removes unwanted spurious edge pixels

• Improves output of Hough transform

• Often performed in multiple passes over the frame

• Also useful in other applications

Lane-Departure Warning: Edge Thinning


• Hough transform examines the edge pixels found in the image, and

detects predefined shapes (typically lines or circles)

• In a lane-departure warning system, Hough transform is used to detect

lines, which may correspond to lane markings on the road

• For good results, lens distortion correction is important

Lane-Departure Warning: Hough Transform

Original image Edges detected Lines detected


Similar to a histogram

• Each detected edge pixel is a “vote” for all of the lines that pass

through the pixel’s position in the frame

• Lines with the most “votes” are detected in the image

• Uses a quantized line-parameter space (e.g. angle and distance

from origin)

• Must compute all possible line-parameter values for each detected

edge pixel

Lane-Departure Warning: Hough Transform

Edge

pixels

Every possible line through an edge pixel

gets one vote when the pixel is processed

A line passing

through many edge

pixels gets many

votes


Filter the detected lines to discard lines that are not likely to be lane

markings

• Find start and end points of line segments

• Filter by length, position, and angle

• Filter by line color and background color

• Additional heuristics may apply (e.g. dashed or solid lines are likely

to be lane markings, but lines with uneven gaps are not)

Possibly classify the lines as lane markings or other lane indication (e.g.

curb)

Lane-Departure Warning: Detecting Lane

Markings


• Tracking lane markers from each frame to the next

• Helps eliminate spurious errors

• Provides a measure of the car’s trajectory with respect to the lane

• Typically done using a predictive filter:

• Predict new positions of lane markings in the current frame

• Match the lane markings to the predicted positions and compute the

prediction error

• Update the predictor for future frames

• Kalman filters are often used for prediction in vision applications

• Theoretically these are the fastest-converging filters

• Simpler filters are often sufficient

• Very low computational demand due to low data rates

• E.g. 2 lane marking positions × 30 fps 60 samples per second

Lane-Departure Warning: Tracking Lane

Markings


• The basic algorithm presented is not robust. Will need significant

enhancements for real-world conditions

• Must work properly on curved roads

• Must handle diverse conditions (e.g. glare on wet road at night)

• Integration with other automotive safety functions

• May impose memory, processing load, or synchronization constraints

• Tradeoffs between computational demand and quality

• E.g. video resolution, choice of algorithms, etc.

• Billions to hundreds of billions of operations per second required,

depending on tradeoffs

Lane-Departure Warning: Challenges


Processors for Embedded Vision


Though challenged with respect to performance and efficiency, unaided high-

performance embedded CPUs are attractive for some vision applications

Vision algorithms are initially developed on PCs with general-purpose

CPUs

CPUs are easiest to use: tools, operating systems, middleware, etc.

Most systems need a CPU for other tasks

However:

Performance and/or efficiency is often inadequate

Memory bandwidth is a common bottleneck

Example: Intel Atom (used in NI 1772C Smart Camera)

Best for: Applications with modest performance needs;

quick time to market

Trend to watch: Integration of GPGPUs with embedded CPUs

High-performance Embedded CPUs


Trend: Heterogeneous Architectures

Performance/$

Performance/W

Development

Effort


• Very heterogeneous processors

• Benefit from huge investments by suppliers

• Hardware performance, efficiency, integration

• Application development infrastructure

• Mobile apps have become a primary locus of software development

• APs can be difficult to buy and use for embedded applications

• APs are used in some embedded applications (sometimes in mobile

device form, sometimes via a system-on-module)

Trend: Mobile Application Processors

CPU GPU

DSP

ISP

VPU


• Graphics processing units (GPUs) are massively parallel machines

• GPUs and their tools have evolved to support non-graphics workloads

(“general-purpose GPU” or “GPGPU”)

• Important recent developments:

• Now in mobile application processors and embedded processors

• OpenCL support

• HSA (Heterogeneous System Architecture)

• Generalized to CPUs, DSPs, FPGAs

Trend: General-Purpose GPU


• As more vision applications achieve high volumes, vision-specific

processors are emerging

• All are co-processors, working in tandem with a CPU

• Many are sold as licensable IP for custom chips:

• Apical, Cadence (Tensilica), CEVA, CogniVue, videantis

• A few are sold as chips: Mobileye, Movidius, TI (EVE)

• Some are do-it-yourself kits:

• For chip designers: Synopsys Processor Designer

• For system designers: Xilinx Zynq

• Challenges:

• Unique programming models and environments

• Limited libraries

• Important (potential) trend: OpenVX

Trend: Vision-Specific (Co-)Processors


FPGA flexibility is very valuable for embedded vision applications

Enables custom specialization and enormous parallelism

Enables selection of I/O interfaces and on-chip peripherals

However:

FPGA design is hardware design, typically done at a low level (register transfer level)

Ease of use improving due to:

Platforms

IP block libraries

Emerging high-level tools

Example: Xilinx Spartan-3 XC3S4000 (in Eutecus Bi-i V301HD smart camera)

Best for: High performance needs with tight size/power/cost budgets

Trends to watch: Vision IP libraries and reference designs; high-level tools

FPGA + CPU


The Embedded Vision Alliance (www.Embedded-Vision.com) is a

partnership of 42 leading embedded vision technology

and services suppliers

Mission: Inspire and empower product creators (including

mobile app developers) to incorporate visual intelligence into

their products

The Alliance provides low-cost, high-quality technical

educational resources for engineers

• The Alliance website offers in-depth tutorial articles,

video “chalk talks,” code examples, discussion forums

• The Embedded Vision Insights newsletter delivers news,

Alliance updates and new resources

Capture the attention of thousands of vision system and

software developers by highlighting AMD products and

technology on the Alliance web site

Empowering Product Creators to

Harness Embedded Vision

http://www.embedded-vision.com/




• “Embedded vision” refers to practical systems that extract meaning

from visual inputs

• Embedded vision upgrades what machines know about the physical

world, and how they interact with it, enabling dramatic improvements

in existing products—and creation of new types of products

• Thanks to the emergence of high-performance, low-cost, energy

efficient programmable processors, embedded vision is proliferating

into almost every segment of the electronics industry

• Embedded vision is a huge opportunity for the electronics industry

• Engineers can leverage the Embedded Vision Alliance to learn practical

techniques in embedded vision

Conclusions


Thank You!

ANALYSIS • ADVICE • ENGINEERING FOR EMBEDDED PROCESSING TECHNOLOGY

© 2014 BDTI

Published Analysis

Contract Analysis Services

Contract Engineering

Services

BDTI: Over 20 Years of Embedded Processing Leadership

39

Since 1992, BDTI has helped technology suppliers and users build better products, win business, and reduce risk through:

1. Published Analysis and Training

• Building awareness of technology options, capabilities, and trade-offs through webinars, articles, blogs, and reports

• Teaching effective development techniques live and online

2. Contract Analysis Services

• Strengthening technology marketing and strategy through an independent view, hands-on evaluation, and deep insight

• Enabling rapid, confident technology-selection decisions via benchmarking and analysis

3. Contract Engineering Services

• Delivering specialized engineering services to prove feasibility, increase efficiency, and speed time-to-market

• Solving difficult problems in performance and power consumption

ANALYSIS • ADVICE • ENGINEERING FOR EMBEDDED PROCESSING TECHNOLOGY

© 2014 BDTI

BDTI Services for Vision Systems Design

BDTI provides vision system design services for product development.

• A highly trusted partner—consistently delivering projects right the first time, on time and on budget

• Knowledge of vision applications, algorithms and tools, including OpenCV

“BDTI has set a new standard for quality. BDTI’s deliverables were orders of magnitude better than other vendors’.” - Group Program Manager, Fortune 500 systems and software company

BDTI provides onsite training in embedded engineering

• Workshop and hands-on training in design and implementation of vision systems

Learn more about BDTI’s capabilities in vision engineering at http://www.embedded-vision.com/platinum-members/bdti/overview

40

http://www.embedded-vision.com/platinum-members/bdti/overview







half of the human brain is devoted directly or indirectly ......the era of machines that see:...

Documents