mobile: driving the next wave with gpu compute... · gpu compute delivers high performance ... data...
TRANSCRIPT
© Imagination Technologies USA Summit 2
CPU
Compute
GPU
Graphics and compute
Today’s opportunity
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
Differentiated camera applications
Exposed for
highlights
Exposed for
Shadows
HDR
Android’s camera subsystem is now
modelled after professional camera
In an attempt to maintain a consistent
platform running on multiple SoCs, Google
limits rate of adoption of new features
Android
version
Addition to
android.camera.hal
Lollipop
KitKat
JB MR2
JB MR1
Camera
HAL
3.2 Noise reduction
3.1 None
3.0 None
2.0 HDR
JB
ICS
ICS
Auto focus
Video stabilization
1.0 Face detection
© Imagination Technologies USA Summit 3
Today’s opportunity
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
Differentiated camera applications
CPU
Compute
GPU
Graphics and compute
Head
in focus
Wings
in focus
Everything
in focus
GPU compute delivers high performance
for many image processing algorithms
OEMs will choose SoCs that allow them to
differentiate their Android products
Features most requested are computational
photography and computer vision:
Sensor processing – stereo, array and ToF
Panoramas – real-time and high-res
Depth of field (focus stacking)
Gesture recognition
Augmented reality – real lighting
Bringing new features to market fast
requires:
large amounts of processing power
programmability
© Imagination Technologies USA Summit 4
SMP configurations unlikely to scale efficiently beyond four CPUs
GPU multi-processor and multi-pipe configurability enables far more
extensive processor scaling
OpenCL unlocks the full potential of GPUs
GPU
Graphics and compute
USSE1
GPU increasingly dominates SoC area Particularly in premium SoCs
CPU
Compute
2012 2013 2014
65nm SoC
CPU1
2015
© Imagination Technologies USA Summit 5
CPU
Compute
GPU increasingly dominates SoC area Particularly in premium SoCs
GPU
Graphics and compute
SMP
USSE (4 pipes)
USSE (4 pipes)
USSE (4 pipes)
USSE (4 pipes)
40nm SoC
CPU1 CPU0
Sch
ed
ule
r
2012 2013 2014 2015
SMP configurations unlikely to scale efficiently beyond four CPUs
GPU multi-processor and multi-pipe configurability enables far more
extensive processor scaling
OpenCL unlocks the full potential of GPUs
© Imagination Technologies USA Summit 6
GPU increasingly dominates SoC area Particularly in premium SoCs
CPU
Compute
GPU
Graphics and compute
28nm SoC
SMP
CPU1 CPU0 CPU3 CPU2
Sch
ed
ule
r
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
2012 2013 2014 2015
SMP configurations unlikely to scale efficiently beyond four CPUs
GPU multi-processor and multi-pipe configurability enables far more
extensive processor scaling
OpenCL unlocks the full potential of GPUs
© Imagination Technologies USA Summit 7
GPU increasingly dominates SoC area
CPU
Compute
GPU
Graphics and compute
28nm SoC
SMP
CPU1 CPU0 CPU3 CPU2
Sch
ed
ule
r
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
2012 2013 2014
100s of Giga Flops
2015
SGX 544
Rogue Series 8
Rogue Series 7
Rogue Series 6
© Imagination Technologies USA Summit 8
Mobile SoCs provide a single unified
system memory that is shared between all
the hardware blocks
The available bandwidth is often 10x less
than a Desktop PCIe bus
SoC bandwidth does not scale
CPU
Compute
GPU
Graphics and compute
SMP
CPU1 CPU0 CPU3 CPU2
Sch
ed
ule
r
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
USC (16 pipes)
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
© Imagination Technologies USA Summit 9
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
Android’s problem with buffer copies
CPU
Compute
Display Subsystem
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
VDE Encode, Decode
ISP Camera
GPU
Graphics and compute
YUV input YUV copy
Unnecessary copying of
buffers can cripple
performance and power
RGB copy RGB output
Android dictates the formats of camera
and video data presented to apps
developers
The OS APIs may copy frames from one
format to another
Unnecessarily increases bandwidth
Unnecessarily reduces achievable GPU
compute performance
Performance losses can be quickly
compounded, especially when
processing HD video content
© Imagination Technologies USA Summit 10
A suite of software extensions that
enables efficient interoperability of
software running on PowerVR GPUs with
many other SoC hardware blocks
Interoperable with:
CPU
ISP
VDE
PowerVR imaging framework for Android The basis of OEM product differentiation
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
© Imagination Technologies USA Summit 11
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
VDE Encode, Decode
ISP Camera
GPU
Graphics and compute
Display Subsystem
PowerVR imaging framework for Android
Software extensions enable images
written by the ISP to be directly read by
the GPU
Gralloc and EGL used to maintain data
interoperability between the ISP/camera HAL
and the GPU
EGL Image pointers enable separate access
to the Y, U and V planes
EGL extensions enable mapping of image
data into OpenCL and OpenGL ES data
types
Gralloc buffer
NV12
Y
UV
image2d_t
EGLImage KHR
image2d_t
EGLImage KHR
How it works
USC1 USC0
USC3 USC2
• Denoise filter
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading • Denoise filter
© Imagination Technologies USA Summit 12
Software extensions enable images
written by the ISP to be directly read by
the GPU
Gralloc and EGL used to maintain data
interoperability between the ISP/camera HAL
and the GPU
EGL Image pointers enable separate access
to the Y, U and V planes
EGL extensions enable mapping of image
data into OpenCL and OpenGL ES data
types
Other extensions enable interoperability
with the CPU and VDE
Many complex vision and computational
software pipelines can be created,
incorporating the VDE and other
compatible hardware on the SoC
PowerVR imaging framework for Android How it works
GPU
Graphics and compute
CPU
Compute
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
Display Subsystem
USC3 USC2
CPU1 CPU0
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading • Denoise filter • Denoise filter
• H264 encode
© Imagination Technologies USA Summit 13
Improved picture quality for a given
camera sensor
Improves low-light performance
PowerVR imaging framework examples Denoising
GPU
Graphics and compute
CPU
Compute
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
Display Subsystem
USC3 USC2
CPU1 CPU0
Without GPU compute With GPU compute • Denoise filter • Denoise filter
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading
• H264 encode
© Imagination Technologies USA Summit 14
Reveal full dynamic range of video
Compensate the excess or lack of
light
Bring out shadows and highlight
details
PowerVR imaging framework examples HDR
With GPU compute Without GPU compute
GPU
Graphics and compute
CPU
Compute
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
• YUV-to-RGB conversion
• Blend
• Tone map
• H264 decode
• Decode additional bits
© Imagination Technologies USA Summit 15
Reduces frame-to-frame jitter when
user is walking/in motion
Provides smooth recording and
playback of user-generated content
Improves low-light performance
PowerVR imaging framework examples Image stabilization
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
• H264 encode
• Apply transformation matrix
• Read gyro data
• Construct transformation matrix
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading Without GPU compute With GPU compute
© Imagination Technologies USA Summit 16
PowerVR imaging framework examples Face detection
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
Accurate face detection enables
camera auto-focus and auto-
exposure
Enables selective high-fidelity
encoding of key regions of interest
(and bit-rate savings on background)
Without GPU compute With GPU compute
• Blob detector
• Auto focus
• Auto exposure
• Integral image
• Face detect
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading
© Imagination Technologies USA Summit 17
PowerVR imaging framework examples Beautification
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
Without GPU compute With GPU compute
• H264 encode • Poisson Cloning
• Bilateral filter
• Bilateral filter • Integral image
• Face detect
• Blob detector
• Auto focus
• Auto exposure
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading
Skin smoothing, blemish and wrinkle
removal
Popular feature in image and video
sharing apps
© Imagination Technologies USA Summit 18
PowerVR imaging framework examples Background removal
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
Without GPU compute With GPU compute
• H264 encode • Background subtraction
• Person detect
• Integral image
• HoG • Generate
depth map
• Blob detector
• Auto focus
• Auto exposure
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading
• RGB sensor
• ToF/depth sensor
Immersive video conferencing
Reduced bandwidth
Live presentations with presenter
overlaid onto PowerPoint
Gaze correction preserves eye
contact
© Imagination Technologies USA Summit 19
PowerVR imaging framework examples Augmented reality
Perspective correct positioning of objects in
screen
Mimicking light sources to improve the reality
of the object in scene
Improved online shopping experience
Without GPU compute With GPU compute
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1
• Compose camera stream with 3D object
• Apply lighting to 3D object
• Light source calculation
• Light source calculation
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading
CPU0 • 3D object
selection
© Imagination Technologies USA Summit 20
PowerVR imaging framework examples Security cameras
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
• H264 encode • Track • Object detect
• HoG • Integral image
• Face detect
• Blob detector
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading With GPU compute
• Face and object recognition
Mass transit monitoring and security
Identify known persons in crowds
Identify suspicious objects
Reduced CCTV operator costs
© Imagination Technologies USA Summit 21
PowerVR imaging framework examples Retail analytics
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
• H264 encode • Track • Person and
object detect
• Integral image
• HoG • Generate
depth map
• Blob detector
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading With GPU compute
• RGB sensor
• ToF/depth sensor
• Face and object recognition
Providing supermarkets retail information
akin to retail websites
Track people in store, demographics (age,
gender, ethnicity), dwell time
Feedback to retailers and advertisers
Upsell via in-store digital signage
© Imagination Technologies USA Summit 22
PowerVR imaging framework examples
GPU
Graphics and compute
CPU
Compute
Mem
ory
contr
olle
r, c
aches a
nd inte
rconnect
RPU Wi-Fi, Bluetooth,sensors connectivity R
F
VDE Encode, Decode
ISP Camera
Display Subsystem
USC1 USC0
USC3 USC2
CPU1 CPU0
Automotive
• Headlight detection
• Bilateral control algorithm
• Heuristics
• Object recognition
• Kalman smoothing
• Panoramic stitching
• Markov random field
• Bilateral filter
• Ransac detection
• Turn analysis
• Spatial post-processing
• Fish-eye
• Image segmentation
• HoG calculation
• Sensor acquisition
• Defective pixel correction
• Auto white balance
• Lens shading
• Camera
• Radar
• Infrared / thermal
• Ultrasonic
• ToF / depth
Pedestrian detection
Lane departure warning
Traffic sign recognition
Night vision
Intelligent high-beam control
Adaptive cruise control
Headway monitoring and collision
avoidance
Surround view
With GPU compute
© Imagination Technologies USA Summit 23
Libraries PowerVR Imaging Framework Zero-copy Compute
Integration models Integrating the PowerVR imaging framework into Android
Application Software
Hardware Abstraction Layer
Linux Kernel
Android SDK
Hardware (System-on-Chip)
Heterogeneous Compute CPU Compute
VDE Encode, Decode
ISP Camera
OCL Driver UM
OpenGL ES UM
OpenVX
Camera Driver Video Driver
Camera HAL
Media Player
Stock App
Camera Stock
App
Media Player
API
Camera Services
API
Video HAL
Graphics Driver KM
GPU Graphics and compute
© Imagination Technologies USA Summit 24
Libraries PowerVR Imaging Framework Zero-copy Compute
Integration models Integrating the PowerVR imaging framework into Android
Application Software
Hardware Abstraction Layer
Linux Kernel
Android SDK
Hardware (System-on-Chip)
Heterogeneous Compute CPU Compute
VDE Encode, Decode
ISP Camera
OCL Driver UM
OpenGL ES UM
OpenVX
Camera HAL
Media Player
Stock App
Camera Stock
App
Media Player
API
Camera Services
API
Video HAL
Graphics Driver KM
GPU Graphics and compute
Differentiated Multimedia Application
• New features enabled by GPU compute
• Higher performance
• Lower power
Camera Driver Video Driver
© Imagination Technologies USA Summit 25
Libraries
Integration models Integrating the PowerVR imaging framework into Android
Application Software
Hardware Abstraction Layer
Linux Kernel
Android SDK
Hardware (System-on-Chip)
Heterogeneous Compute CPU Compute
VDE Encode, Decode
ISP Camera
Camera HAL Video HAL
GPU Graphics and compute
PowerVR Vision Demo
Fish-eye effect Beautification
HDR mode
Differentiated Multimedia Application – Actual OEM product
© Imagination Technologies USA Summit 26
Ecosystem Partner Silicon Vendor OEM
OEM Product
BSP
How we can engage
PowerVR based SoC
PVR imaging framework
License a PowerVR GPU for your next product
Incorporate the PowerVR imaging framework into your Android BSP
If you provide custom camera software to your OEMs, incorporate the framework into this software as well
PowerVR based SoC
Select a PowerVR-based SoC for your next product
Identify premium features you wish to support using GPU compute
Develop the required software algorithms yourself – or license from Imagination and its ecosystem partners
Optional camera platform
Obtain an OEM platform that supports our PowerVR imaging framework
Prototype your computer vision and photography algorithms using our OpenCL SDK
Camera / MM app
BSP
PVR imaging framework
Optional camera platform
OEM Product
BSP
PVR imaging framework
Demo of new feature
SoC
© Imagination Technologies USA Summit 27
How we can engage
Many development boards and OEM products are now available in the market with a PowerVR
Rogue GPU and OpenCL driver
Most platforms now support PowerVR imaging framework extensions
Development boards
Meizu MX4
MediaTek MT6595
Rogue G6200
Merrii OptimusBoard
AllWinner A80
Rogue G6230
Dell Venue 7/8
Intel Atom Z3480
Rogue G6430
ASUS ZenFone 2
Intel Atom Z3460
Rogue G6430
© Imagination Technologies USA Summit 28
Conclusion
Imagination’s PowerVR imaging framework for Android has been successfully
deployed by OEMs in mobile and tablet products, enabling new camera and
multimedia applications
The framework’s zero-copy buffer management extensions were crucial to building
apps that efficiently leverage all available hardware resources (ISP, CPU, GPU
compute and VDE)
The resulting application software minimizes SoC bandwidth while delivering high
performance and low power consumption
Game changing technology from Imagination
www.imgtec.com/gpucompute