mobile phone hw - caxapa.rucaxapa.ru/thumbs/281955/lect.06.smartphonehw.pdf · mobile phone hw kari...

56
April 13, 2010 Kari Pulli 1 Mobile Phone HW Kari Pulli Research Fellow Nokia Research Center Palo Alto

Upload: tranphuc

Post on 04-Mar-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

April 13, 2010 Kari Pulli 1

Mobile Phone HW

Kari Pulli Research Fellow Nokia Research Center Palo Alto

April 13, 2010 Kari Pulli 2

Outline

Mobile devices: huge volumes, next computing frontier Some key challenges & technical developments

!  power, heat, displays, computation, imaging Different types of phones -> different technologies ARM CPU core generations DSPs TI OMAP 3430 application engine Tear-down of N900 Mobile GPU (MBX & SGX) Sensors: accelerometer, GPS Cameras: autofocus + zoom + flash

April 13, 2010 Kari Pulli 3

The Scale is Unparalleled in Human History

Over 219 Countries and Territories Have mobile networks

More than 1 Billion Nokia devices sold

More than 4 Billion Mobile subscriptions worldwide

Over 80% Of the world’s population has coverage

More than 1 Million New mobile subscribers are added daily

April 13, 2010 Kari Pulli 4

Challenge? Power!

Power is the ultimate bottleneck

!  Usually not plugged to wall while using, just batteries Battery improvement doesn’t follow Moore’s law

!  Only 5-10% per year

Gene’s law !  "power consumption of integrated circuits decreases

exponentially" over time => batteries will last longer

!  Since 1994, the power req’d to run an IC has declined 10x / 2 yrs !  But the performance of 2 years ago is not enough

!  Pump up the speed, use up the power savings

April 13, 2010 Kari Pulli 5

Power-Flexibility Conflict Source: T.Claasen (ISSCC99)

2 1 0.5 0.25 0.13 0.07 feature size(µm)

1000

100

10

1

0.1

0.01

0.001

Power efficiency (32b GOPS/Watt)

Reconfigurable Logic Hardwired data paths

Instruction Set Processors General Purpose Processor

Application-Specific Processor, DSP Coarse-Grain Reconfigurable

Field-Programmable Gate Array Standard-Cell Based Design

Hand-Crafted Design

April 13, 2010 Kari Pulli 6

Challenge: Thermal management!

But ridiculously good batteries still won’t be the miracle cure !  The devices are small !  Generated power must get out !  No room for fans

Thermal management must be considered early in the design !  Hot spot would fry electronics

!  Or at least inconvenience the user…

!  Conduct the heat through the walls, and finally release to the ambient

April 13, 2010 Kari Pulli 7

POWER,W

3

2

6

New design constraint, Thermal Dissipation Capability of Mobile Device

4

1

5

COOLING REQUIRED

100cc plastic monoblock

100cc metal monoblock

100cc plastic clamshell,open

Cellular RF Cellular RF

Miscellaneous

Cellular BB Cellular BB

Local Connectivity

Local Connectivity

Local Connectivity

Display+backlight

Display+backlight

Display+backlight

Camera

Camera

Cellular RF

Audio

Audio

Audio

Apps Engine

Apps Engine

Mass Memory

Mass Memory

Power conversion

Power conversion

Power conversion

Cellular BB

Large plastic communicator,open

Small metal communicator,open

Speech Feature phone Smart phone

N900 power usage

4/13/10 Kari Pulli 8

4/13/10 Kari Pulli 9

April 13, 2010 Kari Pulli 10

Changed? Displays!

Resolution !  84 x 48 -> 96 x 65 -> 176 x 208 -> 320 x 240 !  Communicators: 640 x 200 -> 800 x 352 !  “Multimedia computers”: 800 x 480

Color depth !  gray scale 1 -> 2 -> 4 -> 8 bit !  RGB 12 -> 16 -> 18 -> 24 bit

April 13, 2010 Kari Pulli 11

Display technologies LCD: Transmissive • Brightness comes only from backlight • High color saturation and contrast • Optimized for indoors use

Backlight

LCD: Transflective • Pixel is divided to Transmissive and Reflective part • Display always on (partial mode) • Bright performance outdoors

Backlight

OLED: Emissive • Organic LEDs are emitting the light • Extremely vivid colors and high contrast • Excellent viewing angles and response time • Optimal for moving image

April 13, 2010 Kari Pulli 12

Expanding color gamut

2005: 38 % NTSC gamut

•  Nokia 6600

2006: 75 % NTSC gamut

•  Nokia 6131

2007: 100 % NTSC gamut

•  OLED displays

•  Nokia 8300

Touch displays: Resistive vs. capacitive

Resistive

Capacitive

• two flexible sheets, resistive coating, air gap

• passive, any material pressing is OK • high accuracy, lower cost • multi-touch?

•  with area / bounding box +heuristics

•  real multi-touch becoming possible

• insulator (glass) and transparent conductor coating •  touching with another conductor (skin) changes

capacitance

• faster, more responsive, more expensive

• can’t use simple stylus • with a grid of sensors, real multi-touch

April 13, 2010 Kari Pulli 13

N900 Display

April 13, 2010 Kari Pulli 14

3.5 inch

Resistive (not multi-touch) 800 x 480 (iPhone 480x320)

105 pix/cm = 267 ppi

16M color (24bit) TFT LCD Content-adaptive backlight control

!  allows reducing backlight or brightness levels depending on the image being displayed

Ambient light sensors

April 13, 2010 Kari Pulli 15

Future? Displays!

Physical size remains limited

!  TV-out connection !  Near-eye displays?

!  Projectors

!  Roll-up flexible displays?

April 13, 2010 Kari Pulli 16

Changed? Computation!

Moore’s law in action !  3410 (2002)

!  ARM 7 @ 26MHz !  Not much caching, narrow bus

FETCH

DECODE

EXECUTE

Instruction fetched from memory

Decoding of registers used in instruction

Register(s) read from Register Bank Shift and ALU operation Write register(s) back to Register Bank

Von Neumann architecture (shared instr / data bus)

April 13, 2010 Kari Pulli 17

Changed? Computation!

Moore’s law in action !  3410 (2002)

!  ARM 7 @ 26MHz !  Not much caching, narrow bus

!  6600 (2003) !  ARM 9 @ 104MHz

!  Decent caching, better bus

Instruction Fetch

Shift + ALU Memory Access

Reg Write Reg

Read Reg

Decode

FETCH DECODE EXECUTE MEMORY WRITE

ARM or Thumb Inst Decode

Harvard architecture (separate instr / data bus)

April 13, 2010 Kari Pulli 18

Changed? Computation!

Moore’s law in action !  3410 (2002)

!  ARM 7 @ 26MHz !  Not much caching, narrow bus

!  6600 (2003) !  ARM 9 @ 104MHz

!  Decent caching, better bus !  6630 (2004)

!  ARM 9 @ 220MHz

!  Faster memories

!  N93 (2006) !  ARM 11 @ 330MHz !  HW floating-point unit !  3D HW (fixed functionality)

April 13, 2010 Kari Pulli 19

ARM11 Pipeline Architecture

Static & Dynamic Branch Prediction Branch Folding Return Stack

PF1 PF2

ARM, THUMB

and Java

Decode

DE

Register Read

ISS

ALU Pipeline

SH ALU SAT WB ex

MAC Pipeline

MAC1 MAC2 MAC3

LS add

Data Cache Access

DC1 DC2 WB LS

Four fetch/decode/issue stages Branch Prediction

Extended ALU and MAC pipe

April 13, 2010 Kari Pulli 20

Changed? Computation!

Moore’s law in action !  3410 (2002)

!  ARM 7 @ 26MHz !  Not much caching, narrow bus

!  6600 (2003) !  ARM 9 @ 104MHz

!  Decent caching, better bus !  6630 (2004)

!  ARM 9 @ 220MHz

!  Faster memories

!  N93 (2006) !  ARM 11 @ 330MHz !  HW floating-point unit !  3D HW (fixed functionality)

!  N900 (2009) !  ARM Cortex A8 @ 600MHz !  Neon co-processor: parallel FP !  3D HW (programmable shaders)

April 13, 2010 Kari Pulli

Changed? Imaging

Even faster improvement on cameras !  3410

!  no camera !  6600

!  640 x 480 still images

!  176 x 144 video 6-15 fps !  6630

!  1280 x 960 still images !  176 x 144 video 15 fps

!  N93 !  2048 x 1536 still images !  640 x 480 video 30 fps !  3x optical zoom !  Carl-Zeiss Vario-Tessar lens

!  N900 !  2584 x 1938 still images !  848 x 480 video 25 fps

April 13, 2010 Kari Pulli 22

Architecture families

“Cost”: S30 lowest cost

Core SW

Cellular modem

Display

Camera

RF

FM

“Size”: S40/S60 optimize size, maximum flexibility, lots of feature possibilities

Display Camera

Video codec HWA

ISP HWA

GFX HWA

BT WLAN FM GPS

Core SW

Cellular modem

RF

“Performance”: S60 separate app CPU

Display Camera

Video codec HWA

ISP HWA

GFX HWA

BT WLAN FM GPS

Core SW

Cellular modem

RF

April 13, 2010 Kari Pulli 23

… etc … FM radio

DVB-H WLAN

Display

Camera

Input Device

ASIC

System Logic DSP MCU SRAM

RFIF

Audio Codec BB regulators Energy mgmt

ASIC

STD

Flash STD STD

System Chip GSM/EDGE/ WCDMA/ CDMA2000

Analog Chip

Antennas, RF

Flash STD

HW acceler. DSP MCU SRAM

ASIC Multimedia Chip

Application SW

Symbian OS

GPS

Protocol SW

Bluetooth

Memory Card

Architecture of a Cellular Device

April 13, 2010 Kari Pulli 24

Block Diagram of a Cellular Phone BiCMOS

GaAs HBT

0.25µm CMOS

0.13µm CMOS

720mAh

~ ÷

Battery Charging Control

Regulators

Power Supply

ADC DAC

Audio Codec

Bottom Connector

SIM card

Mixed-Signal

BB Memory

Control Interfaces Logic

DSP MCU

SRAM

Flash

Keyboard

LCD Backlight

LCD

Infra Red

Vibra

Microphone

Earpiece

Handsfree SYNTH

PA

RF 1900

1800

900

900

1800/1900

April 13, 2010 Kari Pulli 25

S 11 (d

B)

Freq. (MHz) 850 900 1800

1900

There are many radios

GSM/WCDMA 6+ bands

Bluetooth

GPS

WLAN

WiMAX

DVB-H

UWB

FM radio

RF-ID

New radio applications

LTE

DMB

EPC Wibree

Galileo

DRM

>10 antennas in terminal ! 25% of the volume

Reconfigurable internal quad-band cellular antenna

April 13, 2010 Kari Pulli 26

On chip RAM

Example ARM based System

ARM Processor

core

AM

BA

AH

B

External Memory Interface

APB Bridge

AM

BA

APB

Interrupt Controller

ARM Primecell

Peripherals

GPIO

DMA Port

Clocks and Reset Controller

ARM core deeply embedded within a SoC

!  External debug via JTAG port Design has both external and internal

memories

!  Of varying width, speed, size Includes an interrupt controller

!  Core supports two interrupts Elements connected using AMBA

(Advanced Microcontroller Bus Architecture)

!  APB - AMBA Peripheral Bus !  AHB - AMBA HighSpeed Bus

DEBUG

nIRQ nFIQ

FLASH

SDRAM

ARM based SoC

What is DSP?

A DSP is just a CPU !  optimized for processing digital samples !  program it in C, optimize with “intrinsic” functions

Audio/video analytics (voice/face recognition)

Codecs: MP3, AAC, Ogg, Theora, MPEG4, H264 Pre-/post-processing: equalization, mixing, blending

Power Management

Signal Conditioning

Temperature Pressure Position

Speed Flow

Humidity Sound Light

The Real World

Analog Signal

Conversion to Digital

Digital Signal

Conversion to Analog

Signal Conditioning Interface

Clocks & Timers

DSP vs. CPU

Memory accesses

•  DSP can do several accesses to memory in a single instruction, i.e., DSP processor have a relatively high bandwidth between their Core CPU and memory

Loops

•  DSPs are optimized for repetition or looping of operations common in signal processing applications

Addressing modes

•  DSP has specialized addressing modes, such as indirect, circular, and bit reverse addressing, useful for signal processing

April 13, 2010 Kari Pulli 28

TI OMAP 3430

April 13, 2010 Kari Pulli 29

April 13, 2010 Kari Pulli 30

OMAP3 Features

ARM 600 MHz Cortex™-A8 Core ! ARMv7: Over 2x perf. of ARMv6 SIMD ! In-Order, Dual-Issue, Superscalar Core ! NEON™ Multimedia Architecture ! Both Integer and Floating Point SIMD ! Jazelle (Java acceleration) ! Dynamic Branch Prediction ! Support for Non-Invasive Debug

ARM Cortex™-A8 Memory Architecture: ! 16K-Byte Instruction Cache ! 16K-Byte Data Cache ! 256K-Byte L2 Cache

112K-Byte ROM 64K-Byte Shared SRAM 2D/3D Graphics Accelerator

! Tile Based Architecture, 10 MPoly/sec ! Universal Pixel & Vertex Shader ! OpenGL ES 1.1 and 2.0, OpenVG1.0 and Direct3D Mobile

Camera Image Signal Processing (ISP) ! CCD and CMOS Imager Interface ! RAW Data Interface ! Preview Eng. for Real-Time Image Processing ! Histogram Module/Auto-Exposure, Auto-White Balance, and Auto-Focus Engine ! Resize Engine From 1/4x to 4x

Display Subsystem ! Parallel Digital Output

! Up to 24-Bit RGB, HD max. Resolution ! Supports Up to 2 LCD Panels ! Remote Frame Buffer Interface

! 2 10-Bit Digital-to-Analog Converters (DACs) ! Composite NTSC/PAL Video ! Luma/Chroma Separate Video (S-Video)

! Rotation 90-, 180-, and 270-degrees ! Resize Images From 1/4x to 8x ! Color Space Converter ! 8-bit Alpha Blending

N900 Memory

RAM

•  256 MB RAM Mass storage

•  256 MB NAND •  formatted with UBIFS (unsorted block image file system) •  bootloader, kernel, root file system

•  32 GB eMMC (embedded multimedia card) •  768 MB swap memory

•  2 GB mounted as /home

•  25 GB mounted as /home/user/MyDocs (VFAT virtual file allocation table)

•  microSDHC extension slot (secure digital high capacity)

April 13, 2010 Kari Pulli 31

April 13, 2010 Kari Pulli 32

Power Switch

GPS antenna clips

BT / WLAN antenna

Audio comp

µSD card holder

SIM holder

Vibra RF ASIC

GPS module

Camera connector

BT / FM

radio

Camera DSP

Cellular µ-processor

Application µP

Cellular memory

Display connector

UI connector

RF amp RF amp EM IC

EM / audio

IC

Flash conn.

Gallery switch

Camera switch

Battery conn.

EMC IC’s

2nd camera

Backup battery Power regulators

RF filters

Cellular antenna

connector

Volume

Volume switches

USB conn.

WLAN

The devices are complex

Hole for camera

April 13, 2010 Kari Pulli 33

asdf

April 13, 2010 Kari Pulli 34

OMAP3 256MB NAND Flash 256MB DDR SDRAM

Power mgmnt & audio

FM radio transmitter

GPS Bluetooth FM radio receiver

Touch screen controller

Proximity detector

Microphone Camera socket

afds

April 13, 2010 Kari Pulli 35

fsd

April 14, 2010 Kari Pulli 36

32GB NAND Flash Analog

baseband & power mgmnt

16MB NOR Flash, 16MB DDR SDRAM

4 channel audio codec

Digital baseband & accelerator

USB trans-ceiver

GSM / W-CDMA transceiver

Power amplifier

Power amplifier

3-axis accelerometer

Camera socket

Battery gas gauge

Flash lamp driver

stereo headphone amplifier

Battery charger

April 13, 2010 Kari Pulli 37

April 13, 2010 Kari Pulli 38

April 13, 2010 Kari Pulli 39

PowerVR MBX

OpenGL ES 1.1

!  Tile-based renderer

Parts !  TA = Tile Accelerator

!  ISP = Image Synthesis Processor

!  TSP = Texture and Shading Processor

April 13, 2010 Kari Pulli 40

Tile Accelerator

Screen is divided into tiles

Each tile has a display list !  tristrips, texture

handles, multi-tex args, render states, …

!  if a strips overlaps several tiles, they all point to the same data

April 13, 2010 Kari Pulli 41

Image Synthesis Processor

Rasterization

HSR = Hidden Surface Removal !  determines visible pixels

Texture grouping

!  tag which pixel gets which texture !  deferred texturing

April 13, 2010 Kari Pulli 42

Texture and Shading Processor

Color interpolation

Texture mapping Advantages of tiling

!  can afford to process colors at higher resolution than what the framebuffer is

!  both color

!  32-bit vs. 16 bit !  and space

!  (next slide)

April 13, 2010 Kari Pulli 43

FSAA

Full-Screen Anti-Aliasing

!  only the tile needs 4x memory, the framebuffer doesn’t

April 13, 2010 Kari Pulli 44

April 13, 2010 Kari Pulli 45

V1.0 March 09

PowerVR SGX: Unified Architecture

46

System Memory

Bus

POWERVR SGX

Pixel Co-

Processor

Pixel Data Master (ISP)

Texturing Co-

Processor

Vertex Data Master

(Geometry)

Tiling Co-

Processor

Texturing Co-

Processor

General Purpose

Data Master

Pixel Data Master (ISP)

Pixel Co-

Processor

Universal Scalable Shader Engine

Thread Scheduler

Thread Scheduler

Thread Scheduler

Multi-Threaded Execution Unit

Multi-Threaded Execution Unit

Multi-Threaded Execution Unit

Coarse Grain

Scheduler

(CGS)

Host Bus

Multi-level Cache

Control and Register Bus Host CPU Interface

System Memory Interface System Memory Bus

Fragment processing of frame N happens simultaneously with geometry processing of frame N+1 while the CPU processes frame N+2. Unified architecture means all DSP resources are available for OpenCL use.

V1.0 March 09

POWERVR USSE (Universal Scalable Scheduler Engine) Thread Scheduling

47

Vertex Source

Vertex Source

Pixel Source

General/ Control Source

Coarse Grain

Scheduler

(CGS)

Task Queues

Multi-Threaded Execution Unit 1

Thread Control

Task => Thread distribution

Thread Queues (16 pending per pipe)

Active Threads (4 active per pipe)

Multi-Threaded Execution Unit 2

Multi-Threaded Execution Unit N

Thread Control

Task => Thread distribution

Active Threads

Active Threads

Thread which hits data stall switched out, ready thread switched in with no cycle loss

April 13, 2010 Kari Pulli 48

Video Codec Evolution

20

30

50

40

50

Full

qual

ity c

ompr

essi

on ra

tio

1992 1997 2002 2005 year

MPEG-1 H.261

MPEG-2 H.263

MPEG-4 H.263+

RealVideo9 WMV9=VC-1

MPEG-4 AVC H.264 AVS 1.0

AVS-M

Dominant format in consumer electronics

Dominant format in mobile phones

Dominant format in streaming over the internet

3GPP Rel-6, 3GPP2, IP datacasting over DBC-H, video conferencing, DVD High-Definition,…

Scalability

MPEG-4 SVC

April 13, 2010 Kari Pulli 49

Increased video and display resolutions drive need for higher processing, storage and data transfer capacity

Video codec complexity not taken into account.

VGA (640 x 480)

HVGA (480 x 320) QVGA (320 x 240)

WVGA (800 x 480) SDTV/30 fps

VGA/30 fps

QVGA/30 fps CIF/30 fps

QCIF/30 fps CIF/15 fps

Evolution of required video processing capacity (frames per second)

Pixels/s

HDTV/30 fps

2005

2006 2007

10MPix/s

20MPix/s

April 13, 2010 Kari Pulli 50

Logical steps of GPS

1. The basis of GPS is "triangulation" from satellites. 2. To "triangulate," a GPS receiver measures distance using the travel time of radio signals. 3. To measure travel time, GPS needs very accurate timing. The secret to perfect timing is to make an extra satellite measurement. If three perfect measurements can locate a point in 3D, then four imperfect measurements can do the same thing. 4. Along with distance, you need to know exactly where the satellites are in space. High orbits and careful monitoring are the secret. 5. Finally you must correct for any delays the signal experiences as it travels through the atmosphere.

April 13, 2010 Kari Pulli 51

Accelerometers

Many technologies available

! Surface :mach’d capacitive MEMS ! Plasma etched :mach’d c. MEMS ! Bulk :machined capac. MEMS

! Thermal (heated gas) ! Piezoresistive ! Piezoelectric ! Vibrating crystal technology

! Microfab thin metal layers • Accuracy

!  lo: 3-4b; med: 8b; hi: 12-16b

Piezoresistive element (resistance value R)

Counter force in response to piezoresistance Simultaneous

detection by triaxial accelerator

A full bridge circuit formed by 4 piezoresistors that detects unbalanced voltage

Cross-sectional view of changes in X and Y axes

Cross-sectional view of changes in Z axes

Fixed part

Z

XY

R3+

R4-

R1-

R1+

R2+

R2-

R3+

R4-

Fixed part

April 13, 2010 Kari Pulli 52

Typical Camera Module (e.g. SMIA)

April 13, 2010 Kari Pulli 53

Actuators for Autofocusing and Zooming

Main classes !  stepper motor

!  mainstream actuator solution for Digital Still Cameras !  Voice Coil

!  for Autofocus !  Piezoelectric

!  both for Autofocusing and Zooming

!  Liquid lens !  for Autofocus

!  Extended Depth of Focus (Digital Focus) !  not an actuator technology !  for “always in focus” or “instant Macro”

April 13, 2010 Kari Pulli 54

LED “flash”

LED = Light Emitting Diode

!  a p-n junction in semiconductor material, converts electrical energy into light

The color of light depends from the materials !  typical materials are GaAsP-type and InGaN-type of semiconductors

LED has good efficiency

!  compared to other light sources !  ~ 5% – 40% PN –junction under forward bias

Internal structure of high power LED

April 13, 2010 Kari Pulli 55

Xenon flash

Glass or quartz tube !  filled with Xenon gas !  sealed with electrodes

Trigger voltage !  is normally >10X tube voltage (~ 3.5kV / 300V)

Low power (<20Lux.s) Xenon flash !  short act time (100-200us) !  high luminous flux

Photo sensor ! limits the risk of over exposure at short distance.

• with maximum allowed 35uF capacitor (~15Lux.sec / 1m) • with photo sensor • with overall electrical insulation

~20mm x 10mm x 6mm

Capacitor diameter: ~7mm

Typical xenon flash circuit

April 13, 2010 Kari Pulli 56

Spectrum of light

White LED spectrum

!  lacks content on the blue-green area and red area

Xenon spectrum !  wide spectral

distribution and excellent color balance

The Response area of a sensor

Xenon

White LED

LED flash

Xenon flash