parallel processors todd charlton eric uriostique

19
Parallel Processors Todd Charlton Eric Uriostique

Upload: alan-freeman

Post on 18-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Parallel ProcessorsTodd Charlton

Eric Uriostique

Current Technology

• Hard to find a single core processor anymore.

• Cell phones, Labtops, etc.

• Large systems can contain up to 512+processors

The Motivation

• Divide and Conquer – Higher Throughput

• Lower Power Consumption

P = CV2f

The Motivation

• We need more performance on same power budget. How? • Remember: P = CV2f• Scale voltage and frequency to 80%

• P = C * .82 [V] * .8 [f]

• This drops power by 50%• Add additional core• Result = 1.6x Speedup with same power

The Motivation

• How about reducing power consumption but keeping the same performance?• Remember: P = CV2f• Scale voltage and frequency by 50%

• P = C * .52 [V] * .5 [f]

• This drops power to 12.5%• Add additional core• Result = 25% of original power

consumption with same performance

Amdahl’s Law• “Speed-up is limited by amount of work

that can be done in parallel”

Credit: watermint.org

Ways To Parallelize

1. Multi-Threading:• Multi-thread your application on one chip

• More elegant

2. Multi-Processing:• Flash serial code to separate chips

• No worrying about scheduling!

Let’s Multi-Thread

• One Application: Counting maize pixels

2 Processors

4 Processors

Multi-Threading in µProcessors

• Spin Propeller Processor

• Multi-Thread on 8 cores• One application run on 8

cores • Uses it’s own high level

language and a form of Assembly• In CMU Cam4

Problems with Multi-Threading

• Steep learning curve • Learning the Language

• Parallel Slowdown • Lot of time to set up a new thread.

If that thread does not have much work, not worth the overhead

Multi-Threading Libraries

• Cannot program serially to take advantage of Parallel Processing

• Intel’s Thread Building Blocks (TBB)• OpenMP• Boost and pthread• All of these are libraries in C/C++

Multi-Processing: Beaglebone

• Processor• 720 MHz ARM Cortex-A8• 3D graphics accelerator• ARM Cortex-M3 for power

management• 2x Programmable Realtime Unit

RISC CPUs• PRUs share memory space with

A8

Shared Memory Space

Multi-Processing:Custom with Message Passing

• Designate a processor for each frequent tasks

• Send messages to "Boss" as necessary• Since every processor's workload is

minimal, slower and low power chips can be used

• Overall = Same system performance

Message Passing

Problems with Multi-Processing

• Shared Memory Space• Boards like this are hard to find and

configure

• Message Passing• Can’t assume messages are received

immediately

Recap

• Go parallel if you want:• Higher Throughput• Lower Power

• Two Ways:• Multi-Threading – Spin

• Speed up one Application• Multi-Processing – Beaglebone

• Do more tasks at same time

• Don’t forget Amdahl’s Law!

Questions