cell broadband engine

22
Cell Broadband Engine Spencer Dennis Nicholas Barlow

Upload: others

Post on 16-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cell Broadband Engine

Cell Broadband Engine

Spencer DennisNicholas Barlow

Page 2: Cell Broadband Engine

The Cell Processor

◦ Objective: “[to bring] supercomputer power to everyday life”

◦ Bridge the gap between conventional CPU’s and high performance GPU’s

Page 3: Cell Broadband Engine

History

Original patent application in 2002Generations◦ 90 nm - 2005 ◦ 65 nm - 2007 (PowerXCell 8i)◦ 45 nm - 2009

Page 4: Cell Broadband Engine

Design

Cost $400 Million to developTeam of 400 engineersSTI Design Center◦ Sony◦ Toshiba◦ IBM

Page 5: Cell Broadband Engine

PS3Employed as CPU◦ Clocked at 3.2 GHz◦ theoretical maximum

performance of 23.04 GFLOPS

Utilized alongside NVIDIA RSX 'Reality Synthesizer' GPU

◦ Complimented graphical performance

Page 6: Cell Broadband Engine

Architecture Overview

◦ 8 Synergistic Processing Elements (SPE)

◦ Single Dual Issue Power Processing Element (PPE)

◦ Memory IO Controller (MIC)

◦ Element Interconnect Bus (EIB)

◦ Memory IO Controller (MIC)

◦ Bus Interface Controller (BIC)

Page 7: Cell Broadband Engine

SPU/SPE Synergistic Processing Unit/ElementSXU - Synergistic Execution Unit

LS - Local Store

SMF - Synergistic Memory Frontend

EIB - Element Interconnect Bus

PPE - Power Processing Element

MIC - Memory IO Controller

BIC - Bus Interface Controller

Page 8: Cell Broadband Engine
Page 9: Cell Broadband Engine
Page 10: Cell Broadband Engine

Synergistic Processing Element (SPE)

128-bit dual-issue SIMD dataflow○ “Single Instruction Multiple Data”○ Optimized for data-level

parallelism○ Designed for vectorized floating

point calculations.

Page 11: Cell Broadband Engine

SPE Continued

◦ Workhorses of the Processor

◦ Handle most of the computational workload

◦ Each contains its own Instruction + Data Memory

◦ “Local Store”▫ Embedded SRAM

Page 12: Cell Broadband Engine

Power Processor Element (PPE)

Responsible for governing SPEs◦ “Extensions” of the PPE

Shares main memory with SPE◦ can initiate accesses for SPE cores

Power Architecture◦ Implements Power Architecture Hypervisor

▫ can run multiple operating systems concurrentlyMemory (1st generation)◦ 32KB split L1 instruction & Data cache

▫ unified 512KB L2 Cache

Page 13: Cell Broadband Engine

Element Interconnect BusHigh bandwidth internal bus1st generation: 96 Bytes/cycle4 16B rings ◦ can handle up to 3 simultaneous data

transfers12 on and off ramps◦ Each SPE + PPE◦ memory controller◦ 2 Off-chip I/O interfaces

Page 14: Cell Broadband Engine

Memory Flow Controller

Asynchronous Memory Controller

Retrieves data from main memory to SPE’s local storage & PPE’s Cache.

Supports two Rambus XDR memory banks

Page 15: Cell Broadband Engine

Bus Interface ControllerProvides asynchronous interface between EIB and IO interfacesTwo flexible IO interfaces to rest of system◦ One Interface can be reconfigured to provide Symmetric Multiprocessing (SMP)

interfaceContains pervasive unit◦ provides test, debug and monitoring functionality

▫ Chip level error checking◦ provides clock generation & distribution control◦ Power on Reset Unit (POR)

▫ Responsible for unit initialization◦ Performance monitoring

Power Management Unit (PMU)◦ Allows software controlled power reduction

Thermal Management Unit (TMU)

Page 16: Cell Broadband Engine

Developing for CellOctopiler◦ Takes high level sequential code and parallelizes it to optimize it

for a multiprocessor system▫ High level languages

◦ Divides code nine ways▫ 8 sets of instructions are written for the SPE’s▫ The final set is written for the Power PC PPE

GCC◦ IBM sourced plugins for cell PPU/SPU development

Page 17: Cell Broadband Engine

SPU ISA

Page 18: Cell Broadband Engine

SPU ISA (cont’d)

Page 19: Cell Broadband Engine

Applications (In Depth)Console Gaming◦ PS3

▫ PPE controls 6 SPE’s delegating tasks▫ 1 SPE is OS reserved, 1SPE is redundant

Supercomputing◦ IBM BladeCenter QS Series

▫ Easy Scalability

Password cracking◦ High parallelism allows for high floating point brute force

performance

Page 20: Cell Broadband Engine
Page 21: Cell Broadband Engine

Conclusion

Discontinued in 2009◦ Difficult development environment

▫ Programmer managed SPE memory▫ Explicit parallelism▫ Two separate ISAs

Idea still lives on…◦ General Purpose GPU

▫ Intel Larabee Architecture Intel Many Integrated Core Architecture

▫ AMD FireStream

▫ Nvidia Tesla

Page 22: Cell Broadband Engine

References

◦ https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/76CA6C7304210F3987257060006F2C44/$file/SPU_ISA_v1.2_27Jan2007_pub.pdf

◦ http://en.wikipedia.org/wiki/SIMD◦ http://en.wikipedia.org/wiki/Cell_(microprocessor)◦ ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1564359◦ http://arstechnica.com/uncategorized/2006/02/6265-2/◦ http://www2.lbl.gov/Science-

Articles/Archive/sabl/2006/Jul/CellProcessorPotential.pdf◦ http://en.wikipedia.org/wiki/Symmetric_multiprocessing◦ http://researcher.watson.ibm.com/researcher/view.php?person=us-

mkg/papers/2006_ieeemicro.pdf