![Page 1: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/1.jpg)
Parallel Processing with PlayStation3
Lawrence Kalisz
![Page 2: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/2.jpg)
Topics
Cell Processor1. History2. Architecture
Parallel Programming1. Install Linux2. Examples
PS3 Cluster1. Applications2. Examples
![Page 3: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/3.jpg)
PS3 Cell Processor: History
Created by Sony, Toshiba, and IBM (STI)
400 Engineers
½ Billion Dollars
![Page 4: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/4.jpg)
PS3 Cell Processor: Architecture
![Page 5: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/5.jpg)
PS3 Cell Processor: Architecture
![Page 6: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/6.jpg)
PS3 Cell Processor: Architecture
Power Processing Element (PPE)
Synergistic Processing Element (SPE)
Element Interconnection Bus (EIB)
Memory System
Network Card & Graphics Card
![Page 7: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/7.jpg)
Power Processor Element
PPE handles operating system and control tasks
● 64-bit Power Architecture with VMX● In-order, 2-way hardware simultaneous
multi-threading (SMT)● 32KB L1 cache (I & D) and 512KB L2
![Page 8: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/8.jpg)
Synergistic Processing Element
Specialized high performance coreThree main components
1. SPU: Supplemental Processing Units 2. LS: local store memory3. MFC: memory flow control manages data in and
out of SPECan only access (load & store) data in the
SPE local store7 SPEs used for rendering, 1 SPE reserved
for image compression
![Page 9: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/9.jpg)
SPE: Data IN and OUT Steps
SPU needs data1. SPU initiates MFC request for data2. MFC requests data from memory3. Data is copied to local store4. SPU can access data from local storeSPU operates on data then copies data
from local store back to memory in a similar process
![Page 10: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/10.jpg)
SPE: Data IN and OUT Steps
![Page 11: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/11.jpg)
Element Interconnect Bus
• Physically overlaps all processor elements• Central arbiter supports up to 3 concurrent transfers per ring
• 2 stage, dual round robin arbiter• Each port supports concurrent 16B in and 16B out data path
• Ring topology is transparent to element data interface• Each EIB Bus data port supports 25.6GBytes/sec each way
![Page 12: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/12.jpg)
PS3 Cell: Parallel Programming
![Page 13: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/13.jpg)
PS3 Cell: Parallel Programming
Current working Linux distros:1. Fedora Core 52. YellowDog 5.03. Gentoo PowerPC 64 edition4. Debian
OpenMPI (for use with cluster)
IBM’s CELL SDK
![Page 14: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/14.jpg)
PS3 Cell: Parallel Programming
Cell performance ~10x better than GPU for media and other applications that can take advantage of its SIMD capability◦PPE performance is comparable to a traditional
GPU performance◦SPE performance mostly the same as, or better
than, a GPU with SIMD◦Performance scales with number of SPEs
![Page 15: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/15.jpg)
PS3 Cell: Parallel Programming
Programming becomes exercise in partitioning, mapping (layout),routing (communication) and scheduling
![Page 16: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/16.jpg)
PS3 Cell: Parallel Programming
AI Backgammon player
![Page 17: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/17.jpg)
PS3 Cell: Parallel Programming
AI Backgammon player1M board evaluations in ~3 seconds (6 SPEs)Data parallel implementation, linear speedup
![Page 18: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/18.jpg)
PS3 Cell: Parallel Programming
SPU programs are designed and written to work together but are compiled independently
Separate compiler and toolchain (ppu-gcc and spu-gcc)
Produces small ELF image for each program that can be embedded in PPU program
![Page 19: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/19.jpg)
PS3 Cell: Parallel Programming
BLUE-STEEL
![Page 20: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/20.jpg)
PS3 Cell: Parallel Programming
BLUE-STEELFull ray tracer running on each SPEData parallel implementation://www.youtube.com/watch?v=C3ARXUSKXAM&fe
ature=player_detailpage
![Page 21: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/21.jpg)
PS3 Cell: Parallel Programming
BLUE-STEELA Solution to the rendering equation
◦ Triangle Rasterization– Fast – possible in real time on a single core– Inaccurate or tedious for global effects such as shadows,reflection, refraction, or global illumination
Ray Tracing– Slow – unless done on multiple cores– Accurate and natural shadows, reflection, and refraction
![Page 22: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/22.jpg)
PS3 Cell: Parallel Programming
BLUE-STEELBuild a fast ray tracer from the ground up to take
advantage of multiple cores.– 6 accessible cores for rendering
![Page 23: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/23.jpg)
PS3 Cell: Parallel Programming
Ray Tracing Shoot a ray through each pixel on the screen Check for intersections with each object in the scene Keep the closest intersection
![Page 24: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/24.jpg)
PS3 Cell: Parallel Programming
Ray Tracing Shade each point according to the material of the object,
as well as the lights in the scene Cast rays for shadows, reflection, and refraction
![Page 25: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/25.jpg)
PS3 Cell: Parallel Programming
BLUE-STEEL
![Page 26: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/26.jpg)
PS3: Cluster Applications
Air Force
Folding@home
PS3 Gravity Grid
LACAL Student Cluster
![Page 27: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/27.jpg)
References
http://groups.csail.mit.edu/cag/ps3/http://
impact.asu.edu/cse520fa07/lec19-PS3-cell-tutorial.pdf
http://www.youtube.com/watch?v=VxaLmS7XPiI
http://en.wikipedia.org/wiki/PlayStation_3_cluster
http://www.netlib.org/utk/people/JackDongarra/PAPERS/scop3.pdf
![Page 28: Parallel Processing with PlayStation3 Lawrence Kalisz](https://reader036.vdocuments.us/reader036/viewer/2022081602/55182c3a55034691678b4d13/html5/thumbnails/28.jpg)
Any Questions ?