bandwidth-efficient graphics with arm mali gpus
TRANSCRIPT
![Page 1: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/1.jpg)
1
Bandwidth-efficient graphics with
ARM Mali GPUs
June 27th (Friday), 2014
Hessed Choi @ ARM
![Page 2: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/2.jpg)
2
Memory Bandwidth
![Page 3: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/3.jpg)
3
Vertex load
Varyings
Textures
Framebuffer output
Bandwidth Where does it go?
![Page 4: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/4.jpg)
4
Vertex load
Varyings
Textures
Framebuffer output
Bandwidth Where does it go?
![Page 5: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/5.jpg)
5
Less power = less memory bandwidth Desktop: 170 Watts to >300 Watts… That’s just the GPU!
Console: 80 - 100 Watts (CPU/GPU/WiFi/Network)
Mobile: 3 - 7 Watts (CPU/GPU/Modem/WiFi)
Need smarter solutions
Bandwidth Where are we?
![Page 6: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/6.jpg)
6
Mali Architecture
![Page 7: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/7.jpg)
7
Mali is a tile-based deferred rendering architecture Framebuffer is divided into tiles
Renders tile by tile
16x16 tile size
▪ Color ▪ Depth ▪ Stencil
ARM® Mali™ GPU Rendering
![Page 8: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/8.jpg)
8
Deferred Shading and Extensions Support
![Page 9: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/9.jpg)
9
Popular technique on PC and console games
Deferred Shading
Very memory bandwidth intensive Traditionally not a good fit for mobile
Limitation
![Page 10: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/10.jpg)
10
Fragment shader extensions for OpenGL® ES 2.0 and above
Allows reading of existing framebuffer color, depth and stencil values
Enables: Programmable blending Programmable depth/stencil testing Soft particles Reconstruction of 3D position etc
Extensions (1) Shader Framebuffer Fetch
http://www.khronos.org/registry/gles/extensions/ARM/ARM_shader_framebuffer_fetch.txt http://www.khronos.org/registry/gles/extensions/ARM/ARM_shader_framebuffer_fetch_depth_stencil.txt
![Page 11: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/11.jpg)
11
Fragment shader extension for OpenGL® ES 3.0 and above
On the ARM® Mali™ -T600 series this amounts to 128-bits per pixel Mali-T760 can support even more data per pixel
Enables reading and writing the current pixel’s data that is persistent throughout the
lifetime of the framebuffer
Independent of framebuffer format
Extensions (2) Shader Pixel Local Storage (PLS)
http://www.khronos.org/registry/gles/extensions/EXT/EXT_shader_pixel_local_storage.txt
![Page 12: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/12.jpg)
12
Compute final pixel color based on the Pixel Local Storage data
Output to current framebuffer format
Deferred Shading Resolve
![Page 13: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/13.jpg)
13
Bandwidth Comparison
Deferred Shading
0. 750. 1500. 2250.
Write MB/s
Read MB/s
Total MB/s
Using extensionsMultiple render targets
deferred shading example rendering to 4xRGBA8 1080p@30fps
![Page 14: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/14.jpg)
14
Roadmap
![Page 15: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/15.jpg)
15
Various deferred shading/lighting
Order independent transparency
Deferred virtual texturing
Volume rendering
etc, etc, etc
Future
http://geomerics.com/downloads/SIGGRAPH-2013-SamMartinEtAl-Challenges.pdf
![Page 16: Bandwidth-efficient graphics with ARM Mali GPUs](https://reader030.vdocuments.us/reader030/viewer/2022041007/624f03792a8c172ad471e68b/html5/thumbnails/16.jpg)
16
Questions? Thank you.