acceleration on many-cores cpus and gpus dinesh manocha lauri savioja

15
Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Acceleration on many-cores CPUs and GPUs

Dinesh ManochaLauri Savioja

Page 2: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Frustum Tracing Pipeline

Frustum TriangleIntersection

Page 3: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Frustum Tracing Pipeline

Frustum TriangleIntersection

Page 4: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Frustum Tracing Results (7 cores)

Theater

54 ∆s

Factory

174 ∆s

Game

14K ∆s

Sibenik

71K ∆s

City

72K ∆s

SodaHall

1.5M ∆s

diffraction NO NO NO NO YES YES

#frusta 56K 40K 206K 198K 80K 108K

time (msec)

33 27 273 598 206 373

Page 5: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Frustum Tracing Results (7 cores)

Interactive geometric propagation on complex scenes

[Chandak et al. 2008]

Page 6: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Scaling of FastV (Scaling with #cores)

Fastest, accurate geometric propagation algorithm[Chandak et al. 2009]

Page 7: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Numerical Acoustics with Adaptive Rectangular

Decomposition on the GPU

Nikunj Raghuvanshi+, Brandon Lloyd*, Naga K. Govindaraju*, Ming C. Lin+

+ Department of Computer Science, UNC Chapel Hill* Microsoft Corporation

Page 8: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Rectangular Decomposition

Numerical Acoustics can be solved very efficiently on a rectangular domain

Decompose complex domains into rectangles

Page 9: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Leveraging GPU for acoustics

Solution of Wave Equation within each rectangle can be done using a Discrete Cosine Transform (DCT)

DCTs can be done using FFT

Use an efficient FFT implementation on the GPU Govindaraju, N. K., Lloyd, B., Dotsenko, Y., Smith, B., and

Manferdelli, J. 2008. High performance discrete Fourier transforms on graphics processors. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing

Page 10: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

FFT on the GPU

Page 11: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Performance

Scene Name

Volume (m3)

Time: FDTD (CPU)

Time: Our Technique

(GPU)

Speedup

Corridor 375 365 min 4 min ~ 90x

House 1,275 2718 min 13 min ~ 200x

Cathedral 13,650 ~1 week (projected)

30 min ~ 300 x

Page 12: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Rectangular decomposition leverages GPU FFT combined with algorithmic improvements leading to ~100x improvement in performance for numerical acoustics

Conclusion

Page 13: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Case: Real-time acoustic radiance transfer

Page 14: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja
Page 15: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja

Case continued

More information in:S. Siltanen, T. Lokki, and L. Savioja, `Frequency domain acoustic radiance transfer for real-time auralization,' Acta Acustica united with Acustica, vol. 95, no. 1, pp. 106-117, 2009.