Low-power Task Scheduling for GPU Energy ReductionLi Tang, Yiji Zhang
Introduction
•DVFS (dynamic voltage and frequency scaling) implementation
•Building GPU linear regression power model
DVFS implementation•Dynamic Voltage and Frequency Scaling
a method to provide variable amount of energy for a task by scaling the operating voltage/frequency.
•Power & Energy consumption
•
GPU architecture and linear regression power model
On-chip
Device Memory
• GPU linear power model:
Total power Maximum power
of the i-th component
Usage rate of the i-th components
Intercept power
Energy measurement
• NI USB-6216 DAQ+ two FLUKE 80i-110s current clamps
• Sampling rate:▫ 10 readings per millisecond
Preliminary results
• WAXPY function:▫ W[i]=alpha*X[i]+beta*Y[i] (i: thread number)
• Kernel launch:▫ WAXPY<<<num_blocks, num_threads>>>
• Vector size and type:▫ 1,000,000 float
Thread*Block 1*1 1*4 1*16 1*64 4*64 16*64 64*16
WAXPY Time 44.053 11.529 3.503 1.056 0.266 0.070 = 0.071
WAXPY GPU Power 31.878 33.872 43.263 59.015 57.223 63.709 > 48.098
WAXPY GPU Energy 1404.298 390.498 151.542 62.298 15.195 4.466 3.415