lecture 6 welcome - utrecht university · lecture 6 welcome. 2. 3. 4 part 1 the cache. 5 cache. 6...
TRANSCRIPT
![Page 1: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/1.jpg)
1
LECTURE 6
WELCOME
![Page 2: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/2.jpg)
2
![Page 3: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/3.jpg)
3
![Page 4: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/4.jpg)
4
PART 1
THE CACHE
![Page 5: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/5.jpg)
5
cache
![Page 6: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/6.jpg)
6
cache
![Page 7: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/7.jpg)
7
cache
![Page 8: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/8.jpg)
8
Why is RAM slow?
Runs at a lower clockspeed;
Too far from the CPU
c = 300.000Km / s
at 4Ghz: 7.5cm per cycle
c in copper is lower
actually 5cm per cycle
2.5cm hence and forth
cache
![Page 9: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/9.jpg)
9
Level 1 cache
Level 2 cache
Registers: 0 cycles
L1: 2 cycles
L2: 15 cycles
RAM: 80 cycles
cache
![Page 10: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/10.jpg)
10
Level 1 cache
Level 2 cache
Registers: 0 cycles
L1: 4 cycles
L2: 11 cycles
L3: 39 cycles
RAM: 107 cycles
Level 3 cache
32KB
256KB
6MB
RAM: 107 cycles
cache
![Page 11: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/11.jpg)
11
cache
CACHE
0 0050 411CBB372B37
1 0000 0A3246F3762B
2 0030 8910EE24BACF
3 0080 2AB348FE376C
RAM
0000 0A3246F3762B
0010 64000101EA67
0020 2BD634633642
0030 8910EE24BACF
0040 374C34648232
0050 411CBB372B37
0060 283E34A8623A
0070 A83829200176
0080 2AB348FE376C
Full associative cache
![Page 12: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/12.jpg)
12
cache
CACHE
0 0050 411CBB372B37
1 0000 0A3246F3762B
2 0030 8910EE24BACF
3 0080 2AB348FE376C
Full associative cache
Retrieving data:
CPU wants to read from RAM
Cache searches for address
If found, data is returned
Otherwise, RAM is used
Obtained data is stored in cache
Writing data:
CPU wants to write to RAM
Cache searches for address
If found, data is written
Otherwise, new entry is created
Data to be written is stored in cache
Stored data is written to RAM ‘later’
![Page 13: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/13.jpg)
13
cache
CACHE
line tag data
0000 0000 000000000000
0001 0000 000000000000
0002 1A50 8910EE24BACF
0003 0B70 2AB348FE376C
0004 0000 000000000000
0005 0000 000000000000
0006 0000 000000000000
0007 0000 000000000000
Set associative cache
![Page 14: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/14.jpg)
14
cache
CACHE
line tag data
0000 0000 000000000000
0001 0000 000000000000
0002 1A50 8910EE24BACF
0003 0B70 2AB348FE376C
0004 0000 000000000000
0005 0000 000000000000
0006 0000 000000000000
0007 0000 000000000000
Set associative cache
Address: 0B700003
0003 0B70
line tag
Steps:
Split address in ‘line’ and ‘tag’
At cache line ‘line’, verify ‘tag’
If tag matches, return data
Otherwise, get data from RAM
![Page 15: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/15.jpg)
15
cache
CACHE
line tag data
0000 0000 000000000000
0001 0000 000000000000
0002 1A50 8910EE24BACF
0003 0B70 2AB348FE376C
0004 0000 000000000000
0005 0000 000000000000
0006 0000 000000000000
0007 0000 000000000000
Set associative cache
Address: 0CA00006
0006 0CA0
line tag
Address: 098A0006
0006 098A
line tag
![Page 16: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/16.jpg)
16
cache
N-Set associative cache
CACHE
line tag 1 data 1
0000 0000 000000000000
0001 0000 000000000000
0002 1A50 8910EE24BACF
0003 0B70 2AB348FE376C
0004 0000 000000000000
0005 0000 000000000000
0006 0000 000000000000
0007 0000 000000000000
CACHE
line tag 2 data 2
0000 0000 000000000000
0001 0000 000000000000
0002 0000 000000000000
0003 0FC0 1056BBA001FF
0004 0000 000000000000
0005 0000 000000000000
0006 0000 000000000000
0007 0000 000000000000
![Page 17: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/17.jpg)
17
cache
Caching – Summary
Full associative cache:
Based on an address, we search through all cache lines to see if
the requested data is available. This kind of cache must be small,
or the number of tests is huge.
Set associative cache:
Based on the address, we determine the cache line where our data
could be. We check for that line only if the data is available. Data
that ends up in the same cache line will render the cache useless.
N-Set associative cache:
Every cache line can now hold N addresses. We need to check all
N tags, so N is small. However, several addresses sharing the
same cache line can still be cached.
![Page 18: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/18.jpg)
18
cache
So… How does this affect your program?
1. 64 bytes per cache line:
2. 32Kb L1 cache, 8-way set associative:
3. Memory latency of 107 cycles:
4. Prefetching:
5. L1 instruction cache:
![Page 19: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/19.jpg)
19
PART 2
TOTAL RECAP
![Page 20: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/20.jpg)
20
![Page 21: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/21.jpg)
21
![Page 22: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/22.jpg)
22
“Dear Charles,
In almost every computation a
great variety of arrangements
for the succession of the
processes is possible, and various
considerations must influence
the selection amongst them
(...).
One essential object is to
choose that arrangement which
shall tend to reduce to a
minimum the time necessary for
completing the calculation.
Therefore, one should attend
PR3 and learn from it.
Love, Ada.”
![Page 23: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/23.jpg)
23
10 TIPS straight from Ada Lovelace & Charles Babage!
“HOW TO PASS PR3”
(0. Read the slides once more.)
1. Chose your tools. (timer, compiler, SVN, Excell, etc.)
2. Measure & note. (original performance, scalability, time for various parts of the app)
3. Take a step back. (think, don’t type: what could be done smarter? – then research)
4. Resist the urge. (don’t touch that sqrtf yet. Improve algorithms instead)
5. Measure & note. (things changed radically, so measure again, and write down things)
6. Now give in to the urge. (go wild: Cache. Low level. Multithread.)
7. Measure. Note. (don’t forget! More results means a better report and a higher grade.)
8. Goto 6. (there’s always more to tweak. Mind diminishing returns though.)
9. Add some SIMD. It’s mandatory. (really. Don’t forget.)
10. Add polish. Hand in. (at least make it *look* professional, it really helps)
Wednesday in the exam week – By MAIL!
FRIDAY
![Page 24: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/24.jpg)
![Page 25: LECTURE 6 WELCOME - Utrecht University · LECTURE 6 WELCOME. 2. 3. 4 PART 1 THE CACHE. 5 cache. 6 cache. 7 cache. 8 Why is RAM slow? ... 22 “Dear Charles, In almost every computation](https://reader033.vdocuments.us/reader033/viewer/2022050200/5f53f725a14d596720441f24/html5/thumbnails/25.jpg)
THE END