multiprocessors and...
TRANSCRIPT
![Page 1: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/1.jpg)
Multiprocessors and Multithreading
Jason Mars
Sunday, March 3, 13
![Page 2: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/2.jpg)
Parallel Architectures for Executing Multiple Threads
Sunday, March 3, 13
![Page 3: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/3.jpg)
Parallel Architectures for Executing Multiple Threads
• Multiprocessor – multiple CPUs tightly coupled enough to cooperate on a single problem.
Sunday, March 3, 13
![Page 4: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/4.jpg)
Parallel Architectures for Executing Multiple Threads
• Multiprocessor – multiple CPUs tightly coupled enough to cooperate on a single problem.
• Multithreaded processors (e.g., simultaneous multithreading) – single CPU core that can execute multiple threads simultaneously.
Sunday, March 3, 13
![Page 5: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/5.jpg)
Parallel Architectures for Executing Multiple Threads
• Multiprocessor – multiple CPUs tightly coupled enough to cooperate on a single problem.
• Multithreaded processors (e.g., simultaneous multithreading) – single CPU core that can execute multiple threads simultaneously.
• Multicore processors – multiprocessor where the CPU cores coexist on a single processor chip.
Sunday, March 3, 13
![Page 6: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/6.jpg)
Multiprocessors
• Not that long ago, multiprocessors were expensive, exotic machines – special-purpose engines to solve hard problems.
• Now they are pervasive.
Cache
Processor
Cache
Processor
Cache
Processor
Single bus
Memory I/O
Sunday, March 3, 13
![Page 7: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/7.jpg)
Classifying Multiprocessors
• Flynn Taxonomy
• Interconnection Network
• Memory Topology
• Programming Model
Sunday, March 3, 13
![Page 8: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/8.jpg)
Flynn Taxonomy
• SISD (Single Instruction Single Data)• Uniprocessors
• SIMD (Single Instruction Multiple Data)• Examples: Illiac-IV, CM-2, Nvidia GPUs, etc.
• Simple programming model• Low overhead
• MIMD (Multiple Instruction Multiple Data)• Examples: many, nearly all modern multiprocessors or multicores
• Flexible• Use off-the-shelf microprocessors or microprocessor cores
• MISD (Multiple Instruction Single Data)• ???
Sunday, March 3, 13
![Page 9: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/9.jpg)
Interconnection Networks
• Bus• Network• pros/cons?
Cache
Processor
Cache
Processor
Cache
Processor
Single bus
Memory I/O
Sunday, March 3, 13
![Page 10: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/10.jpg)
Memory Topology
• UMA (Uniform Memory Access)• NUMA (Non-uniform Memory Access)• pros/cons?
cpu
cpu
cpu
cpu
.
.
.
M
M
M
M
.
.
.
Network
Cache
Processor
Cache
Processor
Cache
Processor
Single bus
Memory I/O
Network
Cache
Processor
Cache
Processor
Cache
Processor
Memory Memory Memory
Sunday, March 3, 13
![Page 11: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/11.jpg)
Programming Model
• Shared Memory -- every processor can name every address location• Message Passing -- each processor can name only it’s local memory.
Communication is through explicit messages.• pros/cons?
Network
Cache
Processor
Cache
Processor
Cache
Processor
Memory Memory Memory
Sunday, March 3, 13
![Page 12: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/12.jpg)
Programming Model
• Shared Memory -- every processor can name every address location• Message Passing -- each processor can name only it’s local memory.
Communication is through explicit messages.• pros/cons?
Network
Cache
Processor
Cache
Processor
Cache
Processor
Memory Memory Memory
find the max of 100,000 integers on 10 processors.
Sunday, March 3, 13
![Page 13: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/13.jpg)
Parallel Programming
• Shared-memory programming requires synchronization to provide mutual exclusion and prevent race conditions• locks (semaphores)• barriers
Processor A index = i++;
Processor B index = i++;
i = 47
Sunday, March 3, 13
![Page 14: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/14.jpg)
Parallel Programming
• Shared-memory programming requires synchronization to provide mutual exclusion and prevent race conditions• locks (semaphores)• barriers
Processor A index = i++;
Processor B index = i++;
i = 47
load i;inc i;
store i;
load i;inc i;
store i;
Sunday, March 3, 13
![Page 15: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/15.jpg)
Parallel Programming
• Shared-memory programming requires synchronization to provide mutual exclusion and prevent race conditions• locks (semaphores)• barriers
Processor A index = i++;
Processor B index = i++;
i = 47
load i;inc i;
store i;load i;inc i;
store i;
Sunday, March 3, 13
![Page 16: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/16.jpg)
Parallel Programming
• Shared-memory programming requires synchronization to provide mutual exclusion and prevent race conditions• locks (semaphores)• barriers
Processor A index = i++;
Processor B index = i++;
i = 47
Sunday, March 3, 13
![Page 17: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/17.jpg)
Parallel Programming
• Shared-memory programming requires synchronization to provide mutual exclusion and prevent race conditions• locks (semaphores)• barriers
Processor A index = i++;
Processor B index = i++;
i = 47
load i;
inc i;
store i;
load i;
inc i;
store i;
Sunday, March 3, 13
![Page 18: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/18.jpg)
But...
• That ignores the existence of caches
• How do caches complicate the problem of keeping data consistent between processors?
Sunday, March 3, 13
![Page 19: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/19.jpg)
Multiprocessor Caches (Shared Memory)
• the problem -- cache coherency
• the solution?
Cache
Processor
Cache
Processor
Cache
Processor
Single bus
Memory I/O
ii
Sunday, March 3, 13
![Page 20: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/20.jpg)
Multiprocessor Caches (Shared Memory)
• the problem -- cache coherency
• the solution?
Cache
Processor
Cache
Processor
Cache
Processor
Single bus
Memory I/O
i
inc i;
i
Sunday, March 3, 13
![Page 21: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/21.jpg)
Multiprocessor Caches (Shared Memory)
• the problem -- cache coherency
• the solution?
Cache
Processor
Cache
Processor
Cache
Processor
Single bus
Memory I/O
i
inc i;load i;
i
Sunday, March 3, 13
![Page 22: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/22.jpg)
Multiprocessor Caches (Shared Memory)
• the problem -- cache coherency
• the solution?
Cache
Processor
Cache
Processor
Cache
Processor
Single bus
Memory I/O
i
inc i;load i;
i
Sunday, March 3, 13
![Page 23: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/23.jpg)
What Does Coherence Mean?
• Informally:
• Any read must return the most recent write
• Too strict and very difficult to implement
• Better:
• A processor sees its own writes to a location in the correct order.
• Any write must eventually be seen by a read
• All writes are seen in order (“serialization”). Writes to the same location are seen in the same order by all processors.
• Without these guarantees, synchronization doesn’t work
Sunday, March 3, 13
![Page 24: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/24.jpg)
Solutions
Sunday, March 3, 13
![Page 25: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/25.jpg)
Solutions
• Snooping Solution (Snoopy Bus):• Send all requests for unknown data to all processors• Processors snoop to see if they have a copy and respond accordingly • Requires “broadcast”, since caching information is at processors• Works well with bus (natural broadcast medium)• Dominates for small scale machines (most of the market)
Sunday, March 3, 13
![Page 26: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/26.jpg)
Solutions
• Snooping Solution (Snoopy Bus):• Send all requests for unknown data to all processors• Processors snoop to see if they have a copy and respond accordingly • Requires “broadcast”, since caching information is at processors• Works well with bus (natural broadcast medium)• Dominates for small scale machines (most of the market)
• Directory-Based Schemes• Keep track of what is being shared in one centralized place (for each
address) => the directory• Distributed memory => distributed directory (avoids bottlenecks)• Send point-to-point requests to processors (to invalidate, etc.)• Scales better than Snooping for large multiprocessors
Sunday, March 3, 13
![Page 27: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/27.jpg)
Implementing Coherence Protocols
• How do you find the most up-to-date copy of the desired data?
• Snooping protocols
• Directory protocols
Cache tagand data
Processor
Single bus
Memory I/O
Snooptag
Cache tagand data
Processor
Snooptag
Cache tagand data
Processor
Snooptag
Sunday, March 3, 13
![Page 28: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/28.jpg)
Implementing Coherence Protocols
• How do you find the most up-to-date copy of the desired data?
• Snooping protocols
• Directory protocols
Cache tagand data
Processor
Single bus
Memory I/O
Snooptag
Cache tagand data
Processor
Snooptag
Cache tagand data
Processor
Snooptag
Write-Update vs Write-Invalidate
Sunday, March 3, 13
![Page 29: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/29.jpg)
Parallel Architectures for Executing Multiple Threads
• Multiprocessor – multiple CPUs tightly coupled enough to cooperate on a single problem.
• Multithreaded processors (e.g., simultaneous multithreading) – single CPU core that can execute multiple threads simultaneously.
• Multicore processors – multiprocessor where the CPU cores coexist on a single processor chip.
Sunday, March 3, 13
![Page 30: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/30.jpg)
Dean Tullsen
Simultaneous Multithreading
(A Few of Dean Tullsen’s 1996 Thesis Slides)
Sunday, March 3, 13
![Page 31: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/31.jpg)
Dean Tullsen
Hardware Multithreading
Conventional Processor
PC
regs
CPU
inst
ruct
ion
stre
amSunday, March 3, 13
![Page 32: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/32.jpg)
Dean Tullsen
Hardware Multithreading
Conventional Processor
PC
regs
CPU
inst
ruct
ion
stre
am
Multithreaded
Sunday, March 3, 13
![Page 33: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/33.jpg)
Dean Tullsen
Hardware Multithreading
Conventional Processor
PC
regs
CPU
inst
ruct
ion
stre
am
PC
regs
Multithreaded
Sunday, March 3, 13
![Page 34: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/34.jpg)
Dean Tullsen
Hardware Multithreading
Conventional Processor
PC
regs
CPU
inst
ruct
ion
stre
am
PC
regsPC
regs
Multithreaded
Sunday, March 3, 13
![Page 35: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/35.jpg)
Dean Tullsen
Hardware Multithreading
Conventional Processor
PC
regs
CPU
inst
ruct
ion
stre
am
PC
regs
PC
regsPC
regs
Multithreaded
Sunday, March 3, 13
![Page 36: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/36.jpg)
Superscalar (vs Superpipelined)
(multiple instructions in the same stage, same CR as scalar)
(more total stages, faster clock rate)
Sunday, March 3, 13
![Page 37: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/37.jpg)
Dean Tullsen
Superscalar Execution
Issue SlotsTime (proc cycles)
Sunday, March 3, 13
![Page 38: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/38.jpg)
Dean Tullsen
Superscalar Execution
Issue SlotsTime (proc cycles) Vertical waste
Sunday, March 3, 13
![Page 39: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/39.jpg)
Dean Tullsen
Superscalar Execution
Issue SlotsTime (proc cycles) Vertical waste
Horizontal waste
Sunday, March 3, 13
![Page 40: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/40.jpg)
Dean Tullsen
Superscalar Execution with Fine-Grain Multithreading
Issue SlotsTime (proc cycles)
Thread 1
Thread 2
Thread 3
Sunday, March 3, 13
![Page 41: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/41.jpg)
Dean Tullsen
Simultaneous Multithreading
Issue SlotsTime (proc cycles)
Thread 1
Thread 2
Thread 3
Thread 4
Thread 5
Sunday, March 3, 13
![Page 42: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/42.jpg)
Dean Tullsen
SMT Performance
0
1.7500
3.5000
5.2500
7.0000
1 2 3 4 5 6 7 8
Thro
ughp
ut (I
nstru
ctio
ns p
er C
ycle
)
Number of Threads
Simultaneous Multithreading
Fine-Grain Multithreading
Conventional Superscalar
Sunday, March 3, 13
![Page 43: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/43.jpg)
Parallel Architectures for Executing Multiple Threads
• Multiprocessor – multiple CPUs tightly coupled enough to cooperate on a single problem.
• Multithreaded processors (e.g., simultaneous multithreading) – single CPU core that can execute multiple threads simultaneously.
• Multicore processors – multiprocessor where the CPU cores coexist on a single processor chip.
Sunday, March 3, 13
![Page 44: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/44.jpg)
Multicore Processors (aka Chip Multiprocessors)
• Multiple cores on the same die, may or may not share L2 or L3 cache.
• Intel, AMD both have quad core processors. Sun Niagara T2 is 8 cores x 8 threads (64 contexts!)
• Everyone’s roadmap seems to be increasingly multi-core.
CPU CPU CPU
CPU CPU CPU
Sunday, March 3, 13
![Page 45: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/45.jpg)
The Latest Processors
Tegra 3 (5 Cores) Intel Nehalem (4 Cores)Multicore Multicore + SMT
Sunday, March 3, 13
![Page 46: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/46.jpg)
Nehalem
Sunday, March 3, 13
![Page 47: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/47.jpg)
Nehalem
Fetch
Sunday, March 3, 13
![Page 48: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/48.jpg)
Nehalem
Fetch
Decode
Sunday, March 3, 13
![Page 49: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/49.jpg)
Nehalem
Fetch
Decode
Execute
Sunday, March 3, 13
![Page 50: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/50.jpg)
Nehalem
Fetch
Decode
Execute
Mem/WB
Sunday, March 3, 13
![Page 51: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/51.jpg)
CSE 141 Dean Tullsen
Sunday, March 3, 13
![Page 52: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/52.jpg)
CSE 141 Dean Tullsen
Sunday, March 3, 13
![Page 53: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/53.jpg)
CSE 141 Dean Tullsen
Sunday, March 3, 13
![Page 54: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/54.jpg)
Nehalem in a Nutshell
• Up to 8 cores (i7, 4 cores)• 2 SMT threads per core• 20+ stage pipeline• x86 instructions translated to RISC-like uops• Superscalar, 4 “instructions” (uops) per cycle (more with fusing)• Caches (i7)
• 32KB 4-way set-associative I cache per core• 32KB, 8-way set-associative D cache per core• 256 KB unified 8-way set-associative L2 cache per core• 8 MB shared 16-way set-associative L3 cache
Sunday, March 3, 13
![Page 55: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/55.jpg)
Key Points
Sunday, March 3, 13
![Page 56: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/56.jpg)
Key Points
• Network vs. Bus
Sunday, March 3, 13
![Page 57: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/57.jpg)
Key Points
• Network vs. Bus
• Message-passing vs. Shared Memory
Sunday, March 3, 13
![Page 58: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/58.jpg)
Key Points
• Network vs. Bus
• Message-passing vs. Shared Memory
• Shared Memory is more intuitive, but creates problems for both the programmer (memory consistency, requiring synchronization) and the architect (cache coherency).
Sunday, March 3, 13
![Page 59: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/59.jpg)
Key Points
• Network vs. Bus
• Message-passing vs. Shared Memory
• Shared Memory is more intuitive, but creates problems for both the programmer (memory consistency, requiring synchronization) and the architect (cache coherency).
• Multithreading gives the illusion of multiprocessing (including, in many cases, the performance) with very little additional hardware.
Sunday, March 3, 13
![Page 60: Multiprocessors and Multithreadingcseweb.ucsd.edu/classes/wi13/cse141-b/slides/10-Multithreading.pdf · • Multicore processors – multiprocessor where the CPU cores coexist on](https://reader031.vdocuments.us/reader031/viewer/2022022007/5acffbf97f8b9a8b1e8d5d99/html5/thumbnails/60.jpg)
Key Points
• Network vs. Bus
• Message-passing vs. Shared Memory
• Shared Memory is more intuitive, but creates problems for both the programmer (memory consistency, requiring synchronization) and the architect (cache coherency).
• Multithreading gives the illusion of multiprocessing (including, in many cases, the performance) with very little additional hardware.
• When multiprocessing happens within a single die/processor, we call that a chip multiprocessor, or a multi-core architecture.
Sunday, March 3, 13