memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfthe chip...
TRANSCRIPT
![Page 1: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/1.jpg)
MEMORY MANAGEMENT IN MULTIPROCESSOR AND
MULTICORE SYSTEM
INSTRUCTOR : Dr. MUHAMMAD SHAABAN
PRESENTED BY:
MOKSHAN SHETTIGAR
![Page 2: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/2.jpg)
AGENDA
➢ History of Multiprocessor and Multicore systems.
➢ Difference between Multicores and Multiprocessors.
➢ Different Cache Levels.
➢ How Partitioning of Cache can be done.
➢ Conclusion and Results.
![Page 3: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/3.jpg)
HSITORY OF MULTIPROCESSOR AND MUTICORE SYSTEMS
➢ Multiprocessor came into picture when demand for increase in performance was
needed.
➢ Multicore system was introduced way further.
➢ The idea of multiprocessing was introduced in 1825 by Charles Babbage of analytic
engines.
![Page 4: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/4.jpg)
HSITORY OF MULTIPROCESSOR AND MUTICORE SYSTEMS
➢ The 1st chip with Multicore was introduced by IBM i.e IBM100 POWER4
➢ The chip came with a data throughput rate with its cache memory of more than 100
gigabytes per second, and had chip-to-chip communications modules operating at
over 35 gigabytes per second
![Page 5: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/5.jpg)
MULTIPROCESSOR ARCHITECTURE
There are 2 types of Multiprocessor system.
➢ Symmetric Multiprocessor System.
➢ Master-Slave Multiprocessor System(Asymmetric System).
![Page 6: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/6.jpg)
MULTIPROCESSOR ARCHITECTURE
➢ Master Slave Multiprocessor System. (Asymmetric System)
I. It is an outdated system.
II. Lot of problems with this system.
➢ Symmetric Multiprocessor system.
I. No Master Slave concept.
II. Load is balanced and each processor handle the Calls from OS
![Page 7: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/7.jpg)
MULTICORE ARCHITECTURE
➢ In Multicore systems, each processor has number of cores.
➢ Each core has does a specific part of task.
➢ This increases the throughput.
➢ Level of parallelization depends on
number of cores.
![Page 8: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/8.jpg)
MULTICORE ARCHITECTURE
➢ This is a quad core processor which means it has 4 processors
➢ This processors provides 4 times the performance of that of single core processors.
![Page 9: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/9.jpg)
MULTICORE ARCHITECTURE
➢ Single Core Systems
• This systems consumes less power
• This systems are very slow to be used
in the era of high computations.
➢ Multicore Systems
• This system has a high level of parallelization
• It is used for high end computations and
intensive parallelization.
![Page 10: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/10.jpg)
SHARED CACHE
This is shared cache in a Multicore/Multiprocessor
look like.
➢ Level 1(L1) cache
➢ Level 2(L2) cache
➢ Level 3(L3) cache
![Page 11: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/11.jpg)
CACHE PARTIONING
➢ Cache contention occurs when both the
core compete to acquire the entire
cache.
➢ Applications tries to read the data from
same line in the cache.
➢ This Collision increases Access time
![Page 12: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/12.jpg)
CACHE PARTIONING
➢ Here each core has its own section of L2
cache.
➢ This partitioning helps in eliminating the
contention error.
➢ This reduces the access time.
![Page 13: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/13.jpg)
LOCALITY OF REFERENCE
➢ Temporal Locality
• It means that the resource that is accessed previously could possibly be accessed
again.
• Sensitive to the size of the Cache.
➢ Spatial Locality
• It means the resources nearby to the accessed resources could be possibly be
accessed in the near future.
• Sensitive to the line size in a Cache.
![Page 14: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/14.jpg)
ORGANIZTION OF CACHE
Dual Data Cache Organization Organization of the Split Data Cache.
![Page 15: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/15.jpg)
ORGANIZTION OF CACHE
➢ Dual data Cache.
• In this the L1 cache is split into two different caches that is Spatial and Temporal
• The line size affects the access time for Spatial data.
• The line size is equal for both Temporal and Spatial cache.
• Default cache is Temporal.
➢ Organization of Split data.
• In this only Temporal has 2 levels of cache i.e. T1C and T2C.
• Each line size in spatial is about four words.
• Each line size in temporal is of one word.
• If there is a cache miss, the default cache is Spatial
• There are 2 counters to decide whether or not to allocate the line to temporal.
![Page 16: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/16.jpg)
➢ORGANIZTION OF CACHE
SDC1 model
![Page 17: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/17.jpg)
ORGANIZTION OF CACHE
➢ SDC1 model
• This model is designed from split data.
• But here the T2C cache has four words line size.
• During the runtime it is decided whether the line should be transferred from Spatial
cache to Temporal(T2C) cache.
• Bus of 128 lines is used to transfer from spatial to temporal.
• It has 2 counters used to determine this i.e. X and Y.
• The algorithm is followed for this transfer to take place.
![Page 18: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/18.jpg)
ORGANIZTION OF CACHE
➢ ALGORITHM.
• There are 2 associate counter used here that is X and Y
• Y counter is incremented every time when a line has to be accessed.
• X counter is incremented every time when a word from upper half has to be accessed
• X counter is decremented every time when a word from the lower has to be accessed.
• If the threshold values for both the counters are tested then the block must be
transferred from spatial to temporal caches.
![Page 19: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/19.jpg)
ORGANIZTION OF CACHE
![Page 20: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/20.jpg)
➢ORGANIZTION OF CACHESDC2 MODEL.
![Page 21: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/21.jpg)
ORGANIZTION OF CACHE
➢ SDC2 MODEL.
• This model has shared for second level cache.
• It states to have better bus utilization compared to SDC1 model.
• The block size in both the 1st level caches are same
• Level 2 cache is four times larger than 1st level.
• It has a bus size of 128 lines.
![Page 22: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/22.jpg)
CACHE BALANCER➢ ACCESS RATE ALGORTIHM.
• In this algorithm relative utilization of cache banks are determined.
• If the Access rates are high enough then it can be predicted that the bank is
accessed frequently.
• It is defined as fraction of total number of accesses made to cache bank to the
access rate of shared caches.
• It uses a term thot and tcold is used as measuring factor to iterate loops.
• thot means a the locations are used very frequently and tcold means the location are
not used frequently.
![Page 23: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/23.jpg)
CACHE BALANCERACCESS RATE ALGORITHM
![Page 24: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/24.jpg)
RESULTS
![Page 25: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/25.jpg)
RESULTS.
![Page 26: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/26.jpg)
CONCLUSIONS.
➢ As all the models were explained well and result were determined.
➢ It can be observed that the bus utilization is increased in SDC2 model.
➢ In cache balancer the contention is reduced up to 60% and execution time is
reduced up to 22%.
➢ Cache utilization is also increased by 50% in Cache balancer.
➢ Access time is less for SDC2 compared to SDC1
![Page 27: Memory management in multiprocessor systemmeseec.ce.rit.edu/756-projects/spring2017/2-3.pdfThe chip came with a data throughput rate with its cache memory of more than 100 gigabytes](https://reader034.vdocuments.us/reader034/viewer/2022042220/5ec625d28f5ae00bb11c41ae/html5/thumbnails/27.jpg)
REFERENCES
➢ CacheBalancer: Access Rate and Pain Based Resource Management for Chip
Multiprocessors, Jurrien de Klerk, Sumeet S. Kumar, Rene van Leuken.
➢ Two Management Approaches of the Split Data Cache in Multiprocessor Systems, J.
Sahuquillo, A. Pont.
➢ https://www.pcper.com/category/tags/processor?page=2.
➢ https://www.wikipedia.org/.