revisiting resource partitioning for multi -core...
TRANSCRIPT
RevisitingResourcePartitioningforMulti-coreChips:IntegrationofSharedResourcePartitioningonaCommercialRTOS
21Apr.2017
PAK,EUNJI
Seniorresearcher,ETRI(ElectronicsandTelecommunicationsResearchInstitute)
CMAAS’2017
Agenda• Qplus-AIR, acommercialRTOS• ComprehensivesharedresourcepartitioningimplementationonQplus-AIR
Qplus-AIR
ARINC653compliantRTOSCertifiableforDO-178BLevelA
IntroductiontoQplus-AIR• Qplus-AIR
� DevelopedbyETRIforsafety-criticalsystem(2010~2012)� MainoperatingsystemfortheIFCC(Integratedflightcontrolcomputer)ofUAV(UnmannedAvionicsVehicle),KAI
� IntegrateMC(MissionControl),FC(FlightControl),andC&C(CommunicationsandCommands)intheIFCC
� ARINC653compliantRTOS*� Robustpartitioningamongapplications� Spatialandtemporal� Preventcross-applicationinfluenceanderrorpropagationamongapplications
� Easyintegrationofmultipleapplicationswithdifferentdegreesofcriticality
*AirlinesElectronicEngineeringCommittee,AvionicsApplicationSoftwareStandardInterfaceARINCSpecification653Part1,2006
IntroductiontoQplus-AIR• Qplus-AIR
� CertifiablepackageforDO-178BLevelA� LightweightARINC653support:kernel-levelimplementation� Supportformulticoreplatforms(2014~)
• RTWORKS� AcommercialversionofQplus-AIR� ManagedbyRTST(2013~),ETRI’sspin-offcompany
� Startwith4developers,andnowhas11OSdevelopers� AUTOSAR(automotiveindustrystandard)andISO26262ASILDisinprogress
• ETRIfocusesonresearchissueswhileRTSTfocusesoncommercialization
ApplicationExamples• Safety-criticalindustrialapplications
� Integratedflightcontrolcomputerofunmannedavionicsvehicle,2010~2012
� Tiltrotorflightcontrolcomputer,2012� Nuclearpowerplantcontrolsystem,2013� HUMS(HealthandUsageMonitoringSystem)forhelicopter,2013~2016
� Subwayscreen-doorcontrolsystem,2016 (exporttoBrazil)� Communicationsystemofself-propelledguns,2017~� (project)Autonomousdrivingcar,2015~
ComprehensivesharedresourcepartitioningimplementationonQplus-AIR
Contents• Introduction
• HWplatform:P4080
• Comprehensiveresourcepartitioningimplementation� Memorybusbandwidthpartitioning� DRAMbankpartitioning� Sharedcachepartitioning– set-based/way-based
• CombinedallthetechniquesontheQplus-AIR
• Evaluations
• Conclusions&FutureWork
Introduction[1/2]• Robustpartitioningamongapplications(partitions)
� Qplus-AIRsupportsspatialandtemporalpartitioning� Ensuresindependentexecutionofmultipleapplicationswithvarioussafety-criticallevels
• Robustpartitioningmaynolongerbevalidinmulticore� Multiplecoressharehardwareresourcessuchascacheormemory� Concurrentlyexecutingapplicationsaffecteachotherduetothecontentiononsharedresource
� Majorsourceoftimingvariability� PessimisticWCETestimation→overprovisioningofhardwareresourcesandlowsystemutilization
� Insafety-criticalsystems,wehadtoturnoffbutonecore
Introduction[2/2]• Wemustdealwiththeresourcecontentionproperly
� WCEToftasksstaysguaranteedandtightlybounded� Especiallyforsafetycriticalapplicationsthatrequirecertification
• Requirementofinter-coreinterferencemitigation� “TheapplicanthasidentifiedtheinterferencechannelsthatcouldpermitinterferencetoaffectthesoftwareapplicationshostedontheMCPcores,andhasverifiedtheapplicant’schosenmeansofmitigationoftheinterference.“- FAACAST(CertificationAuthoritiesSoftwareTeam)-32APositionPaper*
• ComprehensivesharedresourcepartitioningimplementationonARINC653compliantRTOS� Integrateanumberofresourcepartitioningschemes,eachofwhichtargetsdifferentsharedhardwareresources, onQplus-AIR
� UniquechallengesduetothefactthattheRTOSdidnotsupportLinux-likedynamicpaging
*CertificationAuthoritiesSoftwareTeam,PositionPaperCAST-32A:Multi-coreProcessors,2016.
HWplatform,P4080[1/2]• P4080architecture*
� EightPowerPCe500mccores� Eachcorehasaprivate32KB-I/32KB-DL1and128KBL2cache� TwoL332-way1MBcacheswithcache-lineinterleaving� Twomemorycontrollersfortwo2GBDDRDIMMmodules(eachDIMMmoduleshas16DRAMbanks)
� CoreNet coherencyfabric– interconnectscoresandotherSoC modules,ahigh-bandwidthswitchthatsupportsseveralconcurrenttransactions
PowerPCe500mccore
CoreNetInterface
L2$
L1I-$ L1I-$
CoreNetFabric
L3$
DDR
Controller
L3$
DUART
GPIO
FMan
BMan
……
QMan
DDR
Controller
DIMM
mod
ule
DIMM
mod
ule
*P4080QorIQIntegratedProcessorHardwareSpecifications,Feb2014.
HWplatform,P4080[2/2]• PartitioningsupportofrecentPowerPCprocessors*
Hardware Support for Robust Partitioning in Freescale QorIQ Multicore SoCs (P4080 and derivatives), Rev. 0
10 Freescale Semiconductor
Overall partitioning model
Figure 4. Example of a Partitioned System
In this model, there are four distinct partitions, each running on two cores. The main memory is divided into several physical regions:
• Private• Shared between partitions; accessible at user level• Shared among partitions; restricted to hypervisor level
This mapping is enforced by the cores’ MMUs accessible only at the hypervisor level. System peripherals (PCIe and sRIO) in this example are not shared -- each is allocated to a partition usage. As such, the hypervisor is able to restrict their DMA-accessible memory range to some part of the memory region assigned to the partition through the MMU.
The shared internal memory (CPC) is partially partitioned, which provides two partition-specific sub-ranges.
NOTEThis CPC allocation can be done per-way. Each way is configured to work either as a cache or as a fixed-address sRAM.
1.7 HypervisorsSeveral hypervisor technologies are proposed for the P4080 to address different purposes.
RTOS suppliers, such as GreenHills, SysGo and WindRiver, have developed their own hypervisor technology with particular focus on safety and robust partitioning.
*HardwareSupportforRobustPartitioninginFreescaleQorIQMulticoreSoCs(P4080andderivatives)
Mainmemoryisdividedintoseveralphysicalregions• Private• Sharedbetweenpartitions;accessibleatuserlevel
• Sharedamongpartitions;restrictedtohypervisorlevel
Thismappingisenforcedbythecore’sMMUs
Systemperipheralsarenotshared• HypervisorisabletorestricttheirDMA-accessiblememoryrangetosomepartofthememoryregion(throughtheMMU)
CPCisPartitioned• Waypartition(32KBperway)
Eachcoreisallocatedtoeachpartition
Restrictthecoherencyoverhead• Disablethecoherency– preventsnoopoverhead• Specifyagroupparticipatingcoherency
Resourcepartitioningmechanisms• 1. Memorybus(interconnect)bandwidthpartitioning
• 2. Memorybankpartitioning
• Sharedcachepartitioning� 3. Set-basedcachepartitioningwithpagecoloring� 4. Way-basedcachepartitioningwiththesupportofP4080hardware
• CombineallthetechniquesandintegratedonQplus-AIR
• Paging� Memorybankpartitioningandset-basedcachepartitioningassumesthatOSsupportsLinux-likepaging
� PagingimplementationinQplus-AIR
ResourcepartitioningmechanismsMemorybusbandwidthregulator [1/2]• Busbandwidthregulator*
� Limitthebandwidthusagepercore
Core1 Core2
1)Setmemorybusbandwidthbudget
10/10 3/10
2)Count#ofrequestssenttomemorybus
3) Generateaninterrupt
Core1 Core2
Memorybus(CoreNet Fabric)
#/10 #/10
Memorybus(CoreNet Fabric)
Core1 Core2
10/10 3/10
Memorybus(CoreNet Fabric)
Core1 Core2
10/10 3/10
Memorybus(CoreNet Fabric)
4)Throttletherequestsfromcore1
*H.Yun,G.Yao,R.Pellizzoni,M.Caccamo,andL.Sha.Memorybandwidthmanagementforefficientperformanceisolationinmulti-coreplatforms.IEEETransactionsonComputers,65:562–576,2015.
ResourcepartitioningmechanismsMemorybusbandwidthregulator [2/2]• Implementation
� Setupthebudgetandconfiguretogenerateaninterruptwhenacoreexhaustthebudget� Configureperformancemonitoringcontrolregistersandperformancemonitoringcounters
� OSschedulerthrottlesfurtherexecutionatthatcore� ImplementinterrupthandlerfortheinterruptthatPMCgenerates� Schedulerde-schedulethetasksonthecore
• Periodofbandwidthregulatorexecution� Iftooshort,overheadbecomesexcessive;incontrast,iftoolong,predictabilityisworsened
� Defaultperiodofourimplementationis5ms
ResourcepartitioningmechanismsBank-awarememoryallocation• DRAMbank-awarememoryallocation*
� Managesmemoryallocationinsuchawaythatnoapplicationsharesitsmemorybankwithapplicationsrunningonothercores
1)requestmemory
DRAM2)Allocatephysicalmemory
Bank1
Bank2
Application2
VirtualMemory Physical
memory
OS
Application1
VirtualMemory
Core1 Core2
Physicalmemory
DRAM
Core1 Core2
Bank1
Physicalmemory
Bank2
Physicalmemory
Pagetable(virtual-to-physicaladdresstranslation)
HWMMU
*H.Yun,R.Mancuso,Z.-P.Wu,andR.Pellizzoni.PALLOC:Drambank-awarememoryallocatorforperformanceisolationonmulticoreplatforms.InRTAS,2014.
031 67141618
banks
12L3cachesets
L2cachesets
[P4080memoryaddressmapping]
ResourcepartitioningmechanismsSet-basedcachepartitioning [1/2]• Set-basedpartitioningviapagecoloring*
� Allocationofphysicalmemoryconsideringthecachesetlocation� 𝑛𝑢𝑚𝑏𝑒𝑟𝑜𝑓𝑐𝑜𝑙𝑜𝑟𝑠 = ./.012341
5/612341∗./.01/223.3/83938:
1)requestmemory
DRAM2)AllocatephysicalmemoryApplication2
RTOS
Application1
Core1 Core2
Cache
031 716 12
L3cachesets
colorsPhysicalpagenumber
*R.Mancuso,R.Dudko,E.Betti,M.Cesati,M.Caccamo,andR.Pellizzoni.Real-timecachemanagementframeworkformulti-corearchitectures.InRTAS,2013.*M.Chisholm,B.C.Ward,N.Kim,andJ.H.Anderson.Cachesharingandisolationtradeoffsinmulticoremixed-criticalitysystems.InRTSS,2015.
ResourcepartitioningmechanismsSet-basedcachepartitioning[2/2]• Implementations
� Manipulatesvirtualtophysicaladdressmapping– allocatedisjointcachesetstoeachcore� Amongaddressbits[15:7],cachesetindex,exploits[15:12]bits,whichintersectswiththephysicalpagenumberinP4080
• L2co-partitioning&Restrictionsofset-basedpartitioning� Co-partitionL2cache
� L3cachesetisdeterminedby[15:12]andL2cachesetby[13:6]� Using[13:12]bitshasasideeffectofco-partitioningL2cache
� Onlythe[15:14]bitsareallowedforL3cachesetpartitioning� Thenumberofcachepartitionsislimitedto4� Ifweadoptfor8cores,somecachesetsinevitablysharedby2cores
031 67141618
banks
12L3cachesets
L2cachesets
[P4080memoryaddressmapping]
ResourcepartitioningmechanismsWay-basedcachepartitioning[1/2]• Way-basedpartitioningwithHardware-levelsupport
� Configuremainmemorywithmultipledistinctpartitions� Foreachpartition,registerthe(memoryrange,target,andpartitionID)intheLAW(LocalAccessWindow)register
� PartitiontheL3cacheandallocatedisjointcachewaystoeachcore� ConfiguretheL3cache(CPC)relatedregisters– transactionsfromthespecifiedpartitioncanallocatetheblocksinthedesignatedcacheways� E.g.,transactionsfromthe‘partition1‘allocateblocksinthe‘way0,1,2,3’
Physicalmemory(DDR3,DRAM)CPC(L3cache)
e6500core
L1cache L1cache L1cache L1cache
2MBBankedL2cache
CoreNetCoherencyFabric
e6500core
e6500core
e6500core
LocalAccessWindowsLocalAccessWindowsLocalAccessWindows CPCConfigurationRegister
MMUMMU MMUMMU
Part.1
Part.2
Part.1 Part.2 Part.3 Part.4
Part.3
Part.4
shared
Part.
1
Part.
2
Part.
3
Part.
4
ResourcepartitioningmechanismsWay-basedcachepartitioning[2/2]• Relaxedrestrictionsonthenumberofcachepartitions
� Withset-basedcachepartitioning,numberofcachepartitionsisrestricteduptofour
� P4080supportscachepartitioningwithper-waygranularity,witheachwayproviding32KB� L3cacheis32-wayandcanbepartitionedto32parts
• Limitationsofway-basedcachepartitioning� Way-basedcachepartitioningcannotbeusedwithset-basedcacheormemorybankpartitioning
� Conflictingrequirementofmemoryallocation� Sequentialvs.interleaving� MayberelevanttoallotherPowerPCchipmodels
� Cachewaylockingallowintegration� MostARMprocessorssupportscachewaylocking� PowerPCe500mcprocessorsupportscachelockinginablockgranularity
Part.1(core1)
Part.2(core2)
Part.1Part.2Part.1Part.2Part.1Part.2
vs.
ImplementationissuesFromtheperspectiveofanRTOS[1/4]• Challenges– paging
� PagecoloringassumesthatOSmanagesmemorywithfixed-sizedpages(normally,4KB)
� Qplus-AIRdeliberatelyavoidpagingduetothetimingpredictabilityisworsenedwhenaTLBmissoccurswithinapagingscheme
Kerneldata
Kernelcode
Partition2
Partition1
Partition3
Memorylayout
• MemorymanagementofQplus-AIR� Managedwithvariablesizedpagesratherthanfixed4KBpages� Kerneldata/code,partitionregions� Manageseachregionasonelargepage- 1TLBentryforeachregion
� OSlockstheentryintheTLB- ForceallthemappingdatatostayintheTLB
� Sizeofmemoryforeachapplicationisconfiguredbydevelopers
� MMUisusedtopreventcross-applicationmemoryaccesses
16MB
16MB
Size(example)
16MB
64MB
64MB
ImplementationissuesFromtheperspectiveofanRTOS[3/4]• MemorymanagementinP4080
� TwolevelsofMMU� Hardware-managedL1MMU� Software-managedL2MMU
� EachMMUconsistsof� TLBforvariable-sizedpages(VSP),11differentpagesizes(4KB~4GB)
� TLBfor4KBfixed-sizedpages(FSP)� TLBlockingforvariable-sizedpages
• Modify memorymanagementofQplus-AIR� Tosupportpagecoloring,whichisusedtoimplementmemorybankpartitioningandset-basedcachepartitioning
� Manageapplication’smemoryregionswith4KBgranularity� Managementofkernelregionswasunchanged– bindperformancepredictabilityofkernelexecution
[ref.]PowerPCe500mccorereferencemanual
ImplementationissuesFromtheperspectiveofanRTOS[3/4]• Overheadofpaging
� ‘Latency’benchmarkwithchangingdatasizeandaccesspattern� Sequentialaccessandrandomaccessoflinkedlist
� Measuretheaveragememoryaccesslatency
0
10
20
30
40
50
60
70
80
90
0 2000 4000 6000 8000 10000
aver
age
mem
ory
late
ncy
data size (KB)
paging overhead(sequential access)
no paging paging
0
50
100
150
200
250
300
0 2000 4000 6000 8000 10000
aver
age
mem
ory
late
ncy
data size (KB)
paging overhead(random access)
no paging paging
Upto6%overheadwhendatasize>2MB
[note]TLBhitratio=98.43%L2TLBhas512-entry
Upto197%overheadwhendatasize>2MB
ImplementationissuesFromtheperspectiveofanRTOS[4/4]• Analysisofoverhead
� DegradationisduetotheMMUarchitectureofe500mccore� L1instructionanddataTLBsandL2unifiedTLB� L1MMUiscontrolledasaninclusivecacheofL2MMU� InPowerPCe6500core,L1andL2MMUisnotinclusive
• Requirementsforthepredictablepaging� Somestudiesfocusedonpredictablepaging*� COTShardwareprovidesmeansforimplementingpredictablepaging–software-managedTLBorTLBlocking
L1TLB
L2TLB
L1TLB
L2TLB
L1TLB
L2TLB
TLBentryforcodeTLBentryfordata
Evict(replaceout)InstructionTLBentries
DatasizeincreasesInvalidated(inclusionproperty)
L1I-TLBmiss!
L1I-TLBmissevenifthecodesizeiswithintheL1I-TLBcoverage
I-TLB D-TLB
*D.HardyandI.Puaut.Predictablecodeanddatapagingforrealtimesystems.InECRTS,2008.
*T.Ishikawa,T.Kato,S.Honda,andH.Takada.Investigationandimprovementontheimpactoftlb missesinreal-timesystems.InOSPERT,2013.
ResourcepartitioningmechanismsIntegrationofpartitioningschemes• Fourtechniqueswithpaging
� Memorybuspartitioning(RP-BUS),memorybankpartitioning(RP-BANK),set-basedcachepartitioning(RP-$SET),andway-basedcachepartitioning(RP-$WAY)
• Integrationofmemorybus,memorybank,andset-basedandway-basedcachepartitioningmechanisms� Notethatway-basedcachepartitioningcannotbeintegratedwithmemorybankpartitioningorset-basedcachepartitioning
• Possibleintegration options� Integrationoption#1:RP_BUS,RP_BANK,andRP_$SET
� Restrictionsonthenumberofavailablecachepartitions� Integrationoption#2:RP_BUSandRP_$WAY
� Contentionsonmemorybankisunavoidable
Evaluations [1/5]• Evaluationsetup
� Hardwareplatform� P4080withactivate4or8oftotal8cores
� Softwareplatform� Qplus-AIR
� Syntheticbenchmark� Latency :traversealinkedlisttoperformaread/writeoperationoneachnode,memoryrequestismadeoneatatime
� Bandwidth :accessmemoryinsequencewithnodatadependencybetweenconservativeaccesses– CPUgeneratemultiplememoryrequestsinparallel,maximizingmemorylevelparallelism(MLP)availableinthememorysystem
� Metric� Averagememoryaccesslatency(ns)– timetoread/writeoneblock(64B)� Normalizeaveragelatencytothebest-casewithoutresourcecontention
Evaluations[2/5]• Evaluationsetup
� Twobenchmarkmixes� 4-core MIX
� Causecontentiononallthememoryresourcestoevaluateeachpartitioningmechanismandintegratedone
� 8-coreMIX� toshowthelimitationofset-basedcachepartitioning
� Datasizeconfiguration
Core1 Core2, 3 Core4
Latency(512KB)
Bandwidth(4MB)
Bandwidth(32MB)
Core1, 2 Core3, 4, 5, 6 Core7, 8
Latency(512KB)
Bandwidth(4MB)
Bandwidth(32MB)
DatasizeExamples Cache(LLC)
hit ratePlatform:2MBLLCon4-coreCPU
LLC SizeofLLCdividedbynumberofcores
2MB/4cores=512KB
100%
DRAM/small TwicethesizeofLLC 2MB;2 =4MB 0%
DRAM/large SignificantlylargerthanLLC
Muchlargerthan2MB(32MBinourexperimentalsetup)
0%
Evaluations [3/5](a) (b) (c) (d) (e)
core1 0.41 0.55 0.97 0.97 1.00
core2 0.49 0.57 0.62 0.78 1.00
core3 0.50 0.57 0.62 0.79 1.00
core4 0.93 0.87 0.87 0.85 1.00
0.20.30.40.50.60.70.80.91
1.1
(a)WORST (b)RP_BANK (c)RP_BANK+RP_$SET
(d)RP_BANK+RP_$SET+RP_BUS
(e)BEST
Normalize
dperformance
core1 core2 core3 core41 istheperformancew/ointerference
• 4-coreMIX,IntegrationOption#1� RP_BANK,RP_$SET,andRP_BUS� (b)RP_BANK:allthecoresareenabledtoaccessbanksinparallel� (c)AddingRP_$SETensures512KBL3cacheforLatency(LLC)apprunningoncore1(56%improvementcomparedtotheworst-case)� Moreover,feweraccessestomainmemorywererequestedbycore1helpsperformanceonothercores
� (d)AddRP_BUS:Performancewhenalltechniquesareputtogether
Evaluations [4/5]• 4-coreMIX,Integrationoption#2
� RP_$WAYandRP_BUS� RP_BANKisinapplicable
� Inthisbenchmark,memoryaccessisnotconcentratedtoabanksinceRP_$WAYallocatesmemorytoeachcoresequentially
� However,worstcasecouldarisedependingonantaskbehavior� RP_$WAYvs.RP_SET
� PagingoverheadonRTOSdegradesperformance� 3%, 16%, 17%, and13%foreachapplicationoncore1,2,3,and4
0.20.30.40.50.60.70.80.91
1.1
(a)WORST (b)RP_$WAY (c)RP_$WAY+RP_BUS
(d)BEST
Normalize
dperformance
core1 core2 core3 core4
(a) (b) (c) (d)
core1 0.41 1.00 1.00 1.00
core2 0.49 0.78 0.91 1.00
core3 0.50 0.79 0.91 1.00
core4 0.93 1.01 0.89 1.00
0.20.30.40.50.60.70.80.91
1.1
(c)RP_BANK+RP_$SET
1 istheperformancew/ointerference
0
0.2
0.4
0.6
0.8
1
1.2
(a)WORST (b)RP_BANK+RP_$SET (c)RP_BANK+RP_$SET+RP_BUS
(d)RP_$WAY (e)RP_$WAY+RP_BUS BEST
Norm
alize
dperfo
rmance
core1 core2 core3 core4 core5 core6 core7 core8
Evaluations [5/5]• 8-coreMIX,Integration#1
� Restrictionsonnumberofpossiblecachepartitions� RP_$SET– 4partitions,RP_$WAY– 32partitionsinP4080platform� PerformanceofLatency(LLC)isabout64%and88%withRP_$SETandRP_$WAY,respectively
� Overheadofpaging� Comparetheperformancein(b)and(d),or(c)and(e)
(a) (b) (c) (d) (e) (f) core1 0.37 0.64 0.64 0.88 0.87 1.00core2 0.37 0.64 0.63 0.88 0.86 1.00core3 0.30 0.42 0.54 0.52 0.71 1.00core4 0.30 0.42 0.54 0.52 0.71 1.00core5 0.30 0.42 0.54 0.53 0.71 1.00core6 0.30 0.42 0.54 0.53 0.71 1.00core7 0.82 0.75 0.74 0.94 0.79 1.00core8 0.82 0.74 0.73 0.94 0.79 1.00
1 istheperformancew/ointerference
Conclusions&FutureWork• Conclusions
� Qplus-AIR,anARINC653compliantRTOS� ComprehensivesharedresourcepartitioningimplementationonanARINC653compliantRTOS,Qplus-AIR� Implementationissuesofimplementingandcombiningmultipleresourcepartitioningmechanisms
� TheuniquechallengesweencounteredduetothefactthattheRTOSdidnotsupportLinux-likedynamicpaging
• FutureWork� Predictablepaging� Evaluationwithreal-worldapplications
References [1/2][1]AirlinesElectronicEngineeringCommittee,AvionicsApplicationSoftwareStandardInterfaceARINCSpecification653Part1,2006.[2]BIOSandkerneldeveloper’sguildforAMDfamily15hprocessors,March2012.[3]ARMCortex53TechnicalReferenceManual,2014.[4]P4080QorIQIntegratedProcessorHardwareSpecifications,Feb2014.[5]CertificationAuthoritiesSoftwareTeam,PositionPaperCAST-32A:Multi-coreProcessors,2016.[6]QorIQ T2080ReferenceManual,2016.[7]M.Chisholm,B.C.Ward,N.Kim,andJ.H.Anderson.Cachesharingandisolationtradeoffsinmulticoremixed-criticalitysystems.InRTSS,2015.[8]J.Flodin,K.Lampka,andW.Yi.Dynamicbudgetingforsettlingdramcontentionofco-runninghardandsoftreal-timetasks.InSIES,2014.[9]D.HardyandI.Puaut.Predictablecodeanddatapagingforrealtimesystems.InECRTS,2008.[10]T.Ishikawa,T.Kato,S.Honda,andH.Takada.Investigationandimprovementontheimpactoftlb missesinreal-timesystems.InOSPERT,2013.[11]H.Kim,A.Kandhalu,andR.Rajkumar.Acoordinatedapproachforpracticalos-levelcachemanagementinmulti-corereal-timesystems.InECRTS,2013.[12]T.Kim,D.Son,C.Shin,S.Park,D.Lim,H.Lee,B.Kim,andC.Lim.Qplus-air:Ado-178bcertifiablearinc 653rtos.InThe8thISET,2013.
References [2/2][13]R.Mancuso,R.Dudko,E.Betti,M.Cesati,M.Caccamo,andR.Pellizzoni.Real-timecachemanagementframeworkformulti-corearchitectures.InRTAS,2013.[14]M.D.BennettandN.C.Audsley.Predictableandefficientvirtualaddressingforsafety-criticalreal-timesystems.InECRTS,2001.[15]J.NowotschandM.Paulitsch.Leveragingmulti-corecomputingarchitecturesinavionics.InEDCC,2012.[16]J.Nowotsch,M.Paulitsch,D.Buhler,H.Theiling,S.Wegener,andM.Schmidt.Multi-coreinterference-sensitivewcetanalysisleveragingruntimeresourcecapacityenforcement.InECRTS,2014.[17]S.A.PanchamukhiandF.Mueller.Providingtaskisolationviatlbcoloring.InRTAS,2015.[18]M.K.QureshiandY.N.Patt.Utility-basedcachepartitioning:Alow-overhead,high-performance,runtimemechanismtopartitionsharedcaches.InMICRO,2006.[19]R.E.KesslerandM.D.Hill.Pagereplacementalgorithmsforlargereal-indexedcaches.InACMTrans.onComp.Sys.,1992.[20]L.Sha,M.Caccamo,R.Mancuso,J.-E.Kim,andM.-K.Yoon.Singlecoreequivalentvirtualmachinesforhardreal-timecomputingonmulticoreprocessors,whitepaper.2014.[21]N.Suzuki,H.Kim,D.deNiz,B.Anderson,L.Wrage,M.Klein,andR.Rajkumar.Coordinatedbankandcachecoloringfortemporalprotectionofmemoryaccesses.InICCSE,2013.[22]H.Yun,R.Mancuso,Z.-P.Wu,andR.Pellizzoni.Palloc:Drambank-awarememoryallocatorforperformanceisolationonmulticoreplatforms.InRTAS,2014.[23]H.Yun,G.Yao,R.Pellizzoni,M.Caccamo,andL.Sha.Memorybandwidthmanagementforefficientperformanceisolationinmulti-coreplatforms.IEEETransactionsonComputers,65:562–576,2015.