risc-vharry/riscv/riscv-summary.pdf · an overview of the instruction set architecture harry h....

323
RISC-V: An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University [email protected] January 26, 2018 The RISC-V project de2ines and describes a standardized Instruction Set Architecture (ISA). RISC-V is an open-source speci2ication for computer processor architectures, not a particular chip or implementation. To date, several different groups have designed and fabricated silicon implementations of the RISC-V speci2ications. Based on the performance of these implementations and the growing need for interoperability among vendors, it appears that the RISC-V standard will increase in importance. This document introduces and explains the RISC-V standard, giving an informal overview of the architecture of a RISC-V processor. This document does not discuss a particular implementation; instead we discuss some of the various design options that are included in the formal RISC-V speci2ication. This document gives the reader an initial introduction to the RISC-V design. Date Last Modi=ied: January 26, 2018 Available Online: www.cs.pdx.edu/~harry/riscv

Upload: others

Post on 27-May-2020

34 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

RISC-V:AnOverviewofthe

InstructionSetArchitecture

HarryH.PorterIIIPortlandStateUniversity

[email protected]

January26,2018

TheRISC-Vprojectde2inesanddescribesastandardizedInstructionSetArchitecture(ISA).RISC-Visanopen-sourcespeci2icationforcomputerprocessorarchitectures,notaparticularchiporimplementation.Todate,severaldifferentgroupshavedesignedandfabricatedsiliconimplementationsoftheRISC-Vspeci2ications.Basedontheperformanceoftheseimplementationsandthegrowingneedforinteroperabilityamongvendors,itappearsthattheRISC-Vstandardwillincreaseinimportance.

ThisdocumentintroducesandexplainstheRISC-Vstandard,givinganinformaloverviewofthearchitectureofaRISC-Vprocessor.Thisdocumentdoesnotdiscussaparticularimplementation;insteadwediscusssomeofthevariousdesignoptionsthatareincludedintheformalRISC-Vspeci2ication.

ThisdocumentgivesthereaderaninitialintroductiontotheRISC-Vdesign.

DateLastModi=ied:January26,2018 AvailableOnline: www.cs.pdx.edu/~harry/riscv

Page 2: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

TableofContentsListofInstructions 6

Chapter1:Introduction 11Introduction 11TheRISC-VNamingConventions 13TheUsualDisclaimer/RequestForCorrections 15DocumentRevisionHistory/PermissiontoCopy 16BasicTerminology 17

Chapter2:BasicOrganization 20MainMemory 20LittleEndian 20TheRegisters 23ControlandStatusRegisters(CSRs) 25Alignment 26

Chapter3:Instructions 28Instructions 28TheCompressedInstructionExtension 28InstructionEncoding 31User-LevelInstructions 36ArithmeticInstructions(ADD,SUB,…) 38LogicalInstructions(AND,OR,XOR,…) 49ShiftingInstructions(SLL,SRL,SRA,…) 52MiscellaneousInstructions 57BranchandJumpInstructions 65LoadandStoreInstructions 78IntegerMultiplyandDivide 89

Chapter4:FloatingPointInstructions 103FloatingPointing–ReviewofBasicConcepts 103TheFloatingPointExtensions 119FloatingPointLoadandStoreInstructions 124BasicFloatingPointArithmeticInstructions 127FloatingPointConversionInstructions 133

RISC-VArchitectureSummary/Porter Page� of�2 323

Page 3: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

TableofContents

FloatingPointMoveInstructions 137FloatingPointCompareandClassifyInstructions 142

Chapter5:RegisterandCallingConventions 146StandardUsageofGeneralPurposeRegisters 146SavingRegistersAcrossCalls 148Registerx0-TheZeroRegister(“zero”) 150Registerx1-TheReturnAddress(“ra”) 150Registerx2-TheStackPointer(“sp”) 151Registerx3-TheGlobalPointer(“gp”) 152Registerx4-TheThreadBasePointer(“tp”) 152Registerx5-x7,x28-x31-TempRegisters(“t0-t6”) 153Registerx8,x9,x18-x27-SavedRegisters(“s0-s11”) 154Registerx8-FramePointer(“fp”) 154Registerx10-x17-ArgumentRegisters(“a0-a7”) 155CallingConventionsforArguments 156FloatingPointRegisters 158

Chapter6:CompressedInstructions 160CompressedInstructions 160InstructionFormats 161LoadInstructions 162StoreInstructions 168Jump,Call,andConditionalBranchInstructions 175LoadConstantintoaRegister 178Arithmetic,Shift,andLogicInstructions 180MiscellaneousInstructions 189InstructionEncoding 191

Chapter7:ConcurrencyandAtomicInstructions 195HardwareThreads(HARTs) 195TheRISC-VMemoryModel 196ConcurrencyControl-ReviewandBackground 198AReviewofCachingConcepts 200RISC-VConcurrencyControl 205TheFENCEInstructions 205Load-Reserved/Store-ConditionalSemantics 210

RISC-VArchitectureSummary/Porter Page� of�3 323

Page 4: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

TableofContents

AtomicMemoryControlBits:“aq”and“rl” 213UsingLRandSCInstructions 217DeadlockandStarvation–Background 219StarvationandLR/SCSequences 220TheAtomicMemoryOperation(AMO)Instructions 221

Chapter8:PrivilegeModes 227PrivilegeModes 227InterfaceTerminology 230Exceptions,Interrupts,TrapsandTrapHandlers 234ControlandStatusRegisters(CSRs) 235CSRListing 237InstructionstoRead/WritetheCSRs 241BasicCSRs:CYCLE,TIME,INSTRET 246CSRRegisterMirroring 249AnOverviewofImportantCSRs 252ExceptionProcessingandInvokingaTrapHandler 257TheStatusRegister 268AdditionalCSRs 280SystemCallandReturnInstructions 284

Chapter9:VirtualMemory 288PhysicalMemoryAttributes(PMAs) 288CachePMAs 291PMAEnforcement 291VirtualMemory 296Sv32(Two-LevelPageTables) 302TheSv32PageTableAlgorithm 309Sv39(Three-LevelPageTables) 311Sv48(Four-LevelPageTables) 313

Chapter10:WhatisNotCovered 316TheDecimalFloatingPointExtension(“L”) 316TheBitManipulationExtension(“B”) 316TheDynamicallyTranslatedLanguagesExtension(“J”) 316TheTransactionalMemoryExtension(“T”) 316ThePackedSIMDExtension(“P”) 317

RISC-VArchitectureSummary/Porter Page� of�4 323

Page 5: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

TableofContents

TheVectorSIMDExtension(“V”) 317PerformanceMonitoring 317Debug/TraceMode 318

AcronymList 319

AbouttheAuthor 323

RISC-VArchitectureSummary/Porter Page� of�5 323

Page 6: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

ListofInstructions

AddImmediate 38AddImmediateWord 39AddImmediateDouble 40Add 41AddWord 42AddDouble 42Subtract 45SubtractWord 46SubtractDouble 46SignExtendWordtoDoubleword 47SignExtendDoublewordtoQuadword 47Negate 48NegateWord 48NegateDoubleword 49AndImmediate 49And 49OrImmediate 50Or 50XorImmediate 50Xor 51Not 51ShiftLeftLogicalImmediate 52ShiftLeftLogical 52ShiftRightLogicalImmediate 53ShiftRightLogical 53ShiftRightArithmeticImmediate 54ShiftRightArithmetic 54ShiftInstructionsforRV64andRV128 56Nop 57Move(RegistertoRegister) 57LoadUpperImmediate 58LoadImmediate 58AddUpperImmediatetoPC 59LoadAddress 60SetIfLessThan(Signed) 61SetLessThanImmediate(Signed) 61SetIfGreaterThan(Signed) 62

RISC-VArchitectureSummary/Porter Page� of�6 323

Page 7: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

ListofInstructions

SetIfLessThan(Unsigned) 62SetLessThanImmediate(Unsigned) 62SetIfGreaterThan(Unsigned) 63SetIfEqualToZero 63SetIfNotEqualToZero 64SetIfLessThanZero(signed) 64SetIfGreaterThanZero(signed) 65BranchifEqual 65BranchifNotEqual 67BranchifLessThan(Signed) 67BranchifLessThanOrEqual(Signed) 67BranchifGreaterThan(Signed) 68BranchifGreaterThanOrEqual(Signed) 68BranchifLessThan(Unsigned) 69BranchifLessThanOrEqual(Unsigned) 69BranchifGreaterThan(Unsigned) 70BranchifGreaterThanOrEqual(Unsigned) 70JumpAndLink(Short-DistanceCALL) 71Jump(Short-Distance) 73JumpAndLinkRegister 73JumpRegister 74Return 74CallFarawaySubroutine 75TailCall(FarawaySubroutine)/Long-DistanceJump 77LoadByte(Signed) 78LoadByte(Unsigned) 78LoadHalfword(Signed) 79LoadHalfword(Unsigned) 79LoadWord(Signed) 80LoadWord(Unsigned) 81LoadDoubleword(Signed) 81LoadDoubleword(Unsigned) 82LoadQuadword 83StoreByte 83StoreHalfword 84StoreWord 85StoreDoubleword 85StoreQuadword 86Multiply 89Multiply–HighBits(Signed) 91

RISC-VArchitectureSummary/Porter Page� of� 7 323

Page 8: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

ListofInstructions

Multiply–HighBits(Unsigned) 91Multiply–HighBits(SignedandUnsigned) 92MultiplyWord 94Divide(Signed) 98Divide(Unsigned) 98Remainder(Signed) 99Remainder(Unsigned) 99DivideWord(Signed) 100DivideWord(Unsigned) 100RemainderWord(Signed) 101RemainderWord(Unsigned) 101FloatingLoad(Word) 124FloatingLoad(Double) 125FloatingLoad(Quad) 125FloatingStore(Word) 126FloatingStore(Double) 126FloatingStore(Quad) 127FloatingAdd 127FloatingSubtract 128FloatingMultiply 129FloatingDivide 129FloatingMinimum 130FloatingMaximum 130FloatingSquareRoot 131FloatingFusedMultiply-Add 132FloatingFusedMultiply-Subtract 132FloatingFusedMultiply-Add/Subtract(Quad) 133FloatingPointConversion(Integerto/fromSinglePrecision) 134FloatingPointConversion(Integerto/fromDoublePrecision) 135FloatingPointConversion(Integerto/fromQuadPrecision) 136FloatingPointConversion(Single/Doubleto/fromQuadPrecision) 136FloatingSignInjection(SinglePrecision) 137FloatingSignInjection(DoublePrecision) 138FloatingSignInjection(DoublePrecision) 138FloatingMove 139FloatingNegate 139FloatingAbsoluteValue 140FloatingMoveTo/FromIntegerRegister(SinglePrecision) 140FloatingMoveTo/FromIntegerRegister(DoublePrecision) 141FloatingPointComparison 142

RISC-VArchitectureSummary/Porter Page� of� 8 323

Page 9: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

ListofInstructions

FloatingPointClassify 144C.LW-LoadWord 162C.LW-LoadDoubleword 162C.LQ-LoadQuadword 163C.FLW-LoadSingleFloat 164C.FLD-LoadDoubleFloat 164C.LWSP-LoadWordfromStackFrame 165C.LDSP-LoadDoublewordfromStackFrame 166C.LQSP-LoadQuadwordfromStackFrame 166C.FLWSP-LoadSingleFloatfromStackFrame 167C.FLDSP-LoadDoubleFloatfromStackFrame 168C.SW-StoreWord 168C.SD-StoreDoubleword 169C.SQ-StoreQuadword 170C.FSW-StoreSingleFloat 170C.FSD-StoreDoubleFloat 171C.SWSP-StoreWordtoStackFrame 172C.SDSP-StoreDoublewordtoStackFrame 172C.SQSP-StoreQuadwordtoStackFrame 173C.FSWSP-StoreSingleFloattoStackFrame 173C.FSDSP-StoreDoubleFloattoStackFrame 174C.J-Jump(PC-relative) 175C.JAL-JumpandLink/Call(PC-relative) 175C.JR-JumpRegister 176C.JALR-JumpRegisterandLink/Call 176C.BEQZ-BranchifZero(PC-relative) 177C.BNEQZ-BranchifNotZero(PC-relative) 177C.LI-LoadImmediate 178C.LUI-LoadUpperImmediate 179C.ADDI-AddImmediate 180C.ADDIW-AddImmediate(Word) 180C.ADDI16SP-AddImmediatetoStackPointer 181C.ADDI4SPN-AddImmediatetoStackPointer 182C.SLLI-ShiftLeftLogical(Immediate) 182C.SRLI-ShiftRightLogical(Immediate) 183C.SRAI-ShiftRightArithmetic(Immediate) 184C.ANDI-LogicalAND(Immediate) 185C.MV-MoveRegistertoRegister 185C.ADD-AddRegistertoRegister 186C.AND-AddRegistertoRegister 186

RISC-VArchitectureSummary/Porter Page� of� 9 323

Page 10: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

ListofInstructions

C.OR-AddRegistertoRegister 187C.XOR-AddRegistertoRegister 187C.SUB-AddRegistertoRegister 188C.ADDW-AddRegistertoRegister 188C.SUBW-AddRegistertoRegister 189C.NOP-NopInstruction 189C.EBREAK-DebuggingBreakInstruction 190C.ILLEGAL-IllegalInstruction 190FENCE 207FENCE.I–(InstructionCacheFlushing) 208LoadReserved(Word) 211LoadReserved(Doubleword) 212StoreConditional(Word) 212StoreConditional(Doubleword) 213AtomicSwap 221AtomicAdd/AND/OR/XOR/Max/Min(Word) 223AtomicAdd/AND/OR/XOR/Max/Min(Doubleword) 224CSRRead/Write 242CSRReadandSetBits 243CSRReadandClearBits 243CSRRead/WriteImmediate 244CSRReadandSetBitsImmediate 245CSRReadandClearBitsImmediate 245Read“CYCLE” 247Read“CYCLEH” 247Read“TIME” 247Read“TIMEH” 248Read“INSTRET” 248Read“INSTRETH” 248EnvironmentCall(SystemCall) 284Break(InvokeDebugger) 285MRET:MachineModeTrapHandlerReturn 285SRET:SupervisorModeTrapHandlerReturn 285URET:UserModeTrapHandlerReturn 286WFI:WaitForInterrupt 286SFENCE.VMA:SupervisorFenceforVirtualMemory 301

RISC-VArchitectureSummary/Porter Page� of� 10 323

Page 11: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

Introduction

AnInstructionSetArchitecture(ISA)de2ines,describes,andspeci2ieshowaparticularcomputerprocessorcoreworks.TheISAdescribestheregistersanddescribeseachmachine-levelinstruction.TheISAtellsexactlywhateachinstructiondoesandhowitisencodedintobits.

TheISAformstheinterfacebetweenhardwareandsoftware.HardwareengineersdesigndigitalcircuitstoimplementagivenISAspeci2ication.Softwareengineerswritecode(operatingsystems,compilers,etc.)basedonagivenISAspeci2ication.

ThereareanumberofInstructionSetArchitecturesinwidespreaduse,forexample:

x86-64(AMD,Intel) ARM(ARMHoldings) SPARC(Sun/Oracle)

EachoftheseISAsisproprietaryandverycomplex.ThedetailsareoftenobscuredinlengthymanualsandsomedetailsoftheISAarenotmadepublicatall.Furthermore,thewidelyusedISAshavebeenaroundforyearsandtheirdesignscarrybaggageasaresult,e.g.,forbackwardcompatibility.Sincetheselegacydesignswere2irstcreated,we’velearnedmoreabouthowtodesigncomputers.Changesinsiliconhardwaretechnologyhavealsohadanimpactonwhichdesignchoicesarenowoptimal.

TheRISC-VprojectcameoutofUCBerkeleytoaddresssomeoftheseissues.RISC-Vispronounced“RISC-2ive”.

OnegoalwastocreateamodernISAincorporatingthebestcurrentideasinprocessordesign.TheystrovetocreateanISAthatwasmuchsimplerthanthelegacyISAs,butatthesametime,wasalsopracticalandintendedtoaccommodatereallyfasthardwareimplementations.

RISC-VArchitectureSummary/Porter Page� of�11 323

Page 12: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

AnothergoalwastocreateapureReducedInstructionSet(RISC)architecture.Thegoalistobeabletoexecuteoneinstructionperclockcycleandtoachievethis,eachinstructionneedstobesimpleandlimited.

Anothergoalwastocreateanopen-sourceISA.ExistingISAsareproprietary.Theyareowned,managed,andcontrolledbycorporateentitieslikeIntel,SunMicrosystems,andARMHoldings.Theopen-sourceapproachtakenbyRISC-VmeansthatmanydifferentcompaniescanprovidehardwareimplementationsoftheRISC-Varchitecture.CreatinganecosysteminwhichmultiplevendorscancompeteinimplementingasingleISAshouldresultinmanyofthebene2itsseeninotheropen-sourceprojects.

TheRISC-Vdesignisnotasingle,fullyspeci2ied,concreteISA.Instead,RISC-Visasomewhatgeneralizedspeci2icationwhichcanbeinstantiatedor2leshed-outtodescribeanactualsiliconpart.

Thedesignersunderstoodthattherearemanydifferentmarkets,manydifferentapplicationareasforcomputers,manydifferentdesignconstraints,andsoon.Forexample,anembeddedcomputerforadishwasherneedstobecheap,reliable,andsimple,butdoesn’trequirespeed,supportforanoperatingsystem,multiplecores,orsupportfor64-bitoperations.Ontheotherhand,othercomputerswillhavemultiplecores,64-bitoperations,etc.

TheRISC-VprojectapproachesthisplethoraofdesignchoicesbyintroducinganumberofoptionsintotheISA.Inthisrespect,RISC-VisreallyasingleInstructionSetArchitecture;itisacollectionofrelatedISAs.Perhapsitcanbeviewedasamenu,fromwhichaparticularimplementationwillchoosesomeitems,butnotothers.ThedocumentationprovidedbyRISC-Visincomplete;anactualprocessorcorewillpresumablycomewithdocumentationsayingexactlyhowandwhichpartsoftheRISC-Vspeci2icationitimplements.

Forexample,theRISC-Vspeci2icationdoesnotanswerthesequestions:

Howwidearetheregisters?Optionsare:32bits,64bits,and128bits

Howmanyregistersarethere? Optionsare:16or32 Howlongareinstructions? 32bitinstructionsaremandatory;16bitformsareoptional

RISC-VArchitectureSummary/Porter Page� of� 12 323

Page 13: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

Ishardwaremultiply/divideincluded? Is9loatingpointsupported?

Optionsare:notsupported,single,ordoubleprecision Issupportforanoperatingsystem(i.e.,SupervisorMode)included? Arepagetablessupportedand,ifso,whichoptionisselected?

Asaresultofallthisparameterization,theRISC-Vspeci2icationisdif2iculttoread.Forexample,thelengthofregistersisgivenasavariableXLEN.Soinsteadofsayingsomethinglike

“…loadsa32bitvalueintobits[31:0]”thedocumentationsays

“…loadsanXLENbitvalueintobits[XLEN-1:0]”.

ThisdocumentisanattempttodescribeRISC-Vusingamoretraditionalapproach.WeproceedbychoosingaparticularISAthatmeetstheRISC-Vspeci2icationsanddescribethat.Tomakethingsmoreconcrete,wewilldescribethe32bitRISC-Vvariant,butmakeadditionalcommentsaboutthe64-bitand128-bitvariants.

TheRISC-VNamingConventions

Aspreviouslydescribed,theRISC-Vspeci2icationisnotasingleISA.Instead,itisacollectionofISAoptions.AnamingconventionisusedinwhichaparticularISAvariationisgivenacodednametellingwhichISAoptionsarepresentandsupported.Aparticularhardware(chip)canbedescribedorsummarizedwithsuchacodedname,indicatingwhichRISC-Vfeaturesareimplementedbythechip.

Forexample,considerthefollowingRISC-Vname:

RV32IMAFD

The“RV”standsfor“RISC-V”andallcodednamesbeginwith“RV”.

The“32”indicatesthatregistersare32bitswide.Otheroptionsare64bitsand128bits:

RV32 32-bitmachines RV64 64-bitmachines RV128 128-bitmachines

RISC-VArchitectureSummary/Porter Page� of� 13 323

Page 14: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

Theremaininglettershavethesemeanings:

I–Basicintegerarithmeticissupported M–Multiplyanddividearesupportedinhardware A–Theinstructionsimplementingatomicsynchronizationaresupported F–Singleprecision(32bit)2loatingpointissupported D–Doubleprecision(64bit)2loatingpointissupported

Eachoftheseisconsideredtobean“extension”ofthebaseISA,exceptforthe“I”(basicintegerinstructions),whichisalwaysrequired.

Theletter“G”isusedasanabbreviationfor“IMAFD”:

RV32G=RV32IMAFD

Inaddition,thereareadditionalvariantsandextensions:

S–Supervisormodeisimplemented Q–Quad-precision(128bit)2loatingpointissupported C–Compressed(i.e.,16bit)instructionsaresupported E–Embeddedmicroprocessors,withonly16registers

TheRISC-VdocumentationalsomentionsseveraladditionalISAdesignextensions,butonlysaystheseinstructionswillbespeci2iedatsomefuturedate.Presently,thereisnothingspeci2icforthefollowingextensions:

L–Decimalarithmeticinstructions V–Vectorarithmeticinstructions P–PackedSIMDinstructions B–Bitmanipulationinstructions T–Transactionalmemorysupport

TheapproachwetakeinthisdocumentistodescribeaRISC-Varchitecturewiththefollowingfeatures:

•32registers •Registersare32bitswide •Multiplyanddivideinstructionsarepresent •Singleanddoubleprecision2loatingpointinstructionsarepresent

RISC-VArchitectureSummary/Porter Page� of� 14 323

Page 15: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

•Atomicinstructionsarepresent •Supervisormodeissupported •Virtualmemoryissupported

Ratherthanspecifyallvariantssimultaneouslyusinggeneralities,wewillfocusonaspeci2icISAandthencommentonpossiblevariations.

TheRISC-VdocumentationprovidesanISA“standard”whichcanbeadoptedandusedfreelybydifferentgroups.Thestandardleavessomedecisionsopen,givingimplementersseveralchoices.Forexample,theimplementerisfreetochoosethesizeoftheregisters;thestandardmentions32-bits,64-bits,and128-bitsbutdoesnotmandateaparticularchoice.

Inotherareas,thestandardisun2inished.Forexample,mentionismadeofdecimal2loatingpointinstructions,butthesectiondiscussingthemsays“tobe2illedinlater”.

Also,theRISC-Vdocumentationacknowledgesthatsomeimplementerswillchoosetoviolatethestandardorextendthestandard.Theyusetheterminology“non-standardextensions”torefertofeaturesthatmightbepresentinagivenchip,butwhichdonotconformtotheRISC-Vstandard.

Commentary:TheRISC-Vdocumentationcontainsanumberofenlighteningparentheticalremarksdescribingthevariousdesignchoicestheyconsideredandofferingjusti2icationsforthedesigndecisionstheymade.ThislevelofthoughtfulnessisabsentinmostextantISAs.Insomecases,theRISC-VdocumentationseemstoaddressthedesignspaceitselfandoffersalanguageandframeworkforfutureISAdevelopment.

TheUsualDisclaimer/RequestForCorrections

Thisisaderivativework,basedon:

• TheRISC-VInstructionSetManual,VolumeI:User-LevelISADocument(Version2.2,May7,2017)

• TheRISC-VInstructionSetManual,VolumeII:PrivilegedArchitecture(Version1.10,May7,2017)

Consulttheof2icialdocumentation,whichprevails.

RISC-VArchitectureSummary/Porter Page� of� 15 323

Page 16: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

TheRISC-Viscomplex.Thisdocumenttakeslibertiesandsimpli2iesthings.Ourgoalistointroducethegeneraldesignandexplainthemainideas.Whilethismaylooklikeamanualorreferencework,itisnot.Tomakethismaterialcomprehensible…

•Somedetailsaresimpli2ied. •Somestatementsarenotstrictlytrueorcomplete. •Somematerialmaysimplybeincorrect.

Thisdocumentuses“???”tomarkincompleteorquestionableinformation.Pleasecontacttheauthorifyou2ind…

•Inaccurateinformationthatyoucancorrect •Incompleteinformationthatyoucan2illin •Confusingtextthatneedstobereworded

DocumentRevisionHistory/PermissiontoCopy

Versionnumbersarenotusedtoidentifyrevisionstothisdocument.Insteadthedateandtheauthor’snameisused.Thedocumenthistoryis:

Date AuthorJanuary26,2018 HarryH.PorterIII<Initialversion>

InthespiritofRISC-Vandtheopen-sourcemovement,theauthorgrantspermissiontofreelycopyand/ormodifythisdocument,withthefollowingrequirement:

Youmustnotalterthissection,excepttoaddtotherevisionhistory.Youmustappendyourdate/nametotherevisionhistory.

Youarealsofreetoadaptthisdocumenttodescribeaparticularimplementation.Ifyouspecializethisdocumenttoaspeci2ichardwaredesign,youshouldchangethetitletore2lectthisnewuse.Anymaterialliftedshouldbereferenced.

RISC-VArchitectureSummary/Porter Page� of� 16 323

Page 17: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

BasicTerminology

There has been some confusion in computer science documentation regardingabbreviationsforlargenumbers.Forexample:

4K=? 4,000 4,096

Weusethefollowingpre2ixnotationforlargenumbers,whichisbecomingcommoninthecontextofcomputerarchitecture:

Pre=ix Example Value Ki kibi KiByte 210 1,024 ~103 Mi mebi MiByte 220 1,048,576 ~106 Gi gibi GiByte 230 1,073,741,824 ~109 Ti tebi TiByte 240 1,099,511,627,776 ~1012 Pi pebi PiByte 250 1,125,899,906,842,624 ~1015 Ei exbi EiByte 260 1,152,921,504,606,846,976 ~1018

Contrastthistothestandardmetricpre2ixes,whichweavoid:

Pre=ix Example Value K kilo KByte 103 1,000 M mega MByte 106 1,000,000 G giga GByte 109 1,000,000,000 T tera TByte 1012 1,000,000,000,000 P peta PByte 1015 1,000,000,000,000,000 E exa EByte 1018 1,000,000,000,000,000,000

Inthisdocument,weusetheterms“byte”,“halfword”,“word”,“doubleword”,and“quadword”torefertovarioussizesofbinarydata.

number number of bytes of bits example value (in hex) ======== ======= =================================== byte 1 8 A4 halfword 2 16 C4F9 word 4 32 AB12CD34 doubleword 8 64 01234567 89ABCDEF quadword 16 128 4B6D073A 9A145E40 35D0F241 DE849F03

RISC-VArchitectureSummary/Porter Page� of� 17 323

Page 18: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

The bits within an 8-bit byte are numbered from 0 (lower, least signi2icant) to 7(upper,mostsigni2icant).

7654 3210 ==== ==== 0000 0000

Thebitswithina16-bithalfwordarenumberedfrom0to15.

15 12 8 4 0 ==== ==== ==== ==== 0000 0000 0000 0000

Thebitswithina32-bitwordarenumberedfrom0to31.

31 28 24 20 16 12 8 4 0 ==== ==== ==== ==== ==== ==== ==== ==== 0000 0000 0000 0000 0000 0000 0000 0000

Likewise, thebitswithina64-bitdoublewordarenumbered from0 to63and thebitswithina128-bitquadwordarenumberedfrom0to127.

Weusethefollowingnotationtorepresentarangeofbits:

Example Meaning ========= ====================================================== [7:0] Bits 0 through 7; e.g., all bits in a byte [31:0] Bits 0 through 31; e.g., all bits in a word [31:28] Bits 28 through 31; e.g., the upper 4 bits in a word [1:0] Bits 0 through 1; e.g., the least significant 2 bits

Asinglehexdigit canbeused to represent4bits (half abyte, sometimes calleda“nibble”),asfollows:

RISC-VArchitectureSummary/Porter Page� of� 18 323

Page 19: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter1:Introduction

Binary Hex ====== === 0000 0 0001 1 0010 2 0011 3 0100 4 0101 5 0110 6 0111 7 1000 8 1001 9 1010 A 1011 B 1100 C 1101 D 1110 E 1111 F

The 8 bits within a byte are conveniently expressed with two hex digits. Forexample:

8-bit byte In Hex ========== ======== 1010 0100 A4

The32bitsinawordaregivenwith8hexdigits.Forexample:

32-bit word In Hex ============================================= ======== 1010 1011 0001 0010 1100 1101 0011 0100 AB12CD34

RISC-VArchitectureSummary/Porter Page� of� 19 323

Page 20: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

MainMemory

Mainmemoryisbyteaddressable.Addressesare4bytes(i.e.,32bits)long,allowingforupto4GiBytestobeaddressed.

TheRISC-Vdocumentationdiscussesseveraldifferentaddresssizeoptions,butwe’llstartsimplewith32bitaddresses.

Memorycanbeviewedasasequenceofbytes:

address data (in hex) (in hex) ======== ======== 00000000 89 00000001 AB 00000002 CD 00000003 EF 00000004 01 00000005 23 00000006 45 00000007 67 ... ... FFFFFFFC E0 FFFFFFFD E1 FFFFFFFE E2 FFFFFFFF E3

“Low”memoryreferstosmallermemoryaddresses,whichwillbeshownhigheronthepagethan“high”memoryaddresses,asintheaboveexample.

LittleEndian

TheRISC-Visalittleendianarchitecture.

RISC-VArchitectureSummary/Porter Page� of�20 323

Page 21: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

Asanexample,assumethatmainmemoryholdsthefollowingbytes:

address data (in hex) (in hex) ======== ======== ... ... E5000004 1A E5000005 2B E5000006 3C E5000007 4D E5000008 5E E5000009 6F E500000A 70 E500000B 81 E500000C 92 E500000D A3 E500000E B4 E500000F C5 ... ...

Considerloadingaregistersthatis32bits(4bytes)wide.ThereareseveralLOADinstructions,whichcanmoveeitherabyte,ahalfword,orawordfrommemoryintoaregister.

AssumethataLOADinstructionloadsabytefromaddress0xE5000004.Theregisterwillthencontain:

0x0000001A

AssumethataLOADinstructionloadsahalfword(2bytes)fromthesameaddress.Theregisterwillthencontain:

0x00002B1A

AssumethataLOADinstructionloadsaword(4bytes)fromthesameaddress.Theregisterwillthencontain:

0x4D3C2B1A

Notethatwecanviewthememoryasholdingasequenceofproperlyalignedhalfwords,whichwecannaturallyrepresentasfollows:

RISC-VArchitectureSummary/Porter Page� of� 21 323

Page 22: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

address data (in hex) (in hex) ======== ======== ... ... E5000004 2B1A E5000006 4D3C E5000008 6F5E E500000A 8170 E500000C A392 E500000E C5B4 ... ...

Orwecanviewthismemoryasholdingasequenceofproperlyalignedwords,whichlookslikethis:

address data (in hex) (in hex) ======== ======== ... ... E5000004 4D3C2B1A E5000008 81706F5E E500000C C5B4A392 ... ...

Commentary:Inalittleendianarchitecture,theorderofthebytesischangedwheneverdataiscopiedfrommemorytoaregisterorstoredfromaregisterintomemory.Thiscanbeasourceofconfusion,particularlywhenhumanslookataprintoutofmemorycontents.

Bigendianarchitecturesaresimplertounderstandsincethebytesarenotreorderedduringloadsandstores.

Noticethatwithalittleendianarchitecture,the2irstbyteinmemory(0x1Aintheexample)alwaysgoesintothesamebitsintheregister,regardlessofwhethertheinstructionismovingabyte,halfword,orword.Thiscanresultinasimpli2icationofthecircuitry.

Thespecmentionsthattheyselectedlittleendiansinceitiscommerciallydominant.Theyalsomentionthepossibilityof“bi-endian”architectures,whichmixbig-andlittle-endianness.

RISC-VArchitectureSummary/Porter Page� of� 22 323

Page 23: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

TheRegisters

Forconcreteness,thisdocumentfocusesprimarilyonRV32,the32-bitvariantofRISC-V.Theboxbelowdiscusses“OtherRegisterSizes”.

Thegeneralpurposeregistersare32bits(i.e.,4bytesoroneword)inwidth.

Thereare32registers.Theregistersarenamedx0,x1,x2,…x31.

OtherRegisterSizes:WemainlyfocusondescribingtheRV32variantofRISC-V,inwhichtheregistersare32bitsinwidth.

IntheRV64variant,theregistersare64bits(i.e.,8bytesoradoubleword)insize.

IntheRV128variant,theregistersare128bits(i.e.,16bytesoraquadword)insize.

Inallcases,thenumberofregistersis32.

If2loatingpointissupported,therewillbeadditional=loatingpointregistersnamedf0,f1,f2,…f31.Theseregistersaredistinctfromx0,x1,x2,…x31.Floatingpointregisterswillbediscussedlater.

Registerx0isaspecial“zeroregister”.Whenread,itsvalueisalways0x00000000.Wheneverthereisanattempttowritetox0,thedataissimplydiscarded.

AllotherregistersaretreatedidenticallybytheISA;thereisnothingspecialaboutanyregister.

Thereisnospecialsupportfora“stackpointer,”a“framepointer,”ora“linkregister”.Althoughtheseareimportantconceptsandareusedintheimplementationofprogramminglanguages,theISAdoesnothaveanyspecializedinstructionsforthem.Anyregistercanbeusedforthesefunctions.

RISC-VArchitectureSummary/Porter Page� of� 23 323

Page 24: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

ReducedNumberofGeneralPurposeRegisters:IntheRISC-V“E”extension,thenumberofregistersisreducedto16.

Theregistersarenamedx0,x1,x2,…x15.

Thisextensionisonlyapplicableto32-bitmachines.Inotherwords,thebaseISAwilleitherbeRV32IorRV32E,withtheonlydifferencebeingthenumberofgeneralpurposeregisters.For64-bitand128-bitmachines,therewillalwaysbeafullsizedregistersetwith32registers,so“E”doesnotapplytoRV64orRV128.

Allinstructionformatsarethesame.Inotherwords,thereisnochangetotheinstructionencodingsbetweenthenormalRV32IandthereducedRV32Evariants.Theonlydifferenceisthatthe5-bit2ieldsforencodingaregisternumberarelimitedtocontainingonlythevaluesof0through15(inbinary:00000through01111).

Thisextensionismeantforsimpli2ied,embedded(“E”)computers.Thespecsuggeststhatthecircuitryforafullsizedregistersetcanconsume50%ofthechiprealestate,notcountingthespaceusedforcache.Thus,dividingthenumberofregistersinhalfwillresultina25%savingsinrealestate.

CompressedInstructions:IntheRISC-V“C”extensionforcompressedinstructions,8registers(x8,x9,…x15)areeasilyaccessible.Toaccesstheotherregisters,theprogrammermayhavetouseanormal,uncompressedinstruction.Therefore,compilersareencouraged(butnotrequired)touseonlyregistersx8,x9,…x15.

Inthisextension,thefactthatcertainregisterswillbeusedasstackpointer,linkregister,etc.ismadeuseof,sosomeregistersaretreatedslightlydifferentlyhere.

Wediscusscompressedinstructionslater.

Inadditiontothegeneralpurposeregisters,thereisalsoaprogramcounter(PC),whosewidthmatchesthewidthofthegeneralpurposeregisters.

Inthevariantwefocuson,thePCregisteris32bitswide.IntheRV64andRV128variants,thesizeofthePCincreasesandmatchesthesizeofthegeneralpurposeregisters.

RISC-VArchitectureSummary/Porter Page� of� 24 323

Page 25: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

ManyprocessorISAsincludea“statusregister”,sometimescalleda“conditioncoderegister.”Sucharegisterusuallycontainsbitssuchas:

•Sign/NegativeValue •Zero/Equal •CarryBit •Over2low

InotherISAs,thereisusuallyaCOMPAREinstruction(whichwillsetbitsinthestatusregister)andseveralBRANCHinstructions(whichwilltestthestatusregisterbitsandconditionallyjump).

TheRISC-Vdoesnotincludeanysuch“statusregister.”Instead,theBRANCHinstructionsinRISC-Vwillperformboththetestandtheconditionaljump.

Commentary:Thenormalpatternofmostcodeinothernon-RISC-VarchitecturesistoexecuteaCOMPAREinstructionand,immediatelyafterward,executeaBRANCHinstruction.Theygotogetherandeffectivelyperformasingle“test-and-jump”operation.BycombiningthemintoasingleinstructionintheRISC-Varchitecture,greaterperformanceef2iciencycanbeachievedwheneverthis“test-and-jump”operationmustbeperformed.

Furthermore,byeliminatingthe“statusregister”,theprocessingofinterruptsisstreamlined.Here’swhy:Thecodeinanyinterrupthandlerroutinewillcertainlymodifythestatusregister,sothereforethestatusregistermustbeautomaticallysavedwheneveraninterrupthandlerisinvokedandrestoredwheneverthehandlerreturns.Additionalarchitecturalcomplexitywouldberequiredtosupporttheseoperations(whichmustbeatomic),butthiscomplexityisavoidedwiththeRISC-Vapproach.

ControlandStatusRegisters(CSRs)

TheentirestateofarunningRISC-Vcoreconsistsof:

•Theintegerregistersx1,x2,…x31 •The2loatingpointregisters,if2loatingpointissupported. •TheProgramCounter(PC) •Asetof“ControlandStatusRegisters”(CSRs)

RISC-VArchitectureSummary/Porter Page� of� 25 323

Page 26: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

The“ControlandStatusRegisters”(CSRs)areusedfortheprotectionandprivilegesystem.TheprivilegesystemisusedbytheOSkerneltoprotectitselfandmanageuser-levelprocesses.

Atanymoment,theRISC-Vprocessorwillbeexecutingeitherinuser-modeorinsupervisor-mode.Kernelcodeisexecutedinsupervisor-modeandapplicationprogramsareexecutedinuser-mode.[Actuallytherearetwonon-usermodes;we’llgointodetailslater.]

Each“ControlandStatusRegister”(CSR)hasaspecialnameandeachhasauniquefunction.Readingand/orwritingaCSRwillhaveaneffectontheprocessoroperation.ReadingandwritingtotheCSRsisusedtoperformalloperationsthatcannotbeperformedwithnormalinstructions.ThebehaviorassociatedwitheachCSRisoftenquitecomplexandeachregistermustbeunderstoodindependently.

Thereisalarge2ileofCSRs:therecanbeupto4,096CSRsde2ined.TheCSRsarereadandwrittenwithjustacoupleofgeneral-purposeinstructions.

AfewdozenCSRsarede2inedinthespec,givingtheirnames,behaviors,andeffects.ThespecleavesopenthepossibilityofmoreCSRsbeingde2inedanditisclearthattherewillbesigni2icantvariationbetweenimplementationsinthedetailsofwhichCSRsareimplementedandexactlyhoweachoneworks.

Fortunately,theCSRscanbeignoredat2irst.TheCSRsareprimarilyusedbyOScode.Forexample,theCSRsareusedforinterruptprocessing,threadswitching,andpagetablemanipulation.

Inorderunderstandtheuser-modeinstructionsetandtocreateuser-levelcode,theCSRscanandshouldbeignored,especiallyonyour2irstintroductiontoRISC-V.

Alignment

A“halfwordaligned”addressisanaddressthatisamultipleof2.Thelastbitofahalfword-alignedaddresswillalwaysbe0.Likewise,a“wordaligned”addressisamultipleof4,andendswiththebits00.Similarlyfor“doublewordalignment”and“quadwordalignment”.

RISC-VArchitectureSummary/Porter Page� of� 26 323

Page 27: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter2:BasicOrganization

Ahalfword-sizedvalueissaidtobe“properlyaligned”ifitisstoredatahalfword-alignedaddress.Aword-sizedvalueisproperlyalignedifitisstoredataword-alignedaddress,andsimilarlyforothersizes.

RISC-VdoesnotrequiredatatobeproperlyalignedfortheLOADandSTOREinstructions.Inotherwords,anyvaluemaybestoredatanyaddress.However,properalignmentisgoodpracticeandencouraged.Inmostimplementations,LOADandSTOREinstructionswillperformmuchfasterwhenthedataisproperlyaligned.

However,thereisanalignmentrequirementforinstructions.

Instructionsare32bitsinlengthandmustbestoredatword-alignedlocations.Anattempttojumptoanunalignedaddressedwillcausean“instructionmisaligned”exception.

Commentary:IntheRISC-V“C”extensionforcompressedinstructions,halfword-sizedinstructionsareallowed.Inthiscase,thealignmentrequirementisslightlyrelaxed:Instructions(whether16or32bits),mustonlybehalfwordaligned,notwordaligned.

Sinceallinstructionsmustbeatleasthalfwordaligned,allbranchtargetaddressandtheprogramcounter(PC)valuesmustbeevennumbers,i.e.,mustendwithasingle0bit.Someinstructionsmakeuseofthefactthattheleastsigni2icantbitisalways0,allowingamoreef2icientencodingofinstructionsbyleavingthis2inalbitimplicit.

RISC-VArchitectureSummary/Porter Page� of� 27 323

Page 28: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Instructions

Instructionsare32bits(4bytes,1word)inlengthandmustbestoredatword-alignedmemorylocations.InstructionsfortheRV64andRV128variantsarestill32bitslong.

Somecomputerarchitecturesaresaidtoemploy“two-address”instructions.Forexample,thefollowinginstructionmentionsonlytworegisters.(ThisisnotRISC-V.)

ADD x4,x5 # x4 = x4 + x5

RISC-Visa“threeaddress”architecture.TheADDinstructionlookslikethis:

ADD x4,x5,x7 # x4 = x5 + x7

NotethatintheRISC-Vassemblylanguage,thedestinationistypicallytheleftmostoperand.

TheCompressedInstructionExtension

Thischapterdescribesthe32-bitinstructions.Howeverinthissection,weintroducethe16-bitcompressedinstructions.Wewillcoverthecompressedinstructionsfullyinalaterchapter.

IntheRISC-V“C”extensionforcompressedinstructions,halfword-sizedinstructionsarealsoallowed.Ifthisextensionisimplemented,both16-bitand32-bitinstructionscanbeused.

RISC-VArchitectureSummary/Porter Page� of�28 323

Page 29: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

The16-bitand32-bitinstructionscanbeintermixed.Thereisno“mode”bittoputtheprocessorinto“compressed-instruction”mode,asthereisinotherprocessors.

Withthisextension,thealignmentrestrictionforinstructionsisrelaxedtohalfword-alignment.

Eachcompressed16-bitinstructionisexactlyequivalentinfunctiontoa32-bitinstruction.However,therearemany32-bitinstructionsforwhichthereisnoequivalent16-bitversion.

Thus,eachcompressed16-bitinstructioncanbethoughtofasashorthandforsomelonger32-bitinstruction.Theideaisthatthemostfrequentlyusedinstructionshave16-bitversions.Byusingtheshorterversions,thesizeofprogramcodecanbereduced.TheRISC-Vdesignershavedoneresearchtodeterminewhichinstructionsarethebestcandidatestobegivencompressedvariants.

Commentary:Reducingthesizeofcoderesultsinincreasedprocessorperformancesinceitallowsmoreinstructionstobecached,reducingthetimetofetchinstructionsfrommainmemory,whichisoftenaperformancebottleneck.

Inatypicalhardwareimplementation,whenacompressedinstructionisfetchedandloadedintotheInstructionRegister(IR)priortobeingexecuted,thehardwarewillnoticethatitisacompressedinstruction.Atthattime,thecompressedinstructionwillimmediatelybeexpandedfrom16bitsintotheequivalent32bitinstruction.Thereafter,thereisnoneedforanyadditionalhardwarelogictosupportthecompressedinstructionset.

Toaddress32registers,5bitsarerequired.Consideraninstructionwhichrefersto3registers,suchas:

ADD x8,x9,x10

Ifthisinstructionisencodedusing5bitsforeachregister,15bitswouldbeconsumed.Withonly16bitstoworkwithincompressedinstructions,thiswillclearlynotwork,sinceonly1bitisleftforencodingtheopcode.

Instead,manyofthecompressedinstructionsrestrictaccesstoonly8oftheregisters.Withthisrestriction,registerscanbeencodedusingonly3bits.

RISC-VArchitectureSummary/Porter Page� of� 29 323

Page 30: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Manyofthecompressedinstructionsonlyallowaccesstoregistersx8,x9,…x15.However,certainotherregistersare(byconventionandhabit)usedforspecialpurposes,suchasfora“stackpointer”(x2)or“linkregister/returnaddress”(x1),andsomecompressedinstructionsimplicitlyrefertotheseregisters.

Somecompressedinstructionsallowanyofthe32registerstobespeci2ied,employinga5-bit2ieldtoencodetheregister.Obviously,theseinstructionsdon’thaveroomforthreesuchregisteroperands.

Ingeneral,RISC-Vusesthree-addressinstructions.However,somecompressedinstructionsusethetwo-addressapproach.Forexample,theinstruction

ADD x8,x8,x10 # Add x10 to x8 canberepresentedincompressedformsincethedestinationregisterisalsoanoperand.

Moreprecisely,acompressedinstructionoftheform:

C.ADDW RegD,RegB # RegD = RegD + RegB

canbeexpandedto

ADDW RegD,RegD,RegB # Equivalent 32-bit inst

Inassemblylanguage,compressedinstructionsareidenti2iedwiththe“C.”pre2ix.

Theinstructionopcodehereendswitha“W”suf2ix,indicatingthatitoperatesona32-bitword,asopposedto64-bitor128-bitvalues.

Inatypicalimplementation,acompressedinstructionwillbeexpandedtotheequivalent32-bitinstructionbytheFETCHandDECODEhardware,afteraninstructionisfetchedfrommemory,directlybeforeitisexecuted.

However,theexpansiontoafull-sized32-bitinstructioncouldbeperformedbytheassembler.Thiswouldbeusefulwhenassemblingforamachinethatdoesnotsupportthecompressedextension.

AssemblerNote:Asophisticatedassemblerwillautomaticallygeneratecompressedinstructionswheneveritcan.Theideaisthattheprogrammer(or

RISC-VArchitectureSummary/Porter Page� of� 30 323

Page 31: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

compiler)willcreateonly32-bitinstructions.Uponencounteringa32-bitinstructionthatcanalsobecodedasa16-bitinstruction,theassemblerwillchoosethesmallerinstruction.Suchanassemblerwillrelieveprogrammers(andcompilers)fromtheburdenofselectingcompressedinstructions,althoughasophisticatedcompilermaybeabletogenerateshortercodesequencesifitisawareofwhichinstructionscanbecompressed.

InstructionEncoding

Eachinstructionis32-bitsinlength.

Inotherwords,theISAisbuiltarounda32-bitinstructionsize.Eachcompressed16-bitinstructioncanalsoberepresentedasaequivalent32-bitinstruction,sothesetof32-bitinstructionfullydescribestheavailableinstructions.Theleastsigni2icant2bitsalwaysindicatewhethertheinstructionis16or32bits,sotheprocessorcanimmediatelytellwhetherornotithasjustloadedacompressedinstruction.

Details:Abitpatternendingin11indicatesa32-bitinstruction.Theotherbitpatterns(00,01,10)indicateacompressedinstruction.Thisapproachreducestheavailablenumberofunique16-bitinstructionsby¼,andreducesthenumberofeffectivebitsavailableforthe32-bitinstructionsto30bits:areasonablecompromise.

Thedistinguishingbitsareintheleastsigni2icantpositions,whichisappropriateforalittleendianarchitecture.

Thereisalsoaprovisionforinstructionsthatarelongerthan32bits,butnosuchinstructionsarespeci2ied.Suchinstructionswouldbeanon-standardextension.

Longerinstructionsmustbeamultipleof16bitsinlength.Anencodingschemeisspeci2ied,whichshowshowinstructionsoflength48bits,64bits,etc.aretohavetheirlengthencoded.Beyondthelengthencoding,furtherinstructionsdetailsareleftunspeci2iedanduptotheimplementersofnon-standarddevices.

RISC-VArchitectureSummary/Porter Page� of� 31 323

Page 32: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Forinstructionsoflength32bits,theleastsigni2icant2bitsmustbe11.Inaddition,thenext3bitsmustnotbe111,sincethisindicatesaninstructionoflengthgreaterthan32-bits.

Compressed(16-bit)Instructions:xxxx xxxx xxxx xxAA (whereAA≠11)

Normal(32-bit)Instructions:xxxx xxxx xxxx xxxx xxxx xxxx xxxB BB11 (whereBBB≠111)

LongerInstructions:xxxx xxxx ... xxxx xxxx xxxx xxxx xxx1 1111

Consulttheof2icialdocumentationfordetailsforlongerinstructions.

Inthissection,wewilldiscussonly32-bitinstructions.

Sincethereare32registers,a2ieldwithawidthof5bitsisusedtoencodeeachregisteroperandwithininstructions.

Theprototypicalinstructionmentions3registers,whicharesymbolicallycalled

RegD–Thedestination Reg1–The2irstoperand Reg2–Thesecondoperand

Inaddition,anumberofinstructionscontainimmediatedata.Theimmediatedatavalueisalwayssign-extendedtoyielda32-bitvalue.

Thefollowingsizesofimmediatedatavaluesareused:

Notation Fieldwidth NumberofValues Rangeofvalues Immed-12 12bits 212=4Ki -2,048..+2,047 Immed-20 20bits 220=1Mi -524,288..+524,287

Commentary:Loadinganarbitrary32bitvalueintoaregisterusinginstructionswhicharethemselvesonly32-bitsnecessarilyrequirestwoinstructions.

Notethat12+20equals32.InRISC-V,anarbitrary32-bitvaluecanbesplitintotwopiecesandloadedintoaregisterwithtwoinstructions.Thedesignershavechosenthe12-20splitforanumberofreasons.Manycommonconstantsand

RISC-VArchitectureSummary/Porter Page� of� 32 323

Page 33: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

offsetsfallwithintherangeofthesmallerimmed-12values.Also,theRISC-Vvirtualmemorysupportsapagesizeof4KiBytes.

Herearetheinstructionformats:

R-typeinstructions: Operands: RegD,Reg1,Reg2 Example:

ADD x4,x6,x8 # x4 = x6+x8 I-typeinstructions: Operands: RegD,Reg1,Immed-12 Examples:

ADDI x4,x6,123 # x4 = x6+123LW x4,8(x6) # x4 = Mem[8+x6]

S-typeinstructions: Operands: Reg1,Reg2,Immed-12 Example:

SW x4,8(x6) # Mem[8+r6] = x4 (word) B-typeinstructions(avariantofS-type): Operands: Reg1,Reg2,Immed-12 Example:

blt x4,x6,loop # if x4<x6, goto offset(pc) U-typeinstructions: Operands: RegD,Immed-20 Example:

LUI x4,0x12AB7 # x4 = value<<12AUIPC x4,0x12AB7 # x4 = (value<<12) + pc

J-typeinstructions(avariantofU-type): Operands: RegD,Immed-20 Example:

jal x4,foo # call: pc=offset+pc; x4=ret addr

TheonlydifferencebetweenS-typeandB-typeinstructionsishowthe12-bitimmediatevalueishandled.InanB-typeinstruction,theimmediatevalueis

RISC-VArchitectureSummary/Porter Page� of� 33 323

Page 34: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

multipliedby2(i.e.,shiftedleft1bit)beforebeingused.IntheS-typeinstruction,thevalueisnotshifted.

Inbothcases,sign-extensionoccurs.Inparticular,thebitstotheleftaresynthesizedby2illingtheminwithacopyofthemostsigni2icantbitactuallypresent.

S-typeimmediatevalues: Actualvalueused(wheres=sign-extension):

ssss ssss ssss ssss ssss VVVV VVVV VVVV Rangeofvalues:

-2,048..+2,0470xFFFFF800..0x000007FF

B-typeimmediatevalues: Actualvalueused(wheres=sign-extension):

ssss ssss ssss ssss sssV VVVV VVVV VVV0 Rangeofvalues:

-4,096..+4,094(inmultiplesof2)0xFFFFF000..0x00000FFE

TheonlydifferencebetweenU-typeandJ-typeinstructionsishowthe20-bitimmediatevalueishandled.InaU-typeinstruction,theimmediatevalueisshiftedleftby12bitstogivea32bitvalue.Inotherwords,theimmediatevalueisplacedintheuppermost20bits,andthelower12bitsarezero-2illed.

InaJ-typeinstruction,theimmediatevalueisshiftedleftby1bit(i.e.,multipliedby2).Itisalsosign-extended.

U-typeimmediatevalues: Actualvalueused:

VVVV VVVV VVVV VVVV VVVV 0000 0000 0000 Thevalueisalwaysalignedtoamultipleof4,096. J-typeimmediatevalues: Actualvalueused(wheres=sign-extension):

ssss ssss sssV VVVV VVVV VVVV VVVV VVV0 Rangeofvalues:

-1,048,576..+1,048,574(inmultiplesof2)0xFFF00000..0x000FFFFE

Next,wegivetheencodingsforthedifferenttypesofinstructions.Inthefollowing,eachletterrepresentsasinglebit,accordingtothefollowinglegend:

RISC-VArchitectureSummary/Porter Page� of� 34 323

Page 35: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

DDDDD=RegD 11111=Reg1 22222=Reg2 VVVVV=Immediatevalue XXXXX=Op-code/functioncode

R-typeinstructions: Operands: RegD,Reg1,Reg2 Encoding:

XXXX XXX2 2222 1111 1XXX DDDD DXXX XXXX I-typeinstructions: Operands: RegD,Reg1,Immed-12 Encoding:

VVVV VVVV VVVV 1111 1XXX DDDD DXXX XXXX S-typeandB-typeinstructions: Operands: Reg1,Reg2,Immed-12 Encoding:

VVVV VVV2 2222 1111 1XXX VVVV VXXX XXXX U-typeandJ-typeinstructions: Operands: RegD,Immed-20 Encoding:

VVVV VVVV VVVV VVVV VVVV DDDD DXXX XXXX

Commentary:NotethatReg1,Reg2,andRegDoccurinthesameplaceinallinstructionformats.

Thissimpli2iesthechipcircuitry,whichwouldbemorecomplexif,forexample,RegDwassometimesinonepartoftheinstructionandothertimesinadifferentplaceintheinstruction.

Aconsequenceofkeepingtheregister2ieldsinthatsameplaceisthatimmediatedatavaluesinaninstructionaresometimesnotalwaysincontiguousbits.AndnotethatthebitsencodingtheimmediatevaluesintheS-typeandB-typeinstructionsarenotcontiguous.

RISC-VArchitectureSummary/Porter Page� of� 35 323

Page 36: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

I-type: VVVV VVVV VVVV ---- ---- ---- ---- ——

S-typeandB-type: VVVV VVV- ---- ---- ---- VVVV V--- ——

U-typeandJ-type: VVVV VVVV VVVV VVVV VVVV ---- ---- ——

WhilebothS-typeandB-typehavea12-bitimmediatevalue,thepreciseorderofthebitsinthose2ieldsdiffersbetweenthetwoformats.Whileyouwouldreasonablyassumethebitsareinorder,theyareinfactscrambledupalittleintheB-typeinstructionformat.Similarly,thebitsofthe20-bitimmediatevalue2ieldintheJ-typeformatsarescrambled,ascomparedtoU-type.Consultthespecfordetails.

Immediatevaluesarealwayssign-extended.Althoughtheimmediatevaluecanbedifferentsizesandmaybebrokenintomultiple2ields,thesignbitisalwaysinthesameinstructionbit(namelytheleftmostbit,bit31).Thissimpli2iesandspeedssign-extensioncircuitry.

User-LevelInstructions

TheRISC-Vspeci2icationbreakstheinstructionsintotwobroadcategories:“user-level”instructionsand“privileged”instructions.Webeginbydescribingtheindividualuser-levelinstructions.We’llcovertheprivilegedinstructionslater.

Theinstructionsetisquitesmallandtightlyde2ined;therearenotasmanyinstructionsintheRISC-Varchitectureasinotherarchitectures.

Severalfamiliarinstructions(whichyouwould2indinotherarchitectures)aresimplyspecialcasesofmoregeneralRISC-Vinstructions.

Theof2icialRISC-Vdocumentationfocusesontheactualinstructionsandmentionsspecialcasesonlybrie2ly,inconnectionwiththemoregeneralinstruction.Wewillseparateoutsomeofthesespecialcasesandpresentthemasusefulinstructionsintheirownright,mentioningthattheyareactuallyimplementedasspecialcasesof

RISC-VArchitectureSummary/Porter Page� of� 36 323

Page 37: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

otherinstructions.Also,someotheruseful“instructions”areactuallyexpandedbytheassemblerintoasequenceoftwosimplerinstructions.

TheRISC-Vstandarddescribethreedifferentmachinesizes:

RV32 32-bitregisters RV64 64-bitregisters RV128 128-bitregisters

AnyparticularRISC-Vhardwarechipwillimplementonlyoneoftheabovestandards.However,eachsizeisstrictlymorepowerfulthanthesmallersizes.AnRV64chipwillincludealltheinstructionsnecessarytomanipulate32-bitdata,soitcaneasilydoanythinganRV32chipcan.Likewise,anRV128chipisastrictsupersetoftheRV32andRV64chips.

Youshouldreadtheinstructiondescriptionsbelowassumingthattheregisterwidthis32bits.However,thesamedescriptionsapplyandmakesensefor64-bitor128-bitregisters,exceptwherespeci2icallynoted.

Regarding64-bitand128-bitextensions:Forthelargermachinesizes,thebasic32-bitinstructionsworkidentically.

Insummary,smallerdatavaluesaresign-extendedto2itintolargerregisters.Thus,32-bitvaluesaresign-extendedto64bitsonanRV64machine.Thismeansyoucanrun32-bitcodedirectlyona64-bitmachinewithnochanges!

Ofcourse,someinstructionsfromaRV64machinewillnotbepresentonaRV32machine,socodethattrulyrequires64-bitswon’twork.

Whenusinga64-bitmachinetoexecutecodewrittenfora32-bitmachine,keepinmindthattheupper32bitsofregisterswillnormallycontainthesignextension,andnotazeroextension.

Likewise,32-bitand64-bitvaluesaresign-extendedto128-bitsonanRV128machine.

The12-bitand20-bitimmediatevaluesininstructionsaresign-extendedtothebasicmachinesize.ThisincludestheimmediatevaluesinthetwoU-typeinstructions(LUIandAUIPC).

RISC-VArchitectureSummary/Porter Page� of� 37 323

Page 38: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Forinstructionsthatarenotpresentonallmachinesizes,wemakespecialnotes.Otherwise,theinstructionispresentinallvariants.

Theof2icialdocumentationdoesnotdwellontheassemblernotation.Theassemblernotationpresentedhereissometimesa“bestguess”ofwhatisacceptedbythetypicalRISC-Vassemblertools.

ArithmeticInstructions(ADD,SUB,…)

AddImmediate

GeneralForm:ADDI RegD,Reg1,Immed-12

Example:ADDI x4,x9,123 # x4 = x9 + 0x0000007B

Description:Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)isaddedtothecontentsofReg1andtheresultisplacedinRegD.

Comments:Thereisno“subtractimmediate”instructionbecausesubtractionisequivalenttoaddinganegativevalue.

Encoding: ThisisanI-typeinstruction.

Commentary:Thereisnodistinctionbetweensignedandunsignedaddition;thesamehardwareyieldscorrectresultregardlessofwhetherbothvaluesareconsideredtorepresentsignedvaluesorbothareconsideredtorepresentunsignedvalues.

Theonlydistinctionbetweensignedandunsignedadditionisinhowover2lowistobedetected.Thesameistrueofsubtraction.ThisisnotuniquetoRISC-V;itistrueofallcomputers.

Over2lowisignoredinRISC-V.Whenover2lowoccurs,noexceptionsareraisedandnostatusbitsareset.Thissatis2iestheneedsofthe“C”language,butmaybeproblematicforlanguagesthatmandateover2lowdetection.

RISC-VArchitectureSummary/Porter Page� of� 38 323

Page 39: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thus,RISC-Vhasonlyonesortofaddition;RISC-Vdoesnotdistinguishbetweensignedandunsignedaddition.Likewise,thereisnodistinctionbetweensignedandunsignedsubtraction.

Notethatover2lowdetectioncanbeprogrammedfairlysimply.Forexample,thefollowingcodesequencewilladdtwounsignedvaluesandbranchonover2low: ADDI RegD,Reg1,Immed-12

BLTU RegD,Reg1,overflow ifRegD<UReg1thenbranch

Thefollowingcodesequencewilladdtwosignedvaluesandbranchonover2low.However,itwillonlyfunctioncorrectlyaslongasweknowthattheimmediatevalueispositive. ADDI RegD,Reg1,Immed-12

BLT RegD,Reg1,overflow ifRegD<SReg1thenbranch

Inthegeneralcaseofsignedaddition,youcanusethefollowingcodesequence,whichwillrequireacoupleofadditionalregisters: ADD RegD,Reg1,Reg2

SLTI Reg3,Reg2,0 Reg3=(Reg2<S0)?1:0 SLT Reg4,RegD,Reg1 Reg4=(RegD<SReg1)?1:0 BNE Reg3,Reg4,overflow ifReg3≠Reg4thenbranch

AddImmediateWord

GeneralForm:ADDIW RegD,Reg1,Immed-12

Example:ADDIW x4,x9,123 # x4 = x9 + 0x0000007B

Description:Thisinstructionisonlypresentin64-bitand128-bitmachines.Theoperationisperformedusing32-bitarithmetic.

Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)isaddedtothecontentsofReg1.Theresultisthentruncatedto32-bits,signed-extendedto64or128bitsandplacedinRegD.

Encoding: ThisisanI-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 39 323

Page 40: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

RV32/RV64/RV128:TheADDIinstructionispresentinallmachinevariations.TheADDIinstructionisusedtoperform32-bitadditionona32-bitmachine,64-bitadditionona64-bitmachine,and128-bitadditionona128-bitmachine.

Toperform32-bitadditionona64-bitor128-bitmachine,theADDIWinstructionisused.

Toperform64-bitadditionona128-bitmachine,theADDIDinstructionisused.

Whenworkingwith32-bitdataona64-bitmachine,theupperhalfoftheregisterwillcontainnothingbutthesignextensionofthelowerhalf.Thisistrueregardlessofwhetherthedatainthelower32bitsistobeinterpretedasasignedinteger,anunsignedinteger,orsomethingotherthananinteger.

TounderstandhowtheADDIinstructiondiffersfromADDIWona64-bitmachine,considerthisexample:

0x 0000 0000 0000 0001 + 0x 0000 0000 7FFF FFFF -------------------------- 0x 0000 0000 8000 0000 Result of ADDI

0x 0000 0000 0000 0001 + 0x 0000 0000 7FFF FFFF -------------------------- 0x FFFF FFFF 8000 0000 Result of ADDIW

AddImmediateDouble

GeneralForm:ADDID RegD,Reg1,Immed-12

Example:ADDID x4,x9,123 # x4 = x9 + 0x0000007B

Description:Thisinstructionisonlypresentin128-bitmachines.Theoperationisperformedusing64-bitarithmetic.

RISC-VArchitectureSummary/Porter Page� of� 40 323

Page 41: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)isaddedtothecontentsofReg1.Theresultisthentruncatedto64-bits,signed-extendedto128bitsandplacedinRegD.

Encoding: ThisisanI-typeinstruction.

SignExtension:Considertheproblemofsign-extendinga32-bitvaluetoa64-bitvalueonaRV64machine.Forexample,the64-bitvalue:

0x 3B4C 204E 92A7 5321

shouldbe“coerced”to2itwithintherangeofa32-bitsignedinteger,resultingin:

0x FFFF FFFF 92A7 5321

TheADDIWinstructioncandothis.Simplyadding0tothevaluewillproducethedesiredresult.

Likewise,theADDIDinstructioncanbeusedtocoercea128-bitvalueintoa64-bitvalue.

Add

GeneralForm:ADD RegD,Reg1,Reg2

Example:ADD x4,x9,x13 # x4 = x9+x13

Description:ThecontentsofReg1isaddedtothecontentsofReg2andtheresultisplacedinRegD.

Comments:Thereisnodistinctionbetweensignedandunsigned.Over2lowisignored.

Encoding: ThisisanR-typeinstruction.

RV32/RV64/RV128:TheADDinstructionispresentinallmachinevariations.TheADDinstructionisusedtoperform32-bitadditionona32-bitmachine,64-bitadditionona64-bitmachine,and128-bitadditionona128-bitmachine.

RISC-VArchitectureSummary/Porter Page� of� 41 323

Page 42: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Toperform32-bitadditionona64-bitor128-bitmachine,theADDWinstructionisused.

Toperform64-bitadditionona128-bitmachine,theADDDinstructionisused.

AddWord

GeneralForm:ADDW RegD,Reg1,Reg2

Example:ADDW x4,x9,x13 # x4 = x9+x13

Description:ThecontentsofReg1isaddedtothecontentsofReg2andtheresultisplacedinRegD.

RV32/RV64/RV128:Thisinstructionisonlypresentin64-bitand128-bitmachines.Theoperationisperformedusing32-bitarithmetic.

Comments:Thereisnodistinctionbetweensignedandunsigned.Over2lowbeyond32-bitsisignored.The32-bitresultissign-extendedto2illtheupperbitsofthedestinationregister.

Encoding: ThisisanR-typeinstruction.

AddDouble

GeneralForm:ADDD RegD,Reg1,Reg2

Example:ADDD x4,x9,x13 # x4 = x9+x13

Description:ThecontentsofReg1isaddedtothecontentsofReg2andtheresultisplacedinRegD.

RV32/RV64/RV128:Thisinstructionisonlypresentin128-bitmachines.Theoperationisperformedusing64-bitarithmetic.

Comments:

RISC-VArchitectureSummary/Porter Page� of� 42 323

Page 43: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thereisnodistinctionbetweensignedandunsigned.Over2lowbeyond64-bitsisignored.The64-bitresultissign-extendedto2illtheupperbitsofthedestinationregister.

Encoding: ThisisanR-typeinstruction.

Commentary:TheRISC-Vauthorsdecidedtode2inemanyinstructionsgenerally,withoutspecifyinghowmanybitstheyoperateon.Instead,thenumberofbitsoperatedonbytheinstructionisdeterminedbytheregistersizeofthemachine.

Forexample,theADDinstructionperforms32-bitadditiononaRV32machine,64-bitadditiononanRV64machine,and128-bitadditiononanRV128machine.

Toperform32-bitadditionona64-bitmachine,anewinstructionisincluded,sincetheupper32bitsoftheoperandsmustbeignoredandtheresultmusthavetheupper32bits2illedwiththepropersign-extension.

Soforthe64-bitmachine(RV64),theyaddedasecondinstruction(ADDW)toadd32-bitquantities.ForRV128,theyaddedathirdinstruction(ADDD)toadd64-bitquantities.

Thefollowingtableshowshowmanybitseachinstructionoperatesonandhighlightsaslightlackoforthogonality:

SizeofMachine 32-bit 64-bit 128-bit BasicInstructionSet: ADD 32 64 128 RV64adds: ADDW 32 32 RV128adds: ADDD 64

Analternateapproachisthefollowing:(ThisisnotRISC-V.)

SpecifythatthebasicADDinstructionwillperform32-bitadditionregardlessofregistersize;for64-bitand128-bitmachines,thisinstructionwillignoretheupperbitsoftheoperandsandsign-extendtheresult.Then,forRV64andRV128machines,anewinstruction(notpresentonRV32machines)isincludedto

RISC-VArchitectureSummary/Porter Page� of� 43 323

Page 44: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

perform64bitaddition.Finally,forRV128machines,athirdinstructionisincludedtoperform128bitaddition.

Toillustratethisalternateapproach,wecanmakeupreasonablenamesforthesehypotheticalinstructionsandshowwhichmachineshavewhichinstructionswiththefollowingtable: SizeofMachine 32-bit 64-bit 128-bit BasicInstructionSet: ADDW 32 32 32 RV64adds: ADDD 64 64 RV128adds: ADDQ 128

Thesetwoapproachesareequivalent,havingthesameinstructions,whichonlydifferintheirnames.Thenameschosenforinstructionsisanissueorthogonaltohowtheinstructionsareencoded.TheRISC-Vdesignersselectedthe2irstnamingschemetomirrortheinstructionopcodeencodingpatternstheychose.

Thisissueappliestothefollowinginstructiongroups:

ADDI ADD SUBI SUB NEG SLLI SLL SRLI SRL SRAI SRA LD ST MUL DIV DIVU REM

RISC-VArchitectureSummary/Porter Page� of� 44 323

Page 45: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

REMU

Ormorespeci2ically:

RV32 RV64 RV128 ADDI ADDIW ADDID ADD ADDW ADDD SUBI SUBIW SUBID SUB SUBW SUBD NEG NEGW NEGD SLLI SLLIW SLLID SLL SLLW SLLD SRLI SRLIW SRLID SRL SRLW SRLD SRAI SRAIW SRAID SRA SRAW SRAD LD LDW LDD ST STW STD MUL MULW MULD DIV DIVW DIVD DIVU DIVUW DIVUD REM REMW REMD REMU REMUW REMUD

Subtract

GeneralForm:SUB RegD,Reg1,Reg2

Example:SUB x4,x9,x13 # x4 = x9-x13

Description:ThecontentsofReg2issubtractedfromthecontentsofReg1andtheresultisplacedinRegD.

Comments:Thereisnodistinctionbetweensignedandunsigned.Over2lowisignored.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 45 323

Page 46: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

SubtractWord

GeneralForm:SUBW RegD,Reg1,Reg2

Example:SUBW x4,x9,x13 # x4 = x9-x13

Description:ThecontentsofReg2issubtractedfromthecontentsofReg1andtheresultisplacedinRegD.

RV32/RV64/RV128:Thisinstructionisonlypresentin64-bitand128-bitmachines.Theoperationisperformedusing32-bitarithmetic.

Comments:Thereisnodistinctionbetweensignedandunsigned.Over2lowbeyond32-bitsisignored.The32-bitresultissign-extendedto2illtheupperbitsofthedestinationregister.

Encoding: ThisisanR-typeinstruction.

SubtractDouble

GeneralForm:SUBD RegD,Reg1,Reg2

Example:SUBD x4,x9,x13 # x4 = x9-x13

Description:ThecontentsofReg2issubtractedfromthecontentsofReg1andtheresultisplacedinRegD.

RV32/RV64/RV128:Thisinstructionisonlypresentin128-bitmachines.Theoperationisperformedusing64-bitarithmetic.

Comments:Thereisnodistinctionbetweensignedandunsigned.Over2lowbeyond64-bitsisignored.The64-bitresultissign-extendedto2illtheupperbitsofthedestinationregister.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 46 323

Page 47: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

SignExtendWordtoDoubleword

GeneralForm:SEXT.W RegD,Reg1

Example:SEXT.W x4,x9 # x4 = Sign-extend(x9)

Description:Thisinstructionisonlyavailablefor64-bitand128-bitmachines.

Thevalueinthelower32bitsofReg1issigned-extendedto64or128bitsandplacedinRegD.

Comments:Thisinstructionisusefulwhena32-bitsignedvaluemustbe“coerced”toalargervalueon64-bitand128-bitmachine.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

ADDIW RegD,Reg1,0

Commentary:TheRISC-Vdocumentationusestwodistinctsuf2ixnotationstodenotethesizeofinstructions.Forexample:

ADDIWSEXT.W

SignExtendDoublewordtoQuadword

GeneralForm:SEXT.D RegD,Reg1

Example:SEXT.D x4,x9 # x4 = Sign-extend(x9)

Description:Thisinstructionisonlyavailablefor128-bitmachines.

Thevalueinthelower64bitsofReg1issigned-extendedto128bitsandplacedinRegD.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

RISC-VArchitectureSummary/Porter Page� of� 47 323

Page 48: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

ADDID RegD,Reg1,0

Negate

GeneralForm:NEG RegD,Reg2

Example:NEG x4,x9 # x4 = -x9

Description:ThecontentsofReg2isarithmeticallynegatedandtheresultisplacedinRegD.

Comments:Theresultiscomputedbysubtractionfromzero.Over2lowcanonlyoccurwhenthemostnegativevalueisnegated.Over2lowisignored.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

SUB RegD,x0,Reg2

NegateWord

GeneralForm:NEGW RegD,Reg2

Example:NEGW x4,x9 # x4 = -x9

Description:ThecontentsofReg2isarithmeticallynegatedandtheresultisplacedinRegD.

RV64/RV128:Thisinstructionisonlypresentin64-bitand128-bitmachines.Theoperationisperformedusing32-bitarithmetic,whereastheNEGinstructionoperateson64-bitor128-bitquantitiesinthelargermachines.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

SUBW RegD,x0,Reg2

RISC-VArchitectureSummary/Porter Page� of� 48 323

Page 49: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

NegateDoubleword

GeneralForm:NEGD RegD,Reg2

Example:NEGD x4,x9 # x4 = -x9

Description:ThecontentsofReg2isarithmeticallynegatedandtheresultisplacedinRegD.

RV64/RV128:Thisinstructionisonlypresentin128-bitmachines.Theoperationisperformedusing64-bitarithmetic.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

SUBD RegD,x0,Reg2

LogicalInstructions(AND,OR,XOR,…)

AndImmediate

GeneralForm:ANDI RegD,Reg1,Immed-12

Example:ANDI x4,x9,123 # x4 = x9 & 0x0000007B

Description:Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)islogicallyANDedwiththecontentsofReg1andtheresultisplacedinRegD.

Encoding: ThisisanI-typeinstruction.

And

GeneralForm:AND RegD,Reg1,Reg2

RISC-VArchitectureSummary/Porter Page� of� 49 323

Page 50: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Example:AND x4,x9,x13 # x4 = x9 & x13

Description:ThecontentsofReg1islogicallyANDedwiththecontentsofReg2andtheresultisplacedinRegD.

Encoding: ThisisanR-typeinstruction.

OrImmediate

GeneralForm:ORI RegD,Reg1,Immed-12

Example:ORI x4,x9,123 # x4 = x9 | 0x0000007B

Description:Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)islogicallyORedwiththecontentsofReg1andtheresultisplacedinRegD.

Encoding: ThisisanI-typeinstruction.

Or

GeneralForm:OR RegD,Reg1,Reg2

Example:OR x4,x9,x13 # x4 = x9 | x13

Description:ThecontentsofReg1islogicallyORedwiththecontentsofReg2andtheresultisplacedinRegD.

Encoding: ThisisanR-typeinstruction.

XorImmediate

GeneralForm:XORI RegD,Reg1,Immed-12

RISC-VArchitectureSummary/Porter Page� of� 50 323

Page 51: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Example:XORI x4,x9,123 # x4 = x9 ^ 0x0000007B

Description:Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)islogicalXORedwiththecontentsofReg1andtheresultisplacedinRegD.

Encoding: ThisisanI-typeinstruction.

Xor

GeneralForm:XOR RegD,Reg1,Reg2

Example:XOR x4,x9,x13 # x4 = x9 ^ x13

Description:ThecontentsofReg1islogicallyXORedwiththecontentsofReg2andtheresultisplacedinRegD.

Encoding: ThisisanR-typeinstruction.

Not

GeneralForm:NOT RegD,Reg1

Example:NOT x4,x9 # x4 = ~x9

Description:ThecontentsofReg1isfetchedandeachofthebitsis2lipped.TheresultingvalueiscopiedintoRegD.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

XORI RegD,Reg1,-1 # Note that -1 = 0xFFFFFFFF

RISC-VArchitectureSummary/Porter Page� of� 51 323

Page 52: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

ShiftingInstructions(SLL,SRL,SRA,…)

ShiftLeftLogicalImmediate

GeneralForm:SLLI RegD,Reg1,Immed-12

Example:SLLI x4,x9,5 # x4 = x9<<5

Description:Theimmediatevaluedeterminesthenumberofbitstoshift.ThecontentsofReg1isshiftedleftthatmanybitsandtheresultisplacedinRegD.Theshiftvalueisnotadjusted,i.e.,0meansnoshiftingisdone.(Apparentlyallvalues,including0,areallowed.???)

RV64/RV128:For32-bitmachines,theshiftamountmustbewithin0..31.For64-bitmachines,theshiftamountmustbewithin0..63.For128-bitmachines,theshiftamountmustbewithin0..127.

Encoding: ThisisanI-typeinstruction.

ShiftLeftLogical

GeneralForm:SLL RegD,Reg1,Reg2

Example:SLL x4,x9,x13 # x4 = x9<<x13

Description:RegisterReg2containstheshiftamount.ThecontentsofReg1isshiftedleftandtheresultisplacedinRegD.

RV64/RV128:For32-bitmachines,theshiftamountmustbewithin0..31.For64-bitmachines,theshiftamountmustbewithin0..63.For128-bitmachines,theshiftamountmustbewithin0..127.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 52 323

Page 53: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

ShiftRightLogicalImmediate

GeneralForm:SRLI RegD,Reg1,Immed-12

Example:SRLI x4,x9,5 # x4 = x9>>5

Description:Theimmediatevaluedeterminesthenumberofbitstoshift.ThecontentsofReg1isshiftedrightthatmanybitsandtheresultisplacedinRegD.Theshiftis“logical”,i.e.,zerobitsarerepeatedlyshiftedinonthemost-signi2icantend.

RV64/RV128:For32-bitmachines,theshiftamountmustbewithin0..31.For64-bitmachines,theshiftamountmustbewithin0..63.For128-bitmachines,theshiftamountmustbewithin0..127.

Encoding: ThisisanI-typeinstruction.

ShiftRightLogical

GeneralForm:SRL RegD,Reg1,Reg2

Example:SRL x4,x9,x13 # x4 = x9>>x13

Description:RegisterReg2containstheshiftamount.ThecontentsofReg1isshiftedrightandtheresultisplacedinRegD.Theshiftis“logical”,i.e.,zerobitsarerepeatedlyshiftedinonthemost-signi2icantend.

RV64/RV128:For32-bitmachines,theshiftamountmustbewithin0..31.For64-bitmachines,theshiftamountmustbewithin0..63.For128-bitmachines,theshiftamountmustbewithin0..127.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 53 323

Page 54: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

ShiftRightArithmeticImmediate

GeneralForm:SRAI RegD,Reg1,Immed-12

Example:SRAI x4,x9,5 # x4 = x9>>>5

Description:Theimmediatevaluedeterminesthenumberofbitstoshift.ThecontentsofReg1isshiftedrightthatmanybitsandtheresultisplacedinRegD.Theshiftis“arithmetic”,i.e.,thesignbitisrepeatedlyshiftedinonthemost-signi2icantend.

RV64/RV128:For32-bitmachines,theshiftamountmustbewithin0..31.For64-bitmachines,theshiftamountmustbewithin0..63.For128-bitmachines,theshiftamountmustbewithin0..127.

Encoding: ThisisanI-typeinstruction.

ShiftRightArithmetic

GeneralForm:SRA RegD,Reg1,Reg2

Example:SRA x4,x9,x13 # x4 = x9>>>x13

Description:RegisterReg2containstheshiftamount.ThecontentsofReg1isshiftedrightandtheresultisplacedinRegD.Theshiftis“arithmetic”,i.e.,thesignbitisrepeatedlyshiftedinonthemost-signi2icantend.

RV64/RV128:For32-bitmachines,theshiftamountmustbewithin0..31.For64-bitmachines,theshiftamountmustbewithin0..63.For128-bitmachines,theshiftamountmustbewithin0..127.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 54 323

Page 55: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

RV32/RV64/RV128:Theshiftoperationsde2inedaboveworkontheentireregister,regardlessofitssize.Fora64-bitmachine,theyshiftall64bits;fora128-bitmachine,theyshiftall128bits.

Notethatwhenusing64-bitregisterstocontain32-bitvalues,wecannotsimplyuse64-bitshiftoperations.Weneedsomenewinstructions.

Toillustratetheproblem,considertheresultofright-shiftingthefollowing32-bitvalueby5ona32-bitmachine: before: 0x 8000 0000

shiftrightlogical: 0x 0400 0000shiftrightarithmetic: 0x FC00 0000

Whathappensifwetrytouse64-bitshiftinstructions?

Ifthevalueisrepresentedasa64-bitsign-extendedvalue,wegetthewrongvalueforthelogicalshift: before: 0x FFFF FFFF 8000 0000

shiftrightlogical: 0x 07FF FFFF FC00 0000shiftrightarithmetic: 0x FFFF FFFF FC00 0000

Ontheotherhand,ifthevalueisrepresentedasa64-bitzero-extendedvalue,wegetthewrongvalueforthearithmeticshift: before: 0x 0000 0000 8000 0000

shiftrightlogical: 0x 0000 0000 0400 0000shiftrightarithmetic: 0x 0000 0000 0400 0000

Asaconsequence,severalnewshiftinstructionsareneededfor64-bitmachinestoperform32-bitshiftingcorrectly.FortheRV64architecturalextension,thefollowinginstructionsareadded: SLLIW RegD,Reg1,Immed-12

SLLW RegD,Reg1,Reg2SRLIW RegD,Reg1,Immed-12SRLW RegD,Reg1,Reg2SRAIW RegD,Reg1,Immed-12SRAW RegD,Reg1,Reg2

wheretheshiftamountsarerestrictedto5bits(i.e.,0..31).

RISC-VArchitectureSummary/Porter Page� of� 55 323

Page 56: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thesenewinstructionsperformashiftofbetween0and31bitsandtheyoperateonlyonthelowerorder32bits,leavingtheupper32bitsunmodi2ied.

Forexample,whenshiftingrightby5wegetthecorrectvalueinthelow-order32-bitsregardlessofwhatisintheupper-bits. before: 0x 89AB CDEF 8000 0000

SRLW(logical): 0x 89AB CDEF 0400 0000SRAW(arithmetic): 0x 89AB CDEF FC00 0000

Similarly,whenwegotoa128-bitmachine,anothergroupofshiftinstructionsisneededtoperform64-bitshiftingcorrectly.FortheRV128architecture,thefollowinginstructionsareadded: SLLID RegD,Reg1,Immed-12

SLLD RegD,Reg1,Reg2SRLID RegD,Reg1,Immed-12SRLD RegD,Reg1,Reg2SRAID RegD,Reg1,Immed-12SRAD RegD,Reg1,Reg2

wheretheshiftamountsarerestrictedto6bits(i.e.,0..63).

ThesenewRV128instructionsperformashiftofbetween0and63bitsandtheyoperateonlyonthelowerorder64bits,leavingtheupper64bitsunmodi2ied.

ShiftInstructionsforRV64andRV128

GeneralForm: SLLIW RegD,Reg1,Immed-12 # RV64 and RV128 only

SLLW RegD,Reg1,Reg2 # RV64 and RV128 onlySRLIW RegD,Reg1,Immed-12 # RV64 and RV128 onlySRLW RegD,Reg1,Reg2 # RV64 and RV128 onlySRAIW RegD,Reg1,Immed-12 # RV64 and RV128 onlySRAW RegD,Reg1,Reg2 # RV64 and RV128 only

SLLID RegD,Reg1,Immed-12 # RV128 onlySLLD RegD,Reg1,Reg2 # RV128 onlySRLID RegD,Reg1,Immed-12 # RV128 onlySRLD RegD,Reg1,Reg2 # RV128 onlySRAID RegD,Reg1,Immed-12 # RV128 onlySRAD RegD,Reg1,Reg2 # RV128 only

RISC-VArchitectureSummary/Porter Page� of� 56 323

Page 57: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Comment: Seeabove.

MiscellaneousInstructions

Nop

GeneralForm:NOP

Example:NOP # Do nothing

Description:Thisinstructionhasnoeffect.

Comment:Thereareseveralwaystoencodethe“nop”operation.UsingADDIisthecanonical,recommendedway.Alsonotethatinstructionswiththebitencodingof0x00000000and0xFFFFFFFFarespeci2icallynotusedfor“nop”.Thesetwovaluesarecommonlyreturnedfrommemoryunitswhentheactualmemoryismissing(i.e.,unpopulated).Thetwovalues0x00000000and0xFFFFFFFFarespeci2iedas“illegalinstructions”andwillcausean“illegalinstructionexception”iffetchedandexecuted.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

ADDI x0,x0,0

Move(RegistertoRegister)

GeneralForm:MV RegD,Reg1

Example:MV x4,x9 # x4 = x9

Description:ThecontentsofReg1iscopiedintoRegD.

Encoding:

RISC-VArchitectureSummary/Porter Page� of� 57 323

Page 58: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

ADDI RegD,Reg1,0

LoadUpperImmediate

GeneralForm:LUI RegD,Immed-20

Example:LUI x4,0x12345 # x4 = 0x12345<<12 (0x12345000)

Description:Theinstructioncontainsa20-bitimmediatevalue.Thisvalueisplacedintheleftmost(i.e.,upper,mostsigni2icant)20bitsoftheregisterRegDandtherightmost(i.e.,lower,leastsigni2icant)12-bitsaresettozero.

RV64/RV128:Thedescriptionaboveappliesto32-bitmachines.Regardlessofregistersize,theimmediatevalueismovedintobits31:12.Inthecaseof64-bitregistersor128-bitregisters,thevalueissign-extendedto2illtheupper32or96bits(i.e.,bits63:32orbits127:32).

Comment:Thisinstructionisoftenuseddirectlybeforeaninstructioncontaininga12-bitimmediatevalue,whichwillbeaddedintoRegD.Together,theyareusedtoeffectivelymakea32-bitvalue.Inthecaseof64-bitor128-bitmachines,thiswillbea32-bitsignedvalue,intherange-2,147,483,648..2,147,483,647(i.e.,-231..231-1).

Encoding: ThisisaU-typeinstruction.

LoadImmediate

GeneralForm:LI RegD,Immed-32

Example:LI x4,123 # x4 = 0x0000007B

Description:Theimmediatevalue(whichcanbeany32-bitvalue)iscopiedintoRegD.

Encoding:

RISC-VArchitectureSummary/Porter Page� of� 58 323

Page 59: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thisisaspecialcaseofmoregeneralinstructionsandisassembleddifferentlydependingontheactualvaluepresent.

Iftheimmediatevalueisintherangeof-2,048..+2,047,thenitcanbeassembledidenticallyto:

ADDI RegD,x0,Immed

Iftheimmediatevalueisnotwithintherangeof-2,048..+2,047butiswithintherangeofa32-bitnumber(i.e.,-2,147,483,648..+2,147,483,647)thenitcanbeassembledusingthistwo-instructionsequence:

LUI RegD,Upper-20ADDI RegD,RegD,Lower-12

where“Upper-20”representstheuppermost20bitsofthevalueand“Lower-12”representstheleastsigni2icant12-bitsofthevalue.

Commentary:Noticethattheassemblermustbecarefulwhencomputingthe“Upper-20”and“Lower-12”piecesfromanarbitrary32-bitvalue.Theassemblercannotsimplybreakthevalueapart.

Why?Becausetheimmediate12bitvaluewillbesign-extendedinthesecond(ADDI)instruction.Ifthemostsigni2icantbitof“Lower-12”is1,thentheADDIinstructionwillbeworkingwithanegativevalue,causingtheincorrectvaluetobecomputed.

Instead,theassemblerneedstocomputethis: Given:Value(a32-bitquantity) Compute: Lower12=Value[11:0] x=Value–SignExtend(Lower12) Upper20=(x>>12)[19:0]TheresultingLUI/ADDIsequencewillloadthecorrectsignedvalueon32,64,and128bitmachines,sincethequantitiesaresigned-extendedinbothinstructions.

AddUpperImmediatetoPC

GeneralForm:AUIPC RegD,Immed-20

RISC-VArchitectureSummary/Porter Page� of� 59 323

Page 60: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Example:AUIPC x4,0x12345 # x4 = PC + (0x12345<<12)

Description:Theinstructioncontainsa20-bitimmediatevalue.Thisvalueismovedintotheleftmost(i.e.,upper,mostsigni2icant)20bitsofandtherightmost(i.e.,lower,leastsigni2icant)12-bitsaresettozero.ThenumbersocreatedisthenaddedtothecontentsoftheProgramCounter.TheresultisplacedinRegD.ThevalueofthePCusedhereistheaddressoftheinstructionthatfollowstheAUIPC.

RV64/RV128:Thedescriptionaboveappliesto32-bitmachines.Inthecaseof64-bitor128-bitsregisters,theimmediatevalueissign-extendedbeforebeingaddedtothePC.ThesizeofthePCequaltothesizeoftheregisters.

Comment:Thisinstructionisoftenuseddirectlybeforeaninstructioncontaininga12-bitimmediatevalue,whichwillbeaddedintoRegD.Together,theyareusedtoeffectivelymakea32-bitPC-relativeoffset.Thisisadequatetoaddressanylocationina32-bit(4GiByte)addressspace.

Inthecaseof64-bitor128-bitmachines,thiswillbea32-bitsignedoffset,intherange-2,147,483,648..2,147,483,647(i.e.,-231..231-1).Iftheaddressspaceislargerthan4GiBytes,thistechniquewillfail;adifferentinstructionsequenceisrequired.

ThecurrentPCcanbeobtainedbyusingthisinstructionwithanimmediatevalueofzero.UseofJALtodeterminethecurrentPCisnotrecommended.

Encoding: ThisisaU-typeinstruction.

LoadAddress

GeneralForm:LA RegD,Address

Example:LA x4,MyVar # x4 = &MyVar

Description:TheaddressofsomememorylocationiscopiedintoRegD.Noaccesstomemoryoccurs.

Encoding:

RISC-VArchitectureSummary/Porter Page� of� 60 323

Page 61: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thereisnoactual“loadaddress”instruction;insteadtheassemblersubstitutesasequenceoftwoinstructionstoachievethesameeffect.

The“address”canrefertoanylocationwithinthe32-bitmemoryspace.TheaddressisconvertedtoaPC-relativeaddress,withanoffsetof32bits.Thisoffsetisthenbrokenintotwopieces:a20-bitpieceanda12-bitpiece.Theinstructionisassembledusingthesetwoinstructions:

AUIPC RegD,Upper-20ADDI RegD,RegD,Lower-12

SetIfLessThan(Signed)

GeneralForm:SLT RegD,Reg1,Reg2

Example:SLT x4,x9,x13 # x4 = (x9<x13) ? 1 : 0

Description:ThecontentsofReg1iscomparedtothecontentsofReg2usingsignedcomparison.IfthevalueinReg1islessthanthevalueinReg2,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Encoding: ThisisanR-typeinstruction.

SetLessThanImmediate(Signed)

GeneralForm:SLTI RegD,Reg1,Immed-12

Example:SLTI x4,x9,123 # x4 = (x9<0x0000007B) ? 1 : 0

Description:Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)iscomparedtothecontentsofReg1usingsignedcomparison.IfthevalueinReg1islessthantheimmediatevalue,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Encoding: ThisisanI-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 61 323

Page 62: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

SetIfGreaterThan(Signed)

GeneralForm:SGT RegD,Reg1,Reg2

Example:SGT x4,x9,x13 # x4 = (x9>x13) ? 1 : 0

Description:ThecontentsofReg1iscomparedtothecontentsofReg2usingsignedcomparison.IfthevalueinReg1isgreaterthanthevalueinReg2,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Encoding:Thisisaspecialcaseofadifferentinstruction.Thisinstructionisassembledidenticallyto:

SLT RegD,Reg2,Reg1 # Note: regs are switched

Note:Thereisno“SetIfGreaterThanImmediate(signed)”instruction.

SetIfLessThan(Unsigned)

GeneralForm:SLTU RegD,Reg1,Reg2

Example:SLTU x4,x9,x13 # x4 = (x9<x13) ? 1 : 0

Description:ThecontentsofReg1iscomparedtothecontentsofReg2usingunsignedcomparison.IfthevalueinReg1islessthanthevalueinReg2,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Encoding: ThisisanR-typeinstruction.

SetLessThanImmediate(Unsigned)

GeneralForm:SLTIU RegD,Reg1,Immed-12

Example:SLTIU x4,x9,123 # x4 = (x9<0x0000007B) ? 1 : 0

RISC-VArchitectureSummary/Porter Page� of� 62 323

Page 63: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Description:Theimmediatevalue(asign-extended12-bitvalue,i.e.,-2,048..+2,047)iscomparedtothecontentsofReg1usingunsignedcomparison.IfthevalueinReg1islessthantheimmediatevalue,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Encoding: ThisisanI-typeinstruction.

SetIfGreaterThan(Unsigned)

GeneralForm:SGTU RegD,Reg1,Reg2

Example:SGTU x4,x9,x13 # x4 = (x9>x13) ? 1 : 0

Description:ThecontentsofReg1iscomparedtothecontentsofReg2usingunsignedcomparison.IfthevalueinReg1isgreaterthanthevalueinReg2,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Encoding:Thisisaspecialcaseofadifferentinstruction.Thisinstructionisassembledidenticallyto:

SLTU RegD,Reg2,Reg1 # Note: regs are switched

Note:Thereisno“SetIfGreaterThanImmediateUnsigned”instruction.

SetIfEqualToZero

GeneralForm:SEQZ RegD,Reg1

Example:SEQZ x4,x9 # x4 = (x9==0) ? 1 : 0

Description:IfthevalueinReg1iszero,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Comment:Thisinstructionisimplementedwithanunsignedcomparisonagainst1.Usingunsignednumbers,theonlyvaluelessthan1is0.Thereforeiftheless-thanconditionholds,thevalueinReg1mustbe0.

RISC-VArchitectureSummary/Porter Page� of� 63 323

Page 64: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

SLTIU RegD,Reg1,1

SetIfNotEqualToZero

GeneralForm:SNEZ RegD,Reg2

Example:SNEZ x4,x9 # x4 = (x9≠0) ? 1 : 0

Description:IfthevalueinReg2isnotzero,thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Comment:Thisinstructionisimplementedwithanunsignedcomparisonagainst0.Usingunsignednumbers,theonlyvaluenotlessthan0is0.Thereforeiftheless-thanconditionholds,thevalueinReg2mustbenotbe0.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

SLTU RegD,x0,Reg2

SetIfLessThanZero(signed)

GeneralForm:SLTZ RegD,Reg1

Example:SLTZ x4,x9 # x4 = (x9<0) ? 1 : 0

Description:IfthevalueinReg1islessthanzero(usingsignedarithmetic),thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

SLT RegD,Reg1,x0

RISC-VArchitectureSummary/Porter Page� of� 64 323

Page 65: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

SetIfGreaterThanZero(signed)

GeneralForm:SGTZ RegD,Reg2

Example:SGTZ x4,x9 # x4 = (x9>0) ? 1 : 0

Description:IfthevalueinReg2isgreaterthanzero(usingsignedarithmetic),thevalue1isstoredinRegD.Otherwise,thevalue0isstoredinRegD.

Comment:“Reg2>0”isequivalentto“0<Reg2”.

Encoding:Thisisaspecialcaseofamoregeneralinstruction.Thisinstructionisassembledidenticallyto:

SLT RegD,x0,Reg2

BranchandJumpInstructions

BranchifEqual

GeneralForm:BEQ Reg1,Reg2,Immed-12

Example:BEQ x4,x9,MyLabel # If x4==x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.Ifequal,controljumps.ThetargetaddressisgivenasaPC-relativeoffset.Moreprecisely,theoffsetissign-extended,multipliedby2,andaddedtothevalueofthePC.ThevalueofthePCusedistheaddressoftheinstructionfollowingthebranch,notthebranchitself(???).Theoffsetismultipliedby2,sinceallinstructionsmustbehalfwordaligned.Thisgivesaneffectiverangeof-4,096..4,094(inmultiplesof2),relativetothePC.

Comment:Mostconditionalbranchesoccurwithinsmallishsubroutines/functions,sothelimitedrangeshouldbeadequate.

RISC-VArchitectureSummary/Porter Page� of� 65 323

Page 66: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Ifthelimitedsizeoftheoffsetisinadequate,codesuchasthefollowing: BEQ x4,x9,MyLabel

mustbealteredto: BNE x4,x9,Skip J MyLabel # a “long jump”

Skip:Thisappliestoallconditionalbranchinstructions.

Encoding: ThisisaB-typeinstruction.

Commentary:Instructionsizes,locations,andrelativeoffsetsaredeterminedbytheassemblerandlinker,notbythecompiler.However,thecompiler(whichselectsinstructionsandproducestheassemblycodesequence)doesnothaveenoughknowledgetodeterminewhetheranoffsetwillbewithinrange.Therefore,itcannotknowwhetherornota“longjump”isnecessary.

Whatisthesolution?

(1)Thecompileralwayserrsonthesafesideandgeneratesmanyunnecessary“longjumps.”Notacceptable.

(2)Thecompilerignorestheproblem,alwaysusesshortjumps,andoccasionallygeneratescodethatwillnotassemblewithoutfurtherattention.Notacceptable.

(3)Thecompilerismodi2iedtocountinstructionsanddeterminewhen“longjumps”shouldbeinserted.Thisputsaburdenonthecompiler,butthecompilermayalreadybekeepingtrackofthelengthsofthecodesequencesitgenerates,inordertocomparealternatives.Inorderforthistowork,theassemblermustbeforbiddenfromdoingthingslikecompilingtheLIinstructionconditionally,asdescribedearlier.

(4)Theassemblerdetectsoffsetover2lowsandquietlyaltersthecodesequencebyreversingthesenseofthetestandinsertinga“longjump”instruction.Thismaybetheeasiest,butviolatesthetraditionalone-to-onemappingbetweenassemblerandmachinecode.

RISC-VArchitectureSummary/Porter Page� of� 66 323

Page 67: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

BranchifNotEqual

GeneralForm:BNE Reg1,Reg2,Immed-12

Example:BNE x4,x9,MyLabel # If x4≠x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.Ifnotequal,controljumpstoaPC-relativetargetaddress.

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding: ThisisaB-typeinstruction.

BranchifLessThan(Signed)

GeneralForm:BLT Reg1,Reg2,Immed-12

Example:BLT x4,x9,MyLabel # If x4<x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1islessthanReg2(usingsignedcomparison),controljumpstoaPC-relativetargetaddress.

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding: ThisisaB-typeinstruction.

BranchifLessThanOrEqual(Signed)

GeneralForm:BLE Reg1,Reg2,Immed-12

Example:BLE x4,x9,MyLabel # If x4<=x9 goto MyLabel

RISC-VArchitectureSummary/Porter Page� of� 67 323

Page 68: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1islessthanorequaltoReg2(usingsignedcomparison),controljumpstoaPC-relativetargetaddress.

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding:Thisisaspecialcaseofanotherinstruction.Thisinstructionisassembledidenticallyto:

BGE Reg2,Reg1,Immed-12 # Note: regs are swapped

BranchifGreaterThan(Signed)

GeneralForm:BGT Reg1,Reg2,Immed-12

Example:BGT x4,x9,MyLabel # If x4>x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1isgreaterthanReg2(usingsignedcomparison),controljumpstoaPC-relativetargetaddress.

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding:Thisisaspecialcaseofanotherinstruction.Thisinstructionisassembledidenticallyto:

BLT Reg2,Reg1,Immed-12 # Note: regs are swapped

BranchifGreaterThanOrEqual(Signed)

GeneralForm:BGE Reg1,Reg2,Immed-12

Example:BGE x4,x9,MyLabel # If x4>=x9 goto MyLabel

RISC-VArchitectureSummary/Porter Page� of� 68 323

Page 69: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1isgreaterthanorequaltoReg2(usingsignedcomparison),controljumpstoaPC-relativetargetaddress.

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding: ThisisaB-typeinstruction.

BranchifLessThan(Unsigned)

GeneralForm:BLTU Reg1,Reg2,Immed-12

Example:BLTU x4,x9,MyLabel # If x4<x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1islessthanReg2(usingunsignedcomparison),controljumpstoaPC-relativetargetaddress.

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding: ThisisaB-typeinstruction.

BranchifLessThanOrEqual(Unsigned)

GeneralForm:BLEU Reg1,Reg2,Immed-12

Example:BLEU x4,x9,MyLabel # If x4<=x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1islessthanorequaltoReg2(usingunsignedcomparison),controljumpstoaPC-relativetargetaddress.

RISC-VArchitectureSummary/Porter Page� of� 69 323

Page 70: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding:Thisisaspecialcaseofanotherinstruction.Thisinstructionisassembledidenticallyto:

BGEU Reg2,Reg1,Immed-12 # Note: regs are swapped

BranchifGreaterThan(Unsigned)

GeneralForm:BGTU Reg1,Reg2,Immed-12

Example:BGTU x4,x9,MyLabel # If x4>x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1isgreaterthanReg2(usingunsignedcomparison),controljumpstoaPC-relativetargetaddress.

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding:Thisisaspecialcaseofanotherinstruction.Thisinstructionisassembledidenticallyto:

BLTU Reg2,Reg1,Immed-12 # Note: regs are swapped

BranchifGreaterThanOrEqual(Unsigned)

GeneralForm:BGEU Reg1,Reg2,Immed-12

Example:BGEU x4,x9,MyLabel # If x4>=x9 goto MyLabel

Description:ThecontentsofReg1iscomparedtothecontentsofReg2.IfReg1isgreaterthanorequaltoReg2(usingunsignedcomparison),controljumpstoaPC-relativetargetaddress.

RISC-VArchitectureSummary/Porter Page� of� 70 323

Page 71: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Comment:Thetargetlocationgivenbytheoffsetmustbewithintherangeof-4,096..4,094(inmultiplesof2),relativetothePC.Seethe“BEQ”instruction.

Encoding: ThisisaB-typeinstruction.

ComparisonstoZero:Thefollowingabbreviationsaresimplevariationsonthepreviousinstructions.Theycompareavalueinaregistertozerousingsignedcomparison.Thezeroregister(x0)isimplicitintheseforms.

AbbreviatedForm: AssembledAs: BEQZ Reg1,target BEQ Reg1,x0,target BNEZ Reg1,target BNE Reg1,x0,target BLTZ Reg1,target BLT Reg1,x0,target BLEZ Reg1,target BGE x0,Reg1,target BGTZ Reg1,target BLT x0,Reg1,target BGEZ Reg1,target BGE Reg1,x0,target

Commentary:TheRISC-Varchitecturedoesnotincludeconditionallyexecutedinstructions.Theideawithaconditionallyexecutedinstructionisthattheinstructionexecutionispredicatedonsomeconditioncode(s).Theinstructionisexecuted,andwilleitherworknormallyorhavenoeffect,dependingonwhethertheconditionistrueorfalse.Thebene2itisthatconditionalbranchinstructions(withtheirpipelineissues)cansometimesbeavoided,resultinginimprovedef2iciency.TheRISC-Vdesignersconsideredandrejectedconditionalexecution.

JumpAndLink(Short-DistanceCALL)

GeneralForm:JAL RegD,Immed-20

Example:JAL x1,MyFunct # Goto MyFunct, x1=RetAddr

Description:Thisinstructionisusedtocallasubroutine(i.e.,function).Thereturnaddress(i.e.,thePC,whichistheaddressoftheinstructionfollowingtheJAL)issavedinRegD.

RISC-VArchitectureSummary/Porter Page� of� 71 323

Page 72: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

ThetargetaddressisgivenasaPC-relativeoffset.Moreprecisely,theoffsetissign-extended,multipliedby2,andaddedtothevalueofthePC.ThevalueofthePCusedistheaddressoftheinstructionfollowingtheJAL,nottheJALitself(???).Theoffsetismultipliedby2,sinceallinstructionsmustbehalfwordaligned.Thisgivesaneffectiverangeof±1MiByte,i.e.,-1,048,576..1,048,574(inmultiplesof2),relativetothePC.

AssemblerShorthand:Byconvention,x1isgenerallyusedasthe“linkregister”.Iftheregisterisnotmentioned,thenx1isimplied.

JAL MyFunct # Call MyFunctComment:

Theprogrammingconventionistouseregisterx1asa“linkregister.”Thisreturnaddressisstoredinx1,ratherthanbeingpushedontoastackinmemory,thusmakingcallsandreturnsfaster.However,ifsomeroutine“foo”willcallanotherroutine“bar”,then“foo”mustsavex1beforecalling“bar”.The“foo”routinemightpushx1ontoastackinmemory,or“foo”mightsavex1ina“callee-saved”registerinthehopeofavoidinganymemoryaccess.

Exceptions:Thismaygeneratean“instructionmisalignedexception.”Thetargetaddresswillnecessarilybeamultipleof2,butitmaynotbemultipleof4.Formachinesthatdonotsupport16-bitinstructions,thiswillcauseanalignmentexception.

Encoding: ThisisaJ-typeinstruction.

Commentary:Mostprogramscodesegmentsarefairlysmall,sothelimitedrangeofJALshouldbeadequate.

Iftheprogramsizeexceeds1MiByte,thelimitedsizeoftheoffsetmaybeinadequate.Codesuchasthefollowing:

JAL x1,MyFunct

mustbealteredto:

LUI Reg2,Offset-20JALR x1,Reg2,Offset-12

RISC-VArchitectureSummary/Porter Page� of� 72 323

Page 73: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

TypicallyseveralroutinesareseparatelycompiledsothecompilerwillnotknowhowfarawaythetargetfunctionisfromtheCALLinstruction.Eventheassemblerwillnotknowandonlyatlink-timecanitbedeterminedwhethera20-bitoffsetisadequate.Oneapproachisforthelinkertoprintoutaninscrutableerrormessage.

Jump(Short-Distance)

GeneralForm:J Immed-20

Example:J MyLabel # Goto MyLabel

Description:ThetargetaddressisgivenasaPC-relativeoffset.Theeffectiverangeis±1MiByte,i.e.,-1,048,576..1,048,574(inmultiplesof2),relativetothePC.

Exceptions:Thismaygeneratean“instructionmisalignedexception.”Thetargetaddresswillnecessarilybeamultipleof2,butitmaynotbemultipleof4.Formachinesthatdonotsupport16-bitinstructions,thiswillcauseanalignmentexception.

Encoding:Thisisaspecialcaseofanotherinstruction.Thisinstructionisassembledidenticallyto:

JAL x0,Immed-20 # Discard return address

JumpAndLinkRegister

GeneralForm:JALR RegD,Reg1,Immed-12

Example:JALR x1,x4,MyFunct # Goto MyFunct, x1=RetAddr

Description:Thisinstructionisusedtocallasubroutine(i.e.,function).Thereturnaddress(i.e.,thePC,whichistheaddressoftheinstructionfollowingtheJAL)issavedinRegD.

ThetargetaddressiscomputedbyaddingtheoffsettothecontentsofReg1.Moreprecisely,theoffsetissign-extendedandaddedtothevalueofReg1.The

RISC-VArchitectureSummary/Porter Page� of� 73 323

Page 74: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

offsetisnotmultipliedby2.Thisgivesaneffectiverangeof±2KiByte,i.e.,-2,048..2,047,relativetotheaddressinReg1.

Comment:Thisinstructioncanbeusedinseveralways.SeetheJR,RET,CALL,andTAILinstructions.

AssemblerShorthand:Byconvention,x1isusedasthe“linkregister”.Thefollowingform:

JALR Reg1 # Call *Reg1isusedshorthandfor:

JALR x0,Reg1,0 Exceptions:

Thismaygeneratean“instructionmisalignedexception.”Encoding: ThisisanI-typeinstruction.

JumpRegister

GeneralForm:JR Reg1

Example:JR Reg1 # Goto *Reg1, i.e., PC = Reg1

Description:JumptotheaddressinReg1.

Exceptions:Thismaygeneratean“instructionmisalignedexception.”

Encoding:Thisisaspecialcaseofanotherinstruction.Thisinstructionisassembledidenticallyto:

JALR x0,Reg1,0 # Discard ret addr; offset=0

Return

GeneralForm:RET

Example:RET # Goto *x1, i.e., PC = x1

RISC-VArchitectureSummary/Porter Page� of� 74 323

Page 75: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Description:Byconvention,x1isusedasthe“linkregister”andwillholdareturnaddress.Thisinstructionreturnsfromasubroutine/function.

Exceptions:Thismaygeneratean“instructionmisalignedexception.”

Encoding:Thisisaspecialcaseofanotherinstruction.Thisinstructionisassembledidenticallyto:

JALR x0,x1,0 # PC=x1+0; don’t save prev PC

CallFarawaySubroutine

GeneralForm:CALL Immed-32

Example:CALL MyFunct # PC = new addr; x1 = ret addr

Description:Byconvention,x1isusedasthe“linkregister”andwillholdareturnaddress.Thisinstructioncallsasubroutine/functionusingaPC-relativescheme,wherethesubroutineoffsetfromtheCALLinstructionexceedsthe20-bitlimit(i.e.,±1MiByte)oftheJALinstruction.Thisinstructionmodi2iesregisterx6.

Encoding:Inordertodealwiththelargerdistancetothesubroutine,this“synthetic”instructionwillbeassembledusingthefollowingtwo-instructionsequence.Thetargetaddresscanbeexpressedasa32-bitoffsetfromtheProgramCounter.Thisoffsetisbrokenintotwopieces,whichareaddedtothePCintwosteps.

AUIPC x6,Immed-20 JALR x1,x6,Immed-12

TheAUIPCinstructionaddsthePCtotheupper20-bitportionofthe32-bitoffsetandplacestheresultinatemporaryregister.TheJALRinstructionaddsinthelower12bitsofthe32-bitoffsetandtransferscontrolbyloadingthesumintothePC.Italsosavesthereturnaddressinx1.

TheCALLinstructionmakesuseoftheconventionthatx1isthelinkregister.Italsousesx6,whichisa“caller-savedtemporaryregister”byconvention.

RISC-VArchitectureSummary/Porter Page� of� 75 323

Page 76: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Commentary:Theactualvalues2illedintotheinstructions(intheImmed-20andImmed-122ields)mustbecomputedbythelinker,sincetheactualtargetaddresswill,ingeneral,notbeavailabletotheassembler.Thelinkermustconvertthetargetaddressintoa32-bitoffsetfromtheProgramCounter.Atruntime,thePCvalueusedwillbetheaddressoftheAUIPC(nottheJALRinstruction)sothevaluemustberelativetothataddress.

Alsonotethatthe12-bitimmediatevaluewillbesignextendedintheJALRinstruction,sosimplybreakingtheoffsetapartintopieceswillnotwork.Instead,thelinkermust2irstisolatethelow-order12bitsandsign-extendit.Thenthelinkermustsubtractitfromthefull-32-bitoffset,yieldingtheupperImmed-20portion.

ThisappliestotheTAILinstruction,too.

Exceptions: Thissequencemaygeneratean“instructionmisalignedexception”ifthetarget

offsetfromwhichtheimmediatevaluesarecomputedisnotwordalignedandtheprocessordoesnotsupport16-bitinstructions.

TailRecursion:Somesubroutines/functionsendbycallinganotherfunction.Forexample,thelaststatementinafunctionnamed“foo”maybetoinvoke/callafunctionnamed“bar.”Whenthecalledfunction(bar)returns,theoriginalfunction(foo)immediatelycompletesandreturns.Anyvaluereturnedbythecalledfunction(bar)willbeusedasthereturnvaluefromthecaller(foo).

Thispatternoftenoccursinfunctionalprograms,whererecursionisusedtoimplementrepetitiveloopingbehavior.Withoutspecialattention,somefunctionalprogramsthatuserecursioninthiswaycancreateverydeepstacksbypushingthousandsofreturnaddressesontothestack.

Acommonoptimizationiscalledthe“tailrecursionoptimization”.Hereishowitworks.Thecompiledcodeforthecaller(foo)ismodi2iedtonotactuallycallbar,butinsteadtosimplyjumptothe2irstinstructioninbar.Then,whenbarreturns,theprocessorwilleffectivelyreturnfromfoo,sincenootherreturnaddresswassaved.Furthermore,anythingreturnedbybarwillbenaturallybereturnedtofoo’scaller.Thetailrecursionoptimizationeffectivelyturnsfunctionalprogramswhichuserecursiontoperformloopingactivitiesintoinstructionsequencesthatuse

RISC-VArchitectureSummary/Porter Page� of� 76 323

Page 77: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

“goto”instructions.Itcovertsrecursiveprogramsintoloopingprogramswith“goto”instructions,makingrecursiveprogrammingtechniquespractical.

Quiteoftenarecursivefunctionwillcallitself.Insuchcasesofthetailrecursionoptimization,the“goto”neednotbealong-distancejumpandtheJinstructionwillsuf2ice.Butinothercases,mutualrecursionmightinvolveseparatelycompiledfunctionsandrequirethelong-distancejumpingabilityoftheTAILinstruction.

TailCall(FarawaySubroutine)/Long-DistanceJump

GeneralForm:TAIL Immed-32

Example:TAIL MyFunct # PC = new addr; Discard ret addr

Description:ThisinstructionisusedtojumptoadistantlocationusingaPC-relativescheme,wherethedisplacementfromtheTAILinstructiontothetargetinstructionexceedsthe20-bitlimit(i.e.,±1MiByte)oftheJinstruction(jumpshortdistance).Thisinstructionmodi2iesregisterx6.

Comment:Thisinstructionisnothingmorethanalong-distance“goto”instruction;thename“TAIL”maybeconfusing,butre2lectshowitwillsometimesbeused.

Encoding:This“synthetic”instructionwillbeassembledusingthefollowingtwo-instructionsequence.

AUIPC x6,Immed-20 JALR x0,x6,Immed-12

SeethecommentsfortheCALLinstruction.Theonlydifferenceisthatherethereturnaddressisdiscarded(x0),insteadofbeingsavedinx1.

Exceptions: LiketheCALLinstruction,thissequencemaygeneratean“instruction

misalignedexception.”

RISC-VArchitectureSummary/Porter Page� of� 77 323

Page 78: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

LoadandStoreInstructions

LoadByte(Signed)

GeneralForm:LB RegD,Immed-12(Reg1)

Example:LB x4,1234(x9) # x4 = Mem[x9+1234]

Description:An8-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.Thevalueissign-extendedtothefulllengthoftheregister.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Thereisnoalignmentissueandthisinstructionwillexecuteatomically.Encoding: ThisisanI-typeinstruction.

LoadByte(Unsigned)

GeneralForm:LBU RegD,Immed-12(Reg1)

Example:LBU x4,1234(x9) # x4 = Mem[x9+1234]

Description:An8-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.Thevalueiszero-extendedtothefulllengthoftheregister.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Thereisnoalignmentissueandthisinstructionwillexecuteatomically.Encoding: ThisisanI-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 78 323

Page 79: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

LoadHalfword(Signed)

GeneralForm:LH RegD,Immed-12(Reg1)

Example:LH x4,1234(x9) # x4 = Mem[x9+1234]

Description:A16-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.Thevalueissign-extendedtothefulllengthoftheregister.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.halfword-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanI-typeinstruction.

LoadHalfword(Unsigned)

GeneralForm:LHU RegD,Immed-12(Reg1)

Example:LHU x4,1234(x9) # x4 = Mem[x9+1234]

Description:A16-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.Thevalueiszero-extendedtothefulllengthoftheregister.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

RISC-VArchitectureSummary/Porter Page� of� 79 323

Page 80: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanI-typeinstruction.

LoadWord(Signed)

GeneralForm:LW RegD,Immed-12(Reg1)

Example:LW x4,1234(x9) # x4 = Mem[x9+1234]

Description:A32-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

RV64/RV128:Foramachinewitharegisterwidthlargerthan32-bits,thevalueissign-extendedtothefulllengthoftheregister.

Encoding: ThisisanI-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 80 323

Page 81: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

LoadWord(Unsigned)

GeneralForm:LWU RegD,Immed-12(Reg1)

Example:LWU x4,1234(x9) # x4 = Mem[x9+1234]

Description:Thisinstructionisonlyavailablefor64-bitand128-bitmachines.

A32-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.Thevalueiszero-extendedtothefulllengthoftheregister.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Inamachinewith32-bitregisters,neithersign-extensionnorzero-extensionisnecessaryforvaluethatisalready32bitswide.Thereforethe“signedload”instruction(LW)doesthesamethingasthe“unsignedload”instruction(LWU),makingLWUredundant.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanI-typeinstruction.

LoadDoubleword(Signed)

GeneralForm:LD RegD,Immed-12(Reg1)

Example:LD x4,1234(x9) # x4 = Mem[x9+1234

RISC-VArchitectureSummary/Porter Page� of� 81 323

Page 82: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Description:Thisinstructionisonlyavailablefor64-bitand128-bitmachines.

A64-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.Signextensionoccursformachineswith128-bitregisters.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanI-typeinstruction.

LoadDoubleword(Unsigned)

GeneralForm:LDU RegD,Immed-12(Reg1)

Example:LDU x4,1234(x9) # x4 = Mem[x9+1234]

Description:Thisinstructionisonlyavailablefor128-bitmachines.

A64-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.Thevalueiszero-extendedto128-bits.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

RISC-VArchitectureSummary/Porter Page� of� 82 323

Page 83: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanI-typeinstruction.

LoadQuadword

GeneralForm:LQ RegD,Immed-12(Reg1)

Example:LQ x4,1234(x9) # x4 = Mem[x9+1234]

Description:Thisinstructionisonlyavailablefor128-bitmachines.

A128-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanI-typeinstruction.

StoreByte

GeneralForm:SB Reg2,Immed-12(Reg1)

Example:SB x4,1234(x9) # Mem[x9+1234] = x4

RISC-VArchitectureSummary/Porter Page� of� 83 323

Page 84: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Description:An8-bitvalueiscopiedfromregisterReg2tomemory.Theupper(moresigni2icant)bitsinReg2areignored.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Thereisnoalignmentissueandthisinstructionwillexecuteatomically.Encoding: ThisisanS-typeinstruction.

StoreHalfword

GeneralForm:SH Reg2,Immed-12(Reg1)

Example:SH x4,1234(x9) # Mem[x9+1234] = x4

Description:A16-bitvalueiscopiedfromregisterReg2tomemory.Theupper(moresigni2icant)bitsinReg2areignored.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.halfword-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanS-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 84 323

Page 85: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

StoreWord

GeneralForm:SW Reg2,Immed-12(Reg1)

Example:SW x4,1234(x9) # Mem[x9+1234] = x4

Description:A32-bitvalueiscopiedfromregisterReg2tomemory.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

RV64/RV128:Foramachinewitharegisterwidthlargerthan32-bits,theupperbitsoftheregisterareignored.

Encoding: ThisisanS-typeinstruction.

StoreDoubleword

GeneralForm:SD Reg2,Immed-12(Reg1)

Example:SD x4,1234(x9) # Mem[x9+1234] = x4

Description:Thisinstructionisonlyavailablein64-bitand128-bitmachines.

A64-bitvalueiscopiedfromregisterReg2tomemory.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.Fora128-bitmachinetheupperbitsoftheregisterareignored.

RISC-VArchitectureSummary/Porter Page� of� 85 323

Page 86: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.doubleword-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanS-typeinstruction.

StoreQuadword

GeneralForm:SQ Reg2,Immed-12(Reg1)

Example:SQ x4,1234(x9) # Mem[x9+1234] = x4

Description:Thisinstructionisonlyavailablein128-bitmachines.

A128-bitvalueiscopiedfromregisterReg2tomemory.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

Encoding: ThisisanS-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 86 323

Page 87: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Commentary:Forthe“store”instructions,theRISC-VdocumentationdoesnotmakeitclearwhethertheaddressisformedusingReg1orReg2,norwhetherthesourceregisterisReg1orReg2.However,basedontheassumptionthatthesameregisterwillbeusedforaddresscalculationinboth“load”and“store”instructionsandoncommentselsewhere,weconcludetheorderiswhatisshownhere.

Commentary:TheRISC-Vdocumentationsuggeststhatassemblerexpectsthememoryaddressoperandtobethesecondoperand,notthe2irst.Thisviolatesthegeneralpatterninassemblycodethatdatamovementistowardtheleft,i.e.,fromthesecondoperandtothe2irst.

Commentary:Byconvention,registerx2isusedasastacktoppointer(SP).However,theRISC-Varchitecturedoesnotcontainanyspecialinstructionsformanipulatingthestack;registerx2istreateduniformlytoallotherregistersbytheISA,eventhoughtheassemblermaysupportsomeshorthandnotationsthatfacilitateusex2asastackpointer.

Compilersfortraditionalprogramminglanguagestypicallyplacelocalvariablesinastackframe(i.e.,anactivationrecord)whichisconvenientlyaccessedusingthestacktoppointer.Toreadandwritesuchlocalvariables,the“load”and“store”instructionswillnaturallyuseoffsetsfromx2:

LB x5,offset(x2) # Load from variableSB x5,offset(x2) # Store into variableADDI x2,x2,4 # Pop 4 bytes off stackLW x5,0(x2) # Fetch word on stack top

Assumingthatstackframestendtohavemodestsizes,theRISC-Vinstructionset,whichutilizes12-bitoffsets(-2,048..+2,047),workswell.

Forglobal(i.e.,sharedorcommon)variables,oneapproachistokeepasinglepointertoablockofglobalvariables.Byconvention,registerx3isoftenusedassucha“globalbasepointer.”Again,assumingtheamountofmemoryrequiredfortheglobalvariablesismodest,the12-bitoffsetworkswell.

Incaseswherethedesiredoffsetsexceedthe12-bitlimit,the“loadupperimmediate”instruction(LUI)comestotherescue.RecallthatLUItakesa20-bitimmediatevalue,shiftsitleft12bits,andmovesitintoaregister.

RISC-VArchitectureSummary/Porter Page� of� 87 323

Page 88: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Asanexample,thefollowinginstructionsequencecanbeusedtoloadavaluefromalocationoffsetfromabasepointer(inthiscasex2),whentheoffsetexceedsthe12-bitlimit.

LUI x5,Immed-20 # Grab the upper 20 bitsADD x5,x5,x2 # Add the base pointerLB x5,Immed-12(x5) # Add the lower 12 bits

Asimilarsequencecanbeusedfor“store”instructions,althoughanotherregisterwouldberequired.

Notethatsomecaremustbetakenwhenbreakinga32-bitoffsetinto12and20bitpiecesbecauseofthesign-extensionthatwilloccurwiththe12-bitportionwhenitisaddedin.

Commentary:Insomecasesitwillbenecessarytousethe“load”or“store”instructionstoaccessarbitrarylocationsinmemory.

Insomecases,aPC-relativeaddressispreferred;inothercasesanabsoluteaddressispreferred.

Anyaddresswithina32-bitaddressspacecanbeaccessedusinga32-bitoffsetfromthePC,sincea32-bitoffset(signedornot)fromthePCwillwraparoundfrom0xFFFFFFFFto0x00000000.Any32-bitoffsetcanbebrokenintotwopieces:a20-bitupperportionanda12-bitlowerportion.

Forcaseswhereanabsoluteaddress(i.e.,notaPC-relativeaddress)isdesired,recallthatthe“LoadUpperImmediate”instruction(LUI)containsa20-bitimmediatevaluewhichisshiftedleft12bitsandmovedintoaregister.

Forexample,toloadahalfwordfromanarbitrarylocationinmemory,aninstructionsequencelikethiscanbeused:

LUI x4,Immed-20LH x4,Immed-12(x4)

(Inthecaseofa“store”instruction,asecondregisterwouldbeneeded.)

RISC-VArchitectureSummary/Porter Page� of� 88 323

Page 89: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Asmentionedelsewhere,caremustbetakenwhenbreakinga32-bitoffsetinto12and20bitpiecesbecauseofthesign-extensionthatwilloccurwiththe12-bitportionwhenitisaddedin.

ForcaseswhereaPC-relativeaddressisdesired,recallthatthe“AddUpperImmediatetoPC”instruction(AUIPC)containsa20-bitimmediatevaluewhichisshiftedleft12bits,addedtothePC,andstoredinaregister.

Forexample,toloadahalfwordusinganarbitraryPC-relativeoffset,aninstructionsequencelikethiscanbeused:

AUIPC x4,Immed-20LH x4,Immed-12(x4)

PC-relativeaddressesarepreferredinsituationswhere(1)thecodemaytherelocatedtoarbitrarylocationsafterlink-time;(2)theloadinstructionandthetargetareassembledinthesame2ileandtheresomereasontocompleteinstructiongenerationbeforelink-time;and(3)theaddressspaceexceeds32-bits(4GiBytes),butthetargetlocationlieswithin±2GiBytesoftheloadinstruction.

IntegerMultiplyandDivide

TheRISC-Vspecmakesthemultiplyanddivideinstructionsoptional.Thissectiondescribestheseinstructions,whichmayormaynotbeimplemented.(TherewasachangebetweenRISC-Vversion1.9and2.0;wedescribethemorerecentversionhere.)

Multiply

GeneralForm:MUL RegD,Reg1,Reg2

Example:MUL x4,x9,x13 # x4 = x9*x13

Description:ThecontentsofReg1ismultipliedbythecontentsofReg2andtheresultisplacedinRegD.

RISC-VArchitectureSummary/Porter Page� of� 89 323

Page 90: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

RV32/RV64/RV128:Regardlessofthesizeoftheregisters,theresultoftheirmultiplicationwillbetwiceaslarge,andthereforerequire2registerstocontain.Thisinstructioncapturesthelower-orderhalfoftheresultandmovesitintothedestinationregister.Seethecommentaryandtheothermultiplyinstructionsfortheupperhalf.

Comments:Thereisnodistinctionbetweensignedandunsigned;theresultisidentical.Over2lowisignored.

Encoding: ThisisanR-typeinstruction.

BinaryMultiplication–UpperBits,Signedvs.Unsigned

Whenevertwovaluesaremultiplied,theresultmayrequireuptotwiceasmanybitstorepresent.Forexample,considermultiplyingthesetwo8bitvalues:

1101 0101 1011 1011

Ifweviewtheseasunsignednumbers,wegetthisresult: 1101 0101 = 213 (unsigned) × 1011 1011 = 187 (unsigned)1001 1011 1001 0111 = 39,831 (unsigned)

Ifweviewtheseassignedvalues,wegetthisresult: 1101 0101 = -43 (signed) × 1011 1011 = -69 (signed)0000 1011 1001 0111 = 2,967 (signed)

Ifweviewonevalueassignedandtheotherasunsigned,wegetthisresult: 1101 0101 = -43 (signed) × 1011 1011 = 187 (unsigned)1110 0000 1001 0111 = -8,041 (signed)

Regardlessofwhetherweareinterpretingtheoperandsassignedorunsignednumbers,notethattheleast-signi2icanthalfoftheresultisalwaysthesamebits.Thisisnotacoincidence;itisalwaystrue.

However,themost-signi2icanthalfcanbedifferent,asthesethreecasesprove.

RISC-VArchitectureSummary/Porter Page� of� 90 323

Page 91: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Forthisreason,theremustbethreeadditionalmultiplyinstructionstoobtainthemost-signi2icanthalfoftheresult:

MULH –signedoperands MULHU –unsignedoperands MULHSU –onesignedoperandandoneunsignedoperand

Wemaywishtodetectover2low,i.e.,tomakesuretheresultissmallenoughto2itintothesamenumberofbitsastheoperands.Foranunsignedresult,wecanchecktomakesurethebitsofthemost-signi2icanthalfareallzero.Forasignedresult,wemustchecktomakesurethatthebitsofthemost-signi2icanthalfareallthesameasthesignbitoftheleast-signi2icanthalf.

Multiply–HighBits(Signed)

GeneralForm:MULH RegD,Reg1,Reg2

Example:MULH x4,x9,x13 # x4 = HighBits(x9*x13)

Description:ThecontentsofReg1ismultipliedbythecontentsofReg2andthemost-signi2icanthalfoftheresultisplacedinRegD.Bothoperandsandtheresultareinterpretedassignedvalues.

Encoding:ThisisaR-typeinstruction.

Multiply–HighBits(Unsigned)

GeneralForm:MULHU RegD,Reg1,Reg2

Example:MULHU x4,x9,x13 # x4 = HighBits(x9*x13)

Description:ThecontentsofReg1ismultipliedbythecontentsofReg2andthemost-signi2icanthalfoftheresultisplacedinRegD.Bothoperandsandtheresultareinterpretedasunsignedvalues.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 91 323

Page 92: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Multiply–HighBits(SignedandUnsigned)

GeneralForm:MULHSU RegD,Reg1,Reg2

Example:MULHSU x4,x9,x13 # x4 = HighBits(x9*x13)

Description:ThecontentsofReg1ismultipliedbythecontentsofReg2andthemost-signi2icanthalfoftheresultisplacedinRegD.Oneoperandisinterpretedassignedandoneoperandisinterpretedasunsignedandtheresultisinterpretedasasignedvalue.

Thespecsuggeststhat: Reg2=multiplier=signed Reg1=multiplicand=unsignedbutthisinterpretationisalsoapossibility??? Reg2=multiplier=unsigned Reg1=multiplicand=signed

Encoding: ThisisanR-typeinstruction.

RecommendedUsage:Typically,theprogrammerwillwanttoobtainthefullresultofamultiplication,i.e.,bothupperhalfandlowerhalves.Thisrequirestwomultiplyinstructions.

Forexample,thefollowingsequence

MULH x4,x9,x13 # compute upper halfMUL x5,x9,x13 # compute lower half

willplacetheresultintheregisterpairx4:x5.

Itisrecommendedthattheinstructionsalwaysbeplacedinthisorder,i.e.,withtheupperhalfcomputed2irstandthelowerhalfsecond.Insomeimplementations,theexecutionunitmayrecognizethiscommonprogrammingidiomand,atthemicro-architecturallevel,fusethesetwoinstructionsintoasinglemultiplicationoperation,therebyimprovingperformance.

RISC-VArchitectureSummary/Porter Page� of� 92 323

Page 93: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

AdditionalMultiplyInstructionFor64-bitand128-bitMachines

64-bitand128-bitmachinesalsoaddanadditionalmultiplyinstruction(MULW)toproperlyperform32-bitmultiplication,asitwouldbedoneona32-bitmachine.

Tounderstandwhythisisnecessary,considera64-bitmachinebeingusedtoexecute32-bitcodeoperatingon32-bitintegers.ARV64machinewillstoreevery32-bitvalueinthelower-orderbitsofa64-bitregister,withtheupper32bitscontainingthesignextensionofthelower-order32bits.

Butwhenwemultiplytwo32-bitvalues,theresultmightover2lowandbelargerthan32-bits.Theresultwedesireisthelow-order32bitsofthecorrectanswer,withasignextensionintheupper32bits.So,toproperlyemulatethebehaviorofa32-bitmultiply,wedon’tactuallywantthemathematicallycorrectresult.

Let’slookatanexample.Tomakeourexamplemanageable,let’sconsideramachinewith16-bitregisters,whicharebeingusedtocontain8-bitvalues(ratherthan64-bitregisterscontaining32-bitvalues).

Imaginethatwewishtomultiply117×94.Bothnumberscanberepresentedas8-bitsignedvalues,buttheresult(10,998)cannotbe.

0000 0000 0111 0101 = +117× 0000 0000 0101 1110 = +94 0010 1010 1111 0110 = +10,998

Thisisasituationwhereover2lowoccurs,inthesensethattheresultcannotberepresentedinonly8bits.Inordertoemulatean8-bitmachine,wewantthelow-order8bitsofthecorrectresult,withsignextensionintheupper8bits,likethis:

1111 1111 1111 0110

TheMULinstructionwouldgiveusthemathematicallycorrectresult,namely

0010 1010 1111 0110

Butwedon’twantthemathematicallycorrectresult.Insteadweneedanewinstructiontogiveustheappropriatevalueforemulatingthe8-bitmultiplyoperation,namelytherightlower-orderbits,withsignextensionintheupperpartoftheregister:

RISC-VArchitectureSummary/Porter Page� of� 93 323

Page 94: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

1111 1111 1111 0110

MultiplyWord

GeneralForm:MULW RegD,Reg1,Reg2

Example:MULW x4,x9,x13 # x4 = x9*x13

Description:ThecontentsofReg1ismultipliedbythecontentsofReg2andtheresultisplacedinRegD.Onlythelowerorder32-bitsoftheresultareused;thelower32bitsaresignedextendedtothefulllengthoftheregister.

Comment:Thisinstructionisusedtoproperlyemulate32-bitmultiplicationona64-bitor128-bitmachine.

Notethatonlytheleast-signi2icant32bitsofReg1andReg2canpossiblyaffecttheresult.

Ifyouwanttheupper32-bitsofthefull64-bitresultusetheMULinstructionona64-bitmachine.

RV32/RV64/RV128:Thisinstructionisonlyavailableon64-bitand128-bitmachines.

Encoding: ThisisanR-typeinstruction.

Commentary:Considerthefollowingsequencewhenexecutedona32-bitmachine:

MULH x4,x9,x13 # compute upper halfMUL x5,x9,x13 # compute lower half

Itwillplacethe64-bitresultintheregisterpairx4:x5.

Nowconsiderexecutingthissamesequenceona64-bitmachine.Thex5registerwillcontainthefull64-bitvalue,nota32-bit,sign-extendedvalue.Thex4registerwillcontainnothingbutmeaninglesssignbits.

RISC-VArchitectureSummary/Porter Page� of� 94 323

Page 95: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Previouslywesaidthat32-bitcodecouldbeexecutedona64-bitmachinewithnochange:Theideawasthattheupper32bitsoftheregistersaresimplyignored.Thisisaclearexception:inthissequencevalidbitsareplacedintheupper32bits.Therefore,thecodesequencetoperforma32-bitmultiplywith64-bitresultmustbedifferentforRV32andRV64machines.

AdditionalMultiplyInstructionFor128BitMachines

Itwouldseemlogicalfora“double”versionofMUL(withthename“MULD”)toexistforRV128machines,inanalogytotheMULWinstruction.Howeverthespecdoesnotmentionthisinstruction.

IntegerDivision–BasicConcepts

Withdivisionandremainderthereisalwaysquestionabouthownegativeoperandsaretreated.

Considerdividingabyn(thatis,a/n).

q=aDIVn #computequotientr=aREMn #computeremainder

Thereisagreementthatthequotient(q)andremainder(r)mustobeytheseequations:

a=nq+r |r|<|q|

Manylanguages(C,C++,Java)perform“truncateddivision”:

q=trunc(a/n)r=a-ntrunc(a/n)

whichproducestheseresults:

7 / 3 = 2 7 % 3 = 1-7 / 3 = -2 -7 % 3 = -1 7 / -3 = -2 7 % -3 = 1 -7 / -3 = 2 -7 % -3 = -1

Anotherreasonablede2initionis“2looreddivision”:

RISC-VArchitectureSummary/Porter Page� of� 95 323

Page 96: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

q=⌊a/n⌋r=a-n⌊a/n⌋

whichproducesthefollowingresults.Thedot(•)indicatesdifferenceswithtruncateddivision.

7 / 3 = 2 7 % 3 = 1-7 / 3 = -3 • -7 % 3 = 2 • 7 / -3 = -3 • 7 % -3 = -2 •-7 / -3 = 2 -7 % -3 = -1

Thereisalsoade2initioncalled“Euclideandivision”,inwhichtheremainderisnevernegative.Thedot(•)indicatesdifferenceswithboththepreviousde2initions.

7 / 3 = 2 7 % 3 = 1-7 / 3 = -3 -7 % 3 = 2 7 / -3 = -2 7 % -3 = 1-7 / -3 = 3 • -7 % -3 = 2 •

TheRISC-VspecmakesnocommitmentregardingtheexactmeaningofDIVandREM.ThefollowingquotefromWikipediaispertinent:

“…Euclideandivisionissuperiortotheotheronesintermsofregularityandusefulmathematicalproperties,although2looreddivision…isalsoagoodde2inition.Despiteitswidespreaduse,truncateddivisionisshowntobeinferiortotheotherde2initions.”

— DaanLeijen,DivisionandModulusforComputerScientists

DivisionErrorConditions

RISC-Vsupportsbothsigneddivision(i.e.,dividingasignednumberbyasignednumber)andunsigneddivision(i.e.,dividinganunsignednumberbyanunsignednumber).Withmultiplication,youcouldmixsignedandunsigned;notsowithdivision.

Concerningthesizesoftheresult,notethatthefollowingmusthold,since|n|≥1:

|q|≤|a||r|<|a|

RISC-VArchitectureSummary/Porter Page� of� 96 323

Page 97: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Thus,iftheoperands(aandn)are32-bits,thentheresults(qandr)willalmostalways2itinto32-bits.

Thereisexactlyoneexceptioninwhichtheresultswillnot2it.Thisiscalled“divisionover2low”anditcanonlyoccurwithsigneddivision.

LetMrepresent-231,whichisthemostnegativenumberrepresentablein32-bits.IfwedivideMby-1,theresultis+231,whichisonegreaterthanthelargest32-bitnumber.

TheRISC-Vspeci2iesthatdivisionover2lowwillbehandledasfollows:

CorrectResult: q=+231 r=0 ActualResult: q=-231 r=0

Theotherwell-knownissueiswithdivisionbyzero.Thespecsaysthattheresultwillbe:

q=0xFFFFFFFF r=a

Thiscanbeinterpretedas:

SignedDivide-By-ZeroResult: q=-1 r=a

UnsignedDivide-By-ZeroResult: q=+232-1 i.e.,themaximumunsignednumber r=a

Ourdiscussionoferrorconditionsisfor32-bitdivision,butnaturallyextendsandappliesto64-bitand128-bitdivision.

Bothdivide-by-zeroanddivision-over2lowerrorsoccurwithoutcausingatraporexceptionandinstructionexecutioncontinueswithnoindication.Infactthereare

RISC-VArchitectureSummary/Porter Page� of� 97 323

Page 98: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

noarithmetictrapsorexceptions.Ifsomehigh-levellanguagemandatesthatdivide-by-zeromustbecaught,thenthecompilermustinsertasingleBRANCH-IF-ZEROinstruction.Theactualresultscomputedbythehardwarearechosenbecausethesearethevaluesthatnaturallyresultfromatypicalhardwareimplementation.

Ifthehigh-levellanguagemandatesthatdivision-over2lowshouldbecaught(andtheyalloughtto!)thenatleastoneadditionaltest-and-branchinstructionmustbeexecuted.

Divide(Signed)

GeneralForm:DIV RegD,Reg1,Reg2

Example:DIV x4,x9,x13 # x4 = x9 DIV x13

Description:ThecontentsofReg1isdividedbythecontentsofReg2andthequotientisplacedinRegD.Bothoperandsandtheresultaresignedvalues.

Comments:Divide-by-zeroanddivision-over2lowresultinmathematicallyincorrectresults.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

Divide(Unsigned)

GeneralForm:DIVU RegD,Reg1,Reg2

Example:DIVU x4,x9,x13 # x4 = x9 DIV x13

Description:ThecontentsofReg1isdividedbythecontentsofReg2andthequotientisplacedinRegD.Bothoperandsandtheresultareunsignedvalues.

Comments:Divide-by-zeroproducesamathematicallyincorrectresult.Division-over2lowcannotoccur.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 98 323

Page 99: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Remainder(Signed)

GeneralForm:REM RegD,Reg1,Reg2

Example:REM x4,x9,x13 # x4 = x9 REM x13

Description:ThecontentsofReg1isdividedbythecontentsofReg2andtheremainderisplacedinRegD.Bothoperandsandtheresultaresignedvalues.

Comments:Divide-by-zeroanddivision-over2lowresultinmathematicallyincorrectresults.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

Remainder(Unsigned)

GeneralForm:REMU RegD,Reg1,Reg2

Example:REMU x4,x9,x13 # x4 = x9 REM x13

Description:ThecontentsofReg1isdividedbythecontentsofReg2andtheremainderisplacedinRegD.Bothoperandsandtheresultareunsignedvalues.

Comments:Divide-by-zeroproducesamathematicallyincorrectresult.Division-over2lowcannotoccur.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

RecommendedUsage:Oftentheprogrammerwillwanttoobtainboththequotientandremainder.ItisrecommendedthattheDIVbedone2irstandtheREMbedonesecond.

Forexample:

RISC-VArchitectureSummary/Porter Page� of� 99 323

Page 100: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

DIV x4,x9,x13 # x4 = x9 DIV x13REM x5,x9,x13 # x5 = x9 REM x13

Insomeimplementations,theexecutionunitmayrecognizethiscommonpatternandfusethesetwoinstructionsintoasingledivisionoperation,therebyimprovingperformance.

AdditionalDivideInstructionsFor64BitMachines

For64-bitmachines,thereare4additionalDIVIDE/REMAINDERinstructionstoperform32-bitoperations.Inotherwords,theabove4instructionswillperform32-bitoperationsonaRV32machinebutwillperform64-bitoperationsonanRV64machine.Thefollowing4instructionswillworkjustliketheaboveinstructionswouldworkona32-bitmachine,bysign-extendingeverythingupto64-bits.

DivideWord(Signed)

GeneralForm:DIVW RegD,Reg1,Reg2

Example:DIVW x4,x9,x13 # x4 = x9 DIV x13

RV32/RV64/RV128:Thisinstructionisonlyavailableon64-bitand128-bitmachines.

Description:ThecontentsofReg1isdividedbythecontentsofReg2andthequotientisplacedinRegD.Bothoperandsandtheresultaresignedvalues.Onlythelow-order32bitsoftheoperandsareusedandthe32-bitresultissigned-extendedto2illthedestinationregister.

Comments:Divide-by-zeroanddivision-over2lowresultinmathematicallyincorrectresults.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

DivideWord(Unsigned)

GeneralForm:DIVUW RegD,Reg1,Reg2

RISC-VArchitectureSummary/Porter Page� of� 100 323

Page 101: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Example:DIVUW x4,x9,x13 # x4 = x9 DIV x13

RV32/RV64/RV128:Thisinstructionisonlyavailableon64-bitand128-bitmachines.

Description:ThecontentsofReg1isdividedbythecontentsofReg2andthequotientisplacedinRegD.Bothoperandsandtheresultareunsignedvalues.Onlythelow-order32bitsoftheoperandsareusedandthe32-bitresultissigned-extended(???)to2illthedestinationregister.

Comments:Divide-by-zeroproducesamathematicallyincorrectresult.Division-over2lowcannotoccur.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

RemainderWord(Signed)

GeneralForm:REMW RegD,Reg1,Reg2

Example:REMW x4,x9,x13 # x4 = x9 REM x13

RV32/RV64/RV128:Thisinstructionisonlyavailableon64-bitand128-bitmachines.

Description:ThecontentsofReg1isdividedbythecontentsofReg2andtheremainderisplacedinRegD.Bothoperandsandtheresultaresignedvalues.Onlythelow-order32bitsoftheoperandsareusedandthe32-bitresultissigned-extendedto2illthedestinationregister.

Comments:Divide-by-zeroanddivision-over2lowresultinmathematicallyincorrectresults.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

RemainderWord(Unsigned)

GeneralForm:REMUW RegD,Reg1,Reg2

RISC-VArchitectureSummary/Porter Page� of� 101 323

Page 102: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter3:Instructions

Example:REMUW x4,x9,x13 # x4 = x9 REM x13

RV32/RV64/RV128:Thisinstructionisonlyavailableon64-bitand128-bitmachines.

Description:ThecontentsofReg1isdividedbythecontentsofReg2andtheremainderisplacedinRegD.Bothoperandsandtheresultareunsignedvalues.Onlythelow-order32bitsoftheoperandsareusedandthe32-bitresultissigned-extended(???)to2illthedestinationregister.

Comments:Divide-by-zeroproducesamathematicallyincorrectresult.Division-over2lowcannotoccur.Seediscussionabove.

Encoding: ThisisanR-typeinstruction.

AdditionalDivideInstructionsFor128BitMachines

Itwouldseemlogicalfor“double”versionsoftheseinstructionstoexistforRV128machines.Howeverthespecdoesnotmentionthese.

RISC-VArchitectureSummary/Porter Page� of� 102 323

Page 103: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

FloatingPointing–ReviewofBasicConcepts

Thissectioncanbeskippedifyouarefamiliarwith2loatingpointarithmetic.Thediscussionhereisgeneral,andnotspeci2ictoRISC-V.

TheIEEE754-2008standarddescribeshow2loatingpointnumbersaretoberepresentedandhow2loatingpointoperationsaretobeexecutedbycomputers.

Thestandardiscomplicatedanddetailed.Thissectionismeanttobeanintroductionandisnotanexhaustivedescription.MostmodernprocessorISAsimplementtheIEEE754-2008speci2ication,butthespeci2icationhasoptionsandsomepartsarenotfullyimplementedonsomecomputers.

Thespeci2icationde2inestwoimportantdatatypes:

SinglePrecision(32-bit“2loat”values) DoublePrecision(64-bit“double”values)

Somecomputersimplementonlysingleprecision;othercomputersimplementboth.

Ineithercase,theideaistorepresentarealrationalnumberinawaysimilartoscienti2icnotation.Forexample,thefollowingnumberisgiveninscienti2icnotation:

6.022×1023(anapproximationtoAvogadro’sconstant)

Withonly32bitsofprecision(or64bitsfordoubleprecision),therearelimitstotheamountofprecisionandthesizeoftheexponents.Theavailablebitsareusedasfollows:

RISC-VArchitectureSummary/Porter Page� of�103 323

Page 104: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Numberofbitsusedfor… Single Double Sign 1 1 Exponent 8 11 Value 23 52 Total 32 64

Furthermore,with2loatingpointanumericalvalueisrepresentedinbinary(notdecimal)andthisintroducessomesubtletieswhengoingbackandforthbetweentheinternalbitpatternsanddecimalrepresentationswhichhumanscanread.

Everypositiveintegercanberepresentedwitha2initenumberofdigitsanda2initenumberofbits.Forexample,hereisthesamenumber,representedbothways.Ofcourse,thisnumberrequiresafewmorecharactersinbinary,buttherepresentedvalueisequal.

2,468 (decimal) 100110100100 (binary)

Wecommonlyrepresentrationalnumbersindecimalusinga“decimalpoint”,asin:

123.456

Wecanalsorepresentrationalnumbersinbinaryusinga“binarypoint”,asin:

101.0101

Withdecimalnumbers,thepositionofeachdigitisimportantandwetalkabouttheplacevalueofthedigits.Theplacevaluesareallpowersof10:

… 1000 100 10 1 1/10 1/100 1/100 1/1000 … … 103 102 101 100 10-1 10-2 10-3 10-4 …

Wecanusetheplacevaluestocomputethevalueofanumber,asyoulearnedinprimaryschool:

123.456 =(1×102)+(2×101)+(3×100)+(4×10-1)+(5×10-2)+(6×10-3) =(1×100)+(2×10)+(3×1)+(4×0.1)+(5×0.01)+(6×0.001)

RISC-VArchitectureSummary/Porter Page� of� 104 323

Page 105: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Likewise,withbinarynumbers,thevalueofeachbitisscaledaccordingtotheplacevalueofthebit.Butwithbinary,theplacevaluesareallpowersof2:

… 8 4 2 1 1/2 1/4 1/8 1/16 … … 23 22 21 20 2-1 2-2 2-3 2-4 …

Usingthis,wecanconvertbinarynumbersintodecimalnumbers:

101.0101 =(1×22)+(0×21)+(1×20)+(0×2-1)+(1×2-2)+(0×2-3)+(1×2-4) =(1×4)+(0×2)+(1×1)+(0×1/2)+(1×1/4)+(0×1/8)+(1×1/16) =4+1+1/4+1/16 =5.3125

Somerationalnumbersrequireanin2initenumberofdigitsintheirdecimalrepresentation.Forexample:

1/3=0.33333…=0.3(3)*

Likewise,somerationalnumbersmayrequireanin2initenumberofbitsintheirbinaryrepresentation.Butregardlessofwhetherwerepresentarationalnumberindecimalorbinary,thein2initestringsofdigits/bitswillsettleintoasimplerepeatingpattern.Thisistrueofallrationalnumbers,butirrationalnumbers(e.g.,𝜋,√2)donothavesuchsimpledecimalorbinaryrepresentations.Neithertheirdecimalnortheirbinaryexpansionswilleverexhibitarepeatingpattern.

Somenumbersmanyhavea2initerepresentationindecimalbutrequireanin2initesequenceinbinary.Forexample,thefollowingnumber:

4.3

requiresanin2initebinaryexpansiontorepresentit,namely:

100.01001100110011…=100.01(0011)*

Itturnsoutthateverybinarynumberwithoutarepeatingpartcanberepresentedwitha2initenumberofdecimaldigits.Furthermore,thenumberofdigitstotherightofthedecimalpointwillneverexceedthenumberofplacestotherightofthebinarypoint.Forexample:

RISC-VArchitectureSummary/Porter Page� of� 105 323

Page 106: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

101.1101(binary)=5.8125(decimal)

Turningto2loatingpointrepresentation,wehavelimitednumberofbitsavailable,whichmeanswecannotaccommodatearbitraryprecision.Noteverynumberisrepresentable,sowemustroundnumberstoanearbynumberthatisrepresentable.

Forexample,thenumber6.022×1023canonlyberepresentedapproximately,eventhoughitappearsnottohaveagreatamountofprecision.Hereistheclosestnumberthatcanberepresentedusingasingleprecision2loatingpoint:

6.02200013124147498450944×1023

Ontheotherhand,itturnsoutthatthisnumber:

2.383496609792×1012

canberepresentedexactlyusingonly32bitsingleprecision.Thenextlargestvaluethatcanberepresentedexactlyhappenstobe:

2.383496871936×1012

WecanmakethefollowingstatementsaboutIEEE754-20082loatingpointnumberrepresentation:

• Every2loatingpointnumbershasasign.Everynumberiseitherpositiveornegative.

• Therearetworepresentationsforzero:positivezero(i.e.,+0.0)andnegativezero(i.e.,-0.0).

• Therearetworepresentationsofin2inity:positivein2inity(+∞or+inf)andnegativein2inity(-∞or-inf)

• Theexponentmaybepositiveornegative,allowingbothverylargenumbersandverysmallnumbers.

• Thereisaspecialrepresentationcalled“notanumber”(“NaN”).Thisvaluecanrepresentamissingvalueortheresultofaunde2inedoperation,suchasdivide

RISC-VArchitectureSummary/Porter Page� of� 106 323

Page 107: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

byzero.Insomeimplementationstherearetwovariations,called“quietNaN”and“signalingNaN”.

• Every32-bitinteger(i.e.,everyintegerintherange-2,147,483,648to+2,147,483,647)canberepresentedexactlywitha64bitdoubleprecision2loatingpointnumber,butnotwithasingleprecision2loat.Infact,theintegerrangeisalittlegreater:every54-bitintegercanberepresentedexactlyindoubleprecision2loatingpoint.Almostalllargerintegerswillgetrounded.

• Every25-bitinteger(i.e.,everyintegerintherange-16,777,216to+16,777,215)canberepresentedexactlywitha32bitsingleprecision2loatingpointnumber.Almostallintegersoutsideofthisrangewillgetrounded.

Hereistherangeofvaluesthatcanberepresented.(Weusedecimalnotationhereandapproximatetheexactvalues.)

SinglePrecision Largestvalue: ~3.40282347×10+38 Smallestvalueabovezero: ~1.17549440×10-38 Digitsofaccuracy: about7

DoublePrecision Largestvalue: ~1.7976931348623157×10+308 Smallestvalueabovezero: ~2.2250738585072014×10-308 Digitsofaccuracy: about16

Rememberthatnoteveryvalueintheaboverangescanberepresented.Why?Recallthereisacountablein2inityofrationalnumbersjustbetweenanytwonumbers,yetwithonly32or64bits,weonlyhaveasmallnumberofuniquerepresentations.

Floatingpointarithmeticismeanttomimicmathematicalarithmetic,butitmustberememberedthattheyareonlyapproximatelythesame:

• Theexactvalueorresultofanoperationisnotalwaysrepresentable,sothecomputedanswerisoftennotmathematicallycorrect.

• Floatingpointadditionisnotalwaysassociative,duetoroundingerrors.Thatis,(x+y)+zisnotalwaysequaltox+(y+z).

RISC-VArchitectureSummary/Porter Page� of� 107 323

Page 108: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

• Floatingpointmultiplicationisnotalwaysassociative.Thatis,(x*y)*zisnotalwaysequaltox*(y*z).

• Floatingpointmultiplicationdoesnotalwaysdistributeoveradditionwiththeexactsameresults.Thatis,x*(y+z)isnotalwaysequalto(x*y)+(x*z).

However,wecansaythis:

• Floatingpointadditionandmultiplicationarecommutative,likemath.Forexample,x+y=y+x,soyoudon’thavetoworryabouttheorderofoperandsforasingleoperation.

Herearethedifferenttypesofthingsthatcanberepresentedina2loatingpointbitpattern:

•Positivezero(+0.0) •Negativezero(-0.0) •Positivein2inity(+∞or+inf) •Negativein2inity(-∞or-inf) •Not-a-number(NaN) QuietNan(qNaN) SignalingNan(sNaN)

•Normalnumbers(or“normalizednumbers”)•Denormalizednumbers(or“denormals”)

Zero–PositiveandNegative

Thereareexactlytwowaystorepresentzero,oneispositiveandtheotherisnegative.Thisisunlikemath,wherethereisonlyasinglenumbercalledzeroanditisunsigned.

Herearesomeinterestingbehaviors:

1/+0yields+∞ 1/-0yields–∞ +0willnormallycompareasequalto-0(e.g.,the==inthe“C”language) Somelanguagesprovideawaytodistinguish+0and-0.

Thereareadditionalbehaviors,suchas:

RISC-VArchitectureSummary/Porter Page� of� 108 323

Page 109: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

-0/-∞yields+0

Although+0and-0maycompareasequal,theymayalsoresultindifferentoutcomesinsomecomputations.Thischallengesourunderstandingofthemeaningof“equal”,tosaytheleast.

Thebitpatternrepresentationofzerois:

single double +0.0 0x00000000 0x0000000000000000 -0.0 0x80000000 0x8000000000000000

Notethatthe2loatingpointrepresentationfor+0.0isbit-for-bitidenticaltotherepresentationfor0inbinaryintegerrepresentation(bothsignedandunsigned).Also,-0.0isrepresentedidenticallytothemostnegativesignedinteger.

NotaNumber-NaN

Operationslikethefollowingaremathematicallyunde2inedand,whenattempted,willresultinaNaNresult,toindicatethattheresultisunde2ined.

0/0 ∞/∞ 0*∞

Otheroperationsaremathematicallyde2inedbutgiveacomplexresult.Complexnumbersarenothandledby2loatingpoint,sooperationssuchasthefollowingwillreturnNaN.

Squarerootofanegativenumber Logofanegativenumber

AnotheruseofNaNistorepresentanuninitializedormissingvalue.Ifavariableisusedbeforeitisinitialized,spuriousincorrectresultsmightoccur,butthiscanbeavoidedifthevariablecontainsNaN.

TheIEEEspecactuallymentionstwokindsofNaN:“signalingNaN”and“quietNaN”.Butusually,wejusttalkaboutNaNwithoutmakinganydistinctionaboutwhetheritissignalingorquiet.

RISC-VArchitectureSummary/Porter Page� of� 109 323

Page 110: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

A“signalingNaN”issupposedtocauseabreakinthe2lowofexecutionwhenitisencounteredinacomputation.Thatis,atraporexceptionofsomesortwilloccur,andthenormalinstructionsequencewillbeinterruptedimmediately.SignalingNaNsmightreasonablyusedforuninitializedvalues:theirusemayrepresentaprogrambugwhichneedsattention.Intheory,signalingNaNsmightalsobeusedasplaceholdersforvalues(suchascomplexnumbers)whichrequirespecialhandling.

Theideawitha“quietNaN”,isthatitcanbeusedasanoperandinarithmeticoperations.Furthermore,aquietNaNwillbepropagated.Thatis,ifoneoftheargumentsisaquietNaN,theresultwillalsobeaquietNaN.Thisallowsalengthysequenceofoperationstobeperformedquicklywithnospecialtestingforproblems.OnceaNaNappears,asaresultofsomeerror,itwillpersistinthechainofcomputations.Eachsubsequentoperationwillcompletenormally,withoutcausinganexceptionortrapeventhoughsomesortoferroroccurredearlierinthesequence.Ifanyproblemsoccuratanystepofthecomputation,the2inalresultwillbeaquietNaN.Therefore,itissuf2icienttoperformonlyasingletestforNaNaftertheentirecomputationtoseeifanyerrorsaroseatanystageofthecomputation.

ThespecdoesnotrequiresignalingNaNs;theyareoptional.OneimplementationapproachisforthehardwaretointerpretallNaNvaluesidentically,basicallyasquietNaNs.

ThereareseveralbitpatternsthatcanbeusedtorepresentNaNs,sothereisnotasinglebitpatternforNaN.

Avalueisde2inedtorepresentNaNif(1)theexponent2ieldisall1’s,and(2)thebitsofthefraction2ieldarenotallzero.(Ifthefractionbitsareallzero,thenthevalueiseither+∞or–∞.)ThesignbitofaNaNvalueisignored.

IfadistinctionbetweenquietandsignalingNaNisimplemented,thenoneofthebitsinthefraction2ieldwillbeusedtodistinguishbetweenquietandsignaling.

TheexactbitpatternsforNaNarenotfullyspeci2iedandcanvarybetweenimplementations.

Wecansaythatavaluewithallbitssettoone(i.e.,therepresentationforthesignedinteger-1whichis0xFFFFFFFFor0xFFFFFFFFFFFFFFFF)willde2initelyrepresentaNaNandwillalmostcertainlyrepresentaquietNaN.Forexample,thisall-onespatternwillbeaquietNaNforIntel,AMD,SPARC,ARM,RISC-V,etc.

RISC-VArchitectureSummary/Porter Page� of� 110 323

Page 111: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

MixingSingleandDoublePrecisionUsingNaN

Therearemanybitsinthefraction2ield,andtheonlyrequirementforNaNisthattheycannotallbezero.Thus,thereisroomtostoresomeadditionaldatawithintheNaN.SoaNaNcancarryasortof“payload”valueinthefractionbits.ThiscapabilitymayormaynotbeusedinaparticularimplementationofIEEE754-2008.

Forexample,thefraction2ieldinadoubleis52bits.AssumethatonebitisreservedtobealwayssettoindicatethatthisisaNaN,andassumethatasecondbitisreservedandusedtodistinguishbetweenaquietNaNandasignalingNaN.Thisleaves50bitsthatcanbeusedtostoreaarbitraryvalue.Noticethatthisisenoughroomtostoreanentiresingleprecision2loatingpointnumber.

Imagineamachinethatimplementsdoubleprecisionarithmeticanduses64-bitregisterstostore2loatingpointvalues.Howmightthismachinestore32-bitsingleprecisionvaluesinthesesameregisters?

Any64-bitvalueinwhichthehighorder32bitsareset,willbealwaysrecognizedasaNaN.Oneapproachtostoringasingleprecisionvalueina64bitregisteristostorethesingleprecisionvalueintheleastsigni2icantbits32bitsandall1sinthemostsigni2icant32bits.

Allsingleprecisionoperationswillonlylookattheleastsigni2icant32-bitsoftheoperandsand,fortheresultvalue,willalwayssetthemost-signi2icant32bitsto1s.

Anyaccidentalattempttoperformadoubleprecisionoperationonaregistercontainingasingleprecisionvalue,willinterpretthatoperandasaNaN.

NormalandDenormalizedNumbers

Noteverynumberisrepresentableandtherepresentablenumbersarespacedoutonthenumberline.Soeachpossible2loatingpointvalueisseparatedbyanumericaldistancefromthenextsmallestnumberandfromthenextlargestnumber.Asthenumbersgetsmallerandclosertozero,thespacinggetssmallerandthenumbersareclosertogether.Asthenumbersgetlarger,thespacingisfartherapart.

Forexample,thefollowingnumbersdifferbyaverysmallamount:

RISC-VArchitectureSummary/Porter Page� of� 111 323

Page 112: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

4.567×10-25 4.568×10-25

Ontheotherhand,thesetwonumbersdifferbyaverylargeamount:

4.567×10+25 4.568×10+25

Howeverinbothexamplesabove,theaccuracyisthesame:4digitsofprecision.

However,thereareonlyalimitednumberofbitsavailabletorepresenttheexponents.Exponentscannotcontinuetogetmorenegativeandwecannotrepresentsmallerandsmallernumbers,evermoreclosetozero.Therefore,thispatternofthe2loatingpointnumbersbecomingspacedevermorecloselyastheygetcloserandclosertozerocannotcontinue.Somethinghastochangeasthenumbersgetsmallerandapproachzero.

Whathappensisthatbelowsomesize,therepresentablevaluesaresimplyspaceduniformlyallthewaydowntozero.Thisistheroleofdenormalizednumbers.

Most2loatingpointnumbersare“normal”numbers.Normalnumbershaveabout7digitsofaccuracy(forsingleprecision)and16digitsofaccuracy(fordoubleprecision).Inotherwords,wecanapproximateanydesiredvaluewithabout7(or16)digitsofaccuracy.

Anotherwaytolookatdenormalizednumbersisthis:Forverysmallvalues,wecannotapproximatethevaluewithfullaccuracy.Aswegetcloserandclosertozero,wecanapproximatethetruevaluewithfewerandfewerplacesofaccuracy.Forreallytinyvalues,wemayevenbeforcedtouse0.0torepresentthevalue,essentiallylosingallaccuracy.

Wecanmakethefollowingstatementsaboutdenormalizednumbers:

• Alldenormalizednumbersareveryclosetozero.• Denormalizednumbersextendonboththepositiveandnegativesidesofzero.

• +0.0and-0.0arethemselvesrepresentedasdenormalizednumbers.• Alldenormalizednumbersareregularlyandevenlyspaced.(Exception:+0.0and-0.0haveanin2initesimaldifferenceandareconsideredequal.)

RISC-VArchitectureSummary/Porter Page� of� 112 323

Page 113: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

• Thelargestdenormalizednumberisjustlessthanthesmallestpositivenormalnumber.

• Likewise,themostnegativedenormalizednumberisjustgreaterthantheleastnegativenormalnumber.

• Itisgenerallysafetoignorethedistinctionbetweennormalizedanddenormalizednumberswhenusing2loatingpointinyourapplications.

Therearerulesfordeterminingtheprecisionoftheresultsofanarithmeticcalculationinvolvingscienti2icnotation.Butifverysmallvalues(i.e.,denormalizednumbers)ariseduringacomputation,thenyourassumptionsaboutprecisionwillbeviolatedandthe2inalresultswillhavereducedprecision.Insomecases,the2inalresultwillbeameaningless,incorrectvalue.

Warning:AlwaysrememberthatnumbersasrepresentedincomputersareNOTtruemathematicalnumbers.ComputerarithmeticisNOTmathematicalarithmetic.Remember:“int”sarenotintegersand“2loats”arenotrealorrationalnumbers.

Computervaluesandcomputationaremereapproximationsofmathematicallypureideals.Agoodprogrammerknowshowimportantitistounderstandandremembertheirdifferencesincreatingreliablesoftware.

RepresentationofSinglePrecisionFloatingPointValues

Asingleprecision2loatingpointnumberisrepresentedwitha32-bitwordasshownhere:

LetNbethenumberrepresented.Let“sign”bethemostsigni2icantbit.Let“exponent”bethebitpatterninbits30:23.Let“fraction”bethebitpatterninbits22:0.

Thenumberrepresentedis:

N=(-1)sign×1.fraction×2exponent

RISC-VArchitectureSummary/Porter Page� of� 113 323

Page 114: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

The2irsttermsimplygivesthesignofthenumber:0=positiveand1=negative.Notethatthemostsigni2icantbitholdsthesignbitforboth2loatingpointnumbersandsignedintegers.

Whenconvertinganumbersuchas101.0101into2loatingpointformat,youshould2irstshiftthedecimalplacetojustaftertheleftmost1bit.Everynumber(exceptzero)willalwayscontainatleastasingle1bit.Thus,themostsigni2icantbitmustbea1andrepresentingitisredundant.Thisexplainswhywepre2ixthefractionalpartwith“1.”.(Thistrickofmakingonebitimplicitdoesn’tworkwithdecimalnumbers:theleadingdigitcanbeanythingexcept0,sowecannotmakeitimplicit.)

Thereare8bitsintheexponent2ield.Theinterpretationofthe“exponent”bitpatternsis:

BitPattern MeaningofExponentField 0000 0000 DenormalizedNumbers,includingzero 0000 0001 -126 ... ... 0111 1110 -1 0111 1111 0 1000 0000 +1 ... ... 1111 1110 +127 1111 1111 InZinity,Not-a-Number

Fornormalizednumbers,theexponenthasaneffectiverangeof-126..+127.

Thesmallestpositivenormalizednumberis:

Inbinary: 1.00000000000000000000000×2-126 (Thereare23zeros)

Inbits: 0x00800000=00000000100000000000000000000000

Decimalapproximation: 1.17549435×10-38

RISC-VArchitectureSummary/Porter Page� of� 114 323

Page 115: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Thelargestnormalizednumberis:

Inbinary: 1.11111111111111111111111×2+127 (Thereare1+23ones)

Inbits: 0x7F7FFFFF=01111111011111111111111111111111

Decimalapproximation: 3.4028235×10+38

Iftheexponentisallones(i.e.,11111111),thenthevalueofthefractionmatters.Ifthefractionisallzeros,thenthevalueis+∞or–∞dependingonthesignbit.

+∞: 0x7F800000=01111111100000000000000000000000

-∞: 0xFF800000=11111111100000000000000000000000

Iftheexponentisallones(i.e.,11111111)andthevalueofthefractionisnotallzeros,thenNaNisrepresented.TherearemultiplerepresentationsthataretobeinterpretedasNaNvalues.Thecanonical,preferredrepresentationofNaNisoftenthis:

NaN(typical): 0xFFFFFFFF=11111111111111111111111111111111

Iftheexponent2ieldisallzeros(i.e.,00000000),thenthevalueisadenormalizednumber.Thevalueofthenumberis:

N=(-1)sign×0.fraction×2-126

Noticethattheleadingimplicit“1”bitisnolongerassumed;itisnow“0”.Alsotheexponentisalways-126,whichhappenstobethesmallestexponentfornormalizednumbers.

Herearesomesamplenumbersthatmayhelpexplaindenormalizednumbers:

RISC-VArchitectureSummary/Porter Page� of� 115 323

Page 116: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Smallestnormalizednumber: 1.00000000000000000000000×2-126 (24bitsofprecision) Largestdenormalizednumber: 0.11111111111111111111111×2-126 (23bitsofprecision) … Randomdenormalizednumber: 0.00000000001100101110101×2-126 (13bitsofprecision) … Smallestdenormalizednumber: 0.00000000000000000000001×2-126 (1bitofprecision) +0.0: 0.00000000000000000000000×2-126 (0bitsofprecision)

RepresentationofDoublePrecisionFloatingPointValues

Doubleprecision2loatingpointnumbersarerepresentedusingananalogousscheme.Theonlydifferenceisthenumberofbitsinthe“exponent”and“fraction”2ields.

Hereistherepresentationofa64-bitdoubleprecision2loatingpointvalue:

Thereare11bitsintheexponent2ield,insteadof8asinsingleprecision.Theinterpretationofthe“exponent”bitpatternsis:

RISC-VArchitectureSummary/Porter Page� of� 116 323

Page 117: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

BitPattern MeaningofExponentField 000 0000 0000 DenormalizedNumbers,includingzero 000 0000 0001 -1022 ... ... 011 1111 1110 -1 011 1111 1111 0 100 0000 0000 +1 ... ... 111 1111 1110 +1023 111 1111 1111 InZinity,Not-a-Number

Fornormalizednumbers,theexponenthasaneffectiverangeof-1022..+1023.

Thesmallestpositivenormalizednumberis:

Inbinary: 1.000000000...00000000000×2-1022 (Thereare52zeros)

Inbits: 0x0010000000000000

Decimalapproximation: 2.2250738585072014×10-308

Thelargestnormalizednumberis:

Inbinary: 1.111111111...11111111111×2+1023 (Thereare1+52ones)

Inbits: 0x7FEFFFFFFFFFFFFF

Decimalapproximation: 1.7976931348623157×10+308

Thesmallestdenormalizednumberis:

Inbinary: 0.000000000...00000000001×2-1022

Inbits: 0x0000000000000001

Decimalapproximation: 5.0×10-324

RISC-VArchitectureSummary/Porter Page� of� 117 323

Page 118: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

OtherFloatingPointSizes

Inadditiontothewellknownsingleprecision(32-bit)anddoubleprecision(64-bit)sizes,theIEEE754-2008standardalsodescribesthese2loatingpointsizes.Theyaremuchlesscommon.

16bits(halfprecision) 128bits(quadrupleprecision) 256bits(octupleprecision)

Thereisalsomentionofdecimal-basedrepresentations.

Rounding

Theresultofacomputationmaybeanumberthatisnotpreciselyrepresentable.Forexample,theadditionoftwodoublenumbersmaynotbeexactlyrepresentableasadouble.

Forexample,herearetwonumberswith5digitsofprecision.Theirsumhas8digitsofprecision.

12.345 =1.2345×101 + .067891 =6.7891×10-2 12.412891 =1.2412891×101

TheIEEEspecsaysthattheexactresultshouldbe“rounded”toanumberthatcanberepresented.Forexample,whentwodoublesareadded,theirresultwillbesomedoublevalue.

Intheaboveexample,tobringtheresultbackdownto5digitsofprecision,someaccuracymustbesacri2iced.Likewise,when2loatingpointoperationsareperformed,theremaybealossofaccuracyasaresultofrounding.

TheIEEEspeclistsseveralwaysthatavaluecanberoundedtosomethingthatcanberepresented:

RISC-VArchitectureSummary/Porter Page� of� 118 323

Page 119: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

•Roundtothenearestnumber (Foratie,thevaluewithazerointheleastsigni2icantbitischosen.) •Roundtowardzero(i.e.,truncate) •Roundtowardpositivein2inity(i.e.,roundup) •Roundtowardnegativein2inity(i.e.,rounddown)

Inordertoperformroundingcorrectly,acomputermayneedtoperformcalculations(e.g.,multiplication)withgreaterprecisionto2irstcomputethecorrectvalue.Then,asthe2inalstepinthecalculation,thevaluemustbeproperlyroundedto2itintotheavailable2loatingpointbits.

Hereisanotherexample,showingthattheentireeffectofanoperationcanbelostastheresultofrounding. 12.345 =1.2345×101 + .00000067891 =6.7891×10-7 12.34500067891 =1.234500067891×101

Roundingtoavaluewiththesameprecision(eitherroundingdown,roundingtozero,orroundingtothenearest)givestheinitialoperand,unchanged.Hereistheroundedresult:

12.345 =1.2345×101

TheFloatingPointExtensions

TheRISC-Vspecdescribestheseextensionstosupport2loatingpointarithmetic

F–Singleprecision2loatingpoint(32bitvalues) D–Doubleprecision2loatingpoint(64bitvalues) Q–Quadprecision2loatingpoint(128bitvalues)

The“D”extensionisasupersetof“F”;whendoubleprecisionisimplemented,allinstructionsoperatingonsingleprecisionvalueswillalsobeincluded.

Likewise,the“Q”isasupersetof“D”;whenthe“Q”extensionisimplemented,all“F”and“D”instructionswillalsobeimplemented.

RISC-VArchitectureSummary/Porter Page� of� 119 323

Page 120: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Inthisdocument,weintermixthedescriptionsof“F”,“D”,and“Q”extensions,ratherthandescribeoneaftertheother.

FloatingPointRegisters

Thereare322loatingpointregisters,namedf0,f1,…f31.

Inthe“F”extension,eachregistercanholdonesingleprecision2loatingpointvalue.Thus,eachregisteris32bitswide.Allregistersfunctionidentically;thereisnothingspecialaboutf0,asthereiswithx0oftheintegerregisters.

Inthe“D”extension,theseregistersareall64bitswideandcanholdeitherasingleprecisionvalueoradoubleprecisionvalue.Iftheregistercontainsasingleprecision2loatingpoint,thenthemostsigni2icant32bitsoftheregisterwillallbe1.Wheninterpretedasadouble,suchavaluewillberecognizedasaNaN.

Inthe“Q”extension,theregistersare128bitswide.Eachregistercanholdeitherasingle,double,orquadprecision2loatingpointvalue.Inthecaseofsmallervalues,theupperbitswillbesetto1sothatthevaluewillappeartobeNaNifinterpretedasalargerprecisionvalue.

Inadditionthereisa“FloatingPointControlandStatusRegister”(called“FCSR”).

TheFCSRisdividedintothefollowing2ields:

Bits Widthinbits Description 0 1 NX:Inexact 1 1 UF:Under2low 2 1 OF:Over2low 3 1 DZ:DivideByZero 4 1 NV:InvalidOperation 5:7 3 FloatingPointRoundingMode(FRM) 8:31 24 (unused)

TheFloatingPointControlandStatusRegister(FCSR)isoneofthe“ControlandStatusRegisters(CSRs),whicharediscussedelsewhereinthisdocument.Thisparticularregisterisfreelyaccessibleregardlessoftheoperatingmode(user,supervisor,ormachinemode),sothisneednotbeaconcernhere.

RISC-VArchitectureSummary/Porter Page� of� 120 323

Page 121: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

TherearesomegeneralpurposeinstructionsusedforreadingandwritingtheCSRsandthesearediscussedelsewhere.

Collectively,the5singlebit2ieldsarereferredtoasthe“FloatingPointFlags”(FFLAGS):

FFLAGS(FloatingPointFlags): NX:Inexact UF:Under2low OF:Over2low DZ:DivideByZero NV:InvalidOperation

These52lagscanberesettozerobywritingtotheFloatingPointControlandStatusRegister(FCSR).Theymaybesettoonefromtime-to-timeas2loatingpointinstructionsareexecuted.Onceset,theywillremainset.Therefore,thesebitscanbequeriedafterasequenceofinstructionstodetermineifanyofseveralunusualconditionshasoccurredduringthepreviousinstructionsequence.

Themeaningofeachbitisstraightforward.

Iftheresultofanoperationhadtoberounded,thenthe“NX:Inexact”bitwillbeset.Iftheresultistoosmallto2itinanormalizedformandisalsoinexact,thenthe“UF:Under2low”bitwillbesetandtheresultwillbereturnedasadenormalizedvalue.Iftheresultistoolargetoberepresented,thenthe“OF:Over2low”bitwillbesetand+∞or-∞willusuallybereturnedastheresult.The“DZ:DivideByZero”bitissetforoperationslike1/0andlog(0)and+∞or-∞willbereturnedastheresult.Certainoperationsareconsideredinvalid,suchas“squarerootofanegativenumber”.The“NV:InvalidOperation”bitwillbesetandNaN(bydefault,quietNaN)willbereturned.

FloatingPointinstructionsthathaveproblemswillsettheFFLAGSbitsbutwillnevercauseatraporexception.Instead,instructionexecutionwillcontinueuninterrupted.

IftheresultofaninstructionisNaN,thenaquietNaNwillbeinserted,unlessotherwisedocumented.RISC-VwillinsertthefollowingrepresentationforaquietNaN:

RISC-VArchitectureSummary/Porter Page� of� 121 323

Page 122: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

SinglePrecisionquietNaN: 0x7FC00000=0 11111111 10000000000000000000000

The“unused”zerobitsmaybeusedforotherRISC-Vextensions,butnormallytheywillappeartobezerowhenread.Floatingpointcodeshouldignorethevalueofthis2ieldandnotattempttochangeit.

Thereareseveraldifferentwaysthata2loatingpointresultcanberoundedto2itintotheavailableprecision:

•Roundtothenearestnumber (Foratie,thevaluewithazerointheleastsigni2icantbitischosen.) •Roundtowardzero(i.e.,truncate) •Roundtowardnegativein2inity(i.e.,rounddown) •Roundtowardpositivein2inity(i.e.,roundup)

Theroundingmodecanbeeitherstaticordynamic.

Thereisa“RoundingMode”(RM)2ieldineach2loatingpointinstruction.Instaticmode,theroundingmethodtobeusedisencodedintheRMbitswithintheinstruction,accordingtothesecodes:

000 Roundtothenearestnumber (Foratie,thevaluewithazerointheleastsigni2icantbitischosen.) 001 Roundtowardzero(i.e.,truncate) 010 Roundtowardnegativein2inity(i.e.,rounddown) 011 Roundtowardpositivein2inity(i.e.,roundup) 100 Roundtothenearestnumber (Foratie,roundawayfromzero.) 101 invalid 110 invalid 111 Usedynamicmode:ConsulttheCSRfortheroundingmethod.

Withthedynamicmode,theroundingmodewillbedeterminedbytheFloatingPointRoundingMode(FRM)bitsintheFloatingPointControlandStatusRegister(FCSR).IftheinstructionsRM2ieldcontains111,thentheroundingmodetobeusedisdetermineddynamicallywhentheinstructionisexecutedbyconsultingtheFCSR.TheFRMbitsintheFCSRtellwhichroundingmethodtouse,accordingtothefollowingtable.Thistableisthesameasabove,exceptforthelastline.

RISC-VArchitectureSummary/Porter Page� of� 122 323

Page 123: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

000 Roundtothenearestnumber (Foratie,thevaluewithazerointheleastsigni2icantbitischosen.) 001 Roundtowardzero(i.e.,truncate) 010 Roundtowardnegativein2inity(i.e.,rounddown) 011 Roundtowardpositivein2inity(i.e.,roundup) 100 Roundtothenearestnumber (Foratie,roundawayfromzero.) 101 invalid 110 invalid 111 invalid

Ifa2loatingpointinstructionwillnotneedtoperformroundingbutcontainsanRM2ield(forexample,theFNEGinstruction),theRM2ieldshouldbesetto000.

TheRISC-Vspecdoesnotindicateorsuggesthowtheroundingmodeshouldbewritteninassemblycode.???

The2lagsportion(NX,UF,OF,DZ,NV)oftheFloatingPointStatusRegister(FCSR)isaccessibleandmirroredinasecondCSRcalled(FFLAGS).Likewise,theroundingmodebitsoftheFloatingPointStatusRegister(FCSR)areaccessibleandmirroredinathirdCSRcalled(FRM).

ControlandStatusRegistersarecoveredelsewhere,buttheregisterconcernedwiththe2loatingpointextensionare:

CSRAddr Name Description001 f=lags Floatingpointing2lags002 frm Dynamicroundingmode003 fcsr Concatenationoffrm+f2lags

Theseregsareread/writeinanyprivilegemode.

Separateassemblerinstructionsarede2inedtoaccesstheseregisters,butthesearejustshorthand(syntacticsugar)forother,moregeneralinstructions.Theinstructionsare:

FRFLAGS-ReadtheFFLAGSregisterFRFLAGS RegD RegD=CSR[FFLAGS]

FSFLAGS-SwaptheFFLAGSregisterFSFLAGS RegD,Reg1 RegD=CSR[FFLAGS];CSR[FFLAGS]=Reg1

RISC-VArchitectureSummary/Porter Page� of� 123 323

Page 124: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

FRRM-ReadtheFRM(RoundingMode)registerFRRM RegD RegD=CSR[FRM]

FSRM-SwaptheFRM(RoundingMode)registerFSRM RegD,Reg1 RegD=CSR[FRM];CSR[FRM]=Reg1

Not-A-Number-NaN

Thespecsaysthatthe“canonicalNaN”is0x7fc00000.Presumably,theymeanaquietNaN.TheRISC-VspecmentionstheexistenceofasignalingNaN,sayingthatifoneisencounteredinanoperation,aninvalidinstructionexceptionwilloccur.However,thespecdoesnotsayhowasignalingNaNisdistinguished.???

FloatingPointLoadandStoreInstructions

Thefollowinginstructionswillmovedatabetweenmemoryanda2loatingpointregister.

WeusethenotationFReg1,FReg2,andFRegDtoindicate2loatingpointregisters,justasweuseReg1,Reg2,andRegDtoindicateintegerregisters.

FloatingLoad(Word)

GeneralForm:FLW FRegD,Immed-12(Reg1)

Example:FLW f4,1234(x9) # f4 = Mem[x9+1234]

Description:A32-bitvalueisfetchedfrommemoryandmovedinto2loatingregisterFRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

RISC-VArchitectureSummary/Porter Page� of� 124 323

Page 125: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

RISC-VExtensions:Thisinstructionrequiresthe“F”extension.

Encoding: ThisisanI-typeinstruction.

FloatingLoad(Double)

GeneralForm:FLD FRegD,Immed-12(Reg1)

Example:FLD f4,1234(x9) # f4 = Mem[x9+1234]

Description:A64-bitvalueisfetchedfrommemoryandmovedinto2loatingregisterFRegD.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.doubleword-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

RISC-VExtensions:Thisinstructionrequiresthe“D”extension.

Encoding: ThisisanI-typeinstruction.

FloatingLoad(Quad)

GeneralForm:FLQ FRegD,Immed-12(Reg1)

RISC-VArchitectureSummary/Porter Page� of� 125 323

Page 126: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Comment: Analogous;Onlyavailablein“Q”quadprecision2loatingpointextension.

FloatingStore(Word)

GeneralForm:FSW FReg2,Immed-12(Reg1)

Example:FSW f4,1234(x9),f4 # Mem[x9+1234] = f4

Description:A32-bitvalueiscopiedfromregisterFReg2tomemory.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.word-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

RISC-VExtensions:Thisinstructionrequiresthe“F”extension.

Encoding: ThisisanI-typeinstruction.

FloatingStore(Double)

GeneralForm:FSD FReg2,Immed-12(Reg1)

Example:FSD f4,1234(x9) # Mem[x9+1234] = f4

Description:A64-bitvalueiscopiedfromregisterFReg2tomemory.ThememoryaddressisformedbyaddingtheoffsettothecontentsofReg1.

RISC-VArchitectureSummary/Porter Page� of� 126 323

Page 127: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Comment:Thetargetlocationgivenbythe12-bitoffsetmustbewithintherangeof-2,048..2,047relativetothevalueinReg1.

Theaddressofthememorylocationisnotrequiredtobeproperlyaligned(i.e.doubleword-aligned),butitisassumedthatthisinstructionwillexecutefasterwithproperlyalignedaddresses.

Thisinstructionisguaranteedtoexecuteatomicallyiftheaddressisproperlyaligned.Iftheaddressisnotaligned,thereisnoguaranteeofatomicoperation.

RISC-VExtensions:Thisinstructionrequiresthe“D”extension.

Encoding: ThisisanI-typeinstruction.

FloatingStore(Quad)

GeneralForm:FSQ FReg2,Immed-12(Reg1)

Comment: Analogous;Onlyavailablein“Q”quadprecision2loatingpointextension.

BasicFloatingPointArithmeticInstructions

FloatingAdd

GeneralForm:FADD.S FRegD,FReg1,FReg2 (single precision)FADD.D FRegD,FReg1,FReg2 (double precision)FADD.Q FRegD,FReg1,FReg2 (quad precision)

Example:FADD.S f4,f9,f13 # f4 = f9+f13 (32 bits)FADD.D f4,f9,f13 # f4 = f9+f13 (64 bits)FADD.Q f4,f9,f13 # f4 = f9+f13 (128 bits)

RISC-VArchitectureSummary/Porter Page� of� 127 323

Page 128: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Description:ThevalueinFReg1isaddedtothevalueinFReg2andtheresultisplacedinFRegD.

RoundingMode:Thisinstructioncontainsa3bitRoundingMode(RM)2ield,whichwilldeterminetheroundingmethodtobeused.

RISC-VExtensions:FADD.Srequiresthe“F”extension.FADD.Drequiresthe“D”extension.FADD.Qrequiresthe“Q”extension.

Encoding: ThisisanR-typeinstruction.

FloatingSubtract

GeneralForm:FSUB.S FRegD,FReg1,FReg2 (single precision)FSUB.D FRegD,FReg1,FReg2 (double precision)FSUB.Q FRegD,FReg1,FReg2 (quad precision)

Example:FSUB.S f4,f9,f13 # f4 = f9-f13 (32 bits)FSUB.D f4,f9,f13 # f4 = f9-f13 (64 bits)FSUB.Q f4,f9,f13 # f4 = f9-f13 (128 bits)

Description:ThevalueinFReg1issubtractedfromthevalueinFReg2andtheresultisplacedinFRegD.

RoundingMode:Thisinstructioncontainsa3bitRoundingMode(RM)2ield,whichwilldeterminetheroundingmethodtobeused.

RISC-VExtensions:FSUB.Srequiresthe“F”extension.FSUB.Drequiresthe“D”extension.FSUB.Qrequiresthe“Q”extension.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 128 323

Page 129: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

FloatingMultiply

GeneralForm:FMUL.S FRegD,FReg1,FReg2 (single precision)FMUL.D FRegD,FReg1,FReg2 (double precision)FMUL.Q FRegD,FReg1,FReg2 (quad precision)

Example:FMUL.S f4,f9,f13 # f4 = f9*f13 (32 bits)FMUL.D f4,f9,f13 # f4 = f9*f13 (64 bits)FMUL.Q f4,f9,f13 # f4 = f9*f13 (128 bits)

Description:ThevalueinFReg1ismultipliedbythevalueinFReg2andtheresultisplacedinFRegD.

RoundingMode:Thisinstructioncontainsa3bitRoundingMode(RM)2ield,whichwilldeterminetheroundingmethodtobeused.

RISC-VExtensions:FMUL.Srequiresthe“F”extension.FMUL.Drequiresthe“D”extension.FMUL.Qrequiresthe“Q”extension.

Encoding: ThisisanR-typeinstruction.

FloatingDivide

GeneralForm:FDIV.S FRegD,FReg1,FReg2 (single precision)FDIV.D FRegD,FReg1,FReg2 (double precision)FDIV.Q FRegD,FReg1,FReg2 (quad precision)

Example:FDIV.S f4,f9,f13 # f4 = f9/f13 (32 bits)FDIV.D f4,f9,f13 # f4 = f9/f13 (64 bits)FDIV.Q f4,f9,f13 # f4 = f9/f13 (128 bits)

Description:ThevalueinFReg1isdividedbythevalueinFReg2andtheresultisplacedinFRegD.

RISC-VArchitectureSummary/Porter Page� of� 129 323

Page 130: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

RoundingMode:Thisinstructioncontainsa3bitRoundingMode(RM)2ield,whichwilldeterminetheroundingmethodtobeused.

RISC-VExtensions:FDIV.Srequiresthe“F”extension.FDIV.Drequiresthe“D”extension.FDIV.Qrequiresthe“Q”extension.

Encoding: ThisisanR-typeinstruction.

FloatingMinimum

GeneralForm:FMIN.S FRegD,FReg1,FReg2 (single precision)FMIN.D FRegD,FReg1,FReg2 (double precision)FMIN.Q FRegD,FReg1,FReg2 (quad precision)

Example:FMIN.S f4,f9,f13 # f4 = Min(f9,f13) (32 bits)FMIN.D f4,f9,f13 # f4 = Min(f9,f13) (64 bits)FMIN.Q f4,f9,f13 # f4 = Min(f9,f13) (128 bits)

Description:ThevaluesinFReg1andFReg2arecomparedandthesmalleroneisplacedinFRegD.

RISC-VExtensions:FMIN.Srequiresthe“F”extension.FMIN.Drequiresthe“D”extension.FMIN.Qrequiresthe“Q”extension.

Encoding: ThisisanR-typeinstruction.

FloatingMaximum

GeneralForm:FMAX.S FRegD,FReg1,FReg2 (single precision)FMAX.D FRegD,FReg1,FReg2 (double precision)FMAX.Q FRegD,FReg1,FReg2 (quad precision)

Example:FMAX.S f4,f9,f13 # f4 = Max(f9,f13) (32 bits)

RISC-VArchitectureSummary/Porter Page� of� 130 323

Page 131: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

FMAX.D f4,f9,f13 # f4 = Max(f9,f13) (64 bits)FMAX.Q f4,f9,f13 # f4 = Max(f9,f13) (128 bits)

Description:ThevaluesinFReg1andFReg2arecomparedandthelargeroneisplacedinFRegD.

RISC-VExtensions:FMAX.Srequiresthe“F”extension.FMAX.Drequiresthe“D”extension.FMAX.Qrequiresthe“Q”extension.

Encoding: ThisisanR-typeinstruction.

FloatingSquareRoot

GeneralForm:FSQRT.S FRegD,FReg1 (single precision)FSQRT.D FRegD,FReg1 (double precision)FSQRT.Q FRegD,FReg1 (quad precision)

Example:FSQRT.S f4,f9 # f4 = sqrt(f9) (32 bits)FSQRT.D f4,f9 # f4 = sqrt(f9) (64 bits)FSQRT.Q f4,f9 # f4 = sqrt(f9) (128 bits)

Description:ThesquarerootofthevalueinFReg1iscomputedandtheresultisplacedinFRegD.

RoundingMode:Thisinstructioncontainsa3bitRoundingMode(RM)2ield,whichwilldeterminetheroundingmethodtobeused.

RISC-VExtensions:FSQRT.Srequiresthe“F”extension.FSQRT.Drequiresthe“D”extension.FSQRT.Qrequiresthe“Q”extension.

Encoding: ThisisaR-typeinstruction,inwhichtheReg22ieldisallzeros.

RISC-VArchitectureSummary/Porter Page� of� 131 323

Page 132: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

FloatingFusedMultiply-Add

GeneralForm:FMADD.S FRegD,FReg1,FReg2,FReg3 (single precision)FNMADD.S FRegD,FReg1,FReg2,FReg3 (single precision)FMADD.D FRegD,FReg1,FReg2,FReg3 (double precision)FNMADD.D FRegD,FReg1,FReg2,FReg3 (double precision)

Example:FMADD.S f2,f5,f6,f7 # f2 = (f5*f6)+f7FNMADD.S f2,f5,f6,f7 # f2 = (-(f5*f6))+f7FMADD.D f2,f5,f6,f7 # f2 = (f5*f6)+f7FNMADD.D f2,f5,f6,f7 # f2 = (-(f5*f6))+f7

Description:Thisinstructionperformsamultiplicationandanaddition.Avariation(whichisshownherewiththeopcodeFNMADD)willoptionallynegatetheproductbeforetheaddition.Seethespecconcerningcaseswhentheargumentsare0,∞,andNaN.

RoundingMode:Thisinstructioncontainsa3bitRoundingMode(RM)2ield,whichwilldeterminetheroundingmethodtobeused.

RISC-VExtensions:FMADD.SandFNMADD.Srequirethe“F”extension.FMADD.DandFNMADD.Drequirethe“D”extension.

Encoding: Thisinstructionisunusualinthatithasfouroperands,allofwhichare

registers.Thus,itdoesnot2itintoanyoftheinstructionformatsthatweredescribedearlier.ThisinstructiontypeiscalledanR4-typeinstruction.TheR4-typeinstructionformatisusedonlyforthisandthefusedmultiply-subtractinstructions.

FloatingFusedMultiply-Subtract

GeneralForm:FMSUB.S FRegD,FReg1,FReg2,FReg3 (single precision)FNMSUB.S FRegD,FReg1,FReg2,FReg3 (single precision)FMSUB.D FRegD,FReg1,FReg2,FReg3 (double precision)FNMSUB.D FRegD,FReg1,FReg2,FReg3 (double precision)

RISC-VArchitectureSummary/Porter Page� of� 132 323

Page 133: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Example:FMSUB.S f2,f5,f6,f7 # f2 = (f5*f6)-f7FNMSUB.S f2,f5,f6,f7 # f2 = (-(f5*f6))-f7FMSUB.D f2,f5,f6,f7 # f2 = (f5*f6)-f7FNMSUB.D f2,f5,f6,f7 # f2 = (-(f5*f6))-f7

Description:Thisinstructionperformsamultiplicationandasubtraction.Avariation(whichisshownherewiththeopcodeFNMSUB)willoptionallynegatetheproductbeforethesubtraction.Seethespecconcerningcaseswhentheargumentsare0,∞,andNaN.

RoundingMode:Thisinstructioncontainsa3bitRoundingMode(RM)2ield,whichwilldeterminetheroundingmethodtobeused.

RISC-VExtensions:FMSUB.SandFNMSUB.Srequirethe“F”extension.FMSUB.DandFNMSUB.Drequirethe“D”extension.

Encoding: Thisinstructionisunusualinthatithasfouroperands,allofwhichare

registers.Thus,itdoesnot2itintoanyoftheinstructionformatsthatweredescribedearlier.ThisinstructiontypeiscalledanR4-typeinstruction.TheR4-typeinstructionformatisusedonlyforthisandthefusedmultiply-addinstructions.

FloatingFusedMultiply-Add/Subtract(Quad)

GeneralForm:FMADD.Q FRegD,FReg1,FReg2,FReg3FNMADD.Q FRegD,FReg1,FReg2,FReg3FMSUB.Q FRegD,FReg1,FReg2,FReg3FNMSUB.Q FRegD,FReg1,FReg2,FReg3

Comment: Analogous;Onlyavailablein“Q”quadprecision2loatingpointextension.

FloatingPointConversionInstructions

RISC-VArchitectureSummary/Porter Page� of� 133 323

Page 134: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Thereisasetofinstructionsforconvertingfromonerepresentationtoanother.Thefollowingcodesareusedtonamethedifferentformats:

W 32-bitsignedinteger WU 32-bitunsignedinteger L 64-bitsignedinteger LU 64-bitunsignedinteger S Single-precision2loatingpointnumber D Double-precision2loatingpointnumber

FloatingPointConversion(Integerto/fromSinglePrecision)

GeneralForms:FCVT.W.S RegD,FReg1FCVT.WU.S RegD,FReg1FCVT.L.S RegD,FReg1 (availableinRV64orRV128only)FCVT.LU.S RegD,FReg1 (availableinRV64orRV128only)FCVT.S.W FRegD,Reg1FCVT.S.WU FRegD,Reg1FCVT.S.L FRegD,Reg1 (availableinRV64orRV128only)FCVT.S.LU FRegD,Reg1 (availableinRV64orRV128only)

Examples:FCVT.W.S r2,f5 # r2(int32) = f5(float)FCVT.WU.S r2,f5 # r2(uint32) = f5(float)FCVT.L.S r2,f5 # r2(int64) = f5(float)FCVT.LU.S r2,f5 # r2(uint64) = f5(float)FCVT.S.W f4,r7 # f4(float) = r7(int32)FCVT.S.WU f4,r7 # f4(float) = r7(uint32)FCVT.S.L f4,r7 # f4(float) = r7(int64)FCVT.S.LU f4,r7 # f4(float) = r7(uint64)

Description:Theseinstructionsperformaconversionofasingleprecision2loatingformatto/fromanintegerformat.

Rounding:TheseinstructionscontainRoundingMode(RM)bitswhichdeterminehowroundingistobedone.

Whenconvertingtoaninteger,thevalues+∞,NaN,andvaluesthataretoolargetoberepresentedinthetargetformatarestoredasthelargestinteger

RISC-VArchitectureSummary/Porter Page� of� 134 323

Page 135: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

valuethatcanberepresented.Thevalue–∞andvaluesthataretoonegativetoberepresentedinthetargetformatareconvertedintothesmallestintegervaluethatcanberepresented.

RISC-VExtensions:Theseinstructionsrequirethe“F”extension.Furthermore,theinstructionsinvolving64bitintegersarenotavailableinRV32,asnotedinthegeneralformsabove.

Encoding: ThisisanR-typeinstruction,inwhichtheReg22ieldisusedforadditional

opcodebits.

FloatingPointConversion(Integerto/fromDoublePrecision)

GeneralForms:FCVT.W.D RegD,FReg1FCVT.WU.D RegD,FReg1FCVT.L.D RegD,FReg1 (availableinRV64orRV128only)FCVT.LU.D RegD,FReg1 (availableinRV64orRV128only)FCVT.D.W FRegD,Reg1FCVT.D.WU FRegD,Reg1FCVT.D.L FRegD,Reg1 (availableinRV64orRV128only)FCVT.D.LU FRegD,Reg1 (availableinRV64orRV128only)

Examples:FCVT.W.D r2,f5 # r2(int32) = f5(double)FCVT.WU.D r2,f5 # r2(uint32) = f5(double)FCVT.L.D r2,f5 # r2(int64) = f5(double)FCVT.LU.D r2,f5 # r2(uint64) = f5(double)FCVT.D.W f4,r7 # f4(double) = r7(int32)FCVT.D.WU f4,r7 # f4(double) = r7(uint32)FCVT.D.L f4,r7 # f4(double) = r7(int64)FCVT.D.LU f4,r7 # f4(double) = r7(uint64)

Description:Theseinstructionsperformaconversionofadoubleprecision2loatingformatto/fromanintegerformat.

Rounding:TheseinstructionscontainRoundingMode(RM)bitswhichdeterminehowroundingistobedone.

RISC-VArchitectureSummary/Porter Page� of� 135 323

Page 136: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Whenconvertingtoaninteger,thevalues+∞,NaN,andvaluesthataretoolargetoberepresentedinthetargetformatarestoredasthelargestintegervaluethatcanberepresented.Thevalue–∞andvaluesthataretoonegativetoberepresentedinthetargetformatareconvertedintothesmallestintegervaluethatcanberepresented.

RISC-VExtensions:Theseinstructionsrequirethe“D”extension.Furthermore,theinstructionsinvolving64bitintegersarenotavailableinRV32,asnotedinthegeneralformsabove.

Encoding: ThisisanR-typeinstruction,inwhichtheReg22ieldisusedforadditional

opcodebits.

ProgrammingHint:Tostore+0.0ina2loatingregister,usetheseinstructions:

FCVT.S.W FRegD,x0 # Single precisionFCVT.D.W FRegD,x0 # Double precision

FloatingPointConversion(Integerto/fromQuadPrecision)

GeneralForms:FCVT.W.Q RegD,FReg1FCVT.WU.Q RegD,FReg1FCVT.L.Q RegD,FReg1 (availableinRV64orRV128only)FCVT.LU.Q RegD,FReg1 (availableinRV64orRV128only)FCVT.Q.W FRegD,Reg1FCVT.Q.WU FRegD,Reg1FCVT.Q.L FRegD,Reg1 (availableinRV64orRV128only)FCVT.Q.LU FRegD,Reg1 (availableinRV64orRV128only)

Comment: Analogous;Onlyavailablein“Q”quadprecision2loatingpointextension.

FloatingPointConversion(Single/Doubleto/fromQuadPrecision)

GeneralForms:FCVT.S.Q FRegD,FReg1FCVT.Q.S FRegD,FReg1

RISC-VArchitectureSummary/Porter Page� of� 136 323

Page 137: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

FCVT.D.Q FRegD,FReg1FCVT.Q.D FRegD,FReg1

Comment: Analogous;Onlyavailablein“Q”quadprecision2loatingpointextension.

FloatingPointMoveInstructions

Thenextthreeinstructionscopyasignedvaluefromoneregistertoanotherregister,whilemodifyingthesignbitbasedonthesignfromanothervalue.

FloatingSignInjection(SinglePrecision)

GeneralForms:FSGNJ.S FRegD,FReg1,FReg2 (inject)FSGNJN.S FRegD,FReg1,FReg2 (negate)FSGNJX.S FRegD,FReg1,FReg2 (exclusive-or)

Examples:FSGNJ.S f2,f5,f6 # f2 = sign(f6) * |f5|FSGNJN.S f2,f5,f6 # f2 = -sign(f6) * |f5|FSGNJX.S f2,f5,f6 # f2 = sign(f6) * f5

Description:EachoftheseinstructionscopiesthevalueinFReg1intoFRegD.Howeverthesignoftheresultischanged,basedonthesignofthevalueinFReg2.Inthe“inject”instruction,thesignfromFReg2isusedfortheresult.Inthe“negate”instruction,thesignfromFReg2is2irst2lippedandthenusedfortheresult.Inthe“exclusive-or”instruction,thesignsfromFReg1andFReg2areXOR-edtogetherandthenusedfortheresult.

Comments:TheseinstructionsareusefulasimplementationsofspecialcasesfortheinstructionsFNEG.S,FABS.S,andFMV.S.Inaddition,theIEEE754-2008speccallsfora“copysign”operation.Theseoperationsmayalsohaveuseinsomemathlibraryfunctions.

RISC-VExtensions:Theseinstructionsrequirethe“F”extension.Presumably,inthe“D”extension,thisinstructionwillonlymodifythelowerorder32bits,butitispossiblethatitwillalsosettheupper32bitstoallones.???

RISC-VArchitectureSummary/Porter Page� of� 137 323

Page 138: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Encoding: ThisisanR-typeinstruction.

FloatingSignInjection(DoublePrecision)

GeneralForms:FSGNJ.D FRegD,FReg1,FReg2 (inject)FSGNJN.D FRegD,FReg1,FReg2 (negate)FSGNJX.D FRegD,FReg1,FReg2 (exclusive-or)

Examples:FSGNJ.D f2,f5,f6 # f2 = sign(f6) * |f5|FSGNJN.D f2,f5,f6 # f2 = -sign(f6) * |f5|FSGNJX.D f2,f5,f6 # f2 = sign(f6) * f5

Description:EachoftheseinstructionscopiesthevalueinFReg1intoFRegD.Howeverthesignoftheresultischanged,basedonthesignofthevalueinFReg2.Inthe“inject”instruction,thesignfromFReg2isusedfortheresult.Inthe“negate”instruction,thesignfromFReg2is2irst2lippedandthenusedfortheresult.Inthe“exclusive-or”instruction,thesignsfromFReg1andFReg2areXOR-edtogetherandthenusedfortheresult.

Comments:TheseinstructionsareusefulasimplementationsofspecialcasesfortheinstructionsFNEG.D,FABS.D,andFMV.D.Inaddition,theIEEE754-2008speccallsfora“copysign”operation.Theseoperationsmayalsohaveuseinsomemathlibraryfunctions.

RISC-VExtensions:Theseinstructionsrequirethe“D”extension.

Encoding: ThisisanR-typeinstruction.

FloatingSignInjection(DoublePrecision)

GeneralForms:FSGNJ.Q FRegD,FReg1,FReg2 (inject)FSGNJN.Q FRegD,FReg1,FReg2 (negate)FSGNJX.Q FRegD,FReg1,FReg2 (exclusive-or)

Comment: Analogous;Onlyavailablein“Q”quadprecision2loatingpointextension.

RISC-VArchitectureSummary/Porter Page� of� 138 323

Page 139: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

FloatingMove

GeneralForms:FMV.S FRegD,FReg1 (single precision)FMV.D FRegD,FReg1 (double precision)FMV.Q FRegD,FReg1 (quad precision)

Examples:FMV.S f2,f5 # f2 = f5 (32 bits)FMV.D f2,f5 # f2 = f5 (64 bits)FMV.Q f2,f5 # f2 = f5 (128 bits)

Description:ThevalueinFReg1iscopiedintoFRegD.

RISC-VExtensions:FMV.Srequiresthe“F”extension.FMV.Drequiresthe“D”extension.FMV.Qrequiresthe“Q”extension.

Encoding:Theseinstructionsarespecialcasesofmoregeneralinstructions.Theyareassembledidenticallyto:

FSGNJ.S FRegD,FReg1,FReg1 (inject) FSGNJ.D FRegD,FReg1,FReg1 (inject)

FloatingNegate

GeneralForms:FNEG.S FRegD,FReg1 (single precision)FNEG.D FRegD,FReg1 (double precision)FNEG.Q FRegD,FReg1 (quad precision)

Examples:FNEG.S f2,f5 # f2 = -(f5) (32 bits)FNEG.D f2,f5 # f2 = -(f5) (64 bits)FNEG.Q f2,f5 # f2 = -(f5) (128 bits)

Description:ThevalueinFReg1isnegatedandthencopiedintoFRegD.

RISC-VArchitectureSummary/Porter Page� of� 139 323

Page 140: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

RISC-VExtensions:FNEG.Srequiresthe“F”extension.FNEG.Drequiresthe“D”extension.FNEG.Qrequiresthe“Q”extension.

Encoding:Theseinstructionsarespecialcasesofmoregeneralinstructions.Theyareassembledidenticallyto:

FSGNJN.S FRegD,FReg1,FReg1 (negate) FSGNJN.D FRegD,FReg1,FReg1 (negate)

FloatingAbsoluteValue

GeneralForms:FABS.S FRegD,FReg1 (single precision)FABS.D FRegD,FReg1 (double precision)FABS.Q FRegD,FReg1 (quad precision)

Examples:FABS.S f2,f5 # f2 = |f5| (32 bits)FABS.D f2,f5 # f2 = |f5| (64 bits)FABS.Q f2,f5 # f2 = |f5| (128 bits)

Description:TheabsolutevalueofthequantityinFReg1iscopiedintoFRegD.

RISC-VExtensions:FABS.Srequiresthe“F”extension.FABS.Drequiresthe“D”extension.FABS.Qrequiresthe“Q”extension.

Encoding:Theseinstructionsarespecialcasesofmoregeneralinstructions.Theyareassembledidenticallyto:

FSGNJX.S FRegD,FReg1,FReg1 (exclusive-or) FSGNJX.D FRegD,FReg1,FReg1 (exclusive-or)

FloatingMoveTo/FromIntegerRegister(SinglePrecision)

GeneralForms:FMV.X.W RegD,FReg1FMV.W.X FRegD,Reg1

RISC-VArchitectureSummary/Porter Page� of� 140 323

Page 141: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Examples:FMV.X.W x2,f5 # x2 = f5FMV.W.X f5,x2 # f5 = x2

Description:32-bitsarecopiedfroma2loatingpointregistertoanintegerregister,orfromanintegerregistertoa2loatingpointregister.Thebitsareunaltered;i.e.,noconversionisperformed.

InthecaseofRV64andRV128(wheretheintegerregistersarelargerthan32bits),whenthedestinationisanintegerregister,theupperbitsoftheintegerregisterwillbe2illedwithzeros.Whenthesourceisanintegerregister,theupperbitswillbeignored.

Noerrorconditionscanarise.

Inanolderversionofthespec,theletter“S”wasusedinplaceof“W”intheopcode,i.e.,“FMV.X.S”and“FMV.S.X”.

RISC-VExtensions:Theseinstructionsrequirethe“F”extension.

Encoding: ThisisanR-typeinstruction.

FloatingMoveTo/FromIntegerRegister(DoublePrecision)

GeneralForms:FMV.X.D RegD,FReg1FMV.D.X FRegD,Reg1

Examples:FMV.X.D x2,f5 # x2 = f5FMV.D.X f5,x2 # f5 = x2

Description:64-bitsarecopiedfroma2loatingpointregistertoanintegerregister,orfromanintegerregistertoa2loatingpointregister.Thebitsareunaltered;i.e.,noconversionisperformed.

InthecaseofRV128(wheretheintegerregistersarelargerthan64bits),whenthedestinationisanintegerregister,theupperbitsoftheintegerregisterwillbe2illedwithzeros.Whenthesourceisanintegerregister,theupperbitswillbeignored.

RISC-VArchitectureSummary/Porter Page� of� 141 323

Page 142: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Noerrorconditionscanarise.RISC-VExtensions:

Theseinstructionsrequirethe“D”extension.Encoding: ThisisanR-typeinstruction.

Note:The“Q”extensiondoesnotprovidethefollowinginstructions.Thisisintentional.

FMV.X.Q RegD,FReg1FMV.Q.X FRegD,Reg1

Inordertomoveaquadprecisionvaluefroma2loatingpointregistertoanintegerregister,you’llneedtomoveittomemory2irst,thenmoveittotheintegerregister.

FloatingPointCompareandClassifyInstructions

FloatingPointComparison

GeneralForms:FLT.S RegD,FReg1,FReg2 (single precision)FLE.S RegD,FReg1,FReg2 (single precision)FEQ.S RegD,FReg1,FReg2 (single precision)

FLT.D RegD,FReg1,FReg2 (double precision)FLE.D RegD,FReg1,FReg2 (double precision)FEQ.D RegD,FReg1,FReg2 (double precision)

FLT.Q RegD,FReg1,FReg2 (quad precision)FLE.Q RegD,FReg1,FReg2 (quad precision)FEQ.Q RegD,FReg1,FReg2 (quad precision)

RISC-VArchitectureSummary/Porter Page� of� 142 323

Page 143: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

Examples:FLT.S x2,f5,f6 # x2 = (f5<f6) ? 1 : 0FLE.S x2,f5,f6 # x2 = (f5≤f6) ? 1 : 0FEQ.S x2,f5,f6 # x2 = (f5=f6) ? 1 : 0

FLT.D x2,f5,f6 # x2 = (f5<f6) ? 1 : 0FLE.D x2,f5,f6 # x2 = (f5≤f6) ? 1 : 0FEQ.D x2,f5,f6 # x2 = (f5=f6) ? 1 : 0

FLT.Q x2,f5,f6 # x2 = (f5<f6) ? 1 : 0FLE.Q x2,f5,f6 # x2 = (f5≤f6) ? 1 : 0FEQ.Q x2,f5,f6 # x2 = (f5=f6) ? 1 : 0

Description:Two2loatingpointvaluesinthe2loatingpointregistersarecomparedandavalueindicatingtheresultisstoredinanintegerregister.Theinstructionsstorea“1”iftherelationship(<,≤,or=)holdstrueand“0”iftherelationshipdoesnothold.

Thereisnoneedfor>or≥instructionssincetheprogrammercanachievethesameresultbyusingthe<and≤instructionsbysimplyswappingthetworegisteroperands.

ErrorConditions:IfeitheroperandisaNot-a-Number(NaN)valuethena“0”willbestoredinthedestinationregister.Thisappliestoallthreecomparisoninstructions(<,≤,and=).

TheIEEEspecrequiresthatthe<or≤shouldresultina“signalingNaN”ifeitheroperandisaNaN.ForRISC-V,theinstructionexecutionwillneverbeinterrupted,soasignalingNaNcannotbeimplementedbyinterruptingtheinstruction2low.RISC-Vhandlesthiscaseasfollows:For<and≤comparisons,ifeitheroperandisNaN,thentheNV(invalidoperation)bitintheFloatingPointerControlandStatusRegister(FCSR)willbesetto1.Otherwise,thebitwillbeleftunchanged.

TheIEEEspectreatsthe=comparisondifferently,sayingthatifeitheroperandisNaN,thentheresultshouldbea“quietNaN”.TheRISC-Vimplementsthisafollows:Forthe=comparison,theNV(invalidoperation)bitwillneverbemodi2ied.

RISC-VExtensions:The“.S”instructionsrequirethe“F”extension.

RISC-VArchitectureSummary/Porter Page� of� 143 323

Page 144: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

The“.D”instructionsrequirethe“D”extension.The“.Q”instructionsrequirethe“Q”extension.

Encoding: ThisisanR-typeinstruction.

FloatingPointClassify

GeneralForms:FCLASS.S RegD,FReg1 (single precision)FCLASS.D RegD,FReg1 (double precision)FCLASS.Q RegD,FReg1 (quad precision)

Examples:FCLASS.S x2,f5 # x2 = class(f5)FCLASS.D x2,f5 # x2 = class(f5)FCLASS.Q x2,f5 # x2 = class(f5)

Description:Theinstructionlooksata2loatingpointvalueanddetermineswhatsortofanumberitis.Forexample,doesthevaluerepresent+0.0,-0.0,Nan,+∞,-∞,etc?

Theresultoftheclassi2icationofthe2loatingpointvaluewillbestoredinanintegerregister,asfollows:Allbitsoftheintegerregisterwillbeclearedto“0”,exceptthatexactlyonebitwillbesetto“1”,toindicatewhatsortofthingthe2loatingpointregistercontains.

Herearethepossibleresultsthatcanbestoredintotheintegerregister:

BitSet ValueStored Meaning0 1 FReg1contains−∞.1 2 FReg1containsanegativenormalnumber.2 4 FReg1containsanegativesubnormalnumber.3 8 FReg1contains−0.4 16 FReg1contains+0.5 32 FReg1containsapositivesubnormalnumber.6 64 FReg1containsapositivenormalnumber.7 128 FReg1contains+∞.8 256 FReg1containsasignalingNaN.9 512 FReg1containsaquietNaN.

RISC-VArchitectureSummary/Porter Page� of� 144 323

Page 145: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter4:FloatingPointInstructions

RISC-VExtensions:The“.S”instructionsrequirethe“F”extension.The“.D”instructionsrequirethe“D”extension.The“.Q”instructionsrequirethe“Q”extension.

Encoding: ThisisanR-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of� 145 323

Page 146: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

StandardUsageofGeneralPurposeRegisters

The32bitversionoftheRISC-Varchitecturetreatsallregistersidenticallysotheassemblylanguageprogrammerisfreetouseanyregisterashe/shewishes.Theonlyexceptionisregisterx0,whichalwayscontainszero,andcanbeusedasadestinationwhenthedatashouldbediscarded.

However,thereisaRISC-V“standardcallingconvention”whichspeci2ieshowregisterswillnormallybeusedbythecompilerandmostassemblylanguageprogrammerswillfollowthisconvention.

Inthe“C”(CompressedInstructions)extension,16bitinstructionsaresupported,andeachcompressedinstructionisashorter,abbreviatedformofa32-bitinstruction.Thecompressedinstructionsetwasdesignedundertheassumptionthattheregisterswillbeusedinthestandardwaysandonlythemostcommoninstructionsaregiven16bitequivalents.Assuch,theregistersarenottreatedidenticallyinthecompressedinstructionset.

Forexample,the32bit“call”instruction(JAL)cansavethereturnaddressinanyregister,althoughthestandardconventionsmandatethatregisterx1willalwaysbeusedtosavethereturnaddress.Makinguseofthisconvention,the16bitversionoftheinstruction(C.JAL)savesthereturnaddressinx1andonlythisregister.Thereisnocompressedinstructiontosavethereturnaddressinanyotherregister,whichisreasonablesincethisisnotsomethingnormallydone.

Thenamesofthegeneralpurposeregistersarex0,x1,…x31.Theregistersarealsogivensecond,alternatenames.Theassemblerwillaccepteithername,sotheprogrammercanusewhichevernameseemsclearest.

Herearetheregisters:

RISC-VArchitectureSummary/Porter Page� of�146 323

Page 147: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Other SavedAcross Name Name Description Calls x0 zero Zero x1 ra ReturnAddress x2 sp StackPointer yes /calleesaved x3 gp GlobalPointer x4 tp ThreadPointer x5 t0 Temp x6 t1 Temp

x7 t2 Temp x8 s0,fp SavedReg/FramePointer yes/calleesaved x9 s1 SavedReg yes/calleesaved x10 a0 Functionargument x11 a1 Functionargument x12 a2 Functionargument x13 a3 Functionargument x14 a4 Functionargument x15 a5 Functionargument x16 a6 Functionargument x17 a7 Functionargument x18 s2 SavedReg yes /calleesaved x19 s3 SavedReg yes /calleesaved x20 s4 SavedReg yes /calleesaved x21 s5 SavedReg yes /calleesaved x22 s6 SavedReg yes /calleesaved x23 s7 SavedReg yes /calleesaved x24 s8 SavedReg yes /calleesaved x25 s9 SavedReg yes /calleesaved x26 s10 SavedReg yes /calleesaved x27 s11 SavedReg yes /calleesaved x28 t3 Temp x29 t4 Temp x30 t5 Temp x31 t6 Temp

Thecompressedinstructionsaredesignedtoalloweasyaccessto8registers,namelyx8,x9,…x15.Intheabovetable,these8registersarehighlighted.

RISC-VArchitectureSummary/Porter Page� of� 147 323

Page 148: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

SavingRegistersAcrossCalls

Byconvention,someregistersaresavedacrossfunctioncalls.Forprogramsthatfollowthestandardcallingconventions(andweassumethatalmostalldo),afunction(say“foo”)willcallanotherfunction(say“bar”).The“caller”isfooandthe“callee”isbar.

Wenormallyusetheterminology“calleesaved”toindicatethataregisterwillbepreservedacrossacall.Forexample,functionbarwillnotmodifytheregister(orifitdoes,itwill2irstsaveandthenrestoretheregister),sothatthecallerfoocanrelyonitsvalueremainingunchangedbybar.

Presumablythecallerfoowillhavelocalvariableswhosevaluesmustbepreservedacrossacalltobar.Presumablythecalleebarwillneedsometemporaryvariablestocompleteitstaskandcomputeitsresult.Iffoocannottrustbartopreserveitslocalvariables,thenfoowillneedtosaveregistersand,afterbarreturns,restoretheirvalues.Thissavingandrestoringisdonetomainmemory(typicallytothelocalstackframe),andsuchmemoryreferencesareveryslow.Iffooreliesonbartodothesavingandrestoringtheregisters,thentheburdenisplacedonbartoperformtheseslowmemorytasks.

Therearetradeoffsandgeneratingoptimalcodeistricky.Itmaybemostef2icientforthecallerfootosavetheregisters,sinceitmaybethecasethatcallerfoocansavearegisterjustonce,makeanumberofcallstobar,andthenrestoretheregister.Iftheregisterhadbeensavedinthecalleebar,therewouldhavebeenmultiplesavesandrestores.Ontheotherhand,itmaybebestforthesavingandrestoringtobeperformedinthecalleebar.Perhapsbarwilloftentakeanexecutionpaththatdoesnotdisrupttheregister,sothesavingandrestoringcanbeavoidedaltogetherinmanycases.

Thetradeoff(whethertoperformthesavinginthecallerorcallee)isadecisionthatmustbemadeforeachoftheregistersinthecallerfoo.Ifbothfunctionsareprogrammedsimultaneouslybyacleverprogrammer,thenhe/shecanmakeadhocdecisionsaboutthesavingandrestoringofresistersinanattempttoproduceoptimalcode.

However,almostallcodeiscompiledandmostcompilerslookateachfunctionincompleteisolation.Effectively,eachfunctionisseparatelycompiled,sothereisno

RISC-VArchitectureSummary/Porter Page� of� 148 323

Page 149: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

possibilityofcoordinatingandoptimizingthesavingofregistersbetweenmultiplefunctions.

Instead,theapproachistocreateastandardconventionmandatingwhichregistersaresavedbythecalleebarandwhicharenot.Registersthatarenotrequiredbytheconventiontobepreservedacrossacallaretermed“callersaved”,whichmeansthatitisuptothecallerfootosaveandrestoretheregisters,iftheycontaindatathatmustbepreservedacrossthefunctioncalltobar.

Theconventiondictateswhichregistersarerequiredtobepreservedacrossacall,anditisassumedthatcompilersandprogrammerswillfollowthecallingconventions.Iftheydo,thentheseparatelycompiledorseparatelywrittencodeisinteroperable.

Assumingthattheconventionsaretobefollowedwhencompilingsomefunctionfoo,thecompiler/programmerisfreetousewhicheverregistermakesmostsense.Forexample,avariablethatmustbepreservedacrossfunctioncalls(e.g.,tootherfunctionslikebar)willmostlikelybeplacedina“calleesaved”register.Therefore,thesaveandrestoreinstructionsareavoidedinthecaller’scode,andmightbeavoidedaltogetherifbardoesn’tevenusetheregisterinquestion.Ontheotherhand,thecompiler/programmermightprefertouseatemporary(callersaved)register,especiallyiftherearenocallswithinfoo,orifthevariabledoesnotneedtobesavedacrossanycallsthatdooccur.

Byestablishingacallingconventionsaboutwhichregistersarecalleesavedandwhicharenot,thedesignersaremakingtheimportantdecisionabouttheratiobetweentemporarycalleesavedregisters.Assumingthatthereareenoughregistersineachclass,thennoregistersaving/restoringwillbenecessary.

Butnotethatwithrecursiveroutines,therecanbeacon2lict.Consideralocalvariablethatmustbepreservedacrossarecursivecall.Byplacingthevariableinatemporary(callersaved)registertherecanbeaproblem.Sincethisregisterwillnotbesavedacrosstherecursivecall,thefunctionmustsaveitbeforethecallandrestoreitafterwards.Ontheotherhand,ifitisplacedinacalleesavedregisterwedon’thavetobotherwithsaving/restoringitaroundtherecursivecall.Butsincethefunctionwillbeusingacalleesavedregister,thefunctionitselfmustsavethepreviousvalueuponentryandrestoreitbeforereturn.

Soeitherway,theregistermustbesaveandrestored.Thismakessenseforrecursivefunctionsthathavelocalvariablesthatmustbepreservedacrosstherecursivecall.

RISC-VArchitectureSummary/Porter Page� of� 149 323

Page 150: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Theonlyquestioniswhethertoperformthesavebeforethecallinstructionorafterthecallinstruction,andlikewiseperformtherestoreafterthereturninstructionorbeforethereturninstruction.Thecompilermakesthisdecisionwhenitchoosestoplacethevariableinacallee-orcaller-savedregister.

Notethatcompilers/programmersaresometimesfreetoviolatethecallingconventionsordertoproducesuperiorcode.Thisisonlyfeasiblewhenthecompiler/programmerhasaccesstoboththecalledfunction(bar)andallpotentialcallers(suchasfoo).Insuchcases,thecompiler/programmercanchoosetousetheregistersinanon-standardwaytoimproveef2iciency.Forexample,itmaybethatbarwouldbene2itfromhavingmoretemporaryregistersandnoneofthecallershavedatathatmustbepreservedacrossthecalltobar,sothecompiler/programmercanuseregistersthatwouldotherwisebe“calleesaved”as“callersaved”.Compilerscankeepallthedetailsstraightbutprogrammersshouldbecareful.Manyassemblylanguageprogrammingbugsaretheresultofmistakesinvolvingthepropersavingandrestoringofregisters.

Registerx0-TheZeroRegister(“zero”)

Asmentionedearlier,registerx0isadummyregister.Whenread,itsvalueisalwayszero;whenstoredinto,thedataissimplydiscarded.

Registerx1-TheReturnAddress(“ra”)

Whenafunctioniscalled,thereturnaddressmustbesavedsothattheRETURNinstructioncanreturntothecorrectlocationinthecaller’scode.Inoldercomputers,theCALLinstructionstoredthereturnaddressinmemoryonthestack;inmodernISAs(includingRISC-V)theCALLinstructionsavesthereturnaddressinaregister.TheRETURNinstructionretrievesthevaluefromthisregister.TheRETURNinstructionisthennothingmorethanaJR(jumpthroughregister)instruction.

Byconvention,registerx1isusedforthispurpose.

SavingthereturnaddressinaregisterisfasterthansavingitonthestacksincethememoryWRITEandREADoperationsareavoided.Thisschemeisidealforleaffunctions.(“Leaf”functionsdonotcallotherfunctionsandarethusleafnodesinthe

RISC-VArchitectureSummary/Porter Page� of� 150 323

Page 151: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

activation/callingtree.)Manyfunctioncallsaretoleaffunctionssothissavesalotmemoryactivity.

Fornon-leaffunctions,thereturnaddressmustbesavedsomewhereelse.Sincex1willbeusedinsubsequentcalls,thereturnaddressinx1mustbemovedsomewhereelsetoavoidgettingclobberedbeforeitcanbeusedintheRETURNinstruction.Thevaluecaneitherbesavedinacalleesavedregisteroronthestack.(Movingthereturnaddresstoacalleesavedregistermaynecessitateotherregistersavingelsewhere,forcingasavetostackmemoryelsewhere.Forrecursivefunctions,thereisnowaytoavoidusingthein-memorystack;wecanonlyshiftaroundwhenthememoryREADs/WRITEsoccurs.)

Registerx2-TheStackPointer(“sp”)

RISC-Vassumesthatx2containsthestacktoppointerandthatthestackgrowsdownward.Thestackisassumedtoalwaysbequadword(16byte)aligned.Somefunctions/methodsmayoperateentirelyusingregistersandmaynotneedanystackspace.TheRISC-Vdesigntriestofacilitatesuchfunctionssincetheymakenomemoryaccesses.However,manyfunctionswillallocatestackspacetoholdreturnaddresses,localvariables,savedregisters,etc.Thelocalvariables,etc.,arestoredina“stackframe”,whichissometimescalledan“activationrecord.”Recursivefunctionsareknownforneedingstackspaceandallocatinganewstackframeforeachinvocation.

Afunctionthatrequiresstackstoragewillgrowthestack(alwaysbyamultipleof16)bysubtractingfromthestacktoppointersp.Variableswithinthestackframecanbeaddressedusingpositiveoffsetsfromregisterx2.Thisorganizationmotivatesseveralofthecompressedinstructions,inwhichtheoffsetmustbepositiveandisneversignextended.

Registerx2iscallee-saved,whichmeansthatanyfunctionthatgrowsthestacktocreateastackframemustalsoshrinkitbyanequalamountbeforereturning.Inotherwords,afunctionmusthavetheneteffectofpoppingeverythingthatitpushed,leavingthestackasitfoundit,andeffectivelypreservingthevalueofx2.Thus,x2issaidtobe“callee-saved”.

RISC-VArchitectureSummary/Porter Page� of� 151 323

Page 152: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Registerx3-TheGlobalPointer(“gp”)

Globalvariablesarealsocalledstaticvariables,andarecontrastedwithlocalvariables.Globalvariablesaresharedbyallfunctions.Globalvariablesremaininexistencethroughouttheentireprogram,whilelocalvariablescomeintoexistencewhenafunctioniscalledandgooutofexistencewhenthefunctionreturns.

Whilelocalvariablesareoftenkeptinregistersoronthestackatunpredictablelocations,globalvariablesareplacedat2ixedlocationsinmemory.Unfortunatelytheselocationsareoftenfarawayfromthecodethataccessesthem.PC-relativeaddressingcanbeused,butisoftenineffectivesincetherequiredoffsetscanbelarge.Absoluteaddressescanoftenbeproblematictoo,sincetheabsoluteaddressescanalsobelarge,whichwouldnecessitateusingextrainstructions.

ThesolutionsupportedbytheRISC-Vcallingconventionistoplacetheglobalvariablestogetherandinitializearegistertopointtothisarea.Byconvention,registerx3isusedforthis.Thentheindividualvariablescanbeconvenientlyaddressedbyusingasmalloffsetfromtheglobalpointer.

Theglobalpointeristypicallyinitializedearlyintheprogramandneverchanged.So,insomesense,itis“callee-saved”althoughwedidn’tmarkitassuchinthetableabove.

Notallprogramswillusethistechnique,sothisregistermayactuallybeusedforsomethingcompletelydifferent.

Registerx4-TheThreadBasePointer(“tp”)

Inmulti-threadedprograms,eachthreadmayhaveacollectionof“thread-speci2ic”variables,whicharenotlocaltoanyfunction.Instead,theyaresharedbyallcoderunninginthisthread,butareinvisibletootherthreads.

Thread-speci2icvariablesaredistinctfromglobalvariables.Thereisonlyonecopyofeachglobalvariableandeachvariablehasauniqueoffsetfromtheglobalbaseregister,x3/gp.Thethreadbaseregister(tp)hasthesamevalueinallcoderunninginallthreads,whichcausesallcodetoaccessthatsamesetofvariables,regardlessofwhichthreaditisexecutingin.

RISC-VArchitectureSummary/Porter Page� of� 152 323

Page 153: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Inamulti-threadedprogram,eachthreadhasauniqueregisterset.WhentheOSswitchesfromonethreadtoanother,allregistersfromthepreviouslyrunningthreadwillbesavedtosomeareaofmemoryandthenewlyactivethreadwillhaveitsregistersrestoredfromadifferentmemoryarea.Ingeneral,theregistersinonethreadwillbecompletelydifferentfromtheregistersinanotherthread,eventhoughthethreadsarecooperatingandexecutingintheexactsameaddressspace.Forexample,eachthreadwillhaveitsownstack,sothex2(sp)registerineachthreadwillcontainadifferentvalue.

Sinceallthreadssharetheglobalvariables,registerx3(gp)willhavethesamevalueinallthreads.Inaddition,eachthreadcanhaveitsownprivatesetofvariables,whicharecalled“thread-speci2ic”variables.Byconvention,thisblockofvariableswillbepointedtobyregisterx4(tp),soeachthreadwillhaveadifferentvalueinitsx4register.Asinglesectionofcodethataccessesthread-speci2icvariableswillaccessadifferentsetofvariables,dependingonwhichthreaditisrunningin,andthisworksbecauseeachthreadhasauniquevalueinitsx4(tp)register.

Manyapplicationsarenotmulti-threaded,orcontainnothread-speci2icvariables.Insuchcases,itseemstheconventionleavesunspeci2iedhowx4istobeused.

Thethreadpointeristypicallyinitializedearlyintheexecutionofanewthreadandneverchanged.So,insomesense,itis“callee-saved”althoughwedidn’tmarkitassuchinthetableabove.

Registerx5-x7,x28-x31-TempRegisters(“t0-t6”)

Byconvention,thereare7registerswhicharefreetobeusedbyanyfunctionwithoutsavingtheirpreviouscontents.Typicallythecompilerwillplacetheintermediateresultsofcomputationintotempregisters.Insomecases,thecompilermayplacelocalvariablesinthetempregisters.

Thetempregistersare“callersaved”,whichmeansthatthecodewithinafunction“foo”isnotrequiredtosavetheirpreviouscontentsbeforeusingthem.Ontheotherhand,sincethisruleappliestoallfunctions,itmustbeassumedthatcallinganyotherfunction(suchas“bar”)mayresultintheseregistersbeingchanged.Thus,ifthecaller(foo)caresaboutthevalueofatempregister,itmustsavetheregisterbeforecallingbar.

RISC-VArchitectureSummary/Porter Page� of� 153 323

Page 154: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Registerx8,x9,x18-x27-SavedRegisters(“s0-s11”)

Byconvention,thereare12registerswhicharedesignatedascallee-savedregisters.Theconventionisthatafunctionmustnotmodifytheseregisters.Ifsomefunctionwishesorneedstouseoneoftheseregisters,thatfunctionmust2irstsaveitspreviousvalueandthenrestorethatvaluebeforereturning.

Thismakestheseregisterespeciallyusefulinstoringlocalvariablesthatmustbepreservedacrosscallstootherfunctions.Thereisareasonablechancethatthecalledfunctionwillnottouchtheseregistersatall,sosaving/restoringtomainmemoryisoftenavoidedcompletely.

Twooftheseregisters(x8/s0andx9/s1)areespeciallyeasytoaccessusingthe16bitcompressedinstructions.Thereforethecompilershouldprefertouseoneofthesetworegisterswheneverpossible.

Registerx8-FramePointer(“fp”)

Registerx8(whichisalsos0,oneofthecallee-savedregisters)isdesignatedastheframepointerandgivenasecondname,“fp”.

Noteveryfunctionwillneedaframepointerand,forafunctionthatdoesnotneedaframepointer,thisregistercanbeviewedass0,justanothercallee-savedregister.

Tounderstandthepurposeofaframepointer,considertwosortsoffunction.

The2irstsortoffunctionwillhaveasmallnumberoflocalvariables.Furthermore,eachofthelocalvariableswillhavea2ixedsize,knownatcompile-time.Forsuchafunction,thestackframecanbelaidoutbythecompilerandalloffsetswithinthestackframewillbeknown,smallnumbers.Thissortoffunctioncanaccessitsvariablesinthestackframebyusingpositiveoffsetsfromthestacktoppointer(x2).Thissortofafunctionwillnotneedaframepointer.

Thesecondsortoffunctionmayhavealargenumberoflocalvariablesormaycreatealocalvariable(suchasadynamicarray)whosesizeisnotknownuntilruntimewhenthefunctionisactuallycalled.Insuchafunction,thecompilerwillnotknow

RISC-VArchitectureSummary/Porter Page� of� 154 323

Page 155: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

thelayoutoftheframe.Thelocalvariablescannotbeaccessedusingsmallpositiveoffsetsfromthestackpointerregister,x2.

Theideawithusingaframepointeristhis:Uponentrytothefunction,theframepointerregisterwillbesettoaknown,2ixedvalue.Typically,thiswouldbetheoldvalueofthestackpointerbeforethenewstackframeisallocated.Then,thestackframeisallocatedinacoupleofsteps.Firstthe2ixedsizedvariablesareallocatedbyadjustingthestacktoppointer.Thenthedynamicallysizedvariablesareallocated,bydecrementingthestacktoppointerbyvaluesdeterminedatruntime.Afterthisinitializationandcreationoflocalvariables,theframepointerisleftpointingtooneend(thetop,higherend)oftheframe(ormorepreciselytothe2irstbyteofthepreviousframe)andthestackpointerisleftpointingtothebottom,lowerendoftheframe.Offsetsrelativetothestacktoppointerareproblematic,sincethecompilerwillnotknowexactlyhowmuchthestackpointerwasadjusted.Instead,thelocalvariableswillbeaccessedusingnegativeoffsetsrelativetotheframepointer.

(Inanotherapproach,thelocalvariablescanbepushedontothestack2irst.Thentheframepointerisinitializedtothecurrentstacktop.Finally,thedynamicallysizedvariablesarepushedontothestack.Thisallowsthelocalvariablestobeaccessedusingapositiveoffsettotheframepointer.IntheRISC-Vdesign,positiveoffsetsarepreferredinthe16bitcompressedinstructionformat,sothismightbeabetterapproach.)

Theframepointeriscalleesaved.Thismeansthatthevalueoftheregistermustberestoredbeforethefunctionreturns.Typically,thefunctionprologuewillstorethepreviousvalueofthefpregisterintheframeandthefunctionepiloguewillrestorefprightbeforereturning.

Registerx10-x17-ArgumentRegisters(“a0-a7”)

Typically,argumentstofunctionsarepassedinregistersandthisgroupofregistersismeanttobeusedforthatpurpose.

Also,avaluethatistobereturnedfromafunctionwillusuallybereturnedinaregister.

RISC-VArchitectureSummary/Porter Page� of� 155 323

Page 156: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Thereturnvaluewillbereturnedinx10(a0),orifthevalueistoolargeto2itintoasingleregister,itwillbereturnedintheregisterpairx10-x11(a0-a1).Thisisthesameregister(s)thatisusedtopassinthe2irstargument.

CallingConventionsforArguments

Argumentsarenormallypassedtoafunctioninregistersandthecallingconventiondetailsexactlyhowargumentvaluesaretobeplacedinregisters.

Iftherearetoomanyarguments,thenthe2irstargumentsarepassedintheregisters,andtheremainderarepassedonthestack.Also,ifanindividualargumentistoolarge,itwillbepassedinmemoryandtheregisterwillcontainapointertothismemoryregion.

Ifthereturnedvalueissmallenough,itwillbereturnedina0-a1,otherwiseitisreturnedinmemory.

Here,wesummarizethecallingconventionsforRV32,the32bitarchitecture.

• Floatsarepassedinthe2loatingpointregisters,iftheyexist.Otherwise,2loatsarepassedinthegeneralpurposeregisters,justlikeints.

• Othervaluesbrides2loats(e.g.,ints,chars,bools,pointers,structs)arepassedinthe8generalpurposeregistersa0-a7,aslongastheyarenolargerthan64bits.So,ifthevaluewill2itintooneortworegisters,theywillbepassedinregisters.

• Valuessmallerthan32bitsaresign-extendedto32bitsandpassedinaregister.[Apparently,theyare“widenedaccordingtothesignoftheirtype,thensignextended”;myinterpretationmaybeincorrect.???]

• Valuesof32bitsarepassedinasingleregister.

• Valuesofupto64bitsinsizearepassedinaregisterpair.Theleastsigni2icant32-bitswillbeplacedintheregisterwiththesmallernumber,inlittleendianstyle.

RISC-VArchitectureSummary/Porter Page� of� 156 323

Page 157: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

• Valueslargerthan64bitsarealwayspassed“byreference”.Thismeansthatthevalueisplacedinaregionofmemory.Thisregioniseitherinthecaller’sstackframeorispushedontothestackatthetimeofthefunctioncall.Theaddresstothisregionappearsinthelistofargumentsandispassedinaregister,inplaceofthevalue.[Itseemsthatanaddressisalwaysused,regardlessofwhetherornotthereisroomintheregistersfortheaddress.]

• Iftherearenotenoughregisters,onlythe2irstfewargumentswillbepassedinregistersandtheremainderwillbepassedinmemory.Theextra,remainingargumentswillpushedontothestack,sotheywillbeatknownoffsetsfromthestacktoppointeruponentrytothecalledfunction.Thus,theycanbeeasilyaccessedusingpositiveoffsetsfromtheframepointerregister(fp).

• Iftherearenotquiteenoughregisters,a64bitvaluecanbesplitintotwo32-bitparts,withtheleastsigni2icantwordpassedinaregisterandthemostsigni2icantwordpassedinmemory.

• Followingthe“C”languageconvention,argumentsarealways“passedbyvalue”,whichmeanstheyareeffectivelylocalvariableswhichcanbemodi2ied;afterthefunctionreturns,themodi2iedvalueislost.Forexample,structspassedasargumentswillbecopiedandanychangestothestuccowillbelost.Ifthisisnotwhattheprogrammerwants,thenhe/shecanexplicitlypassanaddress,inwhichcasethepointeris“passedbyvalue”,andthevaluepointedtoiseffectively“passedbyreference”.

• Allvaluespassedinmemory(e.g.,argumentsinthestackframe)arealigned,accordingtothesizeofthevalue.

• Ifthereturnedvalueis64bitsorless,itwillbereturnedinregistera0-a1(orthe2loatingpointregisters,ifapplicable).

• Ifthereturnedvalueislargerthan64bits,thecallerwillallocatememoryspaceforthereturnedvalue(inthecaller’sstackframe)andwillpassapointertothisblockofmemory.Thepointerwillbepassedasthe2irstargument,implicitlyinsertedinfrontoftheotherarguments.

Sixoftheseregisters(x10-x15,i.e.a0-a5)areespeciallyeasytoaccessusingthe16bitcompressedinstructions.Therefore,themajorityoffunctions(whichhave6orfewerarguments)canbecalledusingregistersalone.

RISC-VArchitectureSummary/Porter Page� of� 157 323

Page 158: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Commentary:Theargumentregistersareotherwisetreatedliketemporaries,inthattheyarecallersaved,i.e.,notpreservedacrosscalls.Ifanargumentregisterisnotusedforpassingarguments,thenititisfreetobeusedasatemporaryworkregister.

Itisunclearwhyitisnecessarytodifferentiatebetweenthesetwoclasses.Whycan’ttempregistersbeusedtopassargumentsincaseswheretherearealotofarguments?Ifafunctionwithmanyargumentstrulyneedsatemporaryregister,thenitcansavesomeofitsargumentsinitsstackframe.Thisisnoextrawork,sincethecallerwouldhavehadtosavethevaluesinitsframeotherwise.Anditwillprobablybethecasethatthecalleefunctioncanbeabitsmarteraboutwhathastobesavedinmemorythanthecaller.

FloatingPointRegisters

Herearethe2loatingpointregisters,withtheirintendedusage.The16bitcompressedinstructionsfavorregistersf8-f15andthesearehighlightedbelow.

RISC-VArchitectureSummary/Porter Page� of� 158 323

Page 159: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter5:RegisterandCallingConventions

Other SavedAcross Name Name Description Calls f0 ft0 Temp f1 ft1 Temp f2 ft2 Temp f3 ft3 Temp f4 ft4 Temp f5 ft5 Temp f6 ft6 Temp

f7 ft7 Temp f8 fs0 SavedReg yes/calleesaved f9 fs1 SavedReg yes/calleesaved f10 fa0 Functionargument f11 fa1 Functionargument f12 fa2 Functionargument f13 fa3 Functionargument f14 fa4 Functionargument f15 fa5 Functionargument f16 fa6 Functionargument f17 fa7 Functionargument f18 fs2 SavedReg yes /calleesaved f19 fs3 SavedReg yes /calleesaved f20 fs4 SavedReg yes /calleesaved f21 fs5 SavedReg yes /calleesaved f22 fs6 SavedReg yes /calleesaved f23 fs7 SavedReg yes /calleesaved f24 fs8 SavedReg yes /calleesaved f25 fs9 SavedReg yes /calleesaved f26 fs10 SavedReg yes /calleesaved f27 fs11 SavedReg yes /calleesaved f28 ft8 Temp f29 ft9 Temp f30 ft10 Temp f31 ft11 Temp

RISC-VArchitectureSummary/Porter Page� of� 159 323

Page 160: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

CompressedInstructions

Normally,RISC-Vinstructionsare32bitsinlength.However,the“C”(CompressedInstructions)extensionaddsinstructionswhichare16bitslong.

The“C”extensionmerelyaddsinstructionstotheISA;alltheother32bitinstructionsremainvalid,unchangedandfullyfunctional.

Each16bitinstructionisanabbreviationforalonger32bitinstruction.Thus,nonewfunctionalityisadded.Thepurposeofthe16bitinstructionsistoreducecodesize:byreplacinglongerinstructionswiththeirequivalentshorterversions,codesizeisreduced.

Notall32bitinstructionshaveanequivalent16bitcounterpart.ThegoaloftheRISC-Vdesignistoprovide16bitinstructionsforthemostpopularandfrequentlyoccurring32bitinstructions,therebyreducingthecodesizeasmuchaspossible.Ablockofcodeinwhichall32bitinstructionshappentohave16bitcounterpartscanbereducedtohalfitssize.Buttypicalcodesequenceswillcontainsomeinstructionswhichhaveno16bitcounterparts,sothereductioninsizewillnotbeasgreat.Thedesignersestimatean25%reductionincodesize.

InotherISAdesignsthereisa“compressedmode”:Whentheprocessorisplacedincompressedmode,theshorterinstructionsareexecuted.RISC-Vdoesnotworkthisway.TheRISC-Vdesignhascarefullyencodedtheinstructionssothatthelengthoftheinstructioncanbedeterminedandthebytesfetchedfrommemorycanbeunambiguouslyinterpretedaseither16bitor32bitinstructions.Thus,theprogrammercanfreelyintermixlongandshortinstructions.

RISC-VArchitectureSummary/Porter Page� of�160 323

Page 161: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

InstructionFormats

Thereareseveralinstructionformatsusedforthecompressedinstructions.Inlistingtheformatsbelow,wewillusetheseabbreviationsfor2ields:

------ OpcodeBitsDDDDD RegDDDD RegD(x8..x15only)

aaaaa Reg1aaa Reg1(x8..x15only)bbbbb Reg2bbb Reg2(x8..x15only)VVVVV ImmediateValueXXXXX Offset

Herearethemaininstructionforms.(Eachcharacterindicatesasinglebit.)

RegisterFormat----aaaaabbbbb------DDDDDbbbbb--

ImmediateFormat---VaaaaaVVVVV-----VDDDDDVVVVV--

WideImmediateFormat---VVVVVVVVDDD--

Stack-relativeStoreFormat---VVVVVVbbbbb--

LoadFormat---VVVaaaVVDDD--

StoreFormat---VVVaaaVVbbb--

BranchFormat---XXXaaaXXXXX--

JumpFormat---XXXXXXXXXXX--

Theseformatsareschematic.Fortheindividualinstructionsdescribedinthefollowingsections,wegivetheexactencoding,includingtheactualopcodebits,aswellasotherencodingconstraints.

RISC-VArchitectureSummary/Porter Page� of� 161 323

Page 162: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

LoadInstructions

C.LW-LoadWord

GeneralForm:C.LW RegD,Immed-6(Reg1)

Description:Movea32bitvaluefrommemoryintoregisterRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedasapositiveoffsettothebaseaddressinregisterReg1.

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

RegistersRegDandReg1arerestrictedtox8..x15.Encoding:

010VVVaaaVVDDD00WhereDDD=RegD.

Whereaaa=Reg1.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.

Equivalent32BitInstruction:LW RegD,Offset(Reg1)

C.LW-LoadDoubleword

GeneralForm:C.LD RegD,Immed-6(Reg1)

Description:Movea64bitvaluefrommemoryintoregisterRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedasapositiveoffsettothebaseaddressinregisterReg1.

RISC-VArchitectureSummary/Porter Page� of� 162 323

Page 163: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

RegistersRegDandReg1arerestrictedtox8..x15.Encoding:

011VVVaaaVVDDD00WhereDDD=RegD.

Whereaaa=Reg1.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.OnlyavailableforRV64andRV128.

Equivalent32BitInstruction:LD RegD,Offset(Reg1)

C.LQ-LoadQuadword

GeneralForm:C.LQ RegD,Immed-6(Reg1)

Description:Movea128bitvaluefrommemoryintoregisterRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby16,andaddedasapositiveoffsettothebaseaddressinregisterReg1.

SincetheImmed-6valueisscaledby16,thisgivesaneffectiverangeof0..1,008,inmultiplesof16.

RegistersRegDandReg1arerestrictedtox8..x15.Encoding:

001VVVaaaVVDDD00 WhereaaadenotesReg1.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.OnlyavailableforRV128.

Equivalent32BitInstruction:LQ RegD,Offset(Reg1)

RISC-VArchitectureSummary/Porter Page� of� 163 323

Page 164: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

C.FLW-LoadSingleFloat

GeneralForm:C.FLW FRegD,Immed-6(Reg1)

Description:Movea32bitvaluefrommemoryinto2loatingpointregisterFRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedasapositiveoffsettothebaseaddressingeneralpurposeregisterReg1.

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

RegisterFRegDisrestrictedtof8..f15.

RegisterReg1isrestrictedtox8..x15.Encoding:

011VVVaaaVVDDD00 WhereaaadenotesReg1.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.Requires“F”(SinglePrecisionFloatingPoint)extension.OnlyavailableforRV32.

Equivalent32BitInstruction:FLW FRegD,Offset(Reg1)

C.FLD-LoadDoubleFloat

GeneralForm:C.FLD FRegD,Immed-6(Reg1)

Description:Movea64bitvaluefrommemoryinto2loatingpointregisterFRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedasapositiveoffsettothebaseaddressingeneralpurposeregisterReg1.

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

RegisterFRegDisrestrictedtof8..f15.

RISC-VArchitectureSummary/Porter Page� of� 164 323

Page 165: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

RegisterReg1isrestrictedtox8..x15.Encoding:

001VVVaaaVVDDD00 WhereaaadenotesReg1.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.Requires“D”(DoublePrecisionFloatingPoint)extension.OnlyavailableforRV32andRV64.

Equivalent32BitInstruction:FLD FRegD,Offset(Reg1)

C.LWSP-LoadWordfromStackFrame

GeneralForm:C.LWSP RegD,Immed-6

Description:Movea32bitvaluefrommemoryintoregisterRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

Thedestinationregistercanbeanyofthe32registers,exceptx0.Encoding:

010VDDDDDVVVVV10WhereDDDDD=RegDandcannotbex0=00000.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.

Equivalent32BitInstruction:LW RegD,Offset(x2)

RISC-VArchitectureSummary/Porter Page� of� 165 323

Page 166: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

C.LDSP-LoadDoublewordfromStackFrame

GeneralForm:C.LDSP RegD,Immed-6

Description:Movea64bitvaluefrommemoryintoregisterRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

Thedestinationregistercanbeanyofthe32registers,exceptx0.Encoding:

011VDDDDDVVVVV10WhereDDDDD=RegDandcannotbex0=00000.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.OnlyavailableforRV64andRV128.

Equivalent32BitInstruction:LD RegD,Offset(x2)

C.LQSP-LoadQuadwordfromStackFrame

GeneralForm:C.LQSP RegD,Immed-6

Description:Movea128bitvaluefrommemoryintoregisterRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby16,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby16,thisgivesaneffectiverangeof0..1008inmultiplesof16.

Thedestinationregistercanbeanyofthe32registers,exceptx0.

RISC-VArchitectureSummary/Porter Page� of� 166 323

Page 167: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Encoding:001VDDDDDVVVVV10

WhereDDDDD=RegDandcannotbex0=00000.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.OnlyavailableforRV128.

Equivalent32BitInstruction:LQ RegD,Offset(x2)

C.FLWSP-LoadSingleFloatfromStackFrame

GeneralForm:C.FLWSP FRegD,Immed-6

Description:Movea32bitvaluefrommemoryinto2loatingpointregisterFRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

Thedestinationregistercanbeanyofthe322loatingpointregisters.Encoding:

011VDDDDDVVVVV10WhereDDDDD=FRegD.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.Requires“F”(SinglePrecisionFloatingPoint)extension.OnlyavailableforRV32.

Equivalent32BitInstruction:FLW FRegD,Offset(x2)

RISC-VArchitectureSummary/Porter Page� of� 167 323

Page 168: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

C.FLDSP-LoadDoubleFloatfromStackFrame

GeneralForm:C.FLDSP FRegD,Immed-6

Description:Movea64bitvaluefrommemoryinto2loatingpointregisterFRegD.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

Thedestinationregistercanbeanyofthe322loatingpointregisters.Encoding:

001VDDDDDVVVVV10WhereDDDDD=FRegD.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.Requires“D”(DoublePrecisionFloatingPoint)extension.OnlyavailableforRV32andRV64.

Equivalent32BitInstruction:FLD FRegD,Offset(x2)

StoreInstructions

C.SW-StoreWord

GeneralForm:C.SW Reg2,Immed-6(Reg1)

Description:Movea32bitvaluefromregisterReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedasapositiveoffsettothebaseaddressinregisterReg1.

RISC-VArchitectureSummary/Porter Page� of� 168 323

Page 169: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

RegistersReg1andReg2arerestrictedtox8..x15.Encoding:

110VVVaaaVVbbb00 WhereaaadenotesReg1andbbbdenotesReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

SW Reg2,Offset(Reg1)

C.SD-StoreDoubleword

GeneralForm:C.SD Reg2,Immed-6(Reg1)

Description:Movea64bitvaluefromregisterReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedasapositiveoffsettothebaseaddressinregisterReg1.

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

RegistersReg1andReg2arerestrictedtox8..x15.Encoding:

111VVVaaaVVbbb00 WhereaaadenotesReg1andbbbdenotesReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.OnlyavailableforRV64andRV128.

Equivalent32BitInstruction:SD Reg2,Offset(Reg1)

RISC-VArchitectureSummary/Porter Page� of� 169 323

Page 170: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

C.SQ-StoreQuadword

GeneralForm:C.SQ Reg2,Immed-6(Reg1)

Description:Movea128bitvaluefromregisterReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby16,andaddedasapositiveoffsettothebaseaddressinregisterReg1.

SincetheImmed-6valueisscaledby16,thisgivesaneffectiverangeof0..1,008,inmultiplesof16.

RegistersReg1andReg2arerestrictedtox8..x15.Encoding:

101VVVaaaVVbbb00 WhereaaadenotesReg1andbbbdenotesReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.OnlyavailableforRV128.

Equivalent32BitInstruction:SQ Reg2,Offset(Reg1)

C.FSW-StoreSingleFloat

GeneralForm:C.FSW FReg2,Immed-6(Reg1)

Description:Movea32bitvaluefrom2loatingpointregisterFReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedasapositiveoffsettothebaseaddressingeneralpurposeregisterReg1.

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

RegisterReg1isrestrictedtox8..x15.RegisterFReg2isrestrictedtof8..f15.

RISC-VArchitectureSummary/Porter Page� of� 170 323

Page 171: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Encoding:111VVVaaaVVbbb00

WhereaaadenotesReg1andbbbdenotesFReg2.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.Requires“F”(SinglePrecisionFloatingPoint)extension.OnlyavailableforRV32.

Equivalent32BitInstruction:FSW FReg2,Offset(Reg1)

C.FSD-StoreDoubleFloat

GeneralForm:C.FSD FReg2,Immed-6(Reg1)

Description:Movea64bitvaluefrom2loatingpointregisterFReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedasapositiveoffsettothebaseaddressingeneralpurposeregisterReg1.

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

RegisterReg1isrestrictedtox8..x15.RegisterFReg2isrestrictedtof8..f15.Encoding:

101VVVaaaVVbbb00 WhereaaadenotesReg1andbbbdenotesFReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.Requires“D”(DoublePrecisionFloatingPoint)extension.OnlyavailableforRV32andRV64.

Equivalent32BitInstruction:FSD FReg2,Offset(Reg1)

RISC-VArchitectureSummary/Porter Page� of� 171 323

Page 172: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

C.SWSP-StoreWordtoStackFrame

GeneralForm:C.SWSP Reg2,Immed-6

Description:Movea32bitvaluefromregisterReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

Thesourceregistercanbeanyofthe32generalpurposeregisters.Encoding:

110VVVVVVbbbbb10 WherebbbbbdenotesReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

SW Reg2,Offset(x2)

C.SDSP-StoreDoublewordtoStackFrame

GeneralForm:C.SDSP Reg2,Immed-6

Description:Movea64bitvaluefromregisterReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

Thesourceregistercanbeanyofthe32generalpurposeregisters.Encoding:

111VVVVVVbbbbb10

RISC-VArchitectureSummary/Porter Page� of� 172 323

Page 173: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

WherebbbbbdenotesReg2.WhereVVVVVV=Immed-6.

Availability:Requires“C”(CompressedInstruction)extension.OnlyavailableforRV64andRV128.

Equivalent32BitInstruction:SD Reg2,Offset(x2)

C.SQSP-StoreQuadwordtoStackFrame

GeneralForm:C.SQSP Reg2,Immed-6

Description:Movea128bitvaluefromregisterReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby16,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby16,thisgivesaneffectiverangeof0..1,016,inmultiplesof16.

Thesourceregistercanbeanyofthe32generalpurposeregisters.Encoding:

101VVVVVVbbbbb10 WherebbbbbdenotesReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.OnlyavailableforRV128.

Equivalent32BitInstruction:SQ Reg2,Offset(x2)

C.FSWSP-StoreSingleFloattoStackFrame

GeneralForm:C.FSWSP FReg2,Immed-6

RISC-VArchitectureSummary/Porter Page� of� 173 323

Page 174: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Description:Movea32bitvaluefromregisterFReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby4,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby4,thisgivesaneffectiverangeof0..252,inmultiplesof4.

Thesourceregistercanbeanyofthe322loatingpointregisters.Encoding:

111VVVVVVbbbbb10 WherebbbbbdenotesFReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.Requires“F”(SinglePrecisionFloatingPoint)extension.OnlyavailableforRV32.

Equivalent32BitInstruction:FSW FReg2,Offset(x2)

C.FSDSP-StoreDoubleFloattoStackFrame

GeneralForm:C.FSDSP FReg2,Immed-6

Description:Movea64bitvaluefromregisterFReg2tomemory.Toformtheaddress,theImmed-6valueiszero-extended,multipliedby8,andaddedtothestackpointer(x2).

SincetheImmed-6valueisscaledby8,thisgivesaneffectiverangeof0..504,inmultiplesof8.

Thesourceregistercanbeanyofthe322loatingpointregisters.Encoding:

101VVVVVVbbbbb10 WherebbbbbdenotesFReg2.

WhereVVVVVV=Immed-6.Availability:

Requires“C”(CompressedInstruction)extension.

RISC-VArchitectureSummary/Porter Page� of� 174 323

Page 175: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Requires“D”(DoublePrecisionFloatingPoint)extension.OnlyavailableforRV32andRV64.

Equivalent32BitInstruction:FSD FReg2,Offset(x2)

Jump,Call,andConditionalBranchInstructions

C.J-Jump(PC-relative)

GeneralForm:C.J Immed-11

Description:AjumpistakenusingaPC-relativeoffset.The11bitimmediatevalueismultipliedbytwo(sinceinstructionsmustbehalfwordaligned),sign-extended,andaddedtotheprogramcounter(PC),togivethetargetaddress.

SincetheImmed-11valueisscaledby2,thisgivesaneffectiverangeof-2,048..+2,046,inmultiplesof2.

Encoding:101XXXXXXXXXXX01

WhereXXXXXXXXXXXdenotestheImmed-11offset.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

JAL x0,Offset

C.JAL-JumpandLink/Call(PC-relative)

GeneralForm:C.JAL Immed-11

Description:AfunctioncallistakenusingaPC-relativeoffset.The11bitimmediatevalueismultipliedbytwo(sinceinstructionsmustbehalfwordaligned),sign-extended,andaddedtotheprogramcounter(PC),togivethetargetaddress.

RISC-VArchitectureSummary/Porter Page� of� 175 323

Page 176: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

SincetheImmed-11valueisscaledby2,thisgivesaneffectiverangeof-2,048..+2,046,inmultiplesof2.

Thereturnaddress(i.e.,theaddressoftheinstructionfollowingtheC.JAL)issavedinregisterx1(alsoknownasregisterra).

Encoding:001XXXXXXXXXXX01

WhereXXXXXXXXXXXdenotestheImmed-11offset.Availability:

Requires“C”(CompressedInstruction)extension.OnlyavailableforRV32.

Equivalent32BitInstruction:JAL x1,Offset

C.JR-JumpRegister

GeneralForm:C.JR Reg1

Description:AjumpistakentotheaddressinregisterReg1.

RegisterReg1canbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.Encoding:

1000aaaaa0000010 WhereaaaaadenotesReg1andcannotbex0=00000.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

JAL x0,Reg1,0

C.JALR-JumpRegisterandLink/CallGeneralForm:

C.JALR Reg1 Description:

AfunctioncallistakentotheaddressinregisterReg1.

Thereturnaddress(i.e.,theaddressoftheinstructionfollowingtheC.JALR)issavedinregisterx1(alsoknownasregisterra).

RISC-VArchitectureSummary/Porter Page� of� 176 323

Page 177: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

RegisterReg1canbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.Encoding:

1001aaaaa0000010 WhereaaaaadenotesReg1andcannotbex0=00000.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

JALR x1,Reg1,0

C.BEQZ-BranchifZero(PC-relative)

GeneralForm:C.BEQZ Reg1,Immed-8

Description:AconditionalbranchistakenusingaPC-relativeoffset.ThebranchistakenifthevalueinregisterReg1iszero.

The8bitimmediatevalueismultipliedbytwo(sinceinstructionsmustbehalfwordaligned),sign-extended,andaddedtotheprogramcounter(PC),togivethetargetaddress.

SincetheImmed-8valueisscaledby2,thisgivesaneffectiverangeof-256..+254,inmultiplesof2.

Encoding:110XXXaaaXXXXX01

WhereXXXXXXXXdenotestheImmed-8offset. WhereaaadenotesReg1.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

BEQ Reg1,x0,Offset

C.BNEQZ-BranchifNotZero(PC-relative)

GeneralForm:C.BNEQZ Reg1,Immed-8

RISC-VArchitectureSummary/Porter Page� of� 177 323

Page 178: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Description:AconditionalbranchistakenusingaPC-relativeoffset.ThebranchistakenifthevalueinregisterReg1isnotzero.

The8bitimmediatevalueismultipliedbytwo(sinceinstructionsmustbehalfwordaligned),sign-extended,andaddedtotheprogramcounter(PC),togivethetargetaddress.

SincetheImmed-8valueisscaledby2,thisgivesaneffectiverangeof-256..+254,inmultiplesof2.

Encoding:111XXXaaaXXXXX01

WhereXXXXXXXXdenotestheImmed-8offset. WhereaaadenotesReg1.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

BNE Reg1,x0,Offset

LoadConstantintoaRegister

C.LI-LoadImmediate

GeneralForm:C.LI RegD,Immed-6

Description:Loadanimmediatevalueintherange-32..+31intoregisterRegD.

Thetargetregistercanbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.

The6bitimmediatevalueissign-extended.Encoding:

010VDDDDDVVVVV01 WhereVVVVVVdenotestheImmed-6value. WhereDDDDDdenotesRegD,andcannotbex0=00000.

RISC-VArchitectureSummary/Porter Page� of� 178 323

Page 179: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Availability:Requires“C”(CompressedInstruction)extension.

Equivalent32BitInstruction:ADDI RegD,x0,Value

C.LUI-LoadUpperImmediate

GeneralForm:C.LUI RegD,Immed-6

Description:ThisinstructionisashortformfortheLUIinstruction,butwitharestrictedrangeofvalues.Thisinstructionloadsbits[17:12]ofthedestinationregister.Theupperbits[…:18]are2illedwiththesignextension.Thelower12bits[11:0]are2illedwithzeros.

Toputitanotherway,theimmediatevalueissignextended,multipliedby4,096,andloadedintothedestinationregister.SincetheImmed-6valueisscaledby212=4,096,thisgivesaneffectiverangeofvaluesof-131,040..+126,976,inmultiplesof4,096.

Thetargetregistercanbeanygeneralpurposeregisterexceptx0orx2.(Registerx2isnormallyusedforthestackpointer.)

Encoding:011VDDDDDVVVVV01

WhereVVVVVVdenotestheImmed-6value. WhereDDDDDdenotesRegD,andcannotbex0=00000orx2=00010.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

LUI RegD,Value

RISC-VArchitectureSummary/Porter Page� of� 179 323

Page 180: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Arithmetic,Shift,andLogicInstructions

C.ADDI-AddImmediate

GeneralForm:C.ADDI RegD,Immed-6

Description:Addsanimmediatevalueintherange-32..+31toregisterRegD.

Theregistercanbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.

The6bitimmediatevalueissign-extended.Encoding:

000VDDDDDVVVVV01 WhereVVVVVVdenotestheImmed-6valueandcannotbezero. WhereDDDDDdenotesRegD,andcannotbex0=00000.Comment: IfVVVVVV=000000andDDDDD=00000,thisencodingisidenticaltoC.NOP. IfVVVVVV≠000000andDDDDD=00000,thisencodingcanbeahint. IfVVVVVV=000000andDDDDD≠00000,thisencodingisreserved.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

ADDI RegD,RegD,Value

C.ADDIW-AddImmediate(Word)

GeneralForm:C.ADDIW RegD,Immed-6

Description:Addsanimmediatevalueintherange-32..+31toregisterRegD.

Theregistercanbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.

The6bitimmediatevalueissign-extended.

ThepreviouslydescribedC.ADDIinstructionperformstheadditionineither32,64,or128bits,asdeterminedbythearchitecture,RV32,RV64,orRV128.

RISC-VArchitectureSummary/Porter Page� of� 180 323

Page 181: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

ThisinstructionC.ADDIWwilltruncatetheresultto32bits,thensignextendedthe32bitstothefullregistersize.

Avalueofzeroisallowed.Thiswillsimplyforcealargervalueintoa32bitvalue,signextendingtothefullregistersize.

Encoding:001VDDDDDVVVVV01

WhereVVVVVVdenotestheImmed-6value.(Zeroisallowed). WhereDDDDDdenotesRegD,andcannotbex0=00000.Comment: IfDDDDD=00000,thisencodingcanbeahint.Availability:

Requires“C”(CompressedInstruction)extension.OnlyavailableforRV64andRV128.

Equivalent32BitInstruction:ADDIW RegD,RegD,Value

C.ADDI16SP-AddImmediatetoStackPointer

GeneralForm:C.ADDI16SP Immed-6

Description:Thisinstructionisusedtogroworshrinkthestackbyadjustingthestackpointerregister.

The6bitimmediatevalueismultipliedby16,sign-extended,andaddedtothestackpointerregister(x2).

Itisassumedthatthestackpointerisalwaysquadwordaligned,i.e.,amultipleof16.SincetheImmed-6valueisscaledby16,thisgivesaneffectiveadjustmentvalueof-512..+496,inmultiplesof16.

Avalueofzeroisnotallowed,sinceadjustingthestackpointerbyzeroispointless.

Encoding:011V00010VVVVV01

WhereVVVVVVdenotestheImmed-6value,andcannotbe000000.Availability:

Requires“C”(CompressedInstruction)extension.

RISC-VArchitectureSummary/Porter Page� of� 181 323

Page 182: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Equivalent32BitInstruction:ADDI x2,x2,Value

C.ADDI4SPN-AddImmediatetoStackPointer

GeneralForm:C.ADDI14SPN RegD,Immed-8

Description:Thisinstructionisusedtoloadtheaddressofalocalstackvariableintoaregister.Weassumethevariableislocatedwiththestackframe,thatthestackgrowsdownward,andthatthestackpointerpointstothelowestaddress.Therefore,alllocalvariables(locatedwithinthestackframe)willbeaccessedwithpositiveoffsetsfromregisterx2(thestackpointer).

The8bitimmediatevalueismultipliedby4,zero-extended,andaddedtothestackpointerregister(x2),andmovedintothedestinationregisterRegD.

SincetheImmed-8valueisscaledby4,thisgivesanoffsetintherange-512..+508,inmultiplesof4.

Thedestinationregisterislimitedtox8,x9,…x15.

Avalueofzeroisnotallowed,sinceaMVinstructioncanbeusedinstead.Encoding:

000VVVVVVVVDDD00 WhereVVVVVVVVdenotesImmed-8,andcannotbe00000000. WhereDDDdenotesRegD.Availability:

Requires“C”(CompressedInstruction)extension.AvailableonlyforRV32andRV64.

Equivalent32BitInstruction:ADDI RegD,x2,Value

C.SLLI-ShiftLeftLogical(Immediate)

GeneralForm:C.SLLI RegD,Immed-6

RISC-VArchitectureSummary/Porter Page� of� 182 323

Page 183: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Description:ThisinstructionshiftsthevalueinaregisterRegDleftbyanamountspeci2iedbytheImmed-62ield.

ForRV32(32-bitarchitectures),theshiftamountmustbe1..31(i.e.,000001-011111).

ForRV64(64-bitarchitectures),theshiftamountmustbe1..63(i.e.,000001-111111).

ForRV128(128-bitarchitectures),theshiftamountmaybe1..64.(Shiftamountsof1..63areencodedas000001-111111;ashiftamount64isencodedas000000.)

Thedestinationregistercanbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.

Encoding:000VDDDDDVVVVV10

WhereVVVVVVdenotesImmed-6.Seerestrictionsinthedescription. WhereDDDDDdenotesRegD,andcannotbex0=00000.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

SLLI RegD,RegD,Value

C.SRLI-ShiftRightLogical(Immediate)

GeneralForm:C.SRLI RegD,Immed-6

Description:ThisinstructionshiftsthevalueinaregisterRegDrightbyanamountspeci2iedbytheImmed-62ield.Thisisa“logicalshift,i.e.,zerosareshiftedinfromtheleft.

ForRV32(32-bitarchitectures),theshiftamountmustbe1..31(i.e.,000001-011111).

ForRV64(64-bitarchitectures),theshiftamountmustbe1..63(i.e.,000001-111111).

RISC-VArchitectureSummary/Porter Page� of� 183 323

Page 184: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

ForRV128(128-bitarchitectures),thepossibleshiftamountsare1..31,64,and96-127.[Myreadingofthespecsuggeststhattheencodingoftheshiftamountisthis:000001-011111means1-31.000000means64.100000-111111isinterpretedasasignedvalueintherange-32..-1,whichwecancallX.Theshiftamountisrightby128-Xbits.Thisgivesarangeof96..127,resp.]

Thedestinationregistercanbex8,x9,…x15.Encoding:

100V00DDDVVVVV01 WhereVVVVVVdenotesImmed-6.Seerestrictionsinthedescription. WhereDDDdenotesRegD.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

SRLI RegD,RegD,Value

C.SRAI-ShiftRightArithmetic(Immediate)

GeneralForm:C.SRAI RegD,Immed-6

Description:ThisinstructionshiftsthevalueinaregisterRegDrightbyanamountspeci2iedbytheImmed-62ield.Thisisa“arithmeticshift,i.e.,thesignbitisrepeatedlyshiftedinfromtheleft.

ForRV32(32-bitarchitectures),theshiftamountmustbe1..31(i.e.,000001-011111).

ForRV64(64-bitarchitectures),theshiftamountmustbe1..63(i.e.,000001-111111).

ForRV128(128-bitarchitectures),thepossibleshiftamountsare1..31,64,and96-127.[Myreadingofthespecsuggeststhattheencodingoftheshiftamountisthis:000001-011111means1-31.000000means64.100000-111111isinterpretedasasignedvalueintherange-32..-1,whichwecancallX.Theshiftamountisrightby128-Xbits.Thisgivesarangeof96..127,resp.]

RISC-VArchitectureSummary/Porter Page� of� 184 323

Page 185: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Thedestinationregistercanbex8,x9,…x15.Encoding:

100V01DDDVVVVV01 WhereVVVVVVdenotesImmed-6.Seerestrictionsinthedescription. WhereDDDdenotesRegD.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

SRAI RegD,RegD,Value

C.ANDI-LogicalAND(Immediate)

GeneralForm:C.ANDI RegD,Immed-6

Description:ThisinstructionperformsthelogicalANDoperationwiththevalueinaregisterandwritestheresulttothatsameregister.

TheImmed-6valueissign-extended.

TheregisterRegDcanbex8,x9,…x15.Encoding:

100V10DDDVVVVV01 WhereVVVVVVdenotesImmed-6. WhereDDDdenotesRegD.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

ANDI RegD,RegD,Value

C.MV-MoveRegistertoRegister

GeneralForm:C.MV RegD,Reg2

Description:ThisinstructionmovesavaluefromReg2toRegD.

RISC-VArchitectureSummary/Porter Page� of� 185 323

Page 186: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

TheregistersRegDandReg2canbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.

Encoding:1000DDDDDbbbbb10

WhereDDDDDdenotesRegD. WherebbbbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

ADD RegD,x0,Reg2

C.ADD-AddRegistertoRegister

GeneralForm:C.ADD RegD,Reg2

Description:ThisinstructionaddsthevaluesinRegDandReg2andplacestheresultinRegD.

TheregistersRegDandReg2canbex1,x2,…x31,i.e.,anygeneralpurposeregisterexceptx0.

Encoding:1001DDDDDbbbbb10

WhereDDDDDdenotesRegD. WherebbbbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

ADD RegD,RegD,Reg2

C.AND-AddRegistertoRegister

GeneralForm:C.AND RegD,Reg2

Description:ThisinstructionlogicallyANDsthevaluesinRegDandReg2andplacestheresultinRegD.

RISC-VArchitectureSummary/Porter Page� of� 186 323

Page 187: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

TheregistersRegDandReg2canbex8,x9,…x15.Encoding:

100011DDD11bbb01 WhereDDDdenotesRegD. WherebbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

AND RegD,RegD,Reg2

C.OR-AddRegistertoRegister

GeneralForm:C.OR RegD,Reg2

Description:ThisinstructionlogicallyORsthevaluesinRegDandReg2andplacestheresultinRegD.

TheregistersRegDandReg2canbex8,x9,…x15.Encoding:

100011DDD10bbb01 WhereDDDdenotesRegD. WherebbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

OR RegD,RegD,Reg2

C.XOR-AddRegistertoRegister

GeneralForm:C.XOR RegD,Reg2

Description:ThisinstructionlogicallyXORsthevaluesinRegDandReg2andplacestheresultinRegD.

TheregistersRegDandReg2canbex8,x9,…x15.

RISC-VArchitectureSummary/Porter Page� of� 187 323

Page 188: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Encoding:100011DDD01bbb01

WhereDDDdenotesRegD. WherebbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

XOR RegD,RegD,Reg2

C.SUB-AddRegistertoRegister

GeneralForm:C.SUB RegD,Reg2

Description:ThisinstructionsubtractsthevalueinReg2fromthevalueinRegDandplacestheresultinRegD.

TheregistersRegDandReg2canbex8,x9,…x15.Encoding:

100011DDD00bbb01 WhereDDDdenotesRegD. WherebbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.Equivalent32BitInstruction:

SUB RegD,RegD,Reg2

C.ADDW-AddRegistertoRegister

GeneralForm:C.ADDW RegD,Reg2

Description:ThisinstructionaddsthevalueinReg2tothevalueinRegDandplacestheresultinRegD.Theresultvalueistruncatedto32bitsandthensignextendedtothefulllengthoftheregister.

TheregistersRegDandReg2canbex8,x9,…x15.

RISC-VArchitectureSummary/Porter Page� of� 188 323

Page 189: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Encoding:100111DDD01bbb01

WhereDDDdenotesRegD. WherebbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.AvailableforRV64andRV128only.

Equivalent32BitInstruction:ADDW RegD,RegD,Reg2

C.SUBW-AddRegistertoRegister

GeneralForm:C.SUBW RegD,Reg2

Description:ThisinstructionsubtractsthevalueinReg2fromthevalueinRegDandplacestheresultinRegD.Theresultvalueistruncatedto32bitsandthensignextendedtothefulllengthoftheregister.

TheregistersRegDandReg2canbex8,x9,…x15.Encoding:

100111DDD00bbb01 WhereDDDdenotesRegD. WherebbbdenotesReg2.Availability:

Requires“C”(CompressedInstruction)extension.AvailableforRV64andRV128only.

Equivalent32BitInstruction:SUBW RegD,RegD,Reg2

MiscellaneousInstructions

C.NOP-NopInstruction

GeneralForm:C.NOP

RISC-VArchitectureSummary/Porter Page� of� 189 323

Page 190: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Description:Thisinstructiondoesnothing.TheencodingcanbeviewedasaspecialcaseoftheC.ADDIinstructionwherethedestinationisx0andtheimmediatevalueis0.

Encoding:0000000000000001

Availability:Requires“C”(CompressedInstruction)extension.

Equivalent32BitInstruction:ADDI x0,x0,0

C.EBREAK-DebuggingBreakInstruction

GeneralForm:C.EBREAK

Description:Thisinstructionisusedbydebuggers.Theideaisthatadebuggerwilltemporarilyreplacesomeinstructionwiththisinstruction.Whenexecuted,anexceptionwilloccur,allowingthedebuggertogetcontrol.

Encoding:1001000000000010

Availability:Requires“C”(CompressedInstruction)extension.

Equivalent32BitInstruction:EBREAK

C.ILLEGAL-IllegalInstruction

GeneralForm:C.ILLEGAL

Itisdoubtfulthatmostassemblerswouldrecognizetheopcode“C.ILLEGAL”,butweincludeitanyway.

Description:Thispatternisde2inedasbeingillegal.Anyattempttoexecuteitwillcausean“illegalinstruction”exception.Allzeroswasintentionallychosensothatanyattempttobranchintounpopulatedordysfunctionalmemory(whichoftenreadsaszeros)willresultinanimmediatetrap.

RISC-VArchitectureSummary/Porter Page� of� 190 323

Page 191: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

Encoding:0000000000000000

Availability:Requires“C”(CompressedInstruction)extension.

Equivalent32BitInstruction:ILLEGAL

InstructionEncoding

Thefollowingtablelistsallthecompressedinstructionsandisorderedinanattempttoshowthecompletecoverageofthe16bitinstructionencodingspace.

SomebitpatternsarereservedbytheRISC-Vspecforfutureuse;suchpatternsaremarked<Reserved>andshouldnotbeusedbyimplementors.

Somebitpatternsareunassignedandavailableforspeci2icimplementationstode2ineastheywish;suchpatternsaremarked<NSE>(NonstandardExtension).

Somebitpatternsaremarkedas<Hint>;implementorsarefreetousethesebitpatternstoencodehintstotheprocessortotheimproveexecutionspeed.Inallcases,thepatternsmarked<Hint>areequivalenttoinstructionsthatinvolvemodifyingaregisterandthatspeci2icallydisallowtheuseofregisterx0asthedestinationregister.Thus,implementorsarefreetoignorethe<Hint>patternsandexecutetheminthenaturalway,bycomputingaresultbutsendingthatvaluestothezeroregisterx0,causingtheseinstructionstobehaveasnopinstructions.

Recallthattheleastsigni2icant2bitsareusedtodistinguish16-bitinstructionsfromlongerinstructions.Iftheleastsigni2icant2bitsare11,thentheinstructionisnota16bitcompressedinstruction.

Thistableismore-or-lesssortedonleastsigni2icant2bits(00,01,10),followedbythemostsigni2icantbitsinorder.

C.ILLEGAL 0000000000000000

<Reserved> 00000000000DDD00 RegD≠0

C.ADDI4SPN 000VVVVVVVVDDD00 Value≠0

RISC-VArchitectureSummary/Porter Page� of� 191 323

Page 192: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

C.FLD 001VVVaaaVVDDD00 RV32andRV64only

C.LQ 001VVVaaaVVDDD00 RV128only

C.LW 010VVVaaaVVDDD00

C.FLW 011VVVaaaVVDDD00 RV32only

C.LD 011VVVaaaVVDDD00 RV64andRV128only

<Reserved> 100XXXXXXXXXXX00

C.FSD 101VVVaaaVVbbb00 RV32andRV64only

C.SQ 101VVVaaaVVbbb00 RV128only

C.SW 110VVVaaaVVbbb00

C.FSW 111VVVaaaVVbbb00 RV32only

C.SD 111VVVaaaVVbbb00 RV64andRV128only

C.ADDI 000VDDDDDVVVVV01 RegD≠0;Value≠0

<Reserved> 0000DDDDD0000001 RegD≠0;Value=0

<Hint> 000V00000VVVVV01 RegD=0;Value≠0

C.NOP 0000000000000001 RegD=0;Value=0

C.JAL 001VVVVVVVVVVV01 RV32only

C.ADDIW 001VDDDDDVVVVV01 Rv64andRV128only;RegD≠0

<Hint> 001V00000VVVVV01 Rv64andRV128only;RegD=0

C.LI 010VDDDDDVVVVV01 RegD≠0

<Hint> 010VDDDDDVVVVV01 RegD=0

C.LUI 011VDDDDDVVVVV01 RegD≠0,2;Value≠0

<Reserved> 011VDDDDDVVVVV01 RegD≠0,2;Value=0

<Hint> 011V00000VVVVV01 RegD=0

C.ADDI16SP 011V00010VVVVV01 RegD=2;Value≠0

<Reserved> 011V00010VVVVV01 RegD=2,Value=0

C.SRLI 100000DDDVVVVV01 RV32;V≠0

<Hint> 100000DDD0000001 RV32;V=0

<NSE> 100100DDDVVVVV01 RV32;

RISC-VArchitectureSummary/Porter Page� of� 192 323

Page 193: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

C.SRLI 100V00DDDVVVVV01 RV64;V≠0

<Hint> 100000DDD0000001 RV64;V=0

C.SRLI 100V00DDDVVVVV01 RV128only;V≠0

C.SRLI64 100000DDD0000001 RV128only;V=0

C.SRAI 100001DDDVVVVV01 RV32;V≠0

<Hint> 100001DDD0000001 RV32;V=0

<NSE> 100101DDDVVVVV01 RV32;

C.SRAI 100V01DDDVVVVV01 RV64;V≠0

<Hint> 100001DDD0000001 RV64;V=0

C.SRAI 100V01DDDVVVVV01 RV128only;V≠0

C.SRAI64 100001DDD0000001 RV128only;V=0

C.SUB 100011DDD00bbb01

C.XOR 100011DDD01bbb01

C.OR 100011DDD10bbb01

C.AND 100011DDD11bbb01

C.SUBW 100111DDD00bbb01 RV64andRV128only

<Reserved> 100111DDD00bbb01 RV32only

C.ADDW 100111DDD01bbb01 RV64andRV128only

<Reserved> 100111DDD01bbb01 RV32only

<Reserved> 100111XXX10XXX01

<Reserved> 100111XXX11XXX01

C.ANDI 100V10DDDVVVVV01

C.J 101VVVVVVVVVVV01

C.BEQZ 110VVVaaaVVVVV01

C.BNEZ 111VVVaaaVVVVV01

C.SLLI 0000DDDDDVVVVV10 RV32only;RegD≠0;V≠0

<Hint> 0000DDDDD0000010 RV32only;RegD≠0;V=0

<NSE> 0001DDDDDVVVVV10 RV32only;RegD≠0

RISC-VArchitectureSummary/Porter Page� of� 193 323

Page 194: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter6:CompressedInstructions

<Hint> 000V00000VVVVV10 RV32only;RegD=0

C.SLLI 000VDDDDDVVVVV10 RV64only;RegD≠0;V≠0

<Hint> 0000DDDDD0000010 RV64only;RegD≠0;V=0

<Hint> 000V00000VVVVV10 RV64only;RegD=0

C.SLLI 000VDDDDDVVVVV10 RV128only;RegD≠0;V≠0

C.SLLI64 0000DDDDD0000010 RV128only;RegD≠0;V=0

<Hint> 000V00000VVVVV10 RV128only;RegD=0

C.FLDSP 001VDDDDDVVVVV10 RV32andRV64only

C.LQSP 001VDDDDDVVVVV10 RV128only;RegD≠0

<Reserved> 001VDDDDDVVVVV10 RV128only;RegD=0

C.LWSP 010VDDDDDVVVVV10 RegD≠0

<Reserved> 010VDDDDDVVVVV10 RegD=0

C.LDSP 011VDDDDDVVVVV10 RV64andRV128only;RegD≠0

<Reserved> 011VDDDDDVVVVV10 RV64andRV128only;RegD=0

C.FLWSP 011VDDDDDVVVVV10 RV32only

C.JR 1000aaaaa0000010 RegA≠0

<Reserved> 1000aaaaa0000010 RegA=0

C.MV 1000DDDDDbbbbb10 RegB≠0;RegD≠0

<Hint> 1000DDDDDbbbbb10 RegB≠0;RegD=0

C.EBREAK 1001000000000010

C.JALR 1001aaaaa0000010 RegA≠0

C.ADD 1001DDDDDbbbbb10 RegB≠0;RegD≠0

<Reserved> 1001DDDDDbbbbb10 RegB≠0;RegD=0

C.FSDSP 101VVVVVVbbbbb10 RV32andRV64only

C.SQSP 101VVVVVVbbbbb10 RV128only

C.SWSP 110VVVVVVbbbbb10

C.FSWSP 111VVVVVVbbbbb10 RV32only

C.SDSP 111VVVVVVbbbbb10 RV64andRV128only

RISC-VArchitectureSummary/Porter Page� of� 194 323

Page 195: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

HardwareThreads(HARTs)

Asimpleprocessorhasasingle2low-of-control.Thatis,itoperatesasasingle“thread”,withaFETCH-DECODE-EXECUTEloop.Instructionsareexecutedone-after-the-other.(Inthecaseofpipelining,severalinstructionsmaybeinsomestageofexecutionatanymoment,buttheinstructionsarecompleted(or“retired”)one-after-the-other,andthepipelinedbehaviorisexactlythesameasifeachinstructionhadbeenexecutedalonetocompletionintheordertheyappearinthecode.)

Normally,anoperatingsystemwillmultiplextheprocessor,therebyprovidingmanythreads,viatimeslicing.Forclarity,wecandistinguishbetween“hardwarethread”and“softwarethreads”.TheOScanmultiplexasinglehardwarethreadtoprovidemultiplesoftwarethreads.

Inamulti-coreormultiprocessorsystem,eachcoreisrunningitsownindependentFETCH-DECODE-EXECUTEcycle.Thus,therewillbeanumberofhardwarethreadsequaltothenumberofcores.HowtheOSimplementssoftwarethreadsontopoftheavailablehardwarethreadsisacomplextopicwhichconcernsthekerneldesignandorganization.

Ina“superscalar”core,thereismorethanoneinstructionstream.Thereisa2ixednumber(suchas2or4)ofindependentinstructionstreams.Forexample,someparticularsuperscalarcoremightexecutetwoindependentinstructionstreamssimultaneously.Suchacoreisprovidingtwohardwarethreads.Eachstreamhasitsownindependentsetofregistersanditsownprogramcounter,more-or-lessasiftheywereseparatecoresinamultiprocessorsystem.

Onebene2itofthesuperscalarorganizationisthatexpensiveandrarelyusedcircuits(e.g.,2loatingpointdivide)canbeprovidedonlyinasinglecopy,andsharedbybothhardwarethreads.Also,theremaybemultipleinstantiationsofmorecriticalcircuits

RISC-VArchitectureSummary/Porter Page� of�195 323

Page 196: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

(e.g.,integeraddition)whichareallocateddynamicallytothoseinstructionsthatneedthem.

Withsuperscalarorganization,itispossiblethatmorethanoneinstructioncanberetiredperclockcycle,providinggreaterperformance.

De=initionof“HART”:Inthediscussionofconcurrencycontrolandsynchronization,theRISC-Vdocumentationusestheterm“HART”torefertoahardwarethread.Thehardwarethreads(HARTs)mightbeimplementedondistinctprocessorswithinamultiprocessorsystem,orbymultiplecoresinasingleprocessor,oronasinglesuperscalarcore.TheRISC-Vspeccoversallpossibilities,includinghybridcombinations.

TheRISC-VMemoryModel

It’sacommonlyacceptedprincipleinprocessordesignthat,withinasingleinstructionstream,datadependenciesarerespected.Inotherwords,theprocessorexecutesinstructionsinorderandwillnotreorderinstructions.

Forexample,aninstructionthatloadsaregisterwithavaluefrommemoryandaninstructionthatstoresthecontentsofthatsameregisterinmemoryshouldneverbereordered.Assemblylanguageprogrammersarefreetoassumethateachinstructionisexecutedfrombeginningtocompletionintheorderitwasspeci2ied.Anyoptimizationsorreorderingdonebythehardwaremustpreservethisconstraint.Forexample,twoinstructionsthataccessdifferentregistersanddifferentmemoryaddressescansafelybereorderedorexecutedsimultaneouslytoachievefasterexecution.

However,betweendifferenthardwarethreads,theorderinwhichoperationsexecutemaybeindeterminate.Forexample,considertwoindependentprocessorsinamultiprocessorsystemwithsharedmemory.Imaginetwoinstructionswhichbothstoreintoasinglememorylocation;theymayexecuteinarbitraryorder,resultinginadifferentvaluebeingstoredlast.

Hereisonecommonrequirementforpropersynchronizationinasystemofcooperatingprocesses.(By“cooperatingprocesses”,wemeanasystemwithmultiplethreadsthataredesignedtoworktogetherconcurrentlyexecutingto

RISC-VArchitectureSummary/Porter Page� of� 196 323

Page 197: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

completesomegivenprogrammingtaskcorrectly,regardlessofthevagariesofindeterminateexecution.)

SynchronizationofReadersandWriters:Withapotentiallysharedpieceofdata,awritermustwaittowritethedatauntilallpreviousreadersare2inishedreadingtheoldvalue.Likewise,areadermustwaituntilanypreviouswriteris2inishedwritingthenewvalue.

TheRISC-Vspecassumesthatreadsandwritesissuedbyasinglehardwarethread(HART),mustbevisibletothatsameHARTintheorderexecuted.Thatis,thecore’sactualorderofexecutionofaHART’sinstructionsmustbeindistinguishablefromthesequentialorderinwhichtheprogrammerwrotetheinstructions.Perhapsthecorewillreorderinstructionstoimproveperformance,butanyreorderingmustbe“safe”inthesenseofhavingthesameeffectasbeingissuedintheoriginalorder.

Thisassumptionissomethingthateveryprogrammertakesforgranted:thecomputerwillexecutetheircodeinstructionsintheordertheprogrammerwrotethem.Forsinglethreadsexecutinginisolation,thereisnothingmuchmoretotalkabout.Butthingsquicklygetcomplicatedwhentherearemultiplethreads.

AsfarastheapparentorderoftheoperationsasviewedbyotherHARTs,theRISC-Vspecdoesnotassumeanything.Infact,theRISC-Vspecallowsthefollowing:

ItmayappeartoasecondHARTthattheinstructionsexecutedbytheZirstHARTaredoneoutoforder.

SoitispossiblethattwodistinctHARTswillseethesamesetofoperationshappeninginadifferentorder.ThisisaveryrealpossibilityonsomesystemswhentheoperationsinquestionarereadsandwritestomemoryandtheindividualHARTsareoperatingwithprivatecaches.

Forexample,considerthefollowingsequenceofinstructionsissuedbyHARTAtomemorylocationX.

WriteX!4 ReadX WriteX!5 ReadX

RISC-VArchitectureSummary/Porter Page� of� 197 323

Page 198: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

The2irstreadwillreturn4andthesecondreadwillreturn5.HoweverHARTBmightexecutetheseinstructions:

ReadX ReadX

The2irstreadmightreturn5andthesecondreadmightreturn4.ItappearstoHARTBthatthewriteswereexecutedoutoforder,probablyduetoinconsistenciesbetweentheircaches.

ConcurrencyControl-ReviewandBackground

Thissectionisnotspeci2ictoRISC-Vandcanbesafelyskippedifyouarefamiliarwithconcurrencycontrol.

Inanysystemwithmultiplethreads,someformofsynchronizationandconcurrencycontrolisrequired.Withoutit,thebehaviorofsomecodesequencesmaybenondeterministicandprogramerrorsmayresult.Forexample,eachofseveralthreadsmayneedtoperiodicallyqueryandupdateasharedpieceofdata.Withoutanycontrol,twoormorethreadsmaybetryingtoreadandupdatetheshareddatasimultaneously;therearisesapossibilitythatsomethreadsmayseeaninconsistentstateofthedataorthatsomeupdatesmaybelostorthatseveralconcurrentupdateswillresultintheshareddatabeingputintoainvalidorinconsistentstate.

Inonecommonapproachtoconcurrencycontrol—particularlyamongsoftwarethreads—a“lock”iscreatedtoprotecttheshareddata.Anyblockofcodeaccessingtheshareddatamust2irst“acquire”(or“obtain”or“set”)thelockbeforeaccessingtheshareddata,andmust“release”(i.e.,“free”,“clear”)thelockaftertheaccessiscomplete.Asequenceofcodewhichaccessestheshareddataiscalleda“criticalsection”andtheprogrammermustremembertoacquirethelockatthebeginningofthecriticalsectionandreleasethelockattheendofthecriticalsection.

Operatingsystemstypicallyprovide“locks”,alongwith“acquire”and“release”operations.Implementingtheseoperationsonasingleprocessorsystemissimple:theOSmerelyhastoavoidathread-switchoccurringduringtheacquireandreleaseoperationsthemselves.Thiscanbedoneeasilyandef2icientlybydisablinginterruptsforthedurationoftheacquire/releaseoperations.

RISC-VArchitectureSummary/Porter Page� of� 198 323

Page 199: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Thingsaresigni2icantlymorecomplexformulti-coresystemswithsharedmemory.Nowthereis“multiprocessing”(trueconcurrency)andnotjust“multithreading”(simulatedconcurrencyusingtime-slicing).Thelockitselfbecomesshareddataandaccesstothelockineffectbecomesacriticalsectionofitsown.Blockinginterruptsononeprocessorisnolongeradequate.Thereexistalgorithmstosolvethisproblempurelyinsoftware(Peterson’salgorithm;Dekker’salgorithm)butarchitecturalsupportintheISAisrequiredforperformancereasons.

Inthepasttherehavebeenacoupleofinstructionsproposedtosupportthe“acquire”and“release”operationsforlocks.Onesuchinstructionwasthe“test-and-set”instruction.Thisinstructionincludestwooperands:anaddressandaregister.Theinstructionreadsfrommemoryattheaddressgivenandstorestheresultintotheregister.Thentheinstructionsstoresa2ixedknownvalue(usually“1”)intothesamememorylocation.Boththeread-memoryandthewrite-memoryoperationarespeci2iedtobe“atomic”,whichmeansthatthereisaguaranteethatnootherinstructionwilloccurbetweentheread-memoryandwrite-memoryfromanyothersource(e.g.,othercoresorI/Odevices).

Historically,hardwarecanenforcetheatomicityguaranteebysomehowblockingallaccessestoallmemorylocationsforallothercoresbetweenthereadandwritephasesofthetest-and-setinstruction.Thiswillslowotherpartsofthesystemdownalittle,butifthenumberofcoresissmall,it’snotmuchofaburden.

Thetest-and-setinstructioncanbeusedtoimplementalockverysimply.Here’show:thelockisrepresentedasasinglememorylocation,with“0”meaning“notlocked”and“1”meaning“locked”.The“acquire”operationconsistsofexecutingthetest-and-setinstruction.Aftertheinstruction,thelockwillbeinthe“set”stateandwillcontain“1”.The“acquire”operationthenexaminesthepreviouslockvalue,whichwasretrievedfrommemoryandstoredinaregister.Ifitwas“0”,thenallisgood:thelockwaspreviouslyfree.Ithasnowbeenacquiredandthecriticalsectioncanbeentered.However,ifthepreviousvaluewas“1”,thenthelockwasapparentlyalreadyset.Thelockhasnotbeenacquired.Theacquirecodemusttryagain,eitherimmediately,orafterawait,orafterthreadrescheduling.The“release”operationisimplementedbysimplystoringa“0”intothememorylocationrepresentingthelock.

AnotherapproachtoISAdesignistoprovidea“swap”instruction,insteadofthetest-and-setinstruction.Likethetest-and-setinstruction,theswapinstructionhasaread-memoryphaseandawrite-memoryphase.Theinstructionistakestwooperands:aregisterandamemoryaddress.Theinstructionswapsthevalues;aftertheinstructiontheregistercontainsthepreviousvaluefromthememorywordand

RISC-VArchitectureSummary/Porter Page� of� 199 323

Page 200: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

thememorywordcontainsthepreviousvalueintheregister.Theswapinstructionisageneralizationofthetest-and-setinstruction:byplacinga“1”intheregisterbeforetheswapinstructionisexecuted,theeffectwillbeidenticaltoatest-and-setinstruction.

Anotherapproachisthe“compare-and-swap”(CAS)instruction,whichisusedintheX-86architecture.TheCASinstructiontakesfouroperands:amemoryaddress,aregistercontainingan“expectedvalue”,andaregistercontaininga“newvalue”,andaregisterinwhichtostoreareturncode.Theinstructiondoesseveralthingsatomically,i.e.,withoutthepossibilityofinterruptionorinterferencefromotherthreads.First,theinstructionreadsavaluefrommemoryandcomparesittothe“expectedvalue.”Ifthetwoaredifferent,theinstructionfailsandreturnsacodetoindicatefailure.Butifthetwovaluesareequal,theinstructionstoresthe“newvalue”intothememorylocationandreturnsacodeindicatingsuccess.

Theproblemwithalltheseinstructionsisthateachrequirestwomemoryoperations.ThisviolatesthegeneralRISCprincipleofkeepingallinstructionssimple,minimal,andfast.Twomemoryoperationsinasingleinstructionsistoomuchandcomplicatesthecircuitry.

Furthermore,theseinstructionscanslowothercoresdownsincetheymustbeexecutedatomicallyandpresumablytheyworkbyfreezinguptheentirememorysystemduringtheirexecution.

Theproblemswithatomicallyexecutingbothareadandwritetogethergetworsewithmorecores.Theseproblemsalsogetworsewithmorefrequentlockingoperations.Tosupportincreasedconcurrency,it’sagoodideatohave2iner-grainedlocking,i.e.,tohavemorelocks,havingeachlockprotectlessdata.Finer-grainedlockingimpliesmore“acquire”and“release”operations,evenifthelocksareusuallyfoundtobefree.Theproblemsalsogetworsewithincreasedsharingofdata,ageneralizedtrendweareseeinginmanycontexts.

AReviewofCachingConcepts

Thissectionisnotspeci2ictoRISC-Vandcanbesafelyskippedifyouarefamiliarwithcaching.

RISC-VArchitectureSummary/Porter Page� of� 200 323

Page 201: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Theproblemofsynchronizationbetweenmultipleprocessors(e.g.,coresorHARTs)getsmuchmorecomplexinthepresenceofcachememories.Inshort,theproblemstemsfromthefactthatasinglelocationinmemorycanhavemultiplevaluesatonce,becausethedatacanbepresentinseveralcacheswithinconsistentvalues.Let’sreviewthebasicideasbehindcachememoriesbeforewediscusstheRISC-Vinstructions.

Inmanycomputers,thebandwidthbetweenmainmemoryandtheprocessorcoreisaperformancebottleneck.Toincreaseperformance,a“cache”memoryisplacedbetweenmainmemoryandthecore.

Wheneverthecoretriestofetchdatafrommemory,thecacheischeckedand,ifitcontainsthedata,thedatacanbesenttothecoreimmediately.Otherwise,thedatamustbefetchedfrommainmemory,whichtakesmoretime.Thedatawillbesenttothecoreaswellasbeingstoredincachememorytospeedfuturerequestsforthesamedata.

Whendataiswrittenfromthecoretomemory,thedatawillsometimesbewrittentothecachememory,whichspeedsfutureaccessesifthecorewantstoreadthedataagainsoon(“writeallocate”).Somesystemsavoidcachingtheupdateddata(“no-write-allocate”).Ineithercase,theupdateddatamustbecopiedtomainmemoryatsomepoint.Thewritetomainmemorymayoccurimmediately,atthetimethecore2irstwritesthedatatocache,andthisiscalled“write-through”.Alternatively,writetomainmemorycanbedelayeduntillater.Atsomelatertime,whenspaceinthecacheisneededforsomeotherdata,theupdatedmodi2ieddatavaluewill2inallybewrittentomainmemory.Thispolicyiscalled“write-back”.

Cachesdonotoperateonbytesindividually;insteadtheystoredatainunitsof“cacheblock”.Acachelineisablockofcontiguousdatabytes(typically64bytes)whicharecopiedtogetherasaunitinandoutofthecache.Inarequestforasinglebyteorhalfword(forexample),theentirecacheblockcontainingthedesiredbyteswilleitherbeinthecache(a“cachehit”)ornot(a“cachemiss”).

Thetypicalalignmentrequirementsensurethatamultipledataunit(suchasahalfword)willnevercrossacacheblockboundary;thedatawilleitherbeentirelyinoneblockorentirelyinanotherblock.Misaligneddatawilloccasionallycrosscacheblockboundariesandtheprocessofreadingintoorwritingfromthecoretothememorysystemwillbecomplicatedandgenerallyrequireabouttwiceasmuchtime.

RISC-VArchitectureSummary/Porter Page� of� 201 323

Page 202: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Cachesareoftenorganizedintolevels.Forexample,athree-levelcachemightconsistof:

L1 Closesttothecore.Fastest,smallestincapacity. L2 Intermediate L3 Closesttomainmemory.Slowest,largestincapacity.

Inasystemwithmultiplecoresonasinglechip,eachindividualcoremightpossessitsownprivateL1andL2caches,whileallcoreswillsharealargerL3cache.

Therearedifferencesbetweeninstructionandregulardata.Inparticular,instructionsarealwaysread-only.Sincethedataisnevermodi2iedbythecore,thereisnoneedtoprovidehardwaretoimplementthewrite-allocate/write-no-allocateorwrite-back/write-throughpolicies.

Asatypicalexample,eachcorewillhavetwoL1caches;onefordataandoneforinstructions.Theterm“uni=iedcache”meansthatthecacheholdsbothinstructionsanddata.Typicallytheslower,largercache(e.g.,L3)isuni2ied.

InmanyISAs,includingRISC-V,allaccesstoI/Odevicesis“memory-mappedI/O”whichmeansthattheI/Odevicesareassignedaddressesinthephysicalmemorymap.TosenddataorcommandstoanI/Odevice,thecorewillissue“store”instructionsandtoretrievedataorstatusinformationfromanI/Odevice,thecorewillexecute“load”instructions.

ManyI/Odevicesalsoaccessmainmemorydirectly.Forexample,thecoremightmovedataintomemoryandthenissuecommandstoanI/Odevicewhichsubsequentlycausethedevicetoretrievethesamedatadirectlyfrommemory.

Thus,themainmemorysystemmayhaveseveral“ports”allowingdatatobereadandwrittenseparatelyby:

•Differentcoresonasinglechip •Differentprocessorchips •VariousI/Odevices

Amoderncomputersystemwillconsistofanumberofbussesandanumberofdifferentdevicessittingoneachbus.

RISC-VArchitectureSummary/Porter Page� of� 202 323

Page 203: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Whichallthiscomplexity,thereisaproblemofsynchronizationanddataconsistency.Forexample,acoremaywriteabyteofdatatomemoryandexpectsomeI/Odeviceorothercoretofetchthatbyte.Inthepresenceofcaches,theremaybeseveraldifferentvaluesforthesamememoryaddressatthesameinstant.Withoutfurthercontrol,a“reader”maynotseethevaluewrittenbya“writer”eventhoughthewriteoccursearlierintime.

Notethatthedataconsistencyproblemdoesnotoccurforread-onlydata.Ifaparticularbyteofdataneverchanges,thenthereisnotaproblem.Perhapsthedataiscachedorperhapsitmustbefetchedfrommainmemory.Butsincethereisonlyonevalue,thereisneverapossibilitythatthecachecontainsanoutdatedvalue.Asynchronizationproblemcanonlyoccurwhendataisupdated.

Notethatalldataiswrittenatsomepoint,exceptperhapsdatastoredinRead-OnlyMemory(ROM).Wemustassumethatwhenanycomputerispoweredup,thecontentsofdynamicmemoryareunde2ined;evenread-onlydatamustinvolveaninitialwritetothememorysocaremustbetakentoavoidreadingtheinitialunde2ineddata.

Thesynchronizationproblemoccursbecausemultiplevaluesforagivenmemorylocationcanbestoredinvariouscaches.Ifthecachesareeliminated,thenthedatamustnecessarilybeconsistent;butcachesyieldtremendousperformancebene2its,sosupportforsynchronizationin2luencesISAdesign.

Wecanmakeadistinctionbetweenprivatecachesandshared(i.e.,memory-sideorglobal)caches.Considerasystemwithseveralcoresandasinglesharedmainmemory.Eachcoremayhaveitsownprivatememorycache,inwhichcaseasinglebyteofmemorymaybecachedsimultaneouslyinseveraldifferentprivatecaches.Anupdatebyanysinglecoretothisbyteoughttobevisibletoallcoresbut(withoutsynchronization)othercoresmightreadoutdatedvaluesfromtheirprivatecaches.

If,insteadthereisonlyasinglesharedcachesittingbetweenmainmemoryandallcores,thenthesynchronizationproblemgoesaway.Agivenbytecanonlybecachedinoneplace,namelythesharedcache.Sinceallcoresareusingthissharedcache,anyupdatetothebytewillbere2lectedinthesharedcache,andthereforeseenbyallothercores.

Modernsystemsoftenuseamixedofprivateandsharedcaches.

Acachememoryconsistsofasetof“cachelines”.

RISC-VArchitectureSummary/Porter Page� of� 203 323

Page 204: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Eachlineinacachecontainsacouplebitsidentifyingwhatdataiscontainedinthelineandwhetherthedatais“valid”or“invalid”.Moreprecisely,acachelinegenerallycontainsthefollowingpiecesofinformation:

•Thecachelinedata(e.g.,ablockof64databytes) •Theaddressofthedatablock •A“validbit”,totellwhetherthiscachelineismeaningful •A“dirtybit”,totellwhetherthedatahasbeenmodi2ied

Wediscussvirtualmemoryandaddresstranslationelsewhere.Butherewepointoutthateachcachelinemustcontaintheaddressofthedatastoredinthatcacheline.

Intheeasiest-to-understandapproach,eachcachelinecontainsthephysicaladdressofitsdatablock.Thecacheisan“associativememory”usingthephysicaladdressasthesearchkey(i.e.,theassociativeindex).However,becauseofvirtualmemoryaddresstranslation,thevirtualaddressisknownearlierintimethanthephysicaladdress,soitmakessensetoindexthecachelinesbyvirtualaddress.Realdesignsaremorecomplex,butforourpurposeswedonotneedtogofurtherhere.

Acrudesolutiontothecacheconsistencyproblemistoperiodicallyemptytheentirecache.

“Flushingthecache”meanswritingallupdateddatabacktomainmemoryandmarkingallcachelines“invalid”.Soasimpleapproachisto2lushallcachesafteranywrite.Flushingtheentirecacheafteranywriteisoverlyconservativeandnotthemostef2icientapproach,butitwillsolvethecacheconsistencyproblem.

Thesoftwaremightbealittlesmarterandavoidfullcache2lusheswhennotstrictlyrequired.Forexample,whenonecorewritestomainmemory,thereisaquestionofwhethercachesbelongingtoothercoresmustbe2lushed.Iftheprogrammercanbecertainthatthedatawillneverbereadbyothercores,thencallingforacache2lushcanbeavoided.

TheRISC-Vdesignersrecognizethata2iner-grainedcontrolovercachesandmemorysynchronizationisrequired,andprovidetheFENCEinstruction,whichwillbedescribedlater.

RISC-VArchitectureSummary/Porter Page� of� 204 323

Page 205: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

RISC-VConcurrencyControl

RISC-Vsupportsconcurrencycontrol,synchronization,andcacheconsistencywiththeseinstructions:

Alwaysavailable: FENCE Synchronizedatareadsandwrites FENCE.I Synchronizetheinstructioncache

Availableinthe“A”extension(for“AtomicInstructions”) LR LoadReserved SC StoreConditional AMOSWAP Atomicallyswaptwovalues AMOADD Atomicallyaddtwovalues AMOAND AtomicallyANDtwovalues AMOOR AtomicallyORtwovalues AMOXOR AtomicallyXORtwovalues AMOMAX AtomicallyMAXtwovalues AMOMIN AtomicallyMINtwovalues

UnderRevision:TheRISC-Vspecstatesthattheirmemorymodelisunderrevision,implyingthattheseinstructionsmaychange.

TheFENCEInstructions

Therearetwofenceinstructions:FENCEandFENCE.Iandtheseinstructionsarealwaysavailable.Theydonotrequirethe“A”(AtomicOperations)extension.

Thegeneralideawitha“fence”isthatalloperationsarepartitionedintotwosets:thepredecessorsetandthesuccessorset.

Thefencerequiresalloperationsinthepredecessorsettocompletebeforeanyoftheoperationsinthesuccessorsetcanproceed.InthecaseofRISC-V,thepredecessoroperationsaretheinstructionsthatmustcompletebeforethefence,andthesuccessoroperationsarethosethatmaynotbeginuntilafterthefence.

RISC-VArchitectureSummary/Porter Page� of� 205 323

Page 206: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Asananalogy,considertheTourdeFrance,whereallthecyclistsmustcompleteeachstageoftherace,beforeanyridercanbeginthenextstage.Onlyaftertheslowestrider2inishesthepreviousstageoftheraceareanyridersallowedtostartthenextstage.Thestagesareeffectivelyseparateby“fences”.

TheFENCEinstructioninRISC-Vmayonlyaffectinstructionsthatreadorwritetothememorysystem.Itisusedtocontrolconcurrentaccessestosharedmemory.

Recallthatthephysicalmemoryaddressspacecanbepopulatedwith:

•Normal,physicalmemory(containingstoreddatabytes) •Memory-mappedI/Odevices

Ahardwarethread(HART)caneitherreadorwritevaluesto/fromthememoryspace.Thisyieldsthesecombinations:

R Read–readdatafromphysicalmemory W Output–writedatatophysicalmemory I Input–readdatafromamemory-mappedI/Odevice O Output–writedatatoamemory-mappedI/Odevice

Ofcoursemanyoperations(suchasamovementofdatafromoneregistertoanother)donottouchmemoryatall.Suchoperationsdonotfallintoanyofthesefourclassi2icationsandareignoredbytheFENCEinstruction.

ImaginethatsomeHARTexecutestwomemorywrite(“W”)operationsinsuccession.Perhapsthe2irstwriteoperationissettingalock,whichisintendedtoprotectsomepieceofsharedata.Onlyafterthelockhasbeenset,isitpermissibletoupdatetheshareddata.(Thisistheideabehindprotectingshareddatawithlocks;theshareddatashouldonlybeaccessedwhenthelockisheldbythethread.)

OfcoursetheHARTinquestionwillexecutethewritetothelockbeforethewritetothedata,butbecauseoftherelaxedmemorymodelassumedbytheRISC-Vspec,itispossiblethatotherHARTswillseethedatawriteappearingtocomebeforethelockwrite.OtherHARTscouldthereforeseetheupdateddatabeforeseeingthelockgettingset,therebyviolatingthefunctionalityandintegrityofthelockprotocol.TheFENCEinstructionprovidesawaytopreventthisdisasterandallowsprogrammerstoimplementproperconcurrencycontrolandcreatecorrect,reliablecooperatingprocesses.

RISC-VArchitectureSummary/Porter Page� of� 206 323

Page 207: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

TheFENCEinstructioncanbeusedtomakesurethatallotherHARTswillseethewriteoccurringbeforetheread.Inthisexample,wewilluseaFENCEinstructiontomakesurethatallwritestomemorythatwereexecutedbeforetheFENCEinstructionwillappeartoexecutebeforeallwritetomemorythatwillbeexecutedaftertheFENCEinstruction.

ForthepurposesoftheFENCEinstruction,weuse“P”foroperationsoccurringpriortotheFENCE(predecessoroperations)and“S”foroperationsoccurringaftertheFENCEinstruction(successoroperations).Thus,wehave:

Predecessoroperations: PR DatareadsoccurringbeforetheFENCE PW DatawritesoccurringbeforetheFENCE PI Memory-mappedreads(inputs)occurringbeforetheFENCE PO Memory-mappedwrites(output)occurringbeforetheFENCE

Successoroperations: SR DatareadsoccurringaftertheFENCE SW DatawritesoccurringaftertheFENCE SI Memory-mappedreads(inputs)occurringaftertheFENCE SO Memory-mappedwrites(output)occurringaftertheFENCE

Inourexample,wewanttoforceallwritestodatamemorythatwereissuedbeforetheFENCEtocompletebeforeallwritetodatamemorythatwillbeissuedaftertheFENCEinstruction.Thus,wecaninsertthefollowingFENCEinstructionbetweenthewritetothelockandthereadfromtheshareddata:

...writetolock…FENCE PW,SW # Complete past writes before future writes

…writetoshareddata…

FENCE

GeneralForm:FENCE PI,PO,PR,PW,SI,SO,SR,SW

Example:Thisinstructioncontains8singlebit2ields.Eachcanbeenabled(setto1)ordisabled(clearedto0).The2ieldsarecalledPI,PO,PR,PW,SI,SO,SR,SW.Thespecdoesnotindicatehowthisinstructionistobecodedinassembly.Perhapsthebitscanbesetbybeingmentionedandclearedotherwise.Forexample:

RISC-VArchitectureSummary/Porter Page� of� 207 323

Page 208: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

FENCE PR,PW,SR,SWDescription:

Seetextabove.Availability:

Thisinstructiondoesnotrequirethe“A”extension(AtomicInstructions).Itisalwaysavailable.

Encoding: ThisisavariationofanI-typeinstruction,wheretheRegDandReg22ieldsare

notusedandsettozeroandonly8oftheImmed-12bitsareused.

ImplementationofFENCE:OnesimpleimplementationofFENCEisforthehardwaretosimplyignorethedistinctionbetweenR,W,I,andOandsimplycompletealloperationsoccurringbeforetheFENCEbeforestartinganyoperationsaftertheFENCE.Inthemostconservativebutsimplestimplementation,thehardwaremightjust2lushallcaches,forcingasortofhard,system-widefenceoperation.Hopefullysomeimplementationswillprovidethe2iner-grainedcontrolpermittedbytheinstruction.

TheFLUSHinstructionisthemaintoolfor2lushingthecache,butisintendedtooperateononlydatacaches.However,instructionsmaybecachedinaseparateinstructioncacheandthereisaneedtoupdatethiscachefromtimetotime.ThisisthepurposeoftheFENCE.Iinstruction.

Wheneverbytesarewrittentomemoryandthesebytesareintendedtobesubsequentlyexecutedasinstructions,anupdatetotheinstructioncachemayberequired.Forexample,awrite(e.g.,theSTstoreinstruction)tomemorylocationXmaybeexecuted.SubsequentlythecorewillexecutetheinstructionlocatedataddressX;ofcoursethenewvalueshouldbefetched.However,iftheinstructioncachealreadycontainsthepreviouscontentsforlocationX,theremaybeaproblem.Thecorecannotsimplyusethecachedvalue.

TheeffectoftheFENCE.Iinstructionisequivalenttoinvalidatingtheentireinstructioncache,therebyforcingallfuturefetchestogotomemory.ThisguaranteesthatthenewestvalueforlocationXwillbefetched.

FENCE.I–(InstructionCacheFlushing)

GeneralForm:FENCE.I

RISC-VArchitectureSummary/Porter Page� of� 208 323

Page 209: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Example:FENCE.I # Flush the instruction cache

Description:InvalidatetheentireinstructioncachelocaltothecurrentHART,oratleastdosomethingoperationallyequivalent.Seetext.

Availability:Thisinstructiondoesnotrequirethe“A”extension(AtomicInstructions).Itisalwaysavailable.

Encoding: ThisisavariationofanI-typeinstruction,wheretheRegD,Reg2,and

Immed-122ieldsarenotusedandsettozero.

Note:ThisinstructiononlyaffectsthecurrentHART.Inamulti-coresystem,itisprobablethatinstructionswillbecachedininstructioncachesthatareprivatetoeachcore.IfcoreAwritesdatatomemorylocationXandexpectstoexecutethatsamedata(e.g.,jumptoX),thenitshouldexecuteaFENCE.Iinstructiontomakesurethatitscachedoesn’tcontainanobsoletevalueforlocationX.

However,thiswillnothelpothercores,whichmayhavecachedolderdatafromlocationXintheirprivateinstructioncaches.Inorderto2lushtheinstructioncachesofothercores,softwareoncoreAwillhavetosomehowsignalsoftwarerunningontheothercores,whichwilltheneachneedtoexecuteFENCE.Iinstructions,to2lushtheirownprivatecaches.

ImplementationNotes:OneimplementationofFENCE.Iissimplyto2lushtheentireinstructioncache,i.e.,tosetallcachelinesto“invalid”.

Thismayseemlikeacoarse-grainedapproach,butitisassumedthatsoftwaretypicallyupdatesalargeblockofinstructionsinmemoryand,onlyafterwritingallthedata,willtherebeajumptothenewlycreatedcode.Forexample,consideraJust-In-Time(JIT)compiler,whichonlycompilesafunction/methodatthetimeitis2irstcalled/invoked.Thereisacleartransitionfromcompile-phasetoexecution-phase.Thus,onlyasingleFENCE.Iinstructionwouldbeneeded,andthepreviouscontentsoftheinstructioncachearelikelytobeunneededinthenearfutureanyway,soinvalidatingthemisacceptable.

Anotherapproachassumesthatthedataandinstructioncachesarekeptconsistent.Forexample,theinstructioncachemightbe“snooping”writestothe

RISC-VArchitectureSummary/Porter Page� of� 209 323

Page 210: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

datacache.Everywritetothedatacachewouldhavetheside-effectofupdatingtheinstructioncachewhennecessary.Whenevertheinstructioncachehappenedtocontaindatafromthesameaddressbeingwrittento,thatcachelineintheinstructioncachewouldbeupdatedoratleastinvalidated.Ifthecachesarekeptcoherent,thentheFENCE.Iinstructioncanbeimplementedbysimply2lushingthepipeline,whichmaypotentiallycontainoutdatedinstructiondata.

Load-Reserved/Store-ConditionalSemantics

TheRISC-Vdesignsupportsanapproachtoconcurrencycontrolcalledload-reserved/store-conditional(LR/SC).Thisisalsocalled“load-link/store-conditional(LL/SC).

We’llbeginbydescribingtheLRandSCinstructions,whichareonlyavailableinthe“A”(AtomicOperations)extensiontotheRISC-Vspec.

Theideaisthatanatomicoperationsuchastest-and-set,swap,orcompare-and-setisbrokenintotwoseparateinstructions.The2irstinstruction(“load-reserved”)performsthereadphaseandthesecondinstruction(“store-conditional”)performsthewritephase.

TheLRinstructionisverysimilartoatypicalLOADinstruction.Itreadsavaluefrommemoryandstoresitintoaregister.Butinaddition,theinstructionalso“reserves”thatmemorylocation.Youcanimaginethatthisreservationconsistsofmakinganoteofwhichcore(or“HART”inRISC-Vterminology)touchedwhichmemorylocationwithanLRinstruction.Thisreservationnoteissomethingthatismanagedbythememorysynchronization/controlhardwareandnotbythecoresattemptingtoaccessmemory.

Atsomelatertime,anSCinstructionwillbeexecuted.Itwillstoreavaluefromaregisterintomemory,justlikeatypicalSTOREinstruction.However,theinstructionmay“succeed”or“fail”.TheSCinstructionhasthreeoperands:aregistercontainingthevaluetobestoredinmemory,amemoryaddress,andaregisterintowhichthereturncodewillbestored.

Therearetworeturncodes:“0”indicatessuccessand“non-zero”indicatesfailure.Failuremightbecausedbymisalignedaddresses,accessviolations,etc.,butthese

RISC-VArchitectureSummary/Porter Page� of� 210 323

Page 211: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

arebesidethepoint.Theprimaryuseofthesuccess/failurehastodowithconcurrencycontrol.

TheproperwaytousetheseinstructionistoexecuteanLRinstructiononsomememorylocationandthen,shortlythereafter,toexecuteanSCinstructiononthatsamememorylocation.Ifnoothercore/HARThasstoredintothatmemorylocationsincetheLRinstruction,thentheSCwillsucceedandthevaluewillbestoredintomemory.Butifsomeothercore/HARThasstoredanythingintothatmemorylocation,thentheSCwillfailandnothingwillbestoredintomemory.Presumably,thecodewillthenloopandretrytheLR/SCsequenceuntilitsucceeds.

AreservationismadebyaLRinstructionandremainsvalidforasmall,2initeamountoftime(about16instructions)speci2iedbytheRISC-Vspec.TheSCinstructionmustbeexecutedwithinthiswindowofopportunity.Ifsomeothercore/HARTsneaksinanexecutesanLRonthesamememorylocation,thenthereservationiscancelledandtheSCwillfail.

LoadReserved(Word)

GeneralForm:LR.W RegD,(Reg1)

Example:LR.W x4,(x9) # x4 = Mem[x9] and reserve

Description:A32-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisinReg1.

Comment:Thisinstructionplacesa“reservation”onablockofmemorycontainingthisword.Seecommentsinthetextregardingsemantics.

Theaddressbeproperlyaligned.Thereservationmayincludemorethanjustthebytesaddressed,butwillatleastincludethesebytes.

RV64/RV128:Foramachinewitharegisterwidthlargerthan32-bits,thevalueissign-extendedtothefulllengthoftheregister.

Availability:Thisinstructionisonlyavailableinthe“A”extension(AtomicInstructions).

Encoding: ThisisavariationofanR-typeinstruction,whereReg2is00000.

RISC-VArchitectureSummary/Porter Page� of� 211 323

Page 212: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

LoadReserved(Doubleword)

GeneralForm:LR.D RegD,(Reg1)

Example:LR.D x4,(x9) # x4 = Mem[x9] and reserve

Description:A64-bitvalueisfetchedfrommemoryandmovedintoregisterRegD.ThememoryaddressisinReg1.

Comment:SeeLR.W.

RV64/RV128:OnlyavailableonRV64andRV128.ForRV128,thevalueissign-extendedtothefulllengthoftheregister.

Availability:Thisinstructionisonlyavailableinthe“A”extension(AtomicInstructions).

Encoding: ThisisavariationofanR-typeinstruction,whereReg2is00000.

StoreConditional(Word)

GeneralForm:SC.W RegD,Reg2,(Reg1)

Example:SC.W x5,x4,(x9) # Mem[x9]=x4; x5=success code

Description:A32-bitvalueiscopiedfromregisterReg2tomemory.ThememoryaddressisinReg1.Asuccess/failcodeisplacedinRegDwhere0=successandnon-zero=failure.Onfailure,memoryisnotchanged.

Comment:Thisinstructionassumesa“reservation”haspreviouslybeenmadebyanLRinstructiononablockofmemorycontainingthisword.Seecommentsinthetextfordetails.

Theaddressbeproperlyaligned.Thereservationmayincludemorethanjustthebytesaddressed,butwillatleastincludethesebytes.

RISC-VArchitectureSummary/Porter Page� of� 212 323

Page 213: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Thevalueof“1”willusuallybeusedtoindicatefailure,butthereisapossibilityfordifferentcodestobeusedtoindicatedifferentreasonsforfailure.

RV64/RV128:Foramachinewitharegisterwidthlargerthan32-bits,theupperbitsoftheregisterareignored.

Availability:Thisinstructionisonlyavailableinthe“A”extension(AtomicInstructions).

Encoding: ThisisanR-typeinstruction.

StoreConditional(Doubleword)

GeneralForm:SC.D RegD,Reg2,(Reg1)

Example:SC.D x5,x4,(x9) # Mem[x9]=x4; x5=success code

Description:A64-bitvalueiscopiedfromregisterReg2tomemory.ThememoryaddressisinReg1.Asuccess/failcodeisplacedinRegDwhere0=successandnon-zero=failure.Onfailure,memoryisnotchanged.

Comment:SeeSC.W

RV64/RV128:OnlyavailableonRV64andRV128.ForRV128,theupperbitsoftheregisterareignored.

Availability:Thisinstructionisonlyavailableinthe“A”extension(AtomicInstructions).

Encoding: ThisisanR-typeinstruction.

AtomicMemoryControlBits:“aq”and“rl”

The“A”extensionprovidesthefollowingatomicmemoryoperation(AMO)instructions:

RISC-VArchitectureSummary/Porter Page� of� 213 323

Page 214: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

LR LoadReserved SC StoreConditional AMOSWAP Atomicallyswaptwovalues AMOADD Atomicallyaddtwovalues AMOAND AtomicallyANDtwovalues AMOOR AtomicallyORtwovalues AMOXOR AtomicallyXORtwovalues AMOMAX AtomicallyMAXtwovalues AMOMIN AtomicallyMINtwovalues

Eachinstructioncontainstwoadditionalbitstocontroltheiroperationabitfurtherthansuggestedpreviously.Thesebitsarecalled“aq”and“rl”(foracquireandrelease).

Thebitscanbesetusingthefollowingassemblernotation:

lr.w.aq x5,(x6) # set the “aq” bit to 1lr.w.rl x5,(x6) # set the “rl” bit to 1lr.w.aq.rl x5,(x6) # set both bits

Sometimesinstructionsonasinglethread/HARTwillbereorderedbythecoreexecutionunittoachievegreaterperformance.Forexample,imaginetwoinstructionsappearinginsequence:2irstastoreinstructionfollowedbyaloadinstruction.Iftheaddressesintheseinstructionsaredifferent,thentheexecutionunitcanreasonablyperformbothoperationsinanyorder.Perhaps,thecorewillissuebothinstructionssimultaneously;thememoryunitmaycompletetheoperationoftheinstructionsinanyorder,perhapsduetothechancecontentsofthecache.Sincethetwoinstructionsaretouchingdifferentmemorylocations,theycanbesafelyreordered,right?

Usuallythereorderingissafe,butinthepresenceofotherconcurrentthreads/HARTs,therecanbeissues.Forexample,imaginethatoneofthetwomemorylocationsrepresentsalockthatisintendedtocontrolaccesstotheotherlocation.Now,itbecomescriticalthattheoperationsareexecutedinthecorrectorder.

(Asanexampleoftheproblem,imaginethatthestoreoperationisbeingusedtosetalock.Onlyafterthelockisset,isitallowabletoloadvaluesfromtheshareddata.Theshareddataisonlyguaranteedtobeinaconsistent,usablestatebyotherprocesseswhenthelockisfree;wheneversomeprocessholdsthelock,thestateofthedatamaybeinaninconsistent,un2inishedstate,andmustnotbeaccessedby

RISC-VArchitectureSummary/Porter Page� of� 214 323

Page 215: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

otherprocesses.Butnowimaginethattheloadoperationisperformed2irst,beforethelockissetwiththestoreinstruction.ThismeansthattheHARThasaccessedsharedmemorywithout2irstacquiringthelock.Thismightallowittoseeaversionorstateofthedatathatitshouldnotsee;aviolationoftheconventiontoacquirelocksbeforeaccessingtheshareddata.)

The“aq”and“rl”bitscontroltheorderingofinstructionsonasingleHART.TheydonothaveanyeffectontheschedulingofinstructionsexecutedondifferentHARTs.Inotherwords,thebitsaretherejusttomakesurethatreorderingofinstructionsonthecurrentHART(whichtheexecutionunitmightnormallydotoincreaseperformance)aretobesuppressedsothatconcurrencycontrolcodewillfunctionproperly.

The“aq”bithasthefollowingmeaning.Whenaninstructionwiththe“aq”(acquire)bitsetisexecuted,itmeansthatanyinstructionsthatfollowthe“aq”instructionmustbedelayedandcannotbeexecutedbeforethe“aq”instruction.The“aq”instructionwillbeexecuted2irst,beforetheinstructionsthatfollowitinthesequential2lowofinstructionsonthisHART.

The“rl”bithasthefollowingmeaning.Whenaninstructionwiththe“rl”(release)bitsetisexecuted,itmeansthatanyinstructionthatprecedesthe“rl”instructionmustbedoneandexecutedtocompletionbeforethe“rl”instruction.The“rl”instructionwillbeexecutedlast,aftertheinstructionsthatprecedeitinthesequential2lowofinstructionsonthisHART.

WhilewehavejustdescribedthebitsasapplyingtoonlyoneHART,thiswasasimpli2ication.Therearereallymultipleviewsoftheorderinwhichinstructionsareexecuted.First,asmentioned,thereistheorderthattheHARTitselfexecutesthetwoinstructions.Butsecond,thereistheorderinwhichotherHARTsobservetheexecution.Perhapsduetocachingeffects,thesecondordermaybedifferent.

Forexample,imaginethattherearetwostoreinstructionstobeexecutedbyoneHART.The2irstinstructionisusedtosetalockandthesecondinstructionisanupdatetotheshareddata.Presumably,anyupdateshouldonlybedoneprivatelybyaHARTthatisalreadyholdingthelock.TheconcernistheorderinwhichtheotherHARTsobservethesetwomemorystoreoperations:otherHARTsmustobservethestoretothelocktohappen2irst,thussignalingthatthelockisnolongerfree.OtherHARTsmustnotbeallowedtoseeaviewinwhichthesecondstoreappearedtooccurbeforethelockwasset,otherwisethisopensthedoortoallowingsomeHARTtoaccessdatathatshouldnotbeaccessible.

RISC-VArchitectureSummary/Porter Page� of� 215 323

Page 216: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

The“aq”and“rl”bitsimposeorderingconstraintsontheviewsobservedbyallHARTsoftheinstructionstreamoftheHARTusingthem.OtherHARTswillseeonlythelegalinstructionsequences.

Whenboth“aq”and“rl”bitsaresetinsomeinstruction,somethingcalled“sequentialconsistency”isimposed.AllotherHARTswillseethesameorderingasoccurredintheHARTcontainingtheinstruction:AllinstructionsthatwereexecutedbeforetheinstructioninquestionwillappeartoallotherHARTsashappeningbeforetheinstruction.AllinstructionsthatwereexecutedaftertheinstructionwillappeartoallotherHARTsashappeningaftertheinstruction.

RISC-Vdividesaccessestothememorysystemintotwocategories:

•Accessestophysicalmemory •Accessestomemory-mappedI/O

EachAMOinstruction(i.e.,LR,SC,AMOSWAP,AMOADD,AMOAND,AMOOR,AMOXOR,AMOMAX,AMOMIN)willaccessexactlyoneofthesetwodomains:eithermemoryorI/O.

The“aq”and“rl”bitsapplyonlytothedomainthatisbeingaccessed.Thebitsimposeorderingconstrainsonallaccessestothedomaininquestion,butimposenoconstraintsontheotherdomain.

Commentary:I’mguessingthemotivationisthateachdomain(memoryorI/O)willhaveaseparateapproachtoimplementingthesynchronizationconstraintsimposedbythe“aq”and“rl”bits.Theuseofthebitswilltriggereitheronemechanismortheother.

Orperhapstheimplementationsineachdomainwillimposewidelydifferentperformancehits.Certainlythe“aq”and“rl”bitswillberequiredtoworkproperlyinthememorydomainandpresumablytheimplementationwillendeavortoprovidehighperformance,sincelocksinmemoryarecommon.ButmaybetheirusewithintheI/Odomainwillberareandasuper-slowimplementationisacceptable,aslongasitcanbeisolatedfromthememorysystem.

Orperhapsitisenvisionedthat,insomeimplementationsofthestandard,thememorysystemwillsupportthe“aq”and“rl”semantics,butwillnotsupportthespeci2iedbehaviorforaccessestotheI/Osystem.

RISC-VArchitectureSummary/Porter Page� of� 216 323

Page 217: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

???

UsingLRandSCInstructions

Takealookatthefollowingcodesequence.Weassumethatthereisalock,whoseaddressisinregisterx10.Whenthelockholds“0”thelockisfree;thevalueof“1”isusedtoindicatethelockisset.

Inline2weloadthecurrentvalueofthelockandplaceareservationonthatlocation.Inline3,wechecktoseeifthelockwasalreadyset.Ifso,wejumptosomeothercode.Thiscodemightsimplyjumpbackto“lock”totryagainimmediately;thisiscalleda“spin”lockbecausethecodeloops,waitingforthevaluetobecomezero.Orperhapstheothercodewillwait,reschedulethreadsorwhatever,andcomebacktotryagainsometimelater.

Line4loadsthevalue“1”intoatemporaryregister.Line5istheSTORE-CONDITIONALinstruction,whichstoresthe“1”(indicating“locked”)intothelock.Thisinstructionwillsucceedorfailandthecodeisplacedintox5totellwhichhappened.

IfsomeotherHARTmanagedtostoreintothelock,thentheSCwillfailandwillnotbeupdatedbythisHART.Otherwise,ifallisgoodandthelockreservationisstillintact,thentheSCwillsucceedand“1”(representing“locked”)willbestoredintothelock.

ThelastlinelooksatthereturncodefromtheSCinstruction.IftheSCfailed,thiscodeloopsbacktothebeginningtotryagain.

1 lock:2 lr.w x5,(x10) # Grab the lock’s value3 bnz x5,fail # If already locked, fail4 li x7,1 # Store “1” into lock5 sc.w x5,x7,(x10) # to indicate “locked”6 bnz x5,lock # If failure, try again

RISC-VArchitectureSummary/Porter Page� of� 217 323

Page 218: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Toreleasealock,allweneedtodoisstorea“0”intothelockvalue.Seethefollowingcode.Recallthatx0alwaysholds“0”.

7 unlock:8 sw x0,(x10) # Set to “unlocked”

Theinstructionsequencesaboveareokayinonerespect,but2lawedinanother.First,theywillatomicallyacquiredthelockinsuchawaythatonlyoneHARTatatimewillacquirethelock.However,itisassumedthatlocksareusedtoprotectcriticalsectionsofcode.Thecriticalsectioncodewillaccesssomeshareddata.Allaccessestotheshareddatamustoccurafteralockoperationandbeforeanunlockoperation,inorderforthelocktobemeaningful.

ButwithmultipleHARTs,itispossiblethat(duetoreorderingofmemoryoperationsanddifferentviewsoftheorder)otherHARTsmightexperiencetheseaccessesinthecriticalsectioncodehappeningeitherbeforethelockoperationoraftertheunlockoperation.Thiscouldsabotagethelock’sfunctionality.

Sowecanrewritethiscode,makinguseofthe“aq”and“rl”bits.Thecorrectedcodeisshownbelow.

The“rl”bitissetwithintheLRinstruction,whichmeansthatanymemoryoperationsthatcameearlierinthesequencewillbecompleted(inthesenseofbeingvisibletootherHARTs)beforethelocksequenceisbegun.Wedon’twantanythingwe’vedonethatshouldbevisibleoutsidethecriticalsectiontobeaccidentallyhidden.

The“aq”bitissetwithintheSCinstruction,whichmeansthatanythingthathappensafterthelocksequence(i.e.,anythinginthecriticalsectioncode)willbeseenbyotherHARTsasoccurringafterthelocksequence.Thismakessurethatnocriticalsectionstuffcanleakoutinfrontofthelocksequence.

1 lock:2 lr.w.rl x5,(x10) # Grab the lock’s value3 bnz x5,fail # If already locked, fail4 li x7,1 # Store “1” into lock5 sc.w.aq x5,x7,(x10) # to indicate “locked”6 bnz x5,lock # If failure, try again

Likewise,wedothesamewithintheunlocksequence,shownnext.

RISC-VArchitectureSummary/Porter Page� of� 218 323

Page 219: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

7 unlock:8 lr.w.rl x5,(x10) # Reserve the lock9 sc.w.aq x5,x0,(x10) # Set to “unlocked”10 bnz x5,unlock # If failure, try again

WeswitchfromusingaSTOREWORDinstructiontousingaLR/SCcombination,sothatwecanusethe“rl”and“aq”bits.WeuseanSCinstructiontostoretheunlockcode(“0”)intothelockandweuseanLRinstruction,becausetheSCinstructionwillfailifwedonotholdareservationonthememorylocation.The“rl”bitforceseverythingwehavedonebeforetheunlocksequencetoappeartootherHARTsasoccurringbeforethelockissetto“0”(unlocked).The“aq”bitontheSCinstructionforcesanythingwedonext,toappearasoccurringaftertheunlocksequence.

Wedon’tbothertochecktheoldvalueofthelock,sinceweassumethat(intheabsenceofbugs),wewouldnevertrytounlockalockthatwasnotpreviouslylocked.However,wemusttestforthefailureoftheSC;itisalwayspossiblethatsomeotherHARTtriedtoacquirethelock,touchedthememorylocationrepresentingthelock,andinvalidatedourreservation.

DeadlockandStarvation–Background

Deadlockoccurswhenevertwo(ormore)processesareeachholdingaresource(e.g.,theyhaveeachacquiredalock)andeachiswaitingforaresourceheldbyanotherprocess(e.g.,tryingtoacquireanotherlock).Onceadeadlocksituationoccurs,theprocesseswillbefrozen,pendingsomeoutsideintervention.DeadlockisaconcernaddressedbytheOSandthereareanumberofapproachestoavoidingordealingwithit.

Thereisanothersimilarproblemcalled“starvation”.(Theterm“livelockissometimesusedtomeanstarvation.)Withstarvation,theprocessesarenotnecessarilyfrozenbut,duetothevagariesofchance,someprocessesnevermakeforwardprogress.

Deadlockrequiresatleasttworesources(suchaslocks)thatarebeingfoughtover.Ontheotherhand,“starvation”canoccurwithonlyoneresource.

RISC-VArchitectureSummary/Porter Page� of� 219 323

Page 220: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Tounderstandstarvation,imagineascenarioinwhichtwoHARTs/processes(calledAandB)competeforasinglelock.ThecoderunningonHARTA2indsthatthelockiscurrentlyheldbyBandrespondsbywaitingandthenretryingtoacquirethelock.Weassumethatnothread/processholdsalockinde2initely(i.e.,weassumethesearecooperatingprocesseswithoutanyprogrambugs),soBmusteventuallyreleasethelock.Butwhatif,duetothepoorluckofA,HARTBhappenstore-acquirethelockagainbeforeAhasachancetocheckit.WhenAchecksasecondtime,Aonceagain2indsthatthelockisstillunavailable.Imaginethatthisrepeatsinde2initely:Bisrepeatedlyreleasingthelock,butAisunabletomakeanyforwardprocess,becausebythetimeAchecks,Bhasalreadyre-acquiredthelock.Biswell-behaved,neverholdingthelockinde2initely,yetAendsupbeingfrozenandunabletomakeprogress.Thisisstarvation.

ItseemslikestarvationoughttoeventuallyresolveitselfwhenthebadluckofA2inallyendsandAgetsachancetoacquirethelock,butthislineofreasoningisdangerous.Funnythingscanhappenwhenprocesses/threads/HARTsarenotbehavingpurelystochastically.Thepossibilityofstarvationmustbeaddressed.

StarvationandLR/SCSequences

Considerthecodesequencetoacquirealockshownearlier.ThatsequenceissuesaLRinstructionandthen,acoupleofinstructionslater,itissuesanSCinstruction.TheguaranteemadebytheRISC-Vspecisthat,iftheSCissuccessfulthennootherHARTshavewrittentothelocation(evenifthevaluewrittenhappenstobethesamevalue).IfanotherHARTmanagedtoexecuteitsSC2irst,thentheSConthisHARTwillfail.TheSCmayfailforotherreasonsaswell.ButiftheSCsucceeds,wecanbesurethatthelocationinquestionhasnotbeenmodi2iedsincetheLRretrievedtheavalue.

However,theRISC-Vspecmakesanadditional,muchstrongerguarantee,anditisthis:IfyoukeepretryinganLR/SCsequence,itwilleventuallysucceed.Starvation(i.e.,livelock)isguaranteednottohappen.TheSCmayfailafewtimes,butitwilleventuallysucceed.Theideaisthata“denialofserviceattack”byaheavy2lowofrequestsfromotherHARTswillneverpreventanyHARTfrommakingprogress.

TherearesomeconstraintsplacedontheLR/SCsequenceinorderforthisguaranteetobemade,andthe“lock”codesequenceshownabovemeetsthem.TherecanbecodebetweentheLRandSC(includingarepeatlooptokeeptestinguntilit

RISC-VArchitectureSummary/Porter Page� of� 220 323

Page 221: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

succeeds).Thebasicrestrictionisthatthesequenceofinstructionscannotbetoolong,namely16instructions.Alsoothermemoryoperationsareforbidden.Alsoexception/trapprocessingisforbidden.Alsoinstructionsthatmighttakealongtimetocomplete(suchasintegermultiplyor2loatingpointoperations)areforbidden.

Commentary:Guaranteeingthatstarvationwillnotoccurisnotnecessarilystraightforward.Oneapproachinvolvesmaintainingaqueuethatsomehowre2lectshowlongaprocesshasbeenwaiting,andgivingprioritytotheprocessthathasbeenwaitinglongest.Anotherapproachinvolvespassinga“baton”aroundinsequencefromprocesstoprocess.Eachprocessgetsthebatoninturnand,withit,achancetomoveforward.Butafteralimitedamountoftime,itmustpassthebatonontothenextprocess.

TheAtomicMemoryOperation(AMO)Instructions

TheAMOSWAP(AtomicSwap)instructioncanbeusedtosimultaneouslyreadavaluefrommemoryandstoreanewvalueintothatsamelocation.Theinstructioncandothisatomically,whichmeansthatnointerveninginstructionthattriestostoreintothislocationcansneakinandexecute.

Typically,anatomicswapinstructionisusedtosetalock.Theinstructionsetsthelocktothe“locked”valuewhileatthesametimereadingtheoldvaluetomakesurethelockwaspreviously“unlocked.”Theinstructionmustbeatomictoensurethattwoconcurrentprocessesdon’tsimultaneouslystorethe“locked”valueintoan“unlocked”lockwithoutrealizingit.

AtomicSwap

GeneralForm:AMOSWAP.W.aq.rl RegD,Reg2,(Reg1) # 32-bitsAMOSWAP.D.aq.rl RegD,Reg2,(Reg1) # 64-bits

Examples:AMOSWAP.W x5,x4,(x9) # x5=Mem[x9]; Mem[x9]=x4AMOSWAP.W.aq x5,x4,(x9) # x5=Mem[x9]; Mem[x9]=x4AMOSWAP.D.rl x5,x4,(x9) # x5=Mem[x9]; Mem[x9]=x4

Description:

RISC-VArchitectureSummary/Porter Page� of� 221 323

Page 222: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

AvalueisreadfromthememorylocationwhoseaddressisinReg1andthevalueisplacedintoRegD.ThenthevaluefromReg2iswrittentothememorylocation.InthecasethatReg2andRegDarethesameregister,thevaluesareswapped,i.e.,thevaluestoredinmemoryisthepreviousvalueofReg2.

Thisinstructioncontainsan“aq”bitandan“rl”bit,whichcontroltheorderingofinstructionsappearingbeforeandafterthisinstruction.

Thememoryaddressmustbeproperlyalignedorelsean“AMOAddressmisaligned”exceptionwillberaised.

RV64/RV128:AMOSWAP.DisonlyavailableonRV64andRV128.

Availability:Thisinstructionisonlyavailableinthe“A”extension(AtomicInstructions).

Encoding: ThisisanR-typeinstruction.

CodeExample:TheAMOSWAPinstructioncanbeusedtoimplementthe“lock”and“unlock”operationsonalock,asweshowhere.Thelockwillberepresentedas

0=unlocked,i.e.,free 1=locked

Weassumeregisterx10containstheaddressofthelockandx5isusedasatemporary.Hereiscodetosetthelock:

li x5,1 # x5 = 1retry:

amoswap.w.aq x5,x5,(x10) # x5 = oldlock & lock = 1bnez x5,retry # Retry if previously set

Afterthecriticalsection,whichpresumablyaccessesandmodi2iessomeshareddata,wehavethefollowingcodetorelease(i.e.,free)thelock:

amoswap.w.rl x0,x0,(x10) # Release lock by storing 0

TheinitialAMOSWAPhasthe“aq”bitset,whichmeansthatanyinstructionsthatfollowtheAMOSWAPcannotbeobservedbyotherHARTstoexecutebeforetheAMOSWAP.Thispreventscodefromthecriticalsectionfrom“leaking”outbeforethelockisset.

RISC-VArchitectureSummary/Porter Page� of� 222 323

Page 223: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

The2inalAMOSWAPhasthe“rl”bitset,whichmeansthatanyinstructionsthatprecedetheAMOSWAPcannotbeobservedbyotherHARTstoexecuteaftertheAMOSWAP.Thispreventscodefromthecriticalsectionfrom“leaking”outafterthelockisreleased.

AtomicAdd/AND/OR/XOR/Max/Min(Word)

GeneralForm:AMOADD.W.aq.rl RegD,Reg2,(Reg1) # additionAMOAND.W.aq.rl RegD,Reg2,(Reg1) # logical ANDAMOOR.W.aq.rl RegD,Reg2,(Reg1) # logical ORAMOXOR.W.aq.rl RegD,Reg2,(Reg1) # logical XORAMOMAX.W.aq.rl RegD,Reg2,(Reg1) # signed maximumAMOMAXU.W.aq.rl RegD,Reg2,(Reg1) # unsigned maximumAMOMIN.W.aq.rl RegD,Reg2,(Reg1) # signed minimumAMOMINU.W.aq.rl RegD,Reg2,(Reg1) # unsigned minimum

Examples:AMOADD.W x5,x4,(x9) # x5=Mem[x9]; Mem[x9]=x4+x5AMOADD.W.aq x5,x4,(x9) # x5=Mem[x9]; Mem[x9]=x4+x5

Description:AvalueisreadfromthememorylocationwhoseaddressisinReg1andthevalueisplacedintoRegD.AbinaryoperationisthenperformedonthevaluefetchedfrommemoryandthevalueinReg2andtheresultiswrittenbacktothememorylocation.InthecasethatReg2andRegDarethesameregister,theoperationisperformedusingtheinitialvalueoftheregister;thevaluestoredintheregisterwillbethevaluefetchedfrommemory.

Thisinstructioncontainsan“aq”bitandan“rl”bit,whichcontroltheorderingofinstructionsappearingbeforeandafterthisinstruction.

Thememoryaddressmustbeproperlyalignedorelsean“AMOAddressmisaligned”exceptionwillberaised.

RV64/RV128:TheresultstoredintoRegDwillbesign-extendedtothefullregistersize,i.e.,to64or128bits.

Availability:Thisinstructionisonlyavailableinthe“A”extension(AtomicInstructions).

Encoding:

RISC-VArchitectureSummary/Porter Page� of� 223 323

Page 224: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

TheseareR-typeinstructions.

AtomicAdd/AND/OR/XOR/Max/Min(Doubleword)

GeneralForm:AMOADD.D.aq.rl RegD,Reg2,(Reg1) # additionAMOAND.D.aq.rl RegD,Reg2,(Reg1) # logical ANDAMOOR.D.aq.rl RegD,Reg2,(Reg1) # logical ORAMOXOR.D.aq.rl RegD,Reg2,(Reg1) # logical XORAMOMAX.D.aq.rl RegD,Reg2,(Reg1) # signed maximumAMOMAXU.D.aq.rl RegD,Reg2,(Reg1) # unsigned maximumAMOMIN.D.aq.rl RegD,Reg2,(Reg1) # signed minimumAMOMINU.D.aq.rl RegD,Reg2,(Reg1) # unsigned minimum

Examples:AMOADD.D x5,x4,(x9) # x5=Mem[x9]; Mem[x9]=x4+x5AMOADD.D.aq x5,x4,(x9) # x5=Mem[x9]; Mem[x9]=x4+x5

Description:AvalueisreadfromthememorylocationwhoseaddressisinReg1andthevalueisplacedintoRegD.AbinaryoperationisthenperformedonthevaluefetchedfrommemoryandthevalueinReg2andtheresultiswrittenbacktothememorylocation.InthecasethatReg2andRegDarethesameregister,theoperationisperformedusingtheinitialvalueoftheregister;thevaluestoredintheregisterwillbethevaluefetchedfrommemory.

Thisinstructioncontainsan“aq”bitandan“rl”bit,whichcontroltheorderingofinstructionsappearingbeforeandafterthisinstruction.

Thememoryaddressmustbeproperlyalignedorelsean“AMOAddressmisaligned”exceptionwillberaised.

RV64/RV128:TheseinstructionsareonlyavailableonRV64andRV128.

Availability:Thisinstructionisonlyavailableinthe“A”extension(AtomicInstructions).

Encoding: TheseareR-typeinstructions.

Implementation:EachAMOoperationrequiresbothareadofmemoryandawritetomemory,aswellasasimplebinaryoperation.Itmaybethattheread-

RISC-VArchitectureSummary/Porter Page� of� 224 323

Page 225: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

operate-writefunctionalitywillbeimplementedcompletelywithinthememorysubsystem,insteadofwithinthecore.UponencounteringanAMOinstruction,thecorewouldsendtheinitialvalueofReg2tothememorysystem.Thememorysubsystemwouldthencompletetheoperation,performingtheatomicread-operate-writecycleandthensendingtheresultvaluebacktothecore,tobestoredinRegD.Byof2loadingtheAMOfunctionalitytothememorysubsystem,theatomicityisguaranteedbythememorysystemalone.

Implementation:AnotherapproachtoimplementingtheAtomicMemoryOperations(AMO)wouldmakeuseofthemoreprimitiveLRandSCoperations.Themicroarchitecturemight,forexample,implementtheAMOinstructionsintermsofthesimplerfunctionality,whichthemicroarchitecturealsousestoimplementtheLRandSCinstructions.

SequentialConsistency:Aseriesofreadsandwritestomemoryissaidtobe“sequentiallyconsistent”ifthefollowingstatementshold.(1)AllofthereadandwriteoperationsissuedbyalltheHARTscanbelinearizedintoasingle,undisputedorder,andthesameorderisobservedbyallHARTs.(2)TheoperationsofanyoneHARTmustofcourseappearinthisorderingintheordertheHARTactuallycalledforthem,althoughtheoperationsfromanytwoHARTscanbeinterleavedarbitrarily.(3)Oratleasttheresultsofthecomputationareindistinguishablefromsuchaglobal,linearorderingofalloperations.

Toforceallreadsandwritestoconformtosequentialconsistency,theprogrammercanusetheatomicoperationsasfollows.Thiswilleffectivelylinearizeallreadsandwritestomemory,sothateveryHARTwillagreeontheorderthateverythingisperformed.

Forreadssuchas: LD.Wx4,(x7)

Substitute: LR.W.aq.rl x4,(x7)

Forwritessuchas: ST.W(x7),x4

Substitute: AMOSWAP.W.aq.rl x0,x4,(x7)

RISC-VArchitectureSummary/Porter Page� of� 225 323

Page 226: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter7:ConcurrencyandAtomicInstructions

Settingthe“aq”and“rl”bitsforcesallotherHARTstoseeinstructionsthatoccurbeforetheinstructiontoappeartoexecutetocompletionbeforetheinstruction.Also,allinstructionsthatoccuraftertheinstructionwillappeartootherHARTstoexecuteaftertheinstruction.

Implementation:NotethatanAMOSWAPoperationthatdiscardsthepreviousvaluefrommemory(i.e.,RegD=x0),canavoidthereadphaseoftheinstruction.Thisisstillausefulatomicinstruction,distinctfromastore(ST)instruction,duetothepresenceofthe“aq”and“rl”bits.Perhapssuchaninstructioncanbeoptimizedbythemicroarchitecturetoavoidthereadphase.

RISC-VArchitectureSummary/Porter Page� of� 226 323

Page 227: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

PrivilegeModes

Atanytime,theRISC-Vprocessorisoperatinginexactlyoneofthefollowing“modes”:

U-Mode(UserMode) !Lowestprivilege S-Mode(SupervisorMode) M-Mode(MachineMode) !Highestprivilege

Commentary:Previousversionsincludedafourthprivilegelevel,inwhichahypervisorwastoexecute.HypervisorModewasprettymuchidenticaltoSupervisorModeandwas2laggedas“tobespeci2iedinthefuture.”

U-Mode(UserMode) !Lowestprivilege S-Mode(SupervisorMode) H-Mode(HypervisorMode) M-Mode(MachineMode) !Highestprivilege

Typically,user-levelapplicationswillexecuteinUserMode.WhenevertheapplicationwantsanOSservice,itwillmakeasystemcall.TheOScodethathandlesthiscallwillexecuteinSupervisorMode,i.e.,atahigherprivilegelevel.Uponreturntotheapplication,themodewillbeloweredbacktoUserMode.

Mostinstructionscanbeexecutedinanymode,butsomeinstructionsareprivilegedandcanonlybeexecutedinahighermode.

Theprivilegelevelsarestrictlyordered.Anythingthatcanbedoneatalowerprivilegelevelcanbedoneinanyoneofthehigherlevels.Forexample,anythingthatcanbedoneinUserModecanbedoneinSupervisorMode.

RISC-VArchitectureSummary/Porter Page� of�227 323

Page 228: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

MachineModeisthehighestprivilegelevel;allinstructionsarelegalatthislevelandnothingisprotected.Whentheprocessorbeginsoperationafterpoweringuporafterareset,itisplacedinMachineMode.

Theremainingmodes(UserandSupervisormodes)areprettymuchthesameaseachother.Inotherwords,thereislittletodistinguishthem,excepttheirrelationshiptoeachother.IfyouunderstandUserMode,thenyouwillalsounderstandmostofSupervisorMode.

Thereisoneexceptiontotheuniformityandthisregardsvirtualmemoryandpagetables.ThisfunctionalityislocatedinSupervisorMode.

TheRISC-Vspecdescribestheprivilegemechanisminsomewhatgeneraltermsanditiseasytoimaginetheinsertionofadditionallevelsofprivilege,althoughitshouldbenotedthatafourthmode(HypervisorMode),whichexistedinpreviousversions,hasbeeneliminated.

Itappearsthatthecurrentmodeisnotavailableinanyuser-visibleregister.Inotherwords,itisdif2icultforcodetodeterminethecurrentmode.Itcannotsimplybereadfromaregister;insteadthecurrentmodeisimplicitinthefunctionalitythatisavailable.

[Thislackseemstobeintentional.Afterall,codeistypicallywrittentoruninonemodeoranotherandrarelyneedstoaskwhichmodeitisrunningin.Forexample,ahypervisormaywishtorunahostedOSisarestrictedlower-privilegemodethantheOSismeanttorunat,therebymaintainingfullcontrolofthemachineandpreventingthehostedOSfromcorruptingthehypervisoritselforotherhostedOSes.ThehostedOSexpectstoruninahigherprivilegemodethanitactuallyisrunningat;thehypervisormustcreateandpreservetheillusionthattheOSisrunninginahigherprivilegemodethanitactuallyis.]

TheRISC-Vspeci2icationdoesnotrequireallmodestobeimplemented.Someprocessorsmaynothaveallmodes.Herearethelegalpossibilities:

Option1 Option2 Option3 U U S M M M

RISC-VArchitectureSummary/Porter Page� of�228 323

Page 229: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

ThemostbasicprocessorwillhaveonlyMachineMode.Inasense,thereisnoprivilegesysteminsuchasimpleprocessorsincealloperationscanbeperformedatanytime.Animplementationlikethiswouldbesuitableforanembeddedmicrocontroller.

Withoption2,therearetwoprivilegelevels:UserandMachine.ThismightbeappropriateforasimpleOSwithsomeprotectedsecuritymonitorrunninginMachineModeandoneormoreapplicationsrunninginUserMode.

Option3isintendedformoreelaboratesystemsthatwillberunningatraditionalOSlikeUnix/Linux.

RISC-VArchitectureSummary/Porter Page� of�229 323

Page 230: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Insomeimplementationsseveralinstructionsand/orISAfeatureswillbemissingandtherewillbeaneedtoemulatethemissingfunctionality.Thenaturalorganizationwillplacethecodetoemulatethemissinginstructions/featuresinMachineModesothattheOS(whichrunsinSupervisorMode)doesnotneedtobeawareofthesedetails.

PreviousRISC-VversionsdocumentedaHypervisorModeintendedtosupporthypervisors.Thehypervisorwastointendedtorunatitsownprivilegelevel,whileeachofthehostedOSesraninSupervisorMode.AlthoughHypervisorModehasbeeneliminated,weincludethefollowingdiagramanyway.

Intheabovediagramswesuggestwhichmodeswillbeusedforwhichsortsofcode,butthisisnotmandatory.Inparticular,anoperatingsystemkernelmightruninMachineModeinsomesystemsandinSupervisorModeinothersystems.

InterfaceTerminology

TheRISC-Vdocumentationdiscussestheinterfacingbetweenapplications,operatingsystems,andhypervisors.

Everyapplicationprogramrunsinaparticularenvironment,whichiscalledthe“ApplicationExecutionEnvironment”(AEE).Howtheapplicationinterfaceswiththeunderlyingexecutionenvironmentiscalledthe“ApplicationBinaryInterface”(ABI).

RISC-VArchitectureSummary/Porter Page� of�230 323

Page 231: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

TypicallyanOSsitsunderneaththeapplicationandimplementstheApplicationBinaryInterface,whichconsistsoftheUserModeinstructionsetalongwiththecollectionofsystemcallsthatareavailabletoapplications.TheApplicationBinaryInterfaceisthesumtotalofwhattheapplicationprogrammerneedstounderstandinordertowriteprograms;theprogrammerdoesnothavetounderstandorknowwhatisgoingonwithintheApplicationExecutionEnvironment.

AnexampleApplicationBinaryInterfacewouldcombinetheprocessorISAalongwiththeOSsystem-callinterface.

IfanapplicationisportedtoanewApplicationExecutionEnvironment,thentheapplicationwillfunctionidenticallywithnoproblems,aslongasthenewenvironmentimplementsexactlythesameApplicationBinaryInterface.

Typicallythereareseveralapplicationsrunningatanyonetime.TheOSgenerallyprovidesoneApplicationBinaryInterfaceandallapplicationsarewrittentomeetthisspeci2ication.However,anOSmightimplementmorethanoneApplicationBinaryInterface,asshowninthenextdiagram.

TheOSmightberunningonbaremetal.OrtheOSmightberunningontopofahypervisororsomeothermonitor/security/bootstrappingsoftware.Inanycase,theOSiscreatedandwrittentomeetthespeci2icationsofa“SupervisorExecution

RISC-VArchitectureSummary/Porter Page� of�231 323

Page 232: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Environment”(SEE).Thisspeci2icationiscalledthe“SupervisorBinaryInterface”(SBI).

IftheOSisportedtoanewenvironment(forexample,tonewcomputerhardware),theOSshouldfunctionidentically,aslongasthenewexecutionenvironmentimplementstheexactsameSupervisorBinaryInterface.

TypicallyanOSisdesignedtorundirectlyonabaremachinewithnounderlyingsoftware.Inthiscase,theSBIconsistsofaspeci2icationoftheprocessor’sISA.

However,theOSmayinfactberunningontopofahypervisor.AslongasthehypervisorimplementsthecorrectSupervisorBinaryInterface,theOSshouldbeunawarethatitisrunningontopofahypervisor.

ItisconceivablethatthehypervisorprovidesmultipleSupervisorBinaryInterfaces.Forexample,onehypervisormightprovideanSBImimickingaPCandasecondSBImimickinganApplecomputer.SuchahypervisorwouldthanfacilitatethesimultaneousexecutionofbothaMSWindowsoperatingsystemandtheMacOS.Suchasetupisshownbelow.

Thehypervisoritselfrunsinsomeenvironment,whichiscalledthe“Hyperv2laisorExecutionEnvironment”(HEE).Theunderlyingenvironmentimplementsaparticularinterface,calledthe“HypervisorBinaryInterface”(HBI).

A“type-1hypervisor”(ornativehypervisor)isaprogramthatisintendedtorunonabaremachine.Inotherwords,thereisnosoftwarebelowthehypervisor.Thehypervisorachievesallitsworkusingonlytheinstructionsavailableontheprocessoranddoesnotmakeanysystemcalls.Foratype-1hypervisor,theHypervisorExecutionEnvironmentisjusttheprocessorcore’sinstructionsetarchitecture(ISA).

RISC-VArchitectureSummary/Porter Page� of�232 323

Page 233: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

A“type-2hypervisor”(orhostedhypervisor)isaprogramthatisintendedtorunontopofothersoftware.Thehypervisorsitsontopofsomeotheroperatingsystemand,asfarastheunderlyingOSisconcerned,thehypervisorisjustanotherapplicationprogram.AnexampleofthiswouldbeVMWare(whichrunsontopoftheAppleMacOSandprovidesaSupervisorBinaryInterfacetomimicthePCsoWindowscanrun.OtherexamplesareQEMUandVirtualBox.

Inordertoruncomputersystemsasef2icientlyaspossible,hardwareimplementationformostinstructionsisdesirable.Themajorityofcommonoperationsmustbeimplementeddirectlyinhardware,byinstructionsexecutedinthecurrentmode.

However,somecodewilloccasionallytrytoexecuteinstructionsthatrequiresoftwareintervention.Perhapsaparticularimplementationfailstoprovidehardwaretosupportcertainoperations;exampleswouldbethelackofhardwaresupportfor“integermultiplication”or“2loatingpointarithmetic”.Orperhapsthecodeisrunninginahypervisorenvironmentinwhichaparticularinstructioncannotbeblindlyexecuted,butmustbeintercepted,monitored,and/oralteredbeforebeingexecuted.Regardlessofthereasontheinstructioncannotbeexecuteddirectly,an“exception”willoccurwhentheinstructionisencountered.Themodewillbeincreasedtoahigherprivilegelevel,softwarewillintervene,andeventuallyareturntotheoriginalprivilegemodewilloccurandinstructionexecutionwillresume.

AllISAdesignerstrytominimizethenumberofoperationsthatrequiresoftwareinterventionand,wheninterventionisnecessary,trytomaketheupcalltotheexceptionhandlerquickandef2icient.

OvertheyearswehaveaccumulatedgoodexperiencewiththeinterfacebetweenUserModecodeandoperatingsystemcode.Muchhasbeendonetoreducethefrequencyofsystemcallsandmaketheupcallinterfacefastandef2icient.

Morerecentlyweareusingmorehypervisorsoftwaretosupportlegacyoperatingsystems.InordertoimplementtheSupervisorBinaryInterface(SBI)expectedbytheOS,thehypervisormaybeobligatedtostepinfrequently,whenevertheOSexecutesaprivilegedinstruction.ProvidinggoodISAsupportforef2icienthypervisorsisanactiveresearcharea.

RISC-VArchitectureSummary/Porter Page� of�233 323

Page 234: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Exceptions,Interrupts,TrapsandTrapHandlers

Duringtheexecutionofaninstruction,an“exception”mayoccur.

HerearethedifferenttypesofexceptionsmentionedintheRISC-Vspec:

SynchronousExceptions:Illegalinstruction

Instructionaddressmisaligned Instructionaccessfault Loadaddressmisaligned Loadaccessfault Store/AtomicMemoryOperation(AMO)addressmisaligned Store/AtomicMemoryOperation(AMO)accessfault Environmentcall(i.e.,the“syscall”trapinstruction) BreakpointAsynchronousExceptions(Interrupts): Timerinterrupt Softwareinterrupt Externalinterrupt

Asynchronousexception(sometimesjustcalledan“exception”)occursasaresultoftryingtoexecuteaninstructionandissaidtobesynchronoussinceitisintimatelyconnectedwithaparticularinstruction.Thesynchronousexceptionwillbedetectedduringinstructionexecutionatthetimetheexceptionalconditionisdetected.Forexample,duringinstructiondecode,thehardwaremaydetectabadopcode2ield,whichwillinitiatethe“illegalinstruction”exceptionprocessing.

Theothertypeofexceptionisan“interrupt”.Duringtheexecutionofaninstruction,aninterruptmayoccur.Interruptsarecausedbyeventsoutsidethecurrentinstructionexecutionand,assuch,areasynchronous.Wheneveraninterruptoccurs,itwillbecomeassociatedwithasingleinstruction(basicallythecurrentlyexecutinginstruction)andthatinstructionischosentoreceivetheinterruptexception.

Anexceptioncanbehandledorignored.Iftheexceptionishandled,thena“trap”willoccur.Thetrapprocessinginvolvesatransferofcontroltoa“traphandler”routine.Trapprocessingconsistsofafewhardwareoperations,suchasmodifyingacoupleofhardware2lags,savingthePC,andeffectingatransferofcontroltothe2irstinstructionofthetraphandlerroutine.

RISC-VArchitectureSummary/Porter Page� of�234 323

Page 235: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

A“traphandler”isasoftwareroutineandwillusuallybeexecutedatahigherprivilegemodethantheinstructionreceivingtheexception.Forexample,an“illegalinstruction”exceptionmightoccurinapplicationcoderunninginUserMode.ThetraphandlerwillruninSupervisorModeand,whencomplete,thehandlermayreturntoexecutinginstructionsinUserMode.

However,thechangeofmodemaynotalwaysoccur.Insomecases,thetraphandlerwillexecuteatthesameprivilegemodeastheinstructionreceivingtheexception.WithRISC-V,sometrapsmaybeentirelycontainedinUserMode;theapplicationcodewillberesponsibleforhandlingitsowntrapsandsupervisorcodewillnotbeinvolved.

A“timerinterrupt”iscausedwhenaseparatetimercircuitindicatesthatapredetermineintervalhasended.Thetimersubsystemwillinterruptthecurrentlyexecutingcode.TimerinterruptsaretypicallyhandledbytheOSwhichusesthemtoimplementtime-slicedmultithreading.

An“externalinterrupt”comesfromoutsidetheprocessorandtheprecisenatureofthecausewilldependontheapplication.Forexample,aRISC-Vprocessorusedinanembeddedprocesscontrolsystemmightreceiveexternalinterruptsfromvarioussensorsdemandingattention.

A“softwareinterrupt”iscausedbysettingabitinthemachinestatusword.Thiscanbeusefulinamulti-corechipwhereathreadrunningononecoreneedstosendaninterruptsignaltoanothercore.

ControlandStatusRegisters(CSRs)

Inadditiontothebasicuser-levelinstructionsdiscussedpreviously,whichcanbeexecutedbycoderunninginanymode,thereareanumberofadditionalfeaturesthatanyonewritinganOSkernelcodewillneedtounderstand.

AtthecenteroftheRISC-VprivilegesystemareanumberofControlandStatusRegisters(CSRs),whicharedifferentfromthegeneralpurposeand2loatingpointregisters.EachCSRhasauniquenameandspecializedfunction.

RISC-VArchitectureSummary/Porter Page� of�235 323

Page 236: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

ControloftheprivilegesystemisperformedentirelybyreadingandwritingtheCSRs.OS/Kernelcodecanperformprivilegedoperationssolelybyreading/writingtoCSRs.

TheRISC-Vspeci2icationallowsforupto4,096CSRstobepresent,butinmostimplementations,notalladdresseswillbeused.Infact,thereareonlyacoupleofdozenCSRsde2inedbytheRISC-Vspeci2ication.Theotheraddressesareleftopenanddifferentimplementationsmaychoosetoaddadditionalnon-standardCSRs,uptothelimitof4,096registers.

EachCSRhasanaddress.Or,ifyouprefer,youcanthinkofeachCSRashavinganumberintherange0to4,095.

Sincetherecanbeupto4,096CSRs,a12-bitaddressisusedtoaddresseachCSR.(Recallthat212=4,096.)TheinstructionsthatreadandwritetheCSRsusetheI-typeinstructionformat.Thisinstructionformatincludesa12-bitimmediate2ield,andthis2ieldisusedtocontainthe12bitaddressoftheCSRregisterbeingoperatedon.

ThesizeoftheCSRsisspeci2iedtobethesamesizeasthegeneralpurposeregisters.SoalltheCSRswillbe32bitsinaRV32machine.Ina64-bitmachine,theCSRsare64bitswide.Likewise,inamachinewith128bitregisters,theCSRsare128bitswide.

Inordertoworkwithallsizesofmachines,thespeci2icationnevermakesuseofmorethan32bitsinanyCSRregister.For64-bitand128-bitmachines,theupper32or96bitsareneverusedandare2illedwithzeros.[Actually,thereareminorexceptions.]

EachCSRhasadifferentmeaninganduse.SomeCSRsarecomprisedofacollectionofspecialized2ields(some2ieldsareonly1or2bitsinlength),whileotherCSRscontainasinglefulllengthvalue,suchasa32bitaddress.

EachCSRbelongstooneoftheprivilegemodes.Inotherwords,someCSRsareMachineModeCSRs,someCSRsareSupervisorModeCSRs,andtheremainingCSRsareUserModeCSRs.

AMachineModeCSRcanonlyberead/writtenwhentheprocessorisinMachineMode;itisillegaltoaccessitwhenrunninginanyothermode.ASupervisorModeCSRcanberead/writtenwhentheprocessorisineitherMachineModeor

RISC-VArchitectureSummary/Porter Page� of�236 323

Page 237: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

SupervisorMode,butnotfromcoderunninginUserMode.AUserModeCSRcanberead/writtenregardlessofthecurrentoperatingmode.

Thereareseveralinstructionswhichareusedtomanageandcontroltheprivilegesystem,andtoperformprivilegedoperations.Theinstructionsare:

ECALL–Tomakeasystemcallfromalowerprivilegeleveltoahighermode EBREAK–Usedbydebuggerstogetcontrol;similartoECALL URET–ToreturnfromtraphandlerthatwasrunninginUserMode SRET–ToreturnfromtraphandlerthatwasrunninginSupervisorMode MRET–ToreturnfromtraphandlerthatwasrunninginMachineMode WFI–Gointosleep/lowpowerstateandWaitForInterrupt CSR…–Instructionstoread/writetheControlandStatusRegisters(CSRs)

WritingtoaCSRwillchangethestateoftheprocessorcore.Forexample,toenable/disableinterrupts,theprogramwouldwritetooneoftheCSRs.

Reading/writingsomeCSRsisnotallowedatlowerprivilegelevels.SomeCSRsareread-onlyatallprivilegelevelsandarethereforesetinstonebythechipdesigners,e.g.,todescribethecore’scapabilities.

KeyIdea:TounderstandtheRISC-Vprivilegesystemandexceptionprocessing,itisbothnecessaryandsuf2icienttounderstandtheCSRsandtheirfunctionality.

CSRListing

TheControlandStatusRegisters(CSRs)arelistedbelowforreference.Theirfunctionswillbedescribedlater.

ForeachCSR,wegiveits12-bitaddress(using3hexdigits).Wealsoindicatewhetheritcanbeaccessedineachoftheprivilegemodes(User,Supervisor,Machine)and,ifso,whatsortofaccess(read-onlyorread/write)isallowed.

Recallthattheprocessorisexecutinginoneofthethreemodesatanymoment.ItisaprivilegeviolationtotrytoaccessaCSRthatisnotvisibleinthecurrentmode.Likewise,itisaprivilegeviolationtotrytowritetoaCSRthatisread-onlyinthecurrentmode.Ifthishappens,anillegalinstructionexceptionwillbesignaled.

RISC-VArchitectureSummary/Porter Page� of�237 323

Page 238: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Visibilityin…AddrName Description U S M

Counters:

C00 cycle Clockcyclecounter r r rB00 mcycle r/wC80 cycleh Upperhalfofcycle(RV32only) r r rB80 mcycleh r/w

C01 time Currenttimeinticks r r rC81 timeh Upperhalfoftime(RV32only) r r r

C02 instret Numberofinstructionsretired r r rB02 minstret r/wC82 instreth Upperhalfofinstret(RV32only) r r rB82 minstreth r/w

ExceptionProcessing:

000 ustatus Statusregister r/w r/w r/w100 sstatus r/w r/w300 mstatus r/w 004 uie Interrupt-enableregister r/w r/w r/w104 sie r/w r/w304 mie r/w

005 utvec Traphandlerbaseaddress r/w r/w r/w105 stvec r/w r/w305 mtvec r/w

102 sedeleg Exceptiondelegationregister r/w r/w302 medeleg r/w

103 sideleg Interruptdelegationregister r/w r/w303 mideleg r/w 106 scounteren Counterenable r/w r/w

RISC-VArchitectureSummary/Porter Page� of�238 323

Page 239: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

306 mcounteren r/w

TrapHandling:

040 uscratch Tempregisterforuseinhandler r/w r/w r/w140 sscratch r/w r/w340 mscratch r/w

041 uepc PreviousvalueofPC r/w r/w r/w141 sepc r/w r/w341 mepc r/w

042 ucause Trapcausecode r/w r/w r/w142 scause r/w r/w342 mcause r/w

043 utval Badaddressorbadinstruction r/w r/w r/w143 stval r/w r/w343 mtval r/w

044 uip Interruptpending r/w r/w r/w144 sip r/w r/w344 mip r/w

VirtualMemory:

180 satp Addresstranslationandprotection r/w r/w

Informational:

301 misa ISAandextensions r/wF11 mvendorid VendorID rF12 marchid ArchitectureID rF13 mimpid ImplementationID rF14 mhartid HardwarethreadID r

FloatingPoint:

001 f=lags Floatingpointing2lags r/w r/w r/w002 frm Dynamicroundingmode r/w r/w r/w

RISC-VArchitectureSummary/Porter Page� of�239 323

Page 240: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

003 fcsr Concatenationoffrm+f2lags r/w r/w r/w

PerformanceMonitoring:

323 mhpmevent3 Eventselector#3 r/wC03 hpmcounter3 Eventcounter#3 r r rB03 mhpmcounter3 r/wC83 hpmcounter3h Upperhalfofcounter(RV32only) r r rB83 mhpmcounter3h r/w

324 mhpmevent3 Eventselector#4 r/wC04 hpmcounter4 EventCounter#4 r r rB04 mhpmcounter4 r/wC84 hpmcounter4h Upperhalfofcounter(RV32only) r r rB84 mhpmcounter4h r/w

… … … … … …

33F mhpmevent31 Eventselector#31 r/wC1F hpmcounter31 EventCounter#31 r r rB1F mhpmcounter31 r/wC9F hpmcounter31h Upperhalfofcounter(RV32only) r r rB9F mhpmcounter31h r/w

PhysicalMemoryProtection:

3A0 pmpcfg0 PMPCon2igurationword#0 r/w3A1 pmpcfg1 PMPCon2igurationword#1 r/w3A2 pmpcfg2 PMPCon2igurationword#2 r/w3A3 pmpcfg3 PMPCon2igurationword#3 r/w

3B0 pmpaddr0 PMPAddress#0 r/w3B1 pmpaddr1 PMPAddress#1 r/w… … … …3BF pmpaddr15 PMPAddress#15 r/w

Debug/Trace:

7A0 tselect Triggerregisterselect r/w7A1 tdata1 Triggerdata#1 r/w

RISC-VArchitectureSummary/Porter Page� of�240 323

Page 241: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

7A2 tdata2 Triggerdata#2 r/w7A3 tdata3 Triggerdata#3 r/w

7B0 dcsr Debugcontrolandstatus r/w7B1 dpc DebugPC r/w7B2 dscratch Scratchregister r/w

InstructionstoRead/WritetheCSRs

RegardlessofwhichControlandStatusRegister(CSR)isinvolved,thereareonly6instructionsthatareusedtoreadandwritetheregisters:

CSRRW–ReadandwriteaCSR CSRRS–Readandsetselectedbitsto1 CSRRC–Readandclearselectedbitsto0 CSRRWI–ReadandwriteaCSR(fromimmediatevalue) CSRRSI–Readandsetselectedbitsto1(usingimmediatemask) CSRRCI–Readandclearselectedbitsto0(usingimmediatemask)

ThereareafewotherinstructionsthatreadormodifyaCSR,buttheseadditionalinstructionsaremerelyspecialcases(i.e.,shorthandorsyntacticsugar)foroneoftheaboveinstructions.

Eachofthese6instructionsbothreadsandwritesasingleCSRinasingle,atomicaction.

EachoftheseinstructionsreadsaCSRbycopyingitspreviousvalueintooneofthegeneralpurposeregisters(x1,x2,…).Often,youonlywanttomodifytheregisteranddon’tcareaboutitspreviousvalue;ifso,youcanspecifythedestinationregistertobex0.

EachoftheseinstructionsisalsocapableofmodifyingtheCSR.Often,youonlywanttoreadtheregister;ifso,youcanspecifythesourcevalueasregisterx0.(Thisisaspecialcase.TheCSRisremainsunchanged;itisnotsettozero.)

Thenewvaluecanbeeitherafullvalue,orabitmaskwhichwillbeusedtosetorclearselectedbits.Thenewvalue(orbitmask)caneithercomefromaregisterorcanbecontainedwithinanimmediatedata2ieldwithintheinstructionitself.

RISC-VArchitectureSummary/Porter Page� of�241 323

Page 242: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

WhethercodeisallowedtoaccessaparticularCSRisdependentonthecurrentprivilegemodeatwhichthecodeisbeingexecuted,aswellaswhichCSRisinvolved.Forexample,ifcoderunninginUserModeattemptstoreadorupdateoneoftheMachineModeCSRs(suchasmstatus),thiswillnotbeallowed.

Ifanattemptismadetoreadand/orwritetoaCSRthatisforbiddenatthecurrentprivilegelevel,thenan“IllegalInstruction”exceptionwilloccur.Theinstructionwillnotcompleteandexceptionprocessingwillensue.

WithinafewoftheCSRs,theprotectionisona2ield-by-2ieldbasis.Inotherwords,someofthebitsmaybeprotectedfromchange(i.e.,read-only),whiletheremainingbitscanbefreelychanged(i.e.,read-write).Forexample,withinthemstatusCSR,itispossibletomodifysomebitsbutnotallowabletomodifyotherbits.(ButforotherCSRs,theentireregisteriseitherwritableornot;thatis,allbitshavethesameprotection.)Anyattempttomodifybitsthatareread-onlywillsimplybeignored.

CSRRead/Write

GeneralForm:CSRRW RegD,Reg1,Immed-12

Example:CSRRW x9,x4,cycle # x9=CSR[0xC00]; CSR[0xC00]=x4

Description:Thisinstructioncanbeusedtoreadfromand/orwritetoaCSR.

TheImmed-122ieldisusedtoencodetheaddressofoneofthe4,096ControlandStatusRegisters(CSRs).ThepreviousvalueoftheCSRiscopiedtothedestinationregisterandthevalueofthesourceregisterReg1iscopiedtotheCSR.Thisisanatomicoperation.

Intheaboveexample,aCSRnamed“cycle”isbeingaccessed.ThisCSRhappenstohavethe12-bitaddress0xC00.Thisregistercontainsacounterofthenumberofclockcyclessinceitwaslastwrittento.Thecounterisautomaticallyincrementedasinstructionsareexecuted.

ToreadaCSRwithoutwritingtoit,thesourceregisterReg1canbespeci2iedasx0.TowriteaCSRwithoutreadingit,thedestinationregisterRegDcanbespeci2iedasx0.

RISC-VArchitectureSummary/Porter Page� of�242 323

Page 243: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Comment:InlowerprivilegemodessomeoftheCSRsareinaccessible.AnattempttoreadfromorwritetoaCSRmaycauseanillegalinstructionexception.

Encoding: ThisisanI-typeinstruction.

CSRReadandSetBits

GeneralForm:CSRRS RegD,Reg1,Immed-12

Example:CSRRS x9,x4,mstatus # x9=CSR[0x300]; CSR[0x300]|=x4

Description:TheImmed-122ieldisusedtoencodetheaddressofoneofthe4,096ControlandStatusRegisters(CSRs).ThepreviousvalueoftheCSRiscopiedtothedestinationregisterandthensomeselectedbitsoftheCSRaresetto1.ThevalueinReg1isusedasabitmasktoselectwhichbitsaretobesetintheCSR.Otherbitsareunchanged.Thisisanatomicoperation.

Intheaboveexample,aCSRnamed“mstatus”isbeingaccessedwhichhasthe12-bitaddress0x300.Thisregistercontainsanumberofbitswhichcontrolwhichcontrolinterruptprocessing.

ThisinstructioncanbeusedtosimplyreadaCSRwithoutupdatingit.IfReg1isx0,thennoupdatetotheCSRwilloccur.

Exception:Maycauseanillegalinstructionexceptionifthecurrentprivilegemodeisnothighenough.

Encoding: ThisisanI-typeinstruction.

CSRReadandClearBits

GeneralForm:CSRRC RegD,Reg1,Immed-12

Example:CSRRC x9,x4,mstatus # x9=CSR[0x300]; CSR[0x300]&=~x4

RISC-VArchitectureSummary/Porter Page� of�243 323

Page 244: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Description:TheImmed-122ieldisusedtoencodetheaddressofoneofthe4,096ControlandStatusRegisters(CSRs).ThepreviousvalueoftheCSRiscopiedtothedestinationregisterandthensomeselectedbitsoftheCSRareclearedto0.ThevalueinReg1isusedasabitmasktoselectwhichbitsaretobeclearedintheCSR.Otherbitsareunchanged.Thisisanatomicoperation.

ThisinstructioncanbeusedtosimplyreadaCSRwithoutupdatingit.IfReg1isx0,thennoupdatetotheCSRwilloccur.

Exception:Maycauseanillegalinstructionexceptionifthecurrentprivilegemodeisnothighenough.

Encoding: ThisisanI-typeinstruction.

CSRRead/WriteImmediate

GeneralForm:CSRRWI RegD,Immed-5,Immed-12

Example:CSRRWI x9,3,mstatus # x9=CSR[0x300]; CSR[0x300] = 3

Description:TheImmed-122ieldisusedtoencodetheaddressofoneofthe4,096ControlandStatusRegisters(CSRs).ThepreviousvalueoftheCSRiscopiedtothedestinationregisterandthentheentireCSRiswrittento.The5-bit2ieldthatisnormallyusedforReg1iszero-extendedandusedasthesourcevaluethatismovedintotheCSR.Thisisanatomicoperation.

Thisinstructionmakesbits[4:0]inanyCSRparticularlyeasytomodify.Exception:

Maycauseanillegalinstructionexceptionifthecurrentprivilegemodeisnothighenough.

Encoding: ThisisanI-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of�244 323

Page 245: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

CSRReadandSetBitsImmediate

GeneralForm:CSRRSI RegD,Immed-5,Immed-12

Example:CSRRSI x9,3,mstatus # x9=CSR[0x300]; CSR[0x300]|=3

Description:TheImmed-122ieldisusedtoencodetheaddressofoneofthe4,096ControlandStatusRegisters(CSRs).ThepreviousvalueoftheCSRiscopiedtothedestinationregisterandthensomeselectedbitsoftheCSRaresetto1.The5-bit2ieldthatisnormallyusedforReg1iszero-extendedandusedasabitmasktoselectwhichbitsaretobesetintheCSR.Otherbitsareunchanged.Thisisanatomicoperation.

Thisinstructionmakesbits[4:0]inanyCSRparticularlyeasytosetto“1”.Exception:

Maycauseanillegalinstructionexceptionifthecurrentprivilegemodeisnothighenough.

Encoding: ThisisanI-typeinstruction.

CSRReadandClearBitsImmediate

GeneralForm:CSRRCI RegD,Immed-5,Immed-12

Example:CSRRCI x9,3,status # x9=CSR[0x300]; CSR[0x300]&=~3

Description:TheImmed-122ieldisusedtoencodetheaddressofoneofthe4,096ControlandStatusRegisters(CSRs).ThepreviousvalueoftheCSRiscopiedtothedestinationregisterandthensomeselectedbitsoftheCSRareclearedto0.The5-bit2ieldthatisnormallyusedforReg1iszero-extendedandusedasabitmasktoselectwhichbitsaretobeclearedintheCSR.Otherbitsareunchanged.Thisisanatomicoperation.

Thisinstructionmakesbits[4:0]inanyCSRparticularlyeasytoclearto“0”.Exception:

Maycauseanillegalinstructionexceptionifthecurrentprivilegemodeisnothighenough.

RISC-VArchitectureSummary/Porter Page� of�245 323

Page 246: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Encoding: ThisisanI-typeinstruction.

BasicCSRs:CYCLE,TIME,INSTRET

TherearethreeControlandStatusRegisters(CSRs)thatareparticularlyeasytounderstand,solet’sstartwiththem:

cycle–counterofclockcyclestime–currentrealtimeinstret–counterofinstructionsretired(i.e.,executed)

EachoftheseCSRsisautomaticallyincrementedasinstructionsareexecuted.Allarereadableatanyprivilegemode.Togethertheycanbeusedtomeasurecodespeedandperformance.

TheCSR“cycle”countsthenumberofclockcycles.TheCSR“time”measuresrealtimeinunitsof“ticks.”Thenumberoftickspersecondisimplementationdependentandcanbedeterminedelsewhere.TheCSR“instret”countsthenumberofinstructionsretired(i.e.,thenumberofinstructionscompleted).

TheseCSRsareallUserModeregisters,whichmeanstheycanbereadbycodeexecutinginanymode.Theseregistersareread-only,sotheycannotbewrittento.

Wecanresetthesecounters,buttheresetoperationrequireshigherprivilege;UserModecodeshouldnotbeabletowritetothesecounters.Toaccommodateupdatingbutonlyathigherprivileges,theseregistersare“mirrored”byMachineModeregisters.Themirroredversions(calledmcycle,minstret,…)havedifferentnamesanddifferentaddresses.Assuch,theycanonlybewrittentobycoderunninginMachineMode.

Toavoidpossibleover2low,thesecountersneedtobe64-bits.For64-bitand128-bitmachines,eachofthethreeCSRswillbelargeenoughtoavoidover2low.

For32-bitmachines,theRV32speci2icationbreakseachcounterintotwo32-bitCSRs.Inotherwords,theRV32speci2icationintroduces3additionalregisterstocontainthehigh-order32-bits:

RISC-VArchitectureSummary/Porter Page� of�246 323

Page 247: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

cycle–clockcycles(lower32bits)cycleh–clockcycles(higher32bits)time–realtime(lower32bits)timeh–realtime(higher32bits)instret–instructionsretired(lower32bits)instreth–instructionsretired(higher32bits)

Thefollowinginstructionsareshortformsofotherinstructions.Theyaredesignedtomakereadingtheabove-mentionedcountersespeciallyeasy.

Read“CYCLE”

GeneralForm:RDCYCLE RegD

Example:RDCYCLE x9 # x9=CSR[cycle]

Encoding: Thisisasimpli2iedformofamoregeneralinstruction.

Read“CYCLEH”

GeneralForm:RDCYCLEH RegD

Example:RDCYCLEH x9 # x9=CSR[cycleh]

Comment: Onlyon32-bitmachines.Encoding: Thisisasimpli2iedformofamoregeneralinstruction.

Read“TIME”

GeneralForm:RDTIME RegD

Example:RDTIME x9 # x9=CSR[time]

RISC-VArchitectureSummary/Porter Page� of�247 323

Page 248: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Encoding: Thisisasimpli2iedformofamoregeneralinstruction.

Read“TIMEH”

GeneralForm:RDTIMEH RegD

Example:RDTIMEH x9 # x9=CSR[timeh]

Comment: Onlyon32-bitmachines.Encoding: Thisisasimpli2iedformofamoregeneralinstruction.

Read“INSTRET”

GeneralForm:RDINSTRET RegD

Example:RDINSTRET x9 # x9=CSR[instret]

Encoding: Thisisasimpli2iedformofamoregeneralinstruction.

Read“INSTRETH”

GeneralForm:RDINSTRETH RegD

Example:RDINSTRETH x9 # x9=CSR[instreth]

Comment: Onlyon32-bitmachines.Encoding: Thisisasimpli2iedformofamoregeneralinstruction.

ExampleCode:Thefollowingsequencecanbeusedtocorrectlyreada64-bitcounterona32-bitmachinewherethecounterisbrokenintotwo32-bitpieces.

RISC-VArchitectureSummary/Porter Page� of�248 323

Page 249: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Thereisapossibilitythat,betweenreadingthetwohalves,over2lowfromthelower-order32bitsintotheupperbitsmightoccur.

again: rdcycleh x3 # Read high bits rdcycle x2 # Read low bits rdcycleh x4 # Check for overflow bne x3,x4,again # from low to high

CSRRegisterMirroring

Atanymomenttheprocessorisexecutinginsomemode,i.e.,ineitherUser(U),Supervisor(S),orMachine(M)Mode.

TheRISC-Vspecde2inesafewdozenControlandStatusRegisters(CSRs)andtheyarebrokeninto3groups.Thereisonegroupforeachmode,soeachCSRbelongstoeitherUserMode,SupervisorMode,orMachineMode.

WhenrunninginMachineModeallCSRsareaccessible,regardlessofwhichgrouptheybelongto,sinceMachineModeisthehighestprivilegelevel.InSupervisorMode,onlytheSupervisorandUserCSRscanbeaccessed.Finally,whenrunninginUserMode,onlyUserCSRsareaccessible.

Tostatethesamethinganotherway,aMachineCSRcanonlybereadorwrittenwhenrunninginMachineMode.ASupervisorCSRcanbewrittenwhenrunningineitherMachineModeorSupervisorMode.Finally,aUserCSRcanbereadorwrittenregardlessofthecurrentprivilegemode.

EachCSRhasaname.

Inmanycases,the2irstletteroftheregisternameindicateswhichgroupitisin(either“u”,“s”,or“m”).

Examplesthatfollowthe2irst-letterconventionaremstatusandmisa(accessibleonlyinMachineMode),sstatusandsatp(accessibleinSupervisorandMachineModes),andustatusanducause(accessibleinUser,Supervisor,andMachine

RISC-VArchitectureSummary/Porter Page� of�249 323

Page 250: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Modes).Counterexamplesarecycle,time,andinstretwhichoughttobeginwith“u”sincetheyareaccessibleinUser,Supervisor,andMachineModes.

AparticularimplementationmayaddnewCSRsinadditiontothosedescribedintheRISC-Vstandard.EachCSRregisterhasanaddressandtheaddressesare12bitswide,allowingforupto4,096CSRs.

Details:TheaddressspaceforCSRsusesa12-bitaddress,allowingfor4,096differentCSRregisters.Thisaddressspaceispartitionedinto4regions,withupto1,024registerseach.Thereisoneblockofregistersforeachmode(U,S,andM)alongwithanextra“reserved”blockwhichwasformerlyusedforHypervisorMode.HypervisorModehasbeeneliminated.

Furthermore,eachregisteralsofallsintooneofthefollowingaccessrestrictioncategories:

registersde2inedbytheRISC-Vstandard •read-only •read-write extensions,notde2inedbythestandard •read-only •read-write registersusedfordebugmode •standard •non-standardextensions

ThisallowsanimplementationoftheRISC-Vspectoaddnew,non-standardCSRswithoutfearthattheiraddresseswillbeoverloadedinfuturerevisionstothestandard.

Themode(U,S,orM)andtheaccessrestrictioncategorycanbedeterminedsolelyfrombitsintheCSRaddress.Thus,thehardwarecaneasilycheckwhetheragiventypeofaccess(e.g.,aread-writeaccessfromcoderunninginUserMode)isallowed,simplybylookingattheaddress.TheencodingofbitsintheCSRaddressisratherconvoluted;consulttheof2icialdocumentationfordetails.

Someoftheregistersaremirrored,whichmeansthesameconceptualregisterisavailableattwoormoreaddresses.Forexample,thecycleregister,whichcontainsa

RISC-VArchitectureSummary/Porter Page� of�250 323

Page 251: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

counterofthecurrentnumberofclockcycles,ismirroredinbothUserModeandMachineMode.AmirroredCSRwillhavetwonamesandtwoaddresses:

Address RegisterName 0xC00 cycle(read-only) 0xB00 mcycle(read-write)

Thisgroupingandmirroringtechniquehasacoupleofadvantages.

Oneadvantageofmirroringtheregistersisthatasingleregistercanberead-onlyatoneprivilegelevelandyetwritableatahigherprivilegelevel.

Forexample,the“cycle”register(whichcountshowmanyclockcycleshaveoccurredsincelastbeingreset)isconceptuallyasingleregisterandwouldcertainlybeimplementedassuch.Theregistershouldbevisibleandreadabletoallcode,buttheresetoperation(whichwritestotheregister)shouldonlybeallowedwhenrunningatahigherprivilegelevel.Thus,wewantthe“cycle”registertoberead-onlyforUserModecode,butmodi2iablebyMachineModecode.

Toachievethis,theregisteris“mirrored”.Thereisaread-onlyCSRnamedcyclewhichcanbeaccessedatanyprivilegelevel,andthereisadifferentread-writeCSRnamedmcyclewhichcanonlybeaccessedwhenrunningattheMachineModeprivilegelevel.Actually,thereisonlyasingle,underlying“mirrored”register,soanyvaluewrittentomcyclewillbeimmediatelyvisiblewhenreadingfromcycle.

Accesstoregistersatahigherprivilegelevelthanthecurrentmodeisalwaysforbidden.Thisallowsanoperatingsystemtohideinformationfromuser-levelcode.Forexample,itmightbeusedinahypervisortohideinformationsothatanoperatingsystemcanbetrulyfooledaboutwhatsortofprocessoritisrunningon.

ThisgroupingmechanismisauniformapproachtopreventinglowerprivilegecodefrommodifyingCSRsthatmustbeprotected.Forexample,user-levelcodemustbepreventedfrommodifyingthevirtualmemorypagingschemeandthishappensnaturallybecausetheCSRassociatedwithpagetablesisnotaCSRthatisaccessibleinUserMode.

Someregisterscontainanumberofbit2ieldsandthese2ieldsneedtohavedifferentaccessibility.Forexample,inthemachinestatusregister,some2ieldsshouldberead-onlywhileother2ieldsareupdatable.Thisissupportedinthefollowingway.An

RISC-VArchitectureSummary/Porter Page� of�251 323

Page 252: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

attempttoupdatea2ieldthatisread-onlywillbeignored,eventhoughother2ieldsinthesameCSRcanbeupdated.

The“status”registerisusedindeterminingwhetherinterruptsareenabledandwhatvirtualmemorymechanismiscurrentlyineffect,amongotherthings.Thisregisterismirroredatallprivilegelevels,usingadifferentCSRateachlevel.

Address RegisterName 0x000 ustatus 0x100 sstatus 0x300 mstatus

Withinthestatusregister,some2ieldsareinvisiblewhenrunningatlowerprivilegelevels.Thisissupportedbygivingtheregisteradifferentnameforeachmode.Somebitsthatarede2inedinthemstatusregisterwillbereadsimplyaszerosinthesstatusregister,tore2lectthefactthatthesebitsmustbeaccessibleinMachineModebutnotinSupervisorMode.

AnOverviewofImportantCSRs

InthissectionwedescribethemostimportantControlandStatusRegisters(CSRs).Thisdescriptiongoeshand-in-handwithdescribingfeaturesofaRISC-VpricessornotcoveredelsewhereandofconcernonlytoprogrammersofOSandkernelcode.

Recallthatthe2irstletteroftheCSRregisternamewilltypicallybem,s,orutoindicatewhichprivilegelevelisrequiredinordertoaccessthisregister.Forexample,the“misa”registercanonlybereadwhenrunninginMachineMode,whilethe“ustatus”registermaybeaccessedfromanymode.

misa–MachineInstructionSetArchitecture(ISA)

ThisCSRgivesinformationaboutthebasicarchitectureofthemachine.Ittellstheregisterwidth(32,64,or128)andindividualbitsinthisCSRalsoindicatewhichofthevariousoptionsandextensionsdetailedbytheRISC-Vspeci2icationhavebeenimplemented.ThisregisterencodeswhetherthismachineisanRV32IM,anRV64IMAFDQ,orsomeothervariantoftheRISC-Varchitecture.

RISC-VArchitectureSummary/Porter Page� of�252 323

Page 253: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Theregisterwidthofthemachine(either32,64,or128)isencodedinthemostsigni2icanttwobitsofthisCSR.Thisisclever,sinceitmakesitpossibleforthesamecodetobeexecutedondifferentmachineswithdifferentregisterwidths.Usingonlyinstructionstotestthesignoftheregisterandshifttheregisterleft,thecodecandeterminewhattheregistersizeis,andthenbranchaccordinglytodifferentcodeblocksforeachofthethreepossibilities.

Thisregisterisread-write.

Somemachinesmaysupportmultipleregisterwidths.Forexample,anRV64machinemaybecapableofrunningas(i.e.,emulating)anRV32machine.Uponpower-onorreset,themisaregisterwillbesettoindicatethewidestregisterwidththecoreiscapableofimplementing.Softwarecansetthisregistertoeffectivelyturn(forexample)anRV64machineintoanRV32machine.

Thelower-order26bitscorrespondtothelettersA,B,…Z(“A”=bit0,“B”=bit1,etc.)Eachbitwillbesettoindicatewhetherthisimplementationsupportsthecorrespondingextension.Forexample,bit5willbesetifthecoresupportsthe“F”singleprecision2loatingpointextension.

mvendorid–MachineVendorID

Forcommercialimplementations,thisCSRidenti2iesbynumberthevendor/manufacturer/organizationthathasproducedthischip.ThenumberusedhereistheIDissuedbyasemiconductorengineeringtradeorganizationcalledJEDEC.Forresearchandnon-commercialimplementations,thisregisterwillcontainzero.

Thisregisterisread-only.

marchid–MachinearchitectureID

ThisCSRidenti2iestheparticulararchitectureofthepartandisessentiallythe“partnumber”or“modelnumber”.Forcommercialimplementations,thisnumberisassignedbythevendor.Forsomenon-commercialoropen-sourceprojects,anumbermaybeassignedbytheRISC-VFoundation.Otherwise,thisregisterwillcontainzero.

Thisregisterisread-only.

RISC-VArchitectureSummary/Porter Page� of�253 323

Page 254: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

mimpid–MachineImplementationID

Givenaparticularvendor(asidenti2iedinmvendorid)andapart/modelnumber(asidenti2iedinmarchid),theremaybeseveralversions.Thisnumberidenti2iestheparticularimplementationorversionoftheprocessor.Itmaybezero.

Thisregisterisread-only.

cycle–CycleCounter(read-only)mcycle–CycleCounter(writable)

Thecycleregisterisaccessibleread-onlyinallmodes.Itcountshardwareclockcycles.ThecountercanberesetbywritingtothemcycleCSR,whichisonlyaccessibleinMachineMode.

instret–InstructionCounter(read-only)minstret–InstructionCounter(writable)

Theinstretregisterisaccessibleread-onlyinallmodes.Itcountsthenumberofinstructionsexecuted(ormoreprecisely,thenumberofinstructionscompleted“instructionsretired”).ThecountercanberesetbywritingtotheminstretCSR,whichisonlyallowableinMachineMode.

time–CurrentTime(read-only)

Thecurrentrealtimeinticks.Seethecommentsformtime.Thisregisterisashadowofmtime.ThetimeCSRisaccessibleatallprivilegelevels.

Details:Theseregistersallrequire64bits,whichisproblematiconRV32machines.Todealwith32-bitmachines,thereareadditionalregisters(whosenamesendin“h”)tocontaintheupper32bits.

loworderbits highorder32bits(RV32only) cycle cycleh mcycle mcycleh instret instreth minstret minstreth time timeh

RISC-VArchitectureSummary/Porter Page� of�254 323

Page 255: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

mtime–CurrentTimemtimecmp–TimeofNextInterrupt

Themtimeregisterprovidesthecurrentrealtimeasacountof“ticks”.Themeaningof“tick”(i.e.,thetick-rateofhowmanytickspersecondandthevalueoftime-zero,atwhichthecounterbegancounting)isdeterminedelsewhere.

Themtimecmpregisterisusedtotriggerthenexttimerinterrupt.Whenthemtimeregisterequals(orexceeds)thevalueinmtimecmp,atimerinterruptwillbetriggered.

TechnicallymtimeandmtimecmparenotControlandStatusRegisters(CSRs),buttheyarelistedheresincetheyaresimilar.Each“register”isa64bitcounterwhichisactuallyimplementedviamemory-mapping,unliketheCSRs.

Commentary:Puttingareal-timeclockinsidethecoreisnotpracticalsinceacorecanberunatdifferentfrequenciesatdifferenttimes(e.g.,under-clockingtoreducepowerconsumptionorover-clockingto…uh…causeittomalfunction).TypicallythereareseveralcoresonachipandtherealtimeclockisessentiallyanI/Odevicesharedbyallcores,providingconsistenttimevaluestoallcores.Thus,therealtimeclockisaccessedbytheuseofmemory-mappedaccesses,likeotherI/Odevices.

Typicallyprocessorchipsareclockedbyimpreciseclockcircuitry.Insystemsthataimtoprovideaccuraterealtimeclocks,theclockisnotimplementedonthesamechipastheprocessorcores.

Presumably,therealtimeclockhasonemtimecmpregisterpercore,allowingeachcoretodeterminewhenitsnexttimerinterruptwillhappen.

Commentary:Themtimecmpregisteris64bitsbutiftheRISC-Vcoreisonly32bits,thentherecouldbeaproblemsettingit.Itmustbeupdatedusingtwostore-word(SW)instructions,eachstoring32bits.Iftheprogrammeriscareless,the2irstloadmighttriggeraspurioustimerinterrupt.TheRISC-Vdocumentationprovidesthiscodetoavoidtheproblem:

RISC-VArchitectureSummary/Porter Page� of�255 323

Page 256: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

# The new value for mtimecmp is in registers x11:x10li x5,-1 # 0xffffffff for lower 32 bitssw x5,mtimecmp # The 64-bit value is ≥ old valuesw x11,mtimecmp+4 # The 64-bit value is ≥ new valuesw x10,mtimecmp

mhartid–MachineHardwareThread(HART)ID

RISC-Vdocumentationde2inesthetermHARTtomean“HardwareThread”.Thisregisterdoesnotre2lectahigherlevel(e.g.,operatingsystem)conceptofthread.

Inasingle-coresystemwithasingle,simpleFETCH-DECODE-EXECUTEpipeline,thereonlyoneHART.

Inamulti-coresystem,whereeachcorewillexecuteasingle2low-of-control,eachcorewillhaveitsownHART.Eachcore’sHARTwillexecuteconcurrentlywiththeothercores’HARTs.ThisCSRidenti2ieswhichcoreisexecuting.

Itmaybeimportanttoidentifyonethreadasa“masterthread”.OneHARTmustbegivenanIDofzero.

Thenumberofhardwarethreadsis2ixedbuttheapplicationsoftwarewillneedanunpredictableandchangingnumberofthreads.TheOSwillmaptraditionalOSthreadsontotheavailablehardwarethreads.

Someadvancedsuperscalarcoresmayimplement“hardwaremultithreading,”inwhichasinglecoreiscapableofexecutingmorethanaoneindependent2low-of-controlatatime.Inotherwords,multi-threadingisperformeddirectlyinhardware.Forexample,asinglecoremightbeabletoexecutetwohardwarethreadsatatime.Theadvantageofhardwaremultithreadingisthatthecomputationalresourcesofthecore(e.g.,multipliers,adders,etc.)canbeusedmoreef2iciently.Insuchasystem,eachhardwarethreadwouldbeidenti2iedwithauniqueHARTandeachcorewouldbeexecutingmorethanoneHARTsimultaneously.

Thisregisterisread-only.

RISC-VArchitectureSummary/Porter Page� of�256 323

Page 257: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

mstatus–MachineStatusRegistersstatus–SupervisorStatusRegisterustatus–UserStatusRegister

Conceptually,theprocessor“statusregister”isasingleregister,butitismirroredatallprivilegelevels.

Bymodifyingthisregister,thesoftwarecandothingslikeenable/disableinterruptsandchangetheISA.(Bysettingbitsinthestatusregister,onecancausetheprocessortoexecuteasifitwerea32bitRV32machine,eventhoughitisactuallya64bitRV64machine.)

Understandingsomeofthe2ieldinthisCSRrequiresunderstandinghowexceptionsareprocessedandtraphandlersareinvoked.

ThisCSRiscomplexandcontainsalargenumberof2ields.Assuch,wedelayfurtherdiscussionuntillater.

mtvec–MachineTrapVectorBaseAddressstvec–SupervisorTrapVectorBaseAddressutvec–UserTrapVectorBaseAddress

Whenatrapoccurs(asaresultofasynchronousexceptionorasynchronousinterrupt),ajumpwillbetakendirectlytothetraphandlerroutine.

Thereisasingletraphandlerforeachprivilegelevel.Theseregisterscontaintheaddressofthesethreetraphandlers.Inotherwords,whenanexceptionoccurs(andistobehandled,notignored),theprogramcounter(PC)issettothevalueinthisCSR,causingajumptothe2irstinstructionofthetraphandlercode.

Thenextsectiongivesadditionaldetail.

ExceptionProcessingandInvokingaTrapHandler

ExceptionsalwaystraptotheMachineModetraphandler2irst,regardlessoftheprivilegemodeatthetimetheexceptionoccurs.TheMachineModetraphandlerwillexecuteinMachineModeand(presumably)endwithanMRETinstruction,whichwillreturntothecodethatwasinterrupted.Theinterruptedcodemayhave

RISC-VArchitectureSummary/Porter Page� of�257 323

Page 258: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

beenrunninginUser,Supervisor,orMachineMode;theMRETinstructionwillrestoretheprivilegeleveltowhateveritwaswhentheexceptionoccurred.

However,wemaynotwanttohandletheexceptioninMachineMode;wemightwanttohandleitinSupervisorModeorevenUserMode.Assuch,thereisafacilityto“delegate”someorallexceptionstothelowerprivilegelevels.

Forexample,ifaparticulartypeofexceptionshouldbehandledbytheSupervisorModetraphandler,thenthisexceptionwillbedelegatedfromMachineModedowntoSupervisorMode.Thetraphandleraddresswillbedeterminedfromthestvecregister(notmtvec);thehandlerwillruninSupervisorMode;andthehandlerwillendwithanSRETinstruction.

Likewise,ifaparticulartypeofexceptionoughttobedealtwithinUserMode,thenanyexceptionofthistypewillbefurtherdelegatedfromSupervisortoUserMode.Thetraphandleraddresswillbedeterminedfromtheutvecregister;thehandlerwillruninUserMode;andthehandlerwillendwithanURETinstruction.

WhetheranexceptionistobedelegatedfromMachineModetoSupervisorMode,ordownfurthertoUserMode,isdeterminedbythesettingsofvariousbitsinotherCSRs.Wedescribethedelegation(“deleg”)CSRselsewhere.

Thereisonlyasingletraphandlerateachofthethreeprivilegelevels(atleastinthebasespeci2ication).Themtvec,stvec,andutvecseCSRsarenormallywritableandOSsoftwarewillstoretheaddressofthetraphandlerintoaCSRbeforetheexceptionoccurs.

Oncetheexceptionoccursandthehandlerisinvoked,itscodemustexamineotherCSRstodeterminethenatureoftheexception,i.e.,whichexceptioncausedthetraphandlertobeinvoked.

(Thetraphandleraddressesmaybehardwired,inwhichcasetheseCSRsareread-only.Itisalsoallowablefortheretobeimplementation-dependentrestrictionsonwhichvaluesthemtvec/stvec/utvecCSRscancontain.)

ImplementationsarefreetoaddadditionaltraphandlersasextensionstotheRISC-Vspec.Inparticular,itmaymakesensefortheprocessortohaveadifferenttraphandlerforeachkindofexception.Thiscanmakehandlingtrapsfaster,sinceiteliminatestheneedforsoftwaretodeterminewhichtypeofexceptioncausedthetrapandjumpconditionally.

RISC-VArchitectureSummary/Porter Page� of�258 323

Page 259: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Thetraphandler(orhandlers,iftherearemorethanone)mustbeginonword-alignedaddresses.Thesemeansthatanyaddressstoredinthemtvec/stvec/utvecCSRsmusthave“00”astheleastsigni2icanttwobits.TheRISC-Vspecmakesuseofthesetwobitsasfollows.

Ifthelasttwobitsare“00”,thenitmeanstheCSRcontainstheaddressofasingletraphandleranditisuptothecodeinthehandlertodeterminewhichtypeofexceptionhasoccurred.

Ifthelasttwobitsare“01”,thenitmeansthereisacollectionoftraphandlers,oneforeachtypeofasynchronousinterrupt.Thisisgoodsinceweoftenwanttohandleasynchronousinterruptsreallyquickly.Moreparticularly,thereisajumptable(i.e.,anarrayofwords,whereeachwordcontainsaJUMPinstruction)andthemtvec/stvec/utvecCSRcontainstheaddressofthisjumptable.

[Details:Whenanasynchronousinterruptoccurs,theCSRisconsultedtolocatethejumptable.Theinterrupttypeisconsultedtodeterminewhichtableentryistobeused(i.e.,anindexintothearray).Thenajumpismadetotheassociatedentry,byloadingthePCwiththecomputedaddress.Presumably,eacharrayelementcontainsajumpinstructiontothe2irstinstructionofthehandler.Althoughtherecanbeadifferenttraphandlerforeachkindofasynchronousinterrupt(“timerinterrupt”,“externalinterrupt”,…),therewillonlybeonehandlerforallsynchronousexceptions(“illegalinstruction”,“addressmisaligned”,…).The2irstelementinthejumptable(index=0)willbeatraphandlerforallsynchronousexceptions,aswellasfora“user-modesoftwareinterrupt”.]

Theremainingbitpatterns“10”and“11”arenotused.

TheNon-MaskableInterrupt:HardwareFailure

Sometrapsare“maskable”andothersare“non-maskable”.Amaskableinterruptcaneitherbehandled,orcanbeignored,orcanbepassedfromahigherprivilegeleveltoalowerprivilegelevel.

Anon-maskableinterrupt(NMI)willbehandledimmediatelyandcannotbeignored.Onlyoneeventcancauseanon-maskableinterrupt:

•Ahardwarefailureisdetected(e.g.,byerrorcheckingcircuitry)

RISC-VArchitectureSummary/Porter Page� of�259 323

Page 260: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

IntheeventofahardwareerrorNMI,thefollowinghappens:

•TheprivilegemodeissettoMachineMode.•ThepreviousPCwillbesavedinthemepcregister.•ThePCissettosome2ixed,predetermined,implementationdependentvalue.•Themcauseregisterissettoindicatethenatureofthefailure.

Nootherprocessorstateisaltered,whichmaybehelpfulindiagnosingtheproblem.

ResetProcessing

Thefollowingeventsarenotconsideredtobeexceptionsandarenothandledastraps:

•Power-on-reset,triggeredwhenthesystemisturnedon•Pressingaphysicalrestart/resetbutton•Awatchdogtimertimesout•Powerlevelsdropbelowspecs,triggeringa“brownout”event•Alow-powersleepstateisendedandtheprocessor“wakesup”

Intheaboveevents,thefollowinghappens:

•TheprivilegemodeissettoMachineMode•TheMIE(Interruptenable)bitinthestatuswordissetto0.•TheMPRVbitissetto0,whichturnsoffanyaddresstranslation.•Themcauseregisterissettoindicatewhicheventhasoccurred.•ThePCissettosome2ixed,predetermined,implementationdependentvalue.

Theremainingprocessorstateisunde2ined.

Power-On-Reset:Whenpoweris2irstappliedtoaprocessor,theelectronicswilltypicallycausea“power-on-resetinterrupt”tooccur.Typically,thisinitialprocessingsequencewillforceajumptoaknown,predeterminedaddressinsomeread-onlyportionofmemorybyloadingtheProgramCounter(PC)withsomeknownvalue.Theeffectistoforceajumptotheinitialstartupbootstrapprogram.

WatchdogTimer:A“watchdogtimer”consistsofanindependentelectroniccircuitthatisusedtorestartacomputerthathasexperiencedacatastrophicsoftwarefailure,e.g.,gonebrain-deadinanin2initeloop.

RISC-VArchitectureSummary/Porter Page� of�260 323

Page 261: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Thistechniquecanallowacomputertorecoverfromotherwisefatalmalfunctionssuchasprogrambugsandtransienthardwareerrors.Thewatchdogtimertechniquehassavedspaceprobesthathavehadfailuresin2light:theprobemaygosilentforseveraldaysbutthenwakesup,performsafullreset,andcontinuestocompletethemission.

Awatchdogtimerwillcauseanon-maskableinterruptafteracertainamountoftimehaselapsedwithoutthetimerbeingreset.Resettingthetimeriscalled“feedingthedog”.Whensoftwareisworkingnormally,theassumptionisthatthetimerwillberesetatregularintervals,thusfeedingthedog.Butifsomethinggoeswrongandthetimerisnotreset,theinterruptwilloccur.

ExceptionDelegation

Wenextlookatthemechanismthatallowsexceptionstobemaskedand/ordelegatedtolowerprivilegelevels.

Bydefault,everyexception(whethersynchronousorasynchronous)ishandledbyatraphandlerrunninginMachineMode.Ifthekernelprogrammerwantsthetraptobehandledatalowerprivilegelevel,thenonepossibilityisfortheprogrammertowritecodetoswitchthemodeandthenpasscontrolbacktothelowerprivilegelevel.

However,thistechniqueofsoftwaredelegationisnotveryef2icientanditisfastertohavesomeexceptionsautomaticallytraptoahandlerrunningatthelowerprivilegelevel.RISC-Vsupportsdelegationoftrapsbythehardware,asonepossibledesignoption.Ifthisoptionispresentintheimplementation,thentheexceptionwillbypasstheMachineModetraphandlerandgodirectlytoatraphandlerrunningatalowerlevel.

Thisiscontrolledbythefollowing4delegationCSRs:

RISC-VArchitectureSummary/Porter Page� of�261 323

Page 262: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

medeleg–MachineExceptionDelegationRegistersedeleg–SupervisorExceptionDelegationRegister andmideleg–MachineInterruptDelegationRegistersideleg–SupervisorInterruptDelegationRegister

Weareusingthisterminology:

“exception”=synchronousexception(e.g.,illegalinstruction) “interrupt”=asynchronousexception(e.g.,timerinterrupt)

IfthebitcorrespondingtotheexceptiontypeintheMachineModedelegationregisters(medelegormideleg)isset,thentheMachineModetraphandlerwillnotbeinvoked.Instead,thetrapwillbeimmediatelydelegatedtothenextlowestprivilegelevel,i.e.,toSupervisorMode.

Next,theappropriatebitintheSupervisorModedelegationregisters(sedelegandsideleg)willbechecked.Ifset,thetrapwillbefurtherdelegatedtoUserMode.

Thetrapcannotbefurtherdelegated,sotherearenoUserModedelegationCSRs.

TrapsarealwayshandledinMachineMode,unlessdelegatedtoalowerlevel.The“deleg”CSRstellwhichexceptionsandinterruptsaredelegatedtowhichprivilegelevel.

Forexample,thehandlingofsomeparticulartrapmightbedelegatedfromMachineModetoSupervisorModecode.ThetrapmightbefurtherdelegatedtoatraphandlerrunninginUserMode.

Theseregisterscanbewrittento.Theycontrolwhetherindividualexceptionsandinterruptswillbedelegated,i.e.,processedbyatraphandlerrunningatalowerprivilegelevel,orwillbeprocessedatthehigherlevel.

ThemedelegregistertellswhetheransynchronousexceptionwillbehandledbytheMachineModetraphandler(whoseaddressisinthemtvecCSR),orwillbedelegatedtothenextlowerlevel.IfSupervisorModeisimplemented(insomechipsitmaynotbe),thenthesedelegregistertellswhethertheexceptionwillbehandledinSupervisorModeorfurtherdelegatedtoUserMode.

RISC-VArchitectureSummary/Porter Page� of�262 323

Page 263: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Thereisabitineach“exception”delegationregister(i.e.,medelegandsedeleg)foreachtypeofsynchronousexception,asfollows:

Bit Description0 instructionaddressmisaligned1 instructionaccessfault2 illegalinstruction3 breakpoint4 loadaddressmisaligned5 loadaccessfault6 store/atomicmemoryoperationmisaligned7 store/atomicmemoryoperationaccessfault8 environmentcallfromUMode9 environmentcallfromSMode10 (previouslyusedforHypervisorMode)11 environmentcallfromMMode

Ifthebitissetto1,thentheexceptionisdelegated,i.e.,handledbythenextlowerlevel.Ifthebitis0,thentheexceptionishandledatthislevel.

ThemidelegregistertellswhetheranasynchronousinterruptwillbehandledbytheMachineModetraphandler(whoseaddressisinthemtvecCSR),orwillbedelegatedtothenextlowerlevel.IfSupervisorModeisimplemented(insomechipsitmaynotbe),thenthesidelegregistertellswhethertheinterruptwillbehandledinSupervisorModeorfurtherdelegatedtoUserMode.

Thereisabitineach“interrupt”delegationregister(i.e.,midelegandsideleg)foreachtypeofasynchronousinterrupt,asfollows:

RISC-VArchitectureSummary/Porter Page� of�263 323

Page 264: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Bit Description0 USIP–usersoftwareinterrupt1 SSIP–supervisorsoftwareinterrupt2 (previouslyusedforHypervisorMode)3 MSIP–machinesoftwareinterrupt4 UTIP–usertimerinterrupt5 STIP–supervisortimerinterrupt6 (previouslyusedforHypervisorMode)7 MTIP–machinetimerinterrupt8 UEIP–userexternalinterrupt9 SEIP–supervisorexternalinterrupt10 (previouslyusedforHypervisorMode)11 MEIP–machineexternalinterrupt

Ifthebitissetto1,thentheinterruptisdelegated,i.e.,handledbythenextlowerlevel.Ifthebitis0,thentheexceptionishandledatthislevel.

mie–MachineModeInterruptEnablemip–MachineModeInterruptPending

Therearethreesourcesofinterrupts(i.e.,asynchronousexceptions):

•Softwareinterrupt •Timerinterrupt •ExternalInterrupt

SoftwareinterruptsareusedforoneHARTtosignalanotherHART.

Timerinterruptsoccurwhenthereal-timeclock(mtime)reachesapresetlimitvalue(mtimecmp).

Externalinterruptsareforallotherdevicestointerruptaprocessor.

Themieregistercontainsabitforeachtypeofasynchronousinterrupt,asfollows:

RISC-VArchitectureSummary/Porter Page� of�264 323

Page 265: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Bit Description0 USIE–usersoftwareinterruptenable1 SSIE–supervisor…3 MSIE–machine…

4 UTIE–usertimerinterruptenable5 STIE–supervisor…7 MTIE–machine…

8 UEIE–userexternalinterruptenable9 SEIE–supervisor…11 MEIE–machine…

Likewisemipregistercontainsabitforeachtypeofasynchronousinterrupt,withanalogousnames:

Bit Description0 USIP–usersoftwareinterruptpending1 SSIP–supervisor…3 MSIP–machine…

4 UTIP–usertimerinterruptpending5 STIP–supervisor…7 MTIP–machine…

8 UEIP–userexternalinterruptpending9 SEIP–supervisor…11 MEIP–machine…

(Bits2,6,and10wereusedforHypervisorModeandarenow“reserved”.Allremainingbitsareunused.)

Settingabitto1inthemip(interruptpending)registerindicatesthatthecorrespondinginterrupthasoccurredandshouldcauseatrapatsomepointinthefuture.Ifthesamebitisalsosetto1inthemie(interruptenable)register,thenthetrapprocessingwilloccurimmediately.

Inadditiontothependingbitsdiscussedhere,interruptsmayalsobegloballyenabledordisabled.SeetheMIE(interruptenable)bitinthemstatusregister.

RISC-VArchitectureSummary/Porter Page� of�265 323

Page 266: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

NotethatthereisabitinthestatusregistercalledMIEandaCSRwiththesamename:mie.Likewise,thereisanSIEbitinthestatusregister,aswellasaCSRcalledsie.

Moreprecisely,aninterruptwillbetaken(i.e.,thetraphandlerwillbeinvoked)ifandonlyifthecorrespondingbitinbothmieandmipregistersissetto1,andifinterruptsaregloballyenabled.

Thetimerinterruptpendingbit(MTIP)issetbyhardwarewhenthetimerexpires,i.e.,whenthecurrenttime(mtime)reachesorexceedsthetimerlimitregister(mtimecmp).

Thetimerinterruptpendingbit(MTIP)isresetbywritingtothetimercomparisonregister.Afterwritingtothetimercomparisonregister,thetimerinterruptwillnolongerbepending.Softwarecannotmodifythefollowinginterruptpendingbits;theyareread-onlyandset/clearedbyothercircuitry:

3 MSIP–machinesoftwareinterruptpending7 MTIP–machinetimerinterruptpending11 MEIP–machineexternalinterruptpending

However,softwarerunninginMachineModecanmodifythesebits:

0 USIP–usersoftwareinterruptpending1 SSIP–supervisorsoftwareinterruptpending4 UTIP–usertimerinterruptpending5 STIP–supervisortimerinterruptpending8 UEIP–userexternalinterruptpending9 SEIP–supervisorexternalinterruptpending

Ifoneofthesebitsissetandthatinterrupttypehasbeendelegatedtoalowerlevel(seethemedelegandmidelegregisters),thentheinterruptwillbesignaledatthelowerprivilegelevel.

RISC-VArchitectureSummary/Porter Page� of�266 323

Page 267: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

sip–SupervisorModeInterruptPendinguip–UserModeInterruptPending

sie–SupervisorModeInterruptEnableuie–UserModeInterruptEnable

Eachofthelowerprivilegelevelshasitsowninterruptpendingregisteranditsowninterruptenableregister.

Normallyinterruptsareprocessedatthehighestprivilegelevel.Forexample,ifatimerinterruptoccurswhilerunninginUserMode,thecorewillenterMachineModeandinvoketheMachineModetraphandler.Whencomplete,theMachineModetraphandlerwillreturntoexecutinginUserMode.

However,interruptscanbedelegatedfromahigherleveltoalowerlevel.Forexample,thetimerinterruptmightbedelegatedfromMachineModetoSupervisorMode.SoiftheinterruptoccursinUserMode,theSupervisorModetraphandlerwillbeinvokedandMachineModewillneverbeentered.

Ifaparticularinterrupttype(suchasthetimerinterrupt)hasbeendelegated(forexamplefromMachineModetoSupervisorMode),thenthecorrespondinginterruptpendingbitisshadowedfromtheMachineModeinterruptpending(mip)register.

Exactlywhatthismeansisunclear.PresumablytheMTIPbitinmipshadowedastheSTIPbit(andnottheMTIPbit)insip???

WhethertheinterruptcausesatrapinSupervisorModeornotisdeterminedbywhetheritisenabledintheSupervisorModesieregister,aswellaswhetherglobalinterruptsareenabledinSupervisorMode.

RISC-VArchitectureSummary/Porter Page� of�267 323

Page 268: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

TheStatusRegister

mstatus–MachineStatusRegistersstatus–SupervisorStatusRegisterustatus–UserStatusRegister

Conceptually,theprocessorstatusregisterisasingleregister,buttheregisterismirroredatthelowerprivilegelevels,sincesome2ieldswithinthestatusregistermustbeprotecteddifferentlyatdifferentprivilegelevels.

ThisCSRcontainsanumberof2ieldsthatcanbereadandupdated.Bymodifyingthese2ields,thesoftwarecandothingslikeenable/disableinterruptsandchangethevirtualmemorymodel.Forexample,bywritingtothisCSR,thesoftwarecanturnonvirtualmemoryandpage-tabletranslation.

Nextweshowthelayoutofthestatusregister,followedbyalistingoftheindividualbit2ields.

Asmentioned,thestatusregisterismirroredinthedifferentmodes.Some2ieldsarepresentonlyathigherprivilegelevels.(Suchbitsare2illedwithzerosatlowerprivilegelevels.)

Twoofthe2ieldsareonlyusedfor64and/or128bitmachines.Thesetwo2ieldsresideinbitspositions[35:32],sotheyarenotevenpresentin32-bitmachines.

RISC-VArchitectureSummary/Porter Page� of�268 323

Page 269: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

RISC-VArchitectureSummary/Porter Page� of�269 323

Page 270: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Name #ofbits Description

UIE 1 UserModeInterruptEnableSIE 1 SupervisorModeInterruptEnableMIE 1 MachineModeInterruptEnable

UPIE 1 UserModePreviousInterruptEnableSPIE 1 SupervisorModePreviousInterruptEnableMPIE 1 MachineModePreviousInterruptEnable

SPP 1 SupervisorMode–PreviousPrivilegeModeMPP 2 MachineMode–PreviousPrivilegeMode

FS 2 FloatingPointStatus(dirty/clean/initial/off)XS 2 UserModeExtension(dirty/clean/initial/off)

MPRV 1 ModifyPrivilege SUM 1 PermitSupervisorUserMemoryAccessMXR 1 MakeExecutableReadable

TVM 1 TrapVirtualMemoryTW 1 Time-outWaitTSR 1 TrapSRETinstruction

SD 1 (FS==11)or(XS==11)

ForRV64andRV128only:

UXL 2 Emulation:RegistersizewheninUserModeSXL 2 Emulation:RegistersizewheninSupervisorMode

MIE,SIE,UIE—InterruptsEnabled

Recalltheterm“interrupt”meansasynchronousexceptions,i.e.,TimerInterrupts,SoftwareInterrupts,andExternalInterrupts.

IftheprocessorisinMachineMode,theninterruptsareenabledwheneverMIE=1;interruptprocessingforSupervisorandUsermodeisdisabled.

RISC-VArchitectureSummary/Porter Page� of�270 323

Page 271: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

IftheprocessorisinSupervisorMode,thensupervisorinterruptsareenabledwheneverSIE=1;interruptprocessingforUsermodeisdisabledandinterruptprocessingforMachineModeisalwaysenabledwhenrunninginSupervisorMode.

IftheprocessorisinUserMode,thenuserinterruptsareenabledwheneverUIE=1;interruptprocessingforMachineandSupervisorModeisalwaysenabledwhenrunninginUserMode.

MPIE,SPIE,UPIE,MPP,SPP—InterruptProcessing&PreviousState

MPIE–PreviousvalueofMIE(MachineModeInterruptEnable) SPIE–PreviousvalueofSIE(SupervisorModeInterruptEnable) UPIE–PreviousvalueofUIE(UserModeInterruptEnable)

MPP–MachineModetrap,previousPrivilegeMode(eitherM,S,orU)SPP–SupervisorModetrap,previousPrivilegeMode(eitherSorU)

Whenaninterruptoccursandatraphandleristobeinvoked,theprivilegemodewillchangeandthe“interruptenable”bitmustbesetto0topreventadditionaltrapprocessing(atthislevel)whilethetraphandlerisexecuting.

Theprocessormaybeexecutingatoneprivilegemode(sayUser)andthetrapmaybehandledatahigherlevel(saySupervisor).IfaMachineModetrapthenoccursbeforetheSupervisortraphandleriscomplete,thentheSupervisorTraphandlerwillbeinterruptedandtheMachineModetraphandlerwillexecute.Uponreturn,theSupervisorModetraphandlerwillresumeexecution;Finally,theinterruptedUserModecodewillberesumed.

Itisnecessarytosavethemachinestateinasortof“stack”,sothatuponcompletionofatraphandler,theprocessorcanreturntothepreviousstate.Sinceatraphandlerrunningatonelevel(saySupervisor)cannotbeinterruptedatthatsamelevel,thestackneednotbetoodeep.

Whenantraphandlerruns,allweneedtosaveisthepreviousmodeandtheinterruptenablebit.Notethepreviousinterruptenablebitmaybe0or1.Regardlessofitspreviousvalue,theprocessorwillchangetheinterruptenablebitto0whenthetraphandlerisinvokedandrestoreitwhenthetraphandlerreturns.

(Ofcourse,anon-maskableinterrupt,suchasahardwarefailure,orareset-typeevent,suchaspower-on-reset,mightoccuratanytime.Suchaneventcanneverbemaskedandwillalwaysbehandledregardlessofwhetherornotinterruptsare

RISC-VArchitectureSummary/Porter Page� of�271 323

Page 272: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

enabled.Butthereisnoproblemsincenon-maskableinterruptsareprocessedinMachineModeandthepreviouslyexecutingcodeissimplyabandoned.)

ConsiderwhattheprocessorwilldoupontheoccurrenceofaninterruptwhichwillbehandledbytheMachineModetraphandler.First,theMPIEbitwillbesettoholdthepreviousvalueoftheinterruptenablebitforMachineModepriortotheinterrupt(i.e.,tothepreviousvalueofMIE).InterruptswillthenbedisabledbysettingMIEto0.Also,MPPwillbesettotellwhatprivilegeleveltheprocessorwasrunningatwhentheinterruptoccurred.TheprivilegemodeineffectwhentheinterruptoccurredmayhavebeeneitherMachine,Supervisor,orUserMode.Thustwobitsarerequiredtosavethepreviousprivilegelevel.

Itispossiblethatanexceptionwilloccurduringtheexecutionofatraphandler.Thismeansthat,whenexceptionsareprocessed,thepreviouslyvalueofthe“interruptsenabled”bitcanbeeither0or1.Forexample,itmaybethatsomeinstructionisunimplementedandmustbeemulated.Sowhenthatinstructionisencountered—eitherduringnormalcodewithinterruptsenabled,orhandlercodewithinterruptdisabled—anewtraphandlerwillbeinvoked.Itwillrunwithinterruptsdisabledand,uponcompletion,the“interruptsenabled”bitwillberestoredtoitspreviousvalue.

TheMIE(“MachineInterruptEnable”)bitdetermineswhetheramaskableinterruptwillcausetrapprocessingorwillremainpending.Ifinterruptsaredisabled,thentrapprocessingwillbesignaledbutwillremainpendinguntilinterruptsareonceagainenabled.[???]

Likewise,whenaninterruptoccursthatistobehandledbytheSupervisorModetraphandler,SPIE(“SupervisorPreviousInterruptEnable”)willbesettothepreviousvalueoftheSIEinterruptenablebitpriortotheinterrupt.ThenSIE(“SupervisorInterruptEnable”)willbesetto0todisableinterruptsatthislevel.ThenSPP(“SupervisorPreviousPrivilege”)willbesettowhicheverprivilegeleveltheprocessorwasrunningatwhentheinterruptoccurred.SincetheonlychoicesareSupervisororUser,theSPP2ieldisonlyonebit.

Similarly,whenaninterruptoccursandishandledbytheUserModetraphandler,UPIE(“UserPreviousInterruptEnable”willbesettothepreviousvalueoftheUIE(“UserInterruptEnable”)interruptenablebitpriortotheinterrupt.TheinterruptedcodemusthavebeenrunninginUserMode,sinceweneverinterruptahigherlevelmodetorunatraphandleratalowerlevel.Therefore,thereisnoneedfora

RISC-VArchitectureSummary/Porter Page� of�272 323

Page 273: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

“UPP”(“UserPreviousPrivilege”)2ieldtoholdthepreviousmode;itmusthavebeen“UserMode”.

ConsideratrapwhichoccurswhilerunninginUserModetoatraphandlerinMachineMode.MPIEissettothepreviousvalueofMIE;thenMIEissetto0;alsoMPPissetto“UserMode”.

Afterthetraphasbeenhandled,anMRETinstructionwillbeexecuted.MPPwillbeconsultedtodeterminethatthepreviousmodewasUserMode.MIEissettoMPIE,restoringitsinitialvalue;themodeisrestoredtoUserMode;MPIEandMPParesettoarbitraryvalues,sincetheirinformationiseffectively“popped”offthestack.

Notethatthehandlermaychoosenottoreturntotheinterruptedcode;thisiscommoninoperatingsystems,particularlywhenservicingatimerinterrupt;thehandlerwillcauseathreadswitchandwillnotreturntotheinterruptedcodeuntilmuchlater.Whenthishappens,thesimplehardwarestacksystemisinadequate.Thehandlermustsavetheinformationinthestatusregisterandrestoreitlater.

TSR—TrapSRETInstruction

Ifthisbitis1,thenanyattempttoexecuteanSRETinstructionwillcausean“illegalinstruction”exception.If0,thenSRETmaybeexecutedwithoutinvokingatraphandler.

ThisbitisavailableonintheMachineModestatuswordmstatus,notinSupervisororUserModes.

TheabilitytointerceptandtrapanSRETinstructionmightbeusefultohypervisorcode.

TWandTVM—Time-outWaitandTrapVirtualMemory

ThesebitsareavailableonlyintheMachineModestatuswordmstatus,notinSupervisororUserModes.

Theirfunctionalityisnotdocumentedinthecurrentspec.Perhapstheyhavebeeneliminated.???

RISC-VArchitectureSummary/Porter Page� of�273 323

Page 274: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

SXLandUXL—SupervisorandUserRegisterSize

AnRV32processoralwaysusesthe32ISA(e.g.,32-bitregisters)regardlessofwhatmodeitisin.For64-bitand128-bitmachines,thereisafacilitytoexecutecodemeantforaRISC-Vprocessorwithasmallerregistersize,i.e.,inRV32orRV64mode.

TheabilityoflargermachinestoexecutetheRV32orRV64bitISAsisnotrequired,butifpresent,theSXLandUXLbitscontrolit.Ifitisnotimplemented,thenthesetwo2ieldsareread-only.

AnRV64bitprocessoralwaysusesthe64-bitISAwhenrunninginMachineMode.However,itcanexecutein32-bitmodewhenrunningineitherSupervisororUserMode.Thisiscontrolledbywriting01totheSXLorUXLbits.

AnRV128bitprocessoralwaysusesthe128-bitISAwhenrunninginMachineMode.However,itcanexecutein32-bitor64-bitmodewhenrunningineitherSupervisororUserMode.ThisiscontrolledbywritingtotheSXLorUXLbits.

TheencodingusedfortheSXLandUXL2ieldsis:

01 32-bitISA 10 64-bitISA 11 128-bitISA

FS,XS,andSD—FloatingStatusandExtensionStatusBits

Thesebitsofthestatusregisterareconcernedwithimprovingcontextswitchingtimes.First,wedescribe“FS”2irst,whichisconcernedwiththe2loatingpointregisters.

Whenthe2loatingpointregistershavebeenused,theywillneedtobesavedwhenthekernelswitchesfromonesoftwarethreadtoanother.Butwhentheregistersareunused,we’dliketoavoidtheoverheadofsavingandrestoringthem,sincethisistimeconsuming.

TheFSbit2ieldconsistsoftwobits,encodedasfollows:

RISC-VArchitectureSummary/Porter Page� of�274 323

Page 275: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Value Meaning 00 Off 01 InitialValues 10 Clean,i.e.,updatedbutsavedinmemory 11 Dirty,i.e.,updatedbutnotsaved

Thesebitsaresetbythehardwareandcanalsobemodi2iedbysoftware.Theyalsocontrolwhethera2loatingpointinstructionwillcauseanexceptionorwillbeexecutednormally.

(Thesebitsareonlypresentifthe2loatingpointextensionisimplemented.Ifthereareno2loatingpointregisters,theFSbitsarehardwiredto00.)

TheFSbitsshouldbesetto00=“off”whenthe2loatingpointregistersarenotinuse.Anyattempttoreadormodifytheregisterswillcauseanillegalinstructionexception.Forathreadthatwillnotbeusingthe2loatingpointregisters,thisisagoodsetting,sincetheOScanavoidsavingandrestoringtheregisters.

Thecompletestateofthe2loatingpointsystemconsistsofthecontentsofofthe322loatingpointregistersandthe2loatingpointstatusregister(fcsr).Manyprogramsdon’tuse2loatingpoint,sothereismuchtobegainedbyavoidingthesave/restoreofallthisinformationoneachcontextswitch.

Whenathreadthatuses2loatingpointiscreatedandbeginslife,the2loatingpointregistersshouldbeinitializedbytheOStotheirinitialvalues(presumably+0.0)andthe2loatingpointstatusregistershouldbeinitialized(presumablytozero).

The“initial”state(01)indicatesthatthe2loatingpointregistersareusablebuttheyhavenotyetbeenalteredfromtheirinitialvalues.Thus,theyregistervaluesdonotneedtobesavedonthenextcontextswitch.

Whenevera2loatingpointregisterismodi2ied,hardwarewillchangethestateasre2lectedinFSto“11=dirty”.

The“clean”state(10)isusedtoindicatethattheregistersareinuseandhavebeenmodi2iedfromtheirinitialzerovalues,butthattheyhavenotbeenmodi2iedsincethelastcontextswitch.Inotherwords,theregisterscontainnon-zerovaluesbutthevalueslastsavedtomemoryarecurrentandaccuratelyre2lectthevaluesintheregisters.Thus,atthetimeofthenextcontextswitch,theregistersdonotneedtobe

RISC-VArchitectureSummary/Porter Page� of�275 323

Page 276: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

savedagain.(Thiswillbecommonsinceaprogramthatuses2loatingpointatsomepointwilllikelyspendmanytimesliceswithoutfurthermodifyingtheregs.)

InsomeOSes,itmaybethecasethatprocessesalwaysbeginlifewiththe2loatingpointsystemenabledbydefault,sothe“off”stateisnotparticularlyuseful.

ButinotherOSes,processeswillbeginlifewiththe2loatingpointsystemdisabled,undertheassumptionthatmostprocesseswillnotuse2loatingpoint.Thus,theOSwillsetFSto00=off.Ifanattemptismadetoreadorwritethe2loatingstate,thehardwarewillsignalanexception.

SomeOSesmayrespondbyenablingtheregisters,settingthemtotheirzerovalues,andretryingtheinstruction.OtherOSesmayrequirethe2loatingpointsystemtobeexplicitlyenabledbysomesystemcallandtreatanyaccessestodisabledstateasanerror.

Itisnecessaryforread(aswellaswrite)attemptstocauseanexceptionwhenthestateis00=off.TheOSmayoccasionallyleavetheoldstatefromanotherunrelatedthreadintheregistersandwemustpreventinformationfrombleedingoverorescapingfromonethreadtothenext.

Whenathread2inishesitstimeslice,theOScanlookatFStodeterminehowmuchstatemustbesavedatthecontextswitch:

ValueofFS Action 00=off Savingnotnecessary 01=initial Savingnotnecessary 10=clean Savingnotnecessary;previouslysavedvaluesarestillgood 11=dirty Mustsavetheregisters

Ifthestatewaspreviously11=dirty,thentheOSshouldchangeitto“clean”beforestoringthethread’sinformation,sinceitwillnowbetruethatthein-memorysavedstateisup-to-date.

WhentheOSisreadytoinitiateandscheduleanewthread,itcanlookatthestateoftheFSfromthepreviousthreadandthestateofthenewthreadtodeterminewhattodo.(Recall,thatwhentheOSpreviouslysavedtheregswhenthethreadwaslastde-scheduled,itchangedthestatefrom“dirty”to“clean”,sothestateofthenewthreadwillneverbe“dirty”atthispoint.)

RISC-VArchitectureSummary/Porter Page� of�276 323

Page 277: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

NextThread PreviousThread Action 00=off …any… Donothing 01=initial 01=initial Donothing(regsalreadyzeroed) 01=initial …other… Zerotheregs 10=clean …any… Restoreregsfrommemory 11=dirty <notpossible>

Whatwehavejustseenisthatsomesectionoftheprocessor(the2loatingpointsubsystem)requiredattentionateverycontextswitch,eithertosavestateortorestorestate.Theseactionsareexpensiveandwewishtoavoidthemwhenpossible.TheFSbitshelpdeterminewhenwecanavoidsaving/restoring.

SomeRISC-Vimplementationsmayaddnon-standardextensionswhicharenotdescribedinthespecandtheXSbitsareusedanalogouslyforthisadditionalstateinformation.

TheXSbitsworkthesamewayastheFSbitsbutcoverallotherstatethatshouldbesaved/restoreduponcontextswitch.Exactlywhatprocessorstateiscoveredby“XS”isleftas“implementationde2ined”.

Itispossiblethattherearemorethanonenon-standardextensions.TheXSbitsaremeanttoapplytoallofthem.Sothemeaningsarechangedslightlytore2lectthis.

Value Name Meaning 00 Off Allsub-systemsareoff 01 Initial Oneormoreareon;butnothingclean,nothingdirty 10 Clean Nothingisdirty,somepartsareclean 11 Dirty Atleastsomestatehasbeenmodi2ied

Itmaybethatthereareadditionalbitstodetermineexactlywhatpartsoftheprocessorstatearedirty/clean/initial,butthespecleavesthistoimplementationspeci2icdecisions.

Finally,uponcontextswitch,we’dliketobeabletoquicklydeterminewhetherwehavetodoanythingaboutthe2loatingpointregsandanynon-standardextensions.Mightweneedtosaveanything,orcanwejustignorethisissue?Thequestionwewanttoansweristhis:IsFS=11=dirty?OrisXS=11=dirty?

RISC-VArchitectureSummary/Porter Page� of�277 323

Page 278: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

TheSDbitinthestatusregisterallowstheOSsoftwaretoquicklyanswerthesequestions.Thebitissetbythehardwareandde2inedas:

SD=((FS==11)OR(XS==11))

Thisbitisplacedinthesignbitofthestatuswordsoaquickcheckcandeterminewhetheranythingisdirty.Ifso,theOSsoftwarecanlookdeepertodeterminewhatstatemustbesaved.

These2ields(FS,XS,SD)areavailableintheMachineModemstatusandSupervisorModesstatusregisters.[Thediagramofthestatusregistersincorrectlyshowstheminustatus.]

MPRV,SUM,andMXR—VirtualMemoryEnabling

TheRISC-Vchipsupportsvirtualmemory,viapagetables.Thepagetablesystemisdescribedinaseparatechapter.

Typically,user-levelcodewillrunwithvirtualmemorymappingturnedonandallmemoryaccesseswillbemappedfromvirtualaddressestophysicaladdresses.

However,user-levelcodewilloccasionallymakesystemcalls,switchingfromexecutioninatalowerprivilegelevel(e.g.,UserMode)toahigherprivilege(e.g.,MachineMode).Oftendataispassedbetweenuser-levelcodeandtheOSinmemory;forexample,theuser-levelcodemightpassanaddresspointertotheOS.ThispointerisavirtualaddressandtheOScodewhichservicesthesystemcallwillthengothememorytofetchthedata.

SincetheaddressisavirtualaddresswhiletheOSisrunningin(say)MachineModewithvirtualmemorymappingturnedoff,thereisaproblem.TheOScouldperformtheaddresstranslation(fromVirtualtoPhysical)insoftware,butthisistimeconsuminganderror-prone.

Tomakethistasksimpler,theMPRVbitcanbeused.NormallytheMPRVisclearedto0andallLOADandSTOREinstructionsdonebycoderunninginMachineModewillhavenoaddresstranslationperformed.However,theOScodecansetthisbitto1;anysubsequentLOADorSTOREinstructionswillbeperformedwithaddresstranslationturnedon.Moreprecisely,addresseswillbetranslatedasifthecurrent

RISC-VArchitectureSummary/Porter Page� of�278 323

Page 279: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

modewerethatspeci2iedbyMPP(theMachinePreviousPrivilegeModebitsinthemstatusregister),whichwasthemodefromwhichthemostrecenttraphandlerwasentered.

(SoasystemcallfromUserMode,wherepagingwason,wouldpassvirtualaddresses.Thehandler,runninginMachineMode,wouldneedaddresstranslationturnedonifitwantstoaccessthememoryusingtheseaddresses.HoweverasystemcallfromMachineModeitself(orSupervisorModewithtranslationturnedoff)wouldpassphysicaladdresses,sincetherewouldbenopaging.)

TheSUMbitisusedbycoderunninginSupervisorModeinsteadofMachineMode.IthasnoeffectoncoderunninginMachineModeorinUserMode.

Addresstranslationcanbeturnedonbywritingtothesatpregister.Ifturnedon,thenallUserModeandSupervisorModecodewillrunwithaddresstranslation.

Eachvirtualaddressspaceisdividedintobothuserandnon-userpages.Inotherwords,somepageswillbemarkedas“User”pagesandtheremainderwillbeinaccessibletoUserModecode.(TypicallytheportionoftheaddressspaceaccessibletoUserModecodeiscalledthe“bottomhalf”andtheportionaccessibleonlytoSupervisorModecodeiscalledthe“tophalf”,althoughthesetwoareasneednotbearrangedascontiguousblocksorlocatedinanyrelativeorder.)

NormallySupervisorcodewillonlyaccessnon-user(tophalf)pagesintheaddressspace.However,sometimesthesupervisorcodemustaccesstheuser(bottomhalf)portionoftheaddressspace.

TheSUMbitcanbeusedtoallowsupervisorcodetoaccesstheuserpages.IfSUM=0,thenSupervisorModecodecannotaccesspagesmarkedas“user”pages;itcanonlyaccessthenon-userportionoftheaddressspace.IfSUM=1,thenSupervisorModecodeisfreetoaddressbothuserandnon-userpages.Thisbitaffectsallloads,stores,andinstructionfetches.Normally,supervisorscodedoesnotneedtoaccessuserpages,sosupervisorwillrunwiththisbitcleartopreventinadvertentaccessestouserspace.Whenthesupervisorcodeneedstoaccessuserspace,e.g.,toretrievesyscallarguments,thisbitcanbetemporarilysetto1.

Sometimesatraphandlerwillneedtolookattheinstructionthatwasexecutingatthemomentofanexception.Forexample,iftheinstructionisunimplementedinthehardwareandmustbeemulated,thetraphandlerwillneedtoreadtheinstructiontodeterminehowtoemulateit.

RISC-VArchitectureSummary/Porter Page� of�279 323

Page 280: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Memorypagescanbemarked“readable”and/or“executable.”Apagecontaininginstructionswillnormallybemarked“executable”butwillnotbemarked“readable”.Therefore,anyattempttofetchawordfromthispagewillcauseanexception.

Inordertoallowthetraphandlertofetchtheinstruction(asdata),thisexceptionmustbesuppressed.ThisisthefunctionoftheMXRbitinthestatusregister.Whensetto1,datareadsareallowedfrompagesmarked“executable”;whensetto0,suchreadswillcauseanexception.

AdditionalCSRs

mepc–ProgramCounterforMachineModeTrapHandlermscratch–ScratchRegisterforMachineModeTrapHandlermcause–CauseCodeforMachineModeTrapHandler

WhentheMachineModetraphandlerisinvoked,thepreviousvalueoftheprogramcounterisstoredintomepc;thisvaluewillbeusedattheendofthetraphandler,whenanMRETinstructionisexecuted.Themepcregisterisread/writeandcanbewrittenbysoftware(e.g.,byanOSduringathreadswitch)tocausetheMRETtojumpintoanarbitrarythread.

WhenaMachineModetraphandlerisinvoked,thecodemustbecarefulnottooverwriteandlosethepreviousvaluesofanyregisters,sincetheywerebeingusedbytheinterruptedcode.Insteadtheregistersmustbeimmediatelysaved.However,itisvirtuallyimpossibletodoanythingwithouttheuseofatleastonegeneralpurposeregister.

Thisisthepurposeofthemscratchregister,whichisaread/writeregister.ThetraphandlercanswapmscratchwithanygeneralpurposeregisterandthenusethenormalISAinstructionstosavetheremainingregisters.

RecallthattheCSRRWinstructionwillexchangethevalueinanarbitraryCSRwithageneralpurposeregister.Thisinstructionaccomplishesthetaskofswappingmscratchwithageneralpurposeregister,therebysavingthegeneralpurposeregister.Thisgivesthesoftwarearegisterloadedwithabasevalue,whichcansubsequentlybeusedtosaveallremainingprocessorstate.

RISC-VArchitectureSummary/Porter Page� of�280 323

Page 281: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Themscratchregistercannotbechangedbytheexecutionofthecodeatlowerprivilegelevels.Thereforeitcanbereliedontocontainaknownvalue,uncorruptedbyuser-levelcode.Typicallyitmightcontainaframeorstackpointerorapointertoa“registersavearea.”Inanycase,uponentrytothetraphandler,mscratchcanbeswappedintoageneralpurposeregisterandthenusedtosavewhicheverregistersthetraphandlerwillbeneeding.

Commentary:SomeearlierISAdesignsusedageneralpurposeregisterforthisfunction.Theideawasthatoneregisterwould(byconvention)bereservedfortheOSkernel.Wheneveraninterruptoccurred,theOSkernelwouldbefreetousethisregisterinsavingthestateoftheinterruptedprocess.Therefore,thisregisterwasuselesstouser-levelcode.Thisregistercouldnotbeusedtoholdavaluesinceanytimeaninterruptoccurred,itwouldbemodi2iedbytheOStraphandler.

Thisdesignapproachreducedthenumberofavailablegeneralpurposeregistersavailabletouser-levelcode.

Furthermore,theOScouldnotdependontheregisterremainingunchangedduringuser-levelcodeexecution;thisnecessitatedloadingtheregisterwithaknownvaluebeforeitcouldbeused.

Also,ifnotconsistentlyzeroedbeforereturningfromOScodetouser-levelcode,datacouldpotentiallyleakfromtheOStouser-levelcode,presentingsecurityconcerns.

Themcauseregistercontainsanumericcodetoindicatewhatcausedthetrap.ItiswrittenbythehardwarewhenanexceptioncausestheMachineModetraphandlertobeinvoked.Herearethepossibleexceptioncausevalues:

RISC-VArchitectureSummary/Porter Page� of�281 323

Page 282: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

NumericCode Description0 Instructionaddressmisaligned1 Instructionaccessfault2 IllegalInstruction(orprivilegedinstr.violation)3 Breakpoint4 Loadaddressmisaligned5 Loadaccessfault6 Store/AMOaddressmisaligned7 Store/AMOaccessfault8 EnvironmentcallfromU-Mode9 EnvironmentcallfromS-Mode11 EnvironmentcallfromM-Mode12 Instructionpagefault13 Loadpagefault15 Store/AMOpagefault

MAX+0 0x80000000 UserSoftwareInterruptMAX+1 0x80000001 SupervisorSoftwareInterruptMAX+3 0x80000003 MachineSoftwareInterrupt

MAX+4 0x80000004 UserTimerInterruptMAX+5 0x80000005 SupervisorTimerInterruptMAX+7 0x80000007 MachineTimerInterrupt

MAX+8 0x80000008 UserExternalInterruptMAX+9 0x80000009 SupervisorExternalInterruptMAX+11 0x8000000B MachineExternalInterrupt

Details:Eachcausehasanumericcode.Thecodeschemeusesthesignbittoshowwhetherthecauseisasynchronousexception(MSB=0)oraninterrupt(MSB=1).Forasynchronousinterrupts,asmallcodenumber(0,1,2,…)isstoredinthelower-orderbits,andthesign-bitisalsoset.Sointheabovetable,MAXstandsfor231fora32-bitmachine,263fora64-bitmachine,etc.Tomakethisclear,wegivetheactualvalueinhexforRV32.

Commentary:A“privilegedinstructionviolation”isaninstructionthatexistsbutcannotbeexecutedinthecurrentprivilegemode.Anillegalinstructionisaninvalidinstructionencodingwhichdoesnotrepresentade2inedinstruction.Concerningexceptionsraisedbyeitheroftheseviolations,RISC-Vmakesno

RISC-VArchitectureSummary/Porter Page� of�282 323

Page 283: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

distinctionbetween“privilegedinstruction”and“illegalinstruction.”Bothcausethe“illegalinstruction”exception.

sepc–ProgramCounterforSupervisorModeTrapHandlersscratch–ScratchRegisterforSupervisorModeTrapHandlerscause–CauseCodeforSupervisorModeTrapHandler

uepc–ProgramCounterforUserModeTrapHandleruscratch–ScratchRegisterforUserModeTrapHandlerucause–CauseCodeforUserModeTrapHandler

Theseregistersarede2inedanalogously,tobeusedinthetraphandlersrunninginSupervisororUserMode.

Somecausevaluescannotshowupinthecauseregisters.Forexample,aSupervisorModetraphandlercannoteverbeconfrontedwithaMachineModetimerinterrupt.

Questions:Butcan’tanyinterruptbedelegated;woulditstillremainaMachineModetimerinterrupt,orwoulditbealteredtoaSupervisorModetimerinterrupt???

Thereisno“Loadaddressmisaligned”codeforthescauseregister,implyingthatsuchaexceptioncannotbehandledinSupervisorMode.Whynot???

Also,theof2icialdocumentationdoesnotincludeacausecodefor“EnvironmentcallfromS-Mode”asapossiblevalueinscause;isthisamistakeinTable4.2???

scounteren–CounterEnableRegisterforUserModemcounteren–CounterEnableRegisterforSupervisororUserMode

Coderunninginalowerprivilegelevel(suchasUserMode)mayperiodicallytrytoreadthevarioustimerandcounters.Coderunningatahigherlevel(suchasanOSorhypervisor)maywishtointerceptallaccessestocountersandtimersinordertofoolthelowerprivilegecode.Perhapsthehypervisorwantstopresenttheillusiontothelowerlevelcodethatitisrunningonabaremachinewhen,infact,itisnot.Thiscouldalsobeusefulforviruseswhichwanttohidetheirexistence.

TheseCSRregisterareusedtocontrolaccesstothefollowingcounter/timesCSRs:cycle,time,instret,hpmcounter3,hpmcounter4,…hpmcounter31.

RISC-VArchitectureSummary/Porter Page� of�283 323

Page 284: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

ThepurposeofthescounterenregisteristoallowSupervisorModecodetodeterminewhetheraccessestooneofthecounter/timerregistersbycoderunninginUserModeshouldcauseanexceptionornot.

ThepurposeofthemcounterenregisteristoallowMachineModecodetodeterminewhetheraccessestooneofthecounter/timerregistersbycoderunninginUserModeorSupervisorModeshouldcauseanexceptionornot.

Thereisonebitforeachofthe32counters/timers.0=disabled,causeanexception;1=enabled,noexception.

Emulationoftime/mtimeAccesses:AnotheruseofthisCSRisasfollows:RecallthattherealtimeclockmtimeisnotactuallyaCSR,butisexpectedtobeimplementedasamemory-mappedI/Odevice.HoweverthereisatrueCSRnamedtime,whichismeanttobeamirrorofmtime.BytrappingallreadstothetimeCSR,MachineModecodecanperformedtheI/OtoretrievethetimefrommtimeandproperlyemulatethereadtothetimeCSR.

SystemCallandReturnInstructions

EnvironmentCall(SystemCall)

GeneralForm:ECALL

Example:ECALL # no operands

Comment:Thisinstructionisusedtoinvokeatraphandler.Thisinstructioncausesoneofthefollowingexceptions,dependingonthecurrentprivilegelevel:

EnvironmentCallfromUserModeEnvironmentCallfromSupervisorModeEnvironmentCallfromMachineMode

Anyandallargumentsandreturnedvaluesmustbepassedinregisters.Encoding: ThisisavariantofanI-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of�284 323

Page 285: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Break(InvokeDebugger)

GeneralForm:EBREAK

Example:EBREAK # no operands

Comment:Thisinstructionisusedtoinvokeadebugger,bycausingan“Breakpoint”exception.Typicallythedebuggingsoftwarewillinsertthisinstructionatvariousplacesintheapplicationcodesequence,inordertogaincontrolfromanexecutingprogram.

Encoding: ThisisavariantofanI-typeinstruction.

MRET:MachineModeTrapHandlerReturn

GeneralForm:MRET

Example:MRET # no operands

Comment:ThisinstructionisusedtoreturnfromatraphandlerthatisexecutinginMachineMode.TheMPP2ieldofthestatusregisterwillbeconsultedtodeterminewhichmodetoreturnto(eitherm,s,oru).ThevalueintheMPIE2ieldwillbecopiedtoMIE,whichwillrestorethe“interruptsenabled”statetowhatitwaswhenthehandlerwasinvoked.ThereturnwillbeeffectedbycopyingthesavedprogramcounterfrommepctotheProgramCounter(pc).

ThisinstructionmayonlybeexecutedwhenrunninginMachineMode.Encoding: ThisisavariantofanI-typeinstruction.

SRET:SupervisorModeTrapHandlerReturn

GeneralForm:SRET

RISC-VArchitectureSummary/Porter Page� of�285 323

Page 286: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Example:SRET # no operands

Comment:ThisinstructionisnormallyusedtoreturnfromatraphandlerthatisexecutinginSupervisorMode.TheSPP2ieldofthestatusregisterwillbeconsultedtodeterminewhichmodetoreturnto(eithersoru).ThevalueintheSPIE2ieldwillbecopiedtoSIE,whichwillrestorethe“interruptsenabled”statetowhatitwaswhenthehandlerwasinvoked.ThereturnwillbeeffectedbycopyingthesavedprogramcounterfromsepctotheProgramCounter(pc).

ThisinstructionmaybeexecutedwhenrunningineitherSupervisorModeorMachineMode.

Encoding: ThisisavariantofanI-typeinstruction.

URET:UserModeTrapHandlerReturn

GeneralForm:URET

Example:URET # no operands

Comment:ThisinstructionisnormallyusedtoreturnfromatraphandlerthatisexecutinginUserMode.UserModetraphandlersalwaysreturntoUserModecode.ThevalueintheUPIE2ieldwillbecopiedtoUIE,whichwillrestorethe“interruptsenabled”statetowhatitwaswhenthehandlerwasinvoked.ThereturnwillbeeffectedbycopyingthesavedprogramcounterfromuepctotheProgramCounter(pc).

Thisinstructionmaybeexecutedinanymode.Encoding: ThisisavariantofanI-typeinstruction.

WFI:WaitForInterrupt

GeneralForm:WFI

RISC-VArchitectureSummary/Porter Page� of�286 323

Page 287: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter8:PrivilegeModes

Example:WFI # no operands

Comment:Thisinstructioncausestheprocessortosuspendinstructionexecution.Whenanasynchronousinterruptoccurs,theprocessorwillwakeupandresumeexecution.Thetraphandlerwillbeinvokedand,uponreturntothecodesequencecontainingtheWFIinstruction,thenextinstructionfollowingtheWFIwillbeexecuted.

Typically,softwarewillexecutethisinstructionwithinan“idle”loop,whichisonlyexecutedwhenthereisnothingtobedone.Suspendinginstructionexecutionmaysavepower.Inasystemwithmultiplecores,thisinstructionmayalsoprovideahinttotheinterruptcircuitrytoroutefutureinterruptstothiscore,sinceitisidling.

Thisinstructionmaybeimplementedasanop.Forsimplerprocessors,itmaybebettertojustspininanidleloop.Complexsystemsmayhaveaninexhaustiblesupplyofbackgroundprocesses,makingalow-powerwait-statepointless.

Ifinterruptsare(globally)disabledwhenaWFIinstructionisexecuted,thenthisdisablingisignored.Instructionexecutionwillresumewhenaninterruptoccurs.Seethespecfordetails.

Encoding: ThisisavariantofanI-typeinstruction.

RISC-VArchitectureSummary/Porter Page� of�287 323

Page 288: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

PhysicalMemoryAttributes(PMAs)

Mainmemoryisthelargerangeofbyte-addressablelocations.Eachlocationhasa“physicaladdress”.

Insystemswithoutvirtualmemorymapping,eachandeveryaddressgeneratedbyaninstructionisaphysicalmemoryaddress.However,manysystemswillsupportvirtualmemorymapping,wherethereisadistinctionbetweenvirtualaddressesandphysicaladdresses.Aninstruction(e.g.,inaREADorWRITEinstruction)willgenerateavirtualaddressandthememorymanagementunit(MMU)willtranslatethisvirtualaddressintoaphysicaladdress,whichisthenusedbythehardwaretosendorretrievedatato/fromthemainmemorystore.

Thephysicaladdressspacecanbepopulatedby:

•Normalmemory •Memory-mappedI/Odevices •Empty(i.e.,unpopulatedholes)

I/Odevicesarecommonly“memorymapped”,whichmeanstheysitonthesamebusasnormalmemory.Thedeviceisassignedaspeci2icphysicaladdress(orsetofaddresses).SoftwarecansenddatatoanI/Odeviceby“writing”toamemorylocationpopulatedbythedeviceandcanretrievedatafromthedeviceby“reading”fromsuchanaddress.Fromthepointofviewoftheexecutingsoftware,thedevicelooksabitlikeanyothermemorylocation,butreadingandwritingtomemory-mappedI/Olocationsisdone,nottostoredata,butforthepurposeofsendingdatato,receivingdatafrom,andotherwisecontrollinganI/Odevice.

Tohandledvariousscenarios,thephysicaladdressspacewillbepartitionedintoregions.Theregionswillbecontiguous,one-after-the-other,non-overlappingandwillcovertheentirephysicaladdressspace.Everybyteinthephysicaladdressspacewillbeinexactlyonephysicalmemoryregion.

RISC-VArchitectureSummary/Porter Page� of�288 323

Page 289: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Eachregioncanbecharacterizedbydifferentattributes.Forexample:

Istheregionpopulatedwithmainmemory?Istheregionpopulatedwithmemory-mappedI/Oregisters?

Istheregion(i.e.,unpopulated,empty)? Doestheregionsupportreads(i.e.,datafetches)? Doestheregionsupportwrites? Doestheregionsupportexecution(i.e.,instructionfetches)? Doestheregionsupportatomicoperations? Istheregioncached? Istheregionsharedwithothercores,orprivate?

Collectively,theseare“physicalmemoryattributes”(PMAs).Someregionswillhavetheirattributes2ixedandunchangeable,otherregionswillbecon2igurableuponpower-up(i.e.,cold-pluggable),andotherregionsmighthavetheirattributeschangeduringsystemoperations(i.e.,hot-pluggable).

Memoryattributesmustcheckedconstantlyduringexecution.Forexample,aREADfromanunpopulatedmemoryregionoughttocauseanexception.

WhereshallthePMAinformationbestored?Oneapproachistostoretheinformationinthevirtualmemorytables,andrequirethememorymanagementunit(MMU)tocheckandenforcetheconstraints.Theremaybeproblemswhenthepagesizeofthememorymanagementunitdoesn’tmatchthesizeoftheregions.

IntheRISC-Vdesign,thereisaseparatesubsysteminthecorewhichisconcernedwithcheckingandenforcingthephysicalmemoryattributes.Thissubsystemiscalledthe“PMAchecker”.

Whenaninstructionattemptstoaccessmemory,theaddressis2irsttranslatedfromvirtualtophysicaladdress(ifvirtualmemoryisturnedon).Then,thePMAcheckerwillverifythatthetypeofaccessislegal.Ifnot,anexceptionwillbegenerated.Thereareseveralexceptiontypesforphysicalmemoryattribute(PMA)violationsandthesearedifferentfromthevirtualmemoryfaultexceptions.

TheinformationaboutthevariousmemoryregionsandtheirassociatedPMAswillbekeptinprocessorregisters.Theseregisters(orpartsofthem)maybehardwiredforsomeregions(whichisappropriatewhentheregionitselfisprecon2iguredandnotmodi2iable,suchasanon-chipROM),ormaybereadable(forqueryingthebus

RISC-VArchitectureSummary/Porter Page� of� 289 323

Page 290: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

todeterminewhatregionsarepresentandtheirPMAs),ormaybewritable(allowingthesoftwaretoestablishandmandatePMAs).Detailsareimplementationdependent.

ConcerningtheatomicoperationsofRISC-V,itisimportanttodeterminewhetheraparticularregioncansupportaparticularinstruction.

RegionspopulatedbymainmemorymustsupportalltheAMOinstructions(AtomicMemoryOperations)andLR/SCinstructions(assumingtheseinstructionsareevenimplemented,ofcourse).

RegionspopulatedbymemorymappedI/OdevicesarenotexpectedtosupporttheLR/SCinstructions.

Memory-mappedregionsmayormaynotsupporttheAMOoperations.IfsucharegionsupportsanyAMOinstructions,thenitmustsupportAMOSWAPataminimum.TheregionmayalsooptionallysupportotherAMOinstructions.HerearethelevelsofAMOsupportthatmaybeimplemented:

•NoAMOoperationssupported •OnlyAMOSWAPsupported •AMOSWAP,AMOAND,AMOOR,AMOXOR •AllAMOoperationsaresupported

ConcerningtheFENCEinstructions,recallthataccessesareclassi2iedasbeing“memory”or“I/O”.Tosupportthis,everyregionofthephysicalmemoryaddressspacemustbeclassi2iedaseither:

•mainmemory •I/O

Itispossiblethattwodifferentcores(orotherdevicessittingonthephysicalmemorybus)mayaccessthesamememorylocationsmore-or-lesssimultaneously.Accessestomainmemoryregionsare“relaxed”,whichmeansthattheexactorderingoftheoperationsisindeterminate(unlessatomicinstructionsareused,ofcourse).Theprogrammercanuseatomicinstructionstoforceparticularorderings,butfornormalinstructions,therearenoguaranteesaboutordering.Asinglecorewillseeitsownoperationsexecutedintheordertheywereissued,butreadsandwritesfromothercoresordevicescanappearatanytime,inanyorder.

RISC-VArchitectureSummary/Porter Page� of� 290 323

Page 291: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Accesstomemory-mappedregionsmayeitherbe“relaxed”or“strong”.Iftheregionisde2inedtohaverelaxedordering,thentheorderingofoperationsisunde2ined,asjustdescribed.Iftheregionhasstrongordering,thenitwillappeartoonecore(A)thattheoperationsexecutedbysomeothercore(B)wereallexecutedintheordertheywereactuallyissuedbycore(B).

Thissummaryisasimpli2icationoftheRISC-Vspec.Thespecisdesignedtoapplytoaspectrumofdifferentkindsofsystems,fromsimpletocomplex.Moderncomplexsystemsmayhavemultipleparallelchannelsbetweencores,devices,andmainmemorystores.ImaginethatallthedevicesareconnectedbyanetworkwiththepropertiesoftheInternet:messagesaresentbetweendevices,cantakedifferentpaths,andcanbearbitrarilyreorderedintransit.Ontheotherhand,olderandsimplersystemshaveasingle,simplebusthatdoesonethingatatimeandreorderingisinconceivable;thesesystemsmayusedevicedriversthatweredesignedwithoutconcernforreorderingandonegoalistosupportsucharchitecturesandlegacyprogramming.

CachePMAs

qqqqq

PMAEnforcement

TheenforcementofPhysicalMemoryAttributes(PMAs)isoptional.

ARISC-VcoremayoptionallyprovideaPhysicalMemoryProtection(PMP)unit.PhysicalmemoryprotectionisimplementedusingasmallnumberofMachineModeCSRsthatdescribethememoryregionsandtheirPMAs.

TheCSRregistersencodethefollowinginformationforeachregion:

•Thestartingaddressoftheregion •Whethertheregionis4,8,16,32,…bytesinlength •DoestheregionsupportREADoperations? •DoestheregionsupportWRITEoperations? •DoesheregionsupportEXECUTIONfetches?

RISC-VArchitectureSummary/Porter Page� of� 291 323

Page 292: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

•Isthisregion“locked”?

Upto16regionscanbedescribedintheregisters.Thisinformationiscleverlyencodedin16registersplusacoupleofadditionalregisters.

ThePMPUnitisactivewhenrunninginSupervisororUserMode;allaccessesgeneratedarechecked.TheprotectionmayormaynotbeenforcedwhenrunninginMachineMode.

SincetheseregistersareMachineModeCSRs,theycanonlybeaccessedinMachineMode.

Thereisasingle“Lock”bitforeachoneofthe16regionsanditworksasfollows.Ifthelockbitis0,thenMachineModeaccessesarenotchecked;onlySupervisorandUserModeaccessesarechecked.

IftheLockbitisset,thenallMachineModebitsarecheckedaswell.

OnceaLockbitisset,itcannotbecleared.Alllockbitsareinitiallyclear;toclearalockbitrequiresthesystemtobefullyreset(e.g.,POWER-ON-RESET).Inotherwords,oncearegionbecomelocked,itstayslockedforever.Furthermore,allaccesses,regardlessofprivilegemode,willbechecked.

ThislockingschememightbeusefulforsecurebootstrappingcodewhichloadsOScodeintomemoryandthenlocksdowntheregioncontainingtheOS,topreventmalwarefromsubsequentlycorruptingtheOScode.

TheRISC-VspecsuggeststhattheCSRsthatdescribethememoryregionsmightgetswappedinandoutduringcontextswitches.

Forexample,oneprocess(adevicedriver)mightbeallowedtoaccessagivenmemoryregionwhileanotherprocess(auserthread)oughttobeforbiddenfromaccessingthesameregion.Asanotherexample,imagineahypervisorrunninginMachineModethatishostingtwodistinctoperatingsystems,whicharerunninginSupervisorMode.Eachoperatingsystemwillbegivenarangeofavailablemainmemorytouse.ToensurethateachOSstaysoutofthememoryregionusedbytheotherOS,theMachineModesoftwarewillusethePMPunittoenforceallaccesses.WhentimeslicingbetweentwoOSes,thehypervisorcoderunninginMachineModewillneedtoswaponesetofPMPregisterswiththeotherset.

RISC-VArchitectureSummary/Porter Page� of� 292 323

Page 293: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Todescribeeachmemoryregion,wemustsomehowspecifyastartingaddressandalength.Eachmemoryregionhascertainprotectionattributesassociatedwithitandthesemustalsobegiven.

Recallthattherecanbeupto16differentmemoryregions.Thereisessentiallyatablewith16entries,whichcandescribeupto16differentmemoryregions.

Theinformationabouteachregion(i.e.,eachtableentry)isencodedintoasinglefull-length“addressregister”(e.g.,32bits)andanadditional“con2igurationbyte”(8bits).Thelengthoftheregioniscleverlyencodedwithinthesebits,asdescribednext.

Thenamesoftheaddressregistersareasfollows.EachisafullCSR,e.g.,32bitsforRV32machines,andlongerforRV64.

pmpaddr0 pmpaddr1 … pmpaddr15

Thenamesofthecon2igurationbytesaregivennext.Eachisan8bitquantity.

pmp0cfg pmp1cfg … pmp15cfg

Thecon2igurationbytesarepackedinto4registers,withfourcon2igurationbytesperregister.(…atleastfortheRV32machines.For64-bitmachines,only2CSRsareneeded,since8con2igurationbytescanbepackedintoeachregister.)

Eachregionhasacon2igurationbyte,whereinthe8bitsareusedasfollows:

Numberofbits Meaning1 Regionisreadable1 Regioniswritable1 Regionisexecutable1 Regionislocked2 “A”2ield2 unused

RISC-VArchitectureSummary/Porter Page� of� 293 323

Page 294: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

The“A”2ieldhasthesepossiblevalues:

Value Meaning00 Thisentryisnotused01 Top-of-rangeaddressencoding10 Regionis4byteslong11 Regionis>4byteslong

Notallofthe16regionsneedbepopulated;the00valueofthe“A”2ieldindicatesanunusedentryintheregiontable.

Eachregionmusthavealengththatisapowerof2.Furthermore,thelengthmustbe4orlarger;thatis:4,8,16,32,64,…Theregionmustbenaturallyaligned,givenitslength.Forexample,aregionofsize1,024muststartonanaddressthatisevenlydivisibleby1,024.

Iftheregionhassize4,the“A”2ieldwillbe01andtheaddress2ieldwillcontainthestartingaddressoftheregion.

Iftheregionhasalargersize(suchas8,16,32,64,…),the“A”2ieldwillbe11andtheaddressregisterwillcontainboththestartingaddressandlength.Forexample,consideraregionwithsize1,024.Howisthisencodedintheaddressregister?Theregionmustbenaturallyalignedsothestartingaddresswilllooklikethis,inbinary:

XXXXXXXXXXXXXXXXXXXXXX0000000000

Thelast10bitsofthestartingaddresswillalwaysbe0.Thelengthoftheregion(1,024inthisexample)willbeencodedinthesezerobits,asfollows:

XXXXXXXXXXXXXXXXXXXXXX0111111111

Thehardware,bylookingatthevalueintheaddressregisterandbylocatingtherightmost0bit,candirectlyinferthesizeoftheregion.

Okay,it’smorecomplexthatthis.For32bitmachines,theregiontablecanactuallybeusedtodescriberegionsanywhereina34bitaddressspace.Theextra2bitsofaddressareachievedasfollows.

RISC-VArchitectureSummary/Porter Page� of� 294 323

Page 295: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Foraregionofsize4bytes,thelast2bitsmustbe0,sotheyarenotrepresented.All32bitsintheaddressregisterareusedforbits[33:2]oftheaddress,withbits[1:0]implicitlybeing00.

Foraregionofsize8bytes,theaddressregisterwillcontain:

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0

Herethereare31signi2icantbits.Addingintheimplicitbits000(sincetheregionis8-bytesinsize),gives34bitsofstartingaddress.

Foraregionofsize16bytes,theaddressregisterwillcontain:

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX01

Herethereare30signi2icantbits.Addingintheimplicitbits0000(sincetheregionis16-bytesinsize),gives34bitsofstartingaddress.

Andsoonforlargerregions.ForRV64,thissameshiftingby2bitsalsooccurs.Allphysicaladdressesarelimitedto56bits,soonlythelowerorder54bitsintheaddressregistersareused.

Thereisa2inalwayofdescribingaregion’sstartingpointandsize.Ifthe“A”2ieldis01(“top-of-range”)thentheaddressregisterisinterpreteddifferently.Thisencodingschemecanbeusedwhentheregionsarecontiguousandeachregionfollowsdirectlyafterthepreviousregion.Inthisencoding,theaddressregisterdoesnotspecifythestartoftheregion,buttheendingaddressoftheregion.(Actually,itisonepastthelastaddress.)Withthiscodeforthe“A”2ield,thestartingaddressisgivenbythepreviousaddressregister(oraddress0forthe2irsttableentry).

Whenthe“top-of-range”encodingisused,theshiftingby2bitsstilloccurs.Thus,allphysicaladdressesare34bitsandeachregionmustbeginona4bytealignedaddress.

Thereasonthat34-bitand56-bitphysicaladdressesarespeci2iedisthatthepagingsystemsusethesesizes.Inparticular,theSv32pagingschemeuses34-bitphysicaladdressesandtheSv39andSv48pagingschemesuse56-bitphysicaladdresses.

Wheneveranaccesstophysicalmemoryisattempted,theprocessorwill2irstconsulttheprotectiontable.Thelowestnumberedentrythatmatchestheaddress

RISC-VArchitectureSummary/Porter Page� of� 295 323

Page 296: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

willbeusedtodetermineiftheaccessisallowed.Iftheaccessisnotallowed,thenasynchronousexceptionwillberaised(i.e.,“loadaccessexception”,“storeaccessexception”,or“instructionaccessexception”).Seethespecforcompletedetailsonwhichentryisselected.

VirtualMemory

VirtualMemory(i.e.,pagetablesandvirtual-to-physicaladdresstranslation)ishandledinSupervisorMode.

ThereareseveralvirtualmemoryschemesdescribedintheRISC-Vspec,called“bare”,“Sv32”,“Sv39”,and“Sv48”.The“bare”optionmeansnovirtualaddresstranslationoccurs.

A32bitmachine(RV32)maysupporta2levelpagetable(butnot3levelsor4levels).A64bitmachine(RV64)maysupporta3leveltableora4leveltable(butnota2leveltable).

PageTable Virtual Physical Depth AddressSpace AddressSpace

RV32: Bare Sv32 2levels 32bits,4GiBytes 34bits,16GiBytes

RV64: Bare Sv39 3levels 39bits,512GiBytes 56bits,64PiBytesSv48 4levels 48bits,256TiBytes 56bits,64PiBytesSv57 …tobedeZinedlater… Sv64 …tobedeZinedlater…

Thesizeofthephysicaladdressesisimplementationde2ined.Thenumbersgivenabovearethemaximums,butinaparticularimplementationthephysicaladdressspacewilloftenbesmaller.

Regardlessofwhichvirtualmemoryschemeisused,thepagesizeis4,096(4KiBytes).Thus,byteoffsetsintoapagerequire12bits,since212=4,096.Allpagesmustbealignedonapageboundary,sothelower12bitsofanypageaddressarealways000000000000.

RISC-VArchitectureSummary/Porter Page� of� 296 323

Page 297: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

WhatisaPageTable?Apagetableisatreewhereeverynodeisapage.Thesepagesareeachkeptinmainmemoryandaddressedwithphysicaladdresses.

Thepagesattheleafsarecalled“datapages”andcontaintheactualbytesinthevirtualaddressspace.

Theinteriornodesconstitutethepagetableitselfandallowthedatapagestobelocated.Therootpageplustheinteriornodesconstitutethepagetableandarenotvisibletotheuserprocess.Thedatapagesconstitutethevirtualaddressspace(oratleastthepartofitthatispopulated)andcontainthedatabytesthattheuserprocesswillaccess.

Eachinteriorpagecontainsanarrayof“PageTableEntries”(PTEs).EachPTEpointstoachildnode.SomePTEsareconsider“leafPTEs”andpointdirectlytoadatapage.OtherPTEspointtolowerlevelsinthepagetabletree.

IntheSv32addressingscheme,thepagetableisonly2levelsdeep.PageTableEntriesare4byteslong.Therefore,wehave:

1rootpage,containing1,024PTEs 1,024interiorpages,with1,024PTEseach 1,024×1,024datapages,of4,096byteseach

Thus,avirtualaddressspacecontains4GiBytes.

Theterm“pagetable”isoftenusedimpreciselytomeaneitherasingle4,096byteinteriornodeinthetree,ortheentirecollectionofinteriorpagesmakingupthewholetree,notincludingthedatapages.

Thesatp(SupervisorAddressandTranslationRegister)istheCSRthatcontrolsaddresstranslation.

Thesatpregistercontainsthree2ields:

MODE Isaddresstranslationturnedonornot? ASID AddressSpaceIdenti2ier PPN PhysicalPageNumber,i.e.,addressofpagetable’srootpage

ForRV32systems,thesatpregisteris32bits,arrangedasfollows:

RISC-VArchitectureSummary/Porter Page� of� 297 323

Page 298: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

ForRV64systems,thesatpregisteris64bits,arrangedasfollows:

TheMODE2ieldindicateswhethervirtualmemoryaddresstranslationisturnedonornotand,ifso,whichpagingtableschemeisbeingused.

ForRV32systems,theMODE2ieldcanhavethesevalues:

0 addresstranslationturnedoff(i.e.,“bare”mode)1 Sv32addresstranslationisturnedon

ForRV64systems,themode2ieldcanhavethesevalues:

0000 addresstranslationturnedoff(i.e.,“bare”mode)1000 Sv39addresstranslationisturnedon1001 Sv48addresstranslationisturnedonother unused/reservedforSv57,Sv64

The“bare”modeindicatesthatthereisnovirtual-to-physicaltranslation.Alladdressesarephysical.Regardlessofmode,theenforcementofphysicalmemoryattributes(i.e.,usingthe“pmp…”registers,describedelsewhere)isstilloperative.

RISC-VArchitectureSummary/Porter Page� of� 298 323

Page 299: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

ThePhysicalPageNumber(PPN)2ieldcontainsthephysicaladdressoftherootpageofthepagetabletree.

Sinceallpagesmustbealignedona4,096byteboundary,thelast12bitsoftheaddressofanypagemustbe0.ForRV32,thePhysicalPageNumber(PPN)2ieldis22bitswide,whichissuf2icienttoaddressanypageina34bitphysicaladdressspace.ForRV64,thePPN2ieldis44bitswide,whichissuf2icienttoaddressanypageina56bitphysicaladdressspace.

Eachuserprocesswillpresumablyruninadistinctaddressspace.EachaddressspaceisgivenanAddressSpaceIdenti2ier(ASID).ForRV32machines,theASIDis9bits(allowingfor29=512differentaddressspaces).ForRV64machines,theASIDis16bits(allowingfor216=65,536differentaddressspaces).

ThesatpregistercontainstheASIDofthecurrentlyexecutingprocess.

OverviewofTLBs:Tomakeaddresstranslationfastenoughforvirtualmemorytobefeasible,pagetableentriesmustbecachedinan“addresstranslationcache”.Suchacacheistraditionallycalleda“TranslationLookasideBuffer”,andsoisabbreviatedas“TLB”.

TheTLBwillcontainasmallnumberofpagetableentries.WhenaFETCH,LOADorSTOREtomemoryoccurs,theaddresstranslationhardwarewill2irstchecktheaddresstranslationcache(TLB).IftheTLBcontainsamatchingentry,theaddresstranslationhardwarewilluseittogenerateaphysicaladdressimmediately,whichismuchfasterbecauseweavoidgoingtomainmemorytoreadthepagetabletreetolocatethedatapage.

TheTLBwhichisorganizedasanassociativememory,keyedonvirtualpagenumber.IfaTLBentryispresent,thentheentrywillsupplythephysicalpagenumber.Iftheentryisabsent,thentheaddresstranslationprocessmustgotomemorytofetchtheappropriateentryfromthepagetabletree.SomeTLBentrywillbeevictedandthenewentrywillbeplacedintheTLB.Theaddresstranslationwillthenproceed.

UpdatingtheTLB,whichincludeswalkingthein-memorypagetabletreeto2indtheappropriateentryandwritingtheevictedentrybacktomemory,isacomplexandtime-consumingtask.Forfasterperformance,TLBupdatingandpage-tablewalkingareperformedinhardwareonmanysystems.However,insimpler

RISC-VArchitectureSummary/Porter Page� of� 299 323

Page 300: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

implementations,walkingthepagetableandupdatingtheTLBareperformedbysoftware.

Thewaysoftwareupdatingworksisasfollows:WhenthereisnovalidentryintheTLBduringanaddresstranslation,anexceptionwillbesignaled.Thehardwareneverattemptstowalkthein-memorypagetables.Theexceptionishandledbyatraphandler(whichmayberunninginMachineMode).Thetraphandlerwillessentiallyemulatethemissinghardwareinsoftwareandwillwalkthein-memorypagetableandupdateoftheTLBdirectly.Howthisoccursisasoftwareissue,notcoveredintheRISC-Vspec.

Fromtimetotimecontextswitcheswilloccurandthecurrentlyexecutingprocesswillchange.Sinceeachprocesscanoccupyadifferentaddressspace,theentriesintheTLBmaynotbeappropriateforthenewlyexecutingprocess.ThepurposeoftheAddressSpaceIDistomakesurethataprocessonlyusespagetableentriesthatapplytoit.EachentryintheTLBwillbetaggedwithanASID;onlyentriesthatmatchthecurrentlyexecutingprocess’sASID(asstoredinthesatpregister)willbeused.

ThespecdoesnotrequireASIDstobeimplementedand,iftheyareimplemented,thefullrangeofvaluesneednotbesupported.

Thespecmentionsthatwritingtothesatpregistermayrequirecarefulconcurrencycontrol.ItcouldbethataddresstranslationsusingolderpagetablesoperateconcurrentlyandanupdatetosatpmayrequiretheSFENCE.VMAinstructioninordertoensurecorrectnessofcode.

Theideaisthat,aslongastherearenottoomanydistinctaddressspacesandtheASID2ieldislargeenough,thentheASIDmechanismwillpreventoneprocessfromusingpagetableentriesfromthewrongaddressspace.Butwithalargernumberofaddressspaces,itmaybenecessaryto2lushtheaddress-translationcache,i.e.,to2lushtheTLB.Wemayalsoneedto2lushtheTLBwhenchangesaremadetoarunningprocess’saddressspace,i.e.,toapagetablethatiscurrentlyinuse.Forexample,ifweremoveadatapagefromtheaddressspace,itisnotsuf2icienttosimplyinvalidatetherelevantPageTableEntryinthepagetablenodestoredinmemory.WealsoneedtoensurethattherearenoentriesstillsittingintheTLB.

Comment:NotethatbystoringtheMode,ASID,andPhysicalPageNumberoftherootofthepagetableinasingleregister,itallowsallthreetobeupdatedatomicallywithasingleCSRinstruction.

RISC-VArchitectureSummary/Porter Page� of� 300 323

Page 301: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

SFENCE.VMA:SupervisorFenceforVirtualMemory

GeneralForm:SFENCE.VMA vaddr,asid

Example:SFENCE.VMA x5,x8

Comment:Thisinstructionisusedtoimposeorderonaccessestomemoryandupdatestopagetables.Inparticular,itisdesignedto2lushtheaddresstranslationcache(TLB).

WhenevervirtualmemoryisturnedonandaFETCH,LOADorSTOREtomemoryismade,addresstranslationmustoccursotherewillalsobeimplicitREADsandWRITEstothepagetablesstoredinmemory.Normally,theaccesstothepagetableinmemorycanbeavoidedandtheaccesswillbemadetotheaddresstranslation(TLB)cacheinstead.Butwheneverchangestothepagetablesinmemoryaremade,weneedto2lushobsoleteentriesintheTLB.Thisisthepurposeofthisinstruction.

WhentheOSupdatesapagetablenode,itwillissueaWRITEtomemory.InordertomakesurethataWRITEtothepagetableisseenbeforeallsubsequentmemoryaccesses,thisinstructionmustbeexecuted.ItensuresthatallWRITEstomemorythatoccurearlierintheinstructionstreamwillbecompletedandvisiblebeforeanyimplicitREADsorWRITEStothepagetabletriggeredbysubsequentoperationsintheinstructionstream.SincetheTLBiscachingpagetableentries,thisinstructionmustalsoinvalidateaffectedentries,forcingthememorymanagementunittogobacktomemorytoretrievecurrentpagetableentries.

[ItwouldseemnecessarytoalsoforceallearlieraddresstranslationstocompletebeforetheWRITEtothepagetablesoccurs,butthespecdoesnotmentionthisorderingconstraint.???]

Toimposea2inerlevelofgranularity,thisinstructioncanspecifyaspeci2icvirtualaddressand/oraspeci2icaddressspace.Reg1containsavirtualaddress;Reg2containsanAddressSpaceID.

IftheASIDisspeci2iedasx0,thenentriesforalladdressspacesareaffected;otherwise,onlyentriesforthegivenaddressspaceareaffected.

RISC-VArchitectureSummary/Porter Page� of� 301 323

Page 302: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Ifthevirtualaddressisspeci2iedasx0,thenalllevelsofthepagetableareaffected;otherwise,onlytherelevantleafPTEinthepagetableisaffected.

IfboththeASIDandthevirtualaddressaregivenasx0,thenitcausesacomplete2lushoftheTLB.SimpleimplementationsarefreetoalwaysignoretheASIDandvirtualaddressspaceandperformaglobalTLB2lushwheneverthisinstructionisexecuted.

Itispossiblethatseveralcoresaresharingmemoryandthattwothreadsrunningondifferentcoresaresharingasingleaddressspaceandasinglepagetabledatastructureinmemory.Ideally,anyupdatetothispagetableshouldbeseenbyallthreadsonallcores.However,thisinstructiononlyaffectsasinglecore.Tosynchronizewithotherthreadsonothercores,theOSmustuseotherinstructions.

Encoding: ThisisavariantofanR-typeinstruction,wheretheRegD2ieldisunused.

Sv32(Two-LevelPageTables)

IntheSv32addresstranslationmode,avirtualaddressis32bits.Thisispartitionedintoa20-bitvirtualpagenumber(VPN)anda12bitoffset.The20-bitVirtualPageNumberistranslatedbythememorymanagementunitintoa22-bitPhysicalPageNumber.ThePhysicalPageNumberisthenconcatenatedwiththeoffsettogivethe34-bitphysicaladdress.

RISC-VArchitectureSummary/Porter Page� of� 302 323

Page 303: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Thesizeofeachpageis4,096bytes,andeachbytewithinapagecanbeaddressedby12bits,since212=4,096.Allpagesmustbeproperlyalignedona4,096byteboundary.

EachPageTableEntry(PTE)is4bytes.Thus,eachpagecanhold1,024entries.Withinapage,eachentrycanbeaddressedbya10-bitaddress,since210=1,024.

Thepagetableisorganizedasa2-leveltree,whereallnodesareinterior(i.e.,non-data)pages.Therootpageisatthetopofthetreeandcontains1,024pagetableentries.Eachentrycontainsapointertoasecondlevelpage.Thesecondlevelpagescontainpagetableentrieswhichpointtothedatapages.Thesecondlevelpagesarereferredtoas“leaf”pages;thedatapagescanbeconsideredtoliebelowtheleafpages.

APageTableEntry(PTE)is4bytesandcontainsthefollowing2ields:

FieldSize Meaning 22 PhysicalPageNumber 2 RSW(Unde2ined;ReservedforSoftware) 1 D(Dirty) 1 A(Accessed) 1 G(Globalmapping) 1 U(Useraccessible) 1 V(Valid) 3 XWR(Executable,Writable,Readable)

TheVirtualPageNumber,whichis20bits,isdividedintotwo2ieldsof10bitseach,calledVPN[1]andVPN[0].

RISC-VArchitectureSummary/Porter Page� of� 303 323

Page 304: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Theuppermost10bits(VPN[1])isusedtoselectanentryintherootpage.ThisentrycontainsaPhysicalPageNumber(PPN)2ield,whichaddressesasecondlevelpage.

Thesecond10bitsoftheVirtualPageNumber(VPN[0])isusedtoselectanentryinthesecondlevelpage.Thisyieldsthe2inalpagetableentry(calleda“leaf”pagetableentry),whichcontainsthePhysicalPageNumberofthepagecontainingtheactualdata.

Thisdiagramshowsthis.

Thereisasinglerootpageandupto1,024secondlevelpages.Eachpageinthepagetable(thatis,therootpageandallsecondlevelpages)contains1,024PageTableEntries.Theentriesatthebottomofthetreearecall“leaf”entriesandpointtothedatapages.Leafentriesaredistinguishedfromnon-leafentriesbytheXWR(Execute-Write-Read)permissionbits.IfXWR=000,thentheentryisnotaleafentry.

RISC-VArchitectureSummary/Porter Page� of� 304 323

Page 305: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Thereisafacilityfor“megapages”.Amegapageisadatapagecontaining4MiBytes.Anoffsetintoamegapagewillbe22bits,since222=4×1,024×1,024=4,194,304.Amegapagemustbealignedtoa4MiByteboundary.

TheXWRbitsinapagetableentryindicatewhetherthatentryisaleafentry(meaningthattheentrypointstoadatapage)oranon-leafentry(meaningthattheentrypointstoanotherpageinthepagetabletree).IftheXWRbitsare000,thentheentryisanon-leafentry,pointingtothenextlevelinthepagetabletree.IftheXWRbitsarenot000,thentheyindicatethattheentrypointstoadatapageandtheygivethe“execute”,“read”,and“write”permissionsforthedatapage.

Inthecaseofmegapages,thereisnosecondlevelpageinthetree.Instead,thePageTableEntryintherootpagepointsdirectlytoadatapage,i.e.,amegapage.

RISC-VArchitectureSummary/Porter Page� of� 305 323

Page 306: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Thereareacoupleofbene2itsofplacingmegapagesinavirtualaddressspacethatmayalsocontainnormal4,096bytedatapages.

First,wheneverdatafromapageis2irstreferenced,theMemoryManagementUnitmustwalkthepagetableto2indtheappropriateleafpagetableentry.Withmegapages,onlytherootpagetablemustbeconsulted,sincethereisnosecondlevelpageinvolved.Thisreducesthenumberofmemoryaccessesbyone.(ThiscouldmeansubstantialsavingsifthePTEneedstobere-loadedfrequently,duetoTLBcontentionand/orneedstobeupdatedinmemoryuponTLB2lushing).

Thesecondadvantageisthatasingleentryintheaddresstranslationcache(theTLB)willcover4MiBytes,insteadof4KiBytes.ThismeansthatfewerentriesintheTLBwillbeneededforeachaddressspace.TheTLBisasigni2icantbottleneckandreducingcontentioniscritical.

[Forexample,asimpleaddressspacemayhaveoneblockofmemorymarked“read/execute”(fortheprogramandconstants)andasecondblockmarked“read/write”(forstaticvariablesandthestack).Sinceneitherofthesewillnormallyexceed4MiBytes,only2entriesintheTLBwouldbeneededfortheentireaddressspace.TLBentriesarepreciousandfewinnumber,sothisreducespressureontheTLB.]

Howevermegapagescomewithdrawbacks.First,theywilloftenbelargerthannecessary.(Thisiscalledthe“internalfragmentationproblem”.)Theextraspaceinthemegapagesrepresentsrealphysicalmemoryand,ifnotneededbytheprocess,thismemoryiseffectivelywasted.Forexample,ifanaddressspaceonlyrequires64

RISC-VArchitectureSummary/Porter Page� of� 306 323

Page 307: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

KiBytes,butisallocatedtwomegapages,thenapproximately99%ofthememoryiswasted,since222–216≈222.Ifconcurrentprocessesaredoinglotsofcomplexdatasharingbysharingvirtualmemorypagesthisfragmentationproblemcanbecomeworse.

Second,theOSkernelmustbeabletoallocatelargeblocksofcontiguousmemory.Thisisnotaproblemifthememoryisdividedintoacollectionofequalsizedpages.Butiftherearemultiplepagesizes(e.g.,1,024and4,194,304)thenthingsgettrickier.Afterall,thisisessentiallytheproblemthatvirtualmemorywasdesignedtoaddress.(Thisiscalledthe“externalfragmentationproblem”.)

Next,welookatthemeaningofthebitsinthePageTableEntry(PTE).

TheValidbit(V)indicateswhetherthisPTEisvalidornot(1=valid,0=invalid).IfaninvalidPTEisencounteredbytheMemoryManagementUnit,thenanaccessfaultissignaled.(Theexceptionwillbeeithera“instructionaccessexception”,“readaccessexception”,or“storeaccessexception”asappropriate).IfthePTEisinvalid,theremainingbitsareignoredandcanbeusedbysoftwareasdesired.

TheUserAccessiblebit(U)controlswhetherthedatacanbeaccessedinSupervisorModeorUserMode.Typically,aprocess’saddressspacewillbedividedintotwoparts,sometimescalledthe“bottomhalf”andthe“tophalf”.ThebottomhalfcanonlybeaccessbycoderunninginUserMode;datainthetophalfcanonlybeaccessedwhenrunninginSupervisorMode.U=1meansbottomhalf;U=0meantophalf.IfUserModecodeattemptstoaccessdatainthetophalf,anaccessfaultwilloccur.IfSupervisorModecodeattemptstoaccessdatainthebottomhalf,anaccessfaultwilloccur.

However,theSUMbitintheSupervisorStatusword(sstatus)canbeusedtoallowSupervisorModecodetoaccesspagesinthebottomhalf.Normally,SupervisorcodewillrunwithSUM=0,inwhichcaseanaccessfaultwilloccur.Butoccasionally(e.g.,whendataispassedtotheOSinakernelcall)SupervisorCodewillsetSUM=1forashorttime.Duringthistime,LOADandSTOREinstructionswilloperateasifrunninginUserMode.Thisallowsthekernelroutinestoretrieveorstoreparametersinthebottomhalf(i.e.,intheuserportionofthevirtualaddressspace).

TheUbitisonlymeaningfulforleafPTEs;fornon-leafPTEs,thisbitwillbe0.

TheGlobalMappingbit(G)concernspagesthataresharedamongalladdressspaces.G=1meansgloballysharedbyalladdressspaces;G=0meanslocaltoasingle

RISC-VArchitectureSummary/Porter Page� of� 307 323

Page 308: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

addressspaceandnotshared.Ifapageisgloballyshared,itmeansthatitismappedintoalladdressspaces.Inotherwords,theAddressSpaceID(ASID)checkingissuppressed.Forexample,asetofcommonlyused,read-onlyfunctionsanddatacanbemappedintoalladdressspaces,makingtheseroutinesavailableatessentiallynocosttoalladdressspaces.Thesecouldbeuser-levelfunctionspresentinall“bottomhalves”orcouldbeSupervisorModecodethatisplacedintoall“tophalves”.

Thespecindicatesthatthisbitismeaningfulforbothleafandnon-leafPTEs.Whensetinanon-leafPTE,itmeanstheentirepagetabletreebelowisglobalandsharedbyalladdressspaces.

[???ItisnotclearwhytheGbitisspeci2iedasbeingmeaningfulinnon-leafPTEs.IfaleafPTEthatismarked“global”isstoredintheTLB,thenthatTLBentry(andthedatapageitrefersto)isshared.ButifthereisnoleafPTEintheTLB,thentheMemoryManagementUnitwillneedtowalkthepagetabletreetolocatetheleafPTE.Duringthiswalk,itwouldnotseemtomakeanydifferencewhethersomepagetablepageissharedornot;eachpageofthepagetabletreestillhastobeaccessed.PerhapstheRISC-Vspecenvisionsputtingnon-leafPTEsintheTLB.???]

TheAccessedbit(A)issetbytheMemoryManagementUnitwheneverabyteonthedatapageisread,written,orfetchedforexecution.Insometextbooksthisbitiscalledthe“referenced”bit.ThisbitisonlymeaningfulforleafPTEs;fornon-leafPTEs,thisbitwillbe0.

TheDirtybit(D)issetbytheMemoryManagementUnitwheneverabyteonthedatapageiswrittento.ThisbitisonlymeaningfulforleafPTEs;fornon-leafPTEs,thisbitwillbe0.

DetailsonUpdatingaPageTableEntry:WejustsaidthattheAccessedBitandtheDirtyBitaresetbythehardwarewheneveranybyteinthedatapageisaccessedorupdated.(ThesearetheonlybitsinthePageTableEntry(PTE)thatwouldeverbemodi2iedbyhardware,butanymodi2icationwillrequirethatthisPTE,whenevictedfromtheTLBinthefuture,mustbewrittenbacktothein-memorypagetable.)

HowevertheRISC-Vspecalsoallowsthehardwaretoworkasecondway.Inthisalternativeapproach,apagetableentryisnevermodi2iedbythehardware.Thissimpli2iesthehardwaresincePTEsneedneverbewrittenbacktomemory.Instead,the“A”and“D”bitsaremerelychecked.Iftheyarenotalreadyset

RISC-VArchitectureSummary/Porter Page� of� 308 323

Page 309: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

correctly,anexceptionissignaled.Thensoftwareinthetraphandlercantakeactionstowritetheupdatedpagetableentrybacktomemory.

Insomecases,the“A”and“D”bitsmaynotbeneeded.Forexample,ifsomedatapageisalwaysresidentinmemoryandneverswappedouttobackingstore,thenthe“A”and“D”bitscanbeignored.Oralso,ifamemory-mappedI/Odeviceismappedintosomevirtualaddressspace,thenthe“A”and“D”bitscanbeignored.Insuchcases,thebitscanbepresetto“1”toobviatetheneedforupdating.

TheExecutable(X),Writable(W),andReadable(R)bitsindicatewhetherthisPTEisaleafornon-leafentryand,forleafentries,theyindicatetheaccesspermissionsforthedataonthedatapage.IfXWR=000,thenthisPTEisnotaleafentry;thePhysicalPageNumberintheentrypointstothenextlower-levelpageinthepagetabletree.

TheRSWbitsarenotusedbythehardwareandarefreelyavailabletosoftwaretobeusedasdesired.

TheSv32PageTableAlgorithm

Wheneveranaccess(read,write,orfetch-for-execution)occurs,thefollowingstepsaretakenbytheMemoryManagementUnit,tomapavirtualaddressintoaphysicaladdress.

(TheRISC-VspecgivesthisalgorithminageneralformapplicabletoSv32,Sv39,andSv48,i.e.,two-,three-,andfour-levelpagetables.Inordertomakeiteasiertounderstand,theversiongivenheresimpli2iesandspecializesthealgorithmtotwo-leveltables.)

Inputs: va–thevirtualaddresstobetranslated(32bits),withthefollowingparts: va.VPN[1]–theuppermost10bits;offsetintothetoplevelpagetable va.VPN[0]–thefollowing10bits;offsetintothesecondlevelpagetable va.OFFSET–thelower12bits;offsetintothedatapage Thedesiredaccesstype(READ,WRITE,orFETCH) Thecurrentprivilegemode(UorS;noaddresstranslationisdoneinMmode) satp–theSupervisorAddressTranslationandProtectionCSR satp.PPN–thepagenumberoftherootpage.

RISC-VArchitectureSummary/Porter Page� of� 309 323

Page 310: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

status.SUM–the“PermitSupervisorUserMemoryAccess”bitinthestatusreg status.MXR–the“MakeExecutableReadable”bitinthestatusreg

Outputs: pa–physicaladdress(34bits) or accessexception–signalapagefaultexceptionandaborttranslation

Tempvariables: a–pointerto(i.e.,physicaladdressof)apageinmemory p–pointerto(i.e.,physicaladdressof)aPTEinmemory pte–aPageTableEntry,asfetchedfrommemory

Algorithm: Compute“a”,theaddressofroottop-levelpageinthepagetable a!satp.PPN||000000000000 Compute“p”,theaddressofthePTEinthetoplevelpage p!a+(va.VPN[1]||00) Readfrommemory“pte”,aPageTableEntryfromtheroot-levelpage pte!memory[p] IfthePhysicalMemoryProtection(PMP)systemindicatesproblems thensignalanexception IfthisPTEisnotvalid(i.e.,pte.V=0orpte.XWRisaninvalidvalue) thensignalanexception Ifpte.XWR≠000then… /*ThisPTEisaleafPTE,pointingtoamegapage.*/ Ifpte.PPN[0]≠0000000000,thensignalanexception Computethephysicaladdress pa!pte.PPN[1]||va.VPN[0]||va.OFFSET Else(i.e.,pte.XWR=000)… /*WedonothavealeafPTE,solookatthesecondlevel.*/ Compute“a”,theaddressofthesecondlevelpagetablepage a!pte.PPN[1]||pte.PPN[0]||000000000000 Compute“p”,theaddressofthePTEinthesecondlevelpage p!a+(va.VPN[0]||00) Readfrommemory“pte”,aPageTableEntryfromthesecondlevelpage pte!memory[p] IfthePhysicalMemoryProtection(PMP)systemindicatesproblems thensignalanexception IfthisPTEisnotvalid(i.e.,pte.V=0orpte.XWRisaninvalidvalue)

RISC-VArchitectureSummary/Porter Page� of� 310 323

Page 311: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

thensignalanexception Ifpte.XWR=000,thenwedonothavealeafPTE… thensignalanexception Computethephysicaladdress pa!pte.PPN[1]||pte.PPN[0]||va.OFFSET /*“pte”containstheleafPageTableEntry.*/ MakesurethisaccessisallowedbycheckingtheX,W,R,andUbits…. Ifpte.R≠1,thensignalanexception Ifpte.W≠1andthisisaWRITE,thensignalanexception Ifpte.X≠1andthisisaFETCH,thensignalanexception Ifthecurrentmodeis“User”… Ifpte.U=0,thensignalanexception Ifthecurrentmodeis“Supervisor”… IfthisisaREAD&pte.R≠1&status.MXR≠1,thensignalanexception Ifpte.U=1andstatus.SUM=0,thensignalanexception Checkthe“accessed”bit… Ifpte.A≠1,thensetit Checkthe“dirty”bit… IfthisisaWRITEandpte.D≠1,thensetit Ifthepte.Aorpte.Dbitswerechanged… Either •Signalanexception(andletsoftwareupdatethepagetable) •WritethePageTableEntrybacktomemory. Thismustbeatomicwithrespecttotheearlierreadingofthepte. IfthePMPsystemindicatesproblems,thensignalanexception

Sv39(Three-LevelPageTables)

BoththeSv39andSv48virtualaddressingschemesarenaturalextensionsoftheSv32scheme.Theprimarydifferenceisthattheysupportlargervirtualaddressspacesandpagetablesthatare3-levels(forSv39)and4-levels(forSv48),insteadof2-levels.

IfyouunderstandSv32then–forallintentsandpurposes–youalreadyunderstandSv39andSv48.

WithSv39,virtualaddressesare39bitsandphysicaladdressesare56bits.

RISC-VArchitectureSummary/Porter Page� of� 311 323

Page 312: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

WecansummarizetheSv39virtualaddressingschemeasfollows:Sv39isidenticaltoSv32,withtheseexceptions:

•ItmayonlybeimplementedonRV64systems,notonRV32systems. •Virtualaddressesare39bits. •Themaximumvirtualaddressspaceis239=512GiBytes. •Thephysicaladdressesare56bits. •Thepagesizeisunchanged(i.e.,4,096bytes). •Thepagetableshave3levels,insteadof2levels. •ThePageTableEntriesare64bits,insteadof32bits. •Eachpageofthepagetablecontains512PTEs,insteadof1,204PTEs. •Offsetsintothepagetablesare9bits(insteadof10bits)since29=512. •TheRSW,D,A,G,U,X,W,R,andVbitsworkidentically. •Megapagesare2MiBytes,insteadof4MiBytes. •“Gigapages”of1GiBytearealsosupported.

WithSv39,thepagetablewillhavethreelevels:

RISC-VArchitectureSummary/Porter Page� of� 312 323

Page 313: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

Ona64-bitmachine,addressesareinitially64bitslong,sincethisistheregisterlength.ForSv39,onlytheleastsigni2icant39bitsareusedtoaddressintothevirtualaddressspace.Theremaining25bitsmustbethesign-extension(notzero-2illed)oftheleastsigni2icantbits.Otherwise,anexceptionwillbesignaled.

Sv48(Four-LevelPageTables)

WithSv48,virtualaddressesare48bitsandphysicaladdressesare56bits.

RISC-VArchitectureSummary/Porter Page� of� 313 323

Page 314: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

WecansummarizetheSv48virtualaddressingschemeasfollows:Sv48isidenticaltoSv32andSv39,withtheseexceptions:

•ItmayonlybeimplementedonRV64systems,notonRV32systems. •IfSv48issupported,thenSv39mustalsobesupported. •Virtualaddressesare48bits. •Themaximumvirtualaddressspaceis248=256TiBytes. •Thephysicaladdressesare56bits. •Thepagesizeisunchanged(i.e.,4,096bytes). •Thepagetableshave4levels. •ThePageTableEntriesare64bits,thesameasSv39. •Eachpageofthepagetablecontains512PTEs,thesameasSv39. •Offsetsintothepagetablesare9bits,thesameasSv39. •TheRSW,D,A,G,U,X,W,R,andVbitsworkidentically. •Megapagesare2MiBytes,thesameasSv39. •Gigapagesare1GiByte,thesameasSv39. •“Terapages”of512GiBytesarealsosupported.

Ona64-bitmachine,addressesareinitially64bitslong,sincethisistheregisterlength.ForSv48,onlytheleastsigni2icant48bitsareusedtoaddressintothevirtual

RISC-VArchitectureSummary/Porter Page� of� 314 323

Page 315: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter9:VirtualMemory

addressspace.Theremaining16bitsmustbethesign-extension(notzero-2illed)oftheleastsigni2icantbits.Otherwise,anexceptionwillbesignaled.

RISC-VArchitectureSummary/Porter Page� of� 315 323

Page 316: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter10:WhatisNotCovered

Besidestheextensionswehavealreadydescribed,theRISC-Vspecmentionsseveraladditionalextensionswhichmayormaynotbeimplemented.Thespeci2icationsfortheseextensionsareinastateof2lux,sowewillsimplymentiontheirexistencehere.

TheDecimalFloatingPointExtension(“L”)

Mostprogrammersarefamiliarwithbinary2loatingpoint,butthereisalsosuchathingasarepresentationof2loatingpointnumbersusingadecimalbase.TheRISC-Vspeccontainsnothingspeci2icforthisextensionandonlymentionsitasfuturework.

TheBitManipulationExtension(“B”)

TheRISC-Vspeccontainsnothingspeci2icforthisextensionandonlymentionsitasfuturework.

TheDynamicallyTranslatedLanguagesExtension(“J”)

Highlevellanguagesoftenrequiredynamictypingchecking,garbagecollectionanddynamic(just-in-time)compiling,soincludingspecializedinstructionstosupportthesecanimproveperformance.However,theRISC-Vspeccontainsnothingspeci2icforthisextensionandonlymentionsitasfuturework.

TheTransactionalMemoryExtension(“T”)

RISC-VArchitectureSummary/Porter Page� of�316 323

Page 317: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter10:WhatisNotCovered

TheRISC-Vspeccontainsnothingspeci2icforthisextensionandonlymentionsitasfuturework.

ThePackedSIMDExtension(“P”)

RecallthatSIMDstandsfor“singleinstruction,multipledata”.Thegoalistoperformalargenumberof2loatingpointoperationssimultaneously.Thereisdataparallelism,butnotfullconcurrencysincethesameoperationisperformedonacollectionofvalues.Asingleoperation(suchas“multiply”)isperformedinparallelonNvalues,followedbythenextoperationafterallNmultiplicationshavecompleted.

Theideawiththisextensionisthateach2loatingpointregisterwillholdmorethanonevalue.Forthisextension,the2loatingpointregisterswilllikelybelargerthan32or64bits.Forexample,inoneimplementationeach2loatingpointregistermightbe1,024bitswide.Thuseachregistercanholdmorethanonevalue.Since16×64=1,024,eachregistercanhold16doubleprecisionvaluesatonce.Whenanarithmeticinstruction(suchas“add”or“multiply”)isexecuted,theoperationisperformedonall16valuessimultaneously.

Thestatusofthisextensionisunsettled.Itmaybedroppedaltogetherinfavorofthe“V”VectorSIMDExtension.

TheVectorSIMDExtension(“V”)

The“V”VectorSIMDextensionisdesignedtoaccommodatelargevectorsandSIMDoperation.Itincludesasetof32vectorregisters(v0,v1,…v31),anumberofnewcon2igurationCSRregisters,andadditionalinstructions.TheRISC-Vspeci2icationisdif2iculttounderstand.Furthermore,thisextensionissubjecttorevisionandisnotyet“frozen”.

PerformanceMonitoring

RISC-VArchitectureSummary/Porter Page� of� 317 323

Page 318: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

Chapter10:WhatisNotCovered

RISC-Vde2inesacollectionofCSRregistersdevotedtomeasuringhardwareperformance:

mhpmcounter3 mhpmcounter4 … mhpmcounter31

mhpmevent3 mhpmevent4 … mhpmevent31

Thismechanismisnotdescribedinthespec,beyondsayingthatthe“events”tobecountedcanbeselectedbysettingmhpmevent3,mhpmevent4,….

Debug/TraceMode

TheRISC-Vspecmentionsanothermodecalled“DebugMode”andseveralrelatedControlandStatusRegisters(CSRs)namedtselect,tdata1,tdata2,andtdata3,butthespecdoesnotgiveanydetail.

RISC-VArchitectureSummary/Porter Page� of� 318 323

Page 319: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

AcronymListABI ApplicationBinaryInterfaceAEE ApplicationExecutionEnvironmentAMO AtomicMemoryOperationASID AddressSpaceIDCSR ControlandStatusRegistercycle CSR[C00]:Clockcyclecounter,UserModecycleh CSR[C80]:Upperhalfofcycle,UserMode(RV32only)dcsr CSR[7B0]:Debugcontrolandstatusdpc CSR[7B1]:DebugPCdscratch CSR[7B2]:ScratchregisterDZ DivideByZero(abitwithintheFloatingPointFlags,FFLAGS)fcsr CSR[003]:FloatingPointControlandStatusReg(frm||f2lags)f2lags CSR[001]:Floatingpointing2lagsFFLAGS FloatingPointFlags(i.e.,NX,UF,OF,DZ,NV)frm CSR[002]:DynamicroundingmodeFRM FloatingPointRoundingModeFS Fieldinstatusregister;statusof2loating(clean/dirty/…)HART HardwareThreadHBI HypervisorBinaryInterfaceHEE HypervisorExecutionEnvironmentHEIP hypervisorexternalinterrupthpmcounter3 CSR[C03]:Eventcounter#3,UserModehpmcounter31 CSR[C1F]:EventCounter#31,UserModehpmcounter31h CSR[C9F]:Upperhalfofcounter,UserMode(RV32only)hpmcounter3h CSR[C83]:Upperhalfofcounter,UserMode(RV32only)hpmcounter4 CSR[C04]:EventCounter#4,UserModehpmcounter4h CSR[C84]:Upperhalfofcounter,UserMode(RV32only)HSIP hypervisorsoftwareinterruptHTIP hypervisortimerinterruptinstret CSR[C02]:Numberofinstructionsretired,UserModeinstreth CSR[C82]:Upperhalfofinstret,UserMode(RV32only)IR InstructionRegisterISA InstructionSetArchitecture LR LinkRegister marchid CSR[F12]:ArchitectureID mcause CSR[342]:Trapcausecode,MachineMode mcounteren CSR[306]:Counterenable,MachineMode

RISC-VArchitectureSummary/Porter Page� of� 319 323

Page 320: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

AcronymList

mcycle CSR[B00]:Clockcyclecounter,MachineModemcycleh CSR[B80]:Upperhalfofcycle,MachineMode(RV32only) medeleg CSR[302]:Exceptiondelegationregister,MachineMode MEIP machineexternalinterrupt mepc CSR[341]:PreviousvalueofPC,MachineModemhartid CSR[F14]:HardwarethreadID mhpmcounter3 CSR[B03]:Eventcounter#3,MachineMode mhpmcounter31 CSR[B1F]:EventCounter#31,MachineMode mhpmcounter31h CSR[B9F]:Upperhalfofcounter,MachineMode(RV32only) mhpmcounter3h CSR[B83]:Upperhalfofcounter,MachineMode(RV32only) mhpmcounter4 CSR[B04]:EventCounter#4,MachineMode mhpmcounter4h CSR[B84]:Upperhalfofcounter,MachineMode(RV32only)mhpmevent3 CSR[323]:Eventselector#3,MachineModemhpmevent3 CSR[324]:Eventselector#4,MachineModemhpmevent31 CSR[33F]:Eventselector#31,MachineModemideleg CSR[303]:Interruptdelegationregister,MachineModemie CSR[304]:Interrupt-enableregister,MachineModeMIE Bitinstatusregister;MachineModeInterruptEnablemimpid CSR[F13]:ImplementationIDminstret CSR[B02]:Numberofinstructionsretired,MachineModeminstreth CSR[B82]:Upperhalfofinstret,MachineMode(RV32only)mip CSR[344]:Interruptpending,MachineModemisa CSR[301]:ISAandextensionsMMU MemoryManagementUnitMPIE Bitinstatusregister;MachineModePreviousInterruptEnableMPP Fieldinstatusregister;MachineMode–PreviousPrivilegeModeMPRV Bitinstatusregister;ModifyPrivilege mscratch CSR[340]:Tempregisterforuseinhandler,MachineModeMSIP machinesoftwareinterruptmstatus CSR[300]:Statusregister,MachineModeMTIP machinetimerinterruptmtval CSR[343]:Badaddressorbadinstruction,MachineModemtvec CSR[305]:Traphandlerbaseaddress,MachineModemvendorid CSR[F11]:VendorIDMXR Bitinstatusregister;MakeExecutableReadableNMI Non-maskableinterruptNSE NonstandardextensionNV InvalidOperation(abitwithintheFloatingPointFlags,FFLAGS)NX Inexact(abitwithintheFloatingPointFlags,FFLAGS)OF Over2low(abitwithintheFloatingPointFlags,FFLAGS)

RISC-VArchitectureSummary/Porter Page� of� 320 323

Page 321: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

AcronymList

PC ProgramCounterPMA PhysicalMemoryAttributePMP PhysicalMemoryProtectionUnitpmpaddr0 CSR[3B0]:PMPAddress#0pmpaddr1 CSR[3B1]:PMPAddress#1pmpaddr15 CSR[3BF]:PMPAddress#15pmpcfg0 CSR[3A0]:PMPCon2igurationword#0pmpcfg1 CSR[3A1]:PMPCon2igurationword#1pmpcfg2 CSR[3A2]:PMPCon2igurationword#2pmpcfg3 CSR[3A3]:PMPCon2igurationword#3PPN PhysicalPageNumberPTE PageTableEntryRISC ReducedInstructionSetComputerRM RoundingModeBitsROM ReadOnlyMemoryRV128/RV128I Instructionspresentonlyon128bitmachinesRV32/RV32I Thebasicinstructionset,presentonallmachinesRV64/RV64I Instructionspresentonlyon64and128bitmachinessatp CSR[180]:Addresstranslationandprotectionsatp SupervisorAddressTranslationandProtectionCSRSBI SupervisorBinaryInterfacescause CSR[142]:Trapcausecode,SupervisorModescounteren CSR[106]:Counterenable,SupervisorModeSD Fieldinstatusregister,summarizesFSandXS2ieldssedeleg CSR[102]:Exceptiondelegationregister,SupervisorModeSEE SupervisorExecutionEnvironmentSEIP supervisorexternalinterruptsepc CSR[141]:PreviousvalueofPC,SupervisorModesideleg CSR[103]:Interruptdelegationregister,SupervisorModesie CSR[104]:Interrupt-enableregister,SupervisorModeSIE Bitinstatusregister;SupervisorModeInterruptEnableSIMD Singleinstruction,multipledatasip CSR[144]:Interruptpending,SupervisorModeSP StackPointerSPP Fieldinstatusregister;SupervisorMode–PrevPrivilegeModeSPIE Bitinstatusregister;SupervisorModePrevInterruptEnablesscratch CSR[140]:Tempregisterforuseinhandler,SupervisorModeSSIP supervisorsoftwareinterruptsstatus CSR[100]:Statusregister,SupervisorModeSTIP supervisortimerinterrupt

RISC-VArchitectureSummary/Porter Page� of� 321 323

Page 322: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

AcronymList

stval CSR[143]:Badaddressorbadinstruction,SupervisorModestvec CSR[105]:Traphandlerbaseaddress,SupervisorModeSUM Bitinstatusregister;PermitSupervisorUserMemoryAccessSXL Fieldinstatusregister;Emulation:RegsizewheninSModetdata1 CSR[7A1]:Triggerdata#1tdata2 CSR[7A2]:Triggerdata#2tdata3 CSR[7A3]:Triggerdata#3time CSR[C01]:Currenttimeintickstimeh CSR[C81]:Upperhalfoftime(RV32only)TLB TranslationLookasideBuffertselect CSR[7A0]:TriggerregisterselectTSR Bitinstatusregister;TrapSRETinstruction TW Bitinstatusregister;Time-outWaitTVM Bitinstatusregister;TrapVirtualMemoryucause CSR[042]:Trapcausecode,UserModeUEIP userexternalinterruptuepc CSR[041]:PreviousvalueofPC,UserModeUF Under2low(abitwithintheFloatingPointFlags,FFLAGS)uie CSR[004]:Interrupt-enableregister,UserModeUIE Bitinstatusregister;UserModeInterruptEnableuip CSR[044]:Interruptpending,UserModeUPIE Bitinstatusregister;UserModePreviousInterruptEnableuscratch CSR[040]:Tempregisterforuseinhandler,UserModeUSIP usersoftwareinterruptustatus CSR[000]:Statusregister,UserModeUTIP usertimerinterruptutval CSR[043]:Badaddressorbadinstruction,UserModeutvec CSR[005]:Traphandlerbaseaddress,UserModeUXL Fieldinstatusregister;Emulation:RegsizewheninUserModeVPN VirtualPageNumberWFI WaitForInterruptXOR Exclusive-OrXS Fieldinstatusregister;statusofextension(clean/dirty/…)

RISC-VArchitectureSummary/Porter Page� of� 322 323

Page 323: RISC-Vharry/riscv/RISCV-Summary.pdf · An Overview of the Instruction Set Architecture Harry H. Porter III Portland State University HHPorter3@gmail.com January 26, 2018 The RISC-V

AbouttheAuthorProfessorHarryH.PorterIIIteachesintheDepartmentofComputerScienceatPortlandStateUniversity.Hehasproducedseveralvideocourses,notablyontheTheoryofComputation.Recentlyhebuiltacompletecomputerusingtherelaytechnologyofthe1940s.Thecomputerhaseightgeneralpurpose8bitregisters,a16bitprogramcounter,andacompleteinstructionset,allhousedinmahoganycabinetsasshown.PorteralsodesignedandconstructedtheBLITZSystem,acollectionofsoftwaredesignedtosupportauniversity-levelcourseonOperatingSystems.Usingthesoftware,studentsimplementasmall,butcomplete,time-sliced,VM-basedoperatingsystemkernel.Porterhashabitofdesigningandimplementingprogramminglanguages,themostrecentbeingalanguagespeci2icallytargetedatkernelimplementation.

PorterholdsanSc.B.fromBrownUniversityandaPh.D.fromtheOregonGraduateCenter.

PorterlivesinPortland,Oregon.Whennottryingto2igureouthowhiscomputerworks,heskis,hikes,travels,andspendstimewithhischildrenbuildingthings.

ProfessorPorter’swebsite:www.cs.pdx.edu/~harry

RISC-VArchitectureSummary/Porter Page� of�323 323