number eight of a series
DESCRIPTION
Number eight of a series. Drinking from the Firehose Many chips from one – Specification in the Mill ™ CPU Architecture. The Mill CPU. The Mill is a new general-purpose commercial CPU family. - PowerPoint PPT PresentationTRANSCRIPT
2014-05-14 1Mill Computing Patents pending
Number eight of a series
Drinking from the FirehoseMany chips from one –
Specification in the Mill™ CPU Architecture
2014-05-14 2Mill Computing Patents pending
The Mill CPU
The Mill is a new general-purpose commercial CPU family.
The Mill has a 10x single-thread power/performance gain over conventional out-of-order superscalar architectures, yet runs the same programs, without rewrite.
This talk will explain:• configurable architecture strategy• attributed specification• operation set specification• component configuration at core/chip/board levels• automatic tool generation
2014-05-14 3Mill Computing Patents pending
Talks in this series
1. Encoding2. The Belt3. Memory4. Prediction5. Metadata and speculation6. Execution7. Security8. Specification9. …
You are here
Slides and videos of other talks are at:
MillComputing.com/docs
2014-05-14 4Mill Computing Patents pending
addsx(b2, b5)
The Mill Architecture
Specification and configurationNew with the Mill:
Family members built from specificationsReusable components
Instruction set built by composing attributesFully regular instruction set
Mechanically generated bit-level encodingEntropy-optimal encoding throughout
Configuration-specific generated tool setsAsm, sim, debugger, compiler, …
Generated hardwareVerilog from specification
2014-05-14 5Mill Computing Patents pending
Caution!
Gross over-simplification!This talk tries to convey an intuitive understanding to the non-specialist.
The reality is more complicated.
2014-05-14 6Mill Computing Patents pending
Specification
This talk does not describe the Mill architecture
Unlike other talks in this series…
It describes how the operation set and particular family member micro-architectures are specified.
It describes, and demonstrates, some of the software tools built from the specifications.
It describes how the specification supports manual creation of Mill hardware.
2014-05-14 7Mill Computing Patents pending
Specification
This talk does not describe the Mill architecture
Unlike other talks in this series…
The specification tools are for internal use in creation of Mill CPUs; the tools are not intended to be products.
By use of these tools, we can create new Mill chip products more quickly and at lower cost than usual.
The intended audience includes tool designers and software developers interested in advanced design.
2014-05-14 8Mill Computing Patents pending
Specification
abstract Mill CPU architecture
family members Tin Coppe
rSilver Gold
The Mill is a family of member CPUs sharing an abstract operation set and micro-architecture.
specification driven
Members differ in concrete operation set and micro-architecture..
Designers describe a concrete member by writing a specification.
2014-05-14 9Mill Computing Patents pending
Specification
abstract Mill CPU architecture
family members Tin Silve
r Gold
tools compiler asm debugge
r HWgensim
Software automatically creates system software, verification tests, documentation, and a hardware framework for the new member from the specification.
specification driven
data driven
Copper
2014-05-14 10Mill Computing Patents pending
C++ compiler masquerade
assembly language
The Mill assembler syntax is C++.Suitably disguised.
2014-05-14 11Mill Computing Patents pending
Two-pass assemblersTraditional assemblers have two passes.
The first pass treats the source as a program in a meta-language, usually a macro language, and interprets that program to produce a different source program in machine language.
The second pass translates the program in machine language to binary and produces the executable file.
source file
macrolanguage
first pass machine
language
second pass
load module
binary
2014-05-14 12Mill Computing Patents pending
The Mill assembler uses the C++ compiler
The first pass is the C++ compiler, which translates the assembly language source program to an executable.
The second pass is the execution of the C++ program, to emit binary and produce the executable file.
source file
macrolanguage
first pass machine
language
second pass
load module
binary
source file
C++C++
compiler program execution
load module
binary
2014-05-14 13Mill Computing Patents pending
C++ is Mill assembly language?
Each assembler operation is a C++ function call.
add b3,b5jump looploop:
add(b3,b5)br("loop")L("loop")
conventional assembler
Mill assembler
add(b0, 3), store(*b5, b7), br("loop“);
A Mill instruction is a C++ statementcomprising operations separated by commas
operations instruction
2014-05-14 14Mill Computing Patents pending
C++ as meta-language
for(int i = 1; i < 5; ++i) add(b0, i);
Each call of an asm function emits that operation.
gives the same machine code as:
add(b0, 1);add(b0, 2);add(b0, 3);add(b0, 4);
As in a macro assembler, in Mill assembler you can meta-program what your program will be.
2014-05-14 15Mill Computing Patents pending
Demo: extend a core
The test case contains this code fragment:
con(w(fpemu::st2bin32("3.0")));con(w(fpemu::st2bin32("5.0")));addb(b0, b1);nop();nop();nop();
However:the Tin CPU does not support native floating-point
2014-05-14 16Mill Computing Patents pending
Demo: extend a core
Build for Tin – fail
2014-05-14 17Mill Computing Patents pending
Demo: create new “Demo” member like Tin
Copy code tree Tin -> DemoClear old files from new tree Tell builder tool to use new memberCreate new specification from Tin spec
2014-05-14 18Mill Computing Patents pending
Demo: update Demo member with FPU
Populate execution pipeline slot with floating point
2014-05-14 19Mill Computing Patents pending
ISA design by composition
attributesThe semantic pieces of operations
2014-05-14 20Mill Computing Patents pending
Operation attributes
A Mill operation invocation comprises a core operation and values for some number of attributes.
Specific attribute values are supplied by the operation mnemonic or by an argument to the operation function.
addus(b3, 17)
unsigned integersaturatingb3 – third belt position17 - literal
domainoverflowopand0imm0
attribute value
There are ~50 attributes. Only a handful are meaningful for any particular operation.
core operation add
2014-05-14 21Mill Computing Patents pending
Attribute values
Most attributes are enumerations:
enum directionCode {leftward, rightward};
enum condSenseCode {allSense, falseSense, trueSense};
enum overflowCode {excepting, modulo, saturating, widening};
Attribute values can be specified individually, or as bitsets with a selection of values of the same attribute.
enum domainCode{binFloat, boolean, decFloat, logical, pointers, signedInt, unsignedInt};
2014-05-14 22Mill Computing Patents pending
MnemonicsEach opcode and attribute value has a text string nick.
value nick
leftwardrightwardsignedIntunsignedInt
lrsu
Spec software concatenates the nicks of the opcode and attributes to make the assembler mnemonic automatically.
shiftrs operation is shift, right, signed
There are ~120 core ops and ~1000 mnemonics.
2014-05-14 23Mill Computing Patents pending
Attribute semanticsBesides its type, each attribute has three choices:
how values are expressed in assembler source• by mnemonic, based on the function name• by parameter, based on explicit argument• derived from other attributes, not in source
how values are encoded in target binary code• pinned in a single bit field in all formats• direct in different bit fields in different formats• merged into an opcode super-field• uncoded, for internal use only
how the set of permitted values is determined• universal, same for all slots for all members• by member, same for all slots on a given member• by slot, may vary based on available entropy
2014-05-14 24Mill Computing Patents pending
A candidate operation
Semantics of the new operation:
shiftaddincrement
Assembler:
Any ALU can do this in one cycle
#define N = 7uint16_t NEWOP (uint16_t a, uint16_t b) {
return (a << N) + b + 1; }
Pick a name: Pick a value for N:
2014-05-14 25Mill Computing Patents pending
Demo: define a new opcode
Add new opcode – opAttr.hhAdd printname – opAttr.ccAdd traits – attrTraits.cc
2014-05-14 26Mill Computing Patents pending
Argument signatures
Some attributes get their value from the function arguments in the operation, rather than the mnemonic.
arg kind meaningexuArgimmArgbitArgoffArg
exu-side belt positionsmall immediate constantbit numberload/store offset
Argument nicks are concatenated into signatures.
exuBitSigbaseOffWidthfSigexuExuExuSig
belt position, bit numberaddress base, offset, widththree belt positions
signature arguments, in order
Ops are uniquely identified by their mnemonic and signature
2014-05-14 27Mill Computing Patents pending
Operation patterns
An operation pattern comprises:• the core operation and its encoding block• the argument list• all meaningful values for all mnemonic attributes
Each pattern defines all the operations that result from the cross-product of attribute values: the models.
There are around a thousand models.
Operations are defined as patterns, not individually.
opPattern(exuBlock, addOp) << floats << roundings << exuArg << exuArg;
This defines 12 models: six different rounding modes for each of binary and decimal floating point
2014-05-14 28Mill Computing Patents pending
What attributes for our new operation?What domain?
What about overflow?ignore it?mark result as an error?saturate to maximal value?produce a double-width result?
signedInt?unsignedInt?
Where to encode it?exuBlock?
What arguments?exuArg, exuArg?
2014-05-14 29Mill Computing Patents pending
Demo: define a new operation
Add specification – opSpecs.ccAdd sim implementationbuild sim
2014-05-14 30Mill Computing Patents pending
Say how – or say what?
specificationHardware development made easy.
2014-05-14 31Mill Computing Patents pending
Abstract Mill-ness
The Mill is a family of member CPUs sharing an abstract operation set and micro-architecture.
abstract Mill
operation set
micro-architectur
e
2014-05-14 32Mill Computing Patents pending
Abstract Mill-ness
The Mill is a family of member CPUs sharing an abstract operation set and micro-architecture.
abstract Mill
operation set
micro-architectur
e
2014-05-14 33Mill Computing Patents pending
Specifications make concrete from abstract
The Mill is a family of member CPUs sharing an abstract operation set and micro-architecture.
abstract Mill
operation set
micro-architectur
e concrete Mill chips
Crimson
Monocore
...specifications
2014-05-14 34Mill Computing Patents pending
Why specification/configuration?
Creating a CPU by hand is fabulously expensive.
Much of CPU implementation is repetitive, error-prone, tedious and wasteful.
Often the design winds up sub-optimal because it’s too much trouble to change it yet again
The Mill team knew it lacked the resources to implement – and re-implement – a moving target from scratch
So we got the software to do it• can address multiple markets efficiently• fast pivots for new chips• economy for company and customers
Result:
2014-05-14 35Mill Computing Patents pending
Concrete Mill chips
Each concrete chip is specified as a set of components, including cores, caches, memory controllers, etc.
Crimson
Monocore
...
concrete Mill chips
“Crimson” chip
2014-05-14 36Mill Computing Patents pending
Concrete Mill chips
Copper core ...
“Crimson” chip
Silver core
caches
“Copper” core
Each concrete chip is specified as a set of components, including cores, caches, memory controllers, etc.
2014-05-14 37Mill Computing Patents pending
Concrete Mill cores
...
“Copper” core
specRegs
caches
ALUs
Belt
decoders
The component cores in turn specify still more nested components.
2014-05-14 38Mill Computing Patents pending
Recursive specification
Apologies to Jonathan Swift
Big parts have little parts,within each to excite ’em;and little parts have smaller parts,and so ad infinitum
It’s components, all the way down!
2014-05-14 39Mill Computing Patents pending
Component parameters
Components have parameters to define their function.
size = 16
belt
bank count = 4line width = 64evict policy = LRUway count = 4 …
cache
exit table size = 2048latency = 2…
predictor
Components of the same kind but different parameter values can be collected in palettes for reuse in designing other Mills.
2014-05-14 40Mill Computing Patents pending
Behind components
Behind each component kind is hand-written software:
• An emulation function sits in the simulator.It defines what the component does in the machine.
• A generator function sits in the generator.It emits the Verilog starting point for hardware.
The emulation function is definitive; if the hardware doesn’t match the simulator then the simulator is right.
2014-05-14 41Mill Computing Patents pending
Clock domains
The Mill sim is event-driven at pico-second accuracy.
All components reside in a clock domain. By default sub-components reside in the domain of their parent.
Xtal components create top-level clock domains.
PLL components link different domains. The ratio registers are in MMIO space for program control.
A simulated Mill program can use simulated MMIO to control the simulated hardware and change the simulated clock rate that it itself is running under.
2014-05-14 42Mill Computing Patents pending
Memory hierarchy
Components that derive from the memLevel type can be hooked together to model the memory hierarchy.
The connections are streams of requests and responses. Each component only deals with the stream. It does not know or care what is on the other end.
The streams use predictive throttling for congestion control, similar to network message methods.
Streams run at full speed, without handshaking delay.
2014-05-14 43Mill Computing Patents pending
Demo: try it out
run sim - ivan/build/testAsm.sim
2014-05-14 44Mill Computing Patents pending
Other roads…
There are other architectures that provide operation specification. These differ significantly from the Mill.
purpose:add special-purpose embedded operationsform optimal subsets for family members
encoding:reserved bit patterns, manually selectedautomatically generated optimal-entropy
specification:one-at-a-time manual processpattern-based orthogonal generation
2014-05-14 45Mill Computing Patents pending
Summary:
The Mill:
Defines members by component lists
Defines operations by composing attributesTool produces cross-product of attributes
Recursive composition – mix and match
Compact notation expresses clock, memorySays what connects to what, tool creates “how”.
2014-05-14 46Mill Computing Patents pending
For technical info about the Mill CPU architecture:
MillComputing.com/docsTo sign up for future announcements, white papers etc.
MillComputing.com/mailing-list
Shameless plug