multi-cellular paradigm the molecular level can support self- replication (and self- repair). but we...
Post on 14-Dec-2015
214 Views
Preview:
TRANSCRIPT
Multi-cellular paradigm
The molecular level can support self-replication (and self-repair).
But we also need cells that can be designed to fit the specific application and at the same time able to support bio-inspired mechanisms for self-replication and fault tolerance.
Cellular differentiation Cells adapt their physical
structure to fit the “application”
Can circuits/processors do the same? Physically? No Logically? Yes, but…
Can they do it easily (dare we say, automatically)?
Instruction encoding Instructions encode both the operation and the
operands. For example, in the MIPS architecture
Bio-inspired processors However, none of these “standard” architectures is quite flexible
enough to implement many of the behaviours required for bio-inspired computing
Needed: adaptable cellular architectureThat is, a processor architecture that is
Customizable Compact Powerful Easy to design and modify Amenable to evolution and learning
Possible solution: MOVE architectures
The MOVE paradigm
One single instruction : move Data displacements trigger
operations Architecture based around
data ≠ operation centric Regular structure : functional
units + data network Scalable and modular
architecture
Example: Sum of two values
Conventional architecture:add R1, R2, R3;
MOVE architecture: move O(Fxxx), I1(Fsum)
move O(Fyyy), I2(Fsum)move O(Fsum), I(Fzzz)
Cellular differentiation
Main features: Only one instruction (OK, maybe two) that MOVEs data to
and from the CUs and FUs (dataflow architecture) Conventional fetch/decode mechanism – compatible with
bio-inspired mechanisms No pipeline: computation carried out in specialized
functional units (FU) Communication carried out in specialized communication
units (CU)
Cellular differentiation
Main advantages: Can be easily customized by introducing application-specific functional and communication units. Perfectly fits the requirements of systolic arrays (arbitrarily complex communication patterns). The introduction of custom components does not affect the assembler language, the code
structure, the fetch and decode units, or the transport bus.
Genotype Layer
Phenotype Layer
Example – Automatic Synthesis
Application-specific (parallel) functions
Developmental algorithm
Genetic code
Mapping Layer
What kind of applications can take advantage of this kind of system?
Complex "real-world" streaming applications computation is carried out sequentially can be represented by a DAG of computation nodes each node processes data locally then forwards
them to the next node in the graph
Applications
×+ ÷≠ FFT +
×
DCTIN OUT
READ DCT QNTZ CMPR WRT
Example: JPEG
Specialized MOVE functional units can be designed for each of these steps
IN OUT
Programmable substrate
×+ ÷≠ FFT +
×
DCT
Context
IN OUT
Problem: task or resource allocation – i.e. how do we map the graph nodes to the array?
Specifically: dynamic allocation
Self-Scaling Stream Processing
Source
Funct A
Funct B
Funct C
JoinFunct AFunct AFunct A
Funct CFunct
AFunct A
Funct CFunct
A
Funct C
SSSP The MJPEG application consists of a four-stage
computation pipeline. The data to be compressed are composed of 192 bytes corresponding to an 8x8 array of pixels using 24-bit colour.
The maximum rate achievable (determined by the input rate) is of 700 packets per second - roughly 1 MBit/second. With a single pipeline, the performance tops at about 60 packets per second.
SSSP
When performance peaks, the average output rate is of 675 packets per second (out of a maximum of 700): this technique allows to multiply the throughput by a factor of 11 using 28 processors.
top related