prof. dr. akash kumar - tu dresden · modern computing systems prof. dr. akash kumar chair for...
TRANSCRIPT
![Page 1: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/1.jpg)
Modern Computing Systems
Prof. Dr. Akash KumarChair for Processor Design
(Ack: my past and current students/PostDocs)(Some slides adapted from Koren, Krishna, Anand)
![Page 2: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/2.jpg)
© Akash Kumar
Singapore2
![Page 3: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/3.jpg)
© Akash Kumar
Eindhoven University of Technology3
![Page 4: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/4.jpg)
© Akash Kumar
Gliding4
![Page 5: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/5.jpg)
© Akash Kumar
Finished PhD!5
![Page 6: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/6.jpg)
© Akash Kumar
Life in Singapore6
![Page 7: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/7.jpg)
© Akash Kumar
Faculty of Engineering, NUS7
![Page 8: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/8.jpg)
© Akash Kumar
Outreach8
![Page 9: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/9.jpg)
© Akash Kumar
…9
![Page 10: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/10.jpg)
© Akash Kumar
Topics to be Covered
1. Real-time Embedded Systems2. Fault-Tolerance3. Approximate computing 14. Approximate computing 25. Machine learning embedded architectures 16. Machine learning embedded architectures 27. Going beyond CMOS
10
![Page 11: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/11.jpg)
© Akash Kumar
Outline
¨ Characteristics of real-time embedded systems¨ Hardware architecture for real-time embedded
systems
¨ Design flow and considerations¨ Modern challenges and solutions
11
![Page 12: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/12.jpg)
© Akash Kumar
What is a real-time system?
A system which responds to events within a finite and specifiable delay in order to change its environment
y(t+Δ)
RT system x(t)Environment
Δ is finite and specifiablePredictability!!
12
![Page 13: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/13.jpg)
© Akash Kumar
Why is real-time important?
*Courtesy – Rajesh Panicker
Original
Corrupt – 5%
Corrupt – 20%
Corrupt – 40%
13
![Page 14: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/14.jpg)
© Akash Kumar
What is an embedded system?
¨ Small device, like a cell phone¨ Small processor installed in some other device, like
a car¨ Software that controls a consumer device¨ Real-time response
My favorite:
Any system where the user doesn’t want to know that it includes a processor
14
![Page 15: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/15.jpg)
© Akash Kumar
Examples of Real time/ Embedded Systems
15
¨ Mars pathfinder
¨ Car engine
¨ Mobile phone
¨ Set-top box
¨ Car navigation
¨ Industrial control
¨ Telecom switch
¨ Global Positioning System
¨ Air Traffic Management
¨ Satellite flight manager
¨ Satellite Ground Control
¨ TV receiver
¨ Flight control system
¨ Electric shaver
¨ Toaster
¨ Washing Machine
![Page 16: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/16.jpg)
© Akash Kumar
Hardware Architectures for Embedded Systems
16
![Page 17: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/17.jpg)
© Akash Kumar
History of Hardware / VLSI
¨ Vacuum tube
(Lee De Forest, 1906)
¨ ENIAC
(1946, UPenn)
¨ Transistor
(1947, Bardeen, Brattain,
Shockley)
¨ Integrated circuit
(1958, Jack Kilby)
17
![Page 18: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/18.jpg)
© Akash Kumar
History of Hardware / VLSI
¨ Intel 4004
(1971, 1400 transistors)
n Intel Core i7 - Ivy Bridge (2012, >1.4 Billion transistors)
n Very Large Scale Integration (VLSI) – originally defined for chips having transistors in the order of 100,000. Other terms such as ULSI came along, but the usage VLSI remains dominant
18
![Page 19: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/19.jpg)
© Akash Kumarhttp://cpudb.stanford.edu/
Moore’s Law
n In 1965, Intel’s Gordon Moore predicted that the number of transistors that can be integrated on single chip would double about every two years
19
![Page 20: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/20.jpg)
© Akash Kumar
40 Years of microprocessor trend data20
![Page 21: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/21.jpg)
© Akash Kumar
Design Productivity Gap
¨ Increasing number of transistors makes it harder to design the system¤ Late launch of products directly hurts profits
21
![Page 22: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/22.jpg)
© Akash Kumar
System Design Considerations
¨ System : sensor -> processor -> actuator¨ Considerations
¤ Technology¤ Performance¤ Power consumption¤ Volume of production¤ Upgradability / ease of maintenance¤ Reliability¤ Testability¤ Availability of CAD and software tools, IP's, hardware and
software libraries¤ Cost, chip area¤ Legal and certification requirements, client specifications¤ …..
22
![Page 23: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/23.jpg)
© Akash Kumar
Digital Hardware Market Segments
n Processor, GPU
n DRAM, Flash memories
n (Co-)Processor alternativesn ASIC (application specific integrated circuit)n ASSP (application specific standard product) n FPGA (field programmable gate array)
n Convergence as System on Chip (SoC), which may also contain analog, mixed-signal, and radio-frequency functions
23
![Page 24: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/24.jpg)
© Akash Kumar
Embedded systems architecture
¨ Trend towards Multi-Processor Systems-on-chip (MPSoC)
¨ Homogeneous vs heterogeneous systems¨ Different memory models¨ Different network architectures
¤ Network-on-chip¤ Buses
24
![Page 25: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/25.jpg)
© Akash Kumar
Processor 4 Processor 5
Interconnection network
Processor 1 Processor 2 Processor 3
Memory Memory Memory
Memory Memory
Homogeneous vs heterogeneous25
![Page 26: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/26.jpg)
© Akash Kumar
Homogeneous vs heterogeneous
¨ Heterogeneity is increasing¤ Different levels of parallelism in application¤ uProc – better for control-flow¤ DSP – better for signal processing¤ Dedicated hardware blocks needed for certain parts¤ Improves efficiency and saves power
¨ Homogeneous systems ¤ Better for fault-tolerance¤ Only one compiled version of any application needed¤ Easier to design and replicate¤ Easy to support task migration
26
![Page 27: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/27.jpg)
© Akash Kumar
Memory usage27
![Page 28: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/28.jpg)
© Akash Kumar
Processor 4 Processor 5
Interconnection network
Processor 1 Processor 2 Processor 3
Memory Memory Memory
Memory Memory
Embedded systems – local memory28
Local memory is better for more predictability Network/ bus delay may be unpredictable
![Page 29: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/29.jpg)
© Akash Kumar
Embedded systems – global memory
Processor 4 Processor 5 Memory
Interconnection network
Processor 1 Processor 2 Processor 3
Arbiter
29
Global memory may be better for shared data
![Page 30: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/30.jpg)
© Akash Kumar
Processor 4 Processor 5
Interconnection network
Processor 1 Processor 2 Processor 3
Memory Memory Memory
Memory Memory
Embedded systems – combination
Memory
Arbiter
Can also be off-chip!
30
Communication pattern also determines which architecture is betterMessage passing OR Shared memory
![Page 31: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/31.jpg)
© Akash Kumar
Embedded systems – network
Processor 4 Input/ Output Memory
Interconnection network
Processor 1 Processor 2 Processor 3
Arbiter
31
![Page 32: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/32.jpg)
© Akash Kumar
Interconnection network-on-chip
Processor 4 Input/ Output Memory
Processor 1 Processor 2 Processor 3
Arbiter
NININI
NI NI
NI
RouterRouter Router
32
![Page 33: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/33.jpg)
© Akash Kumar
Interconnection network – bus
Processor 4 Input/ Output Memory
High speed bus
Processor 1 Processor 2 Processor 3
Arbiter
Arbiter
33
![Page 34: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/34.jpg)
© Akash Kumar
Point-to-point networks
Processor 4 Input/ Output Memory
Processor 1 Processor 2 Processor 3
Arbiter
34
![Page 35: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/35.jpg)
© Akash Kumar
System Design – Hw/Sw Codesign
¨ Take decisions on whether to implement in hardware or software¤ Consider the advantages vs costs
¨ If hardware, whether to use commercial off the shelf (COTS) components or custom components
System Level Specification
SoftwareModel
Hardware/Software
Partitioning
HardwareModel
Co-simulation
Compilation Synthesis
Integrationand Testing
Resources
Performance-1
Pareto Curve
35
A
BC
D
![Page 36: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/36.jpg)
© Akash Kumar
Modern Challenges
36
![Page 37: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/37.jpg)
© Akash Kumar
Modern Multimedia Embedded Systems
Large number of use-cases
Guarantee performance
Minimize power consumption
Run-time addition of appl
37
![Page 38: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/38.jpg)
© Akash Kumar
Issues and Modern Trends
¨ The communication bottleneck¤ 3D Chips¤ Optical interconnects
¨ Leakage current limiting size reduction¤ Multi-gate or gate-all-around transistors (Intel 22nm
uses 3D/tri-gate transistors)¤ Channel strain engineering, silicon-on-insulator-based
technologies, and high-k/metal gate materials
¨ One may not fit all¤ Hardware/Software Co-design¤ Fault-tolerant / reconfigurable computing
¨ Power issues¤ Multi-core and heterogeneous architectures
38
![Page 39: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/39.jpg)
© Akash Kumar
¨ Dennard scaling principles [1]
Technology Scaling
[1] R. Dennard et al. “Design of Ion-Implanted MOSFET’s with Very Small Physical Dimensions,” IEEE Journal of Solid-State Circuits, 1974.
Device Parameters Scaling Factor
Device dimension 1/k
Doping concentration 1/k
Voltage 1/k
Current 1/k
Capacitance 1/k
Delay time per circuit 1/k
Power dissipation 1/k2
Area 1/k2
Power density 1
39
![Page 40: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/40.jpg)
© Akash Kumar
¨ Digression from Dennard’s scaling beyond 65nm¤ Non-ideal voltage scaling: limit on threshold voltage scaling¤ Non-ideal gate oxide scaling
¤ Sub-threshold leakage power
¨ Power dissipation increases with technology scaling¤ Heat localization (hot spots)
¤ Higher temperature => device wear-out
Technology Scaling40
![Page 41: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/41.jpg)
© Akash Kumar
Technology Scaling and Power Density
Hot Plate
Nuclear Reactor
41
![Page 42: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/42.jpg)
© Akash Kumar
Technology Scaling and Power Density
Manufacturing defects(e.g. Imperfect Lithographic
patterning)
Transistor Scaling
Increased Variability(e.g. Random Dopant
Fluctuation)
Increasing Power Density
Increase T
42
![Page 43: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/43.jpg)
© Akash Kumar
Approximate Computing
43
![Page 44: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/44.jpg)
© Akash Kumar
20 W20 W
~200000 W
The Computational Efficiency Gap
IBM Watson playing Jeopardy, 2011
44
![Page 45: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/45.jpg)
© Akash Kumar
Humans Approximate
923 = - - .- -?21
is 923 >1.75?21
is 923 > 45?21
Task:Division
Application context dictates required accuracy of results
Accuracy
21) 923 (43848363
Effort expended increases with required accuracy
~1Petaflop/W
45
![Page 46: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/46.jpg)
© Akash Kumar
But Computers DO NOT
float x = 923;float y = 21;cout << (x/y > 45.0) ?“YES”:”NO”;
NO92321 >45
923> 1.7521
float x = 923;float y = 21;cout << (x/y > 1.75) ?“YES”:”NO”;
YES
But, I workedharder than needed
} Overkill (for many applications)
} Leads to inefficiency} Can computers be more efficient by producing “just good enough” results?
46
![Page 47: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/47.jpg)
© Akash Kumar
Intrinsic Application Resilience: Sources
IntrinsicApplicationResilience
‘Noisy’ RealWorld Inputs
Redundant Input Data
Perceptual Limitations
Statistical Probabilistic
Computations
Self-Healing
Compute distances& assign points to clusters
Update clustermeans
Repeat until convergence
} Intrinsic application resilience:Ability to produce acceptable outputs despite underlying computations being performed in an approximate manner
47
![Page 48: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/48.jpg)
© Akash Kumar
Machine Learning
48
![Page 49: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/49.jpg)
© Akash Kumar
What's Deep Learning/ Deep Neural Network?
¨ Self learning algorithms¨ Using huge data sets to learn¨ Deep: many "learning layers"¨ Brain inspired, based on neurons and synapses
(connections)¨ High classification accuracy¨ Many applications; let's look at ImageNet
classification and Tesla Autopilot
49
![Page 50: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/50.jpg)
© Akash Kumar
ImageNet Winners (top-5 classification error)
¨ ImageNet dataset: 10M images, 10000 classes
50
0%
5%
10%
15%
20%
25%
30%
2010 2011 2012 2013 2014 2015 2016 2017
top-
5 er
ror
Traditional methods Deep Learning Human
![Page 51: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/51.jpg)
© Akash Kumar
AI: Tesla Autopilot
¨ Tesla Model S demonstration of autonomous driving¨ Computing system monitors radar and several
cameras¤ Detect objects like cars, and pedestrians¤ Monitor traffic signs¤ Lane tracking and possible lane changing¤ Auto parking
51
Tesla web page: www.tesla.com/videos/ November 2016
![Page 52: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/52.jpg)
© Akash Kumar
Deep Learning and High-performance HW Architectures
52
![Page 53: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/53.jpg)
© Akash Kumar
Beyond CMOS
53
![Page 54: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/54.jpg)
© Akash Kumar
Classical CMOS Transistors54
PMOS on topNMOS at bottom
![Page 55: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/55.jpg)
© Akash Kumar
Reconfigurable Transistors:Silicon Nanowires Based Reconfigurable FETs
55
![Page 56: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/56.jpg)
© Akash Kumar
SiNW Dual-gate RFETs: Combines p-type and n-type functionality
Runtimereconfigurability
56
![Page 57: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/57.jpg)
© Akash Kumar
Conclusions
¨ Characteristics of real-time embedded systems
¨ Hardware architectures and considerations
¨ Modern trends and challenges
¨ Fault-tolerance a major problem in modern systems
57
![Page 58: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/58.jpg)
© Akash Kumar
Ongoing Research Activities
Reliability/Energy Optimization• Reconfigurable approximate computing at run-time• Optimize energy and reliability• Minimize thermal cycling and peak temperature• Task remapping and scheduling for dealing with faults
Processing Architecture Design• Determine and design appropriate system architecture• Design predictable components – network and communication assist• Partially reconfigurable tile-based heterogeneous multiprocessor systems• Task-migration module in hardware for predictable delay
Low-Power and Fault-Tolerant FPGA Designs• Improving fault-tolerance of FPGA through LUT content manipulation• Novel error-correction mechanisms for FPGAs• Leakage-aware resource management techniques• Electronic Design Automation – Place and Route for FPGAs
58
![Page 59: Prof. Dr. Akash Kumar - TU Dresden · Modern Computing Systems Prof. Dr. Akash Kumar Chair for Processor Design (Ack: my past and current students/PostDocs) (Some slides adapted from](https://reader030.vdocuments.us/reader030/viewer/2022040610/5ecfb2bb2be48a0de54ed0c9/html5/thumbnails/59.jpg)
© Akash Kumar
Chair for Processor Design59