anr reviev mars oct14 v3 withoutbackup
TRANSCRIPT
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
1/29
MRAM based Architecturefor Reliable and
low power SystemsANR INS 2011 Octobre 201136 Project month
Grant: 890 K
http://www.ezprod.lirmm.fr/
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
2/29
1 - OBJECTIVES & CONTEXT
2 - TYPICAL TARGET APPLICATION
3 - MRAM IMPACT ON PROCESSOR ARCHITECTURE
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
Models
Innovative CMOS/MRAM hybrid cells
Exploration & Performances evaluations
5- FUTURE WORKS
6- PROJECT VALORISATION
7- CONCLUSIONS
SUMMARY
Technology
Architecture
2
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
3/29
Rmax - Rmin
Rmin~ 500 %
Pinned Layer (FM)
Free Layer (FM)
Tunnel oxide (Mgo)
350 to 90 nm
( Rmax )( Rmin )
Magnetic Tunnel Junction
Giant Magnetoresisance
Toshiba MTJ features
- Write pulse : 4ns
- Iw < 50uA
- Same Energy than SRAM
1OBJECTIVES & CONTEXT : WHATIS MRAM3
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
4/29
ParameterSTT-MRAM CELL
(1T1J)SRAM cell (6T) STT-RAM vs SRAM
Read time (ps) 200 200 x1
Write time (ns) 3-4 0.2 X 15-20
Read energy (fJ) 6-8 10 /1.25 to 1.5Write energy (fJ) 500 10 X 50
Leakage power (nw) 1.5 15-30 / 10 to 20
Bit cell size (F2) 6-20 120-160 /6 to /20
@ 40nm technology
MRAM vs SRAM
Density x7 vs SRAM (@ bit cell) Non-Volatile - Infinite endurance
Dynamic Write Energy, x10 comparing to SRAM
But ultra low leakage /10 comparing to SRAM
Radiation hardned (CNES study)[1]
[1] RADECS 2007 CNES
1OBJECTIVES & CONTEXT : MRAM FEATURES
Density: Flash >> DRAM > MRAM >> SRAM
Speed: SRAM ~ MRAM > DRAM >> Flash
(< 10ns)
For embedded systems:
Above CMOS techology
4
Potential failures mechanisms
Perturbation by read current
STT-MRAM : Stochastical effect during writing
Sensitive to TDDB (Time-Dependent Oxide Breakdown)
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
5/29
Main Objectives
1/ Innovative hybrid CMOS/MRAM
2/ from technologies to system architecture
(performances/Energy/Reliabilty)
3/ Exploration and validation on embedded
processor core
4/ methodologies for other NVM technologies
Critical applications (space for instance)
No silicon implementationbut design Methodologies
Technology
Architecture
1OBJECTIVES & CONTEXT5
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
6/29
1OBJECTIVES & CONTEXT : PROJECTORGANISATION
WP2 : Case studyEADS
WP5Emerging technologies
Models & reliabilityIEF
WP3 : Circuit and architecture
ExplorationPerf/Reliability/Energy
CEA
WP4 : Performance evaluation
MethodologyLIRMM
Embedded reliable systemDesign CMOS/MRAM methodologies
Generic analyses/methods
for emerging technologies
Project meetings each 4/5 months (Kick off 11/2011)
Several partnership meetings
Common seminar organisations
Network with PhD (1) / Post Doc students (4)
Budget, Globally 60% expenses
6
SPIN, CILOMAG
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
7/29
Case study & Scenarions
Target application defined DONE
Not limited to space
application
NEW
Metrics definition DONE
NVM elements into processor
architecture - Scenarios
DONE
Models & Reliability
STT-RAM Models DONE
Reliability analysis TO BE
COMPLETED
Fault injection campaign DELAYED
NVM technoloy evolution STARTED
T0+6 T0+24
Architectural Exploration & Perf Methodology
Set of innovative cells : NV
Register, CMOS/MRAM cells
memory, ALU
DONE
CAM case study and reliability DONE
Memory hierarchy exploration :
Performance / Energy
DONE
Memory hierarchy exploration :
Reliability
STARTED
1OBJECTIVES & CONTEXT : PROJECTORGANISATION7
Non-Volatile reliable processor architecture
specification
Fault injection campaign TO DO
Memory hierarchy exploration :
Reliability
TO BE
CONTINUED
New NVM spintronic technologie STARTED
Applied methodologies on other
NVM technologies (PCRAM, CBRAM)
TO DO
NVM Reliable processor model TO DO
T0+36
Actual delivrables list : D1.1, D1.2, D1.3, D.1.4, D2.1, D2.2, [D3.1, D.4.1, D4.2], D5.1, D5.2
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
8/29
Earth observation Micro-satellites
Mission: Tracking of a given object, Taking picturesAnalyse pictures to detect the object, send information to the ground station
Altitude < 2000km (Low Earth Orbit)
Period : 90 minutes
Actual Constraints
Weight 10 to 100kg
Low power budget (small batteries, solar cells, few W)
On board computation : CPU clock 100Mh
32 bit processor architecture
Limited memory size : 100 Kb
Boot code on ROM (no upgrade available)
High Reliability : 10-7 failures/hour (100 FIT)
Radiation tolerant to Latchup, bit-flip
2TYPICAL TARGET APPLICATION8
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
9/29
2TYPICAL TARGET APPLICATION
Current application
Image compression + Correction : JPEG + CRC
Image format : variable
Code Program max 1Mbit in ROM
Processing unit power budget < 3W
CPU in Sleep mode between 2 pictures (every 2s)
Reliability : TMR for processor, dedicated process, limited memory size
The future of the application
More complex image compression algorithm Code program > 10Mb, in MRAM
Processing unit power budget < 3W
Start/Stop mode Instant on-off
Reliability : Rollback, checkpointing instead TMR
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
10/29
How :
Non-volatile cache memory ?
Objectives:
Decrease leakage Low performances applications
How :
Non-volatile memory ?
Objectives:
Decrease leakage
High performances applications
Reliability
Cons :
Error detection
Energy ?
Pros
Critical data retention
Data recovery
Rollback/Checkpointing
Low overhead
3MRAM IMPACT ON PROCESSOR ARCHITECTURE10
NVM
Register
Non Volatile Register File
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
11/29
How :
Non-volatile memory ?
Objectives:
Decrease leakage
High performances applications
Reliability
Pros
L1 Higher density (x4-x7)Partial Instant on/off power
Cons :
Energy ?Write latency (x5-10)
Non Volatile Register File
Non Volatile L1 Cache
3MRAM IMPACT ON PROCESSOR ARCHITECTURE10
NVM
Register
D-Cache
MRA
M
I-Cache
MRAM
Cons :
Error detection
Energy ?
Pros
Critical data retention
Data recovery
Rollback/Checkpointing
Low overhead
Tradeoff Speed/area
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
12/29
ProsL2Higher density L2 (x4-x7)
Full partial on/off power
Tradeoff Speed/energy/area
Limited effects of radiations
Cons :Dynamic Energy
High Perf computing
Non Volatile L2 Cache
3MRAM IMPACT ON PROCESSOR ARCHITECTURE10
NVM
Register
D-Cache
MRA
M
I-Cache
MRAM
L2 Cache MRAM
Main Memory MRAM
Pros
L1 Higher density (x4-x7)
Partial Instant on/off power
Cons :
Write latency (x5-10)
Dynamic Energy
Non Volatile Register File
Non Volatile L1 Cache
Cons :
Error detection
Pros
Critical data retention
Data recovery
Rollback/Checkpointing
Low overhead
Tradeoff Speed/area
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
13/29
1 - OBJECTIVES & CONTEXT
2 - TYPICAL TARGET APPLICATION
3 - MRAM IMPACT ON PROCESSOR ARCHITECTURE
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
Models
Innovative CMOS/MRAM hybrid cells
Exploration & Performances evaluations
5- FUTURE WORKS
6- PROJECT VALORISATION7- CONCLUSIONS
SUMMARY
Technology
Architecture
11
4 FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
14/29
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
- WENEEDELECTRICALMODELS
Compact model for Cadence suite tools
TAS, STT planar & perpendicular
Takes into account
The magnetization dynamics due to spin transfer torque (STT)
Dependence of the transport and magnetic parameters on
temperature
Dynamics of the switching
Stochastic effects
.
12
4 FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
15/29
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
ANDPERFORMSRELIABILITYANALYSIS
Retention time (10 years)
Write pulse, Switching time Thermal stability - T
Energy barrier (DE)
Read current - IR
Switching error, read mode
10 years retention
1% read during 10 years
3 1015read cycle
Cellule A 40 nm :=0.027, Ic0=50 A, E=45
Cellule B 30 nm :=0.004, Ic0=30 A, E=61
13
4 FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
16/29
Memory level
Memory of1MB10 MB,
m= 64bit and IR = ICo/5
Without correction
With correctionsingle error correction
2 detection error + 1 error correction
1000 FIT
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
ANDPERFORMSRELIABILITYANALYSIS14
100 FIT
105
4 FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
17/29
Low power applications:
Ultra Low Power Magnetic Flip-Flop based on
Power Gating and Self-Enable Mechanisms
.Multi-level error recovering: If an error occurs
during the latest checkpoint, the system can usethe previous local checkpoint to restore its state.
Fast register context switch: the multi-local
storage of the proposed NV FF can be used to
save several contexts of computing process,
hence, the CPU can store and restore very quickly
these contexts.
ApplicationsProposed MFF Conventional
FF
Non-volatility Checkpoint(Multi-
bit)
No
Read energy (fJ) 12 8
Write Energy (pJ) 0.02-0.5 0.01Sensing current (A) 40 ~80
Sensing Delay(ns) 0.4 0.045)
Leakage power
(active mode) (nW)
820 330
Leakage power
(standby mode) (nW)
Instant on off
0
330
Area for 4bits (m2) 85 ~20
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
IMPACT @ CIRCUITLEVELHYBRIDCMOS/MRAMCELLS15
3 PATENTSMRAM ALU
4 FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE 18
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
18/29
MRAM implementation of a CAM
Reduced power consumption compared to
pure CMOS
No static consumption
Read operation by resistance sensing
(no charge and decharge scheme)
Robust to radiations
Higher level of integration (x2 to x4)
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
IMPACT @ MEMORYLEVEL
CAM
Searched
data
Address of the
searched data
Hit signal
Hamming distance
of the code (d)
Hamming distance
between search
and current word
Decision
1 or 2 0 Hit
1 or more Miss
3 or 4 0 or 1 Hit
2 or more Miss
Proposed structure for ECC protection
of TAS-MRAM based CAM
18
STT-MRAM CAM memory
4 FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE 19
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
19/29
Comparison SRAM vs STT-MRAM
Performance & Energy
CACTI (HP)
SRAM detailed performance
evaluation (access time, power)
NVSIM (Qualcomm/Pennstate)
STT-MRAM detailed performance
evaluation (access time, power)
MRAM HPSRAM HP
MRAM LOPSRAM LOP
16kB
32kB
64kB
128kB
256kB
512kB
1MB
2MB
4MB
8MB
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Write Dynamic Energy
Write energy per byte Read/Write Latency
Techno
file
0
50
100
150
200
250
16K 32K 64K 128K
MRAM
SRAM
Memory Size
Leakage (mW)
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
IMPACT @ ARCHITECTURE/SYSTEMLEVELMEMORYHIERARCHY19
4 FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE 20
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
20/29
Comparison SRAM vs STT-MRAM
Performance & Energy
CACTI (HP)
SRAM detailed performance
evaluation (acces time, power)
NVSIM (Qualcomm/Pennstate)
STT-MRAM detailed perf ormance
evaluation (acces time, power)
OS
Applications
GEM5
- Processor Architecture simulator
- Quasi Cycle Accurate
- # Processors (ARM, MIPS, x86, SPARC)
- Memory Hierarchy definition
- CPU Time, # Cycles
- # Memory Hierarchy Transactions (Read/Write, Hit/Miss)
- Many other parameters
4 - FROM TECHNOLOGY TO SYSTEM LEVEL ARCHITECTURE
IMPACT @ ARCHITECTURE/SYSTEMLEVELMEMORYHIERARCHY20
4 RESULTS 21
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
21/29
Scenario 1 : Hybrid CMOS/MRAM L1 Cache For low
performance applications Target application
Assumptions
MRAM Read/Write Latency = x3 SRAM Read/Write Latency Cache miss L1 = 1000 cycles (classical for low perf applications)
MRAM density = x 4 SRAM density (normally between x4 to x8)
L1D Cache
L1I Cache
4- RESULTSIMPACT @ ARCHITECTURE/SYSTEMLEVELMEMORYHIERARCHY
21
4 RESULTS 22
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
22/29
Main assumptions:
writing/reading process is 3x slower
Density up to 4x higher can be used to deal with this issue
Architecture exploration
MRAM outperformed for SRAM caches > 32 KB [left side] (15%)
Delay can be compensated for SRAM caches < 32 KB [right]
4- RESULTSIMPACT @ ARCHITECTURE/SYSTEMLEVELMEMORYHIERARCHY
22
4 RESULTS 24
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
23/29
Scenario 2 : Hybrid CMOS/MRAM L2 Cache (High perf
embedded CPU)
L2 Cache
16KB L1 SRAM Access time 2 ns,
128KB L2 MRAM Access time 20 ns
256MB DDR Access time 30 ns
4- RESULTSIMPACT @ ARCHITECTURE/SYSTEMLEVELMEMORYHIERARCHY
24
24
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
24/29
Case study & Scenarions
Target application defined DONE
Not limited to space
application
NEW
Metrics definition DONE
NVM elements into processor
architecture - Scenarios
DONE
Models & Reliability
STT-RAM Models DONE
Reliability analysis TO BE
COMPLETED
Fault injection campaign DELAYED
NVM technoloy evolution STARTED
T0+6 T0+24
Architectural Exploration & Perf Methodology
Set of innovative cells : NV
Register, CMOS/MRAM cells
memory, ALU
DONE
CAM case study and reliability DONE
Memory hierarchy exploration :
Performance / Energy
DONE
Memory hierarchy exploration :
Reliability
STARTED
24
Non-Volatile reliable processor architecture
specification
Fault injection campaign TO DO
Memory hierarchy exploration :
Reliability
TO BE
CONTINUED
New NVM spintronic technologie STARTED
Applied methodologies on other
NVM technologies (PCRAM, CBRAM)
TO DO
NVM Reliable processor model TO DO
T0+36
5- FUTURE WORKS
25
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
25/29
5- FUTURE WORKS (1/2)
New NVM spintronic technology evaluation
25
Reference
Insulator
Storage
Write line (conductor)J
app
X
Y
Z
Reading
Pt/-Ta/
-W
Ba
Static magnetic field Ba(permanent magnet or bias layer)
Spin Orbital Torque Junction (SOTJ)
Independent read/write path
Increase reliability
- No write spin current through theMTJ
- Achieving reliable reading of the MTJ
without ever causing switching
storage is performed through moving data in
space
Storage & communication at the same time
Digital architecture prospective: Very High density memory
Logic (ALU), High density CAM,
But low level of maturity
Racetrack memory (IBM)
26Based on real RTL 32bit processor
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
26/29
5- FUTURE WORKS (2/2)
NVM Reliable processor model (Main final objective)
26Based on real RTL 32bit processor
RISC open source model(LIRMM)
Detection error module by observation
NVM
Register
D-
Cache
MRA
M
I-Cache
MRAM
Main Memory MRAM
Application constraints EADS Detection error module provide by CEA
NV Register/ALU : IEF, SPINTEC, LIRMM
RTL model + electrical model
Memory correction code & ECC model :
CEA & LIRMM
Synthesis tools (SPINTEC)
RTL Integration LIRMM
Target technology evaluation 28nm
FDSOI + 45nm STT-MRAM (link
DIPMEM)
Fault injection @ RTL level evaluation
Instant on/off capabilities
Performances evaluations at gate level and
Place & Route + Instant on/off evaluation
27
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
27/29
5- PROJECT VALORISATION
A set of publications
6 journals (2 multi-partnership), 6 Conferences, 1 Chapter Book
3 patents (CEA/Spintec)
MARS book in preparation May 2014 - Springer
A set of invitations
IEEE ISCAS, IEEE NEWCAS, Univ. Beihang (china) Spin Worshop, In_MRAM, E-NVM
National conferences : RADSOL, FETCH, GDR etc A set of remarkable results
Open lib MRAM models for design
Reliability exploration on MRAM technologies
A set of Innovative CMOS/MRAM hybrid cells (patent !)
An exploration tool for performances evaluation @ system level
These results are disseminated by EADS Innovation Works within the EADS group
(Airbus, Astrium)
27
28
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
28/29
5- PROJECT VALORISATION
Link with others research projects (EMYR, CROCUS, SPOT .)
An European project submitted @ FP7 (> average)
EADS IW works on the identification of this promising technologies which could
be used for future electronic developments
Start-up : ARTEMIS (incubateur GRAVITSpintec, Low power NVM design)
MARS
Tools - Design Methodologies
Performances evaluation (circuit, architecture)
DIPMEM
NV processor architecture (ReRAM vs MRAM)
low performance reliable applications
28
A solution for campaign laser with EADS
29
-
7/27/2019 Anr Reviev Mars Oct14 v3 Withoutbackup
29/29
In aeronautic, a large part of the computation performance (~30%)
is dedicated to mitigate reliability and cosmic radiation effects.
In the future MRAM could be a real solution for Embedded system
performance, reliability and maturity had still to be assessed
MARS is a research project to build the future Embedded systems
A set of technological breakthrough breakdown
Model, IP cells, tools, reliability techniques
Toward a RTL model to evaluate more accurately performances
A key enabling technology for the future
CONCLUSIONS29