crypto hardware design for embedded applicationsembedded ... · • tightly coupled high speed...
TRANSCRIPT
Crypto Hardware Design for Embedded ApplicationsEmbedded Applications
Dr. Amlan Chakrabarti&&
Mr. Suman SauReal Time Embedded System Research GroupReal Time Embedded System Research Group
A.K.Choudhury School of Information TechnologyUniversity of Calcuttail k @ l i iemail:[email protected]
1A. Chakrabarti and S. Sau, ICISS 2012
AgendaAgenda• Insight to Embedded Systems
• High Performance Embedded System Design– Requirements– Challengesg
• Basic Concepts of Cryptography– Public & Private Keys– Hashing
Digital Signature– Digital Signature– Block Cipher and Stream Cipher
• Crypto Hardware for Embedded Systems – Requirements– Challenges– Reconfigurable Hardware
• Architectures• Design examples
• Crypto Engine Design– Prototype Design Using FPGA– Example
• Conclusion• Conclusion
2A. Chakrabarti and S. Sau, ICISS 2012
Insight to Embedded Systems
• An embedded system is nearly any computing system (other than a general purpose computer) with the(other than a general‐purpose computer) with the following characteristics
Single functioned– Single‐functioned• Typically, is designed to perform predefined function
– Tightly constrainedg y• Tuned for low cost • Single‐to‐fewer components based P f f ti f t h• Performs functions fast enough
• Consumes minimum power
– Reactive and real‐timeReactive and real time• Must continually monitor the desired environment and react to changes
– Hardware and software co‐existence
3A. Chakrabarti and S. Sau, ICISS 2012
Insight to Embedded Systems (2)Insight to Embedded Systems (2)
4A. Chakrabarti and S. Sau, ICISS 2012
Insight to Embedded Systems (3)
• Typical embedded software components:• Typical embedded software components:
Embedded Application Code
Device Drivers
A R l Ti O ti S t (RTOS)A Real‐Time Operating System (RTOS)
Hardware abstraction layer(s)Hardware abstraction layer(s)
System initialization routinesy
Amlan Chakrabarti, PHYSTENS‐ Dec. 2012
High Performance Embedded Systems (4)High Performance Embedded Systems (4)
• Massive computational resources with requirements of– Small size
Low Weight– Low Weight– Very low power consumption
• Need to employ innovative, advanced system architectures
• Architectures typically feature – Multiple processor cores
Tiered memory structures with multi level memory caching– Tiered memory structures with multi‐level memory caching– Multi‐layer bus structures. – Super‐pipelining and/or super‐scalingSuper pipelining and/or super scaling
Amlan Chakrabarti, PHYSTENS‐ Dec. 2012
High Performance Embedded Systems (6)High Performance Embedded Systems (6)
• Increasing software content
The software content– The software content of embedded systems is increasing at a phenomenal ratephenomenal rate
– software development and test often dominatetest often dominate the costs, timelines, and risks associated with today'swith today s embedded system designs.
Amlan Chakrabarti, PHYSTENS‐ Dec. 2012
Multi-Core Embedded SystemsMulti Core Embedded Systems
8A. Chakrabarti and S. Sau, ICISS 2012
Superscalar EraSuperscalar Era• Single thread performance scaled atProcessor
Memory Bandwidth
performance scaled at 50% per year
• Bandwidth increasesce)
ProcessorMemory Latency
0% / year • Bandwidth increases much more slowly, but we could add er
form
anc
additional bits or channels.Lo
g(Pe
• Lack of diversity in architecture = lack of i di id l t iindividual tuning
• Power wall has capped single thread
1990 1995 2000 2005
9
capped single thread performanceA. Chakrabarti and S. Sau, ICISS 2012
Why Multicore?Why Multicore? • It is reasonable to question whether multicore is worth this
additional work, or whether it is possible to continue gaining improvements through single-core devices.improvements through single core devices.
• Raising Clock Frequency– Crank up the frequency– But it has become all too apparent that pushing the frequency came
at a price– Frequency improvements penalize power consumption which in turnFrequency improvements penalize power consumption, which in turn
generates heat that requires more advanced cooling, decreases reliability, and shortens the longevity of the device.
Oth i• Other issues– Techniques such as parallelizing instructions, speculative execution,
and pipelining cannot generally scale with the frequency
A. Chakrabarti and S. Sau, ICISS 2012 10
Multi-core embedded systemsy• Need
– Increased computing demands from embedded system withIncreased computing demands from embedded system with constrained energy and power
• A 3G mobile handset’s signal processing requires 35-40 GOPSGOPS
• Constraints: power dissipation budget of 1W• Performance efficiency required: 25 mW/GOP or 25Performance efficiency required: 25 mW/GOP or 25
pJ/operation– Multi-core embedded systems provide a promising solution
to meet these performance and power constraints• Multi-core embedded systems architecture
Processor cores– Processor cores– Caches– Memory controllersy– Interconnection network
A. Chakrabarti and S. Sau, ICISS 2012 11
Multicore for Multiple ReasonsMulticore for Multiple Reasons• Asynchronous multiprocessing (AMP)
Mi i i h d d h i ti i– Minimizes overhead and synchronization issues– Core 1 runs legacy OS, Core 2 runs RTOS, others do a variety of
processing tasks (i.e. where applications can be optimized)• Parallel pipeliningParallel pipelining
– Taking advantage of proximity– The performance opportunity….
APPLICATIONLinux ‘RTOS’APPLICATION
VideoCompress Security
Thread1 Thread2 Thread3 Thread4 ThreadnCompress Security
12A. Chakrabarti and S. Sau, ICISS 2012
Different Types of MulticoreDifferent Types of Multicore
• Homogenous• Homogenous– Describes a multicore environment in which cores are identical and
execute the same instruction setH t• Heterogeneous– Describes a multicore environment in which cores are not identical
and implement different instruction sets• The current trend is to create homogeneous multicore devices,
but a significant performance advantage can be obtained by using specialized cores and accelerators to offload the mainusing specialized cores and accelerators to offload the main cores.
13A. Chakrabarti and S. Sau, ICISS 2012
Major Challenges for Multi Core DesignsMajor Challenges for Multi‐Core Designs• Communication
– Memory hierarchy– Memory hierarchy– Data allocation (you have a large shared L2/L3 now)– Interconnection network– Scalability– Bus Bandwidth, how to get there?P P f Wi l ?• Power‐Performance —Win or lose?– Borkar’s multicore arguments
• 15% per core performance drop 50% power saving15% per core performance drop 50% power saving• Giant, single core wastes power when task is small
– How about leakage?P i ti d i ld• Process variation and yield
• Programming Model
A. Chakrabarti and S. Sau, ICISS 2012
14
Basic Concepts of Cryptography
15A. Chakrabarti and S. Sau, ICISS 2012
Ciphers ==> ciphertext• We start with plaintext. Something we can read
• We apply a mathematical algorithm to the plaintext
• The algorithm is the cipherThe algorithm is the cipher
• The plaintext is turned in to ciphertext
• Almost all ciphers were secret until recently
• Creating a secure cipher is HARD
16A. Chakrabarti and S. Sau, ICISS 2012
What it Looks LikeWhat it Looks Like
17A. Chakrabarti and S. Sau, ICISS 2012
Symmetric Cipher
Private Key/Symmetric Ciphers
cipher text
cleartext
cleartext
K KK KThe same key is used to encrypt the document before sending and to
decrypt it once it is receivedExamples: DES , 3DES , AES , Blowfish, IDEA
18A. Chakrabarti and S. Sau, ICISS 2012
Public/Private KeysPublic/Private Keys• We generate a cipher key pair One key is the private key the other isWe generate a cipher key pair. One key is the private key, the other is
the public key
• The private key remains secret and should be protected• The private key remains secret and should be protected
• The public key is freely distributable. It is related mathematically to the i t k b t t ( il ) i th i t kprivate key, but you cannot (easily) reverse engineer the private key
from the public key
• Use the public key to encrypt data
• Only someone with the private key can decrypt.y p y yp
19A. Chakrabarti and S. Sau, ICISS 2012
Example (Public/Private Key pair)Example (Public/Private Key pair)
clear clear
ciphertextclear
textcleartextk1 k2
(public key) (private key)
One key is used to encrypt the documentOne key is used to encrypt the document,a different key is used to decrypt it.
This is a big deal!
20A. Chakrabarti and S. Sau, ICISS 2012
Block Cipher and Stream CipherBlock Cipher and Stream Cipher• Block cipher• Block cipher
– operates on fixed‐length groups of bits, called blocks
– unvarying transformation that is specified by a symmetric keyy g p y y y
– widely used to implement encryption of bulk data
• Stream Cipher
plaintext digits are combined with a pseudorandom cipher digit– plaintext digits are combined with a pseudorandom cipher digit stream (key stream)
– each plaintext digit is encrypted one at a time with the corresponding digit of the key stream
A. Chakrabarti and S. Sau, ICISS 2012 21
HashinggOne-Way Encryption
Fixed length hashhashingcleartext
or message digesthashingfunction
Munging the document gives a shortdi t (h h) N t ibl tmessage digest (hash). Not possible to go
back from the digest to the original document.
22A. Chakrabarti and S. Sau, ICISS 2012
Protecting the Private Key
k2 k2
symmetriccipher
(encryptedon disk)
2readyfor use
P h
key
Passphraseentered by
user hashuser hash
K2= private key*Such as SHA-1 or SHA-2
23A. Chakrabarti and S. Sau, ICISS 2012
Di i l SiDigital Signatures
Let's reverse the role of public and private keys. To create a digital signature on a document do:g g Munge a document.
Encrypt the hash with your private key Encrypt the hash with your private key.
Send the document plus the encrypted hash.
O th th d th d t d d t On the other end munge the document and decrypt the encrypted message digest with the person's public keykey.
If they match, the document is authenticated.
24A. Chakrabarti and S. Sau, ICISS 2012
Digital Signatures
Take a hash of the document and encrypt onlyTake a hash of the document and encrypt only that. An encrypted hash is called a "digital signature"signature
h h h h
digital COMPARE
hash hash
k2 k1
digitalsignature
COMPARE
( i t )k2 k1 (public)(private)
25A. Chakrabarti and S. Sau, ICISS 2012
Security FunctionsSecurity Functions
• data confidentiality• data integrity
th ti ti• authentication
26A. Chakrabarti and S. Sau, ICISS 2012
Embedded Security PyramidEmbedded Security Pyramid
27A. Chakrabarti and S. Sau, ICISS 2012
Design ChallengesDesign Challenges
28A. Chakrabarti and S. Sau, ICISS 2012
Crypto Hardware designCrypto Hardware design
29A. Chakrabarti and S. Sau, ICISS 2012
Hardware Implementation BenefitsHardware Implementation Benefits
• More secure implementations
• Implementing both algorithms in hardware b l k i d i hremoves bottleneck associated with
• Single hardware implementation supporting b th l ith d t f tboth algorithms reduce costs of separate hardware
A. Chakrabarti and S. Sau, ICISS 2012 30
Architectures for Security Processing
31A. Chakrabarti and S. Sau, ICISS 2012
Second- and third-generationsecurity processing architecturesC hi H d A l• Cryptographic Hardware Accelerators– obtained through custom hardware implementations of cryptographic
(asymmetric, symmetric, hash) algorithms( y , y , ) g– Applications
• low-power mobile appliances and smartcards to high-performance t k t d li ti [Di ti S f t]network routers and application servers [Discretix; Safenet]
• Embedded Processor Enhancements– Improving the security processing capabilities of general-purposeImproving the security processing capabilities of general purpose
processors– accelerating bitlevel arithmetic operations such as the permutations
f d i t l ithperformed in crypto algorithms– Examples
• Smart MIPS ARM SecureCore family MOSES security processorSmart MIPS, ARM SecureCore family, MOSES security processor developed at NEC
32A. Chakrabarti and S. Sau, ICISS 2012
Second- and third-generationsecurity processing architecturessecurity processing architectures
contd.S i P l E i• Security Protocol Engines– Security protocol engines accelerate all or most of the
functionality present in a security protocoly p y p– higher efficiency than cryptographic accelerators– these protocol engines, if programmable, can be used to execute
lti l t l ffi i tlmultiple protocols efficiently– programmable security protocol engines are being used
increasinglyg y– by embedded system designers when both flexibility and
efficiency are requiredE l– Examples
• 7811 security processor from HIFN can be used in VPNs to perform IPSec processingp p g
33A. Chakrabarti and S. Sau, ICISS 2012
HW SW CodesignHW-SW CodesignSupport multiple algorithms and protocolsSupport multiple algorithms and protocols
34A. Chakrabarti and S. Sau, ICISS 2012
Implementation PlatformsImplementation Platforms
35A. Chakrabarti and S. Sau, ICISS 2012
Reconfigurable Hardware and Cryptography
• Why Hardware?– Software Implementations are too slow for time pcritical applications
– Hardware implementations are intrinsically moreHardware implementations are intrinsically more secure
• Why Reconfigurable?
A. Chakrabarti and S. Sau, ICISS 2012 36
Reconfigurable Hardware and Cryptography (2)
• Advantages of reconfigurable platforms– Algorithm agilityg g y
– Algorithm Upgradability
Architecture Efficiency– Architecture Efficiency
– Resource Efficiency
– Algorithm Modification
– Throughput (Relative to software)g p ( )
– Cost Efficiency (Relative to ASICs)
A. Chakrabarti and S. Sau, ICISS 2012 37
ASIC vs FPGAASIC vs FPGA
A. Chakrabarti and S. Sau, ICISS 2012 38
Design With Reconfigurable HardwareDesign With Reconfigurable Hardware
Programmable Hardware: FPGA
•Re-Programmable Hardware•Re-Programmable Hardware
•Enables the development of nearly all digital circuits
•Leading vendors: Xilinx , Altera, MicroSemi
T d h l i f DSP•Trend to heterogeneous multicore systems out of processors, DSPs,high-speed I/O and programmable logic
•Usually a rapid prototyping platform but increased exploitation as ASICsubstitute 39A. Chakrabarti and S. Sau, ICISS 2012
Advantages of FPGA EmbeddedAdvantages of FPGA Embedded Processor Systems
• Merge CPU and I/O functions onto a single board
• Flexible design template – optimize power, data, and form factor to match application and I/O requirements
• Tightly coupled high speed logic and control system interface on a single chip – versatile tradeoff between hardware and software task
• Advanced tools bridge software and logic development• Advanced tools bridge software and logic development, provide BSP generation for Linux, VxWorks
40A. Chakrabarti and S. Sau, ICISS 2012
Embedded Processors onEmbedded Processors on FPGA
• Hard Core – Embedded processor is a dedicated physical component of
h hi f h bl l ithe chip, separate from the programmable logic
– E g Xilinx Virtex families w/ PowerPC 405E.g. Xilinx Virtex families w/ PowerPC 405
• Soft Core– Embedded processor is built out of the programmable logic on
the chip
– E.g. Xilinx MicroBlaze, Altera NIOS
41A. Chakrabarti and S. Sau, ICISS 2012
Hard Core vs Soft CoreHard Core vs. Soft CoreConsiderations
• Both cores utilize about the same % of total chip resources
• Hard core performance = 3-4x faster than fastest soft cores
• FPGAs with hard cores are more expensive
• Soft cores more flexibleSoft cores more flexible
– Multiple cores can be used in a single chip
– Can be used in a chip with a hard core
42A. Chakrabarti and S. Sau, ICISS 2012
Architecture of the appl. specific FPGA
43A. Chakrabarti and S. Sau, ICISS 2012
Application specific FPGA: Toolflow
44A. Chakrabarti and S. Sau, ICISS 2012
Crypto Engine Design
45A. Chakrabarti and S. Sau, ICISS 2012
What is crypto engine?
• Designed to implementDesigned to implement the specific cipher need
I l t ti• Implementation through a library of general purpose FPGA design blocksg
• Can be also configured for multiple ciphersfor multiple ciphers
A. Chakrabarti and S. Sau, ICISS 2012 46
Crypto Engine as a CoprocessorCrypto Engine as a Coprocessor
• Customized co‐processor core as per the requirement of the algorithm
• Main processor can execute the other required application tasks concurrentlyconcurrently
• Enables multi tasking• Enables multi‐tasking
C i t ith th i th h b• Communicates with the main processor core through a bus
A. Chakrabarti and S. Sau, ICISS 2012 47
Co‐processor based Hardware Design on FPGA
48A. Chakrabarti and S. Sau, ICISS 2012
Co processor using FSL bus(Internal p g (Architecture)
49A. Chakrabarti and S. Sau, ICISS 2012
Example Design of AES Crypto EngineExample Design of AES Crypto‐Engine• Internal Architecture of AES Core
A. Chakrabarti and S. Sau, ICISS 2012 50
AES Engine as coprocessor with Micro blaze core.
51A. Chakrabarti and S. Sau, ICISS 2012
System ArchitectureSystem Architecture
52A. Chakrabarti and S. Sau, ICISS 2012
Research Issues
• Hardware design of latest crypto/hash algorithmsg
P ll li i f hi l i h• Parallelization of cryptographic algorithms– Higher throughput
• Low power design
H d d t ti• Hardware error detection
A. Chakrabarti and S. Sau, ICISS 2012 53
ConclusionConclusionM l i hi d id d d ff b Multi-core architecture presented provides a good trade-off between flexibility, performances and resource consumption
Crypto-accelaretors and crypto-engines can be efficiently catered by multi-core based designs
Synchronization is a great challenge
Reconfigurable hardware provides new opportunities
54A. Chakrabarti and S. Sau, ICISS 2012
• Prof. Ranjan Ghosh, University of Calcutta
• Mr. Rourab Paul, Research Scholar, University of Calcutta
• Mr. Sangeet Saha, M.Tech. student, University of Calcutta
A. Chakrabarti and S. Sau, ICISS 2012 55
56A. Chakrabarti and S. Sau, ICISS 2012