naser derakhsh - kth.diva-portal.org

DesRec

signaconfigu

SysAM

ndImuratiostemsMasterThe

mplemonCononSResisPrese

Naser

E

Axe

S

Cristia

Spr

entatintrolleRAM‐BntedtoPo

By:

Derakhsh

Examiner:

el Jantsch

upervisor:

ana Bolch

ring 2013

ionoferforBasedolitecnico

han

h

hini

3

faHarrSelf‐HFPGAdiMilano

rdeneHealinAs

edng

I

IN THE NAME OF GOD

II

Abstract

As digital systems become large and complex, their dependability is getting more important, particularly

in mission‐critical and safety‐critical applications. Among various available platforms for implementing a

digital system, SRAM‐based Field Programmable Gate Arrays (FPGAs) are increasingly adopted in

embedded systems due to their flexibility in achieving multiple requirements such as low cost, high

performance, and fast turnaround time compared to Fixed Application Specific Integrated Circuits

(ASICs). The most attractive feature of SRAM‐based FPGAs is the ability of re‐programming1 the device in

a few clock cycles. This feature is further enhanced by the introduction of Partial Dynamic

Reconfiguration (PDR). PDR allows reconfiguration partially and on the fly, while the device is operating.

Nevertheless, SRAM‐based FPGAs are more susceptible to faults compared to other type of FPGAs and

ASICs. One of these faults, which mostly happen in higher altitude2, is bit flop in configuration memory

caused by ionizing radiation. If this bit flop alters the critical bits3 in the configuration memory, the

function of the design can be corrupted. Thus, appropriate hardening techniques should be used in

order to increase device dependability.

In general, fault tolerant techniques are mostly based on spatial redundancy. However, these

techniques can be combined with FPGA’s re‐configuration capability for recovery. Since the complexity

of systems is increasing and utilizing hardening techniques demand higher resources, a single FPGA may

not suffice to contain whole system. In this case, multi‐FPGA platforms would be taken into account.

In this thesis, a hardened generic reconfiguration controller that manages the occurrence of soft‐errors

in self‐healing systems implemented on SRAM‐based FPGAs is demonstrated and analyzed. The

controller shows the ability to correct the SEUs in the configuration memory ‐ in both static and partial

reconfigurable regions ‐ by means of Xilinx PDR capability. Moreover, the controller, itself, is hardened

with fault‐tolerant techniques and it is able to detect and mask its own errors. The developed controller

is compared with similar approaches based on micro‐controller inside the FPGA. Eventually, the

presented structure is proven fully functional on XUPV5‐LX110T evaluation board.

1 Re‐configuring 2 40000 feet and above 3 critical bits are those bits that cause functional failure if they change state

III

Preface

This report is provided as a master thesis to fulfill the requirement for master degree in System on chip

Program at ICT School of Royal Institute of Technology (KTH). This thesis is carried out at spring 2012 at

Politecnico di Milano during an exchange study.

I would like to take this opportunity to express my sincere appreciation to Prof. Cristiana Bolchini, my

supervisor at Politecnico di Milano, for her constant support, motivation and guidance during this

project. Further, I would like to thank Dr. Antonio Miele, Dr. Chiara Sandionigi and Matteo Carminati for

their practical advices and all MicroLAB students for their kind support during this thesis. I would like

also to show my sincere gratitude to all KTH and Politecnico staff which I might not remember their

names but they helped me a lot to finish my master thesis.

Last and the foremost, I wish to thank my parents, Akbar Derakhshan and Tooran Hamedmoghadam,

that nothing can be comparable with their dedications, spiritual support and encouragements in my

whole life. Moreover, I wish to kindly thank my lovely wife, Zeinab Hassani, who broke her study in Iran

to company me during my study abroad. I really could have not finished my master study without her

support.

IV

TableofContents1 Introduction .......................................................................................................................................... 1

2 Background and Related Work ............................................................................................................. 3

2.1 Motivation ..................................................................................................................................... 3

2.2 Working scenario .......................................................................................................................... 4

2.3 Adopted Fault Model .................................................................................................................... 5

2.4 Self‐Healing System Architecture ................................................................................................. 5

2.5 SEU Mitigation Schemes ............................................................................................................... 8

2.6 Summary ....................................................................................................................................... 9

3 Proposed Controller Architecture ....................................................................................................... 11

3.1.1 Implemented design in the Master side ............................................................................. 14

3.1.2 Implemented design in the slave side ................................................................................ 23

3.2 Summary ..................................................................................................................................... 26

4 Design Hardening ................................................................................................................................ 27

4.1 State Machine Encoding ............................................................................................................. 27

4.2 Internal Signal Hardening ............................................................................................................ 28

4.3 Interface Hardening .................................................................................................................... 28

4.4 Bitstream Memory Protection .................................................................................................... 29

5 Test Results ......................................................................................................................................... 30

6 Conclusion and Future Works ............................................................................................................. 33

7 Glossary ............................................................................................................................................... 34

8 Works Cited ......................................................................................................................................... 35

9 Appendices .......................................................................................................................................... 39

9.1 Appendix A: Bitstream Scrubbing and Readback ........................................................................ 39

9.2 Appendix B: Redundancy ............................................................................................................ 42

9.3 Appendix C: Xilinx Virtex‐5 overview .......................................................................................... 44

9.4 Appendix D: Configuration modes in Virtex 5 ............................................................................. 47

9.4.1 Configuration Modes and Pins in Virtex 5 [31] ................................................................... 47

9.4.2 Serial Configuration Interface [31] ...................................................................................... 47

V

LIST OF FIGURES FIGURE 1 BASIC PREMISE OF PARTIAL RECONFIGURATION .......................................................................................................... 6

FIGURE 2 FT SYSTEM ON MULTI‐FPGA PLATFORM. DISTRIBUTED SOLUTION (LEFT); CENTRALIZED SOLUTION (RIGHT) ............................ 8

FIGURE 3 A CONFIGURATION CONTROLLER BLOCK‐DIAGRAM BASED ON MICROBLAZE ....................................................................... 8

FIGURE 4 RECONFIGURATION CONTROLLER BLOCK DIAGRAM ..................................................................................................... 12

FIGURE 5 SLAVE FPGA (LEFT) AND MASTER FPGA (RIGHT)....................................................................................................... 13

FIGURE 6 CONFIGURATION CONTROLLER BLOCK DIAGRAM ....................................................................................................... 13

FIGURE 7 BLOCK DIAGRAM OF THE MASTER SIDE AND THE TOP MODULE SIGNALS ......................................................................... 14

FIGURE 8 PR CONTROLLER INTERFACE ................................................................................................................................... 15

FIGURE 9 MODULES INSIDE THE TOP (MASTER SIDE) .............................................................................................................. 16

FIGURE 10 FAULT‐CLASSIFIER INTERFACE .............................................................................................................................. 17

FIGURE 11 FAULT CLASSIFIER FINITE STATE MACHINE DIAGRAM ................................................................................................. 18

FIGURE 12 PR CONTROLLER INTERFACE ................................................................................................................................. 19

FIGURE 13 PR CONTROLLER FINITE STATE MACHINE DIAGRAM ................................................................................................... 20

FIGURE 14 COMPLETE BLOCK DIAGRAM ................................................................................................................................ 20

FIGURE 15 FULL CONFIGURATION CONTROLLER INTERFACE ...................................................................................................... 21

FIGURE 16 FULL CONFIGURATION CONTROLLER FINITE STATE MACHINE ...................................................................................... 22

FIGURE 17 THE IMPLEMENTED DESIGN WITH AN EXTERNAL MEMORY FOR STORING PARTIAL BIT‐STREAM FILES .................................... 23

FIGURE 18 IMPLEMENTED DESIGN ‐ SLAVE SIDE ...................................................................................................................... 24

FIGURE 19 DIFFERENTIAL INPUT BUFFER PRIMITIVE (IBUFDS) ................................................................................................. 25

FIGURE 20 THE CONNECTION BETWEEN TWO EVALUATION BOARDS ............................................................................................ 28

FIGURE 21 GENERATED PR REGIONS ON THE FPGA FABRIC ...................................................................................................... 30

FIGURE 22 A SCHEMATIC FPGA STRUCTURE. TAKEN FROM [8] ................................................................................................. 40

FIGURE 23 TMR BASIC PRINCIPLE ........................................................................................................................................ 42

FIGURE 24 TMR ‐ DEVICE LEVEL ......................................................................................................................................... 43

FIGURE 25 XILINX VIRTEX‐5 XC5VLX110T DEVICE. TAKEN FROM [44] ...................................................................................... 44

FIGURE 26 XILINX XUPV5‐LX110T EVALUATION PLATFORM. TAKEN FROM [46] ........................................................................ 46

FIGURE 27 VIRTEX‐5 FPGA SERIAL CONFIGURATION INTERFACE. TAKEN FROM [31] ..................................................................... 47

FIGURE 28 SERIAL CONFIGURATION CLOCKING SEQUENCE. TAKEN FROM [31] ............................................................................. 48

FIGURE 29 MASTER SERIAL MODE CONFIGURATION. TAKEN FROM [31] ..................................................................................... 49

VI

LIST OF TABLES TABLE 1 FPGA VS. ASIC DESIGN ADVANTAGES. TAKEN FROM [10] ............................................................................................. 3

TABLE 2 TOP MODULE (MASTER SIDE) INTERFACE PINS ............................................................................................................ 15

TABLE 3 FAULT‐CLASSIFIER INTERFACE PINS ........................................................................................................................... 17

TABLE 4 PR CONTROLLER PIN DESCRIPTION ........................................................................................................................... 19

TABLE 5 FULL CONFIGURATION CONTROLLER. PIN DESCRIPTION ................................................................................................. 21

TABLE 6 BIT ORDERING FOR ICAP 8‐BIT MODE ..................................................................................................................... 25

TABLE 7 BIT ORDERING ..................................................................................................................................................... 25

TABLE 8 DEVICE UTILIZATION SUMMARY FOR CONFIGURATION CONTROLER (EXCLUDE BITSTREAM MODULE) ........................................ 31

TABLE 9 CONFIGURATION TIMES FOR DIFFERENT PARTIAL BITSTREAMS ......................................................................................... 32

TABLE 10 RESOURCE UTILIZATION OF ICAP CONTROLLER ......................................................................................................... 32

TABLE 11 VIRTEX‐5 DEVICE FRAME COUNT, FRAME LENGTH, OVERHEAD, AND BITSTREAM SIZE [31] .............................................. 39

TABLE 12 PERFORMANCE OVERVIEW OF MITIGATION SCHEMES. PART OF THE TABLE IS TAKEN FROM [12] ......................................... 43

TABLE 13 VIRTEX‐5 (LX110T) DEVICE SPECIFICATION TAKEN FROM [43] .................................................................................... 44

TABLE 14 VIRTEX‐5 CONFIGURATION MODES ........................................................................................................................ 45

TABLE 15 VIRTEX‐5 FPGA SERIAL CONFIGURATION INTERFACE PINS .......................................................................................... 48

1

1 Introduction

As digital systems become large and complex, their dependability is getting more important, particularly

in mission‐critical and safety‐critical applications. Among various available platforms for implementing a

digital system, SRAM‐based Field Programmable Gate Arrays (FPGAs) are increasingly adopted in

embedded systems due to their flexibility in achieving multiple requirements such as low cost, high

performance, and fast turnaround time compared to Fixed Application Specific Integrated Circuits

(ASICs). The most attractive feature of SRAM‐based FPGAs is the ability of re‐programming4 the device in

a few clock cycles, which allows the system implemented on the FPGA to be updated during design

lifetime. This feature is one of the reasons in which SRAM‐based FPGAs are taken into account for

mission‐critical applications where direct maintenance is difficult. This feature is further enhanced by

the introduction of Partial Dynamic Reconfiguration (PDR), which allows reconfiguration partially and on

the fly while the device is operating. Some advantages of using SRAM‐based FPGAs in space applications

are discussed in [1], [2].

Nevertheless, SRAM‐based FPGAs are more susceptible to faults compared to other type of FPGAs and

ASICs. One of these faults, which mostly happen in higher altitude5, is bit‐flop in configuration memory

caused by ionizing radiation [3], [4], [5]. Ionizing radiation (such as neutrons or alpha particles emitted

by natural radioactive isotopes present in device packaging) is able to induce undesired single event

effects (SEEs) in most silicon devices. SEEs that result in temporary damages to the device are called soft

errors. Soft errors in FPGAs often show up as bit‐flops in user flip‐flops, internal block memory and

configuration memory. Bit‐flops within the configuration memory are especially challenging. If these bit‐

flops alter the critical bits (those that cause functional failure if they change state) in the configuration

memory, the function of the design can be corrupted. This is clearly unacceptable for mission‐ or safety‐

critical applications. Thus, appropriate hardening techniques should be used before they can be

deployed.

In general, fault‐tolerant techniques are mostly based on spatial redundancy. However, these

techniques can be combined with FPGA’s re‐configuration capability for recovery. Since the complexity

of modern systems is increasing and utilizing hardening techniques demand higher resources, a single

4 Re‐configuring 5 40000 feet and above

2

FPGA may not suffice to contain the whole system. In this case, multi‐FPGA platforms would be taken

into account.

In this thesis, a generic dynamic partial reconfiguration controller for a fault‐tolerant design based on

Multi‐FPGA is proposed. The final goal is to have a dependable controller that is able to recover all

recoverable faults6 by exploiting the reconfiguration capability of the FPGAs. This controller is able to

correct the SEUs in the configuration memory of the neighbor FPGA by means of Xilinx PDR7 capability.

It can correct and classify soft errors in the configuration memory, in both static and partial

reconfigurable regions. Moreover, the controller, itself, is hardened and it is able to detect and mask its

own errors.

Modern fault‐tolerant architectures using PDR often utilize microprocessors such as PowerPC or

MicroBlaze embedded into FPGA as the main processing unit for the configuration controller; like the

ones presented in [6], [7]. The innovative contribution of this thesis is implementing all necessary units

and components for the FT8 configuration controller generically on the FPGA fabric. Moreover, in this

thesis we focused on multi‐FPGA platforms, which are less discussed in the literatures. We have

proposed a distributed solution where each FPGA on the multi‐FPGA platform is responsible for

monitoring and recovering, in case of faults, the neighbor FPGA on the platform. This method, which is

discussed in [8], will increase the overall reliability in contrast to centralized solution. In addition to this,

the proposed solution in this work is able to correct single or multiple faults (assuming the faults are

detected) inside the FPGA.

The rest of this thesis is organized as follows: Chapter 2 briefly introduces the preliminary aspects of the

problem and introduces the background elements useful to set the basis for understanding the rest of

the thesis. Moreover, other SEU mitigation schemes have been discussed in this chapter. We also

introduce the self‐healing system architecture, which our controller is designed based on that. Chapter 3 describes the proposed controller architecture. Chapter 4 presents the design hardening of the implemented controller. In chapter 5, we present the testing results. Eventually, chapter 6 draws some

conclusions and gives some possible future research directions.

6 Recoverable faults are a kind of faults that do not cause permanent damage to the FPGA fabric 7 Partial Dynamic Reconfiguration 8 Fault Tolerance

3

2 BackgroundandRelatedWork

In this thesis, we proposed a dependable reconfiguration controller for embedded systems on multi‐

FPGA platforms. Our aim is to increase the overall reliability of system by means of PDR capability. The

chapter is structured as follows: Section 2.1 presents the motivations of the proposed work and

introduces the background elements useful to set the basis for understanding the rest of the thesis.

Section 2.2 discuss what the working scenario for this thesis is, and what the characteristics are. In

Section 2.3, we explain the adopted fault model. Section 2.4 presents the self‐healing system

architecture. We follow this architecture in the rest of the thesis. Other mitigation schemes are also

discussed in section 2.5. At last, section 2.6 draws the chapter summary.

2.1 Motivation

Occasionally, electronic devices show erroneous behavior for no explicit reason. By performing several

experimental designs and by considering statistical analysis, scientists and engineers discovered that

background radiation is the reason. These failures are generally rare and could be ignored for common

applications. However, for many applications, such as mission‐critical and safety‐critical applications, it is

important to consider the role of radiation in system reliability. Reliability problems due to radiation

most commonly fall into the category termed single event effect (SEE) and show up as a type of soft

errors called single event upsets (SEU) [9].

Among various available platforms for implementing a digital system, SRAM‐based Field Programmable

Gate Arrays (FPGAs) are increasingly adopted in embedded systems due to their flexibility in achieving

multiple requirements such as low cost, high performance, and fast turnaround time compared to Fixed

Application Specific Integrated Circuits (ASICs). Table 1 compares FPGAs with ASICs in the various

aspects.

Table 1 FPGA vs. ASIC Design Advantages. Taken from [10]

FPGA Design

Advantage Benefit

Faster time‐to‐market No layout, masks or other manufacturing steps are needed

No upfront non‐recurring expenses (NRE) Costs typically associated with an ASIC design

Simpler design cycle Due to software that handles much of the routing, placement, and timing

More predictable project cycle Due to elimination of potential re‐spins, wafer capacities, etc.

Field reprogramability A new bitstream can be uploaded remotely

ASIC Design

Advantage Benefit

Full custom capability For design since device is manufactured to design specs

Lower unit costs For very high volume designs

Smaller form factor Since device is manufactured to design specs

4

FPGA designs present faster time to market and less non‐recurring expenses (NRE). They also have a

simpler design cycle in contrast to ASICs. However, in general, FPGA designs exhibit worse performance

in terms of logic density, circuit speed, and power consumption than ASICs. In [11] the authors

presented empirical measurements quantifying the gap between 90 nm CMOS FPGAs and 90 nm CMOS

Standard Cell ASICs. They observed that for circuits implemented entirely using LUTs and flip‐flops (logic‐

only), an FPGA is on average 40 times larger and 3.2 times slower than a standard cell implementation.

An FPGA also consumes 12 times more dynamic power than an equivalent ASIC on average.

“Although FPGAs used to be selected for lower speed, complexity, volume designs in the past, today’s

FPGAs easily push the 500 MHz9 performance barrier. With unprecedented logic density increases and a

host of other features, such as embedded processors, DSP blocks, clocking, and high‐speed serial at

ever‐lower price points, FPGAs are a compelling proposition for almost any type of design” [10]. The

most attractive feature of SRAM‐based FPGAs is the ability of re‐programming10 the device in a few clock

cycles, which allows the system implemented on the FPGA to be updated during design lifetime. This

feature is one of the reasons in which SRAM‐based FPGAs are taken into account for mission‐critical

applications where direct maintenance is difficult. This feature is further enhanced by the introduction

of Partial Dynamic Reconfiguration (PDR), which allows reconfiguration partially and on the fly while the

device is operating.

In this thesis, we focus on the SRAM‐based FPGAs in Multi‐FPGA platforms. In a SRAM based FPGA, the

combinational and sequential logic are implemented in programmable complex logic blocks (CLBs),

which are customized by loading configuration data (bitstream) in the SRAM cells of the program

memory [12]. Since the functionality of SRAM‐based FPGAs is determined by the configuration memory,

any bit‐flop that alters the critical bits11 in the configuration memory would corrupt the function of

design. Thus, to have a dependable system specifically in a harsh environment, the system on the chip

should be hardened using suitable FT techniques.

2.2 Workingscenario

The working scenario of this thesis is space applications where SEUs are caused by secondary particles.

According to [9] “secondary particles liberated by the collision of a neutron with a silicon atom or from a

contaminant emitting an alpha particle in an electronic device. The neutrons are generated when cosmic

rays and protons from space interact with the atmosphere. The cosmic rays are from both inside (the

sun) and outside (novas and supernovas) of the solar system. The neutrons range in energy from below

1 million electron volts (MeV) to more than 1,000 MeV.”

Although it is possible to protect electronic equipment against these hi‐energy neutrons by means of

shielding, this is not practical for most applications because the amount of material required to make

this shield is prohibitive (e.g., as much as 30 meters of water for neutrons with high energy) [9].In

9 Xilinx Zynq‐7000 technology has already passed 800 MHz 10 Re‐configuring 11 critical bits are those bits that cause functional failure if they change state

5

addition to neutron effects, an SEU could be caused by alpha particles that emitted by natural

radioactive isotopes present in device material and packaging [9].

2.3 AdoptedFaultModel

We can organize the effects from ionizing radiation into three main categories: transient current pulses,

changes in memory values (such as bit‐flops or SEUs), and latch‐up. The first two categories will result in

recoverable (or soft) faults while latch‐up, which can results in sever overheating, melting, or

vaporization, can cause damage to FPGA fabric and will result in non‐recoverable (or hard) faults. Due to

the difficulty of maintenance in mission‐critical applications, we have to add aging effects to the above‐

mentioned categories. Aging effects can also end in non‐recoverable faults. Since the primary concern

for FPGAs are soft‐faults, we need to expand the first two mentioned categories in this section:

1‐ Transient current pulses may change the values of the internal signals or they may strike the

clock line. They may have transient effect and get vanished after a short time or they may

propagate to flip‐flops inputs and get registered. In both cases, they can cause erroneous value

that will lead to an incorrect result at the output. Suitable error detection and masking

technique is necessary to avoid the propagation of an incorrect result to the other modules.

Such approach is discussed in [13], [14]. The fault can, then, be recovered by performing the

reset.

2‐ The second type of recoverable‐faults is change in the memory values. SRAM‐based FPGAs have

two types of memory: The user registers and block RAMs, which store the user data, and the

configuration static memory which stores the configuration bitstream. Any changes in the

configuration memory will modify the functionality of the system implemented inside the FPGA.

The only method to recover the configuration memory is to rewrite the corrupted portion of the

configuration memory by the correct portion of the bitstream. In this work, we concentrate on

hardening the design implemented inside the FPGA against upsets in the configuration memory.

The proposed controller in our research is able to correct single or multiple‐bit upsets (MBUs) in

the configuration memory by performing the partial reconfiguration of the corrupted portion of

the memory or, at the worst case, reconfiguring the whole FPGA.

2.4 Self‐HealingSystemArchitecture

We applied a hybrid fault‐tolerant technique to our multi‐FPGA architecture. In this architecture each

FPGA hosted a portion of the design. This portion on each FPGA is hardened with hardware redundancy

techniques and distributed among available partially reconfigurable Regions (PRR‐1 to PRR‐n).

Partitioning the system into different portion and then into n PR regions is not mentioned here since the

proposed architecture is not depended on it. The hardware redundancy techniques implemented in this

scenario

controller

Partial Re

file [15]. A

of on‐site

design. Pa

operating

configure

without c

being reco

In this sce

reconfigu

modified

contents

loading o

should no

The partia

be update

download

stored in

If these p

fault toler

other logi

Partial R

comprehe

design aft

reconfigu

12 A brief in13 Protecte

are able to

r of the neigh

configuration

According to X

e programmi

artial Reconfi

g FPGA design

s the FPGA, p

ompromising

onfigured.” [1

enario, the FP

rable (PR) re

by means o

of the partia

f a partial bi

ot) be reconfig

al BIT files (PR

ed later durin

ding one of se

an external m

partial bit files

rance by reco

cs remains fu

econfiguratio

ensive solutio

ter reconfigu

ration of a b

ntroduction toed against radia

detect, loca

bor FPGA for

n is the modif

Xilinx Partial

ng and re‐p

guration (PR)

n by loading a

partial BIT file

g the integrity

15] The basic

Figu

PGAs are stru

egions. The p

f partial reco

l bit file. The

t file. The st

gured.

R_Bit_x.bit) s

ng the design

everal availab

memory.

s are stored i

onfiguring the

unctioning an

on can be

on for PR des

ration. There

block has not

hardware redation

ate and mas

r recovery12.

fication of an

Reconfigurat

rogramming

) takes this fle

a partial conf

es can be do

y of the appli

block diagram

ure 1 Basic Prem

uctured into t

ortion of the

onfiguration

static logic r

atic region c

hould be calc

lifetime. As

ble partial bit

in an protect

e faulty portio

d are comple

done via JT

sign regardin

e are some st

t been succe

undancy is ava

6

k faults and

operating FP

ion User Guid

without goin

exibility one‐s

figuration file

wnloaded to

cations runni

m of Partial R

mise of Partial Re

two separate

e system that

controller. T

remains funct

ontains the o

culated offline

shown in Fig

files, PR_Bit_

ted memory13

on of the FPG

etely unaffect

TAG, SelectM

ng to the cap

tatus registe

eeded. Furthe

ailable at Appe

inform the

PGA design by

de, “FPGA tec

ng through r

step further,

e, usually a pa

o modify reco

ing on those

Reconfiguratio

econfiguration

e regions: a s

t is impleme

The reconfigu

tioning and is

other parts o

e prior the FP

ure 1, each P

_A.bit to PR_

3, partial reco

GA with a cor

ted.

MAP, Maste

pability of do

rs in ICAP wh

ermore, it is

endix B.

faults to th

y loading a pa

chnology prov

re‐fabrication

allowing the

artial bit file.

onfigurable re

parts of the d

on is illustrat

static region a

ented in the

urable logic i

s completely

of the design

PGA design; h

PR modules c

_Bit_D.bit. The

onfiguration c

rect partial b

er‐Serial, or

ing readback

hich indicate

possible to

he reconfigur

artial configur

vides the flex

n with a mo

modification

After a full b

egions in the

device that ar

ed in Figure 1

and several p

PR regions ca

s replaced b

unaffected b

which canno

however, they

can be modifi

ese bit files c

can improve

bitstream whi

ICAP. ICAP

k and verifyin

an error if p

implement a

ration

ration

xibility

dified

of an

bit file

FPGA

re not

1.

partial

an be

by the

by the

ot (or

y may

ied by

can be

FPGA

le the

is a

ng the

partial

a CRC

7

checker in the PR controller to check the CRC for the received file before forwarding it to the ICAP. By

using these two techniques, (monitoring the ICAP registers and CRC checking) we can be sure that the

target FPGA is partially reconfigured correctly.

Using PR approach has some advantages and disadvantaged. These include:

Advantages:

• Partial BIT files are calculated offline and stored in the FPGA prior the FPGA design.

Therefore, the necessary controller for doing partial reconfiguration can be smaller than the

other methods.

• BIT files can be updated later during the design life time

• The PR flow is straightforward and can be done from beginning to the end in Xilinx

PlanAhead™ software

• Function of each partial reconfigurable region can be changed completely by using a

different BIT file (ability to time multiplex hardware dynamically)

• Many interfaces exists to perform partial reconfiguration from outside

• Do not need to know the memory address of the PR modules

Disadvantages:

• Extra memory is needed to store both full configuration and partial reconfiguration BIT files

• Not all implementation options are available to the PR flow. (e.g. techniques perform

optimization across the entire design) [15]

• PR design affects the performance. In general, one should expect 10% degradation in Clock

Frequency, and expect not to exceed 80% slices in Packing Density. [15]

• Routing challenges may occur if the reconfigurable region is too small or is constructed of

non‐rectangular shapes. [15]

We considered a distributed solution for this Multi‐FPGA design in which each FPGA is responsible to

monitor its neighbor FPGA, and in case of fault, recover the neighbor FPGA to a correct state14. Another

approach could be a centralized solution that a rad‐hard FPGA monitor all other FPGAs in a design. The

main supremacy of distributed to a centralized solution is that, there is no need for a controller to be

resided in a separate device. It can be implemented alongside the main system on the same FPGAs [8].

Moreover, the distributed solution is independent of the number of FPGAs whereas in the centralized

solution the number of FPGAs must be defined prior the design. In both scenarios the original

configuration bitstreams should be protected against SEUs. We will discuss this issue in section 4.4. The

Figure 2 illustrates the basic principle of distributed and centralized solution.

14 By means of a reconfiguration controller

Like any

implemen

hard proc

processor

registers

gives a be

only meth

resource

In our pro

any proce

the contro

2.5 SE

Any time

way to re

applicable

This durat

Figure 2 FT sy

other digit

ntation for th

cessor (such a

r, as shown

for read/writ

etter flexibility

hod for harde

utilization.

oposed hard‐

essors. Imple

oller. In addit

EUMitigat

the FPGA is

ecover the F

e to many ap

tion is not to

ystem on Multi‐

al designs,

he above‐me

as MicroBlaze

in Figure 3,

te operation

y to the user,

ening a soft p

Figure 3 a conf

‐ware based

menting in th

tion to this, im

tionSchem

powered up,

FPGA to a co

pplications be

olerable for m

‐FPGA platform.

there is tra

entioned arch

e, PowerPC o

can manage

of Xilinx XPS

, the processo

processor inv

figuration contro

solution, the

his way will l

mplementing

mes

, all its config

orrect conditi

ecause it will

many applicat

8

Distributed sol

adeoff betwe

hitecture. In

or ARM) shou

e the reconf

S HWICAP co

or itself is a p

olves triplicat

oller block‐diagr

controller is

et the design

in hardware

guration cont

ion is to pow

cause the FP

tions. In thes

lution (left); Cen

een softwar

software‐bas

uld be embed

figuration pr

ore [16]. Alth

point of failur

tion, which c

ram based on M

s implemente

ner to apply a

would be spa

tents are refr

wer cycle it.

PGA to stop fu

se application

ntralized solutio

re‐based and

sed impleme

dded into the

ocess by set

hough softwa

re and should

could be very

MicroBlaze

ed purely on

any available

ace/speed op

reshed. There

However, th

unctioning fo

ns, other mit

on (right)

d hardware‐b

entation, a so

e design. The

tting the req

are‐based so

d be hardened

y costly in ter

hardware wi

e FT techniqu

ptimize.

efore, the sim

his method i

or several sec

igation techn

based

oft or

en the

quired

lution

d. The

rms of

ithout

ues on

mplest

is not

conds.

niques

9

should be deployed. Moreover, the state of the FPGA will be lost and a synchronization technique

should be deployed to synchronize the FPGA with other processing elements in the design.

Another mitigation scheme is ''bitstream scrubbing and readback'' (or simply scrubbing) which means

reading back the configuration bitstream stored in the configuration memory, comparing it with an

original one and correcting any affected configuration bits. The process is continuously performed,

independently of the occurrence of a soft error. Such approach is discussed in [17], [18]. Since this

approach is blind, it will introduce latency in detecting a fault and it may cause much more overhead

compared to the other approaches because of continues readback and checking15. Some works have

been carried out recently to make the scrubbing faster and on demand. In [19] the author proposed a

constraint driven re‐placement method to reduce the number of sensitive configuration frames and

consequently the scrubbing time.

The faster and on‐demand solution is the modification of an operating FPGA design by loading a partial

configuration bitstream. Partial reconfiguration is only a recovery technique which means soft errors

should be detected (and located) first, before they can be repaired. Detection and masking could be

performed by well‐known hardware redundancy techniques, either triple modular redundancy (TMR)

[20], [21], [22], [23] or duplication with comparison (DWC) combined with concurrent error detection

(CED) [24].

A first implementation for this kind of reconfiguration controller has been presented in [25]. The author

in the mentioned paper propose a distributed mesh topology in which each FPGA monitors the neighbor

FPGA in a multi‐FPGA platform and triggers the reconfiguration of the faulty portion of the neighbor

FPGA. However, the proposed solution in the mentioned work for hardening the reconfiguration

controller is based on blind readback and checking which may introduce delay in recovery. Another work

is presented in [16] where the author compares different software‐based solution for reconfiguration

controller to achieve the minimum reconfiguration time. However, since the reconfiguration controller

is implemented in the embedded processor, hardening the controller is very difficult. The latest study in

this genre is presented in [26] where the author implemented a hardware‐based ICAP controller for

doing partial reconfiguration. We will compare these approaches in terms of speed and resource

utilization with our proposed controller in the upcoming discussion.

2.6 Summary

In this chapter, we presented the necessary requirements for a multi‐FPGA system in a mission‐critical

application. We talked about the importance of the SRAM‐based FPGAs and we introduced their

limitation in different environments. We also included a brief comparison in Performance, Consumption,

Cost, and Flexibility between SRAM‐Based FPGAs and similar embedded processing units. Then, we

show that although SRAM‐based FPGAs are attractive not only in commercial markets, but also in the

mission critical and safety critical application, special hardening techniques must be used in a harsh

environment. Moreover, we described our working scenario and talked about its characteristics. Next,

15 For more information regarding scrubbing and Xilinx SEM controller please refer to Appendix A.

10

we mentioned the main types of fault that threaten electronic devices in this environment, and we

discussed the Radiations and its effects on the electronic devices in general and on the SRAM‐based

FPGAs in particular.

Furthermore, we introduce our self‐healing system architecture, which our controller is designed based

on that. We have also discussed other possible approaches for increasing the reliability of SRAM‐based

FPGA’s design. We performed a brief literature analysis on similar approaches as well.

In the next chapter, we will discuss our proposed solution for this scenario.

11

3 ProposedControllerArchitecture

The main problems in fault tolerant system is to first detect error during system operation, then locate

the error as fast as possible, next, recover the system to a normal condition and last, bring the system

back to the correct state. Error detection and localization could be done by means of online checkers

like the one presented in [27]. In this paper, the author presents an on‐line testing technique for TMR.

Another approach is to combine 2‐rail logic and self‐checking to have a concurrent error detection

technique like the one presented in [24].

In this thesis, we only focus on fault recovery by means of PDR capability. Our proposed solution is

based on the design methodology presented in [8]. As shown in Figure 2, each FPGA (FPGAi) in our

architecture hosted a reconfiguration controller. The main responsibilities of these controllers are as

follow:

1‐ The controller has to monitor the error signals of the PR regions, static region, and the

reconfiguration controller of the next FPGA (FPGAi+1) in the proposed mesh topology.

2‐ In case of any error in the FPGAi+1 the controller should perform appropriate action to recover

the FPGAi+1 to a correct condition by means of reconfiguration.

3‐ The controller should be hardened itself in a way that if a fault occur in the controller, it should

detect, locate and mask the fault and inform the reconfiguration controller in the FPGAi‐1 for

performing the recovery.

By considering these responsibilities, the controller can be organized into four main parts: Fault

Classifier, Partial Reconfiguration (PR) Engine, Full Reconfiguration Engine, and Bitstream Module. The

main block diagram of the controller is illustrated in Figure 4.

The fault

technique

with the

whether t

the FPGA

may also

The origin

responsib

possible

reconfigu

could not

master‐se

The contr

can be ex

of FPGAs.

In the im

Master) a

this FPGA

In this sec

compone

classifier ha

e [28], [29]. I

address of re

the error is co

i+1, the Fault

be initiated if

nal bitstreams

ble to provid

speed. To a

ration is don

t be used fo

erial configura

roller in this

tended to an

mplemented s

and the syste

A as slave).

ction, we des

nts in the ma

Figu

as to monito

f an error is

elevant parti

orrected or n

Classifier wil

f PR Engine co

s in our desig

de the neces

achieve the

ne via Interna

or full reconf

ation mode a

thesis is imp

y number of

solution, the

em which sho

scribe the imp

aster side and

re 4 Reconfigura

r the error s

detected on

al bitstream.

not. If the err

l initiate the

ould not fix a

gn are stored

ssary protoco

maximum

al Configurat

figuration and

t 10 Mbps.

lemented an

FPGAs; since

configuratio

ould be harde

plemented co

then the com

12

ation Controller

signals, whic

a PR region,

. Then, it wo

or is detected

Full Reconfig

n error in a P

in a rad‐hard

ol for comm

speed for r

ion Access P

d, for this re

d tested on t

the impleme

n controller

ened by mea

ontroller (Fig

mponent in th

r block diagram

ch are encod

, the Fault Cl

ould monitor

d inside the s

guration Engin

R region afte

d external me

unication wi

reconfiguratio

Port (ICAP) at

eason; the fu

two FPGA pla

ented solution

reside in on

ns of PDR re

ure 6) in deta

he slave side.

ded with two

lassifier initia

the error sig

static region

ne. Full Reco

er a specific nu

emory. The Bi

ith this mem

on, the act

t 3.2 Gbps. H

ull reconfigu

atforms (Figu

n is independ

ne FPGA (we

esides in anot

ails. We start

.

o‐rail coding

ates the PR E

gnals again t

or PR contro

nfiguration E

umber of try.

itstream Mod

mory at max

of doing p

However, the

ration is don

ure 5); howev

dent of the nu

call this FPG

ther FPGA (w

t by explainin

(TRC)

Engine

to see

ller of

Engine

.

dule is

imum

partial

e ICAP

ne via

ver, it

umber

GA as

we call

ng the

LED1

LED2

LED3

Virtex‐5 Evalua

(RointeFPG

RM1

R W

01 0010 00

10 00

10 11

RM2

R W

01 00

10 0010 00

10 11

MuxMux

Figu

ation board (Slave S

FPGA‐2 (Target)

CPLD‐2outing full configuratioerface to the dedicatedGA‐2 configuration pin

RM3

R W

01 00

10 0010 00

10 11

Sta

Mux

Figu

re 5 slave FPGA

Side)

n d s)

Full Configura

atic Parts

ICAP

ure 6 Configurat

13

A (left) and mast

Partial Reconfiguration

interface

P(M

ation

Error Signals

conc

c

tion Controller B

ter FPGA (right)

Virtex‐

FP

(Routint

R ControllerMaster Side)

Full nfiguration controller

Fault classifier

Block Diagram

‐5 Evaluation board

PGA‐1 (Master)

CPLD‐1ng the platform flash to the FPGA‐1)

BRAM

Bitstream‐1

Bitstream‐2

Bitstream‐3

(Master Side)

Full configuration oFPGA‐2 via master se

Platform Flashof

erial

LED1

LED2

LED3

14

3.1.1 ImplementeddesignintheMasterside

The configuration controller has to be able to reconfigure the neighbor FPGA (slave), fully or partially.

Moreover, it should decide whether it has to perform full configuration or partial reconfiguration based

on some existing rules. The block diagram of the Master side is shown in Figure 7.

The fault‐classifier module receives the error16 signals from the slave FPGA and sends a request to the

PR Controller. Then, the PR Controller initializes the ICAP interface and sends the selected bitstream to

the slave FPGA. If the error is in static region or the number of errors in the PR regions exceed a specific

amount (three in our case), the Fault‐classifier classify these errors as “non‐recoverable by PDR” and

sends the request to the full configuration controller for downloading the full bitstream to the slave

FPGA. The partial bitstream files are stored on the on‐chip memory17, and the full bitstream is stored on

the Platform Flash. We will come back to this later that why we route the platform flash to the FPGA via

an onboard CPLD.

Figure 7 Block Diagram of the Master Side and the Top module signals

16 These errors can be in PR regions or static region. The detection of these errors is the responsibility of the user and can be done by means of FT techniques. In this thesis, we assume that there is an error detection mechanism (Such as 2‐rail logic combined with self‐checking) on the slave side. 17 These bitstreams will be moved to an off‐chip memory later.

The interf

Pin Na

err_1(1:0)

err_2(1:0)

err_3(1:0)

CLK

RST

RAM_DIN

SLAVE_CCLK

SLAVE_DON

SLAVE_INIT_

ICAP_INPUT

ICAP_INPUT

number_of_

ICAP_CE

ICAP_CLK

ICAP_WRITE

face of the PR

ame T

I

I

I

I

I

I

K I

NE I

_B I

T_N(15:0) O

Diff

T_P(15:0) O

Diff

_err(2:0) O

O

O

E O

R controller is

Tab

Type

nput Two‐

nput Two‐

nput Two‐

nput Main

nput Main

nput Seriaconn

nput Conficonn

nput Activ0 = S1 = S

nput

Beforfull cdrainrecon0 = C1 = N

Output ferential

ICAP is ide

Output ferential

ICAP is ide

Output Thesehapp

Output Activ

Output ICAP

Output ICAP

s shown in Fig

Figure 8 PR

ble 2 Top modul

‐rail error signal



n 100 MHz clock

n reset

l configuration ected to the D0

iguration clock ected to the CC

ve High signal indlave FPGA not colave FPGA config

re the Mode pinonfiguration of n active Low ounfiguration: RC error

No CRC error

read data bus. entical to the Sel

read data bus. entical to the Sel

e signals which pened since the m

ve‐Low ICAP inte

interface clock.

data flow dire

15

gure 8. Table

R controller inte

e (Master side)

from PR module

from PR module

from PR module

data input, syof the Platform

source for allLK of the slave F

dicating full confonfigured gured

ns are sampled, the slave FPGA.tput indicating

The bus width dlectMAP interfac

The bus width dlectMAP interfac

are connectedmost recent full‐

erface select. Equ

The data are sa

ction. 0=WRITE

2 describes t

erface

interface pins

Description

e one

e two

e three

ynchronous to Flash.

l configuration FPGA.

figuration is com

INIT_B is an inp. After the Modewhether a CRC

depends on ICAce.

depends on ICAce.

to the LEDs in‐configuration.

uivalent to CS_B

mpled on the ris

E, 1=READ. Equi

he top modu

n

rising RAM_CC

modes except

mplete:

put that can be e pins are sampC error occurred

P_WIDTH param

P_WIDTH param

ndicate number

B in the SelectMA

sing edge of this

ivalent to the R

le interface.

CLK edge. This

t JTAG. This sig

held Low to deled, INIT_B is and during full or

meter. The bit or

meter. The bit or

of errors whic

AP interface.

s clock.

RDWR_B signal

Pin is

gnal is

elay the n open‐partial

rdering

rdering

h have

in the

RAM_CCLK

RAM_CE_B

RAM_INIT_

Slave_D0

SLAVE_PRO

As it can

each com

3.1.1.1

The purp

configura

O

O

B O

O

G_B O

be seen in F

ponent in de

Fault‐Classi

pose of the

tion is neede

Selec

Output Synchclock

Output Chip mode

Output Correcoun

Output Confi

Output Activslave

igure 9, we h

tail.

Fig

ifierModule

Fault‐Classifie

ed to restore

ctMAP interface

hronous clock fok.

Enable Output.e, the address co

esponds to OE/ter reset and th

iguration DATA

ve‐Low asynchroe FPGA.

have five ma

gure 9 Modules

e

er is to ana

the faulty mo

16

.

or Platform Flas

When CE is Higounter is reset, a

/RESET_B of Plae DATA output i

input pin for the

onous full‐chip r

in componen

inside the TOP (

lyze the inp

odule in the s

sh. Data is put o

gh, the Platformand the DATA p

atform Flash. Wis in a high‐impe

e slave FPGA

reset. This pin is

nts inside TO

(Master Side)

ut error sign

slave side to

on RAM_DIN on

m Flash is put inins are put in a h

When Low, this edance state

s connected to t

OP. In the foll

nals and dec

bring it back

the rising edge

nto low‐power shigh‐impedance

pin holds the a

the PROGAM_B

lowing, we d

cide what ki

k to its initial

of this

tandby e state.

address

B of the

iscuss

nd of

state.

In this de

These err

target FP

reconfigu

need for e

needed to

failure itse

Figure 10

descriptio

Pin Na

CLK

RST

START_WRI

SLAVE_DON

PR_DONE

Err_1(1:0)

Err_2(1:0)

Err_3(1:0)

Start_PR_ou

Start_full

Number_of_

PR_Select_o

Current_sta

Err_in_class

sign, master

or signals ind

PGA. Then,

uration or ful

error signals.

o be sent to

elf, and it sho

0 illustrate t

on.

ame Typ

Inp

Inp

TE Inp

NE Inp

Inp

Inp

Inp

Inp

ut Out

Out

_err(2:0) Out

out(1:0) Out

ate(4:0) Out

sifier(1:0) Out

FPGA detect

dicate whethe

the fault‐cla

ll configurati

In this case,

the target F

ould be monit

the fault‐cla

pe

put Main clocClock

put Main rese

put This signabeginning

put Active Hig0 = Slave 1 = Slave

put Indicates

put Two‐rail e

put Two‐rail e

put Two‐rail e

put A request

put A request

put These sigsince the

put Two‐bit sbe sent to

put This outp

put This two‐

s which part

er the fault is

assifier in th

on. Fault‐clas

the fault‐clas

FPGA. Howev

tored by the m

ssifier interf

Figure 10 Fa

Table 3 Fault‐

ck for fault class

et. This pin is con

al is connectedg of the state ma

gh signal indicatFPGA not configFPGA configured

that the partial

error signal from

error signal from

error signal from

t to the PR Contr

t to the Full conf

nals that are conmost recent ful

signal that indicao the slave side.

ut buffer is need

rail signal indica

17

of the target

s in reconfigu

he master s

ssifier can al

ssifier may on

ver, in this ca

master period

face and Ta

ault‐Classifier int

‐Classifier interf

D

ifier. could be u

nnected to the m

to a debounceachine. It will be

ing full configuragured d

reconfiguration

m PR module one

m PR module two

m PR module thre

roller to start pa

figuration Contro

nnected to the Ll configuration.

ates which portIt is sampled on

ded for one‐hot

ates the presenc

t FPGA is fau

urable module

ide decides

so reside in

nly signal the

ase, the fault

dically.

ble 3, descr

terface

face pins

Description

p to 100 MHz, W

main system res

ed push buttone removed in the

ation is complet

has been finish

e

o

ee

artial reconfigura

oller to start ful

LEDs indicate nu

ion of the slaven the rising edge

state encoding.

e of error in the

ulty (by mean

es or in the s

whether to

the slave sid

master side

t‐classifier be

ribes the fa

We connect this

set push button

ns and used to e final design.

te:

ed

ation.

l reconfiguration

umber of errors,

e is faulty and we of the start_PR

For more detai

e classifier state

ns of error sig

static region o

o perform p

de, eliminatin

which bitstre

ecomes a po

ault‐classifier’

s to the 6.25 MH

make a pause

n.

which have hap

which bitstream R_out.

ls please refer to

machine

gnals).

of the

partial

ng the

eam is

oint of

’s pin

Hz DCM

at the

ppened

should

o [30]

The error

slave FPG

diagram i

“00” on e

PR contro

3.1.1.2

Partial Re

the act of

comprehe

design aft

reconfigu

side. The

the ICAP.

sure that

mentione

correctly.

interface

r signals will i

GA. Then by c

s shown in Fi

err_1 signals a

oller.

PRControll

configuration

f doing partia

ensive solutio

ter reconfigu

ration of a b

CRC checker

By using the

the target F

ed methods h

These featu

and Table 4,

inform the fa

onsidering th

gure 11. For

and the fault

Figure

ler

n can be done

al reconfigura

on for PR des

ration. There

block has not

r in the PR co

ese two techn

PGA is partia

has been imp

ures can be c

describes the

ault‐classifier

he type of err

instance, if th

t‐classifier wil

11 Fault Classif

e via JTAG, Se

ation is done

sign regardin

e are some st

t been succee

ontroller chec

niques, (mon

ally reconfigu

plemented ye

considered as

e PR Controlle

18

that there is

ror the fault

he error is in

ll go to Init_P

fier finite state m

electMAP, Ma

e via Internal

ng to the cap

tatus registe

eded. Furthe

cks the CRC f

nitoring the IC

red correctly

et. Now, we

s a future wo

er pin descrip

s an error in

classifier per

the IRA_1, w

PR state, whic

machine diagram

aster‐Serial, o

Configuratio

pability of do

rs in ICAP wh

rmore, we ca

for the receiv

CAP registers

y. In the curre

only focus o

ork. Figure 1

ption.

the PR or sta

form a prope

we will get the

ch sends the

m

or ICAP. In the

on Access Por

ing readback

hich indicate

an add a CRC

ved file befor

s and CRC ch

ent design, n

on the overa

10 illustrates

atic regions o

er action. The

e stream of “1

PR request t

e proposed d

rt (ICAP). ICA

k and verifyin

an error if p

C checker at

re forwarding

ecking) we c

one of the a

all system to

the PR Cont

of the

e FSM

11” or

to the

esign,

AP is a

ng the

partial

slave

g it to

an be

bove‐

work

troller

Pin Na

CLK

START_WRI

PR_Select(1

PR_DONE

ICAP_CE

ICAP_write

ICAP_CLK

Current_sta

Err_in_PR(1

ICAP_INPUT

ICAP_INPUT

The FSM d

the rising

after one

slave FPG

bit word

complete

asserted

reconfigu

been test

ame Ty

Inp

TE Inp

1:0) Inp

Out

Out

Out

Out

ate(4:0) Out

1:0) Out

T_P(15:0) Out

T_N(15:0) Out

diagram of th

edge on the

clock cycle. I

A is started. A

is sent to t

ly, the FSM

at the fina

ration. The IC

ed and verifie

pe

put Main clocDCM Cloc

put This signa

put Two‐bit sbe sent toclassifier

tput Indicates

tput Active‐Lo

tput ICAP dataSelectMA

tput ICAP inte

tput This is therefer to [

tput This two‐

tput ICAP readidentical

tput ICAP readidentical

he PR controll

e start_PR rec

n the third st

At the rising e

he slave FPG

enters its fin

al state to

CAP clock can

ed in differen

Figure

Table 4 PR Co

ck for PR Controck.

al is connected t

signal that indicao the slave side.module

that the partial

ow ICAP interface

a flow direction.AP interface.

rface clock. The

e output buffer,30]

‐rail signal indica

d data bus. The bto the SelectMA

d data bus. The bto the SelectMA

ler is shown i

ceived. To en

tate, ICAP_CL

edge of each

GA. After on

nal state and

inform the

n work correc

nt clock speed

19

e 12 PR controll

ontroller pin des

D

oller. It could be

to Start_PR_out

ates which portio It is sampled on

reconfiguration

e select. Equival

0=WRITE, 1=RE

data are sample

which is needed

ates the presenc

bus width depenAP interface.

bus width depenAP interface.

n the Figure

nable ICAP, w

LK is enabled

clock, the RA

e partial bits

deactivate t

fault‐classifie

ctly at the fre

ds from 6.25 M

er interface

scription

Description

up to 100 MHz,

of the fault clas

on of the slave in the rising edge

n has been finish

lent to CS_B in t

EAD. Equivalent t

ed on the rising

d for one‐hot st

ce of error in the

nds on ICAP_WI

nds on ICAP_WI

13. The contr

we first assert

and the proc

AM address‐co

stream (1217

the ICAP prim

er about th

equency up t

MHz to 100 M

We connect thi

ssifier module

s faulty and whie of the start_PR

hed

the SelectMAP in

to the RDWR_B

edge of this cloc

ate encoding. Fo

e PR controller st

DTH parameter.

DTH parameter.

roller enters t

t ICAP_write

cess of sendin

ounter is incr

74 words in

mitive. More

he completio

to 100 MHz.

MHz.

s to the 6.25 MH

ich bitstream shR_out of the fau

nterface.

signal in the

ck.

or more details p

tate machine

. The bit orderin

. The bit orderin

the first state

and then ICA

ng bitstream t

reased and on

our case) is

eover, PR_DO

on of the P

The controlle

Hz

ould lt

please

g is

g is

e after

AP_CE

to the

ne 16‐

s sent

ONE is

Partial

er has

3.1.1.3

3.1.1.3.1

In this de

for the co

slave side

side. We

master FP

FPGA. Aft

the Platfo

Slave FPG

procedure

18 Informat19 Here the

FullConfigu

Implemen

sign, a maste

onfiguration c

e, master FPG

utilized CPLD

PGA can initi

ter releasing t

orm flash and

GA signals a D

e. A block dia

Virtex‐5

tion about Virte Slave FPGA m

Figure

urationCont

ntedcircuitf

er‐serial conf

clock. A dedi

GA, and even

D to access th

ate a full con

the PROGRAM

receiving the

DONE to the f

gram of the f

5 Evaluation board (Slave

FPGA‐2 (Target)

CPLD‐2(Routing full configuratinterface to the dedicatFPGA‐2 configuration p

tex 5 configurameans the FPGA

e 13 PR controll

troller

forfull‐conf

iguration18 in

cated configu

ntually, CPLD

he hardwired

nfiguration b

M_B, the Slav

e bitstream d

ull configurat

full configurat

e Side)

ion ted ins)

Full Configuratio

Figure 14 Co

ation modes arA in the slave s

20

er finite state m

figurationco

nterface is im

uration pin o

on the mast

dedicated co

y lowering th

ve FPGA ente

data one bit p

tion controlle

tion system is

n

Full configuratiocontroller

omplete block d

re made availabside.

machine diagram

ontroller

plemented. T

of the slave F

ter side to th

onfiguration

he dedicated

ers its configu

per clock. Afte

er to inform c

s shown in Fig

Virtex‐5 Evaluation

FPGA‐1 (Maste

CPLD‐1(Routing the platform

to the FPGA‐1)

on r

diagram

ble at Appendi

m

The slave FPG

FPGA is route

he platform f

pins for full

d PROGRAM_

uration mode

er configurat

completion of

gure 14.

n board (Master Side)

er)

m flash Full configuratio

FPGA‐2 via master

ix D.

GA19 is respo

ed via CPLD o

lash in the m

configuration

_B pin of the

and start clo

tion is finishe

f the configur

Platform Flashon of

r serial

nsible

on the

master

n. The

slave

ocking

d, the

ration

3.1.1.3.2

Figure 15

Pin Nam

CLK

RST

START_conf

RAM_DIN

SLAVE_CCLK

SLAVE_DON

SLAVE_INIT_

RAM_CE_B

RAM_INIT_

RAM_CCLK

SLAVE_D0

SLAVE_PRO

Current_sta

Err_in_full(1

TheFull‐C

shows the fu

me Type

Input

Input

fig Input

Input

K Input

NE Input

_B Input

Outpu

B Outpu

Outpu

Outpu

G_B Outpu

ate(4:0) Outpu

1:0) Outpu

Configuratio

ull‐configurati

Figu

Table

e

t Main clock DCM Clock.

t Main reset

t This signal iindicates th

t Serial configthe D0 of th

t Configuratiothe CCLK of

t Active High0 = Slave FP1 = Slave FP

t

Before the configuratioactive Low o0 = CRC erro1 = No CRC

ut Chip Enablethe address

ut Correspondreset and th

ut Synchronou

ut Configuratio

ut Active‐Low FPGA.

ut This is the orefer to [30

ut This two‐ra

on‐Controlle

ion‐controller

ure 15 Full Conf

e 5 Full configura

for full config co This clock is for

which connecte

is connected to he start of full co

guration data inhe Platform Flas

on clock source f the slave FPGA

signal indicatingPGA not configurPGA configured

Mode pins are on of the slaveoutput indicatinor error

e Output. Whens counter is rese

ds to OE/RESET_he DATA output

us clock for Platf

on DATA input p

asynchronous f

output buffer, w0]

il signal indicate

21

erarchitectu

r interface an

iguration Contro

ation controller.

D

ontroller. It coulr internal state m

ed to the main re

Start_full of theonfiguration.

nput, synchronoh.

for all configur.

g full configuratred

sampled, INIT_FPGA. After thg whether a CRC

n CE is High, thet, and the DATA

_B of Platform Fis in a high‐imp

form Flash. Data

pin for the slave

full‐chip reset. T

which is needed

es the presence o

ure

nd Table 5 sho

oller Interface

. Pin description

Description

d be up to 100 Mmachine and diff

eset push button

e fault classifier

us to rising RAM

ation modes ex

ion is complete:

_B is an input thhe Mode pins aC error occurred

e Platform Flash A pins are put in

Flash. When Lowedance state

a is put on RAM_

FPGA

This pin is conn

d for one‐hot sta

of error in the fu

ows the pin d

n

MHz, We connecferent from conf

n

module. The ris

M_CCLK edge. Th

xcept JTAG. This

:

hat can be heldare sampled, INd during full or p

is put into low‐a high‐impedan

w, this pin hold

_DIN on the risin

ected to the PR

ate encoding. Fo

ull‐config‐contro

description.

ct this to the 6.2figuration clock

sing edge on this

his Pin is connec

signal is connec

d Low to delay tNIT_B is an opepartial reconfigu

‐power standby ce state.

ds the address c

ng edge of this c

ROGAM_B of th

or more details

oller state machi

25 MHz (CCLK)

s signal

cted to

cted to

the full n‐drain ration:

mode,

counter

lock.

e slave

please

ine

The FSM d

In this co

connected

received f

ns21. The

controller

3.1.1.4

The Digita

loop, digi

DCM was

DCM beca

instead of

this case

using slav

3.1.1.5

A simple

write sign

3.1.1.6

Up to now

against an

memory.

20 indirectl21 This is th22 This is th23 This is th

diagram of th

ntroller, we c

d to the inter

from fault‐cla

n it releases

r finishes its jo

DigitalCloc

al Clock Mana

tal frequency

used to redu

ause, it can w

f master seri

a DCM is ne

ve serial mode

Debouncem

debounce m

nals. The debo

BitStreamM

w, the partia

ny SEU; there

Since we do

y via two CPLDhe minimum rehe maximum che maximum c

he Full Config

Figure 16

control the P

rface of the p

assifier, the f

s the PROGR

ob by enterin

ckManager

ager (DCM) is

y synthesizer

uce 100 MHz

work with 10

al configurati

eeded. In pra

e.

module

module is imp

ounce module

Module

al bit‐stream

efore, in the

not access t

Ds and one FPGequired time folock frequencylock frequency

uration Contr

6 Full Configurat

PROGRAM_B

latform flash

full configura

RAM_B and m

ng the done st

(DCM)

a primitive in

r, digital pha

clock freque

00 MHz clock

ion, the maxi

ctice, the ma

plemented to

e for start‐wr

files are sto

final produc

to any Rad‐H

GA or PROGRAM_y of Platform Fy, which we ha

22

roller is show

tion Controller f

pin of the Sla

on the slave

tion controlle

monitors the

tate.

n Xilinx FPGA

se shifter, or

ency. In fact, t

directly; how

imum clock s

aximum clock

o debounce t

rite is not sho

red in on‐chi

ct, these files

ard memory

B to remain aslash ve reached for

wn in the Figur

finite state mach

ave FPGA. Th

side20 direct

er lowers the

e SLAVE_DON

A and can be u

r a digital sp

the impleme

wever, if a sla

speed should

k speed shou

the input pus

own in Figure

ip BRAMs. Th

s should be m

in this proje

sserted

r XCF32P.

re 16.

hine

he other conf

ly. When Star

e PROGRAM_

NE signal; w

used to imple

pread spectru

nted design d

ave serial con

be less than

uld not excee

sh‐buttons fo

9.

hese files sho

moved to a r

ect, we have

figuration pin

rt_full comm

_B for at leas

hen received

ement delay lo

um. In this d

does not nee

nfiguration is

20 MHz22, a

ed 16 MHz23

or reset and

ould be prot

adiation‐hard

used a simp

ns are

and is

st 250

d, the

ocked

esign,

ed any

s used

nd, in

when

start‐

tected

dened

le I2C

23

memory to test our design. The new block diagram of the whole system with an external memory for

storing partial bit files are shown in Figure 17.

Figure 17 the implemented design with an external memory for storing partial bit‐stream files

In this design, the responsibility of the Bit Stream Module is to refresh the content of BRAM every n

minutes. This refresh interval could be changed based on the application and the environment in which

the system is deployed. Another solution for a Bit Stream Module is to send the data from external

Memory to PR controller directly. This solution is suitable when the size of the bitstream files is too large

and it is not possible to store all of them on on‐chip memory at the same time. However, in this case,

the interface speed of the external memory would limit the partial reconfiguration speed and we could

not benefit from 400MB/S24 configuration speed anymore. Since the size of the Bit files is small enough

in our project, we kept the main idea of using BRAM and we add a bit stream module to refresh the

content of BRAM periodically.

3.1.2 Implementeddesignintheslaveside

Figure 18 shows the block diagram of the design in the slave board. As previously mentioned, we

assume that the required system is implemented in the slave side. This part consists of partial

reconfiguration regions and a static part.

24 This is the maximum reachable ICAP speed

24

Virtex‐5 Evaluation board (Slave Side)

FPGA‐2 (Target)

CPLD‐2(Routing full configuration interface to the dedicated FPGA‐2 configuration pins)

Partial Reconfiguration

interface

RM1

R W

01 0010 0010 0010 11

RM2

R W

01 0010 0010 0010 11

RM3

R W

01 0010 0010 0010 11

Full Configuration

Error SignalsStatic Parts

ICAP

MuxMux Mux

Figure 18 Implemented design ‐ slave side

3.1.2.1 StaticRegion

The static region contains the parts that cannot or should not be reconfigured. These items could be

ICAP_VIRTEX5, I/O buffers or DCMs.

3.1.2.1.1 ICAP_VIRTEX5[31]

The ICAP_VIRTEX5 primitive works the same way as the SelectMAP configuration interface except it is

on the fabric side and ICAP has a separate read/write bus, as opposed to the bidirectional bus in

SelectMAP. The general SelectMAP timing diagrams and the SelectMAP bitstream ordering information

as described in the “SelectMAP Configuration Interface” section of this user guide are also applicable to

ICAP. It allows the user to access configuration registers, readback configuration data, or partially

reconfigure the FPGA after configuration is done. ICAP has three data width selections through the ICAP

WIDTH parameter: x8, x16, and x32. The two ICAP ports cannot be operated simultaneously. The design

must start from the top ICAP, and then switch back and forth between the two.

Pin Name Type Description

CLK Input ICAP interface clock

CE Input Active‐Low ICAP interface select. Equivalent to CS_B in the SelectMAP interface.

WRITE Input 0=WRITE, 1=READ. Equivalent to the RDWR_B signal in the SelectMAP interface.

I[31:0] Input ICAP write data bus. The bus width depends on ICAP_WIDTH parameter. The bit ordering is identical to the SelectMAP interface. See ICAP Data Ordering in [31]

O[31:0] Output Unregistered ICAP read data bus. The bus width depends on the ICAP_WIDTH parameter. The bit ordering is identical to the SelectMAP interface.

BUSY Output Active‐High busy status. Only used in read operations. BUSY remains Low during writes.

x32 x16 x8

3.1.2.1.2

In many c

in some c

in the con

configura

This conv

of confus

hexadecim

Some app

applicatio

meaning

PROM file

x16, and x

31 30 29 224 25 26 2

3.1.2.1.3

In order t

ICAP data

buffers. “T

SelectIO p

the P and

differentia

25 D [0:7] r

ICAPData

cases, ICAP co

cases another

nfiguration da

tion data is lo

ention (D0 =

sion when d

mal value 0xA

CCLK

1

2

plications can

ons, it can b

that the bits

e generation

x32 modes.

28 27 26 2527 28 29 30

IBUFDS:d

to be able to

a bus betwee

The usage an

primitives. Di

N channel pi

al input buffe

epresent the IC

aOrdering[3

onfiguration i

r FPGA. In the

ata file corres

oaded at one

MSB, D7 = LS

designing cu

ABCD into the

Cycle HEX E

1 0

2 0

n accommoda

be more con

in each byte

software can

5 24 23 22 0 31 16 17

differentiali

o use the ICA

en two evalu

nd rules corre

fferential Sel

ins in a differe

er primitive.

Figure

CAP DATA pins

31]

is driven by a

ese applicatio

sponds to the

e byte per CC

SB) differs fro

ustom config

e ICAP data bu

Table 6 Bit Ord

Equivalent D

0xAB

0xCD

ate the non‐c

venient for

e of the data

n generate bit

Table

21 20 19 18 19 20

nputbuffer

P maximum

uation boards

esponding to

ectIO primiti

ential pair. N

e 19 Differential

s.

25

a user applica

ons, it is impo

e data orderi

CLK, with the

om many oth

guration solu

us.

ering for ICAP 8

D0 D1 D2

1 0 1

1 1 0

conventional

the source c

a stream are

t‐swapped PR

e 7 Bit Ordering

Pin 18 17 16 21 22 23

rprimitive

speed, we ne

s. To use the

the different

ves have two

channel pins

Input Buffer Pr

ation residing

ortant to und

ng expected

MSB of each

her devices. T

utions. Table

8‐Bit Mode

D3 D4

0 1

0 1

data orderin

configuration

reversed. Fo

ROM. Table 7

g

15 14 13 18 9 10 1

8 9 10 1

eed to use d

ese pairs, we

ial primitives

o pins to and

s have a “B” s

rimitive (IBUFDS

g on a microp

derstand how

by the FPGA

h byte presen

This conventio

e 6 shows h

D5 D6 D7

0 1 1

1 0 1

ng without d

n‐data file to

or these appl

7 shows the b

12 11 10 911 12 13 14

11 12 13 14

ifferential pa

e need to uti

s are similar t

from the de

suffix.” [32] Fi

S)

processor, CP

w the data ord

. In ICAP x8 m

nted to the D

on can be a s

how to load

725

1

1

ifficulty. For

o be bit‐swa

ications, the

bit ordering f

9 8 7 6 4 15 0 1

4 15 0 1

0 1

airs to connec

lize different

o the single e

evice pads to

igure 19 show

LD, or

dering

mode,

0 pin.

ource

d the

other

pped,

Xilinx

for x8,

5 4 3 2 2 3 4 5

2 3 4 5

2 3 4 5

ct the

tial IO

ended

show

ws the

1 0 6 7

6 7

6 7

26

3.1.2.2 PartialReconfigurationRegions

An implemented system on a FPGA should be divided into Partial Reconfiguration regions (PRR). Partial

reconfigurable modules (PRM) are the part of the design that can be placed in the PR regions. User may

have any number of PRM in a PRR; however only one PRM can be operated in a PRR at a given time.

The minimal size of the PRM is theoretically one CLB26 ; however, due to the structure of the

configuration memory, configuration of the CLB is contained in several frames and each frame contains

the configuration bits of 20 CLBs. Since the frame is the smallest part of the FPGA that can be

configured, every reconfiguration changes at least 20 CLBs [12]. The Size of the PRMs is important in

optimality of the performance. The author in [8] proposed a reliability‐aware solution for selecting an

optimal area for PRMs.

It is necessary to insert specific interface at the boarders of the PRRs. These interfaces are called proxy

logics in ISE design tools. The user can place proxy logic manually or they can be placed by design tool

automatically. In recent ISE‐design tools, these proxy logics are also supported by the timing analysis.

Therefore, it is possible to analyze the critical path between static region and PR regions.

The design flow can be done in ISE and PlanAhead. First, the ISE synthesize the VHDL or Verilog codes

and generates the necessary Netlist files. Next, these files are imported to the PlanAhead. Last, after the

procedure of floor planning in the PlanAhead, partial bitstream files (*.bit) will be generated for each

PRMs. These bitstream files have a header that contains the address of a PRM. For reconfiguring a PRM,

the relevant partial bitstream file should be forwarded to the configuration engine by means of one of

the available interfaces27.

3.2 Summary

In this chapter, we describe our proposed configuration controller in details. The implemented

configuration controller and its characteristics were presented. In the next chapter, we will discuss deign

hardening techniques for our proposed controller.

26 Configuration Logic Block 27 JTAG, SelectMAP, Master‐Serial or ICAP

27

4 DesignHardening

Up to now, all components of the system implemented in our multi‐FPGA platform is hardened by

combination of hardware‐redundancy techniques and partial reconfiguration capability for fault

detection, masking and recovery. In our work, each component is triplicated and each part is placed in

one PR region. By comparing the output of each part with the other parts, the voter can detect and

mask an error in a PR region and inform the reconfiguration controller for recovery. However, three

more issues should still be protected against SEUs.

In this chapter, we will discuss the strategy for hardening the Configuration Controller, the interfaces

and the bitstream. As previously discussed, the most robust mitigation strategy is to use redundancy

techniques coupled with partial reconfiguration property. In this design, three different approaches

have been utilized to increase the overall reliability.

4.1 StateMachineEncoding

Because the implemented circuits for PR_Controller, full_config_controller, fault_classifier and

bit_stream_module are based on finite state machines (FSM), the first step to increase the robustness of

the design is to encode FSMs. Many works have been carried out to apply an optimal state encoding

[30], [33]. There are many tools and techniques available to apply an optimal state encoding. Common

to most of them is to minimal the number of bits required for state encoding. A poor choice of encoding

techniques will result in a state machine that is very costly in terms if resource utilization or it is very

slow or both. Moreover, encoding must be applied in the hardware description language to ensure

reliability of protected FSM.

In this project, an optimized one‐hot state encoding has been embedded into hardware description

language of the state machines. In the one‐hot state encoding, only one bit of the state vector is set to

one for any given state and all other state bits remain zero. Thus if there are n states then n state flops

are required. State decode is simplified, since the state bits themselves can be used directly to indicate

whether the machine is in a particular state. No additional logic is required [30]. We have used one‐hot

state encoding because it has the following advantages:

It maps easily into Xilinx register‐based FPGA architecture and it is easy to apply one‐hot state

encoding to a state machine. Schematics can be captured and HDL code can be written directly

from the state diagram without coding a state table. [30]

One‐hot state encoding is typically faster than other state ending techniques. Moreover, Speed

is independent of the number of states, and instead depends only on the number of transitions

into a particular state. [30]

It is very easy to modify the design without manipulating the rest of the machine.

It

st

Xilinx can

this prope

state enco

The error

If there is

4.2 In

In additio

hardened

signals a

reconfigu

an undesi

may corr

technique

not valid

the error

4.3 In

The next s

FPGAs in a

can be easil

tatic timing a

apply one‐h

erty since in

oding directly

detection is

more than o

ternalSig

on to state m

. An undesire

re the stat_

ration contro

ired upset on

upted. To pr

e, the signal is

values. An e

detection is a

terfaceH

step is to har

a multi‐FPGA

ly synthesize

nalysis. [30]

ot state enco

this case erro

y to the FSM V

quite easy in

ne bit asserte

gnalHard

machines, the

ed value on t

_PR and sta

oller and full c

n these signals

revent this s

s presented b

rror signals w

also very simp

ardening

den the conn

A platform.

Figure 2

d from VHDL

oding when sy

or detection

VHDL codes.

this scenario

ed in a given s

dening

re are some

these signals

rt_full_config

configuration

s or if they st

ituation, we

by two bits. “

will be genera

ple, and could

g

nection signal

0 the connectio

28

L or Verilog a

ynthesizing th

is not possib

o. It is only ne

state vector,

internal sign

may start a c

g that are u

controller to

tock to zero o

have used 2

10” presents

ated in case

d be impleme

s between tw

on between two

and it is poss

he circuit, ho

ble. Therefore

ecessary to ch

there will be

als between

component u

used by fau

o start their fu

or one, the fu

2‐rail logic to

‘1’ and “01”

of occurrenc

ented by an X

wo evaluation

o evaluation boa

sible to find

owever, it is n

e, we have ap

heck the state

an error.

components

unexpectedly.

ult_classifier

unctions resp

unctionality o

o encode the

presents ‘0’.

e of “00” an

XOR gate.

n boards (Figu

ards

critical path

not possible t

pplied the on

e bits concurr

s, which shou

. Examples of

to inform p

pectively. If th

f the whole d

em. In this s

“00” and “11

d “11”. More

ure 20) or two

using

to use

ne‐hot

rently.

uld be

f such

partial

here is

design

simple

1” are

eover,

o

29

These signals are susceptible to faults caused mainly by radiation or electromagnetic interference. Since,

the ICAP data pins are 16‐bit or 32‐bit, and they are working with 100 MHz clock frequency, cross‐talk is

also possible to happen. We have used differential pairs to prevent such phenomenon. The principle of

differential pairs is quite the same as 2‐rail logic. In differential pairs, “10” presents ‘1’ and “01” presents

‘0’ and “00” and “11” are not valid values.

4.4 BitstreamMemoryProtection

In this study, the necessary bitstreams for reconfiguring the PR regions and also the whole FPGA are

generated with the help of tool chain and stored in an external non‐volatile flash memory. These flash

memories are susceptible to SEUs [34]. Since the reliability of the whole system is depending on the

correctness of these original bitstreams, we need to protect them against radiation. In this work, we

envisioned a solution to protect the original bitstreams based on using a radiation‐hardened memory.

However, this is not the only solution for protection. Another possible solution is to utilize error control

codes [35], [36], [37].

Particularly, the author in [35] has presented encoders and decoders of error control codes for

semiconductor memory systems used in the space radiation environment. In this work, widely‐used

error control codes, such as Hamming and Reed‐Solomon (RS) codes, compared with new classes of byte

error control codes suitable for semiconductor memory systems, called spotty byte error control codes.

The author concluded that the spotty byte error control codes show better performance in terms of gate

counts and maximum clock frequencies. With the help of this technique, we can benefit from regular

non‐volatile memories without worrying about the incorrectness of the original version of bitstreams.

5 Te

The desig

mentione

error sign

considerin

error sign

configura

two PR m

correspon

represent

steps invo

Figure 21

PR modu

generated

which cor

process.

28 “00” and

stResu

gn has been t

ed, the contro

nals. The erro

ng the error

nals are foll

tion controlle

modules. On

nding PR regio

t that there is

olved when u

shows the lo

les are imple

d. These bitst

rresponds to

d “11” on error

ults

tested on two

oller is able to

or detection o

detection m

owing the 2

er, Three PR

e of them is

on works cor

s an error in

sing the PlanA

ocation of the

Figur

emented in a

tream files ha

correct beha

r signals indica

o identical XU

o reconfigure

of the slave F

ethod, which

2‐rail checkin

regions have

s generating

rectly; the ot

the correspo

Ahead softwa

se PR regions

re 21 generated

an 8x8 CLBs.

ave a size of

aviors, are st

te an error in t

30

UPV5‐LX110T

e the faulty pa

FPGA is the r

h has been u

ng rules28. T

been create

a string of

her module is

onding PR reg

are for Partia

s after floor p

d PR regions on t

For each PR

24348 Bytes

tored on the

the correspond

T evaluation b

art of the slav

responsibility

used in the s

To verify the

d in the slave

“10” and “0

s generating

gion. Chapter

l Reconfigura

planning in Pla

the FPGA fabric

modules, on

each. One b

on‐chip BRA

ding compone

boards (Figure

ve FPGA base

of the user;

lave side, we

e correct fun

e side. Each P

01” which re

a string of “1

r 4 in [15] de

ation designs.

anAhead.

ne test‐bitstr

itstream file

AMs For part

nt

e 6). As prev

ed on the rec

however, wi

e assume tha

nctionality o

PR region con

epresent tha

11” and “00” w

escribes the d

.

eam file has

in each PR re

ial reconfigur

iously

ceived

ithout

at the

of the

ntains

at the

which

design

been

egion,

ration

31

The other three bitstreams, which are not stored in the BRAM, are used to simulate faults in the PR

regions. These partial bitstreams are downloaded to the FPGA via JTAG in iMPACT tool. After

downloading, the corresponding PR regions will start sending error signals to the implemented

configuration controller in the master side. Then the master side will respond to the error signals by

reconfiguring the corresponding PR region by a correct PR module.

The above‐mentioned process has been tested for 100 times and for each PR region with different

number of CLBs. The configuration controller was able to correct all of the simulated faults by

performing partial reconfiguration or full configuration. Table 8 shows the device utilization summary

for the configuration controller at the master side. The resource utilization of bitstream module is not

included in this summary.

Table 8 device utilization summary for configuration controler (exclude bitstream module)

Slice Logic Utilization Used Available Utilization

Number of Slice Registers 164 69,120 1%

Number used as Flip Flops 164

Number of Slice LUTs 239 69,120 1%

Number used as logic 236 69,120 1%

Number using O6 output only 114


Number using O5 and O6 66

Number used as exclusive route‐thru 3

Number of route‐thrus 59


Number of occupied Slices 118 17,280 1%

Number of LUT Flip Flop pairs used 259

Number with an unused Flip Flop 95 259 36%

Number with an unused LUT 20 259 7%

Number of fully used LUT‐FF pairs 144 259 55%

Number of unique control sets 18

Number of slice register sites lost to control set restrictions 20 69,120 1%

Number of bonded IOBs 61 640 9%

Number of LOCed IOBs 61 61 100%

IOB Master Pads 16

IOB Slave Pads 16

Number of Block RAM/FIFO 18 148 12%

32

Number using Block RAM only 18

Number of 36k Block RAM used 16

Number of 18k Block RAM used 3

Total Memory used (KB) 630 5,328 11%

Number of BUFG/BUFGCTRLs 3 32 9%

Number used as BUFGs 3

Number of DCM_ADVs 1 12 8%

Average Fan‐out of Non‐Clock Nets 3.72

Our implemented generic controller shows a better performance in terms of speed compare to other

generic controllers and software based controllers. Table 9 compares our design with another generic

reconfiguration controller proposed by Ali Ebrahim in [26] and a software‐based controller based on the

Xilinx XPS_HWICAP engine presented in [7].

Table 9 configuration times for different partial bitstreams

Partial Bitstream Size

(KB)

Configuration Time (us)

XPS_HWICAP(x32)

[7]

BRAM HWICAP (x32) [7]

ICAP controller(x32)

[26]

Our ICAP Controller

(x16)

Our ICAP Controller

(x32)

7.7 533 28.0 21.7 39.5 19.7

23.2 1600 66.3 62.6 118.8 59.4

47.2 3300 121.7 124.1 241.7 120.9

In addition to this, our proposed configuration controller shows better results in terms of resource

utilization compare to the proposed design in [26]. Ali Ebrahim [26] implemented his proposed

controller on 609 FPGA slices; however, our design is utilized only 239 FPGA slices (Table 10), which

shows a significant reduction in space utilization29.

Table 10 Resource utilization of ICAP controller

Resources XPS_HWICAP(x32)

[7] BRAM HWICAP

(x32) [7] ICAP controller(x32)

[26] Our ICAP Controller

(x32)

LUTs (total)

3275 963 609 239

29 The external memory interfaces are excluded in both designs to calculate resource utilization.

33

6 ConclusionandFutureWorks

The research presented in this thesis has proposed a dependable reconfiguration controller with the aim

of recovering the faulty portion of the FPGA in a multi‐FPGA platform. Our working scenario was harsh

environment such as mission‐critical and safety‐critical applications where electronic devices are

susceptible to SEUs caused by ionizing radiation. In this thesis, different types of fault tolerant

techniques for systems based on FPGAs are discussed. It was concluded that the best fault tolerant

technique for a FPGA‐based design, is to use redundancy techniques for fault detection and fault

containment, then use a recovery technique based on Xilinx partial reconfiguration for mitigating the

fault. The main innovative contributions provided by this thesis are summarized as follows:

The controller is implemented purely on hardware. Not only this generic implementation

increases performance in terms of higher speed and lower resource utilization, but also it allows

the designer to apply any available FT techniques for increasing the reliability of the controller.

The configuration interfaces for full configuration and partial reconfiguration are completely

separated from each other. The full configuration is done via a serial interface whereas partial

reconfiguration is done via the Parallel ICAP interface. This method will increase the overall

reliability because, if the partial reconfiguration stops functioning for any reason, there still is a

configuration solution for recovering the FPGA, eventually with lower speed.

Directions for future work aimed at its improvement are summarized in the following:

1‐ A comprehensive testing solution: The testing method used in this thesis is based on pre‐build

bitstream files, which simulate an SEU in the corresponding module. This method is not through

enough. There exist two other recognized testing methods, which should be considered as a

future work to this thesis. These two testing method are radiation‐testing strategies and fault‐

injection campaign. Although radiation test remains one of the worldwide‐recognized and

complete methods for SEU analysis, radiations may permanently damage the device under test

(DUT) and increase the testing cost for both the development of radiation setup and for the

time that beam operating. Moreover, there is no control on the beam to hit a specific location.

Therefore, an SEU may occur in an undesired bit. Another solution is to inject fault during

programming phase to emulate SEUs in the FPGA; however, this method requires huge amount

of time to provide consistent result. The better solution could be calculating the critical bits and

then performing fault injection based on these bits. One example of such approach is presented

in [38].

2‐ Problem with synchronization of PRMs: The Synchronization of a newly reconfigured module

with other modules in the FPGA or other FPGAs in a multi‐FPGA platform is another issue that

has not been addressed in this thesis. This step will be the next step after fault recovery. The

newly reconfigured module must start operating from a correct state. One solution to this

problem is presented in [12].

34

7 Glossary

Some common terminologies used through this document are defined in this section. These definitions

are taken from [39]

Device: A single integrated circuit.

Failure: An unrecoverable error.

Functional Error: A logic error in the user function.

Functional Interrupt: A disruption in device operation requiring system level intervention to regain

normal functionality. Typically causes the loss of user or system data.

Multiple‐Bit Upset (MBU): An SEU that results in more than one adjacent bits flipping due to an oblique

angle strike. MBU probability steadily increases as geometries shrink. Use of maximum MBU distance

observed is useful to determine block RAM interleaving required so that even MBUs can be corrected by

the ECC.

Single‐Bit Upset (SBU): Same as SEU.

Scrubbing: The process of correcting any configuration cell upsets through FPGA partial reconfiguration.

Scrubbing does not interrupt user design function.

Single‐Event Effect (SEE): The resulting electrical disturbances caused by the direct ionization of a silicon

lattice by an energetic charged subatomic particle.

Single‐Event Functional Interrupt (SEFI): An SEE that results in the interference of the normal operation

of a complex digital circuit. SEFI is typically used to indicate a failure in a support circuit, such as loss of

configuration capability, power on reset, JTAG functionality, a region of configuration memory, or the

entire configuration.

Single‐Event Transient (SET): A signal transition caused by a SEE. Often observed as a glitch.

Single‐Event Upset (SEU): A state change (or flip) of a single data bit storage or memory cell caused by

an SEE. An SEU can affect the configuration memory cell states, the block RAM contents, a CLB DFF, a

LUTRAM, or SRL16 memory cell (which are also configuration memory cells, directly accessible to the

user).

System: An integration of multiple devices and circuit boards or modular sub‐systems.

User Function: User‐specified operational functions defined by the data stored in device configuration

memory.

35

8 WorksCited

[1] M. Caffrey, "A Space Based Reconfigurable Radio," Military and Aerospace Applications of Programmable Logic

Devices (MAPLD), Laurel MD, USA, 2002.

[2] A. Dawood, S. Visser and J. Williams, "Reconfigurable FPGAS for real time image processing in space," in 14th

International Conference on Digital Signal Processing, DSP2002, Santorini, Greece, 2002.

[3] D. M. Hiemstra, G. Battiston and P. Gill, "Single Event Upset Characterization of the Virtex‐5 Field Programmable

Gate Array Using Proton Irradiation," in IEEE Radiation Effects Data Workshop (REDW), Denver, CO, 2010.

[4] M. Ceschia, M. Menichelli, A. Papi, J. Wyss and A. Paccagnella, "Ion beam testing of SRAM‐based FPGA's," in

Radiation and Its Effects on Components and Systems, 2001. 6th European Conference on, 2001.

[5] E. Fuller, P. Blain, M. Caffrey and C. Carmichael, "Radiation Test Results of the Virtex FPGA and ZBT SRAM for

Space Based Reconfigurable Computing," Xilinx Inc., Los Alamos National Laboratory, 1999.

[6] L. Sterpone, M. Aguirre, J. Tombs and H. Guzmán‐Miran, "On the design of tunable fault tolerant circuits on

SRAM‐based FPGAs for safety critical applications," in Design automation and test in Europe, Torino, Sevilla,

2008.

[7] M. liu, W. Kuehn, Z. Lu and A. Jantsch, "Run‐Time Partial Reconfiguration Speed Investigation and Architectural

Design Space Exploration," in FPL, Giessen, Germany, 2009.

[8] C. Bolchini, A. Miele and C. Sandioni, "A Novel Design Methodology for Implementing Reliability‐Aware Systems

on SRAM‐Based FPGAs," IEEE TRANSACTIONS ON COMPUTERS, vol. 60, no. 12, pp. 1744 ‐ 1758, 2011.

[9] J. Hussein and G. Swift, "Mitigating Single‐Event Upsets," Xilinx, 2012.

[10] "FPGA vs. ASIC," Xilinx Inc., 2012. [Online]. Available: http://www.xilinx.com/fpga/asic.htm.

[11] I. Kuon and J. Rose, "Measuring the Gap between FPGAs and ASICs," in FPGA’06, Toronto, 2006.

[12] M. Straka, J. Kastil and Z. Kotasek, "Fault Tolerant Structure for SRAM‐based FPGA via Partia Dynamic

Reconfiguration," Digital System Design: Architecture, Methods and Tools, pp. 365‐372, 2010.

[13] F. Lima, C. Carmichael, J. J. Fabula and R. Padovani, "A fault injection analysis of Virtex FPGA TMR design

methodology," in Radiation and Its Effects on Components and Systems, 2001. 6th European Conference on,

2001.

36

[14] K. S. Morgan, D. L. McMurtrey, B. H. Pratt and M. J. Wirthlin, "A Comparison of TMR With Alternative Fault‐

Tolerant Design Techniques for FPGAs," Nuclear Science, IEEE Transactions on, vol. 54, no. 6, pp. 2065 ‐ 2072,

2007.

[15] "Partial Reconfiguration User Guide," Xilinx Inc., 2011.

[16] L. Ming, W. Kuehn, L. Zhonghai and A. Jantsch, "Run‐Time Partial Reconfiguration Speed Investigation and

Architectural Design Space Exploration," in International Conference on Field Programmable Logic and

Applications, FPL 2009, Prague, 2009.

[17] M. Berg, C. Poivey, D. Petrick, D. Espinosa, A. Lesea, K. LaBel, M. Friendlich, H. Kim and A. Phan, "Effectiveness of

Internal Versus External SEU Scrubbing Mitigation Strategies in a Xilinx FPGA: Design, Test, and Analysis," IEEE

Transactions on Nuclear Science, vol. 55, no. 4, pp. 2259 ‐ 2266, 2008.

[18] K. Chapmanl, "SEU Strategies for Virtex‐5 Devices (XAPP864)," Xilinx Inc., 2010.

[19] A. Sari and M. Psarakis, "Scrubbing‐based SEU Mitigation Approach for Systems‐on‐Programmable‐Chips," in

International Conference on Field‐Programmable Technology (FPT), New Delhi, 2011.

[20] M. Niknahad, O. Sander and J. Becker, "Fine grain fault tolerance ‐ A key to high reliability for FPGAs in space," in

IEEE Aerospace Conference, Big Sky, MT, 2012.

[21] K. Kyriakoulakos and D. Pnevmatikatos, "A novel SRAM‐based FPGA architecture for efficient TMR fault

tolerance support," in International Conference on Field Programmable Logic and Applications, FPL, Prague,

2009.

[22] C. Carmichae, "Triple Module Redundancy Design Techniques for Virtex FPGAs (XAPP197)," Xilinx Inc., 2006.

[23] "TMRTool," Xilinx Inc., [Online]. Available: http://www.xilinx.com/ise/optional_prod/tmrtool.htm. [Accessed

2013].

[24] F. de Lima Kastensmidt, "Designing fault‐tolerant techniques for SRAM‐based FPGAs," vol. 21, no. 6, pp. 552‐

562, 2004.

[25] C. Bolchini, L. Fossati, D. Codinachs, A. Miele and C. Sandionigi, "{A reliable reconfiguration controller for fault‐

tolerant embedded systems on multi‐FPGA platform," in IEEE 25th International Symposium on Defect and Fault

Tolerance in VLSI Systems (DFT), Kyoto, 2010.

[26] A. Ebrahim, K. Benkrid, X. Iturbe and C. Hong, "A Novel High‐Performance Fault‐Tolerant ICAP Controller,"

Edinburgh.

[27] Y. shu‐Yi and E. J. McCluskey, "On‐line Testing and Recovery in TMR Systems for Real‐Time Applications," in ITC

INTERNATIONAL TEST CONFERENCE, Stanford University, Stanford, California, 2001.

37

[28] D. Nikolos, "Self‐Testing Embedded Two‐Rail Checkers," Journal of Electronic Testing: Theory and Applications ‐

Special issue on On‐line testing, vol. 12, no. 1 ‐ 2, pp. 69 ‐ 79, 1998.

[29] M. Omana, D. Rossi and C. Metra, "High Speed and Highly Testable Parallel Two‐Rail Code Checker," in Design,

Automation and Test in Europe Conference and Exhibition, 2003.

[30] S. Golson, "One‐hot state machine design for FPGAs," in 3rd PLD Design Conference, Santa Clara CA, 1993.

[31] "Virtex‐5 FPGA Configuration User Guide," Xilinx Inc., 2011.

[32] "Virtex‐5 FPGA User Guide (UG190)," Xilinx Inc., 2012.

[33] M. Cassel and F. Lima, "Evaluating one‐hot encoding finite state machines for SEU reliability in SRAM‐based

FPGAs," in 12th IEEE International On‐Line Testing Symposium, IOLTS, Lake Como, 2006.

[34] . D. Nguyen, . S. Guertin and . J. Patterson, "Radiation Tests on 2Gb NAND Flash Memories," in IEEE Radiation

Effects Data Workshop, Ponte Vedra, FL, 2006.

[35] H. Kaneko, "Error Control Coding for Semiconductor Memory Systems in the Space Radiation Environment," in

20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, DFT, 2005.

[36] G. Umanesan and E. Fujiwara, "A class of random multiple bits in a byte error correcting (Stb/EC) codes for

semiconductor memory systems," in Pacific Rim International Symposium on Dependable Computing, 2002.

Proceedings, 2002.

[37] G. Umanesan and E. Fujiwara, "A class of systematic t/B‐error correcting codes for semiconductor memory

systems," in Information Theory Workshop, IEEE Proceedings., Cairns, Qld., 2001.

[38] L. Sterpone, F. Margaglia, M. Koester, J. Hagemeyer and M. Porrmann, "Analysis of SEU Effects in Partially

Reconfigurable SoPCs," in Adaptive Hardware and Systems (AHS), 2011 NASA/ESA , San Diego, CA, 2011.

[39] B. Bridgford, C. Carmichael and W. Tseng, "Single‐Event Upset Mitigation Selection Guide," Xilinx, 2008.

[40] "Early Access Partial Reconfiguration User Guide," 2006.

[41] "LogiCORE™ IP Soft Error Mitigation Controller v2.1," Xilinx Inc., 2011.

[42] E. Dubrova, FAULT TOLERANT DESIGN:AN INTRODUCTION, Stockholm: Kluwer Academic Publishers, 2008, p.

147.

[43] "Virtex‐5 Family Overview," Xilinx, 2009.

[44] "hkinventory," Xilinx, 2012. [Online]. Available:

38

http://www.hkinventory.com/public/ECatalogResultProductDetails.asp?CompanyID=104266&ProductID=27031.

[45] S. Suhail Zain and C. Hu, "NSEU Mitigation in Avionics Applications," 2010.

[46] "Xilinx University Program XUPV5‐LX110T Development System," Xilinx Inc., 2012. [Online]. Available:

http://www.xilinx.com/univ/xupv5‐lx110t.htm.

[47] "Soft Error Mitigation Controller," Xilinx Inc., 2011.

[48] "Partial Reconfiguration User Guide ‐ UG702," 2011.

39

9 Appendices

9.1 AppendixA:BitstreamScrubbingandReadback

Upsets in Xilinx FPGA can be removed by advanced scrubbing. Scrubbing means reading back the

configuration bitstream that is stored in configuration memory, comparing it with an original one and

correcting any affected configuration bits. The configuration management system, which is able to

detect and correct any upsets in configuration memory by means of scrubbing, can be hosted in a

radiation‐hard FPGA, ASIC, Microcontroller or the FPGA itself. The internal scrubbing in Virtex‐5 FPGA is

done via ICAP for reading back the frames in conjunction with Frame Error Correction Code (ECC) for

detect single or double‐bit errors in configuration frame data.

Configuration management can only detect and correct errors caused by SEUs. It cannot mitigate the

SEU’s effects. Therefore, configuration management is often combined with redundant FPGA mitigation

schemes to mask the SEU’s effects in the system.

Virtex‐5 FPGA configuration memory is arranged in frames that are tiled about the device. These frames

are the smallest addressable segments of the Virtex‐5 configuration memory space, and all operations

must therefore, act upon whole configuration frames. [31]

Frame (Figure 22) is the smallest part of the FPGA that can be reconfigured and has a size of 1362 bits in

the Virtex‐5. [40] Virtex‐5 (LX110T) frame counts and configuration sizes are shown in Table 1.

Table 11 Virtex‐5 Device Frame Count, Frame Length, Overhead, and Bitstream Size [31]

Device Non‐

Configuration Frames

Configuration Frames

Total Device Frames

Frame Lengths in Words

Configuration Array Size in

Words

Bitstream Overhead in Words1

LX110T 592 23,712 24,304 41 972,192 272

1‐ Configuration overhead consists of commands in the bitstream that are needed to perform configuration

but do not themselves program any memory cells. Configuration overhead contributes to the overall

bitstream size.

There are

Advantag

•

•

•

•

•

Disadvant

•

•

•

•

•

•

•

Xilinx has

The SEM

correction

desired fu

Generato

e some advant

ges:

There is no

It can reco

The state o

Need less

All implem

entire desi

tages:

This meth

frame; if a

reads back

The recove

It cannot m

work of th

It is not po

time and c

should be

This metho

functional

Difficult to

It is not po

recently intr

Controller im

n, and error

unctions are s

r. [41]

Figure

tages and dis

o need to par

onfigure a com

of a compone

memory than

mentation opt

ign)

od is blind, w

an upset occu

k its relative f

ery process m

mitigate the S

e design afte

ossible to det

cannot be te

done by mea

od can only b

ity of the com

o monitor if an

ossible to dete

roduced a SEU

mplements fiv

classificatio

selected durin

e 22 A schematic

sadvantages t

rtitioning the

mponent in a

ent can be res

n PR approach

tions are ava

which means

urs it will fix

frame.

may take time

SEU’s effect. T

r recovery.

ect SEUs in th

ested by scru

ans of other te

be used to rec

mponents. Tim

n error has oc

ect more tha

U mitigation c

ve main funct

n. All functi

ng the IP core

40

c FPGA structure

to this metho

device. There

finer granula

served

h. Only one b

ailable. (e.g. t

s it will read

it. This meth

e (it depends o

Therefore, sc

he BRAMs an

bbing. Detec

echniques. [1

configure wit

me multiplex

ccurred durin

n two SEUs a

controller (SE

tions: initializ

ons, except

e configuratio

e. Taken from [8

d. These inclu

efore, the pe

arity than PR a

bitstream file

techniques pe

back the con

hod is not ab

on the fault lo

crubbing is no

nd the registe

ction of error

12]

th the same b

is not possib

ng reconfigura

nd correct m

EM), which is

ation, error i

initialization

on and genera

8]

ude:

rformance re

approach.

is needed

erform optim

nfiguration b

ble to locate

ocation).

ot able to gua

ers since their

rs in these pa

bitstream. It c

le

ation of a fra

ore than one

s based on bit

njection, erro

and detect

ation process

emains unaffe

mization acros

itstream fram

the error un

arantee the co

r values chan

arts of the sy

cannot chang

me

e SEU in a fram

tstream scrub

or detection,

ion, are opt

s in the Xilinx

ected.

ss the

me by

less it

orrect

ges in

ystem

ge the

me

bbing.

error

tional;

CORE

41

The controller initializes by bringing the integrated soft error detection capability of the FPGA into a

known state after the FPGA enters user mode. After this initialization, the controller endlessly loops,

observing the integrated soft error detection status. When an ECC or CRC error is detected, the

controller evaluates the situation to identify the Configuration Memory location involved. [41]

Once this is complete, the controller may optionally correct the soft error by repairing it or by replacing

the affected bits. The repair methods are active partial reconfiguration to perform a localized correction

of Configuration Memory using a read‐modify‐write scheme. These methods use algorithms to identify

the error in need of correction. The replace method is also active partial reconfiguration with the same

goal, but this method uses a write‐only scheme to replace Configuration Memory with original data. This

data is provided by the implementation tools and stored outside the controller. [41]

The controller may optionally classify the soft error as essential or non‐essential using a lookup table.

The lookup table is stored outside the controller and is fetched as required during execution of error

classification. This data is also provided by the implementation tools and stored outside the controller.

[41]

When the controller is idle, there is an option to accept input from the user to inject errors into

Configuration Memory. This function is useful for testing the integration of the controller into a larger

system design. Using the error injection capability, system verification and validation engineers may

construct test cases to ensure the complete system responds to soft error events as expected. [41]

The SEM controller uses ICAP for readback and accessing the configuration memory. The ICAP Interface

is a point‐to‐point connection between the SEM Controller and the ICAP primitive. The ICAP primitive

enables read and write access to the registers inside the FPGA configuration system. For error detection,

the SEM controller uses FRAME_ECC Interface. The FRAME_ECC primitive is an output‐only primitive

that provides a window into the soft error detection function in the FPGA configuration system. The

Virtex‐5 Frame error correction code (ECC) logic is designed to detect single‐ or double‐bit errors in

configuration frame data. [41] [31]

There are some advantages and disadvantages to this method. These include:

Advantages:

Support different error detection and correction techniques [41]

Completely flexible. Can be used on many applications [41]

Can perform error detection, error containment, error classification, and error correction

Various status and monitor registers

Disadvantages:

Only available for special Xilinx series (Spartan‐6, Virtex‐6, Virtex‐7, Kintex‐7 series)

Error detection is not optimal (use FRAME_ECC primitive which read configuration memory

frame by frame periodically)

42

9.2 AppendixB:Redundancy

One method to provide fault tolerance in embedded systems is through redundancy. For our purposes,

“redundancy is the provision of functional capabilities that would be unnecessary in a fault free

environment. This can be a replicated hardware component, an additional check bit attached to a string

of digital data, or a few lines of program code verifying the correctness of the program’s results.” [42]

“Two kinds of redundancy are possible: space redundancy and time redundancy. Space redundancy

provides additional components, functions, or data items that are unnecessary for a fault‐free

operation. Space redundancy is further classified into hardware, software and information redundancy,

depending on the type of redundant resources added to the system. In time redundancy the

computation or data transmission is repeated and the result is compared to a stored copy of the

previous result.” [42]

The term redundancy in literatures mostly returns to space redundancy. The most common form of

space redundancy is Triple Modular Redundancy (TMR). Figure 23 Shows the TMR basic principle. In

TMR, the components are triplicated and their outputs are compared to each other. If there is an error

in one module, the voter will mask the error. TMR can be applied to different granularity, from logic

level to system level.

Figure 23 TMR basic principle

In addition to TMR, there are many other hardware redundancy techniques available (such as N modular

redundancy, duplication with comparison, standby sparing, self‐purging redundancy and Triplex‐duplex

redundancy [42]). Xilinx has introduced XTMR30 software tool to simplify the task of design triplication.

According to [39] “TMRtool can partially or fully triplicate a design, insert voters, synchronize feedback

path loops, and allow customized user‐triplicated module insertion. A triplicated design mitigates SEU

impact on the user design.” However, the XTMR is very costly in terms of resource utilization and as a

result, leads to lower frequency and higher power consumption.

30 Xilinx Triple Modular Redundancy

43

The redundancy can also be applied to the device level. For instance, in Figure 24 the FPGA is triplicated

with two identical duplications. However, in this design the voter is itself a point of failure and must be

implemented on a Radiation‐Hard device. This design can also be very expensive.

Figure 24 TMR ‐ Device Level

To summarize, Redundancy is common techniques for almost all approaches to a FT design; however, it

cannot be considered as a mitigation scheme in a FPGA design solely, because redundancy can only

detect and mask the fault and it cannot recover the modules from faults. Therefore, the best mitigation

schemes would be a combination of redundancy and a reconfiguration controller for detecting, masking

and correcting the faults. Table 12 Summarize the performance overview of mitigation schemes.

Table 12 Performance Overview of mitigation schemes. Part of the table is taken from [12]

Mitigation Scheme

Mitigation Strength

Board Layout

Complexity

Ease in Meeting Timing

Constraints

Power Consumption

Component Cost

Average Recovery

Speed

power cycling Weak Low Normal Typical Low Lowest XTMR Medium High Reduced ~3X typical Low N/A

Bitstream Scrubbing

Medium Low Normal Typical Medium Low

PDR Medium Low Reduced Typical Low High XTMR + Bitstream Scrubbing

Strong High Reduced ~3X Typical Medium Low

XTMR + PDR Strong High Reduced ~3X Typical Low High Redundant devices + Bitstream Scrubbing

Strongest Medium Normal 2~4X typical High Low

Redundant devices + PDR

Strong Medium Reduced 2~4X typical High Highest

As previously mentioned, a combination of redundancy and a reconfiguration controller will be the

strongest mitigation scheme. For instance, redundant devices (which could be a combination of

redundancy at component and device level in multi‐FPGA platforms) plus bitstream scrubbing or PDR

will lead to the best mitigation result. One of the important differences between scrubbing and PDR is

that scrubbing may show a better result in recovering the upsets, especially when SEU occurs in the

routing bits, however, it’s recovery speed could be much lower than PDR.

9.3 Ap

Xilinx Virt

performa

1‐ V

2‐ V

3‐ V

4‐ V

Device

XC5VLX110T

1‐ Vi

in

2‐ Ea

3‐ Bl

4‐ Ea

5‐ Ro

to

6‐ Th

The propo

Table 13

configura

applicatio

contains

Moreover

ppendixC

tex‐5 FPGAs a

nce of 550 M

irtex‐5 LX: Hig

irtex‐5 LXT: H

irtex‐5 SXT: S

irtex‐5 FXT: E

Configurat(

Array (Row x Col)

Slic

T 160 x 54

17,2

irtex‐5 FPGA slic

put LUTs and fo

ach DSP48E slice

ock RAMs are fu

ach Clock Manag

ocketIO GTP tra

o run from 150 M

his number does

osed controll

shows the d

tion bitstrea

on specific co

bits that set

r, the bitstre

C:XilinxV

are one of th

MHz. This Fam

gh performan

High performa

Signal process

Embedded sys

Table 13 Vi

tion Logic Blocks (CLBs)

es1

Max DistributedRAM(kb)

280 1,120

ces are organize

our flip‐flops (pre

e contains a 25 x

undamentally 36

gement Tile (CM

nsceivers are de

Mb/s to 6.5 Gb/s

s not include Roc

Figure 25 X

ler in this the

evice specific

m in SRAM‐

onfiguration d

the configur

eam contains

Virtex‐5ov

e Virtex fami

ily is divided

nce general lo

ance logic wit

sing applicatio

stems with ad

irtex‐5 (LX110T)

DSP48E slices

2

d 1k

64 2

ed differently fr

eviously it was tw

x 18 multiplier, a

6 Kbits in size. Ea

MT) contains two

esigned to run f

s.

cketIO transceiv

Xilinx Virtex‐5 X

esis is implem

cation. Like a

‐type interna

data into inte

ration for ea

s all necessa

44

verview

ilies which in

into four diffe

ogic applicatio

th advanced s

ons with adva

dvanced seria

) device specific

Block RAM blocks

18 kb

3 36 kb

M(k

96 148 5,3

rom previous ge

wo LUTs and tw

n adder, and an

ach block can als

DCMs and one

rom 100 Mb/s t

vers.

XC5VLX110T dev

mented on Xi

all other Xilin

al latches. V

ernal memor

ch LUT and f

ary data for

ntroduced by

erent categor

ons

serial connect

anced serial c

al connectivit

ation taken from

s

CMT4

Poprb

Max kb)

328 6

enerations. Each

o flip‐flops.)

n accumulator.

so be used as tw

PLL.

to 3.75 Gb/s. Ro

vice. Taken from

linx Virtex‐5

nx FPGA serie

irtex‐5 devic

ry via the con

flip‐flop as w

configuring

Xilinx in 200

ries:

tivity

connectivity

ty

m [43]

ower PC rocessor blocks

EthernMAC

N/A 4

h Virtex‐5 FPGA

wo independent

ocketIO GTX tran

m [44]

XC5VLX110T

es, Virtex‐5 fa

ces are conf

nfiguration in

well as all rou

the embedd

9 with the hi

net Cs

Max RockeIO

Transceive5

GTP GT

16 N/A

A slice contains

18‐Kbit blocks.

nsceivers are de

T FPGA (Figure

amilies store

igured by lo

nterface. This

uting connec

ded element

ighest

et

rs Max user I/O

6

TX

A 680

four 6‐

esigned

e 25).

e their

oading

s data

ctions.

ts like

45

PowerPC, ICAP, and the initial data for BRAMs [45]. Because Xilinx configuration memory is volatile, it

must be reconfigured each time it is turned on. The Virtex‐5 FPGA can be configured via several

configuration interfaces. These interfaces are listed in Table 14.

Table 14 Virtex‐5 Configuration Modes

No. Configuration Mode Type of interface

Bus Width (bit)

1 Master‐serial configuration mode Serial 1

2 Slave‐serial configuration mode Serial 1

3 Master SelectMAP configuration mode Parallel 8 or 16

4 Slave SelectMAP configuration mode Parallel 8 or 16 or 32

5 JTAG/Boundary‐Scan configuration mode Serial 1

6 Master Serial Peripheral Interface (SPI) Flash configuration mode

Serial 1

7 Master Byte Peripheral Interface Up (BPI‐Up) Flash configuration mode

Parallel 8 or 16

8 Master Byte Peripheral Interface Down (BPI‐Down) Flash configuration mode

Parallel 8 or 16

Among these interfaces, we have used Master‐serial configuration for doing full FPGA configuration, and

Internal Configuration Access Port (ICAP), which is based on SelectMAP protocol, for doing partial

reconfiguration.

The XUPV505‐LX110T is a feature‐rich general‐purpose evaluation and development platform with on‐

board memory and industry standard connectivity interfaces. It features the Virtex‐5 XC5VLX110T

device. [46]. The evaluation platform (Figure 26) has the following features:

Xilinx Virtex‐5 XC5VLX110T FPGA

Two Xilinx XCF32P Platform Flash PROMs (32 MB each) for storing large device configurations

Xilinx System ACE Compact Flash configuration controller

64‐bit wide 256Mbyte DDR2 small outline DIMM (SODIMM) module compatible with EDK

supported IP and software drivers

On‐board 32‐bit ZBT synchronous SRAM and Intel P30 Strata Flash

10/100/1000 tri‐speed Ethernet PHY supporting MII, GMII, RGMII, and SGMII interfaces

USB host and peripheral controllers

Programmable system clock generator

Stereo AC97 codec with line in, line out, headphone, microphone, and SPDIF digital audio jacks

RS‐232 port, 16x2 character LCD, and many other I/O devices and ports

In this the

master th

have utiliz

partial bit

can use

alternativ

dipswitch

connectin

esis, two iden

hat monitors

zed Two Xilin

tstream files

on‐board me

ve for storing

es, LEDs and

ng two identic

Figure 26 Xilin

ntical XUPV5‐

and, in case

nx XCF32P Pla

were stored

emories (suc

g partial or

d keys for tes

cal evaluation

nx XUPV5‐LX110

LX110T evalu

of failure, re

atform Flash

in on‐chip BR

ch as Compa

full bitstrea

sting the sys

n boards to ea

46

0T Evaluation Pl

uation platfor

ecovers the s

PROMs for st

RAMs and an

act Flash, ZB

m files. In a

tem. Moreov

ach other to s

atform. Taken f

rms have bee

second one th

toring the ful

off‐board At

BT synchrono

addition to

ver, the expa

shape a mast

from [46]

en used. One

hat plays the

ll configuratio

tmel I2C mem

ous SRAM or

this, we hav

ansion IOs ha

ter‐slave syste

plays the rol

e role of slave

on bitstreams

mory; howeve

r SPI flash)

ve used on‐

ave been use

em.

e of a

e. We

s. The

er, we

as an

board

ed for

9.4 Ap

9.4.1 C

Virtex®‐5

internal m

it is powe

configura

M

Sl

M

Sl

JT

M

M

M

9.4.2 S

In serial c

In

In

Figure 27 an FPGA i

M

Sl

Se

G

ppendixD

Configuratio

devices are c

memory. Beca

ered‐up. The

tion pins serv

Master‐serial c

lave‐serial co

Master SelectM

lave SelectMA

TAG/Boundar

Master Serial P

Master Byte Pe

Master Byte Pe

erialConfig

onfiguration

n Master Seria

n Slave Serial

shows the bn serial mode

Master serial c

lave serial con

erial daisy‐ch

anged serial

D:Configu

onModesa

configured by

ause Xilinx FP

e bitstream is

ve as the inte

configuration

nfiguration m

MAP (parallel

AP (parallel) c

ry‐Scan config

Peripheral Int

eripheral Inte

eripheral Inte

gurationIn

modes, the F

al mode, CCL

mode, CCLK i

asic Virtex‐5 e:

configuration

nfiguration

ain configura

configuration

Figure 27 Virtex

urationm

andPinsin

y loading app

PGA configura

s loaded into

rface for a nu

mode

mode

l) configuratio

configuration

guration mod

terface (SPI) F

erface Up (BP

erface Down (

nterface[3

FPGA is config

K is an outpu

is an input.

serial configu

ation

n

x‐5 FPGA Serial

47

modesinV

nVirtex5[3

lication‐speci

ation memory

o the device

umber of diffe

on mode (x8 a

mode (x8, x1

de

Flash configur

PI‐Up) Flash co

(BPI‐Down) F

1]

gured by load

t.

uration interf

Configuration In

Virtex5

31]

ific configura

y is volatile, it

through spe

erent configu

and x16 only)

16, and x32)

ration mode

onfiguration

lash configur

ding one confi

face. There a

nterface. Taken

tion data—th

t must be con

ecial configu

uration modes

)

mode (x8 and

ration mode (

iguration bit p

re four meth

n from [31]

he bitstream—

nfigured each

ration pins. T

s:

d x16 only)

(x8 and x16 o

per CCLK cycl

hods of config

—into

h time

These

nly)

le:

guring

Pin nam

M[2:0]

CCLK

D_IN

DOUT_BU

DONE

INIT_B

PROGRAM

Figure 28

Notes rele

1. B

0

2. Fo

as

me Ty

] Inp

Out

Inp

USY Out

BidirecOpen‐or Ac

B InpuOut

Open‐

M_B Inp

shows how c

evant to Figu

it 0 represent

= 1, bit 1 = 0,

or Master con

s indicated by

Table 15

pe DedDua

put De

tput De

put De

tput De

ctional, Drain, ctive

De

ut or put, ‐Drain

De

put De

configuration

Figure 28 Ser

re 28:

ts the MSB o

, bit 2 = 1, etc

nfiguration m

y the arrow.

5 Virtex‐5 FPGA

dicated or al Purpose

edicated Ms

edicated Ce

edicated SC

edicated SIt

edicated

A01RR

edicated

BtMLd01

edicated A

data is clocke

rial Configuratio

f the first byt

c.

mode, CCLK do

48

Serial Configura

Mode Pins – dset via special

Configurationexcept JTAG. F

Serial configuCCLK edge

Serial data out is left uncon

Active High sig0 = FPGA not c1 = FPGA confRefer to the BReference Gui

Before the Mohat can be heMode pins areLow output induring configu0 = CRC error1 = No CRC er

Active‐Low as

ed into Virtex

on Clocking Sequ

te. For examp

oes not trans

ation Interface P

Des

determine col DIP‐switches

clock sourceFor Master se

ration data in

tput for downnected in ou

gnal indicatinconfigured figured BitGen sectionide for softwa

ode pins are seld Low to dee sampled, INndicating wheuration:

rror

synchronous f

x‐5 devices in

uence. Taken fr

ple, if the firs

sition until aft

Pins

scription

nfiguration ms on the evalu

e for all configerial it is an o

nput, synchro

nstream daisur design.

ng configurati

n of the Deveare settings.

sampled, INITelay configuraNIT_B is an opether a CRC er

full‐chip reset

n Master Seria

om [31]

st byte is 0xAA

ter the Mode

mode. They cauation board

guration modoutput.

onous to rising

sy‐chained de

ion is comple

lopment Syst

T_B is an inpuation. After thpen‐drain actirror occurred

t

al mode.

A (1010_1010

e pins are sam

an be .

es

g

evices.

te:

tem

ut he ive d

0), bit

mpled,

3. C

The Mast

PROM, as

Notes rele

1. Th

D

B

2. Th

3. Th

4. Th

ca

5. Th

6. O

ac

CLK can be fr

ter Serial mo

s shown in Fig

evant to Figu

he DONE pin

ONE pin has

itGen.

he INIT_B pin

he BitGen sta

he PROM in t

ascaded to in

he BIT file mu

On some Xilin

ctive Low wh

ee running in

ode is designe

gure 29.

Figure 29

re 29:

n is by defau

s a programm

n is a bidirecti

artup clock se

this diagram

crease the ov

ust be reform

nx PROMs, th

en using this

n Slave serial m

ed so that th

Master Serial M

ult an open‐d

mable active

onal, open‐d

etting must be

represents o

verall configu

matted into a P

he reset pola

setup.

49

mode.

he FPGA can

Mode Configurat

drain output

driver. To e

rain pin. An e

e set for CCLK

ne or more X

ration storag

PROM file bef

arity is progr

n be configur

tion. Taken from

requiring an

nable it, ena

external pull‐u

K for serial co

Xilinx PROMs

ge capacity.

fore it can be

rammable. R

red from a X

m [31]

n external pu

able the Driv

up resistor is

nfiguration.

s. Multiple Xil

e stored on th

RESET should

Xilinx configur

ull‐up resistor

e DONE opti

required.

linx PROMs c

he Xilinx PROM

be configur

ration

r. The

ion in

can be

M.

red as

naser derakhsh - kth.diva-portal.org

Documents