AN AUTONOMOUS FPGA-BASED EMULATION ... - IEEE Xplore

Departamento de Tecnología Electrónica - Universidad Carlos III de Madrid. Avda. de la Universidad, 30. 28911-Leganés (Madrid, Spain). {celia, mgvalder ...
193KB taille 3 téléchargements 425 vues
AN AUTONOMOUS FPGA-BASED EMULATION SYSTEM FOR FAST FAULT TOLERANT EVALUATION Celia López-Ongil, Mario García-Valderas, Marta Portela-García, Luis Entrena-Arrontes Departamento de Tecnología Electrónica - Universidad Carlos III de Madrid Avda. de la Universidad, 30. 28911-Leganés (Madrid, Spain) {celia, mgvalder, mportela, entrena}@ing.uc3m.es There are hardware and software solutions for fault tolerance evaluation. Among the hardware approaches, the most direct one is to physically expose a circuit prototype to radiation [3]. This method gives the most realistic evaluation results, but it is unable to locate weak areas in the circuit. It also implies the prototypes manufacturing, which makes the redesign cycle very expensive. Simulation based methods have been proposed to allow circuit hardening in early design stages [4]. These methods are based in the injection of faults in the circuit description, according to a fault model, and the validation of the circuit response through simulation. These methods offer great flexibility in terms of fault campaign management and identification of circuit weak areas, but the simulation process is slow. In order to accelerate the evaluation process, simulation tasks have been replaced by hardware emulation in programmable devices. FPGA-based emulation has been applied with good results in other applications such as functional validation [5]. In the case of fault tolerance evaluation, the circuit under test is programmed into a FPGA and then fault injection is performed. There are two main techniques to perform fault injection in FPGAs: reconfiguring the device to introduce the faulty behaviour [6] or modifying the original circuit adding extra hardware to alter the state of the circuit [7]. The main drawback of these approaches is that they require a very intensive interaction with the host, which controls the injection and evaluation for every fault, representing a performance bottleneck. This paper proposes a new FPGA-based emulation system that profits from programmable logic features and existing resources on platform FPGAs to enhance fault tolerance evaluation process in digital circuits. The proposed autonomous emulation system avoids emulator-host communication bottlenecks by executing the complete emulation campaign and its management in the FPGA in a single continuous run. Interaction with the host is only required at the beginning of the process, to download the fault campaign information to the emulator, and at the end, to retrieve the fault classification. Besides, flexibility is maintained since the set up of the process (not the management) is performed by the host. The paper is organized as follows. Transient fault emu-

ABSTRACT Platform FPGAs provide a high degree of reconfigurability and a high density of integration. These features make these devices very suitable for hardware emulation and in particular for fault tolerance evaluation. There are several FPGA-based approaches that enhance notably the fault tolerance evaluation process achieving an important speed up. However, such methods are limited by the communication between the FPGA and the host computer, which manages the emulation process. In order to minimize this communication and therefore accelerate the overall process, an autonomous emulation system is proposed in this paper. This solution profits from additional hardware resources available in current platform FPGAs such as embedded RAM. In the proposed system, a complete emulation campaign and its management is embedded in the FPGA, accelerating emulation process up to two orders of magnitude without losing flexibility with respect to other hardware solutions.1 1. INTRODUCTION Fault tolerance is becoming a concern for an increasing number of applications due to the use of very deep submicron technologies, since they can be affected by radiation, producing transient faults in memory elements even in the earth surface [1]. The effect of changing the value of a memory element due to the impact of a high energy particle is named single event upset (SEU) [2]. In order to design fault tolerant circuits, it is necessary to perform fault tolerance evaluation, that is, to check the circuit response when faults are produced. This evaluation process is useful in a first stage for locating weak areas in the circuit, where tolerant structures need to be inserted, and in a final stage, once the circuit has been hardened, for evaluating the achieved fault tolerance. Fault tolerance evaluation is performed by means of injecting faults into the circuit while it is running a testbench to observe and classify fault effects. 1

This work has been partially funded by the Directorate of Research of Madrid Community Government, code 07T/0052/2003 2.

0-7803-9362-7/05/$20.00 ©2005 IEEE

397

lation is presented in section 2. The proposed autonomous emulation system is described in section 3. Experimental results are presented in section 4 and finally, section 5 states the conclusions.

mental results are provided, giving rates of 100 ms/fault, which are mainly due to run-time FPGA reconfigurations. 2.2. Instrumented Circuit Technique In this technique [7], faults are injected by means of specific hardware located in the flip-flops of the circuit, named instruments, and connected in a scan-path chain, named fault-mask. In this chain, a fault mask is loaded in order to set the flip-flops to be affected by fault injection. The fault list is stored in the host. The fault mask to be applied on the circuit at the injection time is generated and stored in the host, and downloaded in the circuit via scanpath. Fault injection is activated from the host and executed by means of the instruments located at every flip-flop. Fault emulation is controlled by the host which enables the hardware operation, before and after injecting a fault. Test bench is applied by a buffer which communicates the host with the circuit. Emulation for every fault is executed for the whole test bench, first to reach to the state of the circuit in the injection time and then, to emulate the behaviour of the circuit from the injection cycle. If the fault is classified, it is possible to stop the fault emulation before the end of the testbench. But emulation stopping is decided and executed by the host. Fault grading is performed in the host, comparing correct outputs (generated by simulation) with faulty circuit output values uploaded from the FPGA. Instruments introduce an area overhead of one flip-flop and some logic per original flip-flop. This emulation-based solution has given rates of about 100 us/fault [7].

2. TRANSIENT FAULT EMULATION The fault tolerance evaluation process consists in checking the response of a circuit when faults are injected, by comparing its behaviour with the fault free circuit. Fault tolerance evaluation is usually carried out by forcing faults into the circuit while it is running a testbench. Faults are injected according to a fault model. Commonly, bit-flip is the fault model adopted for SEU effects [2], and only affects memory elements. Fault emulation appeared as an alternative to simulation solutions for fault tolerance evaluation. It makes profit from hardware speed, but maintaining software flexibility. In fault emulation, heaviest tasks are done in FPGA hardware and fault injection campaign setup is performed by the host. The main difficulty to perform fault evaluation through emulation is how to perform fault injection during the application of a testbench, in order to observe the fault effects in the circuit response. The main approaches are based on partial or total reconfiguration [6], and scan-path, like the instrumented circuit technique [7]. The instrumented circuit technique has also been used for fault tolerance evaluation, applied to other kinds of faults [8]. These approaches are analysed in more detail in the following sections. 2.1. Circuit Reconfiguration Technique In the emulation with partial or dynamic FPGA reconfiguration, the bit-flip model is applied by performing reconfiguration to modify the circuit state. Circuit is fixed to a faulty state and emulation is applied to analyse its response. A partial or total FPGA reconfiguration is needed to inject every fault [6]. The fault list is generated and stored in the host, which is in charge of reconfiguring the FPGA to inject every fault. Fault emulation is controlled by the host which enables the hardware operation, once a fault is injected. Test bench application is solved by storing circuit inputs in the internal memory of the FPGA. Hardware emulation of every faulty circuit is executed from injection cycle to the end of test bench. It is possible to stop the fault emulation if fault is classified before the end of the test bench, but emulation stopping is decided and executed by the host. Fault grading is performed in the host, comparing correct outputs, generated by simulation, with faulty circuit outputs uploaded from the FPGA. In this technique there is no area overhead with respect to the original circuit, but internal RAM is required to store test bench inputs. However, injection time is very high, as reconfiguration is quite a slow process. In [6] some experi-

3. AUTONOMOUS EMULATION SYSTEM Emulation based techniques represent a great speed improvement over simulation techniques, but important improvements can still be made. In current solutions, the evaluation process is interrupted every time the emulator needs to wait for the host to apply the stimuli, inject a fault or check the output values. We propose an autonomous emulation system that greatly improves evaluation speed by avoiding communication between emulator and host. Along the rest of the paper, the use of the bit-flip model for SEU effects to inject faults into circuits is assumed. Faults are going to be classified into three categories depending on the circuit response to them. A failure is a fault that induces a wrong circuit behaviour. A fault that does not alter the output values during the whole testbench, but leaves the circuit in a different state than the fault free circuit, is classified as a latent fault. Finally, faults whose effects disappear completely are considered silent faults. The fault tolerance evaluation problem is twodimensional. Being F the number of flip-flops in a circuit, and C is the number of cycles of the testbench chosen for fault tolerance evaluation, the complete set of single faults

398

is composed of F·C faults: a fault can be injected in any of the flip-flops, and for every flip-flop, it can be injected at any clock cycle. For every fault, the testbench needs to be applied to check the circuit response. If the testbench is completely applied to check every fault, the total number of clock cycles needed to perform the whole evaluation process is F·C2. Some optimizations can reduce this number to avoid applying the entire testbench every time. The testbench does not need to be applied from the beginning for every fault. It is required to apply it only from the clock cycle where the fault is injected. To apply this optimization, the circuit state required at the injection cycle must be available. The testbench does not always need to be applied up to the end. The testbench application can be stopped when the fault is a failure or silent. If the circuit behaves differently from the golden run, the fault can be immediately classified as a failure. The testbench can also be stopped a soon as the fault effects have completely disappeared, and the fault can be classified as silent. The primary objective of the autonomous system proposed in this paper is to speed up the evaluation process, by applying these optimizations and avoiding communication between the FPGA board and the host computer during the evaluation process. Communication will take place only at the beginning of the process, to download the fault campaign information to the emulation board, and at the end to retrieve the results. Fig. 1 shows the general structure of the emulator, which uses FPGA logic resources, embedded RAM and board RAM.

can be postponed till the evaluation process has finished. The main blocks of the emulator circuit implemented in the FPGA are the circuit under evaluation, an emulation controller, a failure detector and a silent fault detector. The emulation controller is in charge of applying the stimuli to the circuit, injecting the faults and storing evaluation results into memory. Fault detectors can detect when there are differences between faulty circuit outputs and expected outputs, or when there are differences between the faulty circuit state and the golden circuit state, thus allowing stopping the emulation as soon as a fault is classified. Three techniques have been designed to implement the circuit under evaluation, named Time-Multiplexed, StateScan and Mask-Scan techniques. The first one provides the best performance in terms of evaluation speed. The others are simplified versions that reduce resource usage. 3.1. Time-multiplexed technique In this technique, every circuit flip-flop is replaced by the logical structure shown in Fig. 2. The idea underlying this structure is having two circuits working simultaneously, one performing a golden run and the other running a faulty circuit. In order to save resources, these two circuits share their combinational logic and each one runs in alternate clock cycles, performing time multiplexing. DataIn

ScanOut MaskQ

FaultyQ

MASK

FAULTY Inject FaultyQ

EnaFaulty

Data Out

Emulation Board

LoadState

FPGA

RearmQ

REARM

Embedded RAM

EnaFaulty SaveState

Error

GoldenQ

Board RAM

Emulation Controller Failure detector

GoldenQ

Circuit Under Evaluation

GOLDEN

LoadState EnaGolden

DataIn

Fig. 2. Flip-flop replacement for the Time-Multiplexed emulation technique

Silent detector

The flip-flop replacement consists on four flip-flops, called golden, faulty, mask and rearm. The golden and faulty flip-flop pair replaces the original circuit flip-flop. The golden flip-flop is used to emulate the golden run circuit, and the faulty flip-flop is used to emulate the faulty circuit. These two circuits are run in alternate clock cycles. This way, circuit states can be compared at any time, allowing the immediate detection of silent faults. The mask flip-flop is used to state if a fault is going to be injected in the faulty flip-flop. The different mask flipflops of the circuit are joined forming a scan chain. The rearm flip-flop is used to store the circuit state, so that every run does not need to start from the beginning. When a fault has been classified, the faulty and golden flip-

Fig. 1. Fault emulator circuit main structure

The proposed evaluation system uses RAM to store all the information needed to perform the evaluation process autonomously. Testbench input stimulus and expected output values are stored in the FPGA embedded RAM. Embedded RAM allow the access to a complete vector of input and output values to be performed in a single clock cycle, without the limit that an external memory would impose due to the word width. This way, every testbench cycle can be applied in a single clock cycle. Fault injection information is stored in board RAM. The fault dictionary and any other evaluation result will also be stored in board RAM by the emulator, so that results upload

399

flops reload their values from the rearm flip-flop. Using this flip-flop replacement, emulation profits from all speed optimizations. The main disadvantage of this technique is the logic resource consumption, given that the circuit gets its flip-flop number multiplied by four. The emulator implemented to test this technique has been developed to evaluate the complete set of single faults. This is a particular case in which the emulator controller can generate the complete fault list by itself, without receiving any information from a host computer. This is the most favorable case for speed and gives optimistic results.

4. EXPERIMENTAL RESULTS Several experiments have been carried out to analyze the two proposed techniques. In this section, experimental results are detailed. The designs have been implemented in a Celoxica RC1000 board, with a Xilinx Virtex-2000E and 8Mb of onboard RAM. It is a PCI card, and allows a fast communication with the host. Both FPGA and memory are accessible through the PCI bus. The proposed techniques have been applied to VHDL descriptions at RT level. Leonardo Spectrum 2003 and Xilinx ISE 6.3 have been used as synthesis and back-end tools. Experiments have been simulated with ModelSim XE 5.8. Some custom tools have been developed for some specific tasks. To perform automatic circuit instrumentation, a custom tool based on the AIRE/CE intermediate format and the API from FTL Systems has been used. Another tool has been developed to access the onboard memory of the RC1000 through the PCI bus. The proposed techniques have been applied to the b14 and b12 circuits from the ITC’99 benchmark suite, a Viper processor and a sequence detector. Also, a hardened version of b14 circuit has been used for the experiments in order to check the performance of the developed system with highly fault tolerant circuits. Triple Modular Redundancy technique has been applied on the outputs of b14 circuit. B14 circuit has 32 inputs (not including clock and reset), 54 outputs and the synthesized netlist has 215 flipflops. B12 circuit has 5 inputs, 6 outputs and the synthesized netlist has 119 flip-flops. B14 hardened circuit changes the flip-flop number to 323, with respect to B14, due to the TMR insertion in output flip-flops. The experiments that have been carried out consist of two sets of 160 and 600 stimulus vectors and the complete list of single faults. Regarding the fault classification, in Mask-Scan technique only detected/not detected information is provided. In the other techniques, faults are classified as detected, latent or silent. The synthesis results for the three techniques implemented in the autonomous system developed are shown in Table 1 for b14, Table 2 for b12 and Table 3 for b14_hardened; all the results are detailed for 160 and 600 testbench cycles. These tables show the required embedded and on-board RAM and area information for the original circuits and the different circuit versions (modified by the different emulation techniques) and the complete emulation circuits. Area overheads have been calculated in relation with the original circuits. The overhead due to the modification of the circuit under evaluation is proportional to the flip-flop number. The minimum area overhead is achieved with State Scan technique, but in the table this fact is not appreciated. This is due to State Scan technique includes an extra flip-flop for

3.2. State-Scan technique This technique is a simplification of the previous one to reduce area overhead. In this technique, all memory elements in the circuit are connected through a scan-path chain, so that the whole circuit state can be downloaded into the circuit. Fault injection is carried out by downloading a circuit state with a fault already injected, allowing multiple-fault injection. In order to detect silent faults, a golden run is executed by the emulator at the beginning of the process, and the final circuit state is stored, so that the faulty circuit final state can be compared with the golden run state. The fault list is processed in the host in order to generate the circuit state for every fault to be evaluated. The emulator implemented to test this technique stores the different faulty states in board RAM. Concerning time optimizations, the testbench is applied from the fault injection cycle, but emulation is only stopped when a failure is detected. Silent faults can not be detected till the end of the testbench application. The main drawbacks of this technique is the performance decrease, due to the need of scanning-in the circuit state, and the large amount of RAM needed to store the different faulty circuit states. 3.3. Mask-Scan technique This technique is also a simplification of the TimeMultiplexed technique, but it does not need so much RAM as the State-Scan technique. In this technique, only a mask flip-flop is added to every original flip-flop and they are joined forming a scan-chain. This mechanism is quite similar to the instrumented circuit technique [7]. In terms of speed, testbench application is always performed from the beginning, and emulation is only stopped when a failure is detected. The emulator implemented using this technique evaluates the full set of single faults. In this particular case, the fault list can be automatically generated by the emulation controller, offering optimistic speed results.

400

Table 1. Synthesis results for the B14. 160 and 600 test bench cycles B14 Original Mask Scan State Scan Time Mux

Board/FPGA RAM (kbits) 160 600 33 / 13.4 126 / 50.4 7,289 / 13.4 26,573 / 50.4 67 / 5.3 252/ 18.75

Modified CUT (overhead) LUT FFs 160 600 160 600 1,172 215 1,648 (41%) 430 (100%) 1,555 (33%) 430 (100%) 3,945 (227%) 860 (300%)

Emulator (overhead) LUT

FFs

160

600

160

-

600 -

1,586 (35%) 1,592 (36%) 1,684 (44%) 1,694 (45%) 4,255 (255%) 4,291 (266%)

520 (142%) 539 (151%) 1,077(380%)

524 (144%) 545 (153%) 1,035 (401%)

Table 2. Synthesis results for the B12. 160 and 600 test bench cycles B12 Original Mask Scan State Scan Time Mux

Board/FPGA RAM (kbits) 160 600 18.6 / 2 69.7 / 7.6 2,200 / 2 8,200 / 7.6 37 / 0.78 139 / 2.9

Modified CUT (overhead) LUT FFs 160 600 160 600 362 119 673 (86%) 238 (100%) 622 (72%) 238 (100%) 1,769 (389%) 476 (300%)

Emulator (overhead) LUT

FFs

160

600

160

600

710 (96%) 758 (109%) 2,441 (574%)

716 (98%) 773 (113%) 2,465 (581%)

327 (175%) 365 (207%) 637 (435%)

331 (178%) 372 (213%) 647 (444%)

Table 3. Synthesis results for the B14_hardened. 160 and 600 test bench cycles B14 hardened Original Mask Scan State Scan Time Mux

Board/FPGA RAM (kbits) 160 600 50.5 / 13.4 189.3 / 50.4 19,380 / 13.4 72,675 / 50.4 101 / 5.3 378 / 18.75

Modified CUT (overhead) LUT FFs 160 600 160 600 2,126 323 2,191 (3%) 646 (100%) 1,982 (0%) 646 (100%) 4,448 (109%) 1292 (300%)

fault classification. Mask-Scan technique would need an additional flip-flop per original flip-flop in the circuit, in order to perform the same classification. On the other hand, Time Multiplexed technique implies the highest area overhead for the original circuit and for the complete emulation system. The area overhead due to the control block depends on the number of flip-flops, test bench cycles and inputs and outputs of the original circuit. Time-Multiplexed technique requires higher area in the emulator circuit due to the complexity this block.

Emulator (overhead) LUT 160

FFs 600

160

2,073 (0%) 2,099 (0%) 4,861 (129%)

600 -

2,079 (0%) 2,110 (0%) 4,702 (121%)

785 (143%) 804 (149%) 1,483 (359%)

789 (144%) 810 (150%) 1473 (356%)

cise as most functionality as possible. This way, injected faults are allowed to propagate inside the circuit. Finally, emulation times are shown in Table 5, Table 6 and Table 7 for b14, b12 and b14_hardened respectively and for the three techniques. In the two first tables results for simulation-based solution are displayed also. In the three tables, results for 160 and 600 test bench clock cycles are given for a clock frequency of 25 MHz. Together with execution times, speed averages are calculated and shown in these tables. Comparing speed averages obtained with the proposed techniques and those obtained in fault simulation (23,000 us/fault) and in instrumented circuit emulation (100 us/fault) [7], we conclude that our approaches are much faster. Our proposed techniques provide execution times that depend on the number of circuit flip-flops and the number of test bench clock cycles. In Table 5, results for Time Multiplexed technique are better for the shortest emulation (160 cycles) and for large number of flip-flops (b14 and b14_hardened). On the other hand, results for State Scan technique are better for the longest emulation (600 cycles) and small number of flip-flops (b12).

Fault emulation results Experiments performed include the emulation of all single faults. Fault classification results are shown in Table 4, for the two techniques developed and for 160 and 600 test bench clock cycles. Columns four and eight indicate the number of faults injected. The rest of columns show the fault classification for 160 and 600 cycles. Mask Scan technique classifies faults only as detected or not-detected classification. On the other hand, State Scan and Time Multiplexed technique categorize faults as failure, latent and silent. It could be seen that the three techniques provide the same fault classification. When the number of test bench clock cycles is small, fault classification could be inaccurate. For this reason, differences are observed, in Table 4, for circuit emulations with 160 and for 600 clock cycles. Detected faults are higher with longer fault emulations, meanwhile latent faults are less. The conclusion is that, for obtaining accurate fault classifications, is important to apply test benches that exer-

5. CONCLUSIONS This paper presents a new solution for improving the performance of transient fault emulation in platform FPGAs. An autonomous transient fault emulation system is proposed, which executes in the FPGA most of the tasks involved in the fault injection campaign. Meanwhile, the rest of modules are software applications running in a host

401

Table 4. Fault classification for b14, b12 and b14_hardened circuits %F 49.2 29.8 16.0

B14 B12 B14_hardened

%L 4.4 55.6 3.2

160 %S 46.4 14.6 80.7

#faults 34,400 129,000 51,680

%F 59.6 33.3 23.0

%L 0.1 53.4 1.3

600 %S 40.3 13.3 75.7

# faults 19,040 71,400 193,800

Table 5. Time results for b14 circuit B14 Simulation Mask Scan State Scan Time Mux

Execution time 160 600 12.5 h 47 h 141.11 ms 2.2 s 386.40 ms 2.1 s 19.90 ms 83.2 ms

Speed average 160 600 1.3 s/fault 1.3 s/fault 4.1 us/fault 17 us/fault 11.2 us/fault 16 us/fault 0,57 us/fault 0,64 us/fault

Table 6. Time results for b12 circuit B14 Simulation Mask Scan State Scan Time Mux

Execution time 160 600 7.4 min 96 min 108.7 ms 1.45 s 142 ms 771 ms 80.80 ms 971 ms

Speed average 160 600 23 ms/fault 81 ms/fault 5.7 us/fault 20.3 us/fault 7.5 us/fault 10.8 us/fault 4.2 us/fault 13.6 us/fault

Table 7. Time results for b14_hardened circuit B14 hardened Mask Scan State Scan Time Mux

Execution time 160 600 310.7 ms 4.2 s 838.9 ms 4.4 s 24 ms 99 ms

PC. Flexibility is maintained because software is still in charge of setting up the fault injection campaign, while fault emulation process is autonomously performed within the FPGA once fault list is generated by the host PC. Detailed analysis of the results could be done thanks to emulation process stores all required information, i.e. fault classification, fault latency, etc., in the board RAM, which is uploaded finally to the host PC. In this work, three fault emulation techniques have been presented and compared. Results obtained point out that the autonomous system proposed provide better results, in terms of execution time, than simulation-based solution and previous works on fault emulation. Enhancements of three orders of magnitude have been achieved with respect to fault simulation, and two orders of magnitude with respect to instrumented circuit technique. Comparing the three proposed techniques, in terms of execution time, the best solution depends on the circuit under test and the test bench length, but in most of the cases Time Multiplexed technique is one order of magnitude faster than the two other techniques. State Scan technique is better for circuits with small number of flip-flops and long test benches, and Mask Scan technique is better in the opposite case. Regarding resource usage, Time Multiplexed implies the highest amount for FPGA resources, and the least size for RAM blocks, while State Scan employs the smaller area overhead in the circuit under evaluation. Global results obtained have proved that this system is a time and cost-effective solution for transient fault emula-

Speed average 160 600 6,04 us/fault 21.6 us/fault 16.3 us/fault 23.06 us/fault 0,47 us/fault 0,51 us/fault

tion, due to the popularization of low cost platform FPGAs, with a large amount of resources available. 6. REFERENCES [1] [2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

402

International Technology Roadmap for Semiconductors, 2001 Edition F. Alcalá, L. Berrojo, F. Ortega. “Fault Tolerant VHDL Architectures for Space Applications” Proceedings de VHDL Users Forum. Toledo, España. 1997 J. Karlsson, P. Folkesson, J. Arlat, Y. Crouzet, G. Leber, J. Reisinger. “Application of Three Physical Fault Injection Techniques to the Experimental Assessment of the MARS Architecture” Int Working Conf. on Dependable Computing for Critical Applications Illinois-USA Sept-1995 pp150-161 E.Jenn, J. Arlat, M. Rimen, J. Ohlsson, J. Karlsson. “Fault Injection into VHDL Models: the MEFISTO Tool”. FTCS-24. Int. Symp. on Fault Tolerant Computing. 1994. pp.66-75 Helena Krupnova, G. Saucier. “FPGA-based Emulation: Industrial and Custom Prototyping Solutions”. FPL 2000, Field Programmable Logic and Applications: the roadmap to reconfigurable computing. Villach, Austria. August 2000. L. Antoni, R. Leveugle, B. Feher. “Using Run-Time Reconfiguration for Fault Injection in HW Prototypes”. IEEE Int. Symp. on Defect and Fault Tolerance in VLSI Systems, 2002. pp. 245-253. P. Civera, L. Macchiarulo, M. Rebaudengo, M. Sonza, M. Violante. “Exploiting FPGA for accelerating Fault Injection Experiments”. Int. On-Line Testing Workshop. Taormina-Sicilia (Italia). Julio 2001. pp. 9-13. M. Sonza Reorda, M. Violante. “Emulation-Based Analysis of Soft Errors in Deep Submicron Circuits”. FPL 2003, Field Programmable Logic and Applications. Lisbon, Portugal. August 2003 F.Corno, M. Sonza Reorda, G. Squillero. “RT-Level ITC’99 benchmarks and first ATPG results” IEEE T. on Design and Test of Computers, pp 44-53, July-August. 2000.