SINGLE EVENT EFFECT (SEE) ANALYSIS, TEST & MITIGATION OF THE XILINX VIRTEX-II INPUT OUTPUT BLOCK (IOB)

Mathew Napier1,Jason Moore2, Sana Rezgui2, C. Carmichael2,

J. George3, G. Swift4

  1. Sandia National Laboratories, AlbuquerqueNM, 87122
  2. Xilinx, San Jose, CA95124

3. The Aerospace Corporation, El Segundo, CA, USA

4. Jet Propulsion Laboratory / Caltech, Pasadena, CA, USA

  1. INTRODUCTION

SRAM-based Field Programmable Gate Arrays are well suited for a number of different applications ranging from DSP to networking to video. Their reconfigurability makes them ideally suited for systolic processing and their extremely high performance is rivaled only by Application Specific Integrated Circuits.

The QPro Virtex-II Rad-Tolerant family of FPGAs extends the usability from terrestrial and avionics applications to the space environment. While these devices have excellent Total Ionizing Dose (TID, > 200 krad(Si)) and Single Event Latchup (SEL, immune to > 150 MeV per mg/cm2 ) performance, they are very sensitive to Single Event Upset (SEU).

This paper focuses on the SEU susceptibility and signal fidelity effects of various TMR mitigation schemes for the Input Output Block (IOB). These blocks connect Xilinx’s internal fabric with numerous outside devices that may use any number of I/O standards. This interconnect creates an interesting boundary; the transition of a fully Triple Module Redundant (TMR) fabric design to a wide range of non-TMR, quasi TMR or FULL TMR interfaces.

II.DISCUSSION

The thrust to use SRAM based FPGA systems in space has been driven predominately by the fact that available SEU immune devices do not have the sufficient performance or flexibility to meet the demanding requirements and schedules of today’s space applications. The high performance large gate and I/O count Xilinx Virtex II SRAM based FPGA allows designers to implement systems that cycle for cycle are an order of magnitude more powerful, cheaper and faster then a system using conventional space qualified parts. The major drawback and reason more programs are not using these parts is the SRAM based configuration architecture of the Virtex II make them very susceptible to SEU.

A tremendous amount of work has been done generating the cross section of the SRAM configuration bit [1,2] and the cross section of the Single Event Functional Interrupts (SEFI) [2]. TMR mitigation techniques [4], have been designed, implemented and tested on the Virtex I and is on going for the Virtex II. This mitigation will virtually remove the SEU susceptibility of the system. A majority of the testing has been focused on the internal fabric of the device; this includes but is not limited to Look up Tables (LUT), Flip Flops (FF), Routing and Block RAMs.

Fault Injection (FI) tests of the Virtex IOB have been conducted [5] but the main purpose of this test was to determine if an input can be turned to an output with a single SEU upset, the test was static and did not look at the dynamic operation of the IOB. I/O interfaces operating at speeds above 100 MHz are expected. With rise times less then 1ns and timing margins measured in picoseconds and voltage input thresholds of several 100mV a closer look at output TMR IOB structures overall operation in a SEU environment is necessary. The flexibility of FPGAs in terms of internal fabric redesigns is very powerful and flexible. I/O interfaces however are not nearly as flexible once the board has been fabricated. The decision on which I/O mitigation scheme to implement cannot be taken lightly.

Do to the wide range of I/O types, interfaces, timing, operating speeds, routing topologies and capacitive loading the determination of which TMR standard to implement has various tradeoffs that can determine the success or failure of overall system operation. These tradeoffs span from SEU cross section, I/O utilization to signal fidelity all of which need to be considered when determining which IOB mitigation to implement.

Xilinx’s TMR approach for output signals going to non TMR structures that have only one input (for example an Address line to an SDRAM), is to output three voted copies of the signal and tie them together external to the chip. The voter circuit will be able to detect if a signal is incorrect and tri-state the corrupted output so it will not load the bus. This method falls short in that there are approximately 90 configuration bits that can be corrupted after the voter allowing a failed I/O value to be driven onto the bus creating contention. The reasoning is if one of the I/O were to fail the other two would over drive the failed I/O and the circuit would continue to operate correctly. From a conceptual view this sounds very good but from an electrical and signal fidelity view this is far from ideal. A conceptual drawing of how this would look is shown in Figure 1.

Figure 1: Xilinx recommended TMR output structure

This paper will outline several different studies that were conducted on the dynamic operation of this particular TMR IOB structure compared to a quasi TMR structure shown in Figure 2. The different studies include:

  • Hyper Linx Signal Integrity simulation of the TMR output structure for various LVCMOS configurations, LVDS and LVDCI, for both normal operation and failure modes.
  • Real Circuit Bench test and scope measurements of the TMR output structures for several different I/O types listed above for both normal operation and failure modes.
  • Xilinx analysis of the actual IOB in the two TMR configurations to determine exact configuration bit cross section of these failure types.
  • Beam test results of the two different structures to see how they operate and compare for a select number of I/O types.
  • Fault injection testing of the two structures to see if Beam data, configuration bit cross section estimation and signal fidelity issues correlate to which structure is better to use.

Figure 2: Quasi TMR Output Structure

III.CONCLUSION

The results of this paper will show that very critical trade offs need to be considered when determining which structure to use on any particular interface. The paper will recommend for several different output interfaces which TMR structure works best.

IV.REFERENCES

[1] C. Yui, G. Swift and C. Carmichael, “Single Event Upset Susceptibility Testing of the Xilinx Virtex II FPGA”, MAPLD ’02 Laurel MD, 10-12 Sep 2002

[2]C. Yui, G. Swift, C. Carmichael, R. Koga, and J. George, “SEU Mitigation Testing of Xilinx Virtex II FPGAs”, NSREC’03, Monterrey, USA, 21-25 Jul. 2003

[3]R.Koga, J. George, G. Swift, C. Yui, C. Carmichael, T. Langley, P. Murray, K. Lanes and M. Napier, “Comparison of Xilinx Virtex-II FPGAs SEE sensitivities to Protons and Heavy Ions”, RADECS’03, The Netherlands, 16-19 Sep. 2003.

[4]C. Carmichael, “XAPP197: Triple Module Redundancy Design Techniques for Virtex FPGAs” Xilinx 01 Nov. 2001

[5]N. Rollins, M. Wirthlin, M. Caffrey, P. Graham, “Reliability of Programmable Input/Output Pins in the Presence of Configuration Upsets”, MAPLD ’02 Laurel MD, 10-12 Sep 2002