The logic density and growing application domain of FPGAs has led to increased interest in FPGA device test and fault recovery. In this work we develop a series of interconnect and test approaches for commercial cluster-based FPGA architectures. If a permanent fault is located with these approaches, a series of fault recovery approaches are applied to allow the device to return to functionality.

FPGA Test Algorithms
We have developed a series of algorithms to test both logic and interconnect in cluster-based FPGA devices. These algorithms incrementally reconfigure the FPGA to test a logic cluster and associated interconnect with testers located in logic that is known to be fault-free. Our approaches have been shown provide nearly complete fault coverage with a minimum number of device configurations.
  • P. Menon, W. Xu, and R. Tessier, Design-Specific Path Delay Testing in Lookup Table-based FPGAs,, in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, accepted/to appear. Download Postscript Document Download Adobe PDF Document
  • I. G. Harris and R. Tessier, Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures, in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 21, no. 11, November 2002, pp. 1337-1343. Download Postscript Document Download Adobe PDF Document
  • I. G. Harris, P. Menon, and R. Tessier, BIST-Based Delay Path Testing in FPGA Architectures, in the Proceedings of the International Test Conference, Baltimore, Maryland, October 2001. Download Adobe PDF Document
  • I. G. Harris and R. Tessier, Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures, in the Proceedings of the International Conference on Computer Aided Design, San Jose, California, November 2000. Download Postscript Document Download Adobe PDF Document
  • I. G. Harris and R. Tessier, Interconnect Testing in Cluster-Based FPGA Architectures, in the Proceedings of the 37th Design Automation Conference, Los Angeles, California, June 2000. Download Postscript Document Download Adobe PDF Document

Fault Tolerance
In recent years the application space of reconfigurable devices has grown to include many platforms with a strong need for fault tolerance. While these systems frequently contain hardware redundancy to allow for continued operation in the presence of operational faults, the need to recover faulty hardware and return it to full functionality quickly and efficiently is great. In addition to providing functional density, FPGAs provide a level of fault tolerance generally not found in mask-programmable devices by including the capability to reconfigure around operational faults in the field. In this work, incremental CAD techniques are described that allow functional recovery of FPGA design configurations in the presence of single or multiple operational faults. Our preferred approach to fault recovery takes advantage of device routing hierarchy in architectural families such as Xilinx Virtex and Altera Apex to quickly swap unused logic and routing resources in place of faulty ones within logic clusters. These algorithms allow for straightforward implementation within a local fault-tolerant system without the need to access a remote processing location. If initial recovery attempts through localized swapping fail, an incremental router based on the widely-used PathFinder maze routing algorithm can be applied remotely in an attempt to form connections between newly-allocated logic and interconnect based on the history of the initial design route. To illustrate the practicality of this approach, we have recently integrated fault recovery with networked systems. If a fault is discovered on a remote embedded system, fault information can be transferred to a computationally-superior reconfiguration server via a network. Following fault recovery, a replacement bitstream is returned to the remote system via the network. An additional, recent work examines the tradeoffs between power consumption and fault tolerance for VLSI counters.
  • W. Xu, R. Ramanarayanan, and R. Tessier, Adaptive Fault Tolerance for Networked Reconfigurable Systems, in the Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, Napa, California, April 2003. Download Adobe PDF Document
  • A. Maheshwari, W. Burleson and R. Tessier, Trading Off Reliability and Power Consumption in Ultra-Low Power Systems, in the Proceedings of the International Symposium on Quality Electronic Design, San Jose, California, March 2002. Download Adobe PDF Document
  • V. Lakamraju and R. Tessier, Tolerating Operational Faults in Cluster-based FPGAs, in the Proceedings of the 8th International ACM/SIGDA Symposium on Field Programmable Gate Arrays, Monterey, California, February 2000. Download Postscript Document Download Adobe PDF Document