Concurrent Error Detection in ALU's by Recomputing.


Concurrent Error detection is a very important issue in fault tolerant VLSI systems. If there is no this kind of error protection, a one-bit error can be propagated thorough the whole system and it will cause many other errors. By using concurrent error detection scheme, system can detect errors and can do the normal operation at the same time.

People usually using hardware redundancy or time redundancy to detect errors. Hardware redundancy means a kind of duplex system. In this kind of system, the outputs of duplicated hardware are compared. Another redundancy scheme is time redundancy.  This will redo the operation, such as bit-wise, bit-wise and or add, in a different way to allow errors to be detected. During the first computation, the normal operands will be applied. In the re-computation step the operands are encoded and correct results can be generated after decoding. RERO and RESO are one of those systems.

RESO-k (Re-computing with Shifted Operands by k-bit)[1] is most famous time redundancy method. In this method, all arithmetic and logical operations are done twice, once with normal operands, and once with the operands shifted by k bits. This method can detect k consecutive logic errors and (k-1) arithmetic errors.

RERO-k (Re-computing with Rotated Operands by k-bit)[2] is another method. In this method,  scheme is very similar to  RESO-k, but rotated operands are used instead of shifted operands. This method can detect (k mod n) consecutive errors in logical operations and (k mod (n+1)-1) consecutive errors in arithmetic operations.