# A Fault-Tolerant Technique using Quadded Logic and Quadded Transistors

Jie Han, Member, IEEE, Eugene Leung, Leibo Liu and Fabrizio Lombardi, Fellow, IEEE

Abstract-Advances in CMOS technology have made digital circuits and systems very sensitive to manufacturing variations, aging and/or soft errors. Fault-tolerant techniques using hardware redundancy have been extensively investigated for improving reliability. Quadded Logic (QL) is an interwoven redundant logic technique that corrects errors by switching them from critical to subcritical status; however, QL cannot correct errors in the last one or two layers of a circuit. In contrast to OL, quadded transistor (OT) corrects errors while performing the function of a circuit. In this paper, a technique that combines QL with QT is proposed to take advantage of both techniques. The proposed QLQT technique is evaluated and compared with other faulttolerant techniques such as triple modular redundancy (TMR) and triple interwoven redundancy (TIR), using stochastic computational models (SCMs). Simulation results show that QLQT has a better reliability than the other fault-tolerant techniques (except in the very restrictive case of small circuits with low gate error rates and very short paths from primary inputs to primary outputs). These results provide a new insight for implementing efficient fault-tolerant techniques in the design of reliable circuits and systems.

*Index Terms*—Redundancy, Quadded Logic, Quadded Transistor, Fault Tolerance, Reliability, Soft Error.

# I. INTRODUCTION

A DVANCES in fabrication technology have made integrated circuits (ICs) more prone to manufacturing variations, aging and/or soft errors [1]. This is of significant concern in many safety-critical applications, such as medical devices and aerospace applications. Fault-tolerance has become an integral part of digital circuit and system design [2].

Hardware-redundant techniques have been investigated for mitigating the effect of failures. The most common technique is *triple modular redundancy* (TMR), as a particular case of *N*-*tuple modular redundancy* (NMR) [3]. *Quadded logic* (QL) is a fault-tolerant technique in which a quadrupled number of gates are connected in a systematic manner, such that alternating layers of logic gates correct single errors in one or two layers (see [4] for a review). However, QL is only applicable to some gates, and cannot correct errors in the last one or two layers of a circuit. The so-called *triple/N-tuple* 

*interwoven redundancy scheme* (TIR/NIR) has been proposed to allow some randomness in the interconnect pattern of a TMR/NMR [5]. A comparison of these techniques has been performed on a case study of half adders [4].

Since soft errors are likely to affect a circuit on a temporary basis, a time-redundant soft error-tolerant technique has been proposed in [6]. However, TMR has been shown to be vulnerable to multiple bit errors in FPGA devices [7], and it may not work very well for highly unreliable nanoscale technologies [8, 9]. Recently, a *quadded-transistor* (QT) technique has been proposed for tolerating permanent defects in digital circuits [10]. In the QT technique, every transistor in a circuit is replaced by four transistors and any single transistor error in the quadruple can be tolerated. However, gate capacitance is also quadrupled, thus delay is increased.

In this paper, a novel fault-tolerant technique is proposed by combining quadded logic with quadded transistors implemented at the last layer of a circuit. In this technique, quadded transistors replace every transistor in the gates that produce the circuit outputs, while quadded logic is implemented for the remaining circuit. This implementation is therefore referred to as a *QLQT* technique and it takes advantage of both QL and QT. In QLQT, QT implements the logic function of a gate and simultaneously serves as a voter or arbiter. Hence, no additional voter is needed in QLQT (as required in a QL circuit for determining the correct output). The QT voters in QLQT are also fault-tolerant, i.e. they lose the hard core nature as found in TMR. The proposed QLQT technique is evaluated using stochastic computational models (SCMs) [11-13] and compared to TMR, TIR and QL through an extensive simulation of benchmark circuits. It is shown that in most cases, the proposed QLQT performs the best in terms of reliability compared to the other fault-tolerant techniques.

#### II. REVIEW

### A. Triple Modular Redundancy (TMR)

TMR is the most common and simplest case of NMR. In TMR, each module is replicated by three functionally identical modules and the outputs of the modules are voted through a majority voter. TMR is good at tolerating any single fault in a

Manuscript received October 17, 2013. This work was supported in part by the University of Alberta China Opportunity Fund and the ITC endowment at Northeastern University. ©2014 IEEE

J. Han and E. Leung are with the Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada T6G 2V4. (e-mail: {jhan8, eugenel}@ualberta.ca).

L. Liu is with the Institute of Microelectronics, Tsinghua University, Beijing 100084, China. (e-mail: liulb@tsinghua.edu.cn).

F. Lombardi is with the Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA. (e-mail: lombardi@ece.neu.edu).

module (but not in the voters due to their hard core nature). For a constant component failure rate, an increase of the module size increases the probability of having multiple faults; however, a decrease in module size also results in the use of more voters and possibly a lower reliability. Hence, the reliability of TMR is dependent on the size of the modules and the voting process (inclusive of the design of the voter).

## B. Triple Interwoven Redundancy (TIR)

TIR [4] is a generalization of TMR with random interconnections between replicas. In general, the reliability of TIR is comparable to TMR; however in some cases, the effects of a single error can spread and affect multiple modules due to the interwoven nature of TIR. This is deleterious for reliability, because in TMR, an error is confined to a module. However, the randomness in the TIR interconnections may be beneficial for a physical implementation of nanoscale circuits.

## C. Quadded Logic (QL)

QL [4] corrects errors by switching them from a critical to a subcritical status. The classification of an error status is related to the effects of an input to the output of a gate. For example, a 1->0 error at the input of an NAND gate causes the output to be stuck at 1, therefore this is said to be a critical error. Since a 0->1 error at the input of the same gate may or may not cause an error at the output, this is said to be a subcritical error. The alternation of layers of NAND gates in a circuit can correct errors in one or two layers, while at the same time performing the logic function. The error-correction mechanism of QL is dictated by a simple interconnect rule: the interconnect pattern at the output of a quadruple must be different from the pattern of any of its input variables. Using this interconnection strategy, QL can correct any single error and many multiple errors as long as they do not interact with each other. However, any error at the last layer of the circuit and critical errors at the second last layer cannot be corrected in a QL implemented circuit. This remains a significant disadvantage of QL.

# D. Quadded Transistor (QT)

QT uses four transistors for the function of a single transistor [10]. As shown in Fig. 1, a transistor with input A is replaced by a four-transistor structure, which is logically equivalent to a function (A+A)(A+A). Therefore, an error in any single transistor can be tolerated by QT. Many double errors can also be tolerated as long as they do not occur in transistors placed in parallel. However, the gate capacitance of the QT structure is quadrupled and thus the replacement of every transistor with QT makes the circuit slower with an area overhead.



Fig. 1. (a) A transistor; (b) A quadded transistor structure [10].

## III. QUADDED LOGIC WITH QUADDED TRANSISTORS (QLQT)

### A. Proposed QLQT Technique

To overcome the drawbacks of QL and QT, a hybrid design using QT in QL is proposed to enhance the gates that generate the primary outputs in a QL circuit. In a QLQT implementation of the benchmark C17 for example, the two NAND gates at the last logic layer are implemented using QT (Fig. 2).

In QLQT, any single error in the second-to-last layer of gates or in the last layer of transistors can be corrected by the QT circuits at the outputs. This provides a significant advantage over QL. However, a critical error at the third last layer that would be corrected in QL, may not be necessarily corrected in a QLQT circuit; this is caused by the fanouts of the subcritical errors induced at the second last layer onto the last QT structures. However, these errors may not cause an erroneous output due to: 1) the errors may propagate to two transistors that are not in parallel in QT, and 2) the errors may be corrected by other signals due to their subcritical nature. Therefore, the negative effects of QLQT are rather limited (as confirmed by simulation later). Table I shows a comparison between QL and QLQT for the effects of single errors at different layers.



Fig. 2. A QLQT implementation of C17.

TABLE I. COMPARISON BETWEEN OL AND OLOT

| Single Error Occurrence<br>at | Error Type  | Does an error appear at the output? |       |
|-------------------------------|-------------|-------------------------------------|-------|
| at                            |             | QL                                  | QLQT  |
| The last layer of             | critical    | yes                                 | no    |
| gates/transistors             | subcritical | yes                                 | no    |
| 2nd last layer of gates       | critical    | yes                                 | no    |
|                               | subcritical | no                                  | no    |
| 3rd last layer of gates       | critical    | no                                  | maybe |
|                               | subcritical | no                                  | no    |
| Are voters/arbiters needed?   | N/A         | yes                                 | no    |

## B. Comparison on Area, Power and Delay

For TMR and TIR, the area is (at least) tripled and so is the power consumption due to triplication of the gates. If voters are considered, the area and power are slightly larger than three times the original circuit, whereas the delay is only marginally larger (due to the presence of the voters).

QL requires four times as many gates as in the original circuit and twice as many interconnects, i.e., each gate in QL has twice the number of inputs as in the non-redundant circuit. Therefore, the number of transistors in QL is eight times the original circuit. If gate sizing is considered for the same delay and power, the required area is larger and the power consumption is no less than four times the original circuit.

The measures for QLQT are similar, but slightly less than for QL due to the use of quadded transistors in the last layer of the circuit. The number of transistors in QT is half of QL and so is the area (if transistor sizing is constant). Since circuit delay is dominated by the load capacitance, the delay in QL is at least twice as large as in the original circuit due to the fanout of signals into two different gates, whereas the delay in QLQT is slightly smaller than in QL. This is due to the similar delays that the quadded transistors incur as the original logic gates would have in the last layer. However, QLQT does not require additional transistors for the voters or arbiters that would be needed in a QL circuit. The impacts on area, power and delay are quantitatively summarized in gross terms in Table II.

TABLE II. AREA, POWER AND DELAY OVERHEAD OF DIFFERENT FAULT-TOLERANT TECHNIQUES COMPARED TO NONREDUNDANT DESIGN.

|      | AREA | POWER | DELAY |
|------|------|-------|-------|
| TMR  | >3×  | >3×   | >1×   |
| TIR  | >3×  | >3X   | >1×   |
| QL   | >8×  | >4×   | >2×   |
| QLQT | >8×  | >4×   | >2×   |

#### IV. EXPERIMENTAL STUDY

A number of LGSynth'91 [14] and ISCAS'85 benchmark circuits are simulated for comparison purposes to investigate the reliability of TMR, TIR, QL and QLQT. To reduce the number of fanins to the input of a gate in QL and QLQT, SIS [15] is used for synthesizing and optimizing the benchmark circuits by utilizing only inverters and 2-input NAND gates.

The voters of TMR/TIR are implemented by a majority gate. The same majority voters are also implemented for the circuit outputs in QL. These voters output a 1 (0) if there are three or more 1(0)'s at the inputs, else 0 (1). So, one of the four outputs is allowed to be faulty without affecting the correctness of the final output. However, all voters as part of the circuit are considered unreliable and subject to errors. Stochastic computational models (SCMs) [11, 13] are used for reliability evaluation. For QT, a method similar to [12] is used for evaluating the reliability of a transistor-based circuit. To be consistent with a gate-level SCM (for QL), estimates are made, such that when both pull-up and pull-down networks are on or off, the output is considered to be stuck at 1 or 0. The mapping of the transistor state to the output is given in Table III.

TABLE III. MAPPING OF PULL-UP/DOWN NETWORK STATE TO OUTPUT IN A QUADDED-TRANSISTOR STRUCTURE

| Pull-up | Pull-down | Output   | Estimated<br>Output |
|---------|-----------|----------|---------------------|
| on      | on        | unknown  | 1                   |
| on      | off       | 1        | 1                   |
| off     | off       | floating | 0                   |
| off     | on        | 0        | 0                   |

#### V. SIMULATION RESULTS

#### A. Individual Benchmark Simulation

Simulation results are reported in Figs. 3-5 in ascending order of circuit size. These circuits are equivalent to functional modules of different sizes for implementing the redundancy techniques. Reliability is defined as the joint probability that all outputs are correct for a circuit.

Due to the small size of C17 (with only six gates), the use of redundancy is not justified as it may result in a less reliable structure with the unreliable voters. The reliability of the count circuit (with 179 gates) is plotted in Fig. 3 (a) and (b) for lower and higher ranges of gate error rates. It can be seen that TMR and TIR do not work well at a large gate error rate (such as 0.05). QLQT has the best reliability when the gate error rate is large, whereas in some cases, QL and QLQT are less reliable than TMR and TIR. This is caused by the short data paths in this circuit, such that some errors cannot be corrected before reaching the outputs. Similar considerations also apply to the majority circuit (with 16 gates) and C1908 (with 816 gates).

Fig. 4 shows the circuit reliability of C880 (520 gates). For this circuit (as well as mux, C432 and alu2), QLQT has the best performance. In most cases QL is the second most effective technique; however, it is not as reliable as TMR when the gate error rate is lower than approximately  $10^{-4}$ . TIR performs slightly worse than TMR due to the fanouts of errors in TIR.

Fig. 5 shows the reliability of C6288 (2399 gates). QL and QLQT have a clear advantage over TMR and TIR, especially when the gate error rate is large. For a circuit of this size, QL performs very well and its reliability is very close to QLQT. For the two triple redundancy techniques, TMR improves the circuit reliability, whereas TIR deteriorates it. This is due to the interwoven nature of TIR, i.e., errors can spread, whereas errors are confined in the same module in TMR. Similar behavior in reliability has also been observed for C3540.

#### **B.** Summary of Benchmark Simulation Results

Figs. 6-8 summarize the reliability of the benchmark circuits. Three representative gate error rates of  $10^{-4}$  (relatively low),  $10^{-3}$  (medium) and  $10^{-2}$  (relatively high) are considered. On the x-axis, the benchmark circuits are sorted in order of increasing size. An inset is a zoomed-in view of the high reliability region (i.e., for small benchmark circuits). Note that the three gate error rates are larger than the typical soft error rate of CMOS circuits and smaller than the defect rates in most emerging nanotechnologies. However, the gate error rates and different sizes of circuits are effective in evaluating these fault-tolerant techniques. The observations would also be applicable when these techniques are applied to larger circuits.



Fig. 3. Reliability comparison for count for a gate error rate: (a) in  $[10^{-5}, 10^{-3}]$  and (b) in  $[10^{-3}, 0.05]$ .



Fig. 4. Reliability comparison for C880 for a gate error rate: (a) in  $[10^{-5}, 10^{-3}]$  and (b) in  $[10^{-3}, 0.05]$ .



Fig. 5. Reliability comparison for C6288 for a gate error rate: (a) in  $[10^{-5}, 10^{-3}]$  and (b) in  $[10^{-3}, 0.05]$ .

*TMR/TIR*: In general, TMR and TIR are less reliable than QL and QLQT. At a high gate error rate and for a large circuit, the reliability of TMR and TIR is either marginally higher or even lower than the non-redundant circuit. TMR and TIR are good at correcting single faults within a module; however, multiple faults are more likely to occur in a larger circuit module at a higher error rate, thus leading to the overall marginal performance of the TMR and TIR techniques.

For smaller circuits, TMR and TIR show a similar reliability; in larger circuits, TIR is less reliable than TMR. In large circuits when multiple faults are more likely, TIR even shows a lower reliability than a non-redundant circuit. This is due to the structural difference between TMR and TIR, i.e. there are no interconnections between the three replicas in TMR; so if multiple faults occur in the same replica, they can still be masked at the output. However, this is not the case in TIR; interconnections can spread an error among replicas. Also as shown in the insets of Figs. 6 and 7, TMR and TIR can have a better reliability than QL and QLQT for some of the small circuits. This is explained in detail next.

*QL/QLQT*: In most cases, QL and QLQT show better reliabilities than non-redundant, TMR and TIR circuits. However, they also incur a larger area overhead than TMR and TIR. For some small circuits (as shown in the insets of Figs. 6 and 7), QL and QLQT are not as reliable as TMR and/or TIR. QL has the ability to correct single errors in two layers, but it also may spread the error into more than one gate before correcting it. Hence for circuits containing very short paths

from the primary inputs to the primary outputs, such as majority, count and C1908, QL and QLQT are not very effective. Since the probability of having single errors in a short path is high, TMR and TIR could be viable. At a higher gate error rate, however, QL and QLQT are more reliable than TMR and TIR due to their better ability in handling multiple errors.



Fig. 6. Reliability of different fault-tolerant methods at a gate error rate of  $10^{-4}$  (low).



Fig. 7. Reliability of different fault-tolerant methods at a gate error rate of  $10^{-3}$  (medium).



Fig. 8. Reliability of different fault-tolerant methods at a gate error rate of  $10^{-2}$  (high).

# VI. CONCLUSION

This paper has proposed a novel fault-tolerant technique that uses both quadded logic and quadded transistors (QLQT). In the QLQT technique, QTs are implemented at the last layer of a circuit, while the remaining circuit is implemented by QL. Simulations have shown that the proposed QLQT technique improves QL by using QTs to implement functions of both the output gates and voters. The fault-tolerant QT circuits correct faults that occur in the last two logic layers, hence leading to a better reliability. Extensive simulations reveal insights with respect to the features and application scopes of these faulttolerant technique for reliable circuit and system design.

#### REFERENCES

- [1] International Technology Roadmap for Semiconductors, 2012.
- [2] W. Rao, C. Yang, R. Karri, and A. Orailoglu, "Toward future systems with nanoscale devices: Overcoming the reliability challenge," *Computer* 44, no. 2 (2011): 46-53.
- [3] J.A. Abraham and D.P. Siewiorek, "An algorithm for the accurate reliability evaluation of triple modular redundancy networks." *IEEE Transactions on Computers*, no. 7, pp. 682-692, 1974.
- [4] J. Han, J. Gao, Y. Qi, P. Jonker and J. A. B. Fortes, "Toward hardwareredundant, fault tolerant logic for nanoelectronics," *IEEE Design and Test* of Computers, July-August 2005, pp. 328-339.
- [5] J. Han and P. Jonker, "From Massively Parallel Image Processors to Fault-Tolerant Nanocomputers," in *Proc. ICPR*, 2004, Vol. 3, pp. 2-7.
- [6] L. Anghel, D. Alexandrescu and M. Nicolaidis, "Evaluation of a soft error tolerance technique based on time and/or space redundancy." 13th IEEE Symp. on Integrated Circuits and Systems Design, pp. 237-242, 2000.
- [7] H. Quinn, K. Morgan, P. Graham, J. Krone, M. Caffrey, and K. Lundgreen, "Domain crossing errors: Limitations on single device triplemodular redundancy circuits in Xilinx FPGAs." *IEEE Transactions on Nuclear Science*, vol. 54, no. 6 (2007): 2037-2043.
- [8] T.J. Dysart, P.M. Kogge, "Reliability Impact of N-Modular Redundancy in QCA," *IEEE Tran. Nano.*, vol. 10, no. 5, pp.1015-1022, Sept. 2011.
- [9] J. Han, E.R. Boykin, H. Chen, J. Liang, J.A.B. Fortes, "On the Reliability of Computational Structures Using Majority Logic," *IEEE Transactions* on Nanotechnology, vol. 10, no. 5, pp.1099-1112, 2011.
- [10] A. H. El-Maleh, B. M. Al-Hashimi, A. Melouki and F. Khan, "Defecttolerant N2-transistor structure for reliable nanoelectronic designs," *IET Computer and Digital Techniques*, vol. 3, issue 6, 2009, pp. 570-580.
- [11] H. Chen and J. Han, "Stochastic computational models for accurate reliability evaluation of logic circuits," in *GLSVLSI*, 2010, pp. 61-66.
- [12] H. Chen, J. Han and F. Lombardi, "A transistor-level stochastic approach for evaluating the reliability of digital nanometric cmos circuits," in *IEEE DFT 2011*, Vancouver, BC, Canada, pp. 60-67, 2011.
- [13] J. Han, H. Chen, J. Liang, P. Zhu, Z. Yang and F. Lombardi, "A stochastic computational approach for accurate and efficient reliability evaluation," IEEE Transactions on Computers, 2013.
- [14] S. Yang, "Logic synthesis and optimization benchmarks user guide version 3.0," Technical Report, Microelectronics Center of North Carolina, Research Triangle Park, NC, January 1991.
- [15] E. Sentovich et al., "SIS: A system for sequential circuit synthesis," Technical Report, Dept. of EECS, UC Berkeley, 1992.