# Feedback-Based Low-Power Soft-Error-Tolerant Design for Dual-Modular Redundancy

Yan Li, Yufeng Li, Han Jie<sup>®</sup>, Jianhao Hu, Fan Yang, Xuan Zeng, Bruce Cockburn, and Jie Chen<sup>®</sup>

Abstract—Triple-modular redundancy (TMR), which consists of three identical modules and a voting circuit, is a common architecture for soft-error tolerance. However, the original TMR suffers from two major drawbacks: the large area overhead and the vulnerability of the voter. In order to overcome these drawbacks, we propose a new complementary dual-modular redundancy (CDMR) scheme for mitigating the effect of soft errors. Inspired by the Markov random field (MRF) theory, a two-stage voting system is implemented in CDMR, including a firststage optimal MRF structure and a second-stage high-performance merging unit. The CDMR scheme can reduce the voting circuit area by 20% while saving the area of one redundant module, achieving at least 26% error-rate reduction at an ultralow supply voltage of 0.25 V with 8.33% faster timing compared to previous voter designs.

*Index Terms*—Markov random field (MRF), soft-error tolerance, triple-modular redundancy (TMR).

## I. INTRODUCTION

Triple-modular redundancy (TMR) was first proposed by Von Neumann et al. [1], and has since been adopted as a technique to improve error tolerance at the cost of increased circuit area. TMR can only tolerate soft errors when the probability of three or two modules failing simultaneously is much lower than that of a single module. However, one obvious drawback is the increased area overhead. Therefore, partial TMR [2] (PTMR) was proposed to reduce the area overhead by tradingoff reliability. The dual-modular redundancy (DMR) scheme presented in [3] uses a three-module structure with self-feedback. Robust C-elements [4] and multiplexers [5] are used, respectively, to form voters in two different DMR designs. An algorithmic noise-tolerant (ANT) technique [6] was proposed to solve the problem of soft errors caused by voltage over scaling. Algorithmic soft-error tolerance (ASET) [7] and fine-grain soft-error tolerance (FGSET) designs [8] are both extended ANT designs. The designs in [1]-[3] and [5]-[8] suffer from two drawbacks. First, they still consume large area overhead. Second, reliability loss is incurred by soft errors in the voting design. The reason is that redundancies in [1]-[5] and estimator-based redundancies in [6]-[8] work well only when voters never fail, which might be an unrealistic assumption if the circuits are designed using a deep submicrotechnology or an ultralow supply voltage is used. Under such conditions, it is likely that such a failure could occur in the voting circuit, which is a main cause of TMR failure [9]. For a

Manuscript received October 25, 2017; revised January 29, 2018; accepted March 13, 2018. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada, in part by the Alberta Innovates Funding, and in part by the China Scholar Council. (*Yan Li and Yufeng Li are co-first authors.*) (*Corresponding author: Jie Chen.*)

Y. Li and J. Hu are with the University of Electronic Science and Technology of China, Chengdu 611731, China (e-mail: lpgx1962@163.com; jhhu@uestc.edu.cn).

Y. Li, H. Jie, B. Cockburn, and J. Chen are with the University of Alberta, Edmonton, AB T6H 5J5 Canada (e-mail: yufeng3@ualberta.ca; jhan8@ualberta.ca; cockburn@ualberta.ca; jchen@ece.ualberta.ca).

F. Yang and X. Zeng are with Fudan University, Shanghai 201203, China (e-mail: yangfan@fudan.edu.cn; xzeng@fudan.edu.cn).

Digital Object Identifier 10.1109/TVLSI.2018.2819896



Fig. 1. CDMR design.

multistage design, three identical voters could be used in each stage to tolerate errors that occur in one of the TMR voters, but this would add undesirable overhead to the design. Some approaches, such as generalized modular redundancy [10], approximate TMR [11], and a simulation-based synthesis scheme [12], improve the original TMR, but they only offer either an optimal implementation strategy or tradeoff accuracy.

A number of error-tolerant methods, such as Markov random field (MRF) [13]-[15], differential cascode voltage switch (DCVS) [16], and DCVS-MRF [17], have been proposed. In these designs, the basic elements include feedback loops that help them to achieve high soft-error tolerance. However, these implementations require higher area overhead than traditional structures. To solve soft-error issues in the voter and save area overhead, we propose a new complementary DMR (CDMR) scheme, as shown in Fig. 1. The CDMR scheme ensures the significance of soft-error tolerance even for the voting circuit. This is achieved by separately processing one module (M1) through a structure with a stable logic "1" as output (referred to as structure A in Fig. 1), and processing another identical module (M2) through a structure with a stable logic "0" as output (shown in Fig. 1 as structure B). A second-stage feedback structure is then used to merge the stable logic "1" and stable logic "0" outputs from the first stage, ensuring the best performance from the first stage (shown in Fig. 1 as structure C). The CDMR scheme outperforms existing designs in two key aspects by: 1) tolerating many soft errors propagated to the voting circuit and 2) saving the area overhead.

The remaining material is organized as follows. Section II briefly reviews background. Section III describes the design of the proposed two-stage structure and explains how such a structure works together to improve the soft-error tolerance and the reliability of the voter. Section IV presents the simulation results. This brief is concluded in Section V.

## II. BACKGROUND

Distinct from the methods in Fig. 2, the proposed MRF-based design achieves soft-error tolerance by using a feedback structure based on the energy function [13], specifically the clique energy U(In, Out). In a logic circuit, the clique energy describes the energy

1063-8210 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.



Fig. 2. (a) TMR circuit [1]. (b) DMR circuit [3]. (c) ANT structure [6]. (d) ASET structure [7].

function of a clique referring to as a subset formed by fully related logic nodes (e.g., inputs and a corresponding output) [14]. The rules for an MRF-based design ensure that the clique energy of correct logic states is lower than that of wrong logic states. These rules state that first, all the input-output states should be considered including correct and incorrect input-output combinations. Assume that there is a transformation rule from Boolean to algebraic operation:  $\overline{x} \rightarrow \overline{x}$ (1 - x),  $x_1 \Lambda x_2 \rightarrow x_1 x_2$ . Let  $f(x_0, x_1, \dots, x_n)$  be an operation function for nodes  $X = \{x_0, x_1, \dots, x_n\}$ , where f = 1 represents a correct operation; otherwise f = 0. Second, define the clique energy to be  $U(x_c) = -\sum f_i(x_0, x_1, \dots, x_n)$  over all the states of the operation, where *i* indexes the different node values. The MRF-based elements are designed based on function  $U(x_c)$ , where  $f_i = 1$ . For example, the clique energy of an inverter is U(x, y) = -(x'') by only summing the valid states. Note that valid states remain at the lower energy "-1" relative to the invalid states at "0." When circuits tend to enter and remain in the lower valid energy states, the circuit has a high probability of operating correctly despite the presence of soft errors [13]-[15].

## III. MRF-INSPIRED TWO-STAGE FEEDBACK DESIGN

In this section, we present an MRF-inspired two-stage feedback voter by substituting an inverting module for one of the identical modules in Fig. 1. MRF circuit design has been demonstrated to effectively stabilize the circuit into correct states to tolerate soft errors by lowering the energy of the correct states. For stage 1 in Fig. 1, we implement the MRF design and produce a NAND–NAND-based feedback structure.

Assume that an *n*-bit-input one-bit-output function M is a clique, and  $y = y_{out} + \sigma(Noise)$  represents the sum of the noise-free output  $y_{out}$  and environmental noise. The clique energy is  $U(X_{in}, y)$ , where  $X_{in} = \{x_1, x_2, ..., x_n\}$  are input signals and  $y_{out}$  is the output of a logic function  $U(X_{in}, y)$ .

*Theorem:* Assume that  $y_{out} = M(X_{in})$  is the simplest representation of the Karnaugh map simplification (canonical sum of minterms). In a noisy environmental the clique energy is

$$U(X_{\text{in}}, y) = -M(X_{\text{in}}) \cdot y - \overline{M(X_{\text{in}})} \cdot \overline{y}.$$

*Proof:* According to the findings in [13]–[15], in an MRF-based design, valid states have a lower energy than that of invalid states and so, those designs will tend to operate correctly despite the interference of soft errors from noise. Thus, ideal input  $y_{out}$  should be equal to actual output  $y [y_{out} = M(X_{in}) = y]$ , as shown in Table I. Then,  $U(X_{in}, y) = -M(X_{in}) \cdot y - \overline{M(X_{in})} \cdot \overline{y}$  is the clique energy that will

 TABLE I

 ENERGY TRUTH TABLE OF M

| Yout          | у | State   | Clique Energy $U(X_{in}, y)$ |  |  |  |
|---------------|---|---------|------------------------------|--|--|--|
| $M(X_{in})=0$ | 0 | Valid   | -1                           |  |  |  |
| $M(X_{in})=0$ | 1 | Invalid | 0                            |  |  |  |
| $M(X_{in})=1$ | 0 | Invalid | 0                            |  |  |  |
| $M(X_{in})=1$ | 1 | Valid   | -1                           |  |  |  |



Fig. 3. Proposed first-stage structure.

help the structure settle into the valid states which have the lower energy "-1."

According to the above theorem, we propose the NAND–NAND structure shown in Fig. 3 to improve soft-error tolerance in the first stage. The clique energy

$$U(X_{\rm in}, y) = -M(X_{\rm in}) \cdot M(X_{\rm in}) \cdot y - \overline{M(X_{\rm in})} \cdot \overline{M(X_{\rm in})} \cdot \overline{y} \quad (1)$$

can be inferred from  $U(X_{in}, y) = -M(X_{in}) \cdot y - \overline{M(X_{in})} \cdot \overline{y}$ . We assume that output  $x_a = M(X_{in})$  and  $\overline{x_b} = \overline{M(X_{in})}$  under noisy conditions. The structure in Fig. 3 satisfies the clique-energy requirement, and thus helps keep the circuits in the correct state.

From a probability perspective, we consider errors affecting one module at a time in this brief because, for a fair comparison, TMR only tolerates errors occurring in one module [3]. The one error condition is defined as the condition where only one module is erroneous at the input to the voting circuit. Under this condition, we will first analyze the error tolerance of the first stage.  $g_1$  in Fig. 3 has a higher error tolerance of a noisy "0" at an input since the correct output probability with noisy inputs {00,01,10} is larger than that with noisy {11} in a NAND gate.

*Proof:* The probability of an input being incorrect under the effect of noise is  $p_e$  ( $0 \le p_e \le 0.5$ );  $p(y|x_1x_2)$  represents the conditional correct probability of output y when the inputs are  $x_1$  and  $x_2$  in a NAND:  $p(1|00) = 1 - p_e^2 \ge p(1|01) = p(1|10) = 1 - p_e(1 - p_e) \ge p(0|11) = (1 - p_e)^2$ .

Assume that the correct input pair  $\{x_a, \overline{x_b}\}\$  for the previous redundant modules, M and  $\overline{M}$ , is  $\{0, 1\}$ . If M is corrupted by noise, the incorrect output from  $(M, \overline{M})$  momentarily becomes  $\{1, 1\}$ . In this case,  $g_1$  can still tolerate the error by the inverter as long as the output of  $\overline{M}$  remains correct, while  $g_2$  cannot. However, the second-stage structure in Fig. 4 can complement the loss of the error tolerance in  $g_2$  for the first stage using its latching property. The proposed structure benefits from the presence of stage 2 to improve its reliability, which is a feature that TMR, DMR, or other designs lack.

Let us extend the single-error assumption for stage 1 by assuming that only one error can emerge from one of the complementary propagation chains at the same time. In other words, when an error occurs from stage 1, the latch structure of  $g_{3}$ - $g_{4}$  in stage 2 does not propagate errors received from stage 1. With respect to our proposed CDMR, the two redundant inputs to the voter must be complementary



Fig. 4. Proposed two-stage dual feedback structure.

TABLE II Values of  $g_3-g_4$  Feedback

| $x_d$     |   |   | X <sub>e</sub> | State     | g3-g4 |  |  |
|-----------|---|---|----------------|-----------|-------|--|--|
| High 0    | 1 | 0 |                | correct   | pass  |  |  |
| for $X_a$ | 1 | 1 |                | incorrect | hold  |  |  |
|           | 0 | 1 | High 1         | correct   | pass  |  |  |
|           | 1 | 1 | for $x_b$      | incorrect | hold  |  |  |

and will propagate through stages 1 and 2 as complementary signals in the absence of errors. For example, an ideal input bit stream for  $x_a(x_a = x_b)$  is  $\{x_0 \sim x_4 = 0 \text{ and } x_5 \sim x_9 = 1\}$ . Four bits,  $x_7$  and  $x_9$ of  $x_d$  and  $x_1$  and  $x_2$  of  $x_e$  are flipped by noise, as circled by a small circle in Fig. 4. Their corresponding bits in the other branch are robust "1" because of the high tolerance of noisy input bit "0" in both NAND gates  $g_1$  and  $g_2$ . This is why we only consider the cases where errors occur in weak "0" in  $x_d$  or  $x_e$ . This condition causes the second stage  $g_{3-g_4}$  to remain in the hold state in Table II acting as an RS latch, thus protecting the final output results from the influence of the error bits in  $x_d$  and  $x_e$  based on the previous correct outputs. We adopted the widely used double-exponential current source to simulate the above cases where a charged or ionizing particle hits the output "0" of stage 1 circuit [18]

$$I(t) = \frac{Q_{\text{total}}}{\tau_f - \tau_r} (e^{-t/\tau_f} - e^{-t/\tau_r})$$
(2)

where  $Q_{\text{total}}$  is the total charge caused by the particle strike, and  $\tau_r$  and  $\tau_f$  are the rising time constant and the falling time constant, respectively. As  $\tau_r$  and  $\tau_f$  are generally set to 50 and 164 ps for different process technologies, we used the current source  $Q_{\text{total}} = 70$  fC in our simulation. Regardless of whether  $x_a$  and  $x_b$ are both high or low, when a charged particle attacks  $x_d$  or  $x_e$ , there is one single peak shown in Fig. 5 in output  $x_f$ . Compared with a much longer pulse at the output of a TMR voter when an error hits on one of its inner branches, it can be regarded to be less harmless in the proposed voter after sampling, as the error is too short to be sampled multiple times. The results in Fig. 5 confirm the same error tolerance as what we deduced from the proposed structure in Fig. 4. In the extended one error condition, the output of our module can achieve correct operation as long as the two inner complementary signals are not in error at the same time. This is what TMR, DMR, or other voting circuits are incapable of. Thus, the proposed voting design is more reliable. Assuming that the error probability of module M is  $p_{\varepsilon}$  when the voting circuit never fails, the error probability of our proposed structure is  $P_{\text{proposed}} = p_{\varepsilon}^2$ . Comparing this with the error probability of TMR





Fig. 5. Simulation of the intermediate propagation injected by a soft error.



Fig. 6. Voting structure in multistage design. (a) TMR [1]. (b) FGSET [8]. (c) DMR [3]. (d) Proposed voting module.

we see that the proposed design has better soft-error tolerance. Therefore, the proposed voting circuit has both higher modular softerror tolerance and reliability than those of TMR.

For multistage logic, the voter is concatenated in each stage to improve the overall system reliability, as shown in Fig. 6(a)–(d). The original TMR, FGSET, and DMR voters for multistage are simply duplicated [refer to Fig. 6(a)–(c)]. However, the proposed voter has enclosed feedback loops and two outputs without voting duplication between two stages, as shown in Fig. 6(d). Note that this design has two complementary outputs as references for error correction. Overall, the area overhead is reduced by at least 50% compared to the designs used in TMR and DMR.

We consider a 4-bit ripple-carry adder (RCA) as a case study for the proposed voter in Fig. 7. The input to the proposed design requires a differential input; thus, we redesigned the full adder (FA) as  $\overline{FA}$ . We present two design schemes for adders. Scheme 1 (S1) in Fig. 7 is designed for a single unit with DMR, in which the outputs of the two modules are connected to a voter. Scheme 2 (S2) in Fig. 7 is implemented as a multistage design by adding a voter at every stage.

#### IV. SIMULATION RESULTS AND DISCUSSION

In this brief, we focus on transient soft errors caused by signal uncertainties inherent to nanoscale devices and near-threshold computations. We used HSPICE with the 65-nm CMOS library to simulate device performance at progressively smaller dimensions. The nominal supply voltage is 1.2 V, and threshold  $V_{\text{th}}$  is 0.25 V. We added an independent Gaussian noise source, as well as correlated Gaussian noise with a weak correlation coefficient  $\rho = 0.1$ , medium  $\rho = 0.3$ , IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS



Fig. 7. Our proposed schemes for the RCA structure.



Fig. 8. Simulation results of different voting designs.

and strong  $\rho = 0.5$  as a noise source to each input to inject soft errors. We also set the temperature to 50 °C instead of room temperature to simulate the presence of thermal noise, while our operating supply voltage is set as 0.25 V to achieve near-threshold computation. The injected noise is with respect to the period of the input signals, and it covers all the possible transient "double error" cases and "single error" cases occurring at different times. It aims to show that our design can handle soft errors arising from any random noises, such as crosstalk noise, thermal noise, and particle strike noise.

The results in Fig. 8 are an example that shows each output of the proposed structure in Fig. 4, where the noise is an independent Gaussian source with zero mean and 170-mV standard deviation. The seven curves in Fig. 8 represent three noisy input signals (first-third curves), and four voting outputs of the proposed design in Fig. 4 (fourth-seventh curves). The remaining two curves are the eighth curve showing the results of self-voting in Fig. 2(b) and the ninth curve depicting the result of TMR voter as in Fig. 2(a). Both the fourth and fifth curves are the outputs from stage 1 of our design. The fourth curve shows the high probability of a correct "0" in xa while the fifth curve shows the high probability of a correct "1" in xb. Both the sixth and seventh curves are outputs from stage 2 of our design. They have the same performance shown in Fig. 8 but are much better than the performance of self-voting [3] and TMR [1] (the eighth and ninth curves).

We simulated the performance of different voting designs in Fig. 9. Compared to TMR [1], the results show that the proposed structure achieves on average 64.5% (68.2% for  $\rho = 0.1$  and 61% for  $\rho = 0.3$ ) reduction in error rate, 20% area reduction, and 8.33% delay



Fig. 9. Results of different 65-nm voting designs under different input SNRs and different correlation coefficients  $\rho$  of coupling noise.



\*CMOS means a non-redundant design

Fig. 10. Results of 65-nm RCA with different input SNRs.

reduction according to Synopsys Design Compiler. Compared to the self-voting in [3], our design achieves 36.3% (41.6% for  $\rho = 0.1$  and 31% for  $\rho = 0.3$ ) lower error rate, with 20% area saving and 15% delay reduction. Evidently, the proposed design presents a significant improvement over the TMR and self-voting. In Fig. 9, the reduction in error rate of the proposed voter achieves at least 26% compared to those of other voters.

We also simulated the performance of different schemes using an RCA in Fig. 10 under different input SNR conditions at a 0.25-V supply voltage. The results show that the proposed voter achieves a reduction in the error rate by at least 3.7% for S1 and IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

| Design                        | CMOS       | TMR<br>[1] | DMR<br>[3] | PTMR<br>[2] | MUX<br>[5] | This<br>work | CMOS | TMR<br>[1] | DMR<br>[3] | FGSET<br>[8] | MUX<br>[5] | This<br>work |
|-------------------------------|------------|------------|------------|-------------|------------|--------------|------|------------|------------|--------------|------------|--------------|
| Process                       | TSMC 65 nm |            |            |             |            |              |      |            |            |              |            |              |
| Scheme                        | S1         |            |            |             |            | S2           |      |            |            |              |            |              |
| Area (µm <sup>2</sup> )       | 36         | 153        | 117        | 99          | 131        | 108          | 36   | 324        | 216        | 141          | 279        | 150          |
| Delay (ns)                    | 0.42       | 0.54       | 0.55       | 0.54        | 0.59       | 0.53         | 0.42 | 1.20       | 1.21       | 1.57         | 1.81       | 1.01         |
| $Vdd_{min}(V)$                | 0.5        | 0.29       | 0.27       | 0.28        | 0.26       | 0.25         | 0.45 | 0.28       | 0.26       | 0.29         | 0.255      | 0.25         |
| Power@Vdd <sub>min</sub> (µW) | 13.6       | 6.5        | 3.5        | 3.7         | 4.8        | 2.4          | 10.4 | 6.9        | 3.7        | 6.5          | 6.8        | 2.6          |
| Error rate Improved<br>(%)    | 46.88      | 21.04      | 5.89       | 41.39       | 3.71       | I            | 55   | 29.96      | 15.81      | 56.20        | 12.54      | -            |

TABLE III COMPARISONS OF AREA, DELAY, AND POWER CONSUMPTION

\*Vdd min is the minimum supply voltage when the structure has the same error rate under 0.35. S1 refers to scheme1 and S2 to scheme2.

12.5% for S2 compared to the MUX design in [5], saving time by 1.8% and area by 29.4% compared to the traditional TMR design [1]. The proposed structure also features a 3.7% delay reduction and 7.6% area reduction when compared to the selfvote [3], as shown in Table III. Finally, the proposed structure achieves reduction in the error rate by 41% for S1 and by 56% for S2, with a timing reduction by 2% for S1 and 36% for S2, and only 9% and 6% area increase compared to PTMR [2] and FGSET [8], respectively. The output error rate of our design is not zero because "double errors" could (rarely) occur. Here, "double errors" refer to when one error occurs in one module while the output of the other is erroneous or unstable. However, "double error" cases do not happen frequently. The output error rate of the proposed method is still the lowest among all designs.

For the testability, the presence of latches in stages 1 and 2 could cause problems with some automatic test pattern generation (ATPG) algorithms. There is also the challenge of testing modules M and  $\overline{M}$  separately from stages 1 and 2. These challenges could be solved together if two bypass multiplexers were to be added after the two outputs of stage 2. In the "normal" position, the two multiplexers would select the two outputs of stage 2; in the "bypass" position, the two multiplexers would select the outputs of modules M and  $\overline{M}$ . To permit ATPG algorithms to run, you would resynthesize the circuit with the multiplexer control signal fixed at "bypass"; this would cause stages 1 and 2 to be pruned away by the synthesis tool, leaving a circuit that could be sent to ATPG to produce the test vectors. In production, those test vectors could be applied in "bypass" mode to test modules M and  $\overline{M}$  only, and then reapplied in "normal" mode to test the full circuit with DMR. The multiplexer circuits would increase the testability of the design, at the cost of the area for the multiplexers.

#### V. CONCLUSION

In this brief, a novel CDMR scheme is proposed for soft-error tolerance. The proposed design combines MRF theory and the inherent ability of the error tolerance of the logic gate with traditional redundancy techniques. It avoids the higher hardware cost of previous redundancy approaches, and it also improves the reliability of the voter. The proposed two-stage voter saves at least 20% in area and 8.33% in timing compared to the conventional redundancy design with at least 26% improvement in error tolerance at an ultralow supply voltage 0.25 V compared to previously reported voter designs. Implemented in a 4-bit RCA, the proposed CDMR scheme achieves at least 12.5% reduction in the error rate while it saves at least 30% of the area compared with previous DMR approaches when a voter is added at every stage. In the future, when the proposed CDMR is applied to chip implementations, multiplexers could be added to increase the testability of the design.

## REFERENCES

- J. von Neumann, C. E. Shannon, and J. McCarthy, "Probabilistic logics and the synthesis of reliable organisms from unreliable components," in *Automata Studies* (Annals of Mathematics Studies). Princeton, NJ, USA: Princeton Univ. Press, 1956, pp. 43–98.
- [2] R. Parhi, C. H. Kim, and K. K. Parhi, "Fault-tolerant ripple-carry binary adder using partial triple modular redundancy (PTMR)," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, Lisbon, Portugal, May 2015, pp. 41–44.
- [3] J. Teifel, "Self-voting dual-modular-redundancy circuits for singleevent-transient mitigation," *IEEE Trans. Nucl. Sci.*, vol. 55, no. 6, pp. 3435–3439, Dec. 2008.
- [4] I.-C. Wey, B.-C. Wu, C.-C. Peng, C.-S. A. Gong, and C.-H. Yu, "Robust C-element design for soft-error mitigation," *IEICE Elect. Exp.*, vol. 12, no. 10, pp. 1–6, 2015.
- [5] F. Smith, "A new methodology for single event transient suppression in flash FPGAs," *Microprocess. Microsyst.*, vol. 37, no. 3, pp. 313–318, May 2013.
- [6] I.-C. Wey, C.-C. Peng, and F.-Y. Liao, "Reliable low-power multiplier design using fixed-width replica redundancy block," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 23, no. 1, pp. 78–87, Jan. 2015.
- [7] B. Shim and N. R. Shanbhag, "Energy-efficient soft error-tolerant digital signal processing," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 4, pp. 336–348, Apr. 2006.
- [8] Y.-H. Huang, "High-efficiency soft-error-tolerant digital signal processing using fine-grain subword-detection processing," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 18, no. 2, pp. 291–304, Feb. 2010.
- [9] H. Kim and K. G. Shin, "Design and analysis of an optimal instructionretry policy for TMR controller computers," *IEEE Trans. Comput.*, vol. 45, no. 11, pp. 1217–1225, Nov. 1996.
- [10] A. H. El-Maleh and F. C. Oughali, "A generalized modular redundancy scheme for enhancing fault tolerance of combinational circuits," *Microelectron. Rel.*, vol. 54, no. 1, pp. 316–326, 2014.
- [11] A. J. Sanchez-Clemente, L. Entrena, R. Hrbacek, and L. Sekanina, "Error mitigation using approximate logic circuits: A comparison of probabilistic and evolutionary approaches," *IEEE Trans. Rel.*, vol. 65, no. 4, pp. 1871–1883, Dec. 2016.
- [12] A. H. El-Maleh and K. A. K. Daud, "Simulation-based method for synthesizing soft error tolerant combinational circuits," *IEEE Trans. Rel.*, vol. 64, no. 3, pp. 935–948, Sep. 2015.
- [13] R. I. Bahar, J. Mundy, and J. Chen, "A probabilistic-based design methodology for nanoscale computation," in *Proc. Int. Conf. Comput. Aided Des. (ICCAD)*, San Jose, CA, USA, Nov. 2003, pp. 480–486, doi: 10.1109/ICCAD.2003.159727.
- [14] K. Nepal, R. I. Bahar, J. Mundy, W. R. Patterson, and A. Zaslavsky, "Designing logic circuits for probabilistic computation in the presence of noise," in *Proc. IEEE Design Autom. Conf.*, Jun. 2005, pp. 485–490.
- [15] I. C. Wey, Y. G. Chen, C. H. Yu, A. Y. Wu, and J. Chen, "Design and implementation of cost-effective probabilistic-based noise-tolerant VLSI circuits," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 56, no. 11, pp. 2411–2424, Nov. 2009.
- [16] K. M. Chu and D. L. Pulfrey, "Design procedures for differential cascode voltage switch circuits," *IEEE J. Solid-State Circuits*, vol. SSC-21, no. 6, pp. 1082–1087, Dec. 1986.
- [17] Z. Lu, X. P. Yu, and K. S. Yeo, "Design of probabilistic-based Markov random field logic gates in 65 nm CMOS technology," in *Proc. Int. Soc Design Conf. (ISOCC)*, Nov. 2010, pp. 311–314.
- [18] G. R. Srinivasan, P. C. Murley, and H. K. Tang, "Accurate, predictive modeling of soft error rate due to cosmic rays and chip alpha radiation," in *Proc. IEEE Int. Rel. Phys. Symp.*, Apr. 1994, pp. 12–16.