# Design of a Hybrid Non-Volatile SRAM Cell for Concurrent SEU Detection and Correction

Pilin Junsangsri<sup>\*</sup>, Jie Han<sup>+</sup>, and Fabrizio Lombardi<sup>\*</sup> (Corresponding Author)

\*Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA 02115 +Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada T6G 2V4 Email: junsangsri.p@husky.neu.edu, jhan8@ualberta.ca, lombardi@ece.neu.edu

Abstract—This paper presents a hybrid non-volatile (NV) SRAM cell with a new scheme for soft error tolerance. The proposed cell consists of a 6T SRAM core, a Resistive RAM made of a transistor and a Programmable Metallization Cell. An additional transistor and a transmission gate are utilized for selecting a memory cell in the NVSRAM array. Concurrent error detection (CED) and correction capabilities are provided by connecting the NVSRAM array with a dual-rail checker; CED is accomplished using a dual-rail checker, while correction is accomplished by utilizing the restore operation, such that data from the non-volatile memory element is copied back to the SRAM core. The simulation results show that the proposed scheme is very efficient in terms of numerous figures of merit.

*Index Terms*—Nonvolatile SRAM Cell, Emerging technology, Programmable Metallization Cell (PMC), Single Event Upset (SEU), Hybrid memory.

#### I. INTRODUCTION

With scaling of CMOS in the nano ranges, the technology roadmap predicted by Moore's Law is becoming difficult to meet also at circuit level [1]. The operation of these circuits exploits the high device density while meeting other performance metrics (such as delay). Scaling of CMOS has been made possible by improved fabrication/ manufacturing as well as design techniques. So-called emerging technologies have been widely reported to supersede or complement CMOS. Integration of significantly different emerging technologies with CMOS has gained attention, thus creating new possibilities for designing circuits and systems. This type of design style is commonly referred to as "hybrid" [1], because it exploits different characteristics of emerging technologies. A hybrid approach relies on partially utilizing CMOS, while technologies introducing emerging for performance improvement; this is attractive for memories in which the modular cell-based organization of these systems is well suited to new technologies and innovative design paradigms.

One of the emerging technology-driven paradigms in memory systems is represented by the so-called *non-volatile* resistive RAM (RRAM) [2]. [3] has presented a novel design of a non-volatile SRAM (NVSRAM) memory cell that can be used for instant-on operation, i.e. a non-volatile restore signal is utilized to erase the volatile data held in the SRAM (volatile) core and replace it with the data held in the non-volatile storage when a restore operation on power-up is performed. Nonvolatile elements based on resistive switching such as the memristor [4][5] have been recently proposed for NVSRAM implementation [6][7]. Security constraints as well as multi-context configurability (i.e. the capability to store and operate under multiple sets of configuration data) require non-volatile operation in programmable chips such as FPGAs; hence non-volatile elements have been proposed as addition to SRAMs in FPGAs [8].

Despite these advances, the reliable operation at nanometric feature sizes remains of significant concern [9]. The amount of charge stored on a circuit node is becoming increasingly smaller due to the lower supply voltage and the smaller node capacitance. Particles, electrical noise and other environmental phenomena can cause a change in the data stored in a memory cell [10], thus affecting integrity [11]. This event may result in a transient fault (TF); if a TF is latched by a sampling element (latch), then this may result in a so-called soft error (SE) [12][13]. Many approaches have been proposed to deal with a SE in storage elements, such as error correcting codes, temporal redundancy and hardened circuit design. Among them, hardening has been utilized for designs to tolerate a single SE in memories and latches [14][15]. Hardening is based on adding transistors to the original cell design such that the charge of minimum value is increased; this approach is very effective, but it yields a nearly 100% overhead in the number of transistors for single SE tolerance.

The objective of this manuscript is to propose a hybrid NVSRAM cell with a novel scheme for soft error tolerance. The proposed NVSRAM cell consists of a 6T SRAM core, a RRAM (made of a 1T and a PMC) and a selection circuit (one transistor and one transmission gate). By connecting each row of the memory array (made of proposed NVSRAM cells) with the CED circuit, concurrent error detection (CED) and correction capabilities are provided; CED is accomplished using a dual-rail checker [16], while correction is accomplished by utilizing the restore operation, such that data from the nonvolatile memory element is copied back to the SRAM core. The dual-rail checker utilizes two XOR gates each made of two inverters and two ambipolar transistors, thus reducing the transistor count compared to a CMOS implementation. The implications of CED and correction are analyzed and simulated using HSPICE as well as macromodels for the PMC and the ambipolar transistors [1][17]. Extensive simulation results are provided. The simulation results show that the proposed

scheme is very efficient in terms of numerous figures of merit such as delay and circuit complexity and thus applicable to designing look-up tables (LUTs) for multi-context configurability of FPGAs.

This paper is the significant extension of [18] and is organized as follows. Section II introduces a brief review of the technologies and preliminaries relevant to this manuscript. The proposed NVSRAM is described in Section III. Section IV presents the novel ambipolar-based XOR gate used in the dualrail checker for concurrent error detection (CED). Section V presents the simulation results for the assessment of the proposed NVSRAM cell as well as CED and correction. Section VI presents a comparison between the proposed NVSRAM cell and the 11T volatile memory circuit of [15]; conclusion is provided in Section VII.

## II. REVIEW

This section reviews the technology and state-of-the-art works as relevant to the proposed hybrid approach.

#### 2.1 Programmable Metallization Cell (PMC)

The Programmable Metallization Cell (PMC) also known as the Conducting Bridge Random Access Memory (CBRAM) is a resistive switching non-volatile element based on the migration of metallic ions through a solid electrolyte and the subsequent formation and dissolution of a metallic *conductive filament* (CF) connecting the two electrodes [19].



Figure 1. Switching processes of a PMC a) the CF vertically grows prior to set process, b) the CF laterally dissolves prior to reset process [20]

The set (OFF to ON state transition) and the reset (ON to OFF state transition) processes of a PMC device are shown in Figure 1.

- Under a positive bias, the top active electrode is oxidized, and the fast metal ions (Ag<sup>+</sup> or Cu<sup>2+</sup>) drift toward the bottom electrode and form the CF. Thus, the CF vertically grows until it reaches the top electrode, at which time the set process occurs. Following the set process, the CF grows laterally and its diameter continues to increase, because more metal ions are present around it [20][21].
- For the reset process, when a negative voltage bias occurs across the PMC (Figure 1b), the CF tends to laterally dissolve, because the enhanced lateral electric field is at the top of the CF [20][22]. The reset is completed when the diameter of the conductive filament shrinks down to zero at the top electrode. After the reset, the CF vertically dissolves and its height keeps decreasing.

So, the switching process of a PMC has a *transition point* that occurs whenever the tip of the CF touches or separates from the top electrode. The resistance of a PMC is dependent on the CF height (h) and the CF radius (r) for finding the ON and OFF-state resistance ( $R_{on}$  and  $R_{off}$ ). The OFF state occurs when the tip of the conductive filament is separated from the top electrode; in this case, h is less than the film thickness of the solid electrolyte or the height of the PMC (L). Once h is found, the OFF-state resistance ( $R_{off}$ ) is given by the sum of two resistors in series [20] as

$$R_{off} = (\rho_{on}h + \rho_{off}(L - h))/A$$
(1)

where  $\rho_{on}$  is the CF resistivity,  $\rho_{off}$  is the non-conducting solidelectrolyte resistivity, L is the film thickness of the solid electrolyte and A is the area at the bottom of the CF (on the assumption that it is cylindrical before the set process).

The ON-state resistance of a PMC ( $R_{on}$ ) occurs when the tip of the CF touches the top electrode; the resistance value is based on the CF radius (r). As the shape of the conductive filament is conical, then the cell resistance of a PMC in the ONstate is as follows

$$R_{\rm on} = \rho_{\rm on} L / (\pi r R) \tag{2}$$

where R is the radius at the bottom of the CF.

The significant advantage of a PMC is the very large resistance range compared with other resistive element technologies, such as a  $Cu_xSi_yO$  RRAM ( $5k\Omega$  to  $1Meg\Omega$ ) [8], a OxRRAM ( $5k\Omega$  to  $1Meg\Omega$ ) [23], and the memristor ( $100\Omega$  to  $16k\Omega$ ) [4][5]. A PMC does not require to be reset prior to a write operation; moreover, the write operation of a PMC is simpler in execution than the write operation of an OxRRAM. The resistance of a PMC varies based on the polarity and the amplitude of the voltage difference across it; so to improve the write operation of a PMC, a voltage value that is usually larger than the supply voltage for a nanoscaled MOSFET is utilized; this voltage is denoted as V<sub>dh</sub>.

#### 2.2 Non-Volatile SRAM (instant-on configuration)

[23] has presented a novel design of a non-volatile SRAM (NVSRAM) cell that can be used for instant-on operation, i.e. a non-volatile restore signal is utilized to clear the volatile data held in the SRAM and replace it with the data held in the nonvolatile storage when a restore operation on power-up is performed. The NVSRAM cell of [23] utilizes a 6T SRAM core and a resistive RAM (a RRAM made of a MOSFET and an oxide-based resistive element). Hence, this is a 7T1R cell.

The 7T1R memory cell proposed in [23] achieves a significant reduction in power dissipation for all three operational states (i.e. write, power-down and restore) required for instant-on operation when compared with other NVSRAM cells found in the technical literature [3][6][24]. This improvement is significant especially for write and power-down. Also a substantial difference in power dissipation has been reported for "0" and "1" values; this is due to the asymmetric design of this cell (as utilizing only a RRAM connected to node D in the 6T SRAM). The power-down state offers significant advantages over the standby state of a 6T SRAM cell. As expected, the average power dissipation decreases at lower feature sizes and the 7T1R cell still remains

the best among the NVSRAM schemes found in the technical literature.

#### 2.3 Ambipolar transistor

Different from a traditional (unipolar silicon CMOS) device whose behavior (either p-type or n-type) is determined at fabrication, ambipolar devices can be operated in a switched mode (from p-type to n-type, or vice versa) by changing the gate bias [25][26]. Ambipolar conduction is characterized by the superposition of electron and hole currents; this behavior has been experimentally reported in different emerging technologies such as carbon nanotubes [27], graphene [28], silicon nanowires [25][29] and organic single crystals [30]. An ambipolar transistor can be used to control the direction of the current based on the voltage at the so-called polarity gate. A 4terminals ambipolar transistor (Double Gate MOSFET, or DG-FET) is utilized in this paper. The second gate (referred to as the Polarity Gate, PG) controls its polarity, i.e. when PG is set to logic '0', the ambipolar transistor behaves like a NMOS; when PG is set to logic '1', it behaves like a PMOS [31]. The symbol and the modes of operation of the ambipolar transistor used in this paper are shown in Figure 2.



Figure 2. Ambipolar transistor, a) Symbol, b) Characteristics [32]

In the technical literature and to the best knowledge of the authors, there is no HSPICE compatible model to simulate the behavior of an ambipolar transistor; therefore, in this paper, the model of Figure 3 [32] is utilized at macroscopic level for simulating the characteristics of an ambipolar transistor by using two ideals switches and two MOSFETs.



Figure 3. Model of ambipolar transistor [32]

The behavior of the ambipolar transistor is based on the voltage at its polarity gate. If the voltage at node PG is GND, switch Sw1 is ON while Sw2 is OFF; the ambipolar transistor behaves as an NMOS. However if the voltage at the polarity gate is  $V_{DD}$ , switches Sw1 and Sw2 are OFF and ON respectively. The ambipolar transistor behaves as a PMOS. Few ambipolar-based gates (NAND and NOR) have been proposed in [31]; their performance (delay and power dissipation) has been shown to be superior to the CMOS counterparts [31]. [33] has presented the fabrication process of ambipolar transistors. Silicon nanowires are used as ambipolar devices; they are vertically stacked on the substrate. [33] has

shown that this process is compatible with CMOS, thus achieving full integration.

#### 2.4 Soft Error (SE) and model

There is an extensive technical literature on SE-tolerant design for memories. In a memory circuit, the transient voltage change that is generated by a heavy ion strike, may directly lead to a *Single Event Upset* (SEU) as a state change of the memory cell [11][34]. A SEU is said to occur when the collected energy Q at a particular node is greater than the critical charge,  $Q_{crit}$ , i.e.  $Q_{crit}$  is the *minimum charge* that needs to be deposited at the sensitive node of a storage cell to flip (change) the stored bit (data). Usually for a 6T SRAM core,  $Q_{crit}$  is found at one of the storage nodes, i.e. DN or D.

Different hardening approaches have been proposed to overcome a SEU. An example of a hardening approach in memory design is commonly known as DICE [12] and uses twice the number of transistors of a standard storage cell (i.e. 12T vs. 6T). The advantage of this design is that it does not require an increase in the size of the transistors or the capacitance of some nodes. In the DICE cell, the single node that is affected by a TF can be driven back to its previous state by the other transistors. A different hardened memory design requiring 11 transistors (i.e. 11T) has been proposed in [15]; the single node affected by a TF can be driven back by using novel access and refreshing circuits. Theoretically, these two volatile schemes are immune to any amount of charge collected at any single node. However they incur in significant overhead in terms of transistors added to the 6T SRAM core.

In this paper, a single event upset (SEU) model is assumed for the memory cell; the node of least charge (i.e. the critical charge) must be considered for mitigating the SEU.

#### III. PROPOSED NVSRAM CELL

In this section, the operational principles of the proposed NVSRAM cell are presented. The proposed NVSRAM cell consists of the 7T1P NVSRAM scheme of [17] (a volatile (6T) SRAM core and a RRAM circuitry (consisting of a 1T and a 1X, where X denotes the type of resistive element)) and a selection circuitry (a transistor and a transmission gate). In the 7T1R cell of [23] (X=R), nonvolatile data is kept as resistance in an oxide-based resistive element. In this paper, a Programmable Metallization Cell (PMC) is used as nonvolatile storage element.

It has been shown [23] that in a NVSRAM cell, the nonvolatile storage node has a very large charge, so it is extremely tolerant to a SEU; this implies also that a NVSRAM cell has an inherent *redundancy* in stored data and the data stored in the non-volatile (resistive) element still holds correct data if the SRAM cell is affected by a SEU. As corruption of the data stored in the non-volatile element of a cell due to a SEU is highly unlikely (if not impossible), the data stored in the RRAM is a *reliable duplicate* of the one stored in the SRAM core; moreover, the resistive element in a NVSRAM is usually placed on a different plane in the chip layout, thus ensuring that multiple upsets are highly unlikely to affect both versions of the same data and preserving data independence in the storage functions. A detailed assessment of the critical charge at the volatile and the non-volatile storage nodes has been pursued in [23]; in all cells, it has been shown that the non-volatile storage node has a charge order of magnitude larger than the critical charge, thus making the data stored in the RRAM very reliable. The utilization of this feature however requires modifying existing operations (such as read and write) as well as introducing new ones (such as restore).



Figure 4 presents the proposed nonvolatile SRAM cell (10T1P); it consists of a 6T SRAM core, a Resistive RAM (RRAM), and a selection circuitry (a transistor (M8) and a transmission gate (T1)). The RRAM is a 1T1P circuit, i.e. it uses one transistor (M7) and a PMC element, X=P. The proposed 10T1P cell has the following operations (different from the only instant-on behavior of the cell of [23]):

- Write (Store): the data is written to both the SRAM core and • the PMC.
- Read: the data is read from the SRAM core contingent upon no occurrence of an SEU.
- Restore: if an SEU occurs, Concurrent Error Detection (CED) is employed and this operation is evoked, such that the data stored in the PMC is transferred to the SRAM core for correction.
- *Instant-On*: the data stored in the SRAM core is volatile, i.e. it is lost when there is no power supply. Once the supply voltage is again made available, the instant-on operation is started and the data stored in the PMC is transferred also to the SRAM core.

In the 10T1P cell, non-volatile data is kept in the form of PMC resistance. The NVSRAM circuit of the proposed cell utilizes the same circuitry as in [23], but its operations are different. The NVSRAM cell of this paper utilizes the resistive element as a reliable back-up, such that a SEU affecting the SRAM cell can be detected using a novel CED circuit and correction is implemented by the restore operation. The instanton operation is still possible and is evoked when following the loss of power, data stored in the resistive element is transferred also to the SRAM core upon the availability of power. The write operation of the proposed design is different from the write operation of [23]. In the proposed design, only a clock cycle is needed (the write operation of [23] requires 2 clock cycles, because the RRAM must be reset to the state of high resistance prior to executing the write operation).

Moreover, as for the occurrence of an SEU, the node of critical charge is considered [14][35][36]; as shown in a later section (and consistent with other works on NVSRAMs [23]), this node is DN. Once a SEU occurs, it results in a state change at DN, thus also causing D to change accordingly (due to the cross-coupled inverter scheme of the SRAM core).

Next, the three operations of store, restore, and instant-on are presented in more detail.

## 3.1 Store operation

During the store (write) operation, data is written in both the PMC and the SRAM core. The voltage at node EN is at GND, while the voltage at node ENB is at V<sub>DD</sub>. Transistor M8 and the transmission gate (T1) are OFF.

• Write '0' Operation

To write '0' as data, the value of the voltage at D must be at GND, while the value of the PMC resistance must be ROFF (high resistance). The voltages at BL and BLB are at GND and V<sub>DD</sub> respectively. The memory cell is selected by setting the voltage at WL to  $V_{DD}$ . The voltage at D is at GND. For the write operation of a PMC, the changing rate of the resistance of the PMC is related to the voltage difference across it [17]; transistor M7 is turned ON by increasing the supply voltage and the voltage at Ctrl1 to V<sub>dh</sub> during the write operation; so the PMC is written with the data corresponding to the voltages at D and Ctrl2. As the voltage at node D is at GND and the voltage at Ctrl2 is at  $V_{dh}$ , then a negative voltage is dropped across the PMC and its resistance is set to the OFF state (high resistance).

· Write '1' Operation

In this case, the voltage at node D must be at  $V_{DD}$  while the PMC resistance must be placed in the ON state (low resistance). So, the voltage at WL must be at  $V_{DD}$  for selecting the memory cell, while the voltages at BL and BLB are at  $V_{DD}$ and GND respectively. The data stored in the SRAM (voltage at D) is at state '1'; both the PMC and the SRAM are written at the same time, so the supply voltage of the cell is  $V_{dh}$  during the write operation. M7 must be ON to generate the voltage difference across the PMC. So, the voltage at Ctrl1 is V<sub>dh</sub>, while the voltage at Ctrl2 is at GND. A voltage difference across the PMC exists and the write '1' operation is executed.

The write operation of the 10T1P requires one clock cycle; this is better than [17][23] in which two clock cycles are needed. The write operation in the proposed cell stores data at the same time in both the RRAM and the core.

#### 3.2 *Restore operation*

The restore operation transfers (copies) the data stored in the PMC to the SRAM cell, i.e. at node D. The data stored in the PMC is read by setting the voltages at Ctrl1 and Ctrl2 to GND and  $V_{DD}$  respectively. If a '0' ('1') is stored in the PMC, the voltage at DP is at GND ( $V_{DD}$ ).

For the restore operation,  $V_{WL}$  is set at  $V_{DD}$ , while the voltages at BL and BLB are varied depending on the stored data, i.e. for a '0' ('1') in the PMC, the voltages at BL and BLB are given by GND and V<sub>DD</sub> (V<sub>DD</sub> and GND) respectively and the voltage at D is at GND (V<sub>DD</sub>). During the restore operation, the voltages at nodes EN and ENB are at GND and VDD respectively; transistor M8 and the transmission gate (T1) are OFF.

#### 3.3 Instant-On Operation

The proposed 10T1P cell can still operate in an instant-on mode as presented in [17]; so, when the supply voltage is lost, the voltage at D is also lost due to the volatile nature of the SRAM core. However the non-volatile element retains the stored data. The instant-on operation is employed to bring back the value stored in the resistive element to the SRAM core. During the instant-on operation, the voltages at EN and ENB are at GND and  $V_{DD}$  respectively. Transistor M8 and the transmission gate (T1) are OFF. Instant-on operation is started by setting the voltages at Ctrl1 and Ctrl2 to V<sub>DD</sub> while the voltages at lines BL, BLB and WL are at GND; so, M7 is turned ON, while the voltage at D varies depending on the value of the PMC resistance. Due to the high value of the PMC resistance in state '0', an uncertainty may exist due to a discharging node between D and DN. However, the small values of the ON state resistance of M1 and M7 (compared to the high resistance of the PMC) result in a low voltage at node D. This finally turns OFF M3, thus preventing the discharge of node DN; so, the voltage at D is at 0V. If a '1' is stored in the PMC (with a low value of resistance), the voltage at D is at  $V_{DD}$  while the voltage at DN is discharged through M3. Therefore, the data in the PMC is correctly restored to D.

## IV. CED BY DUAL-RAIL CHECKER

Concurrent error detection (CED) is utilized for tolerating the occurrence of a SEU. The CED circuit consists of a dual rail checker. The proposed design is hybrid in nature, because it has a further novelty in the circuit, namely the use of ambipolar transistors in the XOR gates for the dual-rail checker.

## 4.1 Proposed XOR gate

A CMOS XOR gate requires at least 8 transistors, while two more inverters are needed to generate the reverse input logic. Therefore, the total number of transistors is increased to 12. Ambipolar transistors are employed in this paper to reduce the number of transistors in an XOR gate based on their characteristics to behave as either NMOS or PMOS. The reduction in the number of transistors also improves the power dissipation [31].

Figure 5 presents the proposed XOR gate using ambipolar transistors and inverters. The two input signals are given by IN1 and IN2, while the output of the XOR gate is Out. The following cases are possible in the operation of the XOR gate.



Figure 5. Proposed XOR gate using ambipolar transistors

• Both IN1 and IN2 are '0'

In this case, node IN2 is connected to the polarity gate of the ambipolar transistors; when IN2 is set to GND, both ambipolar transistors behave as NMOS. So, the ambipolar transistors operate based on the voltage at IN1. IN1 is at GND, so the ambipolar transistors AMB1 and AMB2 are ON and OFF respectively. The voltage at O1 is given by the difference between the supply voltage and the threshold voltage drop across AMB1 ( $V_{DD}-V_{th}$ ). Therefore, the output voltage ( $V_{Out}$ ) is at GND.

• IN1 and IN2 are '0' and '1' respectively

In this case, both ambipolar transistors behave as PMOS; AMB1 is OFF, while AMB2 is ON. The voltage at O1 of the proposed XOR gate is given by the threshold voltage of the ambipolar transistor ( $V_{th}$ ), so the output voltage is given by  $V_{DD}$ .

• IN1 is '1' and IN2 is '0'

In this case, both ambipolar transistors behave as NMOS; AMB1 is OFF, while AMB2 is ON. The voltage at O1 is at GND, so the voltage at Out is at  $V_{DD}$ .

• IN1 and IN2 are '1'

Both ambipolar transistors behave as PMOS. AMB1 is ON, while AMB2 is OFF. The voltage at O1 is at  $V_{DD}$  and its output voltage is at GND.

Hence, the circuit of Figure 5 correctly operates as an XOR gate.

#### 4.2 Dual-rail checker

XOR gates are connected in parallel (Figure 6a) in a dualrail checker circuit [16][37]. In the absence of an SEU, the copies of the data stored in the SRAM at D and in the RRAM at DP are the same. Two comparisons between the node pairs D and DP and DN and DP are executed to establish the CED feature. The dual-rail checker is connected to the proposed NVSRAM cell (Figure 4); for CED, M7 is turned OFF, while the voltage at Ctrl2 is at V<sub>DD</sub>, the voltages at the three nodes are provided as inputs to the two XOR gates.



Figure 6. a) Dual-rail checker for CED b) Ambipolar-based dual-rail checker

Table 1. Voltages at nodes D, DN, and DP of proposed 10T1P NVSRAM cell and output voltages of a dual-rail checker

| Inpu     | t Voltag        | e (V)           | Output V        | Status |        |
|----------|-----------------|-----------------|-----------------|--------|--------|
| VD       | VDN             | VDP             | VER1 VER2       |        | Suuus  |
| 0        | VDD             | 0               | 0               | VDD    | No SEU |
| 0        | V <sub>DD</sub> | V <sub>DD</sub> | V <sub>DD</sub> | 0      | SEU    |
| $V_{DD}$ | 0               | 0               | V <sub>DD</sub> | 0      | SEU    |
| VDD      | 0               | VDD             | 0               | VDD    | No SEU |

Table 1 shows the input and output voltages for the dualrail checker. Every store operation writes to both the RRAM and the SRAM; so, the SRAM core and the RRAM are monitored by the dual-rail checker. As also applicable to hardened memory cells found in the technical literature [12][15], the condition of logic inversion always applies to V<sub>DN</sub> and V<sub>D</sub>. Two cases are applicable.

- If either an SEU does not cause a logic inversion in the SRAM or there is no SEU, then  $V_{DP} = V_D$ .
- If a SEU causes a logic inversion in the SRAM, then V<sub>DP</sub> = V<sub>DN</sub>.

The outputs of the dual-rail checker also ensure that a single fault occurring in the proposed memory cell will be detected as generating an invalid code at the output, i.e. this circuit is self-checking too. The restore operation therefore is required when  $V_{ER1} = V_{DD}$  and  $V_{ER2} = 0$ . As described previously, the restore operation permits the data stored in the PMC to be written back in the SRAM core, thus correcting the SEU.

Figure 6b presents the ambipolar-based dual-rail checker that utilizes the proposed XOR gates. Node DP is inserted at node IN1 of both the proposed XOR gates while nodes D and DN are connected to node IN2 of XOR1 and XOR2 respectively.

# 4.3 Array Level Considerations

This section considers the connections between the NVSRAM array and the CED circuit; in the proposed scheme, a CED circuit is needed for each row of the NVSRAM array. EN and ENB are the enable lines; they are used to select an NVSRAM cell in each row of the NVSRAM array. Voltages from nodes DN and DP of a selected memory cell are connected to the CED circuit, while the voltages at nodes D and DPB are generated from the voltage at DN and DP respectively.



As shown in Figure 7, the voltages from nodes DN and DP of the proposed NVSRAM cell are provided as inputs to the CED circuit using transistor M8 and its transmission gate. A transmission gate is employed to provide the voltage from node DP of the proposed NVSRAM to the line DP1. When a NVSRAM cell is selected, the voltage at EN (ENB) of the selected cell is at V<sub>DD</sub> (GND). Transistor M8 of the selected cell and its transmission gate are ON; so the line DN1 (DP1) is at the same voltage as node DN (DP). For the unselected cells, the voltages at lines EN and ENB of the proposed NVSRAM cell are at GND and V<sub>DD</sub> respectively (so the corresponding transistor M8 and transmission gate are OFF); voltages from nodes DN and DP of these cells are not provided as inputs of the CED circuit. The voltage at node DN from the NVSRAM cell retains a full swing, so the voltage drop across M8 does not affect its value. Note that transistor M8 is connected to node DN of the proposed NVSRAM cell (instead of node D) to balance the capacitance between D and DN, such that the instant-on operation operates correctly

Two inverters are used as drivers for the voltage at node DN to the CED circuit; the value of the voltage at node DP of the proposed NVSRAM cell is dependent on its PMC resistance. If the PMC is in state '1' (very low resistance), the voltage at node DP slowly increases from GND to  $V_{DD}$ , else the voltage at node DP remains at 0V.

#### V. SIMULATION RESULTS

In this section, the proposed NVSRAM cell is evaluated by simulation. HSPICE is utilized as simulation tool, while the model of [17] is employed for simulating the PMC; the resistance range of the PCM is given by  $30k\Omega - 100Meg\Omega$  [17] The largest values for the CF height (L) and CF radius (R) of the PMC are given by 1.5nm and 25.2nm respectively, while the threshold CF height (h<sub>th</sub>) and the radius (r<sub>th</sub>) of the PMC [17] are selected at the values of 1.45nm and 0.225 nm respectively. Therefore, the OFF state resistance of the PMC is given by 99.958Meg $\Omega$ , while the ON state resistance of the PMC is given by 30.063k $\Omega$ . The macroscopic model of Figure 3 is utilized for an ambipolar transistor; the transistor sizes are adjusted to generate the symmetric conduction between the PMOS and NMOS behaviors at 32nm CMOS feature size [38].

#### 5.1 Ambipolar-based XOR gate

In Figure 5, two inverters and two ambipolar transistors are needed in the proposed XOR gate. Figure 8 shows the input and output voltages of an inverter at 32nm CMOS feature size; so the delay is 18.37ps for the '1' to '0' transition and 17.41ps for the '0' to '1' transition. Figure 9 shows the input and output voltages of the proposed XOR gate. Table 2 shows the delay, power dissipation and PDP for the proposed XOR gate under the four input combinations (bold entries identify the worst cases). The proposed XOR gate encounters a larger delay when the voltage at IN1 is at GND due to the threshold voltage drop across the ambipolar transistor.

Table 2. Delay, power dissipation, and Power Delay Product (PDP) of the proposed XOR gate [18]





The worst cases for the power dissipation and the power delay product (PDP) of the proposed XOR occur when one of the inputs is at 1.

#### 5.2 Critical charge and SER

A Single Event Upset (SEU) in a SRAM cell occurs when a charged particle strikes the most sensitive node and flips the state of the SRAM cell, causing a change in stored data. The sensitivity of SRAM to radiation is quantified by the critical charge parameter, Q<sub>crit</sub>, as the least amount of charge required to change the state of the cell [13][14]. Table 3 shows the critical charge of the proposed NVSRAM cell for the three nodes D, DN and DP for '0' and '1' as data stored in the cell. The critical charge is given by the bold entries and occurs always at node DN. Table 3 confirms the findings of [23] [41], namely that the node at the resistive element has a very high charge and the data stored in the resistive element is not connected to the node of critical charge, i.e. unlikely to be affected by a SEU. The charge at DP is many orders of magnitude higher than the critical charge; this is caused by the resistance value and the voltage across the PMC.

Table 3. Charge of nodes D, DN and DP of the proposed 10T1P cell

| Node | Charge for Stored Data Value |                           |  |  |
|------|------------------------------|---------------------------|--|--|
| woae | '0'                          | '1'                       |  |  |
| D    | -2.1393*10-16                | 2.1597*10 <sup>-16</sup>  |  |  |
| DN   | 1.7512*10 <sup>-16</sup>     | 1.7828*10 <sup>-16</sup>  |  |  |
| DP   | 2.3092*10-12                 | -1.2581*10 <sup>-13</sup> |  |  |

Table 4. Charge of nodes D, DN, DP, DPB, O1, and O2 of the proposed CED Circuit

| N.J. | Charge for Stored Data Value |                           |  |  |  |
|------|------------------------------|---------------------------|--|--|--|
| Node | '0'                          | '1'                       |  |  |  |
| D    | -1.0854*10-16                | 1.3776*10-16              |  |  |  |
| DN   | 7.5220*10 <sup>-17</sup>     | -8.3812*10-17             |  |  |  |
| DP   | -8.0163*10-16                | 7.9372*10 <sup>-16</sup>  |  |  |  |
| DPB  | 9.4019*10 <sup>-16</sup>     | -8.3404*10 <sup>-16</sup> |  |  |  |
| 01   | 1.5354*10 <sup>-16</sup>     | -6.8380*10-16             |  |  |  |
| 02   | -6.8186*10 <sup>-16</sup>    | 4.7880*10-16              |  |  |  |

The critical charge of the proposed CED circuit is considered next; the simulation results of Table 4 show that the critical charge of the proposed CED circuit is at node DN. The Soft Error Rate (SER) is considered next for the proposed cell. It is derived from the critical charge by using the analytical model of [39]. In this model, the SER is given by  $SER = K^*(A_{diff-PMOS}^*exp(-O_{Crit-PMOS}/n_{hole}))$ 

$$ER = K^{*}(A_{diff-PMOS}^{*}exp(-Q_{Crit-PMOS}/\eta_{hole}))$$

+  $(A_{diff-NMOS} * exp(-Q_{Crit-NMOS}/\eta_{elec}))$  (3)

where K is the overall scaling factor,  $\eta$  is the measured charge collection efficiency at a given radiation. Using (3) at 32nm CMOS feature size, the SERs of the proposed cell is 6.075au. These results confirm the findings of [41], namely that the non-volatile storage has a higher charge than the volatile circuit (i.e. the SRAM), so a better SEU tolerance.

## 5.3 Write operation

To write data to the proposed NVSRAM cell, data must be written to node D and the PMC. As mentioned previously, the supply voltage and the voltage at Ctrl1 must be increased to  $V_{dh}$ , while the voltage at Ctrl2 must have an opposite value of the data to be stored.  $V_{dh}$  is related to the voltage difference across the PMC, in this paper,  $V_{dh}$  is 3.5V. The operational feature of requiring such a high voltage value can be attained by employing different fabrication techniques, such as strained CMOS [42], or pure high-k metal gate [43], thus ensuring correct operation and not destroying either the gate oxide or interlayer dielectrics of the CMOS devices.



when '0' and '1' are written to the proposed memory cell

Figure 10 shows the voltage at D and DN as well as the PMC resistance when '0' and '1' are written into the memory cell. The simulation is divided into five parts marked as follows: N/A, Write '0', N/A, Write '1', N/A. For N/A, the voltages of WL, Ctrl1 and Ctrl2 are at GND and no read or write operation is executed. However when data is written into the memory cell, the voltages at D and DN are increased and the PMC resistance changes as follow.

- The PMC resistance has the highest value for '0'; after the write '0' operation, the voltage at D decreases to GND.
- The PMC resistance is switched to the lowest value for '1' and the voltage at D after the write '1' operation is at V<sub>DD</sub>.

Table 5 shows the delay, power dissipation and power delay product (PDP) of the proposed NVSRAM cell for both cases of the store (write) operation.

| is roomegez and rokez respectively) |                 |         |  |
|-------------------------------------|-----------------|---------|--|
|                                     | Write Operation |         |  |
|                                     | '0'             | '1'     |  |
| Delay (ps)                          | 0.023           | 3.827   |  |
| Power dissipation (µW)              | 871.2           | 795.271 |  |
| PDP (*10 <sup>-15</sup> )           | 0.020038        | 3.0443  |  |

Table 5. Delay, power dissipation, and Power Delay Product (PDP) of proposed 10T1P cell for write '0' and '1' operations (when the PMC resistance is 100MagO and 70kO respectively)

The write '0' operation is faster than the write '1' operation; during the write '1' operation, the PMC resistance is reduced and the voltage difference across the PMC also decreases, thus the switching time of the PMC is slower. The power dissipation (PDP) of the write '0' operation is higher (lower) than the write '1' operation for the same reasons. The write operation of DICE takes 5.011 ps at 32nm feature size; despite the presence of the RRAM, the proposed cell has better performance than DICE because the SRAM core in the NVSRAM utilizes the 6T configuration rather than the feedback arrangement of [35].

#### 5.4 NVSRAM Read operation

In the proposed scheme the read operation requires reading both the SRAM core and the RRAM.

#### 5.4.1 SRAM Read

The process of precharging the bitline voltages (BL and BLB) to  $V_{DD}$  is initiated prior to the read operation; the word line voltage ( $V_{WL}$ ) of the selected memory cell is then at  $V_{DD}$ , such that the voltage stored in the SRAM cell is made available at both BL and BLB.

Table 6. Delay, power dissipation, and Power Delay Product (PDP) for read operation of the SRAM core in the proposed cell

|                           | <b>Read Operation</b> |         |  |
|---------------------------|-----------------------|---------|--|
|                           | <i>'0'</i>            | '1'     |  |
| Delay (ps)                | 7.81                  | 8.61    |  |
| Power dissipation (µW)    | 9.68425               | 9.38908 |  |
| PDP (*10 <sup>-15</sup> ) | 0.075634              | 0.08084 |  |

Table 6 shows the delay, the power dissipation, and the power delay product (PDP) of this read operation; while the read '0' operation has the least delay, the least power dissipation (but the highest PDP) is accounted for the read '1' operation. For comparison purposes, the read operation for DICE takes 10.041ps at 32nm, again higher than the proposed scheme.

## 5.4.2 RRAM Read

For reading the RRAM, the PMC resistance is monitored as the voltage at node DP. The data stored in the PMC is found by having the voltage of node Ctrl1 at GND (to turn OFF transistor M7 and separate D and DP); also the voltage of Ctrl2 must be at  $V_{DD}$ . Figure 11 plots the voltage at node DP when the data stored in the PMC is read. If a '0' is stored in the RRAM (i.e. the PMC resistance is very large), the voltage at DP is very small; if a '1' is stored in the RRAM (so the PMC resistance is very small), when the voltage at Ctrl2 is at  $V_{DD}$ , the voltage at DP increases up to  $V_{DD}$ . So by measuring the voltage at DP during the read operation, its delay is 7.2877ps, so smaller than the delay for reading the SRAM core.



5.5 Dual-rail checker

Next, the performance of the dual-rail checker is established. By using the proposed XOR gate (Figure 5) and connecting the voltage at DP to both the polarity gates of the ambipolar transistors, the results of Table 7 are found for delay, power dissipation and power delay product (PDP).

Table 7. Voltages at D, DN, and DP of 10T1P NVSRAM cell and output voltage, delay time, power dissipation, and PDP of dual-rail checker

| In                  | put (                  | (V) Output (V)  |                 | Delay           | Power  | PDP                   |                               |
|---------------------|------------------------|-----------------|-----------------|-----------------|--------|-----------------------|-------------------------------|
| VD                  | <b>V</b> <sub>DN</sub> | VDP             | VER1            | VER2            | (ps)   | Dissipation $(\mu W)$ | (* <b>10</b> <sup>-15</sup> ) |
| 0                   | $V_{DD} \\$            | 0               | 0               | $V_{\text{DD}}$ | 11.811 | 8.7933                | 0.10386                       |
| 0                   | $V_{\text{DD}}$        | $V_{\text{DD}}$ | $V_{\text{DD}}$ | 0               | 66.88  | 35.491                | 1.2703                        |
| $V_{\text{DD}}$     | 0                      | 0               | $V_{\text{DD}}$ | 0               | 11.808 | 8.7967                | 0.095976                      |
| $\overline{V}_{DD}$ | 0                      | $V_{\text{DD}}$ | 0               | V <sub>DD</sub> | 66.904 | 35.491                | 2.3745                        |

The worst case for the delay, the power dissipation and the PDP occurs when a '1' is stored in the PMC and the voltage at node DP must change from GND to  $V_{DD}$ .

For comparison purpose, consider a CMOS implementation of a dual-rail checker. The CMOS XOR gate of [40] is used in place of the proposed ambipolar-based XOR gate. This CMOS gate requires 12 transistors, so the total number of transistors in a CMOS-based implementation of a dual-rail checker is 24. The delay, power dissipation and PDP of a CMOS-based dualrail checker are shown in Table 8. This circuit is faster and has a better PDP, however it incurs in a larger power dissipation and requires a larger number of transistors compared with the proposed ambipolar-based implementation.

Table 8. Voltages at D, DN, and DP of a 10T1P NVSRAM cell and output voltage, delay time, power dissipation, and PDP of a dual-rail checker implemented in CMOS [18]

| Implemented in Chrob [10] |                 |                 |                 |          |                     |                       |                       |     |
|---------------------------|-----------------|-----------------|-----------------|----------|---------------------|-----------------------|-----------------------|-----|
| In                        | put (           | <b>V</b> )      | Output (V)      |          | V) Output (V) Delay |                       | Power                 | PDP |
| $V_D$                     | $V_{DN}$        | VDP             | VER1            | VER2     | ( <b>p</b> s)       | Dissipation $(\mu W)$ | (*10 <sup>-15</sup> ) |     |
| 0                         | $V_{\text{DD}}$ | 0               | 0               | $V_{DD}$ | 58.46               | 22.4873               | 1.31461               |     |
| 0                         | $V_{\text{DD}}$ | $V_{\text{DD}}$ | $V_{\text{DD}}$ | 0        | 51.12               | 15.6201               | 0.798497              |     |
| $V_{\text{DD}}$           | 0               | 0               | $V_{DD}$        | 0        | 57.92               | 22.6245               | 1.31041               |     |
| $V_{\text{DD}}$           | 0               | $V_{\text{DD}}$ | 0               | $V_{DD}$ | 52.82               | 15.3855               | 0.812661              |     |

## 5.6 Restore operation

Data correction occurs when a SEU has affected the SRAM core and its occurrence is detected by the dual-rail checker. So following the detection for the two faulty cases (i.e. DN=DP), a restore operation takes place to copy back the value of the data stored in the RRAM to the SRAM core. The voltage at WL (V<sub>WL</sub>) is at V<sub>DD</sub>, while V<sub>BL</sub> and V<sub>BLB</sub> are selected depending on the value to be restored, i.e. the voltage at D is made the same as the voltage at DP.

Table 9 shows the delay, power dissipation and power delay product (PDP) of the 10T1P NVSRAM for both cases of restored data from the RRAM to the SRAM core. The worst values (bold entries) for the delay and PDP (power dissipation) are encountered when a '1" ('0') is restored.

Table 9. Delay, power dissipation, and Power Delay Product (PDP) of restore operation in the proposed NVSAM cell following the detection of a

| SLO                       |                   |           |  |  |
|---------------------------|-------------------|-----------|--|--|
|                           | Restore Operation |           |  |  |
|                           | Data '0'          | Data '1'  |  |  |
| Delay (ps)                | 18.90             | 22.56     |  |  |
| Power dissipation (µW)    | 25.86157          | 21.32144  |  |  |
| PDP (*10 <sup>-15</sup> ) | 0.4678357         | 0.4810118 |  |  |

It should be noted that as commonly found in coding circuits [37], a dual-rail checker is used for the word output of a memory; in this arrangement, error detection and correction are evoked once a read operation is executed and the voltages at D, DN and DP are checked. The correction of the SEU requires more time to be corrected using the proposed scheme than by hardening [15][35] due to delay in the CED circuitry.

## 5.7 CMOS Feature Size

In the previous sections, the NVSRAM cell has been simulated by using the (basic) CMOS Predictive Technology Model (PTM) at a feature size of 32nm. Next the high performance (HP) PTMs at feature sizes of 16, 22, and 32nm are utilized to assess the proposed NVSRAM cell.

Table 10. Delay (ps) of each operation of the proposed NVSRAM cell using high performance (HP) CMOS PTMs at different feature sizes

| Delay of each         | Feature Size (nm) |       |       | Feature Size (nm) |        |       |
|-----------------------|-------------------|-------|-------|-------------------|--------|-------|
| <b>Operation</b> (ps) | 16                | 22    | 32    | 16                | 22     | 32    |
| Supply Voltage<br>(V) | 0.7               | 0.8   | 0.9   |                   | 0.9    |       |
| Write '1'             | 2.843             | 0.907 | 0.791 | 2.858             | 0.896  | 0.791 |
| SRAM Read '1'         | 6.237             | 7.823 | 8.767 | 4.978             | 6.872  | 8.767 |
| SRAM Read '0'         | 5.809             | 7.318 | 8.231 | 4.653             | 6.438  | 8.231 |
| Dual-Rail<br>Checker  | 636.23            | 595.2 | 196.4 | 840.5             | 565.05 | 196.4 |
| Restore '1'           | 10.89             | 15.28 | 20.04 | 7.97              | 13.62  | 20.04 |
| Restore '0'           | 9.54              | 14.32 | 19.13 | 7.31              | 12.98  | 19.13 |

Table 10 shows the delay of each operation of the proposed NVSRAM cell; a reduction in the CMOS feature size causes an increase of the write time and the delay of the dual-rail checker, but a decrease in all other operations. At a lower CMOS feature size, performance is overall improved, but the reduction in supply voltage affects the write operation, i.e. the write time of the proposed NVSRAM cell at a lower CMOS feature size is higher. As the capacitance of CMOS at lower feature sizes (such as 16nm and 22nm) is less than the capacitance of CMOS at a larger feature size (32nm), then when the data stored in the PMC must be read, the voltage at the node DP increases at a higher rate for a lower CMOS feature size. The read '0' operation requires a longer delay to allow the voltage at DP to have the correct value. Moreover, as DP is connected to the polarity gate of the ambipolar transistor and the threshold voltage of the ambipolar transistors is set to half of the value of the supply voltage, the voltage at DP for a read '0' operation slightly increases from GND to V<sub>DD</sub>, thus degrading the performance of the dual-rail checker to compare the '0' data. So at a lower CMOS feature size, the increase of the voltage at DP affects the performance of dual-rail checker, i.e. the dual-rail checker at a lower CMOS feature size is slower.

#### 5.8 Memory Array Evaluation

In this section, the evaluation of the proposed NVSRAM array and the CED circuit are considered; the delay, power dissipation, and power delay product (PDP) are evaluated.

| Array Size | Delay (ps) | Power (µW) | <b>PDP</b> (*10 <sup>-15</sup> ) |
|------------|------------|------------|----------------------------------|
| 4x4        | 78.853     | 82.293     | 6.4890                           |
| 8x8        | 103.44     | 154.73     | 16.005                           |
| 16x16      | 147.29     | 304.45     | 44.844                           |
| 32x32      | 226.64     | 635.03     | 143.93                           |
| 64x64      | 373.22     | 1380.8     | 515.35                           |
| 128x128    | 651.16     | 3088.4     | 2011.0                           |
| 256x256    | 1078.9     | 6137.5     | 7931.6                           |

Table 11. Delay, power dissipation, and PDP of the entire memory (NVSRAM array and CED circuit)

When increasing the array size, the CED delay and power dissipation increase (Table 11); this is due to the increased line capacitance and the larger number of inverter pairs between arrays (so in series) for memories with dimension larger than 256 (Figure 12).



Moreover as in Figure 12, 4 inverters are employed for connecting to the CED circuit; the worst delay occurs when the first memory array is read. The entire memory delay (so after the CED circuit) for larger size memories is reported in Table 12.

Table 12. Delay of the entire memory (NVSRAM array and CED

| circuit)   |            |  |  |  |
|------------|------------|--|--|--|
| Array Size | Delay (ns) |  |  |  |
| 512x512    | 1.7048     |  |  |  |
| 1024x1024  | 2.6220     |  |  |  |
| 2048x2048  | 4.4435     |  |  |  |

## 5.9 Area

The area of the proposed NVSRAM cell is found by using Cadence to design the layout of the proposed cell while the PMC is stacked on a different plane [33].

Figure 13 shows the layout of the proposed NVSRAM cell. The PMC is stacked on a different plane than this circuit, hence only the area of the MOSFETS is considered. The total area of the proposed NVSRAM cell is  $3878.519\lambda^2$  ( $\lambda$  denotes is half of the CMOS feature size).

#### 5.10 Circuit complexity

The proposed cell is analyzed with respect to the number of CMOS transistors (or equivalent for the non-CMOS elements such as the PMC and the ambipolar transistors) as measure of circuit complexity. The three parts of the proposed NVSRAM cells require the following number of transistors:



Figure 13. Layout of the proposed NVSRAM

- 1. SRAM core: 6T
- RRAM: 1T and 1 PMC; the PMC has a length commeasurable to a 1T at the feature size of 32nm (as reported in a previous section), so even though not encountered in the layout due to the conducting bridge nature of this resistive element, the PCM is at most equivalent to 1T. Therefore, the RRAM has a circuit complexity of 2T
- 3. CED selection circuit: it consists of 1 transistor and 1 transmission gate; hence this circuit has a complexity of 3T.

The total number of transistors in the proposed NVSRAM exclusive of the CED circuitry is therefore 11T. The CED circuit consists of 4 ambipolar transistors and 6 inverters. As stated previously, the CED circuit is provided at the output of the memory so it is shared among all cells; hence, the overhead of the proposed scheme is mostly associated with the added non-volatile function, i.e. 2T and a selection circuit (3T). This is significantly less than (equal to) the 12T (11T) required by [15] and [35], respectively.

## VI. COMPARISON

This section presents a comparative evaluation between the proposed NVSRAM cell and the 11T cell of [15]. The write and read delays as well as other metrics are considered.

Table 13. Comparison between the proposed NVSRAM cell with CED circuit and the 11T-Hardening cell [15]

| chedit and the fiff-flatdening cen [15] |               |          |  |  |
|-----------------------------------------|---------------|----------|--|--|
|                                         | Proposed cell | 11T [15] |  |  |
| Write Delay                             | 3.827ps       | 5.011ps  |  |  |
| Read Delay                              | 8.61ps        | 10.041ps |  |  |
| Circuit complexity                      | 10T + 1PMC    | 11T      |  |  |
| Nonvolatile storage                     | Yes           | No       |  |  |

Table 13 shows the results; the proposed cell has smaller write and read delays than the 11T cell of [15]. The area of the proposed NVSRAM cell is also smaller. (as non-volatile element, the PMC is stacked on a plane different from the MOSFETs) Moreover, the proposed NVSRAM cell provides nonvolatile storage capabilities. In this respect, the restore operation of the proposed NVSRAM cell requires a higher voltage than for the store operation

#### VII. CONCLUSION

This paper has presented a novel approach to concurrent error detection and correction of a SEU in a new memory cell. The proposed memory cell is hybrid in nature because it utilizes the following circuits: a) a 6T SRAM core, b) a RRAM consisting of a 1T and a Programmable Metallization Cell (PMC) as non-volatile resistive element, c) a selection circuit consists of a transistor and a transmission gate. Different from other SEU tolerant cells [12][15][35], the proposed memory cell is non-volatile and utilizes a dual-rail checker for concurrent error detection and the so-called restore operation for correction. Two XOR gates are employed in a dual-rail checker scheme (in which each XOR gate consists of a two ambipolar-based implementation). The operational principles of the proposed NVSRAM have been discussed and extensive simulation results have been presented for all of its operations. In the absence of a SEU, the proposed cell has faster read and write times compared with designs using hardening [15][35]; however, the utilization of the restore operation accounts for a higher delay in SEU correction. The utilization of a PMC results in a very large resistive range, low hardware overhead (due to the bridging nature of this type of resistive element), fast switching, but at the expense of the requirement of higher voltage values for the store operation and consequently higher power dissipation and PDP value. This requirement suggests that the proposed cell is best suited for memories requiring nonvolatile operation with very frequent read operations (but infrequent write), such as in the new generation of look-up tables (LUTs) in FPGAs. The implications of the proposed approach to memory operation at system-level for FPGAs with multi-context configurability are under investigation.

#### REFERENCES

- P. Junsangsri and F. Lombardi "Design of a Hybrid Memory Cell Using Memristance and Ambipolarity," *IEEE Transactions on Nanotechnology*, vol. 12, no. 1, pp. 71-80, 2013.
- [2] H. Akinaga, H. Shima, "Resistive Random Access Memory (ReRAM) Based on Metal Oxides," *Proceedings of the IEEE*, pp. 2237-2251, 2010.
- [3] O. Turkyilmaz, S. Onkaraiah, M. Reyboz, F. Clermidy, H. C. Anghel, J.-M. Portal, M. Bocquet "RRAM-based FPGA for "Normally Off, Instantly On" Applications," *Proceedings of 2012 IEEE/ACM International Symposium on Nanoscale Architectures*, pp. 101-108, 2012.
- [4] D. B. Strukov, G. S. Snider, D.R. Stewart, R.S. Williams, "The missing memristor found", *Nature*, vol. 453, pp. 80-83, May 2008
- [5] L.O.Chua "Memristor-the Missing Circuit Element" *IEEE Transactions* on Circuit Theory. Vol. CT-18 No.5 pp.507-519, Sep 1971
- [6] M. F. Chang, C. H. Chuang, M. P. Chen, L.-F. Chen, H. Yamauchi, P.-F. Chiu, S.-S. Sheu, "Endurance-Aware Circuit Designs of Nonvolatile Logic and Nonvolatile SRAM Using Resistive Memory (Memristor) Device", 2012 17th Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 329-334, 2012.
- [7] P.-F. Chiu, M.-F. Chang, C.-W. Wu, C.-H. Chuang, S.-S. Sheu, Y.-S. Chen, and M.-J. Tsai, "Low Store Energy, Low VDDmin, 8T2R Nonvolatile Latch and SRAM With Vertical-Stacked Resistive Memory (Memristor) Devices for Low Power Mobile Applications" IEEE Journal of Solid-State Circuits, Vol. 47, No. 6, June 2012
- [8] X. Xue, W. Jian, Y. Xie, Q. Dong, R. Yuan, Y. Lin, "Novel RRAM Programming Technology for Instant-on and High-security FPGAs," 2011 IEEE 9th International Conference on ASIC (ASICON), pp. 291-294, 2011.
- [9] S. Lin, Y.B. Kim and F. Lombardi, "Soft-Error Hardening Designs of Nanoscale CMOS Latches," *Proc. IEEE VTS* 2009, pp. 41 - 46, 2009.

- [10] I. Polian, J. P. Hayes, S. M. Reddy, and B. Becker, "Modeling and Mitigating Transient Errors in Logic Circuits" IEEE Transactions on Dependable and Secure Computing, Vol. 8, No. 4, July/August 2011
- [11] N. Seifert, X. Zhu, and L.W. Massengill, "Impact of Scaling on Soft-Error Rates in Commercial Microprocessors," *IEEE Transactions on Nuclear Science*, vol. 49, no. 6, pp. 3100 - 3106, Dec. 2002.
- [12] M. Nicolaidis, R. Perez, D. Alexandrescu, "Low-Cost Highly-Robust Hardened Cells Using Blocking Feedback Transistors," in *Proceedings* of 26th IEEE VLSI Test Symposium, 2008. pp. 371 - 376, April 27 2008-May 1 2008
- [13] Y. Sasaki, K. Namba, H. Ito, "Soft Error Masking Circuit and Latch Using Schmitt Trigger Circuit," in *Proceedings of 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems*, pp. 327 - 335, Oct. 2006.
- [14] M. Omana, D. Rossi, C. Metra, "Latch Susceptibility to Transient Faults and New Hardening Approach", *IEEE Transactions on Computers*, Volume 56, Issue 9, pp. 1255 - 1268, Sept. 2007.
- [15] S. Lin, Y.B. Kim and F. Lombardi, "A 11-Transistor Nanoscale CMOS Memory Cell for Hardening to Soft Errors", *IEEE Transactions on VLSI Systems*, Volume 19, Issue 5, pp. 900 - 904, May. 2011
- [16] J. F. Wakerly, Error detecting codes, self-checking circuits and applications, North-Holland, 1978
- [17] P. Junsangsri, J. Han and F. Lombardi "HSPICE Macromodel of a Programmable Metallization Cell (PMC) and its Application to Memory Design", IEEE/ACM Int. Conference on Nanoscale Architecture (NANOARCH) 2014, pp. 45-50, Paris, July 2014
- [18] P. Junsangsri, J. Han and F. Lombardi, "A Hybrid Non-Volatile SRAM Cell with Concurrent SEU Detection and Correction," Proc. IEEE DATE, Dresden, March 2014.
- [19] M. N. Kozicki, M. Park, and M. Mitkova, "Nanoscale memory elements based on solid state electrolytes," *IEEE Trans. Nanotechnol.*, vol. 4, no. 3, pp. 331–338, May 2005.
- [20] Shimeng Yu, H.S. Philip Wong "Compact Modeling of Conducting-Bridge Random-Access Memory (CBRAM)" IEEE Trans. Electron Devices, Vol. 58, No.5, May 2011
- [21] U. Russo, D. Kamalanathan, D. Ielmini, A. L. Lacaita, and M. N. Kozicki, "Study of multilevel programming in programmable metallization cell (PMC) memory," *IEEE Trans. Electron Devices*, vol. 56, no. 5, pp. 1040– 1047, May 2009
- [22] X. Guo, C. Schindler, S. Menzel, and R. Waser, "Understanding the switching-off mechanism in Ag+ migration based resistively switching model systems," *Appl. Phys. Lett.*, vol. 91, no. 13, p. 133513, Sep. 2007.
- [23] W. Wei, J. Han, K. Namba and F. Lombardi, "Design of a Non-volatile 7T SRAM Cell for Instant-on Operation," *IEEE Transactions on Nanotechnology*, vol. 13, no. 5, pp. 905-916, 2014.
- [24] S. Yamamoto, Y. Shuto, S. Sugahara, "Nonvolatile SRAM (NV-SRAM) Using Functional MOSFET Merged with Resistive Switching Devices", *IEEE 2009 Custom Integrated Circuits Conference (CICC)*, pp. 531-534, 2009.
- [25] S.-M. Koo, Q. Li, M. D. Edelstein, C. A. Richter, E. M. Vogel "Enhanced Channel Modulation in Dual-gated Silicon Nanowire Transistors," *Nano Letters*, vol. 5, no. 12, pp. 2519–2523, 2005.
- [26] Y.-M. Lin, J. Appenzeller, J. Knoch, P. Avouris, "High-performance Carbon Nanotube field-effect Transistor with Tunable Polarities," *IEEE Trans. Nanotechnology*, vol. 4, pp. 481–489, 2005.
- [27] S. Heinze, M. Radosavljevic, J. Tersoff, Ph. Avouris, "Unexpected Scaling of the Performance of Carbon Nanotube Schottky-barrier Transistors," *Physical Review B*, vol. 68, p. 235418, 2003.
- [28] K. S. Novoselov, A. K. Geim, S. V. Morozov, D. Jiang, Y. Zhang, S. V. Dubonos, I. V. Grigorieva, A. A. Firsov "Electric field effect in atomically thin carbon films," *Science*, vol. 306, no. 5696, pp. 666–669, 2004.
- [29] A. Colli, A. Tahraoui, A. Fasoli, J. M. Kivioja, W. I. Milne, A. C. Ferrari, "Top-gated silicon nanowire transistors in a single fabrication step," ACS Nano, vol. 3, no. 6, pp. 1587–1593, 2009.
- [30] A. Dodabalapur, H. E. Katz, L. Torsi, R. C. Haddon, "Organic Heterostructure Field-effect Transistors," *Science*, vol. 269, no. 5230, pp. 1560–1562, 1995.
- [31] M. H. B. Jamaa, K. Mohanram, G. D. Micheli "An Efficient Gate Library for Ambipolar CNTFET Logic" *IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems*, Vol. 30, No.2, Feb 2011

- [32] P. Junsangsri, J. Han, F. Lombardi "Logic-in-Memory With a Nonvolatile Programmable Metallization Cell" IEEE Transactions on Very Large Scale Integration (TVLSI) Systems 2015, (to appear)
- [33] D. Sacchetto, M. H. Ben-Jamaa, S. Carrara, G. D. Micheli, Y. Leblebici "Memristive Devices Fabricated with Silicon Nanowire Schottky Barrier Transistors" *IEEE Circuits and Systems* (ISCAS 2010), Vol.1 pp. 9-12, 2010.
- [34] P.E. Dodd and L.W. Massengill, "Basic Mechanisms and Modeling of Single-Event Upset in Digital Microelectronics," *IEEE Transactions on Nuclear Science*, pp. 583 - 602, June 2003.
- [35] T. Calin, M. Nicolaidis, R. Velazco, "Upset Hardened Memory Design for Submicron CMOS Technology," *IEEE Transactions on Nuclear Science*, Volume 43, Issue 6, Part 1, pp. 2874 - 2878, Dec. 1996.
- [36] J. Gong, Y.B. Kim, J. Han and F. Lombardi "Hardening a Memory Cell for Low Power Operation by Gate Leakage Reduction," *Proc. IEEE International Symposium on DFT in VLSI and Nanotechnology Systems*, pp.73-78, Austin, October 2012.
- [37] E. Fujiwara, "Code Design for Dependable Systems: Theory and Practical Applications," Wiley-Interscience, 2006.
- [38] Predictive Technology Model, http://ptm.asu.edu/
- [39] G. Gasiot, M. Glorieux, S. Clerc, D. Soussan, F. Abouzeid, P. Roche "Experimental Soft Error Rate of Several Flip-Flop Designs Representative of Production Chip in 32nm CMOS Technology" IEEE Trans Nuclear Science, Vol. 60, Issue. 6, pp. 4226-4231, December 2013
- [40] R. J. Baker "CMOS Circuit Design, Layout, and Simulation" Wiley-IEEE Press, Revised 2nd Edition, 2011.
- [41] W. Wei, K. Namba, and F. Lombardi, "Design and Analysis of Non-Volatile Memory Cells for SEU Tolerance," *Proc. 17th IEEE Symposium* on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, pp. 69-74, Amsterdam, October 2014.
- [42] W. C. Shen, C.-E. Huang, H. OuYang, Y.-C. King, and C. J. Lin, "32nm Strained Nitride MTP Cell by Fully CMOS Logic Compatible Process" *VLSI Technology, Systems, and Applications (VLSI-TSA)*, 2012 International Symposium pp. 1-2, 2012
- [43] W. C. Shen, C. Y. Mei, Y. -D. Chih, S.-S. Sheu, M.-J. Tsai, Y.-C. King, C. J. Lin, "High-K Metal Gate Contact RRAM (CRRAM) in Pure 28nm CMOS Logic Process", *Electron Devices Meeting (IEDM)*, 2012 IEEE International, pp. 31.6.1 - 31.6.4, 2012