Synchronous Full-Adder based on Complementary Resistive Switching Memory Cells

Y. Zhang, E.Y. Deng, J.O. Klein, D. Querlioz, D. Ravelosona, C. Chappert, W.S. Zhao*
IEF, Univ. Paris-Sud, UMR 8622, CNRS, Orsay, France
*weisheng.zhao@u-psud.fr

**M. Moreau, J.M. Portal, M. Bocquet, H. Aziza, D. Deleruyelle, C. Muller
Aix-Marseille University, IM2NP – UMR CNRS 7334 Marseille, France
**mathieu.moreau@im2np.fr

Abstract—Emerging non-volatile memories (NVM) such as STT-MRAM and OxRRAM are under intense investigation by both academia and industries. They are based on resistive switching mechanisms and promise advantageous performances in terms of access speed, power consumption and endurance (i.e. >10^15), surpassing mainstream flash memories. This paper presents a non-volatile full-adder design based on complementary resistive switching memory cells and validates it through two NVM technologies: STT-MRAM and OxRRAM on 40 nm node. This architecture allows low power consumption. Thanks to the non-volatility and 3D integration of NVM, both standby power during “idle” state and data transfer power can be reduced. Using a low changing frequency can also control the switching power of NVM. The complementary cells and parallel data sensing enable fast computation and high reliability.

I. INTRODUCTION

Modern computing systems suffer from rising static power due to high leakage currents which increase exponentially with the downsizing of Complementary metal–oxide–semiconductor (CMOS) technologies [1]. According to ITRS roadmap in 2012, the static power will play a predominant role in power consumption in the coming years [2]. In order to overcome this power issue, hybrid circuits integrating resistive switching non-volatile memory (NVM) are being investigated by both academia and industries. Due to their advantageous performances, Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM) [3] and Oxide Resistive RAM (OxRRAM) [4] are among the most promising NVM technologies. Indeed, as compared to conventional floating gate technologies, they provide fast write/read operations, low power consumption, CMOS compatibility and high endurance.

While these NVMs are based on different physical mechanisms, several common features exist. For instance, they consist in two terminal nanoscale devices; their resistance may be switched between ‘0’ and ‘1’ states; the memory cell may be integrated into back-end of line (BEOL) [5-8]. Recently, a number of innovative circuits based on hybrid Resistive Switching (RS) memory/CMOS circuits have been proposed. For example, magnetic look-up-table (MLUT) and non-volatile flip-flop (MFF & RS-NVFF) were introduced for reconfigurable logic circuits and normally off electronics [9-10]. A pre-charge sense amplifier (PCSA) [11] was presented and shows remarkable improvement in terms of reliability comparing with other SAs for RS memory cell sensing. In this context, resistive switching non-volatile full-adder (RS-NVFA) based on hybrid technology could open the way towards ultra-low power and high density ICs [12]. Moreover, RS-NVFA could also overcome the communication bottleneck between separated logic module and memory block.

Magnetic Tunnel Junction (MTJ)-based RS-NVFA were proposed in the last years [13-15] and tested with conventional technology nodes. However they suffer from either high current for magnetic field generation or complex sensing circuit with capacitances. This paper presents a new full-adder design based on RS-NVM cells. Our design is validated by conducting transient simulations with two technologies: STT-MRAM and OxRRAM on 40 nm node. The rest of this paper is organized as follows: in the next section, the two technologies are briefly described and their compact models are introduced. The RS-NVFA architecture is detailed in section III. Section IV is devoted to transient simulations and performance analysis.

II. EMERGING RESISTIVE SWITCHING MEMORIES

Magnetic tunnel junction (MTJ) is the basic cell of MRAM and it consists of a thin insulating barrier (i.e. MgO) separating two ferromagnetic (FM) layers (Fig. 1a). Thanks to tunnel magneto-resistance (TMR) effect [16], its resistance, RP or RAP, depends on the relative orientation, Parallel (P) or Anti-Parallel (AP), of magnetizations in the two FM layers. STT is a switching mechanism promising high power efficiency and fast speed [2-3]. This mechanism enables profoundly simplifying CMOS circuitry, as only a bipolar current is required (Fig. 1b). MTJ switches as the passing current exceeds either from P to AP as the electrons flow from the top (IC0>AP>IC0), or from AP to P as the electrons are injected from the bottom (IC0>P>IC0).

Figure 1. (a) Vertical structure of MTJ composed of CoFeB (1.3)/MgO (0.85)/CoFeB(2) thin films. (b) STT switching mechanism: the MTJ state changes either from P to AP as the electrons flow from the top (IC0>AP>IC0), or from AP to P as the electrons are injected from the bottom (IC0>P>IC0).
A CoFeB/MgO/CoFeB PMA STT-MTJ compact model [20] taking into account related static, dynamic and stochastic behaviors was used to perform transient simulation [17-18]. Table I shows the critical parameters used in the model.

In its simplest form, an OxRRAM memory element relies on a Metal/Insulator/Metal (MIM) stack (Fig. 2a). The MIM structure is generally composed of metallic electrodes sandwiching an active layer, usually an oxygen-deficient oxide. A large number of resistive switching oxides, like HfO₂, Ta₂O₅, NiO, TiO₂ or Cu₂O, are reported in the literature [21-23]. An interfacial layer (IL, Fig. 2a) can also appear during the electroforming process. After an initial electroforming process, the memory element may be switched reversibly between a High Resistance State (HRS or OFF state) and a Low Resistance State (LRS or ON state). In this paper, we focus on bipolar switching RS devices that are switched reversibly between an OFF state or ON state (Fig. 2b).

Even if OxRRAM technology is still in its "infancy", it is broadly accepted that the field-assisted motion of oxygen vacancies plays a prominent role in bipolar resistance switching [24]. The proposed OxRRAM modeling approach, derived from a unipolar model [25], relies on electric field-induced migration of oxygen vacancies within the switching layer. This model enables continuously accounting for both set and reset operations into a single master equation and demonstrates its flexibility to match static (switching voltages, current levels) and dynamic behaviors of the most aggressive component from literature [26-28]. Tables II summarizes the cell operation parameters for very short programming pulse used for the OxRRAM-based NVFA transient simulations.

### Table I

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Description</th>
<th>Default Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Area</td>
<td>MTJ surface</td>
<td>40 nm x 40 nm</td>
</tr>
<tr>
<td>TMR(0)</td>
<td>TMR ratio with Vc,io</td>
<td>120%</td>
</tr>
<tr>
<td>Vf</td>
<td>Volume of free layer</td>
<td>surface x1.3 nm</td>
</tr>
<tr>
<td>RA</td>
<td>Resistance-area product</td>
<td>10 Ωμm²</td>
</tr>
<tr>
<td>Vw</td>
<td>Writing voltage</td>
<td>1.5 V</td>
</tr>
<tr>
<td>Vread</td>
<td>Reading voltage</td>
<td>1.2 V</td>
</tr>
<tr>
<td>Jc0</td>
<td>Critical current density</td>
<td>5.7 x 10⁶ A/cm²</td>
</tr>
</tbody>
</table>

III. RESISTIVE SWITCHING FULL-ADDER ARCHITECTURE

For a 1-bit full-adder, the inputs are “A”, “B”, “Cᵢ” and the outputs are “SUM”, “Cᵪ”, calculated with the following Boolean equations:

\[
SUM = A \oplus B \oplus Cᵢ = A \overline{B} \overline{Cᵢ} + \overline{A} B \overline{Cᵢ} + \overline{A} B Cᵢ \\
Cᵪ = AB + ACᵢ + BCᵢ
\]

In the proposed architecture the input “A” is volatile computing data and input “B” is a critical data or quasi-constant. MOS transistors and RS-NVM are used to represent data “A” and “B”, respectively, in the addition. The switching frequency of “B” is much lower than “A” and “Cᵢ”. The RS-NVFA architecture is composed of three parts: (i) a PCSA stage to evaluate the logic value of RS-NVFA outputs; (ii) a write logic block to program the RS memory cells; (iii) a CMOS logic tree including RS cells, implementing the addition sum and outputting carry. The general architecture is depicted in Fig. 3.

In order to evaluate the addition logic function, PCSA circuit enables providing the best sensing reliability and power efficiency while keeping high-speed performance (e.g. 200 ps) [11]. As shown in Fig. 4, it consists of pre-charge sub-circuit (MP₀₃₄,7), a discharge sub-circuit (MN₁₆₁₇) and a pair of inverters (MN₆₁₂₃₄ and MP₁₂₅₆₇), which act as current sense amplifier. The PCSA works in two phases: during the first phase, “CLK” is ‘0’, the RS-NVFA outputs (“SUM”, “SUMᵦ” for the sum output and “Cᵢ”, “Cᵪ” for the carry output) are pulled-up to “VDD” or logic ‘1’ through MP₀₃₄,7 while MN₁₆₁₇ remains off; during the second phase “CLK” is ‘1’, MP₀₃₄,7 are turned off and MN₁₆₁₇ are on. In these conditions, the RS-NVFA outputs are pulled-down through the logic tree (MN₆₁₅) and the RS cells (B, B̅). Depending on the MOS state in the logic tree and the RS element state, the discharge currents are different in both branches and the current sense amplifier latches opposite logic value on “SUM”, “SUMᵦ” and “Cᵢ”, “Cᵪ” respectively. While RS-NVFA outputs are evaluated through a differential process, the logic functions are implemented in two identical branches. However, in each branch, MOS transistors are controlled with complementary values and RS cells are programmed in opposite resistance states (i.e. RᵦHS and RᵦRS).

The RS devices are serially connected with a common central point. In order to generate bi-directional currents to program the complementary RS cells, write logic block, composed of pass transistors (MN₁₆₂₃ and MP₆₅₁₃, see Fig. 4), is introduced. Pass transistors are connected respectively to bottom (BE) and top electrodes (TE) of the serial branch and to the common point.

![Figure 2](image1.png)

(a) OxRRAM memory stack with metal electrodes sandwiching an interfacial layer (IL) and an active material layer, (b) simulated current-voltage characteristic of a bipolar OxRRAM memory cell.

![Figure 3](image2.png)

General architecture of RS-NVFA composed of PCSA, CMOS tree for volatile data and non-volatile memory cells for non-volatile data.
voltage dynamics. The simulation results obtained with our compact models show that for OXRRAM cells, the switching time is about 2 ns. Using our STT-MRAM and CMOS circuits, the data sensing and electroforming of OXRRAM devices can be efficiently handled. It is important to note that there is no capacitance for the data sensing and no magnetic field for data programming in this new structure, compared to the previous structures [12-14]. Thereby this design is suitable for advanced technological nodes below 90 nm and allows efficient area minimization.

IV. RS-NVFA VALIDATION WITH BIPOLAR STT-MRAM AND OXRRAM CELLS

This section presents transient simulation results to prove the architectural concept of RS-NVFA using our NVM compact models, shown in section II, and CMOS STMicroelectronics 40 nm design kit [29]. In the simulations, the size of transistors in PCSA and MOS tree is minimal while write circuit transistors are designed to reach SET and RESET states of RS cells.

Using our STT-MRAM compact model [20], we simulate the RS-NVFA architecture shown in Fig.4. Fig. 5 demonstrates the transient simulation results of the hybrid circuit. “CLK”=0 drives the outputs “SUM” and “Co” to be pre-charged to “VDD” or logic ‘1’, then the output evaluation will occur when “CLK” is set to ‘1’. We find that the behaviors of the outputs (“SUM” and “Co”) agree with the addition function for the whole truth table. For instance, as “A”=’1’, “B”=’0’, “CI”=’1’, the result will be ‘1’ and no carry yields; in another case, “A”=’1’, “B”=’0’, “CI”=’0’, “SUM” will be ‘0’ and the carry becomes ‘1’. It is important to note that the switching duration of input “B” for STT-MRAM cell is about 2 ns [20].

Similarly, the functionality of RS-NVFA circuit was checked through transient simulations with OXRRAM cells and CMOS STMicroelectronics 40 nm design kit [29] (Fig. 6). For OXRRAM cells, the switching time depends on write voltage dynamics. The simulation results obtained with our model [27] give a switching time less than 3 ns and resistance values $R_{HRS}$=35.1kΩ and $R_{LRS}$=1.1kΩ. The outputs “SUM” and “Co” confirm the whole full-adder function. The sense delay is less than 60 ps for both outputs, showing high-speed performance of OXRRAM-based NVFA.

We summarize the performance comparison between three different simulations in Table III. The delay time and dynamic power of RS-NVFAs (with STT-MRAM and OXRRAM cells) are comparable to CMOS FA coming from standard cell library of STMicroelectronics [29]. It is advantageous in terms of standby power since our design allows to be powered off completely during “idle” state. Although its energy-delay product (EDP) exceeds that of CMOS full-adder by ~30%, then RS-NVFA could greatly reduce the whole consumption. Thanks to the 3D integration, the die area of our design (38 MOS + 4 RS cells) is more compact than the CMOS Full-adder (46 CMOS; 28 CMOS for asynchronous full-adder + 18 CMOS for synchronization) as the RS memories are on top of CMOS circuits. This integration allows also the elimination of dynamic power dedicated for data transfer (1 pJ/mm/bit@22 nm [2,30]) between logic units and memory array as the distance between memory and computing unit becomes some μm instead of some mm for CMOS logic circuits [30].
As mentioned above, a critical idea of this design shown in Fig. 4 is to use a programming frequency (e.g. 1 kHz) of RS-NVM much lower than the computing frequency (e.g. 1 GHz). Thereby, the switching power for non-volatile storage becomes insignificant to other power consumption in a full system.

V. CONCLUDING REMARKS

This paper presents a generic design of RS-NVFA based on complementary RS memory cells. Our RS-NVFA circuit is suitable for advanced technological nodes as there is neither capacitance nor magnetic fields compared to previous structures [11-15]. This architecture enables scaling down the die area and reducing the power as there is nearly zero standby power and low data transfer energy. It can be very useful for normally-off electronics [19]. Using our STT-MRAM and OxRRAM compact models, RS-NVFAs were successfully simulated on 40 nm node and demonstrated their functionality and performance gain.

ACKNOWLEDGMENT

The authors acknowledge support from French research agencies through projects NANOINNOV-SPIN, CNRS-PEPS-NVCPU, ANR-MARS and ANR-DIPMEM.

REFERENCES


TABLE III

<table>
<thead>
<tr>
<th>Performance</th>
<th>CMOS-ONLY FA</th>
<th>STT-MRAM NVFA</th>
<th>OxRRAM NVFA</th>
</tr>
</thead>
<tbody>
<tr>
<td>Delay time</td>
<td>75 ps</td>
<td>87.4 ps</td>
<td>52.3 ps</td>
</tr>
<tr>
<td>Dynamic power@ 500 MHz</td>
<td>2.17 uW</td>
<td>2.44 uW</td>
<td>3.50 uW</td>
</tr>
<tr>
<td>Standby power</td>
<td>71 nW</td>
<td>~0</td>
<td>~0</td>
</tr>
<tr>
<td>Data transfer energy</td>
<td>&gt;1 pJ</td>
<td>&lt;1 fJ</td>
<td>&lt;1 fJ</td>
</tr>
<tr>
<td>Die Area</td>
<td>46 MOS</td>
<td>38 MOS+</td>
<td>38 MOS+</td>
</tr>
<tr>
<td></td>
<td>4 MTJs</td>
<td>4 OxRRAMs</td>
<td></td>
</tr>
</tbody>
</table>

As mentioned above, a critical idea of this design shown in Fig. 4 is to use a programming frequency (e.g. 1 kHz) of RS-NVM much lower than the computing frequency (e.g. 1 GHz). Thereby, the switching power for non-volatile storage becomes insignificant to other power consumption in a full system.

V. CONCLUDING REMARKS

This paper presents a generic design of RS-NVFA based on complementary RS memory cells. Our RS-NVFA circuit is suitable for advanced technological nodes as there is neither capacitance nor magnetic fields compared to previous structures [11-15]. This architecture enables scaling down the die area and reducing the power as there is nearly zero standby power and low data transfer energy. It can be very useful for normally-off electronics [19]. Using our STT-MRAM and OxRRAM compact models, RS-NVFAs were successfully simulated on 40 nm node and demonstrated their functionality and performance gain.

ACKNOWLEDGMENT

The authors acknowledge support from French research agencies through projects NANOINNOV-SPIN, CNRS-PEPS-NVCPU, ANR-MARS and ANR-DIPMEM.

REFERENCES


TABLE III

<table>
<thead>
<tr>
<th>Performance</th>
<th>CMOS-ONLY FA</th>
<th>STT-MRAM NVFA</th>
<th>OxRRAM NVFA</th>
</tr>
</thead>
<tbody>
<tr>
<td>Delay time</td>
<td>75 ps</td>
<td>87.4 ps</td>
<td>52.3 ps</td>
</tr>
<tr>
<td>Dynamic power@ 500 MHz</td>
<td>2.17 uW</td>
<td>2.44 uW</td>
<td>3.50 uW</td>
</tr>
<tr>
<td>Standby power</td>
<td>71 nW</td>
<td>~0</td>
<td>~0</td>
</tr>
<tr>
<td>Data transfer energy</td>
<td>&gt;1 pJ</td>
<td>&lt;1 fJ</td>
<td>&lt;1 fJ</td>
</tr>
<tr>
<td>Die Area</td>
<td>46 MOS</td>
<td>38 MOS+</td>
<td>38 MOS+</td>
</tr>
<tr>
<td></td>
<td>4 MTJs</td>
<td>4 OxRRAMs</td>
<td></td>
</tr>
</tbody>
</table>

As mentioned above, a critical idea of this design shown in Fig. 4 is to use a programming frequency (e.g. 1 kHz) of RS-NVM much lower than the computing frequency (e.g. 1 GHz). Thereby, the switching power for non-volatile storage becomes insignificant to other power consumption in a full system.

V. CONCLUDING REMARKS

This paper presents a generic design of RS-NVFA based on complementary RS memory cells. Our RS-NVFA circuit is suitable for advanced technological nodes as there is neither capacitance nor magnetic fields compared to previous structures [11-15]. This architecture enables scaling down the die area and reducing the power as there is nearly zero standby power and low data transfer energy. It can be very useful for normally-off electronics [19]. Using our STT-MRAM and OxRRAM compact models, RS-NVFAs were successfully simulated on 40 nm node and demonstrated their functionality and performance gain.

ACKNOWLEDGMENT

The authors acknowledge support from French research agencies through projects NANOINNOV-SPIN, CNRS-PEPS-NVCPU, ANR-MARS and ANR-DIPMEM.

REFERENCES