# A Fully Integrated Multistage Cross-Coupled Voltage Multiplier With No Reversion Power Loss in a Standard CMOS Process

Xiaojian Yu, Student Member, IEEE, Kambiz Moez, Senior Member, IEEE, I-Chyn Wey, Member, IEEE, Mohamad Sawan, Fellow, IEEE, and Jie Chen, Fellow, IEEE

Abstract—This brief presents a fully integrated cross-coupled voltage multiplier for boosting dc-to-dc converter applications. The proposed design applies a new structure of cross-coupled voltage doubler (CCVD) and a clock scheme that eliminates all of the reversion power loss and increases the power efficiency (PE). In addition, this design is scalable to multiple-stage voltage doubler (voltage multiplier) as the maximum gate-to-source/drain or drain-to-source voltage does not exceed the nominal power supply  $V_{dd}$ . As a result, such a design is compatible with the standard CMOS process without any overstress voltage. The proposed single-stage CCVD and three-stage cross-coupled voltage multiplier are implemented in 0.13- $\mu$ m IBM CMOS process with maximum PE values of 88.16% and 80.2%, respectively. The maximum voltage conversion efficiency reaches 99.8% under the supply voltage of 1.2 V.

*Index Terms*—Cross-couple voltage doubler, dc-dc converter, reversion power loss, switched capacitor (SC).

#### I. INTRODUCTION

W ITH the trend of integrating different modules on a monolithic system-on-chip, the demand for integrated power management with multiple output voltages is ever increasing. For applications such as EEPROMs, SRAMs, LCD drivers, and ultrasonic transducer drivers, a higher voltage supply is required to provide enough driving capability. Among the boost converter topologies, switched-capacitor (SC)-based step-up dc–dc converters are most appropriate for full integration in CMOS technology as they do not need the large inductors required in other topologies [1], [2]. Recently, integrated step-up SC converters (also called charge pumps) have been applied for implantable devices such as microstimulator for visual prosthesis [3] and neurostimulator for epilepsy [4].

X. Yu, K. Moez, and J. Chen are with University of Alberta, Edmonton, AB T6G 2R3, Canada (e-mail: xy2@ualberta.ca; kambiz@ualberta.ca; jc65@ualberta.ca).

I.-C. Wey is with the Department of Electronic Engineering, Chang Gung University, Taoyuan 33302, Taiwan (e-mail: ichynwey@gmail.com).

M. Sawan is with the Department of Electrical Engineering, Polytechnique Montréal, Montreal, QC H3T 1J4, Canada (e-mail: mohamad.sawan@ polymtl.ca).

Color versions of one or more of the figures in this brief are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSII.2016.2599707

Clock A Path Vout Va M<sub>N1</sub> overlapped intervals C R Clock A Path4 Clock B output loss Clock B pump loss hort-circuit loss (b) (a)

Fig. 1. (a) Conventional CCVD and six reversion loss paths. (b) Overlapped intervals existing at clock transitions in conventional CCVD.

The most commonly used step-up SC converters are Dickson's charge pump [1] and the cross-coupled voltage doubler (CCVD) [5], as shown in Fig. 1. Compared to a conventional charge pump, CCVD has several advantages. First, CCVD reduces the output voltage ripples and output voltage drop if the output buffer capacitors have the same capacitance. Second, in Dickson's charge pump, diodes or diode-connected MOSFETs are used as switches. As a result, the voltage drop across each switch is the threshold voltage  $V_{\rm th}$ . In the CCVD topology, the voltage drop is equal to the drain-to-source voltage  $V_{\rm ds}$ , which is much less than  $V_{\rm th}$ . However, the reversion loss during the clock transition time in this topology severely reduces the power efficiency (PE) [5]. As a result, these reversion losses must be eliminated to improve the performance of the CCVD structure.

The reversion loss mainly comprises three types: output loss, pump loss, and short-circuit loss, manifesting in six power-loss paths [refer to Fig. 1(a)]. All of these power losses appear at the clock transitions due to the overlapped clock signals or the timing mismatch [6], as shown in Fig. 1(b). The output loss is caused by the reverse charge flow from the output capacitor to the flying capacitor during the falling transition of *Clock A* and the rising transition of *Clock B*, and vice versa, which is illustrated as *Path 1* and *Path 4*, respectively, in Fig. 1. Similarly, the pump loss appears when the boosting node  $V_a$  and  $V_b$  transfer charges back to the input source during the rising clock transition intervals of *Clock A* or *Clock B*, which is labeled as *Path 2* and *Path 5* in Fig. 1. The short-circuit power loss is induced by the reverse current from the output capacitor to the input source because  $M_{P1}$  and  $M_{N1}$  are both turned

Manuscript received June 17, 2016; revised August 2, 2016; accepted August 5, 2016. Date of publication August 11, 2016; date of current version June 23, 2017. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada through Discovery Grants Program and in part by China Scholarship Council (CSC). This brief was recommended by Associate Editor F. Lau.



Fig. 2. (a) Charge-transferring nMOS and pMOS must be driven separately to eliminate reversion loss. (b) Control scheme of nonoverlapping gate drive signals.

on for a short time at the clock falling and rising transitions. The short-circuit loss is labeled as *Path 3* and *Path 6* in Fig. 1. Several techniques have been previously reported to reduce or eliminate the reversion power losses [5]–[8]. However, in these designs, either extra blocking transistors and level shifters are used or potential voltage overstress exists. As a result, extra power is consumed and the PE drops, or the stability of the circuit decreases. In addition, most of the previous designs cannot be scaled to multiple stages due to the use of the level shifters or voltage overstress. In this brief, an improved circuit design and the control scheme are proposed to eliminate the reversion power losses. Furthermore, the gate-to-source and source-to-drain voltages of each transistor are limited within the nominal supply voltage. As a result, such a design can be easily scaled to multiple stages without encountering voltage overstress issues.

## II. PROPOSED CIRCUIT AND CONTROL SCHEME

## A. Single-Stage CCVD Topology

In order to improve the efficiency of CCVD, all six of the reversion loss paths in Fig. 1 must be removed. For instance, to remove the power loss along *Path 1*,  $M_{P1}$  must be turned off before *Clock A* goes low. Similarly,  $M_{N1}$  must be turned off before *Clock A* goes high to remove power loss along *Path 2*. To remove power loss in *Path 3*,  $M_{P1}$  and  $M_{N1}$  cannot conduct at the same time during the clock transition. Since the structure of CCVD is symmetric, the other three loss paths can be removed in the same way.

The two charge-transfer transistors  $M_{N1}$  and  $M_{P1}$  must be driven separately to avoid simultaneous conduction at the clock transition to eliminate short-circuit power loss, producing the circuit shown Fig. 2(a). The gate drive signals at nodes  $V_1$ ,  $V_2$ ,  $G_{P1}$ , and  $G_{P2}$  should be nonoverlapping, as shown in Fig. 2(b), to eliminate pump loss and output loss. Based on this control scheme,  $V_2$  turns off  $M_{N1}$  before CK1 goes high, and the transistor  $M_{P1}$  will be turned off by  $G_{P1}$  before CK1



Fig. 3. (a) Proposed CCVD. (b) Control scheme for the proposed voltage doubler.



Fig. 4. (a) Nonoverlapping clock generation circuit. (b) Schematic of the delay cells. The delay time is adjusted by the capacitor.

becomes low. Therefore, reversion losses *Path 1* and *Path 2* are eliminated. Since  $M_{N1}$  and  $M_{P1}$  are never turned on at the same time according to the waveforms of node  $G_{P1}$  and  $V_2$ , the reversion loss along *Path 3* is also removed. *Path 4–Path 6* will be also removed due to the symmetry of the circuit.

Based on the previous analysis, an improved design and its clock scheme are presented that satisfy the gate drive signals in Fig. 2(b), as shown in Fig. 3(a) and (b), respectively. In this design, the bulk of nMOS and pMOS transistors are all connected to their sources. Since the bulk of nMOS transistors are biased at different voltages, triple-well nMOS transistors are used in this design. The four nonoverlapping clocks, namely, CK1, CK2, CKA, and CKB, are applied to generate the four gate-driven signals (referred to  $V_1$ ,  $V_2$ ,  $G_{P1}$ , and  $G_{P2}$ ) shown in Fig. 2(b). In order to generate four nonoverlapping clocks, a clock-generation circuit is presented, as shown in Fig. 4. The delay cells in this circuit are implemented with cascaded inverters and a small capacitor, as shown in Fig. 4(b).

In order to avoid any breakdown issues, the voltages  $V_{\rm gs}$ ,  $V_{\rm gd}$ , and  $V_{\rm ds}$  of the transistors must not exceed  $V_{\rm dd}$ . In the worst case where  $V_{\rm in}$  is equal to  $V_{\rm dd}$  and the output is unloaded, both gate drive signals  $G_{P1}$  and  $G_{P2}$  must swing within a range from  $V_{\rm dd}$  to  $2V_{\rm dd}$  to turn on or turn off  $M_{P1}$  and  $M_{P2}$ . If the CCVD converters are cascaded to make N stages of CCVD, the drive signals  $G_{P1}$  and  $G_{P2}$  in the nth stages should swing within the range from  $nV_{\rm dd}$  to  $(n + 1)V_{\rm dd}$ . Consequently, the level shifters in [5] and [7] are not applicable to drive  $M_{P1}$ and  $M_{P2}$  because the level shifters in these designs can only provide a voltage swing from  $0 \sim NV_{\rm dd}$ . It may cause oxide



Fig. 5. Proposed three-stage CCVD topology.

breakdown if the level shifters are used to drive the pMOS transistors  $M_{P1}$  and  $M_{P2}$  in Fig. 3(a). In the proposed design, two auxiliary clock signals CK A and CK B are connected to two small auxiliary capacitors  $C_A$  and  $C_B$  to achieve driving signals  $G_{P1}$  and  $G_{P2}$  swinging from  $V_{dd}$  to  $2V_{dd}$  without using level shifters. As shown in Fig. 3(a), when CKA is low,  $V_a$  is charged to  $V_{dd}$  and  $G_{P1}$  is charged by  $V_{out}$  that is equal to  $2V_{dd}$ . Similarly, when CKA becomes high,  $V_a$  is coupled to  $2V_{dd}$ and  $G_{P1}$  is discharged to  $V_{in}$  (i.e.,  $V_{dd}$ ). Therefore, the swing of  $G_{P1}$  is from  $V_{dd}$  to  $2V_{dd}$ , which will not cause the voltage overstress on  $M_{P1}$ . The driving signal  $G_{P2}$  works in the same way as  $G_{P1}$ . Likewise, in a multiple-stage CCVD, as shown in Fig. 5, the input and output voltages at nth stage are  $nV_{dd}$ and  $(n+1)V_{dd}$ , respectively. As a result, the driving signals  $G_{P1}$  and  $G_{P2}$  in the *n*th stage swing from  $nV_{dd}$  to  $(n+1)V_{dd}$ , eliminating the overstress of the gate-to-source/drain voltage across the transistors.

The proposed control scheme applies four nonoverlapping clock signals to remove all of the power loss, whereas the conventional two nonoverlapping clock signals can only reduce the reversion loss [5], [8]. In addition, by applying the proposed control scheme and revised circuit, level shifters or extra blocking transistors [5]–[7] are not required.

## B. Multistage CCVD Topology

The single-stage CCVD can be cascaded to obtain a higher output voltage. As a result, N stages of CCVDs can provide an output voltage of  $(N + 1) * V_{dd}$  if no load is applied at the output. Fig. 5 demonstrates a three-stage CCVD design based on the proposed single-stage CCVD, in which three CCVDs are cascaded. Since the reversion loss in every stage has been removed, this design has a higher PE value and voltage conversion efficiency value than the conventional design.

A similar cascaded multiple-stage CCVD was proposed by Ker *et al.* [9]. However, the reversion losses are not eliminated in Ker's design. As a result, in Ker's design, the output voltage drop is larger than in our design, and the PE is also lower. In addition, although the gate-to-source voltage or the gate-todrain/source voltage of each transistor does not exceed  $V_{dd}$  in Ker's design, the source-to-drain  $V_{ds}$  voltage may still exceed



Fig. 6. Output voltage of different CCVD designs under different loads.

 $V_{dd}$  during the clock transitions or slight timing mismatch. However, in the proposed design and control scheme shown in Figs. 3(b) and 5, the transistor  $M_{P1}$  in the first stage and the transistor  $M_{N2}$  in the second stage are both turned off before CK2 goes high and CK1 goes low. As a result, the source-todrain voltage drop across both  $M_{P1}$  and  $M_{N2}$  does not exceed  $V_{dd}$ . Therefore, the proposed design is more robust than Ker's design by taking the gate-oxide and source-to-drain reliabilities into consideration. In addition, an output capacitor (i.e.,  $C_{out1}$ and  $C_{out2}$  in Fig. 5) can be added at the end of each stage to obtain multiple outputs with different voltage levels.

### **III. VERIFICATION AND DISCUSSION**

#### A. Single-Stage CCVD Simulation Results

The proposed single-stage voltage doubler is simulated and implemented in the 0.13- $\mu$ m CMOS technology. In the proposed design, both of the flying capacitors and the output capacitor are 100 pF. The two auxiliary capacitors are 5 pF. In order to ensure fair comparison of the proposed design with other designs, all of the flying capacitors, output capacitors, and the transferring switches in these designs are chosen with the same specifications. In the design in [6] and [11], the auxiliary capacitors are also set to 5 pF. Fig. 6 compares the output voltages under different loads of the proposed design and other designs. As the design is dedicated to low-power application (around 1–10 mW), the load sweeps from 1 to  $10 \text{ k}\Omega$ . Compared with [8], the output voltage of the proposed design is 45 mV higher at most. The design in [11] performs well when the load resistance is high, but the performance dramatically drops when the load resistance becomes small. The proposed design works better than the other designs when the load is larger than 2 k $\Omega$ . In addition, the proposed design shows significant improvement in PE compared with previous designs due to the complete elimination of the reversion power loss. Fig. 7 shows the PE of different designs under different loads with a power supply of 1.2 V. The maximum PE of the proposed design under a load ranging from 1 to 10 k $\Omega$  is 88.55%, whereas the maximum PE of the designs in [6], [8], and [11], and the conventional design are 84.44%, 73.2%, 74.45%, and 74.5%, respectively. Furthermore, the PE of the proposed design stays higher than



Fig. 7. Comparison of the PE of different designs under different load.



Fig. 8. Output voltages and power efficiency of different designs of 3-stage CCVD under different load resistance.

80% for a large range of load, whereas the PE of other designs drops more quickly than the proposed design.

#### B. Multistage CCVD Simulation Results

The three-stage CCVD has been implemented using IBM's 130-nm process. It operates with two main flying capacitors of 100 pF, i.e., two auxiliary flying capacitors of 5 pF and the output capacitor of 200 pF. The clock frequency is set to 50 MHz. Two other three-stage voltage doublers based on the design in [9] and [10] with the same size of transistors and capacitors are also simulated and compared with the proposed design.

The operating frequency of the nonoverlapping clock signals is set as 50 MHz. The maximum unloaded output voltage of the proposed design is 4.79 V, which is 99.8% of the ideal output voltage (i.e., 4.8 V). The maximum unloaded output voltage of the design in [9] is 4.67 V or a ratio of 97.3%. The quiescent power consumption values in the proposed design and [9] are 479  $\mu$ W and 3.792 mW, respectively. Fig. 8 compares the simulated output voltages and power efficiency of the proposed design and the designs in [9] and [10] under different loads. The output voltage of the proposed design is 120 mV higher than that in [9] at most. As shown in Fig. 8, the PE of the proposed design is also much higher than the designs in [9] and [10]. The simulation results show that the proposed design improves the PE significantly by eliminating the reversion loss. The maximum PE of the proposed design, [9], and [10] are 81.02%, 69.29%, and 67.18%, respectively. In addition, the PE of the proposed design is around 80% under a large range of load resistance. However, the design based on [9] and [10]



Fig. 9. Chip micrograph of the single-stage and multiple-stage CCVD.



Fig. 10. Measured and simulated output voltage ( $V_{\rm out}$ ) and power efficiency (PE) of single-stage CCVD.

has efficiency lower than 50% under a wide range of load resistance, especially in larger load resistance. Under a small load resistance, both designs will have a low output voltage due to the relatively small on-chip capacitors. The PE also drops at small load resistance because the low output voltage in each stage renders the transistors unable to be completely turned on or turned off. As a result, the on-resistance and leakage current of the transistors increase and thus significantly decrease the PE. However, the proposed design can achieve high PE when the load is higher than 1 k $\Omega$ , which performs much better than the conventional designs.

# C. Experimental Verification

To verify the effectiveness of the proposed circuit, the singleand three-stage CCVD structures have been implemented using IBM's 0.13- $\mu$ m process with active areas of  $0.37 \text{ mm}^2$  and  $1.07 \text{ mm}^2$ , respectively, as shown in Fig. 9. All of the capacitors, including flying capacitors, auxiliary capacitors, and output capacitors, are implemented with on-chip MIM capacitors. Compared with a MOS capacitor, a MIM capacitor has a smaller bottom capacitor ratio and lower parasitic capacitance and resistance [2]; moreover, it can withstand much higher voltage across the capacitor plates. In our design, the total capacitance of the single-stage CCVD is 410 pF, including two 100-pF flying capacitors, two 5-pF auxiliary capacitors, and one 200-pF output capacitor. The total capacitance of the threestage CCVD is 830 pF, including six 100-pF flying capacitors, six 5-pF auxiliary capacitors, and one 200-pF output capacitors.

The implemented nonoverlapping clock signals operate at 50 MHz, and the measurement sweeps the load from 1 to 10 k $\Omega$  with the input voltage of 1.2 V. Fig. 10 shows the comparison of the measured and simulated output voltages, as well as the



Fig. 11. Measured and simulated output voltage ( $V_{\rm out}$ ) and power efficiency (PE) of three-stage CCVD.

TABLE I Performance Comparison

|                                    | [6]         | [8]          | [9]         | [10]        | This<br>work | This<br>work |
|------------------------------------|-------------|--------------|-------------|-------------|--------------|--------------|
| CMOS<br>process(nm)                | 46          | 350          | 350         | 45          | 130          | 130          |
| Number of stages                   | 1           | 1            | 3           | 2           | 1            | 3            |
| Max voltage<br>conversion<br>ratio | 1.96        | >1.98        | 3.75        | 2.94        | 1.99         | 3.99         |
| Max power<br>efficiency (%)        | 57%         | 96.5%        | 69.3%*      | N/A         | 88.16%       | 80.2%        |
| Capacitor type                     | on-<br>chip | off-<br>chip | on-<br>chip | on-<br>chip | on-<br>chip  | on-<br>chip  |
| Flying<br>capacitor (pF)           | 48          | 2000         | 12          | 2100        | 210          | 630          |
| Output<br>capacitor (pF)           | 24          | 2000         | N/A         | 500         | 200          | 200          |
| Switching<br>frequency<br>(MHz)    | 10          | 0.2          | N/A         | 60          | 50           | 50           |
| Area (mm <sup>2</sup> )            | 0.03        | 0.49         | N/A         | N/A         | 0.37         | 1.07         |

\* Simulated power efficiency

simulated and measured PE values of the single-stage CCVD. The measurement results are a bit lower than the simulated results because of the parasitic capacitance and resistance of the chip and printed circuit board layout. The maximum PE is achieved at the load of  $2-3 \text{ k}\Omega$ , which is around 88.16%. The trend of the measured PE curve is also consistent with the simulation results. Fig. 11 shows the measured and simulated output voltages of the three-stage CCVD with a nominal input voltage of 1.2 V. The clock signals also operate around 50 MHz. The measured voltage matches the simulation results very well. The PE of the measured result is also shown in Fig. 11. The peak efficiency is achieved at the load of 5 k $\Omega$ , which is around 80.2%. The PE maintains around 80% when the load ranged from 2 to 6 k $\Omega$ .

A performance comparison is also listed in Table I. Compared with [6], the proposed one-stage CCVD has higher efficiency but has lower efficiency than [8]. The reason is that, in [8], off-chip capacitors are used, and the parasitic capacitance is much less than on-chip capacitor. Compared with [9] and [10], the proposed three-stage voltage multiplier performs much better both in the voltage conversion ratio and PE due to the elimination of the reversion power loss. However, to perform a more fairly comparison, all of these designs are implemented and simulated with the 0.13- $\mu$ m CMOS process, which have been introduced in Section III. However, in order to obtain a fixed output voltage and maintain the PE at different loads, a feedback regulation loop, such as PFM regulation or hysteric control, is necessary in future designs. In addition, interleaving technique is also usually applied to reduce the output voltage ripples, with a tradeoff of chip area. Since the designs in this brief are proposed to verify the effectiveness of the technique to remove the reversion power loss and eliminate possible voltage overstress of the CCVD structure, a regulation loop is not provided in this design.

### **IV. CONCLUSION**

In this brief, a new fully integrated cross-coupled voltage multiplier and control scheme have been proposed. The presented structure eliminates all reversion power losses by using four nonoverlapping clock signals and small auxiliary capacitors. A single-stage CCVD (1.2-2.4 Vdc) and a three-stage voltage multiplier (1.2–4.8 Vdc) were implemented in a 0.13- $\mu$ m standard CMOS process with a nominal supply voltage of 1.2 V. The simulation and measurement results show a higher output voltage and PE compared with the latest reports. The measured maximum PE values of the proposed single-stage CCVD and multiple-stage cross-coupled voltage multiplier are 88.16% and 80.2%, respectively. Since this design does not require any extra level shifter circuit and both of the gate-to-source/drain voltage and drain-to-source voltage do not exceed the nominal supply voltage, this design is scalable to multiple-stage voltage multiplier to obtain higher output voltages and compatible with a low-voltage standard CMOS process.

#### REFERENCES

- J. F. Dickson, "On-chip high-voltage generation in MNOS integrated circuits using an improved voltage multiplier technique," *IEEE J. Solid State Circuits*, vol. SC-11, no. 3, pp. 374–378, Jun. 1976.
- [2] L. Hanh-Phuc, "Fully integrated power conversion and the enablers," in Proc. IEEE 9th Int. Conf. Power Electron. ECCE Asia, 2015, pp. 1778–1783.
- [3] G. Kar and M. Sawan, "Low-power high-voltage charge pumps for implantable microstimulators," in *Proc. IEEE Int. Symp. Circuits Syst.*, 2012, pp. 2247–2250.
- [4] C.-Y. Lin, W.-L. Chen, and M.-D. Ker, "Implantable stimulator for epileptic seizure suppression with loading impedance adaptability," *IEEE Trans. Biomed. Circuits Syst.*, vol. 7, no. 2, pp. 196–203, Apr. 2013.
- [5] H. Lee and P. K. T. Mok, "Switching noise and shoot-through current reduction techniques for switched-capacitor voltage doubler," *IEEE J. Solid-State Circuits*, vol. 40, no. 5, pp. 1136–1146, May 2005.
- [6] J.-Y. Kim, Y.-H. Jun, and B.-S. Kong, "CMOS charge pump with transfer blocking technique for no reversion loss and relaxed clock timing restriction," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 56, no. 1, pp. 11–15, Jan. 2009.
- [7] F. Su, W.-H. Ki, and C.-Y. Tsui, "High efficiency cross-coupled doubler with no reversion loss," in *Proc. IEEE Int. Symp. Circuits Syst.*, 2006, pp. 2761–2764.
- [8] T. W. Mui, M. Ho, K. H. Mak, J. Guo, H. Chen, and K. N. Leung, "An area-efficient 96.5%-peak-efficiency cross-coupled voltage doubler with minimum supply of 0.8 V," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 61, no. 9, pp. 656–660, Sep. 2014.
- [9] M. D. Ker, S. L. Chen, and C. S. Tsai, "Design of charge pump circuit with consideration of gate-oxide reliability in low-voltage CMOS processes," *IEEE J. Solid-State Circuits*, vol. 41, no. 5, pp. 1100–1107, May 2006.
- [10] L. F. New, Z. Aziz, and M. F. Leong, "A low ripple CMOS charge pump for low-voltage application," in *Proc. IEEE 4th Int. Conf. Intell. Adv. Syst.*, 2012, pp. 784–789.
- [11] P. Ngo and D. Ma, "Integrated switched-capacitor voltage doubler with clock transition periods boosting and transfer blocking techniques," in *Proc. IEEE Annu. Appl. Power Electron. Conf. Expo.*, 2010, pp. 813–817.