

Scientific Journal of Impact Factor (SJIF):

International Journal of Advance Engineering and Research Development

## Volume 4, Issue 10, October -2017

# Low PDP Area efficient Domino logic gates

P. Koti Lakshmi<sup>1</sup> and Prof. Rameshwar Rao<sup>2</sup>

<sup>1</sup> Assistant Professor, Dept. of ECE, UCE, Osmania University, Hyderabad. <sup>2</sup> Professor (Retd), Dept. of ECE, UCE, Osmania University, Hyderabad.

**Abstract-** The demand for high speed, low power and small chip area with considerable noise immunity have become the need of this modern era with technology scaling down to sub nanometer range. In this paper we have presented the analysis and comparison of our proposed domino design with other classic and noise immune domino designs. Power delay product of the proposed style is 10% less than the footed Domino technique, while 23% less compared to PS technique, while the proposed technique shows a reduction of 22% in Area delay product.

Key words - Low PDP, Area Delay Product, Area efficient design, domino logic gates.

## I. INTRODUCTION

As the technology scales down, power consumption density is of major concern especially in deep sub micron technologies. For applications such as mobile computing, digital signal processing and multimedia where amount of computation is large, low power designs are preferred. To reduce power consumption, dynamic designs are preferred for such applications due to their low power consumption and less silicon area needed. Dynamic power component which is dominant is reduced generally by reducing the supply voltage. Over the generations of technologies, supply and threshold were traded to design low power, high performance circuits. As the device dimensions, logic levels, supply voltages and thresholds are scaled down, designs become vulnerable to noise. To improve noise immunity of the circuit, many design techniques were presented in which the thresholds are dynamically varied which in turn is affecting one or the other parameters such as power, area and delay of the design. Thus a design which can optimize all the parameters of concern is the need of the day. We have proposed a design which improved performance and noise immunity maintaining silicon area requirement as that of other noise immune designs, and with a small overhead of power and area compared to classic footless and footed domino techniques.

## II. BACK GROUND

With the aim of reducing power consumption and delay, footer and keeper transistors are added to the basic domino. Footer is an NMOS transistor added between pull down network and ground meant to retain the state of the output node during precharge phase, reducing leakage current flow, thus leakage power and to increase the source voltage of pull down network during evaluation phase, providing noise immunity to the system. While Keeper is a PMOS device connected between output node and supply voltage, charges the output node to logic 1 ( $V_{DD}$ ), and retaining this state of output during pre-charge phase. The size of the keeper directly affects power dissipation, delay and noise immunity of the circuit [2]. There is always a trade off among delay, power and noise immunity based on the application. Thus a domino design with low power dissipation, delay, and considerable immunity and with less area overhead has to be designed to achieve the overall performance enhancement of the system.

## 2.1. Footless Domino With Keeper

The basic domino logic stage consists of logic realized using NMOS in pull down network and the pull up network consisting of a single PMOS transistor (Mp) to pre charge the dynamic node to logic high as shown in Figure 1. The dynamic node is cascaded into a static inverter from where the gate output is taken and can be connected to the NMOS input of the next stage [5]

The footless scheme is characterized by fast discharge of dynamic node (Dyn\_node). The circuit operation is of two phase, Pre-charge and Evaluate. During pre-charge phase (clock =0), dynamic node is pre-charged to  $V_{DD}$  through Pre-charge transistor ( $M_p$ ). Keeper supplies charge that was lost due to leakage through pull down network, thus retaining the state of the dynamic node.



Figure 1. Footless Domino gate with conditional Keeper Figure 2. Basic Domino with Footer and Keeper

During Evaluation phase (clock = 1), the dynamic node has to either retain the charge or discharge to ground depending on the state of the pull down network (logic implemented and input combination applied). The size of the keeper plays an important role, where it should be large enough to compensate for charge sharing problem and small enough to reduce the contention between keeper and NMOS pull down network in case PDN evaluates the dynamic node to logic zero.

#### 2.2. Footed Domino With Keeper

The Basic Domino logic family operates with retained two phases of operation. A single clock is used to both pre-charge and evaluation phase. This circuitry incorporates a static CMOS buffer into each logic gate. An additional footer transistor (Mn) is added to the circuit of Figure 2. When clock goes low, the dynamic node charges to  $V_{DD}$  and the footer transistor  $M_n$  is responsible for holding the charge on the dynamic node irrespective of the input combination applied to the pull down network.



Figure .2 PS Mehar Technique

Figure.3 Proposed Domino 2- input OR gate

Thus the output (Static inverter output) goes to logic 0 during this interval (pre- charge phase). When the clock goes high (Evaluation phase) the pre-charge transistor  $(M_p)$  turns off, allowing the dynamic node to settle down to a state determined by the inputs.

Based on the logic implemented, the charge on the dynamic node may be retained at logic 1, thus output remains at logic 0 or the dynamic node may get discharged to logic 0 and output may rise to logic 1. During evaluation phase when all the inputs are at logic 0, dynamic node should be at logic 1, but because of the leakage current through PDN, the charge stored on the dynamic node may get discharged and may switch the output inverter to other state causing an error.

#### 2.3. Domino With Improved Noise Immunity

Preeti sudha at.el [6] proposed a domino circuit scheme is shown in Figure. 3 Transistor M4 is used as stacking transistor. Due to voltage drop across M4, gate-to-source voltage of the NMOS transistor in the PDN decreases (stacking effect [7]). This circuit differs from [8] as it has additional evaluation transistor M5 with gate connected to the CLK. In [8] when M4 has voltage drop due to presence of noise-signals, M3 starts leaking that causes the circuit to dissipate power and also makes it less noise robust. The purpose of M5 in this Scheme is to stack and make gate-to-source voltage of M3 smaller (M3 less conducting). Hence circuit becomes more noise robust and less leakage power consuming compared to diode footed domino technique.

But for performance degrades because of stacking effect in mirror current path, can be compensated by widening the M2 (high W/L) to make it more conducting. But the stacking effect of M5 will increase the rise time of the gate at the output node as it hinders the discharge of dynamic node during evaluation phase. At the cost of increased delay the circuit is noise immune.

#### III. PROPOSED DESIGN AND OPERATION

The two input OR gate is shown in Figure 4 [9]. The PDN of the gate is replaced with two inputs  $IN_1$  and  $IN_2$  respectively. The circuit consists of a pre-charge transistor  $M_P$ , evaluation network with two NMOS transistors, a keeper transistor, a footer transistor M1 and a pseudo-domino inverter stage.

During pre-charge phase, the transistor  $M_P$  turns ON and the Dyn\_node gets pre-charged to  $V_{DD}$  and the 'Output' node stays at logic '0'. During evaluation phase, transistor  $M_P$  turns OFF and M1 turns ON providing a discharge path for the Dyn\_node if PDN is conducting. In the evaluation period, when any one of the inputs (IN<sub>1</sub> or IN<sub>2</sub> or both) are HIGH, then the Dyn\_node has to discharge the stored charge, through PDN and footer transistor M1. As PDN is stacked with M1, the path resistance increases with M1 and thus the discharge current flow through the path gets reduced, and thus the time required for discharging Dyn\_node to ground potential increases or otherwise, the source potential of PDN is raised because of M1 and thus the conduction current through PDN decreases. This adds delay to the circuit and affects the performance of the design.

The performance of the proposed design is improved by providing an additional current discharge path, such that Dyn\_node can be completely discharged to logic 0 without additional delay overhead. When Dyn\_node discharges to the switching threshold of the output inverter (M4), inverter output raises to logic 1.

The two additional transistors M2 and M3 are connected between Dyn\_node and ground. The gate of M3 is connected to the output terminal and M2 stacks M3 with clock applied to its gate. The discharge time of the Dyn\_node can be considered of two phases. First being the time required to discharge Dyn\_node from  $V_{DD}$  to the  $(V_{DD}-V_{Tn})$  of the NMOS transistor M4 of the pseudo domino buffer and the second, being the time required to discharge from  $V_{Tn}$  of M4 to ground as shown in figure 5.



### Figure .4 Fall time of a dynamic gate

During first phase, the Dyn\_node discharges through PDN and footer transistor M1 to threshold voltage of M4. Let this time is  $t_{f1}$ . Let the time required to discharge the node from  $V_{Tn}$  to ground is  $t_{f2}$ . When the Dyn\_node reaches  $V_{DD}$ - $V_{Tn}$  (threshold in downward direction), M4 turns OFF and the Output node charges to logic 1. This makes the transistor M3 to come into conduction and turns ON. The transistor M3 along with M5 provides an additional discharge path to Dyn\_node as M2 is already conducting (pre-charge mode), thus completely turning OFF M4. Thus the time required to discharge Dyn\_node from  $V_{Tn}$  of M4 to ground is reduced compared to basic gate with footer transistor as  $t_{f2}$  is less for the proposed circuit as shown in Figure 6.*Figure.5* 



Figure.5. Simulation result of Proposed domino and Footed Domino (FD) showing the change in slope of output

The slope of the output rise of the proposed circuit is more than the slope of the footed domino (FD) as explained. This difference in slope is the factor responsible for performance improvement of the proposed design.

Thus during evaluation phase, when clock is high, the charge on Dyn\_node get discharged through pull down network when either or both the inputs to the gate are driven by logic '1'. When the Dyn\_node reaches threshold voltage of the inverter, the 'Output' node switches to logic '1' and transistor M3 turns ON. M3 along with M2 aids in fast discharge of Dyn\_node and

hence enhances the speed of the circuit. The Nfoot\_node is at a higher potential which increases the threshold of the pull down network and thus increases noise immunity of the gate. The source terminal of M4 is connected to the Nfoot\_node, which is responsible to reduce noise pulse amplitude at the output node and thus input noise of the cascading gate.

#### IV. SIMULATION RESULTS

#### 4.1. Power And Delay With Variation In Supply

The proposed gate is simulated with Tanner T-spice 15.0 with 16nm PTM files [10] and supply of 1V, the design is exercised for supply variation of 20% to study its affect on power, and delay, with various fan-in such as 2-input, 4- input and 8- input to study fan-in affect on coupling and therefore noise immunity.

The design is simulated with different Fan-in and the corresponding power and delay of the OR gate. The measured values are tabulated in Table 1. From the table it can be seen that the power and delay are increasing with fan-in as the number of input transistors in PDN and their corresponding input capacitance are increasing with Fan-in. 7 shows the corresponding plot of power variation with supply voltage for various Fan-in.

| Supply voltage | 2-input OR gate |          |          | 4- input OR gate |          |          | 8-input OR gate |          |          |
|----------------|-----------------|----------|----------|------------------|----------|----------|-----------------|----------|----------|
|                | Power           | Delay(s) | PDP      | Power            | Delay(s) | PDP      | Power           | Delay(s) | PDP      |
| 0.8v           | 8.66E-07        | 1.27E-09 | 1.10E-15 | 1.04E-06         | 8.59E-10 | 8.97E-16 | 1.69E-06        | 1.15E-09 | 1.94E-15 |
| 0.9v           | 1.11E-06        | 8.61E-10 | 9.56E-16 | 1.60E-06         | 6.71E-10 | 1.08E-15 | 2.52E-06        | 8.98E-10 | 2.27E-15 |
| 1.0v           | 1.39E-06        | 7.18E-10 | 9.97E-16 | 2.07E-06         | 5.46E-10 | 1.13E-15 | 3.19E-06        | 8.86E-10 | 2.83E-15 |
| 1.1v           | 1.74E-06        | 6.40E-10 | 1.11E-15 | 2.56E-06         | 5.19E-10 | 1.33E-15 | 3.89E-06        | 9.14E-10 | 3.56E-15 |
| 1.2v           | 2.44E-06        | 2.11E-10 | 5.15E-16 | 3.17E-06         | 5.48E-10 | 1.73E-15 | 4.82E-06        | 8.92E-10 | 4.30E-15 |

Table 1. Power, Delay and PDP of proposed design for various Fan-in



Figure 6 Power dissipation of the proposed OR gate for various Fan-in

#### 4.2 Area Comparison:

The proposed technique along with other logic styles were implemented in various technologies with corresponding supply voltages to study the effect of technology scaling and supply scaling on the proposed design and compared with other

techniques under consideration. The measurements such as Power, Delay and Area are tabulated in Table 2. PDP, EDP and ADP are calculated.

Figure 8 shows area delay product of 2-input OR gate and comparison of proposed technique with other logic styles using various technology files namely 90nm, 70nm and 50nm using Microwind2.0. Proposed technique shows the least area delay product of all the other techniques under consideration.

|                | _                  | Power (W) | Average.  | Area     | Power Delay | Energy Delay   | Area Delay Product |  |
|----------------|--------------------|-----------|-----------|----------|-------------|----------------|--------------------|--|
|                |                    |           | Delay (S) | (Sq.m)   | product (J) | product (JSec) | (Sq.m Sec)         |  |
| 90nm<br>(1v)   | Footless<br>Domino | 5.73E-05  | 2.85E-11  | 4.70E-11 | 1.63E-15    | 4.66E-26       | 1.34E-21           |  |
|                | Footed<br>Domino   | 2.45E-06  | 5.3E-11   | 1.07E-10 | 1.30E-16    | 6.89E-27       | 5.68E-21           |  |
|                | Proposed<br>Domino | 1.10E-06  | 4.1E-11   | 7.25E-11 | 4.52E-17    | 1.85E-27       | 2.97E-21           |  |
|                | PS Mehar<br>Domino | 9.73E-07  | 5.45E-11  | 7.25E-11 | 5.30E-17    | 2.89E-27       | 3.95E-21           |  |
| 70nm<br>(0.7v) | Footless<br>Domino | 1.61E-05  | 6.75E-11  | 4.26E-11 | 1.08E-15    | 7.31E-26       | 2.88E-21           |  |
|                | Footed<br>Domino   | 8.40E-07  | 6.55E-11  | 5.25E-11 | 5.50E-17    | 3.60E-27       | 3.44E-21           |  |
|                | Proposed<br>Domino | 3.66E-07  | 5.55E-11  | 3.55E-11 | 2.03E-17    | 1.13E-27       | 1.97E-21           |  |
|                | PS Mehar<br>Domino | 3.34E-07  | 6.75E-11  | 3.55E-11 | 2.25E-17    | 1.52E-27       | 2.40E-21           |  |
| 50nm<br>(0.5v) | Footless<br>Domino | 3.70E-06  | 2.54E-10  | 2.17E-11 | 9.39E-16    | 2.38E-25       | 5.51E-21           |  |
|                | Footed<br>Domino   | 2.68E-07  | 2.63E-10  | 2.68E-11 | 7.05E-17    | 1.85E-26       | 7.05E-21           |  |
|                | Proposed<br>Domino | 1.19E-07  | 7.75E-11  | 1.81E-11 | 9.22E-18    | 7.15E-28       | 1.40E-21           |  |
|                | PS Mehar<br>Domino | 1.06E-07  | 1.3E-10   | 1.81E-11 | 1.38E-17    | 1.79E-27       | 2.35E-21           |  |

Table 2. Power, Delay and Area of 2-input OR gate using MICROWIND 2.0



Figure. 7 Area Delay product of domino techniques under consideration

### 4.2. Implementation Of Other Logic Functions Using Proposed Technique

Various logic functions were implemented using proposed technique and compared with other techniques under consideration for power and delay. Simulations were done using 250MHz clock with 50% duty cycle at 1v supply voltage.

| Logic function | Footless Domino |          | Footed Domino |          | Proposed Domino |          | PS Mehar Domino |          |
|----------------|-----------------|----------|---------------|----------|-----------------|----------|-----------------|----------|
|                | Power           | Delay    | Power         | Delay    | Power           | Delay    | Power           | Delay    |
| A+B            | 3.09E-06        | 2.66E-10 | 1.80E-06      | 4.71E-10 | 1.26E-06        | 4.84E-10 | 1.18E-06        | 5.42E-10 |
| A+B+C          | 3.91E-06        | 3.60E-10 | 2.35E-06      | 1.10E-09 | 1.70E-06        | 5.36E-10 | 1.57E-06        | 6.87E-10 |
| A+B+C+D        | 4.46E-06        | 4.71E-10 | 2.73E-06      | 1.25E-09 | 2.05E-06        | 5.97E-10 | 1.89E-06        | 7.65E-10 |
| AB             | 6.82E-07        | 8.26E-10 | 5.85E-07      | 6.92E-10 | 5.18E-07        | 8.75E-10 | 5.10E-07        | 9.49E-10 |
| ABC            | 4.65E-07        | 1.36E-09 | 3.38E-07      | 7.65E-10 | 2.84E-07        | 9.15E-10 | 2.87E-07        | 1.03E-09 |
| ABCD           | 2.18E-07        | 1.87E-09 | 2.07E-07      | 1.34E-09 | 1.61E-07        | 9.57E-10 | 1.55E-07        | 1.11E-09 |
| AB+CD          | 1.98E-06        | 5.53E-10 | 1.02E-06      | 1.31E-09 | 9.29E-07        | 5.74E-10 | 8.87E-07        | 7.56E-10 |
| AB+CD          | 1.09E-15        | 6.04E-25 | 1.33E-15      | 1.75E-24 | 5.33E-16        | 3.06E-25 | 6.71E-16        | 5.07E-25 |

| Table | Power  | and De | elav o | f various        | logic | functions |
|-------|--------|--------|--------|------------------|-------|-----------|
| Iunic | I UWCI | unu De | my v   | <i>j</i> various | ingu. | junctions |



Figure 8. Power Delay product of various logic functions



Figure 9. EDP of various Logic functions

The logic functions were simulated using Tanner T-Spice 15.0 with 16nm technology PTM files. Implementation Of Other Logic Functions Using Proposed Technique

Various logic functions were implemented using proposed technique and compared with other techniques under consideration for power and delay. Simulations were done using 250MHz clock with 50% duty cycle at 1v supply voltage.

*Table* shows Power and Delay measurements for various logic functions implemented with different design techniques along with the proposed technique. Figure 8 shows the plot of logarithm of power delay product as function of logic operation of different techniques under consideration. It can be seen that the power delay product of the proposed technique is least of all the techniques under consideration for all the logic operations. Figure 9 shows logarithm of Energy delay product as a function of logic operation.

|        | Proposed design | Static CMOS | Footed Domino | Footless Domino | PS- Domino |
|--------|-----------------|-------------|---------------|-----------------|------------|
| OR2    | 13              | 13          | 11            | 07              | 13         |
| OR3    | 15              | 30          | 13            | 08              | 15         |
| OR4    | 17              | 71          | 15            | 9               | 17         |
| AND2   | 11              | 11          | 14            | 9               | 11         |
| AND3   | 16              | 21          | 21            | 14              | 16         |
| AND4   | 24              | 43          | 30            | 22              | 24         |
| AND-OR | 16              | 27          | 20            | 15              | 16         |

Table 4. Number of unit sized transistors used for various technologies



Figure 10. Delay Vs Fan-in

Table 4 shows the number of unit sized transistors used for the realization of the logic functions. Unit sized transistors are those with W/L ratio = 1. Here proposed design is compared with Static CMOS, Footed Domino, Footless Domino and PS-technique. In static CMOS the number of transistors required almost doubles with an increase in one fan-in count while for all other domino techniques increase in transistor count is very much less. This shows that the silicon requirement for the dynamic design is much less than that of static designs. Footed domino and footless domino require less number of unit sized devices but suffer from large power dissipation, while the proposed and PS- designs use the same number of devices, slightly higher than footed and footless techniques.

#### V. CONCLUSION:

Proposed design shows better performance in-terms of speed, noise immunity, Power delay product, Energy Delay product and Area delay product compared to other techniques under consideration.

#### REFERENCES

- [1] V G, Oklobdzija and P. G, Kowyuuc, "On testability of CMOS-Domino Logic," in *Proceedings 14th international Conf Fault-Tolerant Computing* (Orlando, FL), June 20-22, 1984
- [2] Farshad Moradi, AliPeiravi, Hamid Mahmoodi, "A New Leakage Tolerant Design for High Fan-in Domino circuits, " 2004 *IEEE*.
- [3] Ronald J Tocci, Digital systems principles and applications, 6th ed. New Delhi: PHI, 2003.
- [4] P. Larsson and C. Svensson, "Noise in digital dynamic CMOS circuits," *IEEE journal of Solid state circuits*, vol. 29, pp. 655–662, June 1994.
- [5] John P. Uyemura, CMOS Logic circuit Design.: Springer International Edition, 2005.
- [6] Kamala Kanta Mahapatra Preetisudha Meher, "A technique to increase noise tolerance in dynamic digital circuits," in Asia Pacific conference in post graduate Research in Microelectronics and Electronics(PRIMEASIA), December 201, pp. 229-233
- [7] G. A. Katopis, "Delta-i noise specification for a high-performance," in IEEE proceedings, Sept. 1985, pp. 1405-1415.
- [8] Hamid, and Kaushik Roy Mahmoodi-Meimand, "Diode-footed domino: a leakage tolerant high fan-in dynamic circuit design style," IEEE Transactions on Circuits and Systems I: Regular Papers 51.3, vol. 51, no. 3, pp. pp. 495-503., 2004.
- [9] P. Koti Lakshmi and Prof Rameshwar Rao, "A Technique for designing high speed noise immune CMOS domino high fan-in circuits in 16nm technology", International Journal of VLSI design & Communication systems (VLSICS) vol.6, No.5, Oct 2015.
- [10] Predictive Technology Model. [Online]. http://ptm.asu.edu