With the increase in device density and clock frequency of chips under advanced processes, power consumption has also significantly increased. At the same time, the supply voltage and transistor threshold voltage have been reduced, leading to significant leakage current. High power consumption can cause excessive temperature during device operation, reducing reliability due to electromigration and other thermally related failure mechanisms. High power consumption also reduces the battery life of portable devices【1】.
6.1 Methods to Reduce Power Consumption
There are several different RTL and gate-level design strategies for reducing power consumption. Some methods, such as clock gating, have been widely and successfully used for many years. Others, such as dynamic voltage and frequency scaling, have not been widely adopted due to implementation difficulties. As power consumption becomes increasingly important, more methods are being utilized.
6.1.1 Reducing Supply Voltage
The most fundamental method for reducing power consumption is to lower the supply voltage. For both dynamic and static power consumption, power is proportional to the square of the supply voltage. Several generations of CMOS technology have utilized progressively lower supply voltages.
Each reduction in supply voltage decreases the power consumption of each gate, but it also reduces switching speed. Additionally, the transistor threshold voltage must be lowered, leading to more noise susceptibility and sub-threshold leakage issues.
6.1.2 Clock Gating
Clock gating is a dynamic power reduction method that stops the clock signal of selected registers during inactive clock periods. Clock gating is useful for registers that need to maintain the same logic value over many clock cycles. The main challenge is to find the optimal locations to use it and to create logic that turns the clock off and on at the appropriate times. Synthesis tools like Power Compiler can detect low-throughput data paths where clock gating can provide the most benefit and automatically insert clock gating cells at the appropriate locations. Implementing clock gating is relatively straightforward as it does not require additional power scheme changes.
Inserting clock gating circuits into existing clock networks may introduce skew that adversely affects timing. To allow synthesis tools to consider these effects during synthesis, the tools can use predefined integrated clock gating cells, which can be provided as logic cells in the library. Integrated clock gating cells combine various combinations of clock gates and timing elements into a single library cell.
6.1.3 Multiple-Vt Cell
Some CMOS technologies have multiple cells that operate at different threshold voltages for the same logic function. Low-threshold cells have better speed but higher sub-threshold leakage current. Tools can select the appropriate type of cell to use based on the trade-off between speed and power consumption. Low-threshold cells are used in timing-critical paths for speed, while high-threshold cells are used elsewhere for lower leakage power.
Multiple threshold voltage libraries support two or more different threshold voltage groups for each logic gate. The threshold voltage determines the delay and leakage characteristics of the logic cell. Cells with lower threshold voltages can switch faster but have higher leakage. Cells with higher thresholds have less leakage but longer switching delays.
Little Cai says:
-
1. In advanced processes, generally only SVT and LVT are used during synthesis and PR phases, and ULVT is not enabled unless timing convergence is difficult, at which point a certain proportion of ULVT may be attempted;
-
2. When timing fix is close to clean, VT Cell will be swapped in PT, and the tool will replace cells on paths with positive Timing Slack with higher Vt Cells, making the proportion of Vt more meaningful.
-
6.1.4 Multivoltage Design
Different parts of a chip may have different speed requirements. For example, CPU and RAM may need to operate at a higher voltage than peripheral voltages. To achieve maximum speed where needed while minimizing power consumption, CPU and RAM can operate at a higher supply voltage, while peripheral voltages can operate at lower voltages.
Providing multiple supply voltages on a single chip introduces some complexity and cost. Additional device pins must be provided to supply chip voltages, and the power network must separately allocate each voltage source to the appropriate module.
In cases where logic signals cross from one power domain to another with significantly different voltages, level-shift cells are needed to generate signals with appropriate voltage swings. Level-shift cells themselves require power supplies that match the input and output supply voltages.
In multivoltage designs, level-shift cells are needed whenever signals cross from one power domain to another. Level-shift cells function as buffers, having one supply voltage at the input and a different supply voltage at the output. Thus, level-shift cells convert logic signals from one voltage swing to another, aiming to minimize the delay from input to output.
The library description of level-shift cells must include information about the type of conversion performed (from high to low, from low to high, or both), supported voltage levels, and identification of the various power pins that must be connected to each power supply.
Little Cai says:
1. The Power Net structure of the Secondary Power Domain for level-shift cells needs to be customized; tools should not automatically route it, otherwise IR may be difficult to converge;
2. For cells with two (or more) power domains, IR checks must be performed for both power domains.
6.1.5 Power Switching
Power shutdown is an energy-saving technique that can turn off parts of a device when it is inactive, reducing leakage power. In mobile phone chips, when the phone is in standby mode, the module performing voice processing can be turned off. The voice processing module must be “woken up” from the powered-off state when needed.
Power shutdown switches have significant potential to reduce total power consumption as they lower both leakage power and switching power. However, they also introduce some additional challenges:
• Power Controller: A logic module that determines when to turn off and on specific modules. The powering down and powering up of modules takes some time and power cost, so the controller must determine the appropriate shutdown timing;
• Power Switching Network: A large number of high-threshold transistors connected from the always-on power rail to the power pins of the cells. Power switches are physically distributed around or within the module. When the power network is on, power is connected to the logic gates within the module;
• Isolation Cells: Cells inserted in designs where signals leave a powered-off module and enter a powered-on module. When the powered-off module has no power, isolation cells provide a known constant logic value to the always-on module, preventing unknown or intermediate values that could lead to crowbar currents.
• Retention Registers: Registers that retain data during power-off by saving data to shadow registers (also known as bubble registers). Upon powering up, the device restores data from the shadow registers to the main registers.
6.1.6 Dynamic Voltage and Frequency Scaling
The principles of multivoltage operation can be extended to allow voltage changes during chip operation to match the current workload. For example, a mathematical processing chip may operate at lower voltage and clock frequency during simple spreadsheet calculations, then operate at higher voltage and frequency during 3D image rendering. Changing supply voltage and operating frequency to meet workload requirements is known as dynamic voltage and frequency scaling.
Chips and voltage sources can be designed to use multiple established levels or even a continuous range. Dynamic voltage scaling requires multilevel power supplies and logic modules to determine the optimal voltage level for a given task. Since the range and combinations of voltage levels and operating frequencies must be analyzed, the design, implementation, verification, and testing of the device can be challenging.
6.1.7 Multibit Register Synthesis and Implementation
Synthesis and physical implementation tools group individual register bits into multibit registers, allowing a single clock input to drive multiple register bits. This reduces the need for clock tree resources such as buffers and wires, thereby reducing power consumption and area. Multibit cells themselves are more efficient by sharing logic, power connections, and transistor wells of a single bit.
To achieve the best power savings during multibit synthesis, accurate switching activity data should be provided in the SAIF file. This allows synthesis tools to map bits to multibit registers based on switching activity, minimizing total power consumption.
References
【1】《Synopsys Multivoltage Flow User Guide》P14-P25