High-Power ASIC: Testing Challenges and Solutions for Processors

The manufacturing process of chips has been continuously evolving, and today, this digit measured in nanometers has entered the single-digit era. While this “shrinkage” at one end has triggered an “increase” at the other end, namely the integration level and the increasingly complex power consumption schemes and power supply systems, the testing costs of chips with advanced manufacturing processes are also rising sharply. Finding a balance and breakthrough point in this contradiction of “one reduction and one increase” to effectively test chips and ensure product yield has become one of the major challenges faced by engineers today.

Trends in Chip Power Consumption

Transistors are the basic units that make up chips, and they can be viewed as tiny switches that control the flow of current within the chip. The faster these switches open and close, the higher the chip’s operational frequency appears on a macro level, which translates to improved performance. Therefore, to enhance chip performance, it is necessary to continuously strengthen the control capabilities over transistors to accelerate their switching speed, which is precisely the advantage brought by advanced manufacturing processes.However, with the improvement in performance, the power consumption issues of chips have become increasingly prominent, and the leakage phenomenon of transistors at the nanoscale has become more apparent, further exacerbating the energy waste and the rise in power consumption.

High-Power ASIC: Testing Challenges and Solutions for Processors

In recent years, as semiconductor process nodes have continued to advance, the core voltage (VDD) of actual chip power supply and the I/O voltage of the chip’s peripherals have shown a decreasing trend, dropping from 3.3V all the way down to 0.6V or even lower, and from 5V down to around 1V. The “disappearance” of voltage cannot offset the “increase” in current contribution; as chip performance improves, the current consumption per unit area is also rapidly increasing, while the static power consumption caused by high current leading to leakage is one of the main reasons for the surge in power consumption.From a macro application perspective, chip power consumption is becoming increasingly high, especially for high-performance processors and FPGAs used in data centers and servers. For instance, the actual working power consumption of some training and inference chips has reached as high as 450W. In actual mass production testing, this has become a very tricky issue, as a higher toggle rate can even increase peak power consumption several times, posing a significant challenge to the power supply’s voltage regulation capability. Another challenge is that processors with advanced manufacturing processes require very complex power supply systems; some typical FPGA products need many phases to supply power, with each phase having different voltages, complex power-up sequences, and each power rail requiring very high currents (up to several hundred amperes).From another perspective, not only the chips under test but also the facilities that carry these chips, such as sockets, load boards, and probe cards, have become increasingly complex and expensive. According to statistics from Teradyne regarding the complexity of testing application processors, probe cards for application processors a few years ago only required around 2000 pins, while in recent years, some application processor probe cards have even reached up to 7000 pins, with over 10,000 components and PCBs up to 70 layers. It is easy to imagine the expensive testing costs this will incur.As the overall system complexity increases and costs continue to rise, how to balance functionality, accuracy, cost, and robustness has become a significant challenge for testing machine manufacturers.

Challenges in Actual Testing

There are two very important units in the testing machine for application processors. One is the digital I/O for testing digital chips, which undertakes relatively complex tasks and can capture failures; the other is the power supply. Although DC power supplies seem relatively simple, they play a very important role in the testing of high-power processors, as they determine the quality of the tests, and the final yield is closely related to the actual performance of the power supply.To address different challenges, the testing parameters and application scenarios for different testing stages may vary slightly, requiring different solutions.For complex power supply issues, utilizing a modular power supply strategy can alleviate the troubles of multi-phase complex power supply. By flexibly allocating testing resources, the power supply can be broken down into small power modules, which can be combined into small unit modules to supply power to different power rails. Additionally, redundant power modules can be used to help reduce the voltage regulation pressure on the pre-set modules.For example, if a VDD pin requires 30 amperes, and each channel supports 5 amperes of output capability, six such units can be combined to supply power, while redundant channel combination units can be used together with the previous six units to reduce power supply pressure.Furthermore, software programming can be used to set the power-up sequence and soft start to reduce the peripheral power supply circuit.For most application processors, the working frequency is generally positively correlated with VDD. In the early design validation phase, manufacturers will try to find a sweet point that allows the chip to perform better under limited power consumption. In actual production testing, a specific VDD may be directly set to see if it can achieve the expected frequency at this specific VDD.High-Power ASIC: Testing Challenges and Solutions for ProcessorsHowever, in actual testing, no testing machine is perfect. In practice, chips often produce errors. One approach is to program the voltage slightly above the chip’s preset value. Considering the chip’s errors and all losses, it is necessary to ensure that the voltage on the chip pins remains higher than the expected value. Through this testing method, even if the instrument fluctuates to the lowest voltage, high-quality devices can still pass, thus achieving a higher yield.Another approach is to directly program the output of the testing instrument to equal the expected value. However, in reality, some testing machines cannot achieve good accuracy, and in some cases, the output is slightly lower, causing the actual testing voltage of this part of the chip to be lower than the expected value.These two methods can lead to different negative effects. In the first case, the expected value of VDD needs to be set higher, which means the actual voltage will be higher than the expected value, resulting in greater thermal losses during testing, necessitating low-speed vectors to help cool down.In the second case, although the actual shipped products can all pass the expected values, for some machines with larger errors, this can lead to additional yield losses. For advanced process products at 7nm and 5nm, yield is an extremely important factor. Since the yield of advanced process products is very low, especially when the wafer area is large, if an additional portion of yield is lost, it is unacceptable for device manufacturing costs.Faced with various challenges, how should we test? What characteristics should testing machines possess to meet the numerous challenges mentioned above?

Targeted Solutions for Different Testing Challenges

“Millivolts Matter”; every millivolt of precision is crucial.Lower core voltages demand higher output precision and dynamic response from power supplies.Teradyne has always regarded the output voltage capability of power supply instruments as one of the most important parameters in instrument design, which is also a distinguishing feature of Teradyne among many ATE manufacturers.In actual testing processes, the power supply is not completely flat; the actual power consumption is closely related to the actual working conditions, which can even lead to chips losing states and thus causing device failures. Such problems are difficult to predict and troubleshoot.By continuously changing the output VDD and scan shift frequency to observe the output results of all test vectors, failures are more likely to occur when VDD is lower and frequency is higher. In actual Shmoo testing cases, Teradyne’s UltraFLEXplus has a more stable power supply, which means higher boundary yield can be achieved, bringing the chip closer to its true intrinsic state. This way, in actual products, we can obtain a more accurate inference of the chip’s actual working conditions, knowing which situations are workable and which are not. Overall, better and more stable power supplies not only improve yield but also help understand the chip’s working state under real conditions.High-Power ASIC: Testing Challenges and Solutions for ProcessorsCurrently, many chips require very high current supply capabilities, and providing a very large current capability is no longer a challenge for testing machines, as many testing machines can easily supply 1000A of output capability. However, during multi-site testing, each chip’s individual power rail must reach 800-1000A. While the testing machine can meet 1000A of static power supply, whether it can satisfy the step power-up process from 0A to 1000A becomes a challenge. In multi-site testing, Teradyne’s solutions can meet the large power supply requirements for step power-up.High-Power ASIC: Testing Challenges and Solutions for ProcessorsIn addition to focusing on the static and dynamic aspects of power supplies, the design of peripheral circuits for power supplies, such as sockets, probe cards, and load boards, is also closely related to power supply performance.The dynamic response of testing instruments greatly affects the performance of DC power supplies. An excellent power supply solution can help reduce the complexity of peripheral power supply circuits. Traditional ATE solutions typically require the board to provide energy supply, mostly supplying from the DC part to the 100kHz frequency range. Different peripheral circuits need to be added for low, medium, and high-frequency bands, making the overall circuit quite complex.Teradyne focuses on simplifying circuit design, allowing ATE itself to provide output capabilities from low to medium frequencies without needing to add extra peripheral circuits, minimizing the number of capacitors. In practice, only a few types of low ESR/ESL ceramic capacitors need to be added to help change high-frequency characteristics, allowing a single model to meet the output’s dynamic performance.

The benefits of this approach are: 1) Reducing capacitor values to accelerate recovery time; 2) Fewer capacitors mean faster charge and discharge times, which reduces the energy involved in charging and discharging, thus speeding up testing time and lowering the probability of energy damage to the socket; 3) Reducing the variety of capacitors used can lower the chances of circuit resonance and slow recovery.

Another significant challenge lies in the testing units. High-power advanced process chips dissipate a lot of power, and most of the output energy ultimately converts to heat. During testing, we need to avoid the chip overheating indefinitely, which could lead to the chip being “burned out”. Instead, we hope to achieve repeatable and reproducible testing parameters, keeping the chip stable during testing to ensure consistency in all collected data. The most direct method is to use ATC (Automatic Temperature Control) during testing. Common methods include: Option 1) DUT Power Monitor; Option 2) Die Temperature Monitor; Option 3) Package Temperature Monitor.

High-Power ASIC: Testing Challenges and Solutions for ProcessorsEach of the three methods has its pros and cons, and the time efficiency varies (as shown in the figure above). Teradyne prefers to use Option 1, as its advantage lies in the ability to predict the chip’s next possible state earlier and intervene in advance. Additionally, Teradyne’s testing machines can support this method, outputting the percentage of the current load for each DPS and the output voltage.In many actual mass production cases, Teradyne has used this monitoring method, which can predict the chip’s actual working conditions earlier compared to Options 2 and 3.As chip power continues to increase, circuits become more complex. We hope that during testing, all sockets, probe cards, load boards, etc., can be well monitored to ensure that testing components are not damaged due to short circuits, poor contact, and other abnormal situations.To prevent such situations, Teradyne incorporates real-time alarm mechanisms into the design of most testing boards. Once any abnormality occurs, it can issue real-time warnings through the testing machine without affecting the production of other devices or interrupting production, allowing for early screening to avoid abnormal situations and reduce testing omissions and quality accidents.High-Power ASIC: Testing Challenges and Solutions for Processors

Conclusion

Semi-conductor testing is the process of measuring the output response of semiconductors, comparing expected outputs to determine or evaluate the functionality and performance of integrated circuits throughout the design, manufacturing, packaging, and application processes. As the requirements for semiconductor manufacturing processes increase, the importance of the testing phase in semiconductor manufacturing continues to rise.The technical core of semiconductor testing machines lies in functional integration, precision and speed, cost reduction, and scalability. At Teradyne, we believe that testing solutions must have sufficiently good static precision and voltage regulation capabilities while achieving better robustness under boundary conditions to help reduce the probability of failures; simplify peripheral circuit designs as much as possible to reduce operational losses and indirectly lower testing costs; and finally, incorporate alarm mechanisms to predict and prevent abnormal situations.

Leave a Comment