The smartphones and computers you use daily rely on a crucial yet often overlooked process—testing. A chip undergoes a series of rigorous tests from wafer to finished product, ensuring it operates without overheating and has low failure rates, while also validating that all functions and performance metrics meet standards. This document on wafer and chip testing clearly outlines the core logic and key steps of the testing process. Today, we will break it down in simple terms to help you understand the “health check process” that chips undergo before leaving the factory.
1. Core Testing Objectives: Two Key Goals—”Stable Heat” + “Low Failure Rate”
Chip testing is not just a casual check; it focuses on achieving two main objectives, which are critical for evaluating samples:
- Heat Compliance: The chip must not exceed temperature limits during operation (for example, a mobile chip should not exceed 45°C under full load), or it may experience lag, crashes, or even damage;
- Low Failure Rate: All potential issues (such as leakage or functional failures) must be identified before leaving the factory to ensure stable operation for users.
To achieve these two goals, efforts must be made simultaneously from both the “design” and “testing” perspectives—considering “how to facilitate testing” during design and covering all potential problem scenarios during testing.
2. Key Testing Considerations During Chip Design
Testing should not be an afterthought; it should be integrated from the design phase with a focus on six key points:
- Top-Level Design: Clearly define testing requirements, such as “what testing methods the chip should support” and “which parameters need to be measured,” while also estimating wafer area and pin count to avoid issues during later testing;
- Simulation Verification: Conduct repeated simulations during the design process (such as subsystem simulations and backend simulations) to identify functional or timing issues early, reducing failures during physical testing;
- Thermal Design and Power Consumption: Calculate power consumption during design (for example, the power usage of each module) and implement thermal design (such as avoiding clustering high-power modules) to control heat generation from the source;
- Resource, Rate, and Process Matching: Design resources and rates based on process technology (for example, 28nm or 7nm) to avoid “design rates that the process cannot achieve,” which would lead to performance issues during testing;
- Coverage Requirements: Testing should cover as many scenarios as possible (such as functional coverage and fault coverage), ensuring that “all input-output combinations are tested” to avoid missing potential issues;
- Reserved Testing Interfaces: For example, include “boundary scan” interfaces in the design to facilitate quick internal circuit testing with testing equipment later on.
3. Two Testing Stages from Wafer to Chip—Both Are Essential
Chip testing is divided into two main stages—wafer-level (before cutting into chips) and chip-level (after cutting into finished products), with targeted testing items for each stage:
1. First Stage: Wafer Testing—”Initial Health Check for the Wafer”
When the wafer is still a round thin slice, it must undergo testing on a probe station, focusing on “basic health checks” to avoid waste after cutting:
- Contact Testing: Check whether the probe has good contact with the wafer to ensure accurate signal transmission;
- Power Consumption Testing: Measure the power consumption of the wafer during operation to eliminate areas with severe leakage;
- Input Leakage Testing: Check for abnormal leakage at input pins to avoid affecting chip stability;
- Output Level Testing: Verify whether the output signal levels meet standards (for example, high level ≥ 3.3V, low level ≤ 0.8V);
- Function + Dynamic Parameter Testing: Test whether basic functions are normal and whether signal timing (such as signal delay) meets requirements;
- Analog Signal Testing: For analog chips (such as power management chips), measure parameters like gain, linearity, and signal-to-noise ratio, which require high precision—such as measuring microvolt-level voltages and nanoamp-level currents, where even slight fluctuations can lead to non-compliance.
Additionally, monitor the wafer’s “process parameters,” such as the leakage of transistors (including “subthreshold gate leakage” and “junction leakage” mentioned in the document) to rule out issues caused by process defects.
2. Second Stage: Chip Testing—”Final Assessment of the Finished Product”
After the wafer is cut into individual chips, they must undergo more comprehensive testing, primarily to “validate performance in actual usage scenarios”:
- ATE Testing: Use Automated Test Equipment (ATE) to simulate real-world usage environments, testing functionality, performance, and reliability—for example, testing the “baseband functionality during calls” and “GPU performance during gaming” for mobile chips;
- Boundary Scan Testing: Quickly detect internal logic circuits using the chip’s built-in “boundary scan chain” without needing to open the package to check for faults;
- Specialized Analog Circuit Testing: For analog chips, measure more detailed parameters, such as frequency response (signal amplification effects at different frequencies), harmonic distortion (signal deformation), and crosstalk (interference between adjacent channels), which require specialized instruments for accurate measurement.
4. Addressing Failures: A Dual Insurance of “Design + Testing”
Chip failures can vary widely, and it is essential to anticipate and address them proactively, focusing on two key points:
1. Understand Failure Types: Issues Can Arise from Design or Process
Failures do not occur “out of nowhere” and can be categorized into two main types:
- Design-Related Failures: Such as logical errors (for example, “AND gate output is inverted”) and timing delays (signals not arriving on time);
- Process-Related Defects: Such as open/shorted transistors and wire bridging (two wires connected together), with “stuck-at faults, delay faults, and open faults” mentioned in the document all belonging to this category.
2. Rely on “Testability Design”: Incorporate “Self-Test Functions” into Chips
Rather than struggling to find faults later, it is better to leave “backdoors” during design, with two common methods:
- Scan Path Method: Connect registers within the chip into a “scan chain” to allow direct reading of register states during testing, enabling quick localization of logical faults;
- Built-In Self-Test (BIST): Equip the chip with a “self-test module” that can test functionality without external devices (for example, a BIST module in memory chips can automatically check whether storage units are functioning correctly).
5. Testing Is Not the “Final Step” but an Integral Part of the Entire Design Process
Many people think testing is only done before leaving the factory, but in reality, testing is conducted concurrently from the design phase:
- Research Phase: Define testing requirements and metrics;
- Module Design/Implementation Phase: Conduct module simulations to verify local functionality;
- System Simulation Phase: Test the functionality and timing of the entire system;
- Post-Backend Design: Perform “layout post-verification,” considering testing after routing delays;
- Production Phase: Wafer testing → chip testing → finished product screening, with multiple layers of checks.
Just like building a house, seismic simulations are conducted from the design phase rather than waiting until construction is complete to test—chip testing is also about “preventing problems before they occur”; the earlier issues are identified, the lower the costs.
Interactive Time
Have you ever encountered “chip failures”? For example, sudden lag on your phone or a blue screen on your computer (which could be due to minor issues overlooked during chip testing)? Share your experiences in the comments, or let us know which testing details you are most interested in learning about!
#ChipTesting #WaferTesting #ICReliability #TestabilityDesign #ChipManufacturingProcess #SemiconductorTesting