Chip | How to Achieve ASILD Product Certification? (Part 1)

Chip | How to Achieve ASILD Product Certification? (Part 1)Recently, I have been frequently coordinating with several third-party companies to prepare for project certifications, discussing some aspects of product certification.For instance, we must accurately grasp two major directions: studying the preparation work at both the process and technical levels. The review and assessment must cover the entire lifecycle. The so-called process involves confirming whether product development strictly adheres to the ASIL level, while the technical aspect is relatively more complex. The most important thing is to confirm whether the product’s fault coverage meets the ASIL level requirements.Chip | How to Achieve ASILD Product Certification? (Part 1)Below, I will take the ASILD certification situation of a certain automotive-grade chip as an example to share some technical aspects for everyone’s reference and learning!

Hazard Analysis and ASIL Level Determination

1. Scenario Definition and Failure Mode Identification

Core Function:The chip is responsible for real-time monitoring of the battery pack’s voltage, temperature, and current, as well as overvoltage/undervoltage, overtemperature, and overcurrent protection.Potential Hazards:

    • Overcharge Risk: If the AFE chip misjudges the battery voltage, it may lead to battery overheating or even fire (S = 4, life-threatening).
    • Communication Interruption: A CAN bus communication failure may prevent the BMS from interacting with the vehicle controller, leading to power interruption (E = 3, high frequency exposure).
    • Sensor Failure: A short circuit in the temperature sensor may trigger incorrect actions in the cooling system (C = 3, uncontrollable).

2. Risk Matrix and ASIL Level Calculation

According to the S-E-C matrix of ISO 26262 – 3:

  • Overcharge Scenario: S = 4 (fatal injury) + E = 3 (high frequency exposure) + C = 3 (uncontrollable) → ASIL – D.
  • Communication Interruption: S = 3 (serious injury) + E = 2 (medium frequency exposure) + C = 2 (partially controllable) → ASIL – C.
  • Temperature Sensor Failure: S = 2 (moderate injury) + E = 1 (low frequency exposure) + C = 1 (controllable) → ASIL – A.

Conclusion:

The chip must meet the ASIL – D level, covering the most stringent overcharge scenarios.

Hardware Architecture Design and Fault Coverage Rate Calculation

Different safety levels have specific requirements for fault coverage rates, as shown in the table below:

Safety Level Single Point Fault Metric Potential Fault Metric Random Hardware Failure Target Value (PMHF)
ASIL B ≥90% ≥60% <10⁻⁶/h
ASIL C ≥97% ≥80% <10⁻⁷/h
ASIL D ≥99% ≥90% <10⁻⁷/h

1. Redundancy and Diagnostic Mechanisms

  • Dual-Channel ADC Sampling:

Independent sampling of the voltage of each battery cell is performed through primary/backup channels, with real-time data consistency verification using a hardware comparator. If the primary channel ADC fails, the backup channel can detect and trigger a safety response within 10μs, covering 99.9% of single-point failures.

  • Self-Test:Built-in periodic self-test module checks the health of registers, RAM, and communication interfaces every millisecond.Hardware self-tests can detect 95% of latent faults (e.g., register bit flips).
  • Power Monitoring:Adopts a dual power redundancy design (main power + backup LDO) and monitors voltage fluctuations in real-time.When the voltage exceeds the ±5% threshold, the charging circuit is cut off within 50μs.

2. Fault Coverage Rate Metric Calculation

Metric Formula X Chip Actual Value ASIL – D Requirement
Single Point Fault Metric (SPFM) 99.8% ≥99%
Latent Fault Metric (LFM) 65% ≥90%
Random Hardware Failure Probability (PMHF) 8.2×10⁻⁹/h <10⁻⁷/h

/Note: The formulas can be followed according to the standards, so they are not written here.

Design Optimization:

  1. Byincreasing redundant ADC channels (SPFM improved from 98% to 99.8%).
  2. Introducinghardware-accelerated ECC algorithms (LFM improved from 55% to 65%).
  3. Usinglow-power processes (PMHF reduced from 1.2×10⁻⁸/h to 8.2×10⁻⁹/h).

Testing Validation and Third-Party Certification

Testing Methods and Toolchain

  1. Fault Injection Testing: UsingVectorCAST to inject over 100 types of faults (e.g., voltage sensor short circuit, CAN communication errors) into the chip to verify the safety mechanism response time.

Results: Overvoltage protection average response time is 190μs (≤200ms requirement).b. Hardware-in-the-Loop (HIL) Testing: Simulating extreme temperatures from **-40℃ to 125℃**,100% humidity and other environments to test chip stability.Results: All test case pass rate is 99.7%, error rate < 0.01%.c. Code Coverage Analysis: UsingLDRA Testbed for unit testing, achievingMC/DC coverage of 100% (mandatory requirement for ASIL – D).

Third-Party Certification Process

  1. Document Review: Submit Functional Safety Concept Document (FSC), Hardware Architecture Report, Test Case Traceability Matrix, and over 30 other documents.Key Point is to verify the reasonableness of hazard analysis and ASIL level, and the matching of safety mechanisms and fault coverage.
  2. On-Site Assessment: The third-party company team conducts on-site reviews of chip design drawings, test benches, and defect management systems. It is important to ensure the independence of dual-channel ADC sampling and whether the redundancy design considers common cause failures.
  3. Rectification Issues: Identifying and rectifying issues, such as the “CAN communication error response time exceeding standard” issue, optimizing software interrupt priorities to reduce response time from 80ms to 50ms.
  4. Certification Result Confirmation and Announcement, and issuance of relevant certificates.

Some Issues Encountered and Solutions

Issue Solution
Cost Control under High ASIL Levels Adopt modular design, reusing mature IP (e.g., power management modules) to reduce development costs.
Common Cause Failure (CCF) Risk Reduce common cause failure probability through physical isolation (e.g., independent power domains) and different process node designs (main ADC using 28nm, backup ADC using 40nm).
Balancing Real-Time Performance and Diagnostic Coverage Introduce hardware acceleration engines (e.g., dedicated circuits for ECC algorithms) to improve diagnostic coverage without affecting real-time performance.
Compliance of Third-Party Toolchain Select TÜV certified tools (e.g., VectorCAST) and verify through Tool Confidence Level (TCL).

Summary

In summary, I hope to share and organize the key points and issues encountered during this certification process from a technical perspective, to avoid repeating similar points in future chip certifications.

  1. Hazard Analysis-Driven Design: Clearly define safety requirements through systematic HARA to avoid over-design or risk omissions.
  2. Quantitative Metric Verification: Hardware metrics such as SPFM/LFM/PMHF must be verified through both mathematical models and test data.
  3. Full Process Traceability: Bidirectional traceability from requirements to test cases (e.g., DOORS management) is the foundation of certification.
  4. Integration of Toolchain and Methodology: Tools such as HIL testing, fault injection, and code coverage analysis must be deeply integrated with ISO 26262 methodology.

…..

References

  • ISO 26262:2018 “Road vehicles – Functional safety”.

  • VectorCAST
  • LDRA Testbed
  • Silicon Labs SA631XXX chip, the first BMS AFE chip in China to pass ASIL-D certification, using dual-channel ADC redundancy design.
  • Bosch AI Safety mechanism, combining traditional V-model with data-driven engineering processes (DDE) to enhance AI system safety.
  • NVIDIA Halos system, a full-stack safety solution integrating functional safety, cybersecurity, and AI safety.

>>>>> The above is for learning reference only!

Chip | How to Achieve ASILD Product Certification? (Part 1)

Leave a Comment