October 19 | Changchun Automotive Electronics Expo

October 21 | Wuhan Automotive Electronics Expo

Author: Andrew Hopkins, ARM Strategic IT Expert

Today, the automotive industry is undergoing rapid transformation, with designs, uses, and sales models evolving quickly. Driver safety technologies, traffic congestion, environmental issues, and the fundamental premise of vehicles as transportation tools are all influencing the development of the next generation of automobiles. To address these challenges, many automakers are trying to enhance computing capabilities to optimize vehicle control. The new standards issued by the European New Car Assessment Programme (EuroNCAP) stipulate that safety assistance functions such as lane change support are necessary to achieve a five-star safety rating. The number of onboard processors is steadily increasing across all market segments, currently averaging 40-50, with some high-end models already equipped with nearly 120 processors. According to Semicast Research, by 2022, the market size for electronic control unit (ECU) components under the engine hood will reach nearly $86 billion, reflecting a compound annual growth rate of 7% compared to $53 billion in 2015. Semiconductor manufacturers will have the opportunity to tap into a gold mine in the automotive electronics sector.

High-tech chips can improve powertrain emissions, enhance safety performance, and utilize cellular networks to achieve connectivity between vehicles and road infrastructure. However, as systems become more complex, ensuring driver safety becomes increasingly critical, necessitating the development of more automated, systematic, and proactive solutions—what we commonly refer to as “functional safety.”

What is Functional Safety?

In simple terms, the ultimate goal of functional safety is to ensure that products operate safely, even in the event of a problem. Based on this concept, ARM prioritizes safety as a top priority rather than merely following market trends, continuously enhancing R&D and launching more functional safety-related products.

Every industry sets standards to guide future development and establish minimum entry thresholds. In the automotive electronics industry, this standard is ISO 26262, which defines functional safety as:

“Avoiding unreasonable risks due to electrical/electronic system failures.”

Standards in different fields are not entirely consistent; for example, IEC 61508 for electrical and electronic systems and DO-254 for avionics hardware each have their definitions. Notably, they all have specialized terminology and provide engineering R&D guidance, including target parameters. Therefore, it is crucial to determine the target market and establish appropriate processes before starting product development, as modifying R&D processes midway will inevitably lead to inefficiencies. Figure 1 illustrates the various application standards for silicon IP. In practice, if multiple standards need to be met, common ground can be sought while preserving differences, first listing exclusive requirements and then applying quality management and other general principles; safety must be prioritized from the very beginning.

Ensuring Functional Safety in Automotive Design through IP Solutions

Figure 1: Functional Safety Standards for Silicon IP

In practice, functional safety systems must be certified by independent assessors to meet all safety standards. Achieving functional safety requires the ability to predict fault modes, and real-time assessment of system status is necessary to determine whether functionality is intact, partially impaired, or if the system must shut down for a restart or reset.

Not all faults will immediately lead to severe accidents. For instance, a failure in the automotive power steering system may cause sudden erroneous steering, but due to the inherent time delays in electrical and mechanical designs, the fault will not produce immediate consequences; this delay is typically several milliseconds or more, which ISO 26262 defines as fault tolerance time interval. The length of this interval depends on the potential accident type and system design. Therefore, it is not difficult to understand that the higher the safety requirements for the system, the more faults that could lead to unsafe events should be avoided.

Ideally, functional safety should not impact system performance; however, in reality, many existing safety measures significantly affect system performance, power, and area (PPA). How to mitigate the adverse impact on system performance and rising design and manufacturing costs while ensuring functional safety is a major challenge faced by designers.

Why is Functional Safety Needed?

The functional safety of chip IP was once a niche area, attracting interest only from a handful of chip and system developers in automotive, industrial, aerospace, and similar markets. However, the situation has changed dramatically in recent years with the rise of various automotive applications. Beyond automotive, many other industries can also benefit from an increase in electronic devices, provided that functional safety is guaranteed. Medical electronics and aerospace are two typical examples.

Autonomous driving has attracted significant attention in recent years, but it has often been shrouded in ambiguity; now, with the proliferation of Advanced Driver Assistance Systems (ADAS) and rich media In-Vehicle Infotainment (IVI) systems, although the era of highly automated driving remains distant, the prospects for autonomous vehicles have become increasingly clear. Drones of various sizes and the ever-growing Internet of Things also urgently require functional safety, and ARM’s technology will be a great support.

ARM’s Functional Safety Technology

Like other technology markets, emerging functional safety applications also require semiconductor drivers; this is not just theoretical, as rapid product innovations have sparked strong interest among ARM’s partners. Most functional safety embedded systems need to incorporate two core elements: security protection and real-time processing. ARM’s Cortex-R series processors are tailored to these needs, providing high-performance computing solutions for embedded systems, ensuring product reliability, availability, fault tolerance, and/or robust real-time autonomous decision-making capabilities. These features lay the foundation for high safety integrity in ADAS and IVI systems, enabling them to execute critical behavior processing, respond to safety-related interrupt events, communicate with other systems, and regulate less integrated complex functions.

What is a Fault?

Faults can be systemic (such as human factors in specification and design processes) or related to the tools used. One way to reduce faults is to implement stringent quality control processes, which must include detailed planning, review, and quantitative assessment. Proper planning for tool certification is crucial, and the ability to manage and track requirement changes is equally important. ARM’s Compiler 5 has been certified by TÜV SÜD, aiding in safety development so that customers do not need to certify the compiler separately.

Another type of fault is known as random hardware faults. These may be permanent faults, such as short circuits, as shown in Figure 2; or they may be soft faults caused by natural radiation. Such faults can be addressed using integrated solutions in both hardware and software, making system-level technology equally important. For example, Built-In Self-Test (BIST) can be applied during system startup and shutdown to distinguish between soft and permanent faults.

Ensuring Functional Safety in Automotive Design through IP Solutions

Figure 2: Types of Faults

Mitigation Measures

The selection and design of fault detection and control measures are the favorite aspects for process designers, as they can showcase their skills using both system-level and micro-architecture-level technologies. Establishing a Fault Mode and Effects Analysis (FMEA) is a good start, listing all possible fault modes and their severity of consequences. With this information, along with the designer’s in-depth understanding of complex systems, the most severe fault modes can be identified, and countermeasures can be designed.

There are various methods to address potential faults, and here are some of the most commonly used techniques:

·Diversity Checkers: Using another circuit to check whether the main circuit has failed. For example, a checker can count for an interrupt controller, continuously recording the total number of interrupts caused by human and system factors.

·Complete Lockstep Redundancy: This technique is primarily used for Cortex-R5 processors, instantiating an IP component (such as a processor) multiple times, generating operational delays through loops, creating temporal and spatial redundancy. Large storage is typically shared by multiple instances to reduce required area. Although this technique is very reliable, it is also extremely costly.

·Selective Hardware Redundancy: In this scheme, only critical hardware parts can be duplicated, such as arbiters.

·Software Redundancy: Hardware redundancy is usually very complex and incurs indirect costs, representing an unreasonable use of resources. An alternative to hardware computation is to run the same calculation on multiple processor cores and check if the results match.

·Error Detection and Correction Codes are another well-known technique, commonly used to protect memory and buses. There are various types of codes, but the goal is the same: to achieve higher redundancy with a minimal number of additional bits without duplicating all underlying data. In automotive systems, this cutting-edge technology can detect two-bit errors in a storage word and support error correction.

Fault Logging

Once a fault is detected, it must be logged to assist regulatory software in assessing the system’s health and safety status. Safety faults (such as memory corrections) and hazardous faults (such as irretrievable hardware failures) must be logged separately.

Fault logging typically begins with fault counting, which can be recorded by system-level architecture for the number of signal events (similar to interrupts); or by IP counters. To understand the reasons for these events, it is best to reference past events to ascertain the cause of occurrences at the current time. To support this need and facilitate debugging, some IP can capture additional information, such as the monitored storage address. Because this address is typically preserved by a soft reset, it can be read during system startup and self-test processes.

It is important to remember that faults can also occur within the safety architecture itself. Unlike hardware faults, which can usually be quickly identified during use, faults in safety checkers may be latent, unable to detect hazardous failures, but the fault may have already spread silently. Such faults are referred to as latent faults, and regularly testing checkers is a good practice.

Safety Integrity Levels

The methods for reflecting safety levels vary across different standards systems, but their primary purpose is to intuitively reflect the criticality of functions. For example, the ECU controlling windshield wipers, airbags, or brakes must have a higher integrity than the ECU controlling speedometers or parking sensors because forward visibility is crucial, and sudden braking or airbag deployment can have fatal consequences, putting the driver at great risk; whereas the importance of speedometers or parking sensors for safe parking is much lower.

In other words, safety integrity levels relate to the necessity and ability of people to avoid hazardous situations; the role of various standards is to guide how to define safety integrity levels and provide related parameters to help quantify system integrity.

IEC 61508 divides Safety Integrity Levels (SIL) into four levels, with level 4 being the highest integrity. Similarly, ISO 26262 proposes Automotive Safety Integrity Levels (ASIL), ranging from ASIL A (lowest) to ASIL D (highest). Furthermore, as shown in Table 2, ISO 26262 provides recommended parameters for single-point faults, latent faults, and hardware fault probability metrics (PMHF, also referred to as timely faults) for ASIL B to ASIL D. The proportion of detectable faults is referred to as diagnostic coverage.

Recommended Targets	ASIL B	ASIL C	ASIL D
Single-point Faults	≥90%	≥97%	≥99%
Latent Faults	≥60%	≥80%	≥90%
Latent Faults and Hardware Fault Probability	<10^-7 / h	<10^-7 / h	<10^-8 / h

Table 1. Recommended Standards from ISO 26262

Although these metrics are typically considered standard requirements, in practice, they are generally viewed as recommendations, and suppliers can set their target parameters. The most important goal is to create safe products rather than merely adding a few numbers to product specifications. Once again, borrowing from the previously mentioned examples—windshield wipers, brakes, and airbags may achieve ASIL D safety levels, while speedometers and parking sensors may be ASIL B or lower, depending on the overall system safety design.

No matter how high the diagnostic coverage, the development of functional safety applications must follow appropriate processes—this is the greatest benefit of the standards system. Moreover, regardless of the functional safety measures employed, strict quality processes can enhance the overall quality of any application.

Design Process for Functional Safety IPWhen developing functional safety application IP, adherence to regulations is crucial. This process must incorporate safety considerations from the outset and foster a culture that supports safety.

The complete development process must include the following key aspects:

·Safety Management: This includes team organizational structure, specifically detailing the definitions and responsibilities of different positions, building a safety culture, defining safety lifecycle, and establishing functional safety support levels. Setting the safety lifecycle includes creating a successful plan, selecting appropriate development tools, and ensuring the team receives adequate training.

·Requirement Management and Traceability of Fault Detection and Control Measures (Countermeasures). To accurately achieve requirement traceability, the definition of the requirements themselves must be clear, precise, and unique. The level of traceability depends on the integrity requirements; documents can be high-level; products need to cover all aspects from fault detection to validation—planning processes must not be arbitrary and must undergo thorough validation.

·Quality Management is an extension and elaboration of requirement traceability. Errata must be properly managed and utilized. ARM has extensive experience in this area. Additionally, recording and communicating processes are equally important.

Safety Document Package

IP development is a way for ARM to support partners, and our relationship does not end when clients receive the IP. For functional safety-related IP development, ARM defines two levels of safety document packages:

·Standard support up to ASIL B

·Extended support up to ASIL D

Each safety document package includes a safety manual that details the processes followed, fault detection and control functions, applicable scenarios, and other information. We also provide a “Fault Mode and Effects Analysis Report” and case studies illustrating how to achieve higher diagnostic coverage with IP; we also offer more support for clients’ independent analyses at the chip level. Additionally, the document package clearly defines the development interfaces between ARM and the licensees.

Independent Safety Units

The establishment and use of safety status reports require a step-by-step approach. This report is provided by chip developers, and all vendors’ information must be comprehensively considered before delivering it to clients. The most licensed chip IP is referred to as “Independent Safety Unit” (SEooC), and its designers do not need to be aware of how the chip will be used subsequently. Therefore, the safety manual must specify the IP developer’s recommendations and instructions for chip usage to prevent misuse. Similarly, tier-1 controller suppliers can also use the SEooC model to develop safety functions. Thus, the safety document package at the IP level can be utilized throughout the value chain, being an essential part of IP development.

Functional Safety Will Gradually Become a Mandatory Requirement

From automotive to medical to industrial devices, applications relying on electronic devices are increasing, and functional safety is becoming more important and will become a conventional requirement. Functional safety is a requirement that IP vendors must meet and a prerequisite for the smooth operation of models built on that IP; therefore, IP vendors must grant every research outcome to as many chip partners as possible, and vice versa. With solid quality and reliability, functional safety can bring broader benefits, thereby driving improvements in quality and reliability across the entire industry. Including driver safety, fuel economy, comfort, and in-vehicle infotainment systems, functional safety is the foundation for chip designers to address higher-level automotive challenges.

Line

Down

Activity