TÜV Nord Functional Safety Discussion: Software Fault Injection Methods (Part 2)

On the 26th of every month

TÜV Nord Industrial Services Functional Safety Discussion

Software Fault Injection Methods (Part 2)

Abstract

Software fault injection is an important technical means for functional safety verification. This article aims to provide a basic overview of software fault injection and introduce existing fault injection techniques.

Software Fault Injection Techniques

Many Software Fault Injection (SFI) techniques and tools have been developed over more than 20 years. Here, we illustrate and discuss this work by distinguishing two basic approaches: injecting fault effects (also known as error injection), where errors are introduced by perturbing the system state, and injecting actual faults, where program code is modified to simulate software faults in the code. The following subsections review software fault injection techniques:

· Data error injection methods were the earliest, based on existing hardware fault injection techniques at that time;

· Interface error injection methods, aimed at testing the robustness of component interactions with other components;

· Methods for injecting actual faults, which introduce small fault changes in the program code.

4.1 Data Error Injection

Early methods for injecting fault effects were developed in the context of studying hardware faults through SWIFI. SWIFI aims to reproduce the effects of hardware faults (such as CPU, bus, and memory faults) by disturbing the state of memory or hardware registers. According to the following criteria, the SWIFI method replaces the contents of memory locations or registers with corrupted values:

· What to inject. The content of a single bit, byte, or word in a memory location or register has been corrupted. The types of errors are defined by analyzing the errors produced by electrical or gate-level faults. Common types of errors include replacing bits with fixed values (stuck-at-0 and stuck-at-1 faults) or inverted values (bit flips).

· Where to inject. Due to the numerous memory locations, errors injected into memory typically target a subset of locations. Injection can focus on randomly selected locations in specific memory areas (e.g., stack, heap, global data) or user-selected locations (e.g., specific variables in memory). Errors injected into registers can target those registers accessible by software (e.g., data and address registers).

· When to inject. Error injection may be time or event-related. In the former case, errors are injected after a given experimental time, which is selected by the user or according to a probability distribution. In the latter case, errors are injected when a specific event occurs during execution, such as on the first access or every access to the target location. Three types of hardware faults can be simulated: transient faults (i.e., occasional faults), intermittent faults (i.e., repeated faults), and permanent faults.

It is noteworthy that hardware errors injected by SWIFI can be injected into program states (e.g., data and address registers, stack, and heap memory) and program code (e.g., memory areas storing code before or during program execution). This is an important distinction of software fault injection: corruption in program states is intended to reflect the impact of software faults, i.e., errors caused by erroneous program execution, such as incorrect pointers, flags, or control flow, which SWIFI tools can directly introduce; in contrast, faults in program code are intended to reflect actual software faults in the code.

4.2 Interface Error Injection

Injecting errors at input parameters aims to simulate the effects of external faults produced by the target, including the impact of software faults in external software components, and to assess the target’s ability to detect and handle corrupted inputs. Similarly, the corruption of output values is used to simulate the output of faulty components and can be used to evaluate the impact of faults on the rest of the system.

Faults in input parameters may reveal flaws in the design and implementation of the target’s error detection and recovery mechanisms (e.g., input processing code). It is commonly adopted in robustness testing, which evaluates the extent to which “a system or component can operate correctly in the presence of invalid inputs or stressful environmental conditions.” It should be noted that the goals of robustness testing and interface error injection differ from functional testing techniques, such as black-box testing: robustness testing aims to assess the robust behavior of software modules in the face of invalid inputs (e.g., avoiding process crashes or generating warning signals), which is unrelated to the functional correctness of the target.

Interface error injection can be performed in two ways. The first method is based on a test driver linked to the target component (e.g., a program using the API exported by the target) and executes the program by submitting invalid inputs. This method is similar to unit testing, but in this case, robustness, rather than functional correctness, is evaluated. The second method involves intercepting and disrupting the interaction between the target and the rest of the system, i.e., triggering an interceptor program when calling the target component and modifying the original input to introduce corrupted input. In this case, the target component is tested in the context of the integrated system as a whole. This method is similar to SWIFI, as the original data flowing through the system (in this case, interface input) is replaced by corrupted data.

In interface error injection experiments, during the experiment, among several input parameters and several calls to the target API, typically only one input parameter and one call are corrupted. Common methods for generating invalid input values include three:

· Fuzzing: The original value is replaced with a randomly generated value.

· Bit flipping: Corrupted values are generated by flipping one or more bits of the original value.

· Type-based injection: The original value is replaced with an invalid value selected based on the type of the corrupted input parameter, where the type is derived from the API exported by the target. This method defines a pool of invalid values for each data type, which are selected from the analysis of the type domain (e.g., “NULL” in the case of C pointers).

4.3 Injecting Code Changes

The previous subsections mainly discussed simulating software faults by injecting fault effects (i.e., errors) using the SWIFI method. One public issue of these methods is the representativeness of injected errors (such as bit flips), which may not necessarily match the errors produced by software faults.

To address the representativeness issue, recent research on SFI has focused on injecting errors in program code (i.e., code changes). Injecting code changes can simulate real software faults, as the injected faults produce errors and failures similar to those generated by real software faults. Generally, errors can be injected into the process’s code storage area or binary executable files by applying SWIFI to inject errors into the program. However, it should be noted that thorough testing of these programs is required, and software faults need to be injected within a limited scope, requiring specialized tools and techniques for software fault injection.

Conclusion

When selecting methods for a system, the characteristics of the discussed fault injection methods should be considered.

Error injection is often used to evaluate the robustness of individual components and improve error handling in specific parts of the code. The main reason is that error injection allows experimentation on specific parts of the system, as it can assess the impact of errors on specific component interfaces or program variables. In fact, error injection does not require waiting for errors to be generated and propagated to specific parts of the program state being evaluated. Moreover, since error injection can be applied to individual components, it can be performed in the early stages of software verification.

In contrast, the purpose of injecting code changes is to evaluate the fault-tolerant system as a whole and to conduct quantitative assessments and comparisons between alternative design choices. Code changes are better suited for these goals because they are based on representative models of software faults and closely simulate the behavior of faulty software. This is an important requirement for quantitative assessments and comparisons, as they consider the relative probabilities of faults occurring to reflect the behavior exhibited by the system during operation. This makes injecting code changes more suitable for the later stages of software verification, when system components have already been integrated, and the developers’ goal is to assess the expected fault tolerance of the system during its operational lifetime (and derived metrics such as availability).

Author: Zheng Wei

Functional Safety (SIL/ASIL) Certification Evaluator

Functional Safety Engineer Qualification Course Authorized Instructor & Functional Safety Expert

ASPICE Provisional Assessor

National Registered Auditor

Member of the China Industrial Process Measurement Control and Automation Standardization Technical Committee

Member of the Sub-Technical Committee on System and Functional Safety

Expert Member of the National Functional Safety Standard GB/T20438 Drafting Working Group

Leader in Organizing and Promoting the “TÜV Functional Safety Engineer” Course

About TÜV Nord

Functional Safety Engineer, Cyber (Information) Security Engineer Training

TÜV Nord Functional Safety and Cyber (Information) Security Engineer Training enjoys an excellent international reputation. This training is specifically aimed at professionals in the fields of functional safety and cyber (information) security (such as process control, machinery safety, rail transportation, automotive safety, etc.). It covers functional safety requirements from multiple industries, including IEC61508, IEC61511, ISO26262, IEC62061, ISO13849, and cyber (information) security requirements from IEC62443, ISO21434, among others.

TÜV Nord’s functional safety and cyber (information) security series of training aims to cultivate professionals and their relevant knowledge and practical abilities in functional safety and cyber (information) security; provide complete solutions for enterprises in personnel safety; and help enterprises align with international functional safety and cyber (information) security technologies. Training examination requirements stipulate that students must possess project technical experience in the fields of functional safety and cyber (information) security and have practical project handling capabilities.

In 2023, the planned TÜV Nord Functional Safety Engineer, Automotive Functional Safety Manager, and Cyber (Information) Security Engineer training will continue as scheduled. Early registration is welcome to enjoy discounted prices and to receive course materials in advance.

About TÜV Nord Functional Safety and Cyber (Information) Security Services

TÜV Nord Functional Safety and Cyber (Information) Security Services

Mainly engaged in certification, assessment, training, and other areas related to functional safety and cyber (information) security services. Currently, TÜV Nord’s functional safety and cyber (information) security certification service content in the Greater China region mainly includes: functional safety and cyber (information) security training, project technical assessments, functional safety and cyber (information) security management system certification, functional safety expert certification, and other businesses.

TÜV Nord’s functional safety and cyber (information) security services

Span various industries, including aerospace, rail transportation, automotive electronics, nuclear power instrumentation and control systems, process automation safety instrumented systems (SIS), valves, actuators, industrial machinery, elevators, escalators, smart grids, etc.

Experts in TÜV Nord’s functional safety and cyber (information) security services have participated in OPEN Alliance Technical Alliance, NA 052 DIN Automotive Engineering Standards Committee, ISO/TC 22/SC 33 Working Group, FlexRay Consortium In-Vehicle Network Standards, AutoSAR, SAE International, SOTIF, and other standard committees or technical organizations. They have participated multiple times in the drafting of international standards for functional safety and cyber (information) security, such as IEC61508, ISO26262, IEC62061, ISO13849, IEC62443, ISO21434, ISO21448, with many years of R&D experience in safety-related systems and a precise understanding of standards. Well-known enterprises in various fields have chosen TÜV Nord as a partner and have highly praised TÜV Nord for its professionalism and responsibility.

Related Services Please contact:

Zheng Wei

Phone: 13402122657

WeChat: zhengwei_SIL

Email: [email protected]

Related posts

Leave a Comment Cancel reply