New AI Techniques! Boosting C Language Unit Testing Efficiency and Quality!

Click the blue text to follow immediately

In today’s AI-driven era, the software development process is undergoing unprecedented changes.

While cutting-edge AI models like DeepSeek and ChatGPT are widely used for code generation, documentation writing, and requirement analysis, significantly enhancing development efficiency, the field of C language unit testing remains mired in “manual mode”: facing the underlying complexities of pointer operations, memory leak detection, and multithreading race conditions, traditional symbolic execution techniques are limited by path explosion, constraint solving difficulties, and the lack of hardware-dependent simulation.

Data shows that up to 70% of development time is consumed by inefficient testing processes, where manually writing test cases is not only time-consuming but also the complex configuration of traditional testing tools and their high learning curve deter developers.

In this context, AI-driven intelligent unit testing has emerged, redefining unit testing with disruptive innovation and opening a new chapter in intelligent and efficient testing.

Technical Competition:

The Battle Between Symbolic Execution and Large Models

Symbolic execution is efficient but has its bottlenecks.

Most unit testing tools on the market rely on traditional symbolic execution techniques, which parse program paths and simulate all possible values of variables to automatically generate high-coverage test cases.

This method can accurately identify boundary conditions and exceptional paths in the code, and largely avoids the limitations of manually written tests.

However, symbolic execution may encounter path explosion issues when dealing with complex code, and the generated test cases have poor interpretability. Additionally, for code that depends on environments or external interactions, its scalability is somewhat limited.

Large models cover a wide range but lack practicality.

In recent years, large language models (such as DeepSeek) have been widely applied to the automatic generation of unit tests.

This method typically generates test code directly based on the code context and attempts to cover as many code paths as possible.

However, due to the lack of deep understanding of compilation rules and specific code environments, the generated test cases often have a high compilation error rate, while line coverage and branch coverage are relatively low.

Moreover, large models tend to generate loosely structured and unconstrained test code, making it difficult to ensure the integrity and executability of the tests, which affects the actual testing results.

Structured Guidance for Large Models Achieving a Win-Win Situation.

This method originates from the paper “STRUT: Structured Seed Case Guided Unit Test Generation for C Programs using LLMs” presented at the international conference ISSTA 2025 (see Figure 1 for the paper cover).

To our knowledge, this is the first publicly published intelligent test case generation method for C language programs.

This method uses structured seed cases constructed through static analysis to guide large models in generating a high-coverage set of structured test cases, further generating high-quality executable cases through rule transformation.

By introducing a structured test case pattern, large models can generate more standardized test cases within constraints, solving the problem of numerous compilation errors in test case generation by large models. The technical solution is shown in Figure 2.

Figure 1 STRUT: Structured Seed Case Guided Unit Test Generation for C Programs using LLMs Cover Page

Figure 2 Technical Solution for Intelligent Generation of Unit Test Cases Guided by Structured Seed Cases

This technical solution is mainly divided into three parts:

(1) Static analysis of the function under test to construct context information, generating structured seed cases oriented to interfaces.

Using static analysis methods to parse the function under test, extracting dependencies, code blocks, and interface data of the function under test, organizing these three parts into a context database file for the function under test.

At the same time, test cases are defined as structured patterns, where each test case consists of test inputs, test outputs, and stub functions. Inputs and outputs consist of expressions and values, representing the test data that need to be assigned during testing.

Stub functions consist of function names, expressions, and values, representing the assignment of certain test data in the called function. An example of the generation process is shown in Figure 3.

Figure 3 Process of Generating Structured Seed Cases Oriented to Interfaces

(2) Test case generation and rule-based test code generation.

The prompts for generating test cases consist of the context of the function under test, code blocks, and seed cases. The prompts use a one-shot prompting method, using seed cases as examples to demonstrate the standard test case format to the LLM.

This method employs a rule-based code transformation approach to convert structured test cases into C test code (as shown in Figure 4), with the test code divided into three parts: data preparation, test execution, and result validation.

Figure 4 Rule-Based Test Code Generation

(3) Test execution and feedback-based test optimization.

During the test execution phase, the test code is first compiled and run, and based on the test coverage, optimization needs are determined. If the coverage of the function under test is below 80%, an optimization is performed, collecting relevant information to guide the LLM to generate test cases again. The prompts for the test optimization process are shown in Figure 5.

Figure 5 Test Optimization Prompts

The table below shows the comparison of statement coverage, branch coverage, and compilation pass rates among symbolic execution, full-version DeepSeek V3, GPT-4o, and the structured seed case generation method across 10 C projects in different types of open-source projects and safety-critical domains.

The results indicate:

The structured seed case generation method has an average statement coverage of 77.67%, branch coverage of 63.60%, and compilation pass rate of 92.34%. This method has the highest coverage among the four methods.
Compared to the full-version DeepSeek V3, the structured seed case generation method improves statement coverage by 55.23%, branch coverage by 44.57%, and compilation pass rate by 45.67%.

Compared to GPT-4o, it improves statement coverage by 37.34%, branch coverage by 33.02%, and compilation pass rate by 39.14%.
Compared to symbolic execution, the structured seed case generation method has higher coverage in 7 out of 10 projects, with an average statement coverage improvement of 5.92% and branch coverage improvement of 6.18%.

From the above test results, it can be seen that the coverage and compilation pass rates of the structured seed case generation method far exceed the native results of full-version DeepSeek V3 and GPT-4o large models, and there are also significant advantages compared to traditional symbolic execution methods, especially in logically complex projects, where statement and branch coverage can be improved by over 30%.

The above method is applied in the intelligent unit testing tool SunwiseAUnit, with all result data provided by this tool.

Daily Interaction

Don’t go away, Daily interaction “Leave a message, like, view, share, and follow” will have a chance to receive a customized “mouse pad”! ( Every weekInteraction rewards, based on the above 5 aspects, select the 2 participants with the most interactions to each receive a mouse pad)

PS: Every Monday, we will tally the interaction situation from the previous week (Monday to Sunday) and announce the winners and collection methods at 17:30 on that day.

Figure: Customized mouse pad

Today’s Interaction Topic:

What C language unit testing tool are you currently using? Have you found any shortcomings in terms of coverage, usability, etc.? How does it compare to the methods mentioned in the article?

Last Week’s Winner Announcement:

Last Week’s Interaction Comprehensive AwardStatistics Time: 25/04/28-25/05/04

@Zheng Xiangxi🚶

@goodlan

Prize: Customized mouse pad

Collection Method:

The winners, please scan the QR code to add the editor’s WeChat (note: weekly interaction prize collection) to receive the prize! The deadline is May 9, 2025 (Friday) at 12:00 PM, if you do not contact us by the deadline, it will be considered a waiver.

Disclaimer:This article is a user submission from 51Testing Software Testing Network by Xuan Yu Information, Shi Lanlan, the user has promised to independently bear the relevant legal responsibilities involving intellectual property rights when submitting this article, and has assured 51Testing that this article does not contain plagiarized content. The purpose of publishing this article is solely for learning and communication, not for any commercial use. Please do not reproduce without authorization, otherwise the author and 51Testing have the right to pursue responsibility. If you find any content in this public account that is suspected of plagiarism, please send an email to: [email protected] to report, and provide relevant evidence. Once verified, the suspected infringing content will be deleted immediately.

New AI Techniques! Boosting C Language Unit Testing Efficiency and Quality! LikeShareRecommend

Related posts

Leave a Comment Cancel reply