How to Prevent Memory Leaks in Embedded Programming?

1. Introduction

Recently, various products in our department have encountered issues caused by memory leaks, specifically manifested as board resets after continuous operation for several days due to memory exhaustion. On one hand, memory leak issues are considered low-level errors, and their oversight in production can have severe consequences; on the other hand, since memory leaks can lead to board resets after a fixed period of operation, they can only be resolved through bulk upgrades, which also has a significant negative impact. Moreover, the recurrence of such issues, especially one that was introduced by a long-time employee, indicates that many of our staff do not have a deep enough understanding of memory leak issues. This article aims to introduce the principles and inspection methods of memory leaks, hoping to eliminate such problems from the coding review stage in the future.

Note: There are various methods to prevent memory leaks, such as strengthening code reviews, tool detection, and memory testing. This article focuses on enhancing the capabilities of developers.

2. Principles of Memory Leak Issues

2.1 Storage of Heap Memory in C Code

Memory leak issues only occur when using heap memory; stack memory does not have memory leak problems because it is automatically allocated and released. The function for requesting heap memory in C code is malloc, and a common memory allocation code is as follows:

char *info = NULL;    /** Converted string **/
info = (char*)malloc(NB_MEM_SPD_INFO_MAX_SIZE);
if( NULL == info)
{
    (void)tdm_error("malloc error!\n");
    return NB_SA_ERR_HPI_OUT_OF_MEMORY;
}

Since the malloc function actually returns a memory address, the variable that holds the heap memory must be a pointer (unless the code is written extremely poorly). To reiterate, the variable that holds heap memory must be a pointer, which is crucial for understanding the main point of this article. Of course, this pointer can be a single pointer or a multiple pointer.

There are many variants or wrappers of the malloc function, such as g_malloc, g_malloc0, VOS_Malloc, etc., all of which ultimately call the malloc function.

2.2 Methods of Obtaining Heap Memory

Upon seeing the title of this section, some may wonder if the malloc function mentioned in the previous section is not the method for obtaining heap memory? Indeed, using the malloc function is the most direct method of acquisition, but if one only knows this method, it is easy to fall into pitfalls. Generally speaking, there are two methods for obtaining heap memory:

Method 1: Directly assigning the function return value to a pointer, typically represented as follows:

char *local_pointer_xx = NULL;
local_pointer_xx = (char*)function_xx(para_xx, ...);

Functions that involve memory allocation generally return pointer types, for example:

GSList* g_slist_append (GSList   *list, gpointer  data);

Method 2: Using the pointer address as a function return parameter to save the heap memory address, typically represented as follows:

int ret;
char *local_pointer_xx = NULL;    /** Converted string **/
ret = (char*)function_xx(..., &local_pointer_xx, ...);

Functions that involve memory allocation generally have a parameter that is a double pointer, for example:

__STDIO_INLINE _IO_ssize_t;
getline (char **__lineptr, size_t *__n, FILE *__stream);

As mentioned earlier, using malloc to allocate memory is a specific representation of Method 1. In fact, the essence of these two methods is the same; both involve indirect memory allocation within the function, but the method of passing memory differs: Method 1 passes the memory pointer through the return value, while Method 2 passes it through parameters.

2.3 Three Elements of Memory Leaks

The most common memory leak issues include the following three elements:

Element 1: There is a local pointer variable defined within the function;

Element 2: The local pointer has obtained memory through one of the “two methods of obtaining heap memory” mentioned in the previous section;

Element 3: The memory is not released before the function returns (including both normal and exceptional branches), nor is it saved to other global variables or returned to the upper-level function.

2.4 Misconceptions about Memory Release

Anyone who has used C language to write code should know that heap memory must be released after allocation. But why do memory leak issues still occur so easily? On one hand, it is due to developers’ lack of experience, awareness, or momentary negligence; on the other hand, it is due to misconceptions about memory release. Many developers believe that the memory to be released should be limited to the following two types:

1) Memory allocated directly using memory allocation functions such as malloc, g_malloc, etc.;

2) Situations where the developer is familiar with the interface that involves memory allocation, such as iBMC’s colleagues, who should know that calling the following interface requires releasing the memory pointed to by list:

dfl_get_object_list(const char* class_name, GSList **list);

Following this line of thought when writing code, once encountering unfamiliar interfaces that require memory release, there will be no awareness of the need to release memory, leading to natural memory leak issues.

3. Methods for Inspecting Memory Leak Issues

Inspecting memory leak issues primarily requires developing good coding review habits. Corresponding to the three elements of memory leaks, the following three points must be achieved:

When seeing a local pointer in a function, be alert to the potential for memory leak issues and develop a habit of further investigation.
Analyze the assignment operations to the local pointer to determine if they belong to one of the “two methods of obtaining heap memory” mentioned earlier. If so, analyze what the returned pointer actually points to: is it global data, static data, or heap memory? For unfamiliar interfaces, find the corresponding documentation or source code for analysis; alternatively, check other parts of the code that reference this interface to see if memory has been released.
If it is confirmed that there is a memory allocation operation for the local pointer, analyze the destination of that memory: will it be saved in a global variable? Or will it be returned as a function return value? If neither, check all places with “return” in the function to ensure that the memory is correctly released.

4. Common Memory Errors and Detection

After the introduction of the MMU in processors, the operating system took over memory management, responsible for the mapping of virtual space to physical space and permission management.

The memory management subsystem divides a process’s virtual space into different areas, such as code segment, data segment, BSS segment, heap, stack, mmap mapping area, kernel space, etc., each with different read, write, and execute permissions.

Through memory management, each area has specific access permissions, such as read-only, read-write, and no access. The data segment, BSS segment, and heap stack areas are all read-write areas, while the code segment is a read-only area. If you write data to the address space of the code segment, a segmentation fault will occur.

For applications, common memory errors can generally be divided into the following types: memory overflow, memory corruption, double free, and illegal pointers.

5. Segmentation Faults

When we perform write operations on a read-only area of address space or access a prohibited address (such as zero address), a segmentation fault will occur. The memory space between kernel space, zero address, heap, and mmap area is either occupied by the kernel or still in an “undeveloped” state, requiring an application to use it.

(1) When debugging linked lists, we typically use pointers to operate on each node. If the pointer has already pointed to the end or head of the linked list during traversal, and the pointer is now pointing to NULL, accessing the members of that node through this pointer is equivalent to accessing the zero address, which will also cause a segmentation fault, making this pointer an illegal pointer.

(2) Each user process has a default stack space of 8MB. If large arrays or local variables are defined within a function, it may cause stack overflow, leading to a segmentation fault. The same applies to threads in the kernel, where each kernel thread only has an 8KB kernel stack, and care must be taken to prevent stack overflow during actual use.

(3) If heap memory allocated using malloc() is accidentally freed multiple times, it will usually trigger a segmentation fault.

Due to the loose syntax checking of the C language, various operations on memory access in the program do not report errors or provide warning messages, making it difficult to locate segmentation faults during program execution. At this point, we can use some third-party tools to quickly locate segmentation faults.

6. Using Core Dump to Debug Segmentation Faults

In applications running in a Linux environment, various exceptions or bugs can cause the program to exit or be terminated. At this point, the system saves the program’s memory, register state, stack pointer, memory management information, and various function stack call information into a core file.

After enabling the core dump feature and running a.out, a core file will be generated in the current directory after a segmentation fault occurs, allowing us to use gdb to analyze this core file to locate where the program went wrong. In the GDB interactive environment, we can use bt to view the call stack information and see the specific line number of the error.

7. Memory Corruption Detection

For example, if two blocks of dynamic memory are allocated, and data is written to one block of memory causing an overflow, the overflow data will be written to another buffer. The system will not detect any errors or provide any prompts before the buffer is released, but the program may encounter inexplicable errors due to the erroneous operation, overwriting the data of another buffer.

Memory Corruption Monitoring: mprotect

mprotect() is a function in the Linux environment used to protect memory from illegal writes. It

monitors the usage of the memory to be protected, and once illegal access is detected, it immediately terminates the current process and generates a core dump.

Memory Detection Tool: Valgrind

Valgrind includes a suite of tools, one of which is the memory detection tool Memcheck, which can check for memory overwrites, memory leaks, and memory out-of-bounds access.

8. Achieving High-Performance C Programs

Lower memory usage and shorter runtime lead to better overall program performance.
Utilizing cache, which leverages the locality of the CPU, allows programs that meet locality to access data more efficiently.
Code inlining reduces the overhead caused by function call instructions by directly replacing function definitions with function calls.
The restrict keyword limits pointer usage to avoid aliasing, thus providing the compiler with more optimization opportunities to optimize code execution.
Eliminating unnecessary memory references further enhances program efficiency by reducing the interaction between the program and memory.
Loop unrolling allows us to further utilize the CPU’s instruction-level parallelism, making loop bodies execute faster.
Prioritizing the use of conditional move instructions can prevent CPU cycle waste caused by conditional branch instructions in specific scenarios.
Using higher compiler optimization levels allows us to leverage more “black technology” to further optimize our code.
Tail call optimization allows us to replace recursion with loops, reducing the stack frame creation and destruction process during function calls, making recursion faster.