Star/Top Public Account👇, hard-core articles delivered first!
1: Memory Management in C
In C, memory is mainly divided into the following 5 storage areas:
(1) Stack: Local variables (including function parameters) inside functions are allocated and released by the compiler. When the function ends, stack variables become invalid.
(2) Heap: Allocated by the programmer using malloc/calloc/realloc, released with free. If the programmer forgets to free it, it will cause memory leaks, and this memory will be reclaimed by the OS when the program ends.
(3) Global/Static Area: Storage area for global and static variables, which exists once the program is compiled. In C, initialized global and static variables are placed in adjacent areas (in C++, the compiler automatically initializes these variables, so no distinction is made). Since global variables occupy memory space and are hard to maintain, it is recommended to use them sparingly. Released when the program ends.
(4) C-style string constant storage area: A place specifically for storing string constants, released when the program ends.
(5) Program code area: Area where program binary code is stored.
2: Memory Management in C++
In C++, similar to C, but with some differences, memory is mainly divided into the following 5 storage areas:
(1) Stack: Local variables (including function parameters) inside functions are allocated and released by the compiler. When the function ends, stack variables become invalid.
(2) Heap: The difference here is that the heap memory is allocated using new and released with delete or delete[].
(3) Free Storage: Allocated by the programmer using malloc/calloc/realloc, released with free. If the programmer forgets to free it, it will cause memory leaks, and this memory will be reclaimed by the OS when the program ends.
(4) Global/Static Area: Storage area for global and static variables, which exists once the program is compiled. In C++, since the compiler automatically initializes global and static variables, there is no distinction between initialized and uninitialized variables. Since global variables occupy memory space and are hard to maintain, it is recommended to use them sparingly. Released when the program ends.
(5) Constant Storage: This is a special storage area, specifically for storing constants that cannot be modified.
3: Differences between Heap and Stack
3.1 Stack
Specifically, modern computers (Von Neumann serial execution mechanism) directly support the stack data structure at the low code level. This is reflected in having dedicated registers pointing to the stack address (SS, stack segment register, stores the stack segment address); there are dedicated machine instructions to complete data push and pop operations (in assembly, there are PUSH and POP instructions).
This mechanism is characterized by high efficiency, but the data it supports is limited, generally integers, pointers, floating-point numbers, and other data types directly supported by the system. It does not directly support other data structures (a custom stack structure can be defined to support multiple data types). Because of this characteristic of the stack, its usage in programs is very frequent. Subroutine calls are directly accomplished using the stack. The machine’s call instruction implicitly pushes the return address onto the stack and then jumps to the subroutine address, while the subroutine’s ret instruction implicitly pops the return address from the stack and jumps to it.
Automatic variables in C/C++ functions directly use the stack, which is why when the function returns, the automatic variables of that function become invalid, thus avoiding returning stack memory and stack references to prevent memory leaks.
3.2 Heap
Unlike the stack, the heap data structure is not supported by the system (whether machine hardware or operating system), but is provided by the library functions. The basic functions malloc/calloc/realloc/free maintain an internal heap data structure (in C++, new/delete are added to maintain it).
When the program uses these functions to obtain new memory space, these functions first try to find available memory space from the internal heap (common memory allocation algorithms include: first fit algorithm, next fit algorithm, best fit algorithm, and worst fit algorithm, etc.). If no available memory space is found, it tries to use system calls to dynamically increase the size of the program data segment; the newly allocated space is first organized into the internal heap and then returned to the caller in an appropriate form. When the program releases allocated memory space, this memory space is returned to the internal heap structure, which may be appropriately processed (such as merging free space into larger free space) to better suit the next memory allocation request. This complex allocation mechanism is essentially a memory allocation buffer pool (Cache), and the reasons for using this mechanism include:
(1) System calls may not support arbitrary sizes of memory allocation. Some systems only support fixed size and multiples of memory requests (allocated by page); this can lead to waste for a large number of small memory allocations.
(2) System calls for memory allocation can be expensive. System calls may involve switching between user mode and kernel mode.
(3) Unmanaged memory allocation under a large number of complex memory allocation and release operations can easily lead to memory fragmentation.
3.3 Comparison of Stack and Heap
From the above introduction, they have the following differences:
(1) The stack is a function provided by the system, characterized by speed and efficiency, but with limitations, and data is not flexible;
The heap is a function provided by the library, characterized by flexibility and convenience, with a wide range of data adaptability, but efficiency is somewhat reduced.
(2) The stack is a system data structure, which is unique for processes/threads;
The heap is an internal data structure of the library, which is not necessarily unique, and memory allocated from different heaps cannot be operated on each other.
(3) Stack space is divided into static and dynamic allocation, generally completed by the compiler for static allocation, automatically released, and dynamic allocation of the stack is discouraged;
The heap allocation is always dynamic, although all data spaces are released back to the system when the program ends, precise matching of memory allocation/release is a basic element of good programming.
(4) Fragmentation issues
For the heap, frequent new/delete operations will inevitably cause discontinuity in memory space, leading to a large amount of fragmentation, reducing program efficiency; for the stack, this problem does not exist because the stack is a last-in-first-out (LIFO) queue.
(5) Growth direction
The heap grows upwards, meaning towards increasing memory addresses; for the stack, the growth direction is downwards, towards decreasing memory addresses.
(6) Allocation methods
Heap allocations are all dynamic, with no static allocation of heaps;
The stack has two allocation methods: static allocation and dynamic allocation. Static allocation is completed by the compiler, such as the allocation of local variables. Dynamic allocation is done using the alloca function, but the dynamic allocation of the stack differs from the heap in that its dynamic allocation is released by the compiler, without manual implementation.
(7) Allocation efficiency
The stack is a data structure provided by the machine system, supported at the low level by the computer, with dedicated stack segment registers for allocation, and dedicated machine instructions for push and pop operations, all of which determine the high efficiency of stack execution.
The heap is provided by the C/C++ library, with a more complex mechanism, different allocation algorithms, and prone to memory fragmentation, making its efficiency much lower than that of the stack.
4: Specific Instance Analysis
Example 1
Look at the following small segment of C code and carefully understand the various memory allocation mechanisms.
int a = 0; // Global initialization area, a's value is 0char *p1; // Global uninitialized area (initialized to NULL in C++) int main() { int b; // b allocated on stack, integer char s[] = "abc"; // s allocated on stack, char * type; "abc\0" allocated on stack, assigned at runtime, destroyed at function end char *p2; // p2 allocated on stack, uninitialized char *p3 = "123456"; // p3 points to address allocated in string constant storage area, determined at compile time static int c = 0; // c in global (static) initialization area, can be called across functions and retain original value p1 = (char *)malloc(10); // p1 in global uninitialized area, points to allocated 10 bytes in heap p2 = (char *)malloc(20); // p2 points to allocated 20 bytes in heap strcpy(p1, "123456"); // "123456" in string constant storage area, the compiler may optimize it with the "123456" pointed by p3 into one block return 0; }
Example 2
Look at the following small segment of code and experience the difference between heap and stack:
int foo() { // Other code int *p = new int[5]; // Other code return 0;}
The statement int *p = new int[5]; contains both heap and stack. The new keyword allocates a block of heap memory, while the pointer p itself occupies stack memory (generally 4 bytes representing the address). This means that a pointer p pointing to a block of heap memory is stored in stack memory. In the program, first, the size of the memory to be allocated on the heap is determined, then the new keyword is called to allocate memory, and finally, the address of this memory is returned and stored in the stack. The assembly code is:
int foo(){008C1520 push ebp 008C1521 mov ebp,esp 008C1523 sub esp,0D8h 008C1529 push ebx 008C152A push esi 008C152B push edi 008C152C lea edi,[ebp-0D8h] 008C1532 mov ecx,36h 008C1537 mov eax,0CCCCCCCCh 008C153C rep stos dword ptr es:[edi] int *p = new int[5];008C153E push 14h 008C1540 call operator new[] (8C1258h) 008C1545 add esp,4 008C1548 mov dword ptr [ebp-0D4h],eax 008C154E mov eax,dword ptr [ebp-0D4h] 008C1554 mov dword ptr [p],eax return 0;008C1557 xor eax,eax }008C1559 pop edi 008C155A pop esi 008C155B pop ebx 008C155C add esp,0D8h 008C1562 cmp ebp,esp 008C1564 call @ILT+395(__RTC_CheckEsp) (8C1190h) 008C1569 mov esp,ebp 008C156B pop ebp 008C156C ret
If memory needs to be released, we need to use delete[] p here to tell the compiler that we want to delete an array.
Example 3
Look at the following small segment of code and try to find the error:
#include <iostream> using namespace std; int main() { char a[] = "Hello"; // Allocated on stack a[0] = 'X'; cout << a << endl; char *p = "World"; // Points to address allocated in string constant storage area p[0] = 'X'; cout << p << endl; return 0; }
Did you find the problem? Yes, the character array a has a capacity of 6 characters, with content “hello\0”. The content of a can be changed, such as a[0]=’X’, because it is allocated on the stack, meaning its content is determined at runtime. However, the pointer p points to the string “world” allocated in the string constant storage area, and the content “world\0” cannot be modified. From a syntax perspective, the compiler does not see an issue with the statement p[0]=’X’, but at runtime it will lead to an “access violation” illegal memory access issue.
The following changes to several functions should be clearly understood:
char *GetString1(void) { char p[] = "hello,world"; // Result: h. Since the array pointer points to the address of the first element, the result after calling is h return p; } char *GetString2(void) { char *p = "hello,world"; // Result: hello,world. Since p points to the address of the string constant area "hello,world" return p; } char *GetString3(void) { char *p = (char *)malloc(20); // Points to the memory space allocated on the heap. return p; } char *GetString4(void) { char *p = new char[20]; // Points to the memory space allocated on the heap, p itself is on the stack, and the space pointed to by p is on the heap. return p; }
END
2T Programmer’s gift package, including C/C++, Linux, Python, Java, PHP, artificial intelligence, microcontrollers, Raspberry Pi, etc.
Follow the public account👇, reply “1024” in the background to get it for free!