Source | AuthorizedTransferred from Xuanyuan’s Programming Universe (ID: xuanyuancoding)Author | Xuanyuan Wind
One weekend, someone in my reverse learning group from scratch threw out a C language-related question:

First, think about what this code will output after running?
I encountered this question a few years ago in a Huawei interview.
The code is very short; the main function defines a pointer variable p, then passes its address to the fun function. The fun function uses the malloc function to allocate 100 bytes of space on the heap and assigns the address of this memory to p. Back in the main function, it immediately calls the free function to release the just allocated memory.
Then there is an if statement that checks if the pointer p is not NULL, and if so, it uses strcpy to copy a “hello world” string to the memory pointed to by p, and then calls printf to print it out.
At this point, you should understand that this is a very typical dangling pointer problem. Note that it is not what some people think of as a wild pointer; a wild pointer is a pointer variable that has not been initialized. A dangling pointer is the one mentioned above, which has been released but has not been set to NULL in time.

In C language, if pointers are misused, such pointer issues often arise, which is one of the reasons many people find C language pointers difficult to deal with.
Therefore, from the very beginning of learning C language, someone will emphasize that newly defined pointers must be assigned values, and released pointers must be set to NULL. Thus, in C language, it is generally not recommended to directly call the free function; instead, a macro definition is used to automate this process, allowing programmers to release pointers through this macro, which to some extent avoids the dangling pointer problem introduced by programming habits.
#define FREE(p) free(p); \
p = NULL;
In C++, to solve this problem, smart pointers were introduced, which encapsulate pointers in a C++ object, utilizing the automatic destruction feature of objects to accomplish the above task.
Returning to the question above, regardless of whether malloc can successfully allocate memory, in 99.99% of cases, 100 bytes of space can be successfully allocated without issues.
After freeing the memory, the pointer variable p has not been set to NULL in time and still points to this memory address, so the subsequent if statement must also be true, thus the program will enter the if block. However, the memory pointed to by p has already been reclaimed, so what will happen when using strcpy to write data to it is unpredictable.
If lucky, the string can be successfully copied and printed out as “hello world”; for example, when I ran it in VS2008 in Debug mode:

If unlucky, it will report an error and output nothing. For example, in the same VS2008, switching to Release mode:

Now guess again, at which line does the crash occur?
Is it during the strcpy data writing, or during the printf output?
The answer is that it crashes during printf. We can use the WinDbg debugger to debug the run and find that strcpy did not report an error and successfully copied the string:

By examining the call stack at the time of the crash, it actually crashed in the internal call chain of the printf function:

Why is this?
In fact, it is like this: although the memory has been released by calling free, it should be noted that this release is only at the level of the C language runtime library (because the free function is a C library function). The algorithm in the C language runtime library recycles it, but at the programming language level, this memory should no longer be accessed.
However, at the operating system level, this memory is still accessible; it is still located in a readable and writable 4KB memory page. Because the C language heap memory allocation algorithm does not call system-level functions (like VirtualFree) to actually release memory pages every time memory is freed, this is a heavy operation.
The so-called free here merely tells the C language runtime library that I no longer need this memory; you can recycle it and manage it uniformly.
Therefore, when calling strcpy, it can copy normally.
But just because this memory can be written does not mean you can write it indiscriminately. At the operating system level, the memory page is readable and writable, so writing is not a problem.
But from the perspective of the C language runtime library, the content at this address has already been reclaimed; now the content here is very important for managing the heap memory, so do not write indiscriminately; writing indiscriminately will cause problems.
As a result, this strcpy operation corrupts some management facilities in the heap memory (such as some pointers), and when printf is called later, the problems left behind are exposed.
However, if you replace printf with the MessageBox function, it will still pop up normally:

This is because MessageBox is a Win32 API function, and its call does not involve operations of the C language runtime library. The corruption of the C language heap has nothing to do with it.
However, after you click the pop-up message above, the program will still prompt you with an error. This is because after the main function returns, the program’s flow will again enter the territory of the C language runtime library, and the corruption of the heap memory will still be exposed at this time.
So why can the program run successfully in Debug mode?
This may be due to two reasons:
1. The methods of managing heap memory in the C language runtime library differ between Debug and Release modes. It is possible that the content written by strcpy did not corrupt some key data structures of the heap management algorithm.
2. It did corrupt, but the subsequent operations of the C language runtime library did not trigger this issue.
As for which specific reason, further research into the heap memory management algorithm of the C language runtime library, combined with debugging analysis, is needed to draw a conclusion.
Additionally, this code can also run normally after being compiled on Linux by default:

So in summary, whether this code can work normally does not have a definite answer; it is related to different platforms and different compilation modes, and its running result is uncertain.
Use After Free Attack
Speaking of dangling pointers, let me extend a bit and look at the following code:

I first allocated 100 bytes for pointer p, filled it with “hello, world”, printed it out, and then released the memory of pointer p.
But note that after releasing, I also did not set p to NULL.
Next, I called malloc to allocate 100 bytes for pointer q, and then filled the memory it points to with “hello, xuanyuan”.
But here comes the fun part; I then printed p, not q, and surprisingly printed the content of pointer q.
Printing p twice resulted in different outputs; why is that?
Debugging will reveal that now both pointers p and q point to the same memory address:

This takes advantage of the characteristics of the C language runtime library’s heap memory allocation algorithm, where the 100 bytes just freed are allocated to the new q, while p has not been set to NULL, resulting in both p and q pointing to this memory.
This feature is often used in binary security attacks. There is a type of attack called Use After Free (UAF), which uses this trick.
Clearly, the memory is now owned by q, but p also points to it; what will happen?
If p originally pointed to a structure that contains a function pointer, calling p->fun() would invoke it.
Now, by this method, I created a fake structure with a malicious code function pointer inside, so that calling p->fun() would execute the malicious code!
A small pointer can have a complex story behind it!
1. Mihayou (Genshin Impact) is truly a magical presence in Shanghai; even a 985 master’s degree can’t get past the resume screening…
2. A 985 degree is really useful; can you pass even if you perform poorly in the first interview?
3. Microsoft’s recent actions really shocked me!
4. High question on Zhihu: Do programmers need to know why a certain function is implemented?
5. Is this broken thing called a class?