↓Recommended Follow↓
Source: WeChat Official Account – Hongyang
Author: Chen Guanyou – Xiaomi
Native crash has always been a pain for major apps. This article is very in-depth, a selfless sharing from Chen Guanyou of Xiaomi. There is very little relevant information available online, and it is indeed rare to find someone proficient in native crash analysis sharing their knowledge. Due to the depth of knowledge, it may be very difficult to understand. I can only grasp a little, but this article is definitely worth saving and reviewing regularly.
Now let’s start with the main content.
Common Types of Native Crashes
SIGSEGV |
SEGV_MAPERR |
Address not mapped in /proc/self/maps |
SEGV_ACCERR |
No access permission |
|
SEGV_MTESERR |
MTE specific type |
|
SIGABRT |
Program exits actively, commonly calls functions abort(), raise(), etc. |
|
SIGILL |
ILL_ILLOPC |
Illegal opcode (opcode) |
ILL_ILLOPN |
Illegal operand |
|
ILL_ILLADR |
Illegal addressing |
|
ILL_ILLTRP |
Illegal trap, such as _builtintrap() actively crashing |
|
ILL_PRVOPC |
Illegal privileged opcode |
|
ILL_PRVREG |
Illegal privileged register |
|
ILL_COPROC |
Co-processor error |
|
ILL_BADSTK |
Internal stack error |
|
SIGBUS |
BUS_ADRALN |
Access address not aligned |
BUS_ADRERR |
Access to non-existent physical address |
|
BUS_OBJERR |
Hardware error for specific object |
|
SIGFPE |
FPE_INTDIV |
Integer division by 0 |
FPE_INTOVF |
Integer overflow |
|
FPE_FLTDIV |
Floating-point division by 0 |
|
FPE_FLTOVF |
Floating-point overflow |
|
FPE_FLTUND |
Floating-point underflow |
|
FPE_FLTRES |
Floating-point result not precise |
|
FPE_FLTINV |
Invalid floating-point operation |
|
FPE_FLTSUB |
Out of bounds |
Android Logs
When a program encounters a native crash error, the Android logs will output to the log crash buffer. Therefore, we can capture the corresponding error report through
adb logcat -b crash, while the information provided by the log itself is limited, just the error stack and the register information of the current thread.
——— beginning of crash 06-07 01:53:32.465 12027 12027 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 06-07 01:53:32.465 12027 12027 F DEBUG : Revision: ‘0’ 06-07 01:53:32.466 12027 12027 F DEBUG : ABI: ‘arm64’ 06-07 01:53:32.466 12027 12027 F DEBUG : Timestamp: 2022-06-07 01:53:32.033409857+0800 06-07 01:53:32.466 12027 12027 F DEBUG : Process uptime: 0s 06-07 01:53:32.466 12027 12027 F DEBUG : Cmdline: mediaserver64 06-07 01:53:32.466 12027 12027 F DEBUG : pid: 1139, tid: 11981, name: NPDecoder >>> mediaserver64 <<< 06-07 01:53:32.466 12027 12027 F DEBUG : uid: 1013 06-07 01:53:32.466 12027 12027 F DEBUG : tagged_addr_ctrl: 0000000000000001 06-07 01:53:32.466 12027 12027 F DEBUG : signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x7c02d886f0 06-07 01:53:32.466 12027 12027 F DEBUG : x0 79748c5e568e2ddc x1 0000007ca13c3618 x2 0000000000000000 x3 0000007ca1291000 06-07 01:53:32.466 12027 12027 F DEBUG : x4 0000000001909705 x5 0000000000000000 x6 0000007c02d88808 x7 b60625655bf0252f 06-07 01:53:32.467 12027 12027 F DEBUG : x8 0000000000000080 x9 0000007ca126fed7 x10 0000000000000006 x11 0000007bfd0a81fc 06-07 01:53:32.467 12027 12027 F DEBUG : x12 9ef8a95ca9649dbe x13 e44782d5ac38720e x14 0000007bfd0a8030 x15 0000001e56307b5c 06-07 01:53:32.467 12027 12027 F DEBUG : x16 0000007c95dfdb70 x17 0000007c9844f118 x18 0000007bfaa28000 x19 b400007c13c246d0 06-07 01:53:32.467 12027 12027 F DEBUG : x20 0000007c02d88730 x21 b400007c13c67c00 x22 0000000000000415 x23 0000007c02d89000 06-07 01:53:32.467 12027 12027 F DEBUG : x24 0000000000000002 x25 b400007c13c246d0 x26 b400007c13c67c00 x27 0000007c02d89000 06-07 01:53:32.467 12027 12027 F DEBUG : x28 0000007ca13c2c28 x29 0000007c02d886f0 06-07 01:53:32.467 12027 12027 F DEBUG : lr 0000007c02d886f0 sp 0000007c02d886d0 pc 0000007c02d886f0 pst 0000000080001000 06-07 01:53:32.467 12027 12027 F DEBUG : backtrace: 06-07 01:53:32.467 12027 12027 F DEBUG : #00 pc 00000000000f86f0 [anon:stack_and_tls:11981] |
When only the log stack cannot perform a more detailed analysis, we also need some memory information of the program and register information. Android’s error mechanism will accordingly generate a tombstone file saved to /data/tombstones/tombstone_xx. For machines without root privileges, the tombstone file can be captured via adb bugreport.
Tombstone
The tombstone file saves information such as the architecture of the erroneous program, commonly referred to as arm, arm64, etc., the time of occurrence, program name, error type, process ID, thread ID, register information at the error site, stack, and memory information near the address of some registers, program memory mapping table /proc/self/maps, FD information, and the logs output by the program at the time of the error.
ABI: ‘arm64’ [ arm64 program ] Timestamp: 2022-06-07 01:53:32.033409857+0800 [ Timestamp of the error ] Process uptime: 0s Cmdline: mediaserver64 [ Program name ] pid: 1139, tid: 11981, name: NPDecoder >>> mediaserver64 <<< [ Process ID, thread ID ] uid: 1013 |
Error Type
signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x7c02d886f0 [ The error type is SIGSEGV, subclass is SEGV_ACCERR, error address 0x7c02d886f0 ] SIGSEGV is also our most common type of native crash. Most of the time, we refer to it as a segmentation fault, and the error means that a segmentation fault with access denied occurred at PC=0x7c02d886f0. |
Register Information
x0 79748c5e568e2ddc x1 0000007ca13c3618 x2 0000000000000000 x3 0000007ca1291000 x4 0000000001909705 x5 0000000000000000 x6 0000007c02d88808 x7 b60625655bf0252f x8 0000000000000080 x9 0000007ca126fed7 x10 0000000000000006 x11 0000007bfd0a81fc x12 9ef8a95ca9649dbe x13 e44782d5ac38720e x14 0000007bfd0a8030 x15 0000001e56307b5c x16 0000007c95dfdb70 x17 0000007c9844f118 x18 0000007bfaa28000 x19 b400007c13c246d0 x20 0000007c02d88730 x21 b400007c13c67c00 x22 0000000000000415 x23 0000007c02d89000 x24 0000000000000002 x25 b400007c13c246d0 x26 b400007c13c67c00 x27 0000007c02d89000 x28 0000007ca13c2c28 x29 0000007c02d886f0 lr 0000007c02d886f0 sp 0000007c02d886d0 pc 0000007c02d886f0 pst 0000000080001000 |
Stack Information
backtrace: 06-07 01:53:32.467 12027 12027 F DEBUG : #00 pc 00000000000f86f0 [anon:stack_and_tls:11981] |
Memory Information
The tombstone will record the memory information near the current effective address of the registers, with a size of 0x100. This can be modified in the macro definition MEMORY_BYTES_TO_DUMP in
system/core/debuggerd/libdebuggerd/utility.cpp
In such a case with only one stack, the stack memory information combined with the mapping table below can help us recover the stack.
memory near x1 (/system/lib64/libstagefright.so): 0000007ca13c35f0 0000000000000000 0000000000000000 ……………. 0000007ca13c3600 0000000000000000 0000000000000000 ……………. 0000007ca13c3610 0000000000000000 0000007ca132326c ……..l22.|… 0000007ca13c3620 0000007ca1324008 0000007ca10552e4 .@2.|….R..|… 0000007ca13c3630 0000007ca10552e8 0000007ca10552ec .R..|….R..|… 0000007ca13c3640 0000007ca10552f4 0000007ca132402c .R..|…,@2.|… 0000007ca13c3650 0000000000000000 0000000000000000 ……………. 0000007ca13c3660 0000000000000000 0000000000000000 ……………. 0000007ca13c3670 0000000000000000 0000007ca134ea84 ……….4.|… 0000007ca13c3680 0000007ca134ecec 0000007ca10552e4 ..4.|….R..|… 0000007ca13c3690 0000007ca10552e8 0000007ca10552ec .R..|….R..|… 0000007ca13c36a0 0000007ca10552f4 0000007ca134ed10 .R..|…..4.|… 0000007ca13c36b0 0000000000000000 0000000000000000 ……………. 0000007ca13c36c0 0000000000000000 0000000000000000 ……………. 0000007ca13c36d0 0000000000000000 0000007ca135d02c ……..,.5.|… 0000007ca13c36e0 0000007ca135d4b8 0000007ca10552e4 ..5.|….R..|… memory near x29 ([anon:stack_and_tls:11981]): 0000007c02d886d0 b400007c13c246d0 0000000001909705 .F..|……….. 【SP = 0x0000007c02d886d0】 0000007c02d886e0 0000007c02d88700 6f2ab3b40fa2f8ef ….|………o 0000007c02d886f0 0000007c02d88750 0000007ca133f8e0 P…|…..3.|… 【x29 = 0x0000007c02d886f0】 0000007c02d88700 0000000000000002 0000000000000000 ……………. 0000007c02d88710 0000000000000415 0000000001909705 ……………. 0000007c02d88720 0000000000000000 0000007c02d88808 …………|… 0000007c02d88730 b400007c13c67c00 0000000000000000 .|..|……….. 0000007c02d88740 0000007c02d89000 6f2ab3b40fa2f8ef ….|………o 0000007c02d88750 0000007c02d88830 0000007ca796ee7c 0…|…|…|… 0000007c02d88760 0000007ca79f3dd8 0000007ca79edb80 .=..|…….|… 0000007c02d88770 0000007c02d89000 b400007c13c04680 ….|….F..|… 0000007c02d88780 0000000000000000 0000000000000002 ……………. 0000007c02d88790 b400007c13c67c00 0000000000000000 .|..|……….. 0000007c02d887a0 0000000000000000 b400007c13c7c100 …………|… memory near lr ([anon:stack_and_tls:11981]): 0000007c02d886d0 b400007c13c246d0 0000000001909705 .F..|……….. 0000007c02d886e0 0000007c02d88700 6f2ab3b40fa2f8ef ….|………o 0000007c02d886f0 0000007c02d88750 0000007ca133f8e0 P…|…..3.|… 0000007c02d88700 0000000000000002 0000000000000000 ……………. 0000007c02d88710 0000000000000415 0000000001909705 ……………. 0000007c02d88720 0000000000000000 0000007c02d88808 …………|… 0000007c02d88730 b400007c13c67c00 0000000000000000 .|..|……….. 0000007c02d88740 0000007c02d89000 6f2ab3b40fa2f8ef ….|………o 0000007c02d88750 0000007c02d88830 0000007ca796ee7c 0…|…|…|… 0000007c02d88760 0000007ca79f3dd8 0000007ca79edb80 .=..|…….|… 0000007c02d88770 0000007c02d89000 b400007c13c04680 ….|….F..|… |
Memory Mapping Table
memory map (1146 entries): 0000005f’fabc7000-0000005f’fabc7fff r– 0 1000 /system/bin/mediaserver64 0000005f’fabc8000-0000005f’fabc9fff r-x 1000 2000 /system/bin/mediaserver64 0000005f’fabca000-0000005f’fabcafff r– 3000 1000 /system/bin/mediaserver64 0000007b’e79a3000-0000007b’e7d93fff — 0 3f1000 0000007c’a120e000-0000007c’a128efff r– 0 81000 /system/lib64/libstagefright.so 0000007c’a128f000-0000007c’a13c0fff r-x 81000 132000 /system/lib64/libstagefright.so 0000007c’a13c1000-0000007c’a13cffff r– 1b3000 f000 /system/lib64/libstagefright.so 0000007c’a13d0000-0000007c’a13d1fff rw- 1c1000 2000 /system/lib64/libstagefright.so 0000007c’a787c000-0000007c’a78f4fff r– 0 79000 /system/lib64/libmediaplayerservice.so 0000007c’a78f5000-0000007c’a79ecfff r-x 79000 f8000 /system/lib64/libmediaplayerservice.so 0000007c’a79ed000-0000007c’a79f8fff r– 171000 c000 /system/lib64/libmediaplayerservice.so 0000007c’a79f9000-0000007c’a79f9fff rw- 17c000 1000 /system/lib64/libmediaplayerservice.so |
FD Information
open files: fd 0: /dev/null (unowned) fd 1: /dev/null (unowned) fd 2: /dev/null (unowned) fd 3: socket:[62562] (unowned) fd 4: /dev/binderfs/binder (unowned) fd 5: /dev/binderfs/hwbinder (unowned) fd 6: /sys/kernel/tracing/trace_marker (unowned) fd 7: /dev/ashmem4945d9b6-db30-413c-88c5-e50674f154c7 (unowned) fd 8: /dmabuf: (unowned) fd 9: /dev/ashmem4945d9b6-db30-413c-88c5-e50674f154c7 (unowned) fd 10: /storage/emulated/0/zapya/folder/华语音乐/IN-K&王忻辰&苏星婕 – 落日与晚风.mp3 (owned by unique_fd 0x7c13c7a498) fd 11: /dev/ashmem4945d9b6-db30-413c-88c5-e50674f154c7 (unowned) … |
Coredump
From the previous tombstone file content, it can be seen that the information it contains is very limited. When we need more memory information, coredump becomes particularly important. It can capture the relevant memory information according to our configuration. For the introduction of core, see:
https://man7.org/linux/man-pages/man5/core.5.html
AOSP Method
# build/envsetup.sh# coredump_setup - enable core dumps globally for any process that has the core-file-size limit set correctly# NOTE: You must call also coredump_enable for a specific process if its core-file-size limit is not set already.# NOTE: Core dumps are written to ramdisk; they will not survive a reboot!function coredump_setup(){echo "Getting root...";adb root;adb wait-for-device;echo "Remounting root partition read-write...";adb shell mount -w -o remount -t rootfs rootfs;sleep 1;adb wait-for-device;adb shell mkdir -p /cores;adb shell mount -t tmpfs tmpfs /cores;adb shell chmod 0777 /cores;echo "Granting SELinux permission to dump in /cores...";adb shell restorecon -R /cores;echo "Set core pattern.";adb shell 'echo /cores/core.%p > /proc/sys/kernel/core_pattern';echo "Done."}# coredump_enable - enable core dumps for the specified process# $1 = PID of process (e.g., $(pid mediaserver))# NOTE: coredump_setup must have been called as well for a core# dump to actually be generated.function coredump_enable(){local PID=$1;if [ -z "$PID" ]; thenprintf "Expecting a PID!
";return;fi;echo "Setting core limit for $PID to infinite...";adb shell /system/bin/ulimit -P $PID -c unlimited}
Common Methods
Configure coredump parameters for system_server. Since the directory where the coredump generated by the target process is subject to SELinux permissions, this method of configuring to capture coredump needs to pay attention to which directory files the target process has read and write SELinux permissions, and then configure the corresponding directory.
adb wait-for-deviceadb rootadb shell mkdir /data/coresadb shell chmod 777 /data/cores#adb shell setenforce 0adb shell restorecon -R /data/coresadb shell 'echo /data/cores/core.%e.%p > /proc/sys/kernel/core_pattern'adb shell 'system/bin/ulimit -P `pidof system_server` -c unlimited'#adb shell 'echo 2 > /proc/sys/fs/suid_dumpable'
Note: Ensure that the issue is unrelated to SELinux permissions, you can disable SELinux permissions by running adb shell setenforce 0.
Configure parameters for capturing coredump for com.android.settings. Since the previous configuration restores SELinux permissions for the /data/cores directory as follows:
drwxrwxrwx 2 root root u:object_r:system_data_file:s0 3452 2022-07-04 15:08 cores
We know that the app must have permissions to read and write files in its own /data/data/$PACKAGE/ directory, so we can configure it as follows:
adb wait-for-deviceadb rootadb shell mkdir /data/data/com.android.settings/coresadb shell chmod 777 /data/data/com.android.settings/coresadb shell restorecon -R /data/data/com.android.settings/coresadb shell 'echo /data/data/com.android.settings/cores/core.%e.%p > /proc/sys/kernel/core_pattern'adb shell 'system/bin/ulimit -P `pidof com.android.settings` -c unlimited'#adb shell 'echo 2 > /proc/sys/fs/suid_dumpable'
When we verify on the machine to kill -11 simulate on this app $ kill -11 `pidof com.android.settings` $ ls /data/data/com.android.settings/cores/core.ndroid.settings.27946 |
Parameter Description
coredump_filter process default value is 0x23, only captures: private anonymous/shared anonymous/private large pages. If you want to capture all memory information, you can run adb shell ‘echo 0x27 > /proc/$PID/coredump_filter’.
/proc/$PID/coredump_filter |
bit0: private anonymous |
bit1: shared anonymous |
|
bit2: private mapping with underlying file |
|
bit3: shared mapping with underlying file |
|
bit4: ELF header |
|
bit5: private large pages |
|
bit6: shared large pages |
core_pattern controls the filename of the generated core and the location of the output core. For example:
adb shell ‘echo /data/cores/core.%p > /proc/sys/kernel/core_pattern’
/proc/sys/kernel/core_pattern |
%p: add pid |
%u: add current uid |
|
%g: add current gid |
|
%s: add signal that caused the core |
|
%t: add unix time when the core file was created |
|
%h: add hostname |
|
%e: add command name |
|
%E: executable file path name, replace slashes (‘/’) with exclamation marks (‘!’). |
When a program calls seteuid()/setegid() to change the effective user or group of the process, by default the system does not generate a core for these processes. Therefore, you may need to adjust the suid_dumpable parameter to enter debug mode or safe mode.
/proc/sys/fs/suid_dumpable |
0: default mode |
1: debug mode |
|
2: safe mode |
File Format
Core files are also a type of ELF file, so its main format composition parts are the same as ELF files.
For example, the core file discussed in this case mainly consists of VMA in /proc/self/maps and various thread register information. The register information is stored in PT_NOTE, while each VMA is stored in PT_LOAD. When a VMA is filtered out, it only has the Program Header description without a corresponding segment.
Offline Debugging
Note: MTK platform’s MINIDUMP is also a type of coredump, and it saves limited memory information. Core analysis can use debugging tools like GDB, lldb, etc. How to use these debugging tools will not be introduced here one by one.
$ ~/work/debug/gdb_arm64/gdb-12.1/output/bin/aarch64-linux-gdb |
When we do not have a symbol table, it is also possible to use only the core-file. |
(gdb) core-file PROCESS_MINIDUMP |
When we have the corresponding symbol table, we can load the symbol table directory |
(gdb) set solib-search-path symbols/ |
(gdb) set sysroot symbols/ |
(gdb) info sharedlibrary |
Displays the address range of all shared libraries |
(gdb) info registers |
Displays the current thread’s current frame register information |
(gdb) info locals |
Displays the local variables of the current frame |
(gdb) info thread |
Displays which threads are available |
(gdb) thread 2 |
Switch to thread 2 |
(gdb) bt |
Displays the stack of the current thread |
(gdb) thread apply all [command] For example, print all thread stacks (gdb) thread apply all bt |
Let all threads perform the same command |
(gdb) frame |
Displays the current frame information |
(gdb) frame 3 |
Switch to frame #3 |
(gdb) print or (gdb) p |
Print variable |
(gdb) ptype ‘android::AHandler’ |
View the data structure of a certain class or struct |
(gdb) ptype /o ‘android::AHandler’ |
View how many bytes the data type occupies |
(gdb) set print pretty on |
Format output |
(gdb) set log on |
Save the output results of gdb |
(gdb) x /gx 0x7c02d886f0 |
Read the memory content at address 0x7c02d886f0, where the output format is as follows: o(octal), x(hex), d(decimal), u(unsigned decimal), t(binary), f(float), a(address), i(instruction), c(char), s(string), z(hex, zero padded on the left). |
(gdb) disassemble 0x0000007c95de6708 or (gdb) disassemble ‘android::AMessage::setTarget’ |
Display the assembly information of the function |
Memory Detection Mechanism
ASAN
After Android 11, the AOSP master no longer supports platform development ASAN on arm64, replacing it with HWASAN. AddressSanitizer (ASAN) is a compiler-based fast detection tool used to detect memory errors.
Stack and heap buffer overflow/underflow |
Stack and heap buffer overflow/underflow |
Heap use after free |
Use of freed memory |
Stack use outside scope |
Out of stack range |
Double free/wild free |
Multiple memory releases/incorrect releases |
HWASAN
HWASan is only applicable to Android 10 and above, and can only be used on AArch64 hardware, with the same detection capability as ASAN. It requires tagged pointers to be supported in Linux-4.14 and above.
When compiling the Android version, include the following environment variable: $ export SANITIZE_TARGET=hwaddress To skip a certain module, add the following content in the corresponding module’s Android.bp file: sanitize: { hwaddress: false, address: false, }, In Android.mk, add the following content: LOCAL_NOSANITIZE := hwaddress APP build supports HWASAN, add the following content in Application.mk: APP_STL := c++_shared APP_CFLAGS := -fsanitize=hwaddress -fno-omit-frame-pointer APP_LDFLAGS := -fsanitize=hwaddress |
MTE
The latest ARM Memory Tagging Extension (MTE) introduced in Android S works similarly to HWASan, but the biggest difference is that HWASan requires recompilation and instrumentation of the corresponding detection functions before every memory access, while MTE completes the detection entirely supported by hardware.
For more content, refer to the expert on Juejin – Lu Banshan:
https://juejin.cn/post/6844904111570157575
https://juejin.cn/post/7013595058125406238
The Dangers of Wild Pointers
When the object being pointed to is released or reclaimed, but no modifications are made to that pointer, resulting in the pointer still pointing to an already reclaimed memory address, this pointer is called a wild pointer. If this wild pointer points to memory allocated to another pointer while this wild pointer is still in use, the program will become unpredictable.
#include <stdio.h>class A {public: virtual ~A() = default; virtual void foo() { printf("A:%ld\n", a); } long a;};class B {public: virtual ~B() = default; virtual void foo() { printf("B:%ld\n", b); } long b;};int main(int /*argc*/, char** /*argv[]*/) { A *a = new A(); A *a_bak = a; a->a = 1000L; printf("A ptr = %p\n", a); delete a; // At this point, the pointer a has been freed, so the pointer a_bak is a wild pointer B *b = new B(); printf("B ptr = %p\n", b); b->b = 2000L; b->foo(); a_bak->foo(); // What will happen here? delete b; return 0;}
What will the above program output? Since B and A have the same data structure size, when running in the same thread, it is highly likely to allocate the same recently released pointer address. Therefore, this program will most likely have the pointer b and pointer a_bak be the same.
# ./data/Tester64 A ptr = 0xb400007690205010 B ptr = 0xb400007690205010 B:2000 B:2000 |
If the above program outputs an error, that’s fine. If the program continues to run without error, then it can become very scary because you never know how the program will run. For example, the following program deliberately rewrites control to run in other directions.
#include <stdio.h>class A {public: void *bad; long a;};class B {public: virtual ~B() { printf("delete B\n"); }; virtual void foo() { b = 2000L; printf("B:%ld\n", b); } long b;};void func1() { printf("Hello !!\n");}void func2() { printf("GoGoGo !!\n");}int main(int /*argc*/, char** /*argv[]*/) { A *a = new A(); A *a_bak = a; printf("A ptr = %p\n", a); delete a; B *b = new B(); printf("B ptr = %p\n", b); long *data = new long[4] {0x0L, (long)func2, (long)func1, 0x0L}; a_bak->bad = data; printf("Test .. \n"); b->foo(); delete b; printf("Done.\n"); return 0;}
# ./data/Tester64 A ptr = 0xb400007ce9605010 B ptr = 0xb400007ce9605010 Test .. Hello !! GoGoGo !! Done. |
The above result shows that the program will output an error. If the program does not output an error, it can lead to unpredictable behavior, as shown in the following program, which deliberately rewrites control to run in other directions.
#include <stdio.h>class A {public: long bad; long a;};class B {public: virtual ~B() { printf("delete B\n"); }; virtual void foo() { b = 2000L; printf("B:%ld\n", b); } long b;};int main(int /*argc*/, char** /*argv[]*/) { A *a = new A(); A *a_bak = a; printf("A ptr = %p\n", a); delete a; B *b = new B(); printf("B ptr = %p\n", b); a_bak->bad = 0x20L; b->foo(); delete b; return 0;}
This program will report an error at line 27, where bad will corrupt the virtual function table of B, causing lines 28 and 29 to find the addresses of foo and destructor functions to throw a segmentation fault.
Timestamp: 2022-07-06 14:47:50.925654058+0800 Process uptime: 0s Cmdline: ./data/Tester64 pid: 12652, tid: 12652, name: Tester64 >>> ./data/Tester64 <<< uid: 0 tagged_addr_ctrl: 0000000000000001 signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x28 Cause: null pointer dereference x0 b40000762f005010 x1 b40000762f01b000 x2 0000000000000007 x3 ffffffffffffffff x4 ffffffffffffffff x5 0000000040100401 x6 b40000762f01b006 x7 3637303030303462 x8 0000000000000020 x9 a454ef76eb4317d3 x10 0000000000004001 x11 0000000000000000 x12 0000000000000000 x13 0000000000000002 x14 0000000000000010 x15 0000000000000010 x16 000000762f5a0c58 x17 000000762f5910d4 x18 000000763745c000 x19 b40000762f005010 x20 b40000762f005010 x21 0000007fe173d378 x22 0000000000000001 x23 0000000000000000 x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000 x28 0000000000000000 x29 0000007fe173d2e0 lr 0000005ac14600c8 sp 0000007fe173d2e0 pc 0000005ac14600d0 pst 0000000060001000 backtrace: #00 pc 00000000000010d0 /data/Tester64 (main+124) #01 pc 000000000008436c /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+100) |
The program will report an error, but if it does not, the program will behave unpredictably. The following program will deliberately rewrite control to run in other directions.
#include <stdio.h>class A {public: void *bad; long a;};class B {public: virtual ~B() { printf("delete B\n"); }; virtual void foo() { b = 2000L; printf("B:%ld\n", b); } long b;};void func1() { printf("Hello !!\n");}void func2() { printf("GoGoGo !!\n");}int main(int /*argc*/, char** /*argv[]*/) { A *a = new A(); A *a_bak = a; printf("A ptr = %p\n", a); delete a; B *b = new B(); printf("B ptr = %p\n", b); long *data = new long[4] {0x0L, (long)func2, (long)func1, 0x0L}; a_bak->bad = data; printf("Test .. \n"); b->foo(); delete b; printf("Done.\n"); return 0;}
Array Out of Bounds Hazard
Compared to the previous wild pointers, array out-of-bounds can mostly be detected by memory detection tools like HWASAN. However, wild pointer situations can also be detected. Array out-of-bounds often occurs when the memory of the first half of an object is polluted while the second half remains normal.
#include <stdio.h>class A {public: long a = 0x55AA; long b = 0xDEAD;};int main(int /*argc*/, char** /*argv[]*/) { long *b = new long[2] {0x0L, 0x1L}; A *a = new A(); printf("A:%p\n", a); printf("B:%p\n", b); b[2] = 0xDEAD; printf("B2:%p\n", &b[2]); printf("0x%lx-0x%lx\n", a->a, a->b); return 0;}
Because the size of b is the same as that of object a, when the program starts and allocates pointer addresses, they will be closely related. Therefore, the b[2] out-of-bounds operation will corrupt the contents of object a, and often the program runs for a long time, increasing memory fragmentation, making it unclear which object’s memory will be corrupted by the b[2] out-of-bounds operation.
# ./data/Tester64 A:0xb400007fa7c05020 B:0xb400007fa7c05010 B2:0xb400007fa7c05020 0xdead-0xdead |
Because the size of b is the same as that of object a, when the program starts and allocates pointer addresses, they will be closely related. Therefore, the b[2] out-of-bounds operation will corrupt the contents of object a, and often the program runs for a long time, increasing memory fragmentation, making it unclear which object’s memory will be corrupted by the b[2] out-of-bounds operation.
Machine Code Translation
In this article, the tombstone file PC ran away and did not land on the text segment address. Here, we will change a tombstone for explanation. We can generate the corresponding ELF file by compilation to use objdump to obtain the corresponding assembly.
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x40 Cause: null pointer dereference x0 0000000000000020 x1 b400007bf95f0e00 x2 63692f6863726165 x3 0000000000000008 x4 b400007bf95f0e74 x5 b400007bf8ea36f4 x6 656863732f676e69 x7 732f7269645f616d x8 db4cf552ad46f717 x9 0000000000000001 x10 0000000000004001 x11 0000000000000000 x12 0000000000000000 x13 0000000000000002 x14 0000000000000010 x15 0000000000000010 x16 000000762f5a0c58 x17 000000762f5910d4 x18 000000763745c000 x19 0000000000000020 x20 0000000000000000 x21 0000007b548e9000 x22 0000007b548e9000 x23 b400007bf946ffd0 x24 0000007b548e76c0 x25 0000007b548e9000 x26 0000007b548e7b30 x27 0000007b548e7b18 x28 0000007b548e7a10 x29 0000007b548e7410 lr 0000007fdc95f510 sp 0000007fdc95f440 pc 0000007fdc95f510 pst 0000000080001000 backtrace: #00 pc 000000000001f510 [stack] |
Analysis
Direct Cause
0000007c’02c90000-0000007c’02d8bfff rw- 0 fc000 [anon:stack_and_tls:11981] memory near pc ([anon:stack_and_tls:11981]): 0000007c02d886d0 b400007c13c246d0 0000000001909705 .F..|……….. 【SP = 0x0000007c02d886d0】 0000007c02d886e0 0000007c02d88700 6f2ab3b40fa2f8ef ….|………o 0000007c02d886f0 0000007c02d88750 0000007ca133f8e0 P…|…..3.|… 【x29 = 0x0000007c02d886f0】 0000007c02d88700 0000000000000002 0000000000000000 ……………. 0000007c02d88710 0000000000000415 0000000001909705 ……………. code.o: file format elf64-littleaarch64 Disassembly of section .text: 0000000000000000 <.text>: 0: f90023f7 str x23, [sp, #64] 4: a90557f6 stp x22, x21, [sp, #80] 8: a9064ff4 stp x20, x19, [sp, #96] c: 9100c3fd add x29, sp, #0x30 10: d53bd056 mrs x22, tpidr_el0 14: f94016c8 ldr x8, [x22, #40] 18: aa0003f3 mov x19, x0 1c: f81f83a8 stur x8, [x29, #-8] 20: 39408008 ldrb w8, [x0, #32] 24: 350001c8 cbnz w8, 5c <.text+0x5c> 28: b0fffde2 adrp x2, fffffffffffbd000 <.text+0xfffffffffffbd000> 2c: 91181c42 add x2, x2, #0x607 30: 910023e0 add x0, sp, #0x8 34: 2a1f03e1 mov w1, wzr 38: 52808ee3 mov w3, #0x477 // #1143 3c: 910023f4 add x20, sp, #0x8 40: 9400d695 bl 35a94 <.text+0x35a94> End of assembler dump. |
Conclusion
It is not difficult to find a detail that the content stored at address 0x7c02d886c0 is problematic.
__cfi_slowpath(uint64_t, void*)
0x7c02d886c0: 0x0000007c02d886f0 0x0000007c95de6754
0x7c02d886d0: 0xb400007c13c246d0 0x0000000001909705
0x7c02d886e0: 0x0000007c02d88700 0x6f2ab3b40fa2f8ef
android::AMessage::setTarget(android::sp<android::AHandler const> const&)
0x7c02d886f0: 0x0000007c02d88750 0x0000007ca133f8e0
0x7c02d88700: 0x0000000000000002 0x0000000000000000
0x7c02d88710: 0x0000000000000415 0x0000000001909705
0x7c02d88720: 0x0000000000000000 0x0000007c02d88808
0x7c02d886c0: 0x0000007c02d886f0 0x0000007c95de6754
…
0x7c02d886d0: 0xb400007c13c246d0 0x0000000001909705
0x7c02d886e0: 0x0000007c02d88700 0x6f2ab3b40fa2f8ef
0x7c02d88750: 0x0000007c02d88830 0x0000007ca796ee7c
0x7c02d88760: 0x0000007ca79f3dd8 0x0000007ca79edb80
0x7c02d88770: 0x0000007c02d89000 0xb400007c13c04680
0x7c02d88780: 0x0000000000000000 0x0000000000000002
0x7c02d88790: 0xb400007c13c67c00 0x0000000000000000
0x7c02d887a0: 0x0000000000000000 0xb400007c13c7c100
0x7c02d887b0: 0x0000007c02d88830 0x0000007ca796d420
0x7c02d887c0: 0x0000007c02d88830 0x0000007ca796d460
Therefore, the tombstone information indicates that it is likely that the program encountered an error due to the corruption of the virtual function table of class B.