KASAN (4) – Inline Instrumentation Analysis

Technical experience sharing, welcome to follow and provide guidance.

In “KASAN (1) – Simple Practice”, we did not explain the configuration CONFIG_KASAN_INLINE in detail, but simply explained it based on the kernel documentation. Here, we will provide a detailed introduction to the inline detection process of KASAN based on practical analysis.

What is INLINE Mode

The inline mode of KASAN is achieved through instruction instrumentation. Those familiar with ftrace will know that GCC has a way to instrument functions during compilation, thereby enabling debugging features in the program. KASAN operates similarly, with instrumentation also performed by GCC. For more information on how ftrace performs instrumentation, you can refer to the article “ftrace Debugging Kernel”.

Testing Stack

To understand the implementation principle of INLINE, we will use a simple code as a test.

static noinline void kmalloc_oob_right(void)
{
    char *ptr;
    size_t size = 123;

    ptr = kmalloc(size, GFP_KERNEL);
    ptr[size] = 'x';
    kfree(ptr);

    return ;
}

The detection stack is as follows:

[   10.118159]  dump_backtrace+0x0/0x3bc
[   10.118182]  show_stack+0x1c/0x24
[   10.118204]  dump_stack_lvl+0x130/0x168
[   10.118228]  print_address_description.constprop.0+0x74/0x2b8
[   10.118251]  kasan_report+0x1e8/0x200
[   10.118273]  __asan_report_store1_noabort+0x30/0x5c
[   10.118294]  kmalloc_oob_right+0x8c/0x90
[   10.118315]  test_kasan_module_init+0x18/0x40
[   10.118335]  do_one_initcall+0xb0/0x4e0
[   10.118360]  kernel_init_freeable+0x47c/0x4e4
[   10.118380]  kernel_init+0x18/0x13c
[   10.118400]  ret_from_fork+0x10/0x18

As can be seen, in inline mode, the code directly calls <span>__asan_report_store1_noabort</span> after <span>kmalloc_oob_right</span>. How is this achieved? This article mainly explores this.

Disassembly

First, to exclude the macro expansion of the function, we can try compiling with <span>-E</span> to expand it. For the kernel, each C file provides a <span>.cmd</span> file for debugging. The article “vDSO – Example of Implementing System Calls” also used this method to understand the expansion of syscalls. At this point, we focus on the file <span>lib/.test_kasan_kylin.mod.o.cmd</span>. We see the compilation command for this test module as follows:

cmd_lib/test_kasan_kylin.mod.o := /root/kernel/roc-rk3588s-pc/kernel/scripts/gcc-wrapper.py gcc -Wp,-MMD,lib/.test_kasan_kylin.mod.o.d -nostdinc -isystem /usr/lib/gcc/aarch64-linux-gnu/10/include -I./arch/arm64/include -I./arch/arm64/include/generated -I./include -I./arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -DCC_USING_PATCHABLE_FUNCTION_ENTRY -DKASAN_SHADOW_SCALE_SHIFT=3 -fmacro-prefix-map=./= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89 -mgeneral-regs-only -DCONFIG_CC_HAS_K_CONSTRAINT=1 -Wno-psabi -mabi=lp64 -fno-asynchronous-unwind-tables -fno-unwind-tables -mbranch-protection=none -Wa,-march=armv8.5-a -DARM64_ASM_ARCH='"armv8.5-a"' -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member -O2 -fno-allow-store-data-races -Wframe-larger-than=2048 -fstack-protector-strong -Werror -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -g -Wdeclaration-after-statement -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -mstack-protector-guard=sysreg -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard-offset=1344 -fsanitize=kernel-address -fasan-shadow-offset=0xdfffffd000000000 --param asan-globals=1 --param asan-instrumentation-with-call-threshold=10000 --param asan-stack=1 --param asan-instrument-allocas=1 -DMODULE -DKBUILD_BASENAME='"test_kasan_kylin.mod"' -DKBUILD_MODNAME='"test_kasan_kylin"' -D__KBUILD_MODNAME=kmod_test_kasan_kylin -c -o lib/test_kasan_kylin.mod.o lib/test_kasan_kylin.mod.c

Knowing this makes it relatively simple. If we only need macro expansion, we can do as follows:

/root/kernel/roc-rk3588s-pc/kernel/scripts/gcc-wrapper.py gcc -Wp,-MMD,lib/.test_kasan_kylin.mod.o.d -nostdinc -isystem /usr/lib/gcc/aarch64-linux-gnu/10/include -I./arch/arm64/include -I./arch/arm64/include/generated -I./include -I./arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -DCC_USING_PATCHABLE_FUNCTION_ENTRY -DKASAN_SHADOW_SCALE_SHIFT=3 -fmacro-prefix-map=./= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89 -mgeneral-regs-only -DCONFIG_CC_HAS_K_CONSTRAINT=1 -Wno-psabi -mabi=lp64 -fno-asynchronous-unwind-tables -fno-unwind-tables -mbranch-protection=none -Wa,-march=armv8.5-a -DARM64_ASM_ARCH='"armv8.5-a"' -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member -O2 -fno-allow-store-data-races -Wframe-larger-than=2048 -fstack-protector-strong -Werror -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -g -Wdeclaration-after-statement -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -mstack-protector-guard=sysreg -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard-offset=1344 -fsanitize=kernel-address -fasan-shadow-offset=0xdfffffd000000000 --param asan-globals=1 --param asan-instrumentation-with-call-threshold=10000 --param asan-stack=1 --param asan-instrument-allocas=1 -DMODULE -DKBUILD_BASENAME='"test_kasan_kylin.mod"' -DKBUILD_MODNAME='"test_kasan_kylin"' -D__KBUILD_MODNAME=kmod_test_kasan_kylin -c -E -o lib/test_kasan_kylin.mod.o.E lib/test_kasan_kylin.c

At this point, we can open <span>lib/test_kasan_kylin.mod.o.E</span> to find the function <span>kmalloc_oob_right</span>, as follows:

static __attribute__((__noinline__)) void kmalloc_oob_right(void)
{
    char *ptr;
    size_t size = 123;

    ptr = kmalloc(size, ((( gfp_t)(0x400u|0x800u)) | (( gfp_t)0x40u) | (( gfp_t)0x80u)));
    ptr[size] = 'x';
    kfree(ptr);

    return ;
}

As we can see, the function has not been macro-expanded, so we can further confirm whether it is implemented through inline/instruction instrumentation.

As long as it is not a macro expansion issue, we can use objdump for disassembly, as follows:

# objdump -d lib/test_kasan_kylin.ko
0000000000000000 &lt;kmalloc_oob_right&gt;:
   0:   a9be7bfd        stp     x29, x30, [sp, #-32]!
   4:   90000000        adrp    x0, 0 &lt;kmalloc_caches&gt;
   8:   91000000        add     x0, x0, #0x0
   c:   910003fd        mov     x29, sp
10:   d2dffa01        mov     x1, #0xffd000000000             // #281268818280448
14:   d343fc02        lsr     x2, x0, #3
18:   f2fbffe1        movk    x1, #0xdfff, lsl #48
1c:   f9000bf3        str     x19, [sp, #16]
20:   38e16841        ldrsb   w1, [x2, x1]
24:   34000041        cbz     w1, 2c &lt;kmalloc_oob_right+0x2c&gt;
28:   94000000        bl      0 &lt;__asan_report_load8_noabort&gt;
2c:   90000000        adrp    x0, 0 &lt;kmalloc_caches&gt;
30:   d2800f62        mov     x2, #0x7b                       // #123
34:   52819801        mov     w1, #0xcc0                      // #3264
38:   f9400000        ldr     x0, [x0]
3c:   94000000        bl      0 &lt;kmem_cache_alloc_trace&gt;
40:   aa0003f3        mov     x19, x0
44:   9101ec00        add     x0, x0, #0x7b
48:   d2dffa01        mov     x1, #0xffd000000000             // #281268818280448
4c:   f2fbffe1        movk    x1, #0xdfff, lsl #48
50:   12000802        and     w2, w0, #0x7
54:   d343fc03        lsr     x3, x0, #3
58:   38e16861        ldrsb   w1, [x3, x1]
5c:   7100003f        cmp     w1, #0x0
60:   7a411041        ccmp    w2, w1, #0x1, ne  // ne = any
64:   5400004b        b.lt    6c &lt;kmalloc_oob_right+0x6c&gt;  // b.tstop
68:   94000000        bl      0 &lt;__asan_report_store1_noabort&gt;
6c:   52800f01        mov     w1, #0x78                       // #120
70:   aa1303e0        mov     x0, x19
74:   3901ee61        strb    w1, [x19, #123]
78:   94000000        bl      0 &lt;kfree&gt;
7c:   f9400bf3        ldr     x19, [sp, #16]
80:   a8c27bfd        ldp     x29, x30, [sp], #32
84:   d65f03c0        ret

We can see that the code has an inserted assembly segment before <span>ptr[size] = 'x';</span>. The inserted assembly content is as follows:

  40:   aa0003f3        mov     x19, x0
44:   9101ec00        add     x0, x0, #0x7b
48:   d2dffa01        mov     x1, #0xffd000000000             // #281268818280448
4c:   f2fbffe1        movk    x1, #0xdfff, lsl #48
50:   12000802        and     w2, w0, #0x7
54:   d343fc03        lsr     x3, x0, #3
58:   38e16861        ldrsb   w1, [x3, x1]
5c:   7100003f        cmp     w1, #0x0
60:   7a411041        ccmp    w2, w1, #0x1, ne  // ne = any
64:   5400004b        b.lt    6c &lt;kmalloc_oob_right+0x6c&gt;  // b.tstop
68:   94000000        bl      0 &lt;__asan_report_store1_noabort&gt;
6c:   52800f01        mov     w1, #0x78                       // #120
70:   aa1303e0        mov     x0, x19

The logic is as follows:

  • 40: Save the value of x0
  • 44: Add x0 + 123, where 123 is the size of kmalloc
  • 48: Set x1 to 0xffd000000000
  • 4c: Calculate x1 to 0xdfffffd000000000, this value is CONFIG_KASAN_SHADOW_OFFSET
  • 50: Provide w2 with the value of w0 + 0x7
  • 54: Right shift x0 by 3 bits
  • 58: x3 + x1 is the address in the shadow memory, loading the value into w1
  • 5c: Compare the value with 0, if 0, the code is accessible
  • 60: If not 0, compare w2 and w1, where w2 is the value in the shadow memory being accessed, i.e., x0 + 123, and w1 is the boundary value of the shadow, checking if w2 is greater than w1
  • 64: If less than or equal, it is in an accessible position, then normally jump to 6c
  • 68: If greater, it means an out-of-bounds (oob) has occurred, jump to the function __asan_report_store1_noabort
  • 6c: Set w1 to 120
  • 70: Restore x0

In summary, the above logic is:

  • Check the accessed memory address against the poisoned value in the shadow area; if it is 0 or less than the accessible value (1-7), it indicates no oob has occurred; if greater than the accessible value, call __asan_report_store1_noabort to report the oob error.

As we can see, <span>__asan_report_store1_noabort</span> is a function call, and based on the stack, the next call is to <span>kasan_report</span>. We disassemble <span>__asan_report_store1_noabort</span> to see the following:

crash&gt; dis __asan_report_load8_noabort
0xffffffd008656aa4 &lt;__asan_report_load8_noabort&gt;:       stp     x29, x30, [sp,#-16]!
0xffffffd008656aa8 &lt;__asan_report_load8_noabort+4&gt;:     adrp    x1, 0xffffffd00e994000
0xffffffd008656aac &lt;__asan_report_load8_noabort+8&gt;:     mov     x3, #0xffffffffffffffff         // #-1
0xffffffd008656ab0 &lt;__asan_report_load8_noabort+12&gt;:    hint    #0x7
0xffffffd008656ab4 &lt;__asan_report_load8_noabort+16&gt;:     mov     x29, sp
0xffffffd008656ab8 &lt;__asan_report_load8_noabort+20&gt;:     ldr     x1, [x1,#2064]
0xffffffd008656abc &lt;__asan_report_load8_noabort+24&gt;:     lsl     x3, x3, x1
0xffffffd008656ac0 &lt;__asan_report_load8_noabort+28&gt;:     tbz     x30, #55, 0xffffffd008656adc &lt;__asan_report_load8_noabort+56&gt;
0xffffffd008656ac4 &lt;__asan_report_load8_noabort+32&gt;:     orr     x3, x30, x3
0xffffffd008656ac8 &lt;__asan_report_load8_noabort+36&gt;:     mov     w2, #0x0                        // #0
0xffffffd008656acc &lt;__asan_report_load8_noabort+40&gt;:     mov     x1, #0x8                        // #8
0xffffffd008656ad0 &gt;     bl      0xffffffd0086554e0 &lt;kasan_report&gt;
0xffffffd008656ad4 &lt;__asan_report_load8_noabort+48&gt;:     ldp     x29, x30, [sp],#16
0xffffffd008656ad8 &lt;__asan_report_load8_noabort+52&gt;:     ret
0xffffffd008656adc &lt;__asan_report_load8_noabort+56&gt;:     and     x3, x3, #0x7fffffffffffff
0xffffffd008656ae0 &lt;__asan_report_load8_noabort+60&gt;:     mov     w2, #0x0                        // #0
0xffffffd008656ae4 &lt;__asan_report_load8_noabort+64&gt;:     bic     x3, x30, x3
0xffffffd008656ae8 &lt;__asan_report_load8_noabort+68&gt;:     mov     x1, #0x8                        // #8
0xffffffd008656aec &lt;__asan_report_load8_noabort+72&gt;:     bl      0xffffffd0086554e0 &lt;kasan_report&gt;
0xffffffd008656af0 &lt;__asan_report_load8_noabort+76&gt;:     ldp     x29, x30, [sp],#16
0xffffffd008656af4 &lt;__asan_report_load8_noabort+80&gt;:     ret

As we can see, this conforms to the calling convention (AAPCS). For documentation on the AArch64 calling convention, feel free to contact me. In summary, if it conforms to the calling convention, it is likely a standard function. We can refer to the kernel code for easier understanding, as shown below:

#define DEFINE_ASAN_REPORT_LOAD(size)                     \
void __asan_report_load##size##_noabort(unsigned long addr) \
{                                                         \
    kasan_report(addr, size, false, _RET_IP_);    \
}                                                         \

As we can see, it directly calls <span>kasan_report</span>, which conforms to the normal stack logic.

Conclusion

This article understood how KASAN detects out-of-bounds (oob) errors through inline instrumentation via disassembly. The main process summary is as follows:

  1. GCC internally instruments load and store operations.
  2. This instrumentation inserts instructions before load and store operations.
  3. The inserted instructions mainly check whether the corresponding shadow area of the accessed memory is accessible.
  4. If not accessible, it jumps to the standard function __asan_report_load.
  5. The standard function is implemented by the kernel, which mainly jumps to kasan_report.

With the inline detection logic clarified, we will next explore the working mode of outline instrumentation.

Leave a Comment