Linux Kernel Memory Detection Tool KASAN (1) – Simple Practice

Technical experience sharing, welcome to follow and provide guidance.

I briefly introduced KASAN in my article “Debugging Memory Leaks in the Kernel”. At that time, my understanding of the implementation principles of KASAN was still shallow. Recently, I plan to comprehensively learn and understand KASAN, and then consolidate my memory by writing this series of articles.

Before understanding KASAN, it is essential to understand how ASAN works. ASAN detects memory issues in code through a poisoning mechanism. I introduced the working principles and practices of ASAN in the article “Using ASAN to Debug Memory Issues”.

Configuring the Kernel

Regarding the kernel configuration for KASAN, the main options to enable are as follows:

# zcat /proc/config.gz | grep KASAN
CONFIG_KASAN_SHADOW_OFFSET=0xdfffffd000000000
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
# CONFIG_KASAN_OUTLINE is not set
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_STACK=y
CONFIG_KASAN_VMALLOC=y
CONFIG_KASAN_MODULE_TEST=m

Here is a brief explanation:

  1. CONFIG_KASAN_SHADOW_OFFSET This is the offset value for the shadow memory region, generated in an 8-to-1 manner, calculated similarly to ASAN, as follows:
Shadow = (Mem >> 3) + offset;
  1. CONFIG_HAVE_ARCH_KASAN This is the condition for the kernel to determine whether the platform supports KASAN, as follows:
select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)

This means that on ARM64, if the platform is 48-bit with 16k pages, KASAN is not supported; otherwise, KASAN is supported.

  1. CONFIG_HAVE_ARCH_KASAN_SW_TAGS This only marks whether the current architecture (ARM64) supports software tagging for ASAN.
  2. CONFIG_HAVE_ARCH_KASAN_VMALLOC This only marks whether the current architecture (ARM64) supports KASAN checks for vmalloc areas.
  3. CONFIG_CC_HAS_KASAN_GENERIC This adds the GCC parameter<span>-fsanitize=kernel-address</span>, which can be checked as follows:
https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Instrumentation-Options.html#Instrumentation-Options
config CC_HAS_KASAN_GENERIC
    def_bool $(cc-option, -fsanitize=kernel-address)
  1. CONFIG_KASAN This is the main configuration for KASAN in the kernel.
  2. CONFIG_KASAN_GENERIC This selects the generic ASAN configuration, which operates similarly to user ASAN, but can actually configure both software tag ASAN and hardware tag ASAN. As follows:
config KASAN_GENERIC
    bool "Generic mode"
    depends on HAVE_ARCH_KASAN && CC_HAS_KASAN_GENERIC
    depends on CC_HAS_WORKING_NOSANITIZE_ADDRESS
    select SLUB_DEBUG if SLUB
    select CONSTRUCTORS
    help
      Enables generic KASAN mode.

      This mode is supported in both GCC and Clang. With GCC it requires
      version 8.3.0 or later. Any supported Clang version is compatible,
      but detection of out-of-bounds accesses for global variables is
      supported only since Clang 11.

      This mode consumes about 1/8th of available memory at kernel start
      and introduces an overhead of ~x1.5 for the rest of the allocations.
      The performance slowdown is ~x3.

      Currently CONFIG_KASAN_GENERIC doesn't work with CONFIG_DEBUG_SLAB
      (the resulting kernel does not boot).

config KASAN_SW_TAGS
    bool "Software tag-based mode"
    depends on HAVE_ARCH_KASAN_SW_TAGS && CC_HAS_KASAN_SW_TAGS
    depends on CC_HAS_WORKING_NOSANITIZE_ADDRESS
    select SLUB_DEBUG if SLUB
    select CONSTRUCTORS
    help
      Enables software tag-based KASAN mode.

      This mode requires software memory tagging support in the form of
      HWASan-like compiler instrumentation.

      Currently this mode is only implemented for ARM64 CPUs and relies on
      Top Byte Ignore. This mode requires Clang.

      This mode consumes about 1/16th of available memory at kernel start
      and introduces an overhead of ~20% for the rest of the allocations.
      This mode may potentially introduce problems relating to pointer
      casting and comparison, as it embeds tags into the top byte of each
      pointer.

      Currently CONFIG_KASAN_SW_TAGS doesn't work with CONFIG_DEBUG_SLAB
      (the resulting kernel does not boot).

config KASAN_HW_TAGS
    bool "Hardware tag-based mode"
    depends on HAVE_ARCH_KASAN_HW_TAGS
    depends on SLUB
    help
      Enables hardware tag-based KASAN mode.

      This mode requires hardware memory tagging support, and can be used
      by any architecture that provides it.

      Currently this mode is only implemented for ARM64 CPUs starting from
      ARMv8.5 and relies on Memory Tagging Extension and Top Byte Ignore.

In summary:

  • The generic mode is consistent with user-level ASAN, setting shadow memory in a 1/8 manner.
  • The sw tags mode utilizes ARMv8’s TBI to implement ASAN.
  • The hw tags mode utilizes ARMv8’s MTE to implement ASAN.

Currently, my environment does not support the following two modes, so I will temporarily use the generic ASAN.

  1. CONFIG_KASAN_INLINE By default, the kernel enables INLINE mode, but OUTLINE mode can also be manually enabled. This will be explained in detail later when analyzing the code. These two modes differ in the way code instrumentation is performed.
  • INLINE mode performs instrumentation at the instruction level.
  • OUTLINE mode performs instrumentation at the function level.

Thus, it can be seen that using OUTLINE mode will result in a larger kernel code segment, but the advantage is better compatibility. Of course, INLINE mode is currently the most suitable method for tracing through instruction instrumentation.

  1. CONFIG_KASAN_STACK This configuration enables the kernel to locate stack buffer overflow issues.
  2. CONFIG_KASAN_VMALLOC This configuration enables the kernel to support shadow memory regions for vmalloc areas, as vmalloc accesses a fragmented memory address range that is usually quite large. For example, according to the kernel documentation, it is approximately 93TB, as follows:
ffffa00010000000  fffffdffbffeffff     ~93TB      vmalloc

In this case, supporting shadow memory for vmalloc requires 11TB of memory, so the kernel configuration provides an option to control whether to debug memory issues in vmalloc.

  1. CONFIG_KASAN_MODULE_TEST This is a default module verification example provided by the kernel to test KASAN.

Testing and Verification

The testing is very simple; just run<span>make modules</span> to obtain<span> test_kasan_module.ko</span>. At this point, you can directly use insmod. The corresponding log is as follows:

[   28.681187] kasan test: copy_user_test out-of-bounds in copy_from_user()
[   28.681203] ==================================================================
[   28.681212] BUG: KASAN: slab-out-of-bounds in copy_user_test+0xc0/0x340 [test_kasan_module]
[   28.681217] Write of size 11 at addr ffffff80833f8500 by task insmod/1953
insmod: ERROR: could not insert module test_kasan_module.ko: Resource temporarily unavailable
[   28.681221] 
root@kylin:~# [   28.681227] CPU: 4 PID: 1953 Comm: insmod Tainted: G    B             5.10.198 #92
[   28.681231] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT)
[   28.681235] Call trace:
[   28.681243]  dump_backtrace+0x0/0x3bc
[   28.681248]  show_stack+0x1c/0x24
[   28.681254]  dump_stack_lvl+0x130/0x168
[   28.681260]  print_address_description.constprop.0+0x74/0x2b8
[   28.681265]  kasan_report+0x1e8/0x200
[   28.681270]  kasan_check_range+0xf4/0x1a0
[   28.681273]  __kasan_check_write+0x30/0x50
[   28.681279]  copy_user_test+0xc0/0x340 [test_kasan_module]
[   28.681284]  test_kasan_module_init+0x18/0xa78 [test_kasan_module]
[   28.681289]  do_one_initcall+0xb0/0x4e0
[   28.681294]  do_init_module+0x14c/0x600
[   28.681298]  load_module+0x5714/0x71fc
[   28.681303]  __do_sys_finit_module+0x110/0x1a0
[   28.681307]  __arm64_sys_finit_module+0x70/0xa0
[   28.681312]  el0_svc_common.constprop.0+0xf0/0x464
[   28.681317]  do_el0_svc+0x44/0x5c
[   28.681321]  el0_svc+0x1c/0x30
[   28.681325]  el0_sync_handler+0xa8/0xac
[   28.681328]  el0_sync+0x158/0x180
[   28.681331] 
[   28.681335] Allocated by task 1953:
[   28.681340]  kasan_save_stack+0x24/0x50
[   28.681343]  __kasan_kmalloc+0x88/0xb0
[   28.681347]  kmem_cache_alloc_trace+0x1d0/0x3c0
[   28.681352]  copy_user_test+0x48/0x340 [test_kasan_module]
[   28.681357]  test_kasan_module_init+0x18/0xa78 [test_kasan_module]
[   28.681361]  do_one_initcall+0xb0/0x4e0
[   28.681365]  do_init_module+0x14c/0x600
[   28.681369]  load_module+0x5714/0x71fc
[   28.681373]  __do_sys_finit_module+0x110/0x1a0
[   28.681377]  __arm64_sys_finit_module+0x70/0xa0
[   28.681381]  el0_svc_common.constprop.0+0xf0/0x464
[   28.681385]  do_el0_svc+0x44/0x5c
[   28.681388]  el0_svc+0x1c/0x30
[   28.681392]  el0_sync_handler+0xa8/0xac
[   28.681396]  el0_sync+0x158/0x180
[   28.681398] 
[   28.681402] The buggy address belongs to the object at ffffff80833f8500
[   28.681402]  which belongs to the cache kmalloc-128 of size 128
[   28.681407] The buggy address is located 0 bytes inside of
[   28.681407]  128-byte region [ffffff80833f8500, ffffff80833f8580)
[   28.681411] The buggy address belongs to the page:
[   28.681417] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8
[   28.681421] head:00000000064eb9ca order:1 compound_mapcount:0
[   28.681426] flags: 0x10200(slab|head)
[   28.681431] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80
[   28.681436] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[   28.681439] page dumped because: kasan: bad access detected
[   28.681442] 
[   28.681445] Memory state around the buggy address:
[   28.681449]  ffffff80833f8400: 00 00 fcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681452]  ffffff80833f8480: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681456] >ffffff80833f8500: 00 02 fcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681459]                       ^
[   28.681463]  ffffff80833f8580: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681466]  ffffff80833f8600: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681470] ==================================================================
......

[   28.683111] kasan test: kasan_rcu_uaf use-after-free in kasan_rcu_reclaim
[   28.683416] kasan test: kasan_workqueue_uaf use-after-free on workqueue
[   28.683433] ==================================================================
[   28.683454] BUG: KASAN: use-after-free in kasan_workqueue_uaf+0x140/0x158 [test_kasan_module]
[   28.683473] Read of size 8 at addr ffffff80833f8800 by task insmod/1953
[   28.683484] 
[   28.683501] CPU: 4 PID: 1953 Comm: insmod Tainted: G    B             5.10.198 #92
[   28.683504] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT)
[   28.683508] Call trace:
[   28.683513]  dump_backtrace+0x0/0x3bc
[   28.683532]  show_stack+0x1c/0x24
[   28.683552]  dump_stack_lvl+0x130/0x168
[   28.683566]  print_address_description.constprop.0+0x74/0x2b8
[   28.683579]  kasan_report+0x1e8/0x200
[   28.683592]  __asan_report_load8_noabort+0x30/0x5c
[   28.683608]  kasan_workqueue_uaf+0x140/0x158 [test_kasan_module]
[   28.683633]  test_kasan_module_init+0x20/0xa78 [test_kasan_module]
[   28.683646]  do_one_initcall+0xb0/0x4e0
[   28.683661]  do_init_module+0x14c/0x600
[   28.683674]  load_module+0x5714/0x71fc
[   28.683689]  __do_sys_finit_module+0x110/0x1a0
[   28.683710]  __arm64_sys_finit_module+0x70/0xa0
[   28.683724]  el0_svc_common.constprop.0+0xf0/0x464
[   28.683737]  do_el0_svc+0x44/0x5c
[   28.683750]  el0_svc+0x1c/0x30
[   28.683763]  el0_sync_handler+0xa8/0xac
[   28.683776]  el0_sync+0x158/0x180
[   28.683781] 
[   28.683789] Allocated by task 1953:
[   28.683793]  kasan_save_stack+0x24/0x50
[   28.683797]  __kasan_kmalloc+0x88/0xb0
[   28.683801]  kmem_cache_alloc_trace+0x1d0/0x3c0
[   28.683805]  kasan_workqueue_uaf+0x80/0x158 [test_kasan_module]
[   28.683810]  test_kasan_module_init+0x20/0xa78 [test_kasan_module]
[   28.683814]  do_one_initcall+0xb0/0x4e0
[   28.683818]  do_init_module+0x14c/0x600
[   28.683822]  load_module+0x5714/0x71fc
[   28.683825]  __do_sys_finit_module+0x110/0x1a0
[   28.683830]  __arm64_sys_finit_module+0x70/0xa0
[   28.683834]  el0_svc_common.constprop.0+0xf0/0x464
[   28.683838]  do_el0_svc+0x44/0x5c
[   28.683841]  el0_svc+0x1c/0x30
[   28.683845]  el0_sync_handler+0xa8/0xac
[   28.683848]  el0_sync+0x158/0x180
[   28.683851] 
[   28.683855] Freed by task 676:
[   28.683859]  kasan_save_stack+0x24/0x50
[   28.683868]  kasan_set_track+0x24/0x34
[   28.683872]  kasan_set_free_info+0x24/0x44
[   28.683876]  __kasan_slab_free+0xd8/0x134
[   28.683879]  kfree+0xe0/0x500
[   28.683884]  kasan_workqueue_work+0xc/0x14 [test_kasan_module]
[   28.683889]  process_one_work+0x624/0x1240
[   28.683893]  worker_thread+0x3b8/0xe90
[   28.683897]  kthread+0x2c0/0x344
[   28.683901]  ret_from_fork+0x10/0x18
[   28.683904] 
[   28.683907] Last potentially related work creation:
[   28.683910]  kasan_save_stack+0x24/0x50
[   28.683914]  kasan_record_aux_stack+0xbc/0xd0
[   28.683919]  insert_work+0x54/0x2e0
[   28.683923]  __queue_work+0x3a8/0xca0
[   28.683926]  queue_work_on+0x9c/0xd0
[   28.683931]  kasan_workqueue_uaf+0x114/0x158 [test_kasan_module]
[   28.683936]  test_kasan_module_init+0x20/0xa78 [test_kasan_module]
[   28.683939]  do_one_initcall+0xb0/0x4e0
[   28.683944]  do_init_module+0x14c/0x600
[   28.683949]  load_module+0x5714/0x71fc
[   28.683953]  __do_sys_finit_module+0x110/0x1a0
[   28.683962]  __arm64_sys_finit_module+0x70/0xa0
[   28.683967]  el0_svc_common.constprop.0+0xf0/0x464
[   28.683970]  do_el0_svc+0x44/0x5c
[   28.683974]  el0_svc+0x1c/0x30
[   28.683978]  el0_sync_handler+0xa8/0xac
[   28.683981]  el0_sync+0x158/0x180
[   28.683984] 
[   28.683988] The buggy address belongs to the object at ffffff80833f8800
[   28.683988]  which belongs to the cache kmalloc-128 of size 128
[   28.683992] The buggy address is located 0 bytes inside of
[   28.683992]  128-byte region [ffffff80833f8800, ffffff80833f8880)
[   28.683995] The buggy address belongs to the page:
[   28.684001] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8
[   28.684005] head:00000000064eb9ca order:1 compound_mapcount:0
[   28.684009] flags: 0x10200(slab|head)
[   28.684014] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80
[   28.684018] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[   28.684021] page dumped because: kasan: bad access detected
[   28.684026] 
[   28.684028] Memory state around the buggy address:
[   28.684036]  ffffff80833f8700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   28.684040]  ffffff80833f8780: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.684044] >ffffff80833f8800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   28.684047]                    ^
[   28.684050]  ffffff80833f8880: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.684054]  ffffff80833f8900: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc

Log Explanation

The KASAN module test mainly tests two parts:

  • Out of bounds
  • Use after free

Below, I will explain the log information according to these two parts.

Out of Bounds Detection

Let’s first explain the first error of out of bounds as follows:

[   28.681187] kasan test: copy_user_test out-of-bounds in copy_from_user()
[   28.681203] ==================================================================
[   28.681212] BUG: KASAN: slab-out-of-bounds in copy_user_test+0xc0/0x340 [test_kasan_module]
[   28.681217] Write of size 11 at addr ffffff80833f8500 by task insmod/1953
insmod: ERROR: could not insert module test_kasan_module.ko: Resource temporarily unavailable
[   28.681221] 
root@kylin:~# [   28.681227] CPU: 4 PID: 1953 Comm: insmod Tainted: G    B             5.10.198 #92
[   28.681231] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT)
[   28.681235] Call trace:
[   28.681243]  dump_backtrace+0x0/0x3bc
[   28.681248]  show_stack+0x1c/0x24
[   28.681254]  dump_stack_lvl+0x130/0x168
[   28.681260]  print_address_description.constprop.0+0x74/0x2b8
[   28.681265]  kasan_report+0x1e8/0x200
[   28.681270]  kasan_check_range+0xf4/0x1a0
[   28.681273]  __kasan_check_write+0x30/0x50
[   28.681279]  copy_user_test+0xc0/0x340 [test_kasan_module]
[   28.681284]  test_kasan_module_init+0x18/0xa78 [test_kasan_module]
[   28.681289]  do_one_initcall+0xb0/0x4e0
[   28.681294]  do_init_module+0x14c/0x600
[   28.681298]  load_module+0x5714/0x71fc
[   28.681303]  __do_sys_finit_module+0x110/0x1a0
[   28.681307]  __arm64_sys_finit_module+0x70/0xa0
[   28.681312]  el0_svc_common.constprop.0+0xf0/0x464
[   28.681317]  do_el0_svc+0x44/0x5c
[   28.681321]  el0_svc+0x1c/0x30
[   28.681325]  el0_sync_handler+0xa8/0xac
[   28.681328]  el0_sync+0x158/0x180
[   28.681331] 
[   28.681335] Allocated by task 1953:
[   28.681340]  kasan_save_stack+0x24/0x50
[   28.681343]  __kasan_kmalloc+0x88/0xb0
[   28.681347]  kmem_cache_alloc_trace+0x1d0/0x3c0
[   28.681352]  copy_user_test+0x48/0x340 [test_kasan_module]
[   28.681357]  test_kasan_module_init+0x18/0xa78 [test_kasan_module]
[   28.681361]  do_one_initcall+0xb0/0x4e0
[   28.681365]  do_init_module+0x14c/0x600
[   28.681369]  load_module+0x5714/0x71fc
[   28.681373]  __do_sys_finit_module+0x110/0x1a0
[   28.681377]  __arm64_sys_finit_module+0x70/0xa0
[   28.681381]  el0_svc_common.constprop.0+0xf0/0x464
[   28.681385]  do_el0_svc+0x44/0x5c
[   28.681388]  el0_svc+0x1c/0x30
[   28.681392]  el0_sync_handler+0xa8/0xac
[   28.681396]  el0_sync+0x158/0x180
[   28.681398] 
[   28.681402] The buggy address belongs to the object at ffffff80833f8500
[   28.681402]  which belongs to the cache kmalloc-128 of size 128
[   28.681407] The buggy address is located 0 bytes inside of
[   28.681407]  128-byte region [ffffff80833f8500, ffffff80833f8580)
[   28.681411] The buggy address belongs to the page:
[   28.681417] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8
[   28.681421] head:00000000064eb9ca order:1 compound_mapcount:0
[   28.681426] flags: 0x10200(slab|head)
[   28.681431] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80
[   28.681436] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[   28.681439] page dumped because: kasan: bad access detected
[   28.681442] 
[   28.681445] Memory state around the buggy address:
[   28.681449]  ffffff80833f8400: 00 00 fcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681452]  ffffff80833f8480: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681456] >ffffff80833f8500: 00 02 fcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681459]                       ^
[   28.681463]  ffffff80833f8580: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681466]  ffffff80833f8600: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681470] ==================================================================

The first piece of information indicates that KASAN directly reports the error type as out of bounds, with the function location in<span>copy_user_test+0xc0/0x340</span>, which can be calculated using gdb.

BUG: KASAN: slab-out-of-bounds in copy_user_test+0xc0/0x340 [test_kasan_module]

The second piece of information tells us the erroneous address<span>ffffff80833f8500</span> where 11 bytes were written.

Write of size 11 at addr ffffff80833f8500 by task insmod/1953

The third piece of information prints the function call stack, and KASAN actively reports the error, which is very clear.

   28.681235] Call trace:
[   28.681243]  dump_backtrace+0x0/0x3bc
[   28.681248]  show_stack+0x1c/0x24
[   28.681254]  dump_stack_lvl+0x130/0x168
[   28.681260]  print_address_description.constprop.0+0x74/0x2b8
[   28.681265]  kasan_report+0x1e8/0x200
[   28.681270]  kasan_check_range+0xf4/0x1a0
[   28.681273]  __kasan_check_write+0x30/0x50
[   28.681279]  copy_user_test+0xc0/0x340 [test_kasan_module]
[   28.681284]  test_kasan_module_init+0x18/0xa78 [test_kasan_module]
[   28.681289]  do_one_initcall+0xb0/0x4e0
[   28.681294]  do_init_module+0x14c/0x600
[   28.681298]  load_module+0x5714/0x71fc
[   28.681303]  __do_sys_finit_module+0x110/0x1a0
[   28.681307]  __arm64_sys_finit_module+0x70/0xa0
[   28.681312]  el0_svc_common.constprop.0+0xf0/0x464
[   28.681317]  do_el0_svc+0x44/0x5c
[   28.681321]  el0_svc+0x1c/0x30
[   28.681325]  el0_sync_handler+0xa8/0xac
[   28.681328]  el0_sync+0x158/0x180

The fourth piece of information provides us with the memory allocation stack where the problem occurred, indicating that this is a slab issue.

[   28.681335] Allocated by task 1953:
[   28.681340]  kasan_save_stack+0x24/0x50
[   28.681343]  __kasan_kmalloc+0x88/0xb0
[   28.681347]  kmem_cache_alloc_trace+0x1d0/0x3c0
[   28.681352]  copy_user_test+0x48/0x340 [test_kasan_module]
[   28.681357]  test_kasan_module_init+0x18/0xa78 [test_kasan_module]
[   28.681361]  do_one_initcall+0xb0/0x4e0
[   28.681365]  do_init_module+0x14c/0x600
[   28.681369]  load_module+0x5714/0x71fc
[   28.681373]  __do_sys_finit_module+0x110/0x1a0
[   28.681377]  __arm64_sys_finit_module+0x70/0xa0
[   28.681381]  el0_svc_common.constprop.0+0xf0/0x464
[   28.681385]  do_el0_svc+0x44/0x5c
[   28.681388]  el0_svc+0x1c/0x30
[   28.681392]  el0_sync_handler+0xa8/0xac
[   28.681396]  el0_sync+0x158/0x180

The fifth piece of information helps me analyze the slab information, indicating that it is located in an object of the cache named kmalloc-128.

[   28.681402] The buggy address belongs to the object at ffffff80833f8500
[   28.681402]  which belongs to the cache kmalloc-128 of size 128

The sixth piece of information, similar to user ASAN, prints the memory range:

[   28.681407] The buggy address is located 0 bytes inside of
[   28.681407]  128-byte region [ffffff80833f8500, ffffff80833f8580)

The seventh piece of information provides us with the page information, such as the page structure, reference count, mapping count, mapping situation, pfn number, head page address, order value, compound page count, page type, and metadata, along with a summary of the error type.

[   28.681411] The buggy address belongs to the page:
[   28.681417] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8
[   28.681421] head:00000000064eb9ca order:1 compound_mapcount:0
[   28.681426] flags: 0x10200(slab|head)
[   28.681431] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80
[   28.681436] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[   28.681439] page dumped because: kasan: bad access detected

The eighth piece of information, similar to ASAN, tells us that the error occurred between the 8th and 16th bytes, as the shadow mapping is 1/8. Therefore, it only indicates that the error occurred between the 11th and 16th bytes.

[   28.681445] Memory state around the buggy address:
[   28.681449]  ffffff80833f8400: 00 00 fcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681452]  ffffff80833f8480: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681456] >ffffff80833f8500: 00 02 fcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681459]                       ^
[   28.681463]  ffffff80833f8580: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
[   28.681466]  ffffff80833f8600: fcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc

First, we see that the address is<span>ffffff80833f8500</span>, which is clearly a kernel space address. However, we should note that the shadow offset is not this address,<span>CONFIG_KASAN_SHADOW_OFFSET=0xdfffffd000000000</span>, so this is also a difference between ASAN and KASAN. KASAN defaults to calculating the kernel address, while ASAN provides the shadow region address. Therefore, when debugging KASAN, we should not assume that the poisoned area value corresponds to any memory address value, but rather that the poisoned value represents the shadow region value, and the kernel printout has already done the conversion between the shadow memory address and the actual memory address for us.

It is also important to note that the types of poisoning in KASAN and ASAN are different. The 0xfc cannot be interpreted using ASAN; we need to check the kernel code definition as follows:

#ifdef CONFIG_KASAN_GENERIC
#define KASAN_FREE_PAGE         0xFF  /* page was freed */
#define KASAN_PAGE_REDZONE      0xFE  /* redzone for kmalloc_large allocations */
#define KASAN_KMALLOC_REDZONE   0xFC  /* redzone inside slub object */
#define KASAN_KMALLOC_FREE      0xFB  /* object was freed (kmem_cache_free/kfree) */
#define KASAN_KMALLOC_FREETRACK 0xFA  /* object was freed and has free track set */
#else
#define KASAN_FREE_PAGE         KASAN_TAG_INVALID
#define KASAN_PAGE_REDZONE      KASAN_TAG_INVALID
#define KASAN_KMALLOC_REDZONE   KASAN_TAG_INVALID
#define KASAN_KMALLOC_FREE      KASAN_TAG_INVALID
#define KASAN_KMALLOC_FREETRACK KASAN_TAG_INVALID
#endif

#define KASAN_GLOBAL_REDZONE    0xF9  /* redzone for global variable */
#define KASAN_VMALLOC_INVALID   0xF8  /* unallocated space in vmapped page */

/*
 * Stack redzone shadow values
 * (Those are compiler's ABI, don't change them)
 */
#define KASAN_STACK_LEFT        0xF1
#define KASAN_STACK_MID         0xF2
#define KASAN_STACK_RIGHT       0xF3
#define KASAN_STACK_PARTIAL     0xF4

/*
 * alloca redzone shadow values
 */
#define KASAN_ALLOCA_LEFT   0xCA
#define KASAN_ALLOCA_RIGHT  0xCB

#define KASAN_ALLOCA_REDZONE_SIZE   32

/*
 * Stack frame marker (compiler ABI).
 */
#define KASAN_CURRENT_STACK_FRAME_MAGIC 0x41B58AB3

/* Don't break randconfig/all*config builds */
#ifndef KASAN_ABI_VERSION
#define KASAN_ABI_VERSION 1
#endif

As we can see, the 0xfc poison type is KASAN_KMALLOC_REDZONE, which, as the name suggests, is the red zone for kmalloc. A quick look at the code shows:

    case KASAN_KMALLOC_REDZONE:
        bug_type = "slab-out-of-bounds";
        break;

Finally, it is worth noting that the address information provided by ASAN is

Therefore, based on the information above, we can summarize the understanding as follows:

  • KASAN detected a slab out-of-bounds access through the poisoning mechanism of 0xfc, corresponding to the slab name kmalloc-128, which requested a size of 10 bytes but wrote 11 bytes, and provided the page details of this slab, along with the stack for detecting the out-of-bounds access and the allocation stack for this page.

Use After Free

The use-after-free test provides two verification methods: one for RCU and another for workqueue, mainly because UAF in these two cases is relatively hidden, as follows:

  1. RCU accesses have a grace period, and there should be no error during the grace period, but errors should be reported after synchronization (to avoid false positives).
  2. Workqueue involves asynchronous tasks (which are more likely to occur in practical scenarios).

RCU Form of UAF

First, let’s look at the UAF test code as follows:

static struct kasan_rcu_info {
    int i;
    struct rcu_head rcu;
} *global_rcu_ptr;

static noinline void __init kasan_rcu_reclaim(struct rcu_head *rp)
{
    struct kasan_rcu_info *fp = container_of(rp,
                        struct kasan_rcu_info, rcu);

    kfree(fp);
    fp->i = 1;
}

static noinline void __init kasan_rcu_uaf(void)
{
    struct kasan_rcu_info *ptr;

    pr_info("use-after-free in kasan_rcu_reclaim\n");
    ptr = kmalloc(sizeof(struct kasan_rcu_info), GFP_KERNEL);
    if (!ptr) {
        pr_err("Allocation failed\n");
        return;
    }

    global_rcu_ptr = rcu_dereference_protected(ptr, NULL);
    call_rcu(&global_rcu_ptr->rcu, kasan_rcu_reclaim);
}

Here, memory is allocated for a pointer using kmalloc, and then rcu_dereference is used to ensure the pointer is loaded. Finally, after the grace period is completed (after all readers finish), the callback function is called, where the UAF operation is performed. Unfortunately,KASAN did not detect the UAF issue after the RCU grace period, but rather the RCU OS reported an IABT instruction error. The IABT log from RCU OS is clear and understandable, so I won’t elaborate further.

[  205.038853] Unable to handle kernel paging request at virtual address ffffffd003815000
[  205.039568] Mem abort info:
[  205.039826]   ESR = 0x86000007
[  205.040107]   EC = 0x21: IABT (current EL), IL = 32 bits
[  205.040581]   SET = 0, FnV = 0
[  205.040861]   EA = 0, S1PTW = 0
[  205.041154] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000004bfc000
[  205.041748] [ffffffd003815000] pgd=00000001ff7ff003, p4d=00000001ff7ff003, pud=00000001ff7ff003, pmd=000000000fa27003, pte=0000000000000000
[  205.042945] Internal error: Oops: 86000007 [#1] SMP
[  205.043380] Modules linked in:
[  205.043673] CPU: 1 PID: 22 Comm: rcuos/1 Tainted: G    B             5.10.198 #92

Workqueue Form of UAF

The workqueue UAF test case demonstrates a common UAF issue, with the relevant code as follows:

static noinline void __init kasan_workqueue_work(struct work_struct *work)
{
    kfree(work);
}

static noinline void __init kasan_workqueue_uaf(void)
{
    struct workqueue_struct *workqueue;
    struct work_struct *work;

    workqueue = create_workqueue("kasan_wq_test");
    if (!workqueue) {
        pr_err("Allocation failed\n");
        return;
    }
    work = kmalloc(sizeof(struct work_struct), GFP_KERNEL);
    if (!work) {
        pr_err("Allocation failed\n");
        return;
    }

    INIT_WORK(work, kasan_workqueue_work);
    queue_work(workqueue, work);
    destroy_workqueue(workqueue);

    pr_info("use-after-free on workqueue\n");
    ((volatile struct work_struct *)work)->data;
}

This typically triggers a UAF issue in the workqueue scenario, and the logs are as follows. Below, I will explain them one by one.

[   37.721631] ==================================================================
[   37.721639] BUG: KASAN: use-after-free in kasan_workqueue_uaf+0x140/0x158 [test_kasan_module]
[   37.721644] Read of size 8 at addr ffffff80775af000 by task insmod/2047
[   37.721646] 
[   37.721651] CPU: 5 PID: 2047 Comm: insmod Tainted: G    B             5.10.198#92
[   37.721655] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT)
[   37.721658] Call trace:
[   37.721664]  dump_backtrace+0x0/0x3bc
[   37.721668]  show_stack+0x1c/0x24
[   37.721672]  dump_stack_lvl+0x130/0x168
[   37.721678]  print_address_description.constprop.0+0x74/0x2b8
[   37.721689]  kasan_report+0x1e8/0x200
[   37.721694]  __asan_report_load8_noabort+0x30/0x5c
[   37.721699]  kasan_workqueue_uaf+0x140/0x158 [test_kasan_module]
[   37.721704]  test_kasan_module_init+0x20/0xa78 [test_kasan_module]
[   37.721708]  do_one_initcall+0xb0/0x4e0
[   37.721713]  do_init_module+0x14c/0x600
[   37.721717]  load_module+0x5714/0x71fc
[   37.721721]  __do_sys_finit_module+0x110/0x1a0
[   37.721725]  __arm64_sys_finit_module+0x70/0xa0
[   37.721730]  el0_svc_common.constprop.0+0xf0/0x464
[   37.721734]  do_el0_svc+0x44/0x5c
[   37.721737]  el0_svc+0x1c/0x30
[   37.721741]  el0_sync_handler+0xa8/0xac
[   37.721745]  el0_sync+0x158/0x180
[   37.721747] 
[   37.721752] Allocated by task 2047:
[   37.721756]  kasan_save_stack+0x24/0x50
[   37.721765]  __kasan_kmalloc+0x88/0xb0
[   37.721769]  kmem_cache_alloc_trace+0x1d0/0x3c0
[   37.721774]  kasan_workqueue_uaf+0x80/0x158 [test_kasan_module]
[   37.721779]  test_kasan_module_init+0x20/0xa78 [test_kasan_module]
[   37.721783]  do_one_initcall+0xb0/0x4e0
[   37.721787]  do_init_module+0x14c/0x600
[   37.721790]  load_module+0x5714/0x71fc
[   37.721794]  __do_sys_finit_module+0x110/0x1a0
[   37.721799]  __arm64_sys_finit_module+0x70/0xa0
[   37.721803]  el0_svc_common.constprop.0+0xf0/0x464
[   37.721807]  do_el0_svc+0x44/0x5c
[   37.721810]  el0_svc+0x1c/0x30
[   37.721814]  el0_sync_handler+0xa8/0xac
[   37.721817]  el0_sync+0x158/0x180
[   37.721820] 
[   37.721823] Freed by task 227:
[   37.721828]  kasan_save_stack+0x24/0x50
[   37.721832]  kasan_set_track+0x24/0x34
[   37.721841]  kasan_set_free_info+0x24/0x44
[   37.721844]  __kasan_slab_free+0xd8/0x134
[   37.721848]  kfree+0xe0/0x500
[   37.721853]  kasan_workqueue_work+0xc/0x14 [test_kasan_module]
[   37.721858]  process_one_work+0x624/0x1240
[   37.721861]  worker_thread+0x3b8/0xe90
[   37.721865]  kthread+0x2c0/0x344
[   37.721869]  ret_from_fork+0x10/0x18
[   37.721872] 
[   37.721874] Last potentially related work creation:
[   37.721878]  kasan_save_stack+0x24/0x50
[   37.721882]  kasan_record_aux_stack+0xbc/0xd0
[   37.721886]  insert_work+0x54/0x2e0
[   37.721890]  __queue_work+0x3a8/0xca0
[   37.721893]  queue_work_on+0x9c/0xd0
[   37.721898]  kasan_workqueue_uaf+0x114/0x158 [test_kasan_module]
[   37.721903]  test_kasan_module_init+0x20/0xa78 [test_kasan_module]
[   37.721908]  do_one_initcall+0xb0/0x4e0
[   37.721917]  do_init_module+0x14c/0x600
[   37.721921]  load_module+0x5714/0x71fc
[   37.721925]  __do_sys_finit_module+0x110/0x1a0
[   37.721929]  __arm64_sys_finit_module+0x70/0xa0
[   37.721933]  el0_svc_common.constprop.0+0xf0/0x464
[   37.721937]  do_el0_svc+0x44/0x5c
[   37.721940]  el0_svc+0x1c/0x30
[   37.721944]  el0_sync_handler+0xa8/0xac
[   37.721947]  el0_sync+0x158/0x180
[   37.721950] 
[   37.721954] The buggy address belongs to the object at ffffff80775af000
[   37.721954]  which belongs to the cache kmalloc-128 of size 128
[   37.721963] The buggy address is located 0 bytes inside of
[   37.721963]  128-byte region [ffffff80775af000, ffffff80775af080)
[   37.721971] The buggy address belongs to the page:
[   37.721979] page:00000000e417a6e1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x775ae
[   37.721986] head:00000000e417a6e1 order:1 compound_mapcount:0
[   37.721996] flags: 0x10200(slab|head)
[   37.722011] raw: 0000000000010200 ffffffff013e2500 0000000300000003 ffffff8007003c80
[   37.722018] raw: 0000000000000000000000008020002000000001ffffffff ffffff803619b601
[   37.722025] page dumped because: kasan: bad access detected
[   37.722031] 
[   37.722043] Memory state around the buggy address:
[   37.722050]  ffffff80775aef00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   37.722056]  ffffff80775aef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   37.722063] >ffffff80775af000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   37.722069]                    ^
[   37.722078]  ffffff80775af080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   37.722090]  ffffff80775af100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   37.722096] ==================================================================

The first piece of information indicates that KASAN detected a UAF issue at the address<span>ffffff80775af000</span>, where 8 bytes were read.

[   37.721639] BUG: KASAN: use-after-free in kasan_workqueue_uaf+0x140/0x158 [test_kasan_module]
[   37.721644] Read of size 8 at addr ffffff80775af000 by task insmod/2047

This 8-byte read is due to accessing the first member data of work_struct, as follows:

((volatile struct work_struct *)work)->data;

(gdb) ptype /o struct work_struct
/* offset    |  size */  type = struct work_struct {
/*    0      |     8 */    atomic_long_t data;
/*    8      |    16 */    struct list_head {
/*    8      |     8 */        struct list_head *next;
/*   16      |     8 */        struct list_head *prev;

                               /* total size (bytes):   16 */
                           }
/*   24      |     8 */    work_func_t func;

                           /* total size (bytes):   32 */
                         }

The second piece of information provides the stack for detecting the issue and the memory allocation stack, as well as the memory free stack and the most relevant workqueue stack.

[   28.683508] Call trace:
[   28.683513]  dump_backtrace+0x0/0x3bc
[   28.683532]  show_stack+0x1c/0x24
[   28.683552]  dump_stack_lvl+0x130/0x168
[   28.683566]  print_address_description.constprop.0+0x74/0x2b8
[   28.683579]  kasan_report+0x1e8/0x200
[   28.683592]  __asan_report_load8_noabort+0x30/0x5c
[   28.683608]  kasan_workqueue_uaf+0x140/0x158 [test_kasan_module]
[   28.683633]  test_kasan_module_init+0x20/0xa78 [test_kasan_module]
[   28.683646]  do_one_initcall+0xb0/0x4e0
[   28.683661]  do_init_module+0x14c/0x600
[   28.683674]  load_module+0x5714/0x71fc
[   28.683689]  __do_sys_finit_module+0x110/0x1a0
[   28.683710]  __arm64_sys_finit_module+0x70/0xa0
[   28.683724]  el0_svc_common.constprop.0+0xf0/0x464
[   28.683737]  do_el0_svc+0x44/0x5c
[   28.683750]  el0_svc+0x1c/0x30
[   28.683763]  el0_sync_handler+0xa8/0xac
[   28.683776]  el0_sync+0x158/0x180

The third piece of information provides the page information, such as the page structure, reference count, mapping count, mapping situation, pfn number, head page address, order value, compound page count, page type, and metadata, along with a summary of the error type.

[   28.683988] The buggy address belongs to the object at ffffff80775af000
[   28.683988]  which belongs to the cache kmalloc-128 of size 128
[   28.683992] The buggy address is located 0 bytes inside of
[   28.683992]  128-byte region [ffffff80775af000, ffffff80775af080)
[   28.683995] The buggy address belongs to the page:
[   28.684001] page:00000000e417a6e1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x775ae
[   28.684005] head:00000000e417a6e1 order:1 compound_mapcount:0
[   28.684009] flags: 0x10200(slab|head)
[   28.684014] raw: 0000000000010200 ffffffff013e2500 0000000300000003 ffffff8007003c80
[   28.684018] raw: 0000000000000000000000008020002000000001ffffffff ffffff803619b601
[   28.684025] page dumped because: kasan: bad access detected

The fourth piece of information prints the shadow region poisoning value, which here is fa, corresponding to KASAN_KMALLOC_FREETRACK, specifically used for UAF detection. It can be seen that KASAN poisons the entire 128 bytes of the slab as fa/fb to support UAF detection.

[   28.684028] Memory state around the buggy address:
[   28.684036]  ffffff80775aef00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   28.684040]  ffffff80775aef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   28.684044] >ffffff80775af000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   28.684047]                    ^
[   28.684050]  ffffff80775af080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   28.684054]  ffffff80775af100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   28.684056] ==================================================================

Conclusion

Thus, this article has briefly practiced the testing content of KASAN. KASAN and ASAN share many similar principles but also have differences that need to be distinguished. Through practice, it can be found that other KASAN test cases have been merged into the KUnit testing framework. Although KUnit can be run through “Linux UML on x86”, for convenience, I will still test based on the existing hardware environment, as modifying code takes less time than recompiling the UML kernel. Next, I will implement kernel code based on other test cases for further practical work.

Reference Links

https://www.kernel.org/doc/html/latest/translations/zh_CN/dev-tools/kasan.html

Leave a Comment