Understanding Call Trace in ARM64 Stack Backtrace

If you work in the system group at a company, you may encounter a lot of panic issues, and the common output information is as follows. Most people may only know that this is the function call flow, but how to deduce the specific function call stack? Some may say that using the addr2line tool is enough, which is a bit low-level. This article will help you understand how to analyze such issues more clearly. Only provide the essentials, so you can have a place in the company!

benshushu:oops# insmod oops1.ko
[  221.598773] oops1: loading out-of-tree module taints kernel.
[  221.612211] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
[  221.627333] pc : two+0x14/0x2c [oops1]
[  221.627968] lr : one+0x14/0x28 [oops1]
[  221.628393] sp : ffff0000076c3ad0
[  221.628755] x29: ffff0000076c3ad0 x28: 0000000000000002 x27: ffff800010e97e38
...
[  221.636433] Call trace:
[  221.636752]  two+0x14/0x2c [oops1]
[  221.637163]  one+0x14/0x28 [oops1]
[  221.637511]  main_init+0x14/0x1000 [oops1]
[  221.637957]  do_one_initcall+0xf4/0x288
[  221.638557]  do_init_module+0x64/0x210
[  221.638912]  load_module+0x1d70/0x236c
[  221.639265]  __do_sys_finit_module+0xdc/0xf8
[  221.639618]  __arm64_sys_finit_module+0x20/0x28
[  221.639994]  invoke_syscall+0x88/0x118
[  221.640356]  el0_svc_common.constprop.0+0xc4/0xf8
[  221.640810]  do_el0_svc+0x84/0x8c
[  221.641142]  el0_svc+0x48/0xe8
[  221.641536]  el0t_64_sync_handler+0xb0/0x12c
[  221.641963]  el0t_64_sync+0x158/0x15c
[  221.642394] Code: d503201f a9bf7bfd d2800000 910003fd (b9800001) 

Stack Backtrace

The previous article introduced the principle of stack implementation in detail; here is a brief review and summary. It is strongly recommended to read the previous article carefully and practice the deduction, and the process of setting up the environment is also provided:

Understanding Call Trace in ARM64 Stack Backtrace

According to the pseudocode in the figure, we can see the code flow, where lr_xxx represents the return address after xxx executes.

main {
	...
	one() // main + 0xa
	lr_one // main + 0xa + 0x4
}

one {
	...
	two() // one + 0xb
	lr_two // one + 0xb + 0x4
}

two{
	...
	panic() // two + 0xc
}

The stack frames are linked together by FP frame by frame. The FP of the two function can find the stack address of one, and the FP of the one stack frame can find the stack address of main. In addition to FP having this associative relationship, there is another important relationship: lr_xxx address = xxx entry + 0x04.

From the above two rules, deduce the order of inference 3->2->1:

  1. Need to calculate the entry of each frame xxx function call: xxx entry = lr_xxx address – 0x04;

  2. lr_xxx address = FP + 0x08. The principle is that LR and FP both occupy 8 bytes, and LR is higher than FP address. You can read the previous article;

  3. When an exception occurs, you can know the FP of the leaf function, so you can deduce the FP of each stack frame, thus obtaining the LR of each stack frame.

Interpreting Call Trace

  1. Interpreting the site information of pc and lr

The exception context mainly looks at the information of pc and lr registers:

[  221.627333] pc : two+0x14/0x2c [oops1]
[  221.627968] lr : one+0x14/0x28 [oops1]

These two lines of information can obtain the specific place where the exception occurred. The pc register information indicates the line of code where the exception occurred:

gdb) l *two+0x14
0x14 is in two (xx/oops.c:14). // Exception occurred at line 14
9
10	void two (void) 
11	{       
12		size_t val = 0x0;
13		val = *(int *)0x0;
14		pr_info("val = 0x%zu\n", val); // Null pointer exception
15	}

The lr register indicates the return address after the exception function is executed. Since the lr address is 4 bytes higher than the calling two function, it can be calculated as: one+0x14-0x04 = one+0x10, which is the place where the two function is called.

(gdb) l *one+0x10
0x3c is in one (xxx/oops.c:20). // Get line 20 where two function is called
15	}
16	EXPORT_SYMBOL(two);
17	
18	void one (void) 
19	{       
20		two(); // Call two
21		printk("lr two");
22	}

So the information feedback from the site is: at line 20 of the one function, the two function was called, and an exception occurred at line 14 of the two function.

2. Interpreting Call Trace

The previous step can see the most direct information of the exception; in actual projects, it may not be enough to solve the problem. If there is a null pointer parameter, the passing method is through transmission, and it is necessary to determine which layer it was passed from through the call trace.

[  221.636433] Call trace:
[  221.636752]  two+0x14/0x2c [oops1]
[  221.637163]  one+0x14/0x28 [oops1]
[  221.637511]  main_init+0x14/0x1000 [oops1]
...

Only focus on the stack of the module itself, saving the common part of the kernel. This part of the information is a frame chain, outputting frame by frame, so is the output the FP chain address or LR or PC and other address information? Clarifying this doubt is crucial for interpreting the call trace. If you directly parse func + offset, you will find it incorrect. The answer is: the call trace except for the leaf function is the value of the PC register, and the others are the information of the LR register.

Function calls, except for the leaf function’s exception address, can be composed of the function call chain by subtracting 4 bytes from each LR address. The parsing through adb is as follows:

// Subtract 4 bytes from LR address, line 28 calls one
(gdb) l *main_init + 0x10
0x64 is in main_init (xxx/oops.c:28).  
23	EXPORT_SYMBOL(one);
24	
25	static int __init main_init(void)
26	{
27	
28		one();
29		printk("lr one");
30		return 0;
31	}
// Subtract 4 bytes from LR address, line 20 calls two
(gdb) l *one+0x10
0x3c is in one (xxx/oops.c:20).
15	}
16	EXPORT_SYMBOL(two);
17	
18	void one (void) 
19	{       
20		two();
21		printk("lr two");
22	}
// Leaf function PC register, no need to correct, exception at line 14
(gdb) l *two+0x14
0x10 is in two (xxx/oops.c:14).  
9
10	void two (void) 
11	{       
12		size_t val = 0x0;
13		val = *(int *)0x0;
14		pr_info("val = 0x%zu\n", val);
15	}

From the above parsing results, we get the chain: main_init(28) -> one(20) -> two(14).

Call Trace Principles

Taking kernel version 5.15 as an example for sorting. When the system exception occurs, call trace information will be output, and the common scenarios are as follows:

  1. Commonly caused by program exceptions during kernel runtime;

  2. Proactively triggered when program conditions are not met, such as BUG_ON();

  3. Proactively triggered through /proc/sysrq-trigger;

The information output is completed by the dump_backtrace function, path: linux/arch/arm64/kernel/traps.c

Understanding Call Trace in ARM64 Stack Backtrace

dump_backtrace has two calling paths: show_regs and show_stack. From the function names, it can be guessed that the parameters regs and tsk are different, so the function implementation also judges different parameters, listing some scenarios:

  1. Common panic exceptions, because the register information needs to be obtained, regs is not empty, tsk is empty;

  2. sysrq corresponding c parameter (trigger panic), both regs and tsk are null;

  3. sysrq corresponding w parameter (display blocked task), since it is necessary to dump blocked task, tsk is not empty, regs is empty;

Here, we will explain the most common panic path:

  1. If regs is not empty, if it is in user mode, there is no need to dump, just return;

  2. If the kernel stack obtained by try_get_task_stack is empty, return directly;

  3. Whether the task is current, initialize the first frame frame differently: 1. If current, frame takes the current function’s stack frame (dump_backtrace); 2. If not current, it means the task has been scheduled out by __switch_to, and it is necessary to obtain the context saved in task.thread.

  4. Initialize the first frame frame based on the fp and pc passed in. The start_backtrace function mainly assigns values to each member of frame, and it is necessary to explain the frame structure.

Understanding Call Trace in ARM64 Stack Backtrace

Field

Meaning

fp

fp value, calculated as *(unsigned long *)(fp)

pc

The actual record is lr, calculated as *(unsigned long *)(fp + 8), where fp + 8 is because the lr address is 8 bytes higher than fp. This field is the output of the call trace.

stacks_done

Bitmap, each bit indicates that the current stack type backtrace is completed

prev_fp

The current frame executed will be assigned to prev_fp, as the prev_fp of the next frame. The backtrace starts from the child function, so the parent function’s prev_fp points to the child function, and the address is increasing.

prev_type

The type of stack, since the stack can be nested, such as STACK->IRQ, when transitioning from one type to another, the previous type’s stacks_done bit will be set to 1

  1. There are two branches in the code here. When oops occurs, it can be compared based on fp and register x29 values. Here it enters branch 2, and the dump_backtrace_entry parameter is regs->pc, which corresponds to the pc register of the first frame, so there is no need to subtract 4 bytes. Next, it goes to branch 1. The key function for looping the stack frame is unwind_frame, and it will be analyzed separately.

do {
	if (!skip) { // Branch 1
		dump_backtrace_entry(frame.pc, loglvl);
	} else if (frame.fp == regs->regs[29]) { // Branch 2
		skip = 0;
		dump_backtrace_entry(regs->pc, loglvl);
	}
} while (!unwind_frame(tsk, &frame));
  1. unwind_frame function is used to unwind the stack frame, that is, to find the upper stack frame step by step through the fp value. Some key points of the function implementation code are selected for explanation.

1	int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
2	{
3		unsigned long fp = frame->fp;
4		struct stack_info info;
		...
5		if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
6			return -ENOENT;

7		if (fp & 0x7)
8			return -EINVAL;
		...
9		if (test_bit(info.type, frame->stacks_done))
10			return -EINVAL;

11		if (info.type == frame->prev_type) {
12			if (fp <= frame->prev_fp)
13				return -EINVAL;
14		} else {
15			set_bit(frame->prev_type, frame->stacks_done);
16		}

17		frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
18		frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
19		frame->prev_fp = fp;
20		frame->prev_type = info.type;
}
  • Line 5: pt_regs has a stackframe member used to represent the last level stack frame. If fp equals stackframe, it means that the last frame has been backtracked, and it can return directly;

  • Line 7: fp & 0x7: indicates the alignment format of fp. If the condition is not met, return directly;

  • Line 9: Whether the bitmap of the current frame type is 1. If it is 1, it means that the type has been fully unfolded, and return directly, refer to the above figure;

  • Lines 11-12: If the current stack frame type is equal to the previous frame type, it means that the stack frame is not fully unfolded. If fp is less than prev_fp, it does not meet the stack frame rules, and return directly;

  • Line 15: If the current stack frame type is not equal to the previous frame type, it means that the stack frame type has changed, and the previous stack frame type has been completed, so set the corresponding position in the stacks_done bitmap to 1, refer to the above figure;

  • Line 17: Get the address of the parent function stack frame. According to the previous article, it can be backtracked step by step;

  • Line 18: The information output of the call trace. Here, fp + 8 gets the lr address, and the output func + offset is the lr address, which needs to be converted to get the actual call stack

  • Line 19: Assign the current frame’s fp to prev_fp, as the prev_fp of the next frame;

Leave a Comment

×