(Click the public account above, follow quickly)

Compiled by: Linux China / Lv Feng, English: Julia Evans

https://linux.cn/article-9588-1.html

This week, I discovered that I could call C functions from gdb. This seemed cool because I previously thought that gdb was just a read-only debugging tool at best.

I was surprised that gdb could call functions. As usual, I asked on Twitter how this works. I received a lot of useful answers. My favorite answer was Evan Klitzke’s example C code, which demonstrated how gdb calls functions. The code was able to run, which was exciting!

Through some tracing and experimentation, I believe that the example C code and how gdb actually calls functions are different. Therefore, in this article, I will explain how gdb calls functions and how I came to know this.

There are many things I still don’t know about how gdb calls functions, and what I write here may be incorrect.

What Does Calling C Functions from GDB Mean?

Before explaining how this works, let me quickly talk about how I discovered this surprising fact.

Suppose you are running a C program (the target program). You can run a function in the program by simply doing the following:

Pause the program (since it is already running)
Find the address of the function you want to call (using the symbol table)
Make the program (target program) jump to that address
When the function returns, restore the previous instruction pointer and registers

Finding the address of the function you want to call through the symbol table is very easy. Below is a very simple yet working piece of code that I used on Linux to explain how to find the address. This code uses the elf crate. If I want to find the address of the foo function in the process with PID 2345, I can run elf_symbol_value("/proc/2345/exe", "foo").

fn elf_symbol_value(file_name: &str,symbol_name: &str) -> Result<u64,Box<std::error::Error>> {

// Open the ELF file

let file = elf::File::open_path(file_name).ok().ok_or(“parse error”)?;

// Loop through all sections & symbols until the correct one is found

let sections = &file.sections;

forsinsections{

forsym infile.get_symbols(&s).ok().ok_or(“parse error”)?{

ifsym.name == symbol_name{

returnOk(sym.value);

}

}

}

None.ok_or(“No symbol found”)?

}

This won’t actually work; you also need to find the memory mapping of the file and add the symbol offset to the base address of the file mapping. Finding the memory mapping is not difficult; it’s located in /proc/PID/maps.

In summary, finding the address of the function you want to call was straightforward for me, but the rest (changing the instruction pointer, restoring registers, etc.) seemed less obvious.

You Can’t Just Jump

I have said that you can’t just find the address of the function you want to run and then jump to it. I tried doing that in gdb (jump foo), and then the program crashed with a segmentation fault. It was meaningless.

How to Call C Functions from GDB

First, it is possible. I wrote a very concise C program that simply sleeps for 1000 seconds, naming the file test.c:

#include <unistd.h>

intfoo(){

return3;

}

intmain(){

sleep(1000);

}

Next, compile and run it:

$gcc –otest test.c

$./test

Finally, we use gdb to trace the test program:

$sudo gdb –p$(pgrep –ftest)

(gdb)pfoo()

$1 = 3

(gdb)quit

I ran pfoo(), and it executed the function! This is very interesting.

What Is This Useful For?

Here are some possible uses:

It allows you to use gdb as a C REPL, which is fun and I think useful for development
Functions to display/browse complex data structures while debugging in gdb (thanks to @invalidop)
Setting an arbitrary namespace during process runtime (my colleague nelhage was very surprised by this)
There may be many other uses I am unaware of

How It Works

When I asked on Twitter how to call functions from gdb, I received many useful answers. Many answers were, “You get the function’s address from the symbol table,” but that is not the complete answer.

Someone told me about two series of articles on how gdb works: Native Debugging: Part One, Native Debugging: Part Two. The first part discusses how gdb calls functions (noting that gdb actually does this in a complicated manner, but I will do my best).

The steps are as follows:

Stop the process
Create a new stack frame (away from the real stack)
Save all registers
Set the register parameters for the function you want to call
Set the stack pointer to point to the new stack frame
Place a trap instruction at some location in memory
Set the return address for the trap instruction
Set the instruction register value to the address of the function you want to call
Run the process again!

(LCTT Note: If you consider the called function as a separate thread, what gdb actually does is a simple thread context switch)

I don’t know how gdb accomplishes all these tasks, but tonight I learned a few of them.

Creating a Stack Frame

If you want to run a C function, you need a stack to store variables. You certainly don’t want to keep using the current stack. Specifically, before gdb calls the function (by setting the function pointer and jumping), it needs to set the stack pointer to somewhere.

Here are some guesses from Twitter about how it works:

I think it constructs a new stack frame on top of the current stack for the call!

And

Are you sure? It should allocate a pseudo-stack and temporarily change the value of sp (the stack pointer register) to that stack’s address. You can try it; you can set a breakpoint there and see if the stack pointer register’s value is close to the current program register’s value?

I did an experiment with gdb:

(gdb)p$rsp

$7 = (void *)0x7ffea3d0bca8

(gdb)breakfoo

Breakpoint1at0x40052a

(gdb)pfoo()

Breakpoint1,0x000000000040052ainfoo()

(gdb)p$rsp

$8 = (void *)0x7ffea3d0bc00

It seems to support the theory that “gdb constructs a new stack frame on top of the current stack.” Because the stack pointer ($rsp) changed from 0x7ffea3d0bca8 to 0x7ffea3d0bc00 — the stack pointer grows from high to low addresses. So 0x7ffea3d0bca8 is behind 0x7ffea3d0bc00. How interesting!

Thus, it seems that gdb simply creates a new stack frame at the current stack’s location. This surprised me!

Changing the Instruction Pointer

Let’s take a look at how gdb changes the instruction pointer!

(gdb)p$rip

$1 = (void(*)())0x7fae7d29a2f0 <__nanosleep_nocancel+7>

(gdb)bfoo

Breakpoint1at0x40052a

(gdb)pfoo()

Breakpoint1,0x000000000040052ainfoo()

(gdb)p$rip

$3 = (void(*)())0x40052a <foo+4>

Indeed! The instruction pointer changed from 0x7fae7d29a2f0 to 0x40052a (the address of the foo function).

I stared at the output for a long time but still did not understand how it changed the instruction pointer, but that doesn’t matter.

How to Set Breakpoints

I mentioned break foo. I traced how gdb runs the program, but found nothing.

Below are some system calls gdb uses to set breakpoints. They are quite simple. It replaces an instruction with cc (which tells us that int3<code> means <code>send SIGTRAP https://defuse.ca/online-x86-assembler.html), and once the program is interrupted, it restores the instruction to its original state.

I set a breakpoint at the function foo, at the address 0x400528.

PTRACE_POKEDATA shows how gdb modifies the running program.

// Change the instruction at 0x400528

25622ptrace(PTRACE_PEEKTEXT,25618,0x400528,[0x5d00000003b8e589]) = 0

25622ptrace(PTRACE_POKEDATA,25618,0x400528,0x5d00000003cce589) = 0

// Start running the program

25622ptrace(PTRACE_CONT,25618,0x1,SIG_0) = 0

// Get a signal when reaching the breakpoint

25622ptrace(PTRACE_GETSIGINFO,25618,NULL,{si_signo=SIGTRAP,si_code=SI_KERNEL,si_value={int=-1447215360,ptr=0x7ffda9bd3f00}}) = 0

// Change the instruction at 0x400528 back to its original state

25622ptrace(PTRACE_PEEKTEXT,25618,0x400528,[0x5d00000003cce589]) = 0

25622ptrace(PTRACE_POKEDATA,25618,0x400528,0x5d00000003b8e589) = 0

Placing a Trap Instruction Somewhere

When gdb runs a function, it also places a trap instruction somewhere. This is one of them. It basically replaces an instruction with cc (int3).

5908 ptrace(PTRACE_PEEKTEXT,5810,0x7f6fa7c0b260,[0x48f389fd89485355]) = 0

5908 ptrace(PTRACE_PEEKTEXT,5810,0x7f6fa7c0b260,[0x48f389fd89485355]) = 0

5908ptrace(PTRACE_POKEDATA,5810,0x7f6fa7c0b260,0x48f389fd894853cc) = 0

0x7f6fa7c0b260 What is it? I checked the memory mapping of the process and found that it is located somewhere in /lib/x86_64-linux-gnu/libc-2.23.so. It’s strange why gdb places a trap instruction in libc.

Let’s see what function is inside; it is __libc_siglongjmp. Other functions where gdb places trap instructions include __longjmp, ___longjmp_chk, dl_main, and _dl_close_worker.

Why? I don’t know! Maybe for some reason, when the function foo() returns, it calls longjmp, so gdb can control the return. I'm not sure.

Calling Functions from GDB is Complex!

I will stop here (it’s already 1 AM), but I have learned a bit more!

It seems that the answer to the question of “how gdb calls functions” is not simple. I found it interesting and made an effort to figure out some answers, and I hope you can too.

I still have many unanswered questions about how gdb accomplishes all these tasks, but that’s enough. I don’t need to know all the details about how gdb works, but I’m happy that I have gained some further understanding.

Did you find this article helpful? Please share it with more people

Follow “CPP Developer” to enhance your C/C++ skills

How to Call Functions from GDB

What Does Calling C Functions from GDB Mean?

You Can’t Just Jump

How to Call C Functions from GDB

What Is This Useful For?

How It Works

Creating a Stack Frame

Changing the Instruction Pointer

How to Set Breakpoints

Placing a Trap Instruction Somewhere

Calling Functions from GDB is Complex!

Leave a Comment Cancel reply

What Does Calling C Functions from GDB Mean?

You Can’t Just Jump

How to Call C Functions from GDB

What Is This Useful For?

How It Works

Creating a Stack Frame

Changing the Instruction Pointer

How to Set Breakpoints

Placing a Trap Instruction Somewhere

Calling Functions from GDB is Complex!

Related posts

Leave a Comment Cancel reply