Understanding C++ Exception Mechanism: A Small Program Example

Understanding C++ Exception Mechanism: A Small Program Example

AliMei Introduction

The author encountered C++ exceptions while investigating a bug, and takes this opportunity to clarify the C++ exception mechanism for everyone’s reference.

Recently, while investigating a bug, we encountered C++ exceptions. Since we rarely use C++ exceptions, this is a good opportunity to clarify the C++ exception mechanism. There is not much existing information on the internet, and most of it is too profound. Therefore, I am writing this document as a memo.
The implementation mechanisms of C++ exceptions include SJLJ, Dwarf CFI, and EHABI. The specific choice of implementation is related to the operating system and architecture. It is part of the C++ ABI. Here, we will focus only on Dwarf CFI, which is the default implementation on Linux for x86_64 and arm64.
The complete C++ exception mechanism requires collaboration between compiler-generated code, C++ runtime (libstdc++ or libc++), and unwind libraries. This document aims to describe the mechanism in an understandable way and does not distinguish between these three components.

Test Program

We will start with the following small program to analyze the implementation principles of C++ exceptions. This program demonstrates several key points:
1. f() allocates an exception object and throws it;
2. It unwinds the stack frames, destructing the objects on the stack of g() along the way;
3. main() matches the catch statement and handles the exception.
#include <stdio.h>
struct A {
    A() { printf("A\n"); }
    ~A() { printf("~A\n"); }};
struct E {
    E() { printf("E\n"); }
    ~E() { printf("~E\n"); }};
void f(){
    throw E();}
void g(){
    A a;
    f();}
int main(){
    try {
        g();
    } catch (int n) {
        printf("catch int %d\n", n);
    } catch (const E& e) {
        printf("catch E %p\n", &e);
    }
    return 0;
}

Throwing Exceptions

For convenience of description, we will describe the code generated by the compiler for exceptions in C syntax. (A little tip: you can see the assembly code generated by various compilers on the CompilerExplorer website.)
Let’s first look at the f() function that throws an exception of type E. It has no other function.
void f(){
    // throw E();
    E* e = __cxa_allocate_exception(sizeof(struct E));  // Allocate exception object from heap
    e->E();                 // Construct exception object
    __cxa_throw(            // Throw exception
        e,                  // Exception object
        &typeid(struct E),  // Type of the exception object, a static object generated at compile time
        &E::~E);            // Destructor of the exception object
}

These __cxa prefixed functions are provided by the C++ runtime library.

__cxa_allocate_exception() allocates the exception object and other internal data structures from the heap.
__cxa_throw() unwinds the stack frames, going back to g() and main().

Propagation of Exceptions

Now let’s look at g(). g() does not have a catch statement, so the exception will continue to propagate. However, there is an object a on the stack, so it needs to destruct this object while unwinding the stack frame.
This introduces a concept: landing pad. In the following code, lines 9-10 represent the execution path for a normal return from f(). If f() throws an exception, it will jump to line 15. This is referred to as a landing pad. Here, line 15 destructs the a object, and line 16 continues to unwind to main().
void g() {
    // A a;
    A a;    // Allocate a object a on the stack
    a.A();  // Construct a object
        // f();    
f();    // Call f()
    a.~A(); // If f() returns normally, it reaches here
    goto end_of_catch;       // If f() throws an exception, it jumps here.
    // Although g() has no catch, a still needs to destruct, so it has a landing pad.
    // At this point, rax points to the exception object header, rdx indicates the action.
    a.~A();       // Destruct a object
    _Unwind_Resume(e);  // No matching catch, continue to unwind the stack frame
end_of_catch:
    return;
}

Catching Exceptions

Finally, let’s look at main(). main() has catch statements, and the second catch statement matches the exception of type E.
int main(){
    // try {
    //    g();
    // }
    g();  // Call g()          // If try { ... } has other code after g(), it will be placed here
    goto end_of_catch;  // If g() returns normally, it reaches here       // Here is the landing pad for throw.
    // $rax points to the exception object.
    // $rdx indicates the action:    // 0 means no catch, continue to unwind the stack frame;    // 1 means match the first catch;    // 2 means match the second catch.
    void *p = rax;
    int action = rdx;        // If try { ... } has objects to destruct, destruct them here.
    // Now we start matching catch statements.
        // catch (int n) {    //     printf("catch int %d\n", n);
    // }
    if (action == 1) {
        n = *(int *) e;
        printf("catch int %d\n", n);
        goto end_of_catch;
    }
        // catch (const E& e) {    //     printf("catch E\n");
    // }
    if (action == 2) {
        E *e = __cxa_begin_catch(p);
        printf("catch E %p\n", e);
        __cxa_end_catch();  // Internally destruct e object
        goto end_of_catch;
    }
        _Unwind_Resume(p);  // If no matching catch, continue to unwind the stack frame.
end_of_catch:
    return 0;
}


Other Details

There is a hidden detail: __cxa_throw() is responsible for unwinding the stack frames and finding the landing pad. Given the known PC pointer position, this information is determined at compile time. The compiler generates the .eh_frame and .gcc_except_table sections, which allow runtime to find the positions of upper stack frames and landing pads. A detailed description is overly complex; please refer to the links at the end of this document.
Once the landing pad is found, at runtime, the catch statements are matched according to the captured exception type, utilizing C++ RTTI information. If no suitable catch statement is found, the exception continues to propagate up the stack frames.

References:

1、Itanium C++ ABI: Exception Handling:https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html

2、Exception Handling ABI for the Arm Architecture:https://github.com/ARM-software/abi-aa/blob/844a79fd4c77252a11342709e3b27b2c9f590cf1/ehabi32/ehabi32.rst

3、libunwind LLVM Unwinder:https://github.com/llvm/llvm-project/blob/main/libunwind/docs/index.rst

4、Linux Stack Unwinding (x86_64):https://zhuanlan.zhihu.com/p/302726082

5、.eh_frame:https://www.airs.com/blog/archives/460

6、.gcc_except_table:https://www.airs.com/blog/archives/464

Aliyun Developer Community, the choice of millions of developers

Aliyun Developer Community offers millions of quality technical contents, thousands of free system courses, rich experiential scenarios, and active community activities. Industry experts share and communicate. Welcome to click 【Read Original】 to join us.

Leave a Comment