C Preprocessor (III): Advanced Techniques and Best Practices

1. The Art of Macro Programming in the Linux Kernel

The Linux kernel is a prime example of macro programming, demonstrating how to achieve elegant and efficient code using macros.

1.1 Type-Safe <span>min</span> / <span>max</span> Macros

The standard <span>MIN/MAX</span> macros have side effects and type issues:

// Dangerous traditional macros
#define MIN(a, b) ((a) < (b) ? (a) : (b))

int x = 5;
int result = MIN(x++, 10);  // x++ is executed twice! Side effect!

The kernel’s <span>min_t</span> / <span>max_t</span> solution:

// Simplified kernel implementation
#define min_t(type, a, b) ({        
    type __a = (a);                  
    type __b = (b);                  
    __a < __b ? __a : __b;          
})

#define max_t(type, a, b) ({        
    type __a = (a);                  
    type __b = (b);                  
    __a > __b ? __a : __b;          
})

// Usage example
int x = 5;
int result = min_t(int, x++, 10);  // Safe! x increments only once
printf("x = %d, result = %d\n", x, result);  // x = 6, result = 6

// Type safety
long a = 100L;
int b = 50;
long result = max_t(long, a, b);  // Explicitly specify type

Core Technique: Use local variables <span>__a</span> and <span>__b</span> to cache parameters and avoid repeated evaluations.

1.2 <span>typeof</span> – GNU Extension for Type Deduction

<span>typeof</span> can retrieve the type of an expression, enabling more generic macros:

#define min(a, b) ({               
    typeof(a) __a = (a);           
    typeof(b) __b = (b);           
    __a < __b ? __a : __b;         
})

// Automatically adapts type
int i = min(10, 20);               // typeof(10) -> int
double d = min(3.14, 2.71);        // typeof(3.14) -> double
long l = min(100L, 200L);          // typeof(100L) -> long

Practical Example: Swap Macro

#define SWAP(a, b) do {            
    typeof(a) __tmp = (a);         
    (a) = (b);                     
    (b) = __tmp;                   
} while(0)

int x = 10, y = 20;
SWAP(x, y);
printf("x=%d, y=%d\n", x, y);  // x=20, y=10

double p = 1.5, q = 2.5;
SWAP(p, q);  // Also works for double type

1.3 <span>container_of</span> – The Divine Macro in the Kernel

It deduces the base address of a structure from the address of one of its members, making it one of the most ingenious macros in the kernel:

#define container_of(ptr, type, member) ({              
    const typeof(((type *)0)->member) *__mptr = (ptr);  
    (type *)((char *)__mptr - offsetof(type, member));  
})

// Example: Embedded Linked List
struct list_node {
    struct list_node *next;
    struct list_node *prev;
};

struct task {
    int pid;
    char name[32];
    struct list_node list; // Embedded list node
};

// Usage
struct list_node *node_ptr = /* ... */;
struct task *task_ptr = container_of(node_ptr, struct task, list);
printf("Task: %s (PID: %d)\n", task_ptr->name, task_ptr->pid);

Principle Analysis:

// 1. typeof(((type *)0)->member) retrieves member type
// 2. offsetof(type, member) calculates member offset
// 3. Member address - offset = base address of structure

// offsetof definition (standard library)
#define offsetof(type, member) ((size_t)&((type *)0)->member)

Practical Application: Intrusive Containers

#include <stddef.h>

// Custom offsetof (if needed)
#define my_offsetof(type, member) ((size_t)&((type *)0)->member)

#define container_of(ptr, type, member) \
    ((type *)((char *)(ptr) - my_offsetof(type, member)))

// List traversal
#define list_for_each_entry(pos, head, member)              \
    for (pos = container_of((head)->next, typeof(*pos), member); \
         &pos->member != (head);                             \
         pos = container_of(pos->member.next, typeof(*pos), member))

2. GNU Extensions: Statement Expressions

The GNU C extension allows writing multiple statements in <span>({ ... })</span><code><span>, with the last expression returned as the value:</span>

2.1 Basic Syntax

// Statement expression
int max = ({
    int a = 10;
    int b = 20;
    a > b ? a : b;  // The last expression is the return value
});

printf("max = %d\n", max);  // Output: max = 20

2.2 Practical Example: Safe Macro

// Safe square macro
#define SQUARE(x) ({       
    typeof(x) __x = (x);   
    __x * __x;             
})

// Macro with debug information
#define DEBUG_CALC(expr) ({                    
    typeof(expr) __result = (expr);            
    printf("%s = %d\n", #expr, __result);      
    __result;                                  
})

int value = DEBUG_CALC(3 + 4 * 5);
// Output: 3 + 4 * 5 = 23

2.3 Best Practices for Multi-Statement Macros

// Method 1: do-while(0) wrapping (traditional, good compatibility)
#define LOG_ERROR(msg) do {                
    fprintf(stderr, "[ERROR] %s\n", msg);  
    error_count++;                         
} while(0)

// Method 2: Statement expression (GNU extension, can return value)
#define SAFE_MALLOC(size) ({               
    void *ptr = malloc(size);              
    if (!ptr) {                            
        perror("malloc");                  
        exit(1);                           
    }                                      
    ptr;                                   
})

// Usage
int *array = SAFE_MALLOC(100 * sizeof(int));

Why use <span>do-while(0)</span>?

// Error: no do-while wrapping
#define BAD_MACRO(x) foo(); bar();

if (condition)
    BAD_MACRO(x);  // Expands to: foo(); bar();
else
    other();       // Compilation error! else has no corresponding if

// Correct: do-while(0) wrapping
#define GOOD_MACRO(x) do { foo(); bar(); } while(0)

if (condition)
    GOOD_MACRO(x);  // Expands to a single statement
else
    other();        // Correct!

3. C11 Generic Selection:

The <span>_Generic</span> keyword introduced in C11 provides compile-time type selection, similar to C++ overloading:

3.1 Basic Syntax

#define type_name(x) _Generic((x), \
    int: "int",                    \
    float: "float",                \
    double: "double",              \
    char*: "string",               \
    default: "unknown"             \
)

// Usage
printf("%s\n", type_name(10));      // Output: int
printf("%s\n", type_name(3.14));    // Output: double
printf("%s\n", type_name("hello")); // Output: string

3.2 Generic Print Function

#include <stdio.h>

#define print(x) _Generic((x),     \
    int: printf("%d", x),          \
    long: printf("%ld", x),        \
    float: printf("%f", x),        \
    double: printf("%lf", x),      \
    char*: printf("%s", x),        \
    default: printf("unsupported") \
)

// Usage example
print(42);          // Automatically selects printf("%d", x)
print(3.14);        // Automatically selects printf("%lf", x)
print("hello");     // Automatically selects printf("%s", x)

3.3 Type-Safe Math Functions

#include <math.h>

#define ABS(x) _Generic((x),       \
    int: abs(x),                   \
    long: labs(x),                 \
    float: fabsf(x),               \
    double: fabs(x)                \
)

#define SQRT(x) _Generic((x),      \
    float: sqrtf(x),               \
    double: sqrt(x),               \
    default: sqrt(x)               \
)

// Usage
int i = -10;
float f = -3.14f;
double d = -2.71;

printf("%d\n", ABS(i));    // Calls abs()
printf("%f\n", ABS(f));    // Calls fabsf()
printf("%lf\n", ABS(d));   // Calls fabs()

3.4 Practical: Generic Container Macro

#define MAX(a, b) _Generic((a),                          \
    int: _Generic((b),                                   \
        int: max_int,                                    \
        default: (void)0                                 \
    ),                                                   \
    double: _Generic((b),                                \
        double: max_double,                              \
        default: (void)0                                 \
    )                                                    \
)(a, b)

// Helper functions
static inline int max_int(int a, int b) {
    return a > b ? a : b;
}

static inline double max_double(double a, double b) {
    return a > b ? a : b;
}

// Type-safe usage
int i = MAX(10, 20);           // OK
double d = MAX(3.14, 2.71);    // OK
// int x = MAX(10, 3.14);      // Compilation error: type mismatch

4. Limitations and Considerations of Macros

4.1 Redefinition Rules

ANSI C requires that macro redefinitions must be exactly the same (including whitespace):

#define SIZE 100
#define SIZE 100        // OK: exactly the same

#define SIZE  100       // Error: different number of spaces
#define SIZE (100)      // Error: has parentheses

// Correct approach: undefine first then redefine
#undef SIZE
#define SIZE (100)

4.2 Macro Expansion Rules

Macros can be nested, but cannot be recursive:

// Nested expansion
#define A B
#define B C
#define C 100

int x = A;  // A -> B -> C -> 100

// Recursive protection (will not expand infinitely)
#define RECURSIVE RECURSIVE + 1
int y = RECURSIVE;  // Expands to: RECURSIVE + 1 (stops)

Expansion Order:

#define FOO(x) BAR(x)
#define BAR(x) (x + 1)

FOO(10)
// 1. FOO(10) -> BAR(10)
// 2. BAR(10) -> (10 + 1)
// Result: (10 + 1)

4.3 Common Pitfalls

1. Side Effect Issues

#define MAX(a, b) ((a) > (b) ? (a) : (b))

int i = 5;
int m = MAX(i++, 10);  // i++ executes twice!
// Expands: ((i++) > (10) ? (i++) : (10))

2. Precedence Issues

#define MUL(a, b) a * b
int result = MUL(2 + 3, 4 + 5);  // Expected 25, actual 17
// Expands: 2 + 3 * 4 + 5 = 2 + 12 + 5 = 19 (wrong!)

// Correct writing
#define MUL(a, b) ((a) * (b))

3. Semicolon Trap

#define PRINT(x) printf("%d\n", x);  // Note the trailing semicolon

if (condition)
    PRINT(10);  // Expands: printf("%d\n", 10);;
else
    PRINT(20);  // Compilation error! Extra semicolon

// Correct: do not add semicolon, let the user decide
#define PRINT(x) printf("%d\n", x)

4.4 Debugging Difficulties

Macro errors are hard to debug, it is recommended to:

  1. Check the preprocessed result:
gcc -E source.c -o preprocessed.c
  1. Use inline functions instead (C99+):
// Macro version: hard to debug
#define MAX(a, b) ((a) > (b) ? (a) : (b))

// Inline function version: debuggable, type-safe
static inline int max_int(int a, int b) {
    return a > b ? a : b;
}
  1. Conditional compilation debug macros:
#ifdef DEBUG_MACRO
    #define DEBUG_PRINT(x) printf("Debug: " #x " = %d\n", (x))
#else
    #define DEBUG_PRINT(x)
#endif

5. Summary of Best Practices

5.1 When to Use Macros

Scenarios suitable for using macros:

  • Constant definitions
  • Conditional compilation
  • Code generation
  • Compile-time calculations
  • Operations that require type independence

Should avoid using macros:

  • Where inline functions can be used instead
  • Complex logic (more than 3 lines)
  • Where type checking is needed

5.2 Naming Conventions

// Macros use uppercase (convention)
#define MAX_SIZE 1024
#define MIN(a, b) (...)

// Functions use lowercase
static inline int max(int a, int b) { ... }

5.3 Safety Check List

  • [ ] All parameters are enclosed in parentheses
  • [ ] The entire expression is enclosed in parentheses
  • [ ] Avoid multiple evaluations of parameters
  • [ ] Use <span>do-while(0)</span> wrapping for multi-statement macros
  • [ ] Do not add semicolons at the end of macro definitions
  • [ ] Use unique temporary variable names (<span>__var</span>)

6. Reference Resources

Official Documentation:

  • GCC Preprocessor Documentation(https://gcc.gnu.org/onlinedocs/cpp/)
  • C11 Standard (ISO/IEC 9899:2011)(http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf)

Recommended Reading:

  • C Primer Plus
  • Linux Kernel Source <span>include/linux/kernel.h</span>
  • 《C Traps and Pitfalls》

Online Tools:

  • Compiler Explorer(https://godbolt.org/) – View preprocessed results
  • C Preprocessor Online(https://www.tutorialspoint.com/compile_c_online.php)

Conclusion

The series on the C preprocessor concludes here, let’s recap:

📌 First Article: Basics of Macro Definitions, Stringification, Concatenation, Variadic Macros📌 Second Article: Conditional Compilation, #pragma, Predefined Macros📌 Third Article: Kernel Techniques, Statement Expressions, _Generic, Best Practices

The preprocessor is a double-edged sword:used well, it is a powerful tool; used poorly, it is a hazard. It is recommended in modern C code to:

  • Prefer using <span>const</span>, <span>enum</span>, <span>inline</span> instead of simple macros
  • Implement complex logic using functions rather than macros
  • Use macros primarily for conditional compilation and code generation

Mastering the preprocessor is essential to truly understanding the compilation process and underlying mechanisms of the C language!

Leave a Comment