
On the modern stage of computing, programming languages are like dancers, while compilers are the choreographers guiding them from behind the scenes. Every piece of code written is like a carefully designed dance. On this stage, C language and GCC (GNU Compiler Collection) are a classic duo. Together, they perform a magnificent dance of compilation, transforming source code into executable files.
This article will explore the wonderful collaboration between C language and GCC, delving into how they together create the programs that ultimately run on computers.
1. The Four Stages of C/C++ Compilation
1. The Dance Behind the Scenes
Almost every beginner in programming starts with the output <span>Hello World!</span>
(with some minor modifications in the code below).
#include <stdio.h>
int main(int argc, char *argv[]) {
if(argc >= 2)
printf("Hello %s!\n", argv[1]);
else
printf("Hello World!\n");
return 0;
}
To compile this code into an executable file, you only need a simple command:
gcc hello.c -o hello
Using the <span>gcc</span>
command with the <span>-o</span>
option generates the executable file <span>hello</span>
. However, the entire compilation process is not as simple as it seems.
One or more C/C++ files must go through preprocessing, compilation, assembly, and linking stages to become an executable file.
As shown in the figure above, a source code file with the extension <span>.c</span>
can generate a temporary file with the extension <span>.i</span>
after preprocessing. The <span>.i</span>
file, after compilation, can generate an assembly file with the extension <span>.s</span>
. The assembly file with the extension <span>.s</span>
can then be assembled into a binary file with the extension <span>.o</span>
. Finally, these binary files are linked together to form an executable file.
Next, we will use GCC to compile this code into machine-understandable instructions.
2. Setting the Stage – Preprocessing
In C/C++ source files, commands starting with <span>#</span>
are called preprocessing commands, such as the include command <span>#include</span>
, macro definition command <span>#define</span>
, conditional compilation commands <span>#if</span>
, <span>#ifndef</span>
, etc. Preprocessing involves inserting the files to be included into the original file, expanding macro definitions, selecting code to use based on conditional compilation commands, and finally outputting these to a temporary file with the extension <span>.i</span>
, waiting for the next step.
For the previous <span>hello.c</span>
example, you can use the <span>-E</span>
option to let GCC only preprocess and output the result:
gcc -E hello.c -o hello.i
Using <span>vim</span>
to open the <span>hello.i</span>
file, you will find that this is a fully expanded file of the header files in <span>hello.c</span>
:
The source code only has 11 lines, but after expansion, it has as many as 803 lines. However, the main content of the source code can still be seen at the end of <span>hello.i</span>
.
3. The Dancer’s Basic Training – Compilation
What we commonly refer to as “compilation” actually encompasses these four stages and is not specifically referring to just the “compilation” stage itself. However, it is important to specify that here, “compilation” refers to the step immediately following preprocessing. The compilation stage translates the temporary file with the extension <span>.i</span>
into assembly code, using the tool <span>cc1</span>
(do not doubt, this tool is indeed called <span>cc1</span>
. Different architectures have their own <span>cc1</span>
tools; the x86 architecture has its own <span>cc1</span>
tool, and the ARM architecture has its own <span>cc1</span>
tool).
This step can be accomplished using the <span>-S</span>
option:
gcc -S hello.i -o hello.s
Similarly, you can open the <span>hello.s</span>
file using <span>vim</span>
:
As seen in the image above, the generated <span>hello.s</span>
file contains assembly code that describes the basic operations of the program. These instructions are still human-readable, but further conversion is needed for the computer to execute them.
4. The Choreographer’s Fine-Tuning – Assembly
The assembly stage translates assembly code into machine code in a specific format, which on Linux systems generally appears as an ELF object file (OBJ file), using the tool <span>as</span>
.
Using the <span>-c</span>
option, GCC will convert the assembly code into object code (machine code):
gcc -c hello.s -o hello.o
At this point, opening the <span>hello.o</span>
file with <span>vim</span>
will reveal a bunch of incomprehensible content:
We can use another command-line tool to open the
<span>hello.o</span>
file, which is used to display the file’s content in hexadecimal format, called <span>hexdump</span>
. Enter the following command:
hexdump -C hello.o
As shown in the image below, the first line indicates the file format. The entire file contains the machine instructions of the program but has not yet completed the final linking.
5. The Harmony of the Ensemble – Linking
The linking stage is the final step of the compilation process, which links the OBJ files and the system library OBJ files and library files, ultimately generating an executable file that can run on a specific platform, using the tools <span>ld</span>
or <span>collect2</span>
. Enter the following command to generate the final executable file:
gcc hello.o -o hello
At this point, a complete executable file named <span>hello</span>
has been generated. Of course, this executable file is indistinguishable from the one generated by <span>gcc hello.c -o hello</span>
.

2. The Dance’s Detailed Movements – Detailed Compilation Mode
In the GCC compilation process, using the <span>-v</span>
option can enable detailed mode, displaying each step of the compilation and detailed information about the tools invoked. This is very useful for debugging and understanding the compilation process. The specific command is as follows:
gcc hello.c -o hello -v
Executing this command will produce a lot of information, but we can also find some key information from it to verify some of the content we mentioned earlier.

The first part of this information is the compiler configuration information, and we extract the main information from it:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v ...... # omitted unimportant content
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
Specific explanations are as follows:
-
• <span>Using built-in specs.</span>
: Indicates that built-in compiler specifications are being used. -
• <span>COLLECT_GCC=gcc</span>
: Indicates that the main compiler is<span>gcc</span>
. -
• <span>Target: x86_64-linux-gnu</span>
: The target architecture is 64-bit x86 architecture running on GNU/Linux system. -
• <span>Configured with</span>
: Displays the compiler configuration options. -
• <span>Thread model: posix</span>
: Uses the POSIX thread model. -
• <span>gcc version 7.5.0</span>
: The compiler version is 7.5.0.
Next, as mentioned earlier, the first step is preprocessing, but from the output information, it can be seen that GCC will perform both preprocessing and compilation stages together. The <span>cc1</span>
tool is also used here:

Key information extracted (as follows), here the <span>cc1</span>
tool is used to compile the <span>hello.c</span>
file into a temporary <span>/tmp/ccEwwHut.s</span>
, which is an assembly file.
COLLECT_GCC_OPTIONS='-o' 'hello' '-v' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/7/cc1 hello.c -o /tmp/ccEwwHut.s # many parameters omitted
[!NOTE]
When enabling the cc1 tool, many options are used, which can be difficult for beginners to understand. However, for the convenience of those already in the field, here are some descriptions of the options for learning purposes:
• /usr/lib/gcc/x86_64-linux-gnu/7/cc1: Calls cc1, which is the front end of GCC, responsible for preprocessing, lexical analysis, and syntax analysis.
• -quiet: Reduces output information.
• -imultiarch x86_64-linux-gnu: Specifies the target platform.
• -dumpbase hello.c: Specifies the source file name.
• -mtune=generic: Optimizes for generic architecture.
• -march=x86-64: Target architecture is x86-64.
• -fstack-protector-strong: Enables strong stack protection.
• -Wformat: Enables format string warnings.
• -Wformat-security: Enables format string security checks.
This also includes a step that was not mentioned earlier, which is search paths, as shown below:
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/x86_64-linux-gnu/7/include
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/7/include-fixed
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
This step is actually just preparing for the following linking stage, identifying the libraries to be loaded in advance.
During the assembly stage, it can be seen that the <span>as</span>
assembler is called to perform the assembly, where the <span>-v</span>
option indicates detailed mode, and <span>--64</span>
indicates generating 64-bit object code, ultimately generating the specified output object file <span>tmp/cc8dsQ4D.o</span>
, which is also a temporary file.

Finally, in the linking stage, it can be seen that the <span>collect2</span>
linker is called. Among the parameters following the linker, in addition to some related libraries, the most crucial is the <span>/tmp/cc8dsQ4D.o</span>
file, which is also included. It can also be seen that the <span>-o hello</span>
option and parameter will generate the executable file <span>hello</span>
.

[!NOTE]
There are also many options that are relatively difficult to understand, so here are some explanations for enhancement purposes:
• <span>-plugin</span>
: Uses LTO plugins for optimization during linking.• <span>-plugin-opt</span>
: Options passed to the plugin.• <span>-m elf_x86_64</span>
: Specifies the ELF format as 64-bit.• <span>-dynamic-linker /lib64/ld-linux-x86-64.so.2</span>
: Specifies the dynamic linker.• <span>-pie</span>
: Generates position-independent executable files.• <span>-z now</span>
: Makes certain symbols immediately available.• <span>-z relro</span>
: Creates read-only relocation sections.• <span>-lgcc</span>
,<span>-lgcc_s</span>
,<span>-lc</span>
: Links necessary libraries.
3. Behind the Scenes – Various Options of GCC (Overall Option)
GCC (GNU Compiler Collection) provides many options to control the behavior of the compiler. These options can be roughly divided into several categories, including preprocessing options, compilation options, assembly options, and linking options. Below are explanations of relatively important options:
1. Preprocessing Options
-
• <span>-E</span>
: Only perform the preprocessing stage and then stop. The output is the preprocessed source code. -
• <span>-P</span>
: Do not output line control information (e.g.,<span>#line</span>
directive). -
• <span>-C</span>
: Retain all comments. -
• <span>-M</span>
: Output dependency list. -
• <span>-MM</span>
: Output dependency list, ignoring standard header files.
2. Compilation Options
-
• <span>-c</span>
: Only compile and assemble, but do not link. Generates an object file. -
• <span>-S</span>
: Only compile, generating assembly code. -
• <span>-E</span>
: Only preprocess, do not compile. -
• <span>-g</span>
: Generates debugging information. -
• <span>-O</span>
: Sets optimization level.<span>-O0</span>
indicates no optimization,<span>-O1</span>
to<span>-O3</span>
represent different optimization levels, and<span>-O3</span>
is the highest level of optimization. -
• <span>-Os</span>
: Optimize to reduce code size. -
• <span>-Og</span>
: Optimize while maintaining the availability of debugging information. -
• <span>-Wall</span>
: Enable all warnings. -
• <span>-Wextra</span>
: Enable extra warnings. -
• <span>-Werror</span>
: Treat all warnings as errors. -
• <span>-pedantic</span>
: Enable all language extensions prohibited by ISO C and ISO C++ standards. -
• <span>-pedantic-errors</span>
: Similar to<span>-pedantic</span>
but treats extensions as errors. -
• <span>-std=standard</span>
: Specifies the standard to follow (e.g.,<span>-std=c99</span>
or<span>-std=c++11</span><code><span>).</span>
-
• <span>-fPIC</span>
: Generates position-independent code (Position Independent Code) for shared libraries.
3. Assembly Options
-
• <span>-Wa,option</span>
: Passes option to the assembler. -
• <span>-masm=att</span>
: Selects AT&T assembly style. -
• <span>-masm=intel</span>
: Selects Intel assembly style.
4. Linking Options
-
• <span>-Ldir</span>
: Adds directory dir to the search path of the linker. -
• <span>-lfoo</span>
: Links the library named libfoo. -
• <span>-static</span>
: Produces an executable file with static linking. -
• <span>-shared</span>
: Generates a shared library. -
• <span>-pie</span>
: Generates position-independent executable files. -
• <span>-fPIC</span>
: Similar to<span>-pie</span>
, for generating position-independent code, usually for shared libraries. -
• <span>-Wl,option</span>
: Passes options to the linker. -
• <span>-T</span>
: Specifies the linker script. -
• <span>-nostartfiles</span>
: Does not use any startup files. -
• <span>-nostdlib</span>
: Does not use the standard library.
5. Other Options
-
• <span>-v</span>
: Displays compiler version information and detailed information during the compilation process. -
• <span>-V</span>
: Displays the compiler version. -
• <span>-Bprefix</span>
: Specifies the prefix path prefix to look for compiler-related tools. -
• <span>-print-file-name=filename</span>
: Prints the full path of the specified file. -
• <span>-print-prog-name=program</span>
: Prints the full path of the specified program. -
• <span>-print-libgcc-file-name</span>
: Prints the path of libgcc. -
• <span>-dumpversion</span>
: Prints the version number.