Basics of GCC Compilation

Click on the aboveBeginner Learning Vision”, choose to add Star or “Top

Heavyweight content delivered at the first time

This article is reprinted from: AI Algorithms and Image Processing

Data Preparation

To facilitate demonstration and explanation, here are a few simple files prepared in advance: test.cpp test.h main.cpp The contents of the files are as follows:

main.cpp

#include "test.h"

int main (int argc, char **argv)
{
    Test t;
    t.hello();
    return 0;
}

test.h

//test.h
#ifndef _TEST_H_ 
#define _TEST_H_ 

class Test
{
public:
    Test();
    void hello();
    ~Test();
};
#endif  //TEST

test.cpp

//test.cpp
#include "test.h"
#include <iostream>
using namespace std;

Test::Test()
{

}

void Test::hello()
{
    cout << "hello" << endl;
}

Test::~Test()
{

}
The C++ Compilation Process

A complete C++ compilation process (for example, g++ a.cpp generates an executable file) consists of the following four processes:

  • Compilation Preprocessing, also known as Precompilation, can be executed using the command g++ -E
  • Compilation, can be executed using g++ -S
  • Assembly, can be executed using as or g++ -c
  • Linking, can be executed using g++ xxx.o xxx.so xxx.a

You can save all intermediate files generated during the compilation process by adding the g++ --save-temps parameter. The following explains these four steps one by one.

1. Compilation Preprocessing Stage: Mainly processes included header files (#include), macro definitions (#define, #ifdef …), and comments, etc.

You can use g++ -E to stop the compilation process after preprocessing, generating *.ii (for .c files, it generates *.i). Since there are no precompilation directives in the above main.cpp, the precompiled output is almost identical to the source file. Here, let’s preprocess the test.cpp file.

g++ -E test.cpp test.h -o test.ii

You can open test.ii to see the content of the precompiled main.cpp file: Basics of GCC Compilation

After preprocessing, the content introduced by #include is copied into the precompiled file, and if there are macro definitions in the file, they will also be replaced.

  • The main task of the preprocessing process is to replace macro commands.
  • The work of the #include command is simply to import; there are no restrictions on the types of files that can be imported, including .cpp, .txt, etc.
  • Students interested can preprocess a file containing signals from Qt to see that after preprocessing:
    • emit becomes empty. Emitting a signal is essentially a function call;
    • In the header file, signals: is replaced with protected: (Qt5 is replaced with public:)
    • Other macros in Qt are also processed during preprocessing, such as Q_OBJECT, Q_INVOKEABLE, etc.

2. g++ Compilation Stage: Syntax errors in C++ are checked at this stage. If there are no errors, g++ translates the code into assembly language.

You can use the -S option to view the assembly code, which only compiles without assembling, generating assembly code.

g++ -S main.ii -o main.s

The assembly code generated consists of assembly instructions related to the CPU architecture, and different CPU architectures use different instruction sets, resulting in different generated assembly code: Basics of GCC Compilation

3. g++ Assembly Stage: Generates target code *.o

There are two ways to do this:

  • Use g++ to directly generate target code from source code g++ -c *.s -o *.o
  • Use the assembler to generate target code from assembly code as *.s -o *.o

At the compilation stage, the code is still human-readable. In this assembly stage, the assembly code is converted into machine-executable target code, which is binary code.

# Compile
g++ -c main.s -o main.o
# Assembler compilation
as main.s -o main.o

You can also directly use as *.s, which will execute the assembly and linking process to generate the executable file a.out. You can specify the output file format using the -o option.

4. g++ Linking Stage: Generates the executable file; generates .exe under Windows

Modify the content of main.cpp to reference the Test class.

#include "test.h"

int main (int argc, char **argv)
{
    Test t;
    t.hello();
    return 0;
}

Generate target files:

  • g++ test.cpp -c -o test.o
  • g++ main.cpp -c -o main.o

Link to generate the executable file:

g++ main.o test.o -o a.out

The core work of the linking process is to resolve various symbol references (variables, functions) between modules. More often than not, besides using .o, we also link static and dynamic libraries to generate executable files.

The essence of symbol reference is to reference its specific address in memory, so determining the symbol address is an indispensable task during the compilation, linking, and loading processes, which is known as symbol relocation. Essentially, symbol relocation addresses how the current compilation unit accesses external symbols.

Next, we will first explain how to compile source files into dynamic and static libraries, and then describe how to link the compiled libraries during linking.

Compiling Dynamic and Static Libraries

In large projects, it is impossible to provide services with a single executable program; some modules of the program must be compiled into dynamic or static libraries:

Compile to Generate Static Library

Use the ar command to perform “archiving” (the essence of .a is to package files).

ar crsv libtest.a test.o 
  • r replaces existing files in the archive or adds new files (necessary)
  • c does not issue a warning when creating the library is not necessary
  • s creates an archive index
  • v outputs detailed information

Compile to Generate Dynamic Library

Use g++ -shared command to specify that the compilation generates a dynamic library.

g++ test.cpp -fPIC -shared -Wl,-soname,libtest.so -o libtest.so.0.1
  • shared: tells the compiler to generate a dynamic link library
  • -Wl,-soname: indicates the alias of the generated dynamic link library (here it is libtest.so)
  • -o: indicates the actual generated dynamic link library (here it is libtest.so.0.1)
  • -fPIC
    • fPIC stands for Position Independent Code, used to generate position-independent code (it doesn’t matter if you don’t understand it; just add this parameter, it makes it easier for other codes to reference this library; otherwise, minor mistakes can lead to various errors).
    • The dynamic library generated with the -fPIC option is position-independent. This code can be placed at any position in the linear address space without modification to execute correctly. The common method is to get the value of the instruction pointer and add an offset to get the address of global variables/functions.
    • For detailed interpretation of the PIC parameter: click this link

In gcc, if you specify -shared without specifying -fPIC, it will report an error and cannot generate a non-PIC dynamic library, but clang can.

The addresses of functions and variables in the library are relative addresses, not absolute addresses; the real addresses are formed when the program calling the dynamic library loads. The dynamic library has an alias (soname), real name (realname), and linker name.

  • The real name is the actual name of the dynamic library, which generally adds a small version number to the alias; the release version is composed of the alias prefix lib, followed by the library name plus .so, for example: libQt5Core.5.7.1
  • The linker name is the name of the library used when linking the program, for example: -lQt5Core
  • When installing dynamic link libraries, the library files are always copied to a certain directory, and then a soft link is generated to create an alias. When the library file is updated, only the soft link needs to be updated.

Note:

  • The generated library file always starts with libXXX, which is a convention because when the compiler looks for libraries using the -l parameter, for example, -lpthread will automatically look for libpthread.so and libpthread.a.
  • If the generated library does not start with lib, it can still be linked during compilation, but only in a way explicitly specified in the compilation command parameters. For example, g++ main.o test.so

Static Compilation and Dynamic Compilation

During the linking stage, the target files generated from the assembly are linked with the referenced libraries into the executable file. This is called static compilation, and the libraries used in static compilation are static libraries (*.a or *.lib), and the generated executable file does not need to rely on linked libraries at runtime.

  • Advantages:
    • Loading speed of the code is fast, and the execution speed is also relatively fast.
    • Does not depend on other libraries for execution, making it easy to port.
  • Disadvantages:
    • The program size is large.
    • Updating is inconvenient; if the static library needs updating, the program needs to be recompiled.
    • If multiple applications use it, it will be loaded multiple times, wasting memory.
g++ main.o libtest.a

After compilation, you can run a.out to see the effect, and use the ldd command to view the dependencies required to run a.out, and you will see that the statically compiled program does not depend on the libtest library. Basics of GCC Compilation

Dynamic Compilation

Dynamic libraries are not linked into the target code during program compilation but are loaded at runtime. Different applications can call the same library, so only one instance of the shared library needs to be in memory, avoiding space wastage.

The libraries used in dynamic compilation are dynamic libraries (*.so or *.dll).

Dynamic libraries are loaded only at runtime, which also solves the problems that static libraries bring to program updates, deployment, and release. Users only need to update the dynamic library for incremental updates.

Dynamic libraries involve the issue of load-time symbol relocation during the linking process; interested students can refer to the link: Dynamic Compilation Principle Analysis.

  • Advantages:
    • Multiple applications can use the same dynamic library without needing to store multiple copies on disk.
    • Dynamic and flexible, allowing for incremental updates.
  • Disadvantages:
    • Due to runtime loading, it will somewhat affect the initial execution performance of the program.
    • Missing dynamic libraries can cause the file to fail to run.
g++ main.o libtest.so

After compilation, you can run a.out to see the effect, and use the ldd command to view the dependencies required to run a.out Basics of GCC Compilation

gcc Link Parameters -L, -l, -rpath, -rpath-link

From the previous screenshot, we have seen the runtime error of the program, the reason being that the dynamic link library libtest.so cannot be found.

There are many solutions to this error, for example:

  • LD_LIBRARY_PATH=. ./a.out

So why, even though the compilation succeeded, does it fail to find the library at runtime? To clarify this issue, we need to have a deeper understanding of the process of linking dynamic libraries.

In main.cpp, we explicitly referenced the Test class, so during the final stage of compilation, when linking, if the symbol Test cannot be found among all the files involved in the compilation, an undefined reference error will occur. Therefore, it is necessary to find the file containing the Test symbol during the compilation process; it can be .o, .a, or .so. If it is .o or .a, that is static linking, it will package the contents of .o or .a into the generated executable file, which can run independently without any restrictions. However, if it is a dynamic link library like .so, it becomes more complicated. The linker will not package this library into the generated executable file but will only record an address here, telling the program to look for the definition of the Test symbol in the file libtest.so at a certain line and column (for example, in reality, it is a relative memory address).

In summary:

  • When compiling and linking main.cpp, it must be able to find the dynamic library libtest.so, recording the offset address of the Test symbol.
  • When running, the program must find libtest.so and then address the Test.

Compile-time Linking Libraries

-L and -l linker parameters specify where to find which library during linking.

  • -l indicates which library to link, automatically searching for the corresponding library name starting with lib. For example, -lpthread, -lQt5Core will automatically search for libpthread.so, libpthread.a, libQt5Core.so, libQt5Core.a.
    • If both static and dynamic libraries exist, the dynamic library will be prioritized.
  • -L specifies where to find library files. For example, specifying: -L/home/threedog/test will prioritize searching /home/threedog/test/libpthread.so and other files during compilation.
  • The most direct way to link a library is to write the library’s path directly in the compilation parameters.
  • Search Order:
    • If the full path to the library is written directly, it will find the library directly without following the search order below.
    • -L has the highest priority.
    • Then the system environment variable LIBRARY_PATH.
    • Lastly, look for the default directories /lib, /usr/lib, /usr/local/lib, which were written into the program during the compilation of gcc.
    • If none are found, it will report an error indicating that the file or -lxxxx cannot be found.

So the above compilation commands can be compiled in various ways:

  • g++ main.o libtest.so, or g++ main.o ./libtest.so
  • g++ main.o /home/threedog/test/libtest.so
  • g++ main.o -ltest -L., or g++ main.o -ltest -L/home/threedog/test/
  • LIBRARY_PATH=. g++ main.o -ltest, or LIBRARY_PATH=/home/threedog/test/ g++ main.o -ltest
  • Or copy libtest.so to the /usr/lib directory.

Runtime Linking Libraries

The a.out compiled using the above methods will report an error when running. Using the ldd command to check, it is found that the dynamically linked libtest.so is marked as not found, which leads to the second issue: how to make the program able to find the corresponding library when it runs.

-Wl,-rpath does this: -Wl indicates that the following parameter is a linker parameter, and -rpath + the directory where the library is located will explicitly specify where the program should look for the corresponding library. This manually sets a directory as the search directory for ld.

Basics of GCC Compilation
Insert image description here

Additionally, you can also successfully run the program by adding the path in the environment variable LD_LIBRARY_PATH.

Basics of GCC Compilation
Insert image description here

Order of searching for runtime libraries:

  1. The dynamic library search path specified when compiling the target code (-rpath);
  2. The dynamic library search path specified by the environment variable LD_LIBRARY_PATH;
  3. The dynamic library search path specified in the configuration file /etc/ld.so.conf;
  4. The default dynamic library search path /lib;
  5. The default dynamic library search path /usr/lib.

rpath and rpath-link

In fact, rpath and rpath-link are both parameters of the linker ld, not gcc.

rpath-link and rpath look similar, but they are actually not closely related. rpath-link, like -L, also specifies directories during linking. The role of rpath-link is not reflected in our example. For instance, if you specify that libtest.so is needed, but if libtest.so itself requires xxx.so, and you have not specified this xxx.so that libtest.so references, it will first look for it in the directory specified by -rpath-link. rpath-link specified directories are not related to runtime.

Search Principles for C++ Header Files

Above, we mentioned the search order for compile-time linking libraries and the search order for runtime dynamic libraries. By the way, let’s also mention the search order for C++ header files during compilation:

  • #include<file.h> only searches for header files in the default system include paths.
  • #include"file.h" first searches in the current directory and the directories specified by -I; if the header file is not in the current directory, it then searches the system’s default include paths.

Order:

  1. First search the current directory.
  2. Then search the directories specified by -I.
  3. Then search the gcc environment variable CPLUS_INCLUDE_PATH (C programs use C_INCLUDE_PATH).
  4. Finally, search the default directories of gcc.
  • /usr/include
  • /usr/local/include
  • /usr/lib/gcc/x86_64-redhat-linux/4.1.1/include

Above is a detailed summary of gcc parameters. Below are some common questions based on the explanations above:

Question 1: What does -l link to, a dynamic library or a static library?

  • If both .so and .a exist in the linking path, it prioritizes linking .so.

Question 2: If both static and dynamic libraries exist in the path, how to link the static library?

  • The best way is to write the full path of the static library directly in the parameters.
  • Another way is to use the -static parameter, which will force the linking of the static library. The generated file can execute, but the ELF header of the file will have issues; using ldd or readelf -d will show that it is not a dynamically executable file.

Question 3: If the file does not use the corresponding library, will the compiler still link?

  • This depends on the type and version of the compiler. My local gcc 5.4 does not link even if -l is written if the library is not used. However, my local clang will explicitly link the corresponding library even if I do not use it. Basics of GCC Compilation

Reference links:

  • https://www.cnblogs.com/king-lps/p/7757919.html
  • https://blog.csdn.net/abcdu1/article/details/86083295
  • https://blog.csdn.net/weixin_40240269/article/details/86702090
  • https://www.jianshu.com/p/b2f611acba3d
  • https://www.cnblogs.com/youxin/p/5357614.html
  • https://blog.csdn.net/benpaobagzb/article/details/51277960
Download 1: OpenCV-Contrib Extension Module Chinese Version Tutorial
Reply in the background of “Beginner Learning Vision” public account:Extension Module Chinese Tutorial, you can download the first OpenCV extension module tutorial in Chinese on the internet, covering extension module installation, SFM algorithm, stereo vision, target tracking, biological vision, super-resolution processing and more than twenty chapters.
Download 2: Python Vision Practical Project 52 Lectures
Reply in the background of “Beginner Learning Vision” public account:Python Vision Practical Project, you can download 31 vision practical projects including image segmentation, mask detection, lane line detection, vehicle counting, eyeliner addition, license plate recognition, character recognition, emotion detection, text content extraction, face recognition, etc., to help quickly learn computer vision.
Download 3: OpenCV Practical Project 20 Lectures
Reply in the background of “Beginner Learning Vision” public account:OpenCV Practical Project 20 Lectures, you can download 20 practical projects based on OpenCV to achieve advanced learning of OpenCV.

Group Chat

Welcome to join the public account reader group to communicate with peers. Currently, there are WeChat groups for SLAM, 3D Vision, Sensors, Autonomous Driving, Computational Photography, Detection, Segmentation, Recognition, Medical Imaging, GAN, Algorithm Competitions, etc. (will gradually be subdivided in the future), please scan the WeChat number below to join the group, and note: “nickname + school/company + research direction”, for example: “Zhang San + Shanghai Jiaotong University + Vision SLAM”. Please follow the format; otherwise, it will not be approved. After successful addition, you will be invited to join the relevant WeChat group based on your research direction. Please do not send advertisements in the group, or you will be removed from the group. Thank you for your understanding~

Basics of GCC Compilation

Basics of GCC Compilation

Leave a Comment