Interpretation of Google C++ Style Guide Series Part 2 (Scope)

Link to Previous Issue:Interpretation of Google C++ Style Guide Series Part 1 (Header Files)

2.1 Namespaces

Except for a few special cases, code should be placed within a namespace, which should have a unique name that includes the project name and may optionally include the file path. The use of <span>using namespace foo</span> is prohibited, as is the use of inline namespaces; see section 2.2 for information on unnamed namespaces.

Definition A namespace can divide the global scope into independent, named scopes, effectively preventing name collisions in the global scope.

Advantages Namespaces can avoid naming conflicts in large programs while allowing code to continue using short names. For example, if both projects have a class named <span>Foo</span> in their global scopes, these two symbols will conflict at compile or runtime. If each project places its code in different namespaces, <span>project1::Foo</span> and <span>project2::Foo</span> become distinct symbols that do not conflict. Inline namespaces automatically place their identifiers into the outer scope, for example:

namespace outer {
inline namespace inner {
  void foo();
}  // namespace inner
}  // namespace outer

At this point, the expression <span>outer::inner::foo()</span> is equivalent to <span>outer::foo()</span>. The primary purpose of inline namespaces is to maintain compatibility between different ABI versions.

Disadvantages Namespaces can be difficult to understand because it can be hard to find the definition corresponding to an identifier. Inline namespaces are even harder to understand because their identifiers do not only appear in the namespace declaring them. Therefore, inline namespaces should only be used as part of a version control strategy. In some scenarios, we may have to use fully qualified names multiple times to reference symbols. In such cases, deeply nested namespaces can make the code verbose.

Conclusion It is recommended to use namespaces as follows:

  • Follow Google naming conventions.
  • As in the previous example, use comments to close the namespace. (Translator’s note: Indicate the name of the namespace.)
  • Wrap the entire source code file in a namespace after import statements and forward declarations of classes from other namespaces:
// In the .h file
namespace mynamespace {

// All declarations are within the namespace scope.
// Notice the lack of indentation.
class MyClass {
 public:
  ...
  void Foo();
};

}  // namespace mynamespace
// In the .cc file
namespace mynamespace {

// Definition of functions is within scope of the namespace.
void MyClass::Foo() {
  ...
}

}  // namespace mynamespace

More complex <span>.cc</span> files have more details, such as flags or using declarations.

#include "a.h"

ABSL_FLAG(bool, someflag, false, "a flag");

namespace mynamespace {

using ::foo::Bar;

...code for mynamespace...    // Code goes against the left margin.

}  // namespace mynamespace
  • Do not declare anything in the <span>std</span> namespace. Do not forward declare classes from the standard library. Declaring entities in the <span>std</span> namespace is undefined behavior, which can harm portability. To declare entities from the standard library, the corresponding header files should be imported.

  • Prohibit the use of using directives to introduce all symbols from a namespace.

// Prohibited: This pollutes the namespace
using namespace foo;
  • Do not allow header files to introduce namespace aliases, except in namespaces clearly marked for internal use, because anything introduced in the namespace of a header file is part of that file’s public API. Correct example:
// In .cc, use an alias to shorten commonly used names
namespace baz = ::foo::bar::baz;
// Shorten access to some commonly used names (in a .h file).
namespace librarian {
namespace internal {  // Internal, not part of the API.
namespace sidetable = ::pipeline_diagnostics::sidetable;
}  // namespace internal

inline void my_inline_function() {
  // namespace alias local to a function (or method).
  namespace baz = ::foo::bar::baz;
  ...
}
}  // namespace librarian
  • Prohibit inline namespaces.

  • If the name of a namespace contains “internal”, it indicates that users should not use these APIs.

// We shouldn't use this internal name in non-absl code.
using ::absl::container_internal::ImplementationDetail;
  • We encourage new code to use single-line nested namespace declarations, but it is not mandatory.

For example

namespace foo::bar {
...
}  // namespace foo::bar

2.2 Internal Linkage

If definitions in a <span>.cc</span> file are not needed by other files, these definitions can be placed in an unnamed namespace or declared as <span>static</span> to achieve internal linkage. However, do not use these techniques in <span>.h</span> files.

Definition All declarations placed in an unnamed namespace will have internal linkage. Functions and variables declared as <span>static</span> will also have internal linkage. This means that other files cannot access anything you declare. Even if another file declares an identical name, these two entities are independent of each other.

Conclusion It is recommended that all code in <span>.cc</span> files that does not need to be used externally adopts internal linkage. Do not use internal linkage in <span>.h</span> files. Declarations in unnamed namespaces should follow the same format as named namespaces. In the closing comment, do not include the namespace name:

namespace {
...
}  // namespace

2.3 Non-member Functions, Static Member Functions, and Global Functions

It is recommended to place non-member functions in namespaces; avoid using completely global functions. Do not use classes merely to group static members. Static methods of a class should be closely related to instances of the class or static data.

Advantages Non-member functions and static member functions can be useful in certain cases. Placing non-member functions in a namespace does not pollute the global namespace.

Disadvantages Sometimes non-member functions and static member functions are better suited to become members of a new class, especially when they need to access external resources or have clear dependencies.

Conclusion Sometimes we need to define a function that is independent of instances of a class. Such functions can be defined as static member functions or non-member functions. Non-member functions should not depend on external variables, and in most cases, should be located within a namespace. Do not create a new class merely to group static members; this is akin to adding a common prefix to all names, which is often unnecessary. If you define a non-member function for use only within a <span>.cc</span> file, use internal linkage to limit its scope.

2.4 Local Variables

Function variable scopes should be minimized as much as possible, and initialized at the time of declaration.

You can declare variables at any position in a C++ function. We advocate minimizing the scope of variables as much as possible, and declaring them as close to their first use as possible. This makes it easier for readers to find the declaration and understand the variable’s type and initial value. In particular, variables should be initialized directly rather than declared and then assigned, for example:

int i;
i = f();     // Bad: Initialization and declaration are separated.
int i = f(); // Good: Initialized at declaration.
int jobs = NumJobs();
// More code...f(jobs);
// Bad: Initialization and usage are separated.
int jobs = NumJobs();f(jobs);
// Good: Used immediately (or soon) after initialization.
std::vector<int> v;
v.push_back(1);  // Prefer initializing using brace initialization.
v.push_back(2);
std::vector<int> v = {1, 2};  // Good -- v starts initialized.

Variables used in <span>if</span>, <span>while</span>, and <span>for</span> statements should generally be declared within the statement to limit their scope to that statement. For example:

while (const char* p = strchr(str, '/')) str = p + 1;

Exceptions: If a variable is an object, its constructor will be called each time it enters the scope, and its destructor will be called each time it exits the scope.

// Inefficient implementation:
for (int i = 0; i < 1000000; ++i) {
  Foo f;  // My ctor and dtor get called 1000000 times each.
  f.DoSomething(i);
}

Declare such variables outside the loop scope:

Foo f;  // My ctor and dtor get called once each.
for (int i = 0; i < 1000000; ++i) {
  f.DoSomething(i);
}

Interpretation

Another point to note is the initialization of class member variables, especially pointer types, which need to be:

  • Always explicitly initialize pointers: Avoid relying on default initialization rules to enhance code safety.

  • Prefer using in-class initialization:

class A {
    int* ptr = nullptr;  // Concise and less prone to omission
};

2.5 Static and Global Variables

Prohibit the use of variables with static storage duration unless they can be trivially destructible. In simple terms, this means that the destructor does not perform any operations, including destructors for members and base classes. Formally, this means that the type has no user-defined destructor or virtual destructor, and all members and base classes can also be trivially destructible. Local static variables of functions can be dynamically initialized. Except for a few cases, dynamic initialization of static class member variables or variables within namespaces is not recommended. See below for details.

As a rule of thumb: If you only look at the declaration of a global variable, if that statement can be a constant expression (constexpr), it meets the above requirements.

Definition Each object has a storage duration related to its lifetime. The lifetime of static storage duration objects is from the start of program initialization to the end of the program. These objects may be variables within namespace scope (global variables), static data members of classes, or local variables declared with the <span>static</span> specifier. For local static variables of functions, initialization occurs when control flow first passes through the declaration; all other objects are initialized at program startup. All static storage duration objects are destroyed when the program exits (this occurs before any threads that have not joined terminate). The initialization process can be dynamic, meaning that there are non-trivial operations during initialization (for example, constructors that allocate memory or variables initialized with the current process ID). Other initializations are static. The two are not mutually exclusive: Static storage duration variables must be statically initialized (initialized to a specified constant or zeroed out), and may subsequently be dynamically initialized if necessary.

Advantages Global or static variables are helpful in many scenarios: named constants, auxiliary data structures within a translation unit, command line flags, logging, registration mechanisms, background infrastructure, etc.

Disadvantages Using dynamically initialized or non-trivially destructible global and static variables increases code complexity and can lead to subtle bugs. The dynamic initialization order of different translation units is undefined, and the destruction order is also undefined (it is only known that the destruction order is the reverse of the initialization order). If the initialization code of a static variable references another static storage duration variable, this access may occur before the other variable’s lifetime begins (or after it ends). Additionally, if some threads do not join before the program ends, those threads may continue to access these variables after they have been destructed.

Decisions Decisions about destructors are not affected by execution order (they do not actually count as “executed”); other destructors are risky and may access objects whose lifetimes have ended. Therefore, only objects with trivial destructors can have static storage duration. Basic types (such as pointers and <span>int</span>) can be trivially destructed, and arrays composed of trivially destructible types can also be trivially destructed. Note that variables marked with <span>constexpr</span><span> can be trivially destructed.</span>

Allowed

const int kNum = 10;  // Allowed

struct X {int n; };
const X kX[] = {{1}, {2}, {3}};  // Allowed

void foo() {
static const char* const kMessages[] = {"hello", "world"};  // Allowed
}

// Allowed: constexpr guarantees trivial destructor.
constexpr std::array<int, 3> kArray = {1, 2, 3};

Prohibited

// bad: non-trivial destructor
const std::string kFoo = "foo";

// Bad for the same reason, even though kBar is a reference (the
// rule also applies to lifetime-extended temporary objects).
const std::string& kBar = StrCat("a", "b", "c");

void bar() {
// Bad: non-trivial destructor.
static std::map<int, int> kData = {{1, 0}, {2, 0}, {3, 0}};
}

Note that references are not objects, so their destructors are not restricted. However, they still need to adhere to the dynamic initialization restrictions. In particular, we allow constructs like <span>static T& t = *new T;</span> for local static references within functions.

Decisions about Initialization Initialization is a more complex topic because we need to consider not only the execution of constructors but also the evaluation of initialization expressions.

int n = 5;    // Fine
int m = f();  // ? (Depends on f)
Foo x;        // ? (Depends on Foo::Foo)
Bar y = g();  // ? (Depends on g and on Bar::Bar)

Except for the first line, the other statements will be affected by the uncertain initialization order. The concept we need is formally referred to as constant initialization in the C++ standard. This means that the initialization expression is a constant expression, and if it is to be initialized using a constructor, that constructor must also be declared as <span>constexpr</span><span>:</span>

struct Foo { constexpr Foo(int) {} };

int n = 5;  // Fine, 5 is a constant expression.
Foo x(2);   // Fine, 2 is a constant expression and the chosen constructor is constexpr.
Foo a[] = { Foo(1), Foo(2), Foo(3) };  // Fine

Constant initialization can be used freely. Static variables should be marked with <span>constexpr</span><span> or </span><code><span>constinit</span><span> to indicate the constant initialization process. Any static variable that does not have these markings should be assumed to be dynamically initialized, and this code should be checked carefully.</span>

As a counterexample, the following initialization processes have issues:

// Some declarations used below.
time_t time(time_t*);      // Not constexpr!
int f();                   // Not constexpr!
struct Bar { Bar() {} };

// Problematic initializations.
time_t m = time(nullptr);  // Initializing expression not a constant expression.
Foo y(f());                // Ditto
Bar b;                     // Chosen constructor Bar::Bar() not constexpr.

We do not recommend and generally prohibit dynamically initializing global variables. However, if this initialization process does not depend on the order of other initializations, it can be allowed. If this requirement is met, then changes in the order of initialization will not make any difference. For example:

int p = getpid();  // If no other static variables will use p during initialization, this is allowed.

Recommendations

  • Global strings: If you need named global or static string constants, you can use <span>constexpr</span><span> decorated </span><code><span>string_view</span><span> variables, character arrays, or character pointers pointing to string literals. String literals have static storage duration, so they usually meet the need.</span>

  • Dynamic containers such as dictionaries and sets: If you need to store unchanging data (e.g., for searching) in static variables, do not use standard library dynamic containers, as these containers have non-trivial destructors. Consider using arrays of trivial types instead, such as arrays of <span>int</span><span> (as a dictionary mapping </span><code><span>int</span><span> to </span><code><span>int</span><span>) or arrays of pairs (e.g., pairs of </span><code><span>int</span><span> and </span><code><span>const char*</span><span>). For small amounts of data, linear search is sufficient, and due to memory locality, it is more efficient; common operations can be implemented using tools from absl/algorithm/container.h. If necessary, keep the data ordered and use binary search.</span>

  • Smart pointers (e.g., <span>std::unique_ptr</span><span> and </span><code><span>std::shared_ptr</span><span>) perform resource release operations upon destruction, so they cannot be used as static variables. Consider whether your scenario fits other patterns described in this section. A simple solution is to use plain pointers to point to dynamically allocated objects and never delete this object (see the last point).</span>

  • If static or constant data is of a custom type, ensure that this type has a trivial destructor and a <span>constexpr</span><span> decorated constructor.</span>

  • If none of the above apply, you can use a local static pointer or reference within a function, dynamically allocate an object, and never delete it (e.g., <span>static const auto& impl = *new T(args...);</span><span>).</span>

Interpretation

1. Core Restrictions for Static Storage Duration Objects

Objects with static storage duration (such as global variables, static member variables, and function-local static variables) must meet the following requirements to avoid potential issues caused by initialization and destruction order:

1.1 Destructors must be trivial

  • Definition: The type has no user-defined or virtual destructors, and all base classes and non-static members are trivial destructible.

    • In simple terms: The destructor does not perform any operations (such as releasing memory, IO, etc.).
  • Example:

    // Allowed: Basic types (trivial destructible)
    const int kNum = 10;  
    // Allowed: Struct with no user-defined destructor, members are basic types
    struct X { int n; };  
    const X kX[] = {{1}, {2}, {3}};  
    // Allowed: constexpr guarantees trivial destructor
    constexpr std::array<int, 3> kArray = {1, 2, 3};  
    // Prohibited: std::string destructor is non-trivial
    const std::string kFoo = "foo";  
    

1.2 Initialization must be constant initialization

  • The initialization expression must be a constant expression, and the constructor must be <span>constexpr</span><span>.</span>

  • Example:

    // Allowed: Constant expression initialization
    int n = 5;  
    struct Foo { constexpr Foo(int) {} };  
    Foo x(2);  // Constructor is constexpr, parameter is constant  
    // Prohibited: Initialization depends on runtime function
    time_t m = time(nullptr);  // time() is not constexpr  
    Foo y(f());                // f() returns a non-compile-time constant  
    

2. Risks and Issues of Static Storage Duration Objects

Uncertainty of Dynamic Initialization Order

  1. The dynamic initialization order of global/static variables across translation units is undefined, which may lead to objects being accessed before initialization.
// Dangerous example: Assume global_b depends on the initialization of global_a  
int global_a = f();  // Dynamic initialization  
int global_b = global_a + 1;  // If global_a is uninitialized, global_b's value is uncertain  

Destruction Order and Thread Safety Issues

  1. The destruction order is the reverse of the initialization order, but threads may still access resources after an object has been destructed (e.g., threads that have not joined).
static std::map<int, int> kData;  // Prohibited: Non-trivial destructor, may be accessed by threads after destruction  

3. Practices and Alternatives

Prioritize Compile-Time Constants

  1. Declare global variables with <span>constexpr</span><span> to ensure compile-time initialization and trivial destructor</span>.
constexpr std::string_view kMsg = "hello";  // Legal since C++17, compile-time constant  
  1. Avoid dynamic containers as static variables

    If you need static collections, use arrays of basic types or compile-time computable structures

    // Alternative: Use arrays to simulate static mapping tables  
    struct Entry { int key; const char* value; };  
    constexpr Entry kTable[] = {{1, "a"}, {2, "b"}};  
    
  2. Function-local Static Pointer References

    If you must use dynamic resources, use a static pointer to point to a dynamically allocated object (without releasing it):

    MyClass& GetGlobalInstance() {  
      static MyClass* instance = new MyClass();  // Risk of memory leak, but avoids destruction order issues  
      return *instance;  
    }  
    

4. Common Prohibited Scenario Examples

Scenario Code Example Reason for Prohibition
Global STL String <span>const std::string kStr = "abc";</span> Destructor is non-trivial
Static Mapping Container <span>static std::map<int, int> kMap;</span> Destructor is non-trivial, initialization order is uncertain
Dynamically Initialized Global Variable <span>int g_var = getpid();</span> Initialization depends on runtime function
Non-constexpr Constructor <span>struct Bar { Bar() {}; };</span> Constructor is not callable at compile-time

5. Summary: For Static Storage Duration Objects

  • Destructor Checks: The type must be trivial destructible (recommended to declare with <span>constexpr</span><span>).</span>

  • Initialization Checks: Initial values must be constant expressions (<span>constexpr</span><span> or </span><code><span>constinit</span><span> marked).</span>

  • Alternatives: Avoid dynamic containers/non-trivial types, prioritize basic arrays, <span>string_view</span><span>, and compile-time structures.</span>

By strictly adhering to the above rules, potential bugs caused by the initialization and destruction order of static/global variables can be avoided, enhancing program stability.

2.6 thread_local Variables

It is mandatory to use compile-time constant initialization for <span>thread_local</span><span> variables defined outside functions, and the </span><code><span>CONST_INIT</span><span> attribute must be used to enforce this rule. Prefer using </span><code><span>thread_local</span><span> over other methods of defining thread-local data.</span>

Definition: We can declare variables with the <span>thread_local</span><span> modifier:</span>

thread_local Foo foo = ...;

Such variables are essentially a set of different objects. Different threads access their respective objects when accessing this variable. <span>thread_local</span><span> variables are similar to static storage duration objects in many ways. For example, they can be declared within namespaces, functions, or as static members of classes, but cannot be declared as ordinary members of classes.</span>

<span>thread_local</span><span> instances have a similar initialization process to static variables, except that </span><code><span>thread_local</span><span> instances are initialized when each thread starts, rather than when the program starts. This means that function-local </span><code><span>thread_local</span><span> variables are thread-safe. However, accessing other </span><code><span>thread_local</span><span> variables has the same initialization order issues as static variables (and the problems are greater).</span><code><span>thread_local</span><span> variables also have subtle destruction order issues: When a thread terminates, the destruction order of </span><code><span>thread_local</span><span> variables is the reverse of the initialization order (as is the rule in other parts of C++). If a </span><code><span>thread_local</span><span> variable accesses other </span><code><span>thread_local</span><span> variables that have already been destructed during the destruction process, it can lead to difficult-to-debug use-after-free issues (dangling pointers).</span><strong><span>Advantages:</span></strong><ul><li><p><span>Thread-local data fundamentally prevents race conditions (because typically only one thread accesses it), thus </span><code><span>thread_local</span><span> can aid parallelization.</span>

  • Among various methods of creating thread-local data, <span>thread_local</span><span> is the only method supported by syntax standards.</span>

  • Disadvantages:

    • When a thread starts or first uses a <span>thread_local</span><span> variable, it may trigger many unpredictable, runtime-uncontrollable other codes.</span>

    • <span>thread_local</span><span> is essentially a global variable. Aside from thread safety, it has all the other disadvantages of global variables.</span>

    • In the worst case, the memory occupied by <span>thread_local</span><span> variables is proportional to the number of threads, which can be quite large.</span>

    • Member data must be static to be declared as <span>thread_local</span><span>.</span>

    • If a <span>thread_local</span><span> variable has a complex destructor, we may encounter dangling pointers. In particular, destructors cannot (directly or indirectly) access any other </span><code><span>thread_local</span><span> variables that may have already been destructed. We find it difficult to check this rule.</span>

    • The methods used to prevent dangling pointers for global/static variables do not apply to <span>thread_local</span><span>. To elaborate, we can skip destructors for global or local variables because their lifetimes will naturally end with the termination of the program. Thus, the operating system will quickly reclaim leaked memory and other resources. However, if we skip the destructors of </span><code><span>thread_local</span><span> variables, the amount of leaked resources will be proportional to the number of threads created during the program's execution.</span>

    Decisions <span>thread_local</span> variables located in classes or namespaces can only be initialized with true compile-time constants (i.e., cannot be dynamically initialized). They must be marked with <span>constinit</span><span> to ensure this (they can also be marked with </span><code><span>constexpr</span><span>, but this is less common).</span>

       constinit thread_local Foo foo = ...;
    

    Function-local <span>thread_local</span><span> variables do not have initialization concerns, but they do have risks of use-after-free when the thread exits. Note that you can expose function-local </span><code><span>thread_local</span><span> variables using static methods to simulate </span><code><span>thread_local</span><span> variables in classes or namespaces:</span>

    Foo& MyThreadLocalFoo() {
      thread_local Foo result = ComplicatedInitialization();
      return result;
    }
    

    Note that <span>thread_local</span><span> variables will be destructed when the thread exits. If the destructor accesses any other (possibly already destructed) </span><code><span>thread_local</span><span> variables, we will encounter difficult-to-debug dangling pointer issues. It is recommended to use trivial types or types with no custom code in their destructors to reduce the likelihood of accessing other </span><code><span>thread_local</span><span> variables. It is advisable to prioritize using </span><code><span>thread_local</span><span> to define thread-local data rather than other mechanisms.</span>

    Leave a Comment