The Top Ten Features of C++17 for Business Code

Author: Jinshang, Backend Developer at Tencent WXG

Since the advent of modern C++, the C++ language standard has established a convention of releasing a new version every three years: C++11 marked the beginning of modern C++, C++14 filled in the gaps of C++11 without adding many new features, and C++17, as the first major version after C++11, signifies the gradual maturation of modern C++. The WXG compiler has been upgraded to gcc7.5 for some time, and my project team has also upgraded all code to C++17. After using C++17 for over a year, I have summarized the ten most useful features of C++17 in business code.

Note 1: This article only includes features supported by wxg’s gcc7.5; features such as Execution Policy and File System that are not supported are not included.

Note 2: This article only includes features applicable to business logic; features like Fold Expression and Mathematical Special Functions that are suitable for metaprogramming and scientific computing are not included.

I have broadly categorized these features into three types: syntactic sugar, performance improvements, and type system enhancements.

Syntactic Sugar

The syntactic sugar referred to here is not strictly at the programming language level but also includes functions and libraries that make the code more concise and readable:

Structured Bindings

The most convenient syntactic sugar in C++17 is structured bindings. Structured bindings allow the members of an array, tuple, or struct to be bound to a set of variables, with the most common scenario being the iteration over a map/unordered_map without needing to declare an intermediate variable:

// pre c++17
for(const auto& kv: map){
  const auto& key = kv.first;
  const auto& value = kv.second;
  // ...
}

// c++17
for(const auto& [key, value]: map){
  // ...
}

*: Strictly speaking, the result of structured bindings is not a variable; the C++ standard refers to it as a name/alias, which also prevents them from being captured by lambdas. However, gcc does not adhere to the C++ standard, so the following code can compile in gcc but not in clang:

for(const auto& [key, value]: map){
    [&key, &value] {
        std::cout << key << ": " << value << std::endl;
    }();
}

In a clang environment, you can explicitly introduce a reference variable during lambda expression capture to compile:

for(const auto& [key, value]: map){
    [&key = key, &value = value] {
        std::cout << key << ": " << value << std::endl;
    }();
}

This restriction has been removed in C++20, so in the C++20 standard, both gcc and clang can capture structured binding objects.

Implicit Type Deduction for std::tuple

Before C++17, when constructing std::pair/std::tuple, you had to specify the data type or use the std::make_pair/std::make_tuple functions. C++17 introduced deduction rules for std::pair/std::tuple, allowing you to omit explicit type specification.

// pre c++17
std::pair<int, std::string> p1{3.14, "pi"s};
auto p1 = std::make_pair(3.14, "pi"s);

// c++17
std::pair p3{3.14, "pi"s};

if constexpr

The if constexpr statement is a compile-time if statement. Before C++17, compile-time conditional checks were often implemented through complex SFINAE mechanisms or template overloading, and when it became cumbersome, they were directly placed at runtime using if statements, causing performance loss. if constexpr greatly alleviates this issue. For example, if I want to implement a function that converts different types of input to strings, before C++17, I needed to write three functions, while C++17 only requires one function.

// pre c++17
template <typename T>
std::string convert(T input){
    return std::to_string(input);
}

// Special handling for const char* and string
std::string convert(const char* input){
    return input;
}
std::string convert(std::string input){
    return input;
}
// c++17
template <typename T>
std::string convert(T input) {
    if constexpr (std::is_same_v<T, const char*> ||
                  std::is_same_v<T, std::string>) {
        return input;
    } else {
        return std::to_string(input);
    }
}

if Initialization Statement

C++17 supports adding an initialization statement before the if condition, allowing variable declarations that are only used within the if statement to be declared inside the if, which helps improve code readability. This is particularly beneficial for types related to concurrency/RAII, such as locks and iterators, ensuring program correctness more easily.

// c++ 17
std::map<int, std::string> m;
std::mutex mx;
extern bool shared_flag; // guarded by mx

int demo()
{
    if (auto it = m.find(10); it != m.end()) { return it->second.size(); }
    if (char buf[10]; std::fgets(buf, 10, stdin)) { m[0] += buf; }
    if (std::lock_guard lock(mx); shared_flag) { unsafe_ping(); shared_flag = false; }
    if (int s; int count = ReadBytesWithSignal(&s)) { publish(count); raise(s); }
    if (const auto keywords = {"if", "for", "while"};
        std::ranges::any_of(keywords, [&tok](const char* kw) { return tok == kw; }))
    {
        std::cerr << "Token must not be a keyword\n";
    }
}

Performance Improvements

std::shared_mutex

shared_mutex is the native read-write lock implementation in C++, supporting both shared and exclusive lock modes, suitable for high-concurrency read scenarios, enhancing performance through shared locks for readers. Before C++17, you could only implement read-write locks yourself using exclusive locks and condition variables or use the less efficient std::shared_timed_mutex introduced in C++14. Below is a thread-safe counter implemented using shared_mutex:

// c++17
class ThreadSafeCounter {
 public:
  ThreadSafeCounter() = default;

  // Multiple threads/readers can read the counter's value at the same time.
  unsigned int get() const {
    std::shared_lock lock(mutex_);
    return value_;
  }

  // Only one thread/writer can increment/write the counter's value.
  unsigned int increment() {
    std::unique_lock lock(mutex_);
    return ++value_;
  }

  // Only one thread/writer can reset/write the counter's value.
  void reset() {
    std::unique_lock lock(mutex_);
    value_ = 0;
  }

 private:
  mutable std::shared_mutex mutex_;
  unsigned int value_ = 0;
};

std::string_view

std::string_view is, as the name suggests, a “view” of a string, with class member variables consisting of a string pointer and string length. std::string_view encompasses all read-only interfaces of std::string. It does not own the string and is compatible with both std::string and const char*.

Before C++17, we often used const std::string& for read-only strings, which has two performance advantages:

  1. Compatibility with both string types reduces type conversions and memory allocations. If a plain string const char* is passed, const std::string& requires a memory allocation to copy the string to the heap, while std::string_view can avoid this.
  2. When handling substrings, std::string::substr also requires copying and memory allocation, while std::string_view::substr does not, which is significantly advantageous when parsing large files.
// from https://stackoverflow.com/a/40129046
// author: Pavel Davydov

// string_view's remove_prefix is 15 times faster than const std::string&.
string remove_prefix(const string &str) {
  return str.substr(3);
}
string_view remove_prefix(string_view str) {
  str.remove_prefix(3);
  return str;
}

static void BM_remove_prefix_string(benchmark::State& state) {
  std::string example{"asfaghdfgsghasfasg3423rfgasdg"};
  while (state.KeepRunning()) {
    auto res = remove_prefix(example);
    // auto res = remove_prefix(string_view(example)); for string_view
    if (res != "aghdfgsghasfasg3423rfgasdg") {
      throw std::runtime_error("bad op");
    }
  }
}

std::map/unordered_map try_emplace

When inserting elements into std::map/unordered_map, we often use emplace. The operation of emplace inserts the element if the key does not exist; otherwise, it does not insert. However, when the element already exists, emplace still constructs the element to be inserted, and after determining that it does not need to be inserted, immediately destructs that element, resulting in unnecessary construction and destruction operations. C++17 introduced try_emplace to avoid this issue. Additionally, try_emplace separates the key and value in the parameter list, making the in-place construction syntax simpler than emplace.

std::map<std::string, std::string> m;
// emplace's in-place construction requires std::piecewise_construct, as it directly inserts std::pair<key, value>
m.emplace(std::piecewise_construct,
           std::forward_as_tuple("c"),
           std::forward_as_tuple(10, 'c'));

// try_emplace can directly construct in place, as the key and value are separated in the parameter list.
m.try_emplace("c", 10, 'c');

Additionally, C++17 also added the insert_or_assign function to std::map/unordered_map, making it easier to implement insert or modify semantics.

Type System Enhancements

C++17 further completed the C++ type system, finally introducing the long-awaited type-erasure container (Type Erasure) and algebraic data types (Algebraic Data Type).

std::any

std::any is a container that can store any copyable type. In C, similar functionality is often implemented using void*. Compared to void*, std::any has two advantages:

  1. std::any is safer: when type T is converted to void*, the type information of T is lost, and when converting back to a specific type, the program cannot determine whether the current void* is truly T, which can lead to safety risks. In contrast, std::any stores type information, and std::any_cast is a safe type conversion.
  2. std::any manages the object’s lifecycle; when std::any is destructed, it will destruct the stored object, while void* requires manual memory management.

std::any should rarely be the first choice for programmers. When the type is known, std::optional, std::variant, and inheritance are more efficient and reasonable choices. std::any should only be used when the type is completely unknown, such as in dynamic type text parsing or intermediate information transfer in business logic.

std::optional

std::optional<T> represents a possibly existing T value, corresponding to Maybe in Haskell and option in Rust/OCaml, and is essentially a Sum Type. It is commonly used in return values of functions that may fail, such as factory functions. Before C++17, T* was often used as a return value, where nullptr indicated function failure, and otherwise T* pointed to the actual return value. However, this approach blurred ownership, and the function caller could not determine whether they should take over the memory management of T*, and the assumption that T* could be null posed a risk of SegFault if forgotten to check.

// pre c++17
ReturnType* func(const std::string&amp; in) {
    ReturnType* ret = new ReturnType;
    if (in.size() == 0)
        return nullptr;
    // ...
    return ret;
}

// c++17 is safer and more intuitive
std::optional&lt;ReturnType&gt; func(const string&amp; in) {
    ReturnType ret;
    if (in.size() == 0)
        return nullopt;
    // ...
    return ret;
}

std::variant

std::variant<T, U, ...> represents a multi-type container, where the value in the container is one of the specified types, serving as a general Sum Type, corresponding to enum in Rust. It is a type-safe union, also known as a tagged union. Compared to union, it has two advantages:

  1. It can store complex types, while union can only directly store basic POD types; for complex types like std::vector and std::string, users need to manage memory manually.
  2. Type safety: variant stores internal type information, allowing for safe type conversions. Before C++17, similar functionality was often implemented using union + enum.

By using std::variant<T, Err>, users can implement functionality similar to Rust’s std::result, which returns a result on successful function execution and an error message on failure. The previous example can be modified as follows:

std::variant&lt;ReturnType, Err&gt; func(const string&amp; in) {
    ReturnType ret;
    if (in.size() == 0)
        return Err{"input is empty"};
    // ...
    return {ret};
}

It is important to note that C++17 only provides a library-level implementation of variant without a corresponding pattern matching mechanism, and the closest std::visit lacks compiler optimization support, making std::variant not very useful in C++17, falling short of the sophisticated Sum Type in Rust and functional languages. However, many proposals surrounding std::variant have been submitted to the C++ committee for discussion, including pattern matching, std::expected, and more.

In summary, the three types introduced in C++17 bring a more modern and safer type system to C++. Their corresponding use cases are:

  • std::any is suitable for scenarios previously using void* as a generic type.
  • std::optional is suitable for scenarios previously using nullptr to represent failure states.
  • std::variant is suitable for scenarios previously using union.

Conclusion

The above are the C++17 features I use most frequently in production environments. In addition to the ten features described in this article, C++17 also adds very useful features such as lambda value capture of *this, the clamp function std::clamp(), and mandatory return value checks [[nodiscard]], which are not elaborated on due to space limitations. Interested readers are welcome to explore further.

Leave a Comment