In-Depth Analysis of C++ std::async Asynchronous Programming

Introduction

Asynchronous programming is an indispensable part of modern C++ programming, significantly enhancing program performance.std::async, as an important function template for implementing asynchronous operations in the C++ standard library, provides developers with a simple yet powerful way to run asynchronous tasks. This article will delve into the functionality, usage, and differences in implementations across different compilers of std::async, helping to better understand and utilize this powerful tool.

`std::async` Basic Usage

std::async is defined in the <future> header file, and its basic function is to run a function asynchronously and return a std::future object that holds the result of the function call.

std::async declaration:

template <class Fn, class... ArgTypes>
future<typename result_of<Fn(ArgTypes...)>::type>
    async(Fn&& fn, ArgTypes&&... args);

template <class Fn, class... ArgTypes>
future<typename result_of<Fn(ArgTypes...)>::type>
    async(launch policy, Fn&& fn, ArgTypes&&... args);

In the second declaration, a launch policy can be specified.std::launch is an enumeration class.

launch::deferred: Indicates that the function call is delayed until the wait() or get() function is called.
launch::async: Indicates that the function is executed on a new independent thread. (This new thread may be obtained from a thread pool or newly created, depending on the compiler’s implementation.)
launch::deferred | launch::async: The default parameter for std::async, where the system decides whether to run asynchronously (create a new thread) or synchronously (do not create a new thread).

The basic usage is as follows:

int foo(int a) {
return a;
}

int main() {
// Default policy
std::future<int> f = std::async(&foo, 10);

// New thread launch
std::future<int> f1 = std::async(std::launch::async, []() { return 0; });

// Deferred call
std::future<int> f2 = std::async(std::launch::deferred, []() { return 0; });

std::println("result is: {}", f.get());
std::println("result is: {}", f1.get());
std::println("result is: {}", f2.get());
return 0;
}

There is no need for excessive description of the basic usage; next, we will focus on analyzing the details of std::async.

`std::async` Strategy In-Depth Analysis

The C++ standard does not explicitly specify the default strategy for std::async, but most compiler implementations (such as GCC, LLVM, and MSVC) choose std::launch::async | std::launch::deferred as the default strategy. So, what is the final execution strategy of the default strategy on different platforms?

GCC Platform

In GCC, the default option is launch::async|launch::deferred:

/// async, potential overload
template<typename _Fn, typename... _Args>
  _GLIBCXX_NODISCARD inline future<__async_result_of<_Fn, _Args...>>
  async(_Fn&& __fn, _Args&&... __args)
  {
    return std::async(launch::async|launch::deferred,
    std::forward<_Fn>(__fn),
    std::forward<_Args>(__args)...);
  }

In practice, the chosen strategy will be launch::async:

 /// async
template<typename _Fn, typename... _Args>
    _GLIBCXX_NODISCARD future<__async_result_of<_Fn, _Args...>>
    async(launch __policy, _Fn&& __fn, _Args&&... __args)
    {
      std::shared_ptr<__future_base::_State_base> __state;
      if ((__policy & launch::async) == launch::async)
 {
   __try
     {
       __state = __future_base::_S_make_async_state(
    std::thread::__make_invoker(std::forward<_Fn>(__fn),
           std::forward<_Args>(__args)...)
    );
     }
#if __cpp_exceptions
   catch(const system_error&& __e)
     {
       if (__e.code() != errc::resource_unavailable_try_again
    || (__policy & launch::deferred) != launch::deferred)
throw;
     }
#endif
 }
      if (!__state)
 {
   __state = __future_base::_S_make_deferred_state(
       std::thread::__make_invoker(std::forward<_Fn>(__fn),
       std::forward<_Args>(__args)...));
 }
      return future<__async_result_of<_Fn, _Args...>>(__state);
    }

LLVM

LLVM has a special launch strategy for the default option launch::any:

template <class _Fp, class... _Args>
_LIBCPP_NODISCARD_AFTER_CXX17 inline _LIBCPP_INLINE_VISIBILITY
future<typename __invoke_of<typename decay<_Fp>::type, typename decay<_Args>::type...>::type>
async(_Fp&& __f, _Args&&... __args)
{
    return _VSTD::async(launch::any, _VSTD::forward<_Fp>(__f),
                                    _VSTD::forward<_Args>(__args)...);
}

In fact, it is a combination of launch::async and launch::deferred:

enum class launch
{
    async = 1,
    deferred = 2,
    any = async | deferred
};

However, the actual strategy chosen by LLVM will be launch::async:

template <class _Fp, class... _Args>
_LIBCPP_NODISCARD_AFTER_CXX17
future<typename __invoke_of<typename decay<_Fp>::type, typename decay<_Args>::type...>::type>
async(launch __policy, _Fp&& __f, _Args&&... __args)
{
    typedef __async_func<typename decay<_Fp>::type, typename decay<_Args>::type...> _BF;
    typedef typename _BF::_Rp _Rp;

#ifndef _LIBCPP_NO_EXCEPTIONS
    try
    {
#endif
        if (__does_policy_contain(__policy, launch::async))
        return _VSTD::__make_async_assoc_state<_Rp>(_BF(__decay_copy(_VSTD::forward<_Fp>(__f),
                                                     __decay_copy(_VSTD::forward<_Args>(__args))...));
#ifndef _LIBCPP_NO_EXCEPTIONS
    }
    catch ( ... ) { if (__policy == launch::async) throw ; }
#endif

    if (__does_policy_contain(__policy, launch::deferred))
        return _VSTD::__make_deferred_assoc_state<_Rp>(_BF(__decay_copy(_VSTD::forward<_Fp>(__f),
                                                        __decay_copy(_VSTD::forward<_Args>(__args))...));
    return future<_Rp>{};
}

MSVC

For MSVC, the default option is also launch::async | launch::deferred:

_EXPORT_STD template <class _Fty, class... _ArgTypes>
_NODISCARD_ASYNC future<_Invoke_result_t<decay_t<_Fty>, decay_t<_ArgTypes>...>> async(
    _Fty&& _Fnarg, _ArgTypes&&... _Args) {
    // manages a callable object launched with default policy
    return _STD async(launch::async | launch::deferred, _STD forward<_Fty>(_Fnarg), _STD forward<_ArgTypes>(_Args)...);
}

And the chosen strategy is launch::async:

template <class _Ret, class _Fty>
_Associated_state<typename _P_arg_type<_Ret>::type>* _Get_associated_state(launch _Psync, _Fty&& _Fnarg) {
    // construct associated asynchronous state object for the launch type
    switch (_Psync) { // select launch type
    case launch::deferred:
        return new _Deferred_async_state<_Ret>(_STD forward<_Fty>(_Fnarg));
    case launch::async: // TRANSITION, fixed in vMajorNext, should create a new thread here
    default:
        return new _Task_async_state<_Ret>(_STD forward<_Fty>(_Fnarg));
    }
}

`std::launch::async` In-Depth Analysis

We know that std::launch::async indicates that the function is executed on a new independent thread. However, the C++ standard does not specify whether the thread is a new thread or a reused thread from a thread pool.

GCC

GCC calls __future_base::_S_make_async_state, which creates an instance of _Async_state_impl. Its constructor starts a new std::thread:

// Shared state created by std::async().
// Starts a new thread that runs a function and makes the shared state ready.
template<typename _BoundFn, typename _Res>
class __future_base::_Async_state_impl final
  : public __future_base::_Async_state_commonV2
  {
public:
    explicit
    _Async_state_impl(_BoundFn&& __fn)
    : _M_result(new _Result<_Res>()), _M_fn(std::move(__fn))
    {
  _M_thread = std::thread{ [this] {
      __try
        {
   _M_set_result(_S_task_setter(_M_result, _M_fn));
        }
      __catch (const __cxxabiv1::__forced_unwind&&)
        {
   // make the shared state ready on thread cancellation
   if (static_cast<bool>(_M_result))
     this->_M_break_promise(std::move(_M_result));
   __throw_exception_again;
        }
      } };
    }
}

LLVM

LLVM calls _VSTD::__make_async_assoc_state, which also starts a new std::thread:

template <class _Rp, class _Fp>
future<_Rp>
#ifndef _LIBCPP_HAS_NO_RVALUE_REFERENCES
__make_async_assoc_state(_Fp&& __f)
#else
__make_async_assoc_state(_Fp __f)
#endif
{
    unique_ptr<__async_assoc_state<_Rp, _Fp>, __release_shared_count>
        __h(new __async_assoc_state<_Rp, _Fp>(_VSTD::forward<_Fp>(__f)));
    _VSTD::thread(&&__async_assoc_state<_Rp, _Fp>::__execute, __h.get()).detach();
    return future<_Rp>(__h.get());
}

MSVC

The most interesting part comes! MSVC creates an instance of _Task_async_state, which creates a concurrent task and passes a callable function:

// CLASS TEMPLATE _Task_async_state
template <class _Rx>
class _Task_async_state : public _Packaged_state<_Rx()> {
    // class for managing associated synchronous state for asynchronous execution from async
public:
    using _Mybase     = _Packaged_state<_Rx()>;
    using _State_type = typename _Mybase::_State_type;

    template <class _Fty2>
    _Task_async_state(_Fty2&& _Fnarg) : _Mybase(_STD forward<_Fty2>(_Fnarg)) {
        _Task = ::Concurrency::create_task([this]() { // do it now
            this->_Call_immediate();
        });

        this->_Running = true;
    }
}

::Concurrency::create_task is part of the Microsoft Parallel Patterns Library. According to MSDN documentation, the task class retrieves threads from the Windows ThreadPool rather than creating a new thread.

So here is an important point to note: based on the ThreadPool implementation, it cannot guarantee that the thread_local variable will be destroyed when the thread completes. This is because threads obtained from the thread pool are not destroyed. Therefore, you will find that after using std::async, the threads are not released. Essentially, it borrows a thread from the system thread pool, which counts towards the user thread count, and this thread is not released, leading to the phenomenon that the more you use std::async, the more threads there are.

std::async limits the number of concurrent threads to the default value of the Windows thread pool, which is 500 threads.

`std::async` Return’s `std::future` In-Depth Analysis

According to cppreference:

“

If the std::future obtained from std::async has not been moved or bound to a reference, then at the end of the full expression, the destructor of std::future will block until the asynchronous computation is completed, effectively making the following code synchronous:
std::async(std::launch::async, []{ f(); }); // Destructor of temporary waits for f()
std::async(std::launch::async, []{ g(); }); // g() does not start until f() is complete
Note: The destructor of std::future obtained in ways other than calling std::async will not block.

That is, the behavior of the std::future returned by std::async when its destructor is called is different from that of the std::future obtained from std::promise. When these std::future objects are destroyed, the destructor of std::future will be called, executing the wait() function, causing the threads created at instantiation to join back into the main thread.

Here is an example using MSVC code:

~_Task_async_state() noexcept override {
    _Wait();
}

void _Wait() override { // wait for completion
    _Task.wait();
}

void WaitUntilStateChangedTo(_TaskCollectionState _State)
{
    ::std::unique_lock<::std::mutex> _Lock(_M_Cs);

    while(_M_State < _State)
    {
        _M_StateChanged.wait(_Lock);
    }
}

_Task_async_state destructor will call wait(), ultimately stacking up to _M_StateChanged.wait(_Lock);, which is the wait() of the condition variable.

Different platform implementations vary; in GCC and LLVM:

  ~_Async_state_impl()
  {
if (_M_thread.joinable())
  _M_thread.join();
  }

During destruction, it waits for the thread to join() to finish.

Conclusion

std::async is an advanced abstraction tool for threads in the C++ standard library, simplifying the implementation of asynchronous operations and making the code more concise. However, due to differences in implementations across compilers, developers need to consider these factors carefully when using it to avoid potential issues. It is particularly important to note the thread_local and the returned std::future.

“

Implementations of std::async and how they might Affect Applications | Dmitry Danilov

functions | Microsoft Learn

std::async – cppreference.com

《Asynchronous Programming with C++》

I knew you were “watching”

In-Depth Analysis of C++ std::async Asynchronous Programming

In-Depth Analysis of C++ std::async Asynchronous Programming

Introduction

`<span>std::async</span>` Basic Usage

`<span>std::async</span>` Strategy In-Depth Analysis

GCC Platform

LLVM

MSVC

`<span>std::launch::async</span>` In-Depth Analysis

GCC

LLVM

MSVC

`<span>std::async</span>` Return’s `<span>std::future</span>` In-Depth Analysis

Conclusion

Leave a Comment Cancel reply

Introduction

<span>std::async</span> Basic Usage

<span>std::async</span> Strategy In-Depth Analysis

GCC Platform

LLVM

MSVC

<span>std::launch::async</span> In-Depth Analysis

GCC

LLVM

MSVC

<span>std::async</span> Return’s <span>std::future</span> In-Depth Analysis

Conclusion

Related posts

Leave a Comment Cancel reply

`<span>std::async</span>` Basic Usage

`<span>std::async</span>` Strategy In-Depth Analysis

`<span>std::launch::async</span>` In-Depth Analysis

`<span>std::async</span>` Return’s `<span>std::future</span>` In-Depth Analysis