CP: Concurrency and Parallelism
The core component of concurrent and parallel programming is the thread. Threads
allow you to run multiple instances of your program independently, while sharing
the same memory. Concurrent programming is tricky for many reasons, most
importantly that it is undefined behavior to read data in one thread after it
was written by another thread, if there is no proper synchronization between
those threads. Making existing single-threaded code execute concurrently can be
as trivial as adding std::async
or std::thread
strategically, or it can be
necessitate a full rewrite, depending on whether the original code was written
in a thread-friendly way.
The concurrency/parallelism rules in this document are designed with three goals in mind:
- To help you write code that is amenable to being used in a threaded environment
- To show clean, safe ways to use the threading primitives offered by the standard library
- To offer guidance on what to do when concurrency and parallelism aren't giving you the performance gains you need
It is also important to note that concurrency in C++ is an unfinished story. C++11 introduced many core concurrency primitives, C++14 improved on them, and it seems that there is much interest in making the writing of concurrent programs in C++ even easier. We expect some of the library-related guidance here to change significantly over time.
Concurrency and parallelism rule summary:
See also:
CP.1: Assume that your code will run as part of a multi-threaded program
Reason
It is hard to be certain that concurrency isn't used now or sometime in the future. Code gets re-used. Libraries using threads may be used from some other part of the program. Note that this applies most urgently to library code and least urgently to stand-alone applications.
Example
double cached_computation(int x)
{
static int cached_x = 0;
static double cached_result = COMPUTATION_OF_ZERO;
if (cached_x == x)
return cached_result;
double result = computation(x);
cached_x = x;
cached_result = result;
return result;
}
Although cached_computation
works perfectly in a single-threaded environment, in a multi-threaded environment the two static
variables result in data races and thus undefined behavior.
There are several ways that this example could be made safe for a multi-threaded environment:
- Delegate concurrency concerns upwards to the caller.
- Mark the
static
variables asthread_local
(which might make caching less effective). - Implement concurrency control, for example, protecting the two
static
variables with astatic
lock (which might reduce performance). - Have the caller provide the memory to be used for the cache, thereby delegating both memory allocation and concurrency concerns upwards to the caller.
- Refuse to build and/or run in a multi-threaded environment.
- Provide two implementations, one which is used in single-threaded environments and another which is used in multi-threaded environments.
Exception: There are examples where code will never be run in a multi-threaded environment. However, there are also many examples where code that was "known" to never run in a multi-threaded program was run as part of a multi-threaded program. Often years later. Typically, such programs lead to a painful effort to remove data races. Therefore, code that is never intended to run in a multi-threaded environment should be clearly labeled as such and ideally come with compile or run-time enforcement mechanisms to catch those usage bugs early.
CP.2: Avoid data races
Reason
Unless you do, nothing is guaranteed to work and subtle errors will persist.
Note
In a nutshell, if two threads can access the same named object concurrently (without synchronization), and at least one is a writer (performing a non-const
operation), you have a data race. For further information of how to use synchronization well to eliminate data races, please consult a good book about concurrency.
Example
There are many examples of data races that exist, some of which are running in production software at this very moment. One very simple example:
int get_id() {
static int id = 1;
return id++;
}
The increment here is an example of a data race. This can go wrong in many ways, including:
- Thread A loads the value of
id
, the OS context switches A out for some period, during which other threads create hundreds of IDs. Thread A is then allowed to run again, andid
is written back to that location as A's read ofid
plus one. - Thread A and B load
id
and increment it simultaneously. They both get the same ID.
Enforcement
Some is possible, do at least something.