Last active
January 5, 2021 16:58
-
-
Save nitrix/19bab7c711d05811a4661a189d26bc19 to your computer and use it in GitHub Desktop.
memory order atomics explanation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
All atomic operations are guaranteed to be atomic within themselves (the | |
combination of two atomic operations is not atomic as a whole!) and to be | |
visible in the total order in which they appear on the timeline of the | |
execution stream. That means no atomic operation can, under any circumstances, | |
be reordered, but other memory operations might very well be. Compilers (and | |
CPUs) routinely do such reordering as an optimization. | |
It also means the compiler must use whatever instructions are necessary to | |
guarantee that an atomic operation executing at any time will see the results | |
of each and every other atomic operation, possibly on another processor core | |
(but not necessarily other operations), that were executed before. | |
Now, a relaxed is just that, the bare minimum. It does nothing in addition and | |
provides no other guarantees. It is the cheapest possible operation. For | |
non-read-modify-write operations on strongly ordered processor architectures | |
(e.g. x86/amd64) this boils down to a plain normal, ordinary move. | |
The sequentially consistent operation is the exact opposite, it enforces strict | |
ordering not only for atomic operations, but also for other memory operations | |
that happen before or after. Neither one can cross the barrier imposed by the | |
atomic operation. Practically, this means lost optimization opportunities, and | |
possibly fence instructions may have to be inserted. This is the most expensive | |
model. | |
A release operation prevents ordinary loads and stores from being reordered | |
after the atomic operation, whereas an acquire operation prevents ordinary | |
loads and stores from being reordered before the atomic operation. Everything | |
else can still be moved around. | |
The combination of preventing stores being moved after, and loads being moved | |
before the respective atomic operation makes sure that whatever the acquiring | |
thread gets to see is consistent, with only a small amount of optimization | |
opportunity lost. | |
One may think of that as something like a non-existent lock that is being | |
released (by the writer) and acquired (by the reader). Except... there is no | |
lock. | |
In practice, release/acquire usually means the compiler needs not use any | |
particularly expensive special instructions, but it cannot freely reorder loads | |
and stores to its liking, which may miss out some (small) optimization | |
opportuntities. | |
Finally, consume is the same operation as acquire, only with the exception that | |
the ordering guarantees only apply to dependent data. Dependent data would e.g. | |
be data that is pointed-to by an atomically modified pointer. | |
Arguably, that may provide for a couple of optimization opportunities that are | |
not present with acquire operations (since fewer data is subject to | |
restrictions), however this happens at the expense of more complex and more | |
error-prone code, and the non-trivial task of getting dependency chains | |
correct. | |
It is currently discouraged to use consume ordering while the specification is | |
being revised. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment