nitrix · January 5, 2021 16:58
diff --git a/gistfile1.txt b/gistfile1.txt
 All atomic operations are guaranteed to be atomic within themselves (the
 combination of two atomic operations is not atomic as a whole!) and to be
 visible in the total order in which they appear on the timeline of the
 execution stream. That means no atomic operation can, under any circumstances,
 be reordered, but other memory operations might very well be. Compilers (and
 CPUs) routinely do such reordering as an optimization.
 It also means the compiler must use whatever instructions are necessary to
 guarantee that an atomic operation executing at any time will see the results
 of each and every other atomic operation, possibly on another processor core
 (but not necessarily other operations), that were executed before.

 Now, a relaxed is just that, the bare minimum. It does nothing in addition and
 provides no other guarantees. It is the cheapest possible operation. For
 non-read-modify-write operations on strongly ordered processor architectures
 (e.g. x86/amd64) this boils down to a plain normal, ordinary move.

 The sequentially consistent operation is the exact opposite, it enforces strict
 ordering not only for atomic operations, but also for other memory operations
 that happen before or after. Neither one can cross the barrier imposed by the
 atomic operation. Practically, this means lost optimization opportunities, and
 possibly fence instructions may have to be inserted. This is the most expensive
 model.

 A release operation prevents ordinary loads and stores from being reordered
 after the atomic operation, whereas an acquire operation prevents ordinary
 loads and stores from being reordered before the atomic operation. Everything
 else can still be moved around.
 The combination of preventing stores being moved after, and loads being moved
 before the respective atomic operation makes sure that whatever the acquiring
 thread gets to see is consistent, with only a small amount of optimization
 opportunity lost.
 One may think of that as something like a non-existent lock that is being
 released (by the writer) and acquired (by the reader). Except... there is no
 lock.

 In practice, release/acquire usually means the compiler needs not use any
 particularly expensive special instructions, but it cannot freely reorder loads
 and stores to its liking, which may miss out some (small) optimization
 opportuntities.

 Finally, consume is the same operation as acquire, only with the exception that
 the ordering guarantees only apply to dependent data. Dependent data would e.g.
 be data that is pointed-to by an atomically modified pointer.
 Arguably, that may provide for a couple of optimization opportunities that are
 not present with acquire operations (since fewer data is subject to
 restrictions), however this happens at the expense of more complex and more
 error-prone code, and the non-trivial task of getting dependency chains
 correct.

 It is currently discouraged to use consume ordering while the specification is
 being revised.
	All atomic operations are guaranteed to be atomic within themselves (the
	combination of two atomic operations is not atomic as a whole!) and to be
	visible in the total order in which they appear on the timeline of the
	execution stream. That means no atomic operation can, under any circumstances,
	be reordered, but other memory operations might very well be. Compilers (and
	CPUs) routinely do such reordering as an optimization.
	It also means the compiler must use whatever instructions are necessary to
	guarantee that an atomic operation executing at any time will see the results
	of each and every other atomic operation, possibly on another processor core
	(but not necessarily other operations), that were executed before.

	Now, a relaxed is just that, the bare minimum. It does nothing in addition and
	provides no other guarantees. It is the cheapest possible operation. For
	non-read-modify-write operations on strongly ordered processor architectures
	(e.g. x86/amd64) this boils down to a plain normal, ordinary move.

	The sequentially consistent operation is the exact opposite, it enforces strict
	ordering not only for atomic operations, but also for other memory operations
	that happen before or after. Neither one can cross the barrier imposed by the
	atomic operation. Practically, this means lost optimization opportunities, and
	possibly fence instructions may have to be inserted. This is the most expensive
	model.

	A release operation prevents ordinary loads and stores from being reordered
	after the atomic operation, whereas an acquire operation prevents ordinary
	loads and stores from being reordered before the atomic operation. Everything
	else can still be moved around.
	The combination of preventing stores being moved after, and loads being moved
	before the respective atomic operation makes sure that whatever the acquiring
	thread gets to see is consistent, with only a small amount of optimization
	opportunity lost.
	One may think of that as something like a non-existent lock that is being
	released (by the writer) and acquired (by the reader). Except... there is no
	lock.

	In practice, release/acquire usually means the compiler needs not use any
	particularly expensive special instructions, but it cannot freely reorder loads
	and stores to its liking, which may miss out some (small) optimization
	opportuntities.

	Finally, consume is the same operation as acquire, only with the exception that
	the ordering guarantees only apply to dependent data. Dependent data would e.g.
	be data that is pointed-to by an atomically modified pointer.
	Arguably, that may provide for a couple of optimization opportunities that are
	not present with acquire operations (since fewer data is subject to
	restrictions), however this happens at the expense of more complex and more
	error-prone code, and the non-trivial task of getting dependency chains
	correct.

	It is currently discouraged to use consume ordering while the specification is
	being revised.