15e97a28aSMark Cave-Ayland.. _atomics-ref: 25e97a28aSMark Cave-Ayland 315e8699fSPaolo Bonzini========================= 415e8699fSPaolo BonziniAtomic operations in QEMU 515e8699fSPaolo Bonzini========================= 615e8699fSPaolo Bonzini 715e8699fSPaolo BonziniCPUs perform independent memory operations effectively in random order. 815e8699fSPaolo Bonzinibut this can be a problem for CPU-CPU interaction (including interactions 915e8699fSPaolo Bonzinibetween QEMU and the guest). Multi-threaded programs use various tools 1015e8699fSPaolo Bonzinito instruct the compiler and the CPU to restrict the order to something 1115e8699fSPaolo Bonzinithat is consistent with the expectations of the programmer. 1215e8699fSPaolo Bonzini 1315e8699fSPaolo BonziniThe most basic tool is locking. Mutexes, condition variables and 1415e8699fSPaolo Bonzinisemaphores are used in QEMU, and should be the default approach to 1515e8699fSPaolo Bonzinisynchronization. Anything else is considerably harder, but it's 16de99dab0SPaolo Bonzinialso justified more often than one would like; 17de99dab0SPaolo Bonzinithe most performance-critical parts of QEMU in particular require 18de99dab0SPaolo Bonzinia very low level approach to concurrency, involving memory barriers 19de99dab0SPaolo Bonziniand atomic operations. The semantics of concurrent memory accesses are governed 20de99dab0SPaolo Bonziniby the C11 memory model. 2115e8699fSPaolo Bonzini 22de99dab0SPaolo BonziniQEMU provides a header, ``qemu/atomic.h``, which wraps C11 atomics to 23de99dab0SPaolo Bonziniprovide better portability and a less verbose syntax. ``qemu/atomic.h`` 24de99dab0SPaolo Bonziniprovides macros that fall in three camps: 2515e8699fSPaolo Bonzini 2615e8699fSPaolo Bonzini- compiler barriers: ``barrier()``; 2715e8699fSPaolo Bonzini 28d73415a3SStefan Hajnoczi- weak atomic access and manual memory barriers: ``qatomic_read()``, 29d73415a3SStefan Hajnoczi ``qatomic_set()``, ``smp_rmb()``, ``smp_wmb()``, ``smp_mb()``, 30ff00bed1SPaolo Bonzini ``smp_mb_acquire()``, ``smp_mb_release()``, ``smp_read_barrier_depends()``, 31ff00bed1SPaolo Bonzini ``smp_mb__before_rmw()``, ``smp_mb__after_rmw()``; 3215e8699fSPaolo Bonzini 3315e8699fSPaolo Bonzini- sequentially consistent atomic access: everything else. 3415e8699fSPaolo Bonzini 35de99dab0SPaolo BonziniIn general, use of ``qemu/atomic.h`` should be wrapped with more easily 36de99dab0SPaolo Bonziniused data structures (e.g. the lock-free singly-linked list operations 37de99dab0SPaolo Bonzini``QSLIST_INSERT_HEAD_ATOMIC`` and ``QSLIST_MOVE_ATOMIC``) or synchronization 38de99dab0SPaolo Bonziniprimitives (such as RCU, ``QemuEvent`` or ``QemuLockCnt``). Bare use of 39de99dab0SPaolo Bonziniatomic operations and memory barriers should be limited to inter-thread 40de99dab0SPaolo Bonzinichecking of flags and documented thoroughly. 41de99dab0SPaolo Bonzini 42de99dab0SPaolo Bonzini 4315e8699fSPaolo Bonzini 4415e8699fSPaolo BonziniCompiler memory barrier 4515e8699fSPaolo Bonzini======================= 4615e8699fSPaolo Bonzini 47de99dab0SPaolo Bonzini``barrier()`` prevents the compiler from moving the memory accesses on 48de99dab0SPaolo Bonzinieither side of it to the other side. The compiler barrier has no direct 49de99dab0SPaolo Bonzinieffect on the CPU, which may then reorder things however it wishes. 5015e8699fSPaolo Bonzini 5115e8699fSPaolo Bonzini``barrier()`` is mostly used within ``qemu/atomic.h`` itself. On some 5215e8699fSPaolo Bonziniarchitectures, CPU guarantees are strong enough that blocking compiler 5315e8699fSPaolo Bonzinioptimizations already ensures the correct order of execution. In this 5415e8699fSPaolo Bonzinicase, ``qemu/atomic.h`` will reduce stronger memory barriers to simple 5515e8699fSPaolo Bonzinicompiler barriers. 5615e8699fSPaolo Bonzini 5715e8699fSPaolo BonziniStill, ``barrier()`` can be useful when writing code that can be interrupted 5815e8699fSPaolo Bonziniby signal handlers. 5915e8699fSPaolo Bonzini 6015e8699fSPaolo Bonzini 6115e8699fSPaolo BonziniSequentially consistent atomic access 6215e8699fSPaolo Bonzini===================================== 6315e8699fSPaolo Bonzini 6415e8699fSPaolo BonziniMost of the operations in the ``qemu/atomic.h`` header ensure *sequential 6515e8699fSPaolo Bonziniconsistency*, where "the result of any execution is the same as if the 6615e8699fSPaolo Bonzinioperations of all the processors were executed in some sequential order, 6715e8699fSPaolo Bonziniand the operations of each individual processor appear in this sequence 6815e8699fSPaolo Bonziniin the order specified by its program". 6915e8699fSPaolo Bonzini 7015e8699fSPaolo Bonzini``qemu/atomic.h`` provides the following set of atomic read-modify-write 7115e8699fSPaolo Bonzinioperations:: 7215e8699fSPaolo Bonzini 73d73415a3SStefan Hajnoczi void qatomic_inc(ptr) 74d73415a3SStefan Hajnoczi void qatomic_dec(ptr) 75d73415a3SStefan Hajnoczi void qatomic_add(ptr, val) 76d73415a3SStefan Hajnoczi void qatomic_sub(ptr, val) 77d73415a3SStefan Hajnoczi void qatomic_and(ptr, val) 78d73415a3SStefan Hajnoczi void qatomic_or(ptr, val) 7915e8699fSPaolo Bonzini 80d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_inc(ptr) 81d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_dec(ptr) 82d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_add(ptr, val) 83d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_sub(ptr, val) 84d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_and(ptr, val) 85d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_or(ptr, val) 86d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_xor(ptr, val) 87d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_fetch_inc_nonzero(ptr) 88d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_xchg(ptr, val) 89d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_cmpxchg(ptr, old, new) 9015e8699fSPaolo Bonzini 9115e8699fSPaolo Bonziniall of which return the old value of ``*ptr``. These operations are 92de99dab0SPaolo Bonzinipolymorphic; they operate on any type that is as wide as a pointer or 93de99dab0SPaolo Bonzinismaller. 9415e8699fSPaolo Bonzini 9515e8699fSPaolo BonziniSimilar operations return the new value of ``*ptr``:: 9615e8699fSPaolo Bonzini 97d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_inc_fetch(ptr) 98d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_dec_fetch(ptr) 99d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_add_fetch(ptr, val) 100d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_sub_fetch(ptr, val) 101d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_and_fetch(ptr, val) 102d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_or_fetch(ptr, val) 103d73415a3SStefan Hajnoczi typeof(*ptr) qatomic_xor_fetch(ptr, val) 10415e8699fSPaolo Bonzini 10506831001SPaolo Bonzini``qemu/atomic.h`` also provides an optimized shortcut for 10606831001SPaolo Bonzini``qatomic_set`` followed by ``smp_mb``:: 10715e8699fSPaolo Bonzini 10806831001SPaolo Bonzini void qatomic_set_mb(ptr, val) 10915e8699fSPaolo Bonzini 11015e8699fSPaolo Bonzini 11115e8699fSPaolo BonziniWeak atomic access and manual memory barriers 11215e8699fSPaolo Bonzini============================================= 11315e8699fSPaolo Bonzini 11415e8699fSPaolo BonziniCompared to sequentially consistent atomic access, programming with 11515e8699fSPaolo Bonziniweaker consistency models can be considerably more complicated. 116de99dab0SPaolo BonziniThe only guarantees that you can rely upon in this case are: 117de99dab0SPaolo Bonzini 118de99dab0SPaolo Bonzini- atomic accesses will not cause data races (and hence undefined behavior); 119de99dab0SPaolo Bonzini ordinary accesses instead cause data races if they are concurrent with 120de99dab0SPaolo Bonzini other accesses of which at least one is a write. In order to ensure this, 121de99dab0SPaolo Bonzini the compiler will not optimize accesses out of existence, create unsolicited 122f6822feeSStefan Weil accesses, or perform other similar optimizations. 123de99dab0SPaolo Bonzini 124de99dab0SPaolo Bonzini- acquire operations will appear to happen, with respect to the other 125de99dab0SPaolo Bonzini components of the system, before all the LOAD or STORE operations 126de99dab0SPaolo Bonzini specified afterwards. 127de99dab0SPaolo Bonzini 128de99dab0SPaolo Bonzini- release operations will appear to happen, with respect to the other 129de99dab0SPaolo Bonzini components of the system, after all the LOAD or STORE operations 130de99dab0SPaolo Bonzini specified before. 131de99dab0SPaolo Bonzini 132de99dab0SPaolo Bonzini- release operations will *synchronize with* acquire operations; 133de99dab0SPaolo Bonzini see :ref:`acqrel` for a detailed explanation. 13415e8699fSPaolo Bonzini 13515e8699fSPaolo BonziniWhen using this model, variables are accessed with: 13615e8699fSPaolo Bonzini 137d73415a3SStefan Hajnoczi- ``qatomic_read()`` and ``qatomic_set()``; these prevent the compiler from 13815e8699fSPaolo Bonzini optimizing accesses out of existence and creating unsolicited 13915e8699fSPaolo Bonzini accesses, but do not otherwise impose any ordering on loads and 14015e8699fSPaolo Bonzini stores: both the compiler and the processor are free to reorder 14115e8699fSPaolo Bonzini them. 14215e8699fSPaolo Bonzini 143d73415a3SStefan Hajnoczi- ``qatomic_load_acquire()``, which guarantees the LOAD to appear to 14415e8699fSPaolo Bonzini happen, with respect to the other components of the system, 14515e8699fSPaolo Bonzini before all the LOAD or STORE operations specified afterwards. 146d73415a3SStefan Hajnoczi Operations coming before ``qatomic_load_acquire()`` can still be 14715e8699fSPaolo Bonzini reordered after it. 14815e8699fSPaolo Bonzini 149d73415a3SStefan Hajnoczi- ``qatomic_store_release()``, which guarantees the STORE to appear to 15015e8699fSPaolo Bonzini happen, with respect to the other components of the system, 151de99dab0SPaolo Bonzini after all the LOAD or STORE operations specified before. 152d73415a3SStefan Hajnoczi Operations coming after ``qatomic_store_release()`` can still be 153de99dab0SPaolo Bonzini reordered before it. 15415e8699fSPaolo Bonzini 15515e8699fSPaolo BonziniRestrictions to the ordering of accesses can also be specified 15615e8699fSPaolo Bonziniusing the memory barrier macros: ``smp_rmb()``, ``smp_wmb()``, ``smp_mb()``, 15715e8699fSPaolo Bonzini``smp_mb_acquire()``, ``smp_mb_release()``, ``smp_read_barrier_depends()``. 15815e8699fSPaolo Bonzini 15915e8699fSPaolo BonziniMemory barriers control the order of references to shared memory. 16015e8699fSPaolo BonziniThey come in six kinds: 16115e8699fSPaolo Bonzini 16215e8699fSPaolo Bonzini- ``smp_rmb()`` guarantees that all the LOAD operations specified before 16315e8699fSPaolo Bonzini the barrier will appear to happen before all the LOAD operations 16415e8699fSPaolo Bonzini specified after the barrier with respect to the other components of 16515e8699fSPaolo Bonzini the system. 16615e8699fSPaolo Bonzini 16715e8699fSPaolo Bonzini In other words, ``smp_rmb()`` puts a partial ordering on loads, but is not 16815e8699fSPaolo Bonzini required to have any effect on stores. 16915e8699fSPaolo Bonzini 17015e8699fSPaolo Bonzini- ``smp_wmb()`` guarantees that all the STORE operations specified before 17115e8699fSPaolo Bonzini the barrier will appear to happen before all the STORE operations 17215e8699fSPaolo Bonzini specified after the barrier with respect to the other components of 17315e8699fSPaolo Bonzini the system. 17415e8699fSPaolo Bonzini 17515e8699fSPaolo Bonzini In other words, ``smp_wmb()`` puts a partial ordering on stores, but is not 17615e8699fSPaolo Bonzini required to have any effect on loads. 17715e8699fSPaolo Bonzini 17815e8699fSPaolo Bonzini- ``smp_mb_acquire()`` guarantees that all the LOAD operations specified before 17915e8699fSPaolo Bonzini the barrier will appear to happen before all the LOAD or STORE operations 18015e8699fSPaolo Bonzini specified after the barrier with respect to the other components of 18115e8699fSPaolo Bonzini the system. 18215e8699fSPaolo Bonzini 18315e8699fSPaolo Bonzini- ``smp_mb_release()`` guarantees that all the STORE operations specified *after* 18415e8699fSPaolo Bonzini the barrier will appear to happen after all the LOAD or STORE operations 18515e8699fSPaolo Bonzini specified *before* the barrier with respect to the other components of 18615e8699fSPaolo Bonzini the system. 18715e8699fSPaolo Bonzini 18815e8699fSPaolo Bonzini- ``smp_mb()`` guarantees that all the LOAD and STORE operations specified 18915e8699fSPaolo Bonzini before the barrier will appear to happen before all the LOAD and 19015e8699fSPaolo Bonzini STORE operations specified after the barrier with respect to the other 19115e8699fSPaolo Bonzini components of the system. 19215e8699fSPaolo Bonzini 19315e8699fSPaolo Bonzini ``smp_mb()`` puts a partial ordering on both loads and stores. It is 19415e8699fSPaolo Bonzini stronger than both a read and a write memory barrier; it implies both 19515e8699fSPaolo Bonzini ``smp_mb_acquire()`` and ``smp_mb_release()``, but it also prevents STOREs 19615e8699fSPaolo Bonzini coming before the barrier from overtaking LOADs coming after the 19715e8699fSPaolo Bonzini barrier and vice versa. 19815e8699fSPaolo Bonzini 19915e8699fSPaolo Bonzini- ``smp_read_barrier_depends()`` is a weaker kind of read barrier. On 20015e8699fSPaolo Bonzini most processors, whenever two loads are performed such that the 20115e8699fSPaolo Bonzini second depends on the result of the first (e.g., the first load 20215e8699fSPaolo Bonzini retrieves the address to which the second load will be directed), 20315e8699fSPaolo Bonzini the processor will guarantee that the first LOAD will appear to happen 20415e8699fSPaolo Bonzini before the second with respect to the other components of the system. 205000eebcfSPaolo Bonzini Therefore, unlike ``smp_rmb()`` or ``qatomic_load_acquire()``, 206000eebcfSPaolo Bonzini ``smp_read_barrier_depends()`` can be just a compiler barrier on 207*381d2c36SPaolo Bonzini weakly-ordered architectures such as Arm or PPC\ [#alpha]_. 20815e8699fSPaolo Bonzini 20915e8699fSPaolo Bonzini Note that the first load really has to have a _data_ dependency and not 21015e8699fSPaolo Bonzini a control dependency. If the address for the second load is dependent 21115e8699fSPaolo Bonzini on the first load, but the dependency is through a conditional rather 21215e8699fSPaolo Bonzini than actually loading the address itself, then it's a _control_ 21315e8699fSPaolo Bonzini dependency and a full read barrier or better is required. 21415e8699fSPaolo Bonzini 215*381d2c36SPaolo Bonzini.. [#alpha] The DEC Alpha is an exception, because ``smp_read_barrier_depends()`` 216000eebcfSPaolo Bonzini needs a processor barrier. On strongly-ordered architectures such 217000eebcfSPaolo Bonzini as x86 or s390, ``smp_rmb()`` and ``qatomic_load_acquire()`` can 218000eebcfSPaolo Bonzini also be compiler barriers only. 21915e8699fSPaolo Bonzini 220d73415a3SStefan HajnocziMemory barriers and ``qatomic_load_acquire``/``qatomic_store_release`` are 221de99dab0SPaolo Bonzinimostly used when a data structure has one thread that is always a writer 222de99dab0SPaolo Bonziniand one thread that is always a reader: 22315e8699fSPaolo Bonzini 224de99dab0SPaolo Bonzini +----------------------------------+----------------------------------+ 225de99dab0SPaolo Bonzini | thread 1 | thread 2 | 226de99dab0SPaolo Bonzini +==================================+==================================+ 22715e8699fSPaolo Bonzini | :: | :: | 22815e8699fSPaolo Bonzini | | | 229d73415a3SStefan Hajnoczi | qatomic_store_release(&a, x); | y = qatomic_load_acquire(&b); | 230d73415a3SStefan Hajnoczi | qatomic_store_release(&b, y); | x = qatomic_load_acquire(&a); | 231de99dab0SPaolo Bonzini +----------------------------------+----------------------------------+ 23215e8699fSPaolo Bonzini 233de99dab0SPaolo BonziniIn this case, correctness is easy to check for using the "pairing" 234de99dab0SPaolo Bonzinitrick that is explained below. 23515e8699fSPaolo Bonzini 236de99dab0SPaolo BonziniSometimes, a thread is accessing many variables that are otherwise 23715e8699fSPaolo Bonziniunrelated to each other (for example because, apart from the current 23815e8699fSPaolo Bonzinithread, exactly one other thread will read or write each of these 239de99dab0SPaolo Bonzinivariables). In this case, it is possible to "hoist" the barriers 240de99dab0SPaolo Bonzinioutside a loop. For example: 24115e8699fSPaolo Bonzini 242de99dab0SPaolo Bonzini +------------------------------------------+----------------------------------+ 24315e8699fSPaolo Bonzini | before | after | 244de99dab0SPaolo Bonzini +==========================================+==================================+ 24515e8699fSPaolo Bonzini | :: | :: | 24615e8699fSPaolo Bonzini | | | 24715e8699fSPaolo Bonzini | n = 0; | n = 0; | 24815e8699fSPaolo Bonzini | for (i = 0; i < 10; i++) | for (i = 0; i < 10; i++) | 249d73415a3SStefan Hajnoczi | n += qatomic_load_acquire(&a[i]); | n += qatomic_read(&a[i]); | 25015e8699fSPaolo Bonzini | | smp_mb_acquire(); | 251de99dab0SPaolo Bonzini +------------------------------------------+----------------------------------+ 25215e8699fSPaolo Bonzini | :: | :: | 25315e8699fSPaolo Bonzini | | | 25415e8699fSPaolo Bonzini | | smp_mb_release(); | 25515e8699fSPaolo Bonzini | for (i = 0; i < 10; i++) | for (i = 0; i < 10; i++) | 256d73415a3SStefan Hajnoczi | qatomic_store_release(&a[i], false); | qatomic_set(&a[i], false); | 257de99dab0SPaolo Bonzini +------------------------------------------+----------------------------------+ 25815e8699fSPaolo Bonzini 259de99dab0SPaolo BonziniSplitting a loop can also be useful to reduce the number of barriers: 26015e8699fSPaolo Bonzini 261de99dab0SPaolo Bonzini +------------------------------------------+----------------------------------+ 26215e8699fSPaolo Bonzini | before | after | 263de99dab0SPaolo Bonzini +==========================================+==================================+ 264de99dab0SPaolo Bonzini | :: | :: | 265de99dab0SPaolo Bonzini | | | 266de99dab0SPaolo Bonzini | n = 0; | smp_mb_release(); | 267de99dab0SPaolo Bonzini | for (i = 0; i < 10; i++) { | for (i = 0; i < 10; i++) | 268d73415a3SStefan Hajnoczi | qatomic_store_release(&a[i], false); | qatomic_set(&a[i], false); | 269de99dab0SPaolo Bonzini | smp_mb(); | smb_mb(); | 270d73415a3SStefan Hajnoczi | n += qatomic_read(&b[i]); | n = 0; | 271de99dab0SPaolo Bonzini | } | for (i = 0; i < 10; i++) | 272d73415a3SStefan Hajnoczi | | n += qatomic_read(&b[i]); | 273de99dab0SPaolo Bonzini +------------------------------------------+----------------------------------+ 274de99dab0SPaolo Bonzini 275de99dab0SPaolo BonziniIn this case, a ``smp_mb_release()`` is also replaced with a (possibly cheaper, and clearer 276de99dab0SPaolo Bonzinias well) ``smp_wmb()``: 277de99dab0SPaolo Bonzini 278de99dab0SPaolo Bonzini +------------------------------------------+----------------------------------+ 279de99dab0SPaolo Bonzini | before | after | 280de99dab0SPaolo Bonzini +==========================================+==================================+ 28115e8699fSPaolo Bonzini | :: | :: | 28215e8699fSPaolo Bonzini | | | 28315e8699fSPaolo Bonzini | | smp_mb_release(); | 28415e8699fSPaolo Bonzini | for (i = 0; i < 10; i++) { | for (i = 0; i < 10; i++) | 285d73415a3SStefan Hajnoczi | qatomic_store_release(&a[i], false); | qatomic_set(&a[i], false); | 286d73415a3SStefan Hajnoczi | qatomic_store_release(&b[i], false); | smb_wmb(); | 28715e8699fSPaolo Bonzini | } | for (i = 0; i < 10; i++) | 288d73415a3SStefan Hajnoczi | | qatomic_set(&b[i], false); | 289de99dab0SPaolo Bonzini +------------------------------------------+----------------------------------+ 29015e8699fSPaolo Bonzini 29115e8699fSPaolo Bonzini 292de99dab0SPaolo Bonzini.. _acqrel: 29315e8699fSPaolo Bonzini 294de99dab0SPaolo BonziniAcquire/release pairing and the *synchronizes-with* relation 295de99dab0SPaolo Bonzini------------------------------------------------------------ 29615e8699fSPaolo Bonzini 297d73415a3SStefan HajnocziAtomic operations other than ``qatomic_set()`` and ``qatomic_read()`` have 298*381d2c36SPaolo Bonzinieither *acquire* or *release* semantics\ [#rmw]_. This has two effects: 29915e8699fSPaolo Bonzini 300de99dab0SPaolo Bonzini.. [#rmw] Read-modify-write operations can have both---acquire applies to the 301de99dab0SPaolo Bonzini read part, and release to the write. 302de99dab0SPaolo Bonzini 303de99dab0SPaolo Bonzini- within a thread, they are ordered either before subsequent operations 304de99dab0SPaolo Bonzini (for acquire) or after previous operations (for release). 305de99dab0SPaolo Bonzini 306de99dab0SPaolo Bonzini- if a release operation in one thread *synchronizes with* an acquire operation 307de99dab0SPaolo Bonzini in another thread, the ordering constraints propagates from the first to the 308de99dab0SPaolo Bonzini second thread. That is, everything before the release operation in the 309de99dab0SPaolo Bonzini first thread is guaranteed to *happen before* everything after the 310de99dab0SPaolo Bonzini acquire operation in the second thread. 311de99dab0SPaolo Bonzini 312de99dab0SPaolo BonziniThe concept of acquire and release semantics is not exclusive to atomic 313de99dab0SPaolo Bonzinioperations; almost all higher-level synchronization primitives also have 314de99dab0SPaolo Bonziniacquire or release semantics. For example: 315de99dab0SPaolo Bonzini 316de99dab0SPaolo Bonzini- ``pthread_mutex_lock`` has acquire semantics, ``pthread_mutex_unlock`` has 317de99dab0SPaolo Bonzini release semantics and synchronizes with a ``pthread_mutex_lock`` for the 318de99dab0SPaolo Bonzini same mutex. 319de99dab0SPaolo Bonzini 320de99dab0SPaolo Bonzini- ``pthread_cond_signal`` and ``pthread_cond_broadcast`` have release semantics; 321de99dab0SPaolo Bonzini ``pthread_cond_wait`` has both release semantics (synchronizing with 322de99dab0SPaolo Bonzini ``pthread_mutex_lock``) and acquire semantics (synchronizing with 323de99dab0SPaolo Bonzini ``pthread_mutex_unlock`` and signaling of the condition variable). 324de99dab0SPaolo Bonzini 325de99dab0SPaolo Bonzini- ``pthread_create`` has release semantics and synchronizes with the start 326de99dab0SPaolo Bonzini of the new thread; ``pthread_join`` has acquire semantics and synchronizes 327de99dab0SPaolo Bonzini with the exiting of the thread. 328de99dab0SPaolo Bonzini 329de99dab0SPaolo Bonzini- ``qemu_event_set`` has release semantics, ``qemu_event_wait`` has 330de99dab0SPaolo Bonzini acquire semantics. 331de99dab0SPaolo Bonzini 332de99dab0SPaolo BonziniFor example, in the following example there are no atomic accesses, but still 333de99dab0SPaolo Bonzinithread 2 is relying on the *synchronizes-with* relation between ``pthread_exit`` 334de99dab0SPaolo Bonzini(release) and ``pthread_join`` (acquire): 335de99dab0SPaolo Bonzini 336de99dab0SPaolo Bonzini +----------------------+-------------------------------+ 33715e8699fSPaolo Bonzini | thread 1 | thread 2 | 338de99dab0SPaolo Bonzini +======================+===============================+ 33915e8699fSPaolo Bonzini | :: | :: | 34015e8699fSPaolo Bonzini | | | 341de99dab0SPaolo Bonzini | *a = 1; | | 342de99dab0SPaolo Bonzini | pthread_exit(a); | pthread_join(thread1, &a); | 343de99dab0SPaolo Bonzini | | x = *a; | 344de99dab0SPaolo Bonzini +----------------------+-------------------------------+ 345de99dab0SPaolo Bonzini 346de99dab0SPaolo BonziniSynchronization between threads basically descends from this pairing of 347de99dab0SPaolo Bonzinia release operation and an acquire operation. Therefore, atomic operations 348d73415a3SStefan Hajnocziother than ``qatomic_set()`` and ``qatomic_read()`` will almost always be 349de99dab0SPaolo Bonzinipaired with another operation of the opposite kind: an acquire operation 350de99dab0SPaolo Bonziniwill pair with a release operation and vice versa. This rule of thumb is 351de99dab0SPaolo Bonziniextremely useful; in the case of QEMU, however, note that the other 352de99dab0SPaolo Bonzinioperation may actually be in a driver that runs in the guest! 353de99dab0SPaolo Bonzini 354de99dab0SPaolo Bonzini``smp_read_barrier_depends()``, ``smp_rmb()``, ``smp_mb_acquire()``, 355d73415a3SStefan Hajnoczi``qatomic_load_acquire()`` and ``qatomic_rcu_read()`` all count 356de99dab0SPaolo Bonzinias acquire operations. ``smp_wmb()``, ``smp_mb_release()``, 357d73415a3SStefan Hajnoczi``qatomic_store_release()`` and ``qatomic_rcu_set()`` all count as release 358de99dab0SPaolo Bonzinioperations. ``smp_mb()`` counts as both acquire and release, therefore 359de99dab0SPaolo Bonziniit can pair with any other atomic operation. Here is an example: 360de99dab0SPaolo Bonzini 361de99dab0SPaolo Bonzini +----------------------+------------------------------+ 362de99dab0SPaolo Bonzini | thread 1 | thread 2 | 363de99dab0SPaolo Bonzini +======================+==============================+ 364de99dab0SPaolo Bonzini | :: | :: | 365de99dab0SPaolo Bonzini | | | 366d73415a3SStefan Hajnoczi | qatomic_set(&a, 1);| | 36715e8699fSPaolo Bonzini | smp_wmb(); | | 368d73415a3SStefan Hajnoczi | qatomic_set(&b, 2);| x = qatomic_read(&b); | 36915e8699fSPaolo Bonzini | | smp_rmb(); | 370d73415a3SStefan Hajnoczi | | y = qatomic_read(&a); | 371de99dab0SPaolo Bonzini +----------------------+------------------------------+ 372de99dab0SPaolo Bonzini 373de99dab0SPaolo BonziniNote that a load-store pair only counts if the two operations access the 374de99dab0SPaolo Bonzinisame variable: that is, a store-release on a variable ``x`` *synchronizes 375de99dab0SPaolo Bonziniwith* a load-acquire on a variable ``x``, while a release barrier 376de99dab0SPaolo Bonzinisynchronizes with any acquire operation. The following example shows 377de99dab0SPaolo Bonzinicorrect synchronization: 378de99dab0SPaolo Bonzini 379de99dab0SPaolo Bonzini +--------------------------------+--------------------------------+ 380de99dab0SPaolo Bonzini | thread 1 | thread 2 | 381de99dab0SPaolo Bonzini +================================+================================+ 382de99dab0SPaolo Bonzini | :: | :: | 383de99dab0SPaolo Bonzini | | | 384d73415a3SStefan Hajnoczi | qatomic_set(&a, 1); | | 385d73415a3SStefan Hajnoczi | qatomic_store_release(&b, 2);| x = qatomic_load_acquire(&b);| 386d73415a3SStefan Hajnoczi | | y = qatomic_read(&a); | 387de99dab0SPaolo Bonzini +--------------------------------+--------------------------------+ 388de99dab0SPaolo Bonzini 389de99dab0SPaolo BonziniAcquire and release semantics of higher-level primitives can also be 390de99dab0SPaolo Bonzinirelied upon for the purpose of establishing the *synchronizes with* 391de99dab0SPaolo Bonzinirelation. 39215e8699fSPaolo Bonzini 39315e8699fSPaolo BonziniNote that the "writing" thread is accessing the variables in the 39415e8699fSPaolo Bonziniopposite order as the "reading" thread. This is expected: stores 395de99dab0SPaolo Bonzinibefore a release operation will normally match the loads after 396de99dab0SPaolo Bonzinithe acquire operation, and vice versa. In fact, this happened already 397de99dab0SPaolo Bonziniin the ``pthread_exit``/``pthread_join`` example above. 398de99dab0SPaolo Bonzini 399de99dab0SPaolo BonziniFinally, this more complex example has more than two accesses and data 400de99dab0SPaolo Bonzinidependency barriers. It also does not use atomic accesses whenever there 401de99dab0SPaolo Bonzinicannot be a data race: 40215e8699fSPaolo Bonzini 40315e8699fSPaolo Bonzini +----------------------+------------------------------+ 40415e8699fSPaolo Bonzini | thread 1 | thread 2 | 40515e8699fSPaolo Bonzini +======================+==============================+ 40615e8699fSPaolo Bonzini | :: | :: | 40715e8699fSPaolo Bonzini | | | 40815e8699fSPaolo Bonzini | b[2] = 1; | | 40915e8699fSPaolo Bonzini | smp_wmb(); | | 41015e8699fSPaolo Bonzini | x->i = 2; | | 41115e8699fSPaolo Bonzini | smp_wmb(); | | 412d73415a3SStefan Hajnoczi | qatomic_set(&a, x);| x = qatomic_read(&a); | 41315e8699fSPaolo Bonzini | | smp_read_barrier_depends(); | 41415e8699fSPaolo Bonzini | | y = x->i; | 41515e8699fSPaolo Bonzini | | smp_read_barrier_depends(); | 41615e8699fSPaolo Bonzini | | z = b[y]; | 41715e8699fSPaolo Bonzini +----------------------+------------------------------+ 41815e8699fSPaolo Bonzini 419de99dab0SPaolo BonziniComparison with Linux kernel primitives 420de99dab0SPaolo Bonzini======================================= 42115e8699fSPaolo Bonzini 42215e8699fSPaolo BonziniHere is a list of differences between Linux kernel atomic operations 42315e8699fSPaolo Bonziniand memory barriers, and the equivalents in QEMU: 42415e8699fSPaolo Bonzini 42515e8699fSPaolo Bonzini- atomic operations in Linux are always on a 32-bit int type and 42615e8699fSPaolo Bonzini use a boxed ``atomic_t`` type; atomic operations in QEMU are polymorphic 42715e8699fSPaolo Bonzini and use normal C types. 42815e8699fSPaolo Bonzini 42915e8699fSPaolo Bonzini- Originally, ``atomic_read`` and ``atomic_set`` in Linux gave no guarantee 43015e8699fSPaolo Bonzini at all. Linux 4.1 updated them to implement volatile 43115e8699fSPaolo Bonzini semantics via ``ACCESS_ONCE`` (or the more recent ``READ``/``WRITE_ONCE``). 43215e8699fSPaolo Bonzini 433d73415a3SStefan Hajnoczi QEMU's ``qatomic_read`` and ``qatomic_set`` implement C11 atomic relaxed 43415e8699fSPaolo Bonzini semantics if the compiler supports it, and volatile semantics otherwise. 43515e8699fSPaolo Bonzini Both semantics prevent the compiler from doing certain transformations; 43615e8699fSPaolo Bonzini the difference is that atomic accesses are guaranteed to be atomic, 43715e8699fSPaolo Bonzini while volatile accesses aren't. Thus, in the volatile case we just cross 43815e8699fSPaolo Bonzini our fingers hoping that the compiler will generate atomic accesses, 43915e8699fSPaolo Bonzini since we assume the variables passed are machine-word sized and 44015e8699fSPaolo Bonzini properly aligned. 44115e8699fSPaolo Bonzini 442d73415a3SStefan Hajnoczi No barriers are implied by ``qatomic_read`` and ``qatomic_set`` in either 443d73415a3SStefan Hajnoczi Linux or QEMU. 44415e8699fSPaolo Bonzini 44515e8699fSPaolo Bonzini- atomic read-modify-write operations in Linux are of three kinds: 44615e8699fSPaolo Bonzini 44715e8699fSPaolo Bonzini ===================== ========================================= 44815e8699fSPaolo Bonzini ``atomic_OP`` returns void 44915e8699fSPaolo Bonzini ``atomic_OP_return`` returns new value of the variable 45015e8699fSPaolo Bonzini ``atomic_fetch_OP`` returns the old value of the variable 45115e8699fSPaolo Bonzini ``atomic_cmpxchg`` returns the old value of the variable 45215e8699fSPaolo Bonzini ===================== ========================================= 45315e8699fSPaolo Bonzini 454de99dab0SPaolo Bonzini In QEMU, the second kind is named ``atomic_OP_fetch``. 45515e8699fSPaolo Bonzini 45615e8699fSPaolo Bonzini- different atomic read-modify-write operations in Linux imply 4573b31669eSPaolo Bonzini a different set of memory barriers. In QEMU, all of them enforce 4583b31669eSPaolo Bonzini sequential consistency: there is a single order in which the 4593b31669eSPaolo Bonzini program sees them happen. 46015e8699fSPaolo Bonzini 4613b31669eSPaolo Bonzini- however, according to the C11 memory model that QEMU uses, this order 4623b31669eSPaolo Bonzini does not propagate to other memory accesses on either side of the 4633b31669eSPaolo Bonzini read-modify-write operation. As far as those are concerned, the 4643b31669eSPaolo Bonzini operation consist of just a load-acquire followed by a store-release. 4653b31669eSPaolo Bonzini Stores that precede the RMW operation, and loads that follow it, can 4663b31669eSPaolo Bonzini still be reordered and will happen *in the middle* of the read-modify-write 4673b31669eSPaolo Bonzini operation! 4683b31669eSPaolo Bonzini 4693b31669eSPaolo Bonzini Therefore, the following example is correct in Linux but not in QEMU: 470de99dab0SPaolo Bonzini 471de99dab0SPaolo Bonzini +----------------------------------+--------------------------------+ 472de99dab0SPaolo Bonzini | Linux (correct) | QEMU (incorrect) | 473de99dab0SPaolo Bonzini +==================================+================================+ 474de99dab0SPaolo Bonzini | :: | :: | 475de99dab0SPaolo Bonzini | | | 476d73415a3SStefan Hajnoczi | a = atomic_fetch_add(&x, 2); | a = qatomic_fetch_add(&x, 2);| 477d73415a3SStefan Hajnoczi | b = READ_ONCE(&y); | b = qatomic_read(&y); | 478de99dab0SPaolo Bonzini +----------------------------------+--------------------------------+ 479de99dab0SPaolo Bonzini 480de99dab0SPaolo Bonzini because the read of ``y`` can be moved (by either the processor or the 481de99dab0SPaolo Bonzini compiler) before the write of ``x``. 482de99dab0SPaolo Bonzini 483ff00bed1SPaolo Bonzini Fixing this requires a full memory barrier between the write of ``x`` and 484ff00bed1SPaolo Bonzini the read of ``y``. QEMU provides ``smp_mb__before_rmw()`` and 485ff00bed1SPaolo Bonzini ``smp_mb__after_rmw()``; they act both as an optimization, 486ff00bed1SPaolo Bonzini avoiding the memory barrier on processors where it is unnecessary, 487ff00bed1SPaolo Bonzini and as a clarification of this corner case of the C11 memory model: 488ff00bed1SPaolo Bonzini 489ff00bed1SPaolo Bonzini +--------------------------------+ 490ff00bed1SPaolo Bonzini | QEMU (correct) | 491ff00bed1SPaolo Bonzini +================================+ 492ff00bed1SPaolo Bonzini | :: | 493ff00bed1SPaolo Bonzini | | 494ff00bed1SPaolo Bonzini | a = qatomic_fetch_add(&x, 2);| 495ff00bed1SPaolo Bonzini | smp_mb__after_rmw(); | 496ff00bed1SPaolo Bonzini | b = qatomic_read(&y); | 497ff00bed1SPaolo Bonzini +--------------------------------+ 498ff00bed1SPaolo Bonzini 499ff00bed1SPaolo Bonzini In the common case where only one thread writes ``x``, it is also possible 500ff00bed1SPaolo Bonzini to write it like this: 501de99dab0SPaolo Bonzini 502de99dab0SPaolo Bonzini +--------------------------------+ 503de99dab0SPaolo Bonzini | QEMU (correct) | 504de99dab0SPaolo Bonzini +================================+ 505de99dab0SPaolo Bonzini | :: | 506de99dab0SPaolo Bonzini | | 507d73415a3SStefan Hajnoczi | a = qatomic_read(&x); | 50806831001SPaolo Bonzini | qatomic_set_mb(&x, a + 2); | 509d73415a3SStefan Hajnoczi | b = qatomic_read(&y); | 510de99dab0SPaolo Bonzini +--------------------------------+ 51115e8699fSPaolo Bonzini 51215e8699fSPaolo BonziniSources 51315e8699fSPaolo Bonzini======= 51415e8699fSPaolo Bonzini 51515e8699fSPaolo Bonzini- ``Documentation/memory-barriers.txt`` from the Linux kernel 516