1ac06724aSPaolo BonziniUsing RCU (Read-Copy-Update) for synchronization 2ac06724aSPaolo Bonzini================================================ 3ac06724aSPaolo Bonzini 4ac06724aSPaolo BonziniRead-copy update (RCU) is a synchronization mechanism that is used to 5ac06724aSPaolo Bonziniprotect read-mostly data structures. RCU is very efficient and scalable 6ac06724aSPaolo Bonzinion the read side (it is wait-free), and thus can make the read paths 7ac06724aSPaolo Bonziniextremely fast. 8ac06724aSPaolo Bonzini 9ac06724aSPaolo BonziniRCU supports concurrency between a single writer and multiple readers, 10ac06724aSPaolo Bonzinithus it is not used alone. Typically, the write-side will use a lock to 11ac06724aSPaolo Bonziniserialize multiple updates, but other approaches are possible (e.g., 12ac06724aSPaolo Bonzinirestricting updates to a single task). In QEMU, when a lock is used, 13ac06724aSPaolo Bonzinithis will often be the "iothread mutex", also known as the "big QEMU 14ac06724aSPaolo Bonzinilock" (BQL). Also, restricting updates to a single task is done in 15ac06724aSPaolo BonziniQEMU using the "bottom half" API. 16ac06724aSPaolo Bonzini 17ac06724aSPaolo BonziniRCU is fundamentally a "wait-to-finish" mechanism. The read side marks 18ac06724aSPaolo Bonzinisections of code with "critical sections", and the update side will wait 19ac06724aSPaolo Bonzinifor the execution of all *currently running* critical sections before 20ac06724aSPaolo Bonziniproceeding, or before asynchronously executing a callback. 21ac06724aSPaolo Bonzini 22ac06724aSPaolo BonziniThe key point here is that only the currently running critical sections 23ac06724aSPaolo Bonziniare waited for; critical sections that are started _after_ the beginning 24ac06724aSPaolo Bonziniof the wait do not extend the wait, despite running concurrently with 25ac06724aSPaolo Bonzinithe updater. This is the reason why RCU is more scalable than, 26ac06724aSPaolo Bonzinifor example, reader-writer locks. It is so much more scalable that 27ac06724aSPaolo Bonzinithe system will have a single instance of the RCU mechanism; a single 28ac06724aSPaolo Bonzinimechanism can be used for an arbitrary number of "things", without 29ac06724aSPaolo Bonzinihaving to worry about things such as contention or deadlocks. 30ac06724aSPaolo Bonzini 31ac06724aSPaolo BonziniHow is this possible? The basic idea is to split updates in two phases, 32ac06724aSPaolo Bonzini"removal" and "reclamation". During removal, we ensure that subsequent 33ac06724aSPaolo Bonzinireaders will not be able to get a reference to the old data. After 34ac06724aSPaolo Bonziniremoval has completed, a critical section will not be able to access 35ac06724aSPaolo Bonzinithe old data. Therefore, critical sections that begin after removal 36ac06724aSPaolo Bonzinido not matter; as soon as all previous critical sections have finished, 37ac06724aSPaolo Bonzinithere cannot be any readers who hold references to the data structure, 38ac06724aSPaolo Bonziniand these can now be safely reclaimed (e.g., freed or unref'ed). 39ac06724aSPaolo Bonzini 40ac06724aSPaolo BonziniHere is a picture: 41ac06724aSPaolo Bonzini 42ac06724aSPaolo Bonzini thread 1 thread 2 thread 3 43ac06724aSPaolo Bonzini ------------------- ------------------------ ------------------- 44ac06724aSPaolo Bonzini enter RCU crit.sec. 45ac06724aSPaolo Bonzini | finish removal phase 46ac06724aSPaolo Bonzini | begin wait 47ac06724aSPaolo Bonzini | | enter RCU crit.sec. 48ac06724aSPaolo Bonzini exit RCU crit.sec | | 49ac06724aSPaolo Bonzini complete wait | 50ac06724aSPaolo Bonzini begin reclamation phase | 51ac06724aSPaolo Bonzini exit RCU crit.sec. 52ac06724aSPaolo Bonzini 53ac06724aSPaolo Bonzini 54ac06724aSPaolo BonziniNote how thread 3 is still executing its critical section when thread 2 55ac06724aSPaolo Bonzinistarts reclaiming data. This is possible, because the old version of the 56ac06724aSPaolo Bonzinidata structure was not accessible at the time thread 3 began executing 57ac06724aSPaolo Bonzinithat critical section. 58ac06724aSPaolo Bonzini 59ac06724aSPaolo Bonzini 60ac06724aSPaolo BonziniRCU API 61ac06724aSPaolo Bonzini======= 62ac06724aSPaolo Bonzini 63ac06724aSPaolo BonziniThe core RCU API is small: 64ac06724aSPaolo Bonzini 65ac06724aSPaolo Bonzini void rcu_read_lock(void); 66ac06724aSPaolo Bonzini 67ac06724aSPaolo Bonzini Used by a reader to inform the reclaimer that the reader is 68ac06724aSPaolo Bonzini entering an RCU read-side critical section. 69ac06724aSPaolo Bonzini 70ac06724aSPaolo Bonzini void rcu_read_unlock(void); 71ac06724aSPaolo Bonzini 72ac06724aSPaolo Bonzini Used by a reader to inform the reclaimer that the reader is 73ac06724aSPaolo Bonzini exiting an RCU read-side critical section. Note that RCU 74ac06724aSPaolo Bonzini read-side critical sections may be nested and/or overlapping. 75ac06724aSPaolo Bonzini 76ac06724aSPaolo Bonzini void synchronize_rcu(void); 77ac06724aSPaolo Bonzini 78ac06724aSPaolo Bonzini Blocks until all pre-existing RCU read-side critical sections 79ac06724aSPaolo Bonzini on all threads have completed. This marks the end of the removal 80ac06724aSPaolo Bonzini phase and the beginning of reclamation phase. 81ac06724aSPaolo Bonzini 82ac06724aSPaolo Bonzini Note that it would be valid for another update to come while 83ac06724aSPaolo Bonzini synchronize_rcu is running. Because of this, it is better that 84ac06724aSPaolo Bonzini the updater releases any locks it may hold before calling 85ac06724aSPaolo Bonzini synchronize_rcu. If this is not possible (for example, because 86ac06724aSPaolo Bonzini the updater is protected by the BQL), you can use call_rcu. 87ac06724aSPaolo Bonzini 88ac06724aSPaolo Bonzini void call_rcu1(struct rcu_head * head, 89ac06724aSPaolo Bonzini void (*func)(struct rcu_head *head)); 90ac06724aSPaolo Bonzini 91ac06724aSPaolo Bonzini This function invokes func(head) after all pre-existing RCU 92ac06724aSPaolo Bonzini read-side critical sections on all threads have completed. This 93ac06724aSPaolo Bonzini marks the end of the removal phase, with func taking care 94ac06724aSPaolo Bonzini asynchronously of the reclamation phase. 95ac06724aSPaolo Bonzini 96ac06724aSPaolo Bonzini The foo struct needs to have an rcu_head structure added, 97ac06724aSPaolo Bonzini perhaps as follows: 98ac06724aSPaolo Bonzini 99ac06724aSPaolo Bonzini struct foo { 100ac06724aSPaolo Bonzini struct rcu_head rcu; 101ac06724aSPaolo Bonzini int a; 102ac06724aSPaolo Bonzini char b; 103ac06724aSPaolo Bonzini long c; 104ac06724aSPaolo Bonzini }; 105ac06724aSPaolo Bonzini 106ac06724aSPaolo Bonzini so that the reclaimer function can fetch the struct foo address 107ac06724aSPaolo Bonzini and free it: 108ac06724aSPaolo Bonzini 109ac06724aSPaolo Bonzini call_rcu1(&foo.rcu, foo_reclaim); 110ac06724aSPaolo Bonzini 111ac06724aSPaolo Bonzini void foo_reclaim(struct rcu_head *rp) 112ac06724aSPaolo Bonzini { 113ac06724aSPaolo Bonzini struct foo *fp = container_of(rp, struct foo, rcu); 114ac06724aSPaolo Bonzini g_free(fp); 115ac06724aSPaolo Bonzini } 116ac06724aSPaolo Bonzini 117ac06724aSPaolo Bonzini For the common case where the rcu_head member is the first of the 118ac06724aSPaolo Bonzini struct, you can use the following macro. 119ac06724aSPaolo Bonzini 120ac06724aSPaolo Bonzini void call_rcu(T *p, 121ac06724aSPaolo Bonzini void (*func)(T *p), 122ac06724aSPaolo Bonzini field-name); 123ac06724aSPaolo Bonzini void g_free_rcu(T *p, 124ac06724aSPaolo Bonzini field-name); 125ac06724aSPaolo Bonzini 126ac06724aSPaolo Bonzini call_rcu1 is typically used through these macro, in the common case 127ac06724aSPaolo Bonzini where the "struct rcu_head" is the first field in the struct. If 128ac06724aSPaolo Bonzini the callback function is g_free, in particular, g_free_rcu can be 129ac06724aSPaolo Bonzini used. In the above case, one could have written simply: 130ac06724aSPaolo Bonzini 131ac06724aSPaolo Bonzini g_free_rcu(&foo, rcu); 132ac06724aSPaolo Bonzini 133d73415a3SStefan Hajnoczi typeof(*p) qatomic_rcu_read(p); 134ac06724aSPaolo Bonzini 135d73415a3SStefan Hajnoczi qatomic_rcu_read() is similar to qatomic_load_acquire(), but it makes 136ac06724aSPaolo Bonzini some assumptions on the code that calls it. This allows a more 137ac06724aSPaolo Bonzini optimized implementation. 138ac06724aSPaolo Bonzini 139d73415a3SStefan Hajnoczi qatomic_rcu_read assumes that whenever a single RCU critical 140ac06724aSPaolo Bonzini section reads multiple shared data, these reads are either 141ac06724aSPaolo Bonzini data-dependent or need no ordering. This is almost always the 142ac06724aSPaolo Bonzini case when using RCU, because read-side critical sections typically 143ac06724aSPaolo Bonzini navigate one or more pointers (the pointers that are changed on 144ac06724aSPaolo Bonzini every update) until reaching a data structure of interest, 145ac06724aSPaolo Bonzini and then read from there. 146ac06724aSPaolo Bonzini 147d73415a3SStefan Hajnoczi RCU read-side critical sections must use qatomic_rcu_read() to 148ac06724aSPaolo Bonzini read data, unless concurrent writes are prevented by another 149ac06724aSPaolo Bonzini synchronization mechanism. 150ac06724aSPaolo Bonzini 151ac06724aSPaolo Bonzini Furthermore, RCU read-side critical sections should traverse the 152ac06724aSPaolo Bonzini data structure in a single direction, opposite to the direction 153ac06724aSPaolo Bonzini in which the updater initializes it. 154ac06724aSPaolo Bonzini 155d73415a3SStefan Hajnoczi void qatomic_rcu_set(p, typeof(*p) v); 156ac06724aSPaolo Bonzini 157d73415a3SStefan Hajnoczi qatomic_rcu_set() is similar to qatomic_store_release(), though it also 158ac06724aSPaolo Bonzini makes assumptions on the code that calls it in order to allow a more 159ac06724aSPaolo Bonzini optimized implementation. 160ac06724aSPaolo Bonzini 161d73415a3SStefan Hajnoczi In particular, qatomic_rcu_set() suffices for synchronization 162ac06724aSPaolo Bonzini with readers, if the updater never mutates a field within a 163ac06724aSPaolo Bonzini data item that is already accessible to readers. This is the 164ac06724aSPaolo Bonzini case when initializing a new copy of the RCU-protected data 165ac06724aSPaolo Bonzini structure; just ensure that initialization of *p is carried out 166d73415a3SStefan Hajnoczi before qatomic_rcu_set() makes the data item visible to readers. 167ac06724aSPaolo Bonzini If this rule is observed, writes will happen in the opposite 168ac06724aSPaolo Bonzini order as reads in the RCU read-side critical sections (or if 169ac06724aSPaolo Bonzini there is just one update), and there will be no need for other 170ac06724aSPaolo Bonzini synchronization mechanism to coordinate the accesses. 171ac06724aSPaolo Bonzini 172ac06724aSPaolo BonziniThe following APIs must be used before RCU is used in a thread: 173ac06724aSPaolo Bonzini 174ac06724aSPaolo Bonzini void rcu_register_thread(void); 175ac06724aSPaolo Bonzini 176ac06724aSPaolo Bonzini Mark a thread as taking part in the RCU mechanism. Such a thread 177ac06724aSPaolo Bonzini will have to report quiescent points regularly, either manually 178ac06724aSPaolo Bonzini or through the QemuCond/QemuSemaphore/QemuEvent APIs. 179ac06724aSPaolo Bonzini 180ac06724aSPaolo Bonzini void rcu_unregister_thread(void); 181ac06724aSPaolo Bonzini 182ac06724aSPaolo Bonzini Mark a thread as not taking part anymore in the RCU mechanism. 183ac06724aSPaolo Bonzini It is not a problem if such a thread reports quiescent points, 184ac06724aSPaolo Bonzini either manually or by using the QemuCond/QemuSemaphore/QemuEvent 185ac06724aSPaolo Bonzini APIs. 186ac06724aSPaolo Bonzini 187ac06724aSPaolo BonziniNote that these APIs are relatively heavyweight, and should _not_ be 188ac06724aSPaolo Bonzininested. 189ac06724aSPaolo Bonzini 1905626f8c6SDr. David Alan GilbertConvenience macros 1915626f8c6SDr. David Alan Gilbert================== 1925626f8c6SDr. David Alan Gilbert 1935626f8c6SDr. David Alan GilbertTwo macros are provided that automatically release the read lock at the 1945626f8c6SDr. David Alan Gilbertend of the scope. 1955626f8c6SDr. David Alan Gilbert 1965626f8c6SDr. David Alan Gilbert RCU_READ_LOCK_GUARD() 1975626f8c6SDr. David Alan Gilbert 1985626f8c6SDr. David Alan Gilbert Takes the lock and will release it at the end of the block it's 1995626f8c6SDr. David Alan Gilbert used in. 2005626f8c6SDr. David Alan Gilbert 2015626f8c6SDr. David Alan Gilbert WITH_RCU_READ_LOCK_GUARD() { code } 2025626f8c6SDr. David Alan Gilbert 2035626f8c6SDr. David Alan Gilbert Is used at the head of a block to protect the code within the block. 2045626f8c6SDr. David Alan Gilbert 2055626f8c6SDr. David Alan GilbertNote that 'goto'ing out of the guarded block will also drop the lock. 206ac06724aSPaolo Bonzini 207ac06724aSPaolo BonziniDIFFERENCES WITH LINUX 208ac06724aSPaolo Bonzini====================== 209ac06724aSPaolo Bonzini 210ac06724aSPaolo Bonzini- Waiting on a mutex is possible, though discouraged, within an RCU critical 211ac06724aSPaolo Bonzini section. This is because spinlocks are rarely (if ever) used in userspace 212ac06724aSPaolo Bonzini programming; not allowing this would prevent upgrading an RCU read-side 213ac06724aSPaolo Bonzini critical section to become an updater. 214ac06724aSPaolo Bonzini 215d73415a3SStefan Hajnoczi- qatomic_rcu_read and qatomic_rcu_set replace rcu_dereference and 216ac06724aSPaolo Bonzini rcu_assign_pointer. They take a _pointer_ to the variable being accessed. 217ac06724aSPaolo Bonzini 218ac06724aSPaolo Bonzini- call_rcu is a macro that has an extra argument (the name of the first 219ac06724aSPaolo Bonzini field in the struct, which must be a struct rcu_head), and expects the 220ac06724aSPaolo Bonzini type of the callback's argument to be the type of the first argument. 221ac06724aSPaolo Bonzini call_rcu1 is the same as Linux's call_rcu. 222ac06724aSPaolo Bonzini 223ac06724aSPaolo Bonzini 224ac06724aSPaolo BonziniRCU PATTERNS 225ac06724aSPaolo Bonzini============ 226ac06724aSPaolo Bonzini 227ac06724aSPaolo BonziniMany patterns using read-writer locks translate directly to RCU, with 228ac06724aSPaolo Bonzinithe advantages of higher scalability and deadlock immunity. 229ac06724aSPaolo Bonzini 230ac06724aSPaolo BonziniIn general, RCU can be used whenever it is possible to create a new 231ac06724aSPaolo Bonzini"version" of a data structure every time the updater runs. This may 232ac06724aSPaolo Bonzinisound like a very strict restriction, however: 233ac06724aSPaolo Bonzini 234ac06724aSPaolo Bonzini- the updater does not mean "everything that writes to a data structure", 235ac06724aSPaolo Bonzini but rather "everything that involves a reclamation step". See the 236ac06724aSPaolo Bonzini array example below 237ac06724aSPaolo Bonzini 238ac06724aSPaolo Bonzini- in some cases, creating a new version of a data structure may actually 239ac06724aSPaolo Bonzini be very cheap. For example, modifying the "next" pointer of a singly 240ac06724aSPaolo Bonzini linked list is effectively creating a new version of the list. 241ac06724aSPaolo Bonzini 242ac06724aSPaolo BonziniHere are some frequently-used RCU idioms that are worth noting. 243ac06724aSPaolo Bonzini 244ac06724aSPaolo Bonzini 245ac06724aSPaolo BonziniRCU list processing 246ac06724aSPaolo Bonzini------------------- 247ac06724aSPaolo Bonzini 248ac06724aSPaolo BonziniTBD (not yet used in QEMU) 249ac06724aSPaolo Bonzini 250ac06724aSPaolo Bonzini 251ac06724aSPaolo BonziniRCU reference counting 252ac06724aSPaolo Bonzini---------------------- 253ac06724aSPaolo Bonzini 254ac06724aSPaolo BonziniBecause grace periods are not allowed to complete while there is an RCU 255ac06724aSPaolo Bonziniread-side critical section in progress, the RCU read-side primitives 256ac06724aSPaolo Bonzinimay be used as a restricted reference-counting mechanism. For example, 257ac06724aSPaolo Bonziniconsider the following code fragment: 258ac06724aSPaolo Bonzini 259ac06724aSPaolo Bonzini rcu_read_lock(); 260d73415a3SStefan Hajnoczi p = qatomic_rcu_read(&foo); 261ac06724aSPaolo Bonzini /* do something with p. */ 262ac06724aSPaolo Bonzini rcu_read_unlock(); 263ac06724aSPaolo Bonzini 264ac06724aSPaolo BonziniThe RCU read-side critical section ensures that the value of "p" remains 265ac06724aSPaolo Bonzinivalid until after the rcu_read_unlock(). In some sense, it is acquiring 266ac06724aSPaolo Bonzinia reference to p that is later released when the critical section ends. 267ac06724aSPaolo BonziniThe write side looks simply like this (with appropriate locking): 268ac06724aSPaolo Bonzini 269ac06724aSPaolo Bonzini qemu_mutex_lock(&foo_mutex); 270ac06724aSPaolo Bonzini old = foo; 271d73415a3SStefan Hajnoczi qatomic_rcu_set(&foo, new); 272ac06724aSPaolo Bonzini qemu_mutex_unlock(&foo_mutex); 273ac06724aSPaolo Bonzini synchronize_rcu(); 274ac06724aSPaolo Bonzini free(old); 275ac06724aSPaolo Bonzini 276ac06724aSPaolo BonziniIf the processing cannot be done purely within the critical section, it 277ac06724aSPaolo Bonziniis possible to combine this idiom with a "real" reference count: 278ac06724aSPaolo Bonzini 279ac06724aSPaolo Bonzini rcu_read_lock(); 280d73415a3SStefan Hajnoczi p = qatomic_rcu_read(&foo); 281ac06724aSPaolo Bonzini foo_ref(p); 282ac06724aSPaolo Bonzini rcu_read_unlock(); 283ac06724aSPaolo Bonzini /* do something with p. */ 284ac06724aSPaolo Bonzini foo_unref(p); 285ac06724aSPaolo Bonzini 286ac06724aSPaolo BonziniThe write side can be like this: 287ac06724aSPaolo Bonzini 288ac06724aSPaolo Bonzini qemu_mutex_lock(&foo_mutex); 289ac06724aSPaolo Bonzini old = foo; 290d73415a3SStefan Hajnoczi qatomic_rcu_set(&foo, new); 291ac06724aSPaolo Bonzini qemu_mutex_unlock(&foo_mutex); 292ac06724aSPaolo Bonzini synchronize_rcu(); 293ac06724aSPaolo Bonzini foo_unref(old); 294ac06724aSPaolo Bonzini 295ac06724aSPaolo Bonzinior with call_rcu: 296ac06724aSPaolo Bonzini 297ac06724aSPaolo Bonzini qemu_mutex_lock(&foo_mutex); 298ac06724aSPaolo Bonzini old = foo; 299d73415a3SStefan Hajnoczi qatomic_rcu_set(&foo, new); 300ac06724aSPaolo Bonzini qemu_mutex_unlock(&foo_mutex); 301ac06724aSPaolo Bonzini call_rcu(foo_unref, old, rcu); 302ac06724aSPaolo Bonzini 303ac06724aSPaolo BonziniIn both cases, the write side only performs removal. Reclamation 304ac06724aSPaolo Bonzinihappens when the last reference to a "foo" object is dropped. 305ac06724aSPaolo BonziniUsing synchronize_rcu() is undesirably expensive, because the 306ac06724aSPaolo Bonzinilast reference may be dropped on the read side. Hence you can 307ac06724aSPaolo Bonziniuse call_rcu() instead: 308ac06724aSPaolo Bonzini 309ac06724aSPaolo Bonzini foo_unref(struct foo *p) { 310d73415a3SStefan Hajnoczi if (qatomic_fetch_dec(&p->refcount) == 1) { 311ac06724aSPaolo Bonzini call_rcu(foo_destroy, p, rcu); 312ac06724aSPaolo Bonzini } 313ac06724aSPaolo Bonzini } 314ac06724aSPaolo Bonzini 315ac06724aSPaolo Bonzini 316ac06724aSPaolo BonziniNote that the same idioms would be possible with reader/writer 317ac06724aSPaolo Bonzinilocks: 318ac06724aSPaolo Bonzini 319ac06724aSPaolo Bonzini read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock); 320ac06724aSPaolo Bonzini p = foo; p = foo; 321ac06724aSPaolo Bonzini /* do something with p. */ foo = new; 322ac06724aSPaolo Bonzini read_unlock(&foo_rwlock); free(p); 323ac06724aSPaolo Bonzini write_mutex_unlock(&foo_rwlock); 324ac06724aSPaolo Bonzini free(p); 325ac06724aSPaolo Bonzini 326ac06724aSPaolo Bonzini ------------------------------------------------------------------ 327ac06724aSPaolo Bonzini 328ac06724aSPaolo Bonzini read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock); 329ac06724aSPaolo Bonzini p = foo; old = foo; 330ac06724aSPaolo Bonzini foo_ref(p); foo = new; 331ac06724aSPaolo Bonzini read_unlock(&foo_rwlock); foo_unref(old); 332ac06724aSPaolo Bonzini /* do something with p. */ write_mutex_unlock(&foo_rwlock); 333ac06724aSPaolo Bonzini read_lock(&foo_rwlock); 334ac06724aSPaolo Bonzini foo_unref(p); 335ac06724aSPaolo Bonzini read_unlock(&foo_rwlock); 336ac06724aSPaolo Bonzini 337ac06724aSPaolo Bonzinifoo_unref could use a mechanism such as bottom halves to move deallocation 338ac06724aSPaolo Bonziniout of the write-side critical section. 339ac06724aSPaolo Bonzini 340ac06724aSPaolo Bonzini 341ac06724aSPaolo BonziniRCU resizable arrays 342ac06724aSPaolo Bonzini-------------------- 343ac06724aSPaolo Bonzini 344ac06724aSPaolo BonziniResizable arrays can be used with RCU. The expensive RCU synchronization 345ac06724aSPaolo Bonzini(or call_rcu) only needs to take place when the array is resized. 346ac06724aSPaolo BonziniThe two items to take care of are: 347ac06724aSPaolo Bonzini 348ac06724aSPaolo Bonzini- ensuring that the old version of the array is available between removal 349ac06724aSPaolo Bonzini and reclamation; 350ac06724aSPaolo Bonzini 351ac06724aSPaolo Bonzini- avoiding mismatches in the read side between the array data and the 352ac06724aSPaolo Bonzini array size. 353ac06724aSPaolo Bonzini 354ac06724aSPaolo BonziniThe first problem is avoided simply by not using realloc. Instead, 355ac06724aSPaolo Bonzinieach resize will allocate a new array and copy the old data into it. 356ac06724aSPaolo BonziniThe second problem would arise if the size and the data pointers were 357ac06724aSPaolo Bonzinitwo members of a larger struct: 358ac06724aSPaolo Bonzini 359ac06724aSPaolo Bonzini struct mystuff { 360ac06724aSPaolo Bonzini ... 361ac06724aSPaolo Bonzini int data_size; 362ac06724aSPaolo Bonzini int data_alloc; 363ac06724aSPaolo Bonzini T *data; 364ac06724aSPaolo Bonzini ... 365ac06724aSPaolo Bonzini }; 366ac06724aSPaolo Bonzini 367ac06724aSPaolo BonziniInstead, we store the size of the array with the array itself: 368ac06724aSPaolo Bonzini 369ac06724aSPaolo Bonzini struct arr { 370ac06724aSPaolo Bonzini int size; 371ac06724aSPaolo Bonzini int alloc; 372ac06724aSPaolo Bonzini T data[]; 373ac06724aSPaolo Bonzini }; 374ac06724aSPaolo Bonzini struct arr *global_array; 375ac06724aSPaolo Bonzini 376ac06724aSPaolo Bonzini read side: 377ac06724aSPaolo Bonzini rcu_read_lock(); 378d73415a3SStefan Hajnoczi struct arr *array = qatomic_rcu_read(&global_array); 379ac06724aSPaolo Bonzini x = i < array->size ? array->data[i] : -1; 380ac06724aSPaolo Bonzini rcu_read_unlock(); 381ac06724aSPaolo Bonzini return x; 382ac06724aSPaolo Bonzini 383ac06724aSPaolo Bonzini write side (running under a lock): 384ac06724aSPaolo Bonzini if (global_array->size == global_array->alloc) { 385ac06724aSPaolo Bonzini /* Creating a new version. */ 386ac06724aSPaolo Bonzini new_array = g_malloc(sizeof(struct arr) + 387ac06724aSPaolo Bonzini global_array->alloc * 2 * sizeof(T)); 388ac06724aSPaolo Bonzini new_array->size = global_array->size; 389ac06724aSPaolo Bonzini new_array->alloc = global_array->alloc * 2; 390ac06724aSPaolo Bonzini memcpy(new_array->data, global_array->data, 391ac06724aSPaolo Bonzini global_array->alloc * sizeof(T)); 392ac06724aSPaolo Bonzini 393ac06724aSPaolo Bonzini /* Removal phase. */ 394ac06724aSPaolo Bonzini old_array = global_array; 395*d533d635SKeqian Zhu qatomic_rcu_set(&global_array, new_array); 396ac06724aSPaolo Bonzini synchronize_rcu(); 397ac06724aSPaolo Bonzini 398ac06724aSPaolo Bonzini /* Reclamation phase. */ 399ac06724aSPaolo Bonzini free(old_array); 400ac06724aSPaolo Bonzini } 401ac06724aSPaolo Bonzini 402ac06724aSPaolo Bonzini 403ac06724aSPaolo BonziniSOURCES 404ac06724aSPaolo Bonzini======= 405ac06724aSPaolo Bonzini 406ac06724aSPaolo Bonzini* Documentation/RCU/ from the Linux kernel 407