1*7e7eb5aeSPaul E. McKenneyLocking 2*7e7eb5aeSPaul E. McKenney======= 3*7e7eb5aeSPaul E. McKenney 4*7e7eb5aeSPaul E. McKenneyLocking is well-known and the common use cases are straightforward: Any 5*7e7eb5aeSPaul E. McKenneyCPU holding a given lock sees any changes previously seen or made by any 6*7e7eb5aeSPaul E. McKenneyCPU before it previously released that same lock. This last sentence 7*7e7eb5aeSPaul E. McKenneyis the only part of this document that most developers will need to read. 8*7e7eb5aeSPaul E. McKenney 9*7e7eb5aeSPaul E. McKenneyHowever, developers who would like to also access lock-protected shared 10*7e7eb5aeSPaul E. McKenneyvariables outside of their corresponding locks should continue reading. 11*7e7eb5aeSPaul E. McKenney 12*7e7eb5aeSPaul E. McKenney 13*7e7eb5aeSPaul E. McKenneyLocking and Prior Accesses 14*7e7eb5aeSPaul E. McKenney-------------------------- 15*7e7eb5aeSPaul E. McKenney 16*7e7eb5aeSPaul E. McKenneyThe basic rule of locking is worth repeating: 17*7e7eb5aeSPaul E. McKenney 18*7e7eb5aeSPaul E. McKenney Any CPU holding a given lock sees any changes previously seen 19*7e7eb5aeSPaul E. McKenney or made by any CPU before it previously released that same lock. 20*7e7eb5aeSPaul E. McKenney 21*7e7eb5aeSPaul E. McKenneyNote that this statement is a bit stronger than "Any CPU holding a 22*7e7eb5aeSPaul E. McKenneygiven lock sees all changes made by any CPU during the time that CPU was 23*7e7eb5aeSPaul E. McKenneypreviously holding this same lock". For example, consider the following 24*7e7eb5aeSPaul E. McKenneypair of code fragments: 25*7e7eb5aeSPaul E. McKenney 26*7e7eb5aeSPaul E. McKenney /* See MP+polocks.litmus. */ 27*7e7eb5aeSPaul E. McKenney void CPU0(void) 28*7e7eb5aeSPaul E. McKenney { 29*7e7eb5aeSPaul E. McKenney WRITE_ONCE(x, 1); 30*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 31*7e7eb5aeSPaul E. McKenney WRITE_ONCE(y, 1); 32*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 33*7e7eb5aeSPaul E. McKenney } 34*7e7eb5aeSPaul E. McKenney 35*7e7eb5aeSPaul E. McKenney void CPU1(void) 36*7e7eb5aeSPaul E. McKenney { 37*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 38*7e7eb5aeSPaul E. McKenney r0 = READ_ONCE(y); 39*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 40*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(x); 41*7e7eb5aeSPaul E. McKenney } 42*7e7eb5aeSPaul E. McKenney 43*7e7eb5aeSPaul E. McKenneyThe basic rule guarantees that if CPU0() acquires mylock before CPU1(), 44*7e7eb5aeSPaul E. McKenneythen both r0 and r1 must be set to the value 1. This also has the 45*7e7eb5aeSPaul E. McKenneyconsequence that if the final value of r0 is equal to 1, then the final 46*7e7eb5aeSPaul E. McKenneyvalue of r1 must also be equal to 1. In contrast, the weaker rule would 47*7e7eb5aeSPaul E. McKenneysay nothing about the final value of r1. 48*7e7eb5aeSPaul E. McKenney 49*7e7eb5aeSPaul E. McKenney 50*7e7eb5aeSPaul E. McKenneyLocking and Subsequent Accesses 51*7e7eb5aeSPaul E. McKenney------------------------------- 52*7e7eb5aeSPaul E. McKenney 53*7e7eb5aeSPaul E. McKenneyThe converse to the basic rule also holds: Any CPU holding a given 54*7e7eb5aeSPaul E. McKenneylock will not see any changes that will be made by any CPU after it 55*7e7eb5aeSPaul E. McKenneysubsequently acquires this same lock. This converse statement is 56*7e7eb5aeSPaul E. McKenneyillustrated by the following litmus test: 57*7e7eb5aeSPaul E. McKenney 58*7e7eb5aeSPaul E. McKenney /* See MP+porevlocks.litmus. */ 59*7e7eb5aeSPaul E. McKenney void CPU0(void) 60*7e7eb5aeSPaul E. McKenney { 61*7e7eb5aeSPaul E. McKenney r0 = READ_ONCE(y); 62*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 63*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(x); 64*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 65*7e7eb5aeSPaul E. McKenney } 66*7e7eb5aeSPaul E. McKenney 67*7e7eb5aeSPaul E. McKenney void CPU1(void) 68*7e7eb5aeSPaul E. McKenney { 69*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 70*7e7eb5aeSPaul E. McKenney WRITE_ONCE(x, 1); 71*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 72*7e7eb5aeSPaul E. McKenney WRITE_ONCE(y, 1); 73*7e7eb5aeSPaul E. McKenney } 74*7e7eb5aeSPaul E. McKenney 75*7e7eb5aeSPaul E. McKenneyThis converse to the basic rule guarantees that if CPU0() acquires 76*7e7eb5aeSPaul E. McKenneymylock before CPU1(), then both r0 and r1 must be set to the value 0. 77*7e7eb5aeSPaul E. McKenneyThis also has the consequence that if the final value of r1 is equal 78*7e7eb5aeSPaul E. McKenneyto 0, then the final value of r0 must also be equal to 0. In contrast, 79*7e7eb5aeSPaul E. McKenneythe weaker rule would say nothing about the final value of r0. 80*7e7eb5aeSPaul E. McKenney 81*7e7eb5aeSPaul E. McKenneyThese examples show only a single pair of CPUs, but the effects of the 82*7e7eb5aeSPaul E. McKenneylocking basic rule extend across multiple acquisitions of a given lock 83*7e7eb5aeSPaul E. McKenneyacross multiple CPUs. 84*7e7eb5aeSPaul E. McKenney 85*7e7eb5aeSPaul E. McKenney 86*7e7eb5aeSPaul E. McKenneyDouble-Checked Locking 87*7e7eb5aeSPaul E. McKenney---------------------- 88*7e7eb5aeSPaul E. McKenney 89*7e7eb5aeSPaul E. McKenneyIt is well known that more than just a lock is required to make 90*7e7eb5aeSPaul E. McKenneydouble-checked locking work correctly, This litmus test illustrates 91*7e7eb5aeSPaul E. McKenneyone incorrect approach: 92*7e7eb5aeSPaul E. McKenney 93*7e7eb5aeSPaul E. McKenney /* See Documentation/litmus-tests/locking/DCL-broken.litmus. */ 94*7e7eb5aeSPaul E. McKenney void CPU0(void) 95*7e7eb5aeSPaul E. McKenney { 96*7e7eb5aeSPaul E. McKenney r0 = READ_ONCE(flag); 97*7e7eb5aeSPaul E. McKenney if (r0 == 0) { 98*7e7eb5aeSPaul E. McKenney spin_lock(&lck); 99*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(flag); 100*7e7eb5aeSPaul E. McKenney if (r1 == 0) { 101*7e7eb5aeSPaul E. McKenney WRITE_ONCE(data, 1); 102*7e7eb5aeSPaul E. McKenney WRITE_ONCE(flag, 1); 103*7e7eb5aeSPaul E. McKenney } 104*7e7eb5aeSPaul E. McKenney spin_unlock(&lck); 105*7e7eb5aeSPaul E. McKenney } 106*7e7eb5aeSPaul E. McKenney r2 = READ_ONCE(data); 107*7e7eb5aeSPaul E. McKenney } 108*7e7eb5aeSPaul E. McKenney /* CPU1() is the exactly the same as CPU0(). */ 109*7e7eb5aeSPaul E. McKenney 110*7e7eb5aeSPaul E. McKenneyThere are two problems. First, there is no ordering between the first 111*7e7eb5aeSPaul E. McKenneyREAD_ONCE() of "flag" and the READ_ONCE() of "data". Second, there is 112*7e7eb5aeSPaul E. McKenneyno ordering between the two WRITE_ONCE() calls. It should therefore be 113*7e7eb5aeSPaul E. McKenneyno surprise that "r2" can be zero, and a quick herd7 run confirms this. 114*7e7eb5aeSPaul E. McKenney 115*7e7eb5aeSPaul E. McKenneyOne way to fix this is to use smp_load_acquire() and smp_store_release() 116*7e7eb5aeSPaul E. McKenneyas shown in this corrected version: 117*7e7eb5aeSPaul E. McKenney 118*7e7eb5aeSPaul E. McKenney /* See Documentation/litmus-tests/locking/DCL-fixed.litmus. */ 119*7e7eb5aeSPaul E. McKenney void CPU0(void) 120*7e7eb5aeSPaul E. McKenney { 121*7e7eb5aeSPaul E. McKenney r0 = smp_load_acquire(&flag); 122*7e7eb5aeSPaul E. McKenney if (r0 == 0) { 123*7e7eb5aeSPaul E. McKenney spin_lock(&lck); 124*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(flag); 125*7e7eb5aeSPaul E. McKenney if (r1 == 0) { 126*7e7eb5aeSPaul E. McKenney WRITE_ONCE(data, 1); 127*7e7eb5aeSPaul E. McKenney smp_store_release(&flag, 1); 128*7e7eb5aeSPaul E. McKenney } 129*7e7eb5aeSPaul E. McKenney spin_unlock(&lck); 130*7e7eb5aeSPaul E. McKenney } 131*7e7eb5aeSPaul E. McKenney r2 = READ_ONCE(data); 132*7e7eb5aeSPaul E. McKenney } 133*7e7eb5aeSPaul E. McKenney /* CPU1() is the exactly the same as CPU0(). */ 134*7e7eb5aeSPaul E. McKenney 135*7e7eb5aeSPaul E. McKenneyThe smp_load_acquire() guarantees that its load from "flags" will 136*7e7eb5aeSPaul E. McKenneybe ordered before the READ_ONCE() from data, thus solving the first 137*7e7eb5aeSPaul E. McKenneyproblem. The smp_store_release() guarantees that its store will be 138*7e7eb5aeSPaul E. McKenneyordered after the WRITE_ONCE() to "data", solving the second problem. 139*7e7eb5aeSPaul E. McKenneyThe smp_store_release() pairs with the smp_load_acquire(), thus ensuring 140*7e7eb5aeSPaul E. McKenneythat the ordering provided by each actually takes effect. Again, a 141*7e7eb5aeSPaul E. McKenneyquick herd7 run confirms this. 142*7e7eb5aeSPaul E. McKenney 143*7e7eb5aeSPaul E. McKenneyIn short, if you access a lock-protected variable without holding the 144*7e7eb5aeSPaul E. McKenneycorresponding lock, you will need to provide additional ordering, in 145*7e7eb5aeSPaul E. McKenneythis case, via the smp_load_acquire() and the smp_store_release(). 146*7e7eb5aeSPaul E. McKenney 147*7e7eb5aeSPaul E. McKenney 148*7e7eb5aeSPaul E. McKenneyOrdering Provided by a Lock to CPUs Not Holding That Lock 149*7e7eb5aeSPaul E. McKenney--------------------------------------------------------- 150*7e7eb5aeSPaul E. McKenney 151*7e7eb5aeSPaul E. McKenneyIt is not necessarily the case that accesses ordered by locking will be 152*7e7eb5aeSPaul E. McKenneyseen as ordered by CPUs not holding that lock. Consider this example: 153*7e7eb5aeSPaul E. McKenney 154*7e7eb5aeSPaul E. McKenney /* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */ 155*7e7eb5aeSPaul E. McKenney void CPU0(void) 156*7e7eb5aeSPaul E. McKenney { 157*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 158*7e7eb5aeSPaul E. McKenney WRITE_ONCE(x, 1); 159*7e7eb5aeSPaul E. McKenney WRITE_ONCE(y, 1); 160*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 161*7e7eb5aeSPaul E. McKenney } 162*7e7eb5aeSPaul E. McKenney 163*7e7eb5aeSPaul E. McKenney void CPU1(void) 164*7e7eb5aeSPaul E. McKenney { 165*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 166*7e7eb5aeSPaul E. McKenney r0 = READ_ONCE(y); 167*7e7eb5aeSPaul E. McKenney WRITE_ONCE(z, 1); 168*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 169*7e7eb5aeSPaul E. McKenney } 170*7e7eb5aeSPaul E. McKenney 171*7e7eb5aeSPaul E. McKenney void CPU2(void) 172*7e7eb5aeSPaul E. McKenney { 173*7e7eb5aeSPaul E. McKenney WRITE_ONCE(z, 2); 174*7e7eb5aeSPaul E. McKenney smp_mb(); 175*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(x); 176*7e7eb5aeSPaul E. McKenney } 177*7e7eb5aeSPaul E. McKenney 178*7e7eb5aeSPaul E. McKenneyCounter-intuitive though it might be, it is quite possible to have 179*7e7eb5aeSPaul E. McKenneythe final value of r0 be 1, the final value of z be 2, and the final 180*7e7eb5aeSPaul E. McKenneyvalue of r1 be 0. The reason for this surprising outcome is that CPU2() 181*7e7eb5aeSPaul E. McKenneynever acquired the lock, and thus did not fully benefit from the lock's 182*7e7eb5aeSPaul E. McKenneyordering properties. 183*7e7eb5aeSPaul E. McKenney 184*7e7eb5aeSPaul E. McKenneyOrdering can be extended to CPUs not holding the lock by careful use 185*7e7eb5aeSPaul E. McKenneyof smp_mb__after_spinlock(): 186*7e7eb5aeSPaul E. McKenney 187*7e7eb5aeSPaul E. McKenney /* See Z6.0+pooncelock+poonceLock+pombonce.litmus. */ 188*7e7eb5aeSPaul E. McKenney void CPU0(void) 189*7e7eb5aeSPaul E. McKenney { 190*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 191*7e7eb5aeSPaul E. McKenney WRITE_ONCE(x, 1); 192*7e7eb5aeSPaul E. McKenney WRITE_ONCE(y, 1); 193*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 194*7e7eb5aeSPaul E. McKenney } 195*7e7eb5aeSPaul E. McKenney 196*7e7eb5aeSPaul E. McKenney void CPU1(void) 197*7e7eb5aeSPaul E. McKenney { 198*7e7eb5aeSPaul E. McKenney spin_lock(&mylock); 199*7e7eb5aeSPaul E. McKenney smp_mb__after_spinlock(); 200*7e7eb5aeSPaul E. McKenney r0 = READ_ONCE(y); 201*7e7eb5aeSPaul E. McKenney WRITE_ONCE(z, 1); 202*7e7eb5aeSPaul E. McKenney spin_unlock(&mylock); 203*7e7eb5aeSPaul E. McKenney } 204*7e7eb5aeSPaul E. McKenney 205*7e7eb5aeSPaul E. McKenney void CPU2(void) 206*7e7eb5aeSPaul E. McKenney { 207*7e7eb5aeSPaul E. McKenney WRITE_ONCE(z, 2); 208*7e7eb5aeSPaul E. McKenney smp_mb(); 209*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(x); 210*7e7eb5aeSPaul E. McKenney } 211*7e7eb5aeSPaul E. McKenney 212*7e7eb5aeSPaul E. McKenneyThis addition of smp_mb__after_spinlock() strengthens the lock 213*7e7eb5aeSPaul E. McKenneyacquisition sufficiently to rule out the counter-intuitive outcome. 214*7e7eb5aeSPaul E. McKenneyIn other words, the addition of the smp_mb__after_spinlock() prohibits 215*7e7eb5aeSPaul E. McKenneythe counter-intuitive result where the final value of r0 is 1, the final 216*7e7eb5aeSPaul E. McKenneyvalue of z is 2, and the final value of r1 is 0. 217*7e7eb5aeSPaul E. McKenney 218*7e7eb5aeSPaul E. McKenney 219*7e7eb5aeSPaul E. McKenneyNo Roach-Motel Locking! 220*7e7eb5aeSPaul E. McKenney----------------------- 221*7e7eb5aeSPaul E. McKenney 222*7e7eb5aeSPaul E. McKenneyThis example requires familiarity with the herd7 "filter" clause, so 223*7e7eb5aeSPaul E. McKenneyplease read up on that topic in litmus-tests.txt. 224*7e7eb5aeSPaul E. McKenney 225*7e7eb5aeSPaul E. McKenneyIt is tempting to allow memory-reference instructions to be pulled 226*7e7eb5aeSPaul E. McKenneyinto a critical section, but this cannot be allowed in the general case. 227*7e7eb5aeSPaul E. McKenneyFor example, consider a spin loop preceding a lock-based critical section. 228*7e7eb5aeSPaul E. McKenneyNow, herd7 does not model spin loops, but we can emulate one with two 229*7e7eb5aeSPaul E. McKenneyloads, with a "filter" clause to constrain the first to return the 230*7e7eb5aeSPaul E. McKenneyinitial value and the second to return the updated value, as shown below: 231*7e7eb5aeSPaul E. McKenney 232*7e7eb5aeSPaul E. McKenney /* See Documentation/litmus-tests/locking/RM-fixed.litmus. */ 233*7e7eb5aeSPaul E. McKenney void CPU0(void) 234*7e7eb5aeSPaul E. McKenney { 235*7e7eb5aeSPaul E. McKenney spin_lock(&lck); 236*7e7eb5aeSPaul E. McKenney r2 = atomic_inc_return(&y); 237*7e7eb5aeSPaul E. McKenney WRITE_ONCE(x, 1); 238*7e7eb5aeSPaul E. McKenney spin_unlock(&lck); 239*7e7eb5aeSPaul E. McKenney } 240*7e7eb5aeSPaul E. McKenney 241*7e7eb5aeSPaul E. McKenney void CPU1(void) 242*7e7eb5aeSPaul E. McKenney { 243*7e7eb5aeSPaul E. McKenney r0 = READ_ONCE(x); 244*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(x); 245*7e7eb5aeSPaul E. McKenney spin_lock(&lck); 246*7e7eb5aeSPaul E. McKenney r2 = atomic_inc_return(&y); 247*7e7eb5aeSPaul E. McKenney spin_unlock(&lck); 248*7e7eb5aeSPaul E. McKenney } 249*7e7eb5aeSPaul E. McKenney 250*7e7eb5aeSPaul E. McKenney filter (1:r0=0 /\ 1:r1=1) 251*7e7eb5aeSPaul E. McKenney exists (1:r2=1) 252*7e7eb5aeSPaul E. McKenney 253*7e7eb5aeSPaul E. McKenneyThe variable "x" is the control variable for the emulated spin loop. 254*7e7eb5aeSPaul E. McKenneyCPU0() sets it to "1" while holding the lock, and CPU1() emulates the 255*7e7eb5aeSPaul E. McKenneyspin loop by reading it twice, first into "1:r0" (which should get the 256*7e7eb5aeSPaul E. McKenneyinitial value "0") and then into "1:r1" (which should get the updated 257*7e7eb5aeSPaul E. McKenneyvalue "1"). 258*7e7eb5aeSPaul E. McKenney 259*7e7eb5aeSPaul E. McKenneyThe "filter" clause takes this into account, constraining "1:r0" to 260*7e7eb5aeSPaul E. McKenneyequal "0" and "1:r1" to equal 1. 261*7e7eb5aeSPaul E. McKenney 262*7e7eb5aeSPaul E. McKenneyThen the "exists" clause checks to see if CPU1() acquired its lock first, 263*7e7eb5aeSPaul E. McKenneywhich should not happen given the filter clause because CPU0() updates 264*7e7eb5aeSPaul E. McKenney"x" while holding the lock. And herd7 confirms this. 265*7e7eb5aeSPaul E. McKenney 266*7e7eb5aeSPaul E. McKenneyBut suppose that the compiler was permitted to reorder the spin loop 267*7e7eb5aeSPaul E. McKenneyinto CPU1()'s critical section, like this: 268*7e7eb5aeSPaul E. McKenney 269*7e7eb5aeSPaul E. McKenney /* See Documentation/litmus-tests/locking/RM-broken.litmus. */ 270*7e7eb5aeSPaul E. McKenney void CPU0(void) 271*7e7eb5aeSPaul E. McKenney { 272*7e7eb5aeSPaul E. McKenney int r2; 273*7e7eb5aeSPaul E. McKenney 274*7e7eb5aeSPaul E. McKenney spin_lock(&lck); 275*7e7eb5aeSPaul E. McKenney r2 = atomic_inc_return(&y); 276*7e7eb5aeSPaul E. McKenney WRITE_ONCE(x, 1); 277*7e7eb5aeSPaul E. McKenney spin_unlock(&lck); 278*7e7eb5aeSPaul E. McKenney } 279*7e7eb5aeSPaul E. McKenney 280*7e7eb5aeSPaul E. McKenney void CPU1(void) 281*7e7eb5aeSPaul E. McKenney { 282*7e7eb5aeSPaul E. McKenney spin_lock(&lck); 283*7e7eb5aeSPaul E. McKenney r0 = READ_ONCE(x); 284*7e7eb5aeSPaul E. McKenney r1 = READ_ONCE(x); 285*7e7eb5aeSPaul E. McKenney r2 = atomic_inc_return(&y); 286*7e7eb5aeSPaul E. McKenney spin_unlock(&lck); 287*7e7eb5aeSPaul E. McKenney } 288*7e7eb5aeSPaul E. McKenney 289*7e7eb5aeSPaul E. McKenney filter (1:r0=0 /\ 1:r1=1) 290*7e7eb5aeSPaul E. McKenney exists (1:r2=1) 291*7e7eb5aeSPaul E. McKenney 292*7e7eb5aeSPaul E. McKenneyIf "1:r0" is equal to "0", "1:r1" can never equal "1" because CPU0() 293*7e7eb5aeSPaul E. McKenneycannot update "x" while CPU1() holds the lock. And herd7 confirms this, 294*7e7eb5aeSPaul E. McKenneyshowing zero executions matching the "filter" criteria. 295*7e7eb5aeSPaul E. McKenney 296*7e7eb5aeSPaul E. McKenneyAnd this is why Linux-kernel lock and unlock primitives must prevent 297*7e7eb5aeSPaul E. McKenneycode from entering critical sections. It is not sufficient to only 298*7e7eb5aeSPaul E. McKenneyprevent code from leaving them. 299