1*7e7eb5aeSPaul E. McKenneyLocking
2*7e7eb5aeSPaul E. McKenney=======
3*7e7eb5aeSPaul E. McKenney
4*7e7eb5aeSPaul E. McKenneyLocking is well-known and the common use cases are straightforward: Any
5*7e7eb5aeSPaul E. McKenneyCPU holding a given lock sees any changes previously seen or made by any
6*7e7eb5aeSPaul E. McKenneyCPU before it previously released that same lock.  This last sentence
7*7e7eb5aeSPaul E. McKenneyis the only part of this document that most developers will need to read.
8*7e7eb5aeSPaul E. McKenney
9*7e7eb5aeSPaul E. McKenneyHowever, developers who would like to also access lock-protected shared
10*7e7eb5aeSPaul E. McKenneyvariables outside of their corresponding locks should continue reading.
11*7e7eb5aeSPaul E. McKenney
12*7e7eb5aeSPaul E. McKenney
13*7e7eb5aeSPaul E. McKenneyLocking and Prior Accesses
14*7e7eb5aeSPaul E. McKenney--------------------------
15*7e7eb5aeSPaul E. McKenney
16*7e7eb5aeSPaul E. McKenneyThe basic rule of locking is worth repeating:
17*7e7eb5aeSPaul E. McKenney
18*7e7eb5aeSPaul E. McKenney	Any CPU holding a given lock sees any changes previously seen
19*7e7eb5aeSPaul E. McKenney	or made by any CPU before it previously released that same lock.
20*7e7eb5aeSPaul E. McKenney
21*7e7eb5aeSPaul E. McKenneyNote that this statement is a bit stronger than "Any CPU holding a
22*7e7eb5aeSPaul E. McKenneygiven lock sees all changes made by any CPU during the time that CPU was
23*7e7eb5aeSPaul E. McKenneypreviously holding this same lock".  For example, consider the following
24*7e7eb5aeSPaul E. McKenneypair of code fragments:
25*7e7eb5aeSPaul E. McKenney
26*7e7eb5aeSPaul E. McKenney	/* See MP+polocks.litmus. */
27*7e7eb5aeSPaul E. McKenney	void CPU0(void)
28*7e7eb5aeSPaul E. McKenney	{
29*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(x, 1);
30*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
31*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(y, 1);
32*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
33*7e7eb5aeSPaul E. McKenney	}
34*7e7eb5aeSPaul E. McKenney
35*7e7eb5aeSPaul E. McKenney	void CPU1(void)
36*7e7eb5aeSPaul E. McKenney	{
37*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
38*7e7eb5aeSPaul E. McKenney		r0 = READ_ONCE(y);
39*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
40*7e7eb5aeSPaul E. McKenney		r1 = READ_ONCE(x);
41*7e7eb5aeSPaul E. McKenney	}
42*7e7eb5aeSPaul E. McKenney
43*7e7eb5aeSPaul E. McKenneyThe basic rule guarantees that if CPU0() acquires mylock before CPU1(),
44*7e7eb5aeSPaul E. McKenneythen both r0 and r1 must be set to the value 1.  This also has the
45*7e7eb5aeSPaul E. McKenneyconsequence that if the final value of r0 is equal to 1, then the final
46*7e7eb5aeSPaul E. McKenneyvalue of r1 must also be equal to 1.  In contrast, the weaker rule would
47*7e7eb5aeSPaul E. McKenneysay nothing about the final value of r1.
48*7e7eb5aeSPaul E. McKenney
49*7e7eb5aeSPaul E. McKenney
50*7e7eb5aeSPaul E. McKenneyLocking and Subsequent Accesses
51*7e7eb5aeSPaul E. McKenney-------------------------------
52*7e7eb5aeSPaul E. McKenney
53*7e7eb5aeSPaul E. McKenneyThe converse to the basic rule also holds:  Any CPU holding a given
54*7e7eb5aeSPaul E. McKenneylock will not see any changes that will be made by any CPU after it
55*7e7eb5aeSPaul E. McKenneysubsequently acquires this same lock.  This converse statement is
56*7e7eb5aeSPaul E. McKenneyillustrated by the following litmus test:
57*7e7eb5aeSPaul E. McKenney
58*7e7eb5aeSPaul E. McKenney	/* See MP+porevlocks.litmus. */
59*7e7eb5aeSPaul E. McKenney	void CPU0(void)
60*7e7eb5aeSPaul E. McKenney	{
61*7e7eb5aeSPaul E. McKenney		r0 = READ_ONCE(y);
62*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
63*7e7eb5aeSPaul E. McKenney		r1 = READ_ONCE(x);
64*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
65*7e7eb5aeSPaul E. McKenney	}
66*7e7eb5aeSPaul E. McKenney
67*7e7eb5aeSPaul E. McKenney	void CPU1(void)
68*7e7eb5aeSPaul E. McKenney	{
69*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
70*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(x, 1);
71*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
72*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(y, 1);
73*7e7eb5aeSPaul E. McKenney	}
74*7e7eb5aeSPaul E. McKenney
75*7e7eb5aeSPaul E. McKenneyThis converse to the basic rule guarantees that if CPU0() acquires
76*7e7eb5aeSPaul E. McKenneymylock before CPU1(), then both r0 and r1 must be set to the value 0.
77*7e7eb5aeSPaul E. McKenneyThis also has the consequence that if the final value of r1 is equal
78*7e7eb5aeSPaul E. McKenneyto 0, then the final value of r0 must also be equal to 0.  In contrast,
79*7e7eb5aeSPaul E. McKenneythe weaker rule would say nothing about the final value of r0.
80*7e7eb5aeSPaul E. McKenney
81*7e7eb5aeSPaul E. McKenneyThese examples show only a single pair of CPUs, but the effects of the
82*7e7eb5aeSPaul E. McKenneylocking basic rule extend across multiple acquisitions of a given lock
83*7e7eb5aeSPaul E. McKenneyacross multiple CPUs.
84*7e7eb5aeSPaul E. McKenney
85*7e7eb5aeSPaul E. McKenney
86*7e7eb5aeSPaul E. McKenneyDouble-Checked Locking
87*7e7eb5aeSPaul E. McKenney----------------------
88*7e7eb5aeSPaul E. McKenney
89*7e7eb5aeSPaul E. McKenneyIt is well known that more than just a lock is required to make
90*7e7eb5aeSPaul E. McKenneydouble-checked locking work correctly,  This litmus test illustrates
91*7e7eb5aeSPaul E. McKenneyone incorrect approach:
92*7e7eb5aeSPaul E. McKenney
93*7e7eb5aeSPaul E. McKenney	/* See Documentation/litmus-tests/locking/DCL-broken.litmus. */
94*7e7eb5aeSPaul E. McKenney	void CPU0(void)
95*7e7eb5aeSPaul E. McKenney	{
96*7e7eb5aeSPaul E. McKenney		r0 = READ_ONCE(flag);
97*7e7eb5aeSPaul E. McKenney		if (r0 == 0) {
98*7e7eb5aeSPaul E. McKenney			spin_lock(&lck);
99*7e7eb5aeSPaul E. McKenney			r1 = READ_ONCE(flag);
100*7e7eb5aeSPaul E. McKenney			if (r1 == 0) {
101*7e7eb5aeSPaul E. McKenney				WRITE_ONCE(data, 1);
102*7e7eb5aeSPaul E. McKenney				WRITE_ONCE(flag, 1);
103*7e7eb5aeSPaul E. McKenney			}
104*7e7eb5aeSPaul E. McKenney			spin_unlock(&lck);
105*7e7eb5aeSPaul E. McKenney		}
106*7e7eb5aeSPaul E. McKenney		r2 = READ_ONCE(data);
107*7e7eb5aeSPaul E. McKenney	}
108*7e7eb5aeSPaul E. McKenney	/* CPU1() is the exactly the same as CPU0(). */
109*7e7eb5aeSPaul E. McKenney
110*7e7eb5aeSPaul E. McKenneyThere are two problems.  First, there is no ordering between the first
111*7e7eb5aeSPaul E. McKenneyREAD_ONCE() of "flag" and the READ_ONCE() of "data".  Second, there is
112*7e7eb5aeSPaul E. McKenneyno ordering between the two WRITE_ONCE() calls.  It should therefore be
113*7e7eb5aeSPaul E. McKenneyno surprise that "r2" can be zero, and a quick herd7 run confirms this.
114*7e7eb5aeSPaul E. McKenney
115*7e7eb5aeSPaul E. McKenneyOne way to fix this is to use smp_load_acquire() and smp_store_release()
116*7e7eb5aeSPaul E. McKenneyas shown in this corrected version:
117*7e7eb5aeSPaul E. McKenney
118*7e7eb5aeSPaul E. McKenney	/* See Documentation/litmus-tests/locking/DCL-fixed.litmus. */
119*7e7eb5aeSPaul E. McKenney	void CPU0(void)
120*7e7eb5aeSPaul E. McKenney	{
121*7e7eb5aeSPaul E. McKenney		r0 = smp_load_acquire(&flag);
122*7e7eb5aeSPaul E. McKenney		if (r0 == 0) {
123*7e7eb5aeSPaul E. McKenney			spin_lock(&lck);
124*7e7eb5aeSPaul E. McKenney			r1 = READ_ONCE(flag);
125*7e7eb5aeSPaul E. McKenney			if (r1 == 0) {
126*7e7eb5aeSPaul E. McKenney				WRITE_ONCE(data, 1);
127*7e7eb5aeSPaul E. McKenney				smp_store_release(&flag, 1);
128*7e7eb5aeSPaul E. McKenney			}
129*7e7eb5aeSPaul E. McKenney			spin_unlock(&lck);
130*7e7eb5aeSPaul E. McKenney		}
131*7e7eb5aeSPaul E. McKenney		r2 = READ_ONCE(data);
132*7e7eb5aeSPaul E. McKenney	}
133*7e7eb5aeSPaul E. McKenney	/* CPU1() is the exactly the same as CPU0(). */
134*7e7eb5aeSPaul E. McKenney
135*7e7eb5aeSPaul E. McKenneyThe smp_load_acquire() guarantees that its load from "flags" will
136*7e7eb5aeSPaul E. McKenneybe ordered before the READ_ONCE() from data, thus solving the first
137*7e7eb5aeSPaul E. McKenneyproblem.  The smp_store_release() guarantees that its store will be
138*7e7eb5aeSPaul E. McKenneyordered after the WRITE_ONCE() to "data", solving the second problem.
139*7e7eb5aeSPaul E. McKenneyThe smp_store_release() pairs with the smp_load_acquire(), thus ensuring
140*7e7eb5aeSPaul E. McKenneythat the ordering provided by each actually takes effect.  Again, a
141*7e7eb5aeSPaul E. McKenneyquick herd7 run confirms this.
142*7e7eb5aeSPaul E. McKenney
143*7e7eb5aeSPaul E. McKenneyIn short, if you access a lock-protected variable without holding the
144*7e7eb5aeSPaul E. McKenneycorresponding lock, you will need to provide additional ordering, in
145*7e7eb5aeSPaul E. McKenneythis case, via the smp_load_acquire() and the smp_store_release().
146*7e7eb5aeSPaul E. McKenney
147*7e7eb5aeSPaul E. McKenney
148*7e7eb5aeSPaul E. McKenneyOrdering Provided by a Lock to CPUs Not Holding That Lock
149*7e7eb5aeSPaul E. McKenney---------------------------------------------------------
150*7e7eb5aeSPaul E. McKenney
151*7e7eb5aeSPaul E. McKenneyIt is not necessarily the case that accesses ordered by locking will be
152*7e7eb5aeSPaul E. McKenneyseen as ordered by CPUs not holding that lock.  Consider this example:
153*7e7eb5aeSPaul E. McKenney
154*7e7eb5aeSPaul E. McKenney	/* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */
155*7e7eb5aeSPaul E. McKenney	void CPU0(void)
156*7e7eb5aeSPaul E. McKenney	{
157*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
158*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(x, 1);
159*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(y, 1);
160*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
161*7e7eb5aeSPaul E. McKenney	}
162*7e7eb5aeSPaul E. McKenney
163*7e7eb5aeSPaul E. McKenney	void CPU1(void)
164*7e7eb5aeSPaul E. McKenney	{
165*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
166*7e7eb5aeSPaul E. McKenney		r0 = READ_ONCE(y);
167*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(z, 1);
168*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
169*7e7eb5aeSPaul E. McKenney	}
170*7e7eb5aeSPaul E. McKenney
171*7e7eb5aeSPaul E. McKenney	void CPU2(void)
172*7e7eb5aeSPaul E. McKenney	{
173*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(z, 2);
174*7e7eb5aeSPaul E. McKenney		smp_mb();
175*7e7eb5aeSPaul E. McKenney		r1 = READ_ONCE(x);
176*7e7eb5aeSPaul E. McKenney	}
177*7e7eb5aeSPaul E. McKenney
178*7e7eb5aeSPaul E. McKenneyCounter-intuitive though it might be, it is quite possible to have
179*7e7eb5aeSPaul E. McKenneythe final value of r0 be 1, the final value of z be 2, and the final
180*7e7eb5aeSPaul E. McKenneyvalue of r1 be 0.  The reason for this surprising outcome is that CPU2()
181*7e7eb5aeSPaul E. McKenneynever acquired the lock, and thus did not fully benefit from the lock's
182*7e7eb5aeSPaul E. McKenneyordering properties.
183*7e7eb5aeSPaul E. McKenney
184*7e7eb5aeSPaul E. McKenneyOrdering can be extended to CPUs not holding the lock by careful use
185*7e7eb5aeSPaul E. McKenneyof smp_mb__after_spinlock():
186*7e7eb5aeSPaul E. McKenney
187*7e7eb5aeSPaul E. McKenney	/* See Z6.0+pooncelock+poonceLock+pombonce.litmus. */
188*7e7eb5aeSPaul E. McKenney	void CPU0(void)
189*7e7eb5aeSPaul E. McKenney	{
190*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
191*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(x, 1);
192*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(y, 1);
193*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
194*7e7eb5aeSPaul E. McKenney	}
195*7e7eb5aeSPaul E. McKenney
196*7e7eb5aeSPaul E. McKenney	void CPU1(void)
197*7e7eb5aeSPaul E. McKenney	{
198*7e7eb5aeSPaul E. McKenney		spin_lock(&mylock);
199*7e7eb5aeSPaul E. McKenney		smp_mb__after_spinlock();
200*7e7eb5aeSPaul E. McKenney		r0 = READ_ONCE(y);
201*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(z, 1);
202*7e7eb5aeSPaul E. McKenney		spin_unlock(&mylock);
203*7e7eb5aeSPaul E. McKenney	}
204*7e7eb5aeSPaul E. McKenney
205*7e7eb5aeSPaul E. McKenney	void CPU2(void)
206*7e7eb5aeSPaul E. McKenney	{
207*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(z, 2);
208*7e7eb5aeSPaul E. McKenney		smp_mb();
209*7e7eb5aeSPaul E. McKenney		r1 = READ_ONCE(x);
210*7e7eb5aeSPaul E. McKenney	}
211*7e7eb5aeSPaul E. McKenney
212*7e7eb5aeSPaul E. McKenneyThis addition of smp_mb__after_spinlock() strengthens the lock
213*7e7eb5aeSPaul E. McKenneyacquisition sufficiently to rule out the counter-intuitive outcome.
214*7e7eb5aeSPaul E. McKenneyIn other words, the addition of the smp_mb__after_spinlock() prohibits
215*7e7eb5aeSPaul E. McKenneythe counter-intuitive result where the final value of r0 is 1, the final
216*7e7eb5aeSPaul E. McKenneyvalue of z is 2, and the final value of r1 is 0.
217*7e7eb5aeSPaul E. McKenney
218*7e7eb5aeSPaul E. McKenney
219*7e7eb5aeSPaul E. McKenneyNo Roach-Motel Locking!
220*7e7eb5aeSPaul E. McKenney-----------------------
221*7e7eb5aeSPaul E. McKenney
222*7e7eb5aeSPaul E. McKenneyThis example requires familiarity with the herd7 "filter" clause, so
223*7e7eb5aeSPaul E. McKenneyplease read up on that topic in litmus-tests.txt.
224*7e7eb5aeSPaul E. McKenney
225*7e7eb5aeSPaul E. McKenneyIt is tempting to allow memory-reference instructions to be pulled
226*7e7eb5aeSPaul E. McKenneyinto a critical section, but this cannot be allowed in the general case.
227*7e7eb5aeSPaul E. McKenneyFor example, consider a spin loop preceding a lock-based critical section.
228*7e7eb5aeSPaul E. McKenneyNow, herd7 does not model spin loops, but we can emulate one with two
229*7e7eb5aeSPaul E. McKenneyloads, with a "filter" clause to constrain the first to return the
230*7e7eb5aeSPaul E. McKenneyinitial value and the second to return the updated value, as shown below:
231*7e7eb5aeSPaul E. McKenney
232*7e7eb5aeSPaul E. McKenney	/* See Documentation/litmus-tests/locking/RM-fixed.litmus. */
233*7e7eb5aeSPaul E. McKenney	void CPU0(void)
234*7e7eb5aeSPaul E. McKenney	{
235*7e7eb5aeSPaul E. McKenney		spin_lock(&lck);
236*7e7eb5aeSPaul E. McKenney		r2 = atomic_inc_return(&y);
237*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(x, 1);
238*7e7eb5aeSPaul E. McKenney		spin_unlock(&lck);
239*7e7eb5aeSPaul E. McKenney	}
240*7e7eb5aeSPaul E. McKenney
241*7e7eb5aeSPaul E. McKenney	void CPU1(void)
242*7e7eb5aeSPaul E. McKenney	{
243*7e7eb5aeSPaul E. McKenney		r0 = READ_ONCE(x);
244*7e7eb5aeSPaul E. McKenney		r1 = READ_ONCE(x);
245*7e7eb5aeSPaul E. McKenney		spin_lock(&lck);
246*7e7eb5aeSPaul E. McKenney		r2 = atomic_inc_return(&y);
247*7e7eb5aeSPaul E. McKenney		spin_unlock(&lck);
248*7e7eb5aeSPaul E. McKenney	}
249*7e7eb5aeSPaul E. McKenney
250*7e7eb5aeSPaul E. McKenney	filter (1:r0=0 /\ 1:r1=1)
251*7e7eb5aeSPaul E. McKenney	exists (1:r2=1)
252*7e7eb5aeSPaul E. McKenney
253*7e7eb5aeSPaul E. McKenneyThe variable "x" is the control variable for the emulated spin loop.
254*7e7eb5aeSPaul E. McKenneyCPU0() sets it to "1" while holding the lock, and CPU1() emulates the
255*7e7eb5aeSPaul E. McKenneyspin loop by reading it twice, first into "1:r0" (which should get the
256*7e7eb5aeSPaul E. McKenneyinitial value "0") and then into "1:r1" (which should get the updated
257*7e7eb5aeSPaul E. McKenneyvalue "1").
258*7e7eb5aeSPaul E. McKenney
259*7e7eb5aeSPaul E. McKenneyThe "filter" clause takes this into account, constraining "1:r0" to
260*7e7eb5aeSPaul E. McKenneyequal "0" and "1:r1" to equal 1.
261*7e7eb5aeSPaul E. McKenney
262*7e7eb5aeSPaul E. McKenneyThen the "exists" clause checks to see if CPU1() acquired its lock first,
263*7e7eb5aeSPaul E. McKenneywhich should not happen given the filter clause because CPU0() updates
264*7e7eb5aeSPaul E. McKenney"x" while holding the lock.  And herd7 confirms this.
265*7e7eb5aeSPaul E. McKenney
266*7e7eb5aeSPaul E. McKenneyBut suppose that the compiler was permitted to reorder the spin loop
267*7e7eb5aeSPaul E. McKenneyinto CPU1()'s critical section, like this:
268*7e7eb5aeSPaul E. McKenney
269*7e7eb5aeSPaul E. McKenney	/* See Documentation/litmus-tests/locking/RM-broken.litmus. */
270*7e7eb5aeSPaul E. McKenney	void CPU0(void)
271*7e7eb5aeSPaul E. McKenney	{
272*7e7eb5aeSPaul E. McKenney		int r2;
273*7e7eb5aeSPaul E. McKenney
274*7e7eb5aeSPaul E. McKenney		spin_lock(&lck);
275*7e7eb5aeSPaul E. McKenney		r2 = atomic_inc_return(&y);
276*7e7eb5aeSPaul E. McKenney		WRITE_ONCE(x, 1);
277*7e7eb5aeSPaul E. McKenney		spin_unlock(&lck);
278*7e7eb5aeSPaul E. McKenney	}
279*7e7eb5aeSPaul E. McKenney
280*7e7eb5aeSPaul E. McKenney	void CPU1(void)
281*7e7eb5aeSPaul E. McKenney	{
282*7e7eb5aeSPaul E. McKenney		spin_lock(&lck);
283*7e7eb5aeSPaul E. McKenney		r0 = READ_ONCE(x);
284*7e7eb5aeSPaul E. McKenney		r1 = READ_ONCE(x);
285*7e7eb5aeSPaul E. McKenney		r2 = atomic_inc_return(&y);
286*7e7eb5aeSPaul E. McKenney		spin_unlock(&lck);
287*7e7eb5aeSPaul E. McKenney	}
288*7e7eb5aeSPaul E. McKenney
289*7e7eb5aeSPaul E. McKenney	filter (1:r0=0 /\ 1:r1=1)
290*7e7eb5aeSPaul E. McKenney	exists (1:r2=1)
291*7e7eb5aeSPaul E. McKenney
292*7e7eb5aeSPaul E. McKenneyIf "1:r0" is equal to "0", "1:r1" can never equal "1" because CPU0()
293*7e7eb5aeSPaul E. McKenneycannot update "x" while CPU1() holds the lock.  And herd7 confirms this,
294*7e7eb5aeSPaul E. McKenneyshowing zero executions matching the "filter" criteria.
295*7e7eb5aeSPaul E. McKenney
296*7e7eb5aeSPaul E. McKenneyAnd this is why Linux-kernel lock and unlock primitives must prevent
297*7e7eb5aeSPaul E. McKenneycode from entering critical sections.  It is not sufficient to only
298*7e7eb5aeSPaul E. McKenneyprevent code from leaving them.
299