1*ebb477cbSPaul E. McKenneyCONTROL DEPENDENCIES
2*ebb477cbSPaul E. McKenney====================
3*ebb477cbSPaul E. McKenney
4*ebb477cbSPaul E. McKenneyA major difficulty with control dependencies is that current compilers
5*ebb477cbSPaul E. McKenneydo not support them.  One purpose of this document is therefore to
6*ebb477cbSPaul E. McKenneyhelp you prevent your compiler from breaking your code.  However,
7*ebb477cbSPaul E. McKenneycontrol dependencies also pose other challenges, which leads to the
8*ebb477cbSPaul E. McKenneysecond purpose of this document, namely to help you to avoid breaking
9*ebb477cbSPaul E. McKenneyyour own code, even in the absence of help from your compiler.
10*ebb477cbSPaul E. McKenney
11*ebb477cbSPaul E. McKenneyOne such challenge is that control dependencies order only later stores.
12*ebb477cbSPaul E. McKenneyTherefore, a load-load control dependency will not preserve ordering
13*ebb477cbSPaul E. McKenneyunless a read memory barrier is provided.  Consider the following code:
14*ebb477cbSPaul E. McKenney
15*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
16*ebb477cbSPaul E. McKenney	if (q)
17*ebb477cbSPaul E. McKenney		p = READ_ONCE(b);
18*ebb477cbSPaul E. McKenney
19*ebb477cbSPaul E. McKenneyThis is not guaranteed to provide any ordering because some types of CPUs
20*ebb477cbSPaul E. McKenneyare permitted to predict the result of the load from "b".  This prediction
21*ebb477cbSPaul E. McKenneycan cause other CPUs to see this load as having happened before the load
22*ebb477cbSPaul E. McKenneyfrom "a".  This means that an explicit read barrier is required, for example
23*ebb477cbSPaul E. McKenneyas follows:
24*ebb477cbSPaul E. McKenney
25*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
26*ebb477cbSPaul E. McKenney	if (q) {
27*ebb477cbSPaul E. McKenney		smp_rmb();
28*ebb477cbSPaul E. McKenney		p = READ_ONCE(b);
29*ebb477cbSPaul E. McKenney	}
30*ebb477cbSPaul E. McKenney
31*ebb477cbSPaul E. McKenneyHowever, stores are not speculated.  This means that ordering is
32*ebb477cbSPaul E. McKenney(usually) guaranteed for load-store control dependencies, as in the
33*ebb477cbSPaul E. McKenneyfollowing example:
34*ebb477cbSPaul E. McKenney
35*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
36*ebb477cbSPaul E. McKenney	if (q)
37*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
38*ebb477cbSPaul E. McKenney
39*ebb477cbSPaul E. McKenneyControl dependencies can pair with each other and with other types
40*ebb477cbSPaul E. McKenneyof ordering.  But please note that neither the READ_ONCE() nor the
41*ebb477cbSPaul E. McKenneyWRITE_ONCE() are optional.  Without the READ_ONCE(), the compiler might
42*ebb477cbSPaul E. McKenneyfuse the load from "a" with other loads.  Without the WRITE_ONCE(),
43*ebb477cbSPaul E. McKenneythe compiler might fuse the store to "b" with other stores.  Worse yet,
44*ebb477cbSPaul E. McKenneythe compiler might convert the store into a load and a check followed
45*ebb477cbSPaul E. McKenneyby a store, and this compiler-generated load would not be ordered by
46*ebb477cbSPaul E. McKenneythe control dependency.
47*ebb477cbSPaul E. McKenney
48*ebb477cbSPaul E. McKenneyFurthermore, if the compiler is able to prove that the value of variable
49*ebb477cbSPaul E. McKenney"a" is always non-zero, it would be well within its rights to optimize
50*ebb477cbSPaul E. McKenneythe original example by eliminating the "if" statement as follows:
51*ebb477cbSPaul E. McKenney
52*ebb477cbSPaul E. McKenney	q = a;
53*ebb477cbSPaul E. McKenney	b = 1;  /* BUG: Compiler and CPU can both reorder!!! */
54*ebb477cbSPaul E. McKenney
55*ebb477cbSPaul E. McKenneySo don't leave out either the READ_ONCE() or the WRITE_ONCE().
56*ebb477cbSPaul E. McKenneyIn particular, although READ_ONCE() does force the compiler to emit a
57*ebb477cbSPaul E. McKenneyload, it does *not* force the compiler to actually use the loaded value.
58*ebb477cbSPaul E. McKenney
59*ebb477cbSPaul E. McKenneyIt is tempting to try use control dependencies to enforce ordering on
60*ebb477cbSPaul E. McKenneyidentical stores on both branches of the "if" statement as follows:
61*ebb477cbSPaul E. McKenney
62*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
63*ebb477cbSPaul E. McKenney	if (q) {
64*ebb477cbSPaul E. McKenney		barrier();
65*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
66*ebb477cbSPaul E. McKenney		do_something();
67*ebb477cbSPaul E. McKenney	} else {
68*ebb477cbSPaul E. McKenney		barrier();
69*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
70*ebb477cbSPaul E. McKenney		do_something_else();
71*ebb477cbSPaul E. McKenney	}
72*ebb477cbSPaul E. McKenney
73*ebb477cbSPaul E. McKenneyUnfortunately, current compilers will transform this as follows at high
74*ebb477cbSPaul E. McKenneyoptimization levels:
75*ebb477cbSPaul E. McKenney
76*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
77*ebb477cbSPaul E. McKenney	barrier();
78*ebb477cbSPaul E. McKenney	WRITE_ONCE(b, 1);  /* BUG: No ordering vs. load from a!!! */
79*ebb477cbSPaul E. McKenney	if (q) {
80*ebb477cbSPaul E. McKenney		/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
81*ebb477cbSPaul E. McKenney		do_something();
82*ebb477cbSPaul E. McKenney	} else {
83*ebb477cbSPaul E. McKenney		/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
84*ebb477cbSPaul E. McKenney		do_something_else();
85*ebb477cbSPaul E. McKenney	}
86*ebb477cbSPaul E. McKenney
87*ebb477cbSPaul E. McKenneyNow there is no conditional between the load from "a" and the store to
88*ebb477cbSPaul E. McKenney"b", which means that the CPU is within its rights to reorder them:  The
89*ebb477cbSPaul E. McKenneyconditional is absolutely required, and must be present in the final
90*ebb477cbSPaul E. McKenneyassembly code, after all of the compiler and link-time optimizations
91*ebb477cbSPaul E. McKenneyhave been applied.  Therefore, if you need ordering in this example,
92*ebb477cbSPaul E. McKenneyyou must use explicit memory ordering, for example, smp_store_release():
93*ebb477cbSPaul E. McKenney
94*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
95*ebb477cbSPaul E. McKenney	if (q) {
96*ebb477cbSPaul E. McKenney		smp_store_release(&b, 1);
97*ebb477cbSPaul E. McKenney		do_something();
98*ebb477cbSPaul E. McKenney	} else {
99*ebb477cbSPaul E. McKenney		smp_store_release(&b, 1);
100*ebb477cbSPaul E. McKenney		do_something_else();
101*ebb477cbSPaul E. McKenney	}
102*ebb477cbSPaul E. McKenney
103*ebb477cbSPaul E. McKenneyWithout explicit memory ordering, control-dependency-based ordering is
104*ebb477cbSPaul E. McKenneyguaranteed only when the stores differ, for example:
105*ebb477cbSPaul E. McKenney
106*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
107*ebb477cbSPaul E. McKenney	if (q) {
108*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
109*ebb477cbSPaul E. McKenney		do_something();
110*ebb477cbSPaul E. McKenney	} else {
111*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 2);
112*ebb477cbSPaul E. McKenney		do_something_else();
113*ebb477cbSPaul E. McKenney	}
114*ebb477cbSPaul E. McKenney
115*ebb477cbSPaul E. McKenneyThe initial READ_ONCE() is still required to prevent the compiler from
116*ebb477cbSPaul E. McKenneyknowing too much about the value of "a".
117*ebb477cbSPaul E. McKenney
118*ebb477cbSPaul E. McKenneyBut please note that you need to be careful what you do with the local
119*ebb477cbSPaul E. McKenneyvariable "q", otherwise the compiler might be able to guess the value
120*ebb477cbSPaul E. McKenneyand again remove the conditional branch that is absolutely required to
121*ebb477cbSPaul E. McKenneypreserve ordering.  For example:
122*ebb477cbSPaul E. McKenney
123*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
124*ebb477cbSPaul E. McKenney	if (q % MAX) {
125*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
126*ebb477cbSPaul E. McKenney		do_something();
127*ebb477cbSPaul E. McKenney	} else {
128*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 2);
129*ebb477cbSPaul E. McKenney		do_something_else();
130*ebb477cbSPaul E. McKenney	}
131*ebb477cbSPaul E. McKenney
132*ebb477cbSPaul E. McKenneyIf MAX is compile-time defined to be 1, then the compiler knows that
133*ebb477cbSPaul E. McKenney(q % MAX) must be equal to zero, regardless of the value of "q".
134*ebb477cbSPaul E. McKenneyThe compiler is therefore within its rights to transform the above code
135*ebb477cbSPaul E. McKenneyinto the following:
136*ebb477cbSPaul E. McKenney
137*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
138*ebb477cbSPaul E. McKenney	WRITE_ONCE(b, 2);
139*ebb477cbSPaul E. McKenney	do_something_else();
140*ebb477cbSPaul E. McKenney
141*ebb477cbSPaul E. McKenneyGiven this transformation, the CPU is not required to respect the ordering
142*ebb477cbSPaul E. McKenneybetween the load from variable "a" and the store to variable "b".  It is
143*ebb477cbSPaul E. McKenneytempting to add a barrier(), but this does not help.  The conditional
144*ebb477cbSPaul E. McKenneyis gone, and the barrier won't bring it back.  Therefore, if you need
145*ebb477cbSPaul E. McKenneyto relying on control dependencies to produce this ordering, you should
146*ebb477cbSPaul E. McKenneymake sure that MAX is greater than one, perhaps as follows:
147*ebb477cbSPaul E. McKenney
148*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
149*ebb477cbSPaul E. McKenney	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
150*ebb477cbSPaul E. McKenney	if (q % MAX) {
151*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
152*ebb477cbSPaul E. McKenney		do_something();
153*ebb477cbSPaul E. McKenney	} else {
154*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 2);
155*ebb477cbSPaul E. McKenney		do_something_else();
156*ebb477cbSPaul E. McKenney	}
157*ebb477cbSPaul E. McKenney
158*ebb477cbSPaul E. McKenneyPlease note once again that each leg of the "if" statement absolutely
159*ebb477cbSPaul E. McKenneymust store different values to "b".  As in previous examples, if the two
160*ebb477cbSPaul E. McKenneyvalues were identical, the compiler could pull this store outside of the
161*ebb477cbSPaul E. McKenney"if" statement, destroying the control dependency's ordering properties.
162*ebb477cbSPaul E. McKenney
163*ebb477cbSPaul E. McKenneyYou must also be careful avoid relying too much on boolean short-circuit
164*ebb477cbSPaul E. McKenneyevaluation.  Consider this example:
165*ebb477cbSPaul E. McKenney
166*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
167*ebb477cbSPaul E. McKenney	if (q || 1 > 0)
168*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
169*ebb477cbSPaul E. McKenney
170*ebb477cbSPaul E. McKenneyBecause the first condition cannot fault and the second condition is
171*ebb477cbSPaul E. McKenneyalways true, the compiler can transform this example as follows, again
172*ebb477cbSPaul E. McKenneydestroying the control dependency's ordering:
173*ebb477cbSPaul E. McKenney
174*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
175*ebb477cbSPaul E. McKenney	WRITE_ONCE(b, 1);
176*ebb477cbSPaul E. McKenney
177*ebb477cbSPaul E. McKenneyThis is yet another example showing the importance of preventing the
178*ebb477cbSPaul E. McKenneycompiler from out-guessing your code.  Again, although READ_ONCE() really
179*ebb477cbSPaul E. McKenneydoes force the compiler to emit code for a given load, the compiler is
180*ebb477cbSPaul E. McKenneywithin its rights to discard the loaded value.
181*ebb477cbSPaul E. McKenney
182*ebb477cbSPaul E. McKenneyIn addition, control dependencies apply only to the then-clause and
183*ebb477cbSPaul E. McKenneyelse-clause of the "if" statement in question.  In particular, they do
184*ebb477cbSPaul E. McKenneynot necessarily order the code following the entire "if" statement:
185*ebb477cbSPaul E. McKenney
186*ebb477cbSPaul E. McKenney	q = READ_ONCE(a);
187*ebb477cbSPaul E. McKenney	if (q) {
188*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 1);
189*ebb477cbSPaul E. McKenney	} else {
190*ebb477cbSPaul E. McKenney		WRITE_ONCE(b, 2);
191*ebb477cbSPaul E. McKenney	}
192*ebb477cbSPaul E. McKenney	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
193*ebb477cbSPaul E. McKenney
194*ebb477cbSPaul E. McKenneyIt is tempting to argue that there in fact is ordering because the
195*ebb477cbSPaul E. McKenneycompiler cannot reorder volatile accesses and also cannot reorder
196*ebb477cbSPaul E. McKenneythe writes to "b" with the condition.  Unfortunately for this line
197*ebb477cbSPaul E. McKenneyof reasoning, the compiler might compile the two writes to "b" as
198*ebb477cbSPaul E. McKenneyconditional-move instructions, as in this fanciful pseudo-assembly
199*ebb477cbSPaul E. McKenneylanguage:
200*ebb477cbSPaul E. McKenney
201*ebb477cbSPaul E. McKenney	ld r1,a
202*ebb477cbSPaul E. McKenney	cmp r1,$0
203*ebb477cbSPaul E. McKenney	cmov,ne r4,$1
204*ebb477cbSPaul E. McKenney	cmov,eq r4,$2
205*ebb477cbSPaul E. McKenney	st r4,b
206*ebb477cbSPaul E. McKenney	st $1,c
207*ebb477cbSPaul E. McKenney
208*ebb477cbSPaul E. McKenneyThe control dependencies would then extend only to the pair of cmov
209*ebb477cbSPaul E. McKenneyinstructions and the store depending on them.  This means that a weakly
210*ebb477cbSPaul E. McKenneyordered CPU would have no dependency of any sort between the load from
211*ebb477cbSPaul E. McKenney"a" and the store to "c".  In short, control dependencies provide ordering
212*ebb477cbSPaul E. McKenneyonly to the stores in the then-clause and else-clause of the "if" statement
213*ebb477cbSPaul E. McKenneyin question (including functions invoked by those two clauses), and not
214*ebb477cbSPaul E. McKenneyto code following that "if" statement.
215*ebb477cbSPaul E. McKenney
216*ebb477cbSPaul E. McKenney
217*ebb477cbSPaul E. McKenneyIn summary:
218*ebb477cbSPaul E. McKenney
219*ebb477cbSPaul E. McKenney  (*) Control dependencies can order prior loads against later stores.
220*ebb477cbSPaul E. McKenney      However, they do *not* guarantee any other sort of ordering:
221*ebb477cbSPaul E. McKenney      Not prior loads against later loads, nor prior stores against
222*ebb477cbSPaul E. McKenney      later anything.  If you need these other forms of ordering, use
223*ebb477cbSPaul E. McKenney      smp_load_acquire(), smp_store_release(), or, in the case of prior
224*ebb477cbSPaul E. McKenney      stores and later loads, smp_mb().
225*ebb477cbSPaul E. McKenney
226*ebb477cbSPaul E. McKenney  (*) If both legs of the "if" statement contain identical stores to
227*ebb477cbSPaul E. McKenney      the same variable, then you must explicitly order those stores,
228*ebb477cbSPaul E. McKenney      either by preceding both of them with smp_mb() or by using
229*ebb477cbSPaul E. McKenney      smp_store_release().  Please note that it is *not* sufficient to use
230*ebb477cbSPaul E. McKenney      barrier() at beginning and end of each leg of the "if" statement
231*ebb477cbSPaul E. McKenney      because, as shown by the example above, optimizing compilers can
232*ebb477cbSPaul E. McKenney      destroy the control dependency while respecting the letter of the
233*ebb477cbSPaul E. McKenney      barrier() law.
234*ebb477cbSPaul E. McKenney
235*ebb477cbSPaul E. McKenney  (*) Control dependencies require at least one run-time conditional
236*ebb477cbSPaul E. McKenney      between the prior load and the subsequent store, and this
237*ebb477cbSPaul E. McKenney      conditional must involve the prior load.  If the compiler is able
238*ebb477cbSPaul E. McKenney      to optimize the conditional away, it will have also optimized
239*ebb477cbSPaul E. McKenney      away the ordering.  Careful use of READ_ONCE() and WRITE_ONCE()
240*ebb477cbSPaul E. McKenney      can help to preserve the needed conditional.
241*ebb477cbSPaul E. McKenney
242*ebb477cbSPaul E. McKenney  (*) Control dependencies require that the compiler avoid reordering the
243*ebb477cbSPaul E. McKenney      dependency into nonexistence.  Careful use of READ_ONCE() or
244*ebb477cbSPaul E. McKenney      atomic{,64}_read() can help to preserve your control dependency.
245*ebb477cbSPaul E. McKenney
246*ebb477cbSPaul E. McKenney  (*) Control dependencies apply only to the then-clause and else-clause
247*ebb477cbSPaul E. McKenney      of the "if" statement containing the control dependency, including
248*ebb477cbSPaul E. McKenney      any functions that these two clauses call.  Control dependencies
249*ebb477cbSPaul E. McKenney      do *not* apply to code beyond the end of that "if" statement.
250*ebb477cbSPaul E. McKenney
251*ebb477cbSPaul E. McKenney  (*) Control dependencies pair normally with other types of barriers.
252*ebb477cbSPaul E. McKenney
253*ebb477cbSPaul E. McKenney  (*) Control dependencies do *not* provide multicopy atomicity.  If you
254*ebb477cbSPaul E. McKenney      need all the CPUs to agree on the ordering of a given store against
255*ebb477cbSPaul E. McKenney      all other accesses, use smp_mb().
256*ebb477cbSPaul E. McKenney
257*ebb477cbSPaul E. McKenney  (*) Compilers do not understand control dependencies.  It is therefore
258*ebb477cbSPaul E. McKenney      your job to ensure that they do not break your code.
259