1====================================== 2Sequence counters and sequential locks 3====================================== 4 5Introduction 6============ 7 8Sequence counters are a reader-writer consistency mechanism with 9lockless readers (read-only retry loops), and no writer starvation. They 10are used for data that's rarely written to (e.g. system time), where the 11reader wants a consistent set of information and is willing to retry if 12that information changes. 13 14A data set is consistent when the sequence count at the beginning of the 15read side critical section is even and the same sequence count value is 16read again at the end of the critical section. The data in the set must 17be copied out inside the read side critical section. If the sequence 18count has changed between the start and the end of the critical section, 19the reader must retry. 20 21Writers increment the sequence count at the start and the end of their 22critical section. After starting the critical section the sequence count 23is odd and indicates to the readers that an update is in progress. At 24the end of the write side critical section the sequence count becomes 25even again which lets readers make progress. 26 27A sequence counter write side critical section must never be preempted 28or interrupted by read side sections. Otherwise the reader will spin for 29the entire scheduler tick due to the odd sequence count value and the 30interrupted writer. If that reader belongs to a real-time scheduling 31class, it can spin forever and the kernel will livelock. 32 33This mechanism cannot be used if the protected data contains pointers, 34as the writer can invalidate a pointer that the reader is following. 35 36 37.. _seqcount_t: 38 39Sequence counters (``seqcount_t``) 40================================== 41 42This is the the raw counting mechanism, which does not protect against 43multiple writers. Write side critical sections must thus be serialized 44by an external lock. 45 46If the write serialization primitive is not implicitly disabling 47preemption, preemption must be explicitly disabled before entering the 48write side section. If the read section can be invoked from hardirq or 49softirq contexts, interrupts or bottom halves must also be respectively 50disabled before entering the write section. 51 52If it's desired to automatically handle the sequence counter 53requirements of writer serialization and non-preemptibility, use 54:ref:`seqlock_t` instead. 55 56Initialization:: 57 58 /* dynamic */ 59 seqcount_t foo_seqcount; 60 seqcount_init(&foo_seqcount); 61 62 /* static */ 63 static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount); 64 65 /* C99 struct init */ 66 struct { 67 .seq = SEQCNT_ZERO(foo.seq), 68 } foo; 69 70Write path:: 71 72 /* Serialized context with disabled preemption */ 73 74 write_seqcount_begin(&foo_seqcount); 75 76 /* ... [[write-side critical section]] ... */ 77 78 write_seqcount_end(&foo_seqcount); 79 80Read path:: 81 82 do { 83 seq = read_seqcount_begin(&foo_seqcount); 84 85 /* ... [[read-side critical section]] ... */ 86 87 } while (read_seqcount_retry(&foo_seqcount, seq)); 88 89 90.. _seqcount_locktype_t: 91 92Sequence counters with associated locks (``seqcount_LOCKTYPE_t``) 93----------------------------------------------------------------- 94 95As discussed at :ref:`seqcount_t`, sequence count write side critical 96sections must be serialized and non-preemptible. This variant of 97sequence counters associate the lock used for writer serialization at 98initialization time, which enables lockdep to validate that the write 99side critical sections are properly serialized. 100 101This lock association is a NOOP if lockdep is disabled and has neither 102storage nor runtime overhead. If lockdep is enabled, the lock pointer is 103stored in struct seqcount and lockdep's "lock is held" assertions are 104injected at the beginning of the write side critical section to validate 105that it is properly protected. 106 107For lock types which do not implicitly disable preemption, preemption 108protection is enforced in the write side function. 109 110The following sequence counters with associated locks are defined: 111 112 - ``seqcount_spinlock_t`` 113 - ``seqcount_raw_spinlock_t`` 114 - ``seqcount_rwlock_t`` 115 - ``seqcount_mutex_t`` 116 - ``seqcount_ww_mutex_t`` 117 118The plain seqcount read and write APIs branch out to the specific 119seqcount_LOCKTYPE_t implementation at compile-time. This avoids kernel 120API explosion per each new seqcount LOCKTYPE. 121 122Initialization (replace "LOCKTYPE" with one of the supported locks):: 123 124 /* dynamic */ 125 seqcount_LOCKTYPE_t foo_seqcount; 126 seqcount_LOCKTYPE_init(&foo_seqcount, &lock); 127 128 /* static */ 129 static seqcount_LOCKTYPE_t foo_seqcount = 130 SEQCNT_LOCKTYPE_ZERO(foo_seqcount, &lock); 131 132 /* C99 struct init */ 133 struct { 134 .seq = SEQCNT_LOCKTYPE_ZERO(foo.seq, &lock), 135 } foo; 136 137Write path: same as in :ref:`seqcount_t`, while running from a context 138with the associated LOCKTYPE lock acquired. 139 140Read path: same as in :ref:`seqcount_t`. 141 142 143.. _seqcount_latch_t: 144 145Latch sequence counters (``seqcount_latch_t``) 146---------------------------------------------- 147 148Latch sequence counters are a multiversion concurrency control mechanism 149where the embedded seqcount_t counter even/odd value is used to switch 150between two copies of protected data. This allows the sequence counter 151read path to safely interrupt its own write side critical section. 152 153Use seqcount_latch_t when the write side sections cannot be protected 154from interruption by readers. This is typically the case when the read 155side can be invoked from NMI handlers. 156 157Check `raw_write_seqcount_latch()` for more information. 158 159 160.. _seqlock_t: 161 162Sequential locks (``seqlock_t``) 163================================ 164 165This contains the :ref:`seqcount_t` mechanism earlier discussed, plus an 166embedded spinlock for writer serialization and non-preemptibility. 167 168If the read side section can be invoked from hardirq or softirq context, 169use the write side function variants which disable interrupts or bottom 170halves respectively. 171 172Initialization:: 173 174 /* dynamic */ 175 seqlock_t foo_seqlock; 176 seqlock_init(&foo_seqlock); 177 178 /* static */ 179 static DEFINE_SEQLOCK(foo_seqlock); 180 181 /* C99 struct init */ 182 struct { 183 .seql = __SEQLOCK_UNLOCKED(foo.seql) 184 } foo; 185 186Write path:: 187 188 write_seqlock(&foo_seqlock); 189 190 /* ... [[write-side critical section]] ... */ 191 192 write_sequnlock(&foo_seqlock); 193 194Read path, three categories: 195 1961. Normal Sequence readers which never block a writer but they must 197 retry if a writer is in progress by detecting change in the sequence 198 number. Writers do not wait for a sequence reader:: 199 200 do { 201 seq = read_seqbegin(&foo_seqlock); 202 203 /* ... [[read-side critical section]] ... */ 204 205 } while (read_seqretry(&foo_seqlock, seq)); 206 2072. Locking readers which will wait if a writer or another locking reader 208 is in progress. A locking reader in progress will also block a writer 209 from entering its critical section. This read lock is 210 exclusive. Unlike rwlock_t, only one locking reader can acquire it:: 211 212 read_seqlock_excl(&foo_seqlock); 213 214 /* ... [[read-side critical section]] ... */ 215 216 read_sequnlock_excl(&foo_seqlock); 217 2183. Conditional lockless reader (as in 1), or locking reader (as in 2), 219 according to a passed marker. This is used to avoid lockless readers 220 starvation (too much retry loops) in case of a sharp spike in write 221 activity. First, a lockless read is tried (even marker passed). If 222 that trial fails (odd sequence counter is returned, which is used as 223 the next iteration marker), the lockless read is transformed to a 224 full locking read and no retry loop is necessary:: 225 226 /* marker; even initialization */ 227 int seq = 0; 228 do { 229 read_seqbegin_or_lock(&foo_seqlock, &seq); 230 231 /* ... [[read-side critical section]] ... */ 232 233 } while (need_seqretry(&foo_seqlock, seq)); 234 done_seqretry(&foo_seqlock, seq); 235 236 237API documentation 238================= 239 240.. kernel-doc:: include/linux/seqlock.h 241