Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6 |
|
#
f62c43d6 |
| 03-Oct-2023 |
Frederic Weisbecker <frederic@kernel.org> |
srcu: Only accelerate on enqueue time
[ Upstream commit 8a77f38bcd28d3c22ab7dd8eff3f299d43c00411 ]
Acceleration in SRCU happens on enqueue time for each new callback. This operation is expected not
srcu: Only accelerate on enqueue time
[ Upstream commit 8a77f38bcd28d3c22ab7dd8eff3f299d43c00411 ]
Acceleration in SRCU happens on enqueue time for each new callback. This operation is expected not to fail and therefore any similar attempt from other places shouldn't find any remaining callbacks to accelerate.
Moreover accelerations performed beyond enqueue time are error prone because rcu_seq_snap() then may return the snapshot for a new grace period that is not going to be started.
Remove these dangerous and needless accelerations and introduce instead assertions reporting leaking unaccelerated callbacks beyond enqueue time.
Co-developed-by: Yong He <alexyonghe@tencent.com> Signed-off-by: Yong He <alexyonghe@tencent.com> Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Co-developed-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Reviewed-by: Like Xu <likexu@tencent.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.5.5, v6.5.4, v6.5.3, v6.5.2 |
|
#
74f6aedb |
| 04-Sep-2023 |
Denis Arefev <arefev@swemel.ru> |
srcu: Fix srcu_struct node grpmask overflow on 64-bit systems
[ Upstream commit d8d5b7bf6f2105883bbd91bbd4d5b67e4e3dff71 ]
The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) is subje
srcu: Fix srcu_struct node grpmask overflow on 64-bit systems
[ Upstream commit d8d5b7bf6f2105883bbd91bbd4d5b67e4e3dff71 ]
The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) is subject to overflow due to a failure to cast operands to a larger data type before performing the bitwise operation.
The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF Kconfig option, which on 64-bit systems defaults to 16 (resulting in a maximum shift of 15), but which can be set up as high as 64 (resulting in a maximum shift of 63). A value of 31 can result in sign extension, resulting in 0xffffffff80000000 instead of the desired 0x80000000. A value of 32 or greater triggers undefined behavior per the C standard.
This bug has not been known to cause issues because almost all kernels take the default CONFIG_RCU_FANOUT_LEAF=16. Furthermore, as long as a given compiler gives a deterministic non-zero result for 1<<N for N>=32, the code correctly invokes all SRCU callbacks, albeit wasting CPU time along the way.
This commit therefore substitutes the correct 1UL for the buggy 1.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Denis Arefev <arefev@swemel.ru> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: David Laight <David.Laight@aculab.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
51631531 |
| 03-Oct-2023 |
Frederic Weisbecker <frederic@kernel.org> |
srcu: Fix callbacks acceleration mishandling
[ Upstream commit 4a8e65b0c348e42107c64381e692e282900be361 ]
SRCU callbacks acceleration might fail if the preceding callbacks advance also fails. This
srcu: Fix callbacks acceleration mishandling
[ Upstream commit 4a8e65b0c348e42107c64381e692e282900be361 ]
SRCU callbacks acceleration might fail if the preceding callbacks advance also fails. This can happen when the following steps are met:
1) The RCU_WAIT_TAIL segment has callbacks (say for gp_num 8) and the RCU_NEXT_READY_TAIL also has callbacks (say for gp_num 12).
2) The grace period for RCU_WAIT_TAIL is observed as started but not yet completed so rcu_seq_current() returns 4 + SRCU_STATE_SCAN1 = 5.
3) This value is passed to rcu_segcblist_advance() which can't move any segment forward and fails.
4) srcu_gp_start_if_needed() still proceeds with callback acceleration. But then the call to rcu_seq_snap() observes the grace period for the RCU_WAIT_TAIL segment (gp_num 8) as completed and the subsequent one for the RCU_NEXT_READY_TAIL segment as started (ie: 8 + SRCU_STATE_SCAN1 = 9) so it returns a snapshot of the next grace period, which is 16.
5) The value of 16 is passed to rcu_segcblist_accelerate() but the freshly enqueued callback in RCU_NEXT_TAIL can't move to RCU_NEXT_READY_TAIL which already has callbacks for a previous grace period (gp_num = 12). So acceleration fails.
6) Note in all these steps, srcu_invoke_callbacks() hadn't had a chance to run srcu_invoke_callbacks().
Then some very bad outcome may happen if the following happens:
7) Some other CPU races and starts the grace period number 16 before the CPU handling previous steps had a chance. Therefore srcu_gp_start() isn't called on the latter sdp to fix the acceleration leak from previous steps with a new pair of call to advance/accelerate.
8) The grace period 16 completes and srcu_invoke_callbacks() is finally called. All the callbacks from previous grace periods (8 and 12) are correctly advanced and executed but callbacks in RCU_NEXT_READY_TAIL still remain. Then rcu_segcblist_accelerate() is called with a snaphot of 20.
9) Since nothing started the grace period number 20, callbacks stay unhandled.
This has been reported in real load:
[3144162.608392] INFO: task kworker/136:12:252684 blocked for more than 122 seconds. [3144162.615986] Tainted: G O K 5.4.203-1-tlinux4-0011.1 #1 [3144162.623053] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [3144162.631162] kworker/136:12 D 0 252684 2 0x90004000 [3144162.631189] Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm] [3144162.631192] Call Trace: [3144162.631202] __schedule+0x2ee/0x660 [3144162.631206] schedule+0x33/0xa0 [3144162.631209] schedule_timeout+0x1c4/0x340 [3144162.631214] ? update_load_avg+0x82/0x660 [3144162.631217] ? raw_spin_rq_lock_nested+0x1f/0x30 [3144162.631218] wait_for_completion+0x119/0x180 [3144162.631220] ? wake_up_q+0x80/0x80 [3144162.631224] __synchronize_srcu.part.19+0x81/0xb0 [3144162.631226] ? __bpf_trace_rcu_utilization+0x10/0x10 [3144162.631227] synchronize_srcu+0x5f/0xc0 [3144162.631236] irqfd_shutdown+0x3c/0xb0 [kvm] [3144162.631239] ? __schedule+0x2f6/0x660 [3144162.631243] process_one_work+0x19a/0x3a0 [3144162.631244] worker_thread+0x37/0x3a0 [3144162.631247] kthread+0x117/0x140 [3144162.631247] ? process_one_work+0x3a0/0x3a0 [3144162.631248] ? __kthread_cancel_work+0x40/0x40 [3144162.631250] ret_from_fork+0x1f/0x30
Fix this with taking the snapshot for acceleration _before_ the read of the current grace period number.
The only side effect of this solution is that callbacks advancing happen then _after_ the full barrier in rcu_seq_snap(). This is not a problem because that barrier only cares about:
1) Ordering accesses of the update side before call_srcu() so they don't bleed. 2) See all the accesses prior to the grace period of the current gp_num
The only things callbacks advancing need to be ordered against are carried by snp locking.
Reported-by: Yong He <alexyonghe@tencent.com> Co-developed-by:: Yong He <alexyonghe@tencent.com> Signed-off-by: Yong He <alexyonghe@tencent.com> Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Co-developed-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Link: http://lore.kernel.org/CANZk6aR+CqZaqmMWrC2eRRPY12qAZnDZLwLnHZbNi=xXMB401g@mail.gmail.com Fixes: da915ad5cf25 ("srcu: Parallelize callback handling") Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9 |
|
#
754aa642 |
| 27-Jan-2023 |
Joel Fernandes (Google) <joel@joelfernandes.org> |
srcu: Clarify comments on memory barrier "E"
There is an smp_mb() named "E" in srcu_flip() immediately before the increment (flip) of the srcu_struct structure's ->srcu_idx.
The purpose of E is to
srcu: Clarify comments on memory barrier "E"
There is an smp_mb() named "E" in srcu_flip() immediately before the increment (flip) of the srcu_struct structure's ->srcu_idx.
The purpose of E is to order the preceding scan's read of lock counters against the flipping of the ->srcu_idx, in order to prevent new readers from continuing to use the old ->srcu_idx value, which might needlessly extend the grace period.
However, this ordering is already enforced because of the control dependency between the preceding scan and the ->srcu_idx flip. This control dependency exists because atomic_long_read() is used to scan the counts, because WRITE_ONCE() is used to flip ->srcu_idx, and because ->srcu_idx is not flipped until the ->srcu_lock_count[] and ->srcu_unlock_count[] counts match. And such a match cannot happen when there is an in-flight reader that started before the flip (observation courtesy Mathieu Desnoyers).
The litmus test below (courtesy of Frederic Weisbecker, with changes for ctrldep by Boqun and Joel) shows this:
C srcu (* * bad condition: P0's first scan (SCAN1) saw P1's idx=0 LOCK count inc, though P1 saw flip. * * So basically, the ->po ordering on both P0 and P1 is enforced via ->ppo * (control deps) on both sides, and both P0 and P1 are interconnected by ->rf * relations. Combining the ->ppo with ->rf, a cycle is impossible. *)
{}
// updater P0(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1) { int lock1; int unlock1; int lock0; int unlock0;
// SCAN1 unlock1 = READ_ONCE(*UNLOCK1); smp_mb(); // A lock1 = READ_ONCE(*LOCK1);
// FLIP if (lock1 == unlock1) { // Control dep smp_mb(); // E // Remove E and still passes. WRITE_ONCE(*IDX, 1); smp_mb(); // D
// SCAN2 unlock0 = READ_ONCE(*UNLOCK0); smp_mb(); // A lock0 = READ_ONCE(*LOCK0); } }
// reader P1(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1) { int tmp; int idx1; int idx2;
// 1st reader idx1 = READ_ONCE(*IDX); if (idx1 == 0) { // Control dep tmp = READ_ONCE(*LOCK0); WRITE_ONCE(*LOCK0, tmp + 1); smp_mb(); /* B and C */ tmp = READ_ONCE(*UNLOCK0); WRITE_ONCE(*UNLOCK0, tmp + 1); } else { tmp = READ_ONCE(*LOCK1); WRITE_ONCE(*LOCK1, tmp + 1); smp_mb(); /* B and C */ tmp = READ_ONCE(*UNLOCK1); WRITE_ONCE(*UNLOCK1, tmp + 1); } }
exists (0:lock1=1 /\ 1:idx1=1)
More complicated litmus tests with multiple SRCU readers also show that memory barrier E is not needed.
This commit therefore clarifies the comment on memory barrier E.
Why not also remove that redundant smp_mb()?
Because control dependencies are quite fragile due to their not being recognized by most compilers and tools. Control dependencies therefore exact an ongoing maintenance burden, and such a burden cannot be justified in this slowpath. Therefore, that smp_mb() stays until such time as its overhead becomes a measurable problem in a real workload running on a real production system, or until such time as compilers start paying attention to this sort of control dependency.
Co-developed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Co-developed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Co-developed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
show more ...
|
#
cefc0a59 |
| 18-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Fix long lines in srcu_funnel_gp_start()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christ
srcu: Fix long lines in srcu_funnel_gp_start()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
6c366522 |
| 18-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Fix long lines in srcu_gp_end()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christoph Hellw
srcu: Fix long lines in srcu_gp_end()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
5ff8319f |
| 18-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Fix long lines in cleanup_srcu_struct()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christo
srcu: Fix long lines in cleanup_srcu_struct()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
eabe7625 |
| 18-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Fix long lines in srcu_get_delay()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Tested-by: Sachi
srcu: Fix long lines in srcu_get_delay()
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Cc: Christoph Hellwig <hch@lst.de> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
a7bf4d7c |
| 24-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Check for readers at module-exit time
If a given statically allocated in-module srcu_struct structure was ever used for updates, srcu_module_going() will invoke cleanup_srcu_struct() at module
srcu: Check for readers at module-exit time
If a given statically allocated in-module srcu_struct structure was ever used for updates, srcu_module_going() will invoke cleanup_srcu_struct() at module-exit time. This will check for the error case of SRCU readers persisting past module-exit time. On the other hand, if this srcu_struct structure never went through a grace period, srcu_module_going() only invokes free_percpu(), which would result in strange failures if SRCU readers persisted past module-exit time.
This commit therefore adds a srcu_readers_active() check to srcu_module_going(), splatting if readers have persisted and refraining from invoking free_percpu() in that case. Better to leak memory than to suffer silent memory corruption!
[ paulmck: Apply Zhang, Qiang1 feedback on memory leak. ]
Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
fd1b3f8e |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move work-scheduling fields from srcu_struct to srcu_usage
This commit moves the ->reschedule_jiffies, ->reschedule_count, and ->work fields from the srcu_struct structure to the srcu_usage st
srcu: Move work-scheduling fields from srcu_struct to srcu_usage
This commit moves the ->reschedule_jiffies, ->reschedule_count, and ->work fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
However, this means that the container_of() calls cannot get a pointer to the srcu_struct because they are no longer in the srcu_struct. This issue is addressed by adding a ->srcu_ssp field in the srcu_usage structure that references the corresponding srcu_struct structure. And given the presence of the sup pointer to the srcu_usage structure, replace some ssp->srcu_usage-> instances with sup->.
[ paulmck Apply feedback from kernel test robot. ]
Link: https://lore.kernel.org/oe-kbuild-all/202303191400.iO5BOqka-lkp@intel.com/ Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
d20162e0 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move srcu_barrier() fields from srcu_struct to srcu_usage
This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex, ->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the
srcu: Move srcu_barrier() fields from srcu_struct to srcu_usage
This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex, ->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
660349ac |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move ->sda_is_static from srcu_struct to srcu_usage
This commit moves the ->sda_is_static field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in o
srcu: Move ->sda_is_static from srcu_struct to srcu_usage
This commit moves the ->sda_is_static field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
3b46679c |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move heuristics fields from srcu_struct to srcu_usage
This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries, and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the sr
srcu: Move heuristics fields from srcu_struct to srcu_usage
This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries, and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
03200b5c |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move grace-period fields from srcu_struct to srcu_usage
This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed, ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields from
srcu: Move grace-period fields from srcu_struct to srcu_usage
This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed, ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
e3a6ab25 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move ->srcu_gp_mutex from srcu_struct to srcu_usage
This commit moves the ->srcu_gp_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in o
srcu: Move ->srcu_gp_mutex from srcu_struct to srcu_usage
This commit moves the ->srcu_gp_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
b3fb11f7 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move ->lock from srcu_struct to srcu_usage
This commit moves the ->lock field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve ca
srcu: Move ->lock from srcu_struct to srcu_usage
This commit moves the ->lock field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
0839ade9 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move ->lock initialization after srcu_usage allocation
Currently, both __init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=y kernels and init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=n kernel init
srcu: Move ->lock initialization after srcu_usage allocation
Currently, both __init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=y kernels and init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=n kernel initialize the srcu_struct structure's ->lock before the srcu_usage structure has been allocated. This of course prevents the ->lock from being moved to the srcu_usage structure, so this commit moves the initialization into the init_srcu_struct_fields() after the srcu_usage structure has been allocated.
Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
574dc1a7 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move ->srcu_cb_mutex from srcu_struct to srcu_usage
This commit moves the ->srcu_cb_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in o
srcu: Move ->srcu_cb_mutex from srcu_struct to srcu_usage
This commit moves the ->srcu_cb_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
a0d8cbd3 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move ->srcu_size_state from srcu_struct to srcu_usage
This commit moves the ->srcu_size_state field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former
srcu: Move ->srcu_size_state from srcu_struct to srcu_usage
This commit moves the ->srcu_size_state field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
208f41b1 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Move ->level from srcu_struct to srcu_usage
This commit moves the ->level[] array from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improv
srcu: Move ->level from srcu_struct to srcu_usage
This commit moves the ->level[] array from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
95433f72 |
| 16-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Begin offloading srcu_struct fields to srcu_update
The current srcu_struct structure is on the order of 200 bytes in size (depending on architecture and .config), which is much better than the
srcu: Begin offloading srcu_struct fields to srcu_update
The current srcu_struct structure is on the order of 200 bytes in size (depending on architecture and .config), which is much better than the old-style 26K bytes, but still all too inconvenient when one is trying to achieve good cache locality on a fastpath involving SRCU readers.
However, only a few fields in srcu_struct are used by SRCU readers. The remaining fields could be offloaded to a new srcu_update structure, thus shrinking the srcu_struct structure down to a few tens of bytes. This commit begins this noble quest, a quest that is complicated by open-coded initialization of the srcu_struct within the srcu_notifier_head structure. This complication is addressed by updating the srcu_notifier_head structure's open coding, given that there does not appear to be a straightforward way of abstracting that initialization.
This commit moves only the ->node pointer to srcu_update. Later commits will move additional fields.
[ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ]
Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/ Suggested-by: Christoph Hellwig <hch@lst.de> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
#
f4d01a25 |
| 17-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Use static init for statically allocated in-module srcu_struct
Further shrinking the srcu_struct structure is eased by requiring that in-module srcu_struct structures rely more heavily on stat
srcu: Use static init for statically allocated in-module srcu_struct
Further shrinking the srcu_struct structure is eased by requiring that in-module srcu_struct structures rely more heavily on static initialization. In particular, this preserves the property that a module-load-time srcu_struct initialization can fail only due to memory-allocation failure of the per-CPU srcu_data structures. It might also slightly improve robustness by keeping the number of memory allocations that must succeed down percpu_alloc() call.
This is in preparation for splitting an srcu_usage structure out of the srcu_struct structure.
[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
Revision tags: v6.1.8, v6.1.7, v6.1.6 |
|
#
f0f44752 |
| 13-Jan-2023 |
Boqun Feng <boqun.feng@gmail.com> |
rcu: Annotate SRCU's update-side lockdep dependencies
Although all flavors of RCU readers are annotated correctly with lockdep as recursive read locks, they do not set the lock_acquire 'check' param
rcu: Annotate SRCU's update-side lockdep dependencies
Although all flavors of RCU readers are annotated correctly with lockdep as recursive read locks, they do not set the lock_acquire 'check' parameter. This means that RCU read locks are not added to the lockdep dependency graph, which in turn means that lockdep cannot detect RCU-based deadlocks. This is not a problem for RCU flavors having atomic read-side critical sections because context-based annotations can catch these deadlocks, see for example the RCU_LOCKDEP_WARN() statement in synchronize_rcu(). But context-based annotations are not helpful for sleepable RCU, especially given that it is perfectly legal to do synchronize_srcu(&srcu1) within an srcu_read_lock(&srcu2).
However, we can detect SRCU-based by: (1) Making srcu_read_lock() a 'check'ed recursive read lock and (2) Making synchronize_srcu() a empty write lock critical section. Even better, with the newly introduced lock_sync(), we can avoid false positives about irq-unsafe/safe. This commit therefore makes it so.
Note that NMI-safe SRCU read side critical sections are currently not annotated, but might be annotated in the future.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ boqun: Add comments for annotation per Waiman's suggestion ] [ boqun: Fix comment warning reported by Stephen Rothwell ] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
Revision tags: v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15 |
|
#
dafc4d16 |
| 21-Dec-2022 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Update comment after the index flip
Because there is not guaranteed to be a full memory barrier between the ->srcu_unlock_count increment of an srcu_read_unlock() and the ->srcu_lock_count inc
srcu: Update comment after the index flip
Because there is not guaranteed to be a full memory barrier between the ->srcu_unlock_count increment of an srcu_read_unlock() and the ->srcu_lock_count increment of the next srcu_read_lock(), this next srcu_read_lock() is not guaranteed to see the effect of the index flip just prior to this comment. However, this next srcu_read_lock() will execute a full memory barrier, so the srcu_read_lock() after that is guaranteed to see that index flip.
This guarantee is illustrated by the following diagram of events and the litmus test following that.
------------------------------------------------------------------------
READER UPDATER ------------- ---------- // idx is initially 0.
srcu_flip() { smp_mb(); // RSCS
srcu_read_unlock() { smp_mb(); idx++; // P smp_mb(); // QQ }
srcu_readers_unlock_idx(0) { ,--counted------------ count all unlock[0]; // Q | unlock[0]++; // X
} smp_mb(); srcu_read_lock() { READ(idx) = 0; ,---- count all lock[0]; // contributes imbalance of 1. lock[0]++; ----counted | smp_mb(); // PP } | } | | // RSCS not going to effect above scan | srcu_read_unlock() { | smp_mb(); | unlock[0]++; | } | / / srcu_read_lock() { | READ(idx); // Y -----cannot be counted because of P (has to sample idx as 1) lock[1]++; ... }
------------------------------------------------------------------------
This makes it similar to the store buffer pattern. Using X, Y, P and Q annotated above, we get:
------------------------------------------------------------------------
READER UPDATER X (write) P (write)
smp_mb(); //PP smp_mb(); //QQ
Y (read) Q (read)
------------------------------------------------------------------------
ASCII art courtesy of Joel Fernandes.
Reported-by: Joel Fernandes <joel@joelfernandes.org> Reported-by: Boqun Feng <boqun.feng@gmail.com> Reported-by: Frederic Weisbecker <frederic@kernel.org> Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
Revision tags: v6.0.14 |
|
#
0cd4b50b |
| 14-Dec-2022 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Yet more detail for srcu_readers_active_idx_check() comments
The comment in srcu_readers_active_idx_check() following the smp_mb() is out of date, hailing from a simpler time when preemption w
srcu: Yet more detail for srcu_readers_active_idx_check() comments
The comment in srcu_readers_active_idx_check() following the smp_mb() is out of date, hailing from a simpler time when preemption was disabled across the bulk of __srcu_read_lock(). The fact that preemption was disabled meant that the number of tasks that had fetched the old index but not yet incremented counters was limited by the number of CPUs.
In our more complex modern times, the number of CPUs is no longer a limit. This commit therefore updates this comment, additionally giving more memory-ordering detail.
[ paulmck: Apply Nt->Nc feedback from Joel Fernandes. ]
Reported-by: Boqun Feng <boqun.feng@gmail.com> Reported-by: Frederic Weisbecker <frederic@kernel.org> Reported-by: "Joel Fernandes (Google)" <joel@joelfernandes.org> Reported-by: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Reported-by: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|