Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6 |
|
#
d396f976 |
| 25-Oct-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
debugobjects: Stop accessing objects after releasing hash bucket lock
[ Upstream commit 9bb6362652f3f4d74a87d572a91ee1b38e673ef6 ]
After release of the hashbucket lock the tracking object can be mo
debugobjects: Stop accessing objects after releasing hash bucket lock
[ Upstream commit 9bb6362652f3f4d74a87d572a91ee1b38e673ef6 ]
After release of the hashbucket lock the tracking object can be modified or freed by a concurrent thread. Using it in such a case is error prone, even for printing the object state:
1. T1 tries to deactivate destroyed object, debugobjects detects it, hash bucket lock is released.
2. T2 preempts T1 and frees the tracking object.
3. The freed tracking object is allocated and initialized for a different to be tracked kernel object.
4. T1 resumes and reports error for wrong kernel object.
Create a local copy of the tracking object before releasing the hash bucket lock and use the local copy for reporting and fixups to prevent this.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20231025-debugobjects_fix-v3-1-2bc3bf7084c2@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33 |
|
#
8b64d420 |
| 07-Jun-2023 |
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> |
debugobjects: Recheck debug_objects_enabled before reporting
syzbot is reporting false a positive ODEBUG message immediately after ODEBUG was disabled due to OOM.
[ 1062.309646][T22911] ODEBUG: O
debugobjects: Recheck debug_objects_enabled before reporting
syzbot is reporting false a positive ODEBUG message immediately after ODEBUG was disabled due to OOM.
[ 1062.309646][T22911] ODEBUG: Out of memory. ODEBUG disabled [ 1062.886755][ T5171] ------------[ cut here ]------------ [ 1062.892770][ T5171] ODEBUG: assert_init not available (active state 0) object: ffffc900056afb20 object type: timer_list hint: process_timeout+0x0/0x40
CPU 0 [ T5171] CPU 1 [T22911] -------------- -------------- debug_object_assert_init() { if (!debug_objects_enabled) return; db = get_bucket(addr); lookup_object_or_alloc() { debug_objects_enabled = 0; return NULL; } debug_objects_oom() { pr_warn("Out of memory. ODEBUG disabled\n"); // all buckets get emptied here, and } lookup_object_or_alloc(addr, db, descr, false, true) { // this bucket is already empty. return ERR_PTR(-ENOENT); } // Emits false positive warning. debug_print_object(&o, "assert_init"); }
Recheck debug_object_enabled in debug_print_object() to avoid that.
Reported-by: syzbot <syzbot+7937ba6a50bdd00fffdf@syzkaller.appspotmail.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/492fe2ae-5141-d548-ebd5-62f5fe2e57f7@I-love.SAKURA.ne.jp Closes: https://syzkaller.appspot.com/bug?extid=7937ba6a50bdd00fffdf
show more ...
|
Revision tags: v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28 |
|
#
eb799279 |
| 11-May-2023 |
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> |
debugobjects: Don't wake up kswapd from fill_pool()
syzbot is reporting a lockdep warning in fill_pool() because the allocation from debugobjects is using GFP_ATOMIC, which is (__GFP_HIGH | __GFP_KS
debugobjects: Don't wake up kswapd from fill_pool()
syzbot is reporting a lockdep warning in fill_pool() because the allocation from debugobjects is using GFP_ATOMIC, which is (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) and therefore tries to wake up kswapd, which acquires kswapd_wait::lock.
Since fill_pool() might be called with arbitrary locks held, fill_pool() should not assume that acquiring kswapd_wait::lock is safe.
Use __GFP_HIGH instead and remove __GFP_NORETRY as it is pointless for !__GFP_DIRECT_RECLAIM allocation.
Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects") Reported-by: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/6577e1fa-b6ee-f2be-2414-a2b51b1c5e30@I-love.SAKURA.ne.jp Closes: https://syzkaller.appspot.com/bug?extid=fe0c72f0ccbb93786380
show more ...
|
Revision tags: v6.1.27, v6.1.26 |
|
#
0cce06ba |
| 25-Apr-2023 |
Peter Zijlstra <peterz@infradead.org> |
debugobjects,locking: Annotate debug_object_fill_pool() wait type violation
There is an explicit wait-type violation in debug_object_fill_pool() for PREEMPT_RT=n kernels which allows them to more ea
debugobjects,locking: Annotate debug_object_fill_pool() wait type violation
There is an explicit wait-type violation in debug_object_fill_pool() for PREEMPT_RT=n kernels which allows them to more easily fill the object pool and reduce the chance of allocation failures.
Lockdep's wait-type checks are designed to check the PREEMPT_RT locking rules even for PREEMPT_RT=n kernels and object to this, so create a lockdep annotation to allow this to stand.
Specifically, create a 'lock' type that overrides the inner wait-type while it is held -- allowing one to temporarily raise it, such that the violation is hidden.
Reported-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Qi Zheng <zhengqi.arch@bytedance.com> Link: https://lkml.kernel.org/r/20230429100614.GA1489784@hirez.programming.kicks-ass.net
show more ...
|
#
0af462f1 |
| 01-May-2023 |
Thomas Gleixner <tglx@linutronix.de> |
debugobject: Ensure pool refill (again)
The recent fix to ensure atomicity of lookup and allocation inadvertently broke the pool refill mechanism.
Prior to that change debug_objects_activate() and
debugobject: Ensure pool refill (again)
The recent fix to ensure atomicity of lookup and allocation inadvertently broke the pool refill mechanism.
Prior to that change debug_objects_activate() and debug_objecs_assert_init() invoked debug_objecs_init() to set up the tracking object for statically initialized objects. That's not longer the case and debug_objecs_init() is now the only place which does pool refills.
Depending on the number of statically initialized objects this can be enough to actually deplete the pool, which was observed by Ido via a debugobjects OOM warning.
Restore the old behaviour by adding explicit refill opportunities to debug_objects_activate() and debug_objecs_assert_init().
Fixes: 63a759694eed ("debugobject: Prevent init race with static objects") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/871qk05a9d.ffs@tglx
show more ...
|
Revision tags: v6.3, v6.1.25, v6.1.24 |
|
#
63a75969 |
| 12-Apr-2023 |
Thomas Gleixner <tglx@linutronix.de> |
debugobject: Prevent init race with static objects
Statically initialized objects are usually not initialized via the init() function of the subsystem. They are special cased and the subsystem provi
debugobject: Prevent init race with static objects
Statically initialized objects are usually not initialized via the init() function of the subsystem. They are special cased and the subsystem provides a function to validate whether an object which is not yet tracked by debugobjects is statically initialized. This means the object is started to be tracked on first use, e.g. activation.
This works perfectly fine, unless there are two concurrent operations on that object. Schspa decoded the problem:
T0 T1
debug_object_assert_init(addr) lock_hash_bucket() obj = lookup_object(addr); if (!obj) { unlock_hash_bucket(); - > preemption lock_subsytem_object(addr); activate_object(addr) lock_hash_bucket(); obj = lookup_object(addr); if (!obj) { unlock_hash_bucket(); if (is_static_object(addr)) init_and_track(addr); lock_hash_bucket(); obj = lookup_object(addr); obj->state = ACTIVATED; unlock_hash_bucket();
subsys function modifies content of addr, so static object detection does not longer work.
unlock_subsytem_object(addr); if (is_static_object(addr)) <- Fails
debugobject emits a warning and invokes the fixup function which reinitializes the already active object in the worst case.
This race exists forever, but was never observed until mod_timer() got a debug_object_assert_init() added which is outside of the timer base lock held section right at the beginning of the function to cover the lockless early exit points too.
Rework the code so that the lookup, the static object check and the tracking object association happens atomically under the hash bucket lock. This prevents the issue completely as all callers are serialized on the hash bucket lock and therefore cannot observe inconsistent state.
Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects") Reported-by: syzbot+5093ba19745994288b53@syzkaller.appspotmail.com Debugged-by: Schspa Shi <schspa@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://syzkaller.appspot.com/bug?id=22c8a5938eab640d1c6bcc0e3dc7be519d878462 Link: https://lore.kernel.org/lkml/20230303161906.831686-1-schspa@gmail.com Link: https://lore.kernel.org/r/87zg7dzgao.ffs@tglx
show more ...
|
Revision tags: v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18 |
|
#
c4db2d3b |
| 19-May-2022 |
Stephen Boyd <swboyd@chromium.org> |
debugobjects: Print object pointer in debug_print_object()
Delayed kobject debugging (CONFIG_DEBUG_KOBJECT_RELEASE) prints the kobject pointer that's being released in kobject_release() before sched
debugobjects: Print object pointer in debug_print_object()
Delayed kobject debugging (CONFIG_DEBUG_KOBJECT_RELEASE) prints the kobject pointer that's being released in kobject_release() before scheduling a randomly delayed work to do the actual release work.
If the caller of kobject_put() frees the kobject upon return then this will typically emit a debugobject warning about freeing an active timer.
Usually the release function is the function that does the kfree() of the struct containing the kobject.
For example the following print is seen
kobject: 'queue' (ffff888114236190): kobject_release, parent 0000000000000000 (delayed 1000) ------------[ cut here ]------------ ODEBUG: free active (active state 0) object type: timer_list hint: kobject_delayed_cleanup+0x0/0x390
but the kobject printk cannot be matched with the debug object printk because it could be any number of kobjects that was released around that time. The random delay for the work doesn't help either.
Print the address of the object being tracked to help to figure out which kobject is the problem here. Note that this does not use %px here to match the other %p usage in debugobject debugging. Due to %p usage it is required to disable pointer hashing to correlate the two pointer printks.
Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220519202201.2348343-1-swboyd@chromium.org
show more ...
|
#
eabb7f1a |
| 11-Jun-2022 |
wuchi <wuchi.zero@gmail.com> |
lib/debugobjects: fix stat count and optimize debug_objects_mem_init
1. Var debug_objects_allocated tracks valid kmem_cache_alloc calls, so track it in debug_objects_replace_static_objects. Do s
lib/debugobjects: fix stat count and optimize debug_objects_mem_init
1. Var debug_objects_allocated tracks valid kmem_cache_alloc calls, so track it in debug_objects_replace_static_objects. Do similar things in object_cpu_offline.
2. In debug_objects_mem_init, there is no need to call function cpuhp_setup_state_nocalls when debug_objects_enabled = 0 (out of memory).
Link: https://lkml.kernel.org/r/20220611130634.99741-1-wuchi.zero@gmail.com Fixes: 634d61f45d6f ("debugobjects: Percpu pool lookahead freeing/allocation") Fixes: c4b73aabd098 ("debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done") Signed-off-by: wuchi <wuchi.zero@gmail.com> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
Revision tags: v5.15.41, v5.15.40, v5.15.39 |
|
#
9e4a51ad |
| 10-May-2022 |
Thomas Gleixner <tglx@linutronix.de> |
debugobjects: Convert to SPDX license identifier
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/87v8udpy3u.ffs@tglx
|
Revision tags: v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
4bedcc28 |
| 12-Aug-2021 |
Thomas Gleixner <tglx@linutronix.de> |
debugobjects: Make them PREEMPT_RT aware
On PREEMPT_RT enabled kernels it is not possible to refill the object pool from atomic context (preemption or interrupts disabled) as the allocator might acq
debugobjects: Make them PREEMPT_RT aware
On PREEMPT_RT enabled kernels it is not possible to refill the object pool from atomic context (preemption or interrupts disabled) as the allocator might acquire 'sleeping' spinlocks.
Guard the invocation of fill_pool() accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/87sfzehdnl.ffs@tglx
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8 |
|
#
88451f2c |
| 08-Sep-2020 |
Zqiang <qiang.zhang@windriver.com> |
debugobjects: Free per CPU pool after CPU unplug
If a CPU is offlined the debug objects per CPU pool is not cleaned up. If the CPU is never onlined again then the objects in the pool are wasted.
Ad
debugobjects: Free per CPU pool after CPU unplug
If a CPU is offlined the debug objects per CPU pool is not cleaned up. If the CPU is never onlined again then the objects in the pool are wasted.
Add a CPU hotplug callback which is invoked after the CPU is dead to free the pool.
[ tglx: Massaged changelog and added comment about remote access safety ]
Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20200908062709.11441-1-qiang.zhang@windriver.com
show more ...
|
Revision tags: v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59 |
|
#
aedcade6 |
| 14-Aug-2020 |
Stephen Boyd <swboyd@chromium.org> |
debugobjects: Allow debug_obj_descr to be const
The debugobject core could be slightly harder to corrupt if the debug_obj_descr would be a pointer to const memory.
Depending on the architecture, co
debugobjects: Allow debug_obj_descr to be const
The debugobject core could be slightly harder to corrupt if the debug_obj_descr would be a pointer to const memory.
Depending on the architecture, const data structures are placed into read-only memory and thus are harder to corrupt or hijack.
This descriptor is used to fix up stuff like timers and workqueues when core kernel data structures are busted, so moving the descriptors to read-only memory will make debugobjects more resilient to something going wrong and then corrupting the function pointers inside struct debug_obj_descr.
Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200815004027.2046113-2-swboyd@chromium.org
show more ...
|
Revision tags: v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53 |
|
#
0f85c480 |
| 16-Jul-2020 |
Qinglang Miao <miaoqinglang@huawei.com> |
debugobjects: Convert to DEFINE_SHOW_ATTRIBUTE
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
[ tglx: Distangled it from the mess in -next ]
Signed-off-by: Qinglang Miao <miaoqinglang@huawe
debugobjects: Convert to DEFINE_SHOW_ATTRIBUTE
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
[ tglx: Distangled it from the mess in -next ]
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hch@lst.de Link: https://lkml.kernel.org/r/20200716084747.8034-1-miaoqinglang@huawei.com
show more ...
|
Revision tags: v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13 |
|
#
35fd7a63 |
| 16-Jan-2020 |
Marco Elver <elver@google.com> |
debugobjects: Fix various data races
The counters obj_pool_free, and obj_nr_tofree, and the flag obj_freeing are read locklessly outside the pool_lock critical sections. If read with plain accesses,
debugobjects: Fix various data races
The counters obj_pool_free, and obj_nr_tofree, and the flag obj_freeing are read locklessly outside the pool_lock critical sections. If read with plain accesses, this would result in data races.
This is addressed as follows:
* reads outside critical sections become READ_ONCE()s (pairing with WRITE_ONCE()s added);
* writes become WRITE_ONCE()s (pairing with READ_ONCE()s added); since writes happen inside critical sections, only the write and not the read of RMWs needs to be atomic, thus WRITE_ONCE(var, var +/- X) is sufficient.
The data races were reported by KCSAN:
BUG: KCSAN: data-race in __free_object / fill_pool
write to 0xffffffff8beb04f8 of 4 bytes by interrupt on cpu 1: __free_object+0x1ee/0x8e0 lib/debugobjects.c:404 __debug_check_no_obj_freed+0x199/0x330 lib/debugobjects.c:969 debug_check_no_obj_freed+0x3c/0x44 lib/debugobjects.c:994 slab_free_hook mm/slub.c:1422 [inline]
read to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 2: fill_pool+0x3d/0x520 lib/debugobjects.c:135 __debug_object_init+0x3c/0x810 lib/debugobjects.c:536 debug_object_init lib/debugobjects.c:591 [inline] debug_object_activate+0x228/0x320 lib/debugobjects.c:677 debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]
BUG: KCSAN: data-race in __debug_object_init / fill_pool
read to 0xffffffff8beb04f8 of 4 bytes by task 10 on cpu 6: fill_pool+0x3d/0x520 lib/debugobjects.c:135 __debug_object_init+0x3c/0x810 lib/debugobjects.c:536 debug_object_init_on_stack+0x39/0x50 lib/debugobjects.c:606 init_timer_on_stack_key kernel/time/timer.c:742 [inline]
write to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 3: alloc_object lib/debugobjects.c:258 [inline] __debug_object_init+0x717/0x810 lib/debugobjects.c:544 debug_object_init lib/debugobjects.c:591 [inline] debug_object_activate+0x228/0x320 lib/debugobjects.c:677 debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]
BUG: KCSAN: data-race in free_obj_work / free_object
read to 0xffffffff9140c190 of 4 bytes by task 10 on cpu 6: free_object+0x4b/0xd0 lib/debugobjects.c:426 debug_object_free+0x190/0x210 lib/debugobjects.c:824 destroy_timer_on_stack kernel/time/timer.c:749 [inline]
write to 0xffffffff9140c190 of 4 bytes by task 93 on cpu 1: free_obj_work+0x24f/0x480 lib/debugobjects.c:313 process_one_work+0x454/0x8d0 kernel/workqueue.c:2264 worker_thread+0x9a/0x780 kernel/workqueue.c:2410
Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200116185529.11026-1-elver@google.com
show more ...
|
Revision tags: v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6, v5.1.5, v5.1.4 |
|
#
d5f34153 |
| 20-May-2019 |
Waiman Long <longman@redhat.com> |
debugobjects: Move printk out of db->lock critical sections
The db->lock is a raw spinlock and so the lock hold time is supposed to be short. This will not be the case when printk() is being involve
debugobjects: Move printk out of db->lock critical sections
The db->lock is a raw spinlock and so the lock hold time is supposed to be short. This will not be the case when printk() is being involved in some of the critical sections. In order to avoid the long hold time, in case some messages need to be printed, the debug_object_is_on_stack() and debug_print_object() calls are now moved out of those critical sections.
Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-6-longman@redhat.com
show more ...
|
#
a7344a68 |
| 20-May-2019 |
Waiman Long <longman@redhat.com> |
debugobjects: Less aggressive freeing of excess debug objects
After a system bootup and 3 parallel kernel builds, a partial output of the debug objects stats file was:
pool_free :5101 pool_pcp_
debugobjects: Less aggressive freeing of excess debug objects
After a system bootup and 3 parallel kernel builds, a partial output of the debug objects stats file was:
pool_free :5101 pool_pcp_free :4181 pool_min_free :220 pool_used :104172 pool_max_used :171920 on_free_list :0 objs_allocated:39268280 objs_freed :39160031
More than 39 millions debug objects had since been allocated and then freed. The pool_max_used, however, was only about 172k. So this is a lot of extra overhead in freeing and allocating objects from slabs. It may also causes the slabs to be more fragmented and harder to reclaim.
Make the freeing of excess debug objects less aggressive by freeing them at a maximum frequency of 10Hz and about 1k objects at each round of freeing.
With that change applied, the partial output of the debug objects stats file after similar actions became:
pool_free :5901 pool_pcp_free :3742 pool_min_free :1022 pool_used :104805 pool_max_used :168081 on_free_list :0 objs_allocated:5796864 objs_freed :5687182
Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-5-longman@redhat.com
show more ...
|
#
d26bf505 |
| 20-May-2019 |
Waiman Long <longman@redhat.com> |
debugobjects: Reduce number of pool_lock acquisitions in fill_pool()
In fill_pool(), the pool_lock is acquired and then released once per debug object. If many objects are to be filled, the constant
debugobjects: Reduce number of pool_lock acquisitions in fill_pool()
In fill_pool(), the pool_lock is acquired and then released once per debug object. If many objects are to be filled, the constant lock and unlock operations are extra overhead.
To reduce the overhead, batch them up and do an allocation of 4 objects per lock/unlock sequence.
Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-4-longman@redhat.com
show more ...
|
#
634d61f4 |
| 20-May-2019 |
Waiman Long <longman@redhat.com> |
debugobjects: Percpu pool lookahead freeing/allocation
Most workloads will allocate a bunch of memory objects, work on them and then freeing all or most of them. So just having a percpu free pool ma
debugobjects: Percpu pool lookahead freeing/allocation
Most workloads will allocate a bunch of memory objects, work on them and then freeing all or most of them. So just having a percpu free pool may not reduce the pool_lock contention significantly if large number of objects are being used.
To help those situations, we are now doing lookahead allocation and freeing of the debug objects into and out of the percpu free pool. This will hopefully reduce the number of times the pool_lock needs to be taken and hence its contention level.
Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-3-longman@redhat.com
show more ...
|
#
d86998b1 |
| 20-May-2019 |
Waiman Long <longman@redhat.com> |
debugobjects: Add percpu free pools
When a multi-threaded workload does a lot of small memory object allocations and deallocations, it may cause the allocation and freeing of many debug objects. Thi
debugobjects: Add percpu free pools
When a multi-threaded workload does a lot of small memory object allocations and deallocations, it may cause the allocation and freeing of many debug objects. This will make the global pool_lock a bottleneck in the performance of the workload. Since interrupts are disabled when acquiring the pool_lock, it may even cause hard lockups to happen.
To reduce contention of the global pool_lock, add a percpu debug object free pool that can be used to buffer some of the debug object allocation and freeing requests without acquiring the pool_lock. Each CPU will now have a percpu free pool that can hold up to a maximum of 64 debug objects. Allocation and freeing requests will go to the percpu free pool first. If that fails, the pool_lock will be taken and the global free pool will be used.
The presence or absence of obj_cache is used as a marker to see if the percpu cache should be used.
Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-2-longman@redhat.com
show more ...
|
#
fecb0d95 |
| 12-Jun-2019 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
debugobjects: No need to check return value of debugfs_create()
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic sho
debugobjects: No need to check return value of debugfs_create()
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Qian Cai <cai@gmx.us> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Waiman Long <longman@redhat.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190612153513.GA21082@kroah.com
show more ...
|
Revision tags: v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13 |
|
#
a9ee3a63 |
| 28-Dec-2018 |
Qian Cai <cai@gmx.us> |
debugobjects: call debug_objects_mem_init eariler
The current value of the early boot static pool size, 1024 is not big enough for systems with large number of CPUs with timer or/and workqueue objec
debugobjects: call debug_objects_mem_init eariler
The current value of the early boot static pool size, 1024 is not big enough for systems with large number of CPUs with timer or/and workqueue objects selected. As the results, systems have 60+ CPUs with both timer and workqueue objects enabled could trigger "ODEBUG: Out of memory. ODEBUG disabled".
Some debug objects are allocated during the early boot. Enabling some options like timers or workqueue objects may increase the size required significantly with large number of CPUs. For example,
CONFIG_DEBUG_OBJECTS_TIMERS: No. CPUs x 2 (worker pool) objects: start_kernel workqueue_init_early init_worker_pool init_timer_key debug_object_init
plus No. CPUs objects (CONFIG_HIGH_RES_TIMERS): sched_init hrtick_rq_init hrtimer_init
CONFIG_DEBUG_OBJECTS_WORK: No. CPUs objects: vmalloc_init __init_work
plus No. CPUs x 6 (workqueue) objects: workqueue_init_early alloc_workqueue __alloc_workqueue_key alloc_and_link_pwqs init_pwq
Also, plus No. CPUs objects: perf_event_init __init_srcu_struct init_srcu_struct_fields init_srcu_struct_nodes __init_work
However, none of the things are actually used or required before debug_objects_mem_init() is invoked, so just move the call right before vmalloc_init().
According to tglx, "the reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today."
Link: http://lkml.kernel.org/r/20181126102407.1836-1-cai@gmx.us Signed-off-by: Qian Cai <cai@gmx.us> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6 |
|
#
8de456cf |
| 30-Nov-2018 |
Qian Cai <cai@gmx.us> |
debugobjects: avoid recursive calls with kmemleak
CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to recursive calls.
fill_pool kmemleak_ignore make_black_object put_
debugobjects: avoid recursive calls with kmemleak
CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to recursive calls.
fill_pool kmemleak_ignore make_black_object put_object __call_rcu (kernel/rcu/tree.c) debug_rcu_head_queue debug_object_activate debug_object_init fill_pool kmemleak_ignore make_black_object ...
So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly allocated debug objects at all.
Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us Signed-off-by: Qian Cai <cai@gmx.us> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9, v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12 |
|
#
3ff4f80a |
| 31-Jul-2018 |
Zhong Jiang <zhongjiang@huawei.com> |
debugobjects: Remove redundant NULL pointer check
kmem_cache_destroy() has a built in NULL pointer check, so the one at the call can be removed.
Signed-off-by: Zhong Jiang <zhongjiang@huawei.com> S
debugobjects: Remove redundant NULL pointer check
kmem_cache_destroy() has a built in NULL pointer check, so the one at the call can be removed.
Signed-off-by: Zhong Jiang <zhongjiang@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <longman@redhat.com> Cc: <arnd@arndb.de> Cc: <yang.shi@linux.alibaba.com> Link: https://lkml.kernel.org/r/1533054298-35824-1-git-send-email-zhongjiang@huawei.com
show more ...
|
Revision tags: v4.17.11, v4.17.10 |
|
#
fc91a3c4 |
| 23-Jul-2018 |
Joel Fernandes (Google) <joel@joelfernandes.org> |
debugobjects: Make stack check warning more informative
While debugging an issue debugobject tracking warned about an annotation issue of an object on stack. It turned out that the issue was due to
debugobjects: Make stack check warning more informative
While debugging an issue debugobject tracking warned about an annotation issue of an object on stack. It turned out that the issue was due to the object in concern being on a different stack which was due to another issue.
Thomas suggested to print the pointers and the location of the stack for the currently running task. This helped to figure out that the object was on the wrong stack.
As this is general useful information for debugging similar issues, make the error message more informative by printing the pointers.
[ tglx: Massaged changelog ]
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: kernel-team@android.com Cc: Arnd Bergmann <arnd@arndb.de> Cc: astrachan@google.com Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org
show more ...
|
Revision tags: v4.17.9, v4.17.8, v4.17.7, v4.17.6, v4.17.5, v4.17.4, v4.17.3, v4.17.2, v4.17.1, v4.17, v4.16 |
|
#
163cf842 |
| 13-Mar-2018 |
Arnd Bergmann <arnd@arndb.de> |
debugobjects: Avoid another unused variable warning
debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(), and only read in debug_objects_maxchecked, unfortunately both of these a
debugobjects: Avoid another unused variable warning
debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(), and only read in debug_objects_maxchecked, unfortunately both of these are optional and depend on different Kconfig symbols.
When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this warning is emitted:
lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable]
Rather than trying to add more complex #ifdef protections, mark the variable as __maybe_unused so it can be silently dropped when usused.
Fixes: bd9dcd046509 ("debugobjects: Export max loops counter") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de
show more ...
|