Revision tags: v6.6.67 |
|
#
278002ed |
| 15-Dec-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.66' into for/openbmc/dev-6.6
This is the 6.6.66 stable release
|
Revision tags: v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1 |
|
#
01ecd269 |
| 04-Nov-2023 |
Peter Zijlstra <peterz@infradead.org> |
sched/deadline: Move bandwidth accounting into {en,de}queue_dl_entity
[ Upstream commit 2f7a0f58948d8231236e2facecc500f1930fb996 ]
In preparation of introducing !task sched_dl_entity; move the band
sched/deadline: Move bandwidth accounting into {en,de}queue_dl_entity
[ Upstream commit 2f7a0f58948d8231236e2facecc500f1930fb996 ]
In preparation of introducing !task sched_dl_entity; move the bandwidth accounting into {en.de}queue_dl_entity().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lkml.kernel.org/r/a86dccbbe44e021b8771627e1dae01a69b73466d.1699095159.git.bristot@kernel.org Stable-dep-of: 0664e2c311b9 ("sched/deadline: Fix warning in migrate_enable for boosted tasks") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
842010e3 |
| 04-Nov-2023 |
Peter Zijlstra <peterz@infradead.org> |
sched/deadline: Collect sched_dl_entity initialization
[ Upstream commit 9e07d45c5210f5dd6701c00d55791983db7320fa ]
Create a single function that initializes a sched_dl_entity.
Signed-off-by: Pete
sched/deadline: Collect sched_dl_entity initialization
[ Upstream commit 9e07d45c5210f5dd6701c00d55791983db7320fa ]
Create a single function that initializes a sched_dl_entity.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lkml.kernel.org/r/51acc695eecf0a1a2f78f9a044e11ffd9b316bcf.1699095159.git.bristot@kernel.org Stable-dep-of: 0664e2c311b9 ("sched/deadline: Fix warning in migrate_enable for boosted tasks") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
4db5988b |
| 04-Nov-2023 |
Peter Zijlstra <peterz@infradead.org> |
sched: Unify runtime accounting across classes
[ Upstream commit 5d69eca542ee17c618f9a55da52191d5e28b435f ]
All classes use sched_entity::exec_start to track runtime and have copies of the exact sa
sched: Unify runtime accounting across classes
[ Upstream commit 5d69eca542ee17c618f9a55da52191d5e28b435f ]
All classes use sched_entity::exec_start to track runtime and have copies of the exact same code around to compute runtime.
Collapse all that.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/54d148a144f26d9559698c4dd82d8859038a7380.1699095159.git.bristot@kernel.org Stable-dep-of: 0664e2c311b9 ("sched/deadline: Fix warning in migrate_enable for boosted tasks") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4 |
|
#
b2f7d750 |
| 19-Sep-2023 |
Ingo Molnar <mingo@kernel.org> |
sched/fair: Rename check_preempt_curr() to wakeup_preempt()
[ Upstream commit e23edc86b09df655bf8963bbcb16647adc787395 ]
The name is a bit opaque - make it clear that this is about wakeup preemptio
sched/fair: Rename check_preempt_curr() to wakeup_preempt()
[ Upstream commit e23edc86b09df655bf8963bbcb16647adc787395 ]
The name is a bit opaque - make it clear that this is about wakeup preemption.
Also rename the ->check_preempt_curr() methods similarly.
Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Stable-dep-of: 0664e2c311b9 ("sched/deadline: Fix warning in migrate_enable for boosted tasks") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
7e24a55b |
| 04-Aug-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.44' into for/openbmc/dev-6.6
This is the 6.6.44 stable release
|
#
7ca52974 |
| 25-Jun-2024 |
Tejun Heo <tj@kernel.org> |
sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks
commit d329605287020c3d1c3b0dadc63d8208e7251382 upstream.
When a task's weight is being changed, set_load_weight()
sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks
commit d329605287020c3d1c3b0dadc63d8208e7251382 upstream.
When a task's weight is being changed, set_load_weight() is called with @update_load set. As weight changes aren't trivial for the fair class, set_load_weight() calls fair.c::reweight_task() for fair class tasks.
However, set_load_weight() first tests task_has_idle_policy() on entry and skips calling reweight_task() for SCHED_IDLE tasks. This is buggy as SCHED_IDLE tasks are just fair tasks with a very low weight and they would incorrectly skip load, vlag and position updates.
Fix it by updating reweight_task() to take struct load_weight as idle weight can't be expressed with prio and making set_load_weight() call reweight_task() for SCHED_IDLE tasks too when @update_load is set.
Fixes: 9059393e4ec1 ("sched/fair: Use reweight_entity() for set_user_nice()") Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org # v4.15+ Link: http://lkml.kernel.org/r/20240624102331.GI31592@noisy.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
6e4f6b5e |
| 18-Jul-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.41' into dev-6.6
This is the 6.6.41 stable release
|
#
448a2500 |
| 18-Jun-2024 |
John Stultz <jstultz@google.com> |
sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath
commit ddae0ca2a8fe12d0e24ab10ba759c3fbd755ada8 upstream.
It was reported that in moving to 6.1, a larger then 10% regression
sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath
commit ddae0ca2a8fe12d0e24ab10ba759c3fbd755ada8 upstream.
It was reported that in moving to 6.1, a larger then 10% regression was seen in the performance of clock_gettime(CLOCK_THREAD_CPUTIME_ID,...).
Using a simple reproducer, I found: 5.10: 100000000 calls in 24345994193 ns => 243.460 ns per call 100000000 calls in 24288172050 ns => 242.882 ns per call 100000000 calls in 24289135225 ns => 242.891 ns per call
6.1: 100000000 calls in 28248646742 ns => 282.486 ns per call 100000000 calls in 28227055067 ns => 282.271 ns per call 100000000 calls in 28177471287 ns => 281.775 ns per call
The cause of this was finally narrowed down to the addition of psi_account_irqtime() in update_rq_clock_task(), in commit 52b1364ba0b1 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure").
In my initial attempt to resolve this, I leaned towards moving all accounting work out of the clock_gettime() call path, but it wasn't very pretty, so it will have to wait for a later deeper rework. Instead, Peter shared this approach:
Rework psi_account_irqtime() to use its own psi_irq_time base for accounting, and move it out of the hotpath, calling it instead from sched_tick() and __schedule().
In testing this, we found the importance of ensuring psi_account_irqtime() is run under the rq_lock, which Johannes Weiner helpfully explained, so also add some lockdep annotations to make that requirement clear.
With this change the performance is back in-line with 5.10: 6.1+fix: 100000000 calls in 24297324597 ns => 242.973 ns per call 100000000 calls in 24318869234 ns => 243.189 ns per call 100000000 calls in 24291564588 ns => 242.916 ns per call
Reported-by: Jimmy Shiu <jimmyshiu@google.com> Originally-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Reviewed-by: Qais Yousef <qyousef@layalina.io> Link: https://lore.kernel.org/r/20240618215909.4099720-1-jstultz@google.com Fixes: 52b1364ba0b1 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure") [jstultz: Fixed up minor collisions w/ 6.6-stable] Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
c845428b |
| 28-Apr-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.29' into dev-6.6
This is the 6.6.29 stable release
|
#
7fce9f0f |
| 15-Apr-2024 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
sched: Add missing memory barrier in switch_mm_cid
commit fe90f3967bdb3e13f133e5f44025e15f943a99c5 upstream.
Many architectures' switch_mm() (e.g. arm64) do not have an smp_mb() which the core sche
sched: Add missing memory barrier in switch_mm_cid
commit fe90f3967bdb3e13f133e5f44025e15f943a99c5 upstream.
Many architectures' switch_mm() (e.g. arm64) do not have an smp_mb() which the core scheduler code has depended upon since commit:
commit 223baf9d17f25 ("sched: Fix performance regression introduced by mm_cid")
If switch_mm() doesn't call smp_mb(), sched_mm_cid_remote_clear() can unset the actively used cid when it fails to observe active task after it sets lazy_put.
There *is* a memory barrier between storing to rq->curr and _return to userspace_ (as required by membarrier), but the rseq mm_cid has stricter requirements: the barrier needs to be issued between store to rq->curr and switch_mm_cid(), which happens earlier than:
- spin_unlock(), - switch_to().
So it's fine when the architecture switch_mm() happens to have that barrier already, but less so when the architecture only provides the full barrier in switch_to() or spin_unlock().
It is a bug in the rseq switch_mm_cid() implementation. All architectures that don't have memory barriers in switch_mm(), but rather have the full barrier either in finish_lock_switch() or switch_to() have them too late for the needs of switch_mm_cid().
Introduce a new smp_mb__after_switch_mm(), defined as smp_mb() in the generic barrier.h header, and use it in switch_mm_cid() for scheduler transitions where switch_mm() is expected to provide a memory barrier.
Architectures can override smp_mb__after_switch_mm() if their switch_mm() implementation provides an implicit memory barrier. Override it with a no-op on x86 which implicitly provide this memory barrier by writing to CR3.
Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by mm_cid") Reported-by: levi.yun <yeoreum.yun@arm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64 Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86 Cc: <stable@vger.kernel.org> # 6.4.x Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240415152114.59122-2-mathieu.desnoyers@efficios.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.5.3 |
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.5.2, v6.1.51, v6.5.1 |
|
#
1ac731c5 |
| 30-Aug-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.6 merge window.
|
Revision tags: v6.1.50 |
|
#
97efd283 |
| 28-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'x86-cleanups-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 cleanups from Ingo Molnar: "The following commit deserves special mention:
22dc02f81cd
Merge tag 'x86-cleanups-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 cleanups from Ingo Molnar: "The following commit deserves special mention:
22dc02f81cddd Revert "sched/fair: Move unused stub functions to header"
This is in x86/cleanups, because the revert is a re-application of a number of cleanups that got removed inadvertedly"
[ This also effectively undoes the amd_check_microcode() microcode declaration change I had done in my microcode loader merge in commit 42a7f6e3ffe0 ("Merge tag 'x86_microcode_for_v6.6_rc1' [...]").
I picked the declaration change by Arnd from this branch instead, which put it in <asm/processor.h> instead of <asm/microcode.h> like I had done in my merge resolution - Linus ]
* tag 'x86-cleanups-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Refactor code using deprecated strncpy() interface to use strscpy() x86/hpet: Refactor code using deprecated strncpy() interface to use strscpy() x86/platform/uv: Refactor code using deprecated strcpy()/strncpy() interfaces to use strscpy() x86/qspinlock-paravirt: Fix missing-prototype warning x86/paravirt: Silence unused native_pv_lock_init() function warning x86/alternative: Add a __alt_reloc_selftest() prototype x86/purgatory: Include header for warn() declaration x86/asm: Avoid unneeded __div64_32 function definition Revert "sched/fair: Move unused stub functions to header" x86/apic: Hide unused safe_smp_processor_id() on 32-bit UP x86/cpu: Fix amd_check_microcode() declaration
show more ...
|
#
3ca9a836 |
| 28-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'sched-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- The biggest change is introduction of a new iteration of the
Merge tag 'sched-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- The biggest change is introduction of a new iteration of the SCHED_FAIR interactivity code: the EEVDF ("Earliest Eligible Virtual Deadline First") scheduler
EEVDF too is a virtual-time scheduler, with two parameters (weight and relative deadline), compared to CFS that had weight only. It completely reworks the base scheduler: placement, preemption, picking -- everything
LWN.net, as usual, has a terrific writeup about EEVDF:
https://lwn.net/Articles/925371/
Preemption (both tick and wakeup) is driven by testing against a fresh pick. Because the tree is now effectively an interval tree, and the selection is no longer the 'leftmost' task, over-scheduling is less of a problem. A lot of the CFS heuristics are removed or replaced by more natural latency-space parameters & constructs
In terms of expected performance regressions: we will and can fix everything where a 'good' workload misbehaves with the new scheduler, but EEVDF inevitably changes workload scheduling in a binary fashion, hopefully for the better in the overwhelming majority of cases, but in some cases it won't, especially in adversarial loads that got lucky with the previous code, such as some variants of hackbench. We are trying hard to err on the side of fixing all performance regressions, but we expect some inevitable post-release iterations of that process
- Improve load-balancing on hybrid x86 systems: enable cluster scheduling (again)
- Improve & fix bandwidth-scheduling on nohz systems
- Improve bandwidth-throttling
- Use lock guards to simplify and de-goto-ify control flow
- Misc improvements, cleanups and fixes
* tag 'sched-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) sched/eevdf/doc: Modify the documented knob to base_slice_ns as well sched/eevdf: Curb wakeup-preemption sched: Simplify sched_core_cpu_{starting,deactivate}() sched: Simplify try_steal_cookie() sched: Simplify sched_tick_remote() sched: Simplify sched_exec() sched: Simplify ttwu() sched: Simplify wake_up_if_idle() sched: Simplify: migrate_swap_stop() sched: Simplify sysctl_sched_uclamp_handler() sched: Simplify get_nohz_timer_target() sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset sched/rt: Fix sysctl_sched_rr_timeslice intial value sched/fair: Block nohz tick_stop when cfs bandwidth in use sched, cgroup: Restore meaning to hierarchical_quota MAINTAINERS: Add Peter explicitly to the psi section sched/psi: Select KERNFS as needed sched/topology: Align group flags when removing degenerate domain sched/fair: remove util_est boosting sched/fair: Propagate enqueue flags into place_entity() ...
show more ...
|
#
b03a4342 |
| 28-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'seccomp-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
- Provide USER_NOTIFY flag for synchronous mode (Andrei Vagin, Peter
Merge tag 'seccomp-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
- Provide USER_NOTIFY flag for synchronous mode (Andrei Vagin, Peter Oskolkov). This touches the scheduler and perf but has been Acked by Peter Zijlstra.
- Fix regression in syscall skipping and restart tracing on arm32. This touches arch/arm/ but has been Acked by Arnd Bergmann.
* tag 'seccomp-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Add missing kerndoc notations ARM: ptrace: Restore syscall skipping for tracers ARM: ptrace: Restore syscall restart tracing selftests/seccomp: Handle arm32 corner cases better perf/benchmark: add a new benchmark for seccom_unotify selftest/seccomp: add a new test for the sync mode of seccomp_user_notify seccomp: add the synchronous mode for seccomp_unotify sched: add a few helpers to wake up tasks on the current cpu sched: add WF_CURRENT_CPU and externise ttwu seccomp: don't use semaphore and wait_queue together
show more ...
|
Revision tags: v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43 |
|
#
4eb054f9 |
| 01-Aug-2023 |
Peter Zijlstra <peterz@infradead.org> |
sched: Simplify wake_up_if_idle()
Use guards to reduce gotos and simplify control flow.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat
sched: Simplify wake_up_if_idle()
Use guards to reduce gotos and simplify control flow.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20230801211812.032678917@infradead.org
show more ...
|
#
5bb76f1d |
| 01-Aug-2023 |
Peter Zijlstra <peterz@infradead.org> |
sched: Simplify: migrate_swap_stop()
Use guards to reduce gotos and simplify control flow.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@red
sched: Simplify: migrate_swap_stop()
Use guards to reduce gotos and simplify control flow.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20230801211811.964370836@infradead.org
show more ...
|
#
b41bbb33 |
| 10-Aug-2023 |
Ingo Molnar <mingo@kernel.org> |
Merge branch 'sched/eevdf' into sched/core
Pick up the EEVDF work into the main branch - it's looking good so far.
Conflicts: kernel/sched/features.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
2612e3bb |
| 07-Aug-2023 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Catching-up with drm-next and drm-intel-gt-next. It will unblock a code refactor around the platform definitions (names vs acronyms).
Signed-off-by: Rodrigo V
Merge drm/drm-next into drm-intel-next
Catching-up with drm-next and drm-intel-gt-next. It will unblock a code refactor around the platform definitions (names vs acronyms).
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
#
9f771739 |
| 07-Aug-2023 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Need to pull in b3e4aae612ec ("drm/i915/hdcp: Modify hdcp_gsc_message msg sending mechanism") as a dependency for https://patchwork.freedesktop.org/series/1
Merge drm/drm-next into drm-intel-gt-next
Need to pull in b3e4aae612ec ("drm/i915/hdcp: Modify hdcp_gsc_message msg sending mechanism") as a dependency for https://patchwork.freedesktop.org/series/121735/
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
Revision tags: v6.1.42, v6.1.41, v6.1.40, v6.1.39 |
|
#
88c56cfe |
| 12-Jul-2023 |
Phil Auld <pauld@redhat.com> |
sched/fair: Block nohz tick_stop when cfs bandwidth in use
CFS bandwidth limits and NOHZ full don't play well together. Tasks can easily run well past their quotas before a remote tick does account
sched/fair: Block nohz tick_stop when cfs bandwidth in use
CFS bandwidth limits and NOHZ full don't play well together. Tasks can easily run well past their quotas before a remote tick does accounting. This leads to long, multi-period stalls before such tasks can run again. Currently, when presented with these conflicting requirements the scheduler is favoring nohz_full and letting the tick be stopped. However, nohz tick stopping is already best-effort, there are a number of conditions that can prevent it, whereas cfs runtime bandwidth is expected to be enforced.
Make the scheduler favor bandwidth over stopping the tick by setting TICK_DEP_BIT_SCHED when the only running task is a cfs task with runtime limit enabled. We use cfs_b->hierarchical_quota to determine if the task requires the tick.
Add check in pick_next_task_fair() as well since that is where we have a handle on the task that is actually going to be running.
Add check in sched_can_stop_tick() to cover some edge cases such as nr_running going from 2->1 and the 1 remains the running task.
Reviewed-By: Ben Segall <bsegall@google.com> Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230712133357.381137-3-pauld@redhat.com
show more ...
|
#
c98c1827 |
| 14-Jul-2023 |
Phil Auld <pauld@redhat.com> |
sched, cgroup: Restore meaning to hierarchical_quota
In cgroupv2 cfs_b->hierarchical_quota is set to -1 for all task groups due to the previous fix simply taking the min. It should reflect a limit
sched, cgroup: Restore meaning to hierarchical_quota
In cgroupv2 cfs_b->hierarchical_quota is set to -1 for all task groups due to the previous fix simply taking the min. It should reflect a limit imposed at that level or by an ancestor. Even though cgroupv2 does not require child quota to be less than or equal to that of its ancestors the task group will still be constrained by such a quota so this should be shown here. Cgroupv1 continues to set this correctly.
In both cases, add initialization when a new task group is created based on the current parent's value (or RUNTIME_INF in the case of root_task_group). Otherwise, the field is wrong until a quota is changed after creation and __cfs_schedulable() is called.
Fixes: c53593e5cb69 ("sched, cgroup: Don't reject lower cpu.max on ancestors") Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20230714125746.812891-1-pauld@redhat.com
show more ...
|
#
22dc02f8 |
| 24-Jul-2023 |
Peter Zijlstra <peterz@infradead.org> |
Revert "sched/fair: Move unused stub functions to header"
Revert commit 7aa55f2a5902 ("sched/fair: Move unused stub functions to header"), for while it has the right Changelog, the actual patch cont
Revert "sched/fair: Move unused stub functions to header"
Revert commit 7aa55f2a5902 ("sched/fair: Move unused stub functions to header"), for while it has the right Changelog, the actual patch content a revert of the previous 4 patches:
f7df852ad6db ("sched: Make task_vruntime_update() prototype visible") c0bdfd72fbfb ("sched/fair: Hide unused init_cfs_bandwidth() stub") 378be384e01f ("sched: Add schedule_user() declaration") d55ebae3f312 ("sched: Hide unused sched_update_scaling()")
So in effect this is a revert of a revert and re-applies those patches.
Fixes: 7aa55f2a5902 ("sched/fair: Move unused stub functions to header") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
show more ...
|
#
61b73694 |
| 24-Jul-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get v6.5-rc2.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|