#
aad29a73 |
| 23-Jan-2025 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.74' into for/openbmc/dev-6.6
This is the 6.6.74 stable release
|
Revision tags: v6.6.74, v6.6.73, v6.6.72, v6.6.71, v6.12.9, v6.6.70, v6.12.8, v6.6.69, v6.12.7, v6.6.68 |
|
#
a5cbbea1 |
| 20-Dec-2024 |
Koichiro Den <koichiro.den@canonical.com> |
hrtimers: Handle CPU state correctly on hotplug
commit 2f8dea1692eef2b7ba6a256246ed82c365fdc686 upstream.
Consider a scenario where a CPU transitions from CPUHP_ONLINE to halfway through a CPU hotu
hrtimers: Handle CPU state correctly on hotplug
commit 2f8dea1692eef2b7ba6a256246ed82c365fdc686 upstream.
Consider a scenario where a CPU transitions from CPUHP_ONLINE to halfway through a CPU hotunplug down to CPUHP_HRTIMERS_PREPARE, and then back to CPUHP_ONLINE:
Since hrtimers_prepare_cpu() does not run, cpu_base.hres_active remains set to 1 throughout. However, during a CPU unplug operation, the tick and the clockevents are shut down at CPUHP_AP_TICK_DYING. On return to the online state, for instance CFS incorrectly assumes that the hrtick is already active, and the chance of the clockevent device to transition to oneshot mode is also lost forever for the CPU, unless it goes back to a lower state than CPUHP_HRTIMERS_PREPARE once.
This round-trip reveals another issue; cpu_base.online is not set to 1 after the transition, which appears as a WARN_ON_ONCE in enqueue_hrtimer().
Aside of that, the bulk of the per CPU state is not reset either, which means there are dangling pointers in the worst case.
Address this by adding a corresponding startup() callback, which resets the stale per CPU state and sets the online flag.
[ tglx: Make the new callback unconditionally available, remove the online modification in the prepare() callback and clear the remaining state in the starting callback instead of the prepare callback ]
Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241220134421.3809834-1-koichiro.den@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.12.6, v6.6.67, v6.12.5, v6.6.66, v6.6.65, v6.12.4, v6.6.64, v6.12.3, v6.12.2, v6.6.63, v6.12.1, v6.12, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49 |
|
#
26d0dfbb |
| 29-Aug-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.48' into for/openbmc/dev-6.6
This is the 6.6.48 stable release
|
Revision tags: v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44 |
|
#
f17c3a37 |
| 30-Jul-2024 |
Nysal Jan K.A <nysal@linux.ibm.com> |
cpu/SMT: Enable SMT only if a core is online
[ Upstream commit 6c17ea1f3eaa330d445ac14a9428402ce4e3055e ]
If a core is offline then enabling SMT should not online CPUs of this core. By enabling SMT
cpu/SMT: Enable SMT only if a core is online
[ Upstream commit 6c17ea1f3eaa330d445ac14a9428402ce4e3055e ]
If a core is offline then enabling SMT should not online CPUs of this core. By enabling SMT, what is intended is either changing the SMT value from "off" to "on" or setting the SMT level (threads per core) from a lower to higher value.
On PowerPC the ppc64_cpu utility can be used, among other things, to perform the following functions:
ppc64_cpu --cores-on # Get the number of online cores ppc64_cpu --cores-on=X # Put exactly X cores online ppc64_cpu --offline-cores=X[,Y,...] # Put specified cores offline ppc64_cpu --smt={on|off|value} # Enable, disable or change SMT level
If the user has decided to offline certain cores, enabling SMT should not online CPUs in those cores. This patch fixes the issue and changes the behaviour as described, by introducing an arch specific function topology_is_core_online(). It is currently implemented only for PowerPC.
Fixes: 73c58e7e1412 ("powerpc: Add HOTPLUG_SMT support") Reported-by: Tyrel Datwyler <tyreld@linux.ibm.com> Closes: https://groups.google.com/g/powerpc-utils-devel/c/wrwVzAAnRlI/m/5KJSoqP4BAAJ Signed-off-by: Nysal Jan K.A <nysal@linux.ibm.com> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240731030126.956210-2-nysal@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.43, v6.6.42, v6.6.41, v6.6.40 |
|
#
ee1cd504 |
| 12-Jul-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.39' into dev-6.6
This is the 6.6.39 stable release
|
Revision tags: v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35 |
|
#
69787793 |
| 18-Jun-2024 |
Huacai Chen <chenhuacai@loongson.cn> |
cpu: Fix broken cmdline "nosmp" and "maxcpus=0"
commit 6ef8eb5125722c241fd60d7b0c872d5c2e5dd4ca upstream.
After the rework of "Parallel CPU bringup", the cmdline "nosmp" and "maxcpus=0" parameters
cpu: Fix broken cmdline "nosmp" and "maxcpus=0"
commit 6ef8eb5125722c241fd60d7b0c872d5c2e5dd4ca upstream.
After the rework of "Parallel CPU bringup", the cmdline "nosmp" and "maxcpus=0" parameters are not working anymore. These parameters set setup_max_cpus to zero and that's handed to bringup_nonboot_cpus().
The code there does a decrement before checking for zero, which brings it into the negative space and brings up all CPUs.
Add a zero check at the beginning of the function to prevent this.
[ tglx: Massaged change log ]
Fixes: 18415f33e2ac4ab382 ("cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE") Fixes: 06c6796e0304234da6 ("cpu/hotplug: Fix off by one in cpuhp_bringup_mask()") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240618081336.3996825-1-chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
f91ca89e |
| 10-Jul-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.37' into dev-6.6
This is the 6.6.37 stable release
|
Revision tags: v6.6.34, v6.6.33, v6.6.32, v6.6.31 |
|
#
52bbae42 |
| 15-May-2024 |
Yuntao Wang <ytcoode@gmail.com> |
cpu/hotplug: Fix dynstate assignment in __cpuhp_setup_state_cpuslocked()
commit 932d8476399f622aa0767a4a0a9e78e5341dc0e1 upstream.
Commit 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepa
cpu/hotplug: Fix dynstate assignment in __cpuhp_setup_state_cpuslocked()
commit 932d8476399f622aa0767a4a0a9e78e5341dc0e1 upstream.
Commit 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepare stage") added a dynamic range for the prepare states, but did not handle the assignment of the dynstate variable in __cpuhp_setup_state_cpuslocked().
This causes the corresponding startup callback not to be invoked when calling __cpuhp_setup_state_cpuslocked() with the CPUHP_BP_PREPARE_DYN parameter, even though it should be.
Currently, the users of __cpuhp_setup_state_cpuslocked(), for one reason or another, have not triggered this bug.
Fixes: 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepare stage") Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240515134554.427071-1-ytcoode@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
b181f702 |
| 12-Jun-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.33' into dev-6.6
This is the 6.6.33 stable release
|
Revision tags: v6.6.30, v6.6.29 |
|
#
976b74fa |
| 19-Apr-2024 |
Sean Christopherson <seanjc@google.com> |
cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
[ Upstream commit ce0abef6a1d540acef85068e0e82bdf1fbeeb0e9 ]
Explicitly disallow enabling mitigations at runtime for kernels that wer
cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
[ Upstream commit ce0abef6a1d540acef85068e0e82bdf1fbeeb0e9 ]
Explicitly disallow enabling mitigations at runtime for kernels that were built with CONFIG_CPU_MITIGATIONS=n, as some architectures may omit code entirely if mitigations are disabled at compile time.
E.g. on x86, a large pile of Kconfigs are buried behind CPU_MITIGATIONS, and trying to provide sane behavior for retroactively enabling mitigations is extremely difficult, bordering on impossible. E.g. page table isolation and call depth tracking require build-time support, BHI mitigations will still be off without additional kernel parameters, etc.
[ bp: Touchups. ]
Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240420000556.2645001-3-seanjc@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
c1e01cdb |
| 02-May-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.30' into dev-6.6
This is the 6.6.30 stable release
|
#
8292f4f8 |
| 19-Apr-2024 |
Sean Christopherson <seanjc@google.com> |
cpu: Re-enable CPU mitigations by default for !X86 architectures
commit fe42754b94a42d08cf9501790afc25c4f6a5f631 upstream.
Rename x86's to CPU_MITIGATIONS, define it in generic code, and force it o
cpu: Re-enable CPU mitigations by default for !X86 architectures
commit fe42754b94a42d08cf9501790afc25c4f6a5f631 upstream.
Rename x86's to CPU_MITIGATIONS, define it in generic code, and force it on for all architectures exception x86. A recent commit to turn mitigations off by default if SPECULATION_MITIGATIONS=n kinda sorta missed that "cpu_mitigations" is completely generic, whereas SPECULATION_MITIGATIONS is x86-specific.
Rename x86's SPECULATIVE_MITIGATIONS instead of keeping both and have it select CPU_MITIGATIONS, as having two configs for the same thing is unnecessary and confusing. This will also allow x86 to use the knob to manage mitigations that aren't strictly related to speculative execution.
Use another Kconfig to communicate to common code that CPU_MITIGATIONS is already defined instead of having x86's menu depend on the common CPU_MITIGATIONS. This allows keeping a single point of contact for all of x86's mitigations, and it's not clear that other architectures *want* to allow disabling mitigations at compile-time.
Fixes: f337a6a21e2f ("x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n") Closes: https://lkml.kernel.org/r/20240413115324.53303a68%40canb.auug.org.au Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240420000556.2645001-2-seanjc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
aeddf9a2 |
| 17-Apr-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.28' into dev-6.6
This is the 6.6.28 stable release
|
Revision tags: v6.6.28, v6.6.27, v6.6.26 |
|
#
2978ee7c |
| 09-Apr-2024 |
Sean Christopherson <seanjc@google.com> |
x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n
commit f337a6a21e2fd67eadea471e93d05dd37baaa9be upstream.
Initialize cpu_mitigations to CPU_MITIGATIONS_OFF if the ke
x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n
commit f337a6a21e2fd67eadea471e93d05dd37baaa9be upstream.
Initialize cpu_mitigations to CPU_MITIGATIONS_OFF if the kernel is built with CONFIG_SPECULATION_MITIGATIONS=n, as the help text quite clearly states that disabling SPECULATION_MITIGATIONS is supposed to turn off all mitigations by default.
│ If you say N, all mitigations will be disabled. You really │ should know what you are doing to say so.
As is, the kernel still defaults to CPU_MITIGATIONS_AUTO, which results in some mitigations being enabled in spite of SPECULATION_MITIGATIONS=n.
Fixes: f43b9876e857 ("x86/retbleed: Add fine grained Kconfig knobs") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: stable@vger.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240409175108.1512861-2-seanjc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
d0c44de2 |
| 10-Feb-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.7' into dev-6.6
This is the 6.6.7 stable release
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8 |
|
#
b97d6790 |
| 13-Dec-2023 |
Joel Stanley <joel@jms.id.au> |
Merge tag 'v6.6.6' into dev-6.6
This is the 6.6.6 stable release
Signed-off-by: Joel Stanley <joel@jms.id.au>
|
Revision tags: v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1 |
|
#
53f408ca |
| 07-Nov-2023 |
Thomas Gleixner <tglx@linutronix.de> |
hrtimers: Push pending hrtimers away from outgoing CPU earlier
[ Upstream commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 ]
2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug") sol
hrtimers: Push pending hrtimers away from outgoing CPU earlier
[ Upstream commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 ]
2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug") solved the straight forward CPU hotplug deadlock vs. the scheduler bandwidth timer. Yu discovered a more involved variant where a task which has a bandwidth timer started on the outgoing CPU holds a lock and then gets throttled. If the lock required by one of the CPU hotplug callbacks the hotplug operation deadlocks because the unthrottling timer event is not handled on the dying CPU and can only be recovered once the control CPU reaches the hotplug state which pulls the pending hrtimers from the dead CPU.
Solve this by pushing the hrtimers away from the dying CPU in the dying callbacks. Nothing can queue a hrtimer on the dying CPU at that point because all other CPUs spin in stop_machine() with interrupts disabled and once the operation is finished the CPU is marked offline.
Reported-by: Yu Liao <liaoyu15@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Liu Tie <liutie4@huawei.com> Link: https://lore.kernel.org/r/87a5rphara.ffs@tglx Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.5.10, v6.6, v6.5.9, v6.5.8 |
|
#
3073f6df |
| 17-Oct-2023 |
Ran Xiaokai <ran.xiaokai@zte.com.cn> |
cpu/hotplug: Don't offline the last non-isolated CPU
[ Upstream commit 38685e2a0476127db766f81b1c06019ddc4c9ffa ]
If a system has isolated CPUs via the "isolcpus=" command line parameter, then an a
cpu/hotplug: Don't offline the last non-isolated CPU
[ Upstream commit 38685e2a0476127db766f81b1c06019ddc4c9ffa ]
If a system has isolated CPUs via the "isolcpus=" command line parameter, then an attempt to offline the last housekeeping CPU will result in a WARN_ON() when rebuilding the scheduler domains and a subsequent panic due to and unhandled empty CPU mas in partition_sched_domains_locked().
cpuset_hotplug_workfn() rebuild_sched_domains_locked() ndoms = generate_sched_domains(&doms, &attr); cpumask_and(doms[0], top_cpuset.effective_cpus, housekeeping_cpumask(HK_FLAG_DOMAIN));
Thus results in an empty CPU mask which triggers the warning and then the subsequent crash:
WARNING: CPU: 4 PID: 80 at kernel/sched/topology.c:2366 build_sched_domains+0x120c/0x1408 Call trace: build_sched_domains+0x120c/0x1408 partition_sched_domains_locked+0x234/0x880 rebuild_sched_domains_locked+0x37c/0x798 rebuild_sched_domains+0x30/0x58 cpuset_hotplug_workfn+0x2a8/0x930
Unable to handle kernel paging request at virtual address fffe80027ab37080 partition_sched_domains_locked+0x318/0x880 rebuild_sched_domains_locked+0x37c/0x798
Aside of the resulting crash, it does not make any sense to offline the last last housekeeping CPU.
Prevent this by masking out the non-housekeeping CPUs when selecting a target CPU for initiating the CPU unplug operation via the work queue.
Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/202310171709530660462@zte.com.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46 |
|
#
60edbe8e |
| 14-Aug-2023 |
Thomas Gleixner <tglx@linutronix.de> |
cpu/SMT: Make SMT control more robust against enumeration failures
[ Upstream commit d91bdd96b55cc3ce98d883a60f133713821b80a6 ]
The SMT control mechanism got added as speculation attack vector miti
cpu/SMT: Make SMT control more robust against enumeration failures
[ Upstream commit d91bdd96b55cc3ce98d883a60f133713821b80a6 ]
The SMT control mechanism got added as speculation attack vector mitigation. The implemented logic relies on the primary thread mask to be set up properly.
This turns out to be an issue with XEN/PV guests because their CPU hotplug mechanics do not enumerate APICs and therefore the mask is never correctly populated.
This went unnoticed so far because by chance XEN/PV ends up with smp_num_siblings == 2. So smt_hotplug_control stays at its default value CPU_SMT_ENABLED and the primary thread mask is never evaluated in the context of CPU hotplug.
This stopped "working" with the upcoming overhaul of the topology evaluation which legitimately provides a fake topology for XEN/PV. That sets smp_num_siblings to 1, which causes the core CPU hot-plug core to refuse to bring up the APs.
This happens because smt_hotplug_control is set to CPU_SMT_NOT_SUPPORTED which causes cpu_smt_allowed() to evaluate the unpopulated primary thread mask with the conclusion that all non-boot CPUs are not valid to be plugged.
Make cpu_smt_allowed() more robust and take CPU_SMT_NOT_SUPPORTED and CPU_SMT_NOT_IMPLEMENTED into account. Rename it to cpu_bootable() while at it as that makes it more clear what the function is about.
The primary mask issue on x86 XEN/PV needs to be addressed separately as there are users outside of the CPU hotplug code too.
Fixes: 05736e4ac13c ("cpu/hotplug: Provide knobs to control SMT") Reported-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.149440843@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
23dfeae8 |
| 02-Sep-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'smp-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar: "Fix a CPU hotplug related deadlock between the task which initiate
Merge tag 'smp-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar: "Fix a CPU hotplug related deadlock between the task which initiates and controls a CPU hot-unplug operation vs. the CFS bandwidth timer"
* tag 'smp-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Prevent self deadlock on CPU hot-unplug
show more ...
|
#
1ac731c5 |
| 30-Aug-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.6 merge window.
|
#
2b8272ff |
| 23-Aug-2023 |
Thomas Gleixner <tglx@linutronix.de> |
cpu/hotplug: Prevent self deadlock on CPU hot-unplug
Xiongfeng reported and debugged a self deadlock of the task which initiates and controls a CPU hot-unplug operation vs. the CFS bandwidth timer.
cpu/hotplug: Prevent self deadlock on CPU hot-unplug
Xiongfeng reported and debugged a self deadlock of the task which initiates and controls a CPU hot-unplug operation vs. the CFS bandwidth timer.
CPU1 CPU2
T1 sets cfs_quota starts hrtimer cfs_bandwidth 'period_timer' T1 is migrated to CPU2 T1 initiates offlining of CPU1 Hotplug operation starts ... 'period_timer' expires and is re-enqueued on CPU1 ... take_cpu_down() CPU1 shuts down and does not handle timers anymore. They have to be migrated in the post dead hotplug steps by the control task.
T1 runs the post dead offline operation T1 is scheduled out T1 waits for 'period_timer' to expire
T1 waits there forever if it is scheduled out before it can execute the hrtimer offline callback hrtimers_dead_cpu().
Cure this by delegating the hotplug control operation to a worker thread on an online CPU. This takes the initiating user space task, which might be affected by the bandwidth timer, completely out of the picture.
Reported-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Yu Liao <liaoyu15@huawei.com> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/8e785777-03aa-99e1-d20e-e956f5685be6@huawei.com Link: https://lore.kernel.org/r/87h6oqdq0i.ffs@tglx
show more ...
|
#
6f49693a |
| 28-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'smp-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner: "Updates for the CPU hotplug core:
- Support partial SMT
Merge tag 'smp-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner: "Updates for the CPU hotplug core:
- Support partial SMT enablement.
So far the sysfs SMT control only allows to toggle between SMT on and off. That's sufficient for x86 which usually has at max two threads except for the Xeon PHI platform which has four threads per core
Though PowerPC has up to 16 threads per core and so far it's only possible to control the number of enabled threads per core via a command line option. There is some way to control this at runtime, but that lacks enforcement and the usability is awkward
This update expands the sysfs interface and the core infrastructure to accept numerical values so PowerPC can build SMT runtime control for partial SMT enablement on top
The core support has also been provided to the PowerPC maintainers who added the PowerPC related changes on top
- Minor cleanups and documentation updates"
* tag 'smp-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation: core-api/cpuhotplug: Fix state names cpu/hotplug: Remove unused function declaration cpu_set_state_online() cpu/SMT: Fix cpu_smt_possible() comment cpu/SMT: Allow enabling partial SMT states via sysfs cpu/SMT: Create topology_smt_thread_allowed() cpu/SMT: Remove topology_smt_supported() cpu/SMT: Store the current/max number of threads cpu/SMT: Move smt/control simple exit cases earlier cpu/SMT: Move SMT prototypes into cpu_smt.h cpu/hotplug: Remove dependancy against cpu_primary_thread_mask
show more ...
|
#
15f63e30 |
| 14-Aug-2023 |
Michael Ellerman <mpe@ellerman.id.au> |
Merge branch 'topic/cpu-smt' into next
Merge SMT changes we are sharing with the tip tree.
|