Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8 |
|
#
fbe1bf1e |
| 15-Oct-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Revert "x86/smp: Put CPUs into INIT on shutdown if possible"
This reverts commit 45e34c8af58f23db4474e2bfe79183efec09a18b, and the two subsequent fixes to it:
3f874c9b2aae ("x86/smp: Don't send I
Revert "x86/smp: Put CPUs into INIT on shutdown if possible"
This reverts commit 45e34c8af58f23db4474e2bfe79183efec09a18b, and the two subsequent fixes to it:
3f874c9b2aae ("x86/smp: Don't send INIT to non-present and non-booted CPUs") b1472a60a584 ("x86/smp: Don't send INIT to boot CPU")
because it seems to result in hung machines at shutdown. Particularly some Dell machines, but Thomas says
"The rest seems to be Lenovo and Sony with Alderlake/Raptorlake CPUs - at least that's what I could figure out from the various bug reports.
I don't know which CPUs the DELL machines have, so I can't say it's a pattern.
I agree with the revert for now"
Ashok Raj chimes in:
"There was a report (probably this same one), and it turns out it was a bug in the BIOS SMI handler.
The client BIOS's were waiting for the lowest APICID to be the SMI rendevous master. If this is MeteorLake, the BSP wasn't the one with the lowest APIC and it triped here.
The BIOS change is also being pushed to others for assimilation :)
Server BIOS's had this correctly for a while now"
and it does look likely to be some bad interaction between SMI and the non-BSP cores having put into INIT (and thus unresponsive until reset).
Link: https://bbs.archlinux.org/viewtopic.php?pid=2124429 Link: https://www.reddit.com/r/openSUSE/comments/16qq99b/tumbleweed_shutdown_did_not_finish_completely/ Link: https://forum.artixlinux.org/index.php/topic,5997.0.html Link: https://bugzilla.redhat.com/show_bug.cgi?id=2241279 Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3 |
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.5.2, v6.1.51, v6.5.1 |
|
#
1ac731c5 |
| 30-Aug-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.6 merge window.
|
#
1687d8ac |
| 30-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Dave Hansen: "This includes a very thorough rework of the 'struct apic' handlers
Merge tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Dave Hansen: "This includes a very thorough rework of the 'struct apic' handlers. Quite a variety of them popped up over the years, especially in the 32-bit days when odd apics were much more in vogue.
The end result speaks for itself, which is a removal of a ton of code and static calls to replace indirect calls.
If there's any breakage here, it's likely to be around the 32-bit museum pieces that get light to no testing these days.
Summary:
- Rework apic callbacks, getting rid of unnecessary ones and coalescing lots of silly duplicates.
- Use static_calls() instead of indirect calls for apic->foo()
- Tons of cleanups an crap removal along the way"
* tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) x86/apic: Turn on static calls x86/apic: Provide static call infrastructure for APIC callbacks x86/apic: Wrap IPI calls into helper functions x86/apic: Mark all hotpath APIC callback wrappers __always_inline x86/xen/apic: Mark apic __ro_after_init x86/apic: Convert other overrides to apic_update_callback() x86/apic: Replace acpi_wake_cpu_handler_update() and apic_set_eoi_cb() x86/apic: Provide apic_update_callback() x86/xen/apic: Use standard apic driver mechanism for Xen PV x86/apic: Provide common init infrastructure x86/apic: Wrap apic->native_eoi() into a helper x86/apic: Nuke ack_APIC_irq() x86/apic: Remove pointless arguments from [native_]eoi_write() x86/apic/noop: Tidy up the code x86/apic: Remove pointless NULL initializations x86/apic: Sanitize APIC ID range validation x86/apic: Prepare x2APIC for using apic::max_apic_id x86/apic: Simplify X2APIC ID validation x86/apic: Add max_apic_id member x86/apic: Wrap APIC ID validation into an inline ...
show more ...
|
#
87fa732d |
| 30-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'x86-core-2023-08-30-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Thomas Gleixner:
- Prevent kprobes on compiler generated CFI checking code.
Merge tag 'x86-core-2023-08-30-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Thomas Gleixner:
- Prevent kprobes on compiler generated CFI checking code.
The compiler generates an instruction sequence for indirect call checks. If this sequence is modified with a kprobe, then the check fails. So the instructions must be protected against probing.
- A few minor cleanups for the SMP code
* tag 'x86-core-2023-08-30-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kprobes: Prohibit probing on compiler generated CFI checking code x86/smpboot: Change smp_store_boot_cpu_info() to static x86/smp: Remove a non-existent function declaration x86/smpboot: Remove a stray comment about CPU hotplug
show more ...
|
Revision tags: v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45 |
|
#
f2bb0b4f |
| 08-Aug-2023 |
Thomas Gleixner <tglx@linutronix.de> |
x86/apic/32: Remove x86_cpu_to_logical_apicid
This per CPU variable is just yet another form of voodoo programming. The boot ordering is:
per_cpu(x86_cpu_to_logical_apicid, cpu) = 1U << cpu;
.
x86/apic/32: Remove x86_cpu_to_logical_apicid
This per CPU variable is just yet another form of voodoo programming. The boot ordering is:
per_cpu(x86_cpu_to_logical_apicid, cpu) = 1U << cpu;
.....
setup_apic() apic->init_apic_ldr() default_init_apic_ldr() apic_write(SET_APIC_LOGICAL_ID(1UL << smp_processor_id(), APIC_LDR);
id = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR); WARN_ON(id != per_cpu(x86_cpu_to_logical_apicid, cpu)); per_cpu(x86_cpu_to_logical_apicid, cpu) = id;
So first write the default into LDR and then validate it against the same default which was set up during early boot APIC enumeration.
Brilliant, isn't it?
The comment above the per CPU variable declaration describes it well: 'Let's keep it ugly for now.'
Remove the useless gunk and use '1U << cpu' consistently all over the place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
show more ...
|
#
a6625b47 |
| 08-Aug-2023 |
Thomas Gleixner <tglx@linutronix.de> |
x86/apic: Get rid of hard_smp_processor_id()
No point in having a wrapper around read_apic_id().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.in
x86/apic: Get rid of hard_smp_processor_id()
No point in having a wrapper around read_apic_id().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
show more ...
|
#
d23c977f |
| 08-Aug-2023 |
Thomas Gleixner <tglx@linutronix.de> |
x86/apic: Remove pointless x86_bios_cpu_apicid
It's a useless copy of x86_cpu_to_apicid.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
x86/apic: Remove pointless x86_bios_cpu_apicid
It's a useless copy of x86_cpu_to_apicid.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
show more ...
|
Revision tags: v6.1.44 |
|
#
2612e3bb |
| 07-Aug-2023 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Catching-up with drm-next and drm-intel-gt-next. It will unblock a code refactor around the platform definitions (names vs acronyms).
Signed-off-by: Rodrigo V
Merge drm/drm-next into drm-intel-next
Catching-up with drm-next and drm-intel-gt-next. It will unblock a code refactor around the platform definitions (names vs acronyms).
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
#
9f771739 |
| 07-Aug-2023 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Need to pull in b3e4aae612ec ("drm/i915/hdcp: Modify hdcp_gsc_message msg sending mechanism") as a dependency for https://patchwork.freedesktop.org/series/1
Merge drm/drm-next into drm-intel-gt-next
Need to pull in b3e4aae612ec ("drm/i915/hdcp: Modify hdcp_gsc_message msg sending mechanism") as a dependency for https://patchwork.freedesktop.org/series/121735/
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
Revision tags: v6.1.43 |
|
#
d7114f83 |
| 27-Jul-2023 |
Sohil Mehta <sohil.mehta@intel.com> |
x86/smpboot: Change smp_store_boot_cpu_info() to static
The function is only used locally. Convert it to a static one.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Thomas Gleix
x86/smpboot: Change smp_store_boot_cpu_info() to static
The function is only used locally. Convert it to a static one.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230727180533.3119660-4-sohil.mehta@intel.com
show more ...
|
#
54bfd02b |
| 27-Jul-2023 |
Sohil Mehta <sohil.mehta@intel.com> |
x86/smp: Remove a non-existent function declaration
x86_idle_thread_init() does not exist anywhere. Remove its declaration from the header.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed
x86/smp: Remove a non-existent function declaration
x86_idle_thread_init() does not exist anywhere. Remove its declaration from the header.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230727180533.3119660-3-sohil.mehta@intel.com
show more ...
|
Revision tags: v6.1.42, v6.1.41 |
|
#
61b73694 |
| 24-Jul-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get v6.5-rc2.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.1.40, v6.1.39 |
|
#
50501936 |
| 17-Jul-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.4' into next
Sync up with mainline to bring in updates to shared infrastructure.
|
#
0791faeb |
| 17-Jul-2023 |
Mark Brown <broonie@kernel.org> |
ASoC: Merge v6.5-rc2
Get a similar baseline to my other branches, and fixes for people using the branch.
|
#
2f98e686 |
| 11-Jul-2023 |
Maxime Ripard <mripard@kernel.org> |
Merge v6.5-rc1 into drm-misc-fixes
Boris needs 6.5-rc1 in drm-misc-fixes to prevent a conflict.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
Revision tags: v6.1.38, v6.1.37 |
|
#
44f10dbe |
| 30-Jun-2023 |
Andrew Morton <akpm@linux-foundation.org> |
Merge branch 'master' into mm-hotfixes-stable
|
Revision tags: v6.1.36 |
|
#
e80b5003 |
| 27-Jun-2023 |
Jiri Kosina <jkosina@suse.cz> |
Merge branch 'for-6.5/apple' into for-linus
- improved support for Keychron K8 keyboard (Lasse Brun)
|
#
88afbb21 |
| 26-Jun-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'x86-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Thomas Gleixner: "A set of fixes for kexec(), reboot and shutdown issues:
Merge tag 'x86-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Thomas Gleixner: "A set of fixes for kexec(), reboot and shutdown issues:
- Ensure that the WBINVD in stop_this_cpu() has been completed before the control CPU proceedes.
stop_this_cpu() is used for kexec(), reboot and shutdown to park the APs in a HLT loop.
The control CPU sends an IPI to the APs and waits for their CPU online bits to be cleared. Once they all are marked "offline" it proceeds.
But stop_this_cpu() clears the CPU online bit before issuing WBINVD, which means there is no guarantee that the AP has reached the HLT loop.
This was reported to cause intermittent reboot/shutdown failures due to some dubious interaction with the firmware.
This is not only a problem of WBINVD. The code to actually "stop" the CPU which runs between clearing the online bit and reaching the HLT loop can cause large enough delays on its own (think virtualization). That's especially dangerous for kexec() as kexec() expects that all APs are in a safe state and not executing code while the boot CPU jumps to the new kernel. There are more issues vs kexec() which are addressed separately.
Cure this by implementing an explicit synchronization point right before the AP reaches HLT. This guarantees that the AP has completed the full stop proceedure.
- Fix the condition for WBINVD in stop_this_cpu().
The WBINVD in stop_this_cpu() is required for ensuring that when switching to or from memory encryption no dirty data is left in the cache lines which might cause a write back in the wrong more later.
This checks CPUID directly because the feature bit might have been cleared due to a command line option.
But that CPUID check accesses leaf 0x8000001f::EAX unconditionally. Intel CPUs return the content of the highest supported leaf when a non-existing leaf is read, while AMD CPUs return all zeros for unsupported leafs.
So the result of the test on Intel CPUs is lottery and on AMD its just correct by chance.
While harmless it's incorrect and causes the conditional wbinvd() to be issued where not required, which caused the above issue to be unearthed.
- Make kexec() robust against AP code execution
Ashok observed triple faults when doing kexec() on a system which had been booted with "nosmt".
It turned out that the SMT siblings which had been brought up partially are parked in mwait_play_dead() to enable power savings.
mwait_play_dead() is monitoring the thread flags of the AP's idle task, which has been chosen as it's unlikely to be written to.
But kexec() can overwrite the previous kernel text and data including page tables etc. When it overwrites the cache lines monitored by an AP that AP resumes execution after the MWAIT on eventually overwritten text, stack and page tables, which obviously might end up in a triple fault easily.
Make this more robust in several steps:
1) Use an explicit per CPU cache line for monitoring.
2) Write a command to these cache lines to kick APs out of MWAIT before proceeding with kexec(), shutdown or reboot.
The APs confirm the wakeup by writing status back and then enter a HLT loop.
3) If the system uses INIT/INIT/STARTUP for AP bringup, park the APs in INIT state.
HLT is not a guarantee that an AP won't wake up and resume execution. HLT is woken up by NMI and SMI. SMI puts the CPU back into HLT (+/- firmware bugs), but NMI is delivered to the CPU which executes the NMI handler. Same issue as the MWAIT scenario described above.
Sending an INIT/INIT sequence to the APs puts them into wait for STARTUP state, which is safe against NMI.
There is still an issue remaining which can't be fixed: #MCE
If the AP sits in HLT and receives a broadcast #MCE it will try to handle it with the obvious consequences.
INIT/INIT clears CR4.MCE in the AP which will cause a broadcast #MCE to shut down the machine.
So there is a choice between fire (HLT) and frying pan (INIT). Frying pan has been chosen as it's at least preventing the NMI issue.
On systems which are not using INIT/INIT/STARTUP there is not much which can be done right now, but at least the obvious and easy to trigger MWAIT issue has been addressed"
* tag 'x86-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Put CPUs into INIT on shutdown if possible x86/smp: Split sending INIT IPI out into a helper function x86/smp: Cure kexec() vs. mwait_play_dead() breakage x86/smp: Use dedicated cache-line for mwait_play_dead() x86/smp: Remove pointless wmb()s from native_stop_other_cpus() x86/smp: Dont access non-existing CPUID leaf x86/smp: Make stop_other_cpus() more robust
show more ...
|
#
9244724f |
| 26-Jun-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP updates from Thomas Gleixner: "A large update for SMP management:
- Parallel CPU bringup
Merge tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP updates from Thomas Gleixner: "A large update for SMP management:
- Parallel CPU bringup
The reason why people are interested in parallel bringup is to shorten the (kexec) reboot time of cloud servers to reduce the downtime of the VM tenants.
The current fully serialized bringup does the following per AP:
1) Prepare callbacks (allocate, intialize, create threads) 2) Kick the AP alive (e.g. INIT/SIPI on x86) 3) Wait for the AP to report alive state 4) Let the AP continue through the atomic bringup 5) Let the AP run the threaded bringup to full online state
There are two significant delays:
#3 The time for an AP to report alive state in start_secondary() on x86 has been measured in the range between 350us and 3.5ms depending on vendor and CPU type, BIOS microcode size etc.
#4 The atomic bringup does the microcode update. This has been measured to take up to ~8ms on the primary threads depending on the microcode patch size to apply.
On a two socket SKL server with 56 cores (112 threads) the boot CPU spends on current mainline about 800ms busy waiting for the APs to come up and apply microcode. That's more than 80% of the actual onlining procedure.
This can be reduced significantly by splitting the bringup mechanism into two parts:
1) Run the prepare callbacks and kick the AP alive for each AP which needs to be brought up.
The APs wake up, do their firmware initialization and run the low level kernel startup code including microcode loading in parallel up to the first synchronization point. (#1 and #2 above)
2) Run the rest of the bringup code strictly serialized per CPU (#3 - #5 above) as it's done today.
Parallelizing that stage of the CPU bringup might be possible in theory, but it's questionable whether required surgery would be justified for a pretty small gain.
If the system is large enough the first AP is already waiting at the first synchronization point when the boot CPU finished the wake-up of the last AP. That reduces the AP bringup time on that SKL from ~800ms to ~80ms, i.e. by a factor ~10x.
The actual gain varies wildly depending on the system, CPU, microcode patch size and other factors. There are some opportunities to reduce the overhead further, but that needs some deep surgery in the x86 CPU bringup code.
For now this is only enabled on x86, but the core functionality obviously works for all SMP capable architectures.
- Enhancements for SMP function call tracing so it is possible to locate the scheduling and the actual execution points. That allows to measure IPI delivery time precisely"
* tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) trace,smp: Add tracepoints for scheduling remotelly called functions trace,smp: Add tracepoints around remotelly called functions MAINTAINERS: Add CPU HOTPLUG entry x86/smpboot: Fix the parallel bringup decision x86/realmode: Make stack lock work in trampoline_compat() x86/smp: Initialize cpu_primary_thread_mask late cpu/hotplug: Fix off by one in cpuhp_bringup_mask() x86/apic: Fix use of X{,2}APIC_ENABLE in asm with older binutils x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it x86/smpboot: Support parallel startup of secondary CPUs x86/smpboot: Implement a bit spinlock to protect the realmode stack x86/apic: Save the APIC virtual base address cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE x86/apic: Provide cpu_primary_thread mask x86/smpboot: Enable split CPU startup cpu/hotplug: Provide a split up CPUHP_BRINGUP mechanism cpu/hotplug: Reset task stack state in _cpu_up() cpu/hotplug: Remove unused state functions riscv: Switch to hotplug core state synchronization parisc: Switch to hotplug core state synchronization ...
show more ...
|
Revision tags: v6.4, v6.1.35 |
|
#
45e34c8a |
| 15-Jun-2023 |
Thomas Gleixner <tglx@linutronix.de> |
x86/smp: Put CPUs into INIT on shutdown if possible
Parking CPUs in a HLT loop is not completely safe vs. kexec() as HLT can resume execution due to NMI, SMI and MCE, which has the same issue as the
x86/smp: Put CPUs into INIT on shutdown if possible
Parking CPUs in a HLT loop is not completely safe vs. kexec() as HLT can resume execution due to NMI, SMI and MCE, which has the same issue as the MWAIT loop.
Kicking the secondary CPUs into INIT makes this safe against NMI and SMI.
A broadcast MCE will take the machine down, but a broadcast MCE which makes HLT resume and execute overwritten text, pagetables or data will end up in a disaster too.
So chose the lesser of two evils and kick the secondary CPUs into INIT unless the system has installed special wakeup mechanisms which are not using INIT.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230615193330.608657211@linutronix.de
show more ...
|
#
d7893093 |
| 15-Jun-2023 |
Thomas Gleixner <tglx@linutronix.de> |
x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess.
When kexec() is executed on a system with offline CPUs, which are parked in mwait_play_dead() it can end up in a triple fault
x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess.
When kexec() is executed on a system with offline CPUs, which are parked in mwait_play_dead() it can end up in a triple fault during the bootup of the kexec kernel or cause hard to diagnose data corruption.
The reason is that kexec() eventually overwrites the previous kernel's text, page tables, data and stack. If it writes to the cache line which is monitored by a previously offlined CPU, MWAIT resumes execution and ends up executing the wrong text, dereferencing overwritten page tables or corrupting the kexec kernels data.
Cure this by bringing the offlined CPUs out of MWAIT into HLT.
Write to the monitored cache line of each offline CPU, which makes MWAIT resume execution. The written control word tells the offlined CPUs to issue HLT, which does not have the MWAIT problem.
That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as those make it come out of HLT.
A follow up change will put them into INIT, which protects at least against NMI and SMI.
Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case") Reported-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
show more ...
|
#
db6da59c |
| 15-Jun-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging to sync drm-misc-next-fixes with drm-misc-next.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.1.34 |
|
#
03c60192 |
| 12-Jun-2023 |
Dmitry Baryshkov <dmitry.baryshkov@linaro.org> |
Merge branch 'drm-next' of git://anongit.freedesktop.org/drm/drm into msm-next-lumag-base
Merge the drm-next tree to pick up the DRM DSC helpers (merged via drm-intel-next tree). MSM DSC v1.2 patche
Merge branch 'drm-next' of git://anongit.freedesktop.org/drm/drm into msm-next-lumag-base
Merge the drm-next tree to pick up the DRM DSC helpers (merged via drm-intel-next tree). MSM DSC v1.2 patches depend on these helpers.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
show more ...
|
Revision tags: v6.1.33 |
|
#
5c680050 |
| 06-Jun-2023 |
Miquel Raynal <miquel.raynal@bootlin.com> |
Merge tag 'v6.4-rc4' into wpan-next/staging
Linux 6.4-rc4
|