Revision tags: v6.6.26, v6.6.25, v6.6.24, v6.6.23 |
|
#
1c42ff89 |
| 11-Mar-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/bhi: Mitigate KVM by default
commit 95a6ccbdc7199a14b71ad8901cb788ba7fb5167b upstream.
BHI mitigation mode spectre_bhi=auto does not deploy the software mitigation by default. In a cloud enviro
x86/bhi: Mitigate KVM by default
commit 95a6ccbdc7199a14b71ad8901cb788ba7fb5167b upstream.
BHI mitigation mode spectre_bhi=auto does not deploy the software mitigation by default. In a cloud environment, it is a likely scenario where userspace is trusted but the guests are not trusted. Deploying system wide mitigation in such cases is not desirable.
Update the auto mode to unconditionally mitigate against malicious guests. Deploy the software sequence at VMexit in auto mode also, when hardware mitigation is not available. Unlike the force =on mode, software sequence is not deployed at syscalls in auto mode.
Suggested-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
d414b401 |
| 11-Mar-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/bhi: Add BHI mitigation knob
commit ec9404e40e8f36421a2b66ecb76dc2209fe7f3ef upstream.
Branch history clearing software sequences and hardware control BHI_DIS_S were defined to mitigate Branch
x86/bhi: Add BHI mitigation knob
commit ec9404e40e8f36421a2b66ecb76dc2209fe7f3ef upstream.
Branch history clearing software sequences and hardware control BHI_DIS_S were defined to mitigate Branch History Injection (BHI).
Add cmdline spectre_bhi={on|off|auto} to control BHI mitigation:
auto - Deploy the hardware mitigation BHI_DIS_S, if available. on - Deploy the hardware mitigation BHI_DIS_S, if available, otherwise deploy the software sequence at syscall entry and VMexit. off - Turn off BHI mitigation.
The default is auto mode which does not deploy the software sequence mitigation. This is because of the hardening done in the syscall dispatch path, which is the likely target of BHI.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
118794d0 |
| 11-Mar-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/bhi: Enumerate Branch History Injection (BHI) bug
commit be482ff9500999f56093738f9219bbabc729d163 upstream.
Mitigation for BHI is selected based on the bug enumeration. Add bits needed to enume
x86/bhi: Enumerate Branch History Injection (BHI) bug
commit be482ff9500999f56093738f9219bbabc729d163 upstream.
Mitigation for BHI is selected based on the bug enumeration. Add bits needed to enumerate BHI bug.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
c6e3d590 |
| 13-Mar-2024 |
Daniel Sneddon <daniel.sneddon@linux.intel.com> |
x86/bhi: Define SPEC_CTRL_BHI_DIS_S
commit 0f4a837615ff925ba62648d280a861adf1582df7 upstream.
Newer processors supports a hardware control BHI_DIS_S to mitigate Branch History Injection (BHI). Sett
x86/bhi: Define SPEC_CTRL_BHI_DIS_S
commit 0f4a837615ff925ba62648d280a861adf1582df7 upstream.
Newer processors supports a hardware control BHI_DIS_S to mitigate Branch History Injection (BHI). Setting BHI_DIS_S protects the kernel from userspace BHI attacks without having to manually overwrite the branch history.
Define MSR_SPEC_CTRL bit BHI_DIS_S and its enumeration CPUID.BHI_CTRL. Mitigation is enabled later.
Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
eb36b0dc |
| 11-Mar-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/bhi: Add support for clearing branch history at syscall entry
commit 7390db8aea0d64e9deb28b8e1ce716f5020c7ee5 upstream.
Branch History Injection (BHI) attacks may allow a malicious application
x86/bhi: Add support for clearing branch history at syscall entry
commit 7390db8aea0d64e9deb28b8e1ce716f5020c7ee5 upstream.
Branch History Injection (BHI) attacks may allow a malicious application to influence indirect branch prediction in kernel by poisoning the branch history. eIBRS isolates indirect branch targets in ring0. The BHB can still influence the choice of indirect branch predictor entry, and although branch predictor entries are isolated between modes when eIBRS is enabled, the BHB itself is not isolated between modes.
Alder Lake and new processors supports a hardware control BHI_DIS_S to mitigate BHI. For older processors Intel has released a software sequence to clear the branch history on parts that don't support BHI_DIS_S. Add support to execute the software sequence at syscall entry and VMexit to overwrite the branch history.
For now, branch history is not cleared at interrupt entry, as malicious applications are not believed to have sufficient control over the registers, since previous register state is cleared at interrupt entry. Researchers continue to poke at this area and it may become necessary to clear at interrupt entry as well in the future.
This mitigation is only defined here. It is enabled later.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Co-developed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
55ed6c47 |
| 25-Mar-2024 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/lbr: Use freeze based on availability
[ Upstream commit 598c2fafc06fe5c56a1a415fb7b544b31453d637 ]
Currently, the LBR code assumes that LBR Freeze is supported on all processors when X
perf/x86/amd/lbr: Use freeze based on availability
[ Upstream commit 598c2fafc06fe5c56a1a415fb7b544b31453d637 ]
Currently, the LBR code assumes that LBR Freeze is supported on all processors when X86_FEATURE_AMD_LBR_V2 is available i.e. CPUID leaf 0x80000022[EAX] bit 1 is set. This is incorrect as the availability of the feature is additionally dependent on CPUID leaf 0x80000022[EAX] bit 2 being set, which may not be set for all Zen 4 processors.
Define a new feature bit for LBR and PMC freeze and set the freeze enable bit (FLBRI) in DebugCtl (MSR 0x1d9) conditionally.
It should still be possible to use LBR without freeze for profile-guided optimization of user programs by using an user-only branch filter during profiling. When the user-only filter is enabled, branches are no longer recorded after the transition to CPL 0 upon PMI arrival. When branch entries are read in the PMI handler, the branch stack does not change.
E.g.
$ perf record -j any,u -e ex_ret_brn_tkn ./workload
Since the feature bit is visible under flags in /proc/cpuinfo, it can be used to determine the feasibility of use-cases which require LBR Freeze to be supported by the hardware such as profile-guided optimization of kernels.
Fixes: ca5b7c0d9621 ("perf/x86/amd/lbr: Add LbrExtV2 branch record support") Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/69a453c97cfd11c6f2584b19f937fe6df741510f.1711091584.git.sandipan.das@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
56e7373f |
| 25-Mar-2024 |
Sandipan Das <sandipan.das@amd.com> |
x86/cpufeatures: Add new word for scattered features
[ Upstream commit 7f274e609f3d5f45c22b1dd59053f6764458b492 ]
Add a new word for scattered features because all free bits among the existing Linu
x86/cpufeatures: Add new word for scattered features
[ Upstream commit 7f274e609f3d5f45c22b1dd59053f6764458b492 ]
Add a new word for scattered features because all free bits among the existing Linux-defined auxiliary flags have been exhausted.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/8380d2a0da469a1f0ad75b8954a79fb689599ff6.1711091584.git.sandipan.das@amd.com Stable-dep-of: 598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4 |
|
#
d2be2f87 |
| 02-Dec-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/CPU/AMD: Add X86_FEATURE_ZEN1
[ Upstream commit 232afb557835d6f6859c73bf610bad308c96b131 ]
Add a synthetic feature flag specifically for first generation Zen machines. There's need to have a ge
x86/CPU/AMD: Add X86_FEATURE_ZEN1
[ Upstream commit 232afb557835d6f6859c73bf610bad308c96b131 ]
Add a synthetic feature flag specifically for first generation Zen machines. There's need to have a generic flag for all Zen generations so make X86_FEATURE_ZEN be that flag.
Fixes: 30fa92832f40 ("x86/CPU/AMD: Add ZenX generations flags") Suggested-by: Brian Gerst <brgerst@gmail.com> Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/dc3835e3-0731-4230-bbb9-336bbe3d042b@amd.com Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10 |
|
#
936e59cb |
| 31-Oct-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/CPU/AMD: Add ZenX generations flags
[ Upstream commit 30fa92832f405d5ac9f263e99f62445fa3084008 ]
Add X86_FEATURE flags for each Zen generation. They should be used from now on instead of checki
x86/CPU/AMD: Add ZenX generations flags
[ Upstream commit 30fa92832f405d5ac9f263e99f62445fa3084008 ]
Add X86_FEATURE flags for each Zen generation. They should be used from now on instead of checking f/m/s.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lore.kernel.org/r/20231120104152.13740-2-bp@alien8.de Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
77018fb9 |
| 11-Mar-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/rfds: Mitigate Register File Data Sampling (RFDS)
commit 8076fcde016c9c0e0660543e67bff86cb48a7c9c upstream.
RFDS is a CPU vulnerability that may allow userspace to infer kernel stale data previ
x86/rfds: Mitigate Register File Data Sampling (RFDS)
commit 8076fcde016c9c0e0660543e67bff86cb48a7c9c upstream.
RFDS is a CPU vulnerability that may allow userspace to infer kernel stale data previously used in floating point registers, vector registers and integer registers. RFDS only affects certain Intel Atom processors.
Intel released a microcode update that uses VERW instruction to clear the affected CPU buffers. Unlike MDS, none of the affected cores support SMT.
Add RFDS bug infrastructure and enable the VERW based mitigation by default, that clears the affected buffers just before exiting to userspace. Also add sysfs reporting and cmdline parameter "reg_file_data_sampling" to control the mitigation.
For details see: Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
48985d64 |
| 13-Feb-2024 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/bugs: Add asm helpers for executing VERW
commit baf8361e54550a48a7087b603313ad013cc13386 upstream.
MDS mitigation requires clearing the CPU buffers before returning to user. This needs to be do
x86/bugs: Add asm helpers for executing VERW
commit baf8361e54550a48a7087b603313ad013cc13386 upstream.
MDS mitigation requires clearing the CPU buffers before returning to user. This needs to be done late in the exit-to-user path. Current location of VERW leaves a possibility of kernel data ending up in CPU buffers for memory accesses done after VERW such as:
1. Kernel data accessed by an NMI between VERW and return-to-user can remain in CPU buffers since NMI returning to kernel does not execute VERW to clear CPU buffers. 2. Alyssa reported that after VERW is executed, CONFIG_GCC_PLUGIN_STACKLEAK=y scrubs the stack used by a system call. Memory accesses during stack scrubbing can move kernel stack contents into CPU buffers. 3. When caller saved registers are restored after a return from function executing VERW, the kernel stack accesses can remain in CPU buffers(since they occur after VERW).
To fix this VERW needs to be moved very late in exit-to-user path.
In preparation for moving VERW to entry/exit asm code, create macros that can be used in asm. Also make VERW patching depend on a new feature flag X86_FEATURE_CLEAR_CPU_BUF.
Reported-by: Alyssa Milburn <alyssa.milburn@intel.com> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20240213-delay-verw-v8-1-a6216d83edb7%40linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6 |
|
#
ccce12ec |
| 27-Oct-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/barrier: Do not serialize MSR accesses on AMD
commit 04c3024560d3a14acd18d0a51a1d0a89d29b7eb5 upstream.
AMD does not have the requirement for a synchronization barrier when acccessing a certain
x86/barrier: Do not serialize MSR accesses on AMD
commit 04c3024560d3a14acd18d0a51a1d0a89d29b7eb5 upstream.
AMD does not have the requirement for a synchronization barrier when acccessing a certain group of MSRs. Do not incur that unnecessary penalty there.
There will be a CPUID bit which explicitly states that a MFENCE is not needed. Once that bit is added to the APM, this will be extended with it.
While at it, move to processor.h to avoid include hell. Untangling that file properly is a matter for another day.
Some notes on the performance aspect of why this is relevant, courtesy of Kishon VijayAbraham <Kishon.VijayAbraham@amd.com>:
On a AMD Zen4 system with 96 cores, a modified ipi-bench[1] on a VM shows x2AVIC IPI rate is 3% to 4% lower than AVIC IPI rate. The ipi-bench is modified so that the IPIs are sent between two vCPUs in the same CCX. This also requires to pin the vCPU to a physical core to prevent any latencies. This simulates the use case of pinning vCPUs to the thread of a single CCX to avoid interrupt IPI latency.
In order to avoid run-to-run variance (for both x2AVIC and AVIC), the below configurations are done:
1) Disable Power States in BIOS (to prevent the system from going to lower power state)
2) Run the system at fixed frequency 2500MHz (to prevent the system from increasing the frequency when the load is more)
With the above configuration:
*) Performance measured using ipi-bench for AVIC: Average Latency: 1124.98ns [Time to send IPI from one vCPU to another vCPU]
Cumulative throughput: 42.6759M/s [Total number of IPIs sent in a second from 48 vCPUs simultaneously]
*) Performance measured using ipi-bench for x2AVIC: Average Latency: 1172.42ns [Time to send IPI from one vCPU to another vCPU]
Cumulative throughput: 40.9432M/s [Total number of IPIs sent in a second from 48 vCPUs simultaneously]
From above, x2AVIC latency is ~4% more than AVIC. However, the expectation is x2AVIC performance to be better or equivalent to AVIC. Upon analyzing the perf captures, it is observed significant time is spent in weak_wrmsr_fence() invoked by x2apic_send_IPI().
With the fix to skip weak_wrmsr_fence()
*) Performance measured using ipi-bench for x2AVIC: Average Latency: 1117.44ns [Time to send IPI from one vCPU to another vCPU]
Cumulative throughput: 42.9608M/s [Total number of IPIs sent in a second from 48 vCPUs simultaneously]
Comparing the performance of x2AVIC with and without the fix, it can be seen the performance improves by ~4%.
Performance captured using an unmodified ipi-bench using the 'mesh-ipi' option with and without weak_wrmsr_fence() on a Zen4 system also showed significant performance improvement without weak_wrmsr_fence(). The 'mesh-ipi' option ignores CCX or CCD and just picks random vCPU.
Average throughput (10 iterations) with weak_wrmsr_fence(), Cumulative throughput: 4933374 IPI/s
Average throughput (10 iterations) without weak_wrmsr_fence(), Cumulative throughput: 6355156 IPI/s
[1] https://github.com/bytedance/kvm-utils/tree/master/microbenchmark/ipi-bench
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230622095212.20940-1-bp@alien8.de Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44 |
|
#
77245f1c |
| 04-Aug-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/CPU/AMD: Do not leak quotient data after a division by 0
Under certain circumstances, an integer division by 0 which faults, can leave stale quotient data from a previous division operation on Z
x86/CPU/AMD: Do not leak quotient data after a division by 0
Under certain circumstances, an integer division by 0 which faults, can leave stale quotient data from a previous division operation on Zen1 microarchitectures.
Do a dummy division 0/1 before returning from the #DE exception handler in order to avoid any leaks of potentially sensitive data.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39 |
|
#
54e3d943 |
| 18-Jul-2023 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86/mm: Remove "INVPCID single" feature tracking
From: Dave Hansen <dave.hansen@linux.intel.com>
tl;dr: Replace a synthetic X86_FEATURE with a hardware X86_FEATURE and check of existing per-
x86/mm: Remove "INVPCID single" feature tracking
From: Dave Hansen <dave.hansen@linux.intel.com>
tl;dr: Replace a synthetic X86_FEATURE with a hardware X86_FEATURE and check of existing per-cpu state.
== Background ==
There are three features in play here: 1. Good old Page Table Isolation (PTI) 2. Process Context IDentifiers (PCIDs) which allow entries from multiple address spaces to be in the TLB at once. 3. Support for the "Invalidate PCID" (INVPCID) instruction, specifically the "individual address" mode (aka. mode 0).
When all *three* of these are in place, INVPCID can and should be used to flush out individual addresses in the PTI user address space.
But there's a wrinkle or two: First, this INVPCID mode is dependent on CR4.PCIDE. Even if X86_FEATURE_INVPCID==1, the instruction may #GP without setting up CR4. Second, TLB flushing is done very early, even before CR4 is fully set up. That means even if PTI, PCID and INVPCID are supported, there is *still* a window where INVPCID can #GP.
== Problem ==
The current code seems to work, but mostly by chance and there are a bunch of ways it can go wrong. It's also somewhat hard to follow since X86_FEATURE_INVPCID_SINGLE is set far away from its lone user.
== Solution ==
Make "INVPCID single" more robust and easier to follow by placing all the logic in one place. Remove X86_FEATURE_INVPCID_SINGLE.
Make two explicit checks before using INVPCID: 1. Check that the system supports INVPCID itself (boot_cpu_has()) 2. Then check the CR4.PCIDE shadow to ensures that the CPU can safely use INVPCID for individual address invalidation.
The CR4 check *always* works and is not affected by any X86_FEATURE_* twiddling or inconsistencies between the boot and secondary CPUs.
This has been tested on non-Meltdown hardware by using pti=on and then flipping PCID and INVPCID support with qemu.
== Aside ==
How does this code even work today? By chance, I think. First, PTI is initialized around the same time that the boot CPU sets CR4.PCIDE=1. There are currently no TLB invalidations when PTI=1 but CR4.PCIDE=0. That means that the X86_FEATURE_INVPCID_SINGLE check is never even reached.
this_cpu_has() is also very nasty to use in this context because the boot CPU reaches here before cpu_data(0) has been initialized. It happens to work for X86_FEATURE_INVPCID_SINGLE since it's a software-defined feature but it would fall over for a hardware- derived X86_FEATURE.
Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20230718170630.7922E235%40davehans-spike.ostc.intel.com
show more ...
|
Revision tags: v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35 |
|
#
d1f85fbe |
| 15-Jun-2023 |
Alexey Kardashevskiy <aik@amd.com> |
KVM: SEV: Enable data breakpoints in SEV-ES
Add support for "DebugSwap for SEV-ES guests", which provides support for swapping DR[0-3] and DR[0-3]_ADDR_MASK on VMRUN and VMEXIT, i.e. allows KVM to e
KVM: SEV: Enable data breakpoints in SEV-ES
Add support for "DebugSwap for SEV-ES guests", which provides support for swapping DR[0-3] and DR[0-3]_ADDR_MASK on VMRUN and VMEXIT, i.e. allows KVM to expose debug capabilities to SEV-ES guests. Without DebugSwap support, the CPU doesn't save/load most _guest_ debug registers (except DR6/7), and KVM cannot manually context switch guest DRs due the VMSA being encrypted.
Enable DebugSwap if and only if the CPU also supports NoNestedDataBp, which causes the CPU to ignore nested #DBs, i.e. #DBs that occur when vectoring a #DB. Without NoNestedDataBp, a malicious guest can DoS the host by putting the CPU into an infinite loop of vectoring #DBs (see https://bugzilla.redhat.com/show_bug.cgi?id=1278496)
Set the features bit in sev_es_sync_vmsa() which is the last point when VMSA is not encrypted yet as sev_(es_)init_vmcb() (where the most init happens) is called not only when VCPU is initialised but also on intrahost migration when VMSA is encrypted.
Eliminate DR7 intercepts as KVM can't modify guest DR7, and intercepting DR7 would completely defeat the purpose of enabling DebugSwap.
Make X86_FEATURE_DEBUG_SWAP appear in /proc/cpuinfo (by not adding "") to let the operator know if the VM can debug.
Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Link: https://lore.kernel.org/r/20230615063757.3039121-7-aik@amd.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
d893832d |
| 07-Jul-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/srso: Add IBPB on VMEXIT
Add the option to flush IBPB only on VMEXIT in order to protect from malicious guests but one otherwise trusts the software that runs on the hypervisor.
Signed-off-by:
x86/srso: Add IBPB on VMEXIT
Add the option to flush IBPB only on VMEXIT in order to protect from malicious guests but one otherwise trusts the software that runs on the hypervisor.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
show more ...
|
#
1b5277c0 |
| 29-Jun-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/srso: Add SRSO_NO support
Add support for the CPUID flag which denotes that the CPU is not affected by SRSO.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
|
#
79113e40 |
| 18-Jul-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/srso: Add IBPB_BRTYPE support
Add support for the synthetic CPUID flag which "if this bit is 1, it indicates that MSR 49h (PRED_CMD) bit 0 (IBPB) flushes all branch type predictions from the CPU
x86/srso: Add IBPB_BRTYPE support
Add support for the synthetic CPUID flag which "if this bit is 1, it indicates that MSR 49h (PRED_CMD) bit 0 (IBPB) flushes all branch type predictions from the CPU branch predictor."
This flag is there so that this capability in guests can be detected easily (otherwise one would have to track microcode revisions which is impossible for guests).
It is also needed only for Zen3 and -4. The other two (Zen1 and -2) always flush branch type predictions by default.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
show more ...
|
#
fb3bd914 |
| 28-Jun-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/srso: Add a Speculative RAS Overflow mitigation
Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors.
The mitigation works by ensuring all RE
x86/srso: Add a Speculative RAS Overflow mitigation
Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors.
The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence.
To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns.
In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret().
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
show more ...
|
#
8974eb58 |
| 12-Jul-2023 |
Daniel Sneddon <daniel.sneddon@linux.intel.com> |
x86/speculation: Add Gather Data Sampling mitigation
Gather Data Sampling (GDS) is a hardware vulnerability which allows unprivileged speculative access to data which was previously stored in vector
x86/speculation: Add Gather Data Sampling mitigation
Gather Data Sampling (GDS) is a hardware vulnerability which allows unprivileged speculative access to data which was previously stored in vector registers.
Intel processors that support AVX2 and AVX512 have gather instructions that fetch non-contiguous data elements from memory. On vulnerable hardware, when a gather instruction is transiently executed and encounters a fault, stale data from architectural or internal vector registers may get transiently stored to the destination vector register allowing an attacker to infer the stale data using typical side channel techniques like cache timing attacks.
This mitigation is different from many earlier ones for two reasons. First, it is enabled by default and a bit must be set to *DISABLE* it. This is the opposite of normal mitigation polarity. This means GDS can be mitigated simply by updating microcode and leaving the new control bit alone.
Second, GDS has a "lock" bit. This lock bit is there because the mitigation affects the hardware security features KeyLocker and SGX. It needs to be enabled and *STAY* enabled for these features to be mitigated against GDS.
The mitigation is enabled in the microcode by default. Disable it by setting gather_data_sampling=off or by disabling all mitigations with mitigations=off. The mitigation status can be checked by reading:
/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
show more ...
|
#
0e52740f |
| 08-Jul-2023 |
Borislav Petkov (AMD) <bp@alien8.de> |
x86/bugs: Increase the x86 bugs vector size to two u32s
There was never a doubt in my mind that they would not fit into a single u32 eventually.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
|
Revision tags: v6.1.34 |
|
#
701fb66d |
| 12-Jun-2023 |
Rick Edgecombe <rick.p.edgecombe@intel.com> |
x86/cpufeatures: Add CPU feature flags for shadow stacks
The Control-Flow Enforcement Technology contains two related features, one of which is Shadow Stacks. Future patches will utilize this featur
x86/cpufeatures: Add CPU feature flags for shadow stacks
The Control-Flow Enforcement Technology contains two related features, one of which is Shadow Stacks. Future patches will utilize this feature for shadow stack support in KVM, so add a CPU feature flags for Shadow Stacks (CPUID.(EAX=7,ECX=0):ECX[bit 7]).
To protect shadow stack state from malicious modification, the registers are only accessible in supervisor mode. This implementation context-switches the registers with XSAVES. Make X86_FEATURE_SHSTK depend on XSAVES.
The shadow stack feature, enumerated by the CPUID bit described above, encompasses both supervisor and userspace support for shadow stack. In near future patches, only userspace shadow stack will be enabled. In expectation of future supervisor shadow stack support, create a software CPU capability to enumerate kernel utilization of userspace shadow stack support. This user shadow stack bit should depend on the HW "shstk" capability and that logic will be implemented in future patches.
Co-developed-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: John Allen <john.allen@amd.com> Tested-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/20230613001108.3040476-9-rick.p.edgecombe%40intel.com
show more ...
|
Revision tags: v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9 |
|
#
3d8f61bf |
| 24-Jan-2023 |
Sean Christopherson <seanjc@google.com> |
x86: KVM: Add common feature flag for AMD's PSFD
Use a common X86_FEATURE_* flag for AMD's PSFD, and suppress it from /proc/cpuinfo via the standard method of an empty string instead of hacking in a
x86: KVM: Add common feature flag for AMD's PSFD
Use a common X86_FEATURE_* flag for AMD's PSFD, and suppress it from /proc/cpuinfo via the standard method of an empty string instead of hacking in a one-off "private" #define in KVM. The request that led to KVM defining its own flag was really just that the feature not show up in /proc/cpuinfo, and additional patches+discussions in the interim have clarified that defining flags in cpufeatures.h purely so that KVM can advertise features to userspace is ok so long as the kernel already uses a word to track the associated CPUID leaf.
No functional change intended.
Link: https://lore.kernel.org/all/d1b1e0da-29f0-c443-6c86-9549bbe1c79d@redhat.como Link: https://lore.kernel.org/all/YxGZH7aOXQF7Pu5q@nazgul.tnic Link: https://lore.kernel.org/all/Y3O7UYWfOLfJkwM%2F@zn.tnic Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20230124194519.2893234-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
3763bf58 |
| 22-Mar-2023 |
Sean Christopherson <seanjc@google.com> |
x86/cpufeatures: Redefine synthetic virtual NMI bit as AMD's "real" vNMI
The existing X86_FEATURE_VNMI is a synthetic feature flag that exists purely to maintain /proc/cpuinfo's ABI, the "real" Inte
x86/cpufeatures: Redefine synthetic virtual NMI bit as AMD's "real" vNMI
The existing X86_FEATURE_VNMI is a synthetic feature flag that exists purely to maintain /proc/cpuinfo's ABI, the "real" Intel vNMI feature flag is tracked as VMX_FEATURE_VIRTUAL_NMIS, as the feature is enumerated through VMX MSRs, not CPUID.
AMD is also gaining virtual NMI support, but in true VMX vs. SVM form, enumerates support through CPUID, i.e. wants to add real feature flag for vNMI.
Redefine the syntheic X86_FEATURE_VNMI to AMD's real CPUID bit to avoid having both X86_FEATURE_VNMI and e.g. X86_FEATURE_AMD_VNMI.
Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
6449dcb0 |
| 12-Mar-2023 |
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> |
x86: CPUID and CR3/CR4 flags for Linear Address Masking
Enumerate Linear Address Masking and provide defines for CR3 and CR4 flags.
The new CONFIG_ADDRESS_MASKING option enables the feature support
x86: CPUID and CR3/CR4 flags for Linear Address Masking
Enumerate Linear Address Masking and provide defines for CR3 and CR4 flags.
The new CONFIG_ADDRESS_MASKING option enables the feature support in kernel.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/all/20230312112612.31869-4-kirill.shutemov%40linux.intel.com
show more ...
|