#
3763bf58 |
| 22-Mar-2023 |
Sean Christopherson <seanjc@google.com> |
x86/cpufeatures: Redefine synthetic virtual NMI bit as AMD's "real" vNMI
The existing X86_FEATURE_VNMI is a synthetic feature flag that exists purely to maintain /proc/cpuinfo's ABI, the "real" Inte
x86/cpufeatures: Redefine synthetic virtual NMI bit as AMD's "real" vNMI
The existing X86_FEATURE_VNMI is a synthetic feature flag that exists purely to maintain /proc/cpuinfo's ABI, the "real" Intel vNMI feature flag is tracked as VMX_FEATURE_VIRTUAL_NMIS, as the feature is enumerated through VMX MSRs, not CPUID.
AMD is also gaining virtual NMI support, but in true VMX vs. SVM form, enumerates support through CPUID, i.e. wants to add real feature flag for vNMI.
Redefine the syntheic X86_FEATURE_VNMI to AMD's real CPUID bit to avoid having both X86_FEATURE_VNMI and e.g. X86_FEATURE_AMD_VNMI.
Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
6449dcb0 |
| 12-Mar-2023 |
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> |
x86: CPUID and CR3/CR4 flags for Linear Address Masking
Enumerate Linear Address Masking and provide defines for CR3 and CR4 flags.
The new CONFIG_ADDRESS_MASKING option enables the feature support
x86: CPUID and CR3/CR4 flags for Linear Address Masking
Enumerate Linear Address Masking and provide defines for CR3 and CR4 flags.
The new CONFIG_ADDRESS_MASKING option enables the feature support in kernel.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/all/20230312112612.31869-4-kirill.shutemov%40linux.intel.com
show more ...
|
#
be8de49b |
| 09-Feb-2023 |
Tom Lendacky <thomas.lendacky@amd.com> |
x86/speculation: Identify processors vulnerable to SMT RSB predictions
Certain AMD processors are vulnerable to a cross-thread return address predictions bug. When running in SMT mode and one of the
x86/speculation: Identify processors vulnerable to SMT RSB predictions
Certain AMD processors are vulnerable to a cross-thread return address predictions bug. When running in SMT mode and one of the sibling threads transitions out of C0 state, the other sibling thread could use return target predictions from the sibling thread that transitioned out of C0.
The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB when context switching to the idle thread. However, KVM allows a VMM to prevent exiting guest mode when transitioning out of C0. A guest could act maliciously in this situation, so create a new x86 BUG that can be used to detect if the processor is vulnerable.
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <91cec885656ca1fcd4f0185ce403a53dd9edecb7.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
e7862eda |
| 24-Jan-2023 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu: Support AMD Automatic IBRS
The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, like Intel's Enhanced IBRS, h/w manages its IBR
x86/cpu: Support AMD Automatic IBRS
The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, like Intel's Enhanced IBRS, h/w manages its IBRS mitigation resources automatically across CPL transitions.
The feature is advertised by CPUID_Fn80000021_EAX bit 8 and is enabled by setting MSR C000_0080 (EFER) bit 21.
Enable Automatic IBRS by default if the CPU feature is present. It typically provides greater performance over the incumbent generic retpolines mitigation.
Reuse the SPECTRE_V2_EIBRS spectre_v2_mitigation enum. AMD Automatic IBRS and Intel Enhanced IBRS have similar enablement. Add NO_EIBRS_PBRSB to cpu_vuln_whitelist, since AMD Automatic IBRS isn't affected by PBRSB-eIBRS.
The kernel command line option spectre_v2=eibrs is used to select AMD Automatic IBRS, if available.
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20230124163319.2277355-8-kim.phillips@amd.com
show more ...
|
#
faabfcb1 |
| 24-Jan-2023 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu, kvm: Add the SMM_CTL MSR not present feature
The SMM_CTL MSR not present feature was being open-coded for KVM. Add it to its newly added CPUID leaf 0x80000021 EAX proper.
Also drop the bit
x86/cpu, kvm: Add the SMM_CTL MSR not present feature
The SMM_CTL MSR not present feature was being open-coded for KVM. Add it to its newly added CPUID leaf 0x80000021 EAX proper.
Also drop the bit description comments now the code is more self-describing.
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-7-kim.phillips@amd.com
show more ...
|
#
5b909d4a |
| 24-Jan-2023 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu, kvm: Add the Null Selector Clears Base feature
The Null Selector Clears Base feature was being open-coded for KVM. Add it to its newly added native CPUID leaf 0x80000021 EAX proper.
Also d
x86/cpu, kvm: Add the Null Selector Clears Base feature
The Null Selector Clears Base feature was being open-coded for KVM. Add it to its newly added native CPUID leaf 0x80000021 EAX proper.
Also drop the bit description comments now it's more self-describing.
[ bp: Convert test in check_null_seg_clears_base() too. ]
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-6-kim.phillips@amd.com
show more ...
|
#
84168ae7 |
| 24-Jan-2023 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf
The LFENCE always serializing feature bit was defined as scattered LFENCE_RDTSC and its native leaf bit position open-coded for KVM. A
x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf
The LFENCE always serializing feature bit was defined as scattered LFENCE_RDTSC and its native leaf bit position open-coded for KVM. Add it to its newly added CPUID leaf 0x80000021 EAX proper. With LFENCE_RDTSC in its proper place, the kernel's set_cpu_cap() will effectively synthesize the feature for KVM going forward.
Also, DE_CFG[1] doesn't need to be set on such CPUs anymore.
[ bp: Massage and merge diff from Sean. ]
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-5-kim.phillips@amd.com
show more ...
|
#
a9dc9ec5 |
| 24-Jan-2023 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature
The "Processor ignores nested data breakpoints" feature was being open-coded for KVM. Add the feature to its newly introduced CPUID leaf 0x80000021 E
x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature
The "Processor ignores nested data breakpoints" feature was being open-coded for KVM. Add the feature to its newly introduced CPUID leaf 0x80000021 EAX proper.
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-4-kim.phillips@amd.com
show more ...
|
#
8415a748 |
| 10-Jan-2023 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu, kvm: Add support for CPUID_80000021_EAX
Add support for CPUID leaf 80000021, EAX. The majority of the features will be used in the kernel and thus a separate leaf is appropriate.
Include K
x86/cpu, kvm: Add support for CPUID_80000021_EAX
Add support for CPUID leaf 80000021, EAX. The majority of the features will be used in the kernel and thus a separate leaf is appropriate.
Include KVM's reverse_cpuid entry because features are used by VM guests, too.
[ bp: Massage commit message. ]
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-2-kim.phillips@amd.com
show more ...
|
#
f8df91e7 |
| 01-Sep-2022 |
Jim Mattson <jmattson@google.com> |
x86/cpufeatures: Add macros for Intel's new fast rep string features
KVM_GET_SUPPORTED_CPUID should reflect these host CPUID bits. The bits are already cached in word 12. Give the bits X86_FEATURE n
x86/cpufeatures: Add macros for Intel's new fast rep string features
KVM_GET_SUPPORTED_CPUID should reflect these host CPUID bits. The bits are already cached in word 12. Give the bits X86_FEATURE names, so that they can be easily referenced. Hide these bits from /proc/cpuinfo, since the host kernel makes no use of them at present.
Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220901211811.2883855-1-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
78335aac |
| 13-Jan-2023 |
Babu Moger <babu.moger@amd.com> |
x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
Newer AMD processors support the new feature Bandwidth Monitoring Event Configuration (BMEC).
The feature support is ident
x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
Newer AMD processors support the new feature Bandwidth Monitoring Event Configuration (BMEC).
The feature support is identified via CPUID Fn8000_0020_EBX_x0[3]: EVT_CFG - Bandwidth Monitoring Event Configuration (BMEC)
The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes are set to count all the total and local reads/writes, respectively. With the introduction of slow memory, the two counters are not enough to count all the different types of memory events. Therefore, BMEC provides the option to configure mbm_total_bytes and mbm_local_bytes to count the specific type of events.
Each BMEC event has a configuration MSR which contains one field for each bandwidth type that can be used to configure the bandwidth event to track any combination of supported bandwidth types. The event will count requests from every bandwidth type bit that is set in the corresponding configuration register.
Following are the types of events supported:
==== ======================================================== Bits Description ==== ======================================================== 6 Dirty Victims from the QOS domain to all types of memory 5 Reads to slow memory in the non-local NUMA domain 4 Reads to slow memory in the local NUMA domain 3 Non-temporal writes to non-local NUMA domain 2 Non-temporal writes to local NUMA domain 1 Reads to memory in the non-local NUMA domain 0 Reads to memory in the local NUMA domain ==== ========================================================
By default, the mbm_total_bytes configuration is set to 0x7F to count all the event types and the mbm_local_bytes configuration is set to 0x15 to count all the local memory events.
Feature description is available in the specification, "AMD64 Technology Platform Quality of Service Extensions, Revision: 1.03 Publication" at https://bugzilla.kernel.org/attachment.cgi?id=301365
Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20230113152039.770054-5-babu.moger@amd.com
show more ...
|
#
f334f723 |
| 13-Jan-2023 |
Babu Moger <babu.moger@amd.com> |
x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Add the new AMD feature X86_FEATURE_SMBA. With it, the QOS enforcement policies can be applied to external slow memory connected to
x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Add the new AMD feature X86_FEATURE_SMBA. With it, the QOS enforcement policies can be applied to external slow memory connected to the host. QOS enforcement is accomplished by assigning a Class Of Service (COS) to a processor and specifying allocations or limits for that COS for each resource to be allocated.
This feature is identified by the CPUID function 0x8000_0020_EBX_x0[2]: L3SBE - L3 external slow memory bandwidth enforcement.
CXL.memory is the only supported "slow" memory device. With SMBA, the hardware enables bandwidth allocation on the slow memory devices. If there are multiple slow memory devices in the system, then the throttling logic groups all the slow sources together and applies the limit on them as a whole.
The presence of the SMBA feature (with CXL.memory) is independent of whether slow memory device is actually present in the system. If there is no slow memory in the system, then setting a SMBA limit will have no impact on the performance of the system.
Presence of CXL memory can be identified by the numactl command:
$numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 node 0 size: 63678 MB node 0 free: 59542 MB node 1 cpus: node 1 size: 16122 MB node 1 free: 15627 MB node distances: node 0 1 0: 10 50 1: 50 10
CPU list for CXL memory will be empty. The cpu-cxl node distance is greater than cpu-to-cpu distances. Node 1 has the CXL memory in this case. CXL memory can also be identified using ACPI SRAT table and memory maps.
Feature description is available in the specification, "AMD64 Technology Platform Quality of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022" at https://bugzilla.kernel.org/attachment.cgi?id=301365
See also https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20230113152039.770054-3-babu.moger@amd.com
show more ...
|
#
66056947 |
| 12-Jan-2023 |
H. Peter Anvin (Intel) <hpa@zytor.com> |
x86/cpufeature: Add the CPU feature bit for LKGS
Add the CPU feature bit for LKGS (Load "Kernel" GS).
LKGS instruction is introduced with Intel FRED (flexible return and event delivery) specificati
x86/cpufeature: Add the CPU feature bit for LKGS
Add the CPU feature bit for LKGS (Load "Kernel" GS).
LKGS instruction is introduced with Intel FRED (flexible return and event delivery) specification. Search for the latest FRED spec in most search engines with this search pattern:
site:intel.com FRED (flexible return and event delivery) specification
LKGS behaves like the MOV to GS instruction except that it loads the base address into the IA32_KERNEL_GS_BASE MSR instead of the GS segment’s descriptor cache, which is exactly what Linux kernel does to load a user level GS base. Thus, with LKGS, there is no need to SWAPGS away from the kernel GS base.
[ mingo: Minor tweaks to the description. ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230112072032.35626-2-xin3.li@intel.com
show more ...
|
#
a018d2e3 |
| 04-Jan-2023 |
Kan Liang <kan.liang@linux.intel.com> |
x86/cpufeatures: Add Architectural PerfMon Extension bit
CPUID.(EAX=07H, ECX=1):EAX[8] indicates whether the Architectural PerfMon Extension leaf (CPUID leaf 23) is supported.
The "X86_FEATURE_...,
x86/cpufeatures: Add Architectural PerfMon Extension bit
CPUID.(EAX=07H, ECX=1):EAX[8] indicates whether the Architectural PerfMon Extension leaf (CPUID leaf 23) is supported.
The "X86_FEATURE_..., word 12" is already mirrored from CPUID "0x00000007:1 (EAX)". Add X86_FEATURE_ARCH_PERFMON_EXT under the "word 12" section.
The new Architectural PerfMon Extension leaf (CPUID leaf 23) will be supported in the perf_events subsystem later.
The feature will not appear in /proc/cpuinfo.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230104201349.1451191-4-kan.liang@linux.intel.com
show more ...
|
#
5e85c4eb |
| 25-Nov-2022 |
Jiaxi Chen <jiaxi.chen@linux.intel.com> |
x86: KVM: Advertise AVX-IFMA CPUID to user space
AVX-IFMA is a new instruction in the latest Intel platform Sierra Forest. This instruction packed multiplies unsigned 52-bit integers and adds the lo
x86: KVM: Advertise AVX-IFMA CPUID to user space
AVX-IFMA is a new instruction in the latest Intel platform Sierra Forest. This instruction packed multiplies unsigned 52-bit integers and adds the low/high 52-bit products to Qword Accumulators.
The bit definition: CPUID.(EAX=7,ECX=1):EAX[bit 23]
AVX-IFMA is on an expected-dense CPUID leaf and some other bits on this leaf have kernel usages. Given that, define this feature bit like X86_FEATURE_<name> in kernel. Considering AVX-IFMA itself has no truly kernel usages and /proc/cpuinfo has too much unreadable flags, hide this one in /proc/cpuinfo.
Advertise AVX-IFMA to KVM userspace. This is safe because there are no new VMX controls or additional host enabling required for guests to use this feature.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Message-Id: <20221125125845.1182922-6-jiaxi.chen@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
af2872f6 |
| 25-Nov-2022 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86: KVM: Advertise AMX-FP16 CPUID to user space
Latest Intel platform Granite Rapids has introduced a new instruction - AMX-FP16, which performs dot-products of two FP16 tiles and accumulates the r
x86: KVM: Advertise AMX-FP16 CPUID to user space
Latest Intel platform Granite Rapids has introduced a new instruction - AMX-FP16, which performs dot-products of two FP16 tiles and accumulates the results into a packed single precision tile. AMX-FP16 adds FP16 capability and also allows a FP16 GPU trained model to run faster without loss of accuracy or added SW overhead.
The bit definition: CPUID.(EAX=7,ECX=1):EAX[bit 21]
AMX-FP16 is on an expected-dense CPUID leaf and some other bits on this leaf have kernel usages. Given that, define this feature bit like X86_FEATURE_<name> in kernel. Considering AMX-FP16 itself has no truly kernel usages and /proc/cpuinfo has too much unreadable flags, hide this one in /proc/cpuinfo.
Advertise AMX-FP16 to KVM userspace. This is safe because there are no new VMX controls or additional host enabling required for guests to use this feature.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Message-Id: <20221125125845.1182922-5-jiaxi.chen@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
6a19d7aa |
| 25-Nov-2022 |
Jiaxi Chen <jiaxi.chen@linux.intel.com> |
x86: KVM: Advertise CMPccXADD CPUID to user space
CMPccXADD is a new set of instructions in the latest Intel platform Sierra Forest. This new instruction set includes a semaphore operation that can
x86: KVM: Advertise CMPccXADD CPUID to user space
CMPccXADD is a new set of instructions in the latest Intel platform Sierra Forest. This new instruction set includes a semaphore operation that can compare and add the operands if condition is met, which can improve database performance.
The bit definition: CPUID.(EAX=7,ECX=1):EAX[bit 7]
CMPccXADD is on an expected-dense CPUID leaf and some other bits on this leaf have kernel usages. Given that, define this feature bit like X86_FEATURE_<name> in kernel. Considering CMPccXADD itself has no truly kernel usages and /proc/cpuinfo has too much unreadable flags, hide this one in /proc/cpuinfo.
Advertise CMPCCXADD to KVM userspace. This is safe because there are no new VMX controls or additional host enabling required for guests to use this feature.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Message-Id: <20221125125845.1182922-4-jiaxi.chen@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
aaa65d17 |
| 15-Nov-2022 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/tsx: Add a feature bit for TSX control MSR support
Support for the TSX control MSR is enumerated in MSR_IA32_ARCH_CAPABILITIES. This is different from how other CPU features are enumerated i.e.
x86/tsx: Add a feature bit for TSX control MSR support
Support for the TSX control MSR is enumerated in MSR_IA32_ARCH_CAPABILITIES. This is different from how other CPU features are enumerated i.e. via CPUID. Currently, a call to tsx_ctrl_is_supported() is required for enumerating the feature. In the absence of a feature bit for TSX control, any code that relies on checking feature bits directly will not work.
In preparation for adding a feature bit check in MSR save/restore during suspend/resume, set a new feature bit X86_FEATURE_TSX_CTRL when MSR_IA32_TSX_CTRL is present. Also make tsx_ctrl_is_supported() use the new feature bit to avoid any overhead of reading the MSR.
[ bp: Remove tsx_ctrl_is_supported(), add room for two more feature bits in word 11 which are coming up in the next merge window. ]
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/de619764e1d98afbb7a5fa58424f1278ede37b45.1668539735.git.pawan.kumar.gupta@linux.intel.com
show more ...
|
#
b1599915 |
| 06-Nov-2022 |
Ingo Molnar <mingo@kernel.org> |
x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
Reallocate a soft-cpufeatures bit allocated for call-depth tracking
x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
Reallocate a soft-cpufeatures bit allocated for call-depth tracking code, which clashes with this recent KVM/SGX patch being worked on:
KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest
Instead of reallocating cpufeatures bits in evil merges, make the allocation explicit.
Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
16a7fe37 |
| 31-Oct-2022 |
Kai Huang <kai.huang@intel.com> |
KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest
The new Asynchronous Exit (AEX) notification mechanism (AEX-notify) allows one enclave to receive a notification in the ERESUME after
KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest
The new Asynchronous Exit (AEX) notification mechanism (AEX-notify) allows one enclave to receive a notification in the ERESUME after the enclave exit due to an AEX. EDECCSSA is a new SGX user leaf function (ENCLU[EDECCSSA]) to facilitate the AEX notification handling. The new EDECCSSA is enumerated via CPUID(EAX=0x12,ECX=0x0):EAX[11].
Besides Allowing reporting the new AEX-notify attribute to KVM guests, also allow reporting the new EDECCSSA user leaf function to KVM guests so the guest can fully utilize the AEX-notify mechanism.
Similar to existing X86_FEATURE_SGX1 and X86_FEATURE_SGX2, introduce a new scattered X86_FEATURE_SGX_EDECCSSA bit for the new EDECCSSA, and report it in KVM's supported CPUIDs.
Note, no additional KVM enabling is required to allow the guest to use EDECCSSA. It's impossible to trap ENCLU (without completely preventing the guest from using SGX). Advertise EDECCSSA as supported purely so that userspace doesn't need to special case EDECCSSA, i.e. doesn't need to manually check host CPUID.
The inability to trap ENCLU also means that KVM can't prevent the guest from using EDECCSSA, but that virtualization hole is benign as far as KVM is concerned. EDECCSSA is simply a fancy way to modify internal enclave state.
More background about how do AEX-notify and EDECCSSA work:
SGX maintains a Current State Save Area Frame (CSSA) for each enclave thread. When AEX happens, the enclave thread context is saved to the CSSA and the CSSA is increased by 1. For a normal ERESUME which doesn't deliver AEX notification, it restores the saved thread context from the previously saved SSA and decreases the CSSA. If AEX-notify is enabled for one enclave, the ERESUME acts differently. Instead of restoring the saved thread context and decreasing the CSSA, it acts like EENTER which doesn't decrease the CSSA but establishes a clean slate thread context using the CSSA for the enclave to handle the notification. After some handling, the enclave must discard the "new-established" SSA and switch back to the previously saved SSA (upon AEX). Otherwise, the enclave will run out of SSA space upon further AEXs and eventually fail to run.
To solve this problem, the new EDECCSSA essentially decreases the CSSA. It can be used by the enclave notification handler to switch back to the previous saved SSA when needed, i.e. after it handles the notification.
Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Sean Christopherson <seanjc@google.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lore.kernel.org/all/20221101022422.858944-1-kai.huang%40intel.com
show more ...
|
#
80e4c1cd |
| 15-Sep-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/retbleed: Add X86_FEATURE_CALL_DEPTH
Intel SKL CPUs fall back to other predictors when the RSB underflows. The only microcode mitigation is IBRS which is insanely expensive. It comes with perfor
x86/retbleed: Add X86_FEATURE_CALL_DEPTH
Intel SKL CPUs fall back to other predictors when the RSB underflows. The only microcode mitigation is IBRS which is insanely expensive. It comes with performance drops of up to 30% depending on the workload.
A way less expensive, but nevertheless horrible mitigation is to track the call depth in software and overeagerly fill the RSB when returns underflow the software counter.
Provide a configuration symbol and a CPU misfeature bit.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.056176424@infradead.org
show more ...
|
#
257449c6 |
| 11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
x86/cpufeatures: Add LbrExtV2 feature bit
CPUID leaf 0x80000022 i.e. ExtPerfMonAndDbg advertises some new performance monitoring features for AMD processors.
Bit 1 of EAX indicates support for Last
x86/cpufeatures: Add LbrExtV2 feature bit
CPUID leaf 0x80000022 i.e. ExtPerfMonAndDbg advertises some new performance monitoring features for AMD processors.
Bit 1 of EAX indicates support for Last Branch Record Extension Version 2 (LbrExtV2) features. If found to be set during PMU initialization, the EBX bits of the same leaf can be used to determine the number of available LBR entries.
For better utilization of feature words, LbrExtV2 is added as a scattered feature bit.
[peterz: Rename to AMD_LBR_V2] Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/172d2b0df39306ed77221c45ee1aa62e8ae0548d.1660211399.git.sandipan.das@amd.com
show more ...
|
#
7df54884 |
| 03-Aug-2022 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/bugs: Add "unknown" reporting for MMIO Stale Data
Older Intel CPUs that are not in the affected processor list for MMIO Stale Data vulnerabilities currently report "Not affected" in sysfs, which
x86/bugs: Add "unknown" reporting for MMIO Stale Data
Older Intel CPUs that are not in the affected processor list for MMIO Stale Data vulnerabilities currently report "Not affected" in sysfs, which may not be correct. Vulnerability status for these older CPUs is unknown.
Add known-not-affected CPUs to the whitelist. Report "unknown" mitigation status for CPUs that are not in blacklist, whitelist and also don't enumerate MSR ARCH_CAPABILITIES bits that reflect hardware immunity to MMIO Stale Data vulnerabilities.
Mitigation is not deployed when the status is unknown.
[ bp: Massage, fixup. ]
Fixes: 8d50cdf8b834 ("x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data") Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/a932c154772f2121794a5f2eded1a11013114711.1657846269.git.pawan.kumar.gupta@linux.intel.com
show more ...
|
#
2b129932 |
| 02-Aug-2022 |
Daniel Sneddon <daniel.sneddon@linux.intel.com> |
x86/speculation: Add RSB VM Exit protections
tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as documented for RET instructions after VM exits. Mitigate it with a new one-entry RSB
x86/speculation: Add RSB VM Exit protections
tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as documented for RET instructions after VM exits. Mitigate it with a new one-entry RSB stuffing mechanism and a new LFENCE.
== Background ==
Indirect Branch Restricted Speculation (IBRS) was designed to help mitigate Branch Target Injection and Speculative Store Bypass, i.e. Spectre, attacks. IBRS prevents software run in less privileged modes from affecting branch prediction in more privileged modes. IBRS requires the MSR to be written on every privilege level change.
To overcome some of the performance issues of IBRS, Enhanced IBRS was introduced. eIBRS is an "always on" IBRS, in other words, just turn it on once instead of writing the MSR on every privilege level change. When eIBRS is enabled, more privileged modes should be protected from less privileged modes, including protecting VMMs from guests.
== Problem ==
Here's a simplification of how guests are run on Linux' KVM:
void run_kvm_guest(void) { // Prepare to run guest VMRESUME(); // Clean up after guest runs }
The execution flow for that would look something like this to the processor:
1. Host-side: call run_kvm_guest() 2. Host-side: VMRESUME 3. Guest runs, does "CALL guest_function" 4. VM exit, host runs again 5. Host might make some "cleanup" function calls 6. Host-side: RET from run_kvm_guest()
Now, when back on the host, there are a couple of possible scenarios of post-guest activity the host needs to do before executing host code:
* on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not touched and Linux has to do a 32-entry stuffing.
* on eIBRS hardware, VM exit with IBRS enabled, or restoring the host IBRS=1 shortly after VM exit, has a documented side effect of flushing the RSB except in this PBRSB situation where the software needs to stuff the last RSB entry "by hand".
IOW, with eIBRS supported, host RET instructions should no longer be influenced by guest behavior after the host retires a single CALL instruction.
However, if the RET instructions are "unbalanced" with CALLs after a VM exit as is the RET in #6, it might speculatively use the address for the instruction after the CALL in #3 as an RSB prediction. This is a problem since the (untrusted) guest controls this address.
Balanced CALL/RET instruction pairs such as in step #5 are not affected.
== Solution ==
The PBRSB issue affects a wide variety of Intel processors which support eIBRS. But not all of them need mitigation. Today, X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e., eIBRS systems which enable legacy IBRS explicitly.
However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT and most of them need a new mitigation.
Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT.
The lighter-weight mitigation performs a CALL instruction which is immediately followed by a speculative execution barrier (INT3). This steers speculative execution to the barrier -- just like a retpoline -- which ensures that speculation can never reach an unbalanced RET. Then, ensure this CALL is retired before continuing execution with an LFENCE.
In other words, the window of exposure is opened at VM exit where RET behavior is troublesome. While the window is open, force RSB predictions sampling for RET targets to a dead end at the INT3. Close the window with the LFENCE.
There is a subset of eIBRS systems which are not vulnerable to PBRSB. Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB. Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO.
[ bp: Massage, incorporate review comments from Andy Cooper. ]
Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
show more ...
|
#
28a99e95 |
| 18-Jul-2022 |
Peter Zijlstra <peterz@infradead.org> |
x86/amd: Use IBPB for firmware calls
On AMD IBRS does not prevent Retbleed; as such use IBPB before a firmware call to flush the branch history state.
And because in order to do an EFI call, the ke
x86/amd: Use IBPB for firmware calls
On AMD IBRS does not prevent Retbleed; as such use IBPB before a firmware call to flush the branch history state.
And because in order to do an EFI call, the kernel maps a whole lot of the kernel page table into the EFI page table, do an IBPB just in case in order to prevent the scenario of poisoning the BTB and causing an EFI call using the unprotected RET there.
[ bp: Massage. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220715194550.793957-1-cascardo@canonical.com
show more ...
|