#
077168e2 |
| 21-Mar-2020 |
Wei Huang <wei.huang2@amd.com> |
x86/mce/amd: Add PPIN support for AMD MCE
Newer AMD CPUs support a feature called protected processor identification number (PPIN). This feature can be detected via CPUID_Fn80000008_EBX[23].
Howeve
x86/mce/amd: Add PPIN support for AMD MCE
Newer AMD CPUs support a feature called protected processor identification number (PPIN). This feature can be detected via CPUID_Fn80000008_EBX[23].
However, CPUID alone is not enough to read the processor identification number - MSR_AMD_PPIN_CTL also needs to be configured properly. If, for any reason, MSR_AMD_PPIN_CTL[PPIN_EN] can not be turned on, such as disabled in BIOS, the CPU capability bit X86_FEATURE_AMD_PPIN needs to be cleared.
When the X86_FEATURE_AMD_PPIN capability is available, the identification number is issued together with the MCE error info in order to keep track of the source of MCE errors.
[ bp: Massage. ]
Co-developed-by: Smita Koralahalli Channabasappa <smita.koralahallichannabasappa@amd.com> Signed-off-by: Smita Koralahalli Channabasappa <smita.koralahallichannabasappa@amd.com> Signed-off-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200321193800.3666964-1-wei.huang2@amd.com
show more ...
|
#
753039ef |
| 11-Mar-2020 |
Kim Phillips <kim.phillips@amd.com> |
x86/cpu/amd: Call init_amd_zn() om Family 19h processors too
Family 19h CPUs are Zen-based and still share most architectural features with Family 17h CPUs, and therefore still need to call init_amd
x86/cpu/amd: Call init_amd_zn() om Family 19h processors too
Family 19h CPUs are Zen-based and still share most architectural features with Family 17h CPUs, and therefore still need to call init_amd_zn() e.g., to set the RECLAIM_DISTANCE override.
init_amd_zn() also sets X86_FEATURE_ZEN, which today is only used in amd_set_core_ssb_state(), which isn't called on some late model Family 17h CPUs, nor on any Family 19h CPUs: X86_FEATURE_AMD_SSBD replaces X86_FEATURE_LS_CFG_SSBD on those later model CPUs, where the SSBD mitigation is done via the SPEC_CTRL MSR instead of the LS_CFG MSR.
Family 19h CPUs also don't have the erratum where the CPB feature bit isn't set, but that code can stay unchanged and run safely on Family 19h.
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200311191451.13221-1-kim.phillips@amd.com
show more ...
|
#
6650cdd9 |
| 26-Jan-2020 |
Peter Zijlstra (Intel) <peterz@infradead.org> |
x86/split_lock: Enable split lock detection by kernel
A split-lock occurs when an atomic instruction operates on data that spans two cache lines. In order to maintain atomicity the core takes a glob
x86/split_lock: Enable split lock detection by kernel
A split-lock occurs when an atomic instruction operates on data that spans two cache lines. In order to maintain atomicity the core takes a global bus lock.
This is typically >1000 cycles slower than an atomic operation within a cache line. It also disrupts performance on other cores (which must wait for the bus lock to be released before their memory operations can complete). For real-time systems this may mean missing deadlines. For other systems it may just be very annoying.
Some CPUs have the capability to raise an #AC trap when a split lock is attempted.
Provide a command line option to give the user choices on how to handle this:
split_lock_detect= off - not enabled (no traps for split locks) warn - warn once when an application does a split lock, but allow it to continue running. fatal - Send SIGBUS to applications that cause split lock
On systems that support split lock detection the default is "warn". Note that if the kernel hits a split lock in any mode other than "off" it will OOPs.
One implementation wrinkle is that the MSR to control the split lock detection is per-core, not per thread. This might result in some short lived races on HT systems in "warn" mode if Linux tries to enable on one thread while disabling on the other. Race analysis by Sean Christopherson:
- Toggling of split-lock is only done in "warn" mode. Worst case scenario of a race is that a misbehaving task will generate multiple #AC exceptions on the same instruction. And this race will only occur if both siblings are running tasks that generate split-lock #ACs, e.g. a race where sibling threads are writing different values will only occur if CPUx is disabling split-lock after an #AC and CPUy is re-enabling split-lock after *its* previous task generated an #AC. - Transitioning between off/warn/fatal modes at runtime isn't supported and disabling is tracked per task, so hardware will always reach a steady state that matches the configured mode. I.e. split-lock is guaranteed to be enabled in hardware once all _TIF_SLD threads have been scheduled out.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20200126200535.GB30377@agluck-desk2.amr.corp.intel.com
show more ...
|
#
85c17291 |
| 20-Dec-2019 |
Sean Christopherson <sean.j.christopherson@intel.com> |
x86/cpufeatures: Add flag to track whether MSR IA32_FEAT_CTL is configured
Add a new feature flag, X86_FEATURE_MSR_IA32_FEAT_CTL, to track whether IA32_FEAT_CTL has been initialized. This will allo
x86/cpufeatures: Add flag to track whether MSR IA32_FEAT_CTL is configured
Add a new feature flag, X86_FEATURE_MSR_IA32_FEAT_CTL, to track whether IA32_FEAT_CTL has been initialized. This will allow KVM, and any future subsystems that depend on IA32_FEAT_CTL, to rely purely on cpufeatures to query platform support, e.g. allows a future patch to remove KVM's manual IA32_FEAT_CTL MSR checks.
Various features (on platforms that support IA32_FEAT_CTL) are dependent on IA32_FEAT_CTL being configured and locked, e.g. VMX and LMCE. The MSR is always configured during boot, but only if the CPU vendor is recognized by the kernel. Because CPUID doesn't incorporate the current IA32_FEAT_CTL value in its reporting of relevant features, it's possible for a feature to be reported as supported in cpufeatures but not truly enabled, e.g. if the CPU supports VMX but the kernel doesn't recognize the CPU.
As a result, without the flag, KVM would see VMX as supported even if IA32_FEAT_CTL hasn't been initialized, and so would need to manually read the MSR and check the various enabling bits to avoid taking an unexpected #GP on VMXON.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20191221044513.21680-14-sean.j.christopherson@intel.com
show more ...
|
#
f444a5ff |
| 16-Dec-2019 |
Tony Luck <tony.luck@intel.com> |
x86/cpufeatures: Add support for fast short REP; MOVSB
>From the Intel Optimization Reference Manual:
3.7.6.1 Fast Short REP MOVSB Beginning with processors based on Ice Lake Client microarchitectu
x86/cpufeatures: Add support for fast short REP; MOVSB
>From the Intel Optimization Reference Manual:
3.7.6.1 Fast Short REP MOVSB Beginning with processors based on Ice Lake Client microarchitecture, REP MOVSB performance of short operations is enhanced. The enhancement applies to string lengths between 1 and 128 bytes long. Support for fast-short REP MOVSB is enumerated by the CPUID feature flag: CPUID [EAX=7H, ECX=0H).EDX.FAST_SHORT_REP_MOVSB[bit 4] = 1. There is no change in the REP STOS performance.
Add an X86_FEATURE_FSRM flag for this.
memmove() avoids REP MOVSB for short (< 32 byte) copies. Check FSRM and use REP MOVSB for short copies on systems that support it.
[ bp: Massage and add comment. ]
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20191216214254.26492-1-tony.luck@intel.com
show more ...
|
#
db4d30fb |
| 04-Nov-2019 |
Vineela Tummalapalli <vineela.tummalapalli@intel.com> |
x86/bugs: Add ITLB_MULTIHIT bug infrastructure
Some processors may incur a machine check error possibly resulting in an unrecoverable CPU lockup when an instruction fetch encounters a TLB multi-hit
x86/bugs: Add ITLB_MULTIHIT bug infrastructure
Some processors may incur a machine check error possibly resulting in an unrecoverable CPU lockup when an instruction fetch encounters a TLB multi-hit in the instruction TLB. This can occur when the page size is changed along with either the physical address or cache type. The relevant erratum can be found here:
https://bugzilla.kernel.org/show_bug.cgi?id=205195
There are other processors affected for which the erratum does not fully disclose the impact.
This issue affects both bare-metal x86 page tables and EPT.
It can be mitigated by either eliminating the use of large pages or by using careful TLB invalidations when changing the page size in the page tables.
Just like Spectre, Meltdown, L1TF and MDS, a new bit has been allocated in MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will be set on CPUs which are mitigated against this issue.
Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com> Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
1b42f017 |
| 23-Oct-2019 |
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> |
x86/speculation/taa: Add mitigation for TSX Async Abort
TSX Async Abort (TAA) is a side channel vulnerability to the internal buffers in some Intel processors similar to Microachitectural Data Sampl
x86/speculation/taa: Add mitigation for TSX Async Abort
TSX Async Abort (TAA) is a side channel vulnerability to the internal buffers in some Intel processors similar to Microachitectural Data Sampling (MDS). In this case, certain loads may speculatively pass invalid data to dependent operations when an asynchronous abort condition is pending in a TSX transaction.
This includes loads with no fault or assist condition. Such loads may speculatively expose stale data from the uarch data structures as in MDS. Scope of exposure is within the same-thread and cross-thread. This issue affects all current processors that support TSX, but do not have ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES.
On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0, CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers using VERW or L1D_FLUSH, there is no additional mitigation needed for TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by disabling the Transactional Synchronization Extensions (TSX) feature.
A new MSR IA32_TSX_CTRL in future and current processors after a microcode update can be used to control the TSX feature. There are two bits in that MSR:
* TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted Transactional Memory (RTM).
* TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally disabled with updated microcode but still enumerated as present by CPUID(EAX=7).EBX{bit4}.
The second mitigation approach is similar to MDS which is clearing the affected CPU buffers on return to user space and when entering a guest. Relevant microcode update is required for the mitigation to work. More details on this approach can be found here:
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html
The TSX feature can be controlled by the "tsx" command line parameter. If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is deployed. The effective mitigation state can be read from sysfs.
[ bp: - massage + comments cleanup - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh. - remove partial TAA mitigation in update_mds_branch_idle() - Josh. - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g ]
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
show more ...
|
#
9d40b85b |
| 07-Oct-2019 |
Babu Moger <babu.moger@amd.com> |
x86/cpufeatures: Add feature bit RDPRU on AMD
AMD Zen 2 introduces a new RDPRU instruction which is used to give access to some processor registers that are typically only accessible when the privil
x86/cpufeatures: Add feature bit RDPRU on AMD
AMD Zen 2 introduces a new RDPRU instruction which is used to give access to some processor registers that are typically only accessible when the privilege level is zero.
ECX is used as the implicit register to specify which register to read. RDPRU places the specified register’s value into EDX:EAX.
For example, the RDPRU instruction can be used to read MPERF and APERF at CPL > 0.
Add the feature bit so it is visible in /proc/cpuinfo.
Details are available in the AMD64 Architecture Programmer’s Manual: https://www.amd.com/system/files/TechDocs/24594.pdf
Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Aaron Lewis <aaronlewis@google.com> Cc: ak@linux.intel.com Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: robert.hu@linux.intel.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191007204839.5727.10803.stgit@localhost.localdomain
show more ...
|
#
b4dd4f6e |
| 28-Aug-2019 |
Thomas Hellstrom <thellstrom@vmware.com> |
x86/vmware: Add a header file for hypercall definitions
The new header is intended to be used by drivers using the backdoor. Follow the KVM example using alternatives self-patching to choose between
x86/vmware: Add a header file for hypercall definitions
The new header is intended to be used by drivers using the backdoor. Follow the KVM example using alternatives self-patching to choose between vmcall, vmmcall and io instructions.
Also define two new CPU feature flags to indicate hypervisor support for vmcall- and vmmcall instructions. The new XF86_FEATURE_VMW_VMMCALL flag is needed because using XF86_FEATURE_VMMCALL might break QEMU/KVM setups using the vmmouse driver. They rely on XF86_FEATURE_VMMCALL on AMD to get the kvm_hypercall() right. But they do not yet implement vmmcall for the VMware hypercall used by the vmmouse driver.
[ bp: reflow hypercall %edx usage explanation comment. ]
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Doug Covelli <dcovelli@vmware.com> Cc: Aaron Lewis <aaronlewis@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-graphics-maintainer@vmware.com Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Robert Hoo <robert.hu@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: virtualization@lists.linux-foundation.org Cc: <pv-drivers@vmware.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190828080353.12658-3-thomas_os@shipmail.org
show more ...
|
#
f36cf386 |
| 17-Jul-2019 |
Thomas Gleixner <tglx@linutronix.de> |
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Intel provided the following information:
On all current Atom processors, instructions that use a segment register value (e.g
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Intel provided the following information:
On all current Atom processors, instructions that use a segment register value (e.g. a load or store) will not speculatively execute before the last writer of that segment retires. Thus they will not use a speculatively written segment value.
That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS entry paths can be excluded from the extra LFENCE if PTI is disabled.
Create a separate bug flag for the through SWAPGS speculation and mark all out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs are excluded from the whole mitigation mess anyway.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
show more ...
|
#
be261ffc |
| 04-Jul-2019 |
Josh Poimboeuf <jpoimboe@redhat.com> |
x86: Remove X86_FEATURE_MFENCE_RDTSC
AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC). They've both had it for a long time, and AMD has had it enabled in Linux since Spectre v1
x86: Remove X86_FEATURE_MFENCE_RDTSC
AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC). They've both had it for a long time, and AMD has had it enabled in Linux since Spectre v1 was announced.
Back then, there was a proposal to remove the serializing mfence feature bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have serializing lfence. At the time, it was (ahem) speculated that some hypervisors might not yet support its removal, so it remained for the time being.
Now a year-and-a-half later, it should be safe to remove.
I asked Andrew Cooper about whether it's still needed:
So if you're virtualised, you've got no choice in the matter. lfence is either dispatch-serialising or not on AMD, and you won't be able to change it.
Furthermore, you can't accurately tell what state the bit is in, because the MSR might not be virtualised at all, or may not reflect the true state in hardware. Worse still, attempting to set the bit may not be successful even if there isn't a fault for doing so.
Xen sets the DE_CFG bit unconditionally, as does Linux by the looks of things (see MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT). ISTR other hypervisor vendors saying the same, but I don't have any information to hand.
If you are running under a hypervisor which has been updated, then lfence will almost certainly be dispatch-serialising in practice, and you'll almost certainly see the bit already set in DE_CFG. If you're running under a hypervisor which hasn't been patched since Spectre, you've already lost in many more ways.
I'd argue that X86_FEATURE_MFENCE_RDTSC is not worth keeping.
So remove it. This will reduce some code rot, and also make it easier to hook barrier_nospec() up to a cmdline disable for performance raisins, without having to need an alternative_3() macro.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/d990aa51e40063acb9888e8c1b688e41355a9588.1562255067.git.jpoimboe@redhat.com
show more ...
|
#
018ebca8 |
| 17-Jul-2019 |
Gayatri Kammela <gayatri.kammela@intel.com> |
x86/cpufeatures: Enable a new AVX512 CPU feature
Add a new AVX512 instruction group/feature for enumeration in /proc/cpuinfo: AVX512_VP2INTERSECT.
CPUID.(EAX=7,ECX=0):EDX[bit 8] AVX512_VP2INTERSEC
x86/cpufeatures: Enable a new AVX512 CPU feature
Add a new AVX512 instruction group/feature for enumeration in /proc/cpuinfo: AVX512_VP2INTERSECT.
CPUID.(EAX=7,ECX=0):EDX[bit 8] AVX512_VP2INTERSECT
Detailed information of CPUID bits for this feature can be found in the Intel Architecture Intsruction Set Extensions Programming Reference document (refer to Table 1-2). A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=204215.
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190717234632.32673-3-gayatri.kammela@intel.com
show more ...
|
#
18ec54fd |
| 08-Jul-2019 |
Josh Poimboeuf <jpoimboe@redhat.com> |
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
Spectre v1 isn't only about array bounds checks. It can affect any conditional checks. The kernel entry code interrupt, excep
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
Spectre v1 isn't only about array bounds checks. It can affect any conditional checks. The kernel entry code interrupt, exception, and NMI handlers all have conditional swapgs checks. Those may be problematic in the context of Spectre v1, as kernel code can speculatively run with a user GS.
For example:
if (coming from user space) swapgs mov %gs:<percpu_offset>, %reg mov (%reg), %reg1
When coming from user space, the CPU can speculatively skip the swapgs, and then do a speculative percpu load using the user GS value. So the user can speculatively force a read of any kernel value. If a gadget exists which uses the percpu value as an address in another load/store, then the contents of the kernel value may become visible via an L1 side channel attack.
A similar attack exists when coming from kernel space. The CPU can speculatively do the swapgs, causing the user GS to get used for the rest of the speculative window.
The mitigation is similar to a traditional Spectre v1 mitigation, except:
a) index masking isn't possible; because the index (percpu offset) isn't user-controlled; and
b) an lfence is needed in both the "from user" swapgs path and the "from kernel" non-swapgs path (because of the two attacks described above).
The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a CR3 write when PTI is enabled. Since CR3 writes are serializing, the lfences can be skipped in those cases.
On the other hand, the kernel entry swapgs paths don't depend on PTI.
To avoid unnecessary lfences for the user entry case, create two separate features for alternative patching:
X86_FEATURE_FENCE_SWAPGS_USER X86_FEATURE_FENCE_SWAPGS_KERNEL
Use these features in entry code to patch in lfences where needed.
The features aren't enabled yet, so there's no functional change.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com>
show more ...
|
#
6dbbf5ec |
| 19-Jun-2019 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/cpufeatures: Enumerate user wait instructions
umonitor, umwait, and tpause are a set of user wait instructions.
umonitor arms address monitoring hardware using an address. The address range is
x86/cpufeatures: Enumerate user wait instructions
umonitor, umwait, and tpause are a set of user wait instructions.
umonitor arms address monitoring hardware using an address. The address range is determined by using CPUID.0x5. A store to an address within the specified address range triggers the monitoring hardware to wake up the processor waiting in umwait.
umwait instructs the processor to enter an implementation-dependent optimized state while monitoring a range of addresses. The optimized state may be either a light-weight power/performance optimized state (C0.1 state) or an improved power/performance optimized state (C0.2 state).
tpause instructs the processor to enter an implementation-dependent optimized state C0.1 or C0.2 state and wake up when time-stamp counter reaches specified timeout.
The three instructions may be executed at any privilege level.
The instructions provide power saving method while waiting in user space. Additionally, they can allow a sibling hyperthread to make faster progress while this thread is waiting. One example of an application usage of umwait is when waiting for input data from another application, such as a user level multi-threaded packet processing engine.
Availability of the user wait instructions is indicated by the presence of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5].
Detailed information on the instructions and CPUID feature WAITPKG flag can be found in the latest Intel Architecture Instruction Set Extensions and Future Features Programming Reference and Intel 64 and IA-32 Architectures Software Developer's Manual.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: "Borislav Petkov" <bp@alien8.de> Cc: "H Peter Anvin" <hpa@zytor.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Tony Luck" <tony.luck@intel.com> Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/1560994438-235698-2-git-send-email-fenghua.yu@intel.com
show more ...
|
#
b302e4b1 |
| 17-Jun-2019 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/cpufeatures: Enumerate the new AVX512 BFLOAT16 instructions
AVX512 BFLOAT16 instructions support 16-bit BFLOAT16 floating-point format (BF16) for deep learning optimization.
BF16 is a short ver
x86/cpufeatures: Enumerate the new AVX512 BFLOAT16 instructions
AVX512 BFLOAT16 instructions support 16-bit BFLOAT16 floating-point format (BF16) for deep learning optimization.
BF16 is a short version of 32-bit single-precision floating-point format (FP32) and has several advantages over 16-bit half-precision floating-point format (FP16). BF16 keeps FP32 accumulation after multiplication without loss of precision, offers more than enough range for deep learning training tasks, and doesn't need to handle hardware exception.
AVX512 BFLOAT16 instructions are enumerated in CPUID.7.1:EAX[bit 5] AVX512_BF16.
CPUID.7.1:EAX contains only feature bits. Reuse the currently empty word 12 as a pure features word to hold the feature bits including AVX512_BF16.
Detailed information of the CPUID bit and AVX512 BFLOAT16 instructions can be found in the latest Intel Architecture Instruction Set Extensions and Future Features Programming Reference.
[ bp: Check CPUID(7) subleaf validity before accessing subleaf 1. ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nadav Amit <namit@vmware.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Peter Feiner <pfeiner@google.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com> Cc: Robert Hoo <robert.hu@linux.intel.com> Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Lendacky <Thomas.Lendacky@amd.com> Cc: x86 <x86@kernel.org> Link: https://lkml.kernel.org/r/1560794416-217638-3-git-send-email-fenghua.yu@intel.com
show more ...
|
#
acec0ce0 |
| 19-Jun-2019 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two whole feature bits words. To better utilize feature
x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two whole feature bits words. To better utilize feature words, re-define word 11 to host scattered features and move the four X86_FEATURE_CQM_* features into Linux defined word 11. More scattered features can be added in word 11 in the future.
Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a Linux-defined leaf.
Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful name in the next patch when CPUID.7.1:EAX occupies world 12.
Maximum number of RMID and cache occupancy scale are retrieved from CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the code into a separate function.
KVM doesn't support resctrl now. So it's safe to move the X86_FEATURE_CQM_* features to scattered features word 11 for KVM.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Aaron Lewis <aaronlewis@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Babu Moger <babu.moger@amd.com> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Nadav Amit <namit@vmware.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Peter Feiner <pfeiner@google.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Sherry Hurwitz <sherry.hurwitz@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Lendacky <Thomas.Lendacky@amd.com> Cc: x86 <x86@kernel.org> Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.com
show more ...
|
#
cbb99c0f |
| 05-Jun-2019 |
Aaron Lewis <aaronlewis@google.com> |
x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS
Add the CPUID enumeration for Intel's de-feature bits to accommodate passing these de-features through to kvm guests.
These de-features are (fr
x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS
Add the CPUID enumeration for Intel's de-feature bits to accommodate passing these de-features through to kvm guests.
These de-features are (from SDM vol 1, section 8.1.8): - X86_FEATURE_FDP_EXCPTN_ONLY: If CPUID.(EAX=07H,ECX=0H):EBX[bit 6] = 1, the data pointer (FDP) is updated only for the x87 non-control instructions that incur unmasked x87 exceptions. - X86_FEATURE_ZERO_FCS_FDS: If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 1, the processor deprecates FCS and FDS; it saves each as 0000H.
Signed-off-by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: marcorr@google.com Cc: Peter Feiner <pfeiner@google.com> Cc: pshier@google.com Cc: Robert Hoo <robert.hu@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Lendacky <Thomas.Lendacky@amd.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190605220252.103406-1-aaronlewis@google.com
show more ...
|
#
e261f209 |
| 01-Mar-2019 |
Thomas Gleixner <tglx@linutronix.de> |
x86/speculation/mds: Add BUG_MSBDS_ONLY
This bug bit is set on CPUs which are only affected by Microarchitectural Store Buffer Data Sampling (MSBDS) and not by any other MDS variant.
This is import
x86/speculation/mds: Add BUG_MSBDS_ONLY
This bug bit is set on CPUs which are only affected by Microarchitectural Store Buffer Data Sampling (MSBDS) and not by any other MDS variant.
This is important because the Store Buffers are partitioned between Hyper-Threads so cross thread forwarding is not possible. But if a thread enters or exits a sleep state the store buffer is repartitioned which can expose data from one thread to the other. This transition can be mitigated.
That means that for CPUs which are only affected by MSBDS SMT can be enabled, if the CPU is not affected by other SMT sensitive vulnerabilities, e.g. L1TF. The XEON PHI variants fall into that category. Also the Silvermont/Airmont ATOMs, but for them it's not really relevant as they do not support SMT, but mark them for completeness sake.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
show more ...
|
#
ed5194c2 |
| 18-Jan-2019 |
Andi Kleen <ak@linux.intel.com> |
x86/speculation/mds: Add basic bug infrastructure for MDS
Microarchitectural Data Sampling (MDS), is a class of side channel attacks on internal buffers in Intel CPUs. The variants are:
- Microarc
x86/speculation/mds: Add basic bug infrastructure for MDS
Microarchitectural Data Sampling (MDS), is a class of side channel attacks on internal buffers in Intel CPUs. The variants are:
- Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126) - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130) - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a dependent load (store-to-load forwarding) as an optimization. The forward can also happen to a faulting or assisting load operation for a different memory address, which can be exploited under certain conditions. Store buffers are partitioned between Hyper-Threads so cross thread forwarding is not possible. But if a thread enters or exits a sleep state the store buffer is repartitioned which can expose data from one thread to the other.
MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage L1 miss situations and to hold data which is returned or sent in response to a memory or I/O operation. Fill buffers can forward data to a load operation and also write data to the cache. When the fill buffer is deallocated it can retain the stale data of the preceding operations which can then be forwarded to a faulting or assisting load operation, which can be exploited under certain conditions. Fill buffers are shared between Hyper-Threads so cross thread leakage is possible.
MLDPS leaks Load Port Data. Load ports are used to perform load operations from memory or I/O. The received data is then forwarded to the register file or a subsequent operation. In some implementations the Load Port can contain stale data from a previous operation which can be forwarded to faulting or assisting loads under certain conditions, which again can be exploited eventually. Load ports are shared between Hyper-Threads so cross thread leakage is possible.
All variants have the same mitigation for single CPU thread case (SMT off), so the kernel can treat them as one MDS issue.
Add the basic infrastructure to detect if the current CPU is affected by MDS.
[ tglx: Rewrote changelog ]
Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
show more ...
|
#
52f64909 |
| 05-Mar-2019 |
Peter Zijlstra (Intel) <peterz@infradead.org> |
x86: Add TSX Force Abort CPUID/MSR
Skylake systems will receive a microcode update to address a TSX errata. This microcode will (by default) clobber PMC3 when TSX instructions are (speculatively or
x86: Add TSX Force Abort CPUID/MSR
Skylake systems will receive a microcode update to address a TSX errata. This microcode will (by default) clobber PMC3 when TSX instructions are (speculatively or not) executed.
It also provides an MSR to cause all TSX transaction to abort and preserve PMC3.
Add the CPUID enumeration and MSR definition.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
a0aea130 |
| 19-Dec-2018 |
Robert Hoo <robert.hu@linux.intel.com> |
KVM: x86: Add CPUID support for new instruction WBNOINVD
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
20c3a2c3 |
| 13-Dec-2018 |
Thomas Lendacky <Thomas.Lendacky@amd.com> |
x86/speculation: Add support for STIBP always-on preferred mode
Different AMD processors may have different implementations of STIBP. When STIBP is conditionally enabled, some implementations would
x86/speculation: Add support for STIBP always-on preferred mode
Different AMD processors may have different implementations of STIBP. When STIBP is conditionally enabled, some implementations would benefit from having STIBP always on instead of toggling the STIBP bit through MSR writes. This preference is advertised through a CPUID feature bit.
When conditional STIBP support is requested at boot and the CPU advertises STIBP always-on mode as preferred, switch to STIBP "on" support. To show that this transition has occurred, create a new spectre_v2_user_mitigation value and a new spectre_v2_user_strings message. The new mitigation value is used in spectre_v2_user_select_mitigation() to print the new mitigation message as well as to return a new string from stibp_state().
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20181213230352.6937.74943.stgit@tlendack-t1.amdoffice.net
show more ...
|
#
ace6485a |
| 24-Oct-2018 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/cpufeatures: Enumerate MOVDIR64B instruction
MOVDIR64B moves 64-bytes as direct-store with 64-bytes write atomicity. Direct store is implemented by using write combining (WC) for writing data di
x86/cpufeatures: Enumerate MOVDIR64B instruction
MOVDIR64B moves 64-bytes as direct-store with 64-bytes write atomicity. Direct store is implemented by using write combining (WC) for writing data directly into memory without caching the data.
In low latency offload (e.g. Non-Volatile Memory, etc), MOVDIR64B writes work descriptors (and data in some cases) to device-hosted work-queues atomically without cache pollution.
Availability of the MOVDIR64B instruction is indicated by the presence of the CPUID feature flag MOVDIR64B (CPUID.0x07.0x0:ECX[bit 28]).
Please check the latest Intel Architecture Instruction Set Extensions and Future Features Programming Reference for more details on the CPUID feature MOVDIR64B flag.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1540418237-125817-3-git-send-email-fenghua.yu@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
33823f4d |
| 24-Oct-2018 |
Fenghua Yu <fenghua.yu@intel.com> |
x86/cpufeatures: Enumerate MOVDIRI instruction
MOVDIRI moves doubleword or quadword from register to memory through direct store which is implemented by using write combining (WC) for writing data d
x86/cpufeatures: Enumerate MOVDIRI instruction
MOVDIRI moves doubleword or quadword from register to memory through direct store which is implemented by using write combining (WC) for writing data directly into memory without caching the data.
Programmable agents can handle streaming offload (e.g. high speed packet processing in network). Hardware implements a doorbell (tail pointer) register that is updated by software when adding new work-elements to the streaming offload work-queue.
MOVDIRI can be used as the doorbell write which is a 4-byte or 8-byte uncachable write to MMIO. MOVDIRI has lower overhead than other ways to write the doorbell.
Availability of the MOVDIRI instruction is indicated by the presence of the CPUID feature flag MOVDIRI(CPUID.0x07.0x0:ECX[bit 27]).
Please check the latest Intel Architecture Instruction Set Extensions and Future Features Programming Reference for more details on the CPUID feature MOVDIRI flag.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1540418237-125817-2-git-send-email-fenghua.yu@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
706d5168 |
| 01-Aug-2018 |
Sai Praneeth <sai.praneeth.prakhya@intel.com> |
x86/speculation: Support Enhanced IBRS on future CPUs
Future Intel processors will support "Enhanced IBRS" which is an "always on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never disab
x86/speculation: Support Enhanced IBRS on future CPUs
Future Intel processors will support "Enhanced IBRS" which is an "always on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never disabled.
From the specification [1]:
"With enhanced IBRS, the predicted targets of indirect branches executed cannot be controlled by software that was executed in a less privileged predictor mode or on another logical processor. As a result, software operating on a processor with enhanced IBRS need not use WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more privileged predictor mode. Software can isolate predictor modes effectively simply by setting the bit once. Software need not disable enhanced IBRS prior to entering a sleep state such as MWAIT or HLT."
If Enhanced IBRS is supported by the processor then use it as the preferred spectre v2 mitigation mechanism instead of Retpoline. Intel's Retpoline white paper [2] states:
"Retpoline is known to be an effective branch target injection (Spectre variant 2) mitigation on Intel processors belonging to family 6 (enumerated by the CPUID instruction) that do not have support for enhanced IBRS. On processors that support enhanced IBRS, it should be used for mitigation instead of retpoline."
The reason why Enhanced IBRS is the recommended mitigation on processors which support it is that these processors also support CET which provides a defense against ROP attacks. Retpoline is very similar to ROP techniques and might trigger false positives in the CET defense.
If Enhanced IBRS is selected as the mitigation technique for spectre v2, the IBRS bit in SPEC_CTRL MSR is set once at boot time and never cleared. Kernel also has to make sure that IBRS bit remains set after VMEXIT because the guest might have cleared the bit. This is already covered by the existing x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() speculation control functions.
Enhanced IBRS still requires IBPB for full mitigation.
[1] Speculative-Execution-Side-Channel-Mitigations.pdf [2] Retpoline-A-Branch-Target-Injection-Mitigation.pdf Both documents are available at: https://bugzilla.kernel.org/show_bug.cgi?id=199511
Originally-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tim C Chen <tim.c.chen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/1533148945-24095-1-git-send-email-sai.praneeth.prakhya@intel.com
show more ...
|