Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
1acbca93 |
| 22-Mar-2024 |
Adamos Ttofari <attofari@amazon.de> |
x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD
[ Upstream commit 10e4b5166df9ff7a2d5316138ca668b42d004422 ]
Commit 672365477ae8 ("x86/fpu: Update XFD state where required") and commit 8bf26758ca
x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD
[ Upstream commit 10e4b5166df9ff7a2d5316138ca668b42d004422 ]
Commit 672365477ae8 ("x86/fpu: Update XFD state where required") and commit 8bf26758ca96 ("x86/fpu: Add XFD state to fpstate") introduced a per CPU variable xfd_state to keep the MSR_IA32_XFD value cached, in order to avoid unnecessary writes to the MSR.
On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which wipes out any stale state. But the per CPU cached xfd value is not reset, which brings them out of sync.
As a consequence a subsequent xfd_update_state() might fail to update the MSR which in turn can result in XRSTOR raising a #NM in kernel space, which crashes the kernel.
To fix this, introduce xfd_set_state() to write xfd_state together with MSR_IA32_XFD, and use it in all places that set MSR_IA32_XFD.
Fixes: 672365477ae8 ("x86/fpu: Update XFD state where required") Signed-off-by: Adamos Ttofari <attofari@amazon.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240322230439.456571-1-chang.seok.bae@intel.com
Closes: https://lore.kernel.org/lkml/20230511152818.13839-1-attofari@amazon.de Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
1acbca93 |
| 22-Mar-2024 |
Adamos Ttofari <attofari@amazon.de> |
x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD
[ Upstream commit 10e4b5166df9ff7a2d5316138ca668b42d004422 ]
Commit 672365477ae8 ("x86/fpu: Update XFD state where required") and commit 8bf26758ca
x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD
[ Upstream commit 10e4b5166df9ff7a2d5316138ca668b42d004422 ]
Commit 672365477ae8 ("x86/fpu: Update XFD state where required") and commit 8bf26758ca96 ("x86/fpu: Add XFD state to fpstate") introduced a per CPU variable xfd_state to keep the MSR_IA32_XFD value cached, in order to avoid unnecessary writes to the MSR.
On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which wipes out any stale state. But the per CPU cached xfd value is not reset, which brings them out of sync.
As a consequence a subsequent xfd_update_state() might fail to update the MSR which in turn can result in XRSTOR raising a #NM in kernel space, which crashes the kernel.
To fix this, introduce xfd_set_state() to write xfd_state together with MSR_IA32_XFD, and use it in all places that set MSR_IA32_XFD.
Fixes: 672365477ae8 ("x86/fpu: Update XFD state where required") Signed-off-by: Adamos Ttofari <attofari@amazon.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240322230439.456571-1-chang.seok.bae@intel.com
Closes: https://lore.kernel.org/lkml/20230511152818.13839-1-attofari@amazon.de Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6 |
|
#
8647c52e |
| 27-Sep-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
Mask off xfeatures that aren't exposed to the guest only when saving guest state via KVM_GET_XSAVE{2} instead of modifying user
KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
Mask off xfeatures that aren't exposed to the guest only when saving guest state via KVM_GET_XSAVE{2} instead of modifying user_xfeatures directly. Preserving the maximal set of xfeatures in user_xfeatures restores KVM's ABI for KVM_SET_XSAVE, which prior to commit ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") allowed userspace to load xfeatures that are supported by the host, irrespective of what xfeatures are exposed to the guest.
There is no known use case where userspace *intentionally* loads xfeatures that aren't exposed to the guest, but the bug fixed by commit ad856280ddea was specifically that KVM_GET_SAVE{2} would save xfeatures that weren't exposed to the guest, e.g. would lead to userspace unintentionally loading guest-unsupported xfeatures when live migrating a VM.
Restricting KVM_SET_XSAVE to guest-supported xfeatures is especially problematic for QEMU-based setups, as QEMU has a bug where instead of terminating the VM if KVM_SET_XSAVE fails, QEMU instead simply stops loading guest state, i.e. resumes the guest after live migration with incomplete guest state, and ultimately results in guest data corruption.
Note, letting userspace restore all host-supported xfeatures does not fix setups where a VM is migrated from a host *without* commit ad856280ddea, to a target with a subset of host-supported xfeatures. However there is no way to safely address that scenario, e.g. KVM could silently drop the unsupported features, but that would be a clear violation of KVM's ABI and so would require userspace to opt-in, at which point userspace could simply be updated to sanitize the to-be-loaded XSAVE state.
Reported-by: Tyler Stachecki <stachecki.tyler@gmail.com> Closes: https://lore.kernel.org/all/20230914010003.358162-1-tstachecki@bloomberg.net Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") Cc: stable@vger.kernel.org Cc: Leonardo Bras <leobras@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Message-Id: <20230928001956.924301-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
18164f66 |
| 27-Sep-2023 |
Sean Christopherson <seanjc@google.com> |
x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
Plumb an xfeatures mask into __copy_xstate_to_uabi_buf() so that KVM can constrain which xfeatures are saved into the userspa
x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
Plumb an xfeatures mask into __copy_xstate_to_uabi_buf() so that KVM can constrain which xfeatures are saved into the userspace buffer without having to modify the user_xfeatures field in KVM's guest_fpu state.
KVM's ABI for KVM_GET_XSAVE{2} is that features that are not exposed to guest must not show up in the effective xstate_bv field of the buffer. Saving only the guest-supported xfeatures allows userspace to load the saved state on a different host with a fewer xfeatures, so long as the target host supports the xfeatures that are exposed to the guest.
KVM currently sets user_xfeatures directly to restrict KVM_GET_XSAVE{2} to the set of guest-supported xfeatures, but doing so broke KVM's historical ABI for KVM_SET_XSAVE, which allows userspace to load any xfeatures that are supported by the *host*.
Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230928001956.924301-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1 |
|
#
18032b47 |
| 30-Aug-2023 |
Jim Mattson <jmattson@google.com> |
x86/fpu/xstate: Fix PKRU covert channel
When XCR0[9] is set, PKRU can be read and written from userspace with XSAVE and XRSTOR, even when CR4.PKE is clear.
Clear XCR0[9] when protection keys are di
x86/fpu/xstate: Fix PKRU covert channel
When XCR0[9] is set, PKRU can be read and written from userspace with XSAVE and XRSTOR, even when CR4.PKE is clear.
Clear XCR0[9] when protection keys are disabled.
Reported-by: Tavis Ormandy <taviso@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20230831043228.1194256-1-jmattson@google.com
show more ...
|
Revision tags: v6.1.50, v6.5, v6.1.49, v6.1.48 |
|
#
2c66ca39 |
| 23-Aug-2023 |
Feng Tang <feng.tang@intel.com> |
x86/fpu: Set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4
0-Day found a 34.6% regression in stress-ng's 'af-alg' test case, and bisected it to commit b81fac906a8f ("x86/fpu: Move FPU in
x86/fpu: Set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4
0-Day found a 34.6% regression in stress-ng's 'af-alg' test case, and bisected it to commit b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()"), which optimizes the FPU init order, and moves the CR4_OSXSAVE enabling into a later place:
arch_cpu_finalize_init identify_boot_cpu identify_cpu generic_identify get_cpu_cap --> setup cpu capability ... fpu__init_cpu fpu__init_cpu_xstate cr4_set_bits(X86_CR4_OSXSAVE);
As the FPU is not yet initialized the CPU capability setup fails to set X86_FEATURE_OSXSAVE. Many security module like 'camellia_aesni_avx_x86_64' depend on this feature and therefore fail to load, causing the regression.
Cure this by setting X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling.
[ tglx: Moved it into the actual BSP FPU initialization code and added a comment ]
Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@intel.com Link: https://lore.kernel.org/lkml/20230823065747.92257-1-feng.tang@intel.com
show more ...
|
Revision tags: v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34 |
|
#
8970ef02 |
| 12-Jun-2023 |
Rick Edgecombe <rick.p.edgecombe@intel.com> |
x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states
Shadow stack register state can be managed with XSAVE. The registers can logically be separated into two groups: * Registers co
x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states
Shadow stack register state can be managed with XSAVE. The registers can logically be separated into two groups: * Registers controlling user-mode operation * Registers controlling kernel-mode operation
The architecture has two new XSAVE state components: one for each group of those groups of registers. This lets an OS manage them separately if it chooses. Future patches for host userspace and KVM guests will only utilize the user-mode registers, so only configure XSAVE to save user-mode registers. This state will add 16 bytes to the xsave buffer size.
Future patches will use the user-mode XSAVE area to save guest user-mode CET state. However, VMCS includes new fields for guest CET supervisor states. KVM can use these to save and restore guest supervisor state, so host supervisor XSAVE support is not required.
Adding this exacerbates the already unwieldy if statement in check_xstate_against_struct() that handles warning about unimplemented xfeatures. So refactor these check's by having XCHECK_SZ() set a bool when it actually check's the xfeature. This ends up exceeding 80 chars, but was better on balance than other options explored. Pass the bool as pointer to make it clear that XCHECK_SZ() can change the variable.
While configuring user-mode XSAVE, clarify kernel-mode registers are not managed by XSAVE by defining the xfeature in XFEATURE_MASK_SUPERVISOR_UNSUPPORTED, like is done for XFEATURE_MASK_PT. This serves more of a documentation as code purpose, and functionally, only enables a few safety checks.
Both XSAVE state components are supervisor states, even the state controlling user-mode operation. This is a departure from earlier features like protection keys where the PKRU state is a normal user (non-supervisor) state. Having the user state be supervisor-managed ensures there is no direct, unprivileged access to it, making it harder for an attacker to subvert CET.
To facilitate this privileged access, define the two user-mode CET MSRs, and the bits defined in those MSRs relevant to future shadow stack enablement patches.
Co-developed-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: John Allen <john.allen@amd.com> Tested-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/20230613001108.3040476-25-rick.p.edgecombe%40intel.com
show more ...
|
Revision tags: v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15 |
|
#
b1588884 |
| 27-Feb-2023 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
__copy_xstate_to_uabi_buf() copies either from the tasks XSAVE buffer or from init_fpstate into the ptrace buffer. Dynamic
x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
__copy_xstate_to_uabi_buf() copies either from the tasks XSAVE buffer or from init_fpstate into the ptrace buffer. Dynamic features, like XTILEDATA, have an all zeroes init state and are not saved in init_fpstate, which means the corresponding bit is not set in the xfeatures bitmap of the init_fpstate header.
But __copy_xstate_to_uabi_buf() retrieves addresses for both the tasks xstate and init_fpstate unconditionally via __raw_xsave_addr().
So if the tasks XSAVE buffer has a dynamic feature set, then the address retrieval for init_fpstate triggers the warning in __raw_xsave_addr() which checks the feature bit in the init_fpstate header.
Remove the address retrieval from init_fpstate for extended features. They have an all zeroes init state so init_fpstate has zeros for them. Then zeroing the user buffer for the init state is the same as copying them from init_fpstate.
Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Reported-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/kvm/20230221163655.920289-2-mizhang@google.com/ Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/all/20230227210504.18520-2-chang.seok.bae%40intel.com Cc: stable@vger.kernel.org
show more ...
|
Revision tags: v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79 |
|
#
d7e5acea |
| 15-Nov-2022 |
Kyle Huey <me@kylehuey.com> |
x86/fpu: Emulate XRSTOR's behavior if the xfeatures PKRU bit is not set
The hardware XRSTOR instruction resets the PKRU register to its hardware init value (namely 0) if the PKRU bit is not set in t
x86/fpu: Emulate XRSTOR's behavior if the xfeatures PKRU bit is not set
The hardware XRSTOR instruction resets the PKRU register to its hardware init value (namely 0) if the PKRU bit is not set in the xfeatures mask. Emulating that here restores the pre-5.14 behavior for PTRACE_SET_REGSET with NT_X86_XSTATE, and makes sigreturn (which still uses XRSTOR) and ptrace behave identically. KVM has never used XRSTOR and never had this behavior, so KVM opts-out of this emulation by passing a NULL pkru pointer to copy_uabi_to_xstate().
Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()") Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-6-khuey%40kylehuey.com
show more ...
|
#
4a804c4f |
| 15-Nov-2022 |
Kyle Huey <me@kylehuey.com> |
x86/fpu: Allow PKRU to be (once again) written by ptrace.
Move KVM's PKRU handling code in fpu_copy_uabi_to_guest_fpstate() to copy_uabi_to_xstate() so that it is shared with other APIs that write t
x86/fpu: Allow PKRU to be (once again) written by ptrace.
Move KVM's PKRU handling code in fpu_copy_uabi_to_guest_fpstate() to copy_uabi_to_xstate() so that it is shared with other APIs that write the XSTATE such as PTRACE_SETREGSET with NT_X86_XSTATE.
This restores the pre-5.14 behavior of ptrace. The regression can be seen by running gdb and executing `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the write to the PKRU register (which gdb performs through ptrace) is ignored.
[ dhansen: removed stable@ tag for now. The ABI was broken for long enough that this is not urgent material. Let's let it stew in tip for a few weeks before it's submitted to stable because there are so many ABIs potentially affected. ]
Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()") Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-5-khuey%40kylehuey.com
show more ...
|
#
2c87767c |
| 15-Nov-2022 |
Kyle Huey <me@kylehuey.com> |
x86/fpu: Add a pkru argument to copy_uabi_to_xstate()
In preparation for moving PKRU handling code out of fpu_copy_uabi_to_guest_fpstate() and into copy_uabi_to_xstate(), add an argument that copy_u
x86/fpu: Add a pkru argument to copy_uabi_to_xstate()
In preparation for moving PKRU handling code out of fpu_copy_uabi_to_guest_fpstate() and into copy_uabi_to_xstate(), add an argument that copy_uabi_from_kernel_to_xstate() can use to pass the canonical location of the PKRU value. For copy_sigframe_from_user_to_xstate() the kernel will actually restore the PKRU value from the fpstate, but pass in the thread_struct's pkru location anyways for consistency.
Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-4-khuey%40kylehuey.com
show more ...
|
#
1c813ce0 |
| 15-Nov-2022 |
Kyle Huey <me@kylehuey.com> |
x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate().
Both KVM (through KVM_SET_XSTATE) and ptrace (through PTRACE_SETREGSET with NT_X86_XSTATE) ultimately call copy_uabi_from_kernel_to
x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate().
Both KVM (through KVM_SET_XSTATE) and ptrace (through PTRACE_SETREGSET with NT_X86_XSTATE) ultimately call copy_uabi_from_kernel_to_xstate(), but the canonical locations for the current PKRU value for KVM guests and processes in a ptrace stop are different (in the kvm_vcpu_arch and the thread_state structs respectively).
In preparation for eventually handling PKRU in copy_uabi_to_xstate, pass in a pointer to the PKRU location.
Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-3-khuey%40kylehuey.com
show more ...
|
#
6a877d24 |
| 15-Nov-2022 |
Kyle Huey <me@kylehuey.com> |
x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate()
This will allow copy_sigframe_from_user_to_xstate() to grab the address of thread_struct's pkru value in a later patch.
Signed-off-
x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate()
This will allow copy_sigframe_from_user_to_xstate() to grab the address of thread_struct's pkru value in a later patch.
Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-2-khuey%40kylehuey.com
show more ...
|
Revision tags: v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60 |
|
#
48280042 |
| 10-Aug-2022 |
Andrew Cooper <andrew.cooper3@citrix.com> |
x86/fpu/xstate: Fix XSTATE_WARN_ON() to emit relevant diagnostics
"XSAVE consistency problem" has been reported under Xen, but that's the extent of my divination skills.
Modify XSTATE_WARN_ON() to
x86/fpu/xstate: Fix XSTATE_WARN_ON() to emit relevant diagnostics
"XSAVE consistency problem" has been reported under Xen, but that's the extent of my divination skills.
Modify XSTATE_WARN_ON() to force the caller to provide relevant diagnostic information, and modify each caller suitably.
For check_xstate_against_struct(), this removes a double WARN() where one will do perfectly fine.
CC stable as this has been wonky debugging for 7 years and it is good to have there too.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220810221909.12768-1-andrew.cooper3@citrix.com
show more ...
|
#
471f0aa7 |
| 21-Oct-2022 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly
When an extended state component is not present in fpstate, but in init state, the function copies from init_fpstate via copy_feature
x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly
When an extended state component is not present in fpstate, but in init state, the function copies from init_fpstate via copy_feature().
But, dynamic states are not present in init_fpstate because of all-zeros init states. Then retrieving them from init_fpstate will explode like this:
BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:memcpy_erms+0x6/0x10 ? __copy_xstate_to_uabi_buf+0x381/0x870 fpu_copy_guest_fpstate_to_uabi+0x28/0x80 kvm_arch_vcpu_ioctl+0x14c/0x1460 [kvm] ? __this_cpu_preempt_check+0x13/0x20 ? vmx_vcpu_put+0x2e/0x260 [kvm_intel] kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? __fget_light+0xd4/0x130 __x64_sys_ioctl+0xe3/0x910 ? debug_smp_processor_id+0x17/0x20 ? fpregs_assert_state_consistent+0x27/0x50 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Adjust the 'mask' to zero out the userspace buffer for the features that are not available both from fpstate and from init_fpstate.
The dynamic features depend on the compacted XSAVE format. Ensure it is enabled before reading XCOMP_BV in init_fpstate.
Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Reported-by: Yuan Yao <yuan.yao@intel.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/lkml/BYAPR11MB3717EDEF2351C958F2C86EED95259@BYAPR11MB3717.namprd11.prod.outlook.com/ Link: https://lkml.kernel.org/r/20221021185844.13472-1-chang.seok.bae@intel.com
show more ...
|
#
a401f45e |
| 24-Aug-2022 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/fpu: Exclude dynamic states from init_fpstate
== Background ==
The XSTATE init code initializes all enabled and supported components. Then, the init states are saved in the init_fpstate buffer
x86/fpu: Exclude dynamic states from init_fpstate
== Background ==
The XSTATE init code initializes all enabled and supported components. Then, the init states are saved in the init_fpstate buffer that is statically allocated in about one page.
The AMX TILE_DATA state is large (8KB) but its init state is zero. And the feature comes only with the compacted format with these established dependencies: AMX->XFD->XSAVES. So this state is excludable from init_fpstate.
== Problem ==
But the buffer is formatted to include that large state. Then, this can be the cause of a noisy splat like the below.
This came from XRSTORS for the task with init_fpstate in its XSAVE buffer. It is reproducible on AMX systems when the running kernel is built with CONFIG_DEBUG_PAGEALLOC=y and CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y:
Bad FPU state detected at restore_fpregs_from_fpstate+0x57/0xd0, reinitializing FPU registers. ... RIP: 0010:restore_fpregs_from_fpstate+0x57/0xd0 ? restore_fpregs_from_fpstate+0x45/0xd0 switch_fpu_return+0x4e/0xe0 exit_to_user_mode_prepare+0x17b/0x1b0 syscall_exit_to_user_mode+0x29/0x40 do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? exc_page_fault+0x86/0x180 entry_SYSCALL_64_after_hwframe+0x63/0xcd
== Solution ==
Adjust init_fpstate to exclude dynamic states. XRSTORS from init_fpstate still initializes those states when their bits are set in the requested-feature bitmap.
Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Reported-by: Lin X Wang <lin.x.wang@intel.com> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Lin X Wang <lin.x.wang@intel.com> Link: https://lore.kernel.org/r/20220824191223.1248-4-chang.seok.bae@intel.com
show more ...
|
#
d3e021ad |
| 24-Aug-2022 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/fpu: Fix the init_fpstate size check with the actual size
The init_fpstate buffer is statically allocated. Thus, the sanity test was established to check whether the pre-allocated buffer is enou
x86/fpu: Fix the init_fpstate size check with the actual size
The init_fpstate buffer is statically allocated. Thus, the sanity test was established to check whether the pre-allocated buffer is enough for the calculated size or not.
The currently measured size is not strictly relevant. Fix to validate the calculated init_fpstate size with the pre-allocated area.
Also, replace the sanity check function with open code for clarity. The abstraction itself and the function naming do not tend to represent simply what it does.
Fixes: 2ae996e0c1a3 ("x86/fpu: Calculate the default sizes independently") Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220824191223.1248-3-chang.seok.bae@intel.com
show more ...
|
#
c32d7cab |
| 24-Aug-2022 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/fpu: Configure init_fpstate attributes orderly
The init_fpstate setup code is spread out and out of order. The init image is recorded before its scoped features and the buffer size are determine
x86/fpu: Configure init_fpstate attributes orderly
The init_fpstate setup code is spread out and out of order. The init image is recorded before its scoped features and the buffer size are determined.
Determine the scope of init_fpstate components and its size before recording the init state. Also move the relevant code together.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: neelnatu@google.com Link: https://lore.kernel.org/r/20220824191223.1248-2-chang.seok.bae@intel.com
show more ...
|
Revision tags: v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40 |
|
#
f5c0b4f3 |
| 12-May-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/prctl: Remove pointless task argument
The functions invoked via do_arch_prctl_common() can only operate on the current task and none of these function uses the task argument.
Signed-off-by: Tho
x86/prctl: Remove pointless task argument
The functions invoked via do_arch_prctl_common() can only operate on the current task and none of these function uses the task argument.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/87lev7vtxj.ffs@tglx
show more ...
|
Revision tags: v5.15.39, v5.15.38 |
|
#
b91c0922 |
| 02-May-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Cleanup variable shadowing
Addresses: warning: Local variable 'mask' shadows outer variable
Remove extra variable declaration and switch the bit mask assignment to use BIT_ULL() while at i
x86/fpu: Cleanup variable shadowing
Addresses: warning: Local variable 'mask' shadows outer variable
Remove extra variable declaration and switch the bit mask assignment to use BIT_ULL() while at it.
Fixes: 522e92743b35 ("x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/202204262032.jFYKit5j-lkp@intel.com
show more ...
|
Revision tags: v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
8ad7e8f6 |
| 04-Apr-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xsave: Support XSAVEC in the kernel
XSAVEC is the user space counterpart of XSAVES which cannot save supervisor state. In virtualization scenarios the hypervisor does not expose XSAVES but X
x86/fpu/xsave: Support XSAVEC in the kernel
XSAVEC is the user space counterpart of XSAVES which cannot save supervisor state. In virtualization scenarios the hypervisor does not expose XSAVES but XSAVEC to the guest, though the kernel does not make use of it.
That's unfortunate because XSAVEC uses the compacted format of saving the XSTATE. This is more efficient in terms of storage space vs. XSAVE[OPT] as it does not create holes for XSTATE components which are not supported or enabled by the kernel but are available in hardware. There is room for further optimizations when XSAVEC/S and XGETBV1 are supported.
In order to support XSAVEC:
- Define the XSAVEC ASM macro as it's not yet supported by the required minimal toolchain.
- Create a software defined X86_FEATURE_XCOMPACTED to select the compacted XSTATE buffer format for both XSAVEC and XSAVES.
- Make XSAVEC an option in the 'XSAVE' ASM alternatives
Requested-by: Andrew Cooper <Andrew.Cooper3@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220404104820.598704095@linutronix.de
show more ...
|
#
d6d6d50f |
| 28-Mar-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xstate: Consolidate size calculations
Use the offset calculation to do the size calculation which avoids yet another series of CPUID instructions for each invocation.
[ Fix the FP/SSE onl
x86/fpu/xstate: Consolidate size calculations
Use the offset calculation to do the size calculation which avoids yet another series of CPUID instructions for each invocation.
[ Fix the FP/SSE only case which missed to take the xstate header into account, as Reported-by: kernel test robot <oliver.sang@intel.com> ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/87o81pgbp2.ffs@tglx
show more ...
|
Revision tags: v5.15.32 |
|
#
781c64bf |
| 24-Mar-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xstate: Handle supervisor states in XSTATE permissions
The size calculation in __xstate_request_perm() fails to take supervisor states into account because the permission bitmap is only rele
x86/fpu/xstate: Handle supervisor states in XSTATE permissions
The size calculation in __xstate_request_perm() fails to take supervisor states into account because the permission bitmap is only relevant for user states.
Up to 5.17 this does not matter because there are no supervisor states supported, but the (re-)enabling of ENQCMD makes them available.
Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.681768598@linutronix.de
show more ...
|
#
7aa5128b |
| 24-Mar-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
So far the cached fixed compacted offsets worked, but with (re-)enabling of ENQCMD this does no longer work with KVM fpstate.
x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
So far the cached fixed compacted offsets worked, but with (re-)enabling of ENQCMD this does no longer work with KVM fpstate.
KVM does not have supervisor features enabled for the guest FPU, which means that KVM has then a different XSAVE area layout than the host FPU state. This in turn breaks the copy from/to UABI functions when invoked for a guest state.
Remove the pre-calculated compacted offsets and calculate the offset of each component at runtime based on the XCOMP_BV field in the XSAVE header.
The runtime overhead is not interesting because these copy from/to UABI functions are not used in critical fast paths. KVM uses them to save and restore FPU state during migration. The host uses them for ptrace and for the slow path of 32bit signal handling.
Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.627636809@linutronix.de
show more ...
|
#
6afbb58c |
| 24-Mar-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Cache xfeature flags from CPUID
In preparation for runtime calculation of XSAVE offsets cache the feature flags for each XSTATE component during feature enumeration via CPUID(0xD).
EDX has
x86/fpu: Cache xfeature flags from CPUID
In preparation for runtime calculation of XSAVE offsets cache the feature flags for each XSTATE component during feature enumeration via CPUID(0xD).
EDX has two relevant bits: 0 Supervisor component 1 Feature storage must be 64 byte aligned
These bits are currently only evaluated during init, but the alignment bit must be cached to make runtime calculation of XSAVE offsets efficient.
Cache the full EDX content and use it for the existing alignment and supervisor checks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.573656209@linutronix.de
show more ...
|