#
405329fc |
| 16-Dec-2021 |
Michael Roth <michael.roth@amd.com> |
KVM: SVM: include CR3 in initial VMSA state for SEV-ES guests
Normally guests will set up CR3 themselves, but some guests, such as kselftests, and potentially CONFIG_PVH guests, rely on being booted
KVM: SVM: include CR3 in initial VMSA state for SEV-ES guests
Normally guests will set up CR3 themselves, but some guests, such as kselftests, and potentially CONFIG_PVH guests, rely on being booted with paging enabled and CR3 initialized to a pre-allocated page table.
Currently CR3 updates via KVM_SET_SREGS* are not loaded into the guest VMCB until just prior to entering the guest. For SEV-ES/SEV-SNP, this is too late, since it will have switched over to using the VMSA page prior to that point, with the VMSA CR3 copied from the VMCB initial CR3 value: 0.
Address this by sync'ing the CR3 value into the VMCB save area immediately when KVM_SET_SREGS* is issued so it will find it's way into the initial VMSA.
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-Id: <20211216171358.61140-10-michael.roth@amd.com> [Remove vmx_post_set_cr3; add a remark about kvm_set_cr3 not calling the new hook. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
c5063551 |
| 09-Dec-2021 |
Marc Orr <marcorr@google.com> |
KVM: x86: Always set kvm_run->if_flag
The kvm_run struct's if_flag is a part of the userspace/kernel API. The SEV-ES patches failed to set this flag because it's no longer needed by QEMU (according
KVM: x86: Always set kvm_run->if_flag
The kvm_run struct's if_flag is a part of the userspace/kernel API. The SEV-ES patches failed to set this flag because it's no longer needed by QEMU (according to the comment in the source code). However, other hypervisors may make use of this flag. Therefore, set the flag for guests with encrypted registers (i.e., with guest_state_protected set).
Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES") Signed-off-by: Marc Orr <marcorr@google.com> Message-Id: <20211209155257.128747-1-marcorr@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
show more ...
|
#
39b02352 |
| 20-Apr-2022 |
Mingwei Zhang <mizhang@google.com> |
KVM: SEV: add cache flush to solve SEV cache incoherency issues
commit 683412ccf61294d727ead4a73d97397396e69a6b upstream.
Flush the CPU caches when memory is reclaimed from an SEV guest (where recl
KVM: SEV: add cache flush to solve SEV cache incoherency issues
commit 683412ccf61294d727ead4a73d97397396e69a6b upstream.
Flush the CPU caches when memory is reclaimed from an SEV guest (where reclaim also includes it being unmapped from KVM's memslots). Due to lack of coherency for SEV encrypted memory, failure to flush results in silent data corruption if userspace is malicious/broken and doesn't ensure SEV guest memory is properly pinned and unpinned.
Cache coherency is not enforced across the VM boundary in SEV (AMD APM vol.2 Section 15.34.7). Confidential cachelines, generated by confidential VM guests have to be explicitly flushed on the host side. If a memory page containing dirty confidential cachelines was released by VM and reallocated to another user, the cachelines may corrupt the new user at a later time.
KVM takes a shortcut by assuming all confidential memory remain pinned until the end of VM lifetime. Therefore, KVM does not flush cache at mmu_notifier invalidation events. Because of this incorrect assumption and the lack of cache flushing, malicous userspace can crash the host kernel: creating a malicious VM and continuously allocates/releases unpinned confidential memory pages when the VM is running.
Add cache flush operations to mmu_notifier operations to ensure that any physical memory leaving the guest VM get flushed. In particular, hook mmu_notifier_invalidate_range_start and mmu_notifier_release events and flush cache accordingly. The hook after releasing the mmu lock to avoid contention with other vCPUs.
Cc: stable@vger.kernel.org Suggested-by: Sean Christpherson <seanjc@google.com> Reported-by: Mingwei Zhang <mizhang@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20220421031407.2516575-4-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [OP: adjusted KVM_X86_OP_OPTIONAL() -> KVM_X86_OP_NULL, applied kvm_arch_guest_memory_reclaimed() call in kvm_set_memslot()] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
5dea76f5 |
| 09-Dec-2021 |
Marc Orr <marcorr@google.com> |
KVM: x86: Always set kvm_run->if_flag
commit c5063551bfcae4e48fec890b7bf369598b77526b upstream.
The kvm_run struct's if_flag is a part of the userspace/kernel API. The SEV-ES patches failed to set
KVM: x86: Always set kvm_run->if_flag
commit c5063551bfcae4e48fec890b7bf369598b77526b upstream.
The kvm_run struct's if_flag is a part of the userspace/kernel API. The SEV-ES patches failed to set this flag because it's no longer needed by QEMU (according to the comment in the source code). However, other hypervisors may make use of this flag. Therefore, set the flag for guests with encrypted registers (i.e., with guest_state_protected set).
Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES") Signed-off-by: Marc Orr <marcorr@google.com> Message-Id: <20211209155257.128747-1-marcorr@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
36222b11 |
| 10-Aug-2021 |
Maxim Levitsky <mlevitsk@redhat.com> |
KVM: x86: don't disable APICv memslot when inhibited
Thanks to the former patches, it is now possible to keep the APICv memslot always enabled, and it will be invisible to the guest when it is inhib
KVM: x86: don't disable APICv memslot when inhibited
Thanks to the former patches, it is now possible to keep the APICv memslot always enabled, and it will be invisible to the guest when it is inhibited
This code is based on a suggestion from Sean Christopherson: https://lkml.org/lkml/2021/7/19/2970
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-9-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43 |
|
#
ecc513e5 |
| 09-Jun-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: x86: Drop "pre_" from enter/leave_smm() helpers
Now that .post_leave_smm() is gone, drop "pre_" from the remaining helpers. The helpers aren't invoked purely before SMI/RSM processing, e.g. bo
KVM: x86: Drop "pre_" from enter/leave_smm() helpers
Now that .post_leave_smm() is gone, drop "pre_" from the remaining helpers. The helpers aren't invoked purely before SMI/RSM processing, e.g. both helpers are invoked after state is snapshotted (from regs or SMRAM), and the RSM helper is invoked after some amount of register state has been stuffed.
Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210609185619.992058-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
1ab9287a |
| 07-Jun-2021 |
Ilias Stamatis <ilstam@amazon.com> |
KVM: X86: Add vendor callbacks for writing the TSC multiplier
Currently vmx_vcpu_load_vmcs() writes the TSC_MULTIPLIER field of the VMCS every time the VMCS is loaded. Instead of doing this, set thi
KVM: X86: Add vendor callbacks for writing the TSC multiplier
Currently vmx_vcpu_load_vmcs() writes the TSC_MULTIPLIER field of the VMCS every time the VMCS is loaded. Instead of doing this, set this field from common code on initialization and whenever the scaling ratio changes.
Additionally remove vmx->current_tsc_ratio. This field is redundant as vcpu->arch.tsc_scaling_ratio already tracks the current TSC scaling ratio. The vmx->current_tsc_ratio field is only used for avoiding unnecessary writes but it is no longer needed after removing the code from the VMCS load path.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Ilias Stamatis <ilstam@amazon.com> Message-Id: <20210607105438.16541-1-ilstam@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v5.10.42, v5.10.41 |
|
#
edcfe540 |
| 26-May-2021 |
Ilias Stamatis <ilstam@amazon.com> |
KVM: X86: Move write_l1_tsc_offset() logic to common code and rename it
The write_l1_tsc_offset() callback has a misleading name. It does not set L1's TSC offset, it rather updates the current TSC o
KVM: X86: Move write_l1_tsc_offset() logic to common code and rename it
The write_l1_tsc_offset() callback has a misleading name. It does not set L1's TSC offset, it rather updates the current TSC offset which might be different if a nested guest is executing. Additionally, both the vmx and svm implementations use the same logic for calculating the current TSC before writing it to hardware.
Rename the function and move the common logic to the caller. The vmx/svm specific code now merely sets the given offset to the corresponding hardware structure.
Signed-off-by: Ilias Stamatis <ilstam@amazon.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210526184418.28881-9-ilstam@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
307a94c7 |
| 26-May-2021 |
Ilias Stamatis <ilstam@amazon.com> |
KVM: X86: Add functions for retrieving L2 TSC fields from common code
In order to implement as much of the nested TSC scaling logic as possible in common code, we need these vendor callbacks for ret
KVM: X86: Add functions for retrieving L2 TSC fields from common code
In order to implement as much of the nested TSC scaling logic as possible in common code, we need these vendor callbacks for retrieving the TSC offset and the TSC multiplier that L1 has set for L2.
Signed-off-by: Ilias Stamatis <ilstam@amazon.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210526184418.28881-7-ilstam@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v5.10.40 |
|
#
57ab8794 |
| 25-May-2021 |
Marcelo Tosatti <mtosatti@redhat.com> |
KVM: x86: add start_assignment hook to kvm_x86_ops
Add a start_assignment hook to kvm_x86_ops, which is called when kvm_arch_start_assignment is done.
The hook is required to update the wakeup vect
KVM: x86: add start_assignment hook to kvm_x86_ops
Add a start_assignment hook to kvm_x86_ops, which is called when kvm_arch_start_assignment is done.
The hook is required to update the wakeup vector of a sleeping vCPU when a device is assigned to the guest.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20210525134321.254128742@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16 |
|
#
a85863c2 |
| 12-Feb-2021 |
Makarand Sonare <makarandsonare@google.com> |
KVM: VMX: Dynamically enable/disable PML based on memslot dirty logging
Currently, if enable_pml=1 PML remains enabled for the entire lifetime of the VM irrespective of whether dirty logging is enab
KVM: VMX: Dynamically enable/disable PML based on memslot dirty logging
Currently, if enable_pml=1 PML remains enabled for the entire lifetime of the VM irrespective of whether dirty logging is enable or disabled. When dirty logging is disabled, all the pages of the VM are manually marked dirty, so that PML is effectively non-operational. Setting the dirty bits is an expensive operation which can cause severe MMU lock contention in a performance sensitive path when dirty logging is disabled after a failed or canceled live migration.
Manually setting dirty bits also fails to prevent PML activity if some code path clears dirty bits, which can incur unnecessary VM-Exits.
In order to avoid this extra overhead, dynamically enable/disable PML when dirty logging gets turned on/off for the first/last memslot.
Signed-off-by: Makarand Sonare <makarandsonare@google.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-12-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
a018eba5 |
| 12-Feb-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: x86: Move MMU's PML logic to common code
Drop the facade of KVM's PML logic being vendor specific and move the bits that aren't truly VMX specific into common x86 code. The MMU logic for deali
KVM: x86: Move MMU's PML logic to common code
Drop the facade of KVM's PML logic being vendor specific and move the bits that aren't truly VMX specific into common x86 code. The MMU logic for dealing with PML is tightly coupled to the feature and to VMX's implementation, bouncing through kvm_x86_ops obfuscates the code without providing any meaningful separation of concerns or encapsulation.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
6dd03800 |
| 12-Feb-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Make dirty log size hook (PML) a value, not a function
Store the vendor-specific dirty log size in a variable, there's no need to wrap it in a function since the value is constant afte
KVM: x86/mmu: Make dirty log size hook (PML) a value, not a function
Store the vendor-specific dirty log size in a variable, there's no need to wrap it in a function since the value is constant after hardware_setup() runs.
Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v5.10.15, v5.10.14 |
|
#
9af5471b |
| 14-Jan-2021 |
Jason Baron <jbaron@akamai.com> |
KVM: x86: introduce definitions to support static calls for kvm_x86_ops
Use static calls to improve kvm_x86_ops performance. Introduce the definitions that will be used by a subsequent patch to actu
KVM: x86: introduce definitions to support static calls for kvm_x86_ops
Use static calls to improve kvm_x86_ops performance. Introduce the definitions that will be used by a subsequent patch to actualize the savings. Add a new kvm-x86-ops.h header that can be used for the definition of static calls. This header is also intended to be used to simplify the defition of svm_kvm_ops and vmx_x86_ops.
Note that all functions in kvm_x86_ops are covered here except for 'pmu_ops' and 'nested ops'. I think they can be covered by static calls in a simlilar manner, but were omitted from this series to reduce scope and because I don't think they have as large of a performance impact.
Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Jason Baron <jbaron@akamai.com> Message-Id: <e5cc82ead7ab37b2dceb0837a514f3f8bea4f8d1.1610680941.git.jbaron@akamai.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|