Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43 |
|
#
0e3223d8 |
| 28-Jul-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
When attempting to allocate a shadow root for a !visible guest root gfn, e.g. that resides in MMIO space, load a dummy roo
KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
When attempting to allocate a shadow root for a !visible guest root gfn, e.g. that resides in MMIO space, load a dummy root that is backed by the zero page instead of immediately synthesizing a triple fault shutdown (using the zero page ensures any attempt to translate memory will generate a !PRESENT fault and thus VM-Exit).
Unless the vCPU is racing with memslot activity, KVM will inject a page fault due to not finding a visible slot in FNAME(walk_addr_generic), i.e. the end result is mostly same, but critically KVM will inject a fault only *after* KVM runs the vCPU with the bogus root.
Waiting to inject a fault until after running the vCPU fixes a bug where KVM would bail from nested VM-Enter if L1 tried to run L2 with TDP enabled and a !visible root. Even though a bad root will *probably* lead to shutdown, (a) it's not guaranteed and (b) the CPU won't read the underlying memory until after VM-Enter succeeds. E.g. if L1 runs L2 with a VMX preemption timer value of '0', then architecturally the preemption timer VM-Exit is guaranteed to occur before the CPU executes any instruction, i.e. before the CPU needs to translate a GPA to a HPA (so long as there are no injected events with higher priority than the preemption timer).
If KVM manages to get to FNAME(fetch) with a dummy root, e.g. because userspace created a memslot between installing the dummy root and handling the page fault, simply unload the MMU to allocate a new root and retry the instruction. Use KVM_REQ_MMU_FREE_OBSOLETE_ROOTS to drop the root, as invoking kvm_mmu_free_roots() while holding mmu_lock would deadlock, and conceptually the dummy root has indeeed become obsolete. The only difference versus existing usage of KVM_REQ_MMU_FREE_OBSOLETE_ROOTS is that the root has become obsolete due to memslot *creation*, not memslot deletion or movement.
Reported-by: Reima Ishii <ishiir@g.ecc.u-tokyo.ac.jp> Cc: Yu Zhang <yu.c.zhang@linux.intel.com> Link: https://lore.kernel.org/r/20230729005200.1057358-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
b5b359ac |
| 28-Jul-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Disallow guest from using !visible slots for page tables
Explicitly inject a page fault if guest attempts to use a !visible gfn as a page table. kvm_vcpu_gfn_to_hva_prot() will natura
KVM: x86/mmu: Disallow guest from using !visible slots for page tables
Explicitly inject a page fault if guest attempts to use a !visible gfn as a page table. kvm_vcpu_gfn_to_hva_prot() will naturally handle the case where there is no memslot, but doesn't catch the scenario where the gfn points at a KVM-internal memslot.
Letting the guest backdoor its way into accessing KVM-internal memslots isn't dangerous on its own, e.g. at worst the guest can crash itself, but disallowing the behavior will simplify fixing how KVM handles !visible guest root gfns (immediately synthesizing a triple fault when loading the root is architecturally wrong).
Link: https://lore.kernel.org/r/20230729005200.1057358-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
72e2fb24 |
| 28-Jul-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Bug the VM if a vCPU ends up in long mode without PAE enabled
Promote the ASSERT(), which is quite dead code in KVM, into a KVM_BUG_ON() for KVM's sanity check that CR4.PAE=1 if the vC
KVM: x86/mmu: Bug the VM if a vCPU ends up in long mode without PAE enabled
Promote the ASSERT(), which is quite dead code in KVM, into a KVM_BUG_ON() for KVM's sanity check that CR4.PAE=1 if the vCPU is in long mode when performing a walk of guest page tables. The sanity is quite cheap since neither EFER nor CR4.PAE requires a VMREAD, especially relative to the cost of walking the guest page tables.
More importantly, the sanity check would have prevented the true badness fixed by commit 112e66017bff ("KVM: nVMX: add missing consistency checks for CR0 and CR4"). The missed consistency check resulted in some versions of KVM corrupting the on-stack guest_walker structure due to KVM thinking there are 4/5 levels of page tables, but wiring up the MMU hooks to point at the paging32 implementation, which only allocates space for two levels of page tables in "struct guest_walker32".
Queue a page fault for injection if the assertion fails, as both callers, FNAME(gva_to_gpa) and FNAME(walk_addr_generic), assume that walker.fault contains sane info on a walk failure. E.g. not populating the fault info could result in KVM consuming and/or exposing uninitialized stack data before the vCPU is kicked out to userspace, which doesn't happen until KVM checks for KVM_REQ_VM_DEAD on the next enter.
Move the check below the initialization of "pte_access" so that the aforementioned to-be-injected page fault doesn't consume uninitialized stack data. The information _shouldn't_ reach the guest or userspace, but there's zero downside to being paranoid in this case.
Link: https://lore.kernel.org/r/20230729004722.1056172-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
20ba462d |
| 28-Jul-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Convert "runtime" WARN_ON() assertions to WARN_ON_ONCE()
Convert all "runtime" assertions, i.e. assertions that can be triggered while running vCPUs, from WARN_ON() to WARN_ON_ONCE().
KVM: x86/mmu: Convert "runtime" WARN_ON() assertions to WARN_ON_ONCE()
Convert all "runtime" assertions, i.e. assertions that can be triggered while running vCPUs, from WARN_ON() to WARN_ON_ONCE(). Every WARN in the MMU that is tied to running vCPUs, i.e. not contained to loading and initializing KVM, is likely to fire _a lot_ when it does trigger. E.g. if KVM ends up with a bug that causes a root to be invalidated before the page fault handler is invoked, pretty much _every_ page fault VM-Exit triggers the WARN.
If a WARN is triggered frequently, the resulting spam usually causes a lot of damage of its own, e.g. consumes resources to log the WARN and pollutes the kernel log, often to the point where other useful information can be lost. In many case, the damage caused by the spam is actually worse than the bug itself, e.g. KVM can almost always recover from an unexpectedly invalid root.
On the flip side, warning every time is rarely helpful for debug and triage, i.e. a single splat is usually sufficient to point a debugger in the right direction, and automated testing, e.g. syzkaller, typically runs with warn_on_panic=1, i.e. will never get past the first WARN anyways.
Lastly, when an assertions fails multiple times, the stack traces in KVM are almost always identical, i.e. the full splat only needs to be captured once. And _if_ there is value in captruing information about the failed assert, a ratelimited printk() is sufficient and less likely to rack up a large amount of collateral damage.
Link: https://lore.kernel.org/r/20230729004722.1056172-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
a98b8894 |
| 28-Jul-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Delete pgprintk() and all its usage
Delete KVM's pgprintk() and all its usage, as the code is very prone to bitrot due to being buried behind MMU_DEBUG, and the functionality has been
KVM: x86/mmu: Delete pgprintk() and all its usage
Delete KVM's pgprintk() and all its usage, as the code is very prone to bitrot due to being buried behind MMU_DEBUG, and the functionality has been rendered almost entirely obsolete by the tracepoints KVM has gained over the years. And for the situations where the information provided by KVM's tracepoints is insufficient, pgprintk() rarely fills in the gaps, and is almost always far too noisy, i.e. developers end up implementing custom prints anyways.
Link: https://lore.kernel.org/r/20230729004722.1056172-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21 |
|
#
2fdcc1b3 |
| 21-Mar-2023 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: x86/mmu: Avoid indirect call for get_cr3
Most of the time, calls to get_guest_pgd result in calling kvm_read_cr3 (the exception is only nested TDP). Hardcode the default instead of using the g
KVM: x86/mmu: Avoid indirect call for get_cr3
Most of the time, calls to get_guest_pgd result in calling kvm_read_cr3 (the exception is only nested TDP). Hardcode the default instead of using the get_cr3 function, avoiding a retpoline if they are enabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Mathias Krause <minipli@grsecurity.net> Link: https://lore.kernel.org/r/20230322013731.102955-2-minipli@grsecurity.net Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
Revision tags: v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2 |
|
#
19ace7d6 |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Skip calling mmu->sync_spte() when the spte is 0
Sync the spte only when the spte is set and avoid the indirect branch.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link:
KVM: x86/mmu: Skip calling mmu->sync_spte() when the spte is 0
Sync the spte only when the spte is set and avoid the indirect branch.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216235321.735214-5-jiangshanlai@gmail.com [sean: add wrapper instead of open coding each check] Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
91ca7672 |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
kvm: x86/mmu: Remove @no_dirty_log from FNAME(prefetch_gpte)
FNAME(prefetch_gpte) is always called with @no_dirty_log=true.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://l
kvm: x86/mmu: Remove @no_dirty_log from FNAME(prefetch_gpte)
FNAME(prefetch_gpte) is always called with @no_dirty_log=true.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216235321.735214-4-jiangshanlai@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
9fd4a4e3 |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Remove FNAME(invlpg) and use FNAME(sync_spte) to update vTLB instead.
In hardware TLB, invalidating TLB entries means the translations are removed from the TLB.
In KVM shadowed vTLB,
KVM: x86/mmu: Remove FNAME(invlpg) and use FNAME(sync_spte) to update vTLB instead.
In hardware TLB, invalidating TLB entries means the translations are removed from the TLB.
In KVM shadowed vTLB, the translations (combinations of shadow paging and hardware TLB) are generally maintained as long as they remain "clean" when the TLB of an address space (i.e. a PCID or all) is flushed with the help of write-protections, sp->unsync, and kvm_sync_page(), where "clean" in this context means that no updates to KVM's SPTEs are needed.
However, FNAME(invlpg) always zaps/removes the vTLB if the shadow page is unsync, and thus triggers a remote flush even if the original vTLB entry is clean, i.e. is usable as-is.
Besides this, FNAME(invlpg) is largely is a duplicate implementation of FNAME(sync_spte) to invalidate a vTLB entry.
To address both issues, reuse FNAME(sync_spte) to share the code and slightly modify the semantics, i.e. keep the vTLB entry if it's "clean" and avoid remote TLB flush.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216235321.735214-3-jiangshanlai@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
ed335278 |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Allow the roots to be invalid in FNAME(invlpg)
Don't assume the current root to be valid, just check it and remove the WARN().
Also move the code to check if the root is valid into FN
KVM: x86/mmu: Allow the roots to be invalid in FNAME(invlpg)
Don't assume the current root to be valid, just check it and remove the WARN().
Also move the code to check if the root is valid into FNAME(invlpg) to simplify the code.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216235321.735214-2-jiangshanlai@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
e6722d92 |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Reduce the update to the spte in FNAME(sync_spte)
Sometimes when the guest updates its pagetable, it adds only new gptes to it without changing any existed one, so there is no point to
KVM: x86/mmu: Reduce the update to the spte in FNAME(sync_spte)
Sometimes when the guest updates its pagetable, it adds only new gptes to it without changing any existed one, so there is no point to update the sptes for these existed gptes.
Also when the sptes for these unchanged gptes are updated, the AD bits are also removed since make_spte() is called with prefetch=true which might result unneeded TLB flushing.
Just do nothing if the gpte's permissions are unchanged.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216154115.710033-7-jiangshanlai@gmail.com [sean: expand comment to call out A/D bits] Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
c3c6c9fc |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Move the code out of FNAME(sync_page)'s loop body into mmu.c
Rename mmu->sync_page to mmu->sync_spte and move the code out of FNAME(sync_page)'s loop body into mmu.c.
No functionaliti
KVM: x86/mmu: Move the code out of FNAME(sync_page)'s loop body into mmu.c
Rename mmu->sync_page to mmu->sync_spte and move the code out of FNAME(sync_page)'s loop body into mmu.c.
No functionalities change intended.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216154115.710033-6-jiangshanlai@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
90e44470 |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Move the check in FNAME(sync_page) as kvm_sync_page_check()
Prepare to check mmu->sync_page pointer before calling it.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link:
KVM: x86/mmu: Move the check in FNAME(sync_page) as kvm_sync_page_check()
Prepare to check mmu->sync_page pointer before calling it.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216154115.710033-3-jiangshanlai@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
753b43c9 |
| 16-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Use 64-bit address to invalidate to fix a subtle bug
FNAME(invlpg)() and kvm_mmu_invalidate_gva() take a gva_t, i.e. unsigned long, as the type of the address to invalidate. On 32-bit
KVM: x86/mmu: Use 64-bit address to invalidate to fix a subtle bug
FNAME(invlpg)() and kvm_mmu_invalidate_gva() take a gva_t, i.e. unsigned long, as the type of the address to invalidate. On 32-bit kernels, the upper 32 bits of the GPA will get dropped when an L2 GPA address is invalidated in the shadowed nested TDP MMU.
Convert it to u64 to fix the problem.
Reported-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230216154115.710033-2-jiangshanlai@gmail.com [sean: tweak changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
Revision tags: v6.1.12, v6.1.11, v6.1.10 |
|
#
9a967700 |
| 02-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Remove FNAME(is_self_change_mapping)
Drop FNAME(is_self_change_mapping) and instead rely on kvm_mmu_hugepage_adjust() to adjust the hugepage accordingly. Prior to commit 4cd071d13c5c
KVM: x86/mmu: Remove FNAME(is_self_change_mapping)
Drop FNAME(is_self_change_mapping) and instead rely on kvm_mmu_hugepage_adjust() to adjust the hugepage accordingly. Prior to commit 4cd071d13c5c ("KVM: x86/mmu: Move calls to thp_adjust() down a level"), the hugepage adjustment was done before allocating new shadow pages, i.e. failed to restrict the hugepage sizes if a new shadow page resulted in account_shadowed() changing the disallowed hugepage tracking.
Removing FNAME(is_self_change_mapping) fixes a bug reported by Huang Hang where KVM unnecessarily forces a 4KiB page. FNAME(is_self_change_mapping) has a defect in that it blindly disables _all_ hugepage mappings rather than trying to reduce the size of the hugepage. If the guest is writing to a 1GiB page and the 1GiB is self-referential but a 2MiB page is not, then KVM can and should create a 2MiB mapping.
Add a comment above the call to kvm_mmu_hugepage_adjust() to call out the new dependency on adjusting the hugepage size after walking indirect PTEs.
Reported-by: Huang Hang <hhuang@linux.alibaba.com> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20221213125538.81209-1-jiangshanlai@gmail.com [sean: rework changelog after separating out the emulator change] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230202182817.407394-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
39fda5d8 |
| 02-Feb-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
KVM: x86/mmu: Detect write #PF to shadow pages during FNAME(fetch) walk
Move the detection of write #PF to shadow pages, i.e. a fault on a write to a page table that is being shadowed by KVM that is
KVM: x86/mmu: Detect write #PF to shadow pages during FNAME(fetch) walk
Move the detection of write #PF to shadow pages, i.e. a fault on a write to a page table that is being shadowed by KVM that is used to translate the write itself, from FNAME(is_self_change_mapping) to FNAME(fetch). There is no need to detect the self-referential write before kvm_faultin_pfn() as KVM does not consume EMULTYPE_WRITE_PF_TO_SP for accesses that resolve to "error or no-slot" pfns, i.e. KVM doesn't allow retrying MMIO accesses or writes to read-only memslots.
Detecting the EMULTYPE_WRITE_PF_TO_SP scenario in FNAME(fetch) will allow dropping FNAME(is_self_change_mapping) entirely, as the hugepage interaction can be deferred to kvm_mmu_hugepage_adjust().
Cc: Huang Hang <hhuang@linux.alibaba.com> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20221213125538.81209-1-jiangshanlai@gmail.com [sean: split to separate patch, write changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230202182817.407394-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
258d985f |
| 02-Feb-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Use EMULTYPE flag to track write #PFs to shadow pages
Use a new EMULTYPE flag, EMULTYPE_WRITE_PF_TO_SP, to track page faults on self-changing writes to shadowed page tables instead of
KVM: x86/mmu: Use EMULTYPE flag to track write #PFs to shadow pages
Use a new EMULTYPE flag, EMULTYPE_WRITE_PF_TO_SP, to track page faults on self-changing writes to shadowed page tables instead of propagating that information to the emulator via a semi-persistent vCPU flag. Using a flag in "struct kvm_vcpu_arch" is confusing, especially as implemented, as it's not at all obvious that clearing the flag only when emulation actually occurs is correct.
E.g. if KVM sets the flag and then retries the fault without ever getting to the emulator, the flag will be left set for future calls into the emulator. But because the flag is consumed if and only if both EMULTYPE_PF and EMULTYPE_ALLOW_RETRY_PF are set, and because EMULTYPE_ALLOW_RETRY_PF is deliberately not set for direct MMUs, emulated MMIO, or while L2 is active, KVM avoids false positives on a stale flag since FNAME(page_fault) is guaranteed to be run and refresh the flag before it's ultimately consumed by the tail end of reexecute_instruction().
Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230202182817.407394-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
Revision tags: v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1 |
|
#
1b2dc736 |
| 10-Oct-2022 |
Hou Wenlong <houwenlong.hwl@antgroup.com> |
KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range
When a spte is dropped, the start gfn of tlb flushing should be the gfn of spte not the base gfn of SP which contains the spte. Also intr
KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range
When a spte is dropped, the start gfn of tlb flushing should be the gfn of spte not the base gfn of SP which contains the spte. Also introduce a helper function to do range-based flushing when a spte is dropped, which would help prevent future buggy use of kvm_flush_remote_tlbs_with_address() in such case.
Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.") Suggested-by: David Matlack <dmatlack@google.com> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Link: https://lore.kernel.org/r/72ac2169a261976f00c1703e88cda676dfb960f5.1665214747.git.houwenlong.hwl@antgroup.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
c667a3ba |
| 10-Oct-2022 |
Hou Wenlong <houwenlong.hwl@antgroup.com> |
KVM: x86/mmu: Move round_gfn_for_level() helper into mmu_internal.h
Rounding down the GFN to a huge page size is a common pattern throughout KVM, so move round_gfn_for_level() helper in tdp_iter.c t
KVM: x86/mmu: Move round_gfn_for_level() helper into mmu_internal.h
Rounding down the GFN to a huge page size is a common pattern throughout KVM, so move round_gfn_for_level() helper in tdp_iter.c to mmu_internal.h for common usage. Also rename it as gfn_round_for_level() to use gfn_* prefix and clean up the other call sites.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Link: https://lore.kernel.org/r/415c64782f27444898db650e21cf28eeb6441dfa.1665214747.git.houwenlong.hwl@antgroup.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
9e3fbdfd |
| 05-Jan-2023 |
Lai Jiangshan <jiangshan.ljs@antgroup.com> |
kvm: x86/mmu: Don't clear write flooding for direct SP
Although there is no harm, but there is no point to clear write flooding for direct SP.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.c
kvm: x86/mmu: Don't clear write flooding for direct SP
Although there is no harm, but there is no point to clear write flooding for direct SP.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Link: https://lore.kernel.org/r/20230105100310.6700-1-jiangshanlai@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
Revision tags: v5.15.72, v6.0, v5.15.71, v5.15.70 |
|
#
354c908c |
| 21-Sep-2022 |
David Matlack <dmatlack@google.com> |
KVM: x86/mmu: Handle no-slot faults in kvm_faultin_pfn()
Handle faults on GFNs that do not have a backing memslot in kvm_faultin_pfn() and drop handle_abnormal_pfn(). This eliminates duplicate code
KVM: x86/mmu: Handle no-slot faults in kvm_faultin_pfn()
Handle faults on GFNs that do not have a backing memslot in kvm_faultin_pfn() and drop handle_abnormal_pfn(). This eliminates duplicate code in the various page fault handlers.
Opportunistically tweak the comment about handling gfn > host.MAXPHYADDR to reflect that the effect of returning RET_PF_EMULATE at that point is to avoid creating an MMIO SPTE for such GFNs.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220921173546.2674386-7-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
ba6e3fe2 |
| 21-Sep-2022 |
David Matlack <dmatlack@google.com> |
KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()
Grab mmu_invalidate_seq in kvm_faultin_pfn() and stash it in struct kvm_page_fault. The eliminates duplicate code and reduces the amount of
KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()
Grab mmu_invalidate_seq in kvm_faultin_pfn() and stash it in struct kvm_page_fault. The eliminates duplicate code and reduces the amount of parameters needed for is_page_fault_stale().
Preemptively split out __kvm_faultin_pfn() to a separate function for use in subsequent commits.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220921173546.2674386-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
f09948ec |
| 21-Sep-2022 |
David Matlack <dmatlack@google.com> |
KVM: x86/mmu: Handle no-slot faults in kvm_faultin_pfn()
Handle faults on GFNs that do not have a backing memslot in kvm_faultin_pfn() and drop handle_abnormal_pfn(). This eliminates duplicate code
KVM: x86/mmu: Handle no-slot faults in kvm_faultin_pfn()
Handle faults on GFNs that do not have a backing memslot in kvm_faultin_pfn() and drop handle_abnormal_pfn(). This eliminates duplicate code in the various page fault handlers.
Opportunistically tweak the comment about handling gfn > host.MAXPHYADDR to reflect that the effect of returning RET_PF_EMULATE at that point is to avoid creating an MMIO SPTE for such GFNs.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220921173546.2674386-7-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
90c54c19 |
| 21-Sep-2022 |
David Matlack <dmatlack@google.com> |
KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()
Grab mmu_invalidate_seq in kvm_faultin_pfn() and stash it in struct kvm_page_fault. The eliminates duplicate code and reduces the amount of
KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()
Grab mmu_invalidate_seq in kvm_faultin_pfn() and stash it in struct kvm_page_fault. The eliminates duplicate code and reduces the amount of parameters needed for is_page_fault_stale().
Preemptively split out __kvm_faultin_pfn() to a separate function for use in subsequent commits.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220921173546.2674386-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
55c510e2 |
| 19-Oct-2022 |
Sean Christopherson <seanjc@google.com> |
KVM: x86/mmu: Rename NX huge pages fields/functions for consistency
Rename most of the variables/functions involved in the NX huge page mitigation to provide consistency, e.g. lpage vs huge page, an
KVM: x86/mmu: Rename NX huge pages fields/functions for consistency
Rename most of the variables/functions involved in the NX huge page mitigation to provide consistency, e.g. lpage vs huge page, and NX huge vs huge NX, and also to provide clarity, e.g. to make it obvious the flag applies only to the NX huge page mitigation, not to any condition that prevents creating a huge page.
Add a comment explaining what the newly named "possible_nx_huge_pages" tracks.
Leave the nx_lpage_splits stat alone as the name is ABI and thus set in stone.
Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20221019165618.927057-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|