487c4e9a | 16-Apr-2024 |
Claudio Imbrenda <imbrenda@linux.ibm.com> |
s390/mm: Fix clearing storage keys for huge pages
[ Upstream commit 412050af2ea39407fe43324b0be4ab641530ce88 ]
The function __storage_key_init_range() expects the end address to be the first byte o
s390/mm: Fix clearing storage keys for huge pages
[ Upstream commit 412050af2ea39407fe43324b0be4ab641530ce88 ]
The function __storage_key_init_range() expects the end address to be the first byte outside the range to be initialized. I.e. end - start should be the size of the area to be initialized.
The current code works because __storage_key_init_range() will still loop over every page in the range, but it is slower than using sske_frame().
Fixes: 3afdfca69870 ("s390/mm: Clear skeys for newly mapped huge guest pmds") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20240416114220.28489-3-imbrenda@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
7453e810 | 24-Oct-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/cmma: fix detection of DAT pages
commit 44d93045247661acbd50b1629e62f415f2747577 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pa
s390/cmma: fix detection of DAT pages
commit 44d93045247661acbd50b1629e62f415f2747577 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction.
This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. The initial loop however is incorrect: only the first three of the four pages which belong to segment and region tables will be marked as being used for DAT. The last page is incorrectly marked as no-dat.
This can result in incorrect guest TLB flushes.
Fix this by simply marking all four pages.
Cc: <stable@vger.kernel.org> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
bfabe8d0 | 20-Oct-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/mm: add missing arch_set_page_dat() call to gmap allocations
commit 1954da4a2b621a3328a63382cae7e5f5e2af502c upstream.
If the cmma no-dat feature is available all pages that are not used for d
s390/mm: add missing arch_set_page_dat() call to gmap allocations
commit 1954da4a2b621a3328a63382cae7e5f5e2af502c upstream.
If the cmma no-dat feature is available all pages that are not used for dynamic address translation are marked as "no-dat" with the ESSA instruction. This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. This also means that pages which are used for dynamic address translation must not be marked as "no-dat", since the hypervisor may then incorrectly not purge guest TLB entries.
Region, segment, and page tables allocated within the gmap code are incorrectly marked as "no-dat", since an explicit call to arch_set_page_dat() is missing, which would remove the "no-dat" mark.
In order to fix this add a new gmap_alloc_crst() function which should be used to allocate region and segment tables, and which also calls arch_set_page_dat().
Also add the arch_set_page_dat() call to page_table_alloc_pgste().
Cc: <stable@vger.kernel.org> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
0c021834 | 07-Sep-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini: "ARM:
- Clean up vCPU targets, always returning generic v8 as the preferred target
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini: "ARM:
- Clean up vCPU targets, always returning generic v8 as the preferred target
- Trap forwarding infrastructure for nested virtualization (used for traps that are taken from an L2 guest and are needed by the L1 hypervisor)
- FEAT_TLBIRANGE support to only invalidate specific ranges of addresses when collapsing a table PTE to a block PTE. This avoids that the guest refills the TLBs again for addresses that aren't covered by the table PTE.
- Fix vPMU issues related to handling of PMUver.
- Don't unnecessary align non-stack allocations in the EL2 VA space
- Drop HCR_VIRT_EXCP_MASK, which was never used...
- Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu parameter instead
- Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
- Remove prototypes without implementations
RISC-V:
- Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
- Added ONE_REG interface for SATP mode
- Added ONE_REG interface to enable/disable multiple ISA extensions
- Improved error codes returned by ONE_REG interfaces
- Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
- Added get-reg-list selftest for KVM RISC-V
s390:
- PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by the firmware and once a crypto queue is "bound" to a PV VM every other entity (PV or not) looses access until it is not bound anymore. Enablement is done via flags when creating the PV VM.
- Guest debug fixes (Ilya)
x86:
- Clean up KVM's handling of Intel architectural events
- Intel bugfixes
- Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug registers and generate/handle #DBs
- Clean up LBR virtualization code
- Fix a bug where KVM fails to set the target pCPU during an IRTE update
- Fix fatal bugs in SEV-ES intrahost migration
- Fix a bug where the recent (architecturally correct) change to reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to skip it)
- Retry APIC map recalculation if a vCPU is added/enabled
- Overhaul emergency reboot code to bring SVM up to par with VMX, tie the "emergency disabling" behavior to KVM actually being loaded, and move all of the logic within KVM
- Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC ratio MSR cannot diverge from the default when TSC scaling is disabled up related code
- Add a framework to allow "caching" feature flags so that KVM can check if the guest can use a feature without needing to search guest CPUID
- Rip out the ancient MMU_DEBUG crud and replace the useful bits with CONFIG_KVM_PROVE_MMU
- Fix KVM's handling of !visible guest roots to avoid premature triple fault injection
- Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface that is needed by external users (currently only KVMGT), and fix a variety of issues in the process
Generic:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass action specific data without needing to constantly update the main handlers.
- Drop unused function declarations
Selftests:
- Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
- Add support for printf() in guest code and covert all guest asserts to use printf-based reporting
- Clean up the PMU event filter test and add new testcases
- Include x86 selftests in the KVM x86 MAINTAINERS entry"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits) KVM: x86/mmu: Include mmu.h in spte.h KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots KVM: x86/mmu: Disallow guest from using !visible slots for page tables KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page KVM: x86/mmu: Harden new PGD against roots without shadow pages KVM: x86/mmu: Add helper to convert root hpa to shadow page drm/i915/gvt: Drop final dependencies on KVM internal details KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers KVM: x86/mmu: Drop @slot param from exported/external page-track APIs KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled KVM: x86/mmu: Assert that correct locks are held for page write-tracking KVM: x86/mmu: Rename page-track APIs to reflect the new reality KVM: x86/mmu: Drop infrastructure for multiple page-track modes KVM: x86/mmu: Use page-track notifiers iff there are external users KVM: x86/mmu: Move KVM-only page-track declarations to internal header KVM: x86: Remove the unused page-track hook track_flush_slot() drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region() KVM: x86: Add a new page-track hook to handle memslot deletion drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot KVM: x86: Reject memslot MOVE operations if KVMGT is attached ...
show more ...
|
06fc3b0d | 01-Sep-2023 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/vmem: do not silently ignore mapping limit
The only interface that allows drivers establishing liner mappings is vmem_add_mapping(). It does check a requested range against allowed limits and a
s390/vmem: do not silently ignore mapping limit
The only interface that allows drivers establishing liner mappings is vmem_add_mapping(). It does check a requested range against allowed limits and a call to modify_pagetable() with an invalid mapping range is impossible.
Hence, an attempt to map an address range outside of the identity mapping or vmemmap array could only be kernel bug.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
08d90f46 | 01-Sep-2023 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/mm: fix MAX_DMA_ADDRESS physical vs virtual confusion
MAX_DMA_ADDRESS is defined and treated as a physical address, whereas it should be virtual.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com
s390/mm: fix MAX_DMA_ADDRESS physical vs virtual confusion
MAX_DMA_ADDRESS is defined and treated as a physical address, whereas it should be virtual.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
a7eb2880 | 25-Aug-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/mm: use __set_memory() variants where useful
Use the __set_memory_yy() variants instead of set_memory_yy() where useful. This allows to make the code a bit more readable.
This also fixes the d
s390/mm: use __set_memory() variants where useful
Use the __set_memory_yy() variants instead of set_memory_yy() where useful. This allows to make the code a bit more readable.
This also fixes the debug pagealloc case, where set_memory_4k() might be called for an area larger than 8TB which would lead to an overflow of the num_pages parameter of set_memory_4k().
However RELOC_HIDE() has to be used for the __set_memory_4k() case for the time being, to avoid compiler warnings because of performing pointer arithmetic on a NULL pointer, which has undefined behavior. This happens because __va(0) always translates to NULL. However this will change, and as soon as this happens the RELOC_HIDE() hack can be removed again.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
850612c8 | 25-Aug-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/set_memory: add __set_memory() variant
Add a __set_memory_yy() variant for all set_memory_yy() implementations. The new variant takes start and end void pointers, which allows them to be used w
s390/set_memory: add __set_memory() variant
Add a __set_memory_yy() variant for all set_memory_yy() implementations. The new variant takes start and end void pointers, which allows them to be used without the usual unsigned long cast.
However more important: the new variant can be used for areas larger than 8TB. The old variant comes with an "int numpages" parameter, which overflows with more than 8TB. Given that for debug_pagealloc set_memory_4k() is used on the whole kernel mapping this is not only a theoretical problem, but must be fixed.
Changing all set_memory_yy() variants only on s390 to take an "unsigned long numpages" parameter is not possible, since the common module code requires an int parameter from all architectures on these functions. See module_set_memory().
Therefore change/fix this on s390 only with a new interface, and address common code later.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
a6e49f10 | 25-Aug-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/mm: improve description of mapping permissions of prefix pages
Slightly improve the description which explains why the first prefix page must be mapped executable when the BEAR-enhancement faci
s390/mm: improve description of mapping permissions of prefix pages
Slightly improve the description which explains why the first prefix page must be mapped executable when the BEAR-enhancement facility is not installed.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|