Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
10a33d1d |
| 13-Feb-2024 |
Mark Brown <broonie@kernel.org> |
arm64/sve: Lower the maximum allocation for the SVE ptrace regset
[ Upstream commit 2813926261e436d33bc74486b51cce60b76edf78 ]
Doug Anderson observed that ChromeOS crashes are being reported which
arm64/sve: Lower the maximum allocation for the SVE ptrace regset
[ Upstream commit 2813926261e436d33bc74486b51cce60b76edf78 ]
Doug Anderson observed that ChromeOS crashes are being reported which include failing allocations of order 7 during core dumps due to ptrace allocating storage for regsets:
chrome: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=urgent,mems_allowed=0 ... regset_get_alloc+0x1c/0x28 elf_core_dump+0x3d8/0xd8c do_coredump+0xeb8/0x1378
with further investigation showing that this is:
[ 66.957385] DOUG: Allocating 279584 bytes
which is the maximum size of the SVE regset. As Doug observes it is not entirely surprising that such a large allocation of contiguous memory might fail on a long running system.
The SVE regset is currently sized to hold SVE registers with a VQ of SVE_VQ_MAX which is 512, substantially more than the architectural maximum of 16 which we might see even in a system emulating the limits of the architecture. Since we don't expose the size we tell the regset core externally let's define ARCH_SVE_VQ_MAX with the actual architectural maximum and use that for the regset, we'll still overallocate most of the time but much less so which will be helpful even if the core is fixed to not require contiguous allocations.
Specify ARCH_SVE_VQ_MAX in terms of the maximum value that can be written into ZCR_ELx.LEN (where this is set in the hardware). For consistency update the maximum SME vector length to be specified in the same style while we are at it.
We could also teach the ptrace core about runtime discoverable regset sizes but that would be a more invasive change and this is being observed in practical systems.
Reported-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org> Tested-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240213-arm64-sve-ptrace-regset-size-v2-1-c7600ca74b9b@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
7c892383 |
| 13-Feb-2024 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Restore SME registers on exit from suspend
[ Upstream commit 9533864816fb4a6207c63b7a98396351ce1a9fae ]
The fields in SMCR_EL1 and SMPRI_EL1 reset to an architecturally UNKNOWN value. Si
arm64/sme: Restore SME registers on exit from suspend
[ Upstream commit 9533864816fb4a6207c63b7a98396351ce1a9fae ]
The fields in SMCR_EL1 and SMPRI_EL1 reset to an architecturally UNKNOWN value. Since we do not otherwise manage the traps configured in this register at runtime we need to reconfigure them after a suspend in case nothing else was kind enough to preserve them for us.
The vector length will be restored as part of restoring the SME state for the next SME using task.
Fixes: a1f4ccd25cc2 ("arm64/sme: Provide Kconfig for SME") Reported-by: Jackson Cooper-Driver <Jackson.Cooper-Driver@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-sme-resume-v3-1-17e05e493471@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45 |
|
#
5d0a8d2f |
| 10-Aug-2023 |
Mark Brown <broonie@kernel.org> |
arm64/ptrace: Ensure that SME is set up for target when writing SSVE state
When we use NT_ARM_SSVE to either enable streaming mode or change the vector length for a process we do not currently do an
arm64/ptrace: Ensure that SME is set up for target when writing SSVE state
When we use NT_ARM_SSVE to either enable streaming mode or change the vector length for a process we do not currently do anything to ensure that there is storage allocated for the SME specific register state. If the task had not previously used SME or we changed the vector length then the task will not have had TIF_SME set or backing storage for ZA/ZT allocated, resulting in inconsistent register sizes when saving state and spurious traps which flush the newly set register state.
We should set TIF_SME to disable traps and ensure that storage is allocated for ZA and ZT if it is not already allocated. This requires modifying sme_alloc() to make the flush of any existing register state optional so we don't disturb existing state for ZA and ZT.
Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Reported-by: David Spickett <David.Spickett@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: <stable@vger.kernel.org> # 5.19.x Link: https://lore.kernel.org/r/20230810-arm64-fix-ptrace-race-v1-1-a5361fad2bd6@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7 |
|
#
ee072cf7 |
| 16-Jan-2023 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement signal handling for ZT
Add a new signal context type for ZT which is present in the signal frame when ZA is enabled and ZT is supported by the system. In order to account for th
arm64/sme: Implement signal handling for ZT
Add a new signal context type for ZT which is present in the signal frame when ZA is enabled and ZT is supported by the system. In order to account for the possible addition of further ZT registers in the future we make the number of registers variable in the ABI, though currently the only possible number is 1. We could just use a bare list head for the context since the number of registers can be inferred from the size of the context but for usability and future extensibility we define a header with the number of registers and some reserved fields in it.
Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-11-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
95fcec71 |
| 16-Jan-2023 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement context switching for ZT0
When the system supports SME2 the ZT0 register must be context switched as part of the floating point state. This register is stored immediately after
arm64/sme: Implement context switching for ZT0
When the system supports SME2 the ZT0 register must be context switched as part of the floating point state. This register is stored immediately after ZA in memory and is only accessible when PSTATE.ZA is set so we handle it in the same functions we use to save and restore ZA.
Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-10-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
d6138b4a |
| 16-Jan-2023 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Provide storage for ZT0
When the system supports SME2 there is an additional register ZT0 which we must store when the task is using SME. Since ZT0 is accessible only when PSTATE.ZA is se
arm64/sme: Provide storage for ZT0
When the system supports SME2 there is an additional register ZT0 which we must store when the task is using SME. Since ZT0 is accessible only when PSTATE.ZA is set just like ZA we allocate storage for it along with ZA, increasing the allocation size for the memory region where we store ZA and storing the data for ZT after that for ZA.
Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-9-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
d4913eee |
| 16-Jan-2023 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Add basic enumeration for SME2
Add basic feature detection for SME2, detecting that the feature is present and disabling traps for ZT0.
Signed-off-by: Mark Brown <broonie@kernel.org> Lin
arm64/sme: Add basic enumeration for SME2
Add basic feature detection for SME2, detecting that the feature is present and disabling traps for ZT0.
Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-8-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
ce514000 |
| 16-Jan-2023 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Rename za_state to sme_state
In preparation for adding support for storage for ZT0 to the thread_struct rename za_state to sme_state. Since ZT0 is accessible when PSTATE.ZA is set just li
arm64/sme: Rename za_state to sme_state
In preparation for adding support for storage for ZT0 to the thread_struct rename za_state to sme_state. Since ZT0 is accessible when PSTATE.ZA is set just like ZA itself we will extend the allocation done for ZA to cover it, avoiding the need to further expand task_struct for non-SME tasks.
No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-1-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79 |
|
#
1192b93b |
| 15-Nov-2022 |
Mark Brown <broonie@kernel.org> |
arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()
For reasons that are unclear to this reader fpsimd_bind_state_to_cpu() populates the struct fpsimd_last_state_struct that it uses to
arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()
For reasons that are unclear to this reader fpsimd_bind_state_to_cpu() populates the struct fpsimd_last_state_struct that it uses to store the active floating point state for KVM guests by passing an argument for each member of the structure. As the richness of the architecture increases this is resulting in a function with a rather large number of arguments which isn't ideal.
Simplify the interface by using the struct directly as the single argument for the function, renaming it as we lift the definition into the header. This could be built on further to reduce the work we do adding storage for new FP state in various places but for now it just simplifies this one interface.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-9-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
#
deeb8f9a |
| 15-Nov-2022 |
Mark Brown <broonie@kernel.org> |
arm64/fpsimd: Have KVM explicitly say which FP registers to save
In order to avoid needlessly saving and restoring the guest registers KVM relies on the host FPSMID code to save the guest registers
arm64/fpsimd: Have KVM explicitly say which FP registers to save
In order to avoid needlessly saving and restoring the guest registers KVM relies on the host FPSMID code to save the guest registers when we context switch away from the guest. This is done by binding the KVM guest state to the CPU on top of the task state that was originally there, then carefully managing the TIF_SVE flag for the task to cause the host to save the full SVE state when needed regardless of the needs of the host task. This works well enough but isn't terribly direct about what is going on and makes it much more complicated to try to optimise what we're doing with the SVE register state.
Let's instead have KVM pass in the register state it wants saving when it binds to the CPU. We introduce a new FP_STATE_CURRENT for use during normal task binding to indicate that we should base our decisions on the current task. This should not be used when actually saving. Ideally we might want to use a separate enum for the type to save but this enum and the enum values would then need to be named which has problems with clarity and ambiguity.
In order to ease any future debugging that might be required this patch does not actually update any of the decision making about what to save, it merely starts tracking the new information and warns if the requested state is not what we would otherwise have decided to save.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-4-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
#
baa85152 |
| 15-Nov-2022 |
Mark Brown <broonie@kernel.org> |
arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registe
arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it.
At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
#
93ae6b01 |
| 15-Nov-2022 |
Mark Brown <broonie@kernel.org> |
KVM: arm64: Discard any SVE state when entering KVM guests
Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving) KVM has not tracked the host SVE state, relying on the fact that
KVM: arm64: Discard any SVE state when entering KVM guests
Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving) KVM has not tracked the host SVE state, relying on the fact that we currently disable SVE whenever we perform a syscall. This may not be true in future since performance optimisation may result in us keeping SVE enabled in order to avoid needing to take access traps to reenable it. Handle this by clearing TIF_SVE and converting the stored task state to FPSIMD format when preparing to run the guest. This is done with a new call fpsimd_kvm_prepare() to keep the direct state manipulation functions internal to fpsimd.c.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-2-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
Revision tags: v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62 |
|
#
826a4fdd |
| 17-Aug-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Don't flush SVE register state when allocating SME storage
Currently when taking a SME access trap we allocate storage for the SVE register state in order to be able to handle storage of
arm64/sme: Don't flush SVE register state when allocating SME storage
Currently when taking a SME access trap we allocate storage for the SVE register state in order to be able to handle storage of streaming mode SVE. Due to the original usage in a purely SVE context the SVE register state allocation this also flushes the register state for SVE if storage was already allocated but in the SME context this is not desirable. For a SME access trap to be taken the task must not be in streaming mode so either there already is SVE register state present for regular SVE mode which would be corrupted or the task does not have TIF_SVE and the flush is redundant.
Fix this by adding a flag to sve_alloc() indicating if we are in a SVE context and need to flush the state. Freshly allocated storage is always zeroed either way.
Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220817182324.638214-4-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
Revision tags: v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39 |
|
#
ec0067a6 |
| 10-May-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Remove _EL0 from name of SVCR - FIXME sysreg.h
The defines for SVCR call it SVCR_EL0 however the architecture calls the register SVCR with no _EL0 suffix. In preparation for generating th
arm64/sme: Remove _EL0 from name of SVCR - FIXME sysreg.h
The defines for SVCR call it SVCR_EL0 however the architecture calls the register SVCR with no _EL0 suffix. In preparation for generating the sysreg definitions rename to match the architecture, no functional change.
Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
e65fc01b |
| 10-May-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Standardise bitfield names for SVCR
The bitfield definitions for SVCR have a SYS_ added to the names of the constant which will be a problem for automatic generation. Remove the prefixes,
arm64/sme: Standardise bitfield names for SVCR
The bitfield definitions for SVCR have a SYS_ added to the names of the constant which will be a problem for automatic generation. Remove the prefixes, no functional change.
Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v5.15.38 |
|
#
d158a060 |
| 05-May-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: More sensibly define the size for the ZA register set
Since the vector length configuration mechanism is identical between SVE and SME we share large elements of the code including the de
arm64/sme: More sensibly define the size for the ZA register set
Since the vector length configuration mechanism is identical between SVE and SME we share large elements of the code including the definition for the maximum vector length. Unfortunately when we were defining the ABI for SVE we included not only the actual maximum vector length of 2048 bits but also the value possible if all the bits reserved in the architecture for expansion of the LEN field were used, 16384 bits.
This starts creating problems if we try to allocate anything for the ZA matrix based on the maximum possible vector length, as we do for the regset used with ptrace during the process of generating a core dump. While the maximum potential size for ZA with the current architecture is a reasonably managable 64K with the higher reserved limit ZA would be 64M which leads to entirely reasonable complaints from the memory management code when we try to allocate a buffer of that size. Avoid these issues by defining the actual maximum vector length for the architecture and using it for the SME regsets.
Also use the full ZA_PT_SIZE() with the header rather than just the actual register payload when specifying the size, fixing support for the largest vector lengths now that we have this new, lower define. With the SVE maximum this did not cause problems due to the extra headroom we had.
While we're at it add a comment clarifying why even though ZA is a single register we tell the regset code that it is a multi-register regset.
Reported-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Mark Brown <broonie@kernel.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://lore.kernel.org/r/20220505221517.1642014-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
Revision tags: v5.15.37, v5.15.36, v5.15.35 |
|
#
e12310a0 |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement ptrace support for streaming mode SVE registers
The streaming mode SVE registers are represented using the same data structures as for SVE but since the vector lengths supported
arm64/sme: Implement ptrace support for streaming mode SVE registers
The streaming mode SVE registers are represented using the same data structures as for SVE but since the vector lengths supported and in use may not be the same as SVE we represent them with a new type NT_ARM_SSVE. Unfortunately we only have a single 16 bit reserved field available in the header so there is no space to fit the current and maximum vector length for both standard and streaming SVE mode without redefining the structure in a way the creates a complicatd and fragile ABI. Since FFR is not present in streaming mode it is read and written as zero.
Setting NT_ARM_SSVE registers will put the task into streaming mode, similarly setting NT_ARM_SVE registers will exit it. Reads that do not correspond to the current mode of the task will return the header with no register data. For compatibility reasons on write setting no flag for the register type will be interpreted as setting SVE registers, though users can provide no register data as an alternative mechanism for doing so.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-21-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
8bd7f91c |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement traps and syscall handling for SME
By default all SME operations in userspace will trap. When this happens we allocate storage space for the SME register state, set up the SVE
arm64/sme: Implement traps and syscall handling for SME
By default all SME operations in userspace will trap. When this happens we allocate storage space for the SME register state, set up the SVE registers and disable traps. We do not need to initialize ZA since the architecture guarantees that it will be zeroed when enabled and when we trap ZA is disabled.
On syscall we exit streaming mode if we were previously in it and ensure that all but the lower 128 bits of the registers are zeroed while preserving the state of ZA. This follows the aarch64 PCS for SME, ZA state is preserved over a function call and streaming mode is exited. Since the traps for SME do not distinguish between streaming mode SVE and ZA usage if ZA is in use rather than reenabling traps we instead zero the parts of the SVE registers not shared with FPSIMD and leave SME enabled, this simplifies handling SME traps. If ZA is not in use then we reenable SME traps and fall through to normal handling of SVE.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-17-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
0033cd93 |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement ZA context switching
Allocate space for storing ZA on first access to SME and use that to save and restore ZA state when context switching. We do this by using the vector form o
arm64/sme: Implement ZA context switching
Allocate space for storing ZA on first access to SME and use that to save and restore ZA state when context switching. We do this by using the vector form of the LDR and STR ZA instructions, these do not require streaming mode and have implementation recommendations that they avoid contention issues in shared SMCU implementations.
Since ZA is architecturally guaranteed to be zeroed when enabled we do not need to explicitly zero ZA, either we will be restoring from a saved copy or trapping on first use of SME so we know that ZA must be disabled.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-16-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
af7167d6 |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement streaming SVE context switching
When in streaming mode we need to save and restore the streaming mode SVE register state rather than the regular SVE register state. This uses th
arm64/sme: Implement streaming SVE context switching
When in streaming mode we need to save and restore the streaming mode SVE register state rather than the regular SVE register state. This uses the streaming mode vector length and omits FFR but is otherwise identical, if TIF_SVE is enabled when we are in streaming mode then streaming mode takes precedence.
This does not handle use of streaming SVE state with KVM, ptrace or signals. This will be updated in further patches.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-15-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
b40c559b |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement SVCR context switching
In SME the use of both streaming SVE mode and ZA are tracked through PSTATE.SM and PSTATE.ZA, visible through the system register SVCR. In order to conte
arm64/sme: Implement SVCR context switching
In SME the use of both streaming SVE mode and ZA are tracked through PSTATE.SM and PSTATE.ZA, visible through the system register SVCR. In order to context switch the floating point state for SME we need to context switch the contents of this register as part of context switching the floating point state.
Since changing the vector length exits streaming SVE mode and disables ZA we also make sure we update SVCR appropriately when setting vector length, and similarly ensure that new threads have streaming SVE mode and ZA disabled.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-14-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
9e4ab6c8 |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Implement vector length configuration prctl()s
As for SVE provide a prctl() interface which allows processes to configure their SME vector length.
Signed-off-by: Mark Brown <broonie@kern
arm64/sme: Implement vector length configuration prctl()s
As for SVE provide a prctl() interface which allows processes to configure their SME vector length.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-12-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
b42990d3 |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Identify supported SME vector lengths at boot
The vector lengths used for SME are controlled through a similar set of registers to those for SVE and enumerated using a similar algorithm w
arm64/sme: Identify supported SME vector lengths at boot
The vector lengths used for SME are controlled through a similar set of registers to those for SVE and enumerated using a similar algorithm with some slight differences due to the fact that unlike SVE there are no restrictions on which combinations of vector lengths can be supported nor any mandatory vector lengths which must be implemented. Add a new vector type and implement support for enumerating it.
One slightly awkward feature is that we need to read the current vector length using a different instruction (or enter streaming mode which would have the same issue and be higher cost). Rather than add an ops structure we add special cases directly in the otherwise generic vec_probe_vqs() function, this is a bit inelegant but it's the only place where this is an issue.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-10-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
5e64b862 |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Basic enumeration support
This patch introduces basic cpufeature support for discovering the presence of the Scalable Matrix Extension.
Signed-off-by: Mark Brown <broonie@kernel.org> Rev
arm64/sme: Basic enumeration support
This patch introduces basic cpufeature support for discovering the presence of the Scalable Matrix Extension.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-9-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
#
ca8a4ebc |
| 19-Apr-2022 |
Mark Brown <broonie@kernel.org> |
arm64/sme: Manually encode SME instructions
As with SVE rather than impose ambitious toolchain requirements for SME we manually encode the few instructions which we require in order to perform the w
arm64/sme: Manually encode SME instructions
As with SVE rather than impose ambitious toolchain requirements for SME we manually encode the few instructions which we require in order to perform the work the kernel needs to do. The instructions used to save and restore context are provided as assembler macros while those for entering and leaving streaming mode are done in asm volatile blocks since they are expected to be used from C.
We could do the SMSTART and SMSTOP operations with read/modify/write cycles on SVCR but using the aliases provided for individual field accesses should be slightly faster. These instructions are aliases for MSR but since our minimum toolchain requirements are old enough to mean that we can't use the sX_X_cX_cX_X form and they always use xzr rather than taking a value like write_sysreg_s() wants we just use .inst.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|