a820a85a | 12-Nov-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Drop user-only special case in sve_stN_r
This path is reachable with plugins enabled, and provoked with run-plugin-catch-syscalls-with-libinline.so.
Cc: qemu-stable@nongnu.org Reviewed-
target/arm: Drop user-only special case in sve_stN_r
This path is reachable with plugins enabled, and provoked with run-plugin-catch-syscalls-with-libinline.so.
Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20241112141232.321354-1-richard.henderson@linaro.org> (cherry picked from commit f27550804688da43c6e0d87b2f9e143adbf76271) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
5e29203b | 05-Nov-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed)
Our implementation of the indexed version of SVE SDOT/UDOT/USDOT got the calculation of the inner loop terminator wrong. Although we correctly a
target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed)
Our implementation of the indexed version of SVE SDOT/UDOT/USDOT got the calculation of the inner loop terminator wrong. Although we correctly account for the element size when we calculate the terminator for the first iteration: intptr_t segend = MIN(16 / sizeof(TYPED), opr_sz_n); we don't do that when we move it forward after the first inner loop completes. The intention is that we process the vector in 128-bit segments, which for a 64-bit element size should mean (1, 2), (3, 4), (5, 6), etc. This bug meant that we would iterate (1, 2), (3, 4, 5, 6), (7, 8, 9, 10) etc and apply the wrong indexed element to some of the operations, and also index off the end of the vector.
You don't see this bug if the vector length is small enough that we don't need to iterate the outer loop, i.e. if it is only 128 bits, or if it is the 64-bit special case from AA32/AA64 AdvSIMD. If the vector length is 256 bits then we calculate the right results for the elements in the vector but do index off the end of the vector. Vector lengths greater than 256 bits see wrong answers. The instructions that produce 32-bit results behave correctly.
Fix the recalculation of 'segend' for subsequent iterations, and restore a version of the comment that was lost in the refactor of commit 7020ffd656a5 that explains why we only need to clamp segend to opr_sz_n for the first iteration, not the later ones.
Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2595 Fixes: 7020ffd656a5 ("target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h}") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241101185544.2130972-1-peter.maydell@linaro.org (cherry picked from commit e6b2fa1b81ac6b05c4397237c846a295a9857920) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
6d62f309 | 05-Nov-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Add new MMU indexes for AArch32 Secure PL1&0
Our current usage of MMU indexes when EL3 is AArch32 is confused. Architecturally, when EL3 is AArch32, all Secure code runs under the Secure
target/arm: Add new MMU indexes for AArch32 Secure PL1&0
Our current usage of MMU indexes when EL3 is AArch32 is confused. Architecturally, when EL3 is AArch32, all Secure code runs under the Secure PL1&0 translation regime: * code at EL3, which might be Mon, or SVC, or any of the other privileged modes (PL1) * code at EL0 (Secure PL0)
This is different from when EL3 is AArch64, in which case EL3 is its own translation regime, and EL1 and EL0 (whether AArch32 or AArch64) have their own regime.
We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't do anything special about Secure PL0, which meant it used the same ARMMMUIdx_EL10_0 that NonSecure PL0 does. This resulted in a bug where arm_sctlr() incorrectly picked the NonSecure SCTLR as the controlling register when in Secure PL0, which meant we were spuriously generating alignment faults because we were looking at the wrong SCTLR control bits.
The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that we wouldn't honour the PAN bit for Secure PL1, because there's no equivalent _PAN mmu index for it.
Fix this by adding two new MMU indexes: * ARMMMUIdx_E30_0 is for Secure PL0 * ARMMMUIdx_E30_3_PAN is for Secure PL1 when PAN is enabled The existing ARMMMUIdx_E3 is used to mean "Secure PL1 without PAN" (and would be named ARMMMUIdx_E30_3 in an AArch32-centric scheme).
These extra two indexes bring us up to the maximum of 16 that the core code can currently support.
This commit: * adds the new MMU index handling to the various places where we deal in MMU index values * adds assertions that we aren't AArch32 EL3 in a couple of places that currently use the E10 indexes, to document why they don't also need to handle the E30 indexes * documents in a comment why regime_has_2_ranges() doesn't need updating
Notes for backporting: this commit depends on the preceding revert of 4c2c04746932; that revert and this commit should probably be backported to everywhere that we originally backported 4c2c04746932.
Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2588 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241101142845.1712482-3-peter.maydell@linaro.org (cherry picked from commit efbe180ad2ed75d4cc64dfc6fb46a015eef713d1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
f147ed37 | 05-Nov-2024 |
Peter Maydell <peter.maydell@linaro.org> |
Revert "target/arm: Fix usage of MMU indexes when EL3 is AArch32"
This reverts commit 4c2c0474693229c1f533239bb983495c5427784d.
This commit tried to fix a problem with our usage of MMU indexes when
Revert "target/arm: Fix usage of MMU indexes when EL3 is AArch32"
This reverts commit 4c2c0474693229c1f533239bb983495c5427784d.
This commit tried to fix a problem with our usage of MMU indexes when EL3 is AArch32, using what it described as a "more complicated approach" where we share the same MMU index values for Secure PL1&0 and NonSecure PL1&0. In theory this should work, but the change didn't account for (at least) two things:
(1) The design change means we need to flush the TLBs at any point where the CPU state flips from one to the other. We already flush the TLB when SCR.NS is changed, but we don't flush the TLB when we take an exception from NS PL1&0 into Mon or when we return from Mon to NS PL1&0, and the commit didn't add any code to do that.
(2) The ATS12NS* address translate instructions allow Mon code (which is Secure) to do a stage 1+2 page table walk for NS. I thought this was OK because do_ats_write() does a page table walk which doesn't use the TLBs, so because it can pass both the MMU index and also an ARMSecuritySpace argument we can tell the table walk that we want NS stage1+2, not S. But that means that all the code within the ptw that needs to find e.g. the regime EL cannot do so only with an mmu_idx -- all these functions like regime_sctlr(), regime_el(), etc would need to pass both an mmu_idx and the security_space, so they can tell whether this is a translation regime controlled by EL1 or EL3 (and so whether to look at SCTLR.S or SCTLR.NS, etc).
In particular, because regime_el() wasn't updated to look at the ARMSecuritySpace it would return 1 even when the CPU was in Monitor mode (and the controlling EL is 3). This meant that page table walks in Monitor mode would look at the wrong SCTLR, TCR, etc and would generally fault when they should not.
Rather than trying to make the complicated changes needed to rescue the design of 4c2c04746932, we revert it in order to instead take the route that that commit describes as "the most straightforward" fix, where we add new MMU indexes EL30_0, EL30_3, EL30_3_PAN to correspond to "Secure PL1&0 at PL0", "Secure PL1&0 at PL1", and "Secure PL1&0 at PL1 with PAN".
This revert will re-expose the "spurious alignment faults in Secure PL0" issue #2326; we'll fix it again in the next commit.
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com> Message-id: 20241101142845.1712482-2-peter.maydell@linaro.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit 056c5c90c171c4895b407af0cf3d198e1d44b40f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
10eb3721 | 29-Oct-2024 |
Ido Plat <ido.plat1@ibm.com> |
target/arm: Fix arithmetic underflow in SETM instruction
Pass the stage size to step function callback, otherwise do_setm would hang when size is larger then page size because stage size would under
target/arm: Fix arithmetic underflow in SETM instruction
Pass the stage size to step function callback, otherwise do_setm would hang when size is larger then page size because stage size would underflow. This fix changes do_setm to be more inline with do_setp.
Cc: qemu-stable@nongnu.org Fixes: 0e92818887dee ("target/arm: Implement the SET* instructions") Signed-off-by: Ido Plat <ido.plat1@ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025024909.799989-1-ido.plat1@ibm.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit bab209af35037b33f7eb1b8a3737085935bec3a3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
03ee5e0c | 17-Sep-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Correct ID_AA64ISAR1_EL1 value for neoverse-v1
The Neoverse-V1 TRM is a bit confused about the layout of the ID_AA64ISAR1_EL1 register, and so its table 3-6 has the wrong value for this
target/arm: Correct ID_AA64ISAR1_EL1 value for neoverse-v1
The Neoverse-V1 TRM is a bit confused about the layout of the ID_AA64ISAR1_EL1 register, and so its table 3-6 has the wrong value for this ID register. Trust instead section 3.2.74's list of which fields are set.
This means that we stop incorrectly reporting FEAT_XS as present, and now report the presence of FEAT_BF16.
Cc: qemu-stable@nongnu.org Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240917161337.3012188-1-peter.maydell@linaro.org (cherry picked from commit 8676007eff04bb4e454bcdf92fab3f855bcc59b3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
4c2c0474 | 09-Aug-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Fix usage of MMU indexes when EL3 is AArch32
Our current usage of MMU indexes when EL3 is AArch32 is confused. Architecturally, when EL3 is AArch32, all Secure code runs under the Secure
target/arm: Fix usage of MMU indexes when EL3 is AArch32
Our current usage of MMU indexes when EL3 is AArch32 is confused. Architecturally, when EL3 is AArch32, all Secure code runs under the Secure PL1&0 translation regime: * code at EL3, which might be Mon, or SVC, or any of the other privileged modes (PL1) * code at EL0 (Secure PL0)
This is different from when EL3 is AArch64, in which case EL3 is its own translation regime, and EL1 and EL0 (whether AArch32 or AArch64) have their own regime.
We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't do anything special about Secure PL0, which meant it used the same ARMMMUIdx_EL10_0 that NonSecure PL0 does. This resulted in a bug where arm_sctlr() incorrectly picked the NonSecure SCTLR as the controlling register when in Secure PL0, which meant we were spuriously generating alignment faults because we were looking at the wrong SCTLR control bits.
The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that we wouldn't honour the PAN bit for Secure PL1, because there's no equivalent _PAN mmu index for it.
We could fix this in one of two ways: * The most straightforward is to add new MMU indexes EL30_0, EL30_3, EL30_3_PAN to correspond to "Secure PL1&0 at PL0", "Secure PL1&0 at PL1", and "Secure PL1&0 at PL1 with PAN". This matches how we use indexes for the AArch64 regimes, and preserves propirties like being able to determine the privilege level from an MMU index without any other information. However it would add two MMU indexes (we can share one with ARMMMUIdx_EL3), and we are already using 14 of the 16 the core TLB code permits.
* The more complicated approach is the one we take here. We use the same MMU indexes (E10_0, E10_1, E10_1_PAN) for Secure PL1&0 than we do for NonSecure PL1&0. This saves on MMU indexes, but means we need to check in some places whether we're in the Secure PL1&0 regime or not before we interpret an MMU index.
The changes in this commit were created by auditing all the places where we use specific ARMMMUIdx_ values, and checking whether they needed to be changed to handle the new index value usage.
Note for potential stable backports: taking also the previous (comment-change-only) commit might make the backport easier.
Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240809160430.1144805-3-peter.maydell@linaro.org
show more ...
|
8e0c9a9e | 13-Aug-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Clear high SVE elements in handle_vec_simd_wshli
AdvSIMD instructions are supposed to zero bits beyond 128. Affects SSHLL, USHLL, SSHLL2, USHLL2.
Cc: qemu-stable@nongnu.org Signed-off-b
target/arm: Clear high SVE elements in handle_vec_simd_wshli
AdvSIMD instructions are supposed to zero bits beyond 128. Affects SSHLL, USHLL, SSHLL2, USHLL2.
Cc: qemu-stable@nongnu.org Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240717060903.205098-15-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
64678fc4 | 09-Aug-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Fix BTI versus CF_PCREL
With pcrel, we cannot check the guarded page bit at translation time, as different mappings of the same physical page may or may not have the GP bit set.
Instead
target/arm: Fix BTI versus CF_PCREL
With pcrel, we cannot check the guarded page bit at translation time, as different mappings of the same physical page may or may not have the GP bit set.
Instead, add a couple of helpers to check the page at runtime, after all other filters that might obviate the need for the check.
The set_btype_for_br call must be moved after the gen_a64_set_pc call to ensure the current pc can still be computed.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240802003028.795476-1-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
55f9f4ee | 01-Aug-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Handle denormals correctly for FMOPA (widening)
The FMOPA (widening) SME instruction takes pairs of half-precision floating point values, widens them to single-precision, does a two-way
target/arm: Handle denormals correctly for FMOPA (widening)
The FMOPA (widening) SME instruction takes pairs of half-precision floating point values, widens them to single-precision, does a two-way dot product and accumulates the results into a single-precision destination. We don't quite correctly handle the FPCR bits FZ and FZ16 which control flushing of denormal inputs and outputs. This is because at the moment we pass a single float_status value to the helper function, which then uses that configuration for all the fp operations it does. However, because the inputs to this operation are float16 and the outputs are float32 we need to use the fp_status_f16 for the float16 input widening but the normal fp_status for everything else. Otherwise we will apply the flushing control FPCR.FZ16 to the 32-bit output rather than the FPCR.FZ control, and incorrectly flush a denormal output to zero when we should not (or vice-versa).
(In commit 207d30b5fdb5b we tried to fix the FZ handling but didn't get it right, switching from "use FPCR.FZ for everything" to "use FPCR.FZ16 for everything".)
Pass the CPU env to the sme_fmopa_h helper instead of an fp_status pointer, and have the helper pass an extra fp_status into the f16_dotadd() function so that we can use the right status for the right parts of this operation.
Cc: qemu-stable@nongnu.org Fixes: 207d30b5fdb5 ("target/arm: Use FPST_F16 for SME FMOPA (widening)") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2373 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
76916dfa | 22-Jul-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
The function tszimm_esz() returns a shift amount, or possibly -1 in certain cases that correspond to unallocated encodings in the inst
target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
The function tszimm_esz() returns a shift amount, or possibly -1 in certain cases that correspond to unallocated encodings in the instruction set. We catch these later in the trans_ functions (generally with an "a-esz < 0" check), but before we do the decodetree-generated code will also call tszimm_shr() or tszimm_sl(), which will use the tszimm_esz() return value as a shift count without checking that it is not negative, which is undefined behaviour.
Avoid the UB by checking the return value in tszimm_shr() and tszimm_shl().
Cc: qemu-stable@nongnu.org Resolves: Coverity CID 1547617, 1547694 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240722172957.1041231-4-peter.maydell@linaro.org
show more ...
|
ea3f5a90 | 22-Jul-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Fix UMOPA/UMOPS of 16-bit values
The UMOPA/UMOPS instructions are supposed to multiply unsigned 8 or 16 bit elements and accumulate the products into a 64-bit element. In the Arm ARM pse
target/arm: Fix UMOPA/UMOPS of 16-bit values
The UMOPA/UMOPS instructions are supposed to multiply unsigned 8 or 16 bit elements and accumulate the products into a 64-bit element. In the Arm ARM pseudocode, this is done with the usual infinite-precision signed arithmetic. However our implementation doesn't quite get it right, because in the DEF_IMOP_64() macro we do: sum += (NTYPE)(n >> 0) * (MTYPE)(m >> 0);
where NTYPE and MTYPE are uint16_t or int16_t. In the uint16_t case, the C usual arithmetic conversions mean the values are converted to "int" type and the multiply is done as a 32-bit multiply. This means that if the inputs are, for example, 0xffff and 0xffff then the result is 0xFFFE0001 as an int, which is then promoted to uint64_t for the accumulation into sum; this promotion incorrectly sign extends the multiply.
Avoid the incorrect sign extension by casting to int64_t before the multiply, so we do the multiply as 64-bit signed arithmetic, which is a type large enough that the multiply can never overflow into the sign bit.
(The equivalent 8-bit operations in DEF_IMOP_32() are fine, because the 8-bit multiplies can never overflow into the sign bit of a 32-bit integer.)
Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2372 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240722172957.1041231-3-peter.maydell@linaro.org
show more ...
|
56f1c0db | 22-Jul-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Don't assert for 128-bit tile accesses when SVL is 128
For an instruction which accesses a 128-bit element tile when the SVL is also 128 (for example MOV z0.Q, p0/M, ZA0H.Q[w0,0]), we wi
target/arm: Don't assert for 128-bit tile accesses when SVL is 128
For an instruction which accesses a 128-bit element tile when the SVL is also 128 (for example MOV z0.Q, p0/M, ZA0H.Q[w0,0]), we will assert in get_tile_rowcol():
qemu-system-aarch64: ../../tcg/tcg-op.c:926: tcg_gen_deposit_z_i32: Assertion `len > 0' failed.
This happens because we calculate len = ctz32(streaming_vec_reg_size(s)) - esz;$ but if the SVL and the element size are the same len is 0, and the deposit operation asserts.
In this case the ZA storage contains exactly one 128 bit element ZA tile, and the horizontal or vertical slice is just that tile. This means that regardless of the index value in the Ws register, we always access that tile. (In pseudocode terms, we calculate (index + offset) MOD 1, which is 0.)
Special case the len == 0 case to avoid hitting the assertion in tcg_gen_deposit_z_i32().
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240722172957.1041231-2-peter.maydell@linaro.org
show more ...
|
3b9991e3 | 09-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Use set/clear_helper_retaddr in SVE and SME helpers
Avoid a race condition with munmap in another thread. Use around blocks that exclusively use "host_fn". Keep the blocks as small as po
target/arm: Use set/clear_helper_retaddr in SVE and SME helpers
Avoid a race condition with munmap in another thread. Use around blocks that exclusively use "host_fn". Keep the blocks as small as possible, but without setting and clearing for every operation on one page.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
8009519b | 09-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Use set/clear_helper_retaddr in helper-a64.c
Use these in helper_dc_dva and the FEAT_MOPS routines to avoid a race condition with munmap in another thread.
Reviewed-by: Peter Maydell <p
target/arm: Use set/clear_helper_retaddr in helper-a64.c
Use these in helper_dc_dva and the FEAT_MOPS routines to avoid a race condition with munmap in another thread.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
207d30b5 | 17-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Use FPST_F16 for SME FMOPA (widening)
This operation has float16 inputs and thus must use the FZ16 control not the FZ control.
Cc: qemu-stable@nongnu.org Fixes: 3916841ac75 ("target/arm
target/arm: Use FPST_F16 for SME FMOPA (widening)
This operation has float16 inputs and thus must use the FZ16 control not the FZ control.
Cc: qemu-stable@nongnu.org Fixes: 3916841ac75 ("target/arm: Implement FMOPA, FMOPS (widening)") Reported-by: Daniyal Khan <danikhan632@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20240717060149.204788-3-richard.henderson@linaro.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2374 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
31d93fed | 17-Jul-2024 |
Daniyal Khan <danikhan632@gmail.com> |
target/arm: Use float_status copy in sme_fmopa_s
We made a copy above because the fp exception flags are not propagated back to the FPST register, but then failed to use the copy.
Cc: qemu-stable@n
target/arm: Use float_status copy in sme_fmopa_s
We made a copy above because the fp exception flags are not propagated back to the FPST register, but then failed to use the copy.
Cc: qemu-stable@nongnu.org Fixes: 558e956c719 ("target/arm: Implement FMOPA, FMOPS (non-widening)") Signed-off-by: Daniyal Khan <danikhan632@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20240717060149.204788-2-richard.henderson@linaro.org [rth: Split from a larger patch] Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
25489b52 | 16-Jul-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: LDAPR should honour SCTLR_ELx.nAA
In commit c1a1f80518d360b when we added the FEAT_LSE2 relaxations to the alignment requirements for atomic and ordered loads and stores, we didn't quite
target/arm: LDAPR should honour SCTLR_ELx.nAA
In commit c1a1f80518d360b when we added the FEAT_LSE2 relaxations to the alignment requirements for atomic and ordered loads and stores, we didn't quite get it right for LDAPR/LDAPRH/LDAPRB with no immediate offset. These instructions were handled in the old decoder as part of disas_ldst_atomic(), but unlike all the other insns that function decoded (LDADD, LDCLR, etc) these insns are "ordered", not "atomic", so they should be using check_ordered_align() rather than check_atomic_align(). Commit c1a1f80518d360b used check_atomic_align() regardless for everything in disas_ldst_atomic(). We then carried that incorrect check over in the decodetree conversion, where LDAPR/LDAPRH/LDAPRB are now handled by trans_LDAPR().
The effect is that when FEAT_LSE2 is implemented, these instructions don't honour the SCTLR_ELx.nAA bit and will generate alignment faults when they should not.
(The LDAPR insns with an immediate offset were in disas_ldst_ldapr_stlr() and then in trans_LDAPR_i() and trans_STLR_i(), and have always used the correct check_ordered_align().)
Use check_ordered_align() in trans_LDAPR().
Cc: qemu-stable@nongnu.org Fixes: c1a1f80518d360b ("target/arm: Relax ordered/atomic alignment checks for LSE2") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240709134504.3500007-3-peter.maydell@linaro.org
show more ...
|
5669d26e | 16-Jul-2024 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Fix handling of LDAPR/STLR with negative offset
When we converted the LDAPR/STLR instructions to decodetree we accidentally introduced a regression where the offset is negative. The 9-bi
target/arm: Fix handling of LDAPR/STLR with negative offset
When we converted the LDAPR/STLR instructions to decodetree we accidentally introduced a regression where the offset is negative. The 9-bit immediate field is signed, and the old hand decoder correctly used sextract32() to get it out of the insn word, but the ldapr_stlr_i pattern in the decode file used "imm:9" instead of "imm:s9", so it treated the field as unsigned.
Fix the pattern to treat the field as a signed immediate.
Cc: qemu-stable@nongnu.org Fixes: 2521b6073b7 ("target/arm: Convert LDAPR/STLR (imm) to decodetree") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2419 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240709134504.3500007-2-peter.maydell@linaro.org
show more ...
|
7f490891 | 08-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert PMULL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.mayd
target/arm: Convert PMULL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240709000610.382391-7-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
f7a84565 | 08-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert ADDHN, SUBHN, RADDHN, RSUBHN to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240
target/arm: Convert ADDHN, SUBHN, RADDHN, RSUBHN to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240709000610.382391-6-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
26cb9dbe | 08-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert SADDW, SSUBW, UADDW, USUBW to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 2024070
target/arm: Convert SADDW, SSUBW, UADDW, USUBW to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240709000610.382391-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
7575c571 | 08-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert SQDMULL, SQDMLAL, SQDMLSL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240709
target/arm: Convert SQDMULL, SQDMLAL, SQDMLSL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240709000610.382391-4-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
eb191187 | 08-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert SADDL, SSUBL, SABDL, SABAL, and unsigned to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Messa
target/arm: Convert SADDL, SSUBL, SABDL, SABAL, and unsigned to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240709000610.382391-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
97b06ab7 | 08-Jul-2024 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert SMULL, UMULL, SMLAL, UMLAL, SMLSL, UMLSL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Messa
target/arm: Convert SMULL, UMULL, SMLAL, UMLAL, SMLSL, UMLSL to decodetree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240709000610.382391-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|