d5896749 | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Implement cmpsel_vec with avx512 insns
The avx512 vpblendm* instructions exactly implement cmpsel, using a predicate input. Of course this matches nicely with the avx512 predicate compari
tcg/i386: Implement cmpsel_vec with avx512 insns
The avx512 vpblendm* instructions exactly implement cmpsel, using a predicate input. Of course this matches nicely with the avx512 predicate comparison instructions.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
c044ec0d | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Add predicate parameters to tcg_out_evex_opc
Extend tcg_out_evex_opc to handle the predicate and zero-merging parameters of the evex prefix.
Reviewed-by: Philippe Mathieu-Daudé <philmd@li
tcg/i386: Add predicate parameters to tcg_out_evex_opc
Extend tcg_out_evex_opc to handle the predicate and zero-merging parameters of the evex prefix.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
717da87d | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Implement cmp_vec with avx512 insns
The sse/avx instruction set only has EQ and GT as direct comparisons. Other signed comparisons can be generated from swapping and inversion. However uns
tcg/i386: Implement cmp_vec with avx512 insns
The sse/avx instruction set only has EQ and GT as direct comparisons. Other signed comparisons can be generated from swapping and inversion. However unsigned comparisons are not available and must be transformed to signed comparisons by biasing the inputs.
The avx512 instruction set has a complete set of comparisons, with results placed into a predicate register. We can produce the normal cmp_vec result by using VPMOVM2*.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
d8387f0e | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Optimize cmpsel with constant 0 operand 3.
These can be simplified to and/andc, avoiding the load of the zero into a register.
Signed-off-by: Richard Henderson <richard.henderson@linaro.o
tcg/i386: Optimize cmpsel with constant 0 operand 3.
These can be simplified to and/andc, avoiding the load of the zero into a register.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
db4121d2 | 06-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Do not expand cmpsel_vec early
Expand during output instead of during opcode generation. Remove x86_vpblendvb_vec opcode, this this removes the only user.
Signed-off-by: Richard Henderson
tcg/i386: Do not expand cmpsel_vec early
Expand during output instead of during opcode generation. Remove x86_vpblendvb_vec opcode, this this removes the only user.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
b8a56703 | 06-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Do not expand cmp_vec early
Move most of expansion to opcode generation, leaving the conversion of unsigned to signed to be done in the early phase. Small inefficiencies, but not incorrect
tcg/i386: Do not expand cmp_vec early
Move most of expansion to opcode generation, leaving the conversion of unsigned to signed to be done in the early phase. Small inefficiencies, but not incorrect results, are introduced until cmpsel_vec is converted in the next patch.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
45ccdbcb | 31-May-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
host/i386: assume presence of POPCNT
QEMU now requires an x86-64-v2 host, which has the POPCNT instruction. Use it freely in TCG-generated code.
Reviewed-by: Richard Henderson <richard.henderson@li
host/i386: assume presence of POPCNT
QEMU now requires an x86-64-v2 host, which has the POPCNT instruction. Use it freely in TCG-generated code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
19517b83 | 26-Mar-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Optimize setcond of TST{EQ,NE} with 0xffffffff
This may be treated as a 32-bit EQ/NE comparison against 0, which is in turn treated as a LTU/GEU comparison against 1.
Reviewed-by: Philipp
tcg/i386: Optimize setcond of TST{EQ,NE} with 0xffffffff
This may be treated as a 32-bit EQ/NE comparison against 0, which is in turn treated as a LTU/GEU comparison against 1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
be1335db | 09-Jan-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
tcg/i386: Use TEST r,r to test 8/16/32 bits
Just like when testing against the sign bits, TEST r,r can be used when the immediate is 0xff, 0xff00, 0xffff, 0xffffffff.
Signed-off-by: Paolo Bonzini <
tcg/i386: Use TEST r,r to test 8/16/32 bits
Just like when testing against the sign bits, TEST r,r can be used when the immediate is 0xff, 0xff00, 0xffff, 0xffffffff.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
303214aa | 27-Oct-2023 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Support TCG_COND_TST{EQ,NE}
Merge tcg_out_testi into tcg_out_cmp and adjust the two uses.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard
tcg/i386: Support TCG_COND_TST{EQ,NE}
Merge tcg_out_testi into tcg_out_cmp and adjust the two uses.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
6749d85b | 24-Oct-2023 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp
Return the x86 condition codes to use after the compare.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <
tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp
Return the x86 condition codes to use after the compare.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
afa37be4 | 28-Dec-2023 |
Paolo Bonzini <pbonzini@redhat.com> |
tcg/i386: use 8-bit OR or XOR for unsigned 8-bit immediates
In the case where OR or XOR has an 8-bit immediate between 128 and 255, we can operate on a low-byte register and shorten the output by tw
tcg/i386: use 8-bit OR or XOR for unsigned 8-bit immediates
In the case where OR or XOR has an 8-bit immediate between 128 and 255, we can operate on a low-byte register and shorten the output by two or three bytes (two if a prefix byte is needed for REX.B).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20231228120524.70239-1-pbonzini@redhat.com> [rth: Incorporate into switch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|