ce8e5f2f | 10-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/ppc: Optimize cmpsel with constant 0/-1 arguments
These can be simplified to and/or/andc/orc, avoiding the load of the constantinto a register.
Signed-off-by: Richard Henderson <richard.henders
tcg/ppc: Optimize cmpsel with constant 0/-1 arguments
These can be simplified to and/or/andc/orc, avoiding the load of the constantinto a register.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
d0dabf9e | 09-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/ppc: Implement cmpsel_vec
Do not allow cmpsel_vec to be expanded early, so that we can make the correct decision wrt the sense of the comparison.
Signed-off-by: Richard Henderson <richard.hende
tcg/ppc: Implement cmpsel_vec
Do not allow cmpsel_vec to be expanded early, so that we can make the correct decision wrt the sense of the comparison.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
d5896749 | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Implement cmpsel_vec with avx512 insns
The avx512 vpblendm* instructions exactly implement cmpsel, using a predicate input. Of course this matches nicely with the avx512 predicate compari
tcg/i386: Implement cmpsel_vec with avx512 insns
The avx512 vpblendm* instructions exactly implement cmpsel, using a predicate input. Of course this matches nicely with the avx512 predicate comparison instructions.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
c044ec0d | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Add predicate parameters to tcg_out_evex_opc
Extend tcg_out_evex_opc to handle the predicate and zero-merging parameters of the evex prefix.
Reviewed-by: Philippe Mathieu-Daudé <philmd@li
tcg/i386: Add predicate parameters to tcg_out_evex_opc
Extend tcg_out_evex_opc to handle the predicate and zero-merging parameters of the evex prefix.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
717da87d | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Implement cmp_vec with avx512 insns
The sse/avx instruction set only has EQ and GT as direct comparisons. Other signed comparisons can be generated from swapping and inversion. However uns
tcg/i386: Implement cmp_vec with avx512 insns
The sse/avx instruction set only has EQ and GT as direct comparisons. Other signed comparisons can be generated from swapping and inversion. However unsigned comparisons are not available and must be transformed to signed comparisons by biasing the inputs.
The avx512 instruction set has a complete set of comparisons, with results placed into a predicate register. We can produce the normal cmp_vec result by using VPMOVM2*.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
d8387f0e | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Optimize cmpsel with constant 0 operand 3.
These can be simplified to and/andc, avoiding the load of the zero into a register.
Signed-off-by: Richard Henderson <richard.henderson@linaro.o
tcg/i386: Optimize cmpsel with constant 0 operand 3.
These can be simplified to and/andc, avoiding the load of the zero into a register.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
1f106544 | 06-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/optimize: Optimize cmp_vec and cmpsel_vec
Place immediate values second in the comparison. Place destination matches first in the true/false values. All of this mirrors what we do for integer se
tcg/optimize: Optimize cmp_vec and cmpsel_vec
Place immediate values second in the comparison. Place destination matches first in the true/false values. All of this mirrors what we do for integer setcond and movcond.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
141125e0 | 06-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/optimize: Fold movcond with true and false values identical
Fold "x = cond ? y : y" to "x = y".
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard
tcg/optimize: Fold movcond with true and false values identical
Fold "x = cond ? y : y" to "x = y".
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
db4121d2 | 06-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Do not expand cmpsel_vec early
Expand during output instead of during opcode generation. Remove x86_vpblendvb_vec opcode, this this removes the only user.
Signed-off-by: Richard Henderson
tcg/i386: Do not expand cmpsel_vec early
Expand during output instead of during opcode generation. Remove x86_vpblendvb_vec opcode, this this removes the only user.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
b8a56703 | 06-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Do not expand cmp_vec early
Move most of expansion to opcode generation, leaving the conversion of unsigned to signed to be done in the early phase. Small inefficiencies, but not incorrect
tcg/i386: Do not expand cmp_vec early
Move most of expansion to opcode generation, leaving the conversion of unsigned to signed to be done in the early phase. Small inefficiencies, but not incorrect results, are introduced until cmpsel_vec is converted in the next patch.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
bc97b3ad | 06-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/i386: Split out tcg_out_vex_modrm_type
Helper function to handle setting of VEXL based on the type of the operation.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richa
tcg/i386: Split out tcg_out_vex_modrm_type
Helper function to handle setting of VEXL based on the type of the operation.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
8dd2ea75 | 07-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg: Export vec_gen_6
Add declaration to tcg-internal.h, making it available for use from tcg backend vector expanders.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richar
tcg: Export vec_gen_6
Add declaration to tcg-internal.h, making it available for use from tcg backend vector expanders.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
9d8d5a5b | 04-Sep-2024 |
TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com> |
tcg: Fix iteration step in 32-bit gvec operation
The loop in the 32-bit case of the vector compare operation was incorrectly incrementing by 8 bytes per iteration instead of 4 bytes. This caused the
tcg: Fix iteration step in 32-bit gvec operation
The loop in the 32-bit case of the vector compare operation was incorrectly incrementing by 8 bytes per iteration instead of 4 bytes. This caused the function to process only half of the intended elements.
Cc: qemu-stable@nongnu.org Fixes: 9622c697d1 (tcg: Add gvec compare with immediate and scalar operand) Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com> Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240904142739.854-2-zhiwei_liu@linux.alibaba.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
show more ...
|
83ac625c | 10-Sep-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg: Propagate new TCGOp to add_as_label_use
The use of tcg_last_op does not interact well with TCGContext.emit_before_op, resulting in the label being linked to something other than the branch op.
tcg: Propagate new TCGOp to add_as_label_use
The use of tcg_last_op does not interact well with TCGContext.emit_before_op, resulting in the label being linked to something other than the branch op.
In this case it is easier to simply collect the emitted branch op and pass it directly to add_as_label_use.
Reported-by: Elisha Hollander <just4now666666@gmail.com> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
a71d9dfb | 30-Jun-2024 |
Richard Henderson <richard.henderson@linaro.org> |
tcg/optimize: Fix TCG_COND_TST* simplification of setcond2
Argument ordering for setcond2 is:
output, a_low, a_high, b_low, b_high, cond
The test is supposed to be against b_low, not a_high.
Cc
tcg/optimize: Fix TCG_COND_TST* simplification of setcond2
Argument ordering for setcond2 is:
output, a_low, a_high, b_low, b_high, cond
The test is supposed to be against b_low, not a_high.
Cc: qemu-stable@nongnu.org Fixes: ceb9ee06b71 ("tcg/optimize: Handle TCG_COND_TST{EQ,NE}") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2413 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20240701024623.1265028-1-richard.henderson@linaro.org>
show more ...
|
4408155a | 24-May-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
meson: Drop the .fa library suffix
The non-standard .fa library suffix breaks the link source de-duplication done by Meson so drop it.
The lack of link source de-duplication causes AddressSanitizer
meson: Drop the .fa library suffix
The non-standard .fa library suffix breaks the link source de-duplication done by Meson so drop it.
The lack of link source de-duplication causes AddressSanitizer to complain ODR violations, and makes GNU ld abort when combined with clang's LTO.
Fortunately, the non-standard suffix is not necessary anymore for two reasons.
First, the non-standard suffix was necessary for fork-fuzzing. Meson wraps all standard-suffixed libraries with --start-group and --end-group. This made a fork-fuzz.ld linker script wrapped as well and broke builds. Commit d2e6f9272d33 ("fuzz: remove fork-fuzzing scaffolding") dropped fork-fuzzing so we can now restore the standard suffix.
Second, the libraries are not even built anymore, because it is possible to just use the object files directly via extract_all_objects().
The occurences of the suffix were detected and removed by performing a tree-wide search with 'fa' and .fa (note the quotes and dot).
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-ID: <20240524-xkb-v4-4-2de564e5c859@daynix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|