Message ID | 20210920080451.408655-15-git@xen0n.name (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | 64-bit LoongArch port of QEMU TCG | expand |
On 9/20/21 1:04 AM, WANG Xuerui wrote: > + case INDEX_op_bswap32_i32: > + tcg_out_opc_revb_2h(s, a0, a1); > + tcg_out_opc_rotri_w(s, a0, a0, 16); > + break; > + case INDEX_op_bswap64_i64: > + tcg_out_opc_revb_d(s, a0, a1); > + break; You're missing INDEX_op_bswap32_i64, which in addition has a third argument consisting of TCG_BSWAP_* bits. I would have expected revb_2w to be the preferred implementation of bswap32. I would expect something like case INDEX_op_bswap32_i32: /* All 32-bit values are computed sign-extended in the register. */ a2 = TCG_BSWAP_OS; /* fall through */ case INDEX_op_bswap32_i64: tcg_out_opc_revb_2w(s, a0, a1); if (a2 & TCG_BSWAP_OS) { tcg_out_ext32s(s, a0, a0); } else if (a2 & TCG_BSWAP_OZ) { tcg_out_ext32u(s, a0, a0); } break; r~
On 9/20/21 8:11 AM, Richard Henderson wrote: > } else if (a2 & TCG_BSWAP_OZ) { > tcg_out_ext32u(s, a0, a0); > } Actually, if ((a2 & (TCG_BSWAP_IZ | TCG_BSWAP_OZ)) == TCG_BSWAP_OZ) If the input is zero-extended, the output of revb_2w will also be zero-extended already. r~
Hi Richard, On 9/20/21 23:11, Richard Henderson wrote: > On 9/20/21 1:04 AM, WANG Xuerui wrote: >> + case INDEX_op_bswap32_i32: >> + tcg_out_opc_revb_2h(s, a0, a1); >> + tcg_out_opc_rotri_w(s, a0, a0, 16); >> + break; >> + case INDEX_op_bswap64_i64: >> + tcg_out_opc_revb_d(s, a0, a1); >> + break; > > You're missing INDEX_op_bswap32_i64, which in addition has a third > argument consisting of TCG_BSWAP_* bits. > > I would have expected revb_2w to be the preferred implementation of > bswap32. I would expect something like > > > case INDEX_op_bswap32_i32: > /* All 32-bit values are computed sign-extended in the > register. */ > a2 = TCG_BSWAP_OS; > /* fall through */ > case INDEX_op_bswap32_i64: > tcg_out_opc_revb_2w(s, a0, a1); > if (a2 & TCG_BSWAP_OS) { > tcg_out_ext32s(s, a0, a0); > } else if (a2 & TCG_BSWAP_OZ) { > tcg_out_ext32u(s, a0, a0); > } > break; > You're right when we're supporting only 64-bit hosts. While I was writing that code I hadn't decided whether to remove support for 32-bit hosts, so I didn't make use of 64-bit instructions for the 32-bit ops. I'll fix this in v2. > > r~
diff --git a/tcg/loongarch/tcg-target.c.inc b/tcg/loongarch/tcg-target.c.inc index e5356bdaf8..d617b833e5 100644 --- a/tcg/loongarch/tcg-target.c.inc +++ b/tcg/loongarch/tcg-target.c.inc @@ -480,6 +480,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, tcg_out_opc_bstrins_d(s, a0, a2, args[3], args[3] + args[4] - 1); break; + case INDEX_op_bswap32_i32: + tcg_out_opc_revb_2h(s, a0, a1); + tcg_out_opc_rotri_w(s, a0, a0, 16); + break; + case INDEX_op_bswap64_i64: + tcg_out_opc_revb_d(s, a0, a1); + break; + case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */ case INDEX_op_mov_i64: default: @@ -511,6 +519,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) case INDEX_op_not_i64: case INDEX_op_extract_i32: case INDEX_op_extract_i64: + case INDEX_op_bswap32_i32: + case INDEX_op_bswap64_i64: return C_O1_I1(r, r); case INDEX_op_nor_i32:
Signed-off-by: WANG Xuerui <git@xen0n.name> --- tcg/loongarch/tcg-target.c.inc | 10 ++++++++++ 1 file changed, 10 insertions(+)