Message ID | 1479901039-7113-10-git-send-email-nikunj@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote: > From: Avinesh Kumar <avinesku@linux.vnet.ibm.com> > > vextublx: Vector Extract Unsigned Byte Left > vextuhlx: Vector Extract Unsigned Halfword Left > vextuwlx: Vector Extract Unsigned Word Left > > Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com> > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> So, when I suggested doing these without helpers before, I had forgotten that the non-byte versions can straddle the word boundary. Given that the offset is in a register, not the instruction that does make it complicated. But, this version also relies on working 128-bit arithmetic, AFAICT this will just fail to build if CONFIG_INT128 isn't defined. It really shouldn't be that hard to make a helper that works just in terms of 64-bit arithmetic - there are only 3 cases (all in the upper word, all in the lower, and straddling). I'd prefer to see it done that way, rather than increasing reliance on CONFIG_INT128.
David Gibson <david@gibson.dropbear.id.au> writes: > [ Unknown signature status ] > On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote: >> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com> >> >> vextublx: Vector Extract Unsigned Byte Left >> vextuhlx: Vector Extract Unsigned Halfword Left >> vextuwlx: Vector Extract Unsigned Word Left >> >> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com> >> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> > > So, when I suggested doing these without helpers before, I had > forgotten that the non-byte versions can straddle the word boundary. > Given that the offset is in a register, not the instruction that does > make it complicated. > > But, this version also relies on working 128-bit arithmetic, AFAICT > this will just fail to build if CONFIG_INT128 isn't defined. It has both the implementation, just that the defines might have confused you: #if defined(HOST_WORDS_BIGENDIAN) # if defined(CONFIG_INT128) # else # endif #else /* !defined (HOST_WORDS_BIGENDIAN) */ # if defined(CONFIG_INT128) # else # endif #endif > It really shouldn't be that hard to make a helper that works just in > terms of 64-bit arithmetic - there are only 3 cases (all in the upper > word, all in the lower, and straddling). Currently, its being done using byte array. +{ \ + target_ulong r = 0; \ + int i; \ + int index = a & 0xf; \ + for (i = 0; i < elem; i++) { \ + r = r << 8; \ + if (index + i <= 15) { \ + r = r | b->u8[index + i]; \ + } \ + } \ + return r; \ +} > I'd prefer to see it done that way, rather than increasing reliance on > CONFIG_INT128. Regards Nikunj
On 11/24/2016 06:53 AM, Nikunj A Dadhania wrote: > David Gibson <david@gibson.dropbear.id.au> writes: > >> [ Unknown signature status ] >> On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote: >>> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com> >>> >>> vextublx: Vector Extract Unsigned Byte Left >>> vextuhlx: Vector Extract Unsigned Halfword Left >>> vextuwlx: Vector Extract Unsigned Word Left >>> >>> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com> >>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> >> >> So, when I suggested doing these without helpers before, I had >> forgotten that the non-byte versions can straddle the word boundary. >> Given that the offset is in a register, not the instruction that does >> make it complicated. >> >> But, this version also relies on working 128-bit arithmetic, AFAICT >> this will just fail to build if CONFIG_INT128 isn't defined. > > It has both the implementation, just that the defines might have > confused you: > > #if defined(HOST_WORDS_BIGENDIAN) > > # if defined(CONFIG_INT128) > # else > # endif > > #else /* !defined (HOST_WORDS_BIGENDIAN) */ > > # if defined(CONFIG_INT128) > # else > # endif > > #endif In include/qemu/int128.h, we do have int128_rshift. So you don't *really* have to do this by hand, exactly. r~
Richard Henderson <rth@twiddle.net> writes: > On 11/24/2016 06:53 AM, Nikunj A Dadhania wrote: >> David Gibson <david@gibson.dropbear.id.au> writes: >> >>> [ Unknown signature status ] >>> On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote: >>>> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com> >>>> >>>> vextublx: Vector Extract Unsigned Byte Left >>>> vextuhlx: Vector Extract Unsigned Halfword Left >>>> vextuwlx: Vector Extract Unsigned Word Left >>>> >>>> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com> >>>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> >>> >>> So, when I suggested doing these without helpers before, I had >>> forgotten that the non-byte versions can straddle the word boundary. >>> Given that the offset is in a register, not the instruction that does >>> make it complicated. >>> >>> But, this version also relies on working 128-bit arithmetic, AFAICT >>> this will just fail to build if CONFIG_INT128 isn't defined. >> >> It has both the implementation, just that the defines might have >> confused you: >> >> #if defined(HOST_WORDS_BIGENDIAN) >> >> # if defined(CONFIG_INT128) >> # else >> # endif >> >> #else /* !defined (HOST_WORDS_BIGENDIAN) */ >> >> # if defined(CONFIG_INT128) >> # else >> # endif >> >> #endif > > In include/qemu/int128.h, we do have int128_rshift. So you don't *really* have > to do this by hand, exactly. Sure, let me add int128_extract as well. Will be helpful. Regards Nikunj
diff --git a/target-ppc/helper.h b/target-ppc/helper.h index 3b26678..d0a8fb2 100644 --- a/target-ppc/helper.h +++ b/target-ppc/helper.h @@ -366,6 +366,9 @@ DEF_HELPER_3(vpmsumb, void, avr, avr, avr) DEF_HELPER_3(vpmsumh, void, avr, avr, avr) DEF_HELPER_3(vpmsumw, void, avr, avr, avr) DEF_HELPER_3(vpmsumd, void, avr, avr, avr) +DEF_HELPER_2(vextublx, tl, tl, avr) +DEF_HELPER_2(vextuhlx, tl, tl, avr) +DEF_HELPER_2(vextuwlx, tl, tl, avr) DEF_HELPER_2(vsbox, void, avr, avr) DEF_HELPER_3(vcipher, void, avr, avr, avr) diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c index fbf477f..ce6cff1 100644 --- a/target-ppc/int_helper.c +++ b/target-ppc/int_helper.c @@ -1805,6 +1805,71 @@ void helper_vlogefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b) } } +#ifdef CONFIG_INT128 +#define EXTRACT128(value, start, length) \ + ((value >> start) & (~(__uint128_t)0 >> (128 - length))) +#endif + +#if defined(HOST_WORDS_BIGENDIAN) +# if defined(CONFIG_INT128) +# define VEXTULX_DO(name, elem) \ +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b) \ +{ \ + target_ulong r = 0; \ + int index = (a & 0xf) * 8; \ + r = EXTRACT128(b->u128, index, elem * 8); \ + return r; \ +} +# else +# define VEXTULX_DO(name, elem) \ +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b) \ +{ \ + target_ulong r = 0; \ + int i; \ + int index = a & 0xf; \ + for (i = 0; i < elem; i++) { \ + r = r << 8; \ + if (index + i <= 15) { \ + r = r | b->u8[index + i]; \ + } \ + } \ + return r; \ +} +# endif +#else +# if defined(CONFIG_INT128) +# define VEXTULX_DO(name, elem) \ +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b) \ +{ \ + target_ulong r = 0; \ + int size = elem * 8; \ + int index = (15 - (a & 0xf) + 1) * 8; \ + r = EXTRACT128(b->u128, (index - size), size); \ + return r; \ +} +# else +# define VEXTULX_DO(name, elem) \ +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b) \ +{ \ + target_ulong r = 0; \ + int i; \ + int index = 15 - (a & 0xf); \ + for (i = 0; i < elem; i++) { \ + r = r << 8; \ + if (index - i >= 0) { \ + r = r | b->u8[index - i]; \ + } \ + } \ + return r; \ +} +# endif +#endif + +VEXTULX_DO(vextublx, 1) +VEXTULX_DO(vextuhlx, 2) +VEXTULX_DO(vextuwlx, 4) +#undef VEXTULX_DO + /* The specification says that the results are undefined if all of the * shift counts are not identical. We check to make sure that they are * to conform to what real hardware appears to do. */ diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c index 7143eb3..e91d10b 100644 --- a/target-ppc/translate/vmx-impl.inc.c +++ b/target-ppc/translate/vmx-impl.inc.c @@ -340,6 +340,19 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx) \ } \ } +#define GEN_VXFORM_HETRO(name, opc2, opc3) \ +static void glue(gen_, name)(DisasContext *ctx) \ +{ \ + TCGv_ptr rb; \ + if (unlikely(!ctx->altivec_enabled)) { \ + gen_exception(ctx, POWERPC_EXCP_VPU); \ + return; \ + } \ + rb = gen_avr_ptr(rB(ctx->opcode)); \ + gen_helper_##name(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)], rb); \ + tcg_temp_free_ptr(rb); \ +} + GEN_VXFORM(vaddubm, 0, 0); GEN_VXFORM_DUAL_EXT(vaddubm, PPC_ALTIVEC, PPC_NONE, 0, \ vmul10cuq, PPC_NONE, PPC2_ISA300, 0x0000F800) @@ -525,6 +538,11 @@ GEN_VXFORM_ENV(vaddfp, 5, 0); GEN_VXFORM_ENV(vsubfp, 5, 1); GEN_VXFORM_ENV(vmaxfp, 5, 16); GEN_VXFORM_ENV(vminfp, 5, 17); +GEN_VXFORM_HETRO(vextublx, 6, 24) +GEN_VXFORM_HETRO(vextuhlx, 6, 25) +GEN_VXFORM_HETRO(vextuwlx, 6, 26) +GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207, + vextuwlx, PPC_NONE, PPC2_ISA300) #define GEN_VXRFORM1(opname, name, str, opc2, opc3) \ static void glue(gen_, name)(DisasContext *ctx) \ diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c index f02b3be..e62e564 100644 --- a/target-ppc/translate/vmx-ops.inc.c +++ b/target-ppc/translate/vmx-ops.inc.c @@ -91,8 +91,10 @@ GEN_VXFORM(vmrghw, 6, 2), GEN_VXFORM(vmrglb, 6, 4), GEN_VXFORM(vmrglh, 6, 5), GEN_VXFORM(vmrglw, 6, 6), +GEN_VXFORM_300(vextublx, 6, 24), +GEN_VXFORM_300(vextuhlx, 6, 25), +GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207), GEN_VXFORM_207(vmrgew, 6, 30), -GEN_VXFORM_207(vmrgow, 6, 26), GEN_VXFORM(vmuloub, 4, 0), GEN_VXFORM(vmulouh, 4, 1), GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),