Message ID | 1453067939-9121-6-git-send-email-haozhong.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 17.01.16 at 22:58, <haozhong.zhang@intel.com> wrote: > This patch adds several functions to take multiplication, division and > shifting involving 64-bit integers. > > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > --- > Changes in v4: > (addressing Jan Beulich's comments) > * Rewrite mul_u64_u64_shr() in assembly. Thanks, but it puzzles me that the other one didn't get converted as well. Anyway, I'm not going to make this a requirement, since at least it appears to match Linux'es variant. > +static inline u64 mul_u64_u64_shr(u64 a, u64 mul, unsigned int n) > +{ > + u64 hi, lo; > + > + asm volatile ( "mulq %2; shrdq %1,%0" > + : "=a" (lo), "=d" (hi) > + : "rm" (mul), "0" (a), "c" (n) ); SHRD formally is a 3-operand instruction, and the fact that gas' AT&T syntax supports a 2-operand "alias" is, well, odd. Please let's use the specification mandated 3-operand form properly, to avoid surprises with e.g. clang. Jan
On 02/05/16 21:36, Jan Beulich wrote: > >>> On 17.01.16 at 22:58, <haozhong.zhang@intel.com> wrote: > > This patch adds several functions to take multiplication, division and > > shifting involving 64-bit integers. > > > > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > > --- > > Changes in v4: > > (addressing Jan Beulich's comments) > > * Rewrite mul_u64_u64_shr() in assembly. > > Thanks, but it puzzles me that the other one didn't get converted > as well. Anyway, I'm not going to make this a requirement, since > at least it appears to match Linux'es variant. > I can't remember why I didn't rewrite mul_u64_u32_div(), especially when it can be easily implemented as static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor) { u64 quotient, remainder; asm volatile ( "mulq %3; divq %4" : "=a" (quotient), "=d" (remainder) : "0" (a), "rm" ((u64) mul), "c" ((u64) divisor) ); return quotient; } I'll modify it in the next version. > > +static inline u64 mul_u64_u64_shr(u64 a, u64 mul, unsigned int n) > > +{ > > + u64 hi, lo; > > + > > + asm volatile ( "mulq %2; shrdq %1,%0" > > + : "=a" (lo), "=d" (hi) > > + : "rm" (mul), "0" (a), "c" (n) ); > > SHRD formally is a 3-operand instruction, and the fact that gas' > AT&T syntax supports a 2-operand "alias" is, well, odd. Please > let's use the specification mandated 3-operand form properly, > to avoid surprises with e.g. clang. > OK, I'll change it to the 3-operand form. Haozhong
>>> On 16.02.16 at 10:02, <haozhong.zhang@intel.com> wrote: > On 02/05/16 21:36, Jan Beulich wrote: >> >>> On 17.01.16 at 22:58, <haozhong.zhang@intel.com> wrote: >> > This patch adds several functions to take multiplication, division and >> > shifting involving 64-bit integers. >> > >> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> >> > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> > --- >> > Changes in v4: >> > (addressing Jan Beulich's comments) >> > * Rewrite mul_u64_u64_shr() in assembly. >> >> Thanks, but it puzzles me that the other one didn't get converted >> as well. Anyway, I'm not going to make this a requirement, since >> at least it appears to match Linux'es variant. >> > > I can't remember why I didn't rewrite mul_u64_u32_div(), especially when > it can be easily implemented as > > static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor) > { > u64 quotient, remainder; > > asm volatile ( "mulq %3; divq %4" > : "=a" (quotient), "=d" (remainder) > : "0" (a), "rm" ((u64) mul), "c" ((u64) divisor) ); > > return quotient; > } > > I'll modify it in the next version. Looks better, but the constraints aren't right (needing =& for both outputs, and "c" being too narrow). But iirc the question was anyway whether to better have even lower overhead inline assembly and single point of use. Jan
On 02/16/16 02:39, Jan Beulich wrote: > >>> On 16.02.16 at 10:02, <haozhong.zhang@intel.com> wrote: > > On 02/05/16 21:36, Jan Beulich wrote: > >> >>> On 17.01.16 at 22:58, <haozhong.zhang@intel.com> wrote: > >> > This patch adds several functions to take multiplication, division and > >> > shifting involving 64-bit integers. > >> > > >> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > >> > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > >> > --- > >> > Changes in v4: > >> > (addressing Jan Beulich's comments) > >> > * Rewrite mul_u64_u64_shr() in assembly. > >> > >> Thanks, but it puzzles me that the other one didn't get converted > >> as well. Anyway, I'm not going to make this a requirement, since > >> at least it appears to match Linux'es variant. > >> > > > > I can't remember why I didn't rewrite mul_u64_u32_div(), especially when > > it can be easily implemented as > > > > static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor) > > { > > u64 quotient, remainder; > > > > asm volatile ( "mulq %3; divq %4" > > : "=a" (quotient), "=d" (remainder) > > : "0" (a), "rm" ((u64) mul), "c" ((u64) divisor) ); > > > > return quotient; > > } > > > > I'll modify it in the next version. > > Looks better, but the constraints aren't right (needing =& for > both outputs, and "c" being too narrow). But iirc the question > was anyway whether to better have even lower overhead inline > assembly and single point of use. > Thank you for pointing out the constraint error! And indeed every function in this patch is used at single point. In my reply to your comments for patch 6, I'll modify and inline them at their use points. Haozhong
diff --git a/xen/include/asm-x86/math64.h b/xen/include/asm-x86/math64.h new file mode 100644 index 0000000..2ddec62 --- /dev/null +++ b/xen/include/asm-x86/math64.h @@ -0,0 +1,47 @@ +#ifndef __X86_MATH64 +#define __X86_MATH64 + +/* + * (a * mul) / divisor + * + * This function is derived from Linux kernel + * (mul_u64_u32_div() in include/linux/math64.h) + */ +static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor) +{ + union { + u64 ll; + struct { + u32 low, high; + } l; + } u, rl, rh; + + u.ll = a; + rl.ll = (u64)u.l.low * mul; + rh.ll = (u64)u.l.high * mul + rl.l.high; + + /* Bits 32-63 of the result will be in rh.l.low. */ + rl.l.high = do_div(rh.ll, divisor); + + /* Bits 0-31 of the result will be in rl.l.low. */ + do_div(rl.ll, divisor); + + rl.l.high = rh.l.low; + return rl.ll; +} + +/* + * (a * mul) >> n + */ +static inline u64 mul_u64_u64_shr(u64 a, u64 mul, unsigned int n) +{ + u64 hi, lo; + + asm volatile ( "mulq %2; shrdq %1,%0" + : "=a" (lo), "=d" (hi) + : "rm" (mul), "0" (a), "c" (n) ); + + return lo; +} + +#endif