Message ID | 1585119021-46593-2-git-send-email-robert.hu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] util/bufferiszero: assign length_to_accel value for each accelerator case | expand |
On 3/25/20 1:50 AM, Robert Hoo wrote: > By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a > branch. > > The authorship of this patch actually belongs to Richard Henderson <richard.henderson@linaro.org>, Long line; it's nice to wrap commit messages around column 70 or so (because reading 'git log' in an 80-column window adds indentation). > I just fix a boudary case on his original patch. boundary > > Suggested-by: Richard Henderson <richard.henderson@linaro.org> > Signed-off-by: Robert Hoo <robert.hu@linux.intel.com> > --- > util/bufferiszero.c | 26 +++++++++----------------- > 1 file changed, 9 insertions(+), 17 deletions(-) >
(Don't know why my Linux-Evolution missed this mail.) > -----Original Message----- > From: Eric Blake <eblake@redhat.com> > Sent: Wednesday, March 25, 2020 20:54 > To: Robert Hoo <robert.hu@linux.intel.com>; qemu-devel@nongnu.org; > pbonzini@redhat.com; richard.henderson@linaro.org > Cc: Hu, Robert <robert.hu@intel.com> > Subject: Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator > > On 3/25/20 1:50 AM, Robert Hoo wrote: > > By increasing avx2 length_to_accel to 128, we can simplify its logic > > and reduce a branch. > > > > The authorship of this patch actually belongs to Richard Henderson > > <richard.henderson@linaro.org>, > > Long line; it's nice to wrap commit messages around column 70 or so (because > reading 'git log' in an 80-column window adds indentation). > [Hu, Robert] I think I set my vim on wrap. This probably escaped by paste. I ran checkpatch.pl on the patches before sending. It escaped check but didn't escaped your eagle eye
On 26/03/20 03:09, Hu, Robert wrote:
> BTW, do I need to resend these 2 patches?
No, thanks! I have queued them.
Paolo
On 3/25/20 9:09 PM, Hu, Robert wrote: > (Don't know why my Linux-Evolution missed this mail.) >> -----Original Message----- >> Long line; it's nice to wrap commit messages around column 70 or so (because >> reading 'git log' in an 80-column window adds indentation). >> > [Hu, Robert] > I think I set my vim on wrap. This probably escaped by paste. > I ran checkpatch.pl on the patches before sending. It escaped check but didn't > escaped your eagle eye
On Thu, 2020-03-26 at 08:26 -0500, Eric Blake wrote: > On 3/25/20 9:09 PM, Hu, Robert wrote: > > (Don't know why my Linux-Evolution missed this mail.) > > > -----Original Message----- > > > Long line; it's nice to wrap commit messages around column 70 or > > > so (because > > > reading 'git log' in an 80-column window adds indentation). > > > > > > > [Hu, Robert] > > I think I set my vim on wrap. This probably escaped by paste. > > I ran checkpatch.pl on the patches before sending. It escaped check > > but didn't > > escaped your eagle eye
diff --git a/util/bufferiszero.c b/util/bufferiszero.c index b801253..695bb4c 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -158,27 +158,19 @@ buffer_zero_avx2(const void *buf, size_t len) __m256i *p = (__m256i *)(((uintptr_t)buf + 5 * 32) & -32); __m256i *e = (__m256i *)(((uintptr_t)buf + len) & -32); - if (likely(p <= e)) { - /* Loop over 32-byte aligned blocks of 128. */ - do { - __builtin_prefetch(p); - if (unlikely(!_mm256_testz_si256(t, t))) { - return false; - } - t = p[-4] | p[-3] | p[-2] | p[-1]; - p += 4; - } while (p <= e); - } else { - t |= _mm256_loadu_si256(buf + 32); - if (len <= 128) { - goto last2; + /* Loop over 32-byte aligned blocks of 128. */ + while (p <= e) { + __builtin_prefetch(p); + if (unlikely(!_mm256_testz_si256(t, t))) { + return false; } - } + t = p[-4] | p[-3] | p[-2] | p[-1]; + p += 4; + } ; /* Finish the last block of 128 unaligned. */ t |= _mm256_loadu_si256(buf + len - 4 * 32); t |= _mm256_loadu_si256(buf + len - 3 * 32); - last2: t |= _mm256_loadu_si256(buf + len - 2 * 32); t |= _mm256_loadu_si256(buf + len - 1 * 32); @@ -263,7 +255,7 @@ static void init_accel(unsigned cache) } if (cache & CACHE_AVX2) { fn = buffer_zero_avx2; - length_to_accel = 64; + length_to_accel = 128; } #endif #ifdef CONFIG_AVX512F_OPT
By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a branch. The authorship of this patch actually belongs to Richard Henderson <richard.henderson@linaro.org>, I just fix a boudary case on his original patch. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com> --- util/bufferiszero.c | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-)