[4/6] xen/bitops: Introduce for_each_set_bit()

Message ID	20240625190719.788643-5-andrew.cooper3@citrix.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: Andrew Cooper <andrew.cooper3@citrix.com> To: Xen-devel <xen-devel@lists.xenproject.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com>, Jan Beulich <JBeulich@suse.com>, =?utf-8?q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>, Stefano Stabellini <sstabellini@kernel.org>, Julien Grall <julien@xen.org>, Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>, Bertrand Marquis <bertrand.marquis@arm.com>, Michal Orzel <michal.orzel@amd.com>, Oleksii Kurochko <oleksii.kurochko@gmail.com> Subject: [PATCH 4/6] xen/bitops: Introduce for_each_set_bit() Date: Tue, 25 Jun 2024 20:07:17 +0100 Message-Id: <20240625190719.788643-5-andrew.cooper3@citrix.com> In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com> References: <20240625190719.788643-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	xen: Rework for_each_set_bit() \| expand [for-4.20,0/6] xen: Rework for_each_set_bit() [1/6] x86/vmx: Rewrite vmx_sync_pir_to_irr() to be more efficient [2/6] xen/bitops: Rename for_each_set_bit() to bitmap_for_each() [for-4.19?,3/6] xen/macros: Introduce BUILD_ERROR() [4/6] xen/bitops: Introduce for_each_set_bit() [5/6] ARM/vgic: Use for_each_set_bit() in vgic_set_irqs_pending() [6/6] x86/xstate: Switch back to for_each_set_bit()

Message ID

20240625190719.788643-5-andrew.cooper3@citrix.com (mailing list archive)

State

New, archived

Headers

Errors-To: xen-devel-bounces@lists.xenproject.org
Precedence: list
Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Xen-devel <xen-devel@lists.xenproject.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
 Jan Beulich <JBeulich@suse.com>,
 =?utf-8?q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>,
 Stefano Stabellini <sstabellini@kernel.org>, Julien Grall <julien@xen.org>,
 Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>,
 Bertrand Marquis <bertrand.marquis@arm.com>,
 Michal Orzel <michal.orzel@amd.com>,
 Oleksii Kurochko <oleksii.kurochko@gmail.com>
Subject: [PATCH 4/6] xen/bitops: Introduce for_each_set_bit()
Date: Tue, 25 Jun 2024 20:07:17 +0100
Message-Id: <20240625190719.788643-5-andrew.cooper3@citrix.com>
In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com>
References: <20240625190719.788643-1-andrew.cooper3@citrix.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Series

xen: Rework for_each_set_bit() | expand

Commit Message

Andrew Cooper June 25, 2024, 7:07 p.m. UTC

The prior version (renamed to bitmap_for_each()) was inefficeint when used
over a scalar, but this is the more common usage even before accounting for
the many opencoded forms.

Introduce a new version which operates on scalars only and does so without
spilling them to memory.  This in turn requires the addition of a type-generic
form of ffs().

Add testing for the new construct alongside the ffs/fls testing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Oleksii Kurochko <oleksii.kurochko@gmail.com>

The naming of ffs_g() is taken from the new compiler builtins which are using
a g suffix to mean type-generic.
---
 xen/common/bitops.c      | 29 +++++++++++++++++++++++++++++
 xen/include/xen/bitops.h | 24 ++++++++++++++++++++++++
 2 files changed, 53 insertions(+)

Comments

Jan Beulich June 26, 2024, 10:17 a.m. UTC | #1

On 25.06.2024 21:07, Andrew Cooper wrote:
> The prior version (renamed to bitmap_for_each()) was inefficeint when used
> over a scalar, but this is the more common usage even before accounting for
> the many opencoded forms.
> 
> Introduce a new version which operates on scalars only and does so without
> spilling them to memory.  This in turn requires the addition of a type-generic
> form of ffs().
> 
> Add testing for the new construct alongside the ffs/fls testing.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with two remarks:

> The naming of ffs_g() is taken from the new compiler builtins which are using
> a g suffix to mean type-generic.

As you say, a g suffix, not a _g one; gcc14 documents __builtin_ffsg().
(Seeing it exists I wonder whether we wouldn't want to use it when
available, and only fall back to the macro for older versions.) Any
specific reason you use _g?

> --- a/xen/include/xen/bitops.h
> +++ b/xen/include/xen/bitops.h
> @@ -56,6 +56,16 @@ static always_inline __pure unsigned int ffs64(uint64_t x)
>          return !x || (uint32_t)x ? ffs(x) : ffs(x >> 32) + 32;
>  }
>  
> +/*
> + * A type-generic ffs() which picks the appropriate ffs{,l,64}() based on it's
> + * argument.
> + */
> +#define ffs_g(x)                                        \
> +    sizeof(x) <= sizeof(int) ? ffs(x) :                 \
> +        sizeof(x) <= sizeof(long) ? ffsl(x) :           \
> +        sizeof(x) <= sizeof(uint64_t) ? ffs64(x) :      \
> +        ({ BUILD_ERROR("ffs_g() Bad input type"); 0; })

Related to the patch introducing it: I can see how BUILD_ERROR_ON() may
be deemed sub-optimal here. Nevertheless I think we should be able to
find some common ground there.

Jan

Jan Beulich June 26, 2024, 12:03 p.m. UTC | #2

On 25.06.2024 21:07, Andrew Cooper wrote:
> --- a/xen/include/xen/bitops.h
> +++ b/xen/include/xen/bitops.h
> @@ -56,6 +56,16 @@ static always_inline __pure unsigned int ffs64(uint64_t x)
>          return !x || (uint32_t)x ? ffs(x) : ffs(x >> 32) + 32;
>  }
>  
> +/*
> + * A type-generic ffs() which picks the appropriate ffs{,l,64}() based on it's
> + * argument.
> + */
> +#define ffs_g(x)                                        \
> +    sizeof(x) <= sizeof(int) ? ffs(x) :                 \
> +        sizeof(x) <= sizeof(long) ? ffsl(x) :           \
> +        sizeof(x) <= sizeof(uint64_t) ? ffs64(x) :      \
> +        ({ BUILD_ERROR("ffs_g() Bad input type"); 0; })

I guess the lack of parentheses around the entire construct might be the
reason for the problems you're seeing, as that ...

> @@ -92,6 +102,20 @@ static always_inline __pure unsigned int fls64(uint64_t x)
>      }
>  }
>  
> +/*
> + * for_each_set_bit() - Iterate over all set bits in a scalar value.
> + *
> + * @iter An iterator name.  Scoped is within the loop only.
> + * @val  A scalar value to iterate over.
> + *
> + * A copy of @val is taken internally.
> + */
> +#define for_each_set_bit(iter, val)                     \
> +    for ( typeof(val) __v = (val); __v; )               \
> +        for ( unsigned int (iter);                      \
> +              __v && ((iter) = ffs_g(__v) - 1, true);   \

... affects what effect the "- 1" here has.

Jan

Andrew Cooper June 26, 2024, 8:22 p.m. UTC | #3

On 26/06/2024 1:03 pm, Jan Beulich wrote:
> On 25.06.2024 21:07, Andrew Cooper wrote:
>> --- a/xen/include/xen/bitops.h
>> +++ b/xen/include/xen/bitops.h
>> @@ -56,6 +56,16 @@ static always_inline __pure unsigned int ffs64(uint64_t x)
>>          return !x || (uint32_t)x ? ffs(x) : ffs(x >> 32) + 32;
>>  }
>>  
>> +/*
>> + * A type-generic ffs() which picks the appropriate ffs{,l,64}() based on it's
>> + * argument.
>> + */
>> +#define ffs_g(x)                                        \
>> +    sizeof(x) <= sizeof(int) ? ffs(x) :                 \
>> +        sizeof(x) <= sizeof(long) ? ffsl(x) :           \
>> +        sizeof(x) <= sizeof(uint64_t) ? ffs64(x) :      \
>> +        ({ BUILD_ERROR("ffs_g() Bad input type"); 0; })
> I guess the lack of parentheses around the entire construct might be the
> reason for the problems you're seeing, as that ...
>
>> @@ -92,6 +102,20 @@ static always_inline __pure unsigned int fls64(uint64_t x)
>>      }
>>  }
>>  
>> +/*
>> + * for_each_set_bit() - Iterate over all set bits in a scalar value.
>> + *
>> + * @iter An iterator name.  Scoped is within the loop only.
>> + * @val  A scalar value to iterate over.
>> + *
>> + * A copy of @val is taken internally.
>> + */
>> +#define for_each_set_bit(iter, val)                     \
>> +    for ( typeof(val) __v = (val); __v; )               \
>> +        for ( unsigned int (iter);                      \
>> +              __v && ((iter) = ffs_g(__v) - 1, true);   \
> ... affects what effect the "- 1" here has.

Yes, it was this.

After fixing the bracketing, everything is green. 
https://gitlab.com/xen-project/people/andyhhp/xen/-/pipelines/1349243540

~Andrew

Andrew Cooper Aug. 23, 2024, 7:46 p.m. UTC | #4

On 26/06/2024 11:17 am, Jan Beulich wrote:
> On 25.06.2024 21:07, Andrew Cooper wrote:
>> The prior version (renamed to bitmap_for_each()) was inefficeint when used
>> over a scalar, but this is the more common usage even before accounting for
>> the many opencoded forms.
>>
>> Introduce a new version which operates on scalars only and does so without
>> spilling them to memory.  This in turn requires the addition of a type-generic
>> form of ffs().
>>
>> Add testing for the new construct alongside the ffs/fls testing.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

> with two remarks:
>
>> The naming of ffs_g() is taken from the new compiler builtins which are using
>> a g suffix to mean type-generic.
> As you say, a g suffix, not a _g one; gcc14 documents __builtin_ffsg().
> (Seeing it exists I wonder whether we wouldn't want to use it when
> available, and only fall back to the macro for older versions.) Any
> specific reason you use _g?

Legibility.  It's bad enough with the l() and ll() forms (which get
worse with the hweight() series, and I can't find a nice option).

The other option I considered was have ffs() be the generic one, and
have ffsi() for the int variant, but now we're in active contradiction
with the builtin scheme, rather than merely opting not to use it.

I don't expect this to get much use in practice.  Use of this in regular
code mixing ints and longs is almost certainly a bug on behalf of the
programmer.

~Andrew

diff --git a/xen/common/bitops.c b/xen/common/bitops.c
index 94a8983af99c..9e532f0d87aa 100644
--- a/xen/common/bitops.c
+++ b/xen/common/bitops.c
@@ -84,8 +84,37 @@  static void __init test_fls(void)
     CHECK(fls64, 0x8000000000000001ULL, 64);
 }
 
+static void __init test_for_each_set_bit(void)
+{
+    unsigned int  ui,  ui_res = 0;
+    unsigned long ul,  ul_res = 0;
+    uint64_t      ull, ull_res = 0;
+
+    ui = HIDE(0x80008001U);
+    for_each_set_bit ( i, ui )
+        ui_res |= 1U << i;
+
+    if ( ui != ui_res )
+        panic("for_each_set_bit(uint) expected %#x, got %#x\n", ui, ui_res);
+
+    ul = HIDE(1UL << (BITS_PER_LONG - 1) | 1);
+    for_each_set_bit ( i, ul )
+        ul_res |= 1UL << i;
+
+    if ( ul != ul_res )
+        panic("for_each_set_bit(ulong) expected %#lx, got %#lx\n", ul, ul_res);
+
+    ull = HIDE(0x8000000180000001ULL);
+    for_each_set_bit ( i, ull )
+        ull_res |= 1ULL << i;
+
+    if ( ull != ull_res )
+        panic("for_each_set_bit(uint64) expected %#"PRIx64", got %#"PRIx64"\n", ull, ull_res);
+}
+
 static void __init __constructor test_bitops(void)
 {
     test_ffs();
     test_fls();
+    test_for_each_set_bit();
 }
diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h
index 24de0835b7ab..84ffcb8d57bc 100644
--- a/xen/include/xen/bitops.h
+++ b/xen/include/xen/bitops.h
@@ -56,6 +56,16 @@  static always_inline __pure unsigned int ffs64(uint64_t x)
         return !x || (uint32_t)x ? ffs(x) : ffs(x >> 32) + 32;
 }
 
+/*
+ * A type-generic ffs() which picks the appropriate ffs{,l,64}() based on it's
+ * argument.
+ */
+#define ffs_g(x)                                        \
+    sizeof(x) <= sizeof(int) ? ffs(x) :                 \
+        sizeof(x) <= sizeof(long) ? ffsl(x) :           \
+        sizeof(x) <= sizeof(uint64_t) ? ffs64(x) :      \
+        ({ BUILD_ERROR("ffs_g() Bad input type"); 0; })
+
 static always_inline __pure unsigned int fls(unsigned int x)
 {
     if ( __builtin_constant_p(x) )
@@ -92,6 +102,20 @@  static always_inline __pure unsigned int fls64(uint64_t x)
     }
 }
 
+/*
+ * for_each_set_bit() - Iterate over all set bits in a scalar value.
+ *
+ * @iter An iterator name.  Scoped is within the loop only.
+ * @val  A scalar value to iterate over.
+ *
+ * A copy of @val is taken internally.
+ */
+#define for_each_set_bit(iter, val)                     \
+    for ( typeof(val) __v = (val); __v; )               \
+        for ( unsigned int (iter);                      \
+              __v && ((iter) = ffs_g(__v) - 1, true);   \
+              __v &= __v - 1 )
+
 /* --------------------- Please tidy below here --------------------- */
 
 #ifndef find_next_bit

[4/6] xen/bitops: Introduce for_each_set_bit()

Commit Message

Comments

Patch