Message ID | 20250223164217.2139331-3-visitorckw@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Introduce and use generic parity32/64 helper | expand |
On 23. 02. 25, 17:42, Kuan-Wei Chiu wrote: > Several parts of the kernel open-code parity calculations using > different methods. Add a generic parity64() helper implemented with the > same efficient approach as parity8(). > > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > include/linux/bitops.h | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > index fb13dedad7aa..67677057f5e2 100644 > --- a/include/linux/bitops.h > +++ b/include/linux/bitops.h > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) > return (0x6996 >> (val & 0xf)) & 1; > } > > +/** > + * parity64 - get the parity of an u64 value > + * @value: the value to be examined > + * > + * Determine the parity of the u64 argument. > + * > + * Returns: > + * 0 for even parity, 1 for odd parity > + */ > +static inline int parity64(u64 val) > +{ > + /* > + * One explanation of this algorithm: > + * https://funloop.org/codex/problem/parity/README.html > + */ > + val ^= val >> 32; Do we need all these implementations? Can't we simply use parity64() for any 8, 16 and 32-bit values too? I.e. have one parity(). > + val ^= val >> 16; > + val ^= val >> 8; > + val ^= val >> 4; > + return (0x6996 >> (val & 0xf)) & 1; > +} > + > /** > * __ffs64 - find first set bit in a 64 bit word > * @word: The 64 bit word
On Mon, 24 Feb 2025 08:09:43 +0100 Jiri Slaby <jirislaby@kernel.org> wrote: > On 23. 02. 25, 17:42, Kuan-Wei Chiu wrote: > > Several parts of the kernel open-code parity calculations using > > different methods. Add a generic parity64() helper implemented with the > > same efficient approach as parity8(). > > > > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> > > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> > > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > > --- > > include/linux/bitops.h | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > > index fb13dedad7aa..67677057f5e2 100644 > > --- a/include/linux/bitops.h > > +++ b/include/linux/bitops.h > > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) > > return (0x6996 >> (val & 0xf)) & 1; > > } > > > > +/** > > + * parity64 - get the parity of an u64 value > > + * @value: the value to be examined > > + * > > + * Determine the parity of the u64 argument. > > + * > > + * Returns: > > + * 0 for even parity, 1 for odd parity > > + */ > > +static inline int parity64(u64 val) > > +{ > > + /* > > + * One explanation of this algorithm: > > + * https://funloop.org/codex/problem/parity/README.html > > + */ > > + val ^= val >> 32; > > Do we need all these implementations? Can't we simply use parity64() for > any 8, 16 and 32-bit values too? I.e. have one parity(). I'm not sure you can guarantee that the compiler will optimise away the unnecessary operations. But: static inline int parity64(u64 val) { return parity32(val ^ (val >> 32)) } should be ok. It will also work on x86-32 where parity32() can just check the parity flag. Although you are unlikely to manage to use the the PF the xor sets. David > > > + val ^= val >> 16; > > + val ^= val >> 8; > > + val ^= val >> 4; > > + return (0x6996 >> (val & 0xf)) & 1; > > +} > > + > > /** > > * __ffs64 - find first set bit in a 64 bit word > > * @word: The 64 bit word > >
On Mon, Feb 24, 2025 at 01:34:31PM +0000, David Laight wrote: > On Mon, 24 Feb 2025 08:09:43 +0100 > Jiri Slaby <jirislaby@kernel.org> wrote: > > > On 23. 02. 25, 17:42, Kuan-Wei Chiu wrote: > > > Several parts of the kernel open-code parity calculations using > > > different methods. Add a generic parity64() helper implemented with the > > > same efficient approach as parity8(). > > > > > > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> > > > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> > > > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > > > --- > > > include/linux/bitops.h | 22 ++++++++++++++++++++++ > > > 1 file changed, 22 insertions(+) > > > > > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > > > index fb13dedad7aa..67677057f5e2 100644 > > > --- a/include/linux/bitops.h > > > +++ b/include/linux/bitops.h > > > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) > > > return (0x6996 >> (val & 0xf)) & 1; > > > } > > > > > > +/** > > > + * parity64 - get the parity of an u64 value > > > + * @value: the value to be examined > > > + * > > > + * Determine the parity of the u64 argument. > > > + * > > > + * Returns: > > > + * 0 for even parity, 1 for odd parity > > > + */ > > > +static inline int parity64(u64 val) > > > +{ > > > + /* > > > + * One explanation of this algorithm: > > > + * https://funloop.org/codex/problem/parity/README.html > > > + */ > > > + val ^= val >> 32; > > > > Do we need all these implementations? Can't we simply use parity64() for > > any 8, 16 and 32-bit values too? I.e. have one parity(). > > I'm not sure you can guarantee that the compiler will optimise away > the unnecessary operations. Hi Jiri and David, Unless we can be certain about the compiler's optimization behavior, we prefer to follow an approach similar to hweight, distinguishing implementations based on different bit sizes. > > But: > static inline int parity64(u64 val) > { > return parity32(val ^ (val >> 32)) > } > > should be ok. We will adopt this approach, as it is indeed more concise. Thank you all for your feedback. Best regards, Yu-Chun Lin > It will also work on x86-32 where parity32() can just check the parity flag. > Although you are unlikely to manage to use the the PF the xor sets. > > David > > > > > > + val ^= val >> 16; > > > + val ^= val >> 8; > > > + val ^= val >> 4; > > > + return (0x6996 >> (val & 0xf)) & 1; > > > +} > > > + > > > /** > > > * __ffs64 - find first set bit in a 64 bit word > > > * @word: The 64 bit word > > > > >
On Mon, Feb 24, 2025 at 12:42:02AM +0800, Kuan-Wei Chiu wrote: > Several parts of the kernel open-code parity calculations using > different methods. Add a generic parity64() helper implemented with the > same efficient approach as parity8(). No reason to add parity32() and parity64() in separate patches > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > include/linux/bitops.h | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > index fb13dedad7aa..67677057f5e2 100644 > --- a/include/linux/bitops.h > +++ b/include/linux/bitops.h > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) > return (0x6996 >> (val & 0xf)) & 1; > } > > +/** > + * parity64 - get the parity of an u64 value > + * @value: the value to be examined > + * > + * Determine the parity of the u64 argument. > + * > + * Returns: > + * 0 for even parity, 1 for odd parity > + */ > +static inline int parity64(u64 val) > +{ > + /* > + * One explanation of this algorithm: > + * https://funloop.org/codex/problem/parity/README.html This is already referenced in sources. No need to spread it for more. > + */ > + val ^= val >> 32; > + val ^= val >> 16; > + val ^= val >> 8; > + val ^= val >> 4; > + return (0x6996 >> (val & 0xf)) & 1; It's better to avoid duplicating the same logic again and again. > +} > + So maybe make it a macro? From f17a28ae3429f49825d65ebc0f7717c6a191a3e2 Mon Sep 17 00:00:00 2001 From: Yury Norov <yury.norov@gmail.com> Date: Mon, 24 Feb 2025 14:14:27 -0500 Subject: [PATCH] bitops: generalize parity8() The generic parity calculation approach may be easily generalized for other standard types. Do that and drop sub-optimal implementation of parity calculation in x86 code. Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> --- arch/x86/kernel/bootflag.c | 14 +----------- include/linux/bitops.h | 47 +++++++++++++++++++++++++++----------- 2 files changed, 35 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/bootflag.c b/arch/x86/kernel/bootflag.c index 3fed7ae58b60..4a85c69a28f8 100644 --- a/arch/x86/kernel/bootflag.c +++ b/arch/x86/kernel/bootflag.c @@ -2,6 +2,7 @@ /* * Implement 'Simple Boot Flag Specification 2.0' */ +#include <linux/bitops.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/init.h> @@ -20,19 +21,6 @@ int sbf_port __initdata = -1; /* set via acpi_boot_init() */ -static int __init parity(u8 v) -{ - int x = 0; - int i; - - for (i = 0; i < 8; i++) { - x ^= (v & 1); - v >>= 1; - } - - return x; -} - static void __init sbf_write(u8 v) { unsigned long flags; diff --git a/include/linux/bitops.h b/include/linux/bitops.h index c1cb53cf2f0f..29601434f5f4 100644 --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -230,10 +230,10 @@ static inline int get_count_order_long(unsigned long l) } /** - * parity8 - get the parity of an u8 value + * parity - get the parity of a value * @value: the value to be examined * - * Determine the parity of the u8 argument. + * Determine parity of the argument. * * Returns: * 0 for even parity, 1 for odd parity @@ -241,24 +241,45 @@ static inline int get_count_order_long(unsigned long l) * Note: This function informs you about the current parity. Example to bail * out when parity is odd: * - * if (parity8(val) == 1) + * if (parity(val) == 1) * return -EBADMSG; * * If you need to calculate a parity bit, you need to draw the conclusion from * this result yourself. Example to enforce odd parity, parity bit is bit 7: * - * if (parity8(val) == 0) + * if (parity(val) == 0) * val ^= BIT(7); + * + * One explanation of this algorithm: + * https://funloop.org/codex/problem/parity/README.html */ -static inline int parity8(u8 val) -{ - /* - * One explanation of this algorithm: - * https://funloop.org/codex/problem/parity/README.html - */ - val ^= val >> 4; - return (0x6996 >> (val & 0xf)) & 1; -} +#define parity(val) \ +({ \ + u64 __v = (val); \ + int __ret; \ + switch (BITS_PER_TYPE(val)) { \ + case 64: \ + __v ^= __v >> 32; \ + fallthrough; \ + case 32: \ + __v ^= __v >> 16; \ + fallthrough; \ + case 16: \ + __v ^= __v >> 8; \ + fallthrough; \ + case 8: \ + __v ^= __v >> 4; \ + __ret = (0x6996 >> (__v & 0xf)) & 1; \ + break; \ + default: \ + BUILD_BUG(); \ + } \ + __ret; \ +}) + +#define parity8(val) parity((u8)(val)) +#define parity32(val) parity((u32)(val)) +#define parity64(val) parity((u64)(val)) /** * __ffs64 - find first set bit in a 64 bit word
Hi Yury, On Mon, Feb 24, 2025 at 02:27:03PM -0500, Yury Norov wrote: > On Mon, Feb 24, 2025 at 12:42:02AM +0800, Kuan-Wei Chiu wrote: > > Several parts of the kernel open-code parity calculations using > > different methods. Add a generic parity64() helper implemented with the > > same efficient approach as parity8(). > > No reason to add parity32() and parity64() in separate patches Ack. > > > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> > > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> > > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > > --- > > include/linux/bitops.h | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > > index fb13dedad7aa..67677057f5e2 100644 > > --- a/include/linux/bitops.h > > +++ b/include/linux/bitops.h > > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) > > return (0x6996 >> (val & 0xf)) & 1; > > } > > > > +/** > > + * parity64 - get the parity of an u64 value > > + * @value: the value to be examined > > + * > > + * Determine the parity of the u64 argument. > > + * > > + * Returns: > > + * 0 for even parity, 1 for odd parity > > + */ > > +static inline int parity64(u64 val) > > +{ > > + /* > > + * One explanation of this algorithm: > > + * https://funloop.org/codex/problem/parity/README.html > > This is already referenced in sources. No need to spread it for more. Ack. > > > + */ > > + val ^= val >> 32; > > + val ^= val >> 16; > > + val ^= val >> 8; > > + val ^= val >> 4; > > + return (0x6996 >> (val & 0xf)) & 1; > > It's better to avoid duplicating the same logic again and again. Ack. > > > +} > > + > > So maybe make it a macro? > > > From f17a28ae3429f49825d65ebc0f7717c6a191a3e2 Mon Sep 17 00:00:00 2001 > From: Yury Norov <yury.norov@gmail.com> > Date: Mon, 24 Feb 2025 14:14:27 -0500 > Subject: [PATCH] bitops: generalize parity8() > > The generic parity calculation approach may be easily generalized for > other standard types. Do that and drop sub-optimal implementation of > parity calculation in x86 code. > > Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> > --- > arch/x86/kernel/bootflag.c | 14 +----------- > include/linux/bitops.h | 47 +++++++++++++++++++++++++++----------- > 2 files changed, 35 insertions(+), 26 deletions(-) > > diff --git a/arch/x86/kernel/bootflag.c b/arch/x86/kernel/bootflag.c > index 3fed7ae58b60..4a85c69a28f8 100644 > --- a/arch/x86/kernel/bootflag.c > +++ b/arch/x86/kernel/bootflag.c > @@ -2,6 +2,7 @@ > /* > * Implement 'Simple Boot Flag Specification 2.0' > */ > +#include <linux/bitops.h> > #include <linux/types.h> > #include <linux/kernel.h> > #include <linux/init.h> > @@ -20,19 +21,6 @@ > > int sbf_port __initdata = -1; /* set via acpi_boot_init() */ > > -static int __init parity(u8 v) > -{ > - int x = 0; > - int i; > - > - for (i = 0; i < 8; i++) { > - x ^= (v & 1); > - v >>= 1; > - } > - > - return x; > -} > - > static void __init sbf_write(u8 v) > { > unsigned long flags; > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > index c1cb53cf2f0f..29601434f5f4 100644 > --- a/include/linux/bitops.h > +++ b/include/linux/bitops.h > @@ -230,10 +230,10 @@ static inline int get_count_order_long(unsigned long l) > } > > /** > - * parity8 - get the parity of an u8 value > + * parity - get the parity of a value > * @value: the value to be examined > * > - * Determine the parity of the u8 argument. > + * Determine parity of the argument. > * > * Returns: > * 0 for even parity, 1 for odd parity > @@ -241,24 +241,45 @@ static inline int get_count_order_long(unsigned long l) > * Note: This function informs you about the current parity. Example to bail > * out when parity is odd: > * > - * if (parity8(val) == 1) > + * if (parity(val) == 1) > * return -EBADMSG; > * > * If you need to calculate a parity bit, you need to draw the conclusion from > * this result yourself. Example to enforce odd parity, parity bit is bit 7: > * > - * if (parity8(val) == 0) > + * if (parity(val) == 0) > * val ^= BIT(7); > + * > + * One explanation of this algorithm: > + * https://funloop.org/codex/problem/parity/README.html > */ > -static inline int parity8(u8 val) > -{ > - /* > - * One explanation of this algorithm: > - * https://funloop.org/codex/problem/parity/README.html > - */ > - val ^= val >> 4; > - return (0x6996 >> (val & 0xf)) & 1; > -} > +#define parity(val) \ > +({ \ > + u64 __v = (val); \ > + int __ret; \ > + switch (BITS_PER_TYPE(val)) { \ > + case 64: \ > + __v ^= __v >> 32; \ > + fallthrough; \ > + case 32: \ > + __v ^= __v >> 16; \ > + fallthrough; \ > + case 16: \ > + __v ^= __v >> 8; \ > + fallthrough; \ > + case 8: \ > + __v ^= __v >> 4; \ > + __ret = (0x6996 >> (__v & 0xf)) & 1; \ > + break; \ > + default: \ > + BUILD_BUG(); \ > + } \ > + __ret; \ > +}) > + > +#define parity8(val) parity((u8)(val)) > +#define parity32(val) parity((u32)(val)) > +#define parity64(val) parity((u64)(val)) > What do you think about using these inline functions instead of macros? Except for parity8(), each function is a single line and follows the same logic. I find inline functions more readable, and coding-style.rst also recommends them over macros. Regards, Kuan-Wei diff --git a/include/linux/bitops.h b/include/linux/bitops.h index c1cb53cf2f0f..d518a382f1fe 100644 --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -260,6 +260,26 @@ static inline int parity8(u8 val) return (0x6996 >> (val & 0xf)) & 1; } +static inline parity16(u16 val) +{ + return parity8(val ^ (val >> 8)); +} + +static inline parity16(u16 val) +{ + return parity8(val ^ (val >> 8)); +} + +static inline parity32(u32) +{ + return parity16(val ^ (val >> 16)); +} + +static inline parity64(u64) +{ + return parity32(val ^ (val >> 32)); +} + /** * __ffs64 - find first set bit in a 64 bit word * @word: The 64 bit word > /** > * __ffs64 - find first set bit in a 64 bit word > -- > 2.43.0 >
On Tue, Feb 25, 2025 at 09:29:51PM +0800, Kuan-Wei Chiu wrote: > Hi Yury, > > On Mon, Feb 24, 2025 at 02:27:03PM -0500, Yury Norov wrote: > > On Mon, Feb 24, 2025 at 12:42:02AM +0800, Kuan-Wei Chiu wrote: > > > Several parts of the kernel open-code parity calculations using > > > different methods. Add a generic parity64() helper implemented with the > > > same efficient approach as parity8(). > > > > No reason to add parity32() and parity64() in separate patches > > Ack. > > > > > > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> > > > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> > > > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > > > --- > > > include/linux/bitops.h | 22 ++++++++++++++++++++++ > > > 1 file changed, 22 insertions(+) > > > > > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > > > index fb13dedad7aa..67677057f5e2 100644 > > > --- a/include/linux/bitops.h > > > +++ b/include/linux/bitops.h > > > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) > > > return (0x6996 >> (val & 0xf)) & 1; > > > } > > > > > > +/** > > > + * parity64 - get the parity of an u64 value > > > + * @value: the value to be examined > > > + * > > > + * Determine the parity of the u64 argument. > > > + * > > > + * Returns: > > > + * 0 for even parity, 1 for odd parity > > > + */ > > > +static inline int parity64(u64 val) > > > +{ > > > + /* > > > + * One explanation of this algorithm: > > > + * https://funloop.org/codex/problem/parity/README.html > > > > This is already referenced in sources. No need to spread it for more. > > Ack. > > > > > > + */ > > > + val ^= val >> 32; > > > + val ^= val >> 16; > > > + val ^= val >> 8; > > > + val ^= val >> 4; > > > + return (0x6996 >> (val & 0xf)) & 1; > > > > It's better to avoid duplicating the same logic again and again. > > Ack. > > > > > > +} > > > + > > > > So maybe make it a macro? > > > > > > From f17a28ae3429f49825d65ebc0f7717c6a191a3e2 Mon Sep 17 00:00:00 2001 > > From: Yury Norov <yury.norov@gmail.com> > > Date: Mon, 24 Feb 2025 14:14:27 -0500 > > Subject: [PATCH] bitops: generalize parity8() > > > > The generic parity calculation approach may be easily generalized for > > other standard types. Do that and drop sub-optimal implementation of > > parity calculation in x86 code. > > > > Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > --- > > arch/x86/kernel/bootflag.c | 14 +----------- > > include/linux/bitops.h | 47 +++++++++++++++++++++++++++----------- > > 2 files changed, 35 insertions(+), 26 deletions(-) > > > > diff --git a/arch/x86/kernel/bootflag.c b/arch/x86/kernel/bootflag.c > > index 3fed7ae58b60..4a85c69a28f8 100644 > > --- a/arch/x86/kernel/bootflag.c > > +++ b/arch/x86/kernel/bootflag.c > > @@ -2,6 +2,7 @@ > > /* > > * Implement 'Simple Boot Flag Specification 2.0' > > */ > > +#include <linux/bitops.h> > > #include <linux/types.h> > > #include <linux/kernel.h> > > #include <linux/init.h> > > @@ -20,19 +21,6 @@ > > > > int sbf_port __initdata = -1; /* set via acpi_boot_init() */ > > > > -static int __init parity(u8 v) > > -{ > > - int x = 0; > > - int i; > > - > > - for (i = 0; i < 8; i++) { > > - x ^= (v & 1); > > - v >>= 1; > > - } > > - > > - return x; > > -} > > - > > static void __init sbf_write(u8 v) > > { > > unsigned long flags; > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > > index c1cb53cf2f0f..29601434f5f4 100644 > > --- a/include/linux/bitops.h > > +++ b/include/linux/bitops.h > > @@ -230,10 +230,10 @@ static inline int get_count_order_long(unsigned long l) > > } > > > > /** > > - * parity8 - get the parity of an u8 value > > + * parity - get the parity of a value > > * @value: the value to be examined > > * > > - * Determine the parity of the u8 argument. > > + * Determine parity of the argument. > > * > > * Returns: > > * 0 for even parity, 1 for odd parity > > @@ -241,24 +241,45 @@ static inline int get_count_order_long(unsigned long l) > > * Note: This function informs you about the current parity. Example to bail > > * out when parity is odd: > > * > > - * if (parity8(val) == 1) > > + * if (parity(val) == 1) > > * return -EBADMSG; > > * > > * If you need to calculate a parity bit, you need to draw the conclusion from > > * this result yourself. Example to enforce odd parity, parity bit is bit 7: > > * > > - * if (parity8(val) == 0) > > + * if (parity(val) == 0) > > * val ^= BIT(7); > > + * > > + * One explanation of this algorithm: > > + * https://funloop.org/codex/problem/parity/README.html > > */ > > -static inline int parity8(u8 val) > > -{ > > - /* > > - * One explanation of this algorithm: > > - * https://funloop.org/codex/problem/parity/README.html > > - */ > > - val ^= val >> 4; > > - return (0x6996 >> (val & 0xf)) & 1; > > -} > > +#define parity(val) \ > > +({ \ > > + u64 __v = (val); \ > > + int __ret; \ > > + switch (BITS_PER_TYPE(val)) { \ > > + case 64: \ > > + __v ^= __v >> 32; \ > > + fallthrough; \ > > + case 32: \ > > + __v ^= __v >> 16; \ > > + fallthrough; \ > > + case 16: \ > > + __v ^= __v >> 8; \ > > + fallthrough; \ > > + case 8: \ > > + __v ^= __v >> 4; \ > > + __ret = (0x6996 >> (__v & 0xf)) & 1; \ > > + break; \ > > + default: \ > > + BUILD_BUG(); \ > > + } \ > > + __ret; \ > > +}) > > + > > +#define parity8(val) parity((u8)(val)) > > +#define parity32(val) parity((u32)(val)) > > +#define parity64(val) parity((u64)(val)) > > > What do you think about using these inline functions instead of macros? > Except for parity8(), each function is a single line and follows the > same logic. I find inline functions more readable, and coding-style.rst > also recommends them over macros. > > Regards, > Kuan-Wei > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > index c1cb53cf2f0f..d518a382f1fe 100644 > --- a/include/linux/bitops.h > +++ b/include/linux/bitops.h > @@ -260,6 +260,26 @@ static inline int parity8(u8 val) > return (0x6996 >> (val & 0xf)) & 1; > } > > +static inline parity16(u16 val) > +{ > + return parity8(val ^ (val >> 8)); > +} > + > +static inline parity16(u16 val) > +{ > + return parity8(val ^ (val >> 8)); > +} > + > +static inline parity32(u32) > +{ > + return parity16(val ^ (val >> 16)); > +} > + > +static inline parity64(u64) > +{ > + return parity32(val ^ (val >> 32)); > +} > + > /** > * __ffs64 - find first set bit in a 64 bit word > * @word: The 64 bit word > > Oops... I made a lot of fat-finger mistakes. Here's the correct one. diff --git a/include/linux/bitops.h b/include/linux/bitops.h index c1cb53cf2f0f..427e4c06055e 100644 --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -260,6 +260,21 @@ static inline int parity8(u8 val) return (0x6996 >> (val & 0xf)) & 1; } +static inline int parity16(u16 val) +{ + return parity8(val ^ (val >> 8)); +} + +static inline int parity32(u32 val) +{ + return parity16(val ^ (val >> 16)); +} + +static inline int parity64(u64 val) +{ + return parity32(val ^ (val >> 32)); +} + /** * __ffs64 - find first set bit in a 64 bit word * @word: The 64 bit word > > /** > > * __ffs64 - find first set bit in a 64 bit word > > -- > > 2.43.0 > >
On February 24, 2025 5:34:31 AM PST, David Laight <david.laight.linux@gmail.com> wrote: >On Mon, 24 Feb 2025 08:09:43 +0100 >Jiri Slaby <jirislaby@kernel.org> wrote: > >> On 23. 02. 25, 17:42, Kuan-Wei Chiu wrote: >> > Several parts of the kernel open-code parity calculations using >> > different methods. Add a generic parity64() helper implemented with the >> > same efficient approach as parity8(). >> > >> > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> >> > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> >> > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> >> > --- >> > include/linux/bitops.h | 22 ++++++++++++++++++++++ >> > 1 file changed, 22 insertions(+) >> > >> > diff --git a/include/linux/bitops.h b/include/linux/bitops.h >> > index fb13dedad7aa..67677057f5e2 100644 >> > --- a/include/linux/bitops.h >> > +++ b/include/linux/bitops.h >> > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) >> > return (0x6996 >> (val & 0xf)) & 1; >> > } >> > >> > +/** >> > + * parity64 - get the parity of an u64 value >> > + * @value: the value to be examined >> > + * >> > + * Determine the parity of the u64 argument. >> > + * >> > + * Returns: >> > + * 0 for even parity, 1 for odd parity >> > + */ >> > +static inline int parity64(u64 val) >> > +{ >> > + /* >> > + * One explanation of this algorithm: >> > + * https://funloop.org/codex/problem/parity/README.html >> > + */ >> > + val ^= val >> 32; >> >> Do we need all these implementations? Can't we simply use parity64() for >> any 8, 16 and 32-bit values too? I.e. have one parity(). > >I'm not sure you can guarantee that the compiler will optimise away >the unnecessary operations. > >But: >static inline int parity64(u64 val) >{ > return parity32(val ^ (val >> 32)) >} > >should be ok. >It will also work on x86-32 where parity32() can just check the parity flag. >Although you are unlikely to manage to use the the PF the xor sets. > > David > >> >> > + val ^= val >> 16; >> > + val ^= val >> 8; >> > + val ^= val >> 4; >> > + return (0x6996 >> (val & 0xf)) & 1; >> > +} >> > + >> > /** >> > * __ffs64 - find first set bit in a 64 bit word >> > * @word: The 64 bit word >> >> > Sure you can; you do need an 8- and a 16-bit arch implementation though (the 16 bit one being xor %rh,%rl)
On February 24, 2025 5:34:31 AM PST, David Laight <david.laight.linux@gmail.com> wrote: >On Mon, 24 Feb 2025 08:09:43 +0100 >Jiri Slaby <jirislaby@kernel.org> wrote: > >> On 23. 02. 25, 17:42, Kuan-Wei Chiu wrote: >> > Several parts of the kernel open-code parity calculations using >> > different methods. Add a generic parity64() helper implemented with the >> > same efficient approach as parity8(). >> > >> > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com> >> > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> >> > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> >> > --- >> > include/linux/bitops.h | 22 ++++++++++++++++++++++ >> > 1 file changed, 22 insertions(+) >> > >> > diff --git a/include/linux/bitops.h b/include/linux/bitops.h >> > index fb13dedad7aa..67677057f5e2 100644 >> > --- a/include/linux/bitops.h >> > +++ b/include/linux/bitops.h >> > @@ -281,6 +281,28 @@ static inline int parity32(u32 val) >> > return (0x6996 >> (val & 0xf)) & 1; >> > } >> > >> > +/** >> > + * parity64 - get the parity of an u64 value >> > + * @value: the value to be examined >> > + * >> > + * Determine the parity of the u64 argument. >> > + * >> > + * Returns: >> > + * 0 for even parity, 1 for odd parity >> > + */ >> > +static inline int parity64(u64 val) >> > +{ >> > + /* >> > + * One explanation of this algorithm: >> > + * https://funloop.org/codex/problem/parity/README.html >> > + */ >> > + val ^= val >> 32; >> >> Do we need all these implementations? Can't we simply use parity64() for >> any 8, 16 and 32-bit values too? I.e. have one parity(). > >I'm not sure you can guarantee that the compiler will optimise away >the unnecessary operations. > >But: >static inline int parity64(u64 val) >{ > return parity32(val ^ (val >> 32)) >} > >should be ok. >It will also work on x86-32 where parity32() can just check the parity flag. >Although you are unlikely to manage to use the the PF the xor sets. > > David > >> >> > + val ^= val >> 16; >> > + val ^= val >> 8; >> > + val ^= val >> 4; >> > + return (0x6996 >> (val & 0xf)) & 1; >> > +} >> > + >> > /** >> > * __ffs64 - find first set bit in a 64 bit word >> > * @word: The 64 bit word >> >> > Incidentally, in all of this, didn't anyone notice __builtin_parity()?
> Incidentally, in all of this, didn't anyone notice __builtin_parity()?
Yes. It it has done sane for a decade on x86, yet does things such as
emitting a library call on other architectures.
https://godbolt.org/z/6qG3noebq
~Andrew
On February 25, 2025 1:43:27 PM PST, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> Incidentally, in all of this, didn't anyone notice __builtin_parity()? > >Yes. It it has done sane for a decade on x86, yet does things such as >emitting a library call on other architectures. > >https://godbolt.org/z/6qG3noebq > >~Andrew And not even a smart one at that.
On 25. 02. 25, 14:29, Kuan-Wei Chiu wrote: >> +#define parity(val) \ >> +({ \ >> + u64 __v = (val); \ >> + int __ret; \ >> + switch (BITS_PER_TYPE(val)) { \ >> + case 64: \ >> + __v ^= __v >> 32; \ >> + fallthrough; \ >> + case 32: \ >> + __v ^= __v >> 16; \ >> + fallthrough; \ >> + case 16: \ >> + __v ^= __v >> 8; \ >> + fallthrough; \ >> + case 8: \ >> + __v ^= __v >> 4; \ >> + __ret = (0x6996 >> (__v & 0xf)) & 1; \ >> + break; \ >> + default: \ >> + BUILD_BUG(); \ >> + } \ >> + __ret; \ >> +}) >> + >> +#define parity8(val) parity((u8)(val)) >> +#define parity32(val) parity((u32)(val)) >> +#define parity64(val) parity((u64)(val)) >> > What do you think about using these inline functions instead of macros? > Except for parity8(), each function is a single line and follows the > same logic. I find inline functions more readable, and coding-style.rst > also recommends them over macros. Not in cases where macros are inevitable. I mean, do we need parityXX() for XX in (8, 16, 32, 64) at all? Isn't the parity() above enough for everybody? And if not, you can have all those parityXX() as inlines as you suggest, but also provide a macro such as the above to call (optimized) parityXX() as per datatype len. thanks,
Hi Jiri, On Wed, Feb 26, 2025 at 08:14:14AM +0100, Jiri Slaby wrote: > On 25. 02. 25, 14:29, Kuan-Wei Chiu wrote: > > > +#define parity(val) \ > > > +({ \ > > > + u64 __v = (val); \ > > > + int __ret; \ > > > + switch (BITS_PER_TYPE(val)) { \ > > > + case 64: \ > > > + __v ^= __v >> 32; \ > > > + fallthrough; \ > > > + case 32: \ > > > + __v ^= __v >> 16; \ > > > + fallthrough; \ > > > + case 16: \ > > > + __v ^= __v >> 8; \ > > > + fallthrough; \ > > > + case 8: \ > > > + __v ^= __v >> 4; \ > > > + __ret = (0x6996 >> (__v & 0xf)) & 1; \ > > > + break; \ > > > + default: \ > > > + BUILD_BUG(); \ > > > + } \ > > > + __ret; \ > > > +}) > > > + > > > +#define parity8(val) parity((u8)(val)) > > > +#define parity32(val) parity((u32)(val)) > > > +#define parity64(val) parity((u64)(val)) > > What do you think about using these inline functions instead of macros? > > Except for parity8(), each function is a single line and follows the > > same logic. I find inline functions more readable, and coding-style.rst > > also recommends them over macros. > > Not in cases where macros are inevitable. I mean, do we need parityXX() for > XX in (8, 16, 32, 64) at all? Isn't the parity() above enough for everybody? > And if not, you can have all those parityXX() as inlines as you suggest, but > also provide a macro such as the above to call (optimized) parityXX() as per > datatype len. > I agree that we can add a macro to call parity8/16/32/64 based on the data type size. However, I think we should still keep parity8/16/32/64. As Peter and David discussed, the x86-specific implementations of parity8() and parity16() might use different instructions instead of just XORing and calling another function, as in the generic version. My current idea is to follow David's suggestion and use __builtin_parity when there is no architecture-specific implementation. In lib/, we can provide a generic weak function implementation of __parity[sdt]i2. Any comments or suggestions are welcome! Regards, Kuan-Wei static inline parity32(u32 val) { return __builtin_const_p(val) ? _parity_const(val) : _parity32(val); } #ifndef _parity32 static inline _parity32(u32 val) { return __builtin_parity(val); } #endif int __weak __paritysi2(u32 val); int __weak __paritysi2(u32 val) { val ^= val >> 16; val ^= val >> 8; val ^= val >> 4; return (0x6996 >> (val & 0xf)) & 1; } EXPORT_SYMBOL(__paritysi2);
On Wed, Feb 26, 2025 at 08:14:14AM +0100, Jiri Slaby wrote: > On 25. 02. 25, 14:29, Kuan-Wei Chiu wrote: > > > +#define parity(val) \ > > > +({ \ > > > + u64 __v = (val); \ > > > + int __ret; \ > > > + switch (BITS_PER_TYPE(val)) { \ > > > + case 64: \ > > > + __v ^= __v >> 32; \ > > > + fallthrough; \ > > > + case 32: \ > > > + __v ^= __v >> 16; \ > > > + fallthrough; \ > > > + case 16: \ > > > + __v ^= __v >> 8; \ > > > + fallthrough; \ > > > + case 8: \ > > > + __v ^= __v >> 4; \ > > > + __ret = (0x6996 >> (__v & 0xf)) & 1; \ > > > + break; \ > > > + default: \ > > > + BUILD_BUG(); \ > > > + } \ > > > + __ret; \ > > > +}) > > > + > > > +#define parity8(val) parity((u8)(val)) > > > +#define parity32(val) parity((u32)(val)) > > > +#define parity64(val) parity((u64)(val)) > > What do you think about using these inline functions instead of macros? > > Except for parity8(), each function is a single line and follows the > > same logic. I find inline functions more readable, and coding-style.rst > > also recommends them over macros. > > Not in cases where macros are inevitable. I mean, do we need parityXX() for > XX in (8, 16, 32, 64) at all? Isn't the parity() above enough for everybody? The existing codebase has something like: int ret; ret = i3c_master_get_free_addr(m, last_addr + 1); ret |= parity8(ret) ? 0 : BIT(7) So if we'll switch it to a macro like one above, it will become a 32-bit parity. It wouldn't be an error because i3c_master_get_free_addr() returns an u8 or -ENOMEM, and the error code is checked explicitly. But if we decide to go with parity() only, some users will have to call it like parity((u8)val) explicitly. Which is not bad actually. > And if not, you can have all those parityXX() as inlines as you suggest, but > also provide a macro such as the above to call (optimized) parityXX() as per > datatype len. Yes, if we need fixed-type parity's, they should all be one-liners calling the same macro. Macros or inline functions - no preference for me. Thanks, Yury
On Mon, 24 Feb 2025 14:27:03 -0500 Yury Norov <yury.norov@gmail.com> wrote: .... > +#define parity(val) \ > +({ \ > + u64 __v = (val); \ > + int __ret; \ > + switch (BITS_PER_TYPE(val)) { \ > + case 64: \ > + __v ^= __v >> 32; \ > + fallthrough; \ > + case 32: \ > + __v ^= __v >> 16; \ > + fallthrough; \ > + case 16: \ > + __v ^= __v >> 8; \ > + fallthrough; \ > + case 8: \ > + __v ^= __v >> 4; \ > + __ret = (0x6996 >> (__v & 0xf)) & 1; \ > + break; \ > + default: \ > + BUILD_BUG(); \ > + } \ > + __ret; \ > +}) > + You really don't want to do that! gcc makes a right hash of it for x86 (32bit). See https://www.godbolt.org/z/jG8dv3cvs You do better using a __v32 after the 64bit xor. Even the 64bit version is probably sub-optimal (both gcc and clang). The whole lot ends up being a bit single register dependency chain. You want to do: mov %eax, %edx shrl $n, %eax xor %edx, %eax so that the 'mov' and 'shrl' can happen in the same clock (without relying on the register-register move being optimised out). I dropped in the arm64 for an example of where the magic shift of 6996 just adds an extra instruction. David
On 26. 02. 25, 19:33, Yury Norov wrote: >> Not in cases where macros are inevitable. I mean, do we need parityXX() for >> XX in (8, 16, 32, 64) at all? Isn't the parity() above enough for everybody? > > The existing codebase has something like: > > int ret; > > ret = i3c_master_get_free_addr(m, last_addr + 1); > ret |= parity8(ret) ? 0 : BIT(7) > > So if we'll switch it to a macro like one above, it will become a > 32-bit parity. It wouldn't be an error because i3c_master_get_free_addr() > returns an u8 or -ENOMEM, and the error code is checked explicitly. > > But if we decide to go with parity() only, some users will have to > call it like parity((u8)val) explicitly. Which is not bad actually. That cast looks ugly -- we apparently need parityXX(). (In this particular case we could do parity8(last_addr), but I assume there are more cases like this.) Thanks for looking up the case for this. >> And if not, you can have all those parityXX() as inlines as you suggest, but >> also provide a macro such as the above to call (optimized) parityXX() as per >> datatype len. > > Yes, if we need fixed-type parity's, they should all be one-liners > calling the same macro. Macros or inline functions - no preference for > me.
diff --git a/include/linux/bitops.h b/include/linux/bitops.h index fb13dedad7aa..67677057f5e2 100644 --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -281,6 +281,28 @@ static inline int parity32(u32 val) return (0x6996 >> (val & 0xf)) & 1; } +/** + * parity64 - get the parity of an u64 value + * @value: the value to be examined + * + * Determine the parity of the u64 argument. + * + * Returns: + * 0 for even parity, 1 for odd parity + */ +static inline int parity64(u64 val) +{ + /* + * One explanation of this algorithm: + * https://funloop.org/codex/problem/parity/README.html + */ + val ^= val >> 32; + val ^= val >> 16; + val ^= val >> 8; + val ^= val >> 4; + return (0x6996 >> (val & 0xf)) & 1; +} + /** * __ffs64 - find first set bit in a 64 bit word * @word: The 64 bit word