[v4,0/8] bits: Fixed-type GENMASK()/BIT()

Message ID	20250305-fixed-type-genmasks-v4-0-1873dcdf6723@wanadoo.fr (mailing list archive)
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> Subject: [PATCH v4 0/8] bits: Fixed-type GENMASK()/BIT() Date: Wed, 05 Mar 2025 22:00:12 +0900 Message-Id: <20250305-fixed-type-genmasks-v4-0-1873dcdf6723@wanadoo.fr> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit To: Yury Norov <yury.norov@gmail.com>, Lucas De Marchi <lucas.demarchi@intel.com>, Rasmus Villemoes <linux@rasmusvillemoes.dk>, Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, Tvrtko Ursulin <tursulin@ursulin.net>, David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>, Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Andi Shyti <andi.shyti@linux.intel.com>, David Laight <David.Laight@ACULAB.COM>, Dmitry Baryshkov <dmitry.baryshkov@linaro.org>, Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Vincent Mailhol <mailhol.vincent@wanadoo.fr>, Jani Nikula <jani.nikula@intel.com> Precedence: list Reply-To: mailhol.vincent@wanadoo.fr Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	bits: Fixed-type GENMASK()/BIT() \| expand [v4,0/8] bits: Fixed-type GENMASK()/BIT() [v4,1/8] bits: fix typo 'unsigned __init128' -> 'unsigned __int128' [v4,2/8] bits: split the definition of the asm and non-asm GENMASK() [v4,3/8] bits: introduce fixed-type genmasks [v4,4/8] bits: introduce fixed-type BIT [v4,5/8] drm/i915: Convert REG_GENMASK* to fixed-width GENMASK_* [v4,6/8] test_bits: add tests for __GENMASK() and __GENMASK_ULL() [v4,7/8] test_bits: add tests for fixed-type genmasks [v4,8/8] test_bits: add tests for fixed-type BIT

Vincent Mailhol via B4 Relay March 5, 2025, 1 p.m. UTC

Introduce some fixed width variant of the GENMASK() and the BIT()
macros in bits.h. Note that the main goal is not to get the correct
type, but rather to enforce more checks at compile time. For example:

  GENMASK_U16(16, 0)

will raise a build bug.

This series is a continuation of:

  https://lore.kernel.org/intel-xe/20240208074521.577076-1-lucas.demarchi@intel.com

from Lucas De Marchi. Above series is one year old. I really think
that this was a good idea and I do not want this series to die. So I
am volunteering to revive it.

Meanwhile, many changes occurred in bits.h. The most significant
change is that __GENMASK() was moved to the uapi headers.

In this v4, I introduce one big change: split the definition of the
asm and non-asm GENMASK(). I think this is controversial. Especially,
Yuri commented that he did not want such split. So I initially
implemented a first draft in which both the asm and non-asm version
would rely on the same helper macro, i.e. adding this:

  #define __GENMASK_t(t, w, h, l)			\
  	(((t)~_ULL(0) - ((t)1 << (l)) + 1) &		\
  	 ((t)~_ULL(0) >> (w - 1 - (h))))
    
to uapi/bits.h. And then, the different GENMASK()s would look like
this:

  #define __GENMASK(h, l) __GENMASK_t(unsigned long, __BITS_PER_LONG, h, l)
    
and so on.
    
I implemented it, and the final result looks quite ugly. Not only do
we need to manually provide the width each time, the biggest concern
is that adding this to the uapi is asking for trouble. Who knows how
people are going to use this? And once it is in the uapi, there is
virtually no way back.

Finally, I do not think it makes sense to expose the fixed width
variants to the asm. The fixed width integers type are a C
concept. For asm, the long and long long variants seems sufficient.

And so, after implementing both, the asm and non-asm split seems way
more clean and I think this is the best compromise. Let me know what
you think :)

Changes from v3:

        - Rebase on v6.14-rc5

        - Fix a typo in GENMASK_U128() comment.

        - Split the asm and non-asm definition of 

        - Replace ~0ULL by ~ULL(0)

        - Since v3, __GENMASK() was moved to the uapi and people
          started using directly. Introduce GENMASK_t() instead.

v3: https://lore.kernel.org/intel-xe/20240208074521.577076-1-lucas.demarchi@intel.com

Changes from v2:

	- Document both in commit message and code about the strict type
	  checking and give examples how it´d break with invalid params.

v2: https://lore.kernel.org/intel-xe/20240124050205.3646390-1-lucas.demarchi@intel.com
v1: https://lore.kernel.org/intel-xe/20230509051403.2748545-1-lucas.demarchi@intel.com
--
2.43.0

---
Lucas De Marchi (3):
      bits: introduce fixed-type BIT
      drm/i915: Convert REG_GENMASK* to fixed-width GENMASK_*
      test_bits: add tests for fixed-type genmasks

Vincent Mailhol (4):
      bits: fix typo 'unsigned __init128' -> 'unsigned __int128'
      bits: split the definition of the asm and non-asm GENMASK()
      test_bits: add tests for __GENMASK() and __GENMASK_ULL()
      test_bits: add tests for fixed-type BIT

Yury Norov (1):
      bits: introduce fixed-type genmasks

 drivers/gpu/drm/i915/i915_reg_defs.h | 108 ++++-------------------------------
 include/linux/bitops.h               |   1 -
 include/linux/bits.h                 |  65 +++++++++++++++++----
 lib/test_bits.c                      |  47 +++++++++++++++
 4 files changed, 111 insertions(+), 110 deletions(-)
---
base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
change-id: 20250228-fixed-type-genmasks-8d1a555f34e8

Best regards,

Yury Norov March 5, 2025, 2:38 p.m. UTC | #1

On Wed, Mar 05, 2025 at 04:36:12PM +0200, Andy Shevchenko wrote:
> On Wed, Mar 05, 2025 at 09:30:20AM -0500, Yury Norov wrote:
> > On Wed, Mar 05, 2025 at 10:00:13PM +0900, Vincent Mailhol via B4 Relay wrote:
> > > From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> > > 
> > > "int" was misspelled as "init" in GENMASK_U128() comments. Fix the typo.
> > 
> > Thanks for respinning the series. I'll take this fix in bitmap-for-next, so
> > if you need v2, you'll not have to bear this thing too.
> 
> Before doing that, please read my comment first.

Already did. Yes, you're right.

Vincent, can you send the fix separately, so I'll move it in the
upcoming merge window?

Vincent Mailhol March 5, 2025, 2:48 p.m. UTC | #2

On 05/03/2025 at 23:33, Andy Shevchenko wrote:
> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:
>> From: Lucas De Marchi <lucas.demarchi@intel.com>
>>
>> Implement fixed-type BIT to help drivers add stricter checks, like was
> 
> Here and in the Subject I would use BIT_Uxx().
> 
>> done for GENMASK().
> 
> ...
> 
>> +/*
>> + * Fixed-type variants of BIT(), with additional checks like GENMASK_t(). The
> 
> GENMASK_t() is not a well named macro.

Ack. I will rename to GENMASK_TYPE().

>> + * following examples generate compiler warnings due to shift-count-overflow:
>> + *
>> + * - BIT_U8(8)
>> + * - BIT_U32(-1)
>> + * - BIT_U32(40)
>> + */
>> +#define BIT_INPUT_CHECK(type, b) \
>> +	BUILD_BUG_ON_ZERO(const_true((b) >= BITS_PER_TYPE(type)))
>> +
>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))
> 
> Why not u8 and u16? This inconsistency needs to be well justified.

Because of the C integer promotion rules, if casted to u8 or u16, the
expression will immediately become a signed integer as soon as it is get
used. For example, if casted to u8

  BIT_U8(0) + BIT_U8(1)

would be a signed integer. And that may surprise people.

David also pointed this in the v3:

https://lore.kernel.org/intel-xe/d42dc197a15649e69d459362849a37f2@AcuMS.aculab.com/

and I agree with his comment.

I explained this in the changelog below the --- cutter, but it is
probably better to make the explanation more visible. I will add a
comment in the code to explain this.

>> +#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
>> +#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
> 
> Can you also use a TAB between the parentheses for better readability?
> E.g.,
> 
> #define BIT_U64(b)r	(BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))

Sure. I prefer it with space, but no strong opinion. I will put tab in v5.

Yours sincerely,
Vincent Mailhol

Yury Norov March 5, 2025, 3:22 p.m. UTC | #3

+ Anshuman Khandual <anshuman.khandual@arm.com>

Anshuman,

I merged your GENMASK_U128() because you said it's important for your
projects, and that it will get used in the kernel soon.

Now it's in the kernel for more than 6 month, but no users were added.
Can you clarify if you still need it, and if so why it's not used?

As you see, people add another fixed-types GENMASK() macros, and their
implementation differ from GENMASK_U128().

My second concern is that __GENMASK_U128() is declared in uapi, while
the general understanding for other fixed-type genmasks is that they
are not exported to users. Do you need this macro to be exported to
userspace? Can you show how and where it is used there?

Thanks,
Yury


On Wed, Mar 05, 2025 at 10:00:15PM +0900, Vincent Mailhol via B4 Relay wrote:
> From: Yury Norov <yury.norov@gmail.com>
> 
> Add __GENMASK_t() which generalizes __GENMASK() to support different
> types, and implement fixed-types versions of GENMASK() based on it.
> The fixed-type version allows more strict checks to the min/max values
> accepted, which is useful for defining registers like implemented by
> i915 and xe drivers with their REG_GENMASK*() macros.
> 
> The strict checks rely on shift-count-overflow compiler check to fail
> the build if a number outside of the range allowed is passed.
> Example:
> 
> 	#define FOO_MASK GENMASK_U32(33, 4)
> 
> will generate a warning like:
> 
> 	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
> 	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
> 	      |                               ^~
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> Acked-by: Jani Nikula <jani.nikula@intel.com>
> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> ---
> Changelog:
> 
>   v3 -> v4:
> 
>     - The v3 is one year old. Meanwhile people started using
>       __GENMASK() directly. So instead of generalizing __GENMASK() to
>       support different types, add a new GENMASK_t().
> 
>     - replace ~0ULL by ~_ULL(0). Otherwise, __GENMASK_t() would fail
>       in asm code.
> 
>     - Make GENMASK_U8() and GENMASK_U16() return an unsigned int. In
>       v3, due to the integer promotion rules, these were returning a
>       signed integer. By casting these to unsigned int, at least the
>       signedness is kept.
> ---
>  include/linux/bitops.h |  1 -
>  include/linux/bits.h   | 33 +++++++++++++++++++++++++++++----
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index c1cb53cf2f0f8662ed3e324578f74330e63f935d..9be2d50da09a417966b3d11c84092bb2f4cd0bef 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -8,7 +8,6 @@
>  
>  #include <uapi/linux/kernel.h>
>  
> -#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>  #define BITS_TO_LONGS(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>  #define BITS_TO_U64(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
>  #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
> diff --git a/include/linux/bits.h b/include/linux/bits.h
> index 5f68980a1b98d771426872c74d7b5c0f79e5e802..f202e46d2f4b7899c16d975120f3fa3ae41556ae 100644
> --- a/include/linux/bits.h
> +++ b/include/linux/bits.h
> @@ -12,6 +12,7 @@
>  #define BIT_ULL_MASK(nr)	(ULL(1) << ((nr) % BITS_PER_LONG_LONG))
>  #define BIT_ULL_WORD(nr)	((nr) / BITS_PER_LONG_LONG)
>  #define BITS_PER_BYTE		8
> +#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>  
>  /*
>   * Create a contiguous bitmask starting at bit position @l and ending at
> @@ -25,14 +26,38 @@
>  
>  #define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
>  
> -#define GENMASK(h, l) \
> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> -#define GENMASK_ULL(h, l) \
> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
> +/*
> + * Generate a mask for the specified type @t. Additional checks are made to
> + * guarantee the value returned fits in that type, relying on
> + * shift-count-overflow compiler check to detect incompatible arguments.
> + * For example, all these create build errors or warnings:
> + *
> + * - GENMASK(15, 20): wrong argument order
> + * - GENMASK(72, 15): doesn't fit unsigned long
> + * - GENMASK_U32(33, 15): doesn't fit in a u32
> + */
> +#define GENMASK_t(t, h, l)				\
> +	(GENMASK_INPUT_CHECK(h, l) +			\
> +	 (((t)~ULL(0) - ((t)1 << (l)) + 1) &		\
> +	  ((t)~ULL(0) >> (BITS_PER_TYPE(t) - 1 - (h)))))
> +
> +#define GENMASK(h, l) GENMASK_t(unsigned long,  h, l)
> +#define GENMASK_ULL(h, l) GENMASK_t(unsigned long long, h, l)
>  
>  /*
>   * Missing asm support
>   *
> + * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm
> + * code as BITS_PER_TYPE() relies on sizeof(), something not available in
> + * asm. Nethertheless, the concept of fixed width integers is a C thing which
> + * does not apply to assembly code.
> + */
> +#define GENMASK_U8(h, l) ((unsigned int)GENMASK_t(u8,  h, l))
> +#define GENMASK_U16(h, l) ((unsigned int)GENMASK_t(u16, h, l))
> +#define GENMASK_U32(h, l) GENMASK_t(u32, h, l)
> +#define GENMASK_U64(h, l) GENMASK_t(u64, h, l)
> +
> +/*
>   * __GENMASK_U128() depends on _BIT128() which would not work
>   * in the asm code, as it shifts an 'unsigned __int128' data
>   * type instead of direct representation of 128 bit constants
> 
> -- 
> 2.45.3
>

Vincent Mailhol March 5, 2025, 4:09 p.m. UTC | #4

On 05/03/2025 at 23:38, Yury Norov wrote:
> On Wed, Mar 05, 2025 at 04:36:12PM +0200, Andy Shevchenko wrote:
>> On Wed, Mar 05, 2025 at 09:30:20AM -0500, Yury Norov wrote:
>>> On Wed, Mar 05, 2025 at 10:00:13PM +0900, Vincent Mailhol via B4 Relay wrote:
>>>> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
>>>>
>>>> "int" was misspelled as "init" in GENMASK_U128() comments. Fix the typo.
>>>
>>> Thanks for respinning the series. I'll take this fix in bitmap-for-next, so
>>> if you need v2, you'll not have to bear this thing too.
>>
>> Before doing that, please read my comment first.
> 
> Already did. Yes, you're right.
> 
> Vincent, can you send the fix separately, so I'll move it in the
> upcoming merge window?

Here it is:
https://lore.kernel.org/all/20250305-fix_init128_typo-v1-1-cbe5b8e54e7d@wanadoo.fr/

As requested, I will exclude this from the v5.


Yours sincerely,
Vincent Mailhol

Vincent Mailhol March 5, 2025, 5:17 p.m. UTC | #5

On 06/03/2025 at 00:48, Andy Shevchenko wrote:
> On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:
>> On 05/03/2025 at 23:33, Andy Shevchenko wrote:
>>> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:
> 
> ...
> 
>>>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
>>>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))
>>>
>>> Why not u8 and u16? This inconsistency needs to be well justified.
>>
>> Because of the C integer promotion rules, if casted to u8 or u16, the
>> expression will immediately become a signed integer as soon as it is get
>> used. For example, if casted to u8
>>
>>   BIT_U8(0) + BIT_U8(1)
>>
>> would be a signed integer. And that may surprise people.
> 
> Yes, but wouldn't be better to put it more explicitly like
> 
> #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?

OK, the final result would be unsigned. But, I do not follow how this is
more explicit.

Also, why doing:

  (u8)BIT(b) + 0 + UL(0)

and not just:

  (u8)BIT(b) + UL(0)

?

What is that intermediary '+ 0' for?

I am sorry, but I am having a hard time understanding how casting to u8
and then doing an addition with an unsigned long is more explicit than
directly doing a cast to the desired type.

As I mentioned in my answer to Yuri, I have a slight preference for the
unsigned int cast, but I am OK to go back to the u8/u16 cast as it was
in v3.

However, I really do not see how that '+ 0 + UL(0)' would be an improvement.

> Also, BIT_Uxx() gives different type at the end, shouldn't they all be promoted
> to unsigned long long at the end? Probably it won't work in real assembly.
> Can you add test cases which are written in assembly? (Yes, I understand that it will
> be architecture dependent, but still.)

No. I purposely guarded the definition of the BIT_Uxx() by a

  #if !defined(__ASSEMBLY__)

so that these are never visible in assembly. I actually put a comment to
explain why the GENMASK_U*() are not available in assembly. I can copy
paste the same comment to explain why why BIT_U*() are not made
available either:

  /*
   * Missing asm support
   *
   * BIT_U*() depends on BITS_PER_TYPE() which would not work in the asm
   * code as BITS_PER_TYPE() relies on sizeof(), something not available
   * in asm.  Nethertheless, the concept of fixed width integers is a C
   * thing which does not apply to assembly code.
   */

I really believe that it would be a mistake to make the GENMASK_U*() or
the BIT_U*() available to assembly.

>> David also pointed this in the v3:
>>
>> https://lore.kernel.org/intel-xe/d42dc197a15649e69d459362849a37f2@AcuMS.aculab.com/
>>
>> and I agree with his comment.
>>
>> I explained this in the changelog below the --- cutter, but it is
>> probably better to make the explanation more visible. I will add a
>> comment in the code to explain this.
>>
>>>> +#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
>>>> +#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
> 

Yours sincerely,
Vincent Mailhol

David Laight March 5, 2025, 9:13 p.m. UTC | #6

On Wed, 5 Mar 2025 17:48:05 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:
> > On 05/03/2025 at 23:33, Andy Shevchenko wrote:  
> > > On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:  
> 
> ...
> 
> > >> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
> > >> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))

Why even pretend you are checking against a type - just use 8 or 16.

> > > 
> > > Why not u8 and u16? This inconsistency needs to be well justified.

What is the type of BIT(b) ?
it really ought to be unsigned int (so always 32bit), but I bet it
is unsigned long (possibly historically because someone was worried
int might be 16 bits!)

> > 
> > Because of the C integer promotion rules, if casted to u8 or u16, the
> > expression will immediately become a signed integer as soon as it is get
> > used. For example, if casted to u8
> > 
> >   BIT_U8(0) + BIT_U8(1)
> > 
> > would be a signed integer. And that may surprise people.

They always get 'surprised' by that.
I found some 'dayjob' code that was doing (byte_var << 1) >> 1 in order
to get the high bit discarded.
Been like that for best part of 30 years...
I wasn't scared to fix it :-)

> Yes, but wouldn't be better to put it more explicitly like
> 
> #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?

I don't think you should force it to 'unsigned long'.
On 64bit a comparison against a 32bit 'signed int' will sign-extend the
value before making it unsigned.
While that shouldn't matter here, someone might copy it.
You just want to ensure that all the values are 'unsigned int', trying
to return u8 or u16 isn't worth the effort.

When I was doing min_unsigned() I did ((x) + 0u + 0ul + 0ull) to ensure
that values would always be zero extended.
But I was doing the same to both sides of the expression - and the compiler
optimises away all the 'known 0' extension to 64bits.

> Also, BIT_Uxx() gives different type at the end, shouldn't they all be promoted
> to unsigned long long at the end? Probably it won't work in real assembly.
> Can you add test cases which are written in assembly? (Yes, I understand that it will
> be architecture dependent, but still.)

There is no point doing multiple versions for asm files.
The reason UL(x) and ULL(x) exist is because the assembler just has integers.
Both expand to (x).
There might not even be a distinction between signed and unsigned.
I'm not sure you can assume that a shift right won't replicate the sign bit.
Since the expression can only be valid for constants, something simple
like ((2 << (hi)) - (1 << (lo)) really is the best you are going to get
for GENMASK().

So just define a completely different version for asm any nuke the UL() etc
for readability.

	David

David Laight March 5, 2025, 9:50 p.m. UTC | #7

On Wed, 5 Mar 2025 21:56:22 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Thu, Mar 06, 2025 at 02:17:18AM +0900, Vincent Mailhol wrote:
> > On 06/03/2025 at 00:48, Andy Shevchenko wrote:  
> > > On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:  
> > >> On 05/03/2025 at 23:33, Andy Shevchenko wrote:  
> > >>> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:  
> 
> ...
> 
> > >>>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
> > >>>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))  
> > >>>
> > >>> Why not u8 and u16? This inconsistency needs to be well justified.  
> > >>
> > >> Because of the C integer promotion rules, if casted to u8 or u16, the
> > >> expression will immediately become a signed integer as soon as it is get
> > >> used. For example, if casted to u8
> > >>
> > >>   BIT_U8(0) + BIT_U8(1)
> > >>
> > >> would be a signed integer. And that may surprise people.  
> > > 
> > > Yes, but wouldn't be better to put it more explicitly like
> > > 
> > > #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?  
> > 
> > OK, the final result would be unsigned. But, I do not follow how this is
> > more explicit.
> > 
> > Also, why doing:
> > 
> >   (u8)BIT(b) + 0 + UL(0)
> > 
> > and not just:
> > 
> >   (u8)BIT(b) + UL(0)
> > 
> > ?
> > 
> > What is that intermediary '+ 0' for?
> > 
> > I am sorry, but I am having a hard time understanding how casting to u8
> > and then doing an addition with an unsigned long is more explicit than
> > directly doing a cast to the desired type.  
> 
> Reading this again, I think we don't need it at all. u8, aka unsigned char,
> will be promoted to int, but it will be int with a value < 256, can't be signed
> as far as I understand this correctly.

The value can't be negative, but the type will be a signed one.
Anything comparing types (and there are a few) will treat it as signed.
It really is bad practise to even pretend you can have an expression
(rather that a variable) that has a type smaller than 'int'.
It wouldn't surprise me if even an 'a = b' assignment promotes 'b' to int.

So it is even questionable whether BIT8() and BIT16() should even exist at all.
There can be reasons to return 'unsigned int' rather than 'unsigned long'.
But with the type definitions that Linux uses (and can't really be changed)
you can have BIT32() that is 'unsigned int' and BIT64() that is 'unsigned long
long'. These are then the same on 32bit and 64bit.

	David

Vincent Mailhol March 6, 2025, 9:38 a.m. UTC | #8

On 06/03/2025 at 18:12, Andy Shevchenko wrote:
> On Wed, Mar 05, 2025 at 09:50:27PM +0000, David Laight wrote:
>> On Wed, 5 Mar 2025 21:56:22 +0200
>> Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
>>> On Thu, Mar 06, 2025 at 02:17:18AM +0900, Vincent Mailhol wrote:
>>>> On 06/03/2025 at 00:48, Andy Shevchenko wrote:  
>>>>> On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:  
>>>>>> On 05/03/2025 at 23:33, Andy Shevchenko wrote:  
>>>>>>> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:  
> 
> ...
> 
>>>>>>>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
>>>>>>>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))  
>>>>>>>
>>>>>>> Why not u8 and u16? This inconsistency needs to be well justified.  
>>>>>>
>>>>>> Because of the C integer promotion rules, if casted to u8 or u16, the
>>>>>> expression will immediately become a signed integer as soon as it is get
>>>>>> used. For example, if casted to u8
>>>>>>
>>>>>>   BIT_U8(0) + BIT_U8(1)
>>>>>>
>>>>>> would be a signed integer. And that may surprise people.  
>>>>>
>>>>> Yes, but wouldn't be better to put it more explicitly like
>>>>>
>>>>> #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?  
>>>>
>>>> OK, the final result would be unsigned. But, I do not follow how this is
>>>> more explicit.
>>>>
>>>> Also, why doing:
>>>>
>>>>   (u8)BIT(b) + 0 + UL(0)
>>>>
>>>> and not just:
>>>>
>>>>   (u8)BIT(b) + UL(0)
>>>>
>>>> ?
>>>>
>>>> What is that intermediary '+ 0' for?
>>>>
>>>> I am sorry, but I am having a hard time understanding how casting to u8
>>>> and then doing an addition with an unsigned long is more explicit than
>>>> directly doing a cast to the desired type.  
>>>
>>> Reading this again, I think we don't need it at all. u8, aka unsigned char,
>>> will be promoted to int, but it will be int with a value < 256, can't be signed
>>> as far as I understand this correctly.
>>
>> The value can't be negative, but the type will be a signed one.
> 
> Yes, that's what I mentioned above: "int with the value < 256".
> 
>> Anything comparing types (and there are a few) will treat it as signed.
>> It really is bad practise to even pretend you can have an expression
>> (rather that a variable) that has a type smaller than 'int'.
>> It wouldn't surprise me if even an 'a = b' assignment promotes 'b' to int.
> 
> We have tons of code with u8/u16, what you are proposing here is like
> "let's get rid of those types and replace all of them by int/unsigned int".
> We have ISAs that are byte-oriented despite being 32- or 64-bit platforms.
> 
>> So it is even questionable whether BIT8() and BIT16() should even exist at all.
> 
> The point is to check the boundaries and not in the returned value per se.

+1

I will also add that this adds to the readability of the code. In a
driver, if I see:

  #define REG_FOO1_MASK GENMASK(6, 2)
  #define REG_FOO2_MASK GENMASK(12, 7)

it does not tell me much about the register. Whereas if I see:

  #define REG_FOO1_MASK GENMASK_U16(6, 2)
  #define REG_FOO2_MASK GENMASK_U16(12, 7)

then I know that this is for a 16 bit register.

>> There can be reasons to return 'unsigned int' rather than 'unsigned long'.
>> But with the type definitions that Linux uses (and can't really be changed)
>> you can have BIT32() that is 'unsigned int' and BIT64() that is 'unsigned long
>> long'. These are then the same on 32bit and 64bit.

So, at the end, my goal when introducing that unsigned int cast was not
to confuse people. This had the opposite effect. Nearly all the
reviewers pointed at that cast.

I will revert this in the v5. The U8 and U16 variants of both GENMASK
and BIT will return an u8 and u16 respectively. And unless someone
manages to convince Yury otherwise, I will keep it as such.


Yours sincerely,
Vincent Mailhol

Yury Norov March 19, 2025, 1:46 a.m. UTC | #9

+ Catalin Marinas, ARM maillist

Hi Catalin and everyone,

Anshuman Khandual asked me to merge GENMASK_U128() saying it's
important for ARM to stabilize API. While it's a dead code, I
accepted his patch as he promised to add users shortly.

Now it's more than half a year since that. There's no users,
and no feedback from Anshuman.

Can you please tell if you still need the macro? I don't want to
undercut your development, but if you don't need 128-bit genmasks
there's no reason to have a dead code in the uapi.

Thanks,
Yury

On Wed, Mar 05, 2025 at 10:22:47AM -0500, Yury Norov wrote:
> + Anshuman Khandual <anshuman.khandual@arm.com>
> 
> Anshuman,
> 
> I merged your GENMASK_U128() because you said it's important for your
> projects, and that it will get used in the kernel soon.
> 
> Now it's in the kernel for more than 6 month, but no users were added.
> Can you clarify if you still need it, and if so why it's not used?
> 
> As you see, people add another fixed-types GENMASK() macros, and their
> implementation differ from GENMASK_U128().
> 
> My second concern is that __GENMASK_U128() is declared in uapi, while
> the general understanding for other fixed-type genmasks is that they
> are not exported to users. Do you need this macro to be exported to
> userspace? Can you show how and where it is used there?
> 
> Thanks,
> Yury
> 
> 
> On Wed, Mar 05, 2025 at 10:00:15PM +0900, Vincent Mailhol via B4 Relay wrote:
> > From: Yury Norov <yury.norov@gmail.com>
> > 
> > Add __GENMASK_t() which generalizes __GENMASK() to support different
> > types, and implement fixed-types versions of GENMASK() based on it.
> > The fixed-type version allows more strict checks to the min/max values
> > accepted, which is useful for defining registers like implemented by
> > i915 and xe drivers with their REG_GENMASK*() macros.
> > 
> > The strict checks rely on shift-count-overflow compiler check to fail
> > the build if a number outside of the range allowed is passed.
> > Example:
> > 
> > 	#define FOO_MASK GENMASK_U32(33, 4)
> > 
> > will generate a warning like:
> > 
> > 	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
> > 	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
> > 	      |                               ^~
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > Acked-by: Jani Nikula <jani.nikula@intel.com>
> > Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> > ---
> > Changelog:
> > 
> >   v3 -> v4:
> > 
> >     - The v3 is one year old. Meanwhile people started using
> >       __GENMASK() directly. So instead of generalizing __GENMASK() to
> >       support different types, add a new GENMASK_t().
> > 
> >     - replace ~0ULL by ~_ULL(0). Otherwise, __GENMASK_t() would fail
> >       in asm code.
> > 
> >     - Make GENMASK_U8() and GENMASK_U16() return an unsigned int. In
> >       v3, due to the integer promotion rules, these were returning a
> >       signed integer. By casting these to unsigned int, at least the
> >       signedness is kept.
> > ---
> >  include/linux/bitops.h |  1 -
> >  include/linux/bits.h   | 33 +++++++++++++++++++++++++++++----
> >  2 files changed, 29 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> > index c1cb53cf2f0f8662ed3e324578f74330e63f935d..9be2d50da09a417966b3d11c84092bb2f4cd0bef 100644
> > --- a/include/linux/bitops.h
> > +++ b/include/linux/bitops.h
> > @@ -8,7 +8,6 @@
> >  
> >  #include <uapi/linux/kernel.h>
> >  
> > -#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
> >  #define BITS_TO_LONGS(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
> >  #define BITS_TO_U64(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
> >  #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
> > diff --git a/include/linux/bits.h b/include/linux/bits.h
> > index 5f68980a1b98d771426872c74d7b5c0f79e5e802..f202e46d2f4b7899c16d975120f3fa3ae41556ae 100644
> > --- a/include/linux/bits.h
> > +++ b/include/linux/bits.h
> > @@ -12,6 +12,7 @@
> >  #define BIT_ULL_MASK(nr)	(ULL(1) << ((nr) % BITS_PER_LONG_LONG))
> >  #define BIT_ULL_WORD(nr)	((nr) / BITS_PER_LONG_LONG)
> >  #define BITS_PER_BYTE		8
> > +#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
> >  
> >  /*
> >   * Create a contiguous bitmask starting at bit position @l and ending at
> > @@ -25,14 +26,38 @@
> >  
> >  #define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
> >  
> > -#define GENMASK(h, l) \
> > -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> > -#define GENMASK_ULL(h, l) \
> > -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
> > +/*
> > + * Generate a mask for the specified type @t. Additional checks are made to
> > + * guarantee the value returned fits in that type, relying on
> > + * shift-count-overflow compiler check to detect incompatible arguments.
> > + * For example, all these create build errors or warnings:
> > + *
> > + * - GENMASK(15, 20): wrong argument order
> > + * - GENMASK(72, 15): doesn't fit unsigned long
> > + * - GENMASK_U32(33, 15): doesn't fit in a u32
> > + */
> > +#define GENMASK_t(t, h, l)				\
> > +	(GENMASK_INPUT_CHECK(h, l) +			\
> > +	 (((t)~ULL(0) - ((t)1 << (l)) + 1) &		\
> > +	  ((t)~ULL(0) >> (BITS_PER_TYPE(t) - 1 - (h)))))
> > +
> > +#define GENMASK(h, l) GENMASK_t(unsigned long,  h, l)
> > +#define GENMASK_ULL(h, l) GENMASK_t(unsigned long long, h, l)
> >  
> >  /*
> >   * Missing asm support
> >   *
> > + * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm
> > + * code as BITS_PER_TYPE() relies on sizeof(), something not available in
> > + * asm. Nethertheless, the concept of fixed width integers is a C thing which
> > + * does not apply to assembly code.
> > + */
> > +#define GENMASK_U8(h, l) ((unsigned int)GENMASK_t(u8,  h, l))
> > +#define GENMASK_U16(h, l) ((unsigned int)GENMASK_t(u16, h, l))
> > +#define GENMASK_U32(h, l) GENMASK_t(u32, h, l)
> > +#define GENMASK_U64(h, l) GENMASK_t(u64, h, l)
> > +
> > +/*
> >   * __GENMASK_U128() depends on _BIT128() which would not work
> >   * in the asm code, as it shifts an 'unsigned __int128' data
> >   * type instead of direct representation of 128 bit constants
> > 
> > -- 
> > 2.45.3
> >

Anshuman Khandual March 19, 2025, 3:34 a.m. UTC | #10

On 3/19/25 07:16, Yury Norov wrote:
> + Catalin Marinas, ARM maillist
> 
> Hi Catalin and everyone,

Hello Yury,

> 
> Anshuman Khandual asked me to merge GENMASK_U128() saying it's
> important for ARM to stabilize API. While it's a dead code, I
> accepted his patch as he promised to add users shortly.
> 
> Now it's more than half a year since that. There's no users,
> and no feedback from Anshuman.

My apologies to have missed your email earlier. Please find response
for the earlier email below as well.

> 
> Can you please tell if you still need the macro? I don't want to
> undercut your development, but if you don't need 128-bit genmasks
> there's no reason to have a dead code in the uapi.

The code base specifically using GENMASK_U128() has not been posted
upstream (probably in next couple of months or so) till now, except
the following patch which has been not been merged and still under
review and development.

https://lore.kernel.org/lkml/20240801054436.612024-1-anshuman.khandual@arm.com/

> 
> Thanks,
> Yury
> 
> On Wed, Mar 05, 2025 at 10:22:47AM -0500, Yury Norov wrote:
>> + Anshuman Khandual <anshuman.khandual@arm.com>
>>
>> Anshuman,
>>
>> I merged your GENMASK_U128() because you said it's important for your
>> projects, and that it will get used in the kernel soon.
>>
>> Now it's in the kernel for more than 6 month, but no users were added.
>> Can you clarify if you still need it, and if so why it's not used?

We would need it but although the code using GENMASK_U128() has not been
posted upstream.

>>
>> As you see, people add another fixed-types GENMASK() macros, and their
>> implementation differ from GENMASK_U128().

I will take a look. Is GENMASK_U128() being problematic for the this new
scheme ?

>>
>> My second concern is that __GENMASK_U128() is declared in uapi, while
>> the general understanding for other fixed-type genmasks is that they
>> are not exported to users. Do you need this macro to be exported to
>> userspace? Can you show how and where it is used there?

No, not atleast right now.

These were moved into uapi subsequently via the following commit.

21a3a3d015aee ("tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources")

But in general GENMASK_U128() is needed for generating 128 bit page table
entries, related flags and masks whether in kernel or in user space for
writing kernel test cases etc.

>>
>> Thanks,
>> Yury
>>
>>
>> On Wed, Mar 05, 2025 at 10:00:15PM +0900, Vincent Mailhol via B4 Relay wrote:
>>> From: Yury Norov <yury.norov@gmail.com>
>>>
>>> Add __GENMASK_t() which generalizes __GENMASK() to support different
>>> types, and implement fixed-types versions of GENMASK() based on it.
>>> The fixed-type version allows more strict checks to the min/max values
>>> accepted, which is useful for defining registers like implemented by
>>> i915 and xe drivers with their REG_GENMASK*() macros.
>>>
>>> The strict checks rely on shift-count-overflow compiler check to fail
>>> the build if a number outside of the range allowed is passed.
>>> Example:
>>>
>>> 	#define FOO_MASK GENMASK_U32(33, 4)
>>>
>>> will generate a warning like:
>>>
>>> 	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
>>> 	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
>>> 	      |                               ^~
>>>
>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>> Acked-by: Jani Nikula <jani.nikula@intel.com>
>>> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
>>> ---
>>> Changelog:
>>>
>>>   v3 -> v4:
>>>
>>>     - The v3 is one year old. Meanwhile people started using
>>>       __GENMASK() directly. So instead of generalizing __GENMASK() to
>>>       support different types, add a new GENMASK_t().
>>>
>>>     - replace ~0ULL by ~_ULL(0). Otherwise, __GENMASK_t() would fail
>>>       in asm code.
>>>
>>>     - Make GENMASK_U8() and GENMASK_U16() return an unsigned int. In
>>>       v3, due to the integer promotion rules, these were returning a
>>>       signed integer. By casting these to unsigned int, at least the
>>>       signedness is kept.
>>> ---
>>>  include/linux/bitops.h |  1 -
>>>  include/linux/bits.h   | 33 +++++++++++++++++++++++++++++----
>>>  2 files changed, 29 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
>>> index c1cb53cf2f0f8662ed3e324578f74330e63f935d..9be2d50da09a417966b3d11c84092bb2f4cd0bef 100644
>>> --- a/include/linux/bitops.h
>>> +++ b/include/linux/bitops.h
>>> @@ -8,7 +8,6 @@
>>>  
>>>  #include <uapi/linux/kernel.h>
>>>  
>>> -#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>>>  #define BITS_TO_LONGS(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>>>  #define BITS_TO_U64(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
>>>  #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
>>> diff --git a/include/linux/bits.h b/include/linux/bits.h
>>> index 5f68980a1b98d771426872c74d7b5c0f79e5e802..f202e46d2f4b7899c16d975120f3fa3ae41556ae 100644
>>> --- a/include/linux/bits.h
>>> +++ b/include/linux/bits.h
>>> @@ -12,6 +12,7 @@
>>>  #define BIT_ULL_MASK(nr)	(ULL(1) << ((nr) % BITS_PER_LONG_LONG))
>>>  #define BIT_ULL_WORD(nr)	((nr) / BITS_PER_LONG_LONG)
>>>  #define BITS_PER_BYTE		8
>>> +#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>>>  
>>>  /*
>>>   * Create a contiguous bitmask starting at bit position @l and ending at
>>> @@ -25,14 +26,38 @@
>>>  
>>>  #define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
>>>  
>>> -#define GENMASK(h, l) \
>>> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
>>> -#define GENMASK_ULL(h, l) \
>>> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>>> +/*
>>> + * Generate a mask for the specified type @t. Additional checks are made to
>>> + * guarantee the value returned fits in that type, relying on
>>> + * shift-count-overflow compiler check to detect incompatible arguments.
>>> + * For example, all these create build errors or warnings:
>>> + *
>>> + * - GENMASK(15, 20): wrong argument order
>>> + * - GENMASK(72, 15): doesn't fit unsigned long
>>> + * - GENMASK_U32(33, 15): doesn't fit in a u32
>>> + */
>>> +#define GENMASK_t(t, h, l)				\
>>> +	(GENMASK_INPUT_CHECK(h, l) +			\
>>> +	 (((t)~ULL(0) - ((t)1 << (l)) + 1) &		\
>>> +	  ((t)~ULL(0) >> (BITS_PER_TYPE(t) - 1 - (h)))))
>>> +
>>> +#define GENMASK(h, l) GENMASK_t(unsigned long,  h, l)
>>> +#define GENMASK_ULL(h, l) GENMASK_t(unsigned long long, h, l)
>>>  
>>>  /*
>>>   * Missing asm support
>>>   *
>>> + * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm
>>> + * code as BITS_PER_TYPE() relies on sizeof(), something not available in
>>> + * asm. Nethertheless, the concept of fixed width integers is a C thing which
>>> + * does not apply to assembly code.
>>> + */
>>> +#define GENMASK_U8(h, l) ((unsigned int)GENMASK_t(u8,  h, l))
>>> +#define GENMASK_U16(h, l) ((unsigned int)GENMASK_t(u16, h, l))
>>> +#define GENMASK_U32(h, l) GENMASK_t(u32, h, l)
>>> +#define GENMASK_U64(h, l) GENMASK_t(u64, h, l)
>>> +
>>> +/*
>>>   * __GENMASK_U128() depends on _BIT128() which would not work
>>>   * in the asm code, as it shifts an 'unsigned __int128' data
>>>   * type instead of direct representation of 128 bit constants
>>>
>>> -- 
>>> 2.45.3
>>>

Anshuman Khandual March 19, 2025, 4:13 a.m. UTC | #11

On 3/19/25 09:04, Anshuman Khandual wrote:
> On 3/19/25 07:16, Yury Norov wrote:
>> + Catalin Marinas, ARM maillist
>>
>> Hi Catalin and everyone,
> 
> Hello Yury,
> 
>>
>> Anshuman Khandual asked me to merge GENMASK_U128() saying it's
>> important for ARM to stabilize API. While it's a dead code, I
>> accepted his patch as he promised to add users shortly.
>>
>> Now it's more than half a year since that. There's no users,
>> and no feedback from Anshuman.
> 
> My apologies to have missed your email earlier. Please find response
> for the earlier email below as well.
> 
>>
>> Can you please tell if you still need the macro? I don't want to
>> undercut your development, but if you don't need 128-bit genmasks
>> there's no reason to have a dead code in the uapi.
> 
> The code base specifically using GENMASK_U128() has not been posted
> upstream (probably in next couple of months or so) till now, except
> the following patch which has been not been merged and still under
> review and development.
> 
> https://lore.kernel.org/lkml/20240801054436.612024-1-anshuman.khandual@arm.com/
> 
>>
>> Thanks,
>> Yury
>>
>> On Wed, Mar 05, 2025 at 10:22:47AM -0500, Yury Norov wrote:
>>> + Anshuman Khandual <anshuman.khandual@arm.com>
>>>
>>> Anshuman,
>>>
>>> I merged your GENMASK_U128() because you said it's important for your
>>> projects, and that it will get used in the kernel soon.
>>>
>>> Now it's in the kernel for more than 6 month, but no users were added.
>>> Can you clarify if you still need it, and if so why it's not used?
> 
> We would need it but although the code using GENMASK_U128() has not been
> posted upstream.
> 
>>>
>>> As you see, people add another fixed-types GENMASK() macros, and their
>>> implementation differ from GENMASK_U128().
> 
> I will take a look. Is GENMASK_U128() being problematic for the this new
> scheme ?
> 
>>>
>>> My second concern is that __GENMASK_U128() is declared in uapi, while
>>> the general understanding for other fixed-type genmasks is that they
>>> are not exported to users. Do you need this macro to be exported to
>>> userspace? Can you show how and where it is used there?
> 
> No, not atleast right now.
> 
> These were moved into uapi subsequently via the following commit.
> 
> 21a3a3d015aee ("tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources")
> 
> But in general GENMASK_U128() is needed for generating 128 bit page table
> entries, related flags and masks whether in kernel or in user space for
> writing kernel test cases etc.

In the commit 947697c6f0f7 ("uapi: Define GENMASK_U128"), GENMASK_U128() gets defined
using __GENMASK_U128() which in turn calls __BIT128() - both of which are defined in
UAPI headers inside (include/uapi/linux/). 

Just wondering - are you suggesting to move these helpers from include/uapi/linux/ to
include/linux/bits.h instead ?

> 
>>>
>>> Thanks,
>>> Yury
>>>
>>>
>>> On Wed, Mar 05, 2025 at 10:00:15PM +0900, Vincent Mailhol via B4 Relay wrote:
>>>> From: Yury Norov <yury.norov@gmail.com>
>>>>
>>>> Add __GENMASK_t() which generalizes __GENMASK() to support different
>>>> types, and implement fixed-types versions of GENMASK() based on it.
>>>> The fixed-type version allows more strict checks to the min/max values
>>>> accepted, which is useful for defining registers like implemented by
>>>> i915 and xe drivers with their REG_GENMASK*() macros.
>>>>
>>>> The strict checks rely on shift-count-overflow compiler check to fail
>>>> the build if a number outside of the range allowed is passed.
>>>> Example:
>>>>
>>>> 	#define FOO_MASK GENMASK_U32(33, 4)
>>>>
>>>> will generate a warning like:
>>>>
>>>> 	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
>>>> 	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
>>>> 	      |                               ^~
>>>>
>>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>>> Acked-by: Jani Nikula <jani.nikula@intel.com>
>>>> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
>>>> ---
>>>> Changelog:
>>>>
>>>>   v3 -> v4:
>>>>
>>>>     - The v3 is one year old. Meanwhile people started using
>>>>       __GENMASK() directly. So instead of generalizing __GENMASK() to
>>>>       support different types, add a new GENMASK_t().
>>>>
>>>>     - replace ~0ULL by ~_ULL(0). Otherwise, __GENMASK_t() would fail
>>>>       in asm code.
>>>>
>>>>     - Make GENMASK_U8() and GENMASK_U16() return an unsigned int. In
>>>>       v3, due to the integer promotion rules, these were returning a
>>>>       signed integer. By casting these to unsigned int, at least the
>>>>       signedness is kept.
>>>> ---
>>>>  include/linux/bitops.h |  1 -
>>>>  include/linux/bits.h   | 33 +++++++++++++++++++++++++++++----
>>>>  2 files changed, 29 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
>>>> index c1cb53cf2f0f8662ed3e324578f74330e63f935d..9be2d50da09a417966b3d11c84092bb2f4cd0bef 100644
>>>> --- a/include/linux/bitops.h
>>>> +++ b/include/linux/bitops.h
>>>> @@ -8,7 +8,6 @@
>>>>  
>>>>  #include <uapi/linux/kernel.h>
>>>>  
>>>> -#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>>>>  #define BITS_TO_LONGS(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>>>>  #define BITS_TO_U64(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
>>>>  #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
>>>> diff --git a/include/linux/bits.h b/include/linux/bits.h
>>>> index 5f68980a1b98d771426872c74d7b5c0f79e5e802..f202e46d2f4b7899c16d975120f3fa3ae41556ae 100644
>>>> --- a/include/linux/bits.h
>>>> +++ b/include/linux/bits.h
>>>> @@ -12,6 +12,7 @@
>>>>  #define BIT_ULL_MASK(nr)	(ULL(1) << ((nr) % BITS_PER_LONG_LONG))
>>>>  #define BIT_ULL_WORD(nr)	((nr) / BITS_PER_LONG_LONG)
>>>>  #define BITS_PER_BYTE		8
>>>> +#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>>>>  
>>>>  /*
>>>>   * Create a contiguous bitmask starting at bit position @l and ending at
>>>> @@ -25,14 +26,38 @@
>>>>  
>>>>  #define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
>>>>  
>>>> -#define GENMASK(h, l) \
>>>> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
>>>> -#define GENMASK_ULL(h, l) \
>>>> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>>>> +/*
>>>> + * Generate a mask for the specified type @t. Additional checks are made to
>>>> + * guarantee the value returned fits in that type, relying on
>>>> + * shift-count-overflow compiler check to detect incompatible arguments.
>>>> + * For example, all these create build errors or warnings:
>>>> + *
>>>> + * - GENMASK(15, 20): wrong argument order
>>>> + * - GENMASK(72, 15): doesn't fit unsigned long
>>>> + * - GENMASK_U32(33, 15): doesn't fit in a u32
>>>> + */
>>>> +#define GENMASK_t(t, h, l)				\
>>>> +	(GENMASK_INPUT_CHECK(h, l) +			\
>>>> +	 (((t)~ULL(0) - ((t)1 << (l)) + 1) &		\
>>>> +	  ((t)~ULL(0) >> (BITS_PER_TYPE(t) - 1 - (h)))))
>>>> +
>>>> +#define GENMASK(h, l) GENMASK_t(unsigned long,  h, l)
>>>> +#define GENMASK_ULL(h, l) GENMASK_t(unsigned long long, h, l)
>>>>  
>>>>  /*
>>>>   * Missing asm support
>>>>   *
>>>> + * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm
>>>> + * code as BITS_PER_TYPE() relies on sizeof(), something not available in
>>>> + * asm. Nethertheless, the concept of fixed width integers is a C thing which
>>>> + * does not apply to assembly code.
>>>> + */
>>>> +#define GENMASK_U8(h, l) ((unsigned int)GENMASK_t(u8,  h, l))
>>>> +#define GENMASK_U16(h, l) ((unsigned int)GENMASK_t(u16, h, l))
>>>> +#define GENMASK_U32(h, l) GENMASK_t(u32, h, l)
>>>> +#define GENMASK_U64(h, l) GENMASK_t(u64, h, l)
>>>> +
>>>> +/*
>>>>   * __GENMASK_U128() depends on _BIT128() which would not work
>>>>   * in the asm code, as it shifts an 'unsigned __int128' data
>>>>   * type instead of direct representation of 128 bit constants
>>>>
>>>> -- 
>>>> 2.45.3
>>>>
>

Yury Norov March 21, 2025, 5:05 p.m. UTC | #12

On Wed, Mar 19, 2025 at 09:43:06AM +0530, Anshuman Khandual wrote:
> 
> 
> On 3/19/25 09:04, Anshuman Khandual wrote:
> > On 3/19/25 07:16, Yury Norov wrote:
> >> + Catalin Marinas, ARM maillist
> >>
> >> Hi Catalin and everyone,
> > 
> > Hello Yury,
> > 
> >>
> >> Anshuman Khandual asked me to merge GENMASK_U128() saying it's
> >> important for ARM to stabilize API. While it's a dead code, I
> >> accepted his patch as he promised to add users shortly.
> >>
> >> Now it's more than half a year since that. There's no users,
> >> and no feedback from Anshuman.
> > 
> > My apologies to have missed your email earlier. Please find response
> > for the earlier email below as well.
> > 
> >>
> >> Can you please tell if you still need the macro? I don't want to
> >> undercut your development, but if you don't need 128-bit genmasks
> >> there's no reason to have a dead code in the uapi.
> > 
> > The code base specifically using GENMASK_U128() has not been posted
> > upstream (probably in next couple of months or so) till now, except
> > the following patch which has been not been merged and still under
> > review and development.
> > 
> > https://lore.kernel.org/lkml/20240801054436.612024-1-anshuman.khandual@arm.com/
> > 
> >>
> >> Thanks,
> >> Yury
> >>
> >> On Wed, Mar 05, 2025 at 10:22:47AM -0500, Yury Norov wrote:
> >>> + Anshuman Khandual <anshuman.khandual@arm.com>
> >>>
> >>> Anshuman,
> >>>
> >>> I merged your GENMASK_U128() because you said it's important for your
> >>> projects, and that it will get used in the kernel soon.
> >>>
> >>> Now it's in the kernel for more than 6 month, but no users were added.
> >>> Can you clarify if you still need it, and if so why it's not used?
> > 
> > We would need it but although the code using GENMASK_U128() has not been
> > posted upstream.
> > 
> >>>
> >>> As you see, people add another fixed-types GENMASK() macros, and their
> >>> implementation differ from GENMASK_U128().
> > 
> > I will take a look. Is GENMASK_U128() being problematic for the this new
> > scheme ?
> > 
> >>>
> >>> My second concern is that __GENMASK_U128() is declared in uapi, while
> >>> the general understanding for other fixed-type genmasks is that they
> >>> are not exported to users. Do you need this macro to be exported to
> >>> userspace? Can you show how and where it is used there?
> > 
> > No, not atleast right now.

Ok, thanks.

> > These were moved into uapi subsequently via the following commit.
> > 
> > 21a3a3d015aee ("tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources")
> > 
> > But in general GENMASK_U128() is needed for generating 128 bit page table
> > entries, related flags and masks whether in kernel or in user space for
> > writing kernel test cases etc.
> 
> In the commit 947697c6f0f7 ("uapi: Define GENMASK_U128"), GENMASK_U128() gets defined
> using __GENMASK_U128() which in turn calls __BIT128() - both of which are defined in
> UAPI headers inside (include/uapi/linux/). 
> 
> Just wondering - are you suggesting to move these helpers from include/uapi/linux/ to
> include/linux/bits.h instead ?

Vincent is working on fixed-width GENMASK_Uxx() based on GENMASK_TYPE().

https://lore.kernel.org/lkml/20250308-fixed-type-genmasks-v6-0-f59315e73c29@wanadoo.fr/T/

The series adds a general GENMASK_TYPE() in the linux/bits.h. I'd like
all fixed-widh genmasks to be based on it. The implementation doesn't
allow to move GENMASK_TYPE() the to uapi easily.

There was a discussion regarding that, and for now the general understanding
is that userspace doesn't need GENMASK_Uxx().

Are your proposed tests based on the in-kernel tools/ ? If so, linux/bits.h
will be available for you.

Vincent,

Can you please experiment with moving GENMASK_U128() to linux/bits.h
and switching it to GENMASK_TYPE()-based implementation?

If it works, we can do it after merging of GENMASK_TYPE() and
ancestors.

Thanks,
Yury

Vincent Mailhol March 22, 2025, 11:46 a.m. UTC | #13

On 22/03/2025 at 02:05, Yury Norov wrote:
> On Wed, Mar 19, 2025 at 09:43:06AM +0530, Anshuman Khandual wrote:
>> 
>> 
>> On 3/19/25 09:04, Anshuman Khandual wrote:
>>> On 3/19/25 07:16, Yury Norov wrote:
>>>> + Catalin Marinas, ARM maillist

(...)

>>> These were moved into uapi subsequently via the following 
>>> commit.
>>> 
>>> 21a3a3d015aee ("tools headers: Synchronize {uapi/}linux/bits.h 
>>> with the kernel sources")
>>> 
>>> But in general GENMASK_U128() is needed for generating 128 bit 
>>> page table entries, related flags and masks whether in kernel or
>>> in user space for writing kernel test cases etc.
>> 
>> In the commit 947697c6f0f7 ("uapi: Define GENMASK_U128"), 
>> GENMASK_U128() gets defined using __GENMASK_U128() which in turn 
>> calls __BIT128() - both of which are defined in UAPI headers 
>> inside (include/uapi/linux/).
>> 
>> Just wondering - are you suggesting to move these helpers from 
>> include/uapi/linux/ to include/linux/bits.h instead ?
> 
> Vincent is working on fixed-width GENMASK_Uxx() based on 
> GENMASK_TYPE().
> 
> https://lore.kernel.org/lkml/20250308-fixed-type-genmasks-v6-0- 
> f59315e73c29@wanadoo.fr/T/
> 
> The series adds a general GENMASK_TYPE() in the linux/bits.h. I'd 
> like all fixed-widh genmasks to be based on it. The implementation 
> doesn't allow to move GENMASK_TYPE() the to uapi easily.
> 
> There was a discussion regarding that, and for now the general 
> understanding is that userspace doesn't need GENMASK_Uxx().
> 
> Are your proposed tests based on the in-kernel tools/ ? If so, 
> linux/ bits.h will be available for you.
> 
> Vincent,
> 
> Can you please experiment with moving GENMASK_U128() to linux/ 
> bits.h and switching it to GENMASK_TYPE()-based implementation?
> 
> If it works, we can do it after merging of GENMASK_TYPE() and 
> ancestors.

I sent the new version with the split as you asked in a separate message.

I switched GENMASK_U128() from using __GENMASK_U128() to using
GENMASK_TYPE() in this patch of the second series:

https://lore.kernel.org/all/20250322-consolidate-genmask-
v1-2-54bfd36c5643@wanadoo.fr/

After this, the genmask_u128_test() unit tests from lib/test_bits.c are
all green, so this looks good. Note that because it is not yet used,
there isn't much more things to test aside from that unit test.

To be precise, I am not yet *moving* it. For now, I decoupled
GENMASK_U128() from __GENMASK_U128(). To complete the move, all what is
left is to remove __GENMASK_U128() from the uapi. To be honest, I am not
keen on touching either of the uapi or the asm variants myself. But, if
my work gets merged, that last step should be easy for you.

On a side note, at first glance, I was disturbed by the current
__GENMASK_U128() implementation:

  #define __GENMASK_U128(h, l) \
  	((_BIT128((h)) << 1) - (_BIT128(l)))

If calling __GENMASK_U128(127, x), the macro does a:

  _BIT128(127) << 1

which expands to:

  (unsigned __int128)1 << 127 << 1

So, while (unsigned __int128)1 << 128 is an undefined behaviour, doing
it in two steps: << 127 and << 1 is well defined and gives zero. Then,
when doing the subtraction, the unsigned integer wraparound restores the
most significant bits making things go back to normal.

The same applies to all the other variants. If doing:

  #define GENMASK_TYPE(t, h, l)				\
  	((t)(GENMASK_INPUT_CHECK(h, l) +		\
  	     (((t)1 << (h) << 1) - ((t)1 << (l)))))

The unit tests pass for everything and you even still get the warning if
h is out of bound.

But then, bloat-o-meter (x86_64, defconfig, GCC 12.4.1) shows a small
increase:

  Total: Before=22723482, After=22724586, chg +0.00%

So, probably not worth the change anyway. I am keeping the current version.

Yours sincerely,
Vincent Mailhol

[v4,0/8] bits: Fixed-type GENMASK()/BIT()

Message

Comments