diff mbox series

[v4,4/8] bits: introduce fixed-type BIT

Message ID 20250305-fixed-type-genmasks-v4-4-1873dcdf6723@wanadoo.fr (mailing list archive)
State New
Headers show
Series bits: Fixed-type GENMASK()/BIT() | expand

Commit Message

Vincent Mailhol via B4 Relay March 5, 2025, 1 p.m. UTC
From: Lucas De Marchi <lucas.demarchi@intel.com>

Implement fixed-type BIT to help drivers add stricter checks, like was
done for GENMASK().

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
---
Changelog:

  v3 -> v4:

    - Use const_true() to simplify BIT_INPUT_CHECK().

    - Make BIT_U8() and BIT_U16() return an unsigned int instead of a
      u8 and u16. Because of the integer promotion rules in C, an u8
      or an u16 would become a signed integer as soon as these are
      used in any expression. By casting these to unsigned ints, at
      least the signedness is kept.

    - Put the cast next to the BIT() macro.

    - In BIT_U64(): use BIT_ULL() instead of BIT().
---
 include/linux/bits.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Andy Shevchenko March 5, 2025, 2:33 p.m. UTC | #1
On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:
> From: Lucas De Marchi <lucas.demarchi@intel.com>
> 
> Implement fixed-type BIT to help drivers add stricter checks, like was

Here and in the Subject I would use BIT_Uxx().

> done for GENMASK().

...

> +/*
> + * Fixed-type variants of BIT(), with additional checks like GENMASK_t(). The

GENMASK_t() is not a well named macro.

> + * following examples generate compiler warnings due to shift-count-overflow:
> + *
> + * - BIT_U8(8)
> + * - BIT_U32(-1)
> + * - BIT_U32(40)
> + */
> +#define BIT_INPUT_CHECK(type, b) \
> +	BUILD_BUG_ON_ZERO(const_true((b) >= BITS_PER_TYPE(type)))
> +
> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))

Why not u8 and u16? This inconsistency needs to be well justified.

> +#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
> +#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))

Can you also use a TAB between the parentheses for better readability?
E.g.,

#define BIT_U64(b)r	(BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
Vincent Mailhol March 5, 2025, 2:48 p.m. UTC | #2
On 05/03/2025 at 23:33, Andy Shevchenko wrote:
> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:
>> From: Lucas De Marchi <lucas.demarchi@intel.com>
>>
>> Implement fixed-type BIT to help drivers add stricter checks, like was
> 
> Here and in the Subject I would use BIT_Uxx().
> 
>> done for GENMASK().
> 
> ...
> 
>> +/*
>> + * Fixed-type variants of BIT(), with additional checks like GENMASK_t(). The
> 
> GENMASK_t() is not a well named macro.

Ack. I will rename to GENMASK_TYPE().

>> + * following examples generate compiler warnings due to shift-count-overflow:
>> + *
>> + * - BIT_U8(8)
>> + * - BIT_U32(-1)
>> + * - BIT_U32(40)
>> + */
>> +#define BIT_INPUT_CHECK(type, b) \
>> +	BUILD_BUG_ON_ZERO(const_true((b) >= BITS_PER_TYPE(type)))
>> +
>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))
> 
> Why not u8 and u16? This inconsistency needs to be well justified.

Because of the C integer promotion rules, if casted to u8 or u16, the
expression will immediately become a signed integer as soon as it is get
used. For example, if casted to u8

  BIT_U8(0) + BIT_U8(1)

would be a signed integer. And that may surprise people.

David also pointed this in the v3:

https://lore.kernel.org/intel-xe/d42dc197a15649e69d459362849a37f2@AcuMS.aculab.com/

and I agree with his comment.

I explained this in the changelog below the --- cutter, but it is
probably better to make the explanation more visible. I will add a
comment in the code to explain this.

>> +#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
>> +#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
> 
> Can you also use a TAB between the parentheses for better readability?
> E.g.,
> 
> #define BIT_U64(b)r	(BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))

Sure. I prefer it with space, but no strong opinion. I will put tab in v5.

Yours sincerely,
Vincent Mailhol
Andy Shevchenko March 5, 2025, 3:48 p.m. UTC | #3
On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:
> On 05/03/2025 at 23:33, Andy Shevchenko wrote:
> > On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:

...

> >> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
> >> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))
> > 
> > Why not u8 and u16? This inconsistency needs to be well justified.
> 
> Because of the C integer promotion rules, if casted to u8 or u16, the
> expression will immediately become a signed integer as soon as it is get
> used. For example, if casted to u8
> 
>   BIT_U8(0) + BIT_U8(1)
> 
> would be a signed integer. And that may surprise people.

Yes, but wouldn't be better to put it more explicitly like

#define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?

Also, BIT_Uxx() gives different type at the end, shouldn't they all be promoted
to unsigned long long at the end? Probably it won't work in real assembly.
Can you add test cases which are written in assembly? (Yes, I understand that it will
be architecture dependent, but still.)

> David also pointed this in the v3:
> 
> https://lore.kernel.org/intel-xe/d42dc197a15649e69d459362849a37f2@AcuMS.aculab.com/
> 
> and I agree with his comment.
> 
> I explained this in the changelog below the --- cutter, but it is
> probably better to make the explanation more visible. I will add a
> comment in the code to explain this.
> 
> >> +#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
> >> +#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
Vincent Mailhol March 5, 2025, 5:17 p.m. UTC | #4
On 06/03/2025 at 00:48, Andy Shevchenko wrote:
> On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:
>> On 05/03/2025 at 23:33, Andy Shevchenko wrote:
>>> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:
> 
> ...
> 
>>>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
>>>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))
>>>
>>> Why not u8 and u16? This inconsistency needs to be well justified.
>>
>> Because of the C integer promotion rules, if casted to u8 or u16, the
>> expression will immediately become a signed integer as soon as it is get
>> used. For example, if casted to u8
>>
>>   BIT_U8(0) + BIT_U8(1)
>>
>> would be a signed integer. And that may surprise people.
> 
> Yes, but wouldn't be better to put it more explicitly like
> 
> #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?

OK, the final result would be unsigned. But, I do not follow how this is
more explicit.

Also, why doing:

  (u8)BIT(b) + 0 + UL(0)

and not just:

  (u8)BIT(b) + UL(0)

?

What is that intermediary '+ 0' for?

I am sorry, but I am having a hard time understanding how casting to u8
and then doing an addition with an unsigned long is more explicit than
directly doing a cast to the desired type.

As I mentioned in my answer to Yuri, I have a slight preference for the
unsigned int cast, but I am OK to go back to the u8/u16 cast as it was
in v3.

However, I really do not see how that '+ 0 + UL(0)' would be an improvement.

> Also, BIT_Uxx() gives different type at the end, shouldn't they all be promoted
> to unsigned long long at the end? Probably it won't work in real assembly.
> Can you add test cases which are written in assembly? (Yes, I understand that it will
> be architecture dependent, but still.)

No. I purposely guarded the definition of the BIT_Uxx() by a

  #if !defined(__ASSEMBLY__)

so that these are never visible in assembly. I actually put a comment to
explain why the GENMASK_U*() are not available in assembly. I can copy
paste the same comment to explain why why BIT_U*() are not made
available either:

  /*
   * Missing asm support
   *
   * BIT_U*() depends on BITS_PER_TYPE() which would not work in the asm
   * code as BITS_PER_TYPE() relies on sizeof(), something not available
   * in asm.  Nethertheless, the concept of fixed width integers is a C
   * thing which does not apply to assembly code.
   */

I really believe that it would be a mistake to make the GENMASK_U*() or
the BIT_U*() available to assembly.

>> David also pointed this in the v3:
>>
>> https://lore.kernel.org/intel-xe/d42dc197a15649e69d459362849a37f2@AcuMS.aculab.com/
>>
>> and I agree with his comment.
>>
>> I explained this in the changelog below the --- cutter, but it is
>> probably better to make the explanation more visible. I will add a
>> comment in the code to explain this.
>>
>>>> +#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
>>>> +#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
> 

Yours sincerely,
Vincent Mailhol
Andy Shevchenko March 5, 2025, 7:56 p.m. UTC | #5
On Thu, Mar 06, 2025 at 02:17:18AM +0900, Vincent Mailhol wrote:
> On 06/03/2025 at 00:48, Andy Shevchenko wrote:
> > On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:
> >> On 05/03/2025 at 23:33, Andy Shevchenko wrote:
> >>> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:

...

> >>>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
> >>>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))
> >>>
> >>> Why not u8 and u16? This inconsistency needs to be well justified.
> >>
> >> Because of the C integer promotion rules, if casted to u8 or u16, the
> >> expression will immediately become a signed integer as soon as it is get
> >> used. For example, if casted to u8
> >>
> >>   BIT_U8(0) + BIT_U8(1)
> >>
> >> would be a signed integer. And that may surprise people.
> > 
> > Yes, but wouldn't be better to put it more explicitly like
> > 
> > #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?
> 
> OK, the final result would be unsigned. But, I do not follow how this is
> more explicit.
> 
> Also, why doing:
> 
>   (u8)BIT(b) + 0 + UL(0)
> 
> and not just:
> 
>   (u8)BIT(b) + UL(0)
> 
> ?
> 
> What is that intermediary '+ 0' for?
> 
> I am sorry, but I am having a hard time understanding how casting to u8
> and then doing an addition with an unsigned long is more explicit than
> directly doing a cast to the desired type.

Reading this again, I think we don't need it at all. u8, aka unsigned char,
will be promoted to int, but it will be int with a value < 256, can't be signed
as far as I understand this correctly.

> As I mentioned in my answer to Yuri, I have a slight preference for the
> unsigned int cast, but I am OK to go back to the u8/u16 cast as it was
> in v3.

Which means that the simples uXX castings should suffice. In any case we need
test cases for that.

> However, I really do not see how that '+ 0 + UL(0)' would be an improvement.
> 
> > Also, BIT_Uxx() gives different type at the end, shouldn't they all be promoted
> > to unsigned long long at the end? Probably it won't work in real assembly.
> > Can you add test cases which are written in assembly? (Yes, I understand that it will
> > be architecture dependent, but still.)
> 
> No. I purposely guarded the definition of the BIT_Uxx() by a
> 
>   #if !defined(__ASSEMBLY__)
> 
> so that these are never visible in assembly. I actually put a comment to
> explain why the GENMASK_U*() are not available in assembly. I can copy
> paste the same comment to explain why why BIT_U*() are not made
> available either:
> 
>   /*
>    * Missing asm support
>    *
>    * BIT_U*() depends on BITS_PER_TYPE() which would not work in the asm
>    * code as BITS_PER_TYPE() relies on sizeof(), something not available
>    * in asm.  Nethertheless, the concept of fixed width integers is a C
>    * thing which does not apply to assembly code.
>    */
> 
> I really believe that it would be a mistake to make the GENMASK_U*() or
> the BIT_U*() available to assembly.

Ah, okay then!

> >> David also pointed this in the v3:
> >>
> >> https://lore.kernel.org/intel-xe/d42dc197a15649e69d459362849a37f2@AcuMS.aculab.com/
> >>
> >> and I agree with his comment.

Why unsigned char won't work?

> >> I explained this in the changelog below the --- cutter, but it is
> >> probably better to make the explanation more visible. I will add a
> >> comment in the code to explain this.
> >>
> >>>> +#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
> >>>> +#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
David Laight March 5, 2025, 9:13 p.m. UTC | #6
On Wed, 5 Mar 2025 17:48:05 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:
> > On 05/03/2025 at 23:33, Andy Shevchenko wrote:  
> > > On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:  
> 
> ...
> 
> > >> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
> > >> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))

Why even pretend you are checking against a type - just use 8 or 16.

> > > 
> > > Why not u8 and u16? This inconsistency needs to be well justified.

What is the type of BIT(b) ?
it really ought to be unsigned int (so always 32bit), but I bet it
is unsigned long (possibly historically because someone was worried
int might be 16 bits!)

> > 
> > Because of the C integer promotion rules, if casted to u8 or u16, the
> > expression will immediately become a signed integer as soon as it is get
> > used. For example, if casted to u8
> > 
> >   BIT_U8(0) + BIT_U8(1)
> > 
> > would be a signed integer. And that may surprise people.

They always get 'surprised' by that.
I found some 'dayjob' code that was doing (byte_var << 1) >> 1 in order
to get the high bit discarded.
Been like that for best part of 30 years...
I wasn't scared to fix it :-)

> Yes, but wouldn't be better to put it more explicitly like
> 
> #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?

I don't think you should force it to 'unsigned long'.
On 64bit a comparison against a 32bit 'signed int' will sign-extend the
value before making it unsigned.
While that shouldn't matter here, someone might copy it.
You just want to ensure that all the values are 'unsigned int', trying
to return u8 or u16 isn't worth the effort.

When I was doing min_unsigned() I did ((x) + 0u + 0ul + 0ull) to ensure
that values would always be zero extended.
But I was doing the same to both sides of the expression - and the compiler
optimises away all the 'known 0' extension to 64bits.

> Also, BIT_Uxx() gives different type at the end, shouldn't they all be promoted
> to unsigned long long at the end? Probably it won't work in real assembly.
> Can you add test cases which are written in assembly? (Yes, I understand that it will
> be architecture dependent, but still.)

There is no point doing multiple versions for asm files.
The reason UL(x) and ULL(x) exist is because the assembler just has integers.
Both expand to (x).
There might not even be a distinction between signed and unsigned.
I'm not sure you can assume that a shift right won't replicate the sign bit.
Since the expression can only be valid for constants, something simple
like ((2 << (hi)) - (1 << (lo)) really is the best you are going to get
for GENMASK().

So just define a completely different version for asm any nuke the UL() etc
for readability.

	David
David Laight March 5, 2025, 9:50 p.m. UTC | #7
On Wed, 5 Mar 2025 21:56:22 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Thu, Mar 06, 2025 at 02:17:18AM +0900, Vincent Mailhol wrote:
> > On 06/03/2025 at 00:48, Andy Shevchenko wrote:  
> > > On Wed, Mar 05, 2025 at 11:48:10PM +0900, Vincent Mailhol wrote:  
> > >> On 05/03/2025 at 23:33, Andy Shevchenko wrote:  
> > >>> On Wed, Mar 05, 2025 at 10:00:16PM +0900, Vincent Mailhol via B4 Relay wrote:  
> 
> ...
> 
> > >>>> +#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
> > >>>> +#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))  
> > >>>
> > >>> Why not u8 and u16? This inconsistency needs to be well justified.  
> > >>
> > >> Because of the C integer promotion rules, if casted to u8 or u16, the
> > >> expression will immediately become a signed integer as soon as it is get
> > >> used. For example, if casted to u8
> > >>
> > >>   BIT_U8(0) + BIT_U8(1)
> > >>
> > >> would be a signed integer. And that may surprise people.  
> > > 
> > > Yes, but wouldn't be better to put it more explicitly like
> > > 
> > > #define BIT_U8(b)	(BIT_INPUT_CHECK(u8, b) + (u8)BIT(b) + 0 + UL(0)) // + ULL(0) ?  
> > 
> > OK, the final result would be unsigned. But, I do not follow how this is
> > more explicit.
> > 
> > Also, why doing:
> > 
> >   (u8)BIT(b) + 0 + UL(0)
> > 
> > and not just:
> > 
> >   (u8)BIT(b) + UL(0)
> > 
> > ?
> > 
> > What is that intermediary '+ 0' for?
> > 
> > I am sorry, but I am having a hard time understanding how casting to u8
> > and then doing an addition with an unsigned long is more explicit than
> > directly doing a cast to the desired type.  
> 
> Reading this again, I think we don't need it at all. u8, aka unsigned char,
> will be promoted to int, but it will be int with a value < 256, can't be signed
> as far as I understand this correctly.

The value can't be negative, but the type will be a signed one.
Anything comparing types (and there are a few) will treat it as signed.
It really is bad practise to even pretend you can have an expression
(rather that a variable) that has a type smaller than 'int'.
It wouldn't surprise me if even an 'a = b' assignment promotes 'b' to int.

So it is even questionable whether BIT8() and BIT16() should even exist at all.
There can be reasons to return 'unsigned int' rather than 'unsigned long'.
But with the type definitions that Linux uses (and can't really be changed)
you can have BIT32() that is 'unsigned int' and BIT64() that is 'unsigned long
long'. These are then the same on 32bit and 64bit.

	David
diff mbox series

Patch

diff --git a/include/linux/bits.h b/include/linux/bits.h
index f202e46d2f4b7899c16d975120f3fa3ae41556ae..1b6f5262b79093a01aae6c14ead944e0e85821cc 100644
--- a/include/linux/bits.h
+++ b/include/linux/bits.h
@@ -68,6 +68,22 @@ 
 #define GENMASK_U128(h, l) \
 	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_U128(h, l))
 
+/*
+ * Fixed-type variants of BIT(), with additional checks like GENMASK_t(). The
+ * following examples generate compiler warnings due to shift-count-overflow:
+ *
+ * - BIT_U8(8)
+ * - BIT_U32(-1)
+ * - BIT_U32(40)
+ */
+#define BIT_INPUT_CHECK(type, b) \
+	BUILD_BUG_ON_ZERO(const_true((b) >= BITS_PER_TYPE(type)))
+
+#define BIT_U8(b) (BIT_INPUT_CHECK(u8, b) + (unsigned int)BIT(b))
+#define BIT_U16(b) (BIT_INPUT_CHECK(u16, b) + (unsigned int)BIT(b))
+#define BIT_U32(b) (BIT_INPUT_CHECK(u32, b) + (u32)BIT(b))
+#define BIT_U64(b) (BIT_INPUT_CHECK(u64, b) + (u64)BIT_ULL(b))
+
 #else /* defined(__ASSEMBLY__) */
 
 #define GENMASK(h, l) __GENMASK(h, l)