[v4,1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW

Message ID	20231024161931.78567-2-sebastian.reichel@collabora.com (mailing list archive)
State	Changes Requested, archived
Headers	show Return-Path: <linux-clk-owner@vger.kernel.org> sender: sre) by madras.collabora.co.uk (Postfix) with ESMTPSA id 264DD66072A2; Tue, 24 Oct 2023 17:19:35 +0100 (BST) From: Sebastian Reichel <sebastian.reichel@collabora.com> To: Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Michael Turquette <mturquette@baylibre.com>, Stephen Boyd <sboyd@kernel.org>, David Laight <David.Laight@ACULAB.COM>, linux-clk@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Vasily Gorbik <gor@linux.ibm.com>, Niklas Schnelle <schnelle@linux.ibm.com>, Linus Torvalds <torvalds@linux-foundation.org>, Sebastian Reichel <sebastian.reichel@collabora.com>, kernel@collabora.com Subject: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW Date: Tue, 24 Oct 2023 18:18:15 +0200 Message-ID: <20231024161931.78567-2-sebastian.reichel@collabora.com> In-Reply-To: <20231024161931.78567-1-sebastian.reichel@collabora.com> References: <20231024161931.78567-1-sebastian.reichel@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Fix clock division overflow problem \| expand [v4,0/3] Fix clock division overflow problem [v4,1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW [v4,2/3] clk: divider: Fix divisor masking on 64 bit platforms [v4,3/3] clk: composite: replace open-coded abs_diff()

Sebastian Reichel Oct. 24, 2023, 4:18 p.m. UTC

Add a new DIV_ROUND_UP helper, which cannot overflow when
big numbers are being used.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
---
 include/linux/math.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

David Laight Oct. 24, 2023, 4:26 p.m. UTC | #1

From: Sebastian Reichel
> Sent: 24 October 2023 17:18
> 
> Add a new DIV_ROUND_UP helper, which cannot overflow when
> big numbers are being used.

For non-zero you can use (n - 1)/d + 1 instead of (n + d - 1)/d
So maybe add:
#define DIV_ROUND_UP_NON_ZERO(n, d) (((n) - 1)/(d) + 1)

Saves the compiler having to get the remainder (if not
generated by a divide instruction.

	David

> 
> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
> ---
>  include/linux/math.h | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/include/linux/math.h b/include/linux/math.h
> index dd4152711de7..f80bfb375ab9 100644
> --- a/include/linux/math.h
> +++ b/include/linux/math.h
> @@ -36,6 +36,17 @@
> 
>  #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
> 
> +/**
> + * DIV_ROUND_UP_NO_OVERFLOW - divide two numbers and always round up
> + * @n: numerator / dividend
> + * @d: denominator / divisor
> + *
> + * This functions does the same as DIV_ROUND_UP, but internally uses a
> + * division and a modulo operation instead of math tricks. This way it
> + * avoids overflowing when handling big numbers.
> + */
> +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))
> +
>  #define DIV_ROUND_DOWN_ULL(ll, d) \
>  	({ unsigned long long _tmp = (ll); do_div(_tmp, d); _tmp; })
> 
> --
> 2.42.0

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Linus Torvalds Oct. 24, 2023, 7:32 p.m. UTC | #2

[ Since you're cc'ing the s390 people, I assume that means that this
all ended up being a follow-up to the s390 issue discussion ]

On Tue, 24 Oct 2023 at 06:19, Sebastian Reichel
<sebastian.reichel@collabora.com> wrote:
>
> Add a new DIV_ROUND_UP helper, which cannot overflow when
> big numbers are being used.

This is really horrendously bad on some architectures (ie it might
require *two* divisions):

> +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))

but David Laight at least had a suggestion for that: when allowing
multiple uses, you can just do

   #define DIV_ROUND_UP(n,d) ((n) ? ((n)-1)/(d)+1 : 0)

so now you're back to just one division and no horrible extra expense.

But we can't allow those multiple uses in general, sadly.

I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:

 - people do use it with complex first arguments (ie function calls
etc) that we don't want to evaluate twice

 - we can't make it an inline function, because the types aren't fixed

 - we can't even use a statement expression and __auto_type, because
these things are used in type definitions etc and need to be constant
expressions

That last thing means that even the "__builtin_choose_expr()" doesn't
work, because the statement expression still needs to be _parsed_ as
such, and that's not something we can do in those contexts.

Very annoying. Let me think about this.

                     Linus

Linus Torvalds Oct. 24, 2023, 10:53 p.m. UTC | #3

On Tue, 24 Oct 2023 at 09:32, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
>
>  - people do use it with complex first arguments (ie function calls
> etc) that we don't want to evaluate twice
>
>  - we can't make it an inline function, because the types aren't fixed
>
>  - we can't even use a statement expression and __auto_type, because
> these things are used in type definitions etc and need to be constant
> expressions

Ok. I have a potential beginning of a solution.

It is unbelievably disgustingly complicated. But it might approach
being correct.

And by that "it might approach being correct" I obviously mean "this
is untested, but builds at least some kernel code".

I'm almost certain it will fail on more complex cases, because I
already found a lot of questionable stuff that was simply hidden by
the old macro just silently doing the C arithmetic type conversions,
and this thing does type handling manually.

I'm hoping that somebody will go "Linus, you're just being
*completely* silly, it's much easier to do XYZ".

            Linus

Rasmus Villemoes Oct. 25, 2023, 8:03 a.m. UTC | #4

On 25/10/2023 00.53, Linus Torvalds wrote:

> I'm hoping that somebody will go "Linus, you're just being
> *completely* silly, it's much easier to do XYZ".

I don't have better suggestions, but a few notes:

Both the existing and new implementation are simply wrong for negative
n, because / doesn't do floor(), it does round-towards-0 (I do believe
that's a bug in the C standard, but not much one can do about that). So
both the old and new say that dru(-5, 5) == 0, dru(-7, 5) == 0, dru(-10,
5) == -1. They are correct for dru(-9, 5) == -1 and other "1 mod d" cases.

But the new implementation disagrees with the old for -d+1 < n < 0: The
old implementation correctly says dru(-3, 5) == 0, but the new gives 1 (!).

Preventing signed types from being used is probably too difficult. But
it would be nice if we could at least catch a negative _value_ and do
something sensible. Can we make the build fail for a negative,
compile-time constant, n? I have no idea what to do for run-time
negative values - we already don't do the math right, but starting to
return a positive number for some negative n's really doesn't seem right.

Aside: I don't think uapi/ stuff can include non-uapi/ stuff.

Aside2: do_div(n-1, d); probably doesn't compile :)

Rasmus

David Laight Oct. 25, 2023, 8:38 a.m. UTC | #5

From: Linus Torvalds
> Sent: 24 October 2023 23:53
> 
> On Tue, 24 Oct 2023 at 09:32, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
> >
> >  - people do use it with complex first arguments (ie function calls
> > etc) that we don't want to evaluate twice
> >
> >  - we can't make it an inline function, because the types aren't fixed
> >
> >  - we can't even use a statement expression and __auto_type, because
> > these things are used in type definitions etc and need to be constant
> > expressions

Doesn't min() get around that by using is_constexpr() and
__builtin_choose_exptr() - the same could be done here.

> 
> Ok. I have a potential beginning of a solution.
> 
> It is unbelievably disgustingly complicated. But it might approach
> being correct.
> 
> And by that "it might approach being correct" I obviously mean "this
> is untested, but builds at least some kernel code".
> 
> I'm almost certain it will fail on more complex cases, because I
> already found a lot of questionable stuff that was simply hidden by
> the old macro just silently doing the C arithmetic type conversions,
> and this thing does type handling manually.
> 
> I'm hoping that somebody will go "Linus, you're just being
> *completely* silly, it's much easier to do XYZ".

> Doing a non-overflowing DIV_ROUND_UP() that is usable in all contexts is
> actually very nasty.
>
> This is a trial balloon..  The signed cases need more thought.  The best
> option would be to disallow them (by not listing them in the _Generic()
> rules). But they currently happen, often for bad reasons, ie wireless has
>
> 	DIV_ROUND_UP(interval, MSEC_PER_SEC);
>
> and while 'interval' is a proper u32, MSEC_PER_SEC is defined to be
> '1000L', so the resulting C arithmetic is done in signed 'long'.

Maybe use some of the 'stuff' from min() and convert compile-time
constant 'd' to signed int to avoid promotions.

Indeed the whole thing really only makes sense for (d > 0 && n >= 0)
so forcing an unsigned divide wouldn't be a bad thing at all.
It will also generate better code when 'd' is a power of 2.

Ignoring the n==0 case I think this always generates an unsigned
divide, never does sign extension and does a 32bit divide
for 32bit arguments.

#define CVT_ULL(x) ((x) + 0u + 0ul + 0ull)
#define DIV_ROUND_UP(n, d) ((CVT_ULL(n) + CVT_ULL(d) - 1) / CVT_ULL(d) + 1)

It should be possible to error if 'd' is a signed variable or
a non-positive constant.
I'd guess most 'd' are constants.

Erroring signed 'n' is possible but might be annoying.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Vasily Gorbik Oct. 25, 2023, 3:05 p.m. UTC | #6

On Tue, Oct 24, 2023 at 12:53:03PM -1000, Linus Torvalds wrote:
>+/*
>+ * The other cases need to be split by type.
>+ *
>+ * Signed cases seem badly defined, but do exist. We should
>+ * consider removing them from this _Generic(), and fixing any
>+ * result 'not compatible with any association' cases.
>+ */
>+#define __div_round_up(n,d) _Generic((n)+(d),		\
>+	unsigned long long: __div_round_up_ull(n,d),	\
>+	long long: __div_round_up_ll(n,d),		\
>+	unsigned long: __div_round_up_ul(n,d),		\
>+	long: __div_round_up_l(n,d),			\
>+	unsigned int: __div_round_up_u(n,d),		\
>+	int: __div_round_up_i(n,d))

You probably want

 #define __div_round_up(n,d) _Generic((n)+(d),		\
	unsigned long long: __div_round_up_ull,		\
	long long: __div_round_up_ll,			\
	unsigned long: __div_round_up_ul,		\
	long: __div_round_up_l,				\
	unsigned int: __div_round_up_u,			\
	int: __div_round_up_i)(n,d)

to avoid early type-checking for expressions that will be discarded
and prevent errors like:

drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c: In function 'sparx5_tc_flower_parse_act_police':
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '25000000000' to '3525163520' [-Werror=overflow]
  435 | #define SPX5_SDLB_GROUP_RATE_MAX 25000000000ULL
      |                                  ^~~~~~~~~~~~~~
./include/linux/div_round_up.h:68:42: note: in definition of macro '__div_round_up'
   68 |         unsigned long: __div_round_up_ul(n,d),          \
      |                                          ^
./include/linux/div_round_up.h:50:9: note: in expansion of macro '__keep_const'
   50 |         __keep_const(n, div_round_up(n,d))
      |         ^~~~~~~~~~~~
./include/linux/math.h:37:22: note: in expansion of macro '__KERNEL_DIV_ROUND_UP'
   37 | #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
      |                      ^~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c:732:38: note: in expansion of macro 'SPX5_SDLB_GROUP_RATE_MAX'
  732 |         if (pol->rate > DIV_ROUND_UP(SPX5_SDLB_GROUP_RATE_MAX, 1000)) {
      |                                      ^~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: overflow in conversion from 'long long unsigned int' to 'long int' changes value from '25000000000' to '-769803776' [-Werror=overflow]
...
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: conversion from 'long long unsigned int' to 'unsigned int' changes value from '25000000000' to '3525163520' [-Werror=overflow]
...
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: overflow in conversion from 'long long unsigned int' to 'int' changes value from '25000000000' to '-769803776' [-Werror=overflow]

Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm.

 static inline unsigned long long __div_round_up_ull(unsigned long long n, unsigned long d)
 {
 #ifdef CONFIG_32BIT
 	if (!n)
-		return 0
-	do_div(n-1, d);
-	return n+1;
+		return 0;
+	n--;
+	do_div(n, d);
+	return n + 1;
 #else
 	return __const_div_round_up(n,d);
 #endif

Sebastian Reichel Oct. 25, 2023, 5:28 p.m. UTC | #7

Hi,

On Tue, Oct 24, 2023 at 09:32:27AM -1000, Linus Torvalds wrote:
> [ Since you're cc'ing the s390 people, I assume that means that this
> all ended up being a follow-up to the s390 issue discussion ]

Actually I've noticed the issue in the clock framework some time
ago and Andy Shevchenko pointed me towards the s390 thread, which
is more or less about the same fundamental problem. So I added
everyone in Cc, to make sure this is solved properly and not just
s390 specific.

> On Tue, 24 Oct 2023 at 06:19, Sebastian Reichel
> <sebastian.reichel@collabora.com> wrote:
> >
> > Add a new DIV_ROUND_UP helper, which cannot overflow when
> > big numbers are being used.
> 
> This is really horrendously bad on some architectures (ie it might
> require *two* divisions):

I came up with this helper for the issue in the clock framework,
which currently use DIV_ROUND_UP_ULL() as a workaround. That solves
the issue on 32 bit systems (input arguments are of type 'unsigned
long'). But it also means doing u64 / u32 and I expect that do be
much more more expensive on 32 bit platforms than two divisons.

On 64 bit platforms the workaround is wrong, since the second
argument is reduced from 64 bit to 32 bit. It effectively breaks
platforms trying to request a frequency above 4.3 GHz from a clock
divider. Since that's not yet a common thing I guess this has not
yet been found and fixed by somebody else. I also just found it as
bycatch.

> > +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))
> 
> but David Laight at least had a suggestion for that: when allowing
> multiple uses, you can just do
> 
>    #define DIV_ROUND_UP(n,d) ((n) ? ((n)-1)/(d)+1 : 0)
> 
> so now you're back to just one division and no horrible extra expense.
> 
> But we can't allow those multiple uses in general, sadly.
> 
> I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
> 
>  - people do use it with complex first arguments (ie function calls
> etc) that we don't want to evaluate twice
> 
>  - we can't make it an inline function, because the types aren't fixed
> 
>  - we can't even use a statement expression and __auto_type, because
> these things are used in type definitions etc and need to be constant
> expressions
> 
> That last thing means that even the "__builtin_choose_expr()" doesn't
> work, because the statement expression still needs to be _parsed_ as
> such, and that's not something we can do in those contexts.
> 
> Very annoying. Let me think about this.

Thanks.

-- Sebastian

Sebastian Reichel Oct. 25, 2023, 5:36 p.m. UTC | #8

Hi,

On Tue, Oct 24, 2023 at 12:53:03PM -1000, Linus Torvalds wrote:
> On Tue, 24 Oct 2023 at 09:32, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
> >
> >  - people do use it with complex first arguments (ie function calls
> > etc) that we don't want to evaluate twice
> >
> >  - we can't make it an inline function, because the types aren't fixed
> >
> >  - we can't even use a statement expression and __auto_type, because
> > these things are used in type definitions etc and need to be constant
> > expressions
> 
> Ok. I have a potential beginning of a solution.
> 
> It is unbelievably disgustingly complicated. But it might approach
> being correct.
> 
> And by that "it might approach being correct" I obviously mean "this
> is untested, but builds at least some kernel code".
> 
> I'm almost certain it will fail on more complex cases, because I
> already found a lot of questionable stuff that was simply hidden by
> the old macro just silently doing the C arithmetic type conversions,
> and this thing does type handling manually.
> 
> I'm hoping that somebody will go "Linus, you're just being
> *completely* silly, it's much easier to do XYZ".

I think your new __div_round_up_ull() should also be used by
DIV_ROUND_UP_ULL, which is defined in linux/math.h.

-- Sebastian

Linus Torvalds Oct. 25, 2023, 5:37 p.m. UTC | #9

On Tue, 24 Oct 2023 at 22:03, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
>
> Both the existing and new implementation are simply wrong for negative
> n, because / doesn't do floor(), it does round-towards-0 (I do believe
> that's a bug in the C standard, but not much one can do about that).

Honestly, while I found several cases where the arithmetic was done
with a signed type, all the ones I looked at were purely about the
_type_, not the values.

Any actual negative values would have been too random before to make a
difference. In that patch, I tried to keep the signed type around
mainly to at least get *similar* values to the old code, but honestly,
it was all broken before already if the values were actually signed.

If it were to change the result and somebody would find a bug due to
that, that would be a good thing, in other words. The negative value
would need to be fixed anyway - as you say, integer division of signed
values is simply not rgeat in the first place.

Side note: if you dislike the "round towards zero" behavior, I can
tell you that it is better than the historical alternative, which was
"implementation defined".

IIRC, the rule used to be that signed division could either round down
or towards zero, and the only rule was that (a) it had to be
documented (ie "implementation defined" rather than "undefined") and
(b) the modulus had to follow the same rules, ie regardless of
rounding, you'd have

  (a/b)*b + a%b = a

That said:

> Preventing signed types from being used is probably too difficult.

It might not be. My horror with _Generic() is certainly not pretty,
but it *does* have the advantage that we could error out for signed
cases.

I did that initially, and it didn't look horrific, particularly if you
just accept signed positive constants as unsigned values (which
requires another bit of macro magic but is not conceptually hard).

The patch I sent out was not it, though - it was meant really as a
minimal "something like this" that still compiled my defconfig.

> Aside: I don't think uapi/ stuff can include non-uapi/ stuff.

Yeah, no, I left it in that uapi header just for the same reason I
didn't bother trying to fix any signed type cases - actively avoiding
changes *outside* the DIV_ROUND_UP() case itself.

In reality, I'd just make any users of __KERNEL_DIV_ROUND_UP include
<linux/div_round_up.h> and get rid of the crazy UAPI thing entirely.

So ignore that part.

> Aside2: do_div(n-1, d); probably doesn't compile :)

.. I did say it was untested and "might approach being correct".

That 32-bit ULL case should also have some attempt at checking that
the divisor really fits in 32 bits. Because I still refuse to believe
that the full 64-by-64 divisions are sane on a 32-bit machine, and
anybody who does that needs to go to extra lengths for them. No
"silently do them without telling the programmer they are doing
something horrendously bad".

                Linus

Linus Torvalds Oct. 25, 2023, 5:41 p.m. UTC | #10

On Tue, 24 Oct 2023 at 22:38, David Laight <David.Laight@aculab.com> wrote:
>
> From: Linus Torvalds
> > >  - we can't even use a statement expression and __auto_type, because
> > > these things are used in type definitions etc and need to be constant
> > > expressions
>
> Doesn't min() get around that by using is_constexpr() and
> __builtin_choose_exptr() - the same could be done here.

Nope. I wanted to do it that way - it would have made things much
simpler and avoid the whole _Generic() thing, but try it - you cannot
use statement expressions in a non-function context even with
__builtin_choose_expr().

And no, min/max have never been usable in that context

                 Linus

Linus Torvalds Oct. 25, 2023, 5:43 p.m. UTC | #11

On Wed, 25 Oct 2023 at 05:05, Vasily Gorbik <gor@linux.ibm.com> wrote:
>
> You probably want
>
>  #define __div_round_up(n,d) _Generic((n)+(d),          \
>         unsigned long long: __div_round_up_ull,         \
>         long long: __div_round_up_ll,                   \
>         unsigned long: __div_round_up_ul,               \
>         long: __div_round_up_l,                         \
>         unsigned int: __div_round_up_u,                 \
>         int: __div_round_up_i)(n,d)
>
> to avoid early type-checking for expressions that will be discarded
> and prevent errors like:

Ack. I noticed that later when I tried to do a bigger config build -
the compiler would warn about the implicit truncation of the integer
arguments (for the cases where they weren't used).

> Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm.

Lovely.

It would have been even better if somebody told me that I was stupid
and there was some nice trick to it, but at least the _Generic()
approach doesn't seem broken - just a few tweaks needed.

               Linus

David Laight Oct. 26, 2023, 8:41 a.m. UTC | #12

From: Linus Torvalds 
> Sent: 25 October 2023 18:41
> 
> On Tue, 24 Oct 2023 at 22:38, David Laight <David.Laight@aculab.com> wrote:
> >
> > From: Linus Torvalds
> > > >  - we can't even use a statement expression and __auto_type, because
> > > > these things are used in type definitions etc and need to be constant
> > > > expressions
> >
> > Doesn't min() get around that by using is_constexpr() and
> > __builtin_choose_exptr() - the same could be done here.
> 
> Nope. I wanted to do it that way - it would have made things much
> simpler and avoid the whole _Generic() thing, but try it - you cannot
> use statement expressions in a non-function context even with
> __builtin_choose_expr().

_Generic() has exactly the same issues as __builtin_choose_expr().
All the code has to be valid for all types.
Makes testing for negative constants a PITA.
Something like this worked:
	(__buitin_choose_expr(__is_constexpr((x) && is_signed_type(__typeof(x)), x, 1) < 0)
Although our is_signed_type() isn't constant (enough) for pointer types.
So you need another check for (typeof(x))1 being constant.

> And no, min/max have never been usable in that context

I've clearly not tested it and assumed the comment implied
that was ok.

It'd probably work in C++ - which is perfectly willing to
add functions to initialise static data - even when you'd
rather have a compile time error.

	David

> 
>                  Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

David Laight Oct. 26, 2023, 8:57 a.m. UTC | #13

From: Linus Torvalds
> Sent: 25 October 2023 18:44
> 
> On Wed, 25 Oct 2023 at 05:05, Vasily Gorbik <gor@linux.ibm.com> wrote:
> >
> > You probably want
> >
> >  #define __div_round_up(n,d) _Generic((n)+(d),          \
> >         unsigned long long: __div_round_up_ull,         \
> >         long long: __div_round_up_ll,                   \
> >         unsigned long: __div_round_up_ul,               \
> >         long: __div_round_up_l,                         \
> >         unsigned int: __div_round_up_u,                 \
> >         int: __div_round_up_i)(n,d)
> >
> > to avoid early type-checking for expressions that will be discarded
> > and prevent errors like:
> 
> Ack. I noticed that later when I tried to do a bigger config build -
> the compiler would warn about the implicit truncation of the integer
> arguments (for the cases where they weren't used).
> 
> > Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm.
> 
> Lovely.
> 
> It would have been even better if somebody told me that I was stupid
> and there was some nice trick to it, but at least the _Generic()
> approach doesn't seem broken - just a few tweaks needed.

Doesn't that version end up calling inline functions?
So won't be usable in static initialisers - the same as statement functions.

Will this trick work:
#define ZERO_UNLESS_TYPE(x, type) _Generic(x, type: x, default 0)
#define div_ru(type, fn, n, d) \
	type: fn(ZERO_UNLESS_TYPE(type, (n) + (d), n), ZERO_UNLESS_TYPE(type, (n) + (d), d)
then:
#define __div_round_up(n, d) _Generic((n)+(d),          \
	div_ru(unsigned long long, __div_round_up_ull, n, d),
	...

Which would allow the 'functions' be #defines.

But it still has the issues of doing 'big' divisions.
Although the compiler will use 32bit (and 64 by 32 bit) divisions
for 64 bit types if it knows the values are small.

clang seems to add conditionals to avoid 64x64 divides.
Probably worth it for intel x86 cpu, but not for amd ones.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Linus Torvalds Oct. 26, 2023, 4:54 p.m. UTC | #14

On Wed, 25 Oct 2023 at 22:57, David Laight <David.Laight@aculab.com> wrote:
>
> Doesn't that version end up calling inline functions?
> So won't be usable in static initialisers - the same as statement functions.

David, please either read what I write, or actually test things out,
instead of just adding noise.

It is not a problem to have a _Generic() - or a
__builtin_chooise_expr() - that has an inline function that doesn't
get used. The compiler will remove it.

The problem is literally that you can't use "__auto_type", because
*STATEMENT EXPRESSIONS* do not work outside of function definitions,
because gcc won't even *parse* them, and just says

  error: braced-group within expression allowed only inside a function

exactly like I explained in the comment of the patch that does the _Generic().

Which is why my solution ended up using those inline functions, and
splitting manually by type, instead of an expression statement and
__auto_type.

Whether you then split into different expressions with _Generic() or
by using __builtin_choose_expr() is irrelevant, although once you do
"split by type", _Generic() ends up being much more convenient because
that's what it's designed for (and you don't end up trying to figure
out the size with sizeof() and typeof tricks).

               Linus

David Laight Oct. 27, 2023, 7:24 a.m. UTC | #15

From: Linus Torvalds
> Sent: 26 October 2023 17:54
> Subject: Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
> 
> On Wed, 25 Oct 2023 at 22:57, David Laight <David.Laight@aculab.com> wrote:
> >
> > Doesn't that version end up calling inline functions?
> > So won't be usable in static initialisers - the same as statement functions.
> 
> David, please either read what I write, or actually test things out,
> instead of just adding noise.

I do test a lot of stuff.
For some reason I'd assumed that you would be able to
have a call to a function inside a static initialiser.

Clearly not enough coffee :-(

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

[v4,1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW

Commit Message

Comments

Patch