Message ID | 20231024161931.78567-2-sebastian.reichel@collabora.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | Fix clock division overflow problem | expand |
From: Sebastian Reichel > Sent: 24 October 2023 17:18 > > Add a new DIV_ROUND_UP helper, which cannot overflow when > big numbers are being used. For non-zero you can use (n - 1)/d + 1 instead of (n + d - 1)/d So maybe add: #define DIV_ROUND_UP_NON_ZERO(n, d) (((n) - 1)/(d) + 1) Saves the compiler having to get the remainder (if not generated by a divide instruction. David > > Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> > --- > include/linux/math.h | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/include/linux/math.h b/include/linux/math.h > index dd4152711de7..f80bfb375ab9 100644 > --- a/include/linux/math.h > +++ b/include/linux/math.h > @@ -36,6 +36,17 @@ > > #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP > > +/** > + * DIV_ROUND_UP_NO_OVERFLOW - divide two numbers and always round up > + * @n: numerator / dividend > + * @d: denominator / divisor > + * > + * This functions does the same as DIV_ROUND_UP, but internally uses a > + * division and a modulo operation instead of math tricks. This way it > + * avoids overflowing when handling big numbers. > + */ > +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d))) > + > #define DIV_ROUND_DOWN_ULL(ll, d) \ > ({ unsigned long long _tmp = (ll); do_div(_tmp, d); _tmp; }) > > -- > 2.42.0 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
[ Since you're cc'ing the s390 people, I assume that means that this all ended up being a follow-up to the s390 issue discussion ] On Tue, 24 Oct 2023 at 06:19, Sebastian Reichel <sebastian.reichel@collabora.com> wrote: > > Add a new DIV_ROUND_UP helper, which cannot overflow when > big numbers are being used. This is really horrendously bad on some architectures (ie it might require *two* divisions): > +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d))) but David Laight at least had a suggestion for that: when allowing multiple uses, you can just do #define DIV_ROUND_UP(n,d) ((n) ? ((n)-1)/(d)+1 : 0) so now you're back to just one division and no horrible extra expense. But we can't allow those multiple uses in general, sadly. I would really prefer to just make our regular DIV_ROUND_UP() DTRT. But: - people do use it with complex first arguments (ie function calls etc) that we don't want to evaluate twice - we can't make it an inline function, because the types aren't fixed - we can't even use a statement expression and __auto_type, because these things are used in type definitions etc and need to be constant expressions That last thing means that even the "__builtin_choose_expr()" doesn't work, because the statement expression still needs to be _parsed_ as such, and that's not something we can do in those contexts. Very annoying. Let me think about this. Linus
On Tue, 24 Oct 2023 at 09:32, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > I would really prefer to just make our regular DIV_ROUND_UP() DTRT. But: > > - people do use it with complex first arguments (ie function calls > etc) that we don't want to evaluate twice > > - we can't make it an inline function, because the types aren't fixed > > - we can't even use a statement expression and __auto_type, because > these things are used in type definitions etc and need to be constant > expressions Ok. I have a potential beginning of a solution. It is unbelievably disgustingly complicated. But it might approach being correct. And by that "it might approach being correct" I obviously mean "this is untested, but builds at least some kernel code". I'm almost certain it will fail on more complex cases, because I already found a lot of questionable stuff that was simply hidden by the old macro just silently doing the C arithmetic type conversions, and this thing does type handling manually. I'm hoping that somebody will go "Linus, you're just being *completely* silly, it's much easier to do XYZ". Linus
On 25/10/2023 00.53, Linus Torvalds wrote: > I'm hoping that somebody will go "Linus, you're just being > *completely* silly, it's much easier to do XYZ". I don't have better suggestions, but a few notes: Both the existing and new implementation are simply wrong for negative n, because / doesn't do floor(), it does round-towards-0 (I do believe that's a bug in the C standard, but not much one can do about that). So both the old and new say that dru(-5, 5) == 0, dru(-7, 5) == 0, dru(-10, 5) == -1. They are correct for dru(-9, 5) == -1 and other "1 mod d" cases. But the new implementation disagrees with the old for -d+1 < n < 0: The old implementation correctly says dru(-3, 5) == 0, but the new gives 1 (!). Preventing signed types from being used is probably too difficult. But it would be nice if we could at least catch a negative _value_ and do something sensible. Can we make the build fail for a negative, compile-time constant, n? I have no idea what to do for run-time negative values - we already don't do the math right, but starting to return a positive number for some negative n's really doesn't seem right. Aside: I don't think uapi/ stuff can include non-uapi/ stuff. Aside2: do_div(n-1, d); probably doesn't compile :) Rasmus
From: Linus Torvalds > Sent: 24 October 2023 23:53 > > On Tue, 24 Oct 2023 at 09:32, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > I would really prefer to just make our regular DIV_ROUND_UP() DTRT. But: > > > > - people do use it with complex first arguments (ie function calls > > etc) that we don't want to evaluate twice > > > > - we can't make it an inline function, because the types aren't fixed > > > > - we can't even use a statement expression and __auto_type, because > > these things are used in type definitions etc and need to be constant > > expressions Doesn't min() get around that by using is_constexpr() and __builtin_choose_exptr() - the same could be done here. > > Ok. I have a potential beginning of a solution. > > It is unbelievably disgustingly complicated. But it might approach > being correct. > > And by that "it might approach being correct" I obviously mean "this > is untested, but builds at least some kernel code". > > I'm almost certain it will fail on more complex cases, because I > already found a lot of questionable stuff that was simply hidden by > the old macro just silently doing the C arithmetic type conversions, > and this thing does type handling manually. > > I'm hoping that somebody will go "Linus, you're just being > *completely* silly, it's much easier to do XYZ". > Doing a non-overflowing DIV_ROUND_UP() that is usable in all contexts is > actually very nasty. > > This is a trial balloon.. The signed cases need more thought. The best > option would be to disallow them (by not listing them in the _Generic() > rules). But they currently happen, often for bad reasons, ie wireless has > > DIV_ROUND_UP(interval, MSEC_PER_SEC); > > and while 'interval' is a proper u32, MSEC_PER_SEC is defined to be > '1000L', so the resulting C arithmetic is done in signed 'long'. Maybe use some of the 'stuff' from min() and convert compile-time constant 'd' to signed int to avoid promotions. Indeed the whole thing really only makes sense for (d > 0 && n >= 0) so forcing an unsigned divide wouldn't be a bad thing at all. It will also generate better code when 'd' is a power of 2. Ignoring the n==0 case I think this always generates an unsigned divide, never does sign extension and does a 32bit divide for 32bit arguments. #define CVT_ULL(x) ((x) + 0u + 0ul + 0ull) #define DIV_ROUND_UP(n, d) ((CVT_ULL(n) + CVT_ULL(d) - 1) / CVT_ULL(d) + 1) It should be possible to error if 'd' is a signed variable or a non-positive constant. I'd guess most 'd' are constants. Erroring signed 'n' is possible but might be annoying. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Tue, Oct 24, 2023 at 12:53:03PM -1000, Linus Torvalds wrote: >+/* >+ * The other cases need to be split by type. >+ * >+ * Signed cases seem badly defined, but do exist. We should >+ * consider removing them from this _Generic(), and fixing any >+ * result 'not compatible with any association' cases. >+ */ >+#define __div_round_up(n,d) _Generic((n)+(d), \ >+ unsigned long long: __div_round_up_ull(n,d), \ >+ long long: __div_round_up_ll(n,d), \ >+ unsigned long: __div_round_up_ul(n,d), \ >+ long: __div_round_up_l(n,d), \ >+ unsigned int: __div_round_up_u(n,d), \ >+ int: __div_round_up_i(n,d)) You probably want #define __div_round_up(n,d) _Generic((n)+(d), \ unsigned long long: __div_round_up_ull, \ long long: __div_round_up_ll, \ unsigned long: __div_round_up_ul, \ long: __div_round_up_l, \ unsigned int: __div_round_up_u, \ int: __div_round_up_i)(n,d) to avoid early type-checking for expressions that will be discarded and prevent errors like: drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c: In function 'sparx5_tc_flower_parse_act_police': drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '25000000000' to '3525163520' [-Werror=overflow] 435 | #define SPX5_SDLB_GROUP_RATE_MAX 25000000000ULL | ^~~~~~~~~~~~~~ ./include/linux/div_round_up.h:68:42: note: in definition of macro '__div_round_up' 68 | unsigned long: __div_round_up_ul(n,d), \ | ^ ./include/linux/div_round_up.h:50:9: note: in expansion of macro '__keep_const' 50 | __keep_const(n, div_round_up(n,d)) | ^~~~~~~~~~~~ ./include/linux/math.h:37:22: note: in expansion of macro '__KERNEL_DIV_ROUND_UP' 37 | #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP | ^~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c:732:38: note: in expansion of macro 'SPX5_SDLB_GROUP_RATE_MAX' 732 | if (pol->rate > DIV_ROUND_UP(SPX5_SDLB_GROUP_RATE_MAX, 1000)) { | ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: overflow in conversion from 'long long unsigned int' to 'long int' changes value from '25000000000' to '-769803776' [-Werror=overflow] ... drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: conversion from 'long long unsigned int' to 'unsigned int' changes value from '25000000000' to '3525163520' [-Werror=overflow] ... drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: overflow in conversion from 'long long unsigned int' to 'int' changes value from '25000000000' to '-769803776' [-Werror=overflow] Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm. static inline unsigned long long __div_round_up_ull(unsigned long long n, unsigned long d) { #ifdef CONFIG_32BIT if (!n) - return 0 - do_div(n-1, d); - return n+1; + return 0; + n--; + do_div(n, d); + return n + 1; #else return __const_div_round_up(n,d); #endif
Hi, On Tue, Oct 24, 2023 at 09:32:27AM -1000, Linus Torvalds wrote: > [ Since you're cc'ing the s390 people, I assume that means that this > all ended up being a follow-up to the s390 issue discussion ] Actually I've noticed the issue in the clock framework some time ago and Andy Shevchenko pointed me towards the s390 thread, which is more or less about the same fundamental problem. So I added everyone in Cc, to make sure this is solved properly and not just s390 specific. > On Tue, 24 Oct 2023 at 06:19, Sebastian Reichel > <sebastian.reichel@collabora.com> wrote: > > > > Add a new DIV_ROUND_UP helper, which cannot overflow when > > big numbers are being used. > > This is really horrendously bad on some architectures (ie it might > require *two* divisions): I came up with this helper for the issue in the clock framework, which currently use DIV_ROUND_UP_ULL() as a workaround. That solves the issue on 32 bit systems (input arguments are of type 'unsigned long'). But it also means doing u64 / u32 and I expect that do be much more more expensive on 32 bit platforms than two divisons. On 64 bit platforms the workaround is wrong, since the second argument is reduced from 64 bit to 32 bit. It effectively breaks platforms trying to request a frequency above 4.3 GHz from a clock divider. Since that's not yet a common thing I guess this has not yet been found and fixed by somebody else. I also just found it as bycatch. > > +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d))) > > but David Laight at least had a suggestion for that: when allowing > multiple uses, you can just do > > #define DIV_ROUND_UP(n,d) ((n) ? ((n)-1)/(d)+1 : 0) > > so now you're back to just one division and no horrible extra expense. > > But we can't allow those multiple uses in general, sadly. > > I would really prefer to just make our regular DIV_ROUND_UP() DTRT. But: > > - people do use it with complex first arguments (ie function calls > etc) that we don't want to evaluate twice > > - we can't make it an inline function, because the types aren't fixed > > - we can't even use a statement expression and __auto_type, because > these things are used in type definitions etc and need to be constant > expressions > > That last thing means that even the "__builtin_choose_expr()" doesn't > work, because the statement expression still needs to be _parsed_ as > such, and that's not something we can do in those contexts. > > Very annoying. Let me think about this. Thanks. -- Sebastian
Hi, On Tue, Oct 24, 2023 at 12:53:03PM -1000, Linus Torvalds wrote: > On Tue, 24 Oct 2023 at 09:32, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > I would really prefer to just make our regular DIV_ROUND_UP() DTRT. But: > > > > - people do use it with complex first arguments (ie function calls > > etc) that we don't want to evaluate twice > > > > - we can't make it an inline function, because the types aren't fixed > > > > - we can't even use a statement expression and __auto_type, because > > these things are used in type definitions etc and need to be constant > > expressions > > Ok. I have a potential beginning of a solution. > > It is unbelievably disgustingly complicated. But it might approach > being correct. > > And by that "it might approach being correct" I obviously mean "this > is untested, but builds at least some kernel code". > > I'm almost certain it will fail on more complex cases, because I > already found a lot of questionable stuff that was simply hidden by > the old macro just silently doing the C arithmetic type conversions, > and this thing does type handling manually. > > I'm hoping that somebody will go "Linus, you're just being > *completely* silly, it's much easier to do XYZ". I think your new __div_round_up_ull() should also be used by DIV_ROUND_UP_ULL, which is defined in linux/math.h. -- Sebastian
On Tue, 24 Oct 2023 at 22:03, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > Both the existing and new implementation are simply wrong for negative > n, because / doesn't do floor(), it does round-towards-0 (I do believe > that's a bug in the C standard, but not much one can do about that). Honestly, while I found several cases where the arithmetic was done with a signed type, all the ones I looked at were purely about the _type_, not the values. Any actual negative values would have been too random before to make a difference. In that patch, I tried to keep the signed type around mainly to at least get *similar* values to the old code, but honestly, it was all broken before already if the values were actually signed. If it were to change the result and somebody would find a bug due to that, that would be a good thing, in other words. The negative value would need to be fixed anyway - as you say, integer division of signed values is simply not rgeat in the first place. Side note: if you dislike the "round towards zero" behavior, I can tell you that it is better than the historical alternative, which was "implementation defined". IIRC, the rule used to be that signed division could either round down or towards zero, and the only rule was that (a) it had to be documented (ie "implementation defined" rather than "undefined") and (b) the modulus had to follow the same rules, ie regardless of rounding, you'd have (a/b)*b + a%b = a That said: > Preventing signed types from being used is probably too difficult. It might not be. My horror with _Generic() is certainly not pretty, but it *does* have the advantage that we could error out for signed cases. I did that initially, and it didn't look horrific, particularly if you just accept signed positive constants as unsigned values (which requires another bit of macro magic but is not conceptually hard). The patch I sent out was not it, though - it was meant really as a minimal "something like this" that still compiled my defconfig. > Aside: I don't think uapi/ stuff can include non-uapi/ stuff. Yeah, no, I left it in that uapi header just for the same reason I didn't bother trying to fix any signed type cases - actively avoiding changes *outside* the DIV_ROUND_UP() case itself. In reality, I'd just make any users of __KERNEL_DIV_ROUND_UP include <linux/div_round_up.h> and get rid of the crazy UAPI thing entirely. So ignore that part. > Aside2: do_div(n-1, d); probably doesn't compile :) .. I did say it was untested and "might approach being correct". That 32-bit ULL case should also have some attempt at checking that the divisor really fits in 32 bits. Because I still refuse to believe that the full 64-by-64 divisions are sane on a 32-bit machine, and anybody who does that needs to go to extra lengths for them. No "silently do them without telling the programmer they are doing something horrendously bad". Linus
On Tue, 24 Oct 2023 at 22:38, David Laight <David.Laight@aculab.com> wrote: > > From: Linus Torvalds > > > - we can't even use a statement expression and __auto_type, because > > > these things are used in type definitions etc and need to be constant > > > expressions > > Doesn't min() get around that by using is_constexpr() and > __builtin_choose_exptr() - the same could be done here. Nope. I wanted to do it that way - it would have made things much simpler and avoid the whole _Generic() thing, but try it - you cannot use statement expressions in a non-function context even with __builtin_choose_expr(). And no, min/max have never been usable in that context Linus
On Wed, 25 Oct 2023 at 05:05, Vasily Gorbik <gor@linux.ibm.com> wrote: > > You probably want > > #define __div_round_up(n,d) _Generic((n)+(d), \ > unsigned long long: __div_round_up_ull, \ > long long: __div_round_up_ll, \ > unsigned long: __div_round_up_ul, \ > long: __div_round_up_l, \ > unsigned int: __div_round_up_u, \ > int: __div_round_up_i)(n,d) > > to avoid early type-checking for expressions that will be discarded > and prevent errors like: Ack. I noticed that later when I tried to do a bigger config build - the compiler would warn about the implicit truncation of the integer arguments (for the cases where they weren't used). > Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm. Lovely. It would have been even better if somebody told me that I was stupid and there was some nice trick to it, but at least the _Generic() approach doesn't seem broken - just a few tweaks needed. Linus
From: Linus Torvalds > Sent: 25 October 2023 18:41 > > On Tue, 24 Oct 2023 at 22:38, David Laight <David.Laight@aculab.com> wrote: > > > > From: Linus Torvalds > > > > - we can't even use a statement expression and __auto_type, because > > > > these things are used in type definitions etc and need to be constant > > > > expressions > > > > Doesn't min() get around that by using is_constexpr() and > > __builtin_choose_exptr() - the same could be done here. > > Nope. I wanted to do it that way - it would have made things much > simpler and avoid the whole _Generic() thing, but try it - you cannot > use statement expressions in a non-function context even with > __builtin_choose_expr(). _Generic() has exactly the same issues as __builtin_choose_expr(). All the code has to be valid for all types. Makes testing for negative constants a PITA. Something like this worked: (__buitin_choose_expr(__is_constexpr((x) && is_signed_type(__typeof(x)), x, 1) < 0) Although our is_signed_type() isn't constant (enough) for pointer types. So you need another check for (typeof(x))1 being constant. > And no, min/max have never been usable in that context I've clearly not tested it and assumed the comment implied that was ok. It'd probably work in C++ - which is perfectly willing to add functions to initialise static data - even when you'd rather have a compile time error. David > > Linus - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
From: Linus Torvalds > Sent: 25 October 2023 18:44 > > On Wed, 25 Oct 2023 at 05:05, Vasily Gorbik <gor@linux.ibm.com> wrote: > > > > You probably want > > > > #define __div_round_up(n,d) _Generic((n)+(d), \ > > unsigned long long: __div_round_up_ull, \ > > long long: __div_round_up_ll, \ > > unsigned long: __div_round_up_ul, \ > > long: __div_round_up_l, \ > > unsigned int: __div_round_up_u, \ > > int: __div_round_up_i)(n,d) > > > > to avoid early type-checking for expressions that will be discarded > > and prevent errors like: > > Ack. I noticed that later when I tried to do a bigger config build - > the compiler would warn about the implicit truncation of the integer > arguments (for the cases where they weren't used). > > > Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm. > > Lovely. > > It would have been even better if somebody told me that I was stupid > and there was some nice trick to it, but at least the _Generic() > approach doesn't seem broken - just a few tweaks needed. Doesn't that version end up calling inline functions? So won't be usable in static initialisers - the same as statement functions. Will this trick work: #define ZERO_UNLESS_TYPE(x, type) _Generic(x, type: x, default 0) #define div_ru(type, fn, n, d) \ type: fn(ZERO_UNLESS_TYPE(type, (n) + (d), n), ZERO_UNLESS_TYPE(type, (n) + (d), d) then: #define __div_round_up(n, d) _Generic((n)+(d), \ div_ru(unsigned long long, __div_round_up_ull, n, d), ... Which would allow the 'functions' be #defines. But it still has the issues of doing 'big' divisions. Although the compiler will use 32bit (and 64 by 32 bit) divisions for 64 bit types if it knows the values are small. clang seems to add conditionals to avoid 64x64 divides. Probably worth it for intel x86 cpu, but not for amd ones. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, 25 Oct 2023 at 22:57, David Laight <David.Laight@aculab.com> wrote: > > Doesn't that version end up calling inline functions? > So won't be usable in static initialisers - the same as statement functions. David, please either read what I write, or actually test things out, instead of just adding noise. It is not a problem to have a _Generic() - or a __builtin_chooise_expr() - that has an inline function that doesn't get used. The compiler will remove it. The problem is literally that you can't use "__auto_type", because *STATEMENT EXPRESSIONS* do not work outside of function definitions, because gcc won't even *parse* them, and just says error: braced-group within expression allowed only inside a function exactly like I explained in the comment of the patch that does the _Generic(). Which is why my solution ended up using those inline functions, and splitting manually by type, instead of an expression statement and __auto_type. Whether you then split into different expressions with _Generic() or by using __builtin_choose_expr() is irrelevant, although once you do "split by type", _Generic() ends up being much more convenient because that's what it's designed for (and you don't end up trying to figure out the size with sizeof() and typeof tricks). Linus
From: Linus Torvalds > Sent: 26 October 2023 17:54 > Subject: Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW > > On Wed, 25 Oct 2023 at 22:57, David Laight <David.Laight@aculab.com> wrote: > > > > Doesn't that version end up calling inline functions? > > So won't be usable in static initialisers - the same as statement functions. > > David, please either read what I write, or actually test things out, > instead of just adding noise. I do test a lot of stuff. For some reason I'd assumed that you would be able to have a call to a function inside a static initialiser. Clearly not enough coffee :-( David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..f80bfb375ab9 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -36,6 +36,17 @@ #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP +/** + * DIV_ROUND_UP_NO_OVERFLOW - divide two numbers and always round up + * @n: numerator / dividend + * @d: denominator / divisor + * + * This functions does the same as DIV_ROUND_UP, but internally uses a + * division and a modulo operation instead of math tricks. This way it + * avoids overflowing when handling big numbers. + */ +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d))) + #define DIV_ROUND_DOWN_ULL(ll, d) \ ({ unsigned long long _tmp = (ll); do_div(_tmp, d); _tmp; })
Add a new DIV_ROUND_UP helper, which cannot overflow when big numbers are being used. Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> --- include/linux/math.h | 11 +++++++++++ 1 file changed, 11 insertions(+)