Message ID | 3a54a6703879d10f08cf0275a2a69297ebd2b1d4.1637592133.git.geert+renesas@glider.be (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | Non-const bitfield helper conversions | expand |
On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: > The existing FIELD_{GET,PREP}() macros are limited to compile-time > constants. However, it is very common to prepare or extract bitfield > elements where the bitfield mask is not a compile-time constant. > I'm not sure it's really a good idea to add a third API here? We have the upper-case (constant) versions, and already {u32,...}_get_bits()/etc. Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit architectures (afaict), so that seems a bit awkward. Maybe we can make {u32,...}_get_bits() be doing compile-time only checks if it is indeed a constant? The __field_overflow() usage is already only done if __builtin_constant_p(v), so I guess we can do the same with __bad_mask()? johannes
On Mon, 22 Nov 2021 17:32:43 +0100 Johannes Berg wrote: > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: > > The existing FIELD_{GET,PREP}() macros are limited to compile-time > > constants. However, it is very common to prepare or extract bitfield > > elements where the bitfield mask is not a compile-time constant. > > I'm not sure it's really a good idea to add a third API here? +1 > We have the upper-case (constant) versions, and already > {u32,...}_get_bits()/etc. > > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > architectures (afaict), so that seems a bit awkward. > > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > if it is indeed a constant? The __field_overflow() usage is already only > done if __builtin_constant_p(v), so I guess we can do the same with > __bad_mask()? Either that or add decomposition macros. Are compilers still really bad at passing small structs by value?
On 11/22/21 10:32 AM, Johannes Berg wrote: > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: >> The existing FIELD_{GET,PREP}() macros are limited to compile-time >> constants. However, it is very common to prepare or extract bitfield >> elements where the bitfield mask is not a compile-time constant. >> > > I'm not sure it's really a good idea to add a third API here? > > We have the upper-case (constant) versions, and already > {u32,...}_get_bits()/etc. I've used these a lot (and personally prefer the lower-case ones). Your new macros don't do anything to ensure the field mask is of the right form, which is basically: (2 ^ width - 1) << shift I really like the property that the field mask must be constant. That being said, I've had to use some strange coding patterns in order to adhere to the "const only" rule in a few cases. So if you can come up with a satisfactory naming scheme I'm all for it. -Alex > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > architectures (afaict), so that seems a bit awkward. > > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > if it is indeed a constant? The __field_overflow() usage is already only > done if __builtin_constant_p(v), so I guess we can do the same with > __bad_mask()? > > johannes >
Hi Johannes, On Mon, Nov 22, 2021 at 5:33 PM Johannes Berg <johannes@sipsolutions.net> wrote: > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: > > The existing FIELD_{GET,PREP}() macros are limited to compile-time > > constants. However, it is very common to prepare or extract bitfield > > elements where the bitfield mask is not a compile-time constant. > > > > I'm not sure it's really a good idea to add a third API here? > > We have the upper-case (constant) versions, and already > {u32,...}_get_bits()/etc. These don't work for non-const masks. > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > architectures (afaict), so that seems a bit awkward. That's a valid comment. Can be fixed by using a wrapper macro that checks if typeof(mask) == u64, and uses an __ffs64() version when needed. > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > if it is indeed a constant? The __field_overflow() usage is already only > done if __builtin_constant_p(v), so I guess we can do the same with > __bad_mask()? Are all compilers smart enough to replace the division by field_multiplier(field) by a shift? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Jakub, On Tue, Nov 23, 2021 at 2:17 AM Jakub Kicinski <kuba@kernel.org> wrote: > On Mon, 22 Nov 2021 17:32:43 +0100 Johannes Berg wrote: > > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: > > > The existing FIELD_{GET,PREP}() macros are limited to compile-time > > > constants. However, it is very common to prepare or extract bitfield > > > elements where the bitfield mask is not a compile-time constant. > > > > I'm not sure it's really a good idea to add a third API here? > > +1 Yeah, a smaller API is better. > > We have the upper-case (constant) versions, and already > > {u32,...}_get_bits()/etc. TBH, I don't like the *_get_bits() API: in general, u32_get_bits() does the same as FIELD_GET(), but the order of the parameters is different? (*_replace_bits() seems to be useful, though) That's why I picked field_{get,prep}(). > > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > > architectures (afaict), so that seems a bit awkward. > > > > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > > if it is indeed a constant? The __field_overflow() usage is already only > > done if __builtin_constant_p(v), so I guess we can do the same with > > __bad_mask()? > > Either that or add decomposition macros. Are compilers still really bad > at passing small structs by value? Sorry, I don't get what you mean by adding decomposition macros. Can you please elaborate? Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Alex, On Tue, Nov 23, 2021 at 2:52 AM Alex Elder <elder@ieee.org> wrote: > On 11/22/21 10:32 AM, Johannes Berg wrote: > > On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: > >> The existing FIELD_{GET,PREP}() macros are limited to compile-time > >> constants. However, it is very common to prepare or extract bitfield > >> elements where the bitfield mask is not a compile-time constant. > > > > I'm not sure it's really a good idea to add a third API here? > > > > We have the upper-case (constant) versions, and already > > {u32,...}_get_bits()/etc. > > I've used these a lot (and personally prefer the lower-case ones). > > Your new macros don't do anything to ensure the field mask is > of the right form, which is basically: (2 ^ width - 1) << shift > I really like the property that the field mask must be constant. That's correct. How to enforce that in the non-const case? BUG()/WARN() is not an option ;-) > That being said, I've had to use some strange coding patterns > in order to adhere to the "const only" rule in a few cases. > So if you can come up with a satisfactory naming scheme I'm > all for it. There are plenty of drivers that handle masks stored in a data structure, so it would be good if they can use a suitable helper, as open-coding is prone to errors. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Tue, 2021-11-23 at 09:30 +0100, Geert Uytterhoeven wrote: > > We have the upper-case (constant) versions, and already > > {u32,...}_get_bits()/etc. > > These don't work for non-const masks. Obviously, I know that. Still, just saying. I'm actually in the opposite camp to you I guess - I much prefer the typed versions (u32_get_bits() and friends) over the FIELD_GET() macros that are more magic. Mostly though that's because the typed ones also have le32_/be32_/... variants, which are tremendously useful, and so I prefer to use them all across. In fact, I have considered in the past to just remove the upper- case macros entirely but ... no time I guess. > > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > > architectures (afaict), so that seems a bit awkward. > > That's a valid comment. Can be fixed by using a wrapper macro > that checks if typeof(mask) == u64, and uses an __ffs64() version when > needed. You can't really do a typeof()==something, but you can check the size, so yeah, that could be done. > > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > > if it is indeed a constant? The __field_overflow() usage is already only > > done if __builtin_constant_p(v), so I guess we can do the same with > > __bad_mask()? > > Are all compilers smart enough to replace the division by > field_multiplier(field) by a shift? In the constant case they are, but you'd have to replace field_multiplier() with the __ffs(), including the size check discussed above. Then it's no longer a constant, and then I'm not so sure it would actually be able to translate it, even if it's "1<<__ffs64(...)". I guess you can check, or just change it to not use the division and multiplication, but shifts/masks instead manually? IOW - I would much prefer to make the type_get_bits() and friends work for non-constant masks. In fact, you have e.g. code in drivers/usb/chipidea/udc.c that does things like cpu_to_le32(mul << __ffs(...)) - though in those cases it's actually constant today, so you could already write it as le32_encode_bits(...). johannes
On Tue, 2021-11-23 at 09:36 +0100, Geert Uytterhoeven wrote: Ah, here's your comment wrt. which one is nicer :) > > > We have the upper-case (constant) versions, and already > > > {u32,...}_get_bits()/etc. > > TBH, I don't like the *_get_bits() API: in general, u32_get_bits() does > the same as FIELD_GET(), but the order of the parameters is different? I don't really see how "the order of parameters is different" is a downside? Yeah it means if you're used to FIELD_GET() then you'll retrain, but ...? > (*_replace_bits() seems to be useful, though) Indeed. Also as I said in my other mail, the le32/be32/... variants are tremendously useful, and they fundamentally cannot be expressed with the FIELD_GET() or field_get() macros. IMHO this is a clear advantage to the typed versions, and if you ask me we should get rid of the FIELD_GETand FIELD_PREP entirely - difficult now, but at least let's not propagate that? johannes
Hi Johannes, On Tue, Nov 23, 2021 at 5:21 PM Johannes Berg <johannes@sipsolutions.net> wrote: > On Tue, 2021-11-23 at 09:30 +0100, Geert Uytterhoeven wrote: > > > We have the upper-case (constant) versions, and already > > > {u32,...}_get_bits()/etc. > > > > These don't work for non-const masks. > > Obviously, I know that. Still, just saying. > > I'm actually in the opposite camp to you I guess - I much prefer the > typed versions (u32_get_bits() and friends) over the FIELD_GET() macros > that are more magic. > > Mostly though that's because the typed ones also have le32_/be32_/... > variants, which are tremendously useful, and so I prefer to use them all > across. In fact, I have considered in the past to just remove the upper- > case macros entirely but ... no time I guess. OK, I have to think a bit about this. FTR, initially I didn't like the FIELD_{GET,PREP}() macros neither ;-) > In fact, you have e.g. code in drivers/usb/chipidea/udc.c that does > things like cpu_to_le32(mul << __ffs(...)) - though in those cases it's > actually constant today, so you could already write it as > le32_encode_bits(...). Yeah, there are lots of opportunities for improvement for drivers/usb/chipidea/. I didn't include a conversion patch for that driver, as it led me too deep into the rabbit hole, and I wanted to get something posted rather sooner than later... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Tue, 23 Nov 2021 09:36:22 +0100 Geert Uytterhoeven wrote: > > > Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit > > > architectures (afaict), so that seems a bit awkward. > > > > > > Maybe we can make {u32,...}_get_bits() be doing compile-time only checks > > > if it is indeed a constant? The __field_overflow() usage is already only > > > done if __builtin_constant_p(v), so I guess we can do the same with > > > __bad_mask()? > > > > Either that or add decomposition macros. Are compilers still really bad > > at passing small structs by value? > > Sorry, I don't get what you mean by adding decomposition macros. > Can you please elaborate? #define DECOMPOSE(_mask) \ (struct bf){ .mask = _mask, .shf = __bf_shf(_mask), } Then drivers can save or pass around the mask and shift params broken apart as a small struct.
On Tue, 23 Nov 2021 17:24:15 +0100 Johannes Berg wrote: > > (*_replace_bits() seems to be useful, though) > > Indeed. > > Also as I said in my other mail, the le32/be32/... variants are > tremendously useful, and they fundamentally cannot be expressed with the > FIELD_GET() or field_get() macros. IMHO this is a clear advantage to the Can you elaborate? > typed versions, and if you ask me we should get rid of the FIELD_GETand > FIELD_PREP entirely - difficult now, but at least let's not propagate > that? I don't see why.
On Tue, 2021-11-23 at 15:49 -0800, Jakub Kicinski wrote: > On Tue, 23 Nov 2021 17:24:15 +0100 Johannes Berg wrote: > > > (*_replace_bits() seems to be useful, though) > > > > Indeed. > > > > Also as I said in my other mail, the le32/be32/... variants are > > tremendously useful, and they fundamentally cannot be expressed with the > > FIELD_GET() or field_get() macros. IMHO this is a clear advantage to the > > Can you elaborate? Well, the way I see it, the only advantage of FIELD_GET() is that it will auto-determine the type (based on the mask type.) This cannot work if you need be/le conversions, because the be/le type annotations are invisible to the compiler. So obviously you could write a BE32_FIELD_GET(), but then really that's equivalent to be32_get_bits() - note you you have to actually specify the type in the macro name. I guess in theory you could make macros where the type is an argument (like FIELD_GET_TYPE(be32, ...)), but I don't see how that gains anything. > > typed versions, and if you ask me we should get rid of the FIELD_GETand > > FIELD_PREP entirely - difficult now, but at least let's not propagate > > that? > > I don't see why. Just for being more regular, in the spirit of "there's exactly one correct way of doing it" :) johannes
Geert Uytterhoeven <geert@linux-m68k.org> writes: > Hi Johannes, > > On Mon, Nov 22, 2021 at 5:33 PM Johannes Berg <johannes@sipsolutions.net> wrote: >> On Mon, 2021-11-22 at 16:53 +0100, Geert Uytterhoeven wrote: >> > The existing FIELD_{GET,PREP}() macros are limited to compile-time >> > constants. However, it is very common to prepare or extract bitfield >> > elements where the bitfield mask is not a compile-time constant. >> > >> >> I'm not sure it's really a good idea to add a third API here? >> >> We have the upper-case (constant) versions, and already >> {u32,...}_get_bits()/etc. > > These don't work for non-const masks. > >> Also, you're using __ffs(), which doesn't work for 64-bit on 32-bit >> architectures (afaict), so that seems a bit awkward. > > That's a valid comment. Can be fixed by using a wrapper macro > that checks if typeof(mask) == u64, and uses an __ffs64() version when > needed. > >> Maybe we can make {u32,...}_get_bits() be doing compile-time only checks >> if it is indeed a constant? The __field_overflow() usage is already only >> done if __builtin_constant_p(v), so I guess we can do the same with >> __bad_mask()? > > Are all compilers smart enough to replace the division by > field_multiplier(field) by a shift? It looks like the answer is no as few weeks back I received a comment internally that a team is seeing a slow down with u32_get_bits(): "Time taken for executing both the macros/inline function (in terms of microseconds) (out of 3 Trails) FIELD_GET : 32, 31, 32 u32_get_bits : 6379, 6664, 6558" Sadly I didn't realise to ask what compiler they were using. But I still prefer {u32,...}_get_bits() over FIELD_GET(), they are just so much cleaner to use.
On Wed, 24 Nov 2021 09:03:24 +0100 Johannes Berg wrote: > On Tue, 2021-11-23 at 15:49 -0800, Jakub Kicinski wrote: > > > Indeed. > > > > > > Also as I said in my other mail, the le32/be32/... variants are > > > tremendously useful, and they fundamentally cannot be expressed with the > > > FIELD_GET() or field_get() macros. IMHO this is a clear advantage to the > > > > Can you elaborate? > > Well, the way I see it, the only advantage of FIELD_GET() is that it > will auto-determine the type (based on the mask type.) This cannot work > if you need be/le conversions, because the be/le type annotations are > invisible to the compiler. > > So obviously you could write a BE32_FIELD_GET(), but then really that's > equivalent to be32_get_bits() - note you you have to actually specify > the type in the macro name. I guess in theory you could make macros > where the type is an argument (like FIELD_GET_TYPE(be32, ...)), but I > don't see how that gains anything. Ah, that's what you meant! Thanks for spelling it out. FWIW I never found the be/le versions useful. Most of the time the data comes from bus accessors which swap or is unaligned so you have to do be/le_get_unaligned, which swaps. Plus if you access/set multiple fields you'd swap them one by one which seems wasteful. > > > typed versions, and if you ask me we should get rid of the FIELD_GETand > > > FIELD_PREP entirely - difficult now, but at least let's not propagate > > > that? > > > > I don't see why. > > Just for being more regular, in the spirit of "there's exactly one > correct way of doing it" :) Right now it seems the uppercase macros are more prevalent. Could just be because of the way the "swapping ones" are defined.
On Wed, 2021-11-24 at 05:59 -0800, Jakub Kicinski wrote: > > FWIW I never found the be/le versions useful. Most of the time the data > comes from bus accessors which swap or is unaligned so you have to do > be/le_get_unaligned, which swaps. Plus if you access/set multiple > fields you'd swap them one by one which seems wasteful. Oh, we use them all the time in wifi! I'm not sure I'm too concerned about wasteful - actually in wifi most of the time it's little endian to start with, which matches the CPU for all practical uses of wifi (**), and often we just access one field or so. And anyway if we extract more than a single bit we need to swap anyway, and I hope if it's just a single bit the compiler will optimize since the one side is a constant? But whatever ... (**) I had a fight with big-endian ARM a few years ago just to get wifi tested on big-endian ... > Right now it seems the uppercase macros are more prevalent. > Not in my world ;-) $ git grep FIELD_GET -- ... | wc -l 20 $ git grep le32_get_bits -- ... | wc -l 44 $ git grep le16_get_bits -- ... | wc -l 12 $ git grep u8_get_bits -- ... | wc -l 17 :-) johannes
diff --git a/drivers/clk/at91/clk-peripheral.c b/drivers/clk/at91/clk-peripheral.c index e14fa5ac734cead7..e2f33498139a1b8c 100644 --- a/drivers/clk/at91/clk-peripheral.c +++ b/drivers/clk/at91/clk-peripheral.c @@ -3,6 +3,7 @@ * Copyright (C) 2013 Boris BREZILLON <b.brezillon@overkiz.com> */ +#include <linux/bitfield.h> #include <linux/bitops.h> #include <linux/clk-provider.h> #include <linux/clkdev.h> diff --git a/drivers/clk/at91/pmc.h b/drivers/clk/at91/pmc.h index 3a1bf6194c287d09..1256e1ab91526a25 100644 --- a/drivers/clk/at91/pmc.h +++ b/drivers/clk/at91/pmc.h @@ -114,9 +114,6 @@ struct at91_clk_pms { unsigned int parent; }; -#define field_get(_mask, _reg) (((_reg) & (_mask)) >> (ffs(_mask) - 1)) -#define field_prep(_mask, _val) (((_val) << (ffs(_mask) - 1)) & (_mask)) - #define ndck(a, s) (a[s - 1].id + 1) #define nck(a) (a[ARRAY_SIZE(a) - 1].id + 1) struct pmc_data *pmc_data_allocate(unsigned int ncore, unsigned int nsystem, diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h index 4e035aca6f7e6000..f03b0712e4babec1 100644 --- a/include/linux/bitfield.h +++ b/include/linux/bitfield.h @@ -156,4 +156,34 @@ __MAKE_OP(64) #undef __MAKE_OP #undef ____MAKE_OP +/** + * field_prep() - prepare a bitfield element + * @_mask: shifted mask defining the field's length and position + * @_val: value to put in the field + * + * field_prep() masks and shifts up the value. The result should be + * combined with other fields of the bitfield using logical OR. + * Unlike FIELD_PREP(), @_mask is not limited to a compile-time constant. + */ +#define field_prep(_mask, _val) \ + ({ \ + typeof(_mask) ___mask = (_mask); \ + (((_val) << __ffs(___mask)) & (___mask)); \ + }) + +/** + * field_get() - extract a bitfield element + * @_mask: shifted mask defining the field's length and position + * @_reg: value of entire bitfield + * + * field_get() extracts the field specified by @_mask from the + * bitfield passed in as @_reg by masking and shifting it down. + * Unlike FIELD_GET(), @_mask is not limited to a compile-time constant. + */ +#define field_get(_mask, _reg) \ + ({ \ + typeof(_mask) ___mask = _mask; \ + (((_reg) & (___mask)) >> __ffs(___mask)); \ + }) + #endif
The existing FIELD_{GET,PREP}() macros are limited to compile-time constants. However, it is very common to prepare or extract bitfield elements where the bitfield mask is not a compile-time constant. To avoid this limitation, the AT91 clock driver already has its own field_{prep,get}() macros. Make them available for general use by moving them to <linux/bitfield.h>, and improve them slightly: 1. Avoid evaluating macro parameters more than once, 2. Replace "ffs() - 1" by "__ffs()", as the latter operates on "unsigned long", just like BIT() and GENMASK(). Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> --- Using __ffs() actually reduces code size (16 bytes for each of drivers/clk/at91/clk-{generated,peripheral}.o), as __ffs() doesn't need to verify that the value passed is non-zero. --- drivers/clk/at91/clk-peripheral.c | 1 + drivers/clk/at91/pmc.h | 3 --- include/linux/bitfield.h | 30 ++++++++++++++++++++++++++++++ 3 files changed, 31 insertions(+), 3 deletions(-)