diff mbox series

[net-next,3/8] lib: packing: add pack_fields() and unpack_fields()

Message ID 20241011-packing-pack-fields-and-ice-implementation-v1-3-d9b1f7500740@intel.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series lib: packing: introduce and use (un)pack_fields | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5 this patch: 5
netdev/build_tools success Errors and warnings before: 0 (+1) this patch: 0 (+1)
netdev/cc_maintainers success CCed 4 of 4 maintainers
netdev/build_clang success Errors and warnings before: 4 this patch: 4
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch fail CHECK: Alignment should match open parenthesis ERROR: Macros with multiple statements should be enclosed in a do - while loop WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? WARNING: line length of 106 exceeds 80 columns WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 90 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: macros should not use a trailing semicolon
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-10-12--12-00 (tests: 777)

Commit Message

Jacob Keller Oct. 11, 2024, 6:48 p.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

This is new API which caters to the following requirements:

- Pack or unpack a large number of fields to/from a buffer with a small
  code footprint. The current alternative is to open-code a large number
  of calls to pack() and unpack(), or to use packing() to reduce that
  number to half. But packing() is not const-correct.

- Use unpacked numbers stored in variables smaller than u64. This
  reduces the rodata footprint of the stored field arrays.

- Perform error checking at compile time, rather than at runtime, and
  return void from the API functions. To that end, we introduce
  CHECK_PACKED_FIELD_*() macros to be used on the arrays of packed
  fields. Note: the C preprocessor can't generate variable-length code
  (loops),  as would be required for array-style definitions of struct
  packed_field arrays. So the sanity checks use code generation at
  compile time to $KBUILD_OUTPUT/include/generated/packing-checks.h.
  There are explicit macros for sanity-checking arrays of 1 packed
  field, 2 packed fields, 3 packed fields, ..., all the way to 50 packed
  fields. In practice, the sja1105 driver will actually need the variant
  with 40 fields. This isn't as bad as it seems: feeding a 39 entry
  sized array into the CHECK_PACKED_FIELDS_40() macro will actually
  generate a compilation error, so mistakes are very likely to be caught
  by the developer and thus are not a problem.

- Reduced rodata footprint for the storage of the packed field arrays.
  To that end, we have struct packed_field_s (small) and packed_field_m
  (medium). More can be added as needed (unlikely for now). On these
  types, the same generic pack_fields() and unpack_fields() API can be
  used, thanks to the new C11 _Generic() selection feature, which can
  call pack_fields_s() or pack_fields_m(), depending on the type of the
  "fields" array - a simplistic form of polymorphism. It is evaluated at
  compile time which function will actually be called.

Over time, packing() is expected to be completely replaced either with
pack() or with pack_fields().

Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/linux/packing.h  |  69 ++++++++++++++++++++++
 lib/gen_packing_checks.c |  31 ++++++++++
 lib/packing.c            | 149 ++++++++++++++++++++++++++++++++++++++++++++++-
 Kbuild                   |  13 ++++-
 4 files changed, 259 insertions(+), 3 deletions(-)

Comments

Jacob Keller Oct. 15, 2024, 7:19 p.m. UTC | #1
On 10/11/2024 11:48 AM, Jacob Keller wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> This is new API which caters to the following requirements:
> 
> - Pack or unpack a large number of fields to/from a buffer with a small
>   code footprint. The current alternative is to open-code a large number
>   of calls to pack() and unpack(), or to use packing() to reduce that
>   number to half. But packing() is not const-correct.
> 
> - Use unpacked numbers stored in variables smaller than u64. This
>   reduces the rodata footprint of the stored field arrays.
> 
> - Perform error checking at compile time, rather than at runtime, and
>   return void from the API functions. To that end, we introduce
>   CHECK_PACKED_FIELD_*() macros to be used on the arrays of packed
>   fields. Note: the C preprocessor can't generate variable-length code
>   (loops),  as would be required for array-style definitions of struct
>   packed_field arrays. So the sanity checks use code generation at
>   compile time to $KBUILD_OUTPUT/include/generated/packing-checks.h.
>   There are explicit macros for sanity-checking arrays of 1 packed
>   field, 2 packed fields, 3 packed fields, ..., all the way to 50 packed
>   fields. In practice, the sja1105 driver will actually need the variant
>   with 40 fields. This isn't as bad as it seems: feeding a 39 entry
>   sized array into the CHECK_PACKED_FIELDS_40() macro will actually
>   generate a compilation error, so mistakes are very likely to be caught
>   by the developer and thus are not a problem.
> 
> - Reduced rodata footprint for the storage of the packed field arrays.
>   To that end, we have struct packed_field_s (small) and packed_field_m
>   (medium). More can be added as needed (unlikely for now). On these
>   types, the same generic pack_fields() and unpack_fields() API can be
>   used, thanks to the new C11 _Generic() selection feature, which can
>   call pack_fields_s() or pack_fields_m(), depending on the type of the
>   "fields" array - a simplistic form of polymorphism. It is evaluated at
>   compile time which function will actually be called.
> 
> Over time, packing() is expected to be completely replaced either with
> pack() or with pack_fields().
> 
> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  include/linux/packing.h  |  69 ++++++++++++++++++++++
>  lib/gen_packing_checks.c |  31 ++++++++++
>  lib/packing.c            | 149 ++++++++++++++++++++++++++++++++++++++++++++++-
>  Kbuild                   |  13 ++++-
>  4 files changed, 259 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/packing.h b/include/linux/packing.h
> index 5d36dcd06f60..eeb23d90e5e0 100644
> --- a/include/linux/packing.h
> +++ b/include/linux/packing.h
> @@ -26,4 +26,73 @@ int pack(void *pbuf, u64 uval, size_t startbit, size_t endbit, size_t pbuflen,
>  int unpack(const void *pbuf, u64 *uval, size_t startbit, size_t endbit,
>  	   size_t pbuflen, u8 quirks);
>  
> +#define GEN_PACKED_FIELD_MEMBERS(__type) \
> +	__type startbit; \
> +	__type endbit; \
> +	__type offset; \
> +	__type size;
> +
> +/* Small packed field. Use with bit offsets < 256, buffers < 32B and
> + * unpacked structures < 256B.
> + */
> +struct packed_field_s {
> +	GEN_PACKED_FIELD_MEMBERS(u8);
> +};
> +
> +/* Medium packed field. Use with bit offsets < 65536, buffers < 8KB and
> + * unpacked structures < 64KB.
> + */
> +struct packed_field_m {
> +	GEN_PACKED_FIELD_MEMBERS(u16);
> +};
> +
> +#define PACKED_FIELD(start, end, struct_name, struct_field) \
> +	{ \
> +		(start), \
> +		(end), \
> +		offsetof(struct_name, struct_field), \
> +		sizeof_field(struct_name, struct_field), \
> +	}
> +
> +#define CHECK_PACKED_FIELD(field, pbuflen) \
> +	({ typeof(field) __f = (field); typeof(pbuflen) __len = (pbuflen); \
> +	BUILD_BUG_ON(__f.startbit < __f.endbit); \
> +	BUILD_BUG_ON(__f.startbit >= BITS_PER_BYTE * __len); \
> +	BUILD_BUG_ON(__f.startbit - __f.endbit >= BITS_PER_BYTE * __f.size); \
> +	BUILD_BUG_ON(__f.size != 1 && __f.size != 2 && __f.size != 4 && __f.size != 8); })
> +
> +#define CHECK_PACKED_FIELD_OVERLAP(field1, field2) \
> +	({ typeof(field1) _f1 = (field1); typeof(field2) _f2 = (field2); \
> +	BUILD_BUG_ON(max(_f1.endbit, _f2.endbit) <=  min(_f1.startbit, _f2.startbit)); })
> +
> +#include <generated/packing-checks.h>
> +
> +void pack_fields_s(void *pbuf, size_t pbuflen, const void *ustruct,
> +		   const struct packed_field_s *fields, size_t num_fields,
> +		   u8 quirks);
> +
> +void pack_fields_m(void *pbuf, size_t pbuflen, const void *ustruct,
> +		   const struct packed_field_m *fields, size_t num_fields,
> +		   u8 quirks);
> +
> +void unpack_fields_s(const void *pbuf, size_t pbuflen, void *ustruct,
> +		     const struct packed_field_s *fields, size_t num_fields,
> +		     u8 quirks);
> +
> +void unpack_fields_m(const void *pbuf, size_t pbuflen, void *ustruct,
> +		      const struct packed_field_m *fields, size_t num_fields,
> +		      u8 quirks);
> +
> +#define pack_fields(pbuf, pbuflen, ustruct, fields, quirks) \
> +	_Generic((fields), \
> +		 const struct packed_field_s * : pack_fields_s, \
> +		 const struct packed_field_m * : pack_fields_m \
> +		)(pbuf, pbuflen, ustruct, fields, ARRAY_SIZE(fields), quirks)
> +
> +#define unpack_fields(pbuf, pbuflen, ustruct, fields, quirks) \
> +	_Generic((fields), \
> +		 const struct packed_field_s * : unpack_fields_s, \
> +		 const struct packed_field_m * : unpack_fields_m \
> +		)(pbuf, pbuflen, ustruct, fields, ARRAY_SIZE(fields), quirks)
> +
>  #endif
> diff --git a/lib/gen_packing_checks.c b/lib/gen_packing_checks.c
> new file mode 100644
> index 000000000000..3213c858c2fe
> --- /dev/null
> +++ b/lib/gen_packing_checks.c
> @@ -0,0 +1,31 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdio.h>
> +
> +int main(int argc, char **argv)
> +{
> +	printf("/* Automatically generated - do not edit */\n\n");
> +	printf("#ifndef GENERATED_PACKING_CHECKS_H\n");
> +	printf("#define GENERATED_PACKING_CHECKS_H\n\n");
> +
> +	for (int i = 1; i <= 50; i++) {
> +		printf("#define CHECK_PACKED_FIELDS_%d(fields, pbuflen) \\\n", i);
> +		printf("\t({ typeof(&(fields)[0]) _f = (fields); typeof(pbuflen) _len = (pbuflen); \\\n");
> +		printf("\tBUILD_BUG_ON(ARRAY_SIZE(fields) != %d); \\\n", i);
> +		for (int j = 0; j < i; j++) {
> +			int final = (i == 1);
> +
> +			printf("\tCHECK_PACKED_FIELD(_f[%d], _len);%s\n",
> +			       j, final ? " })\n" : " \\");
> +		}
> +		for (int j = 1; j < i; j++) {
> +			for (int k = 0; k < j; k++) {
> +				int final = (j == i - 1) && (k == j - 1);
> +
> +				printf("\tCHECK_PACKED_FIELD_OVERLAP(_f[%d], _f[%d]);%s\n",
> +				       k, j, final ? " })\n" : " \\");
> +			}
> +		}
> +	}
> +
> +	printf("#endif /* GENERATED_PACKING_CHECKS_H */\n");
> +}

Hi Masahiro,

The changes in this patch contains some code and Kbuild changes to
generate compile-time macro checks at build time (instead of committing
20 thousand lines of code directly to git). I'd appreciate if you could
review this change, specifically the auto-generation of packing-checks.h

The full series can be viewed on lore at:

> https://lore.kernel.org/netdev/20241011-packing-pack-fields-and-ice-implementation-v1-0-d9b1f7500740@intel.com/

> diff --git a/Kbuild b/Kbuild
> index 464b34a08f51..35a8b78b72d9 100644
> --- a/Kbuild
> +++ b/Kbuild
> @@ -34,6 +34,17 @@ arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)
>  $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
>  	$(call filechk,offsets,__ASM_OFFSETS_H__)
>  
> +# Generate packing-checks.h
> +
> +hostprogs += lib/gen_packing_checks
> +
> +packing-checks := include/generated/packing-checks.h
> +
> +filechk_gen_packing_checks = lib/gen_packing_checks
> +
> +$(packing-checks): lib/gen_packing_checks FORCE
> +	$(call filechk,gen_packing_checks)
> +
>  # Check for missing system calls
>  
>  quiet_cmd_syscalls = CALL    $<
> @@ -70,7 +81,7 @@ $(atomic-checks): $(obj)/.checked-%: include/linux/atomic/%  FORCE
>  # A phony target that depends on all the preparation targets
>  
>  PHONY += prepare
> -prepare: $(offsets-file) missing-syscalls $(atomic-checks)
> +prepare: $(offsets-file) missing-syscalls $(atomic-checks) $(packing-checks)
>  	@:
>  
>  # Ordinary directory descending
> 

In particular, I tried a variety of places to put the build-time
generation logic, and ended up having to stick it in the top level
Kbuild file as part of the prepare target. I was unable to figure out
another way to get the include dependency correct.

packing-checks.h contains the set of macros generated for checking
various sizes of the packing array, which we want to have at compile
time, so we need to generate packing-checks.h before any code which
includes <linux/packing.h> which is what ultimately includes
<generated/packing-checks.h>

Vladimir and I tried to come up with other methods of doing the compile
time checking and validation of the structures. But the limited C
pre-processor prevents us from doing loops, which is what led to the
large number of generated macros.

Thanks,
Jake
Przemek Kitszel Oct. 16, 2024, 1:02 p.m. UTC | #2
On 10/11/24 20:48, Jacob Keller wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> This is new API which caters to the following requirements:
> 
> - Pack or unpack a large number of fields to/from a buffer with a small
>    code footprint. The current alternative is to open-code a large number
>    of calls to pack() and unpack(), or to use packing() to reduce that
>    number to half. But packing() is not const-correct.
> 
> - Use unpacked numbers stored in variables smaller than u64. This
>    reduces the rodata footprint of the stored field arrays.
> 
> - Perform error checking at compile time, rather than at runtime, and
>    return void from the API functions. To that end, we introduce
>    CHECK_PACKED_FIELD_*() macros to be used on the arrays of packed
>    fields. Note: the C preprocessor can't generate variable-length code
>    (loops),  as would be required for array-style definitions of struct
>    packed_field arrays. So the sanity checks use code generation at
>    compile time to $KBUILD_OUTPUT/include/generated/packing-checks.h.
>    There are explicit macros for sanity-checking arrays of 1 packed
>    field, 2 packed fields, 3 packed fields, ..., all the way to 50 packed
>    fields. In practice, the sja1105 driver will actually need the variant
>    with 40 fields. This isn't as bad as it seems: feeding a 39 entry
>    sized array into the CHECK_PACKED_FIELDS_40() macro will actually
>    generate a compilation error, so mistakes are very likely to be caught
>    by the developer and thus are not a problem.
> 
> - Reduced rodata footprint for the storage of the packed field arrays.
>    To that end, we have struct packed_field_s (small) and packed_field_m
>    (medium). More can be added as needed (unlikely for now). On these
>    types, the same generic pack_fields() and unpack_fields() API can be
>    used, thanks to the new C11 _Generic() selection feature, which can
>    call pack_fields_s() or pack_fields_m(), depending on the type of the
>    "fields" array - a simplistic form of polymorphism. It is evaluated at
>    compile time which function will actually be called.
> 
> Over time, packing() is expected to be completely replaced either with
> pack() or with pack_fields().
> 
> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>   include/linux/packing.h  |  69 ++++++++++++++++++++++
>   lib/gen_packing_checks.c |  31 ++++++++++
>   lib/packing.c            | 149 ++++++++++++++++++++++++++++++++++++++++++++++-
>   Kbuild                   |  13 ++++-
>   4 files changed, 259 insertions(+), 3 deletions(-)


> diff --git a/lib/gen_packing_checks.c b/lib/gen_packing_checks.c
> new file mode 100644
> index 000000000000..3213c858c2fe
> --- /dev/null
> +++ b/lib/gen_packing_checks.c
> @@ -0,0 +1,31 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdio.h>
> +
> +int main(int argc, char **argv)
> +{
> +	printf("/* Automatically generated - do not edit */\n\n");
> +	printf("#ifndef GENERATED_PACKING_CHECKS_H\n");
> +	printf("#define GENERATED_PACKING_CHECKS_H\n\n");
> +
> +	for (int i = 1; i <= 50; i++) {

either you missed my question, or I have missed your reply during
internal round of review, but:

do we need 50? that means 1MB file, while it is 10x smaller for n=27

> +		printf("#define CHECK_PACKED_FIELDS_%d(fields, pbuflen) \\\n", i);
> +		printf("\t({ typeof(&(fields)[0]) _f = (fields); typeof(pbuflen) _len = (pbuflen); \\\n");
> +		printf("\tBUILD_BUG_ON(ARRAY_SIZE(fields) != %d); \\\n", i);
> +		for (int j = 0; j < i; j++) {
> +			int final = (i == 1);

you could replace both @final variables and ternary operators from
the prints by simply moving the final "})\n" outside the loops

> +
> +			printf("\tCHECK_PACKED_FIELD(_f[%d], _len);%s\n",
> +			       j, final ? " })\n" : " \\");
> +		}
> +		for (int j = 1; j < i; j++) {
> +			for (int k = 0; k < j; k++) {
> +				int final = (j == i - 1) && (k == j - 1);
> +
> +				printf("\tCHECK_PACKED_FIELD_OVERLAP(_f[%d], _f[%d]);%s\n",
> +				       k, j, final ? " })\n" : " \\");
> +			}
> +		}
> +	}
> +
> +	printf("#endif /* GENERATED_PACKING_CHECKS_H */\n");
> +}
Vladimir Oltean Oct. 16, 2024, 1:40 p.m. UTC | #3
On Wed, Oct 16, 2024 at 03:02:38PM +0200, Przemek Kitszel wrote:
> On 10/11/24 20:48, Jacob Keller wrote:
> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > 
> > This is new API which caters to the following requirements:
> > 
> > - Pack or unpack a large number of fields to/from a buffer with a small
> >    code footprint. The current alternative is to open-code a large number
> >    of calls to pack() and unpack(), or to use packing() to reduce that
> >    number to half. But packing() is not const-correct.
> > 
> > - Use unpacked numbers stored in variables smaller than u64. This
> >    reduces the rodata footprint of the stored field arrays.
> > 
> > - Perform error checking at compile time, rather than at runtime, and
> >    return void from the API functions. To that end, we introduce
> >    CHECK_PACKED_FIELD_*() macros to be used on the arrays of packed
> >    fields. Note: the C preprocessor can't generate variable-length code
> >    (loops),  as would be required for array-style definitions of struct
> >    packed_field arrays. So the sanity checks use code generation at
> >    compile time to $KBUILD_OUTPUT/include/generated/packing-checks.h.
> >    There are explicit macros for sanity-checking arrays of 1 packed
> >    field, 2 packed fields, 3 packed fields, ..., all the way to 50 packed
> >    fields. In practice, the sja1105 driver will actually need the variant
> >    with 40 fields. This isn't as bad as it seems: feeding a 39 entry
> >    sized array into the CHECK_PACKED_FIELDS_40() macro will actually
> >    generate a compilation error, so mistakes are very likely to be caught
> >    by the developer and thus are not a problem.
> > 
> > - Reduced rodata footprint for the storage of the packed field arrays.
> >    To that end, we have struct packed_field_s (small) and packed_field_m
> >    (medium). More can be added as needed (unlikely for now). On these
> >    types, the same generic pack_fields() and unpack_fields() API can be
> >    used, thanks to the new C11 _Generic() selection feature, which can
> >    call pack_fields_s() or pack_fields_m(), depending on the type of the
> >    "fields" array - a simplistic form of polymorphism. It is evaluated at
> >    compile time which function will actually be called.
> > 
> > Over time, packing() is expected to be completely replaced either with
> > pack() or with pack_fields().
> > 
> > Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > ---
> >   include/linux/packing.h  |  69 ++++++++++++++++++++++
> >   lib/gen_packing_checks.c |  31 ++++++++++
> >   lib/packing.c            | 149 ++++++++++++++++++++++++++++++++++++++++++++++-
> >   Kbuild                   |  13 ++++-
> >   4 files changed, 259 insertions(+), 3 deletions(-)
> 
> 
> > diff --git a/lib/gen_packing_checks.c b/lib/gen_packing_checks.c
> > new file mode 100644
> > index 000000000000..3213c858c2fe
> > --- /dev/null
> > +++ b/lib/gen_packing_checks.c
> > @@ -0,0 +1,31 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <stdio.h>
> > +
> > +int main(int argc, char **argv)
> > +{
> > +	printf("/* Automatically generated - do not edit */\n\n");
> > +	printf("#ifndef GENERATED_PACKING_CHECKS_H\n");
> > +	printf("#define GENERATED_PACKING_CHECKS_H\n\n");
> > +
> > +	for (int i = 1; i <= 50; i++) {
> 
> either you missed my question, or I have missed your reply during
> internal round of review, but:
> 
> do we need 50? that means 1MB file, while it is 10x smaller for n=27

The sja1105 driver will need checks for arrays of 40 fields, it's in the
commit message. Though if the file size is going to generate comments
even at this initial dimension, then maybe it isn't the best way forward...

Suggestions for how to scale up the compile-time checks?

Originally the CHECK_PACKED_FIELD_OVERLAP() weren't the cartesian
product of all array elements. I just assumed that the field array would
be ordered from MSB to LSB. But then, Jacob wondered why the order isn't
from LSB to MSB. The presence/absence of the QUIRK_LSW32_IS_FIRST quirk
seems to influence the perception of which field layout is natural.
So the full-blown current overlap check is the compromise to use the
same sanity check macros everywhere. Otherwise, we'd have to introduce
CHECK_PACKED_FIELDS_5_UP() and CHECK_PACKED_FIELDS_5_DOWN(), and
although even that would be smaller in size than the full cartesian
product, it's harder to use IMO.

> > +		printf("#define CHECK_PACKED_FIELDS_%d(fields, pbuflen) \\\n", i);
> > +		printf("\t({ typeof(&(fields)[0]) _f = (fields); typeof(pbuflen) _len = (pbuflen); \\\n");
> > +		printf("\tBUILD_BUG_ON(ARRAY_SIZE(fields) != %d); \\\n", i);
> > +		for (int j = 0; j < i; j++) {
> > +			int final = (i == 1);
> 
> you could replace both @final variables and ternary operators from
> the prints by simply moving the final "})\n" outside the loops

I don't see how, could you illustrate with some code? (assuming you're
not proposing to change the generated output?) Even if you move the
final "})\n" outside the loop, I'm not seeing how you would avoid
printing the last " \\" without keeping track of the "final" variable
anyway, point at which you're better off with the ternary than yet
another printf() call?

> > +
> > +			printf("\tCHECK_PACKED_FIELD(_f[%d], _len);%s\n",
> > +			       j, final ? " })\n" : " \\");
> > +		}
> > +		for (int j = 1; j < i; j++) {
> > +			for (int k = 0; k < j; k++) {
> > +				int final = (j == i - 1) && (k == j - 1);
> > +
> > +				printf("\tCHECK_PACKED_FIELD_OVERLAP(_f[%d], _f[%d]);%s\n",
> > +				       k, j, final ? " })\n" : " \\");
> > +			}
> > +		}
> > +	}
> > +
> > +	printf("#endif /* GENERATED_PACKING_CHECKS_H */\n");
> > +}
Jacob Keller Oct. 16, 2024, 10:31 p.m. UTC | #4
> -----Original Message-----
> From: Vladimir Oltean <olteanv@gmail.com>
> Sent: Wednesday, October 16, 2024 6:41 AM
> To: Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; linux-kernel@vger.kernel.org;
> Andrew Morton <akpm@linux-foundation.org>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>;
> netdev@vger.kernel.org; Vladimir Oltean <vladimir.oltean@nxp.com>
> Subject: Re: [PATCH net-next 3/8] lib: packing: add pack_fields() and
> unpack_fields()
> 
> On Wed, Oct 16, 2024 at 03:02:38PM +0200, Przemek Kitszel wrote:
> > On 10/11/24 20:48, Jacob Keller wrote:
> > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > >
> > > This is new API which caters to the following requirements:
> > >
> > > - Pack or unpack a large number of fields to/from a buffer with a small
> > >    code footprint. The current alternative is to open-code a large number
> > >    of calls to pack() and unpack(), or to use packing() to reduce that
> > >    number to half. But packing() is not const-correct.
> > >
> > > - Use unpacked numbers stored in variables smaller than u64. This
> > >    reduces the rodata footprint of the stored field arrays.
> > >
> > > - Perform error checking at compile time, rather than at runtime, and
> > >    return void from the API functions. To that end, we introduce
> > >    CHECK_PACKED_FIELD_*() macros to be used on the arrays of packed
> > >    fields. Note: the C preprocessor can't generate variable-length code
> > >    (loops),  as would be required for array-style definitions of struct
> > >    packed_field arrays. So the sanity checks use code generation at
> > >    compile time to $KBUILD_OUTPUT/include/generated/packing-checks.h.
> > >    There are explicit macros for sanity-checking arrays of 1 packed
> > >    field, 2 packed fields, 3 packed fields, ..., all the way to 50 packed
> > >    fields. In practice, the sja1105 driver will actually need the variant
> > >    with 40 fields. This isn't as bad as it seems: feeding a 39 entry
> > >    sized array into the CHECK_PACKED_FIELDS_40() macro will actually
> > >    generate a compilation error, so mistakes are very likely to be caught
> > >    by the developer and thus are not a problem.
> > >
> > > - Reduced rodata footprint for the storage of the packed field arrays.
> > >    To that end, we have struct packed_field_s (small) and packed_field_m
> > >    (medium). More can be added as needed (unlikely for now). On these
> > >    types, the same generic pack_fields() and unpack_fields() API can be
> > >    used, thanks to the new C11 _Generic() selection feature, which can
> > >    call pack_fields_s() or pack_fields_m(), depending on the type of the
> > >    "fields" array - a simplistic form of polymorphism. It is evaluated at
> > >    compile time which function will actually be called.
> > >
> > > Over time, packing() is expected to be completely replaced either with
> > > pack() or with pack_fields().
> > >
> > > Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
> > > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> > > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > ---
> > >   include/linux/packing.h  |  69 ++++++++++++++++++++++
> > >   lib/gen_packing_checks.c |  31 ++++++++++
> > >   lib/packing.c            | 149
> ++++++++++++++++++++++++++++++++++++++++++++++-
> > >   Kbuild                   |  13 ++++-
> > >   4 files changed, 259 insertions(+), 3 deletions(-)
> >
> >
> > > diff --git a/lib/gen_packing_checks.c b/lib/gen_packing_checks.c
> > > new file mode 100644
> > > index 000000000000..3213c858c2fe
> > > --- /dev/null
> > > +++ b/lib/gen_packing_checks.c
> > > @@ -0,0 +1,31 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +#include <stdio.h>
> > > +
> > > +int main(int argc, char **argv)
> > > +{
> > > +	printf("/* Automatically generated - do not edit */\n\n");
> > > +	printf("#ifndef GENERATED_PACKING_CHECKS_H\n");
> > > +	printf("#define GENERATED_PACKING_CHECKS_H\n\n");
> > > +
> > > +	for (int i = 1; i <= 50; i++) {
> >
> > either you missed my question, or I have missed your reply during
> > internal round of review, but:
> >
> > do we need 50? that means 1MB file, while it is 10x smaller for n=27
> 

That is partly why we generate the file instead of committing it. We could reduce this to 40, (or make it 40 once we add the sja1105 driver).

This would somewhat limit the size, at least until/unless another place in the code adds more fields to an array.

> The sja1105 driver will need checks for arrays of 40 fields, it's in the
> commit message. Though if the file size is going to generate comments
> even at this initial dimension, then maybe it isn't the best way forward...
> 
> Suggestions for how to scale up the compile-time checks?
> 
> Originally the CHECK_PACKED_FIELD_OVERLAP() weren't the cartesian
> product of all array elements. I just assumed that the field array would
> be ordered from MSB to LSB. But then, Jacob wondered why the order isn't
> from LSB to MSB. The presence/absence of the QUIRK_LSW32_IS_FIRST quirk
> seems to influence the perception of which field layout is natural.
> So the full-blown current overlap check is the compromise to use the
> same sanity check macros everywhere. Otherwise, we'd have to introduce
> CHECK_PACKED_FIELDS_5_UP() and CHECK_PACKED_FIELDS_5_DOWN(), and
> although even that would be smaller in size than the full cartesian
> product, it's harder to use IMO.
> 

Another option would be to implement something external to C to validate the fields, perhaps something in sparse? Downside being that it is less likely to be checked, so more risk that bugs creep in.

> > > +		printf("#define CHECK_PACKED_FIELDS_%d(fields, pbuflen) \\\n",
> i);
> > > +		printf("\t({ typeof(&(fields)[0]) _f = (fields); typeof(pbuflen) _len =
> (pbuflen); \\\n");
> > > +		printf("\tBUILD_BUG_ON(ARRAY_SIZE(fields) != %d); \\\n", i);
> > > +		for (int j = 0; j < i; j++) {
> > > +			int final = (i == 1);
> >
> > you could replace both @final variables and ternary operators from
> > the prints by simply moving the final "})\n" outside the loops
> 
> I don't see how, could you illustrate with some code? (assuming you're
> not proposing to change the generated output?) Even if you move the
> final "})\n" outside the loop, I'm not seeing how you would avoid
> printing the last " \\" without keeping track of the "final" variable
> anyway, point at which you're better off with the ternary than yet
> another printf() call?
> 
> > > +
> > > +			printf("\tCHECK_PACKED_FIELD(_f[%d], _len);%s\n",
> > > +			       j, final ? " })\n" : " \\");
> > > +		}
> > > +		for (int j = 1; j < i; j++) {
> > > +			for (int k = 0; k < j; k++) {
> > > +				int final = (j == i - 1) && (k == j - 1);
> > > +
> > > +				printf("\tCHECK_PACKED_FIELD_OVERLAP(_f[%d],
> _f[%d]);%s\n",
> > > +				       k, j, final ? " })\n" : " \\");
> > > +			}
> > > +		}
> > > +	}
> > > +
> > > +	printf("#endif /* GENERATED_PACKING_CHECKS_H */\n");
> > > +}
Jacob Keller Oct. 18, 2024, 9:50 p.m. UTC | #5
On 10/16/2024 3:31 PM, Keller, Jacob E wrote:
>> On Wed, Oct 16, 2024 at 03:02:38PM +0200, Przemek Kitszel wrote:
>>> On 10/11/24 20:48, Jacob Keller wrote:
>>>> From: Vladimir Oltean <vladimir.oltean@nxp.com>
>>>>
>>>> This is new API which caters to the following requirements:
>>>>
>>>> - Pack or unpack a large number of fields to/from a buffer with a small
>>>>    code footprint. The current alternative is to open-code a large number
>>>>    of calls to pack() and unpack(), or to use packing() to reduce that
>>>>    number to half. But packing() is not const-correct.
>>>>
>>>> - Use unpacked numbers stored in variables smaller than u64. This
>>>>    reduces the rodata footprint of the stored field arrays.
>>>>
>>>> - Perform error checking at compile time, rather than at runtime, and
>>>>    return void from the API functions. To that end, we introduce
>>>>    CHECK_PACKED_FIELD_*() macros to be used on the arrays of packed
>>>>    fields. Note: the C preprocessor can't generate variable-length code
>>>>    (loops),  as would be required for array-style definitions of struct
>>>>    packed_field arrays. So the sanity checks use code generation at
>>>>    compile time to $KBUILD_OUTPUT/include/generated/packing-checks.h.
>>>>    There are explicit macros for sanity-checking arrays of 1 packed
>>>>    field, 2 packed fields, 3 packed fields, ..., all the way to 50 packed
>>>>    fields. In practice, the sja1105 driver will actually need the variant
>>>>    with 40 fields. This isn't as bad as it seems: feeding a 39 entry
>>>>    sized array into the CHECK_PACKED_FIELDS_40() macro will actually
>>>>    generate a compilation error, so mistakes are very likely to be caught
>>>>    by the developer and thus are not a problem.
>>>>
>>>> - Reduced rodata footprint for the storage of the packed field arrays.
>>>>    To that end, we have struct packed_field_s (small) and packed_field_m
>>>>    (medium). More can be added as needed (unlikely for now). On these
>>>>    types, the same generic pack_fields() and unpack_fields() API can be
>>>>    used, thanks to the new C11 _Generic() selection feature, which can
>>>>    call pack_fields_s() or pack_fields_m(), depending on the type of the
>>>>    "fields" array - a simplistic form of polymorphism. It is evaluated at
>>>>    compile time which function will actually be called.
>>>>
>>>> Over time, packing() is expected to be completely replaced either with
>>>> pack() or with pack_fields().
>>>>
>>>> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
>>>> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
>>>> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

<snip>

>>>> +{
>>>> +	printf("/* Automatically generated - do not edit */\n\n");
>>>> +	printf("#ifndef GENERATED_PACKING_CHECKS_H\n");
>>>> +	printf("#define GENERATED_PACKING_CHECKS_H\n\n");
>>>> +
>>>> +	for (int i = 1; i <= 50; i++) {
>>>
>>> either you missed my question, or I have missed your reply during
>>> internal round of review, but:
>>>
>>> do we need 50? that means 1MB file, while it is 10x smaller for n=27
>>
> 
> That is partly why we generate the file instead of committing it. We could reduce this to 40, (or make it 40 once we add the sja1105 driver).
> 
> This would somewhat limit the size, at least until/unless another place in the code adds more fields to an array.
> 
>> The sja1105 driver will need checks for arrays of 40 fields, it's in the
>> commit message. Though if the file size is going to generate comments
>> even at this initial dimension, then maybe it isn't the best way forward...
>>
>> Suggestions for how to scale up the compile-time checks?
>>
>> Originally the CHECK_PACKED_FIELD_OVERLAP() weren't the cartesian
>> product of all array elements. I just assumed that the field array would
>> be ordered from MSB to LSB. But then, Jacob wondered why the order isn't
>> from LSB to MSB. The presence/absence of the QUIRK_LSW32_IS_FIRST quirk
>> seems to influence the perception of which field layout is natural.
>> So the full-blown current overlap check is the compromise to use the
>> same sanity check macros everywhere. Otherwise, we'd have to introduce
>> CHECK_PACKED_FIELDS_5_UP() and CHECK_PACKED_FIELDS_5_DOWN(), and
>> although even that would be smaller in size than the full cartesian
>> product, it's harder to use IMO.
>>
> 
> Another option would be to implement something external to C to validate the fields, perhaps something in sparse? Downside being that it is less likely to be checked, so more risk that bugs creep in.
> 
Przemek, Vladimir,

What are your thoughts on the next steps here. Do we need to go back to
the drawing board for how to handle these static checks?

Do we try to reduce the size somewhat, or try to come up with a
completely different approach to handling this? Do we revert back to
run-time checks? Investigate some alternative for static checking that
doesn't have this limitation requiring thousands of lines of macro?

I'd like to figure out what to do next.
Vladimir Oltean Oct. 19, 2024, 12:20 p.m. UTC | #6
On Fri, Oct 18, 2024 at 02:50:52PM -0700, Jacob Keller wrote:
> Przemek, Vladimir,
> 
> What are your thoughts on the next steps here. Do we need to go back to
> the drawing board for how to handle these static checks?
> 
> Do we try to reduce the size somewhat, or try to come up with a
> completely different approach to handling this? Do we revert back to
> run-time checks? Investigate some alternative for static checking that
> doesn't have this limitation requiring thousands of lines of macro?
> 
> I'd like to figure out what to do next.

Please see the attached patch for an idea on how to reduce the size
of <include/generated/packing-checks.h>, in a way that should be
satisfactory for both ice and sja1105, as well as future users.
Jacob Keller Oct. 22, 2024, 7:11 p.m. UTC | #7
On 10/19/2024 5:20 AM, Vladimir Oltean wrote:
> On Fri, Oct 18, 2024 at 02:50:52PM -0700, Jacob Keller wrote:
>> Przemek, Vladimir,
>>
>> What are your thoughts on the next steps here. Do we need to go back to
>> the drawing board for how to handle these static checks?
>>
>> Do we try to reduce the size somewhat, or try to come up with a
>> completely different approach to handling this? Do we revert back to
>> run-time checks? Investigate some alternative for static checking that
>> doesn't have this limitation requiring thousands of lines of macro?
>>
>> I'd like to figure out what to do next.
> 
> Please see the attached patch for an idea on how to reduce the size
> of <include/generated/packing-checks.h>, in a way that should be
> satisfactory for both ice and sja1105, as well as future users.

This trades off generating the macros for an increase in the config
complexity. I suppose that is slightly better than generating thousands
of lines of macro... The unused macros sit on disk in the include file,
but i don't think they would impact the deployed code...

I'm still wondering if there is a different approach we can take to
validate these structures.
Vladimir Oltean Oct. 24, 2024, 1:49 p.m. UTC | #8
On Tue, Oct 22, 2024 at 12:11:36PM -0700, Jacob Keller wrote:
> On 10/19/2024 5:20 AM, Vladimir Oltean wrote:
> > On Fri, Oct 18, 2024 at 02:50:52PM -0700, Jacob Keller wrote:
> >> Przemek, Vladimir,
> >>
> >> What are your thoughts on the next steps here. Do we need to go back to
> >> the drawing board for how to handle these static checks?
> >>
> >> Do we try to reduce the size somewhat, or try to come up with a
> >> completely different approach to handling this? Do we revert back to
> >> run-time checks? Investigate some alternative for static checking that
> >> doesn't have this limitation requiring thousands of lines of macro?
> >>
> >> I'd like to figure out what to do next.
> > 
> > Please see the attached patch for an idea on how to reduce the size
> > of <include/generated/packing-checks.h>, in a way that should be
> > satisfactory for both ice and sja1105, as well as future users.
> 
> This trades off generating the macros for an increase in the config
> complexity. I suppose that is slightly better than generating thousands
> of lines of macro... The unused macros sit on disk in the include file,
> but i don't think they would impact the deployed code...

Sorry, conflicting requirements. There will be a trade-off somewhere between
performance (having sanity checks at compile time rather than run time),
size (offer a library-level mechanism for consumer drivers to perform their
compile-time sanity checks), complexity (only generate those sanity
checks which are requested by drivers) and flexibility (support whichever
order the consumer driver desires for the arrays of packed fields).
I believe performance should not be the one which has to suffer, because
packet processing is one of the potential use cases, and I wouldn't want
to lose that through design choices. The rest.. I'm more flexible on,
but still, they have to be satisfiable in a way that I can see.

> I'm still wondering if there is a different approach we can take to
> validate these structures.

I just want to say that I don't have any alternative proposals, nor will I
explore your sparse suggestion. I don't know enough about sparse to judge
whether something as 'custom' as the packing API is in scope for its
check_call_instruction() infrastructure, how well will that solution
deal with internal kernel API changes down the line, and I don't have
the time to learn enough to prototype something to find the maintainers'
answer to these questions, either. I strongly prefer to have the static
checks inside the kernel, together with the packing() API itself, so it
can be more easily altered.

Obviously you're still free to wait for more opinions and suggestions,
or to experiment with the sparse idea yourself.

Honestly, my opinion is that if we can avoid messing too much with the
top-level Kbuild file, this pretty much enters "no one really cares"
territory, as long as the code is generated only for the pack_fields()
users. This is, in fact, one of the reasons why the patch I attached
earlier compiles and runs the code-gen only when PACKING_CHECK_FIELDS
is defined.
Jacob Keller Oct. 24, 2024, 4:38 p.m. UTC | #9
On 10/24/2024 6:49 AM, Vladimir Oltean wrote:
> I just want to say that I don't have any alternative proposals, nor will I
> explore your sparse suggestion. I don't know enough about sparse to judge
> whether something as 'custom' as the packing API is in scope for its
> check_call_instruction() infrastructure, how well will that solution
> deal with internal kernel API changes down the line, and I don't have
> the time to learn enough to prototype something to find the maintainers'
> answer to these questions, either. I strongly prefer to have the static
> checks inside the kernel, together with the packing() API itself, so it
> can be more easily altered.
> 
> Obviously you're still free to wait for more opinions and suggestions,
> or to experiment with the sparse idea yourself.
> 

I also have some thought about trying to catch this in a coccinelle
script. That has the trade-off that its only caught by running the
spatch/coccinelle scripts, but it would completely eliminate the need to
modify Kbuild at all.

I'm going to try and experiment with that direction and see if its feasible.

> Honestly, my opinion is that if we can avoid messing too much with the
> top-level Kbuild file, this pretty much enters "no one really cares"
> territory, as long as the code is generated only for the pack_fields()
> users. This is, in fact, one of the reasons why the patch I attached
> earlier compiles and runs the code-gen only when PACKING_CHECK_FIELDS
> is defined.

If I can't make that work today, I'll send a v2 with the
PACKING_CHECK_FIELDS and the other cleanups applied.
Jacob Keller Oct. 24, 2024, 8:14 p.m. UTC | #10
On 10/24/2024 9:38 AM, Jacob Keller wrote:
> I also have some thought about trying to catch this in a coccinelle
> script. That has the trade-off that its only caught by running the
> spatch/coccinelle scripts, but it would completely eliminate the need to
> modify Kbuild at all.
> 
> I'm going to try and experiment with that direction and see if its feasible.
> 

Coccinelle can't handle all of the relevant checks, since it does not
parse the macros, and can only look at the uncompiled code.

I'll have a v2 out soon with Vladimir's suggested approach.

Thanks,
Jake
Vladimir Oltean Oct. 24, 2024, 8:30 p.m. UTC | #11
On Thu, Oct 24, 2024 at 01:14:21PM -0700, Jacob Keller wrote:
> On 10/24/2024 9:38 AM, Jacob Keller wrote:
> > I also have some thought about trying to catch this in a coccinelle
> > script. That has the trade-off that its only caught by running the
> > spatch/coccinelle scripts, but it would completely eliminate the need to
> > modify Kbuild at all.
> > 
> > I'm going to try and experiment with that direction and see if its feasible.
> > 
> 
> Coccinelle can't handle all of the relevant checks, since it does not
> parse the macros, and can only look at the uncompiled code.
> 
> I'll have a v2 out soon with Vladimir's suggested approach.
> 
> Thanks,
> Jake

I should mention that my earlier patch is a big blob, but it should
really be broken down into little bits which are each squashed into
existing patches from this set, until nothing remains of it.
diff mbox series

Patch

diff --git a/include/linux/packing.h b/include/linux/packing.h
index 5d36dcd06f60..eeb23d90e5e0 100644
--- a/include/linux/packing.h
+++ b/include/linux/packing.h
@@ -26,4 +26,73 @@  int pack(void *pbuf, u64 uval, size_t startbit, size_t endbit, size_t pbuflen,
 int unpack(const void *pbuf, u64 *uval, size_t startbit, size_t endbit,
 	   size_t pbuflen, u8 quirks);
 
+#define GEN_PACKED_FIELD_MEMBERS(__type) \
+	__type startbit; \
+	__type endbit; \
+	__type offset; \
+	__type size;
+
+/* Small packed field. Use with bit offsets < 256, buffers < 32B and
+ * unpacked structures < 256B.
+ */
+struct packed_field_s {
+	GEN_PACKED_FIELD_MEMBERS(u8);
+};
+
+/* Medium packed field. Use with bit offsets < 65536, buffers < 8KB and
+ * unpacked structures < 64KB.
+ */
+struct packed_field_m {
+	GEN_PACKED_FIELD_MEMBERS(u16);
+};
+
+#define PACKED_FIELD(start, end, struct_name, struct_field) \
+	{ \
+		(start), \
+		(end), \
+		offsetof(struct_name, struct_field), \
+		sizeof_field(struct_name, struct_field), \
+	}
+
+#define CHECK_PACKED_FIELD(field, pbuflen) \
+	({ typeof(field) __f = (field); typeof(pbuflen) __len = (pbuflen); \
+	BUILD_BUG_ON(__f.startbit < __f.endbit); \
+	BUILD_BUG_ON(__f.startbit >= BITS_PER_BYTE * __len); \
+	BUILD_BUG_ON(__f.startbit - __f.endbit >= BITS_PER_BYTE * __f.size); \
+	BUILD_BUG_ON(__f.size != 1 && __f.size != 2 && __f.size != 4 && __f.size != 8); })
+
+#define CHECK_PACKED_FIELD_OVERLAP(field1, field2) \
+	({ typeof(field1) _f1 = (field1); typeof(field2) _f2 = (field2); \
+	BUILD_BUG_ON(max(_f1.endbit, _f2.endbit) <=  min(_f1.startbit, _f2.startbit)); })
+
+#include <generated/packing-checks.h>
+
+void pack_fields_s(void *pbuf, size_t pbuflen, const void *ustruct,
+		   const struct packed_field_s *fields, size_t num_fields,
+		   u8 quirks);
+
+void pack_fields_m(void *pbuf, size_t pbuflen, const void *ustruct,
+		   const struct packed_field_m *fields, size_t num_fields,
+		   u8 quirks);
+
+void unpack_fields_s(const void *pbuf, size_t pbuflen, void *ustruct,
+		     const struct packed_field_s *fields, size_t num_fields,
+		     u8 quirks);
+
+void unpack_fields_m(const void *pbuf, size_t pbuflen, void *ustruct,
+		      const struct packed_field_m *fields, size_t num_fields,
+		      u8 quirks);
+
+#define pack_fields(pbuf, pbuflen, ustruct, fields, quirks) \
+	_Generic((fields), \
+		 const struct packed_field_s * : pack_fields_s, \
+		 const struct packed_field_m * : pack_fields_m \
+		)(pbuf, pbuflen, ustruct, fields, ARRAY_SIZE(fields), quirks)
+
+#define unpack_fields(pbuf, pbuflen, ustruct, fields, quirks) \
+	_Generic((fields), \
+		 const struct packed_field_s * : unpack_fields_s, \
+		 const struct packed_field_m * : unpack_fields_m \
+		)(pbuf, pbuflen, ustruct, fields, ARRAY_SIZE(fields), quirks)
+
 #endif
diff --git a/lib/gen_packing_checks.c b/lib/gen_packing_checks.c
new file mode 100644
index 000000000000..3213c858c2fe
--- /dev/null
+++ b/lib/gen_packing_checks.c
@@ -0,0 +1,31 @@ 
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+
+int main(int argc, char **argv)
+{
+	printf("/* Automatically generated - do not edit */\n\n");
+	printf("#ifndef GENERATED_PACKING_CHECKS_H\n");
+	printf("#define GENERATED_PACKING_CHECKS_H\n\n");
+
+	for (int i = 1; i <= 50; i++) {
+		printf("#define CHECK_PACKED_FIELDS_%d(fields, pbuflen) \\\n", i);
+		printf("\t({ typeof(&(fields)[0]) _f = (fields); typeof(pbuflen) _len = (pbuflen); \\\n");
+		printf("\tBUILD_BUG_ON(ARRAY_SIZE(fields) != %d); \\\n", i);
+		for (int j = 0; j < i; j++) {
+			int final = (i == 1);
+
+			printf("\tCHECK_PACKED_FIELD(_f[%d], _len);%s\n",
+			       j, final ? " })\n" : " \\");
+		}
+		for (int j = 1; j < i; j++) {
+			for (int k = 0; k < j; k++) {
+				int final = (j == i - 1) && (k == j - 1);
+
+				printf("\tCHECK_PACKED_FIELD_OVERLAP(_f[%d], _f[%d]);%s\n",
+				       k, j, final ? " })\n" : " \\");
+			}
+		}
+	}
+
+	printf("#endif /* GENERATED_PACKING_CHECKS_H */\n");
+}
diff --git a/lib/packing.c b/lib/packing.c
index 2bf81951dfc8..b7ca55269d0f 100644
--- a/lib/packing.c
+++ b/lib/packing.c
@@ -5,10 +5,37 @@ 
 #include <linux/packing.h>
 #include <linux/module.h>
 #include <linux/bitops.h>
+#include <linux/bits.h>
 #include <linux/errno.h>
 #include <linux/types.h>
 #include <linux/bitrev.h>
 
+#define __pack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks)	\
+	({									\
+		for (size_t i = 0; i < (num_fields); i++) {			\
+			typeof(&(fields)[0]) field = &(fields)[i];		\
+			u64 uval;						\
+										\
+			uval = ustruct_field_to_u64(ustruct, field->offset, field->size); \
+										\
+			__pack(pbuf, uval, field->startbit, field->endbit,	\
+			       pbuflen, quirks);				\
+		}								\
+	})
+
+#define __unpack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks)	\
+	({									\
+		for (size_t i = 0; i < (num_fields); i++) {			\
+			typeof(&(fields)[0]) field = &fields[i];		\
+			u64 uval;						\
+										\
+			__unpack(pbuf, &uval, field->startbit, field->endbit,	\
+				 pbuflen, quirks);				\
+										\
+			u64_to_ustruct_field(ustruct, field->offset, field->size, uval); \
+		}								\
+	})
+
 /**
  * calculate_box_addr - Determine physical location of byte in buffer
  * @box: Index of byte within buffer seen as a logical big-endian big number
@@ -168,8 +195,8 @@  int pack(void *pbuf, u64 uval, size_t startbit, size_t endbit, size_t pbuflen,
 }
 EXPORT_SYMBOL(pack);
 
-static void __unpack(const void *pbuf, u64 *uval, size_t startbit,
-		     size_t endbit, size_t pbuflen, u8 quirks)
+static void __unpack(const void *pbuf, u64 *uval, size_t startbit, size_t endbit,
+		     size_t pbuflen, u8 quirks)
 {
 	/* Logical byte indices corresponding to the
 	 * start and end of the field.
@@ -322,4 +349,122 @@  int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen,
 }
 EXPORT_SYMBOL(packing);
 
+static u64 ustruct_field_to_u64(const void *ustruct, size_t field_offset,
+				size_t field_size)
+{
+	switch (field_size) {
+	case 1:
+		return *((u8 *)(ustruct + field_offset));
+	case 2:
+		return *((u16 *)(ustruct + field_offset));
+	case 4:
+		return *((u32 *)(ustruct + field_offset));
+	default:
+		return *((u64 *)(ustruct + field_offset));
+	}
+}
+
+static void u64_to_ustruct_field(void *ustruct, size_t field_offset,
+				 size_t field_size, u64 uval)
+{
+	switch (field_size) {
+	case 1:
+		*((u8 *)(ustruct + field_offset)) = uval;
+		break;
+	case 2:
+		*((u16 *)(ustruct + field_offset)) = uval;
+		break;
+	case 4:
+		*((u32 *)(ustruct + field_offset)) = uval;
+		break;
+	default:
+		*((u64 *)(ustruct + field_offset)) = uval;
+		break;
+	}
+}
+
+/**
+ * pack_fields_s - Pack array of small fields
+ *
+ * @pbuf: Pointer to a buffer holding the packed value.
+ * @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
+ * @ustruct: Pointer to CPU-readable structure holding the unpacked value.
+ *	     It is expected (but not checked) that this has the same data type
+ *	     as all struct packed_field_s definitions.
+ * @fields: Array of small packed fields definition. They must not overlap.
+ * @num_fields: Length of @fields array.
+ * @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
+ *	    QUIRK_MSB_ON_THE_RIGHT.
+ */
+void pack_fields_s(void *pbuf, size_t pbuflen, const void *ustruct,
+		   const struct packed_field_s *fields, size_t num_fields,
+		   u8 quirks)
+{
+	__pack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
+}
+EXPORT_SYMBOL(pack_fields_s);
+
+/**
+ * pack_fields_m - Pack array of medium fields
+ *
+ * @pbuf: Pointer to a buffer holding the packed value.
+ * @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
+ * @ustruct: Pointer to CPU-readable structure holding the unpacked value.
+ *	     It is expected (but not checked) that this has the same data type
+ *	     as all struct packed_field_s definitions.
+ * @fields: Array of medium packed fields definition. They must not overlap.
+ * @num_fields: Length of @fields array.
+ * @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
+ *	    QUIRK_MSB_ON_THE_RIGHT.
+ */
+void pack_fields_m(void *pbuf, size_t pbuflen, const void *ustruct,
+		    const struct packed_field_m *fields, size_t num_fields,
+		    u8 quirks)
+{
+	__pack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
+}
+EXPORT_SYMBOL(pack_fields_m);
+
+/**
+ * unpack_fields_s - Unpack array of small fields
+ *
+ * @pbuf: Pointer to a buffer holding the packed value.
+ * @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
+ * @ustruct: Pointer to CPU-readable structure holding the unpacked value.
+ *	     It is expected (but not checked) that this has the same data type
+ *	     as all struct packed_field_s definitions.
+ * @fields: Array of small packed fields definition. They must not overlap.
+ * @num_fields: Length of @fields array.
+ * @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
+ *	    QUIRK_MSB_ON_THE_RIGHT.
+ */
+void unpack_fields_s(const void *pbuf, size_t pbuflen, void *ustruct,
+		     const struct packed_field_s *fields, size_t num_fields,
+		     u8 quirks)
+{
+	__unpack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
+}
+EXPORT_SYMBOL(unpack_fields_s);
+
+/**
+ * unpack_fields_m - Unpack array of medium fields
+ *
+ * @pbuf: Pointer to a buffer holding the packed value.
+ * @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
+ * @ustruct: Pointer to CPU-readable structure holding the unpacked value.
+ *	     It is expected (but not checked) that this has the same data type
+ *	     as all struct packed_field_s definitions.
+ * @fields: Array of medium packed fields definition. They must not overlap.
+ * @num_fields: Length of @fields array.
+ * @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
+ *	    QUIRK_MSB_ON_THE_RIGHT.
+ */
+void unpack_fields_m(const void *pbuf, size_t pbuflen, void *ustruct,
+		      const struct packed_field_m *fields, size_t num_fields,
+		      u8 quirks)
+{
+	__unpack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
+}
+EXPORT_SYMBOL(unpack_fields_m);
+
 MODULE_DESCRIPTION("Generic bitfield packing and unpacking");
diff --git a/Kbuild b/Kbuild
index 464b34a08f51..35a8b78b72d9 100644
--- a/Kbuild
+++ b/Kbuild
@@ -34,6 +34,17 @@  arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)
 $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
 	$(call filechk,offsets,__ASM_OFFSETS_H__)
 
+# Generate packing-checks.h
+
+hostprogs += lib/gen_packing_checks
+
+packing-checks := include/generated/packing-checks.h
+
+filechk_gen_packing_checks = lib/gen_packing_checks
+
+$(packing-checks): lib/gen_packing_checks FORCE
+	$(call filechk,gen_packing_checks)
+
 # Check for missing system calls
 
 quiet_cmd_syscalls = CALL    $<
@@ -70,7 +81,7 @@  $(atomic-checks): $(obj)/.checked-%: include/linux/atomic/%  FORCE
 # A phony target that depends on all the preparation targets
 
 PHONY += prepare
-prepare: $(offsets-file) missing-syscalls $(atomic-checks)
+prepare: $(offsets-file) missing-syscalls $(atomic-checks) $(packing-checks)
 	@:
 
 # Ordinary directory descending