mbox series

[net-next,v2,0/9] lib: packing: introduce and use (un)pack_fields

Message ID 20241025-packing-pack-fields-and-ice-implementation-v2-0-734776c88e40@intel.com (mailing list archive)
Headers show
Series lib: packing: introduce and use (un)pack_fields | expand

Message

Jacob Keller Oct. 26, 2024, 12:04 a.m. UTC
This series improves the packing library with a new API for packing or
unpacking a large number of fields at once with minimal code footprint. The
API is then used to replace bespoke packing logic in the ice driver,
preparing it to handle unpacking in the future. Finally, the ice driver has
a few other cleanups related to the packing logic.

The pack_fields and unpack_fields functions have the following improvements
over the existing pack() and unpack() API:

 1. Packing or unpacking a large number of fields takes significantly less
    code. This significantly reduces the .text size for an increase in the
    .data size which is much smaller.

 2. The unpacked data can be stored in sizes smaller than u64 variables.
    This reduces the storage requirement both for runtime data structures,
    and for the rodata defining the fields. This scales with the number of
    fields used.

 3. Most of the error checking is done at compile time, rather than
    runtime via CHECK_PACKED_FIELD_* macros. This saves wasted computation
    time, *and* catches errors in the field definitions immediately instead
    of only after the offending code executes.

The actual packing and unpacking code still uses the u64 size
variables. However, these are converted to the appropriate field sizes when
storing or reading the data from the buffer.

One complexity is that the CHECK_PACKED_FIELD_* macros need to be defined
one per size of the packed_fields array. This is because we don't have a
good way to handle the ordering checks otherwise. The C pre-processor is
unable to generate and run variable length loops at compile time.

This is a significant amount of macro code, ~22,000 lines of code. To
ensure it is correct and to avoid needing to store this directly in the
kernel history, this file is generated as <generated/packing-checks.h> via
a small C program, gen_packing_checks. To generate this, we need to update
the top level Kbuild process to include the compilation of
gen_packing_checks and execution to generate the packing-checks.h file.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Changes in v2:
- Add my missing sign-off to the first patch
- Update the descriptions for a few patches
- Only generate CHECK_PACKED_FIELDS_N when another module selects it
- Add a new patch introducing wrapper structures for the packed Tx and Rx
  queue context, suggested by Vladimir.
- Drop the now unnecessary macros in ice, thanks to the new types
- Link to v1: https://lore.kernel.org/r/20241011-packing-pack-fields-and-ice-implementation-v1-0-d9b1f7500740@intel.com

---
Jacob Keller (6):
      ice: remove int_q_state from ice_tlan_ctx
      ice: use structures to keep track of queue context size
      ice: use <linux/packing.h> for Tx and Rx queue context data
      ice: reduce size of queue context fields
      ice: move prefetch enable to ice_setup_rx_ctx
      ice: cleanup Rx queue context programming functions

Vladimir Oltean (3):
      lib: packing: create __pack() and __unpack() variants without error checking
      lib: packing: demote truncation error in pack() to a warning in __pack()
      lib: packing: add pack_fields() and unpack_fields()

 drivers/net/ethernet/intel/ice/ice_adminq_cmd.h |  11 +-
 drivers/net/ethernet/intel/ice/ice_common.h     |   5 +-
 drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h  |  49 +---
 include/linux/packing.h                         |  73 +++++
 drivers/net/dsa/sja1105/sja1105_static_config.c |   8 +-
 drivers/net/ethernet/intel/ice/ice_base.c       |   6 +-
 drivers/net/ethernet/intel/ice/ice_common.c     | 299 +++++---------------
 lib/gen_packing_checks.c                        | 193 +++++++++++++
 lib/packing.c                                   | 285 ++++++++++++++-----
 Kbuild                                          | 168 ++++++++++-
 drivers/net/ethernet/intel/Kconfig              |   3 +
 lib/Kconfig                                     | 361 +++++++++++++++++++++++-
 12 files changed, 1105 insertions(+), 356 deletions(-)
---
base-commit: 03fc07a24735e0be8646563913abf5f5cb71ad19
change-id: 20241004-packing-pack-fields-and-ice-implementation-b17c7ce8e373

Best regards,

Comments

Vladimir Oltean Oct. 29, 2024, 6:28 p.m. UTC | #1
On Fri, Oct 25, 2024 at 05:04:52PM -0700, Jacob Keller wrote:
> This series improves the packing library with a new API for packing or
> unpacking a large number of fields at once with minimal code footprint. The
> API is then used to replace bespoke packing logic in the ice driver,
> preparing it to handle unpacking in the future. Finally, the ice driver has
> a few other cleanups related to the packing logic.
> 
> The pack_fields and unpack_fields functions have the following improvements
> over the existing pack() and unpack() API:
> 
>  1. Packing or unpacking a large number of fields takes significantly less
>     code. This significantly reduces the .text size for an increase in the
>     .data size which is much smaller.
> 
>  2. The unpacked data can be stored in sizes smaller than u64 variables.
>     This reduces the storage requirement both for runtime data structures,
>     and for the rodata defining the fields. This scales with the number of
>     fields used.
> 
>  3. Most of the error checking is done at compile time, rather than
>     runtime via CHECK_PACKED_FIELD_* macros. This saves wasted computation
>     time, *and* catches errors in the field definitions immediately instead
>     of only after the offending code executes.
> 
> The actual packing and unpacking code still uses the u64 size
> variables. However, these are converted to the appropriate field sizes when
> storing or reading the data from the buffer.
> 
> One complexity is that the CHECK_PACKED_FIELD_* macros need to be defined
> one per size of the packed_fields array. This is because we don't have a
> good way to handle the ordering checks otherwise. The C pre-processor is
> unable to generate and run variable length loops at compile time.
> 
> This is a significant amount of macro code, ~22,000 lines of code. To
> ensure it is correct and to avoid needing to store this directly in the
> kernel history, this file is generated as <generated/packing-checks.h> via
> a small C program, gen_packing_checks. To generate this, we need to update
> the top level Kbuild process to include the compilation of
> gen_packing_checks and execution to generate the packing-checks.h file.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> Changes in v2:
> - Add my missing sign-off to the first patch
> - Update the descriptions for a few patches
> - Only generate CHECK_PACKED_FIELDS_N when another module selects it
> - Add a new patch introducing wrapper structures for the packed Tx and Rx
>   queue context, suggested by Vladimir.
> - Drop the now unnecessary macros in ice, thanks to the new types
> - Link to v1: https://lore.kernel.org/r/20241011-packing-pack-fields-and-ice-implementation-v1-0-d9b1f7500740@intel.com

For the set:

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>

Thanks for working on this!
Daniel Machon Oct. 31, 2024, 9:30 a.m. UTC | #2
> This series improves the packing library with a new API for packing or
> unpacking a large number of fields at once with minimal code footprint. The
> API is then used to replace bespoke packing logic in the ice driver,
> preparing it to handle unpacking in the future. Finally, the ice driver has
> a few other cleanups related to the packing logic.
> 
> The pack_fields and unpack_fields functions have the following improvements
> over the existing pack() and unpack() API:
> 
>  1. Packing or unpacking a large number of fields takes significantly less
>     code. This significantly reduces the .text size for an increase in the
>     .data size which is much smaller.
> 
>  2. The unpacked data can be stored in sizes smaller than u64 variables.
>     This reduces the storage requirement both for runtime data structures,
>     and for the rodata defining the fields. This scales with the number of
>     fields used.
> 
>  3. Most of the error checking is done at compile time, rather than
>     runtime via CHECK_PACKED_FIELD_* macros. This saves wasted computation
>     time, *and* catches errors in the field definitions immediately instead
>     of only after the offending code executes.
> 
> The actual packing and unpacking code still uses the u64 size
> variables. However, these are converted to the appropriate field sizes when
> storing or reading the data from the buffer.
> 
> One complexity is that the CHECK_PACKED_FIELD_* macros need to be defined
> one per size of the packed_fields array. This is because we don't have a
> good way to handle the ordering checks otherwise. The C pre-processor is
> unable to generate and run variable length loops at compile time.
> 
> This is a significant amount of macro code, ~22,000 lines of code. To
> ensure it is correct and to avoid needing to store this directly in the
> kernel history, this file is generated as <generated/packing-checks.h> via
> a small C program, gen_packing_checks. To generate this, we need to update
> the top level Kbuild process to include the compilation of
> gen_packing_checks and execution to generate the packing-checks.h file.
> 

Hi Jacob,

As for the rest of the patches:

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>

I can confirm that smatch does not complain anymore, after Dan's recent
commit that skips the macros.

/Daniel