mbox series

[net-next,v2,0/8] ice: support dynamic interrupt allocation

Message ID 20230322162530.3317238-1-piotr.raczynski@intel.com (mailing list archive)
Headers show
Series ice: support dynamic interrupt allocation | expand

Message

Piotr Raczynski March 22, 2023, 4:25 p.m. UTC
This patchset reimplements MSIX interrupt allocation logic to allow dynamic
interrupt allocation after MSIX has been initially enabled. This allows
current and future features to allocate and free interrupts as needed and
will help to drastically decrease number of initially preallocated
interrupts (even down to the API hard limit of 1). Although this patchset
does not change behavior in terms of actual number of allocated interrupts
during probe, it will be subject to change.

First few patches prepares to introduce dynamic allocation by moving
interrupt allocation code to separate file and update allocation API used
in the driver to the currently preferred one.

Due to the current contract between ice and irdma driver which is directly
accessing msix entries allocated by ice driver, even after moving away from
older pci_enable_msix_range function, still keep msix_entries array for
irdma use.

Next patches refactors and removes redundant code from SRIOV related logic
as it also make it easier to move away from static allocation scheme.

Last patches actually enables dynamic allocation of MSIX interrupts. First,
introduce functions to allocate and free interrupts individually. This sets
ground for the rest of the changes even if that patch still allocates the
interrupts from the preallocated pool. Since this patch starts to keep
interrupt details in ice_q_vector structure we can get rid of functions
that calculates base vector number and register offset for the interrupt
as it is equal to the interrupt index. Only keep separate register offset
functions for the VF VSIs.

Next, replace homegrown interrupt tracker with much simpler xarray based
approach. As new API always allocate interrupts one by one, also track
interrupts in the same manner.

Lastly, extend the interrupt tracker to deal both with preallocated and
dynamically allocated vectors and use pci_msix_alloc_irq_at and
pci_msix_free_irq functions. Since not all architecture supports dynamic
allocation, check it before trying to allocate a new interrupt.

As previously mentioned, this patchset does not change number of initially
allocated interrupts during init phase but now it can and will likely be
changed.

Patch 1-3 -> move code around and use newer API
Patch 4-5 -> refactor and remove redundant SRIOV code
Patch 6   -> allocate every interrupt individually
Patch 7   -> replace homegrown interrupt tracker with xarray
Patch 8   -> allow dynamic interrupt allocation

Change history:
v1 -> v2:
- ice: refactor VF control VSI interrupt handling
  - move ice_get_vf_ctrl_vsi to ice_lib.c (ice_vf_lib.c depends on
    CONFIG_PCI_IOV)

Piotr Raczynski (8):
  ice: move interrupt related code to separate file
  ice: use pci_irq_vector helper function
  ice: use preferred MSIX allocation api
  ice: refactor VF control VSI interrupt handling
  ice: remove redundant SRIOV code
  ice: add individual interrupt allocation
  ice: track interrupt vectors with xarray
  ice: add dynamic interrupt allocation

 drivers/net/ethernet/intel/ice/Makefile      |   1 +
 drivers/net/ethernet/intel/ice/ice.h         |  24 +-
 drivers/net/ethernet/intel/ice/ice_arfs.c    |   5 +-
 drivers/net/ethernet/intel/ice/ice_base.c    |  36 +-
 drivers/net/ethernet/intel/ice/ice_ethtool.c |   2 +-
 drivers/net/ethernet/intel/ice/ice_idc.c     |  54 ++-
 drivers/net/ethernet/intel/ice/ice_irq.c     | 377 +++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_irq.h     |  25 ++
 drivers/net/ethernet/intel/ice/ice_lib.c     | 321 ++--------------
 drivers/net/ethernet/intel/ice/ice_lib.h     |   7 +-
 drivers/net/ethernet/intel/ice/ice_main.c    | 268 ++-----------
 drivers/net/ethernet/intel/ice/ice_ptp.c     |   2 +-
 drivers/net/ethernet/intel/ice/ice_sriov.c   |  43 +--
 drivers/net/ethernet/intel/ice/ice_xsk.c     |   5 +-
 14 files changed, 553 insertions(+), 617 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.h

Comments

Jacob Keller March 22, 2023, 5:33 p.m. UTC | #1
On 3/22/2023 9:25 AM, Piotr Raczynski wrote:
> This patchset reimplements MSIX interrupt allocation logic to allow dynamic
> interrupt allocation after MSIX has been initially enabled. This allows
> current and future features to allocate and free interrupts as needed and
> will help to drastically decrease number of initially preallocated
> interrupts (even down to the API hard limit of 1). Although this patchset
> does not change behavior in terms of actual number of allocated interrupts
> during probe, it will be subject to change.
> 
> First few patches prepares to introduce dynamic allocation by moving
> interrupt allocation code to separate file and update allocation API used
> in the driver to the currently preferred one.
> 
> Due to the current contract between ice and irdma driver which is directly
> accessing msix entries allocated by ice driver, even after moving away from
> older pci_enable_msix_range function, still keep msix_entries array for
> irdma use.
> 
> Next patches refactors and removes redundant code from SRIOV related logic
> as it also make it easier to move away from static allocation scheme.
> 
> Last patches actually enables dynamic allocation of MSIX interrupts. First,
> introduce functions to allocate and free interrupts individually. This sets
> ground for the rest of the changes even if that patch still allocates the
> interrupts from the preallocated pool. Since this patch starts to keep
> interrupt details in ice_q_vector structure we can get rid of functions
> that calculates base vector number and register offset for the interrupt
> as it is equal to the interrupt index. Only keep separate register offset
> functions for the VF VSIs.
> 
> Next, replace homegrown interrupt tracker with much simpler xarray based
> approach. As new API always allocate interrupts one by one, also track
> interrupts in the same manner.
> 
> Lastly, extend the interrupt tracker to deal both with preallocated and
> dynamically allocated vectors and use pci_msix_alloc_irq_at and
> pci_msix_free_irq functions. Since not all architecture supports dynamic
> allocation, check it before trying to allocate a new interrupt.
> 
> As previously mentioned, this patchset does not change number of initially
> allocated interrupts during init phase but now it can and will likely be
> changed.
> 
> Patch 1-3 -> move code around and use newer API
> Patch 4-5 -> refactor and remove redundant SRIOV code
> Patch 6   -> allocate every interrupt individually
> Patch 7   -> replace homegrown interrupt tracker with xarray
> Patch 8   -> allow dynamic interrupt allocation
> 
> Change history:
> v1 -> v2:
> - ice: refactor VF control VSI interrupt handling
>   - move ice_get_vf_ctrl_vsi to ice_lib.c (ice_vf_lib.c depends on
>     CONFIG_PCI_IOV)
> 

The other option would have been to make ice_vf_lib.h have a no-op
function that always returned NULL, since we generally would know that
there are no VF ctrl VSI if CONFIG_PCI_IOV is disabled.

But I'm ok with it being in ice_lib.c too.

Thanks,
Jake
Piotr Raczynski March 22, 2023, 6:40 p.m. UTC | #2
On Wed, Mar 22, 2023 at 10:33:31AM -0700, Jacob Keller wrote:
> 
> 
> On 3/22/2023 9:25 AM, Piotr Raczynski wrote:
> > This patchset reimplements MSIX interrupt allocation logic to allow dynamic
> > interrupt allocation after MSIX has been initially enabled. This allows
> > current and future features to allocate and free interrupts as needed and
> > will help to drastically decrease number of initially preallocated
> > interrupts (even down to the API hard limit of 1). Although this patchset
> > does not change behavior in terms of actual number of allocated interrupts
> > during probe, it will be subject to change.
> > 
> > First few patches prepares to introduce dynamic allocation by moving
> > interrupt allocation code to separate file and update allocation API used
> > in the driver to the currently preferred one.
> > 
> > Due to the current contract between ice and irdma driver which is directly
> > accessing msix entries allocated by ice driver, even after moving away from
> > older pci_enable_msix_range function, still keep msix_entries array for
> > irdma use.
> > 
> > Next patches refactors and removes redundant code from SRIOV related logic
> > as it also make it easier to move away from static allocation scheme.
> > 
> > Last patches actually enables dynamic allocation of MSIX interrupts. First,
> > introduce functions to allocate and free interrupts individually. This sets
> > ground for the rest of the changes even if that patch still allocates the
> > interrupts from the preallocated pool. Since this patch starts to keep
> > interrupt details in ice_q_vector structure we can get rid of functions
> > that calculates base vector number and register offset for the interrupt
> > as it is equal to the interrupt index. Only keep separate register offset
> > functions for the VF VSIs.
> > 
> > Next, replace homegrown interrupt tracker with much simpler xarray based
> > approach. As new API always allocate interrupts one by one, also track
> > interrupts in the same manner.
> > 
> > Lastly, extend the interrupt tracker to deal both with preallocated and
> > dynamically allocated vectors and use pci_msix_alloc_irq_at and
> > pci_msix_free_irq functions. Since not all architecture supports dynamic
> > allocation, check it before trying to allocate a new interrupt.
> > 
> > As previously mentioned, this patchset does not change number of initially
> > allocated interrupts during init phase but now it can and will likely be
> > changed.
> > 
> > Patch 1-3 -> move code around and use newer API
> > Patch 4-5 -> refactor and remove redundant SRIOV code
> > Patch 6   -> allocate every interrupt individually
> > Patch 7   -> replace homegrown interrupt tracker with xarray
> > Patch 8   -> allow dynamic interrupt allocation
> > 
> > Change history:
> > v1 -> v2:
> > - ice: refactor VF control VSI interrupt handling
> >   - move ice_get_vf_ctrl_vsi to ice_lib.c (ice_vf_lib.c depends on
> >     CONFIG_PCI_IOV)
> > 
> 
> The other option would have been to make ice_vf_lib.h have a no-op
> function that always returned NULL, since we generally would know that
> there are no VF ctrl VSI if CONFIG_PCI_IOV is disabled.
> 
> But I'm ok with it being in ice_lib.c too.
> 
> Thanks,
> Jake

Thanks, that makes more sense, a little bit too hasty here.
Jacob Keller March 22, 2023, 7:03 p.m. UTC | #3
> -----Original Message-----
> From: Raczynski, Piotr <piotr.raczynski@intel.com>
> Sent: Wednesday, March 22, 2023 11:41 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Swiatkowski,
> Michal <michal.swiatkowski@intel.com>; Saleem, Shiraz
> <shiraz.saleem@intel.com>; Samudrala, Sridhar
> <sridhar.samudrala@intel.com>; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; Lobakin, Aleksander
> <aleksander.lobakin@intel.com>; Czapnik, Lukasz <lukasz.czapnik@intel.com>
> Subject: Re: [PATCH net-next v2 0/8] ice: support dynamic interrupt allocation
> 
> On Wed, Mar 22, 2023 at 10:33:31AM -0700, Jacob Keller wrote:
> >
> >
> > On 3/22/2023 9:25 AM, Piotr Raczynski wrote:
> > > This patchset reimplements MSIX interrupt allocation logic to allow dynamic
> > > interrupt allocation after MSIX has been initially enabled. This allows
> > > current and future features to allocate and free interrupts as needed and
> > > will help to drastically decrease number of initially preallocated
> > > interrupts (even down to the API hard limit of 1). Although this patchset
> > > does not change behavior in terms of actual number of allocated interrupts
> > > during probe, it will be subject to change.
> > >
> > > First few patches prepares to introduce dynamic allocation by moving
> > > interrupt allocation code to separate file and update allocation API used
> > > in the driver to the currently preferred one.
> > >
> > > Due to the current contract between ice and irdma driver which is directly
> > > accessing msix entries allocated by ice driver, even after moving away from
> > > older pci_enable_msix_range function, still keep msix_entries array for
> > > irdma use.
> > >
> > > Next patches refactors and removes redundant code from SRIOV related logic
> > > as it also make it easier to move away from static allocation scheme.
> > >
> > > Last patches actually enables dynamic allocation of MSIX interrupts. First,
> > > introduce functions to allocate and free interrupts individually. This sets
> > > ground for the rest of the changes even if that patch still allocates the
> > > interrupts from the preallocated pool. Since this patch starts to keep
> > > interrupt details in ice_q_vector structure we can get rid of functions
> > > that calculates base vector number and register offset for the interrupt
> > > as it is equal to the interrupt index. Only keep separate register offset
> > > functions for the VF VSIs.
> > >
> > > Next, replace homegrown interrupt tracker with much simpler xarray based
> > > approach. As new API always allocate interrupts one by one, also track
> > > interrupts in the same manner.
> > >
> > > Lastly, extend the interrupt tracker to deal both with preallocated and
> > > dynamically allocated vectors and use pci_msix_alloc_irq_at and
> > > pci_msix_free_irq functions. Since not all architecture supports dynamic
> > > allocation, check it before trying to allocate a new interrupt.
> > >
> > > As previously mentioned, this patchset does not change number of initially
> > > allocated interrupts during init phase but now it can and will likely be
> > > changed.
> > >
> > > Patch 1-3 -> move code around and use newer API
> > > Patch 4-5 -> refactor and remove redundant SRIOV code
> > > Patch 6   -> allocate every interrupt individually
> > > Patch 7   -> replace homegrown interrupt tracker with xarray
> > > Patch 8   -> allow dynamic interrupt allocation
> > >
> > > Change history:
> > > v1 -> v2:
> > > - ice: refactor VF control VSI interrupt handling
> > >   - move ice_get_vf_ctrl_vsi to ice_lib.c (ice_vf_lib.c depends on
> > >     CONFIG_PCI_IOV)
> > >
> >
> > The other option would have been to make ice_vf_lib.h have a no-op
> > function that always returned NULL, since we generally would know that
> > there are no VF ctrl VSI if CONFIG_PCI_IOV is disabled.
> >
> > But I'm ok with it being in ice_lib.c too.
> >
> > Thanks,
> > Jake
> 
> Thanks, that makes more sense, a little bit too hasty here.

The main callers are in ice_lib.c so I think its ok to keep it there as well. I think we can go with v2 unless some other comments come up.

Thanks,
Jake
Jakub Kicinski March 22, 2023, 7:37 p.m. UTC | #4
On Wed, 22 Mar 2023 17:25:22 +0100 Piotr Raczynski wrote:
> This patchset reimplements MSIX interrupt allocation logic to allow dynamic
> interrupt allocation after MSIX has been initially enabled. This allows
> current and future features to allocate and free interrupts as needed and
> will help to drastically decrease number of initially preallocated
> interrupts (even down to the API hard limit of 1). Although this patchset
> does not change behavior in terms of actual number of allocated interrupts
> during probe, it will be subject to change.

Have you seen the mlx5 patch set? I haven't read in detail but seems
like you're working on the same stuff so could be worth cross-checking
each other's work:

https://lore.kernel.org/all/20230320175144.153187-1-saeed@kernel.org/
Jacob Keller March 22, 2023, 8:05 p.m. UTC | #5
On 3/22/2023 12:37 PM, Jakub Kicinski wrote:
> On Wed, 22 Mar 2023 17:25:22 +0100 Piotr Raczynski wrote:
>> This patchset reimplements MSIX interrupt allocation logic to allow dynamic
>> interrupt allocation after MSIX has been initially enabled. This allows
>> current and future features to allocate and free interrupts as needed and
>> will help to drastically decrease number of initially preallocated
>> interrupts (even down to the API hard limit of 1). Although this patchset
>> does not change behavior in terms of actual number of allocated interrupts
>> during probe, it will be subject to change.
> 
> Have you seen the mlx5 patch set? I haven't read in detail but seems
> like you're working on the same stuff so could be worth cross-checking
> each other's work:
> 
> https://lore.kernel.org/all/20230320175144.153187-1-saeed@kernel.org/

Thanks for the pointer. I read through that series just now, and it
looks good.

We are doing similar work, though a big difference is that MLX has
converted *all* allocations to be dynamic. This, I think, works for them
because they already have a pool allocation scheme and basically treated
the available vectors as a resource across the function. We didn't
really do that before (we reserved total based on feature and tried to
ensure we don't starve some features), and so converting everything to
dynamic in one-go wasn't done here. Instead, we open the way to allow
dynamic, with a plan to look at refactoring and removing the initial
allocations at a later stage.

In addition the MLX code already has good data structure for tracking
vectors, where as we had the mess that was our res_tracker which was
poorly implemented.

Overall I think the end goal is going to be similar: use dynamic
allocation when possible.

For ice, we need to complete this work, then follow up with some work to
make handling of vector allocation failure better in the case of things
such as vectors for default PF queues. The current code basically
determines the number of queues we create based on the number of vectors
we got. That won't work with dynamic, and we need to instead pick number
of queues, and be able to handle vector exhaustion if we happen to have
limited number of vectors available. We also need to make sure other
uses of vectors can handle failure appropriately, and deal with the
iRDMA<->ice interfaces which currently assume static allocation as well.

My long term goal would be that the driver only allocates one vector at
load which is for the control vector used for miscellaneous interrupts,
and everything else is allocated and released dynamically based on
feature use. But we have a bit of a ways to get there due to the above
limitations in current design. We need to make sure every path has a
suitable path to report failure on vector exhaustion, and that every
consumer has available knobs or parameters to allow the system to be
reconfigured to prevent exhaustion.