Message ID | 20230322162530.3317238-1-piotr.raczynski@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | ice: support dynamic interrupt allocation | expand |
On 3/22/2023 9:25 AM, Piotr Raczynski wrote: > This patchset reimplements MSIX interrupt allocation logic to allow dynamic > interrupt allocation after MSIX has been initially enabled. This allows > current and future features to allocate and free interrupts as needed and > will help to drastically decrease number of initially preallocated > interrupts (even down to the API hard limit of 1). Although this patchset > does not change behavior in terms of actual number of allocated interrupts > during probe, it will be subject to change. > > First few patches prepares to introduce dynamic allocation by moving > interrupt allocation code to separate file and update allocation API used > in the driver to the currently preferred one. > > Due to the current contract between ice and irdma driver which is directly > accessing msix entries allocated by ice driver, even after moving away from > older pci_enable_msix_range function, still keep msix_entries array for > irdma use. > > Next patches refactors and removes redundant code from SRIOV related logic > as it also make it easier to move away from static allocation scheme. > > Last patches actually enables dynamic allocation of MSIX interrupts. First, > introduce functions to allocate and free interrupts individually. This sets > ground for the rest of the changes even if that patch still allocates the > interrupts from the preallocated pool. Since this patch starts to keep > interrupt details in ice_q_vector structure we can get rid of functions > that calculates base vector number and register offset for the interrupt > as it is equal to the interrupt index. Only keep separate register offset > functions for the VF VSIs. > > Next, replace homegrown interrupt tracker with much simpler xarray based > approach. As new API always allocate interrupts one by one, also track > interrupts in the same manner. > > Lastly, extend the interrupt tracker to deal both with preallocated and > dynamically allocated vectors and use pci_msix_alloc_irq_at and > pci_msix_free_irq functions. Since not all architecture supports dynamic > allocation, check it before trying to allocate a new interrupt. > > As previously mentioned, this patchset does not change number of initially > allocated interrupts during init phase but now it can and will likely be > changed. > > Patch 1-3 -> move code around and use newer API > Patch 4-5 -> refactor and remove redundant SRIOV code > Patch 6 -> allocate every interrupt individually > Patch 7 -> replace homegrown interrupt tracker with xarray > Patch 8 -> allow dynamic interrupt allocation > > Change history: > v1 -> v2: > - ice: refactor VF control VSI interrupt handling > - move ice_get_vf_ctrl_vsi to ice_lib.c (ice_vf_lib.c depends on > CONFIG_PCI_IOV) > The other option would have been to make ice_vf_lib.h have a no-op function that always returned NULL, since we generally would know that there are no VF ctrl VSI if CONFIG_PCI_IOV is disabled. But I'm ok with it being in ice_lib.c too. Thanks, Jake
On Wed, Mar 22, 2023 at 10:33:31AM -0700, Jacob Keller wrote: > > > On 3/22/2023 9:25 AM, Piotr Raczynski wrote: > > This patchset reimplements MSIX interrupt allocation logic to allow dynamic > > interrupt allocation after MSIX has been initially enabled. This allows > > current and future features to allocate and free interrupts as needed and > > will help to drastically decrease number of initially preallocated > > interrupts (even down to the API hard limit of 1). Although this patchset > > does not change behavior in terms of actual number of allocated interrupts > > during probe, it will be subject to change. > > > > First few patches prepares to introduce dynamic allocation by moving > > interrupt allocation code to separate file and update allocation API used > > in the driver to the currently preferred one. > > > > Due to the current contract between ice and irdma driver which is directly > > accessing msix entries allocated by ice driver, even after moving away from > > older pci_enable_msix_range function, still keep msix_entries array for > > irdma use. > > > > Next patches refactors and removes redundant code from SRIOV related logic > > as it also make it easier to move away from static allocation scheme. > > > > Last patches actually enables dynamic allocation of MSIX interrupts. First, > > introduce functions to allocate and free interrupts individually. This sets > > ground for the rest of the changes even if that patch still allocates the > > interrupts from the preallocated pool. Since this patch starts to keep > > interrupt details in ice_q_vector structure we can get rid of functions > > that calculates base vector number and register offset for the interrupt > > as it is equal to the interrupt index. Only keep separate register offset > > functions for the VF VSIs. > > > > Next, replace homegrown interrupt tracker with much simpler xarray based > > approach. As new API always allocate interrupts one by one, also track > > interrupts in the same manner. > > > > Lastly, extend the interrupt tracker to deal both with preallocated and > > dynamically allocated vectors and use pci_msix_alloc_irq_at and > > pci_msix_free_irq functions. Since not all architecture supports dynamic > > allocation, check it before trying to allocate a new interrupt. > > > > As previously mentioned, this patchset does not change number of initially > > allocated interrupts during init phase but now it can and will likely be > > changed. > > > > Patch 1-3 -> move code around and use newer API > > Patch 4-5 -> refactor and remove redundant SRIOV code > > Patch 6 -> allocate every interrupt individually > > Patch 7 -> replace homegrown interrupt tracker with xarray > > Patch 8 -> allow dynamic interrupt allocation > > > > Change history: > > v1 -> v2: > > - ice: refactor VF control VSI interrupt handling > > - move ice_get_vf_ctrl_vsi to ice_lib.c (ice_vf_lib.c depends on > > CONFIG_PCI_IOV) > > > > The other option would have been to make ice_vf_lib.h have a no-op > function that always returned NULL, since we generally would know that > there are no VF ctrl VSI if CONFIG_PCI_IOV is disabled. > > But I'm ok with it being in ice_lib.c too. > > Thanks, > Jake Thanks, that makes more sense, a little bit too hasty here.
> -----Original Message----- > From: Raczynski, Piotr <piotr.raczynski@intel.com> > Sent: Wednesday, March 22, 2023 11:41 AM > To: Keller, Jacob E <jacob.e.keller@intel.com> > Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Swiatkowski, > Michal <michal.swiatkowski@intel.com>; Saleem, Shiraz > <shiraz.saleem@intel.com>; Samudrala, Sridhar > <sridhar.samudrala@intel.com>; Brandeburg, Jesse > <jesse.brandeburg@intel.com>; Lobakin, Aleksander > <aleksander.lobakin@intel.com>; Czapnik, Lukasz <lukasz.czapnik@intel.com> > Subject: Re: [PATCH net-next v2 0/8] ice: support dynamic interrupt allocation > > On Wed, Mar 22, 2023 at 10:33:31AM -0700, Jacob Keller wrote: > > > > > > On 3/22/2023 9:25 AM, Piotr Raczynski wrote: > > > This patchset reimplements MSIX interrupt allocation logic to allow dynamic > > > interrupt allocation after MSIX has been initially enabled. This allows > > > current and future features to allocate and free interrupts as needed and > > > will help to drastically decrease number of initially preallocated > > > interrupts (even down to the API hard limit of 1). Although this patchset > > > does not change behavior in terms of actual number of allocated interrupts > > > during probe, it will be subject to change. > > > > > > First few patches prepares to introduce dynamic allocation by moving > > > interrupt allocation code to separate file and update allocation API used > > > in the driver to the currently preferred one. > > > > > > Due to the current contract between ice and irdma driver which is directly > > > accessing msix entries allocated by ice driver, even after moving away from > > > older pci_enable_msix_range function, still keep msix_entries array for > > > irdma use. > > > > > > Next patches refactors and removes redundant code from SRIOV related logic > > > as it also make it easier to move away from static allocation scheme. > > > > > > Last patches actually enables dynamic allocation of MSIX interrupts. First, > > > introduce functions to allocate and free interrupts individually. This sets > > > ground for the rest of the changes even if that patch still allocates the > > > interrupts from the preallocated pool. Since this patch starts to keep > > > interrupt details in ice_q_vector structure we can get rid of functions > > > that calculates base vector number and register offset for the interrupt > > > as it is equal to the interrupt index. Only keep separate register offset > > > functions for the VF VSIs. > > > > > > Next, replace homegrown interrupt tracker with much simpler xarray based > > > approach. As new API always allocate interrupts one by one, also track > > > interrupts in the same manner. > > > > > > Lastly, extend the interrupt tracker to deal both with preallocated and > > > dynamically allocated vectors and use pci_msix_alloc_irq_at and > > > pci_msix_free_irq functions. Since not all architecture supports dynamic > > > allocation, check it before trying to allocate a new interrupt. > > > > > > As previously mentioned, this patchset does not change number of initially > > > allocated interrupts during init phase but now it can and will likely be > > > changed. > > > > > > Patch 1-3 -> move code around and use newer API > > > Patch 4-5 -> refactor and remove redundant SRIOV code > > > Patch 6 -> allocate every interrupt individually > > > Patch 7 -> replace homegrown interrupt tracker with xarray > > > Patch 8 -> allow dynamic interrupt allocation > > > > > > Change history: > > > v1 -> v2: > > > - ice: refactor VF control VSI interrupt handling > > > - move ice_get_vf_ctrl_vsi to ice_lib.c (ice_vf_lib.c depends on > > > CONFIG_PCI_IOV) > > > > > > > The other option would have been to make ice_vf_lib.h have a no-op > > function that always returned NULL, since we generally would know that > > there are no VF ctrl VSI if CONFIG_PCI_IOV is disabled. > > > > But I'm ok with it being in ice_lib.c too. > > > > Thanks, > > Jake > > Thanks, that makes more sense, a little bit too hasty here. The main callers are in ice_lib.c so I think its ok to keep it there as well. I think we can go with v2 unless some other comments come up. Thanks, Jake
On Wed, 22 Mar 2023 17:25:22 +0100 Piotr Raczynski wrote: > This patchset reimplements MSIX interrupt allocation logic to allow dynamic > interrupt allocation after MSIX has been initially enabled. This allows > current and future features to allocate and free interrupts as needed and > will help to drastically decrease number of initially preallocated > interrupts (even down to the API hard limit of 1). Although this patchset > does not change behavior in terms of actual number of allocated interrupts > during probe, it will be subject to change. Have you seen the mlx5 patch set? I haven't read in detail but seems like you're working on the same stuff so could be worth cross-checking each other's work: https://lore.kernel.org/all/20230320175144.153187-1-saeed@kernel.org/
On 3/22/2023 12:37 PM, Jakub Kicinski wrote: > On Wed, 22 Mar 2023 17:25:22 +0100 Piotr Raczynski wrote: >> This patchset reimplements MSIX interrupt allocation logic to allow dynamic >> interrupt allocation after MSIX has been initially enabled. This allows >> current and future features to allocate and free interrupts as needed and >> will help to drastically decrease number of initially preallocated >> interrupts (even down to the API hard limit of 1). Although this patchset >> does not change behavior in terms of actual number of allocated interrupts >> during probe, it will be subject to change. > > Have you seen the mlx5 patch set? I haven't read in detail but seems > like you're working on the same stuff so could be worth cross-checking > each other's work: > > https://lore.kernel.org/all/20230320175144.153187-1-saeed@kernel.org/ Thanks for the pointer. I read through that series just now, and it looks good. We are doing similar work, though a big difference is that MLX has converted *all* allocations to be dynamic. This, I think, works for them because they already have a pool allocation scheme and basically treated the available vectors as a resource across the function. We didn't really do that before (we reserved total based on feature and tried to ensure we don't starve some features), and so converting everything to dynamic in one-go wasn't done here. Instead, we open the way to allow dynamic, with a plan to look at refactoring and removing the initial allocations at a later stage. In addition the MLX code already has good data structure for tracking vectors, where as we had the mess that was our res_tracker which was poorly implemented. Overall I think the end goal is going to be similar: use dynamic allocation when possible. For ice, we need to complete this work, then follow up with some work to make handling of vector allocation failure better in the case of things such as vectors for default PF queues. The current code basically determines the number of queues we create based on the number of vectors we got. That won't work with dynamic, and we need to instead pick number of queues, and be able to handle vector exhaustion if we happen to have limited number of vectors available. We also need to make sure other uses of vectors can handle failure appropriately, and deal with the iRDMA<->ice interfaces which currently assume static allocation as well. My long term goal would be that the driver only allocates one vector at load which is for the control vector used for miscellaneous interrupts, and everything else is allocated and released dynamically based on feature use. But we have a bit of a ways to get there due to the above limitations in current design. We need to make sure every path has a suitable path to report failure on vector exhaustion, and that every consumer has available knobs or parameters to allow the system to be reconfigured to prevent exhaustion.