Message ID | 20241114000105.703740-1-jbrandeb@kernel.org (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v1] ice: do not reserve resources for RDMA when disabled | expand |
On Wed, Nov 13, 2024 at 04:00:56PM -0800, jbrandeb@kernel.org wrote: > From: Jesse Brandeburg <jbrandeb@kernel.org> > > If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a > built-in, then don't let the driver reserve resources for RDMA. > > Do this by avoiding enabling the capability when scanning hardware > capabilities. > > Fixes: d25a0fc41c1f ("ice: Initialize RDMA support") > CC: Dave Ertman <david.m.ertman@intel.com> > Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org> > --- > drivers/net/ethernet/intel/ice/ice_common.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c > index 009716a12a26..70be07ad2c10 100644 > --- a/drivers/net/ethernet/intel/ice/ice_common.c > +++ b/drivers/net/ethernet/intel/ice/ice_common.c > @@ -2174,7 +2174,8 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps, > caps->nvm_unified_update); > break; > case ICE_AQC_CAPS_RDMA: > - caps->rdma = (number == 1); > + if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA)) ice_eth is not dependent on CONFIG_INFINIBAND_IRDMA and can be built perfectly without irdma. So technically, you disabled RDMA for such kernels and users won't be able to load out-of-tree built module. I don't care about out-of-tree code, but it is worth to add into commit message, so users will know what to do it IRDMA stopped to work for them. Thanks > + caps->rdma = (number == 1); > ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, caps->rdma); > break; > case ICE_AQC_CAPS_MAX_MTU: > > base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623 > -- > 2.39.5 > >
> -----Original Message----- > From: jbrandeb@kernel.org <jbrandeb@kernel.org> > Sent: Wednesday, November 13, 2024 4:01 PM > To: netdev@vger.kernel.org > Cc: Brandeburg, Jesse <jbrandeburg@cloudflare.com>; Jesse Brandeburg > <jbrandeb@kernel.org>; intel-wired-lan@lists.osuosl.org; Ertman, David M > <david.m.ertman@intel.com>; Nguyen, Anthony L > <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw > <przemyslaw.kitszel@intel.com>; Andrew Lunn <andrew+netdev@lunn.ch>; > David S. Miller <davem@davemloft.net>; Eric Dumazet > <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni > <pabeni@redhat.com> > Subject: [PATCH net v1] ice: do not reserve resources for RDMA when > disabled > > From: Jesse Brandeburg <jbrandeb@kernel.org> > > If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a > built-in, then don't let the driver reserve resources for RDMA. > > Do this by avoiding enabling the capability when scanning hardware > capabilities. > > Fixes: d25a0fc41c1f ("ice: Initialize RDMA support") > CC: Dave Ertman <david.m.ertman@intel.com> > Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org> > --- > drivers/net/ethernet/intel/ice/ice_common.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_common.c > b/drivers/net/ethernet/intel/ice/ice_common.c > index 009716a12a26..70be07ad2c10 100644 > --- a/drivers/net/ethernet/intel/ice/ice_common.c > +++ b/drivers/net/ethernet/intel/ice/ice_common.c > @@ -2174,7 +2174,8 @@ ice_parse_common_caps(struct ice_hw *hw, struct > ice_hw_common_caps *caps, > caps->nvm_unified_update); > break; > case ICE_AQC_CAPS_RDMA: > - caps->rdma = (number == 1); > + if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA)) > + caps->rdma = (number == 1); > ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, The HW caps struct should always accurately reflect the capabilities of the HW being probed. Since this is a kernel configuration (i.e. software) consideration, the more appropriate approach would be to control the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting. Thanks, DaveE
On 11/14/24 01:00, jbrandeb@kernel.org wrote: > From: Jesse Brandeburg <jbrandeb@kernel.org> > > If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a > built-in, then don't let the driver reserve resources for RDMA. > > Do this by avoiding enabling the capability when scanning hardware > capabilities. > > Fixes: d25a0fc41c1f ("ice: Initialize RDMA support") > CC: Dave Ertman <david.m.ertman@intel.com> > Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org> > --- Hi Jesse, it's good to hear back from you :) we are already working on resolving the issue of miss-allocating too many resources (would be good to know what beyond MSI-x'es you care about) for RDMA in the default (likely non-RDMA heavy) case. Here is a series from Michal that lets user to manage it a bit: https://lore.kernel.org/netdev/20241114122009.97416-3-michal.swiatkowski@linux.intel.com/T/ and we want to post another one later that changes defaults from the current "grab a lot when there are many CPUs" policy to more resonable
On 11/14/24 10:06 AM, Ertman, David M wrote: >> case ICE_AQC_CAPS_RDMA: >> - caps->rdma = (number == 1); >> + if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA)) >> + caps->rdma = (number == 1); >> ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, > > The HW caps struct should always accurately reflect the capabilities of the HW being probed. Since this why must it accurately reflect the capability of the hardware? The driver state and capability is a reflection of the combination of both, so I'm not sure what the point of your statement. > is a kernel configuration (i.e. software) consideration, the more appropriate approach would be to control > the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting. I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is blindly set by the LAG code, if the cap.rdma is enabled. see ice_set_rdma_cap(). This means the disable won't stick. Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as a gate and as a state, which is a design issue. This leaves no choice but to implement the way I did in this v1 patch. Do you see any other option to make a simple change that is safe for backporting to stable? Thanks, Jesse
On Fri, Nov 15, 2024 at 12:51 AM Przemek Kitszel <przemyslaw.kitszel@intel.com> wrote: > > On 11/14/24 01:00, jbrandeb@kernel.org wrote: > > From: Jesse Brandeburg <jbrandeb@kernel.org> > > > > If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a > > built-in, then don't let the driver reserve resources for RDMA. > > > > Do this by avoiding enabling the capability when scanning hardware > > capabilities. > > > > Fixes: d25a0fc41c1f ("ice: Initialize RDMA support") > > CC: Dave Ertman <david.m.ertman@intel.com> > > Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org> > > --- > > Hi Jesse, it's good to hear back from you :) Hi Przemek! You too. > we are already working on resolving the issue of miss-allocating > too many resources (would be good to know what beyond MSI-x'es > you care about) for RDMA in the default (likely non-RDMA heavy) case. > Here is a series from Michal that lets user to manage it a bit: > https://lore.kernel.org/netdev/20241114122009.97416-3-michal.swiatkowski@linux.intel.com/T/ I agree, but that whole series is far too big to backport to stable, right? > and we want to post another one later that changes defaults from the > current "grab a lot when there are many CPUs" policy to more resonable I'm looking forward to those series, and in fact had been looking to backport one of the patches from michal's series, but found that for us at least with RDMA disabled this limit seemed far simpler, and also I doubt it will conflict with the more complicated work of managing features when all are enabled. Please see my other reply to dave (and yes I'm replying from two different accounts, as I'm figuring out the best way to work here at my new job.)
On 11/15/24 10:46 AM, Jesse Brandeburg wrote: > On 11/14/24 10:06 AM, Ertman, David M wrote: >>> case ICE_AQC_CAPS_RDMA: >>> - caps->rdma = (number == 1); >>> + if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA)) >>> + caps->rdma = (number == 1); >>> ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, >> >> The HW caps struct should always accurately reflect the capabilities >> of the HW being probed. Since this > > why must it accurately reflect the capability of the hardware? The > driver state and capability is a reflection of the combination of both, > so I'm not sure what the point of your statement. > >> is a kernel configuration (i.e. software) consideration, the more >> appropriate approach would be to control >> the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting. > > I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is > blindly set by the LAG code, if the cap.rdma is enabled. see > ice_set_rdma_cap(). This means the disable won't stick. > > Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as > a gate and as a state, which is a design issue. This leaves no choice > but to implement the way I did in this v1 patch. Do you see any other > option to make a simple change that is safe for backporting to stable? Any comments here Dave?
> -----Original Message----- > From: Jesse Brandeburg <jesse.brandeburg@linux.dev> > Sent: Thursday, November 21, 2024 5:50 PM > To: Ertman, David M <david.m.ertman@intel.com>; jbrandeb@kernel.org; > netdev@vger.kernel.org > Subject: Re: [PATCH net v1] ice: do not reserve resources for RDMA when > disabled > > On 11/15/24 10:46 AM, Jesse Brandeburg wrote: > > On 11/14/24 10:06 AM, Ertman, David M wrote: > >>> case ICE_AQC_CAPS_RDMA: > >>> - caps->rdma = (number == 1); > >>> + if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA)) > >>> + caps->rdma = (number == 1); > >>> ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, > >> > >> The HW caps struct should always accurately reflect the capabilities > >> of the HW being probed. Since this > > > > why must it accurately reflect the capability of the hardware? The > > driver state and capability is a reflection of the combination of both, > > so I'm not sure what the point of your statement. > > > >> is a kernel configuration (i.e. software) consideration, the more > >> appropriate approach would be to control > >> the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting. > > > > I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is > > blindly set by the LAG code, if the cap.rdma is enabled. see > > ice_set_rdma_cap(). This means the disable won't stick. > > > > Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as > > a gate and as a state, which is a design issue. This leaves no choice > > but to implement the way I did in this v1 patch. Do you see any other > > option to make a simple change that is safe for backporting to stable? > > Any comments here Dave? Jesse, Looking at the FLAG as used in the ice driver, it is used as an ephemeral state for the enablement of RDMA (gated by the value of the capabilities struct). So, I agree that a kernel-wide config value should not be tied to it as well. Since the kernel config value will not change for the life of the driver, it should be OK to alter the capabilities struct value based on the kernel's configuration. Sorry for the thrash. DaveE Acked-by: Dave Ertman <david.m.ertman@intel.com>
[CC IWL] On 11/22/24 23:35, Ertman, David M wrote: >> -----Original Message----- >> From: Jesse Brandeburg <jesse.brandeburg@linux.dev> >> Sent: Thursday, November 21, 2024 5:50 PM >> To: Ertman, David M <david.m.ertman@intel.com>; jbrandeb@kernel.org; >> netdev@vger.kernel.org >> Subject: Re: [PATCH net v1] ice: do not reserve resources for RDMA when >> disabled >> >> On 11/15/24 10:46 AM, Jesse Brandeburg wrote: >>> On 11/14/24 10:06 AM, Ertman, David M wrote: >>>>> case ICE_AQC_CAPS_RDMA: >>>>> - caps->rdma = (number == 1); >>>>> + if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA)) >>>>> + caps->rdma = (number == 1); >>>>> ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, >>>> >>>> The HW caps struct should always accurately reflect the capabilities >>>> of the HW being probed. Since this >>> >>> why must it accurately reflect the capability of the hardware? The >>> driver state and capability is a reflection of the combination of both, >>> so I'm not sure what the point of your statement. >>> >>>> is a kernel configuration (i.e. software) consideration, the more >>>> appropriate approach would be to control >>>> the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting. >>> >>> I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is >>> blindly set by the LAG code, if the cap.rdma is enabled. see >>> ice_set_rdma_cap(). This means the disable won't stick. >>> >>> Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as >>> a gate and as a state, which is a design issue. This leaves no choice >>> but to implement the way I did in this v1 patch. Do you see any other >>> option to make a simple change that is safe for backporting to stable? >> >> Any comments here Dave? > > Jesse, > > Looking at the FLAG as used in the ice driver, it is used as an ephemeral state for > the enablement of RDMA (gated by the value of the capabilities struct). So, I agree > that a kernel-wide config value should not be tied to it as well. > > Since the kernel config value will not change for the life of the driver, it should be > OK to alter the capabilities struct value based on the kernel's configuration. > > Sorry for the thrash. > > DaveE > > Acked-by: Dave Ertman <david.m.ertman@intel.com>
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 009716a12a26..70be07ad2c10 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -2174,7 +2174,8 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps, caps->nvm_unified_update); break; case ICE_AQC_CAPS_RDMA: - caps->rdma = (number == 1); + if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA)) + caps->rdma = (number == 1); ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, caps->rdma); break; case ICE_AQC_CAPS_MAX_MTU: