diff mbox series

[net,v1] ice: do not reserve resources for RDMA when disabled

Message ID 20241114000105.703740-1-jbrandeb@kernel.org (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [net,v1] ice: do not reserve resources for RDMA when disabled | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 1 blamed authors not CCed: shiraz.saleem@intel.com; 1 maintainers not CCed: shiraz.saleem@intel.com
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 223 this patch: 223
netdev/source_inline success Was 0 now: 0

Commit Message

Jesse Brandeburg Nov. 14, 2024, midnight UTC
From: Jesse Brandeburg <jbrandeb@kernel.org>

If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a
built-in, then don't let the driver reserve resources for RDMA.

Do this by avoiding enabling the capability when scanning hardware
capabilities.

Fixes: d25a0fc41c1f ("ice: Initialize RDMA support")
CC: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org>
---
 drivers/net/ethernet/intel/ice/ice_common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623

Comments

Leon Romanovsky Nov. 14, 2024, 11:32 a.m. UTC | #1
On Wed, Nov 13, 2024 at 04:00:56PM -0800, jbrandeb@kernel.org wrote:
> From: Jesse Brandeburg <jbrandeb@kernel.org>
> 
> If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a
> built-in, then don't let the driver reserve resources for RDMA.
> 
> Do this by avoiding enabling the capability when scanning hardware
> capabilities.
> 
> Fixes: d25a0fc41c1f ("ice: Initialize RDMA support")
> CC: Dave Ertman <david.m.ertman@intel.com>
> Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org>
> ---
>  drivers/net/ethernet/intel/ice/ice_common.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> index 009716a12a26..70be07ad2c10 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -2174,7 +2174,8 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps,
>  			  caps->nvm_unified_update);
>  		break;
>  	case ICE_AQC_CAPS_RDMA:
> -		caps->rdma = (number == 1);
> +		if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA))

ice_eth is not dependent on CONFIG_INFINIBAND_IRDMA and can be built
perfectly without irdma. So technically, you disabled RDMA for such
kernels and users won't be able to load out-of-tree built module.

I don't care about out-of-tree code, but it is worth to add into commit
message, so users will know what to do it IRDMA stopped to work for them.

Thanks

> +			caps->rdma = (number == 1);
>  		ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, caps->rdma);
>  		break;
>  	case ICE_AQC_CAPS_MAX_MTU:
> 
> base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623
> -- 
> 2.39.5
> 
>
Ertman, David M Nov. 14, 2024, 6:06 p.m. UTC | #2
> -----Original Message-----
> From: jbrandeb@kernel.org <jbrandeb@kernel.org>
> Sent: Wednesday, November 13, 2024 4:01 PM
> To: netdev@vger.kernel.org
> Cc: Brandeburg, Jesse <jbrandeburg@cloudflare.com>; Jesse Brandeburg
> <jbrandeb@kernel.org>; intel-wired-lan@lists.osuosl.org; Ertman, David M
> <david.m.ertman@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Andrew Lunn <andrew+netdev@lunn.ch>;
> David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>
> Subject: [PATCH net v1] ice: do not reserve resources for RDMA when
> disabled
> 
> From: Jesse Brandeburg <jbrandeb@kernel.org>
> 
> If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a
> built-in, then don't let the driver reserve resources for RDMA.
> 
> Do this by avoiding enabling the capability when scanning hardware
> capabilities.
> 
> Fixes: d25a0fc41c1f ("ice: Initialize RDMA support")
> CC: Dave Ertman <david.m.ertman@intel.com>
> Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org>
> ---
>  drivers/net/ethernet/intel/ice/ice_common.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c
> b/drivers/net/ethernet/intel/ice/ice_common.c
> index 009716a12a26..70be07ad2c10 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -2174,7 +2174,8 @@ ice_parse_common_caps(struct ice_hw *hw, struct
> ice_hw_common_caps *caps,
>  			  caps->nvm_unified_update);
>  		break;
>  	case ICE_AQC_CAPS_RDMA:
> -		caps->rdma = (number == 1);
> +		if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA))
> +			caps->rdma = (number == 1);
>  		ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix,

The HW caps struct should always accurately reflect the capabilities of the HW being probed.  Since this
is a kernel configuration (i.e. software) consideration, the more appropriate approach would be to control
the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting.

Thanks,
DaveE
Przemek Kitszel Nov. 15, 2024, 8:51 a.m. UTC | #3
On 11/14/24 01:00, jbrandeb@kernel.org wrote:
> From: Jesse Brandeburg <jbrandeb@kernel.org>
> 
> If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a
> built-in, then don't let the driver reserve resources for RDMA.
> 
> Do this by avoiding enabling the capability when scanning hardware
> capabilities.
> 
> Fixes: d25a0fc41c1f ("ice: Initialize RDMA support")
> CC: Dave Ertman <david.m.ertman@intel.com>
> Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org>
> ---

Hi Jesse, it's good to hear back from you :)

we are already working on resolving the issue of miss-allocating
too many resources (would be good to know what beyond MSI-x'es
you care about) for RDMA in the default (likely non-RDMA heavy) case. 
Here is a series from Michal that lets user to manage it a bit:
https://lore.kernel.org/netdev/20241114122009.97416-3-michal.swiatkowski@linux.intel.com/T/

and we want to post another one later that changes defaults from the
current "grab a lot when there are many CPUs" policy to more resonable
Jesse Brandeburg Nov. 15, 2024, 6:46 p.m. UTC | #4
On 11/14/24 10:06 AM, Ertman, David M wrote:
>>   	case ICE_AQC_CAPS_RDMA:
>> -		caps->rdma = (number == 1);
>> +		if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA))
>> +			caps->rdma = (number == 1);
>>   		ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix,
> 
> The HW caps struct should always accurately reflect the capabilities of the HW being probed.  Since this

why must it accurately reflect the capability of the hardware? The 
driver state and capability is a reflection of the combination of both, 
so I'm not sure what the point of your statement.

> is a kernel configuration (i.e. software) consideration, the more appropriate approach would be to control
> the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting.

I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is 
blindly set by the LAG code, if the cap.rdma is enabled. see 
ice_set_rdma_cap(). This means the disable won't stick.

Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as 
a gate and as a state, which is a design issue. This leaves no choice 
but to implement the way I did in this v1 patch. Do you see any other 
option to make a simple change that is safe for backporting to stable?

Thanks,
Jesse
Jesse Brandeburg Nov. 15, 2024, 6:48 p.m. UTC | #5
On Fri, Nov 15, 2024 at 12:51 AM Przemek Kitszel
<przemyslaw.kitszel@intel.com> wrote:
>
> On 11/14/24 01:00, jbrandeb@kernel.org wrote:
> > From: Jesse Brandeburg <jbrandeb@kernel.org>
> >
> > If the CONFIG_INFINIBAND_IRDMA symbol is not enabled as a module or a
> > built-in, then don't let the driver reserve resources for RDMA.
> >
> > Do this by avoiding enabling the capability when scanning hardware
> > capabilities.
> >
> > Fixes: d25a0fc41c1f ("ice: Initialize RDMA support")
> > CC: Dave Ertman <david.m.ertman@intel.com>
> > Signed-off-by: Jesse Brandeburg <jbrandeb@kernel.org>
> > ---
>
> Hi Jesse, it's good to hear back from you :)


Hi Przemek! You too.

> we are already working on resolving the issue of miss-allocating
> too many resources (would be good to know what beyond MSI-x'es
> you care about) for RDMA in the default (likely non-RDMA heavy) case.
> Here is a series from Michal that lets user to manage it a bit:
> https://lore.kernel.org/netdev/20241114122009.97416-3-michal.swiatkowski@linux.intel.com/T/

I agree, but that whole series is far too big to backport to stable, right?

> and we want to post another one later that changes defaults from the
> current "grab a lot when there are many CPUs" policy to more resonable

I'm looking forward to those series, and in fact had been looking to
backport one of the patches from michal's series, but found that for
us at least with RDMA disabled this limit seemed far simpler, and also
I doubt it will conflict with the more complicated work of managing
features when all are enabled.

Please see my other reply to dave (and yes I'm replying from two
different accounts, as I'm figuring out the best way to work here at
my new job.)
Jesse Brandeburg Nov. 22, 2024, 1:50 a.m. UTC | #6
On 11/15/24 10:46 AM, Jesse Brandeburg wrote:
> On 11/14/24 10:06 AM, Ertman, David M wrote:
>>>       case ICE_AQC_CAPS_RDMA:
>>> -        caps->rdma = (number == 1);
>>> +        if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA))
>>> +            caps->rdma = (number == 1);
>>>           ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix,
>>
>> The HW caps struct should always accurately reflect the capabilities 
>> of the HW being probed.  Since this
> 
> why must it accurately reflect the capability of the hardware? The 
> driver state and capability is a reflection of the combination of both, 
> so I'm not sure what the point of your statement.
> 
>> is a kernel configuration (i.e. software) consideration, the more 
>> appropriate approach would be to control
>> the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting.
> 
> I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is 
> blindly set by the LAG code, if the cap.rdma is enabled. see 
> ice_set_rdma_cap(). This means the disable won't stick.
> 
> Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as 
> a gate and as a state, which is a design issue. This leaves no choice 
> but to implement the way I did in this v1 patch. Do you see any other 
> option to make a simple change that is safe for backporting to stable?

Any comments here Dave?
Ertman, David M Nov. 22, 2024, 10:35 p.m. UTC | #7
> -----Original Message-----
> From: Jesse Brandeburg <jesse.brandeburg@linux.dev>
> Sent: Thursday, November 21, 2024 5:50 PM
> To: Ertman, David M <david.m.ertman@intel.com>; jbrandeb@kernel.org;
> netdev@vger.kernel.org
> Subject: Re: [PATCH net v1] ice: do not reserve resources for RDMA when
> disabled
> 
> On 11/15/24 10:46 AM, Jesse Brandeburg wrote:
> > On 11/14/24 10:06 AM, Ertman, David M wrote:
> >>>       case ICE_AQC_CAPS_RDMA:
> >>> -        caps->rdma = (number == 1);
> >>> +        if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA))
> >>> +            caps->rdma = (number == 1);
> >>>           ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix,
> >>
> >> The HW caps struct should always accurately reflect the capabilities
> >> of the HW being probed.  Since this
> >
> > why must it accurately reflect the capability of the hardware? The
> > driver state and capability is a reflection of the combination of both,
> > so I'm not sure what the point of your statement.
> >
> >> is a kernel configuration (i.e. software) consideration, the more
> >> appropriate approach would be to control
> >> the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting.
> >
> > I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is
> > blindly set by the LAG code, if the cap.rdma is enabled. see
> > ice_set_rdma_cap(). This means the disable won't stick.
> >
> > Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as
> > a gate and as a state, which is a design issue. This leaves no choice
> > but to implement the way I did in this v1 patch. Do you see any other
> > option to make a simple change that is safe for backporting to stable?
> 
> Any comments here Dave? 

Jesse,

Looking at the FLAG as used in the ice driver, it is used as an ephemeral state for
the enablement of RDMA (gated by the value of the capabilities struct).  So, I agree
that a kernel-wide config value should not be tied to it as well.

Since the kernel config value will not change for the life of the driver, it should be
OK to alter the capabilities struct value based on the kernel's configuration.

Sorry for the thrash.

DaveE

Acked-by: Dave Ertman <david.m.ertman@intel.com>
Przemek Kitszel Nov. 25, 2024, 8:32 a.m. UTC | #8
[CC IWL]

On 11/22/24 23:35, Ertman, David M wrote:
>> -----Original Message-----
>> From: Jesse Brandeburg <jesse.brandeburg@linux.dev>
>> Sent: Thursday, November 21, 2024 5:50 PM
>> To: Ertman, David M <david.m.ertman@intel.com>; jbrandeb@kernel.org;
>> netdev@vger.kernel.org
>> Subject: Re: [PATCH net v1] ice: do not reserve resources for RDMA when
>> disabled
>>
>> On 11/15/24 10:46 AM, Jesse Brandeburg wrote:
>>> On 11/14/24 10:06 AM, Ertman, David M wrote:
>>>>>        case ICE_AQC_CAPS_RDMA:
>>>>> -        caps->rdma = (number == 1);
>>>>> +        if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA))
>>>>> +            caps->rdma = (number == 1);
>>>>>            ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix,
>>>>
>>>> The HW caps struct should always accurately reflect the capabilities
>>>> of the HW being probed.  Since this
>>>
>>> why must it accurately reflect the capability of the hardware? The
>>> driver state and capability is a reflection of the combination of both,
>>> so I'm not sure what the point of your statement.
>>>
>>>> is a kernel configuration (i.e. software) consideration, the more
>>>> appropriate approach would be to control
>>>> the PF flag "ICE_FLAG_RDMA_ENA" based on the kernel CONFIG setting.
>>>
>>> I started making the changes you suggested, but the ICE_FLAG_RDMA_ENA is
>>> blindly set by the LAG code, if the cap.rdma is enabled. see
>>> ice_set_rdma_cap(). This means the disable won't stick.
>>>
>>> Unless I'm misunderstanding something, ICE_FLAG_RDMA_ENA is used both as
>>> a gate and as a state, which is a design issue. This leaves no choice
>>> but to implement the way I did in this v1 patch. Do you see any other
>>> option to make a simple change that is safe for backporting to stable?
>>
>> Any comments here Dave?
> 
> Jesse,
> 
> Looking at the FLAG as used in the ice driver, it is used as an ephemeral state for
> the enablement of RDMA (gated by the value of the capabilities struct).  So, I agree
> that a kernel-wide config value should not be tied to it as well.
> 
> Since the kernel config value will not change for the life of the driver, it should be
> OK to alter the capabilities struct value based on the kernel's configuration.
> 
> Sorry for the thrash.
> 
> DaveE
> 
> Acked-by: Dave Ertman <david.m.ertman@intel.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 009716a12a26..70be07ad2c10 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -2174,7 +2174,8 @@  ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps,
 			  caps->nvm_unified_update);
 		break;
 	case ICE_AQC_CAPS_RDMA:
-		caps->rdma = (number == 1);
+		if (IS_ENABLED(CONFIG_INFINIBAND_IRDMA))
+			caps->rdma = (number == 1);
 		ice_debug(hw, ICE_DBG_INIT, "%s: rdma = %d\n", prefix, caps->rdma);
 		break;
 	case ICE_AQC_CAPS_MAX_MTU: