diff mbox series

[net-next] ethtool: Fail number of channels change when it conflicts with rxnfc

Message ID 20221031100016.6028-1-gal@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next] ethtool: Fail number of channels change when it conflicts with rxnfc | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers warning 11 maintainers not CCed: idosch@nvidia.com alexandru.tachici@analog.com andrew@lunn.ch linux@rempel-privat.de leon@kernel.org pabeni@redhat.com bagasdotme@gmail.com amcohen@nvidia.com edumazet@google.com chenhao288@hisilicon.com lkp@intel.com
netdev/build_clang success Errors and warnings before: 5 this patch: 5
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch warning WARNING: Block comments use a trailing */ on a separate line WARNING: line length of 115 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Gal Pressman Oct. 31, 2022, 10 a.m. UTC
Similar to what we do with the hash indirection table [1], when network
flow classification rules are forwarding traffic to channels greater
than the requested number of channels, fail the operation.
Without this, traffic could be directed to channels which no longer
exist (dropped) after changing number of channels.

[1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}")

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
---
n.b:
Another desirable fix would be handling the fact that additional RSS
contexts could be created and point to higher channels.
---
 net/ethtool/channels.c | 19 ++++++++----
 net/ethtool/common.c   | 65 ++++++++++++++++++++++++++++++++++++++++++
 net/ethtool/common.h   |  1 +
 net/ethtool/ioctl.c    | 17 +++++++----
 4 files changed, 91 insertions(+), 11 deletions(-)

Comments

Jakub Kicinski Nov. 1, 2022, 1:23 a.m. UTC | #1
On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote:
> Similar to what we do with the hash indirection table [1], when network
> flow classification rules are forwarding traffic to channels greater
> than the requested number of channels, fail the operation.
> Without this, traffic could be directed to channels which no longer
> exist (dropped) after changing number of channels.
> 
> [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}")

Have you made sure there are no magic encodings of queue numbers this
would break? I seem to recall some vendors used magic queue values to
redirect to VFs before TC and switchdev. If that's the case we'd need
to locate the drivers that do that and flag them so we can enforce this
only going forward?
Jacob Keller Nov. 1, 2022, 4:50 p.m. UTC | #2
On 10/31/2022 6:23 PM, Jakub Kicinski wrote:
> On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote:
>> Similar to what we do with the hash indirection table [1], when network
>> flow classification rules are forwarding traffic to channels greater
>> than the requested number of channels, fail the operation.
>> Without this, traffic could be directed to channels which no longer
>> exist (dropped) after changing number of channels.
>>
>> [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}")
> 
> Have you made sure there are no magic encodings of queue numbers this
> would break? I seem to recall some vendors used magic queue values to
> redirect to VFs before TC and switchdev. If that's the case we'd need
> to locate the drivers that do that and flag them so we can enforce this
> only going forward?

I believe these all use the same encoding defined by 
ethtool_get_flow_spec_ring and ethtool_get_flow_spec_vf, at least that's 
what ixgbe uses.

This sets the lower 32 bits as the queue index and the next 8 bits as 
the VF identifier as defined by ETHTOOL_RX_FLOW_SPEC_RING and 
ETHTOOL_RX_FLOW_SPEC_RING_VF.

It looks like this change should just exempt ring_cookie with 
ethtool_get_flow_spec_vf as non-zero?

We maybe ought to mark this whole thing as deprecated now given the 
advances in TC.

Thanks,
Jake
Gal Pressman Nov. 2, 2022, 12:44 p.m. UTC | #3
On 01/11/2022 18:50, Jacob Keller wrote:
>
>
> On 10/31/2022 6:23 PM, Jakub Kicinski wrote:
>> On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote:
>>> Similar to what we do with the hash indirection table [1], when network
>>> flow classification rules are forwarding traffic to channels greater
>>> than the requested number of channels, fail the operation.
>>> Without this, traffic could be directed to channels which no longer
>>> exist (dropped) after changing number of channels.
>>>
>>> [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS
>>> doesn't conflict with GS{RXFH}")
>>
>> Have you made sure there are no magic encodings of queue numbers this
>> would break? I seem to recall some vendors used magic queue values to
>> redirect to VFs before TC and switchdev. If that's the case we'd need
>> to locate the drivers that do that and flag them so we can enforce this
>> only going forward?
>
> I believe these all use the same encoding defined by
> ethtool_get_flow_spec_ring and ethtool_get_flow_spec_vf, at least
> that's what ixgbe uses.
>
> This sets the lower 32 bits as the queue index and the next 8 bits as
> the VF identifier as defined by ETHTOOL_RX_FLOW_SPEC_RING and
> ETHTOOL_RX_FLOW_SPEC_RING_VF.
>
> It looks like this change should just exempt ring_cookie with
> ethtool_get_flow_spec_vf as non-zero?
>
> We maybe ought to mark this whole thing as deprecated now given the
> advances in TC.

Oh, I was not aware of this encoding scheme, shouldn't VF rules be added
on the VF interface?
What is this used for?

How does the PF verify the rules are in range for the VF queues?
Anyway, I'll go ahead and verify that VF == 0 in the if statement.

Thanks for the review!
Jacob Keller Nov. 2, 2022, 7:38 p.m. UTC | #4
> -----Original Message-----
> From: Gal Pressman <gal@nvidia.com>
> Sent: Wednesday, November 2, 2022 5:45 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>; Jakub Kicinski <kuba@kernel.org>
> Cc: David S. Miller <davem@davemloft.net>; netdev@vger.kernel.org; Tariq
> Toukan <tariqt@nvidia.com>
> Subject: Re: [PATCH net-next] ethtool: Fail number of channels change when it
> conflicts with rxnfc
> 
> On 01/11/2022 18:50, Jacob Keller wrote:
> >
> >
> > On 10/31/2022 6:23 PM, Jakub Kicinski wrote:
> >> On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote:
> >>> Similar to what we do with the hash indirection table [1], when network
> >>> flow classification rules are forwarding traffic to channels greater
> >>> than the requested number of channels, fail the operation.
> >>> Without this, traffic could be directed to channels which no longer
> >>> exist (dropped) after changing number of channels.
> >>>
> >>> [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS
> >>> doesn't conflict with GS{RXFH}")
> >>
> >> Have you made sure there are no magic encodings of queue numbers this
> >> would break? I seem to recall some vendors used magic queue values to
> >> redirect to VFs before TC and switchdev. If that's the case we'd need
> >> to locate the drivers that do that and flag them so we can enforce this
> >> only going forward?
> >
> > I believe these all use the same encoding defined by
> > ethtool_get_flow_spec_ring and ethtool_get_flow_spec_vf, at least
> > that's what ixgbe uses.
> >
> > This sets the lower 32 bits as the queue index and the next 8 bits as
> > the VF identifier as defined by ETHTOOL_RX_FLOW_SPEC_RING and
> > ETHTOOL_RX_FLOW_SPEC_RING_VF.
> >
> > It looks like this change should just exempt ring_cookie with
> > ethtool_get_flow_spec_vf as non-zero?
> >
> > We maybe ought to mark this whole thing as deprecated now given the
> > advances in TC.
> 
> Oh, I was not aware of this encoding scheme, shouldn't VF rules be added
> on the VF interface?
> What is this used for?
> 

It's rather old, and the idea was to allow forwarding traffic by rules in the host. It predates switchdev, which I think would be the modern method now.

> How does the PF verify the rules are in range for the VF queues?

I believe the PF driver has to check this, and I think its sort of just a hack/independent of the PF queues. I think it would depend on the driver. ixgbe knows how many VF queues there are, and which ones belong to which VF.

I suspect we don't want to use this on new drivers or add it to any existing drivers that don't already have it.

> Anyway, I'll go ahead and verify that VF == 0 in the if statement.
> 
> Thanks for the review!
diff mbox series

Patch

diff --git a/net/ethtool/channels.c b/net/ethtool/channels.c
index 403158862011..c7e37130647e 100644
--- a/net/ethtool/channels.c
+++ b/net/ethtool/channels.c
@@ -116,9 +116,10 @@  int ethnl_set_channels(struct sk_buff *skb, struct genl_info *info)
 	struct ethtool_channels channels = {};
 	struct ethnl_req_info req_info = {};
 	struct nlattr **tb = info->attrs;
-	u32 err_attr, max_rx_in_use = 0;
+	u32 err_attr, max_rxfh_in_use;
 	const struct ethtool_ops *ops;
 	struct net_device *dev;
+	u64 max_rxnfc_in_use;
 	int ret;
 
 	ret = ethnl_parse_header_dev_get(&req_info,
@@ -189,15 +190,23 @@  int ethnl_set_channels(struct sk_buff *skb, struct genl_info *info)
 	}
 
 	/* ensure the new Rx count fits within the configured Rx flow
-	 * indirection table settings
+	 * indirection table/rxnfc settings
 	 */
-	if (netif_is_rxfh_configured(dev) &&
-	    !ethtool_get_max_rxfh_channel(dev, &max_rx_in_use) &&
-	    (channels.combined_count + channels.rx_count) <= max_rx_in_use) {
+	if (ethtool_get_max_rxnfc_channel(dev, &max_rxnfc_in_use))
+		max_rxnfc_in_use = 0;
+	if (!netif_is_rxfh_configured(dev) ||
+	    ethtool_get_max_rxfh_channel(dev, &max_rxfh_in_use))
+		max_rxfh_in_use = 0;
+	if (channels.combined_count + channels.rx_count <= max_rxfh_in_use) {
 		ret = -EINVAL;
 		GENL_SET_ERR_MSG(info, "requested channel counts are too low for existing indirection table settings");
 		goto out_ops;
 	}
+	if (channels.combined_count + channels.rx_count <= max_rxnfc_in_use) {
+		ret = -EINVAL;
+		GENL_SET_ERR_MSG(info, "requested channel counts are too low for existing ntuple filter settings");
+		goto out_ops;
+	}
 
 	/* Disabling channels, query zero-copy AF_XDP sockets */
 	from_channel = channels.combined_count +
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index ee3e02da0013..c2790d29f97c 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -512,6 +512,71 @@  int __ethtool_get_link(struct net_device *dev)
 	return netif_running(dev) && dev->ethtool_ops->get_link(dev);
 }
 
+static int ethtool_get_rxnfc_rule_count(struct net_device *dev)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	struct ethtool_rxnfc info = {
+		.cmd = ETHTOOL_GRXCLSRLCNT,
+	};
+	int err;
+
+	err = ops->get_rxnfc(dev, &info, NULL);
+	if (err)
+		return err;
+
+	return info.rule_cnt;
+}
+
+int ethtool_get_max_rxnfc_channel(struct net_device *dev, u64 *max)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	struct ethtool_rxnfc *info;
+	int err, i, rule_cnt;
+	u64 max_ring = 0;
+
+	if (!ops->get_rxnfc)
+		return -EOPNOTSUPP;
+
+	rule_cnt = ethtool_get_rxnfc_rule_count(dev);
+	if (rule_cnt <= 0)
+		return -EINVAL;
+
+	info = kvzalloc(struct_size(info, rule_locs, rule_cnt), GFP_KERNEL);
+	if (!info)
+		return -ENOMEM;
+
+	info->cmd = ETHTOOL_GRXCLSRLALL;
+	info->rule_cnt = rule_cnt;
+	err = ops->get_rxnfc(dev, info, info->rule_locs);
+	if (err)
+		goto err_free_info;
+
+	for (i = 0; i < rule_cnt; i++) {
+		struct ethtool_rxnfc rule_info = {
+			.cmd = ETHTOOL_GRXCLSRULE,
+			.fs.location = info->rule_locs[i],
+		};
+
+		err = ops->get_rxnfc(dev, &rule_info, NULL);
+		if (err)
+			goto err_free_info;
+
+		if (rule_info.fs.ring_cookie != RX_CLS_FLOW_DISC &&
+		    rule_info.fs.ring_cookie != RX_CLS_FLOW_WAKE &&
+		    !(rule_info.flow_type & FLOW_RSS))
+			max_ring =
+				max_t(u64, max_ring, rule_info.fs.ring_cookie);
+	}
+
+	kvfree(info);
+	*max = max_ring;
+	return 0;
+
+err_free_info:
+	kvfree(info);
+	return err;
+}
+
 int ethtool_get_max_rxfh_channel(struct net_device *dev, u32 *max)
 {
 	u32 dev_size, current_max = 0;
diff --git a/net/ethtool/common.h b/net/ethtool/common.h
index c1779657e074..b1b9db810eca 100644
--- a/net/ethtool/common.h
+++ b/net/ethtool/common.h
@@ -43,6 +43,7 @@  bool convert_legacy_settings_to_link_ksettings(
 	struct ethtool_link_ksettings *link_ksettings,
 	const struct ethtool_cmd *legacy_settings);
 int ethtool_get_max_rxfh_channel(struct net_device *dev, u32 *max);
+int ethtool_get_max_rxnfc_channel(struct net_device *dev, u64 *max);
 int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info);
 
 extern const struct ethtool_phy_ops *ethtool_phy_ops;
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 57e7238a4136..4831fd82796a 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1796,7 +1796,8 @@  static noinline_for_stack int ethtool_set_channels(struct net_device *dev,
 {
 	struct ethtool_channels channels, curr = { .cmd = ETHTOOL_GCHANNELS };
 	u16 from_channel, to_channel;
-	u32 max_rx_in_use = 0;
+	u64 max_rxnfc_in_use;
+	u32 max_rxfh_in_use;
 	unsigned int i;
 	int ret;
 
@@ -1827,11 +1828,15 @@  static noinline_for_stack int ethtool_set_channels(struct net_device *dev,
 		return -EINVAL;
 
 	/* ensure the new Rx count fits within the configured Rx flow
-	 * indirection table settings */
-	if (netif_is_rxfh_configured(dev) &&
-	    !ethtool_get_max_rxfh_channel(dev, &max_rx_in_use) &&
-	    (channels.combined_count + channels.rx_count) <= max_rx_in_use)
-	    return -EINVAL;
+	 * indirection table/rxnfc settings */
+	if (ethtool_get_max_rxnfc_channel(dev, &max_rxnfc_in_use))
+		max_rxnfc_in_use = 0;
+	if (!netif_is_rxfh_configured(dev) ||
+	    ethtool_get_max_rxfh_channel(dev, &max_rxfh_in_use))
+		max_rxfh_in_use = 0;
+	if (channels.combined_count + channels.rx_count <=
+	    max_t(u64, max_rxnfc_in_use, max_rxfh_in_use))
+		return -EINVAL;
 
 	/* Disabling channels, query zero-copy AF_XDP sockets */
 	from_channel = channels.combined_count +