Message ID | 20221031100016.6028-1-gal@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] ethtool: Fail number of channels change when it conflicts with rxnfc | expand |
On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote: > Similar to what we do with the hash indirection table [1], when network > flow classification rules are forwarding traffic to channels greater > than the requested number of channels, fail the operation. > Without this, traffic could be directed to channels which no longer > exist (dropped) after changing number of channels. > > [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}") Have you made sure there are no magic encodings of queue numbers this would break? I seem to recall some vendors used magic queue values to redirect to VFs before TC and switchdev. If that's the case we'd need to locate the drivers that do that and flag them so we can enforce this only going forward?
On 10/31/2022 6:23 PM, Jakub Kicinski wrote: > On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote: >> Similar to what we do with the hash indirection table [1], when network >> flow classification rules are forwarding traffic to channels greater >> than the requested number of channels, fail the operation. >> Without this, traffic could be directed to channels which no longer >> exist (dropped) after changing number of channels. >> >> [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}") > > Have you made sure there are no magic encodings of queue numbers this > would break? I seem to recall some vendors used magic queue values to > redirect to VFs before TC and switchdev. If that's the case we'd need > to locate the drivers that do that and flag them so we can enforce this > only going forward? I believe these all use the same encoding defined by ethtool_get_flow_spec_ring and ethtool_get_flow_spec_vf, at least that's what ixgbe uses. This sets the lower 32 bits as the queue index and the next 8 bits as the VF identifier as defined by ETHTOOL_RX_FLOW_SPEC_RING and ETHTOOL_RX_FLOW_SPEC_RING_VF. It looks like this change should just exempt ring_cookie with ethtool_get_flow_spec_vf as non-zero? We maybe ought to mark this whole thing as deprecated now given the advances in TC. Thanks, Jake
On 01/11/2022 18:50, Jacob Keller wrote: > > > On 10/31/2022 6:23 PM, Jakub Kicinski wrote: >> On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote: >>> Similar to what we do with the hash indirection table [1], when network >>> flow classification rules are forwarding traffic to channels greater >>> than the requested number of channels, fail the operation. >>> Without this, traffic could be directed to channels which no longer >>> exist (dropped) after changing number of channels. >>> >>> [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS >>> doesn't conflict with GS{RXFH}") >> >> Have you made sure there are no magic encodings of queue numbers this >> would break? I seem to recall some vendors used magic queue values to >> redirect to VFs before TC and switchdev. If that's the case we'd need >> to locate the drivers that do that and flag them so we can enforce this >> only going forward? > > I believe these all use the same encoding defined by > ethtool_get_flow_spec_ring and ethtool_get_flow_spec_vf, at least > that's what ixgbe uses. > > This sets the lower 32 bits as the queue index and the next 8 bits as > the VF identifier as defined by ETHTOOL_RX_FLOW_SPEC_RING and > ETHTOOL_RX_FLOW_SPEC_RING_VF. > > It looks like this change should just exempt ring_cookie with > ethtool_get_flow_spec_vf as non-zero? > > We maybe ought to mark this whole thing as deprecated now given the > advances in TC. Oh, I was not aware of this encoding scheme, shouldn't VF rules be added on the VF interface? What is this used for? How does the PF verify the rules are in range for the VF queues? Anyway, I'll go ahead and verify that VF == 0 in the if statement. Thanks for the review!
> -----Original Message----- > From: Gal Pressman <gal@nvidia.com> > Sent: Wednesday, November 2, 2022 5:45 AM > To: Keller, Jacob E <jacob.e.keller@intel.com>; Jakub Kicinski <kuba@kernel.org> > Cc: David S. Miller <davem@davemloft.net>; netdev@vger.kernel.org; Tariq > Toukan <tariqt@nvidia.com> > Subject: Re: [PATCH net-next] ethtool: Fail number of channels change when it > conflicts with rxnfc > > On 01/11/2022 18:50, Jacob Keller wrote: > > > > > > On 10/31/2022 6:23 PM, Jakub Kicinski wrote: > >> On Mon, 31 Oct 2022 12:00:16 +0200 Gal Pressman wrote: > >>> Similar to what we do with the hash indirection table [1], when network > >>> flow classification rules are forwarding traffic to channels greater > >>> than the requested number of channels, fail the operation. > >>> Without this, traffic could be directed to channels which no longer > >>> exist (dropped) after changing number of channels. > >>> > >>> [1] commit d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS > >>> doesn't conflict with GS{RXFH}") > >> > >> Have you made sure there are no magic encodings of queue numbers this > >> would break? I seem to recall some vendors used magic queue values to > >> redirect to VFs before TC and switchdev. If that's the case we'd need > >> to locate the drivers that do that and flag them so we can enforce this > >> only going forward? > > > > I believe these all use the same encoding defined by > > ethtool_get_flow_spec_ring and ethtool_get_flow_spec_vf, at least > > that's what ixgbe uses. > > > > This sets the lower 32 bits as the queue index and the next 8 bits as > > the VF identifier as defined by ETHTOOL_RX_FLOW_SPEC_RING and > > ETHTOOL_RX_FLOW_SPEC_RING_VF. > > > > It looks like this change should just exempt ring_cookie with > > ethtool_get_flow_spec_vf as non-zero? > > > > We maybe ought to mark this whole thing as deprecated now given the > > advances in TC. > > Oh, I was not aware of this encoding scheme, shouldn't VF rules be added > on the VF interface? > What is this used for? > It's rather old, and the idea was to allow forwarding traffic by rules in the host. It predates switchdev, which I think would be the modern method now. > How does the PF verify the rules are in range for the VF queues? I believe the PF driver has to check this, and I think its sort of just a hack/independent of the PF queues. I think it would depend on the driver. ixgbe knows how many VF queues there are, and which ones belong to which VF. I suspect we don't want to use this on new drivers or add it to any existing drivers that don't already have it. > Anyway, I'll go ahead and verify that VF == 0 in the if statement. > > Thanks for the review!
diff --git a/net/ethtool/channels.c b/net/ethtool/channels.c index 403158862011..c7e37130647e 100644 --- a/net/ethtool/channels.c +++ b/net/ethtool/channels.c @@ -116,9 +116,10 @@ int ethnl_set_channels(struct sk_buff *skb, struct genl_info *info) struct ethtool_channels channels = {}; struct ethnl_req_info req_info = {}; struct nlattr **tb = info->attrs; - u32 err_attr, max_rx_in_use = 0; + u32 err_attr, max_rxfh_in_use; const struct ethtool_ops *ops; struct net_device *dev; + u64 max_rxnfc_in_use; int ret; ret = ethnl_parse_header_dev_get(&req_info, @@ -189,15 +190,23 @@ int ethnl_set_channels(struct sk_buff *skb, struct genl_info *info) } /* ensure the new Rx count fits within the configured Rx flow - * indirection table settings + * indirection table/rxnfc settings */ - if (netif_is_rxfh_configured(dev) && - !ethtool_get_max_rxfh_channel(dev, &max_rx_in_use) && - (channels.combined_count + channels.rx_count) <= max_rx_in_use) { + if (ethtool_get_max_rxnfc_channel(dev, &max_rxnfc_in_use)) + max_rxnfc_in_use = 0; + if (!netif_is_rxfh_configured(dev) || + ethtool_get_max_rxfh_channel(dev, &max_rxfh_in_use)) + max_rxfh_in_use = 0; + if (channels.combined_count + channels.rx_count <= max_rxfh_in_use) { ret = -EINVAL; GENL_SET_ERR_MSG(info, "requested channel counts are too low for existing indirection table settings"); goto out_ops; } + if (channels.combined_count + channels.rx_count <= max_rxnfc_in_use) { + ret = -EINVAL; + GENL_SET_ERR_MSG(info, "requested channel counts are too low for existing ntuple filter settings"); + goto out_ops; + } /* Disabling channels, query zero-copy AF_XDP sockets */ from_channel = channels.combined_count + diff --git a/net/ethtool/common.c b/net/ethtool/common.c index ee3e02da0013..c2790d29f97c 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -512,6 +512,71 @@ int __ethtool_get_link(struct net_device *dev) return netif_running(dev) && dev->ethtool_ops->get_link(dev); } +static int ethtool_get_rxnfc_rule_count(struct net_device *dev) +{ + const struct ethtool_ops *ops = dev->ethtool_ops; + struct ethtool_rxnfc info = { + .cmd = ETHTOOL_GRXCLSRLCNT, + }; + int err; + + err = ops->get_rxnfc(dev, &info, NULL); + if (err) + return err; + + return info.rule_cnt; +} + +int ethtool_get_max_rxnfc_channel(struct net_device *dev, u64 *max) +{ + const struct ethtool_ops *ops = dev->ethtool_ops; + struct ethtool_rxnfc *info; + int err, i, rule_cnt; + u64 max_ring = 0; + + if (!ops->get_rxnfc) + return -EOPNOTSUPP; + + rule_cnt = ethtool_get_rxnfc_rule_count(dev); + if (rule_cnt <= 0) + return -EINVAL; + + info = kvzalloc(struct_size(info, rule_locs, rule_cnt), GFP_KERNEL); + if (!info) + return -ENOMEM; + + info->cmd = ETHTOOL_GRXCLSRLALL; + info->rule_cnt = rule_cnt; + err = ops->get_rxnfc(dev, info, info->rule_locs); + if (err) + goto err_free_info; + + for (i = 0; i < rule_cnt; i++) { + struct ethtool_rxnfc rule_info = { + .cmd = ETHTOOL_GRXCLSRULE, + .fs.location = info->rule_locs[i], + }; + + err = ops->get_rxnfc(dev, &rule_info, NULL); + if (err) + goto err_free_info; + + if (rule_info.fs.ring_cookie != RX_CLS_FLOW_DISC && + rule_info.fs.ring_cookie != RX_CLS_FLOW_WAKE && + !(rule_info.flow_type & FLOW_RSS)) + max_ring = + max_t(u64, max_ring, rule_info.fs.ring_cookie); + } + + kvfree(info); + *max = max_ring; + return 0; + +err_free_info: + kvfree(info); + return err; +} + int ethtool_get_max_rxfh_channel(struct net_device *dev, u32 *max) { u32 dev_size, current_max = 0; diff --git a/net/ethtool/common.h b/net/ethtool/common.h index c1779657e074..b1b9db810eca 100644 --- a/net/ethtool/common.h +++ b/net/ethtool/common.h @@ -43,6 +43,7 @@ bool convert_legacy_settings_to_link_ksettings( struct ethtool_link_ksettings *link_ksettings, const struct ethtool_cmd *legacy_settings); int ethtool_get_max_rxfh_channel(struct net_device *dev, u32 *max); +int ethtool_get_max_rxnfc_channel(struct net_device *dev, u64 *max); int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info); extern const struct ethtool_phy_ops *ethtool_phy_ops; diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 57e7238a4136..4831fd82796a 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -1796,7 +1796,8 @@ static noinline_for_stack int ethtool_set_channels(struct net_device *dev, { struct ethtool_channels channels, curr = { .cmd = ETHTOOL_GCHANNELS }; u16 from_channel, to_channel; - u32 max_rx_in_use = 0; + u64 max_rxnfc_in_use; + u32 max_rxfh_in_use; unsigned int i; int ret; @@ -1827,11 +1828,15 @@ static noinline_for_stack int ethtool_set_channels(struct net_device *dev, return -EINVAL; /* ensure the new Rx count fits within the configured Rx flow - * indirection table settings */ - if (netif_is_rxfh_configured(dev) && - !ethtool_get_max_rxfh_channel(dev, &max_rx_in_use) && - (channels.combined_count + channels.rx_count) <= max_rx_in_use) - return -EINVAL; + * indirection table/rxnfc settings */ + if (ethtool_get_max_rxnfc_channel(dev, &max_rxnfc_in_use)) + max_rxnfc_in_use = 0; + if (!netif_is_rxfh_configured(dev) || + ethtool_get_max_rxfh_channel(dev, &max_rxfh_in_use)) + max_rxfh_in_use = 0; + if (channels.combined_count + channels.rx_count <= + max_t(u64, max_rxnfc_in_use, max_rxfh_in_use)) + return -EINVAL; /* Disabling channels, query zero-copy AF_XDP sockets */ from_channel = channels.combined_count +