Message ID | 20240206010311.149103-1-jdamato@fastly.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [net-next] eth: mlx5: link NAPI instances to queues and IRQs | expand |
On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: > Make mlx5 compatible with the newly added netlink queue GET APIs. > > Signed-off-by: Joe Damato <jdamato@fastly.com> > --- > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > 2 files changed, 9 insertions(+) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > index 55c6ace0acd5..3f86ee1831a8 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > @@ -768,6 +768,7 @@ struct mlx5e_channel { > u16 qos_sqs_size; > u8 num_tc; > u8 lag_port; > + unsigned int irq; > > /* XDP_REDIRECT */ > struct mlx5e_xdpsq xdpsq; > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > index c8e8f512803e..e1bfff1fb328 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > mlx5e_close_tx_cqs(c); > mlx5e_close_cq(&c->icosq.cq); > mlx5e_close_cq(&c->async_icosq.cq); > + > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); This should be set to NULL *before* actually closing the rqs, sqs, and related cqs right? I would expect these two lines to be the first ones called in mlx5e_close_queues. Btw, I think this should be done in mlx5e_deactivate_channel where the NAPI is disabled. > } > > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > c->stats = &priv->channel_stats[ix]->ch; > c->aff_mask = irq_get_effective_affinity_mask(irq); > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > + c->irq = irq; > > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > mlx5e_activate_xsk(c); > else > mlx5e_activate_rq(&c->rq); > + > + netif_napi_set_irq(&c->napi, c->irq); > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); It's weird that netlink queue API is being configured in mlx5e_activate_channel and deconfigured in mlx5e_close_queues. This leads to a problem where the napi will be falsely referred to even when we deactivate the channels in mlx5e_switch_priv_channels and may not necessarily get to closing the channels due to an error. Typically, we use the following clean up patterns. mlx5e_activate_channel -> mlx5e_deactivate_channel mlx5e_open_queues -> mlx5e_close_queues > } > > static void mlx5e_deactivate_channel(struct mlx5e_channel *c) -- Thanks, Rahul Rameshbabu
On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: > On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: > > Make mlx5 compatible with the newly added netlink queue GET APIs. > > > > Signed-off-by: Joe Damato <jdamato@fastly.com> > > --- > > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > > 2 files changed, 9 insertions(+) > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > > index 55c6ace0acd5..3f86ee1831a8 100644 > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > > @@ -768,6 +768,7 @@ struct mlx5e_channel { > > u16 qos_sqs_size; > > u8 num_tc; > > u8 lag_port; > > + unsigned int irq; > > > > /* XDP_REDIRECT */ > > struct mlx5e_xdpsq xdpsq; > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > index c8e8f512803e..e1bfff1fb328 100644 > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > > mlx5e_close_tx_cqs(c); > > mlx5e_close_cq(&c->icosq.cq); > > mlx5e_close_cq(&c->async_icosq.cq); > > + > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > > This should be set to NULL *before* actually closing the rqs, sqs, and > related cqs right? I would expect these two lines to be the first ones > called in mlx5e_close_queues. Btw, I think this should be done in > mlx5e_deactivate_channel where the NAPI is disabled. > > > } > > > > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > > c->stats = &priv->channel_stats[ix]->ch; > > c->aff_mask = irq_get_effective_affinity_mask(irq); > > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > > + c->irq = irq; > > > > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > > > > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > > mlx5e_activate_xsk(c); > > else > > mlx5e_activate_rq(&c->rq); > > + > > + netif_napi_set_irq(&c->napi, c->irq); > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); > > It's weird that netlink queue API is being configured in > mlx5e_activate_channel and deconfigured in mlx5e_close_queues. This > leads to a problem where the napi will be falsely referred to even when > we deactivate the channels in mlx5e_switch_priv_channels and may not > necessarily get to closing the channels due to an error. > > Typically, we use the following clean up patterns. > > mlx5e_activate_channel -> mlx5e_deactivate_channel > mlx5e_open_queues -> mlx5e_close_queues OK, I'll move it to mlx5e_deactivate_channel before the NAPI is disabled. That makes sense to me.
On Mon, 05 Feb, 2024 17:32:47 -0800 Joe Damato <jdamato@fastly.com> wrote: > On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: >> On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: >> > Make mlx5 compatible with the newly added netlink queue GET APIs. >> > >> > Signed-off-by: Joe Damato <jdamato@fastly.com> >> > --- >> > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + >> > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ >> > 2 files changed, 9 insertions(+) >> > >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h >> > index 55c6ace0acd5..3f86ee1831a8 100644 >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h >> > @@ -768,6 +768,7 @@ struct mlx5e_channel { >> > u16 qos_sqs_size; >> > u8 num_tc; >> > u8 lag_port; >> > + unsigned int irq; >> > >> > /* XDP_REDIRECT */ >> > struct mlx5e_xdpsq xdpsq; >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> > index c8e8f512803e..e1bfff1fb328 100644 >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) >> > mlx5e_close_tx_cqs(c); >> > mlx5e_close_cq(&c->icosq.cq); >> > mlx5e_close_cq(&c->async_icosq.cq); >> > + >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); >> >> This should be set to NULL *before* actually closing the rqs, sqs, and >> related cqs right? I would expect these two lines to be the first ones >> called in mlx5e_close_queues. Btw, I think this should be done in >> mlx5e_deactivate_channel where the NAPI is disabled. >> >> > } >> > >> > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) >> > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, >> > c->stats = &priv->channel_stats[ix]->ch; >> > c->aff_mask = irq_get_effective_affinity_mask(irq); >> > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); >> > + c->irq = irq; >> > >> > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); >> > >> > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) >> > mlx5e_activate_xsk(c); >> > else >> > mlx5e_activate_rq(&c->rq); >> > + >> > + netif_napi_set_irq(&c->napi, c->irq); One small comment that I missed in my previous iteration. I think the above should be moved to mlx5e_open_channel right after netif_napi_add. This avoids needing to save the irq in struct mlx5e_channel. >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); >> >> It's weird that netlink queue API is being configured in >> mlx5e_activate_channel and deconfigured in mlx5e_close_queues. This >> leads to a problem where the napi will be falsely referred to even when >> we deactivate the channels in mlx5e_switch_priv_channels and may not >> necessarily get to closing the channels due to an error. >> >> Typically, we use the following clean up patterns. >> >> mlx5e_activate_channel -> mlx5e_deactivate_channel >> mlx5e_open_queues -> mlx5e_close_queues > > OK, I'll move it to mlx5e_deactivate_channel before the NAPI is disabled. > That makes sense to me. Appreciated. Thank you for the patch btw. -- Rahul Rameshbabu
On Mon, Feb 05, 2024 at 05:33:39PM -0800, Rahul Rameshbabu wrote: > > On Mon, 05 Feb, 2024 17:32:47 -0800 Joe Damato <jdamato@fastly.com> wrote: > > On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: > >> On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: > >> > Make mlx5 compatible with the newly added netlink queue GET APIs. > >> > > >> > Signed-off-by: Joe Damato <jdamato@fastly.com> > >> > --- > >> > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > >> > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > >> > 2 files changed, 9 insertions(+) > >> > > >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >> > index 55c6ace0acd5..3f86ee1831a8 100644 > >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >> > @@ -768,6 +768,7 @@ struct mlx5e_channel { > >> > u16 qos_sqs_size; > >> > u8 num_tc; > >> > u8 lag_port; > >> > + unsigned int irq; > >> > > >> > /* XDP_REDIRECT */ > >> > struct mlx5e_xdpsq xdpsq; > >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> > index c8e8f512803e..e1bfff1fb328 100644 > >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > >> > mlx5e_close_tx_cqs(c); > >> > mlx5e_close_cq(&c->icosq.cq); > >> > mlx5e_close_cq(&c->async_icosq.cq); > >> > + > >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > >> > >> This should be set to NULL *before* actually closing the rqs, sqs, and > >> related cqs right? I would expect these two lines to be the first ones > >> called in mlx5e_close_queues. Btw, I think this should be done in > >> mlx5e_deactivate_channel where the NAPI is disabled. > >> > >> > } > >> > > >> > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > >> > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > >> > c->stats = &priv->channel_stats[ix]->ch; > >> > c->aff_mask = irq_get_effective_affinity_mask(irq); > >> > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > >> > + c->irq = irq; > >> > > >> > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > >> > > >> > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > >> > mlx5e_activate_xsk(c); > >> > else > >> > mlx5e_activate_rq(&c->rq); > >> > + > >> > + netif_napi_set_irq(&c->napi, c->irq); > > One small comment that I missed in my previous iteration. I think the > above should be moved to mlx5e_open_channel right after netif_napi_add. > This avoids needing to save the irq in struct mlx5e_channel. I couldn't move it to mlx5e_open_channel because of how safe_switch_params and the mechanics around that seem to work (at least as far as I could tell). mlx5 seems to create a new set of channels before closing the previous channel. So, moving this logic to open_channels and close_channels means you end up with a flow like this: - Create new channels (NAPI netlink API is used to set NAPIs) - Old channels are closed (NAPI netlink API sets NULL and overwrites the previous NAPI netlink calls) Now, the associations are all NULL. I think moving the calls to active / deactivate fixes that problem, but requires that irq is stored, if I am understanding the driver correctly. > >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); > >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); > >> > >> It's weird that netlink queue API is being configured in > >> mlx5e_activate_channel and deconfigured in mlx5e_close_queues. This > >> leads to a problem where the napi will be falsely referred to even when > >> we deactivate the channels in mlx5e_switch_priv_channels and may not > >> necessarily get to closing the channels due to an error. > >> > >> Typically, we use the following clean up patterns. > >> > >> mlx5e_activate_channel -> mlx5e_deactivate_channel > >> mlx5e_open_queues -> mlx5e_close_queues > > > > OK, I'll move it to mlx5e_deactivate_channel before the NAPI is disabled. > > That makes sense to me. > > Appreciated. Thank you for the patch btw. Sure, thanks for the review.
On Mon, 05 Feb, 2024 17:41:52 -0800 Joe Damato <jdamato@fastly.com> wrote: > On Mon, Feb 05, 2024 at 05:33:39PM -0800, Rahul Rameshbabu wrote: >> >> On Mon, 05 Feb, 2024 17:32:47 -0800 Joe Damato <jdamato@fastly.com> wrote: >> > On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: >> >> On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: >> >> > Make mlx5 compatible with the newly added netlink queue GET APIs. >> >> > >> >> > Signed-off-by: Joe Damato <jdamato@fastly.com> >> >> > --- >> >> > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + >> >> > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ >> >> > 2 files changed, 9 insertions(+) >> >> > >> >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h >> >> > index 55c6ace0acd5..3f86ee1831a8 100644 >> >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h >> >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h >> >> > @@ -768,6 +768,7 @@ struct mlx5e_channel { >> >> > u16 qos_sqs_size; >> >> > u8 num_tc; >> >> > u8 lag_port; >> >> > + unsigned int irq; >> >> > >> >> > /* XDP_REDIRECT */ >> >> > struct mlx5e_xdpsq xdpsq; >> >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> >> > index c8e8f512803e..e1bfff1fb328 100644 >> >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> >> > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) >> >> > mlx5e_close_tx_cqs(c); >> >> > mlx5e_close_cq(&c->icosq.cq); >> >> > mlx5e_close_cq(&c->async_icosq.cq); >> >> > + >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); >> >> >> >> This should be set to NULL *before* actually closing the rqs, sqs, and >> >> related cqs right? I would expect these two lines to be the first ones >> >> called in mlx5e_close_queues. Btw, I think this should be done in >> >> mlx5e_deactivate_channel where the NAPI is disabled. >> >> >> >> > } >> >> > >> >> > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) >> >> > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, >> >> > c->stats = &priv->channel_stats[ix]->ch; >> >> > c->aff_mask = irq_get_effective_affinity_mask(irq); >> >> > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); >> >> > + c->irq = irq; >> >> > >> >> > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); >> >> > >> >> > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) >> >> > mlx5e_activate_xsk(c); >> >> > else >> >> > mlx5e_activate_rq(&c->rq); >> >> > + >> >> > + netif_napi_set_irq(&c->napi, c->irq); >> >> One small comment that I missed in my previous iteration. I think the >> above should be moved to mlx5e_open_channel right after netif_napi_add. >> This avoids needing to save the irq in struct mlx5e_channel. > > I couldn't move it to mlx5e_open_channel because of how safe_switch_params > and the mechanics around that seem to work (at least as far as I could > tell). > > mlx5 seems to create a new set of channels before closing the previous > channel. So, moving this logic to open_channels and close_channels means > you end up with a flow like this: > > - Create new channels (NAPI netlink API is used to set NAPIs) > - Old channels are closed (NAPI netlink API sets NULL and overwrites the > previous NAPI netlink calls) > > Now, the associations are all NULL. > > I think moving the calls to active / deactivate fixes that problem, but > requires that irq is stored, if I am understanding the driver correctly. I believe moving the changes to activate / deactivate channels resolves this problem because only one set of channels will be active, so you will no longer have dangling association conflicts for the queue -> napi. This is partially why I suggested the change in that iteration. As for netif_napi_set_irq, that alone can be in mlx5e_open_channel (that was the intention of my most recent comment. Not that all the other associations should be moved as well). I agree that the other association calls should be part of activate / deactivate channels. > >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); >> >> >> >> It's weird that netlink queue API is being configured in >> >> mlx5e_activate_channel and deconfigured in mlx5e_close_queues. This >> >> leads to a problem where the napi will be falsely referred to even when >> >> we deactivate the channels in mlx5e_switch_priv_channels and may not >> >> necessarily get to closing the channels due to an error. >> >> >> >> Typically, we use the following clean up patterns. >> >> >> >> mlx5e_activate_channel -> mlx5e_deactivate_channel >> >> mlx5e_open_queues -> mlx5e_close_queues >> > >> > OK, I'll move it to mlx5e_deactivate_channel before the NAPI is disabled. >> > That makes sense to me. >> >> Appreciated. Thank you for the patch btw. > > Sure, thanks for the review.
On Mon, Feb 05, 2024 at 05:44:27PM -0800, Rahul Rameshbabu wrote: > > On Mon, 05 Feb, 2024 17:41:52 -0800 Joe Damato <jdamato@fastly.com> wrote: > > On Mon, Feb 05, 2024 at 05:33:39PM -0800, Rahul Rameshbabu wrote: > >> > >> On Mon, 05 Feb, 2024 17:32:47 -0800 Joe Damato <jdamato@fastly.com> wrote: > >> > On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: > >> >> On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: > >> >> > Make mlx5 compatible with the newly added netlink queue GET APIs. > >> >> > > >> >> > Signed-off-by: Joe Damato <jdamato@fastly.com> > >> >> > --- > >> >> > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > >> >> > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > >> >> > 2 files changed, 9 insertions(+) > >> >> > > >> >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >> >> > index 55c6ace0acd5..3f86ee1831a8 100644 > >> >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > >> >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >> >> > @@ -768,6 +768,7 @@ struct mlx5e_channel { > >> >> > u16 qos_sqs_size; > >> >> > u8 num_tc; > >> >> > u8 lag_port; > >> >> > + unsigned int irq; > >> >> > > >> >> > /* XDP_REDIRECT */ > >> >> > struct mlx5e_xdpsq xdpsq; > >> >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> >> > index c8e8f512803e..e1bfff1fb328 100644 > >> >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> >> > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > >> >> > mlx5e_close_tx_cqs(c); > >> >> > mlx5e_close_cq(&c->icosq.cq); > >> >> > mlx5e_close_cq(&c->async_icosq.cq); > >> >> > + > >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > >> >> > >> >> This should be set to NULL *before* actually closing the rqs, sqs, and > >> >> related cqs right? I would expect these two lines to be the first ones > >> >> called in mlx5e_close_queues. Btw, I think this should be done in > >> >> mlx5e_deactivate_channel where the NAPI is disabled. > >> >> > >> >> > } > >> >> > > >> >> > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > >> >> > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > >> >> > c->stats = &priv->channel_stats[ix]->ch; > >> >> > c->aff_mask = irq_get_effective_affinity_mask(irq); > >> >> > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > >> >> > + c->irq = irq; > >> >> > > >> >> > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > >> >> > > >> >> > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > >> >> > mlx5e_activate_xsk(c); > >> >> > else > >> >> > mlx5e_activate_rq(&c->rq); > >> >> > + > >> >> > + netif_napi_set_irq(&c->napi, c->irq); > >> > >> One small comment that I missed in my previous iteration. I think the > >> above should be moved to mlx5e_open_channel right after netif_napi_add. > >> This avoids needing to save the irq in struct mlx5e_channel. > > > > I couldn't move it to mlx5e_open_channel because of how safe_switch_params > > and the mechanics around that seem to work (at least as far as I could > > tell). > > > > mlx5 seems to create a new set of channels before closing the previous > > channel. So, moving this logic to open_channels and close_channels means > > you end up with a flow like this: > > > > - Create new channels (NAPI netlink API is used to set NAPIs) > > - Old channels are closed (NAPI netlink API sets NULL and overwrites the > > previous NAPI netlink calls) > > > > Now, the associations are all NULL. > > > > I think moving the calls to active / deactivate fixes that problem, but > > requires that irq is stored, if I am understanding the driver correctly. > > I believe moving the changes to activate / deactivate channels resolves > this problem because only one set of channels will be active, so you > will no longer have dangling association conflicts for the queue -> > napi. This is partially why I suggested the change in that iteration. As far as I can tell, it does. > As for netif_napi_set_irq, that alone can be in mlx5e_open_channel (that > was the intention of my most recent comment. Not that all the other > associations should be moved as well). I agree that the other > association calls should be part of activate / deactivate channels. OK, sure that makes sense. I make that change, too. > > > >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); > >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); > >> >> > >> >> It's weird that netlink queue API is being configured in > >> >> mlx5e_activate_channel and deconfigured in mlx5e_close_queues. This > >> >> leads to a problem where the napi will be falsely referred to even when > >> >> we deactivate the channels in mlx5e_switch_priv_channels and may not > >> >> necessarily get to closing the channels due to an error. > >> >> > >> >> Typically, we use the following clean up patterns. > >> >> > >> >> mlx5e_activate_channel -> mlx5e_deactivate_channel > >> >> mlx5e_open_queues -> mlx5e_close_queues > >> > > >> > OK, I'll move it to mlx5e_deactivate_channel before the NAPI is disabled. > >> > That makes sense to me. > >> > >> Appreciated. Thank you for the patch btw. > > > > Sure, thanks for the review. >
On Mon, 05 Feb, 2024 17:56:58 -0800 Joe Damato <jdamato@fastly.com> wrote: > On Mon, Feb 05, 2024 at 05:44:27PM -0800, Rahul Rameshbabu wrote: >> >> On Mon, 05 Feb, 2024 17:41:52 -0800 Joe Damato <jdamato@fastly.com> wrote: >> > On Mon, Feb 05, 2024 at 05:33:39PM -0800, Rahul Rameshbabu wrote: >> >> >> >> On Mon, 05 Feb, 2024 17:32:47 -0800 Joe Damato <jdamato@fastly.com> wrote: >> >> > On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: >> >> >> On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: >> >> >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> >> >> > index c8e8f512803e..e1bfff1fb328 100644 >> >> >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> >> >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> >> >> > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) >> >> >> > mlx5e_close_tx_cqs(c); >> >> >> > mlx5e_close_cq(&c->icosq.cq); >> >> >> > mlx5e_close_cq(&c->async_icosq.cq); >> >> >> > + >> >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); >> >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); >> >> >> >> >> >> This should be set to NULL *before* actually closing the rqs, sqs, and >> >> >> related cqs right? I would expect these two lines to be the first ones >> >> >> called in mlx5e_close_queues. Btw, I think this should be done in >> >> >> mlx5e_deactivate_channel where the NAPI is disabled. >> >> >> >> >> >> > } >> >> >> > >> >> >> > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) >> >> >> > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, >> >> >> > c->stats = &priv->channel_stats[ix]->ch; >> >> >> > c->aff_mask = irq_get_effective_affinity_mask(irq); >> >> >> > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); >> >> >> > + c->irq = irq; >> >> >> > >> >> >> > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); >> >> >> > >> >> >> > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) >> >> >> > mlx5e_activate_xsk(c); >> >> >> > else >> >> >> > mlx5e_activate_rq(&c->rq); >> >> >> > + >> >> >> > + netif_napi_set_irq(&c->napi, c->irq); >> >> >> >> One small comment that I missed in my previous iteration. I think the >> >> above should be moved to mlx5e_open_channel right after netif_napi_add. >> >> This avoids needing to save the irq in struct mlx5e_channel. >> > >> > I couldn't move it to mlx5e_open_channel because of how safe_switch_params >> > and the mechanics around that seem to work (at least as far as I could >> > tell). >> > >> > mlx5 seems to create a new set of channels before closing the previous >> > channel. So, moving this logic to open_channels and close_channels means >> > you end up with a flow like this: >> > >> > - Create new channels (NAPI netlink API is used to set NAPIs) >> > - Old channels are closed (NAPI netlink API sets NULL and overwrites the >> > previous NAPI netlink calls) >> > >> > Now, the associations are all NULL. >> > >> > I think moving the calls to active / deactivate fixes that problem, but >> > requires that irq is stored, if I am understanding the driver correctly. >> >> I believe moving the changes to activate / deactivate channels resolves >> this problem because only one set of channels will be active, so you >> will no longer have dangling association conflicts for the queue -> >> napi. This is partially why I suggested the change in that iteration. > > As far as I can tell, it does. > >> As for netif_napi_set_irq, that alone can be in mlx5e_open_channel (that >> was the intention of my most recent comment. Not that all the other >> associations should be moved as well). I agree that the other >> association calls should be part of activate / deactivate channels. > > OK, sure that makes sense. I make that change, too. > Also for your v2, it would be great if you can use the commit message subject 'net/mlx5e: link NAPI instances to queues and IRQs' rather than 'eth: mlx5: link NAPI instances to queues and IRQs'. -- Thanks, Rahul Rameshbabu
On Mon, Feb 05, 2024 at 06:38:41PM -0800, Rahul Rameshbabu wrote: > On Mon, 05 Feb, 2024 17:56:58 -0800 Joe Damato <jdamato@fastly.com> wrote: > > On Mon, Feb 05, 2024 at 05:44:27PM -0800, Rahul Rameshbabu wrote: > >> > >> On Mon, 05 Feb, 2024 17:41:52 -0800 Joe Damato <jdamato@fastly.com> wrote: > >> > On Mon, Feb 05, 2024 at 05:33:39PM -0800, Rahul Rameshbabu wrote: > >> >> > >> >> On Mon, 05 Feb, 2024 17:32:47 -0800 Joe Damato <jdamato@fastly.com> wrote: > >> >> > On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: > >> >> >> On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@fastly.com> wrote: > >> >> >> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> >> >> > index c8e8f512803e..e1bfff1fb328 100644 > >> >> >> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> >> >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >> >> >> > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > >> >> >> > mlx5e_close_tx_cqs(c); > >> >> >> > mlx5e_close_cq(&c->icosq.cq); > >> >> >> > mlx5e_close_cq(&c->async_icosq.cq); > >> >> >> > + > >> >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > >> >> >> > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > >> >> >> > >> >> >> This should be set to NULL *before* actually closing the rqs, sqs, and > >> >> >> related cqs right? I would expect these two lines to be the first ones > >> >> >> called in mlx5e_close_queues. Btw, I think this should be done in > >> >> >> mlx5e_deactivate_channel where the NAPI is disabled. > >> >> >> > >> >> >> > } > >> >> >> > > >> >> >> > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > >> >> >> > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > >> >> >> > c->stats = &priv->channel_stats[ix]->ch; > >> >> >> > c->aff_mask = irq_get_effective_affinity_mask(irq); > >> >> >> > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > >> >> >> > + c->irq = irq; > >> >> >> > > >> >> >> > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > >> >> >> > > >> >> >> > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > >> >> >> > mlx5e_activate_xsk(c); > >> >> >> > else > >> >> >> > mlx5e_activate_rq(&c->rq); > >> >> >> > + > >> >> >> > + netif_napi_set_irq(&c->napi, c->irq); > >> >> > >> >> One small comment that I missed in my previous iteration. I think the > >> >> above should be moved to mlx5e_open_channel right after netif_napi_add. > >> >> This avoids needing to save the irq in struct mlx5e_channel. > >> > > >> > I couldn't move it to mlx5e_open_channel because of how safe_switch_params > >> > and the mechanics around that seem to work (at least as far as I could > >> > tell). > >> > > >> > mlx5 seems to create a new set of channels before closing the previous > >> > channel. So, moving this logic to open_channels and close_channels means > >> > you end up with a flow like this: > >> > > >> > - Create new channels (NAPI netlink API is used to set NAPIs) > >> > - Old channels are closed (NAPI netlink API sets NULL and overwrites the > >> > previous NAPI netlink calls) > >> > > >> > Now, the associations are all NULL. > >> > > >> > I think moving the calls to active / deactivate fixes that problem, but > >> > requires that irq is stored, if I am understanding the driver correctly. > >> > >> I believe moving the changes to activate / deactivate channels resolves > >> this problem because only one set of channels will be active, so you > >> will no longer have dangling association conflicts for the queue -> > >> napi. This is partially why I suggested the change in that iteration. > > > > As far as I can tell, it does. > > > >> As for netif_napi_set_irq, that alone can be in mlx5e_open_channel (that > >> was the intention of my most recent comment. Not that all the other > >> associations should be moved as well). I agree that the other > >> association calls should be part of activate / deactivate channels. > > > > OK, sure that makes sense. I make that change, too. > > > > Also for your v2, it would be great if you can use the commit message > subject 'net/mlx5e: link NAPI instances to queues and IRQs' rather than > 'eth: mlx5: link NAPI instances to queues and IRQs'. Didn't see this before I sent it. If it matters that much, I can send a v3 with an updated commit message.
On 06/02/2024 3:03, Joe Damato wrote: > Make mlx5 compatible with the newly added netlink queue GET APIs. > > Signed-off-by: Joe Damato <jdamato@fastly.com> + Gal Hi Joe, Thanks for your patch. We have already prepared a similar patch, and it's part of our internal submission queue, and planned to be submitted soon. Please see my comments below, let us know if you're welling to respin a V2 or wait for our patch. > --- > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > 2 files changed, 9 insertions(+) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > index 55c6ace0acd5..3f86ee1831a8 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > @@ -768,6 +768,7 @@ struct mlx5e_channel { > u16 qos_sqs_size; > u8 num_tc; > u8 lag_port; > + unsigned int irq; > > /* XDP_REDIRECT */ > struct mlx5e_xdpsq xdpsq; > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > index c8e8f512803e..e1bfff1fb328 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > mlx5e_close_tx_cqs(c); > mlx5e_close_cq(&c->icosq.cq); > mlx5e_close_cq(&c->async_icosq.cq); > + > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > } > > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > c->stats = &priv->channel_stats[ix]->ch; > c->aff_mask = irq_get_effective_affinity_mask(irq); > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > + c->irq = irq; > > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > mlx5e_activate_xsk(c); > else > mlx5e_activate_rq(&c->rq); > + > + netif_napi_set_irq(&c->napi, c->irq); Can be safely moved to mlx5e_open_channel without interfering with other existing mapping. This would save the new irq field in mlx5e_channel. > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); In some configurations we have multiple txqs per channel, so the txq indices are not 1-to-1 with their channel index. This should be called per each txq, with the proper txq index. It should be done also for feture-dedicated SQs (like QOS/HTB). > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); For consistency, I'd move this one as well, to match the TX handling. > } > > static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
On Tue, Feb 06, 2024 at 10:11:28AM +0200, Tariq Toukan wrote: > > > On 06/02/2024 3:03, Joe Damato wrote: > >Make mlx5 compatible with the newly added netlink queue GET APIs. > > > >Signed-off-by: Joe Damato <jdamato@fastly.com> > > + Gal > > Hi Joe, > Thanks for your patch. > > We have already prepared a similar patch, and it's part of our internal > submission queue, and planned to be submitted soon. > > Please see my comments below, let us know if you're welling to respin a V2 > or wait for our patch. Do you have a rough estimate on when it'll be submitted? If it's several months out I'll try again, but if it's expected to be submit in the next few weeks I'll wait for your official change. BTW, are the per queue coalesce changes in that same queue? It was mentioned previously [1] that this feature is coming after we submit a simple attempt as an RFC. If that feature isn't planned or won't be submit anytime soon, can you let us know and we can try to attempt an RFC v3 for it? [1]: https://lore.kernel.org/lkml/874jj34e67.fsf@nvidia.com/ > >--- > > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > > 2 files changed, 9 insertions(+) > > > >diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >index 55c6ace0acd5..3f86ee1831a8 100644 > >--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > >+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >@@ -768,6 +768,7 @@ struct mlx5e_channel { > > u16 qos_sqs_size; > > u8 num_tc; > > u8 lag_port; > >+ unsigned int irq; > > /* XDP_REDIRECT */ > > struct mlx5e_xdpsq xdpsq; > >diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >index c8e8f512803e..e1bfff1fb328 100644 > >--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >@@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > > mlx5e_close_tx_cqs(c); > > mlx5e_close_cq(&c->icosq.cq); > > mlx5e_close_cq(&c->async_icosq.cq); > >+ > >+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > >+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > > } > > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > >@@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > > c->stats = &priv->channel_stats[ix]->ch; > > c->aff_mask = irq_get_effective_affinity_mask(irq); > > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > >+ c->irq = irq; > > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > >@@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > > mlx5e_activate_xsk(c); > > else > > mlx5e_activate_rq(&c->rq); > >+ > >+ netif_napi_set_irq(&c->napi, c->irq); > > Can be safely moved to mlx5e_open_channel without interfering with other > existing mapping. This would save the new irq field in mlx5e_channel. Sure, yea, I have that change queued already from last night. > >+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); > > In some configurations we have multiple txqs per channel, so the txq indices > are not 1-to-1 with their channel index. > > This should be called per each txq, with the proper txq index. > > It should be done also for feture-dedicated SQs (like QOS/HTB). OK. I think the above makes sense and I'll look into it if I have some time this week. > >+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); > > For consistency, I'd move this one as well, to match the TX handling. Sure. > > } > > static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
On 06/02/2024 19:12, Joe Damato wrote: > On Tue, Feb 06, 2024 at 10:11:28AM +0200, Tariq Toukan wrote: >> >> >> On 06/02/2024 3:03, Joe Damato wrote: >>> Make mlx5 compatible with the newly added netlink queue GET APIs. >>> >>> Signed-off-by: Joe Damato <jdamato@fastly.com> >> >> + Gal >> >> Hi Joe, >> Thanks for your patch. >> >> We have already prepared a similar patch, and it's part of our internal >> submission queue, and planned to be submitted soon. >> >> Please see my comments below, let us know if you're welling to respin a V2 >> or wait for our patch. > > Do you have a rough estimate on when it'll be submitted? > > If it's several months out I'll try again, but if it's expected to be > submit in the next few weeks I'll wait for your official change. It'll be in the next few weeks. > > BTW, are the per queue coalesce changes in that same queue? It was > mentioned previously [1] that this feature is coming after we submit a > simple attempt as an RFC. If that feature isn't planned or won't be submit > anytime soon, can you let us know and we can try to attempt an RFC v3 for > it? > The per queue coalesce series is going through internal code review, and is expected to also be ready in a matter of a few weeks. > [1]: https://lore.kernel.org/lkml/874jj34e67.fsf@nvidia.com/ > >>> --- >>> drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + >>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ >>> 2 files changed, 9 insertions(+) >>> >>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h >>> index 55c6ace0acd5..3f86ee1831a8 100644 >>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h >>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h >>> @@ -768,6 +768,7 @@ struct mlx5e_channel { >>> u16 qos_sqs_size; >>> u8 num_tc; >>> u8 lag_port; >>> + unsigned int irq; >>> /* XDP_REDIRECT */ >>> struct mlx5e_xdpsq xdpsq; >>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> index c8e8f512803e..e1bfff1fb328 100644 >>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) >>> mlx5e_close_tx_cqs(c); >>> mlx5e_close_cq(&c->icosq.cq); >>> mlx5e_close_cq(&c->async_icosq.cq); >>> + >>> + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); >>> + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); >>> } >>> static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) >>> @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, >>> c->stats = &priv->channel_stats[ix]->ch; >>> c->aff_mask = irq_get_effective_affinity_mask(irq); >>> c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); >>> + c->irq = irq; >>> netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); >>> @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) >>> mlx5e_activate_xsk(c); >>> else >>> mlx5e_activate_rq(&c->rq); >>> + >>> + netif_napi_set_irq(&c->napi, c->irq); >> >> Can be safely moved to mlx5e_open_channel without interfering with other >> existing mapping. This would save the new irq field in mlx5e_channel. > > Sure, yea, I have that change queued already from last night. > I see now.. I replied before noticing it. >>> + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); >> >> In some configurations we have multiple txqs per channel, so the txq indices >> are not 1-to-1 with their channel index. >> >> This should be called per each txq, with the proper txq index. >> >> It should be done also for feture-dedicated SQs (like QOS/HTB). > > OK. I think the above makes sense and I'll look into it if I have some time > this week. > >>> + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); >> >> For consistency, I'd move this one as well, to match the TX handling. > > Sure. > >>> } >>> static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
On Tue, Feb 06, 2024 at 09:10:27PM +0200, Tariq Toukan wrote: > > > On 06/02/2024 19:12, Joe Damato wrote: > >On Tue, Feb 06, 2024 at 10:11:28AM +0200, Tariq Toukan wrote: > >> > >> > >>On 06/02/2024 3:03, Joe Damato wrote: > >>>Make mlx5 compatible with the newly added netlink queue GET APIs. > >>> > >>>Signed-off-by: Joe Damato <jdamato@fastly.com> > >> > >>+ Gal > >> > >>Hi Joe, > >>Thanks for your patch. > >> > >>We have already prepared a similar patch, and it's part of our internal > >>submission queue, and planned to be submitted soon. > >> > >>Please see my comments below, let us know if you're welling to respin a V2 > >>or wait for our patch. > > > >Do you have a rough estimate on when it'll be submitted? > > > >If it's several months out I'll try again, but if it's expected to be > >submit in the next few weeks I'll wait for your official change. > > It'll be in the next few weeks. OK, well I tweaked the v3 I had queued based on your feedback. I am definitiely not an mlx5 expert, so I have no idea if it's correct. The changes can be summed up as: - mlx5e_activate_channel and mlx5e_deactivate_channel to use netif_queue_set_napi for each mlx5e_txqsq as it is activated/deactivated. I assumed sq->txq_ix is the correct index, but I have no idea. - mlx5e_activate_qos_sq and mlx5e_deactivate_qos_sq to handle the QOS/HTB case, similar to the above. - IRQ storage removed If you think that sounds vaguely correct, I can send the v3 tomorrow when it has been >24hrs as per Rahul's request. > > > >BTW, are the per queue coalesce changes in that same queue? It was > >mentioned previously [1] that this feature is coming after we submit a > >simple attempt as an RFC. If that feature isn't planned or won't be submit > >anytime soon, can you let us know and we can try to attempt an RFC v3 for > >it? > > > > The per queue coalesce series is going through internal code review, and is > expected to also be ready in a matter of a few weeks. OK, great. Thanks for letting me know; we are definitely interested in using this feature. > >[1]: https://lore.kernel.org/lkml/874jj34e67.fsf@nvidia.com/ > > > >>>--- > >>> drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > >>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > >>> 2 files changed, 9 insertions(+) > >>> > >>>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >>>index 55c6ace0acd5..3f86ee1831a8 100644 > >>>--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > >>>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > >>>@@ -768,6 +768,7 @@ struct mlx5e_channel { > >>> u16 qos_sqs_size; > >>> u8 num_tc; > >>> u8 lag_port; > >>>+ unsigned int irq; > >>> /* XDP_REDIRECT */ > >>> struct mlx5e_xdpsq xdpsq; > >>>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >>>index c8e8f512803e..e1bfff1fb328 100644 > >>>--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >>>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >>>@@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > >>> mlx5e_close_tx_cqs(c); > >>> mlx5e_close_cq(&c->icosq.cq); > >>> mlx5e_close_cq(&c->async_icosq.cq); > >>>+ > >>>+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > >>>+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > >>> } > >>> static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > >>>@@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > >>> c->stats = &priv->channel_stats[ix]->ch; > >>> c->aff_mask = irq_get_effective_affinity_mask(irq); > >>> c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > >>>+ c->irq = irq; > >>> netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > >>>@@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > >>> mlx5e_activate_xsk(c); > >>> else > >>> mlx5e_activate_rq(&c->rq); > >>>+ > >>>+ netif_napi_set_irq(&c->napi, c->irq); > >> > >>Can be safely moved to mlx5e_open_channel without interfering with other > >>existing mapping. This would save the new irq field in mlx5e_channel. > > > >Sure, yea, I have that change queued already from last night. > > > > I see now.. I replied before noticing it. > > >>>+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); > >> > >>In some configurations we have multiple txqs per channel, so the txq indices > >>are not 1-to-1 with their channel index. > >> > >>This should be called per each txq, with the proper txq index. > >> > >>It should be done also for feture-dedicated SQs (like QOS/HTB). > > > >OK. I think the above makes sense and I'll look into it if I have some time > >this week. > >>>+ netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); > >> > >>For consistency, I'd move this one as well, to match the TX handling. > > > >Sure. > > > >>> } > >>> static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
On 06/02/2024 21:23, Joe Damato wrote: >> The per queue coalesce series is going through internal code review, and is >> expected to also be ready in a matter of a few weeks. > > OK, great. Thanks for letting me know; we are definitely interested in > using this feature. Hi Joe, Can you please share some details about your usecase for this feature?
On 06/02/2024 21:23, Joe Damato wrote: > On Tue, Feb 06, 2024 at 09:10:27PM +0200, Tariq Toukan wrote: >> >> >> On 06/02/2024 19:12, Joe Damato wrote: >>> On Tue, Feb 06, 2024 at 10:11:28AM +0200, Tariq Toukan wrote: >>>> >>>> >>>> On 06/02/2024 3:03, Joe Damato wrote: >>>>> Make mlx5 compatible with the newly added netlink queue GET APIs. >>>>> >>>>> Signed-off-by: Joe Damato <jdamato@fastly.com> ... > > OK, well I tweaked the v3 I had queued based on your feedback. I am > definitiely not an mlx5 expert, so I have no idea if it's correct. > > The changes can be summed up as: > - mlx5e_activate_channel and mlx5e_deactivate_channel to use > netif_queue_set_napi for each mlx5e_txqsq as it is > activated/deactivated. I assumed sq->txq_ix is the correct index, but I > have no idea. > - mlx5e_activate_qos_sq and mlx5e_deactivate_qos_sq to handle the QOS/HTB > case, similar to the above. > - IRQ storage removed > > If you think that sounds vaguely correct, I can send the v3 tomorrow when > it has been >24hrs as per Rahul's request. > Sounds correct. Please go on and send when it's time so we can review.
On Wed, Feb 07, 2024 at 08:59:18AM +0200, Gal Pressman wrote: > On 06/02/2024 21:23, Joe Damato wrote: > >> The per queue coalesce series is going through internal code review, and is > >> expected to also be ready in a matter of a few weeks. > > > > OK, great. Thanks for letting me know; we are definitely interested in > > using this feature. > > Hi Joe, > Can you please share some details about your usecase for this feature? It was outlined in the cover letter for the RFC [1]. But, briefly: we set a number of queues (say 16) via ethtool. We then create a series of n-tuple filters directing certain flows to queues 0-7 via a custom RSS context. The remaining queues, 8-15 are for all other flows via the default RSS context. Queues 0-7 are used with busy polling from userland so we want those queues to have a larger rx/tx-usecs rx/tx-frames than queues 8-15. We implemented basic support for this in the RFC we sent to the mailing list. [1]: https://lore.kernel.org/lkml/20230823223121.58676-1-dev@nalramli.com/
On Wed, Feb 7, 2024 at 9:26 AM Joe Damato <jdamato@fastly.com> wrote: > > On Wed, Feb 07, 2024 at 08:59:18AM +0200, Gal Pressman wrote: > > On 06/02/2024 21:23, Joe Damato wrote: > > >> The per queue coalesce series is going through internal code review, and is > > >> expected to also be ready in a matter of a few weeks. > > > > > > OK, great. Thanks for letting me know; we are definitely interested in > > > using this feature. > > > > Hi Joe, > > Can you please share some details about your usecase for this feature? > > It was outlined in the cover letter for the RFC [1]. > > But, briefly: we set a number of queues (say 16) via ethtool. We then > create a series of n-tuple filters directing certain flows to queues 0-7 > via a custom RSS context. The remaining queues, 8-15 are for all other > flows via the default RSS context. > > Queues 0-7 are used with busy polling from userland so we want those queues > to have a larger rx/tx-usecs rx/tx-frames than queues 8-15. I am looking forward to trying this to chop some usec off of eBPF. I am curious as to how low can you go... > We implemented basic support for this in the RFC we sent to the mailing > list. thank you for re-citing this: > [1]: https://lore.kernel.org/lkml/20230823223121.58676-1-dev@nalramli.com/ The big feature that I hope appears in some ethernet card someday the ability to map (say 16k) LPMs to a hw queue, as opposed to a mere tuple. It's the biggest overhead operation we have (presently in vectoring data via ebpf) to libreqos for 10k+ ISP subscribers. >
On Wed, Feb 07, 2024 at 03:23:47PM +0200, Tariq Toukan wrote: > > > On 06/02/2024 21:23, Joe Damato wrote: > >On Tue, Feb 06, 2024 at 09:10:27PM +0200, Tariq Toukan wrote: > >> > >> > >>On 06/02/2024 19:12, Joe Damato wrote: > >>>On Tue, Feb 06, 2024 at 10:11:28AM +0200, Tariq Toukan wrote: > >>>> > >>>> > >>>>On 06/02/2024 3:03, Joe Damato wrote: > >>>>>Make mlx5 compatible with the newly added netlink queue GET APIs. > >>>>> > >>>>>Signed-off-by: Joe Damato <jdamato@fastly.com> > > ... > > > > >OK, well I tweaked the v3 I had queued based on your feedback. I am > >definitiely not an mlx5 expert, so I have no idea if it's correct. > > > >The changes can be summed up as: > > - mlx5e_activate_channel and mlx5e_deactivate_channel to use > > netif_queue_set_napi for each mlx5e_txqsq as it is > > activated/deactivated. I assumed sq->txq_ix is the correct index, but I > > have no idea. > > - mlx5e_activate_qos_sq and mlx5e_deactivate_qos_sq to handle the QOS/HTB > > case, similar to the above. > > - IRQ storage removed > > > >If you think that sounds vaguely correct, I can send the v3 tomorrow when > >it has been >24hrs as per Rahul's request. > > > > Sounds correct. > Please go on and send when it's time so we can review. OK, I'll send it a bit later today. After looking at it again just now, I am wondering if the PTP txqsq case needs to be handled, as well.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 55c6ace0acd5..3f86ee1831a8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -768,6 +768,7 @@ struct mlx5e_channel { u16 qos_sqs_size; u8 num_tc; u8 lag_port; + unsigned int irq; /* XDP_REDIRECT */ struct mlx5e_xdpsq xdpsq; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index c8e8f512803e..e1bfff1fb328 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) mlx5e_close_tx_cqs(c); mlx5e_close_cq(&c->icosq.cq); mlx5e_close_cq(&c->async_icosq.cq); + + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); } static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, c->stats = &priv->channel_stats[ix]->ch; c->aff_mask = irq_get_effective_affinity_mask(irq); c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); + c->irq = irq; netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) mlx5e_activate_xsk(c); else mlx5e_activate_rq(&c->rq); + + netif_napi_set_irq(&c->napi, c->irq); + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); } static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
Make mlx5 compatible with the newly added netlink queue GET APIs. Signed-off-by: Joe Damato <jdamato@fastly.com> --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ 2 files changed, 9 insertions(+)