[RFC,net-next,9/9] gve: Implement queue api

Message ID	20240418195159.3461151-10-shailend@google.com (mailing list archive)
State	RFC
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B27E61BED97 for <netdev@vger.kernel.org>; Thu, 18 Apr 2024 19:52:32 +0000 (UTC) Date: Thu, 18 Apr 2024 19:51:59 +0000 In-Reply-To: <20240418195159.3461151-1-shailend@google.com> Precedence: bulk Mime-Version: 1.0 References: <20240418195159.3461151-1-shailend@google.com> Message-ID: <20240418195159.3461151-10-shailend@google.com> Subject: [RFC PATCH net-next 9/9] gve: Implement queue api From: Shailend Chand <shailend@google.com> To: netdev@vger.kernel.org Cc: almasrymina@google.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, willemb@google.com, Shailend Chand <shailend@google.com> Content-Type: text/plain; charset="UTF-8"
Series	gve: Implement netdev queue api \| expand [RFC,net-next,0/9] gve: Implement netdev queue api [RFC,net-next,1/9] queue_api: define queue api [RFC,net-next,2/9] gve: Make the RX free queue funcs idempotent [RFC,net-next,3/9] gve: Add adminq funcs to add/remove a single Rx queue [RFC,net-next,4/9] gve: Make gve_turn(up\|down) ignore stopped queues [RFC,net-next,5/9] gve: Make gve_turnup work for nonempty queues [RFC,net-next,6/9] gve: Avoid rescheduling napi if on wrong cpu [RFC,net-next,7/9] gve: Reset Rx ring state in the ring-stop funcs [RFC,net-next,8/9] gve: Account for stopped queues when reading NIC stats [RFC,net-next,9/9] gve: Implement queue api

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net-next, async
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 926 this patch: 926
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	warning	6 maintainers not CCed: jeroendb@google.com pkaligineedi@google.com ziweixiao@google.com rushilg@google.com horms@kernel.org hramamurthy@google.com
netdev/build_clang	success	Errors and warnings before: 940 this patch: 940
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 937 this patch: 937
netdev/checkpatch	warning	WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Shailend Chand April 18, 2024, 7:51 p.m. UTC

An api enabling the net stack to reset driver queues is implemented for
gve.

Signed-off-by: Shailend Chand <shailend@google.com>
---
 drivers/net/ethernet/google/gve/gve.h        |   6 +
 drivers/net/ethernet/google/gve/gve_dqo.h    |   6 +
 drivers/net/ethernet/google/gve/gve_main.c   | 143 +++++++++++++++++++
 drivers/net/ethernet/google/gve/gve_rx.c     |  12 +-
 drivers/net/ethernet/google/gve/gve_rx_dqo.c |  12 +-
 5 files changed, 167 insertions(+), 12 deletions(-)

Jakub Kicinski April 19, 2024, 1:48 a.m. UTC | #1

On Thu, 18 Apr 2024 19:51:59 +0000 Shailend Chand wrote:
> +static int gve_rx_queue_stop(struct net_device *dev, int idx,
> +			     void **out_per_q_mem)
> +{
> +	struct gve_priv *priv = netdev_priv(dev);
> +	struct gve_rx_ring *rx;
> +	int err;
> +
> +	if (!priv->rx)
> +		return -EAGAIN;
> +	if (idx < 0 || idx >= priv->rx_cfg.max_queues)
> +		return -ERANGE;

A little too defensive? Core should not issue these > current real num
queues.

> +	/* Destroying queue 0 while other queues exist is not supported in DQO */
> +	if (!gve_is_gqi(priv) && idx == 0)
> +		return -ERANGE;
> +
> +	rx = kvzalloc(sizeof(*rx), GFP_KERNEL);

Why allocate in the driver rather than let the core allocate based on
the declared size ?

Mina Almasry April 19, 2024, 4:10 p.m. UTC | #2

On Thu, Apr 18, 2024 at 6:48 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 18 Apr 2024 19:51:59 +0000 Shailend Chand wrote:
> > +static int gve_rx_queue_stop(struct net_device *dev, int idx,
> > +                          void **out_per_q_mem)
> > +{
> > +     struct gve_priv *priv = netdev_priv(dev);
> > +     struct gve_rx_ring *rx;
> > +     int err;
> > +
> > +     if (!priv->rx)
> > +             return -EAGAIN;
> > +     if (idx < 0 || idx >= priv->rx_cfg.max_queues)
> > +             return -ERANGE;
>
> A little too defensive? Core should not issue these > current real num
> queues.
>
> > +     /* Destroying queue 0 while other queues exist is not supported in DQO */
> > +     if (!gve_is_gqi(priv) && idx == 0)
> > +             return -ERANGE;
> > +
> > +     rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
>
> Why allocate in the driver rather than let the core allocate based on
> the declared size ?
>

Currently the ndos don't include an interface for the driver to
declare the size, right? In theory we could add it to the ndos like
so, if I understood you correctly (untested yet, just to illustrate
what I'm thinking point):

diff --git a/drivers/net/ethernet/google/gve/gve_main.c
b/drivers/net/ethernet/google/gve/gve_main.c
index 7c38dc06a392..efe3944b529a 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -2579,11 +2579,16 @@ static void gve_write_version(u8 __iomem
*driver_version_register)
  writeb('\n', driver_version_register);
 }

+static size_t gve_rx_queue_mem_get_size(void)
+{
+ return sizeof(struct gve_rx_ring);
+}
+
 static int gve_rx_queue_stop(struct net_device *dev, int idx,
-      void **out_per_q_mem)
+      void *out_per_q_mem)
 {
  struct gve_priv *priv = netdev_priv(dev);
- struct gve_rx_ring *rx;
+ struct gve_rx_ring *rx = out_per_q_mem;
  int err;

  if (!priv->rx)
@@ -2595,9 +2600,6 @@ static int gve_rx_queue_stop(struct net_device
*dev, int idx,
  if (!gve_is_gqi(priv) && idx == 0)
  return -ERANGE;

- rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
- if (!rx)
- return -ENOMEM;
  *rx = priv->rx[idx];

  /* Single-queue destruction requires quiescence on all queues */
@@ -2606,7 +2608,6 @@ static int gve_rx_queue_stop(struct net_device
*dev, int idx,
  /* This failure will trigger a reset - no need to clean up */
  err = gve_adminq_destroy_single_rx_queue(priv, idx);
  if (err) {
- kvfree(rx);
  return err;
  }

@@ -2618,7 +2619,6 @@ static int gve_rx_queue_stop(struct net_device
*dev, int idx,
  /* Turn the unstopped queues back up */
  gve_turnup_and_check_status(priv);

- *out_per_q_mem = rx;
  return 0;
 }

@@ -2709,6 +2709,7 @@ static const struct netdev_queue_mgmt_ops
gve_queue_mgmt_ops = {
  .ndo_queue_mem_free = gve_rx_queue_mem_free,
  .ndo_queue_start = gve_rx_queue_start,
  .ndo_queue_stop = gve_rx_queue_stop,
+ .ndo_queue_mem_get_size = gve_rx_queue_mem_get_size,
 };

 static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
index 337df0860ae6..0af08414d8eb 100644
--- a/include/net/netdev_queues.h
+++ b/include/net/netdev_queues.h
@@ -75,6 +75,7 @@ struct netdev_stat_ops {
  * @ndo_queue_stop: Stop the RX queue at the specified index.
  */
 struct netdev_queue_mgmt_ops {
+ size_t (*ndo_queue_mem_get_size)(void);
  void * (*ndo_queue_mem_alloc)(struct net_device *dev,
         int idx);
  void (*ndo_queue_mem_free)(struct net_device *dev,
@@ -84,7 +85,7 @@ struct netdev_queue_mgmt_ops {
     void *queue_mem);
  int (*ndo_queue_stop)(struct net_device *dev,
    int idx,
-   void **out_queue_mem);
+   void *out_queue_mem);
 };

 /**
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 01337de7d6a4..89c90e21f083 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -60,7 +60,8 @@ static int net_devmem_restart_rx_queue(struct
net_device *dev, int rxq_idx)
  void *old_mem;
  int err;

- if (!dev->queue_mgmt_ops->ndo_queue_stop ||
+ if (!dev->queue_mgmt_ops->ndo_queue_mem_get_size ||
+     !dev->queue_mgmt_ops->ndo_queue_stop ||
      !dev->queue_mgmt_ops->ndo_queue_mem_free ||
      !dev->queue_mgmt_ops->ndo_queue_mem_alloc ||
      !dev->queue_mgmt_ops->ndo_queue_start)
@@ -70,7 +71,11 @@ static int net_devmem_restart_rx_queue(struct
net_device *dev, int rxq_idx)
  if (!new_mem)
  return -ENOMEM;

- err = dev->queue_mgmt_ops->ndo_queue_stop(dev, rxq_idx, &old_mem);
+ old_mem = kvzalloc(dev->queue_mgmt_ops->ndo_queue_mem_get_size(),
+    GFP_KERNEL);
+ BUG_ON(!old_mem); /* TODO */
+
+ err = dev->queue_mgmt_ops->ndo_queue_stop(dev, rxq_idx, old_mem);
  if (err)
  goto err_free_new_mem;

@@ -79,6 +84,7 @@ static int net_devmem_restart_rx_queue(struct
net_device *dev, int rxq_idx)
  goto err_start_queue;

  dev->queue_mgmt_ops->ndo_queue_mem_free(dev, old_mem);
+ kvfree(old_mem);

  return 0;


I think maybe if we want to apply this change to mem_stop, then we
should probably also apply this change to queue_mem_alloc as well,
right? I.e. core will allocate the pointer, and ndo_queue_mem_alloc
would allocate the actual resources and would fill in the entries of
the pointer? Is this what you're looking for here?

--
Thanks,
Mina

Shailend Chand April 19, 2024, 10:23 p.m. UTC | #3

On Thu, Apr 18, 2024 at 12:52 PM Shailend Chand <shailend@google.com> wrote:
>
> An api enabling the net stack to reset driver queues is implemented for
> gve.
>
> Signed-off-by: Shailend Chand <shailend@google.com>
> ---
>  drivers/net/ethernet/google/gve/gve.h        |   6 +
>  drivers/net/ethernet/google/gve/gve_dqo.h    |   6 +
>  drivers/net/ethernet/google/gve/gve_main.c   | 143 +++++++++++++++++++
>  drivers/net/ethernet/google/gve/gve_rx.c     |  12 +-
>  drivers/net/ethernet/google/gve/gve_rx_dqo.c |  12 +-
>  5 files changed, 167 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
> index 9f6a897c87cb..d752e525bde7 100644
> --- a/drivers/net/ethernet/google/gve/gve.h
> +++ b/drivers/net/ethernet/google/gve/gve.h
> @@ -1147,6 +1147,12 @@ bool gve_tx_clean_pending(struct gve_priv *priv, struct gve_tx_ring *tx);
>  void gve_rx_write_doorbell(struct gve_priv *priv, struct gve_rx_ring *rx);
>  int gve_rx_poll(struct gve_notify_block *block, int budget);
>  bool gve_rx_work_pending(struct gve_rx_ring *rx);
> +int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
> +                         struct gve_rx_alloc_rings_cfg *cfg,
> +                         struct gve_rx_ring *rx,
> +                         int idx);
> +void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
> +                         struct gve_rx_alloc_rings_cfg *cfg);
>  int gve_rx_alloc_rings(struct gve_priv *priv);
>  int gve_rx_alloc_rings_gqi(struct gve_priv *priv,
>                            struct gve_rx_alloc_rings_cfg *cfg);
> diff --git a/drivers/net/ethernet/google/gve/gve_dqo.h b/drivers/net/ethernet/google/gve/gve_dqo.h
> index b81584829c40..e83773fb891f 100644
> --- a/drivers/net/ethernet/google/gve/gve_dqo.h
> +++ b/drivers/net/ethernet/google/gve/gve_dqo.h
> @@ -44,6 +44,12 @@ void gve_tx_free_rings_dqo(struct gve_priv *priv,
>                            struct gve_tx_alloc_rings_cfg *cfg);
>  void gve_tx_start_ring_dqo(struct gve_priv *priv, int idx);
>  void gve_tx_stop_ring_dqo(struct gve_priv *priv, int idx);
> +int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
> +                         struct gve_rx_alloc_rings_cfg *cfg,
> +                         struct gve_rx_ring *rx,
> +                         int idx);
> +void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
> +                         struct gve_rx_alloc_rings_cfg *cfg);
>  int gve_rx_alloc_rings_dqo(struct gve_priv *priv,
>                            struct gve_rx_alloc_rings_cfg *cfg);
>  void gve_rx_free_rings_dqo(struct gve_priv *priv,
> diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
> index c348dff7cca6..5e652958f10f 100644
> --- a/drivers/net/ethernet/google/gve/gve_main.c
> +++ b/drivers/net/ethernet/google/gve/gve_main.c
> @@ -17,6 +17,7 @@
>  #include <linux/workqueue.h>
>  #include <linux/utsname.h>
>  #include <linux/version.h>
> +#include <net/netdev_queues.h>
>  #include <net/sch_generic.h>
>  #include <net/xdp_sock_drv.h>
>  #include "gve.h"
> @@ -2070,6 +2071,15 @@ static void gve_turnup(struct gve_priv *priv)
>         gve_set_napi_enabled(priv);
>  }
>
> +static void gve_turnup_and_check_status(struct gve_priv *priv)
> +{
> +       u32 status;
> +
> +       gve_turnup(priv);
> +       status = ioread32be(&priv->reg_bar0->device_status);
> +       gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status);
> +}
> +
>  static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
>  {
>         struct gve_notify_block *block;
> @@ -2530,6 +2540,138 @@ static void gve_write_version(u8 __iomem *driver_version_register)
>         writeb('\n', driver_version_register);
>  }
>
> +static int gve_rx_queue_stop(struct net_device *dev, int idx,
> +                            void **out_per_q_mem)
> +{
> +       struct gve_priv *priv = netdev_priv(dev);
> +       struct gve_rx_ring *rx;
> +       int err;
> +
> +       if (!priv->rx)
> +               return -EAGAIN;
> +       if (idx < 0 || idx >= priv->rx_cfg.max_queues)
> +               return -ERANGE;
> +
> +       /* Destroying queue 0 while other queues exist is not supported in DQO */
> +       if (!gve_is_gqi(priv) && idx == 0)
> +               return -ERANGE;
> +
> +       rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
> +       if (!rx)
> +               return -ENOMEM;
> +       *rx = priv->rx[idx];
> +
> +       /* Single-queue destruction requires quiescence on all queues */
> +       gve_turndown(priv);
> +
> +       /* This failure will trigger a reset - no need to clean up */
> +       err = gve_adminq_destroy_single_rx_queue(priv, idx);
> +       if (err) {
> +               kvfree(rx);
> +               return err;
> +       }
> +
> +       if (gve_is_gqi(priv))
> +               gve_rx_stop_ring_gqi(priv, idx);
> +       else
> +               gve_rx_stop_ring_dqo(priv, idx);
> +
> +       /* Turn the unstopped queues back up */
> +       gve_turnup_and_check_status(priv);
> +
> +       *out_per_q_mem = rx;
> +       return 0;
> +}
> +
> +static void gve_rx_queue_mem_free(struct net_device *dev, void *per_q_mem)
> +{
> +       struct gve_priv *priv = netdev_priv(dev);
> +       struct gve_rx_alloc_rings_cfg cfg = {0};
> +       struct gve_rx_ring *rx;
> +
> +       gve_rx_get_curr_alloc_cfg(priv, &cfg);
> +       rx = (struct gve_rx_ring *)per_q_mem;
> +       if (!rx)
> +               return;
> +
> +       if (gve_is_gqi(priv))
> +               gve_rx_free_ring_gqi(priv, rx, &cfg);
> +       else
> +               gve_rx_free_ring_dqo(priv, rx, &cfg);
> +
> +       kvfree(per_q_mem);
> +}
> +
> +static void *gve_rx_queue_mem_alloc(struct net_device *dev, int idx)
> +{
> +       struct gve_priv *priv = netdev_priv(dev);
> +       struct gve_rx_alloc_rings_cfg cfg = {0};
> +       struct gve_rx_ring *rx;
> +       int err;
> +
> +       gve_rx_get_curr_alloc_cfg(priv, &cfg);
> +       if (idx < 0 || idx >= cfg.qcfg->max_queues)
> +               return NULL;
> +
> +       rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
> +       if (!rx)
> +               return NULL;
> +
> +       if (gve_is_gqi(priv))
> +               err = gve_rx_alloc_ring_gqi(priv, &cfg, rx, idx);
> +       else
> +               err = gve_rx_alloc_ring_dqo(priv, &cfg, rx, idx);
> +
> +       if (err) {
> +               kvfree(rx);
> +               return NULL;
> +       }
> +       return rx;
> +}
> +
> +static int gve_rx_queue_start(struct net_device *dev, int idx, void *per_q_mem)
> +{
> +       struct gve_priv *priv = netdev_priv(dev);
> +       struct gve_rx_ring *rx;
> +       int err;
> +
> +       if (!priv->rx)
> +               return -EAGAIN;
> +       if (idx < 0 || idx >= priv->rx_cfg.max_queues)
> +               return -ERANGE;
> +       rx = (struct gve_rx_ring *)per_q_mem;
> +       priv->rx[idx] = *rx;
> +
> +       /* Single-queue creation requires quiescence on all queues */
> +       gve_turndown(priv);
> +
> +       if (gve_is_gqi(priv))
> +               gve_rx_start_ring_gqi(priv, idx);
> +       else
> +               gve_rx_start_ring_dqo(priv, idx);
> +
> +       /* This failure will trigger a reset - no need to clean up */
> +       err = gve_adminq_create_single_rx_queue(priv, idx);
> +       if (err)
> +               return err;
> +
> +       if (gve_is_gqi(priv))
> +               gve_rx_write_doorbell(priv, &priv->rx[idx]);
> +       else
> +               gve_rx_post_buffers_dqo(&priv->rx[idx]);
> +
> +       /* Turn the unstopped queues back up */
> +       gve_turnup_and_check_status(priv);
> +       return 0;
> +}

I realized that due to not kvfree-ing the passed-in `per_q_mem`, there
is a leak. The alloc and stop hooks kvzalloc
a temp ring struct, which means the start and free hooks ought to have
kvfreed them to keep symmetry and avoid leaking.
The free hook is doing it but I forgot to do it in the start hook.

If we go down the route of making core aware of the ring struct size,
then none of the four hooks
need to worry about the temp struct: core can just alloc and free it
for both the old and new queue.

> +
> +static const struct netdev_queue_mgmt_ops gve_queue_mgmt_ops = {
> +       .ndo_queue_mem_alloc    =       gve_rx_queue_mem_alloc,
> +       .ndo_queue_mem_free     =       gve_rx_queue_mem_free,
> +       .ndo_queue_start        =       gve_rx_queue_start,
> +       .ndo_queue_stop         =       gve_rx_queue_stop,
> +};
> +
>  static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  {
>         int max_tx_queues, max_rx_queues;
> @@ -2584,6 +2726,7 @@ static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>         pci_set_drvdata(pdev, dev);
>         dev->ethtool_ops = &gve_ethtool_ops;
>         dev->netdev_ops = &gve_netdev_ops;
> +       dev->queue_mgmt_ops = &gve_queue_mgmt_ops;
>
>         /* Set default and supported features.
>          *
> diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
> index 1d235caab4c5..307bf97d4778 100644
> --- a/drivers/net/ethernet/google/gve/gve_rx.c
> +++ b/drivers/net/ethernet/google/gve/gve_rx.c
> @@ -101,8 +101,8 @@ void gve_rx_stop_ring_gqi(struct gve_priv *priv, int idx)
>         gve_rx_reset_ring_gqi(priv, idx);
>  }
>
> -static void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
> -                                struct gve_rx_alloc_rings_cfg *cfg)
> +void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
> +                         struct gve_rx_alloc_rings_cfg *cfg)
>  {
>         struct device *dev = &priv->pdev->dev;
>         u32 slots = rx->mask + 1;
> @@ -270,10 +270,10 @@ void gve_rx_start_ring_gqi(struct gve_priv *priv, int idx)
>         gve_add_napi(priv, ntfy_idx, gve_napi_poll);
>  }
>
> -static int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
> -                                struct gve_rx_alloc_rings_cfg *cfg,
> -                                struct gve_rx_ring *rx,
> -                                int idx)
> +int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
> +                         struct gve_rx_alloc_rings_cfg *cfg,
> +                         struct gve_rx_ring *rx,
> +                         int idx)
>  {
>         struct device *hdev = &priv->pdev->dev;
>         u32 slots = cfg->ring_size;
> diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
> index dc2c6bd92e82..dcbc37118870 100644
> --- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
> +++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
> @@ -299,8 +299,8 @@ void gve_rx_stop_ring_dqo(struct gve_priv *priv, int idx)
>         gve_rx_reset_ring_dqo(priv, idx);
>  }
>
> -static void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
> -                                struct gve_rx_alloc_rings_cfg *cfg)
> +void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
> +                         struct gve_rx_alloc_rings_cfg *cfg)
>  {
>         struct device *hdev = &priv->pdev->dev;
>         size_t completion_queue_slots;
> @@ -373,10 +373,10 @@ void gve_rx_start_ring_dqo(struct gve_priv *priv, int idx)
>         gve_add_napi(priv, ntfy_idx, gve_napi_poll_dqo);
>  }
>
> -static int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
> -                                struct gve_rx_alloc_rings_cfg *cfg,
> -                                struct gve_rx_ring *rx,
> -                                int idx)
> +int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
> +                         struct gve_rx_alloc_rings_cfg *cfg,
> +                         struct gve_rx_ring *rx,
> +                         int idx)
>  {
>         struct device *hdev = &priv->pdev->dev;
>         size_t size;
> --
> 2.44.0.769.g3c40516874-goog
>

Jakub Kicinski April 20, 2024, 3:25 a.m. UTC | #4

On Fri, 19 Apr 2024 09:10:42 -0700 Mina Almasry wrote:
> Currently the ndos don't include an interface for the driver to
> declare the size, right? In theory we could add it to the ndos like
> so, if I understood you correctly (untested yet, just to illustrate
> what I'm thinking point):
> 
> diff --git a/drivers/net/ethernet/google/gve/gve_main.c
> b/drivers/net/ethernet/google/gve/gve_main.c
> index 7c38dc06a392..efe3944b529a 100644
> --- a/drivers/net/ethernet/google/gve/gve_main.c
> +++ b/drivers/net/ethernet/google/gve/gve_main.c
> @@ -2579,11 +2579,16 @@ static void gve_write_version(u8 __iomem
> *driver_version_register)
>   writeb('\n', driver_version_register);
>  }
> 
> +static size_t gve_rx_queue_mem_get_size(void)
> +{
> + return sizeof(struct gve_rx_ring);
> +}

> @@ -2709,6 +2709,7 @@ static const struct netdev_queue_mgmt_ops
> gve_queue_mgmt_ops = {
>   .ndo_queue_mem_free = gve_rx_queue_mem_free,
>   .ndo_queue_start = gve_rx_queue_start,
>   .ndo_queue_stop = gve_rx_queue_stop,
> + .ndo_queue_mem_get_size = gve_rx_queue_mem_get_size,
>  };

I don't think we need to make it a callback, even, directly:

const struct netdev_queue_mgmt_ops gve_queue_mgmt_ops = {
+	.queue_mem_size		= sizeof(struct gve_rx_ring),
 	.ndo_queue_mem_free	= gve_rx_queue_mem_free,
 	.ndo_queue_start	= gve_rx_queue_start,
 	.ndo_queue_stop		= gve_rx_queue_stop,

> I think maybe if we want to apply this change to mem_stop, then we
> should probably also apply this change to queue_mem_alloc as well,
> right? I.e. core will allocate the pointer, and ndo_queue_mem_alloc
> would allocate the actual resources and would fill in the entries of
> the pointer? Is this what you're looking for here?

Yup. But thinking about it again, this may be more natural once we also
have the open/close path use the queue API. IIUC for now the driver
allocates the queue resources on open, without going via .ndo_queue_*
If that's the case we can keep the code as you have it here.

Mina Almasry April 22, 2024, 4:58 p.m. UTC | #5

On Fri, Apr 19, 2024 at 8:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 19 Apr 2024 09:10:42 -0700 Mina Almasry wrote:
> > Currently the ndos don't include an interface for the driver to
> > declare the size, right? In theory we could add it to the ndos like
> > so, if I understood you correctly (untested yet, just to illustrate
> > what I'm thinking point):
> >
> > diff --git a/drivers/net/ethernet/google/gve/gve_main.c
> > b/drivers/net/ethernet/google/gve/gve_main.c
> > index 7c38dc06a392..efe3944b529a 100644
> > --- a/drivers/net/ethernet/google/gve/gve_main.c
> > +++ b/drivers/net/ethernet/google/gve/gve_main.c
> > @@ -2579,11 +2579,16 @@ static void gve_write_version(u8 __iomem
> > *driver_version_register)
> >   writeb('\n', driver_version_register);
> >  }
> >
> > +static size_t gve_rx_queue_mem_get_size(void)
> > +{
> > + return sizeof(struct gve_rx_ring);
> > +}
>
> > @@ -2709,6 +2709,7 @@ static const struct netdev_queue_mgmt_ops
> > gve_queue_mgmt_ops = {
> >   .ndo_queue_mem_free = gve_rx_queue_mem_free,
> >   .ndo_queue_start = gve_rx_queue_start,
> >   .ndo_queue_stop = gve_rx_queue_stop,
> > + .ndo_queue_mem_get_size = gve_rx_queue_mem_get_size,
> >  };
>
> I don't think we need to make it a callback, even, directly:
>
> const struct netdev_queue_mgmt_ops gve_queue_mgmt_ops = {
> +       .queue_mem_size         = sizeof(struct gve_rx_ring),
>         .ndo_queue_mem_free     = gve_rx_queue_mem_free,
>         .ndo_queue_start        = gve_rx_queue_start,
>         .ndo_queue_stop         = gve_rx_queue_stop,
>
> > I think maybe if we want to apply this change to mem_stop, then we
> > should probably also apply this change to queue_mem_alloc as well,
> > right? I.e. core will allocate the pointer, and ndo_queue_mem_alloc
> > would allocate the actual resources and would fill in the entries of
> > the pointer? Is this what you're looking for here?
>
> Yup. But thinking about it again, this may be more natural once we also
> have the open/close path use the queue API. IIUC for now the driver
> allocates the queue resources on open, without going via .ndo_queue_*
> If that's the case we can keep the code as you have it here.

Yes, as currently written the queue API funcs & gve_open/close use the
same internal gve helpers, but they do not go via .ndo_queue_*.

OK, sounds like you would like us to keep this bit of the code as we
have it here. Let us know if you have any other feedback. I think
we're working through final testing/polishing/code reviewing
internally and will be submitting something very similar to this as
non-RFC for review, with whatever feedback we receive in the meantime.

Jakub Kicinski April 22, 2024, 6:41 p.m. UTC | #6

On Mon, 22 Apr 2024 09:58:38 -0700 Mina Almasry wrote:
> OK, sounds like you would like us to keep this bit of the code as we
> have it here. Let us know if you have any other feedback. I think
> we're working through final testing/polishing/code reviewing
> internally and will be submitting something very similar to this as
> non-RFC for review, with whatever feedback we receive in the meantime.

No other notes, thanks!

David Wei April 23, 2024, 5:33 p.m. UTC | #7

On 2024-04-18 12:51 pm, Shailend Chand wrote:
> An api enabling the net stack to reset driver queues is implemented for
> gve.
> 
> Signed-off-by: Shailend Chand <shailend@google.com>
> ---
>  drivers/net/ethernet/google/gve/gve.h        |   6 +
>  drivers/net/ethernet/google/gve/gve_dqo.h    |   6 +
>  drivers/net/ethernet/google/gve/gve_main.c   | 143 +++++++++++++++++++
>  drivers/net/ethernet/google/gve/gve_rx.c     |  12 +-
>  drivers/net/ethernet/google/gve/gve_rx_dqo.c |  12 +-
>  5 files changed, 167 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
> index 9f6a897c87cb..d752e525bde7 100644
> --- a/drivers/net/ethernet/google/gve/gve.h
> +++ b/drivers/net/ethernet/google/gve/gve.h
> @@ -1147,6 +1147,12 @@ bool gve_tx_clean_pending(struct gve_priv *priv, struct gve_tx_ring *tx);
>  void gve_rx_write_doorbell(struct gve_priv *priv, struct gve_rx_ring *rx);
>  int gve_rx_poll(struct gve_notify_block *block, int budget);
>  bool gve_rx_work_pending(struct gve_rx_ring *rx);
> +int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
> +			  struct gve_rx_alloc_rings_cfg *cfg,
> +			  struct gve_rx_ring *rx,
> +			  int idx);
> +void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
> +			  struct gve_rx_alloc_rings_cfg *cfg);
>  int gve_rx_alloc_rings(struct gve_priv *priv);
>  int gve_rx_alloc_rings_gqi(struct gve_priv *priv,
>  			   struct gve_rx_alloc_rings_cfg *cfg);
> diff --git a/drivers/net/ethernet/google/gve/gve_dqo.h b/drivers/net/ethernet/google/gve/gve_dqo.h
> index b81584829c40..e83773fb891f 100644
> --- a/drivers/net/ethernet/google/gve/gve_dqo.h
> +++ b/drivers/net/ethernet/google/gve/gve_dqo.h
> @@ -44,6 +44,12 @@ void gve_tx_free_rings_dqo(struct gve_priv *priv,
>  			   struct gve_tx_alloc_rings_cfg *cfg);
>  void gve_tx_start_ring_dqo(struct gve_priv *priv, int idx);
>  void gve_tx_stop_ring_dqo(struct gve_priv *priv, int idx);
> +int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
> +			  struct gve_rx_alloc_rings_cfg *cfg,
> +			  struct gve_rx_ring *rx,
> +			  int idx);
> +void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
> +			  struct gve_rx_alloc_rings_cfg *cfg);
>  int gve_rx_alloc_rings_dqo(struct gve_priv *priv,
>  			   struct gve_rx_alloc_rings_cfg *cfg);
>  void gve_rx_free_rings_dqo(struct gve_priv *priv,
> diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
> index c348dff7cca6..5e652958f10f 100644
> --- a/drivers/net/ethernet/google/gve/gve_main.c
> +++ b/drivers/net/ethernet/google/gve/gve_main.c
> @@ -17,6 +17,7 @@
>  #include <linux/workqueue.h>
>  #include <linux/utsname.h>
>  #include <linux/version.h>
> +#include <net/netdev_queues.h>
>  #include <net/sch_generic.h>
>  #include <net/xdp_sock_drv.h>
>  #include "gve.h"
> @@ -2070,6 +2071,15 @@ static void gve_turnup(struct gve_priv *priv)
>  	gve_set_napi_enabled(priv);
>  }
>  
> +static void gve_turnup_and_check_status(struct gve_priv *priv)
> +{
> +	u32 status;
> +
> +	gve_turnup(priv);
> +	status = ioread32be(&priv->reg_bar0->device_status);
> +	gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status);
> +}
> +
>  static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
>  {
>  	struct gve_notify_block *block;
> @@ -2530,6 +2540,138 @@ static void gve_write_version(u8 __iomem *driver_version_register)
>  	writeb('\n', driver_version_register);
>  }
>  
> +static int gve_rx_queue_stop(struct net_device *dev, int idx,
> +			     void **out_per_q_mem)
> +{
> +	struct gve_priv *priv = netdev_priv(dev);
> +	struct gve_rx_ring *rx;
> +	int err;
> +
> +	if (!priv->rx)
> +		return -EAGAIN;
> +	if (idx < 0 || idx >= priv->rx_cfg.max_queues)
> +		return -ERANGE;
> +
> +	/* Destroying queue 0 while other queues exist is not supported in DQO */
> +	if (!gve_is_gqi(priv) && idx == 0)
> +		return -ERANGE;
> +
> +	rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
> +	if (!rx)
> +		return -ENOMEM;
> +	*rx = priv->rx[idx];
> +
> +	/* Single-queue destruction requires quiescence on all queues */
> +	gve_turndown(priv);
> +
> +	/* This failure will trigger a reset - no need to clean up */
> +	err = gve_adminq_destroy_single_rx_queue(priv, idx);
> +	if (err) {
> +		kvfree(rx);
> +		return err;
> +	}
> +
> +	if (gve_is_gqi(priv))
> +		gve_rx_stop_ring_gqi(priv, idx);
> +	else
> +		gve_rx_stop_ring_dqo(priv, idx);

Could these be pulled out into a helper? I see it repeated a lot.

> +
> +	/* Turn the unstopped queues back up */
> +	gve_turnup_and_check_status(priv);
> +
> +	*out_per_q_mem = rx;
> +	return 0;
> +}
> +
> +static void gve_rx_queue_mem_free(struct net_device *dev, void *per_q_mem)
> +{
> +	struct gve_priv *priv = netdev_priv(dev);
> +	struct gve_rx_alloc_rings_cfg cfg = {0};
> +	struct gve_rx_ring *rx;
> +
> +	gve_rx_get_curr_alloc_cfg(priv, &cfg);
> +	rx = (struct gve_rx_ring *)per_q_mem;
> +	if (!rx)
> +		return;

This can be checked earlier.

> +
> +	if (gve_is_gqi(priv))
> +		gve_rx_free_ring_gqi(priv, rx, &cfg);
> +	else
> +		gve_rx_free_ring_dqo(priv, rx, &cfg);
> +
> +	kvfree(per_q_mem);
> +}
> +
> +static void *gve_rx_queue_mem_alloc(struct net_device *dev, int idx)
> +{
> +	struct gve_priv *priv = netdev_priv(dev);
> +	struct gve_rx_alloc_rings_cfg cfg = {0};
> +	struct gve_rx_ring *rx;
> +	int err;
> +
> +	gve_rx_get_curr_alloc_cfg(priv, &cfg);
> +	if (idx < 0 || idx >= cfg.qcfg->max_queues)
> +		return NULL;
> +
> +	rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
> +	if (!rx)
> +		return NULL;
> +
> +	if (gve_is_gqi(priv))
> +		err = gve_rx_alloc_ring_gqi(priv, &cfg, rx, idx);
> +	else
> +		err = gve_rx_alloc_ring_dqo(priv, &cfg, rx, idx);
> +
> +	if (err) {
> +		kvfree(rx);
> +		return NULL;
> +	}
> +	return rx;
> +}
> +
> +static int gve_rx_queue_start(struct net_device *dev, int idx, void *per_q_mem)
> +{
> +	struct gve_priv *priv = netdev_priv(dev);
> +	struct gve_rx_ring *rx;
> +	int err;
> +
> +	if (!priv->rx)
> +		return -EAGAIN;
> +	if (idx < 0 || idx >= priv->rx_cfg.max_queues)
> +		return -ERANGE;
> +	rx = (struct gve_rx_ring *)per_q_mem;
> +	priv->rx[idx] = *rx;
> +
> +	/* Single-queue creation requires quiescence on all queues */
> +	gve_turndown(priv);
> +
> +	if (gve_is_gqi(priv))
> +		gve_rx_start_ring_gqi(priv, idx);
> +	else
> +		gve_rx_start_ring_dqo(priv, idx);
> +
> +	/* This failure will trigger a reset - no need to clean up */
> +	err = gve_adminq_create_single_rx_queue(priv, idx);
> +	if (err)
> +		return err;
> +
> +	if (gve_is_gqi(priv))
> +		gve_rx_write_doorbell(priv, &priv->rx[idx]);
> +	else
> +		gve_rx_post_buffers_dqo(&priv->rx[idx]);
> +
> +	/* Turn the unstopped queues back up */
> +	gve_turnup_and_check_status(priv);
> +	return 0;
> +}
> +
> +static const struct netdev_queue_mgmt_ops gve_queue_mgmt_ops = {
> +	.ndo_queue_mem_alloc	=	gve_rx_queue_mem_alloc,
> +	.ndo_queue_mem_free	=	gve_rx_queue_mem_free,
> +	.ndo_queue_start	=	gve_rx_queue_start,
> +	.ndo_queue_stop		=	gve_rx_queue_stop,
> +};
> +
>  static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  {
>  	int max_tx_queues, max_rx_queues;
> @@ -2584,6 +2726,7 @@ static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	pci_set_drvdata(pdev, dev);
>  	dev->ethtool_ops = &gve_ethtool_ops;
>  	dev->netdev_ops = &gve_netdev_ops;
> +	dev->queue_mgmt_ops = &gve_queue_mgmt_ops;
>  
>  	/* Set default and supported features.
>  	 *
> diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
> index 1d235caab4c5..307bf97d4778 100644
> --- a/drivers/net/ethernet/google/gve/gve_rx.c
> +++ b/drivers/net/ethernet/google/gve/gve_rx.c
> @@ -101,8 +101,8 @@ void gve_rx_stop_ring_gqi(struct gve_priv *priv, int idx)
>  	gve_rx_reset_ring_gqi(priv, idx);
>  }
>  
> -static void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
> -				 struct gve_rx_alloc_rings_cfg *cfg)
> +void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
> +			  struct gve_rx_alloc_rings_cfg *cfg)
>  {
>  	struct device *dev = &priv->pdev->dev;
>  	u32 slots = rx->mask + 1;
> @@ -270,10 +270,10 @@ void gve_rx_start_ring_gqi(struct gve_priv *priv, int idx)
>  	gve_add_napi(priv, ntfy_idx, gve_napi_poll);
>  }
>  
> -static int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
> -				 struct gve_rx_alloc_rings_cfg *cfg,
> -				 struct gve_rx_ring *rx,
> -				 int idx)
> +int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
> +			  struct gve_rx_alloc_rings_cfg *cfg,
> +			  struct gve_rx_ring *rx,
> +			  int idx)
>  {
>  	struct device *hdev = &priv->pdev->dev;
>  	u32 slots = cfg->ring_size;
> diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
> index dc2c6bd92e82..dcbc37118870 100644
> --- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
> +++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
> @@ -299,8 +299,8 @@ void gve_rx_stop_ring_dqo(struct gve_priv *priv, int idx)
>  	gve_rx_reset_ring_dqo(priv, idx);
>  }
>  
> -static void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
> -				 struct gve_rx_alloc_rings_cfg *cfg)
> +void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
> +			  struct gve_rx_alloc_rings_cfg *cfg)
>  {
>  	struct device *hdev = &priv->pdev->dev;
>  	size_t completion_queue_slots;
> @@ -373,10 +373,10 @@ void gve_rx_start_ring_dqo(struct gve_priv *priv, int idx)
>  	gve_add_napi(priv, ntfy_idx, gve_napi_poll_dqo);
>  }
>  
> -static int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
> -				 struct gve_rx_alloc_rings_cfg *cfg,
> -				 struct gve_rx_ring *rx,
> -				 int idx)
> +int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
> +			  struct gve_rx_alloc_rings_cfg *cfg,
> +			  struct gve_rx_ring *rx,
> +			  int idx)
>  {
>  	struct device *hdev = &priv->pdev->dev;
>  	size_t size;

David Wei April 23, 2024, 5:55 p.m. UTC | #8

On 2024-04-19 3:23 pm, Shailend Chand wrote:
> On Thu, Apr 18, 2024 at 12:52 PM Shailend Chand <shailend@google.com> wrote:
>>
>> An api enabling the net stack to reset driver queues is implemented for
>> gve.
>>
>> Signed-off-by: Shailend Chand <shailend@google.com>
>> ---
>>  drivers/net/ethernet/google/gve/gve.h        |   6 +
>>  drivers/net/ethernet/google/gve/gve_dqo.h    |   6 +
>>  drivers/net/ethernet/google/gve/gve_main.c   | 143 +++++++++++++++++++
>>  drivers/net/ethernet/google/gve/gve_rx.c     |  12 +-
>>  drivers/net/ethernet/google/gve/gve_rx_dqo.c |  12 +-
>>  5 files changed, 167 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
>> index 9f6a897c87cb..d752e525bde7 100644
>> --- a/drivers/net/ethernet/google/gve/gve.h
>> +++ b/drivers/net/ethernet/google/gve/gve.h
>> @@ -1147,6 +1147,12 @@ bool gve_tx_clean_pending(struct gve_priv *priv, struct gve_tx_ring *tx);
>>  void gve_rx_write_doorbell(struct gve_priv *priv, struct gve_rx_ring *rx);
>>  int gve_rx_poll(struct gve_notify_block *block, int budget);
>>  bool gve_rx_work_pending(struct gve_rx_ring *rx);
>> +int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
>> +                         struct gve_rx_alloc_rings_cfg *cfg,
>> +                         struct gve_rx_ring *rx,
>> +                         int idx);
>> +void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
>> +                         struct gve_rx_alloc_rings_cfg *cfg);
>>  int gve_rx_alloc_rings(struct gve_priv *priv);
>>  int gve_rx_alloc_rings_gqi(struct gve_priv *priv,
>>                            struct gve_rx_alloc_rings_cfg *cfg);
>> diff --git a/drivers/net/ethernet/google/gve/gve_dqo.h b/drivers/net/ethernet/google/gve/gve_dqo.h
>> index b81584829c40..e83773fb891f 100644
>> --- a/drivers/net/ethernet/google/gve/gve_dqo.h
>> +++ b/drivers/net/ethernet/google/gve/gve_dqo.h
>> @@ -44,6 +44,12 @@ void gve_tx_free_rings_dqo(struct gve_priv *priv,
>>                            struct gve_tx_alloc_rings_cfg *cfg);
>>  void gve_tx_start_ring_dqo(struct gve_priv *priv, int idx);
>>  void gve_tx_stop_ring_dqo(struct gve_priv *priv, int idx);
>> +int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
>> +                         struct gve_rx_alloc_rings_cfg *cfg,
>> +                         struct gve_rx_ring *rx,
>> +                         int idx);
>> +void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
>> +                         struct gve_rx_alloc_rings_cfg *cfg);
>>  int gve_rx_alloc_rings_dqo(struct gve_priv *priv,
>>                            struct gve_rx_alloc_rings_cfg *cfg);
>>  void gve_rx_free_rings_dqo(struct gve_priv *priv,
>> diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
>> index c348dff7cca6..5e652958f10f 100644
>> --- a/drivers/net/ethernet/google/gve/gve_main.c
>> +++ b/drivers/net/ethernet/google/gve/gve_main.c
>> @@ -17,6 +17,7 @@
>>  #include <linux/workqueue.h>
>>  #include <linux/utsname.h>
>>  #include <linux/version.h>
>> +#include <net/netdev_queues.h>
>>  #include <net/sch_generic.h>
>>  #include <net/xdp_sock_drv.h>
>>  #include "gve.h"
>> @@ -2070,6 +2071,15 @@ static void gve_turnup(struct gve_priv *priv)
>>         gve_set_napi_enabled(priv);
>>  }
>>
>> +static void gve_turnup_and_check_status(struct gve_priv *priv)
>> +{
>> +       u32 status;
>> +
>> +       gve_turnup(priv);
>> +       status = ioread32be(&priv->reg_bar0->device_status);
>> +       gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status);
>> +}
>> +
>>  static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
>>  {
>>         struct gve_notify_block *block;
>> @@ -2530,6 +2540,138 @@ static void gve_write_version(u8 __iomem *driver_version_register)
>>         writeb('\n', driver_version_register);
>>  }
>>
>> +static int gve_rx_queue_stop(struct net_device *dev, int idx,
>> +                            void **out_per_q_mem)
>> +{
>> +       struct gve_priv *priv = netdev_priv(dev);
>> +       struct gve_rx_ring *rx;
>> +       int err;
>> +
>> +       if (!priv->rx)
>> +               return -EAGAIN;
>> +       if (idx < 0 || idx >= priv->rx_cfg.max_queues)
>> +               return -ERANGE;
>> +
>> +       /* Destroying queue 0 while other queues exist is not supported in DQO */
>> +       if (!gve_is_gqi(priv) && idx == 0)
>> +               return -ERANGE;
>> +
>> +       rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
>> +       if (!rx)
>> +               return -ENOMEM;
>> +       *rx = priv->rx[idx];
>> +
>> +       /* Single-queue destruction requires quiescence on all queues */
>> +       gve_turndown(priv);
>> +
>> +       /* This failure will trigger a reset - no need to clean up */
>> +       err = gve_adminq_destroy_single_rx_queue(priv, idx);
>> +       if (err) {
>> +               kvfree(rx);
>> +               return err;
>> +       }
>> +
>> +       if (gve_is_gqi(priv))
>> +               gve_rx_stop_ring_gqi(priv, idx);
>> +       else
>> +               gve_rx_stop_ring_dqo(priv, idx);
>> +
>> +       /* Turn the unstopped queues back up */
>> +       gve_turnup_and_check_status(priv);
>> +
>> +       *out_per_q_mem = rx;
>> +       return 0;
>> +}
>> +
>> +static void gve_rx_queue_mem_free(struct net_device *dev, void *per_q_mem)
>> +{
>> +       struct gve_priv *priv = netdev_priv(dev);
>> +       struct gve_rx_alloc_rings_cfg cfg = {0};
>> +       struct gve_rx_ring *rx;
>> +
>> +       gve_rx_get_curr_alloc_cfg(priv, &cfg);
>> +       rx = (struct gve_rx_ring *)per_q_mem;
>> +       if (!rx)
>> +               return;
>> +
>> +       if (gve_is_gqi(priv))
>> +               gve_rx_free_ring_gqi(priv, rx, &cfg);
>> +       else
>> +               gve_rx_free_ring_dqo(priv, rx, &cfg);
>> +
>> +       kvfree(per_q_mem);
>> +}
>> +
>> +static void *gve_rx_queue_mem_alloc(struct net_device *dev, int idx)
>> +{
>> +       struct gve_priv *priv = netdev_priv(dev);
>> +       struct gve_rx_alloc_rings_cfg cfg = {0};
>> +       struct gve_rx_ring *rx;
>> +       int err;
>> +
>> +       gve_rx_get_curr_alloc_cfg(priv, &cfg);
>> +       if (idx < 0 || idx >= cfg.qcfg->max_queues)
>> +               return NULL;
>> +
>> +       rx = kvzalloc(sizeof(*rx), GFP_KERNEL);
>> +       if (!rx)
>> +               return NULL;
>> +
>> +       if (gve_is_gqi(priv))
>> +               err = gve_rx_alloc_ring_gqi(priv, &cfg, rx, idx);
>> +       else
>> +               err = gve_rx_alloc_ring_dqo(priv, &cfg, rx, idx);
>> +
>> +       if (err) {
>> +               kvfree(rx);
>> +               return NULL;
>> +       }
>> +       return rx;
>> +}
>> +
>> +static int gve_rx_queue_start(struct net_device *dev, int idx, void *per_q_mem)
>> +{
>> +       struct gve_priv *priv = netdev_priv(dev);
>> +       struct gve_rx_ring *rx;
>> +       int err;
>> +
>> +       if (!priv->rx)
>> +               return -EAGAIN;
>> +       if (idx < 0 || idx >= priv->rx_cfg.max_queues)
>> +               return -ERANGE;
>> +       rx = (struct gve_rx_ring *)per_q_mem;
>> +       priv->rx[idx] = *rx;
>> +
>> +       /* Single-queue creation requires quiescence on all queues */
>> +       gve_turndown(priv);
>> +
>> +       if (gve_is_gqi(priv))
>> +               gve_rx_start_ring_gqi(priv, idx);
>> +       else
>> +               gve_rx_start_ring_dqo(priv, idx);
>> +
>> +       /* This failure will trigger a reset - no need to clean up */
>> +       err = gve_adminq_create_single_rx_queue(priv, idx);
>> +       if (err)
>> +               return err;
>> +
>> +       if (gve_is_gqi(priv))
>> +               gve_rx_write_doorbell(priv, &priv->rx[idx]);
>> +       else
>> +               gve_rx_post_buffers_dqo(&priv->rx[idx]);
>> +
>> +       /* Turn the unstopped queues back up */
>> +       gve_turnup_and_check_status(priv);
>> +       return 0;
>> +}
> 
> I realized that due to not kvfree-ing the passed-in `per_q_mem`, there
> is a leak. The alloc and stop hooks kvzalloc
> a temp ring struct, which means the start and free hooks ought to have
> kvfreed them to keep symmetry and avoid leaking.
> The free hook is doing it but I forgot to do it in the start hook.
> 
> If we go down the route of making core aware of the ring struct size,
> then none of the four hooks
> need to worry about the temp struct: core can just alloc and free it
> for both the old and new queue.

Having the core own qmem is the direction to move towards. I have
patches that do this, so we could merge what you have so far (i.e. have
the ndos alloc/free qmem) then I'll build on top of it for FBNIC/bnxt.

What might be missing in this patchset is a user of
netdev_queue_mgmt_ops. I know Mina has a restart_rx_queue() somewhere
and perhaps that should be included.

> 
>> +
>> +static const struct netdev_queue_mgmt_ops gve_queue_mgmt_ops = {
>> +       .ndo_queue_mem_alloc    =       gve_rx_queue_mem_alloc,
>> +       .ndo_queue_mem_free     =       gve_rx_queue_mem_free,
>> +       .ndo_queue_start        =       gve_rx_queue_start,
>> +       .ndo_queue_stop         =       gve_rx_queue_stop,
>> +};
>> +
>>  static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>  {
>>         int max_tx_queues, max_rx_queues;
>> @@ -2584,6 +2726,7 @@ static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>         pci_set_drvdata(pdev, dev);
>>         dev->ethtool_ops = &gve_ethtool_ops;
>>         dev->netdev_ops = &gve_netdev_ops;
>> +       dev->queue_mgmt_ops = &gve_queue_mgmt_ops;
>>
>>         /* Set default and supported features.
>>          *
>> diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
>> index 1d235caab4c5..307bf97d4778 100644
>> --- a/drivers/net/ethernet/google/gve/gve_rx.c
>> +++ b/drivers/net/ethernet/google/gve/gve_rx.c
>> @@ -101,8 +101,8 @@ void gve_rx_stop_ring_gqi(struct gve_priv *priv, int idx)
>>         gve_rx_reset_ring_gqi(priv, idx);
>>  }
>>
>> -static void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
>> -                                struct gve_rx_alloc_rings_cfg *cfg)
>> +void gve_rx_free_ring_gqi(struct gve_priv *priv, struct gve_rx_ring *rx,
>> +                         struct gve_rx_alloc_rings_cfg *cfg)
>>  {
>>         struct device *dev = &priv->pdev->dev;
>>         u32 slots = rx->mask + 1;
>> @@ -270,10 +270,10 @@ void gve_rx_start_ring_gqi(struct gve_priv *priv, int idx)
>>         gve_add_napi(priv, ntfy_idx, gve_napi_poll);
>>  }
>>
>> -static int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
>> -                                struct gve_rx_alloc_rings_cfg *cfg,
>> -                                struct gve_rx_ring *rx,
>> -                                int idx)
>> +int gve_rx_alloc_ring_gqi(struct gve_priv *priv,
>> +                         struct gve_rx_alloc_rings_cfg *cfg,
>> +                         struct gve_rx_ring *rx,
>> +                         int idx)
>>  {
>>         struct device *hdev = &priv->pdev->dev;
>>         u32 slots = cfg->ring_size;
>> diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
>> index dc2c6bd92e82..dcbc37118870 100644
>> --- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
>> +++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
>> @@ -299,8 +299,8 @@ void gve_rx_stop_ring_dqo(struct gve_priv *priv, int idx)
>>         gve_rx_reset_ring_dqo(priv, idx);
>>  }
>>
>> -static void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
>> -                                struct gve_rx_alloc_rings_cfg *cfg)
>> +void gve_rx_free_ring_dqo(struct gve_priv *priv, struct gve_rx_ring *rx,
>> +                         struct gve_rx_alloc_rings_cfg *cfg)
>>  {
>>         struct device *hdev = &priv->pdev->dev;
>>         size_t completion_queue_slots;
>> @@ -373,10 +373,10 @@ void gve_rx_start_ring_dqo(struct gve_priv *priv, int idx)
>>         gve_add_napi(priv, ntfy_idx, gve_napi_poll_dqo);
>>  }
>>
>> -static int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
>> -                                struct gve_rx_alloc_rings_cfg *cfg,
>> -                                struct gve_rx_ring *rx,
>> -                                int idx)
>> +int gve_rx_alloc_ring_dqo(struct gve_priv *priv,
>> +                         struct gve_rx_alloc_rings_cfg *cfg,
>> +                         struct gve_rx_ring *rx,
>> +                         int idx)
>>  {
>>         struct device *hdev = &priv->pdev->dev;
>>         size_t size;
>> --
>> 2.44.0.769.g3c40516874-goog
>>

[RFC,net-next,9/9] gve: Implement queue api

Checks

Commit Message

Comments

Patch