Message ID | 20240816090159.1967650-2-dtatulea@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vdpa/mlx5: Parallelize device suspend/resume | expand |
On 16.08.24 11:01, Dragos Tatulea wrote: > Currently, commands that qualify as throttled can't be used via the > async API. That's due to the fact that the throttle semaphore can sleep > but the async API can't. > > This patch allows throttling in the async API by using the tentative > variant of the semaphore and upon failure (semaphore at 0) returns EBUSY > to signal to the caller that they need to wait for the completion of > previously issued commands. > > Furthermore, make sure that the semaphore is released in the callback. > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > Cc: Leon Romanovsky <leonro@nvidia.com> > Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Same reminder as in v1: Tariq is the maintainer for mlx5 so his review also counts as Acked-by. Thanks, Dragos > --- > drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 ++++++++++++++----- > 1 file changed, 16 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > index 20768ef2e9d2..f69c977c1569 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > @@ -1882,10 +1882,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out, > > throttle_op = mlx5_cmd_is_throttle_opcode(opcode); > if (throttle_op) { > - /* atomic context may not sleep */ > - if (callback) > - return -EINVAL; > - down(&dev->cmd.vars.throttle_sem); > + if (callback) { > + if (down_trylock(&dev->cmd.vars.throttle_sem)) > + return -EBUSY; > + } else { > + down(&dev->cmd.vars.throttle_sem); > + } > } > > pages_queue = is_manage_pages(in); > @@ -2091,10 +2093,19 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work) > { > struct mlx5_async_work *work = _work; > struct mlx5_async_ctx *ctx; > + struct mlx5_core_dev *dev; > + u16 opcode; > > ctx = work->ctx; > - status = cmd_status_err(ctx->dev, status, work->opcode, work->op_mod, work->out); > + dev = ctx->dev; > + opcode = work->opcode; > + status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out); > work->user_callback(status, work); > + /* Can't access "work" from this point on. It could have been freed in > + * the callback. > + */ > + if (mlx5_cmd_is_throttle_opcode(opcode)) > + up(&dev->cmd.vars.throttle_sem); > if (atomic_dec_and_test(&ctx->num_inflight)) > complete(&ctx->inflight_done); > }
On Mon, Sep 9, 2024 at 11:33 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > On 16.08.24 11:01, Dragos Tatulea wrote: > > Currently, commands that qualify as throttled can't be used via the > > async API. That's due to the fact that the throttle semaphore can sleep > > but the async API can't. > > > > This patch allows throttling in the async API by using the tentative > > variant of the semaphore and upon failure (semaphore at 0) returns EBUSY > > to signal to the caller that they need to wait for the completion of > > previously issued commands. > > > > Furthermore, make sure that the semaphore is released in the callback. > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > Cc: Leon Romanovsky <leonro@nvidia.com> > > Reviewed-by: Tariq Toukan <tariqt@nvidia.com> > Same reminder as in v1: Tariq is the maintainer for mlx5 so his review > also counts as Acked-by. > Not sure if it was the case when you send the mail, but this series is already in the maintainer's branch: * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/ * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/commit/?h=vhost&id=691fd851e1bc8ec043798e1ab337305e6291cd6b Thanks!
On 11.09.24 10:00, Eugenio Perez Martin wrote: > On Mon, Sep 9, 2024 at 11:33 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: >> >> >> >> On 16.08.24 11:01, Dragos Tatulea wrote: >>> Currently, commands that qualify as throttled can't be used via the >>> async API. That's due to the fact that the throttle semaphore can sleep >>> but the async API can't. >>> >>> This patch allows throttling in the async API by using the tentative >>> variant of the semaphore and upon failure (semaphore at 0) returns EBUSY >>> to signal to the caller that they need to wait for the completion of >>> previously issued commands. >>> >>> Furthermore, make sure that the semaphore is released in the callback. >>> >>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> >>> Cc: Leon Romanovsky <leonro@nvidia.com> >>> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> >> Same reminder as in v1: Tariq is the maintainer for mlx5 so his review >> also counts as Acked-by. >> > > Not sure if it was the case when you send the mail, but this series is > already in the maintainer's branch: > * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/ > * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/commit/?h=vhost&id=691fd851e1bc8ec043798e1ab337305e6291cd6b It wasn't. Thanks for the notice! Thanks, Dragos
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 20768ef2e9d2..f69c977c1569 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -1882,10 +1882,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out, throttle_op = mlx5_cmd_is_throttle_opcode(opcode); if (throttle_op) { - /* atomic context may not sleep */ - if (callback) - return -EINVAL; - down(&dev->cmd.vars.throttle_sem); + if (callback) { + if (down_trylock(&dev->cmd.vars.throttle_sem)) + return -EBUSY; + } else { + down(&dev->cmd.vars.throttle_sem); + } } pages_queue = is_manage_pages(in); @@ -2091,10 +2093,19 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work) { struct mlx5_async_work *work = _work; struct mlx5_async_ctx *ctx; + struct mlx5_core_dev *dev; + u16 opcode; ctx = work->ctx; - status = cmd_status_err(ctx->dev, status, work->opcode, work->op_mod, work->out); + dev = ctx->dev; + opcode = work->opcode; + status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out); work->user_callback(status, work); + /* Can't access "work" from this point on. It could have been freed in + * the callback. + */ + if (mlx5_cmd_is_throttle_opcode(opcode)) + up(&dev->cmd.vars.throttle_sem); if (atomic_dec_and_test(&ctx->num_inflight)) complete(&ctx->inflight_done); }