mbox series

[vfio,0/5] Improve mlx5 driver to better handle some error cases

Message ID 20240130170227.153464-1-yishaih@nvidia.com (mailing list archive)
Headers show
Series Improve mlx5 driver to better handle some error cases | expand

Message

Yishai Hadas Jan. 30, 2024, 5:02 p.m. UTC
This series improves the mlx5 driver to better handle some error cases
as of below.

The first two patches let the driver recognize whether the firmware
moved the tracker object to an error state. In that case, the driver
will skip/block any usage of that object.

The next two patches (#3, #4), improve the driver to better include the
proper firmware syndrome in dmesg upon a failure in some firmware
commands.

The last patch follows the device specification to let the firmware know
upon leaving PRE_COPY back to RUNNING. (e.g. error in the target,
migration cancellation, etc.).

This will let the firmware clean its internal resources that were turned
on upon PRE_COPY.

Note:
As the first patch should go to net/mlx5, we may need to send it as a
pull request format to vfio before acceptance of the series, to avoid
conflicts.

Yishai

Yishai Hadas (5):
  net/mlx5: Add the IFC related bits for query tracker
  vfio/mlx5: Add support for tracker object events
  vfio/mlx5: Handle the EREMOTEIO error upon the SAVE command
  vfio/mlx5: Block incremental query upon migf state error
  vfio/mlx5: Let firmware knows upon leaving PRE_COPY back to RUNNING

 drivers/vfio/pci/mlx5/cmd.c   | 74 ++++++++++++++++++++++++++++++++---
 drivers/vfio/pci/mlx5/cmd.h   |  5 ++-
 drivers/vfio/pci/mlx5/main.c  | 39 ++++++++++++++----
 include/linux/mlx5/mlx5_ifc.h |  5 +++
 4 files changed, 110 insertions(+), 13 deletions(-)