mbox series

[v2,0/5] Enable FLR for IDXD halt

Message ID 20241122233028.2762809-1-fenghua.yu@intel.com (mailing list archive)
Headers show
Series Enable FLR for IDXD halt | expand

Message

Fenghua Yu Nov. 22, 2024, 11:30 p.m. UTC
When IDXD device hits hardware errors, it enters halt state and triggers
an interrupt to IDXD driver. Currently IDXD driver just prints an error
message in the interrupt handler.

A better way to handle the interrupt is to do Function Level Reset (FLR)
and recover the device's hardware and software configurations to its
previous working state. The device and software can continue to run after
the interrupt.

This series enables this FLR handling for IDXD device whose WQs are all
user type. FLR handling for IDXD device whose WQs are kernel type
will be implemented in a future series.

Change log:
v2:
- Patch 3: Call a free helper to free all saved configs (Dave Jiang).
- Patch 3: Replace defined bitmap free function with existing
  bitmpa_free().

v1:
https://lore.kernel.org/lkml/20240705181519.4067507-1-fenghua.yu@intel.com/

Fenghua Yu (5):
  dmaengine: idxd: Add idxd_pci_probe_alloc() helper
  dmaengine: idxd: Binding and unbinding IDXD device and driver
  dmaengine: idxd: Add idxd_device_config_save() and
    idxd_device_config_restore() helpers
  dmaengine: idxd: Refactor halt handler
  dmaengine: idxd: Enable Function Level Reset (FLR) for halt

 drivers/dma/idxd/idxd.h |  13 ++
 drivers/dma/idxd/init.c | 479 ++++++++++++++++++++++++++++++++++++----
 drivers/dma/idxd/irq.c  |  85 ++++---
 3 files changed, 507 insertions(+), 70 deletions(-)

Comments

Dave Jiang Dec. 2, 2024, 10:15 p.m. UTC | #1
On 11/22/24 4:30 PM, Fenghua Yu wrote:
> When IDXD device hits hardware errors, it enters halt state and triggers
> an interrupt to IDXD driver. Currently IDXD driver just prints an error
> message in the interrupt handler.
> 
> A better way to handle the interrupt is to do Function Level Reset (FLR)
> and recover the device's hardware and software configurations to its
> previous working state. The device and software can continue to run after
> the interrupt.
> 
> This series enables this FLR handling for IDXD device whose WQs are all
> user type. FLR handling for IDXD device whose WQs are kernel type
> will be implemented in a future series.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

For the series.

> 
> Change log:
> v2:
> - Patch 3: Call a free helper to free all saved configs (Dave Jiang).
> - Patch 3: Replace defined bitmap free function with existing
>   bitmpa_free().
> 
> v1:
> https://lore.kernel.org/lkml/20240705181519.4067507-1-fenghua.yu@intel.com/
> 
> Fenghua Yu (5):
>   dmaengine: idxd: Add idxd_pci_probe_alloc() helper
>   dmaengine: idxd: Binding and unbinding IDXD device and driver
>   dmaengine: idxd: Add idxd_device_config_save() and
>     idxd_device_config_restore() helpers
>   dmaengine: idxd: Refactor halt handler
>   dmaengine: idxd: Enable Function Level Reset (FLR) for halt
> 
>  drivers/dma/idxd/idxd.h |  13 ++
>  drivers/dma/idxd/init.c | 479 ++++++++++++++++++++++++++++++++++++----
>  drivers/dma/idxd/irq.c  |  85 ++++---
>  3 files changed, 507 insertions(+), 70 deletions(-)
>