mbox series

[0/8] crypto: qat - fix warm reboot

Message ID 20250326160116.102699-2-giovanni.cabiddu@intel.com (mailing list archive)
Headers show
Series crypto: qat - fix warm reboot | expand

Message

Cabiddu, Giovanni March 26, 2025, 3:59 p.m. UTC
This series of patches addresses the warm reboot problem that affects
all QAT devices. When a reset is performed using kexec, QAT devices
fail to recover due to improper shutdown.

This implement the shutdown() handler, which integrates with the
reboot notifier list to ensure proper device shutdown during reboots.

Each patch in this series targets a specific device driver which has a
different commit id, therefore a different `Fixes` tag.

Giovanni Cabiddu (8):
  crypto: qat - add shutdown handler to qat_4xxx
  crypto: qat - add shutdown handler to qat_420xx
  crypto: qat - remove redundant prototypes in qat_dh895xcc
  crypto: qat - add shutdown handler to qat_dh895xcc
  crypto: qat - remove redundant prototypes in qat_c62x
  crypto: qat - add shutdown handler to qat_c62x
  crypto: qat - remove redundant prototypes in qat_c3xxx
  crypto: qat - add shutdown handler to qat_c3xxx

 drivers/crypto/intel/qat/qat_420xx/adf_drv.c  |  8 ++++
 drivers/crypto/intel/qat/qat_4xxx/adf_drv.c   |  8 ++++
 drivers/crypto/intel/qat/qat_c3xxx/adf_drv.c  | 41 +++++++++++--------
 drivers/crypto/intel/qat/qat_c62x/adf_drv.c   | 41 +++++++++++--------
 .../crypto/intel/qat/qat_dh895xcc/adf_drv.c   | 41 +++++++++++--------
 5 files changed, 85 insertions(+), 54 deletions(-)

Comments

Herbert Xu April 2, 2025, 2:58 a.m. UTC | #1
On Wed, Mar 26, 2025 at 03:59:45PM +0000, Giovanni Cabiddu wrote:
> This series of patches addresses the warm reboot problem that affects
> all QAT devices. When a reset is performed using kexec, QAT devices
> fail to recover due to improper shutdown.

Thanks for the quick fixes Giovanni!

As this is not a new regression, I think they should go through
the usual release cycle.

Just one comment on a possible improvement though, while it's good
to shut down the device properly, the initialisation side should
also do as much as is possible to reset a device that is in an
unknown state.

This is because the previous kernel might have had a hard crash,
in which case there is no chance for the correct shutdown sequence
to be carried out.

Of course it's not always physically possible to reset something
that is in an unknown state, but we should design the driver to be
as resilient as possible.

Cheers,
Cabiddu, Giovanni April 4, 2025, 12:12 p.m. UTC | #2
Hi Herbert,

On Wed, Apr 02, 2025 at 10:58:13AM +0800, Herbert Xu wrote:
> On Wed, Mar 26, 2025 at 03:59:45PM +0000, Giovanni Cabiddu wrote:
> > This series of patches addresses the warm reboot problem that affects
> > all QAT devices. When a reset is performed using kexec, QAT devices
> > fail to recover due to improper shutdown.
> 
> Thanks for the quick fixes Giovanni!
> 
> As this is not a new regression, I think they should go through
> the usual release cycle.

Sure, that's fine.

> Just one comment on a possible improvement though, while it's good
> to shut down the device properly, the initialisation side should
> also do as much as is possible to reset a device that is in an
> unknown state.

> 
> This is because the previous kernel might have had a hard crash,
> in which case there is no chance for the correct shutdown sequence
> to be carried out.
> 
> Of course it's not always physically possible to reset something
> that is in an unknown state, but we should design the driver to be
> as resilient as possible.

Thanks for the feedback.

I considered adding a reset in the probe, but it would slow down the
load time for each user.

One option might be to attempt to bring up the device, and if that fails,
reset the device and retry the startup process.
We can look into that.

Regards,