mbox series

[v3,0/5] drm/panthor: Be robust against failures in the resume path

Message ID 20241211075419.2333731-1-boris.brezillon@collabora.com (mailing list archive)
Headers show
Series drm/panthor: Be robust against failures in the resume path | expand

Message

Boris Brezillon Dec. 11, 2024, 7:54 a.m. UTC
Hello,

Here's a collection of patches improving robustness to failures in
the device resume/suspend path. Those failures are pretty hard to
reproduce (happens once in a while on a deqp-vk run), so I used a
mechanism to fake them.

Faking a FW boot failure is kinda tricky though, which means the
last patch has only been partially tested:
- the fast reset path is well tested because that's the default on
  a device suspend
- the slow reset has been tested with a hack replacing fast resets
  by slow resets
- the fast -> slow reset fallback has been tested by faking boot
  failures after a fast reset, but these are not real, which means
  we can't really validate if the MCU recovers fine after a slow
  reset

On the other hand, this implementation doesn't look like it could
do more harm than the current one (the only difference is the
extra GPU soft-reset that happens between the fast and slow FW
boot).

Nothing major changed in v3. Each patch contains a changelog, if
you're interested.

Regards,

Boris

Boris Brezillon (5):
  drm/panthor: Preserve the result returned by panthor_fw_resume()
  drm/panthor: Be robust against runtime PM resume failures in the
    suspend path
  drm/panthor: Ignore devfreq_{suspend,resume}_device() failures
  drm/panthor: Be robust against resume failures
  drm/panthor: Fix the fast-reset logic

 drivers/gpu/drm/panthor/panthor_devfreq.c | 12 ++--
 drivers/gpu/drm/panthor/panthor_devfreq.h |  4 +-
 drivers/gpu/drm/panthor/panthor_device.c  | 68 ++++++++++-------------
 drivers/gpu/drm/panthor/panthor_device.h  | 37 ++++++++++++
 drivers/gpu/drm/panthor/panthor_drv.c     |  2 +-
 drivers/gpu/drm/panthor/panthor_fw.c      | 68 +++++++----------------
 drivers/gpu/drm/panthor/panthor_gpu.c     | 14 +++--
 drivers/gpu/drm/panthor/panthor_mmu.c     |  3 +-
 drivers/gpu/drm/panthor/panthor_sched.c   |  4 +-
 9 files changed, 107 insertions(+), 105 deletions(-)

Comments

Boris Brezillon Dec. 11, 2024, 10:10 a.m. UTC | #1
On Wed, 11 Dec 2024 08:54:14 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> Hello,
> 
> Here's a collection of patches improving robustness to failures in
> the device resume/suspend path. Those failures are pretty hard to
> reproduce (happens once in a while on a deqp-vk run), so I used a
> mechanism to fake them.
> 
> Faking a FW boot failure is kinda tricky though, which means the
> last patch has only been partially tested:
> - the fast reset path is well tested because that's the default on
>   a device suspend
> - the slow reset has been tested with a hack replacing fast resets
>   by slow resets
> - the fast -> slow reset fallback has been tested by faking boot
>   failures after a fast reset, but these are not real, which means
>   we can't really validate if the MCU recovers fine after a slow
>   reset
> 
> On the other hand, this implementation doesn't look like it could
> do more harm than the current one (the only difference is the
> extra GPU soft-reset that happens between the fast and slow FW
> boot).
> 
> Nothing major changed in v3. Each patch contains a changelog, if
> you're interested.
> 
> Regards,
> 
> Boris
> 
> Boris Brezillon (5):
>   drm/panthor: Preserve the result returned by panthor_fw_resume()
>   drm/panthor: Be robust against runtime PM resume failures in the
>     suspend path
>   drm/panthor: Ignore devfreq_{suspend,resume}_device() failures
>   drm/panthor: Be robust against resume failures
>   drm/panthor: Fix the fast-reset logic

Queued to drm-misc-next.

> 
>  drivers/gpu/drm/panthor/panthor_devfreq.c | 12 ++--
>  drivers/gpu/drm/panthor/panthor_devfreq.h |  4 +-
>  drivers/gpu/drm/panthor/panthor_device.c  | 68 ++++++++++-------------
>  drivers/gpu/drm/panthor/panthor_device.h  | 37 ++++++++++++
>  drivers/gpu/drm/panthor/panthor_drv.c     |  2 +-
>  drivers/gpu/drm/panthor/panthor_fw.c      | 68 +++++++----------------
>  drivers/gpu/drm/panthor/panthor_gpu.c     | 14 +++--
>  drivers/gpu/drm/panthor/panthor_mmu.c     |  3 +-
>  drivers/gpu/drm/panthor/panthor_sched.c   |  4 +-
>  9 files changed, 107 insertions(+), 105 deletions(-)
>