diff mbox series

[5/9] drm/panfrost: Add HW_ISSUE_TTRX_3485 quirk

Message ID 20220211202728.6146-6-alyssa.rosenzweig@collabora.com (mailing list archive)
State New, archived
Headers show
Series drm/panfrost: Initial Valhall support | expand

Commit Message

Alyssa Rosenzweig Feb. 11, 2022, 8:27 p.m. UTC
From: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

TTRX_3485 requires the infamous "dummy job" workaround. I have this
workaround implemented in a local branch, but I have not yet hit a case
that requires it so I cannot test whether the implementation is correct.
In the mean time, add the quirk bit so we can document which platforms
may need it in the future.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_issues.h | 3 +++
 1 file changed, 3 insertions(+)

Comments

Steven Price Feb. 14, 2022, 4:23 p.m. UTC | #1
On 11/02/2022 20:27, alyssa.rosenzweig@collabora.com wrote:
> From: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> 
> TTRX_3485 requires the infamous "dummy job" workaround. I have this
> workaround implemented in a local branch, but I have not yet hit a case
> that requires it so I cannot test whether the implementation is correct.
> In the mean time, add the quirk bit so we can document which platforms
> may need it in the future.

This one is hideous ;) Although to me this isn't the 'infamous' one as
it's not the earliest example of a dummy job.

However... I believe as Panfrost currently stands this is probably not
very possible to hit. It requires a job to be stopped (soft or hard) at
a critical point during submission - which at the moment Panfrost
basically never does (the exception is if you close the fd immediately
while a job is in progress). And of course the timing has to be 'just
right' to hit the bug.

That said I think we should probably add pre-emption support sometime at
which point this could become an issue.

> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_issues.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_issues.h b/drivers/gpu/drm/panfrost/panfrost_issues.h
> index 058f6a4c8435..b8865fc9efce 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_issues.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_issues.h
> @@ -132,6 +132,9 @@ enum panfrost_hw_issue {
>  	 * to hang */
>  	HW_ISSUE_TTRX_3076,
>  
> +	/* Must issue a dummy job before starting real work to prevent hangs */
> +	HW_ISSUE_TTRX_3485,
> +
>  	HW_ISSUE_END
>  };
>
Alyssa Rosenzweig Feb. 14, 2022, 5:11 p.m. UTC | #2
> > TTRX_3485 requires the infamous "dummy job" workaround. I have this
> > workaround implemented in a local branch, but I have not yet hit a case
> > that requires it so I cannot test whether the implementation is correct.
> > In the mean time, add the quirk bit so we can document which platforms
> > may need it in the future.
> 
> This one is hideous ;) Although to me this isn't the 'infamous' one as
> it's not the earliest example of a dummy job.

Terrifying. I guess we narrowly avoided the 'replay' workaround which
was far worse than this one...

> However... I believe as Panfrost currently stands this is probably not
> very possible to hit. It requires a job to be stopped (soft or hard) at
> a critical point during submission - which at the moment Panfrost
> basically never does (the exception is if you close the fd immediately
> while a job is in progress). And of course the timing has to be 'just
> right' to hit the bug.

OK, that's good to know. Still "should" be fixed but that definitely
lowers the priority of it. Frankly the multithreading bugs we have on
the CPU side would hang the machine sooner...
diff mbox series

Patch

diff --git a/drivers/gpu/drm/panfrost/panfrost_issues.h b/drivers/gpu/drm/panfrost/panfrost_issues.h
index 058f6a4c8435..b8865fc9efce 100644
--- a/drivers/gpu/drm/panfrost/panfrost_issues.h
+++ b/drivers/gpu/drm/panfrost/panfrost_issues.h
@@ -132,6 +132,9 @@  enum panfrost_hw_issue {
 	 * to hang */
 	HW_ISSUE_TTRX_3076,
 
+	/* Must issue a dummy job before starting real work to prevent hangs */
+	HW_ISSUE_TTRX_3485,
+
 	HW_ISSUE_END
 };