diff mbox series

[v4,3/4] drm/vc4: Check for the binner bo before handling OOM interrupt

Message ID 20190403154856.9470-4-paul.kocialkowski@bootlin.com (mailing list archive)
State New, archived
Headers show
Series drm/vc4: Binner BO management improvements | expand

Commit Message

Paul Kocialkowski April 3, 2019, 3:48 p.m. UTC
Since the OOM interrupt directly deals with the binner bo, it doesn't
make sense to try and handle it without a binner buffer registered.
The interrupt will kick again in due time, so we can safely ignore it
without a binner bo allocated.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Eric Anholt April 3, 2019, 6:58 p.m. UTC | #1
Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> Since the OOM interrupt directly deals with the binner bo, it doesn't
> make sense to try and handle it without a binner buffer registered.
> The interrupt will kick again in due time, so we can safely ignore it
> without a binner bo allocated.
>
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
> index ffd0a4388752..723dc86b4511 100644
> --- a/drivers/gpu/drm/vc4/vc4_irq.c
> +++ b/drivers/gpu/drm/vc4/vc4_irq.c
> @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
>  	struct vc4_exec_info *exec;
>  	unsigned long irqflags;

Since OOM handling is tricky, could we add a comment to help the next
person try to understand it:

/* The OOM IRQ is level-triggered, so we'll see one at power-on before
 * any jobs are submitted.  The OOM IRQ is masked when this work is
 * scheduled, so we can safely return if there's no binner memory
 * (because no client is currently using 3D).  When a bin job is
 * later submitted, its tile memory allocation will end up bringing us
 * back to a non-OOM state so the OOM can be triggered again.
 */

But, actually, I don't see how the OOM IRQ will ever get re-enabled.
Paul Kocialkowski April 4, 2019, 2:33 p.m. UTC | #2
Hey,

Le mercredi 03 avril 2019 à 11:58 -0700, Eric Anholt a écrit :
> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> 
> > Since the OOM interrupt directly deals with the binner bo, it doesn't
> > make sense to try and handle it without a binner buffer registered.
> > The interrupt will kick again in due time, so we can safely ignore it
> > without a binner bo allocated.
> > 
> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> >  drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
> > index ffd0a4388752..723dc86b4511 100644
> > --- a/drivers/gpu/drm/vc4/vc4_irq.c
> > +++ b/drivers/gpu/drm/vc4/vc4_irq.c
> > @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
> >  	struct vc4_exec_info *exec;
> >  	unsigned long irqflags;
> 
> Since OOM handling is tricky, could we add a comment to help the next
> person try to understand it:
> 
> /* The OOM IRQ is level-triggered, so we'll see one at power-on before
>  * any jobs are submitted.  The OOM IRQ is masked when this work is
>  * scheduled, so we can safely return if there's no binner memory
>  * (because no client is currently using 3D).  When a bin job is
>  * later submitted, its tile memory allocation will end up bringing us
>  * back to a non-OOM state so the OOM can be triggered again.
>  */
> 
> But, actually, I don't see how the OOM IRQ will ever get re-enabled.

Okay so I investigated that to try and understand what's going on.
We are definitely writing the OUTOMEM bit to V3D_INTDIS just before
scheduling the workqueue, and never re-enable the IRQ when leaving
early in the workqueue because !vc4->bin_bo.

It turns out that what saves us here is vc4_irq_postinstall being
called from runtime resume at "the right time". Obviously this is more
than fragile, so we should really be re-enabling the IRQ as soon as we
have the binner bo allocated.

Since we're now allocating at the first non-dumb bo alloc, I think we
need to make sure that we did in fact get the irq and registered the
allocated BO with the workqueue before submitting the rcl. Or does the
hardware provide any mechanism to take that off our hands somehow?

What do you think?

Cheers,

Paul
Eric Anholt April 4, 2019, 8:09 p.m. UTC | #3
Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> Hey,
>
> Le mercredi 03 avril 2019 à 11:58 -0700, Eric Anholt a écrit :
>> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
>> 
>> > Since the OOM interrupt directly deals with the binner bo, it doesn't
>> > make sense to try and handle it without a binner buffer registered.
>> > The interrupt will kick again in due time, so we can safely ignore it
>> > without a binner bo allocated.
>> > 
>> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
>> > ---
>> >  drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
>> >  1 file changed, 3 insertions(+)
>> > 
>> > diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
>> > index ffd0a4388752..723dc86b4511 100644
>> > --- a/drivers/gpu/drm/vc4/vc4_irq.c
>> > +++ b/drivers/gpu/drm/vc4/vc4_irq.c
>> > @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
>> >  	struct vc4_exec_info *exec;
>> >  	unsigned long irqflags;
>> 
>> Since OOM handling is tricky, could we add a comment to help the next
>> person try to understand it:
>> 
>> /* The OOM IRQ is level-triggered, so we'll see one at power-on before
>>  * any jobs are submitted.  The OOM IRQ is masked when this work is
>>  * scheduled, so we can safely return if there's no binner memory
>>  * (because no client is currently using 3D).  When a bin job is
>>  * later submitted, its tile memory allocation will end up bringing us
>>  * back to a non-OOM state so the OOM can be triggered again.
>>  */
>> 
>> But, actually, I don't see how the OOM IRQ will ever get re-enabled.
>
> Okay so I investigated that to try and understand what's going on.
> We are definitely writing the OUTOMEM bit to V3D_INTDIS just before
> scheduling the workqueue, and never re-enable the IRQ when leaving
> early in the workqueue because !vc4->bin_bo.
>
> It turns out that what saves us here is vc4_irq_postinstall being
> called from runtime resume at "the right time". Obviously this is more
> than fragile, so we should really be re-enabling the IRQ as soon as we
> have the binner bo allocated.
>
> Since we're now allocating at the first non-dumb bo alloc, I think we
> need to make sure that we did in fact get the irq and registered the
> allocated BO with the workqueue before submitting the rcl. Or does the
> hardware provide any mechanism to take that off our hands somehow?

Maybe just enable the OOM interrupt using INTENA in the bin BO
allocation's success case?  That feels race-free, since it's a level
interrupt and even if we were racing the !bin_bo check in the work, we'd
end up re-scheduling the work?
diff mbox series

Patch

diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index ffd0a4388752..723dc86b4511 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -64,6 +64,9 @@  vc4_overflow_mem_work(struct work_struct *work)
 	struct vc4_exec_info *exec;
 	unsigned long irqflags;
 
+	if (!bo)
+		return;
+
 	bin_bo_slot = vc4_v3d_get_bin_slot(vc4);
 	if (bin_bo_slot < 0) {
 		DRM_ERROR("Couldn't allocate binner overflow mem\n");