diff mbox

drm/i915/guc: make forwarding of VBlanks conditional

Message ID 1466170959-4062-1-git-send-email-david.s.gordon@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Gordon June 17, 2016, 1:42 p.m. UTC
Apparently we shouldn't forward VBlanks to the GuC unconditionally,
as it may trigger a race condition in the GuC's internal dispatcher.

From an internal message from Harsh Chheda <harsh.j.chheda@intel.com>
>
> If the context has switched out the [GuC] scheduler should know so
> that it can  put the context in wait state and schedule something
> else.  If the context which is waiting on Vblank is still on the CS
> then [the GuC] scheduler does not need to know. If in this case wait
> on Vblank is sent to Guc while the context is still on CS there is
> an edge condition when the Vblank may get satisfied right after
> sending interrupt to guc and context may complete. Scheduler will
> then incorrectly mark the context to be in wait state.

So we need to change this register setting to GFX_FORWARD_VBLANK_COND.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Harsh Chheda <harsh.j.chheda@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ville Syrjälä June 17, 2016, 1:51 p.m. UTC | #1
On Fri, Jun 17, 2016 at 02:42:39PM +0100, Dave Gordon wrote:
> Apparently we shouldn't forward VBlanks to the GuC unconditionally,
> as it may trigger a race condition in the GuC's internal dispatcher.
> 
> From an internal message from Harsh Chheda <harsh.j.chheda@intel.com>
> >
> > If the context has switched out the [GuC] scheduler should know so
> > that it can  put the context in wait state and schedule something
> > else.  If the context which is waiting on Vblank is still on the CS
> > then [the GuC] scheduler does not need to know. If in this case wait
> > on Vblank is sent to Guc while the context is still on CS there is
> > an edge condition when the Vblank may get satisfied right after
> > sending interrupt to guc and context may complete. Scheduler will
> > then incorrectly mark the context to be in wait state.
> 
> So we need to change this register setting to GFX_FORWARD_VBLANK_COND.

No one has ever managed to explain why the guc needs to get vblanks
in the first place. The original commit message adding this stuff is
not helpful at all.

> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Harsh Chheda <harsh.j.chheda@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_guc_loader.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> index 8fe96a2..af9f51e 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -106,7 +106,7 @@ static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
>  	u32 tmp;
>  
>  	/* tell all command streamers to forward interrupts and vblank to GuC */
> -	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
> +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_COND);
>  	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
>  	for_each_engine(engine, dev_priv)
>  		I915_WRITE(RING_MODE_GEN7(engine), irqs);
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Daniel Vetter June 17, 2016, 2:14 p.m. UTC | #2
On Fri, Jun 17, 2016 at 04:51:34PM +0300, Ville Syrjälä wrote:
> On Fri, Jun 17, 2016 at 02:42:39PM +0100, Dave Gordon wrote:
> > Apparently we shouldn't forward VBlanks to the GuC unconditionally,
> > as it may trigger a race condition in the GuC's internal dispatcher.
> > 
> > From an internal message from Harsh Chheda <harsh.j.chheda@intel.com>
> > >
> > > If the context has switched out the [GuC] scheduler should know so
> > > that it can  put the context in wait state and schedule something
> > > else.  If the context which is waiting on Vblank is still on the CS
> > > then [the GuC] scheduler does not need to know. If in this case wait
> > > on Vblank is sent to Guc while the context is still on CS there is
> > > an edge condition when the Vblank may get satisfied right after
> > > sending interrupt to guc and context may complete. Scheduler will
> > > then incorrectly mark the context to be in wait state.
> > 
> > So we need to change this register setting to GFX_FORWARD_VBLANK_COND.
> 
> No one has ever managed to explain why the guc needs to get vblanks
> in the first place. The original commit message adding this stuff is
> not helpful at all.

It's used for the guc's magic slpc so that it can try to hit flips at
vrefresh. I maintain that this design is bonghits, and we should forward
_any_ display interrupts to guc. Instead the kernel should tell guc when
it's doing a sloppy job and missing deadlines. Because the kernel actually
knows, and doesn't have to infer what might be going on by watching
vblanks and flip_complete interrupts.

Also, with atomic we won't be using cs flips any more, who knows whether
that guc magic even still works with that ...

So I'd vote to just stop telling guc this stuff and be done with it.
-Daniel

> 
> > 
> > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > Cc: Harsh Chheda <harsh.j.chheda@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_guc_loader.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> > index 8fe96a2..af9f51e 100644
> > --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> > +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> > @@ -106,7 +106,7 @@ static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
> >  	u32 tmp;
> >  
> >  	/* tell all command streamers to forward interrupts and vblank to GuC */
> > -	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
> > +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_COND);
> >  	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
> >  	for_each_engine(engine, dev_priv)
> >  		I915_WRITE(RING_MODE_GEN7(engine), irqs);
> > -- 
> > 1.9.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Ville Syrjälä June 17, 2016, 3 p.m. UTC | #3
On Fri, Jun 17, 2016 at 04:14:48PM +0200, Daniel Vetter wrote:
> On Fri, Jun 17, 2016 at 04:51:34PM +0300, Ville Syrjälä wrote:
> > On Fri, Jun 17, 2016 at 02:42:39PM +0100, Dave Gordon wrote:
> > > Apparently we shouldn't forward VBlanks to the GuC unconditionally,
> > > as it may trigger a race condition in the GuC's internal dispatcher.
> > > 
> > > From an internal message from Harsh Chheda <harsh.j.chheda@intel.com>
> > > >
> > > > If the context has switched out the [GuC] scheduler should know so
> > > > that it can  put the context in wait state and schedule something
> > > > else.  If the context which is waiting on Vblank is still on the CS
> > > > then [the GuC] scheduler does not need to know. If in this case wait
> > > > on Vblank is sent to Guc while the context is still on CS there is
> > > > an edge condition when the Vblank may get satisfied right after
> > > > sending interrupt to guc and context may complete. Scheduler will
> > > > then incorrectly mark the context to be in wait state.
> > > 
> > > So we need to change this register setting to GFX_FORWARD_VBLANK_COND.
> > 
> > No one has ever managed to explain why the guc needs to get vblanks
> > in the first place. The original commit message adding this stuff is
> > not helpful at all.
> 
> It's used for the guc's magic slpc so that it can try to hit flips at
> vrefresh.

Is that a documented fact, or just an assumption? There's nothing
in the commit message, there's no comment in the code, and I don't
see anything in the spec either.

> I maintain that this design is bonghits, and we should forward

I assume you mean "should not".

> _any_ display interrupts to guc. Instead the kernel should tell guc when
> it's doing a sloppy job and missing deadlines. Because the kernel actually
> knows, and doesn't have to infer what might be going on by watching
> vblanks and flip_complete interrupts.
> 
> Also, with atomic we won't be using cs flips any more, who knows whether
> that guc magic even still works with that ...
> 
> So I'd vote to just stop telling guc this stuff and be done with it.

I would have to agree.

> -Daniel
> 
> > 
> > > 
> > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > > Cc: Harsh Chheda <harsh.j.chheda@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/intel_guc_loader.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> > > index 8fe96a2..af9f51e 100644
> > > --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> > > +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> > > @@ -106,7 +106,7 @@ static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
> > >  	u32 tmp;
> > >  
> > >  	/* tell all command streamers to forward interrupts and vblank to GuC */
> > > -	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
> > > +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_COND);
> > >  	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
> > >  	for_each_engine(engine, dev_priv)
> > >  		I915_WRITE(RING_MODE_GEN7(engine), irqs);
> > > -- 
> > > 1.9.1
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Ville Syrjälä
> > Intel OTC
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Daniel Vetter June 20, 2016, 12:57 p.m. UTC | #4
On Fri, Jun 17, 2016 at 06:00:15PM +0300, Ville Syrjälä wrote:
> On Fri, Jun 17, 2016 at 04:14:48PM +0200, Daniel Vetter wrote:
> > On Fri, Jun 17, 2016 at 04:51:34PM +0300, Ville Syrjälä wrote:
> > > On Fri, Jun 17, 2016 at 02:42:39PM +0100, Dave Gordon wrote:
> > > > Apparently we shouldn't forward VBlanks to the GuC unconditionally,
> > > > as it may trigger a race condition in the GuC's internal dispatcher.
> > > > 
> > > > From an internal message from Harsh Chheda <harsh.j.chheda@intel.com>
> > > > >
> > > > > If the context has switched out the [GuC] scheduler should know so
> > > > > that it can  put the context in wait state and schedule something
> > > > > else.  If the context which is waiting on Vblank is still on the CS
> > > > > then [the GuC] scheduler does not need to know. If in this case wait
> > > > > on Vblank is sent to Guc while the context is still on CS there is
> > > > > an edge condition when the Vblank may get satisfied right after
> > > > > sending interrupt to guc and context may complete. Scheduler will
> > > > > then incorrectly mark the context to be in wait state.
> > > > 
> > > > So we need to change this register setting to GFX_FORWARD_VBLANK_COND.
> > > 
> > > No one has ever managed to explain why the guc needs to get vblanks
> > > in the first place. The original commit message adding this stuff is
> > > not helpful at all.
> > 
> > It's used for the guc's magic slpc so that it can try to hit flips at
> > vrefresh.
> 
> Is that a documented fact, or just an assumption? There's nothing
> in the commit message, there's no comment in the code, and I don't
> see anything in the spec either.

Well there's a magic mode where we tell guc to hit a vrefresh and it does
magic. Crucially we don't tell guc when it missed or when we flip, it
seems to observe all that through interrupts it gets from the display
block. But that's all inferences indeed, I have know idea what it does for
real.

> > I maintain that this design is bonghits, and we should forward
> 
> I assume you mean "should not".

yup.
-Daniel

> > _any_ display interrupts to guc. Instead the kernel should tell guc when
> > it's doing a sloppy job and missing deadlines. Because the kernel actually
> > knows, and doesn't have to infer what might be going on by watching
> > vblanks and flip_complete interrupts.
> > 
> > Also, with atomic we won't be using cs flips any more, who knows whether
> > that guc magic even still works with that ...
> > 
> > So I'd vote to just stop telling guc this stuff and be done with it.
> 
> I would have to agree.
> 
> > -Daniel
> > 
> > > 
> > > > 
> > > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > > > Cc: Harsh Chheda <harsh.j.chheda@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_guc_loader.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> > > > index 8fe96a2..af9f51e 100644
> > > > --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> > > > +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> > > > @@ -106,7 +106,7 @@ static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
> > > >  	u32 tmp;
> > > >  
> > > >  	/* tell all command streamers to forward interrupts and vblank to GuC */
> > > > -	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
> > > > +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_COND);
> > > >  	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
> > > >  	for_each_engine(engine, dev_priv)
> > > >  		I915_WRITE(RING_MODE_GEN7(engine), irqs);
> > > > -- 
> > > > 1.9.1
> > > > 
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > 
> > > -- 
> > > Ville Syrjälä
> > > Intel OTC
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> -- 
> Ville Syrjälä
> Intel OTC
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index 8fe96a2..af9f51e 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -106,7 +106,7 @@  static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
 	u32 tmp;
 
 	/* tell all command streamers to forward interrupts and vblank to GuC */
-	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
+	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_COND);
 	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
 	for_each_engine(engine, dev_priv)
 		I915_WRITE(RING_MODE_GEN7(engine), irqs);