diff mbox

[11/11,v4] drm/i915/bdw: Ensure a context is loaded before RC6

Message ID 1392877640-20588-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky Feb. 20, 2014, 6:27 a.m. UTC
RC6 works a lot like HW contexts in that when the GPU enters RC6 it
saves away the state to a context, and loads it upon wake.

It's to be somewhat expected that BIOS will not set up valid GPU state.
As a result, if loading bad state can cause the GPU to get angry, it
would make sense then that we need to load state first. There are two
ways in which we can do this:

1. Create 3d state in the driver, load it up, then enable RC6.
1b. Reuse a known good state, [and if needed,] just bind objects where
needed. Then enable RC6.
2. Hold off enabling RC6 until userspace has had a chance to complete
batches.

There has been discussions in the past with #1 as it has been
recommended for fixes elsewhere. I'm not opposed to it, I'd just like to
do the easy thing now to enable the platform.

This patch is a hack that implement option #2. It will defer enabling
rc6 until the first batch from userspace has been retired. It suffers
two flaws. The first is, if the driver is loaded, but a batch is not
submitted/completed, we'll never enter rc6. The other is, it expects
userspace to submit a batch with 3d state first. Both of these things
are not actual flaws for most users because most users will boot to a
graphical composited desktop. Both mesa, and X will always emit the
necessary 3d state.

Once a context is loaded and we enable rc6, the default context should
inherit the proper state because we always inhibit the restore for the
default context. This assumes certain things about the workaround/issue
itself to which I am not privy (primarily that the indirect state
objects don't actually need to exist).

With that, there are currently 4 options for BDW:
1. Don't use RC6.
2. Use RC6 and expect a hang on the first batch submitted for every
context.
3. Use RC6 and use this patch.
4. Wait for another workaround implementation.

NOTE: This patch could be used against other platforms as well.

v2: Re-add accidentally dropped hunk (Ben)

v3: Now more compilable (Ben)

v4: Use the existing enable flag for rc6. This will also make the
suspend/resume case work properly, which is broken in v3.
Disable rc6 on reset, and defer re-enabling until the first batch.

The fact that RC6 residency continues to increment, and that this patch
prevents a hang on BDW silicon has been:
Tested-by: Kenneth Graunke <kenneth@whitecape.org> (v1)

Cc: David E. Box <david.e.box@intel.com>
Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

squash! drm/i915/bdw: Ensure a context is loaded before RC6
---
 drivers/gpu/drm/i915/i915_drv.c      |  4 +++-
 drivers/gpu/drm/i915/i915_gem.c      | 10 ++++++++++
 drivers/gpu/drm/i915/intel_display.c |  5 +++++
 3 files changed, 18 insertions(+), 1 deletion(-)

Comments

Daniel Vetter March 4, 2014, 2:30 p.m. UTC | #1
On Wed, Feb 19, 2014 at 10:27:20PM -0800, Ben Widawsky wrote:
> RC6 works a lot like HW contexts in that when the GPU enters RC6 it
> saves away the state to a context, and loads it upon wake.
> 
> It's to be somewhat expected that BIOS will not set up valid GPU state.
> As a result, if loading bad state can cause the GPU to get angry, it
> would make sense then that we need to load state first. There are two
> ways in which we can do this:
> 
> 1. Create 3d state in the driver, load it up, then enable RC6.
> 1b. Reuse a known good state, [and if needed,] just bind objects where
> needed. Then enable RC6.
> 2. Hold off enabling RC6 until userspace has had a chance to complete
> batches.
> 
> There has been discussions in the past with #1 as it has been
> recommended for fixes elsewhere. I'm not opposed to it, I'd just like to
> do the easy thing now to enable the platform.
> 
> This patch is a hack that implement option #2. It will defer enabling
> rc6 until the first batch from userspace has been retired. It suffers
> two flaws. The first is, if the driver is loaded, but a batch is not
> submitted/completed, we'll never enter rc6. The other is, it expects
> userspace to submit a batch with 3d state first. Both of these things
> are not actual flaws for most users because most users will boot to a
> graphical composited desktop. Both mesa, and X will always emit the
> necessary 3d state.
> 
> Once a context is loaded and we enable rc6, the default context should
> inherit the proper state because we always inhibit the restore for the
> default context. This assumes certain things about the workaround/issue
> itself to which I am not privy (primarily that the indirect state
> objects don't actually need to exist).
> 
> With that, there are currently 4 options for BDW:
> 1. Don't use RC6.
> 2. Use RC6 and expect a hang on the first batch submitted for every
> context.
> 3. Use RC6 and use this patch.
> 4. Wait for another workaround implementation.
> 
> NOTE: This patch could be used against other platforms as well.
> 
> v2: Re-add accidentally dropped hunk (Ben)
> 
> v3: Now more compilable (Ben)
> 
> v4: Use the existing enable flag for rc6. This will also make the
> suspend/resume case work properly, which is broken in v3.
> Disable rc6 on reset, and defer re-enabling until the first batch.
> 
> The fact that RC6 residency continues to increment, and that this patch
> prevents a hang on BDW silicon has been:
> Tested-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
> 
> Cc: David E. Box <david.e.box@intel.com>
> Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> squash! drm/i915/bdw: Ensure a context is loaded before RC6
> ---
>  drivers/gpu/drm/i915/i915_drv.c      |  4 +++-
>  drivers/gpu/drm/i915/i915_gem.c      | 10 ++++++++++
>  drivers/gpu/drm/i915/intel_display.c |  5 +++++
>  3 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 2d05d7c..7fdfc0e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -679,6 +679,8 @@ int i915_reset(struct drm_device *dev)
>  	mutex_lock(&dev->struct_mutex);
>  
>  	i915_gem_reset(dev);
> +	if (IS_BROADWELL(dev))
> +		intel_disable_gt_powersave(dev);
>  
>  	simulated = dev_priv->gpu_error.stop_rings != 0;
>  
> @@ -733,7 +735,7 @@ int i915_reset(struct drm_device *dev)
>  		 * reset and the re-install of drm irq. Skip for ironlake per
>  		 * previous concerns that it doesn't respond well to some forms
>  		 * of re-init after reset. */
> -		if (INTEL_INFO(dev)->gen > 5) {
> +		if (INTEL_INFO(dev)->gen > 5 && !IS_BROADWELL(dev)) {
>  			mutex_lock(&dev->struct_mutex);
>  			intel_enable_gt_powersave(dev);
>  			mutex_unlock(&dev->struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3618bb0..25a97a6 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2420,6 +2420,7 @@ void i915_gem_reset(struct drm_device *dev)
>  void
>  i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  {
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	uint32_t seqno;
>  
>  	if (list_empty(&ring->request_list))
> @@ -2443,6 +2444,15 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> +		/* Wa: can't find the w/a name.
> +		 * This doesn't actually implement the w/a, but it a workaround
> +		 * for the workaround. It defers using rc6 until we know valid
> +		 * state exists.
> +		 */
> +		if (IS_BROADWELL(ring->dev) && intel_enable_rc6(ring->dev) &&
> +		    !dev_priv->rps.enabled && ring->id == RCS)
> +			intel_enable_gt_powersave(ring->dev);

Nope, we won't rely on userspace submitting a command to enable a power
saving feature. There's a good chance that userspace won't submitting
anything, but still wake up from suspend, e.g. for background mail
checking. Especially on Android where opportunistic is used to
aggressively conserve battery power.

If just a MI_SET_CONTEXT isn't good enough and we need to save a valid 3d
context then we need to (shock, horror) emit the relevant minimal set of
3d state commands in the kernel before enabling rc6.

Aside: Series looks good otherwise, so I'll happily merge once it's
reviewed. No opinion on the RP* bikeshed (besides that your new names for
dev_priv->rps.foo are definitely clearer) as long as the connection
between the sane names and the Bspec names is documented somewhere.
-Daniel

> +
>  		i915_gem_object_move_to_inactive(obj);
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index f19e6ea..72c8e1d 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10983,6 +10983,11 @@ void intel_modeset_init_hw(struct drm_device *dev)
>  
>  	intel_reset_dpio(dev);
>  
> +	if (IS_BROADWELL(dev)) {
> +		DRM_DEBUG_DRIVER("Deferring RC6 enabling until first batch is complete\n");
> +		return;
> +	}
> +
>  	mutex_lock(&dev->struct_mutex);
>  	intel_enable_gt_powersave(dev);
>  	mutex_unlock(&dev->struct_mutex);
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Ben Widawsky March 20, 2014, 12:41 a.m. UTC | #2
On Tue, Mar 04, 2014 at 03:30:14PM +0100, Daniel Vetter wrote:
> On Wed, Feb 19, 2014 at 10:27:20PM -0800, Ben Widawsky wrote:
> > RC6 works a lot like HW contexts in that when the GPU enters RC6 it
> > saves away the state to a context, and loads it upon wake.
> > 
> > It's to be somewhat expected that BIOS will not set up valid GPU state.
> > As a result, if loading bad state can cause the GPU to get angry, it
> > would make sense then that we need to load state first. There are two
> > ways in which we can do this:
> > 
> > 1. Create 3d state in the driver, load it up, then enable RC6.
> > 1b. Reuse a known good state, [and if needed,] just bind objects where
> > needed. Then enable RC6.
> > 2. Hold off enabling RC6 until userspace has had a chance to complete
> > batches.
> > 
> > There has been discussions in the past with #1 as it has been
> > recommended for fixes elsewhere. I'm not opposed to it, I'd just like to
> > do the easy thing now to enable the platform.
> > 
> > This patch is a hack that implement option #2. It will defer enabling
> > rc6 until the first batch from userspace has been retired. It suffers
> > two flaws. The first is, if the driver is loaded, but a batch is not
> > submitted/completed, we'll never enter rc6. The other is, it expects
> > userspace to submit a batch with 3d state first. Both of these things
> > are not actual flaws for most users because most users will boot to a
> > graphical composited desktop. Both mesa, and X will always emit the
> > necessary 3d state.
> > 
> > Once a context is loaded and we enable rc6, the default context should
> > inherit the proper state because we always inhibit the restore for the
> > default context. This assumes certain things about the workaround/issue
> > itself to which I am not privy (primarily that the indirect state
> > objects don't actually need to exist).
> > 
> > With that, there are currently 4 options for BDW:
> > 1. Don't use RC6.
> > 2. Use RC6 and expect a hang on the first batch submitted for every
> > context.
> > 3. Use RC6 and use this patch.
> > 4. Wait for another workaround implementation.
> > 
> > NOTE: This patch could be used against other platforms as well.
> > 
> > v2: Re-add accidentally dropped hunk (Ben)
> > 
> > v3: Now more compilable (Ben)
> > 
> > v4: Use the existing enable flag for rc6. This will also make the
> > suspend/resume case work properly, which is broken in v3.
> > Disable rc6 on reset, and defer re-enabling until the first batch.
> > 
> > The fact that RC6 residency continues to increment, and that this patch
> > prevents a hang on BDW silicon has been:
> > Tested-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
> > 
> > Cc: David E. Box <david.e.box@intel.com>
> > Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > squash! drm/i915/bdw: Ensure a context is loaded before RC6
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c      |  4 +++-
> >  drivers/gpu/drm/i915/i915_gem.c      | 10 ++++++++++
> >  drivers/gpu/drm/i915/intel_display.c |  5 +++++
> >  3 files changed, 18 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 2d05d7c..7fdfc0e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -679,6 +679,8 @@ int i915_reset(struct drm_device *dev)
> >  	mutex_lock(&dev->struct_mutex);
> >  
> >  	i915_gem_reset(dev);
> > +	if (IS_BROADWELL(dev))
> > +		intel_disable_gt_powersave(dev);
> >  
> >  	simulated = dev_priv->gpu_error.stop_rings != 0;
> >  
> > @@ -733,7 +735,7 @@ int i915_reset(struct drm_device *dev)
> >  		 * reset and the re-install of drm irq. Skip for ironlake per
> >  		 * previous concerns that it doesn't respond well to some forms
> >  		 * of re-init after reset. */
> > -		if (INTEL_INFO(dev)->gen > 5) {
> > +		if (INTEL_INFO(dev)->gen > 5 && !IS_BROADWELL(dev)) {
> >  			mutex_lock(&dev->struct_mutex);
> >  			intel_enable_gt_powersave(dev);
> >  			mutex_unlock(&dev->struct_mutex);
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 3618bb0..25a97a6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2420,6 +2420,7 @@ void i915_gem_reset(struct drm_device *dev)
> >  void
> >  i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  {
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	uint32_t seqno;
> >  
> >  	if (list_empty(&ring->request_list))
> > @@ -2443,6 +2444,15 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > +		/* Wa: can't find the w/a name.
> > +		 * This doesn't actually implement the w/a, but it a workaround
> > +		 * for the workaround. It defers using rc6 until we know valid
> > +		 * state exists.
> > +		 */
> > +		if (IS_BROADWELL(ring->dev) && intel_enable_rc6(ring->dev) &&
> > +		    !dev_priv->rps.enabled && ring->id == RCS)
> > +			intel_enable_gt_powersave(ring->dev);
> 
> Nope, we won't rely on userspace submitting a command to enable a power
> saving feature. There's a good chance that userspace won't submitting
> anything, but still wake up from suspend, e.g. for background mail
> checking. Especially on Android where opportunistic is used to
> aggressively conserve battery power.

Your suspend example with mail is also not great. So the thing wakes up
for a second to check mail, and GPU is not in RC6, then goes back to
sleep. Big deal. If a user sees a new mail and uses the mail app, she
will get RC6. Essentially on any composited windowing system, that the
user is actively interfacing with, you get RC6 (and deeper states).

> 
> If just a MI_SET_CONTEXT isn't good enough and we need to save a valid 3d
> context then we need to (shock, horror) emit the relevant minimal set of
> 3d state commands in the kernel before enabling rc6.
> 
> Aside: Series looks good otherwise, so I'll happily merge once it's
> reviewed. No opinion on the RP* bikeshed (besides that your new names for
> dev_priv->rps.foo are definitely clearer) as long as the connection
> between the sane names and the Bspec names is documented somewhere.
> -Daniel
> 

I can't say it's completely unexpected that this would be your response,
but I do feel like you've ignored my argument that this is better than
the current situation. Not merging this patch only keeps things bad.

So I'd like you to re-consider merging this patch instead of waiting for
the perfect solution. This patch requires a lot less review than the
real fix. It has been tested by several people (I can ask them to put
their reviewed by on it). It enables rc6 for people today, and this
includes pc7, and deeper package and C states. It's very easy to revert
if/when we get a real fix. Real users benefit from this patch. Real
users are not hurt by this patch because if userspace is submitting bad
state setup, they'll have the same or worse experience than failing RC6.

As an aside, this needs to come before I enable rc6 anyway. So the order
way bad.

> > +
> >  		i915_gem_object_move_to_inactive(obj);
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index f19e6ea..72c8e1d 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -10983,6 +10983,11 @@ void intel_modeset_init_hw(struct drm_device *dev)
> >  
> >  	intel_reset_dpio(dev);
> >  
> > +	if (IS_BROADWELL(dev)) {
> > +		DRM_DEBUG_DRIVER("Deferring RC6 enabling until first batch is complete\n");
> > +		return;
> > +	}
> > +
> >  	mutex_lock(&dev->struct_mutex);
> >  	intel_enable_gt_powersave(dev);
> >  	mutex_unlock(&dev->struct_mutex);
> > -- 
> > 1.9.0
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Daniel Vetter March 20, 2014, 1:42 p.m. UTC | #3
On Wed, Mar 19, 2014 at 05:41:37PM -0700, Ben Widawsky wrote:
> I can't say it's completely unexpected that this would be your response,
> but I do feel like you've ignored my argument that this is better than
> the current situation. Not merging this patch only keeps things bad.
> 
> So I'd like you to re-consider merging this patch instead of waiting for
> the perfect solution. This patch requires a lot less review than the
> real fix. It has been tested by several people (I can ask them to put
> their reviewed by on it). It enables rc6 for people today, and this
> includes pc7, and deeper package and C states. It's very easy to revert
> if/when we get a real fix. Real users benefit from this patch. Real
> users are not hurt by this patch because if userspace is submitting bad
> state setup, they'll have the same or worse experience than failing RC6.
> 
> As an aside, this needs to come before I enable rc6 anyway. So the order
> way bad.

I fully agree with your assessment on technical reasons. The problem is
that I've just been forced through an exercise of "merge this now because
it works, people have tested it and we really, really, really need it to
move forward" and it didn't go down well.

Which means for the foreseeable future I'll reject patches when it looks
like a few too many rolls of ducttape have been involved in their
construction. I'd prefer if we could move more towards a merge early or at
least integration-test early model, but currently that's not a viable
model for me.

Note that this is not an iron rule at all, e.g. with psr I've just told
Rodrigo that I want the current state of affairs finalized for merging
instead of trying to hunt for the perfect patches. The reason for that was
that I think the remaining issues in psr support are well-understood and
we have solid test-coverage to make sure we don't fumble things. Also one
issue with the current psr patches is that they're way too conservative in
a few cases (i.e. wasting power), but for something tricky leaning towards
correctness is actually a feature. And bad power numbers tend to grab our
project managements attention. Overall the risks are a fairly clear
quantity.

In this area of rc6 and contexts though we have track record of not really
understanding issues. Which means that the risks here are unknown and
might be fairly big.

Yours, Daniel
Jesse Barnes March 20, 2014, 5:30 p.m. UTC | #4
On Thu, 20 Mar 2014 14:42:32 +0100
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Wed, Mar 19, 2014 at 05:41:37PM -0700, Ben Widawsky wrote:
> > I can't say it's completely unexpected that this would be your response,
> > but I do feel like you've ignored my argument that this is better than
> > the current situation. Not merging this patch only keeps things bad.
> > 
> > So I'd like you to re-consider merging this patch instead of waiting for
> > the perfect solution. This patch requires a lot less review than the
> > real fix. It has been tested by several people (I can ask them to put
> > their reviewed by on it). It enables rc6 for people today, and this
> > includes pc7, and deeper package and C states. It's very easy to revert
> > if/when we get a real fix. Real users benefit from this patch. Real
> > users are not hurt by this patch because if userspace is submitting bad
> > state setup, they'll have the same or worse experience than failing RC6.
> > 
> > As an aside, this needs to come before I enable rc6 anyway. So the order
> > way bad.
> 
> I fully agree with your assessment on technical reasons. The problem is
> that I've just been forced through an exercise of "merge this now because
> it works, people have tested it and we really, really, really need it to
> move forward" and it didn't go down well.
> 
> Which means for the foreseeable future I'll reject patches when it looks
> like a few too many rolls of ducttape have been involved in their
> construction. I'd prefer if we could move more towards a merge early or at
> least integration-test early model, but currently that's not a viable
> model for me.
> 
> Note that this is not an iron rule at all, e.g. with psr I've just told
> Rodrigo that I want the current state of affairs finalized for merging
> instead of trying to hunt for the perfect patches. The reason for that was
> that I think the remaining issues in psr support are well-understood and
> we have solid test-coverage to make sure we don't fumble things. Also one
> issue with the current psr patches is that they're way too conservative in
> a few cases (i.e. wasting power), but for something tricky leaning towards
> correctness is actually a feature. And bad power numbers tend to grab our
> project managements attention. Overall the risks are a fairly clear
> quantity.
> 
> In this area of rc6 and contexts though we have track record of not really
> understanding issues. Which means that the risks here are unknown and
> might be fairly big.

Do you think this patch falls into that class of issues?  It seems like
it's a general improvement, even if it doesn't address the more
general behavior we'd like (sooner than later really).

But blocking it until we have the full primitive emission seems like
it's going to keep power consumption in a terrible state on BDW for the
forseeable future... moreover I guess this is something we need going
back forever for RC6 stability, at least according to the hw team.  So
rather than blocking this, maybe we should commit it for all platforms!
Jesse Barnes March 20, 2014, 8:12 p.m. UTC | #5
On Thu, 20 Mar 2014 10:30:32 -0700
Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> On Thu, 20 Mar 2014 14:42:32 +0100
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Wed, Mar 19, 2014 at 05:41:37PM -0700, Ben Widawsky wrote:
> > > I can't say it's completely unexpected that this would be your response,
> > > but I do feel like you've ignored my argument that this is better than
> > > the current situation. Not merging this patch only keeps things bad.
> > > 
> > > So I'd like you to re-consider merging this patch instead of waiting for
> > > the perfect solution. This patch requires a lot less review than the
> > > real fix. It has been tested by several people (I can ask them to put
> > > their reviewed by on it). It enables rc6 for people today, and this
> > > includes pc7, and deeper package and C states. It's very easy to revert
> > > if/when we get a real fix. Real users benefit from this patch. Real
> > > users are not hurt by this patch because if userspace is submitting bad
> > > state setup, they'll have the same or worse experience than failing RC6.
> > > 
> > > As an aside, this needs to come before I enable rc6 anyway. So the order
> > > way bad.
> > 
> > I fully agree with your assessment on technical reasons. The problem is
> > that I've just been forced through an exercise of "merge this now because
> > it works, people have tested it and we really, really, really need it to
> > move forward" and it didn't go down well.
> > 
> > Which means for the foreseeable future I'll reject patches when it looks
> > like a few too many rolls of ducttape have been involved in their
> > construction. I'd prefer if we could move more towards a merge early or at
> > least integration-test early model, but currently that's not a viable
> > model for me.
> > 
> > Note that this is not an iron rule at all, e.g. with psr I've just told
> > Rodrigo that I want the current state of affairs finalized for merging
> > instead of trying to hunt for the perfect patches. The reason for that was
> > that I think the remaining issues in psr support are well-understood and
> > we have solid test-coverage to make sure we don't fumble things. Also one
> > issue with the current psr patches is that they're way too conservative in
> > a few cases (i.e. wasting power), but for something tricky leaning towards
> > correctness is actually a feature. And bad power numbers tend to grab our
> > project managements attention. Overall the risks are a fairly clear
> > quantity.
> > 
> > In this area of rc6 and contexts though we have track record of not really
> > understanding issues. Which means that the risks here are unknown and
> > might be fairly big.
> 
> Do you think this patch falls into that class of issues?  It seems like
> it's a general improvement, even if it doesn't address the more
> general behavior we'd like (sooner than later really).
> 
> But blocking it until we have the full primitive emission seems like
> it's going to keep power consumption in a terrible state on BDW for the
> forseeable future... moreover I guess this is something we need going
> back forever for RC6 stability, at least according to the hw team.  So
> rather than blocking this, maybe we should commit it for all platforms!

Summarizing IRC discussion a bit: speaking generally of when some
future work blocks an existing fix or feature, we really need to make
sure someone is working on the broader task and make sure we track it
so it doesn't get lost, otherwise everyone loses.  The user loses
because a fix or feature isn't available, the author loses because
something gets blocked indefinitely, and upstream loses because either
a fix doesn't land or it does and the larger feature never gets
implemented because the pressure is off.

So with that, who wants to volunteer to update the 3D state emission
patch to include BDW and push it upstream?  Daniel promised to file a
JIRA task for this so our PM can track it, so someone will be
volunteered in the next week or so if we don't get it done before then.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 2d05d7c..7fdfc0e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -679,6 +679,8 @@  int i915_reset(struct drm_device *dev)
 	mutex_lock(&dev->struct_mutex);
 
 	i915_gem_reset(dev);
+	if (IS_BROADWELL(dev))
+		intel_disable_gt_powersave(dev);
 
 	simulated = dev_priv->gpu_error.stop_rings != 0;
 
@@ -733,7 +735,7 @@  int i915_reset(struct drm_device *dev)
 		 * reset and the re-install of drm irq. Skip for ironlake per
 		 * previous concerns that it doesn't respond well to some forms
 		 * of re-init after reset. */
-		if (INTEL_INFO(dev)->gen > 5) {
+		if (INTEL_INFO(dev)->gen > 5 && !IS_BROADWELL(dev)) {
 			mutex_lock(&dev->struct_mutex);
 			intel_enable_gt_powersave(dev);
 			mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3618bb0..25a97a6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2420,6 +2420,7 @@  void i915_gem_reset(struct drm_device *dev)
 void
 i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	uint32_t seqno;
 
 	if (list_empty(&ring->request_list))
@@ -2443,6 +2444,15 @@  i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
+		/* Wa: can't find the w/a name.
+		 * This doesn't actually implement the w/a, but it a workaround
+		 * for the workaround. It defers using rc6 until we know valid
+		 * state exists.
+		 */
+		if (IS_BROADWELL(ring->dev) && intel_enable_rc6(ring->dev) &&
+		    !dev_priv->rps.enabled && ring->id == RCS)
+			intel_enable_gt_powersave(ring->dev);
+
 		i915_gem_object_move_to_inactive(obj);
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f19e6ea..72c8e1d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10983,6 +10983,11 @@  void intel_modeset_init_hw(struct drm_device *dev)
 
 	intel_reset_dpio(dev);
 
+	if (IS_BROADWELL(dev)) {
+		DRM_DEBUG_DRIVER("Deferring RC6 enabling until first batch is complete\n");
+		return;
+	}
+
 	mutex_lock(&dev->struct_mutex);
 	intel_enable_gt_powersave(dev);
 	mutex_unlock(&dev->struct_mutex);