diff mbox

drm/i915: run intel_uncore_early_sanitize earlier on resume on non-VLV

Message ID 1413572490-2129-1-git-send-email-przanoni@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paulo Zanoni Oct. 17, 2014, 7:01 p.m. UTC
From: Paulo Zanoni <paulo.r.zanoni@intel.com>

As far as I understand, intel_uncore_early_sanitize() was supposed to
be ran before any register access, but currently
intel_resume_prepare() is ran earlier, and it does register
access. I don't think it should be safe to be calling
I915_{READ,WRITE} without calling intel_uncore_early_sanitize() first.

One of the problems we currently have is that when we suspend/resume
BDW, the FPGA_DBG_RM_NOCLAIM bit becomes 1, so we end up printing an
"unclaimed register" message on resume, but this message doesn't
really seem to have been triggered by our driver or user space, since
the bit was not there before suspending, and gets there just after
resuming, before any of our own register accesses. So calling
intel_uncore_early_sanitize() as a first thing will allow us to stop
printing the error message, fixing the "bug".

v2: VLV is an exception to the early_sanitize() rule: it needs to do
stuff before calling early_sanitize(), so instead of calling it
earlier for every platform, we call it earlier for non-VLV by adding
the early_sanitize() call inside intel_resume_prepare(). This doesn't
look like the most-beautiful-solution-ever, but, well, at least it
fixes the bug. (Imre)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83094
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Imre Deak Oct. 20, 2014, 10:20 a.m. UTC | #1
On Fri, 2014-10-17 at 16:01 -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> As far as I understand, intel_uncore_early_sanitize() was supposed to
> be ran before any register access, but currently
> intel_resume_prepare() is ran earlier, and it does register
> access. I don't think it should be safe to be calling
> I915_{READ,WRITE} without calling intel_uncore_early_sanitize() first.
> 
> One of the problems we currently have is that when we suspend/resume
> BDW, the FPGA_DBG_RM_NOCLAIM bit becomes 1, so we end up printing an
> "unclaimed register" message on resume, but this message doesn't
> really seem to have been triggered by our driver or user space, since
> the bit was not there before suspending, and gets there just after
> resuming, before any of our own register accesses. So calling
> intel_uncore_early_sanitize() as a first thing will allow us to stop
> printing the error message, fixing the "bug".
> 
> v2: VLV is an exception to the early_sanitize() rule: it needs to do
> stuff before calling early_sanitize(), so instead of calling it
> earlier for every platform, we call it earlier for non-VLV by adding
> the early_sanitize() call inside intel_resume_prepare(). This doesn't
> look like the most-beautiful-solution-ever, but, well, at least it
> fixes the bug. (Imre)
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Imre Deak <imre.deak@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83094
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index a05a1d0..f6d28f2 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -669,7 +669,6 @@ static int i915_drm_thaw_early(struct drm_device *dev)
>  	if (ret)
>  		DRM_ERROR("Resume prepare failed: %d,Continuing resume\n", ret);
>  
> -	intel_uncore_early_sanitize(dev, true);
>  	intel_uncore_sanitize(dev);
>  	intel_power_domains_init_hw(dev_priv);
>  
> @@ -1049,6 +1048,8 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
>  
>  	if (rpm_resume)
>  		intel_init_pch_refclk(dev);
> +	else
> +		intel_uncore_early_sanitize(dev, true);
>  
>  	return 0;
>  }
> @@ -1056,6 +1057,9 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
>  static int hsw_resume_prepare(struct drm_i915_private *dev_priv,
>  				bool rpm_resume)
>  {
> +	if (!rpm_resume)
> +		intel_uncore_early_sanitize(dev_priv->dev, true);
> +
>  	hsw_disable_pc8(dev_priv);
>  
>  	return 0;
> @@ -1421,6 +1425,9 @@ static int vlv_resume_prepare(struct drm_i915_private *dev_priv,
>  		i915_gem_restore_fences(dev);
>  	}
>  
> +	if (!rpm_resume)
> +		intel_uncore_early_sanitize(dev, true);
> +
>  	return ret;
>  }
>  

You also need to call intel_uncore_early_sanitize() from
intel_resume_prepare() for the rest of the platforms. With that fixed:
Reviewed-by: Imre Deak <imre.deak@intel.com>

Looking at the result, I agree it's not the nicest, so yet another way
to reduce the clutter would be to have the following instead in
i915_drm_thaw_early():

intel_resume_early_prepare()
intel_uncore_early_sanitize()
intel_resume_prepare()

and do the early steps for VLV in intel_resume_early_prepare(). I'm ok
with both solutions.

--Imre
Daniel Vetter Oct. 21, 2014, 5:05 p.m. UTC | #2
On Mon, Oct 20, 2014 at 01:20:50PM +0300, Imre Deak wrote:
> On Fri, 2014-10-17 at 16:01 -0300, Paulo Zanoni wrote:
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > 
> > As far as I understand, intel_uncore_early_sanitize() was supposed to
> > be ran before any register access, but currently
> > intel_resume_prepare() is ran earlier, and it does register
> > access. I don't think it should be safe to be calling
> > I915_{READ,WRITE} without calling intel_uncore_early_sanitize() first.
> > 
> > One of the problems we currently have is that when we suspend/resume
> > BDW, the FPGA_DBG_RM_NOCLAIM bit becomes 1, so we end up printing an
> > "unclaimed register" message on resume, but this message doesn't
> > really seem to have been triggered by our driver or user space, since
> > the bit was not there before suspending, and gets there just after
> > resuming, before any of our own register accesses. So calling
> > intel_uncore_early_sanitize() as a first thing will allow us to stop
> > printing the error message, fixing the "bug".
> > 
> > v2: VLV is an exception to the early_sanitize() rule: it needs to do
> > stuff before calling early_sanitize(), so instead of calling it
> > earlier for every platform, we call it earlier for non-VLV by adding
> > the early_sanitize() call inside intel_resume_prepare(). This doesn't
> > look like the most-beautiful-solution-ever, but, well, at least it
> > fixes the bug. (Imre)
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Imre Deak <imre.deak@intel.com>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83094
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index a05a1d0..f6d28f2 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -669,7 +669,6 @@ static int i915_drm_thaw_early(struct drm_device *dev)
> >  	if (ret)
> >  		DRM_ERROR("Resume prepare failed: %d,Continuing resume\n", ret);
> >  
> > -	intel_uncore_early_sanitize(dev, true);
> >  	intel_uncore_sanitize(dev);
> >  	intel_power_domains_init_hw(dev_priv);
> >  
> > @@ -1049,6 +1048,8 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
> >  
> >  	if (rpm_resume)
> >  		intel_init_pch_refclk(dev);
> > +	else
> > +		intel_uncore_early_sanitize(dev, true);
> >  
> >  	return 0;
> >  }
> > @@ -1056,6 +1057,9 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
> >  static int hsw_resume_prepare(struct drm_i915_private *dev_priv,
> >  				bool rpm_resume)
> >  {
> > +	if (!rpm_resume)
> > +		intel_uncore_early_sanitize(dev_priv->dev, true);
> > +
> >  	hsw_disable_pc8(dev_priv);
> >  
> >  	return 0;
> > @@ -1421,6 +1425,9 @@ static int vlv_resume_prepare(struct drm_i915_private *dev_priv,
> >  		i915_gem_restore_fences(dev);
> >  	}
> >  
> > +	if (!rpm_resume)
> > +		intel_uncore_early_sanitize(dev, true);
> > +
> >  	return ret;
> >  }
> >  
> 
> You also need to call intel_uncore_early_sanitize() from
> intel_resume_prepare() for the rest of the platforms. With that fixed:
> Reviewed-by: Imre Deak <imre.deak@intel.com>
> 
> Looking at the result, I agree it's not the nicest, so yet another way
> to reduce the clutter would be to have the following instead in
> i915_drm_thaw_early():
> 
> intel_resume_early_prepare()
> intel_uncore_early_sanitize()
> intel_resume_prepare()
> 
> and do the early steps for VLV in intel_resume_early_prepare(). I'm ok
> with both solutions.

This honestly starts to smell like a giant maintenance nightmare. We kinda
started off into the wrong direction with vlv rpm and it seems to get
worse by the day. And it looks like the situation is messy enough that we
can't even look down the ordering with copious amounts of warnings ...

But I also don't see any real solution, so just ranting for now. I'd
appreciate though if the revised version comes with a bunch of comments
attached in the code.
-Daniel
Imre Deak Oct. 22, 2014, 11:20 a.m. UTC | #3
On Tue, 2014-10-21 at 19:05 +0200, Daniel Vetter wrote:
> On Mon, Oct 20, 2014 at 01:20:50PM +0300, Imre Deak wrote:
> > On Fri, 2014-10-17 at 16:01 -0300, Paulo Zanoni wrote:
> > > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > 
> > > As far as I understand, intel_uncore_early_sanitize() was supposed to
> > > be ran before any register access, but currently
> > > intel_resume_prepare() is ran earlier, and it does register
> > > access. I don't think it should be safe to be calling
> > > I915_{READ,WRITE} without calling intel_uncore_early_sanitize() first.
> > > 
> > > One of the problems we currently have is that when we suspend/resume
> > > BDW, the FPGA_DBG_RM_NOCLAIM bit becomes 1, so we end up printing an
> > > "unclaimed register" message on resume, but this message doesn't
> > > really seem to have been triggered by our driver or user space, since
> > > the bit was not there before suspending, and gets there just after
> > > resuming, before any of our own register accesses. So calling
> > > intel_uncore_early_sanitize() as a first thing will allow us to stop
> > > printing the error message, fixing the "bug".
> > > 
> > > v2: VLV is an exception to the early_sanitize() rule: it needs to do
> > > stuff before calling early_sanitize(), so instead of calling it
> > > earlier for every platform, we call it earlier for non-VLV by adding
> > > the early_sanitize() call inside intel_resume_prepare(). This doesn't
> > > look like the most-beautiful-solution-ever, but, well, at least it
> > > fixes the bug. (Imre)
> > > 
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Imre Deak <imre.deak@intel.com>
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83094
> > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > > index a05a1d0..f6d28f2 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -669,7 +669,6 @@ static int i915_drm_thaw_early(struct drm_device *dev)
> > >  	if (ret)
> > >  		DRM_ERROR("Resume prepare failed: %d,Continuing resume\n", ret);
> > >  
> > > -	intel_uncore_early_sanitize(dev, true);
> > >  	intel_uncore_sanitize(dev);
> > >  	intel_power_domains_init_hw(dev_priv);
> > >  
> > > @@ -1049,6 +1048,8 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
> > >  
> > >  	if (rpm_resume)
> > >  		intel_init_pch_refclk(dev);
> > > +	else
> > > +		intel_uncore_early_sanitize(dev, true);
> > >  
> > >  	return 0;
> > >  }
> > > @@ -1056,6 +1057,9 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
> > >  static int hsw_resume_prepare(struct drm_i915_private *dev_priv,
> > >  				bool rpm_resume)
> > >  {
> > > +	if (!rpm_resume)
> > > +		intel_uncore_early_sanitize(dev_priv->dev, true);
> > > +
> > >  	hsw_disable_pc8(dev_priv);
> > >  
> > >  	return 0;
> > > @@ -1421,6 +1425,9 @@ static int vlv_resume_prepare(struct drm_i915_private *dev_priv,
> > >  		i915_gem_restore_fences(dev);
> > >  	}
> > >  
> > > +	if (!rpm_resume)
> > > +		intel_uncore_early_sanitize(dev, true);
> > > +
> > >  	return ret;
> > >  }
> > >  
> > 
> > You also need to call intel_uncore_early_sanitize() from
> > intel_resume_prepare() for the rest of the platforms. With that fixed:
> > Reviewed-by: Imre Deak <imre.deak@intel.com>
> > 
> > Looking at the result, I agree it's not the nicest, so yet another way
> > to reduce the clutter would be to have the following instead in
> > i915_drm_thaw_early():
> > 
> > intel_resume_early_prepare()
> > intel_uncore_early_sanitize()
> > intel_resume_prepare()
> > 
> > and do the early steps for VLV in intel_resume_early_prepare(). I'm ok
> > with both solutions.
> 
> This honestly starts to smell like a giant maintenance nightmare. We kinda
> started off into the wrong direction with vlv rpm and it seems to get
> worse by the day. And it looks like the situation is messy enough that we
> can't even look down the ordering with copious amounts of warnings ...
> 
> But I also don't see any real solution, so just ranting for now. I'd
> appreciate though if the revised version comes with a bunch of comments
> attached in the code.

I blame it on the HW people. :) Seriously, the VLV PM code differs from
the rest of PM code in that we save/restore some HW state instead of
reinitializing it. That's where the above special casing of the ordering
stems from. I agree that it's not ideal, but I think having started with
that solution and moving towards the ideal was not that bad. In fact
s0ix doesn't yet work in the upstream kernel for reasons independent of
i915 (or at least I couldn't make it work), but we would need it to
fully validate all the suspend/resume paths.

--Imre
Paulo Zanoni Oct. 22, 2014, 7:01 p.m. UTC | #4
2014-10-22 9:20 GMT-02:00 Imre Deak <imre.deak@intel.com>:
> On Tue, 2014-10-21 at 19:05 +0200, Daniel Vetter wrote:
>> On Mon, Oct 20, 2014 at 01:20:50PM +0300, Imre Deak wrote:
>> > On Fri, 2014-10-17 at 16:01 -0300, Paulo Zanoni wrote:
>> > > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> > >
>> > > As far as I understand, intel_uncore_early_sanitize() was supposed to
>> > > be ran before any register access, but currently
>> > > intel_resume_prepare() is ran earlier, and it does register
>> > > access. I don't think it should be safe to be calling
>> > > I915_{READ,WRITE} without calling intel_uncore_early_sanitize() first.
>> > >
>> > > One of the problems we currently have is that when we suspend/resume
>> > > BDW, the FPGA_DBG_RM_NOCLAIM bit becomes 1, so we end up printing an
>> > > "unclaimed register" message on resume, but this message doesn't
>> > > really seem to have been triggered by our driver or user space, since
>> > > the bit was not there before suspending, and gets there just after
>> > > resuming, before any of our own register accesses. So calling
>> > > intel_uncore_early_sanitize() as a first thing will allow us to stop
>> > > printing the error message, fixing the "bug".
>> > >
>> > > v2: VLV is an exception to the early_sanitize() rule: it needs to do
>> > > stuff before calling early_sanitize(), so instead of calling it
>> > > earlier for every platform, we call it earlier for non-VLV by adding
>> > > the early_sanitize() call inside intel_resume_prepare(). This doesn't
>> > > look like the most-beautiful-solution-ever, but, well, at least it
>> > > fixes the bug. (Imre)
>> > >
>> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> > > Cc: Imre Deak <imre.deak@intel.com>
>> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83094
>> > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> > > ---
>> > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
>> > >  1 file changed, 8 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> > > index a05a1d0..f6d28f2 100644
>> > > --- a/drivers/gpu/drm/i915/i915_drv.c
>> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
>> > > @@ -669,7 +669,6 @@ static int i915_drm_thaw_early(struct drm_device *dev)
>> > >   if (ret)
>> > >           DRM_ERROR("Resume prepare failed: %d,Continuing resume\n", ret);
>> > >
>> > > - intel_uncore_early_sanitize(dev, true);
>> > >   intel_uncore_sanitize(dev);
>> > >   intel_power_domains_init_hw(dev_priv);
>> > >
>> > > @@ -1049,6 +1048,8 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
>> > >
>> > >   if (rpm_resume)
>> > >           intel_init_pch_refclk(dev);
>> > > + else
>> > > +         intel_uncore_early_sanitize(dev, true);
>> > >
>> > >   return 0;
>> > >  }
>> > > @@ -1056,6 +1057,9 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
>> > >  static int hsw_resume_prepare(struct drm_i915_private *dev_priv,
>> > >                           bool rpm_resume)
>> > >  {
>> > > + if (!rpm_resume)
>> > > +         intel_uncore_early_sanitize(dev_priv->dev, true);
>> > > +
>> > >   hsw_disable_pc8(dev_priv);
>> > >
>> > >   return 0;
>> > > @@ -1421,6 +1425,9 @@ static int vlv_resume_prepare(struct drm_i915_private *dev_priv,
>> > >           i915_gem_restore_fences(dev);
>> > >   }
>> > >
>> > > + if (!rpm_resume)
>> > > +         intel_uncore_early_sanitize(dev, true);
>> > > +
>> > >   return ret;
>> > >  }
>> > >
>> >
>> > You also need to call intel_uncore_early_sanitize() from
>> > intel_resume_prepare() for the rest of the platforms. With that fixed:
>> > Reviewed-by: Imre Deak <imre.deak@intel.com>
>> >
>> > Looking at the result, I agree it's not the nicest, so yet another way
>> > to reduce the clutter would be to have the following instead in
>> > i915_drm_thaw_early():
>> >
>> > intel_resume_early_prepare()
>> > intel_uncore_early_sanitize()
>> > intel_resume_prepare()
>> >
>> > and do the early steps for VLV in intel_resume_early_prepare(). I'm ok
>> > with both solutions.
>>
>> This honestly starts to smell like a giant maintenance nightmare. We kinda
>> started off into the wrong direction with vlv rpm and it seems to get
>> worse by the day. And it looks like the situation is messy enough that we
>> can't even look down the ordering with copious amounts of warnings ...
>>
>> But I also don't see any real solution, so just ranting for now. I'd
>> appreciate though if the revised version comes with a bunch of comments
>> attached in the code.
>
> I blame it on the HW people. :) Seriously, the VLV PM code differs from
> the rest of PM code in that we save/restore some HW state instead of
> reinitializing it. That's where the above special casing of the ordering
> stems from. I agree that it's not ideal, but I think having started with
> that solution and moving towards the ideal was not that bad. In fact
> s0ix doesn't yet work in the upstream kernel for reasons independent of
> i915 (or at least I couldn't make it work), but we would need it to
> fully validate all the suspend/resume paths.

On a side note, even igt/pm_rpm/rte (the basic subtest) seems to be
broken on BYT since forever (at least according to QA, bug #82939), so
do we even want RPM enabled on BYT?

>
> --Imre
>
Daniel Vetter Oct. 23, 2014, 12:16 p.m. UTC | #5
On Wed, Oct 22, 2014 at 05:01:54PM -0200, Paulo Zanoni wrote:
> 2014-10-22 9:20 GMT-02:00 Imre Deak <imre.deak@intel.com>:
> > On Tue, 2014-10-21 at 19:05 +0200, Daniel Vetter wrote:
> >> On Mon, Oct 20, 2014 at 01:20:50PM +0300, Imre Deak wrote:
> >> > On Fri, 2014-10-17 at 16:01 -0300, Paulo Zanoni wrote:
> >> > > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> > >
> >> > > As far as I understand, intel_uncore_early_sanitize() was supposed to
> >> > > be ran before any register access, but currently
> >> > > intel_resume_prepare() is ran earlier, and it does register
> >> > > access. I don't think it should be safe to be calling
> >> > > I915_{READ,WRITE} without calling intel_uncore_early_sanitize() first.
> >> > >
> >> > > One of the problems we currently have is that when we suspend/resume
> >> > > BDW, the FPGA_DBG_RM_NOCLAIM bit becomes 1, so we end up printing an
> >> > > "unclaimed register" message on resume, but this message doesn't
> >> > > really seem to have been triggered by our driver or user space, since
> >> > > the bit was not there before suspending, and gets there just after
> >> > > resuming, before any of our own register accesses. So calling
> >> > > intel_uncore_early_sanitize() as a first thing will allow us to stop
> >> > > printing the error message, fixing the "bug".
> >> > >
> >> > > v2: VLV is an exception to the early_sanitize() rule: it needs to do
> >> > > stuff before calling early_sanitize(), so instead of calling it
> >> > > earlier for every platform, we call it earlier for non-VLV by adding
> >> > > the early_sanitize() call inside intel_resume_prepare(). This doesn't
> >> > > look like the most-beautiful-solution-ever, but, well, at least it
> >> > > fixes the bug. (Imre)
> >> > >
> >> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> > > Cc: Imre Deak <imre.deak@intel.com>
> >> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83094
> >> > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> > > ---
> >> > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> >> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> >> > > index a05a1d0..f6d28f2 100644
> >> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> >> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> >> > > @@ -669,7 +669,6 @@ static int i915_drm_thaw_early(struct drm_device *dev)
> >> > >   if (ret)
> >> > >           DRM_ERROR("Resume prepare failed: %d,Continuing resume\n", ret);
> >> > >
> >> > > - intel_uncore_early_sanitize(dev, true);
> >> > >   intel_uncore_sanitize(dev);
> >> > >   intel_power_domains_init_hw(dev_priv);
> >> > >
> >> > > @@ -1049,6 +1048,8 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
> >> > >
> >> > >   if (rpm_resume)
> >> > >           intel_init_pch_refclk(dev);
> >> > > + else
> >> > > +         intel_uncore_early_sanitize(dev, true);
> >> > >
> >> > >   return 0;
> >> > >  }
> >> > > @@ -1056,6 +1057,9 @@ static int snb_resume_prepare(struct drm_i915_private *dev_priv,
> >> > >  static int hsw_resume_prepare(struct drm_i915_private *dev_priv,
> >> > >                           bool rpm_resume)
> >> > >  {
> >> > > + if (!rpm_resume)
> >> > > +         intel_uncore_early_sanitize(dev_priv->dev, true);
> >> > > +
> >> > >   hsw_disable_pc8(dev_priv);
> >> > >
> >> > >   return 0;
> >> > > @@ -1421,6 +1425,9 @@ static int vlv_resume_prepare(struct drm_i915_private *dev_priv,
> >> > >           i915_gem_restore_fences(dev);
> >> > >   }
> >> > >
> >> > > + if (!rpm_resume)
> >> > > +         intel_uncore_early_sanitize(dev, true);
> >> > > +
> >> > >   return ret;
> >> > >  }
> >> > >
> >> >
> >> > You also need to call intel_uncore_early_sanitize() from
> >> > intel_resume_prepare() for the rest of the platforms. With that fixed:
> >> > Reviewed-by: Imre Deak <imre.deak@intel.com>
> >> >
> >> > Looking at the result, I agree it's not the nicest, so yet another way
> >> > to reduce the clutter would be to have the following instead in
> >> > i915_drm_thaw_early():
> >> >
> >> > intel_resume_early_prepare()
> >> > intel_uncore_early_sanitize()
> >> > intel_resume_prepare()
> >> >
> >> > and do the early steps for VLV in intel_resume_early_prepare(). I'm ok
> >> > with both solutions.
> >>
> >> This honestly starts to smell like a giant maintenance nightmare. We kinda
> >> started off into the wrong direction with vlv rpm and it seems to get
> >> worse by the day. And it looks like the situation is messy enough that we
> >> can't even look down the ordering with copious amounts of warnings ...
> >>
> >> But I also don't see any real solution, so just ranting for now. I'd
> >> appreciate though if the revised version comes with a bunch of comments
> >> attached in the code.
> >
> > I blame it on the HW people. :) Seriously, the VLV PM code differs from
> > the rest of PM code in that we save/restore some HW state instead of
> > reinitializing it. That's where the above special casing of the ordering
> > stems from. I agree that it's not ideal, but I think having started with
> > that solution and moving towards the ideal was not that bad. In fact
> > s0ix doesn't yet work in the upstream kernel for reasons independent of
> > i915 (or at least I couldn't make it work), but we would need it to
> > fully validate all the suspend/resume paths.
> 
> On a side note, even igt/pm_rpm/rte (the basic subtest) seems to be
> broken on BYT since forever (at least according to QA, bug #82939), so
> do we even want RPM enabled on BYT?

If it's really broken and not just the test being pedantic about
something, then yes please submit the revert. Submitting reverts is the
best way to make sure PM and engineers are aware that something didn't go
as it should have, and it's the quickest way to rack up good bug team
stats. So please bring them on.

This is btw true for regressions in general: If you don't see action on a
bug, then just send in the revert. Either the original author can come up
with fix within a few days, or I'll just merge the revert. In either case,
bug stats improve.

Cheers, Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index a05a1d0..f6d28f2 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -669,7 +669,6 @@  static int i915_drm_thaw_early(struct drm_device *dev)
 	if (ret)
 		DRM_ERROR("Resume prepare failed: %d,Continuing resume\n", ret);
 
-	intel_uncore_early_sanitize(dev, true);
 	intel_uncore_sanitize(dev);
 	intel_power_domains_init_hw(dev_priv);
 
@@ -1049,6 +1048,8 @@  static int snb_resume_prepare(struct drm_i915_private *dev_priv,
 
 	if (rpm_resume)
 		intel_init_pch_refclk(dev);
+	else
+		intel_uncore_early_sanitize(dev, true);
 
 	return 0;
 }
@@ -1056,6 +1057,9 @@  static int snb_resume_prepare(struct drm_i915_private *dev_priv,
 static int hsw_resume_prepare(struct drm_i915_private *dev_priv,
 				bool rpm_resume)
 {
+	if (!rpm_resume)
+		intel_uncore_early_sanitize(dev_priv->dev, true);
+
 	hsw_disable_pc8(dev_priv);
 
 	return 0;
@@ -1421,6 +1425,9 @@  static int vlv_resume_prepare(struct drm_i915_private *dev_priv,
 		i915_gem_restore_fences(dev);
 	}
 
+	if (!rpm_resume)
+		intel_uncore_early_sanitize(dev, true);
+
 	return ret;
 }