diff mbox

drm/i915: Avoid accessing the stolen address when it is unavailable

Message ID 1382632427.26153.94.camel@cliu38-desktop-build (mailing list archive)
State New, archived
Headers show

Commit Message

Chuansheng Liu Oct. 24, 2013, 4:33 p.m. UTC
In our platform, we hit the the stolen region initialization failure case,
such as below log:
[drm:i915_stolen_to_physical] *ERROR* conflict detected with stolen region: [0x7b000000]

And it causes the dev_priv->mm.stolen_base is NULL, in this case, we should
avoid accessing it any more.

Here is possible call trace:
intel_enable_gt_powersave -- >
valleyview_enable_rps -- >
valleyview_setup_pctx

Cc: Li Fei <fei.li@intel.com>
Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c |    3 +++
 1 file changed, 3 insertions(+)

Comments

Chris Wilson Oct. 24, 2013, 12:17 p.m. UTC | #1
On Fri, Oct 25, 2013 at 12:33:47AM +0800, Chuansheng Liu wrote:
> 
> In our platform, we hit the the stolen region initialization failure case,
> such as below log:
> [drm:i915_stolen_to_physical] *ERROR* conflict detected with stolen region: [0x7b000000]
> 
> And it causes the dev_priv->mm.stolen_base is NULL, in this case, we should
> avoid accessing it any more.
> 
> Here is possible call trace:
> intel_enable_gt_powersave -- >
> valleyview_enable_rps -- >
> valleyview_setup_pctx

The two create_stolen routines are no-ops in that case so all that
happens instead is that we read VLV_PCBR. However, really if
i915_gem_object_create_stolen_for_preallocated() fails we should abort
loading the driver as it means we have a hardware conflict and undefined
behaviour.
-Chris
Ben Widawsky Oct. 24, 2013, 8:56 p.m. UTC | #2
On Thu, Oct 24, 2013 at 01:17:06PM +0100, Chris Wilson wrote:
> On Fri, Oct 25, 2013 at 12:33:47AM +0800, Chuansheng Liu wrote:
> > 
> > In our platform, we hit the the stolen region initialization failure case,
> > such as below log:
> > [drm:i915_stolen_to_physical] *ERROR* conflict detected with stolen region: [0x7b000000]
> > 
> > And it causes the dev_priv->mm.stolen_base is NULL, in this case, we should
> > avoid accessing it any more.
> > 
> > Here is possible call trace:
> > intel_enable_gt_powersave -- >
> > valleyview_enable_rps -- >
> > valleyview_setup_pctx
> 
> The two create_stolen routines are no-ops in that case so all that
> happens instead is that we read VLV_PCBR. However, really if
> i915_gem_object_create_stolen_for_preallocated() fails we should abort
> loading the driver as it means we have a hardware conflict and undefined
> behaviour.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

I agree. We should start treating these things as errors since no
RPS/RC6 is essentially not what anyone wants.

For another immediate solution it seems you can demote the DRM_ERROR to
DRM_DEBUG_DRIVER, and add a check in valleyview_enable_rps for the pctx
value.
Chuansheng Liu Oct. 25, 2013, 12:27 a.m. UTC | #3
Hello Chris and Ben,

> -----Original Message-----
> From: Ben Widawsky [mailto:ben@bwidawsk.net]
> Sent: Friday, October 25, 2013 4:57 AM
> To: Chris Wilson; Liu, Chuansheng; daniel.vetter@ffwll.ch; airlied@linux.ie;
> intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> linux-kernel@vger.kernel.org; Li, Fei
> Subject: Re: [Intel-gfx] drm/i915: Avoid accessing the stolen address when it is
> unavailable
> 
> On Thu, Oct 24, 2013 at 01:17:06PM +0100, Chris Wilson wrote:
> > On Fri, Oct 25, 2013 at 12:33:47AM +0800, Chuansheng Liu wrote:
> > >
> > > In our platform, we hit the the stolen region initialization failure case,
> > > such as below log:
> > > [drm:i915_stolen_to_physical] *ERROR* conflict detected with stolen
> region: [0x7b000000]
> > >
> > > And it causes the dev_priv->mm.stolen_base is NULL, in this case, we
> should
> > > avoid accessing it any more.
> > >
> > > Here is possible call trace:
> > > intel_enable_gt_powersave -- >
> > > valleyview_enable_rps -- >
> > > valleyview_setup_pctx
> >
> > The two create_stolen routines are no-ops in that case so all that
> > happens instead is that we read VLV_PCBR. However, really if
> > i915_gem_object_create_stolen_for_preallocated() fails we should abort
> > loading the driver as it means we have a hardware conflict and undefined
> > behaviour.
In case of dev_priv->mm.stolen_base == NULL, and the valleyview_setup_pctx() is called
at the first time, it will call i915_gem_object_create_stolen_for_preallocated(), which should
should return NULL always due to (!drm_mm_initialized(&dev_priv->mm.stolen)).

After that, every time specially when doing pm operation, the above scenario will
be called again and again.

Here this patch is to save some time for PM operation, we do not need to read
VLV_PCBR and pcbr_offset calculation in case of stolen_base == NULL.

Is it making sense? Thanks.

> 
> I agree. We should start treating these things as errors since no
> RPS/RC6 is essentially not what anyone wants.
> 
> DRM_DEBUG_DRIVER, and add a check in valleyview_enable_rps for the pctx
> value.
The pctx is already checked in valleyview_disable_rps().
Do we need more checking in case of pctx == NULL?
Chris Wilson Oct. 25, 2013, 8:07 a.m. UTC | #4
On Fri, Oct 25, 2013 at 12:27:42AM +0000, Liu, Chuansheng wrote:
> Hello Chris and Ben,
> 
> > -----Original Message-----
> > From: Ben Widawsky [mailto:ben@bwidawsk.net]
> > Sent: Friday, October 25, 2013 4:57 AM
> > To: Chris Wilson; Liu, Chuansheng; daniel.vetter@ffwll.ch; airlied@linux.ie;
> > intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> > linux-kernel@vger.kernel.org; Li, Fei
> > Subject: Re: [Intel-gfx] drm/i915: Avoid accessing the stolen address when it is
> > unavailable
> > 
> > On Thu, Oct 24, 2013 at 01:17:06PM +0100, Chris Wilson wrote:
> > > On Fri, Oct 25, 2013 at 12:33:47AM +0800, Chuansheng Liu wrote:
> > > >
> > > > In our platform, we hit the the stolen region initialization failure case,
> > > > such as below log:
> > > > [drm:i915_stolen_to_physical] *ERROR* conflict detected with stolen
> > region: [0x7b000000]
> > > >
> > > > And it causes the dev_priv->mm.stolen_base is NULL, in this case, we
> > should
> > > > avoid accessing it any more.
> > > >
> > > > Here is possible call trace:
> > > > intel_enable_gt_powersave -- >
> > > > valleyview_enable_rps -- >
> > > > valleyview_setup_pctx
> > >
> > > The two create_stolen routines are no-ops in that case so all that
> > > happens instead is that we read VLV_PCBR. However, really if
> > > i915_gem_object_create_stolen_for_preallocated() fails we should abort
> > > loading the driver as it means we have a hardware conflict and undefined
> > > behaviour.
> In case of dev_priv->mm.stolen_base == NULL, and the valleyview_setup_pctx() is called
> at the first time, it will call i915_gem_object_create_stolen_for_preallocated(), which should
> should return NULL always due to (!drm_mm_initialized(&dev_priv->mm.stolen)).
> 
> After that, every time specially when doing pm operation, the above scenario will
> be called again and again.
> 
> Here this patch is to save some time for PM operation, we do not need to read
> VLV_PCBR and pcbr_offset calculation in case of stolen_base == NULL.
> 
> Is it making sense? Thanks.

I see. No, it is a pointless optimisation that leaks knowledge about
internals of another subsystem to paper over a kernel bug.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 26c2ea3..1069b24 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3735,6 +3735,9 @@  static void valleyview_setup_pctx(struct drm_device *dev)
 	u32 pcbr;
 	int pctx_size = 24*1024;
 
+	if (!dev_priv->mm.stolen_base)
+		return;
+
 	pcbr = I915_READ(VLV_PCBR);
 	if (pcbr) {
 		/* BIOS set it up already, grab the pre-alloc'd space */