diff mbox

[02/12] drm/i915: BUG_ON bad PPGTT offset

Message ID 1366784140-2670-3-git-send-email-ben@bwidawsk.net (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky April 24, 2013, 6:15 a.m. UTC
Because PPGTT PDEs within the GTT are calculated in cachelines
(HW guys consistency ftw) we do a divide which will wreak havoc if this
is wrong, and I know that from experience).

If/when we move to multiple PPGTTs this will have to become a WARN, and
return an error. For now however it should always be considered fatal,
and only a developer could hit it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jesse Barnes May 2, 2013, 8:28 p.m. UTC | #1
On Tue, 23 Apr 2013 23:15:30 -0700
Ben Widawsky <ben@bwidawsk.net> wrote:

> Because PPGTT PDEs within the GTT are calculated in cachelines
> (HW guys consistency ftw) we do a divide which will wreak havoc if this
> is wrong, and I know that from experience).
> 
> If/when we move to multiple PPGTTs this will have to become a WARN, and
> return an error. For now however it should always be considered fatal,
> and only a developer could hit it.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 50df194..0503f09 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -85,6 +85,8 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
>  	uint32_t pd_entry;
>  	int i;
>  
> +	BUG_ON(ppgtt->pd_offset & 0x3f);
> +
>  	pd_addr = (gen6_gtt_pte_t __iomem*)dev_priv->gtt.gsm +
>  		ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
>  	for (i = 0; i < ppgtt->num_pd_entries; i++) {

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Daniel Vetter May 6, 2013, 9:48 a.m. UTC | #2
On Thu, May 02, 2013 at 01:28:23PM -0700, Jesse Barnes wrote:
> On Tue, 23 Apr 2013 23:15:30 -0700
> Ben Widawsky <ben@bwidawsk.net> wrote:
> 
> > Because PPGTT PDEs within the GTT are calculated in cachelines
> > (HW guys consistency ftw) we do a divide which will wreak havoc if this
> > is wrong, and I know that from experience).
> > 
> > If/when we move to multiple PPGTTs this will have to become a WARN, and
> > return an error. For now however it should always be considered fatal,
> > and only a developer could hit it.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 50df194..0503f09 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -85,6 +85,8 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> >  	uint32_t pd_entry;
> >  	int i;
> >  
> > +	BUG_ON(ppgtt->pd_offset & 0x3f);
> > +
> >  	pd_addr = (gen6_gtt_pte_t __iomem*)dev_priv->gtt.gsm +
> >  		ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
> >  	for (i = 0; i < ppgtt->num_pd_entries; i++) {
> 
> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>

Queued for -next with the same s/BUG/WARN bikeshed, thanks for the patch.
-Daniel
Ben Widawsky May 6, 2013, 6:03 p.m. UTC | #3
On Mon, May 06, 2013 at 11:48:05AM +0200, Daniel Vetter wrote:
> On Thu, May 02, 2013 at 01:28:23PM -0700, Jesse Barnes wrote:
> > On Tue, 23 Apr 2013 23:15:30 -0700
> > Ben Widawsky <ben@bwidawsk.net> wrote:
> > 
> > > Because PPGTT PDEs within the GTT are calculated in cachelines
> > > (HW guys consistency ftw) we do a divide which will wreak havoc if this
> > > is wrong, and I know that from experience).
> > > 
> > > If/when we move to multiple PPGTTs this will have to become a WARN, and
> > > return an error. For now however it should always be considered fatal,
> > > and only a developer could hit it.
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 50df194..0503f09 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -85,6 +85,8 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> > >  	uint32_t pd_entry;
> > >  	int i;
> > >  
> > > +	BUG_ON(ppgtt->pd_offset & 0x3f);
> > > +
> > >  	pd_addr = (gen6_gtt_pte_t __iomem*)dev_priv->gtt.gsm +
> > >  		ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
> > >  	for (i = 0; i < ppgtt->num_pd_entries; i++) {
> > 
> > Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> 
> Queued for -next with the same s/BUG/WARN bikeshed, thanks for the patch.
> -Daniel

If you're going to change to WARN, please fixup the patch to return
early... if (WARN_ON(...)) return

The GPU will end up hanging if we've messed this up...
Daniel Vetter May 6, 2013, 6:37 p.m. UTC | #4
On Mon, May 6, 2013 at 8:03 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Mon, May 06, 2013 at 11:48:05AM +0200, Daniel Vetter wrote:
>> On Thu, May 02, 2013 at 01:28:23PM -0700, Jesse Barnes wrote:
>> > On Tue, 23 Apr 2013 23:15:30 -0700
>> > Ben Widawsky <ben@bwidawsk.net> wrote:
>> >
>> > > Because PPGTT PDEs within the GTT are calculated in cachelines
>> > > (HW guys consistency ftw) we do a divide which will wreak havoc if this
>> > > is wrong, and I know that from experience).
>> > >
>> > > If/when we move to multiple PPGTTs this will have to become a WARN, and
>> > > return an error. For now however it should always be considered fatal,
>> > > and only a developer could hit it.
>> > >
>> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>> > > ---
>> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 2 ++
>> > >  1 file changed, 2 insertions(+)
>> > >
>> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> > > index 50df194..0503f09 100644
>> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> > > @@ -85,6 +85,8 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
>> > >   uint32_t pd_entry;
>> > >   int i;
>> > >
>> > > + BUG_ON(ppgtt->pd_offset & 0x3f);
>> > > +
>> > >   pd_addr = (gen6_gtt_pte_t __iomem*)dev_priv->gtt.gsm +
>> > >           ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
>> > >   for (i = 0; i < ppgtt->num_pd_entries; i++) {
>> >
>> > Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
>>
>> Queued for -next with the same s/BUG/WARN bikeshed, thanks for the patch.
>> -Daniel
>
> If you're going to change to WARN, please fixup the patch to return
> early... if (WARN_ON(...)) return
>
> The GPU will end up hanging if we've messed this up...

I don't care about the gpu, only that the logs hit the disc. And I
don't see how an early return can safe anything in here, since
something went horribly wrong already anyway.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Ben Widawsky May 8, 2013, 4:48 p.m. UTC | #5
On Mon, May 06, 2013 at 08:37:37PM +0200, Daniel Vetter wrote:
> On Mon, May 6, 2013 at 8:03 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Mon, May 06, 2013 at 11:48:05AM +0200, Daniel Vetter wrote:
> >> On Thu, May 02, 2013 at 01:28:23PM -0700, Jesse Barnes wrote:
> >> > On Tue, 23 Apr 2013 23:15:30 -0700
> >> > Ben Widawsky <ben@bwidawsk.net> wrote:
> >> >
> >> > > Because PPGTT PDEs within the GTT are calculated in cachelines
> >> > > (HW guys consistency ftw) we do a divide which will wreak havoc if this
> >> > > is wrong, and I know that from experience).
> >> > >
> >> > > If/when we move to multiple PPGTTs this will have to become a WARN, and
> >> > > return an error. For now however it should always be considered fatal,
> >> > > and only a developer could hit it.
> >> > >
> >> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >> > > ---
> >> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 2 ++
> >> > >  1 file changed, 2 insertions(+)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> > > index 50df194..0503f09 100644
> >> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> > > @@ -85,6 +85,8 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> >> > >   uint32_t pd_entry;
> >> > >   int i;
> >> > >
> >> > > + BUG_ON(ppgtt->pd_offset & 0x3f);
> >> > > +
> >> > >   pd_addr = (gen6_gtt_pte_t __iomem*)dev_priv->gtt.gsm +
> >> > >           ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
> >> > >   for (i = 0; i < ppgtt->num_pd_entries; i++) {
> >> >
> >> > Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> >>
> >> Queued for -next with the same s/BUG/WARN bikeshed, thanks for the patch.
> >> -Daniel
> >
> > If you're going to change to WARN, please fixup the patch to return
> > early... if (WARN_ON(...)) return
> >
> > The GPU will end up hanging if we've messed this up...
> 
> I don't care about the gpu, only that the logs hit the disc. And I
> don't see how an early return can safe anything in here, since
> something went horribly wrong already anyway.
> -Daniel

Seriously? Feels a bit like this has become an argument for arguments
sake, but: I'd much rather have an assertion fail than a gpu error state
from same random point in time later.

> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Daniel Vetter May 8, 2013, 5:55 p.m. UTC | #6
On Wed, May 8, 2013 at 6:48 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
>> >> Queued for -next with the same s/BUG/WARN bikeshed, thanks for the patch.
>> >> -Daniel
>> >
>> > If you're going to change to WARN, please fixup the patch to return
>> > early... if (WARN_ON(...)) return
>> >
>> > The GPU will end up hanging if we've messed this up...
>>
>> I don't care about the gpu, only that the logs hit the disc. And I
>> don't see how an early return can safe anything in here, since
>> something went horribly wrong already anyway.
>> -Daniel
>
> Seriously? Feels a bit like this has become an argument for arguments
> sake, but: I'd much rather have an assertion fail than a gpu error state
> from same random point in time later.

You still get the assertion fail in dmesg with the WARN. The
difference to a BUG is that we don't kill X, which means you can still
vt-switch and get out of the graphical console mode and restore fbcon.
For normal users that's imo much better (and the gpu crapping out
later on, well meh).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 50df194..0503f09 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -85,6 +85,8 @@  static int gen6_ppgtt_enable(struct drm_device *dev)
 	uint32_t pd_entry;
 	int i;
 
+	BUG_ON(ppgtt->pd_offset & 0x3f);
+
 	pd_addr = (gen6_gtt_pte_t __iomem*)dev_priv->gtt.gsm +
 		ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {