[3.0-rc3] i915: Fix gen6 (SNB) GPU stalling

Message ID	1308070307-2630-1-git-send-email-daniel.blueman@gmail.com (mailing list archive)
State	New, archived
Headers	show Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by demeter2.kernel.org (8.14.4/8.14.4) with ESMTP id p5EGqY4b016864 for <patchwork-intel-gfx@patchwork.kernel.org>; Tue, 14 Jun 2011 16:52:55 GMT DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:date:message-id:x-mailer; b=Kds+YzBpv0fWx1iENogoUFzrJOhLuC1oeqYtTmXGTCZ0k9Wzx1NECzWvmszRu6BatD KuoMtuzfg9zQGUKQSOQSVJkZbIpotyGiQk9TMQR2SF/lfW+CkmD7E6kta4Re5+mgB6Id 2OeF9hmY6SFZuCmsbubuT73ZoiVNvE5EeYecQ= From: Daniel J Blueman <daniel.blueman@gmail.com> To: Eric Anholt <eric@anholt.net> Date: Wed, 15 Jun 2011 00:51:47 +0800 Message-Id: <1308070307-2630-1-git-send-email-daniel.blueman@gmail.com> Cc: Daniel J Blueman <daniel.blueman@gmail.com>, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Dave Airlie <airlied@redhat.com> Subject: [Intel-gfx] [PATCH 3.0-rc3] i915: Fix gen6 (SNB) GPU stalling Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Daniel J Blueman June 14, 2011, 4:51 p.m. UTC

On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:
> On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
>> Hi Eric,
>>
>> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
>> whenever you hit the top-left of the screen to show all windows) are
>> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
>> I see no 'missed IRQ' kernel messages.
>>
>> As this addresses a significant usability regression, are you happy to
>> add it to the 3.0-rc queue? I think it has very good value in -stable
>> also (assuming correctness). What do you think?
>
> This one had significant performance impacts, and later hacks in this
> series worked around the problem to approximately the same level of
> success with less impact, and we don't actually have a justification of
> why any of them work.  We were still hoping to come up with some clue,
> and haven't yet.

True; that is quite heavy handed delay looping.

It's a pity the usual Intel font didn't make it to the programmer's
reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
so we don't need to read it here.

It would be good to get this into -rc4. -stable probably needs some additional
tweaks.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
---
 drivers/gpu/drm/i915/i915_irq.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

Chris Wilson June 14, 2011, 5:23 p.m. UTC | #1

On Wed, 15 Jun 2011 00:51:47 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
> True; that is quite heavy handed delay looping.
> 
> It's a pity the usual Intel font didn't make it to the programmer's
> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
> so we don't need to read it here.
> 
> It would be good to get this into -rc4. -stable probably needs some additional
> tweaks.
> 
> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>

Tested-by: Chris Wilson <chris@chris-wilson.co.uk>

Should apply as is and be equally effective for stable.
-Chris

Eric Anholt June 15, 2011, 2:06 a.m. UTC | #2

On Wed, 15 Jun 2011 00:51:47 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
> On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:
> > On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
> >> Hi Eric,
> >>
> >> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
> >> whenever you hit the top-left of the screen to show all windows) are
> >> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
> >> I see no 'missed IRQ' kernel messages.
> >>
> >> As this addresses a significant usability regression, are you happy to
> >> add it to the 3.0-rc queue? I think it has very good value in -stable
> >> also (assuming correctness). What do you think?
> >
> > This one had significant performance impacts, and later hacks in this
> > series worked around the problem to approximately the same level of
> > success with less impact, and we don't actually have a justification of
> > why any of them work.  We were still hoping to come up with some clue,
> > and haven't yet.
> 
> True; that is quite heavy handed delay looping.
> 
> It's a pity the usual Intel font didn't make it to the programmer's
> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
> so we don't need to read it here.
> 
> It would be good to get this into -rc4. -stable probably needs some additional
> tweaks.
> 
> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>

So you're saying that our interrupts at the top-level IMR are triggered
by the write to the status page of the lower-level ring?  That's
surprising to me.  Or do you think that this write is just happening to
trigger serialization so the interrupt comes after the DWORD write of
the seqno by the ring?  (hw folks just recently indicated that our
particular code is not expected to serialize the interrupt after the
seqno store, unless we had an MI_FLUSH_DWORD in between)

This patch has now passed 7000 iterations of the testcase that had a
~.5% failure rate before.

Tested-by: Eric Anholt <eric@anholt.net>

Daniel J Blueman June 15, 2011, 3:24 a.m. UTC | #3

On 15 June 2011 10:06, Eric Anholt <eric@anholt.net> wrote:
> On Wed, 15 Jun 2011 00:51:47 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
>> On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:
>> > On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
>> >> Hi Eric,
>> >>
>> >> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
>> >> whenever you hit the top-left of the screen to show all windows) are
>> >> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
>> >> I see no 'missed IRQ' kernel messages.
>> >>
>> >> As this addresses a significant usability regression, are you happy to
>> >> add it to the 3.0-rc queue? I think it has very good value in -stable
>> >> also (assuming correctness). What do you think?
>> >
>> > This one had significant performance impacts, and later hacks in this
>> > series worked around the problem to approximately the same level of
>> > success with less impact, and we don't actually have a justification of
>> > why any of them work.  We were still hoping to come up with some clue,
>> > and haven't yet.
>>
>> True; that is quite heavy handed delay looping.
>>
>> It's a pity the usual Intel font didn't make it to the programmer's
>> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
>> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
>> so we don't need to read it here.
>>
>> It would be good to get this into -rc4. -stable probably needs some additional
>> tweaks.
>>
>> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>

> So you're saying that our interrupts at the top-level IMR are triggered
> by the write to the status page of the lower-level ring?  That's
> surprising to me.  Or do you think that this write is just happening to
> trigger serialization so the interrupt comes after the DWORD write of
> the seqno by the ring?  (hw folks just recently indicated that our
> particular code is not expected to serialize the interrupt after the
> seqno store, unless we had an MI_FLUSH_DWORD in between)

When ISR bits not masked by the hardware status mask register change,
a write is generated with the ISR contents to the status page, so I
believe that the blitter command streamer wasn't generating an
interrupt when an MI_USER_INTERRUPT command was issued. The interrupt
handler would kick in for other interrupts, read all the IIRs and
notice the bit set and wake the ring interrupt queue anyway.

I guess we could test this by observing that the BLT user interrupt
IIR bit is always not set on it's own (ie another interrupt woke us)
in the interrupt handler.

Thanks,
  Daniel

> This patch has now passed 7000 iterations of the testcase that had a
> ~.5% failure rate before.
>
> Tested-by: Eric Anholt <eric@anholt.net>

Ben Widawsky June 15, 2011, 4:43 a.m. UTC | #4

On Wed, Jun 15, 2011 at 12:51:47AM +0800, Daniel J Blueman wrote:
> On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:
> > On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
> >> Hi Eric,
> >>
> >> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
> >> whenever you hit the top-left of the screen to show all windows) are
> >> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
> >> I see no 'missed IRQ' kernel messages.
> >>
> >> As this addresses a significant usability regression, are you happy to
> >> add it to the 3.0-rc queue? I think it has very good value in -stable
> >> also (assuming correctness). What do you think?
> >
> > This one had significant performance impacts, and later hacks in this
> > series worked around the problem to approximately the same level of
> > success with less impact, and we don't actually have a justification of
> > why any of them work.  We were still hoping to come up with some clue,
> > and haven't yet.
> 
> True; that is quite heavy handed delay looping.
> 
> It's a pity the usual Intel font didn't make it to the programmer's
> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
> so we don't need to read it here.
> 
> It would be good to get this into -rc4. -stable probably needs some additional
> tweaks.
> 
> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b9fafe3..9a98c1b 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1827,6 +1827,12 @@ int ironlake_irq_postinstall(struct drm_device *dev)
>  		ironlake_enable_display_irq(dev_priv, DE_PCU_EVENT);
>  	}
>  
> +	if (IS_GEN6(dev))
> +		/* allow blitter user interrupt to generate a MSI write from
> +		   the ISR */
> +		I915_WRITE(GEN6_BLITTER_HWSTAM,
> +			0xffffffff & ~GEN6_BLITTER_USER_INTERRUPT);
> +
>  	return 0;
>  }
>  

I wish the docs said that that this hwstam unmasked MI_USER_INTERRUPTS
parsed by the Blitter Command Parser, instead of the Render Command
parser.

I was just about to write an email about how this is just making the
same interrupt happen twice, when I realized that the docs make no
sense, and this must be a copy-paste doc bug.

This patch sounds good to me. Two small comments:
1. The HWSTAM is touched in preinstall already, why not move this there?
2. I'd prefer you read the register even though as you say it isn't
necessary. It just makes the code self-documented by doing it that way.

Ben

Daniel J Blueman June 15, 2011, 5:04 a.m. UTC | #5

On 15 June 2011 12:43, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Wed, Jun 15, 2011 at 12:51:47AM +0800, Daniel J Blueman wrote:
>> On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:
>> > On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
>> >> Hi Eric,
>> >>
>> >> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
>> >> whenever you hit the top-left of the screen to show all windows) are
>> >> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
>> >> I see no 'missed IRQ' kernel messages.
>> >>
>> >> As this addresses a significant usability regression, are you happy to
>> >> add it to the 3.0-rc queue? I think it has very good value in -stable
>> >> also (assuming correctness). What do you think?
>> >
>> > This one had significant performance impacts, and later hacks in this
>> > series worked around the problem to approximately the same level of
>> > success with less impact, and we don't actually have a justification of
>> > why any of them work.  We were still hoping to come up with some clue,
>> > and haven't yet.
>>
>> True; that is quite heavy handed delay looping.
>>
>> It's a pity the usual Intel font didn't make it to the programmer's
>> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
>> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
>> so we don't need to read it here.
>>
>> It would be good to get this into -rc4. -stable probably needs some additional
>> tweaks.
>>
>> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
>> ---
>>  drivers/gpu/drm/i915/i915_irq.c |    6 ++++++
>>  1 files changed, 6 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index b9fafe3..9a98c1b 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -1827,6 +1827,12 @@ int ironlake_irq_postinstall(struct drm_device *dev)
>>               ironlake_enable_display_irq(dev_priv, DE_PCU_EVENT);
>>       }
>>
>> +     if (IS_GEN6(dev))
>> +             /* allow blitter user interrupt to generate a MSI write from
>> +                the ISR */
>> +             I915_WRITE(GEN6_BLITTER_HWSTAM,
>> +                     0xffffffff & ~GEN6_BLITTER_USER_INTERRUPT);
>> +
>>       return 0;
>>  }
>>
>
> I wish the docs said that that this hwstam unmasked MI_USER_INTERRUPTS
> parsed by the Blitter Command Parser, instead of the Render Command
> parser.
>
> I was just about to write an email about how this is just making the
> same interrupt happen twice, when I realized that the docs make no
> sense, and this must be a copy-paste doc bug.
>
> This patch sounds good to me. Two small comments:
> 1. The HWSTAM is touched in preinstall already, why not move this there?
> 2. I'd prefer you read the register even though as you say it isn't
> necessary. It just makes the code self-documented by doing it that way.

The render HWSTAM is tweaked in preinstall, but we need to tweak the
blitter HWSTAM (new to gen6).

To me, it makes sense to reset the blitter HWSTAM register to what the
driver expects, in case anything before the i915 module loads and
adjusts it for a particular purpose (including debug); the render
HWSTAM is set this way too. I could add a comment to both perhaps?

Updating the blitter HWSTAM in the postinstall was a marginally safer
choice, as there'll not be any potential race with a blitter user
interrupt getting emitted before we're ready (which wouldn't have been
tested), but the risk is probably so low that it could just go into
the preinstall.

What do you think?

Daniel

Ben Widawsky June 15, 2011, 3:16 p.m. UTC | #6

On Wed, Jun 15, 2011 at 01:04:51PM +0800, Daniel J Blueman wrote:
> On 15 June 2011 12:43, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Wed, Jun 15, 2011 at 12:51:47AM +0800, Daniel J Blueman wrote:
> >> On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:
> >> > On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
> >> >> Hi Eric,
> >> >>
> >> >> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
> >> >> whenever you hit the top-left of the screen to show all windows) are
> >> >> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
> >> >> I see no 'missed IRQ' kernel messages.
> >> >>
> >> >> As this addresses a significant usability regression, are you happy to
> >> >> add it to the 3.0-rc queue? I think it has very good value in -stable
> >> >> also (assuming correctness). What do you think?
> >> >
> >> > This one had significant performance impacts, and later hacks in this
> >> > series worked around the problem to approximately the same level of
> >> > success with less impact, and we don't actually have a justification of
> >> > why any of them work. ?We were still hoping to come up with some clue,
> >> > and haven't yet.
> >>
> >> True; that is quite heavy handed delay looping.
> >>
> >> It's a pity the usual Intel font didn't make it to the programmer's
> >> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
> >> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
> >> so we don't need to read it here.
> >>
> >> It would be good to get this into -rc4. -stable probably needs some additional
> >> tweaks.
> >>
> >> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
> >> ---
> >> ?drivers/gpu/drm/i915/i915_irq.c | ? ?6 ++++++
> >> ?1 files changed, 6 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >> index b9fafe3..9a98c1b 100644
> >> --- a/drivers/gpu/drm/i915/i915_irq.c
> >> +++ b/drivers/gpu/drm/i915/i915_irq.c
> >> @@ -1827,6 +1827,12 @@ int ironlake_irq_postinstall(struct drm_device *dev)
> >> ? ? ? ? ? ? ? ironlake_enable_display_irq(dev_priv, DE_PCU_EVENT);
> >> ? ? ? }
> >>
> >> + ? ? if (IS_GEN6(dev))
> >> + ? ? ? ? ? ? /* allow blitter user interrupt to generate a MSI write from
> >> + ? ? ? ? ? ? ? ?the ISR */
> >> + ? ? ? ? ? ? I915_WRITE(GEN6_BLITTER_HWSTAM,
> >> + ? ? ? ? ? ? ? ? ? ? 0xffffffff & ~GEN6_BLITTER_USER_INTERRUPT);
> >> +
> >> ? ? ? return 0;
> >> ?}
> >>
> >
> > I wish the docs said that that this hwstam unmasked MI_USER_INTERRUPTS
> > parsed by the Blitter Command Parser, instead of the Render Command
> > parser.
> >
> > I was just about to write an email about how this is just making the
> > same interrupt happen twice, when I realized that the docs make no
> > sense, and this must be a copy-paste doc bug.
> >
> > This patch sounds good to me. Two small comments:
> > 1. The HWSTAM is touched in preinstall already, why not move this there?
> > 2. I'd prefer you read the register even though as you say it isn't
> > necessary. It just makes the code self-documented by doing it that way.
> 
> The render HWSTAM is tweaked in preinstall, but we need to tweak the
> blitter HWSTAM (new to gen6).
> 
> To me, it makes sense to reset the blitter HWSTAM register to what the
> driver expects, in case anything before the i915 module loads and
> adjusts it for a particular purpose (including debug); the render
> HWSTAM is set this way too. I could add a comment to both perhaps?

Well on that note, the docs clearly state only 1 bit can be unmasked at
a time. Not sure if that applies to masking as well, but if it does,
that would be not good.

I'm fine with a comment. Seeing the current masking doesn't make it
clear to me what is happening/why. Not sure I understand the reason for
saving the read is in this case.

> 
> Updating the blitter HWSTAM in the postinstall was a marginally safer
> choice, as there'll not be any potential race with a blitter user
> interrupt getting emitted before we're ready (which wouldn't have been
> tested), but the risk is probably so low that it could just go into
> the preinstall.
> 
> What do you think?

I think there is no risk since this command could only be executed if
the driver was up. I'd just like all HWSTAM writes in a single place. I
don't have any preference which place.

> 
> Daniel
Ben

Eric Anholt June 15, 2011, 4:38 p.m. UTC | #7

On Wed, 15 Jun 2011 13:04:51 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
> The render HWSTAM is tweaked in preinstall, but we need to tweak the
> blitter HWSTAM (new to gen6).

This still doesn't *really* make sense -- HWSTAM is supposed to be
masking updates to the status page's copy of the IIR, which we never
read, and not be involved in masking updates to the MMIO I[IS]R.  So it
seems to me that this is just happening to get lucky and serialize in
the hardware for the way that we do actually read IIR (through MMIO).
And hey, we should be using the status page copy instead of MMIO some
day anyway, so that's more reason to do this patch even if we don't like
workarounds.

> To me, it makes sense to reset the blitter HWSTAM register to what the
> driver expects, in case anything before the i915 module loads and
> adjusts it for a particular purpose (including debug); the render
> HWSTAM is set this way too. I could add a comment to both perhaps?
> 
> Updating the blitter HWSTAM in the postinstall was a marginally safer
> choice, as there'll not be any potential race with a blitter user
> interrupt getting emitted before we're ready (which wouldn't have been
> tested), but the risk is probably so low that it could just go into
> the preinstall.

The GPU is idle before our driver shows up, so there's no risk (there's
a bunch of leftover paranoia in the driver from the DRI1 days, none of
which ever made much sense).

Kenneth Graunke June 15, 2011, 5:11 p.m. UTC | #8

On 06/14/2011 09:51 AM, Daniel J Blueman wrote:
> On 14 June 2011 13:23, Eric Anholt<eric@anholt.net>  wrote:
>> On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman<daniel.blueman@gmail.com>  wrote:
>>> Hi Eric,
>>>
>>> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
>>> whenever you hit the top-left of the screen to show all windows) are
>>> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
>>> I see no 'missed IRQ' kernel messages.
>>>
>>> As this addresses a significant usability regression, are you happy to
>>> add it to the 3.0-rc queue? I think it has very good value in -stable
>>> also (assuming correctness). What do you think?
>>
>> This one had significant performance impacts, and later hacks in this
>> series worked around the problem to approximately the same level of
>> success with less impact, and we don't actually have a justification of
>> why any of them work.  We were still hoping to come up with some clue,
>> and haven't yet.
>
> True; that is quite heavy handed delay looping.
>
> It's a pity the usual Intel font didn't make it to the programmer's
> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
> so we don't need to read it here.
>
> It would be good to get this into -rc4. -stable probably needs some additional
> tweaks.
>
> Signed-off-by: Daniel J Blueman<daniel.blueman@gmail.com>
> ---
>   drivers/gpu/drm/i915/i915_irq.c |    6 ++++++
>   1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b9fafe3..9a98c1b 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1827,6 +1827,12 @@ int ironlake_irq_postinstall(struct drm_device *dev)
>   		ironlake_enable_display_irq(dev_priv, DE_PCU_EVENT);
>   	}
>
> +	if (IS_GEN6(dev))
> +		/* allow blitter user interrupt to generate a MSI write from
> +		   the ISR */
> +		I915_WRITE(GEN6_BLITTER_HWSTAM,
> +			0xffffffff&  ~GEN6_BLITTER_USER_INTERRUPT);
> +
>   	return 0;
>   }
>

Tested-by: Kenneth Graunke <kenneth@whitecape.org>

With i915.semaphores=0 on my Lenovo T420s, I reliably saw missed IRQs 
from the blitter when using GNOME Shell or running GLBenchmark 
2.0/Egypt.  Applying this patch fixes the issue, making my system much 
more responsive.

Thanks, Daniel!

Daniel J Blueman June 16, 2011, 2:45 a.m. UTC | #9

On 15 June 2011 23:16, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Wed, Jun 15, 2011 at 01:04:51PM +0800, Daniel J Blueman wrote:
>> On 15 June 2011 12:43, Ben Widawsky <ben@bwidawsk.net> wrote:
>> > On Wed, Jun 15, 2011 at 12:51:47AM +0800, Daniel J Blueman wrote:
>> >> On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:
>> >> > On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
>> >> >> Hi Eric,
>> >> >>
>> >> >> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg
>> >> >> whenever you hit the top-left of the screen to show all windows) are
>> >> >> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus
>> >> >> I see no 'missed IRQ' kernel messages.
>> >> >>
>> >> >> As this addresses a significant usability regression, are you happy to
>> >> >> add it to the 3.0-rc queue? I think it has very good value in -stable
>> >> >> also (assuming correctness). What do you think?
>> >> >
>> >> > This one had significant performance impacts, and later hacks in this
>> >> > series worked around the problem to approximately the same level of
>> >> > success with less impact, and we don't actually have a justification of
>> >> > why any of them work. ?We were still hoping to come up with some clue,
>> >> > and haven't yet.
>> >>
>> >> True; that is quite heavy handed delay looping.
>> >>
>> >> It's a pity the usual Intel font didn't make it to the programmer's
>> >> reference manuals. Anyway, unmasking the blitter user interrupt in the hardware
>> >> status mask register addresses the root cause. Out of reset it's FFFFFFFFh,
>> >> so we don't need to read it here.
>> >>
>> >> It would be good to get this into -rc4. -stable probably needs some additional
>> >> tweaks.
>> >>
>> >> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
>> >> ---
>> >> ?drivers/gpu/drm/i915/i915_irq.c | ? ?6 ++++++
>> >> ?1 files changed, 6 insertions(+), 0 deletions(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> >> index b9fafe3..9a98c1b 100644
>> >> --- a/drivers/gpu/drm/i915/i915_irq.c
>> >> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> >> @@ -1827,6 +1827,12 @@ int ironlake_irq_postinstall(struct drm_device *dev)
>> >> ? ? ? ? ? ? ? ironlake_enable_display_irq(dev_priv, DE_PCU_EVENT);
>> >> ? ? ? }
>> >>
>> >> + ? ? if (IS_GEN6(dev))
>> >> + ? ? ? ? ? ? /* allow blitter user interrupt to generate a MSI write from
>> >> + ? ? ? ? ? ? ? ?the ISR */
>> >> + ? ? ? ? ? ? I915_WRITE(GEN6_BLITTER_HWSTAM,
>> >> + ? ? ? ? ? ? ? ? ? ? 0xffffffff & ~GEN6_BLITTER_USER_INTERRUPT);
>> >> +
>> >> ? ? ? return 0;
>> >> ?}
>> >>
>> >
>> > I wish the docs said that that this hwstam unmasked MI_USER_INTERRUPTS
>> > parsed by the Blitter Command Parser, instead of the Render Command
>> > parser.
>> >
>> > I was just about to write an email about how this is just making the
>> > same interrupt happen twice, when I realized that the docs make no
>> > sense, and this must be a copy-paste doc bug.
>> >
>> > This patch sounds good to me. Two small comments:
>> > 1. The HWSTAM is touched in preinstall already, why not move this there?
>> > 2. I'd prefer you read the register even though as you say it isn't
>> > necessary. It just makes the code self-documented by doing it that way.
>>
>> The render HWSTAM is tweaked in preinstall, but we need to tweak the
>> blitter HWSTAM (new to gen6).
>>
>> To me, it makes sense to reset the blitter HWSTAM register to what the
>> driver expects, in case anything before the i915 module loads and
>> adjusts it for a particular purpose (including debug); the render
>> HWSTAM is set this way too. I could add a comment to both perhaps?
>
> Well on that note, the docs clearly state only 1 bit can be unmasked at
> a time. Not sure if that applies to masking as well, but if it does,
> that would be not good.

When a bit is unmasked, logic in the silicon will generate a write
from a particular functional block; multiple writes in the same cycles
may generate undefined behaviour (though would need multiple pending
interrupts). Masking would be safe, since there is no side-effect, and
this is typical.

> I'm fine with a comment. Seeing the current masking doesn't make it
> clear to me what is happening/why. Not sure I understand the reason for
> saving the read is in this case.
>
>>
>> Updating the blitter HWSTAM in the postinstall was a marginally safer
>> choice, as there'll not be any potential race with a blitter user
>> interrupt getting emitted before we're ready (which wouldn't have been
>> tested), but the risk is probably so low that it could just go into
>> the preinstall.
>>
>> What do you think?
>
> I think there is no risk since this command could only be executed if
> the driver was up. I'd just like all HWSTAM writes in a single place. I
> don't have any preference which place.

Ok, I'll prepare a patch that will read-modify-write the register and
place with the render HWSTAM write. Note that the render HWSTAM
unmasks multiple bits with impunity (probably fine if no pending
interrupts), but fixing that is outside the scope of what I want to
get into 3.0-rc4.

Thanks,
  Daniel

Daniel J Blueman June 16, 2011, 3:45 a.m. UTC | #10

On 16 June 2011 00:38, Eric Anholt <eric@anholt.net> wrote:
> On Wed, 15 Jun 2011 13:04:51 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:
>> The render HWSTAM is tweaked in preinstall, but we need to tweak the
>> blitter HWSTAM (new to gen6).
>
> This still doesn't *really* make sense -- HWSTAM is supposed to be
> masking updates to the status page's copy of the IIR, which we never
> read, and not be involved in masking updates to the MMIO I[IS]R.  So it
> seems to me that this is just happening to get lucky and serialize in
> the hardware for the way that we do actually read IIR (through MMIO).
> And hey, we should be using the status page copy instead of MMIO some
> day anyway, so that's more reason to do this patch even if we don't like
> workarounds.

I see we're checking only the MMIO IIR in the interrupt handler.
Perhaps we should come up with a way to prove that we're not
immediately seeing the correct state in the MMIO IIR when the
interrupt handler is fired without the unmasking. We could also check
if we get only one interrupt bit set (ie the interrupt occurred for
what we wanted and not something else) when we issue a
MI_USER_INTERRUPT to the blitter ring, to see if there is some
unexpected behaviour on gen6.

What do you think?

Also, perhaps I add a comment into the patch to show it's a workaround, right?

>> To me, it makes sense to reset the blitter HWSTAM register to what the
>> driver expects, in case anything before the i915 module loads and
>> adjusts it for a particular purpose (including debug); the render
>> HWSTAM is set this way too. I could add a comment to both perhaps?
>>
>> Updating the blitter HWSTAM in the postinstall was a marginally safer
>> choice, as there'll not be any potential race with a blitter user
>> interrupt getting emitted before we're ready (which wouldn't have been
>> tested), but the risk is probably so low that it could just go into
>> the preinstall.
>
> The GPU is idle before our driver shows up, so there's no risk (there's
> a bunch of leftover paranoia in the driver from the DRI1 days, none of
> which ever made much sense).

Eric Anholt June 16, 2011, 6:36 p.m. UTC | #11

On Wed, 15 Jun 2011 08:16:54 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Wed, Jun 15, 2011 at 01:04:51PM +0800, Daniel J Blueman wrote:
> > The render HWSTAM is tweaked in preinstall, but we need to tweak the
> > blitter HWSTAM (new to gen6).
> > 
> > To me, it makes sense to reset the blitter HWSTAM register to what the
> > driver expects, in case anything before the i915 module loads and
> > adjusts it for a particular purpose (including debug); the render
> > HWSTAM is set this way too. I could add a comment to both perhaps?
> 
> Well on that note, the docs clearly state only 1 bit can be unmasked at
> a time. Not sure if that applies to masking as well, but if it does,
> that would be not good.

This is because HWSTAM controls writes of the current ISR to the status
page, not IIR.  If you wanted to hear about more than one bit of
interrupts in that field, you'd potentially lose one of them because ISR
is "things interrupting at this very moment" not "things that have
interrupted since you last checked".  This is one of those few cases
where the hardware docs are telling you how to build software in order
to not fail, rather than telling you information about the hardware.

Given that we never look at that ISR field, then, it shouldn't matter
that we have more than one set for the render engine.

Yi Sun June 17, 2011, 7:52 a.m. UTC | #12

All,
Thank for Mengmeng’s testing work, now the status is as following:

The bug33394(performance regression: screen stuttered when running the demo of 3D games with compiz enabled without GPU semaphores) is fixed. The two issue(stutter and hangcheck) is gone, now.
The issue described as bug 36407 isn’t able to be reproduced.
The bug 36653 is still there.

As to the performance, the detail is listed as the table.
From the table, we can get the information that the patch make little effect to the 2D performance, but it improve 3D performance much.

?

without patch

with patch

?

dis-semaphores

en-semaphores

dis-semaphores

en-semaphores

2D-aa10text

1790k

2650k

1640k

2550k

2D-rgb10text

1380k

2380k

1100k

2320k

openarena

11

86.2

98.9

103.9 fps

urbanterror

10.5

71.4

68.5

70.9 fps

padman

12.1

100.7

92

100.3 fps

nexuiz

6

20

19.5

20 fps



Thanks
   --Yi,Sun




-----Original Message-----
From: intel-gfx-bounces+yi.sun=intel.com@lists.freedesktop.org [mailto:intel-gfx-bounces+yi.sun=intel.com@lists.freedesktop.org] On Behalf Of Daniel J Blueman

Sent: Wednesday, June 15, 2011 12:52 AM
To: Eric Anholt
Cc: Daniel J Blueman; intel-gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org; Dave Airlie
Subject: [Intel-gfx] [PATCH 3.0-rc3] i915: Fix gen6 (SNB) GPU stalling



On 14 June 2011 13:23, Eric Anholt <eric@anholt.net> wrote:

> On Tue, 14 Jun 2011 12:18:36 +0800, Daniel J Blueman <daniel.blueman@gmail.com> wrote:


>> Hi Eric,


>>


>> The frequent ~1.5s pauses I hit with SNB hardware in the gnome3 UI (eg


>> whenever you hit the top-left of the screen to show all windows) are


>> nicely addressed by your recent wake patch [1] (ported to -rc3). Thus


>> I see no 'missed IRQ' kernel messages.


>>


>> As this addresses a significant usability regression, are you happy to


>> add it to the 3.0-rc queue? I think it has very good value in -stable


>> also (assuming correctness). What do you think?


>


> This one had significant performance impacts, and later hacks in this


> series worked around the problem to approximately the same level of


> success with less impact, and we don't actually have a justification of


> why any of them work.  We were still hoping to come up with some clue,


> and haven't yet.




True; that is quite heavy handed delay looping.



It's a pity the usual Intel font didn't make it to the programmer's

reference manuals. Anyway, unmasking the blitter user interrupt in the hardware

status mask register addresses the root cause. Out of reset it's FFFFFFFFh,

so we don't need to read it here.



It would be good to get this into -rc4. -stable probably needs some additional

tweaks.



Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>


---

drivers/gpu/drm/i915/i915_irq.c |    6 ++++++

1 files changed, 6 insertions(+), 0 deletions(-)



diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c

index b9fafe3..9a98c1b 100644

--- a/drivers/gpu/drm/i915/i915_irq.c

+++ b/drivers/gpu/drm/i915/i915_irq.c

@@ -1827,6 +1827,12 @@ int ironlake_irq_postinstall(struct drm_device *dev)

                ironlake_enable_display_irq(dev_priv, DE_PCU_EVENT);

       }

+       if (IS_GEN6(dev))

+                /* allow blitter user interrupt to generate a MSI write from

+                   the ISR */

+                I915_WRITE(GEN6_BLITTER_HWSTAM,

+                          0xffffffff & ~GEN6_BLITTER_USER_INTERRUPT);

+

       return 0;

}

--

1.7.4.1



_______________________________________________

Intel-gfx mailing list

Intel-gfx@lists.freedesktop.org

http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Jesse Barnes June 17, 2011, 2:12 p.m. UTC | #13

On Fri, 17 Jun 2011 15:52:58 +0800
"Sun, Yi" <yi.sun@intel.com> wrote:

> All,
> Thank for Mengmeng’s testing work, now the status is as following:
> 
> The bug33394(performance regression: screen stuttered when running the demo of 3D games with compiz enabled without GPU semaphores) is fixed. The two issue(stutter and hangcheck) is gone, now.
> The issue described as bug 36407 isn’t able to be reproduced.
> The bug 36653 is still there.

Thanks for testing, looks like this is an important fix (36652 is what
I think you mean on the last one?).

Robert Hooker June 17, 2011, 3:29 p.m. UTC | #14

2011/6/17 Jesse Barnes <jbarnes@virtuousgeek.org>:
> On Fri, 17 Jun 2011 15:52:58 +0800
> "Sun, Yi" <yi.sun@intel.com> wrote:
>
>> All,
>> Thank for Mengmeng’s testing work, now the status is as following:
>>
>> The bug33394(performance regression: screen stuttered when running the demo of 3D games with compiz enabled without GPU semaphores) is fixed. The two issue(stutter and hangcheck) is gone, now.
>> The issue described as bug 36407 isn’t able to be reproduced.
>> The bug 36653 is still there.
>
> Thanks for testing, looks like this is an important fix (36652 is what
> I think you mean on the last one?).
>
> --
> Jesse Barnes, Intel Open Source Technology Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>


This also fixes https://bugs.freedesktop.org/show_bug.cgi?id=36241
when applied to 2.6.38 and 2.6.39, it would be nice to see it cc:
stable.

Tested-by: Robert Hooker <robert.hooker@canonical.com>

Thanks!

[3.0-rc3] i915: Fix gen6 (SNB) GPU stalling

Commit Message

Comments

Patch