diff mbox

[v3] drm/i915/hsw: Change default LLC age to 3

Message ID 1375397767-5439-1-git-send-email-ben@bwidawsk.net (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky Aug. 1, 2013, 10:56 p.m. UTC
The default LLC age was changed:
commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Thu Jul 4 11:02:03 2013 -0700

drm/i915/hsw: Set correct Haswell PTE encodings.

This caused a regression in performance on certain benchmarks. While I
think a discussion still needs to happen about how the kernel should
default for both eLLC, and LLC - just revert this behavior for now.

v2: Drop the extra #define (Chad)

v3: Actually git add

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Daniel Vetter Aug. 4, 2013, 7:10 p.m. UTC | #1
On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> The default LLC age was changed:
> commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d
> Author: Ben Widawsky <benjamin.widawsky@intel.com>
> Date:   Thu Jul 4 11:02:03 2013 -0700
>
> drm/i915/hsw: Set correct Haswell PTE encodings.
>
> This caused a regression in performance on certain benchmarks. While I
> think a discussion still needs to happen about how the kernel should
> default for both eLLC, and LLC - just revert this behavior for now.
>
> v2: Drop the extra #define (Chad)
>
> v3: Actually git add
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>

iris_pte_encode was split up after the regressing commit in

commit 4d15c145a6234d999c0452eec0d275c1fbf0688c
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jul 4 11:02:06 2013 -0700

    drm/i915: Use eLLC/LLC by default when available

Hence I think that one should be patched, too (QA doesn't have an iris
machine afaik, so no regression report expected). And since I have
some open questions about Chris WT-for-Iris patch too I think it's
best to spin a v4 of this one here.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index e7b4204..3e7f124 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -52,6 +52,7 @@
>   */
>  #define HSW_CACHEABILITY_CONTROL(bits) ((((bits) & 0x7) << 1) | \
>                                          (((bits) & 0x8) << (11 - 3)))
> +#define HSW_WB_LLC_AGE3                        HSW_CACHEABILITY_CONTROL(0x2)
>  #define HSW_WB_LLC_AGE0                        HSW_CACHEABILITY_CONTROL(0x3)
>  #define HSW_WB_ELLC_LLC_AGE0           HSW_CACHEABILITY_CONTROL(0xb)
>
> @@ -105,7 +106,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
>         pte |= HSW_PTE_ADDR_ENCODE(addr);
>
>         if (level != I915_CACHE_NONE)
> -               pte |= HSW_WB_LLC_AGE0;
> +               pte |= HSW_WB_LLC_AGE3;
>
>         return pte;
>  }
> --
> 1.8.3.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Ben Widawsky Aug. 4, 2013, 8:55 p.m. UTC | #2
On Sun, Aug 04, 2013 at 09:10:25PM +0200, Daniel Vetter wrote:
> On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > The default LLC age was changed:
> > commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d
> > Author: Ben Widawsky <benjamin.widawsky@intel.com>
> > Date:   Thu Jul 4 11:02:03 2013 -0700
> >
> > drm/i915/hsw: Set correct Haswell PTE encodings.
> >
> > This caused a regression in performance on certain benchmarks. While I
> > think a discussion still needs to happen about how the kernel should
> > default for both eLLC, and LLC - just revert this behavior for now.
> >
> > v2: Drop the extra #define (Chad)
> >
> > v3: Actually git add
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
> 
> iris_pte_encode was split up after the regressing commit in
> 
> commit 4d15c145a6234d999c0452eec0d275c1fbf0688c
> Author: Ben Widawsky <ben@bwidawsk.net>
> Date:   Thu Jul 4 11:02:06 2013 -0700
> 
>     drm/i915: Use eLLC/LLC by default when available
> 
> Hence I think that one should be patched, too (QA doesn't have an iris
> machine afaik, so no regression report expected). And since I have
> some open questions about Chris WT-for-Iris patch too I think it's
> best to spin a v4 of this one here.
> -Daniel
> 

We've come up with a theory as to why this was a regression (which I
think nullifies your request for the eLLC version)

On the surface, one would think that aging all objects equally, whether
they be oldest, or youngest, makes no difference. The key with LLC is
that it is shared with the CPU, and presumably the CPU follows the
normal cache aging FIFO rules (ie. everything is aged 3). On this logic,
eLLC would not share the same flaw. I think it makes sense to have
someone prove we need it for eLLC, as this does allow mesa/DDX to get 2
differently aged eLLC objects, which they cannot do if we default to 3.
On the same note, even LLC age 0 is still ideal in my opinion, but was
asked to not do this so that meas 9.2 doesn't have performance
regressions.


[snip]
Daniel Vetter Aug. 4, 2013, 11:03 p.m. UTC | #3
On Sun, Aug 4, 2013 at 10:55 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Sun, Aug 04, 2013 at 09:10:25PM +0200, Daniel Vetter wrote:
>> On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
>> > The default LLC age was changed:
>> > commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d
>> > Author: Ben Widawsky <benjamin.widawsky@intel.com>
>> > Date:   Thu Jul 4 11:02:03 2013 -0700
>> >
>> > drm/i915/hsw: Set correct Haswell PTE encodings.
>> >
>> > This caused a regression in performance on certain benchmarks. While I
>> > think a discussion still needs to happen about how the kernel should
>> > default for both eLLC, and LLC - just revert this behavior for now.
>> >
>> > v2: Drop the extra #define (Chad)
>> >
>> > v3: Actually git add
>> >
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
>> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>> > Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
>>
>> iris_pte_encode was split up after the regressing commit in
>>
>> commit 4d15c145a6234d999c0452eec0d275c1fbf0688c
>> Author: Ben Widawsky <ben@bwidawsk.net>
>> Date:   Thu Jul 4 11:02:06 2013 -0700
>>
>>     drm/i915: Use eLLC/LLC by default when available
>>
>> Hence I think that one should be patched, too (QA doesn't have an iris
>> machine afaik, so no regression report expected). And since I have
>> some open questions about Chris WT-for-Iris patch too I think it's
>> best to spin a v4 of this one here.
>> -Daniel
>>
>
> We've come up with a theory as to why this was a regression (which I
> think nullifies your request for the eLLC version)
>
> On the surface, one would think that aging all objects equally, whether
> they be oldest, or youngest, makes no difference. The key with LLC is
> that it is shared with the CPU, and presumably the CPU follows the
> normal cache aging FIFO rules (ie. everything is aged 3). On this logic,
> eLLC would not share the same flaw. I think it makes sense to have
> someone prove we need it for eLLC, as this does allow mesa/DDX to get 2
> differently aged eLLC objects, which they cannot do if we default to 3.
> On the same note, even LLC age 0 is still ideal in my opinion, but was
> asked to not do this so that meas 9.2 doesn't have performance
> regressions.

Well we have a get/set_caching ioctl, explictly with room for stuff
like aging. So this can be added in any case. But yeah if someone can
check whether we have the same regression on iris or not would be
good. Can you please poke QA (or whoever has one of these boxen)?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Ben Widawsky Aug. 4, 2013, 11:22 p.m. UTC | #4
On Mon, Aug 05, 2013 at 01:03:29AM +0200, Daniel Vetter wrote:
> On Sun, Aug 4, 2013 at 10:55 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Sun, Aug 04, 2013 at 09:10:25PM +0200, Daniel Vetter wrote:
> >> On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> >> > The default LLC age was changed:
> >> > commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d
> >> > Author: Ben Widawsky <benjamin.widawsky@intel.com>
> >> > Date:   Thu Jul 4 11:02:03 2013 -0700
> >> >
> >> > drm/i915/hsw: Set correct Haswell PTE encodings.
> >> >
> >> > This caused a regression in performance on certain benchmarks. While I
> >> > think a discussion still needs to happen about how the kernel should
> >> > default for both eLLC, and LLC - just revert this behavior for now.
> >> >
> >> > v2: Drop the extra #define (Chad)
> >> >
> >> > v3: Actually git add
> >> >
> >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
> >> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >> > Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
> >>
> >> iris_pte_encode was split up after the regressing commit in
> >>
> >> commit 4d15c145a6234d999c0452eec0d275c1fbf0688c
> >> Author: Ben Widawsky <ben@bwidawsk.net>
> >> Date:   Thu Jul 4 11:02:06 2013 -0700
> >>
> >>     drm/i915: Use eLLC/LLC by default when available
> >>
> >> Hence I think that one should be patched, too (QA doesn't have an iris
> >> machine afaik, so no regression report expected). And since I have
> >> some open questions about Chris WT-for-Iris patch too I think it's
> >> best to spin a v4 of this one here.
> >> -Daniel
> >>
> >
> > We've come up with a theory as to why this was a regression (which I
> > think nullifies your request for the eLLC version)
> >
> > On the surface, one would think that aging all objects equally, whether
> > they be oldest, or youngest, makes no difference. The key with LLC is
> > that it is shared with the CPU, and presumably the CPU follows the
> > normal cache aging FIFO rules (ie. everything is aged 3). On this logic,
> > eLLC would not share the same flaw. I think it makes sense to have
> > someone prove we need it for eLLC, as this does allow mesa/DDX to get 2
> > differently aged eLLC objects, which they cannot do if we default to 3.
> > On the same note, even LLC age 0 is still ideal in my opinion, but was
> > asked to not do this so that meas 9.2 doesn't have performance
> > regressions.
> 
> Well we have a get/set_caching ioctl, explictly with room for stuff
> like aging.

It's like someone wrote such patches almost a year ago, and they were
rejected as overly complicated by some maintainer.

> So this can be added in any case. But yeah if someone can
> check whether we have the same regression on iris or not would be
> good. Can you please poke QA (or whoever has one of these boxen)?
> -Daniel

Well, forgetting regressions for just one sec, what do you think makes
the most sense? If you think it should be 3, it's probably not even
worth the effort since I barely care - and I'm happy to just change it..
Daniel Vetter Aug. 5, 2013, 5:07 a.m. UTC | #5
On Mon, Aug 5, 2013 at 1:22 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Mon, Aug 05, 2013 at 01:03:29AM +0200, Daniel Vetter wrote:
>> Well we have a get/set_caching ioctl, explictly with room for stuff
>> like aging.
>
> It's like someone wrote such patches almost a year ago, and they were
> rejected as overly complicated by some maintainer.

Rejected as "needs to be justified with performance data and a
mesa/userspace use-case", not "rejected as too complicated, period".

>> So this can be added in any case. But yeah if someone can
>> check whether we have the same regression on iris or not would be
>> good. Can you please poke QA (or whoever has one of these boxen)?
>> -Daniel
>
> Well, forgetting regressions for just one sec, what do you think makes
> the most sense? If you think it should be 3, it's probably not even
> worth the effort since I barely care - and I'm happy to just change it..

Since the iris split happened after the regression and we've just
argued that age3 seems to be the right default (as long as mesa
doesn't do anything special) I think we should argue for age3 on iris
systems, too. Imo argueing against simply propagating the copy&paste
change needs performance data.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index e7b4204..3e7f124 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -52,6 +52,7 @@ 
  */
 #define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
 					 (((bits) & 0x8) << (11 - 3)))
+#define HSW_WB_LLC_AGE3			HSW_CACHEABILITY_CONTROL(0x2)
 #define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
 #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
 
@@ -105,7 +106,7 @@  static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
 	pte |= HSW_PTE_ADDR_ENCODE(addr);
 
 	if (level != I915_CACHE_NONE)
-		pte |= HSW_WB_LLC_AGE0;
+		pte |= HSW_WB_LLC_AGE3;
 
 	return pte;
 }