Message ID | 1375397767-5439-1-git-send-email-ben@bwidawsk.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote: > The default LLC age was changed: > commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d > Author: Ben Widawsky <benjamin.widawsky@intel.com> > Date: Thu Jul 4 11:02:03 2013 -0700 > > drm/i915/hsw: Set correct Haswell PTE encodings. > > This caused a regression in performance on certain benchmarks. While I > think a discussion still needs to happen about how the kernel should > default for both eLLC, and LLC - just revert this behavior for now. > > v2: Drop the extra #define (Chad) > > v3: Actually git add > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062 > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > Reviewed-by: Chad Versace <chad.versace@linux.intel.com> iris_pte_encode was split up after the regressing commit in commit 4d15c145a6234d999c0452eec0d275c1fbf0688c Author: Ben Widawsky <ben@bwidawsk.net> Date: Thu Jul 4 11:02:06 2013 -0700 drm/i915: Use eLLC/LLC by default when available Hence I think that one should be patched, too (QA doesn't have an iris machine afaik, so no regression report expected). And since I have some open questions about Chris WT-for-Iris patch too I think it's best to spin a v4 of this one here. -Daniel > --- > drivers/gpu/drm/i915/i915_gem_gtt.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index e7b4204..3e7f124 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -52,6 +52,7 @@ > */ > #define HSW_CACHEABILITY_CONTROL(bits) ((((bits) & 0x7) << 1) | \ > (((bits) & 0x8) << (11 - 3))) > +#define HSW_WB_LLC_AGE3 HSW_CACHEABILITY_CONTROL(0x2) > #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) > #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) > > @@ -105,7 +106,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr, > pte |= HSW_PTE_ADDR_ENCODE(addr); > > if (level != I915_CACHE_NONE) > - pte |= HSW_WB_LLC_AGE0; > + pte |= HSW_WB_LLC_AGE3; > > return pte; > } > -- > 1.8.3.4 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Sun, Aug 04, 2013 at 09:10:25PM +0200, Daniel Vetter wrote: > On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote: > > The default LLC age was changed: > > commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d > > Author: Ben Widawsky <benjamin.widawsky@intel.com> > > Date: Thu Jul 4 11:02:03 2013 -0700 > > > > drm/i915/hsw: Set correct Haswell PTE encodings. > > > > This caused a regression in performance on certain benchmarks. While I > > think a discussion still needs to happen about how the kernel should > > default for both eLLC, and LLC - just revert this behavior for now. > > > > v2: Drop the extra #define (Chad) > > > > v3: Actually git add > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062 > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > Reviewed-by: Chad Versace <chad.versace@linux.intel.com> > > iris_pte_encode was split up after the regressing commit in > > commit 4d15c145a6234d999c0452eec0d275c1fbf0688c > Author: Ben Widawsky <ben@bwidawsk.net> > Date: Thu Jul 4 11:02:06 2013 -0700 > > drm/i915: Use eLLC/LLC by default when available > > Hence I think that one should be patched, too (QA doesn't have an iris > machine afaik, so no regression report expected). And since I have > some open questions about Chris WT-for-Iris patch too I think it's > best to spin a v4 of this one here. > -Daniel > We've come up with a theory as to why this was a regression (which I think nullifies your request for the eLLC version) On the surface, one would think that aging all objects equally, whether they be oldest, or youngest, makes no difference. The key with LLC is that it is shared with the CPU, and presumably the CPU follows the normal cache aging FIFO rules (ie. everything is aged 3). On this logic, eLLC would not share the same flaw. I think it makes sense to have someone prove we need it for eLLC, as this does allow mesa/DDX to get 2 differently aged eLLC objects, which they cannot do if we default to 3. On the same note, even LLC age 0 is still ideal in my opinion, but was asked to not do this so that meas 9.2 doesn't have performance regressions. [snip]
On Sun, Aug 4, 2013 at 10:55 PM, Ben Widawsky <ben@bwidawsk.net> wrote: > On Sun, Aug 04, 2013 at 09:10:25PM +0200, Daniel Vetter wrote: >> On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote: >> > The default LLC age was changed: >> > commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d >> > Author: Ben Widawsky <benjamin.widawsky@intel.com> >> > Date: Thu Jul 4 11:02:03 2013 -0700 >> > >> > drm/i915/hsw: Set correct Haswell PTE encodings. >> > >> > This caused a regression in performance on certain benchmarks. While I >> > think a discussion still needs to happen about how the kernel should >> > default for both eLLC, and LLC - just revert this behavior for now. >> > >> > v2: Drop the extra #define (Chad) >> > >> > v3: Actually git add >> > >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062 >> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> >> > Reviewed-by: Chad Versace <chad.versace@linux.intel.com> >> >> iris_pte_encode was split up after the regressing commit in >> >> commit 4d15c145a6234d999c0452eec0d275c1fbf0688c >> Author: Ben Widawsky <ben@bwidawsk.net> >> Date: Thu Jul 4 11:02:06 2013 -0700 >> >> drm/i915: Use eLLC/LLC by default when available >> >> Hence I think that one should be patched, too (QA doesn't have an iris >> machine afaik, so no regression report expected). And since I have >> some open questions about Chris WT-for-Iris patch too I think it's >> best to spin a v4 of this one here. >> -Daniel >> > > We've come up with a theory as to why this was a regression (which I > think nullifies your request for the eLLC version) > > On the surface, one would think that aging all objects equally, whether > they be oldest, or youngest, makes no difference. The key with LLC is > that it is shared with the CPU, and presumably the CPU follows the > normal cache aging FIFO rules (ie. everything is aged 3). On this logic, > eLLC would not share the same flaw. I think it makes sense to have > someone prove we need it for eLLC, as this does allow mesa/DDX to get 2 > differently aged eLLC objects, which they cannot do if we default to 3. > On the same note, even LLC age 0 is still ideal in my opinion, but was > asked to not do this so that meas 9.2 doesn't have performance > regressions. Well we have a get/set_caching ioctl, explictly with room for stuff like aging. So this can be added in any case. But yeah if someone can check whether we have the same regression on iris or not would be good. Can you please poke QA (or whoever has one of these boxen)? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Mon, Aug 05, 2013 at 01:03:29AM +0200, Daniel Vetter wrote: > On Sun, Aug 4, 2013 at 10:55 PM, Ben Widawsky <ben@bwidawsk.net> wrote: > > On Sun, Aug 04, 2013 at 09:10:25PM +0200, Daniel Vetter wrote: > >> On Fri, Aug 2, 2013 at 12:56 AM, Ben Widawsky <ben@bwidawsk.net> wrote: > >> > The default LLC age was changed: > >> > commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d > >> > Author: Ben Widawsky <benjamin.widawsky@intel.com> > >> > Date: Thu Jul 4 11:02:03 2013 -0700 > >> > > >> > drm/i915/hsw: Set correct Haswell PTE encodings. > >> > > >> > This caused a regression in performance on certain benchmarks. While I > >> > think a discussion still needs to happen about how the kernel should > >> > default for both eLLC, and LLC - just revert this behavior for now. > >> > > >> > v2: Drop the extra #define (Chad) > >> > > >> > v3: Actually git add > >> > > >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062 > >> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > >> > Reviewed-by: Chad Versace <chad.versace@linux.intel.com> > >> > >> iris_pte_encode was split up after the regressing commit in > >> > >> commit 4d15c145a6234d999c0452eec0d275c1fbf0688c > >> Author: Ben Widawsky <ben@bwidawsk.net> > >> Date: Thu Jul 4 11:02:06 2013 -0700 > >> > >> drm/i915: Use eLLC/LLC by default when available > >> > >> Hence I think that one should be patched, too (QA doesn't have an iris > >> machine afaik, so no regression report expected). And since I have > >> some open questions about Chris WT-for-Iris patch too I think it's > >> best to spin a v4 of this one here. > >> -Daniel > >> > > > > We've come up with a theory as to why this was a regression (which I > > think nullifies your request for the eLLC version) > > > > On the surface, one would think that aging all objects equally, whether > > they be oldest, or youngest, makes no difference. The key with LLC is > > that it is shared with the CPU, and presumably the CPU follows the > > normal cache aging FIFO rules (ie. everything is aged 3). On this logic, > > eLLC would not share the same flaw. I think it makes sense to have > > someone prove we need it for eLLC, as this does allow mesa/DDX to get 2 > > differently aged eLLC objects, which they cannot do if we default to 3. > > On the same note, even LLC age 0 is still ideal in my opinion, but was > > asked to not do this so that meas 9.2 doesn't have performance > > regressions. > > Well we have a get/set_caching ioctl, explictly with room for stuff > like aging. It's like someone wrote such patches almost a year ago, and they were rejected as overly complicated by some maintainer. > So this can be added in any case. But yeah if someone can > check whether we have the same regression on iris or not would be > good. Can you please poke QA (or whoever has one of these boxen)? > -Daniel Well, forgetting regressions for just one sec, what do you think makes the most sense? If you think it should be 3, it's probably not even worth the effort since I barely care - and I'm happy to just change it..
On Mon, Aug 5, 2013 at 1:22 AM, Ben Widawsky <ben@bwidawsk.net> wrote: > On Mon, Aug 05, 2013 at 01:03:29AM +0200, Daniel Vetter wrote: >> Well we have a get/set_caching ioctl, explictly with room for stuff >> like aging. > > It's like someone wrote such patches almost a year ago, and they were > rejected as overly complicated by some maintainer. Rejected as "needs to be justified with performance data and a mesa/userspace use-case", not "rejected as too complicated, period". >> So this can be added in any case. But yeah if someone can >> check whether we have the same regression on iris or not would be >> good. Can you please poke QA (or whoever has one of these boxen)? >> -Daniel > > Well, forgetting regressions for just one sec, what do you think makes > the most sense? If you think it should be 3, it's probably not even > worth the effort since I barely care - and I'm happy to just change it.. Since the iris split happened after the regression and we've just argued that age3 seems to be the right default (as long as mesa doesn't do anything special) I think we should argue for age3 on iris systems, too. Imo argueing against simply propagating the copy&paste change needs performance data. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index e7b4204..3e7f124 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -52,6 +52,7 @@ */ #define HSW_CACHEABILITY_CONTROL(bits) ((((bits) & 0x7) << 1) | \ (((bits) & 0x8) << (11 - 3))) +#define HSW_WB_LLC_AGE3 HSW_CACHEABILITY_CONTROL(0x2) #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) @@ -105,7 +106,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr, pte |= HSW_PTE_ADDR_ENCODE(addr); if (level != I915_CACHE_NONE) - pte |= HSW_WB_LLC_AGE0; + pte |= HSW_WB_LLC_AGE3; return pte; }