Message ID | 1449514270-15171-5-git-send-email-wayne.boyer@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote: > Do some further clean up based on the initial review of > drm/i915: Separate cherryview from valleyview. > > In this case, in i915_gem_alloc_context_obj() only call > i915_gem_object_set_cache_level() for Ivy Bridge devices > since later platforms don't have L3 control bits in the PTE. > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> > --- > drivers/gpu/drm/i915/i915_gem_context.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index 4b1161d..e4de433 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device *dev, size_t size) > /* > * Try to make the context utilize L3 as well as LLC. > * > - * On VLV we don't have L3 controls in the PTEs so we > - * shouldn't touch the cache level, especially as that > - * would make the object snooped which might have a > - * negative performance impact. > + * This is only applicable for Ivy Bridge devices since > + * later platforms don't have L3 control bits in the PTE. > */ > - if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev)) { > + if (IS_IVYBRIDGE(dev)) { This would actually change the snoop setting on BXT, but that should be fine since BXT still has the GTT -> PAT 0 thing which means we get snooping anyway IIRC. Imre, we discussed this stuff at some point, and I hope I'm not forgetting something crucial here that would break BXT. Probably best if you sanity check my thinking here... We should probably mention this PAT 0 business in a comment somewhere here since it's a very easy thing to miss. > ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC); > /* Failure shouldn't ever happen this early */ > if (WARN_ON(ret)) { > -- > 2.6.3
On Mon, 2015-12-07 at 21:28 +0200, Ville Syrjälä wrote: > On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote: > > Do some further clean up based on the initial review of > > drm/i915: Separate cherryview from valleyview. > > > > In this case, in i915_gem_alloc_context_obj() only call > > i915_gem_object_set_cache_level() for Ivy Bridge devices > > since later platforms don't have L3 control bits in the PTE. > > > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > > Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> > > --- > > drivers/gpu/drm/i915/i915_gem_context.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > > b/drivers/gpu/drm/i915/i915_gem_context.c > > index 4b1161d..e4de433 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device > > *dev, size_t size) > > /* > > * Try to make the context utilize L3 as well as LLC. > > * > > - * On VLV we don't have L3 controls in the PTEs so we > > - * shouldn't touch the cache level, especially as that > > - * would make the object snooped which might have a > > - * negative performance impact. > > + * This is only applicable for Ivy Bridge devices since > > + * later platforms don't have L3 control bits in the PTE. > > */ > > - if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) && > > !IS_CHERRYVIEW(dev)) { > > + if (IS_IVYBRIDGE(dev)) { > > This would actually change the snoop setting on BXT, but that > should be fine since BXT still has the GTT -> PAT 0 thing which > means we get snooping anyway IIRC. > > Imre, we discussed this stuff at some point, and I hope I'm not > forgetting something crucial here that would break BXT. Probably best > if you sanity check my thinking here... Yes, AFAIU it doesn't matter what level we set, because of the PAT mapping constraint you describe above. Atm we also don't use this function on BXT, b/c of the different codepaths for LRC vs. ringbuffer mode. But that could/should be unified at one point, so yea it's better to keep things working for BXT too here. In the corresponding intel_lr_context_deferred_alloc() we don't set the cache level, which also agrees with the above, so I think this change is fine. > We should probably mention this PAT 0 business in a comment somewhere > here since it's a very easy thing to miss. Yep, a comment about this would be in order. > > > ret = i915_gem_object_set_cache_level(obj, > > I915_CACHE_L3_LLC); > > /* Failure shouldn't ever happen this early */ > > if (WARN_ON(ret)) { > > -- > > 2.6.3 >
On 12/7/15, 11:56 AM, "Deak, Imre" <imre.deak@intel.com> wrote: >On Mon, 2015-12-07 at 21:28 +0200, Ville Syrjälä wrote: >> On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote: >> > Do some further clean up based on the initial review of >> > drm/i915: Separate cherryview from valleyview. >> > >> > In this case, in i915_gem_alloc_context_obj() only call >> > i915_gem_object_set_cache_level() for Ivy Bridge devices >> > since later platforms don't have L3 control bits in the PTE. >> > >> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> >> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >> > Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> >> > --- >> > drivers/gpu/drm/i915/i915_gem_context.c | 8 +++----- >> > 1 file changed, 3 insertions(+), 5 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c >> > b/drivers/gpu/drm/i915/i915_gem_context.c >> > index 4b1161d..e4de433 100644 >> > --- a/drivers/gpu/drm/i915/i915_gem_context.c >> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c >> > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device >> > *dev, size_t size) >> > /* >> > * Try to make the context utilize L3 as well as LLC. >> > * >> > - * On VLV we don't have L3 controls in the PTEs so we >> > - * shouldn't touch the cache level, especially as that >> > - * would make the object snooped which might have a >> > - * negative performance impact. >> > + * This is only applicable for Ivy Bridge devices since >> > + * later platforms don't have L3 control bits in the PTE. >> > */ >> > - if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) && >> > !IS_CHERRYVIEW(dev)) { >> > + if (IS_IVYBRIDGE(dev)) { >> >> This would actually change the snoop setting on BXT, but that >> should be fine since BXT still has the GTT -> PAT 0 thing which >> means we get snooping anyway IIRC. >> >> Imre, we discussed this stuff at some point, and I hope I'm not >> forgetting something crucial here that would break BXT. Probably best >> if you sanity check my thinking here... > >Yes, AFAIU it doesn't matter what level we set, because of the PAT >mapping constraint you describe above. Atm we also don't use this >function on BXT, b/c of the different codepaths for LRC vs. ringbuffer >mode. But that could/should be unified at one point, so yea it's better >to keep things working for BXT too here. In the corresponding >intel_lr_context_deferred_alloc() we don't set the cache level, which >also agrees with the above, so I think this change is fine. > >> We should probably mention this PAT 0 business in a comment somewhere >> here since it's a very easy thing to miss. > >Yep, a comment about this would be in order. Ville, Imre, how does this look for an updated comment? /* * Try to make the context utilize L3 as well as LLC. * * On VLV we don't have L3 controls in the PTE so we * shouldn't touch the cache level, especially as that * would make the object snooped which might have a * negative performance impact. * * On CHV and BXT we set PPAT 0 in chv_setup_private_ppat() * which means we get snooping anyway. * * This is only applicable for Ivy Bridge devices since * later platforms don't have L3 control bits in the PTE. */
On Mon, Dec 07, 2015 at 10:26:15PM +0000, Boyer, Wayne wrote: > On 12/7/15, 11:56 AM, "Deak, Imre" <imre.deak@intel.com> wrote: > > > >On Mon, 2015-12-07 at 21:28 +0200, Ville Syrjälä wrote: > >> On Mon, Dec 07, 2015 at 10:51:09AM -0800, Wayne Boyer wrote: > >> > Do some further clean up based on the initial review of > >> > drm/i915: Separate cherryview from valleyview. > >> > > >> > In this case, in i915_gem_alloc_context_obj() only call > >> > i915_gem_object_set_cache_level() for Ivy Bridge devices > >> > since later platforms don't have L3 control bits in the PTE. > >> > > >> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > >> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > >> > Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> > >> > --- > >> > drivers/gpu/drm/i915/i915_gem_context.c | 8 +++----- > >> > 1 file changed, 3 insertions(+), 5 deletions(-) > >> > > >> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > >> > b/drivers/gpu/drm/i915/i915_gem_context.c > >> > index 4b1161d..e4de433 100644 > >> > --- a/drivers/gpu/drm/i915/i915_gem_context.c > >> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > >> > @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device > >> > *dev, size_t size) > >> > /* > >> > * Try to make the context utilize L3 as well as LLC. > >> > * > >> > - * On VLV we don't have L3 controls in the PTEs so we > >> > - * shouldn't touch the cache level, especially as that > >> > - * would make the object snooped which might have a > >> > - * negative performance impact. > >> > + * This is only applicable for Ivy Bridge devices since > >> > + * later platforms don't have L3 control bits in the PTE. > >> > */ > >> > - if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) && > >> > !IS_CHERRYVIEW(dev)) { > >> > + if (IS_IVYBRIDGE(dev)) { > >> > >> This would actually change the snoop setting on BXT, but that > >> should be fine since BXT still has the GTT -> PAT 0 thing which > >> means we get snooping anyway IIRC. > >> > >> Imre, we discussed this stuff at some point, and I hope I'm not > >> forgetting something crucial here that would break BXT. Probably best > >> if you sanity check my thinking here... > > > >Yes, AFAIU it doesn't matter what level we set, because of the PAT > >mapping constraint you describe above. Atm we also don't use this > >function on BXT, b/c of the different codepaths for LRC vs. ringbuffer > >mode. But that could/should be unified at one point, so yea it's better > >to keep things working for BXT too here. In the corresponding > >intel_lr_context_deferred_alloc() we don't set the cache level, which > >also agrees with the above, so I think this change is fine. > > > >> We should probably mention this PAT 0 business in a comment somewhere > >> here since it's a very easy thing to miss. > > > >Yep, a comment about this would be in order. > > Ville, Imre, how does this look for an updated comment? > > /* > * Try to make the context utilize L3 as well as LLC. > * > * On VLV we don't have L3 controls in the PTE so we > * shouldn't touch the cache level, especially as that > * would make the object snooped which might have a > * negative performance impact. > * > * On CHV and BXT we set PPAT 0 in chv_setup_private_ppat() > * which means we get snooping anyway. Maybe something like this: "Snooping is required on non-llc platforms in execlist mode, but since all GGTT accesses use PAT entry 0 we get snooping anyway regardless of cache_level. Otherwise lgtm. > * > * This is only applicable for Ivy Bridge devices since > * later platforms don't have L3 control bits in the PTE. > */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 4b1161d..e4de433 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -185,12 +185,10 @@ i915_gem_alloc_context_obj(struct drm_device *dev, size_t size) /* * Try to make the context utilize L3 as well as LLC. * - * On VLV we don't have L3 controls in the PTEs so we - * shouldn't touch the cache level, especially as that - * would make the object snooped which might have a - * negative performance impact. + * This is only applicable for Ivy Bridge devices since + * later platforms don't have L3 control bits in the PTE. */ - if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev)) { + if (IS_IVYBRIDGE(dev)) { ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC); /* Failure shouldn't ever happen this early */ if (WARN_ON(ret)) {
Do some further clean up based on the initial review of drm/i915: Separate cherryview from valleyview. In this case, in i915_gem_alloc_context_obj() only call i915_gem_object_set_cache_level() for Ivy Bridge devices since later platforms don't have L3 control bits in the PTE. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> --- drivers/gpu/drm/i915/i915_gem_context.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)