diff mbox

[18/30] drm/i915: Add an interface to dynamically change the cache level

Message ID 1302640318-23165-19-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson April 12, 2011, 8:31 p.m. UTC
[anholt v2: Don't forget that when going from cached to uncached, we
haven't been tracking the write domain from the CPU perspective, since
we haven't needed it for GPU coherency.]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |    6 ++++-
 drivers/gpu/drm/i915/i915_gem.c         |   40 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c     |    3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |    6 +++-
 4 files changed, 50 insertions(+), 5 deletions(-)

Comments

Daniel Vetter April 13, 2011, 6:59 p.m. UTC | #1
> @@ -3002,6 +3002,44 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	return 0;
>  }
>  
> +int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    enum i915_cache_level cache_level)
> +{
> +	int ret;
> +
> +	if (obj->cache_level == cache_level)
> +		return 0;
> +
> +	if (obj->gtt_space) {
> +		ret = i915_gem_object_flush_gpu(obj);
> +		if (ret)
> +			return ret;
> +
> +		ret = i915_gem_gtt_bind_object(obj, cache_level);
> +		if (ret)
> +			return ret;

This momentarily confused me till I've noticed that the fake agp driver
does the right thing and does not re-create a dmar mapping if it already
exists. So much for remembering my own code. Still, maybe extract
i915_gem_gtt_rebind_object from restore_gtt_mappings and use that one
here? Should make the intent clearer.

> +		/* Ensure that we invalidate the GPU's caches and TLBs. */
> +		obj->base.read_domains &= I915_GEM_GPU_DOMAINS;

I can't make sense of this. Either we really want to ensure that the gpu
buffers get invalidated on next use. But then it's probably

		read_domains &= ~GPU_DOMAINS

and would fit better grouped together with the call to object_flush_gpu
(the rebind can't actually fail if the dmar mappings already exist). Or
this is something else and I'm blind.

> +	}
> +
> +	if (cache_level == I915_CACHE_NONE) {
> +		/* If we're coming from LLC cached, then we haven't
> +		 * actually been tracking whether the data is in the
> +		 * CPU cache or not, since we only allow one bit set
> +		 * in obj->write_domain and have been skipping the clflushes.
> +		 * Just set it to the CPU cache for now.
> +		 */
> +		WARN_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
> +
> +		obj->base.read_domains |= I915_GEM_DOMAIN_CPU;

This breaks the invariant that write_domain != 0 implies write_domain ==
read_domains. Yes, if nothing prefetches and we clflush in due time the
caches should still be valid, but paranoid me deems that a bit fragile.
Also future patches shoot down fences, so we might as well shoot down the
gtt mapping completely. That seems required for the redirect gtt mappings
patch, too.

> +		obj->base.write_domain = I915_GEM_DOMAIN_CPU;

We might end up here with a write_domain == DOMAIN_GTT. Feels a tad bit
unsafe. I'd prefer either a WARN_ON and push the problem out to callers or
to call flush_gtt_write_domain somewhere in set_cache_level.

This looks like the critical part of the whole patch series so perhaps
fold the follow-up patches in here, too (like fence teardown). This way
there's just one spot that requires _really_ careful thinking.

Also, I haven't thought too hard about the uncached->cached transition on
live objects, which is not (yet) required. Maybe some more careful
handling of the gtt domain (mappings teardown) is needed for that.
-Daniel
Chris Wilson April 13, 2011, 7:21 p.m. UTC | #2
On Wed, 13 Apr 2011 20:59:46 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> > @@ -3002,6 +3002,44 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  	return 0;
> >  }
> >  
> > +int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    enum i915_cache_level cache_level)
> > +{
> > +	int ret;
> > +
> > +	if (obj->cache_level == cache_level)
> > +		return 0;
> > +
> > +	if (obj->gtt_space) {
> > +		ret = i915_gem_object_flush_gpu(obj);
> > +		if (ret)
> > +			return ret;
> > +
> > +		ret = i915_gem_gtt_bind_object(obj, cache_level);
> > +		if (ret)
> > +			return ret;
> 
> This momentarily confused me till I've noticed that the fake agp driver
> does the right thing and does not re-create a dmar mapping if it already
> exists. So much for remembering my own code. Still, maybe extract
> i915_gem_gtt_rebind_object from restore_gtt_mappings and use that one
> here? Should make the intent clearer.

Now that you reminded me, I was going to ask you at one point if we can
move the construction of the sg to a separate function. I'm not completely
happy that we have the sanest of interfaces between the gtt driver
and i915 yet. I think it will be worth revisiting that as our usage
patterns change.

[The suggested change is good, of course.]

> 
> > +		/* Ensure that we invalidate the GPU's caches and TLBs. */
> > +		obj->base.read_domains &= I915_GEM_GPU_DOMAINS;
> 
> I can't make sense of this. Either we really want to ensure that the gpu
> buffers get invalidated on next use. But then it's probably
> 
> 		read_domains &= ~GPU_DOMAINS
> 
> and would fit better grouped together with the call to object_flush_gpu
> (the rebind can't actually fail if the dmar mappings already exist). Or
> this is something else and I'm blind.

Gah, typo. Even re-reading what you wrote, I thought you had gone insane.
It was only me who had. ;-)

> > +	}
> > +
> > +	if (cache_level == I915_CACHE_NONE) {
> > +		/* If we're coming from LLC cached, then we haven't
> > +		 * actually been tracking whether the data is in the
> > +		 * CPU cache or not, since we only allow one bit set
> > +		 * in obj->write_domain and have been skipping the clflushes.
> > +		 * Just set it to the CPU cache for now.
> > +		 */
> > +		WARN_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
> > +
> > +		obj->base.read_domains |= I915_GEM_DOMAIN_CPU;
> 
> This breaks the invariant that write_domain != 0 implies write_domain ==
> read_domains. Yes, if nothing prefetches and we clflush in due time the
> caches should still be valid, but paranoid me deems that a bit fragile.
> Also future patches shoot down fences, so we might as well shoot down the
> gtt mapping completely. That seems required for the redirect gtt mappings
> patch, too.
> 
> > +		obj->base.write_domain = I915_GEM_DOMAIN_CPU;
> 
> We might end up here with a write_domain == DOMAIN_GTT. Feels a tad bit
> unsafe. I'd prefer either a WARN_ON and push the problem out to callers or
> to call flush_gtt_write_domain somewhere in set_cache_level.

Right, we can simply flush the GTT write domain and do a complete move
into the CPU domain:

if (cache_level == I915_CACHE_NONE) {
	/* If we're coming from LLC cached, then we haven't
	 * actually been tracking whether the data is in the
	 * CPU cache or not, since we only allow one bit set
	 * in obj->write_domain and have been skipping the
	 * clflushes.
	 * Just set it to the CPU cache for now.
	 */
	WARN_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
	WARN_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);

	i915_gem_object_flush_gtt_write_domain(obj);

	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
}

That should be a no-op and cause no greater impact than having to trigger
the clflushes. And keeps the domain tracking consistent.

However, considering the unflushed GTT bug, it does become much simpler...

> This looks like the critical part of the whole patch series so perhaps
> fold the follow-up patches in here, too (like fence teardown). This way
> there's just one spot that requires _really_ careful thinking.
> 
> Also, I haven't thought too hard about the uncached->cached transition on
> live objects, which is not (yet) required. Maybe some more careful
> handling of the gtt domain (mappings teardown) is needed for that.

Right, we've already identified that we have a bug here entirely due to
not flushing the GTT!
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2536334..2f45228 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1196,9 +1196,13 @@  void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 uint32_t
 i915_gem_get_unfenced_gtt_alignment(struct drm_i915_gem_object *obj);
 
+int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    enum i915_cache_level cache_level);
+
 /* i915_gem_gtt.c */
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
-int __must_check i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj);
+int __must_check i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
+					  enum i915_cache_level cache_level);
 void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 
 /* i915_gem_evict.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fa483d8..9027ee4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2831,7 +2831,7 @@  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return ret;
 	}
 
-	ret = i915_gem_gtt_bind_object(obj);
+	ret = i915_gem_gtt_bind_object(obj, obj->cache_level);
 	if (ret) {
 		i915_gem_object_put_pages_gtt(obj);
 		drm_mm_put_block(obj->gtt_space);
@@ -3002,6 +3002,44 @@  i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	return 0;
 }
 
+int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    enum i915_cache_level cache_level)
+{
+	int ret;
+
+	if (obj->cache_level == cache_level)
+		return 0;
+
+	if (obj->gtt_space) {
+		ret = i915_gem_object_flush_gpu(obj);
+		if (ret)
+			return ret;
+
+		ret = i915_gem_gtt_bind_object(obj, cache_level);
+		if (ret)
+			return ret;
+
+		/* Ensure that we invalidate the GPU's caches and TLBs. */
+		obj->base.read_domains &= I915_GEM_GPU_DOMAINS;
+	}
+
+	if (cache_level == I915_CACHE_NONE) {
+		/* If we're coming from LLC cached, then we haven't
+		 * actually been tracking whether the data is in the
+		 * CPU cache or not, since we only allow one bit set
+		 * in obj->write_domain and have been skipping the clflushes.
+		 * Just set it to the CPU cache for now.
+		 */
+		WARN_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
+
+		obj->base.read_domains |= I915_GEM_DOMAIN_CPU;
+		obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+	}
+
+	obj->cache_level = cache_level;
+	return 0;
+}
+
 /*
  * Prepare buffer for display plane. Use uninterruptible for possible flush
  * wait, as in modesetting process we're not supposed to be interrupted.
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2a1f8f1..6505617 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -77,7 +77,8 @@  void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	intel_gtt_chipset_flush();
 }
 
-int i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj)
+int i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
+			     enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index eab2565..f15d80f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -236,7 +236,8 @@  init_pipe_control(struct intel_ring_buffer *ring)
 		ret = -ENOMEM;
 		goto err;
 	}
-	obj->cache_level = I915_CACHE_LLC;
+
+	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
 
 	ret = i915_gem_object_pin(obj, 4096, true);
 	if (ret)
@@ -759,7 +760,8 @@  static int init_status_page(struct intel_ring_buffer *ring)
 		ret = -ENOMEM;
 		goto err;
 	}
-	obj->cache_level = I915_CACHE_LLC;
+
+	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
 
 	ret = i915_gem_object_pin(obj, 4096, true);
 	if (ret != 0) {