diff mbox

drm/i915: Support 64b relocations

Message ID 1398730708-3278-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky April 29, 2014, 12:18 a.m. UTC
All the rest of the code to enable this is in my branch. Without my
branch, hitting > 32b offsets is impossible. The code has always
"supported" 64b, but it's never actually been run of tested. This change
doesn't actually fix anything. [1] I am not sure why X won't work yet. I
do not get hangs or obvious errors.

There are 3 fixes grouped together here. First is to remove the
hardcoded 0 for the upper dword of the relocation. The next fix is to
use a 64b value for target_offset. The final fix is to not directly
apply target_offset to reloc->delta. reloc->delta is part of ABI, and so
we cannot change it. As it stands, 32b is enough to represent everything
we're interested in representing anyway. The main problem is, we cannot
add greater than 32b values to it directly.

[1] Almost all of intel-gpu-tools is not yet ready to test 64b
relocations. There are a few places that expect 32b values for offsets
and these all won't work.

Cc: Rafael Barbalho <rafael.barbalho@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

Comments

Chris Wilson May 1, 2014, 8:04 a.m. UTC | #1
On Mon, Apr 28, 2014 at 05:18:28PM -0700, Ben Widawsky wrote:
> All the rest of the code to enable this is in my branch. Without my
> branch, hitting > 32b offsets is impossible. The code has always
> "supported" 64b, but it's never actually been run of tested. This change
> doesn't actually fix anything. [1] I am not sure why X won't work yet. I
> do not get hangs or obvious errors.
> 
> There are 3 fixes grouped together here. First is to remove the
> hardcoded 0 for the upper dword of the relocation. The next fix is to
> use a 64b value for target_offset. The final fix is to not directly
> apply target_offset to reloc->delta. reloc->delta is part of ABI, and so
> we cannot change it. As it stands, 32b is enough to represent everything
> we're interested in representing anyway. The main problem is, we cannot
> add greater than 32b values to it directly.
> 
> [1] Almost all of intel-gpu-tools is not yet ready to test 64b
> relocations. There are a few places that expect 32b values for offsets
> and these all won't work.
> 
> Cc: Rafael Barbalho <rafael.barbalho@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Seriously, we did this? I am ashamed. I was annoyed by the original
assertion that no userspace was ready in the first place, and to see
that the code was a complete farce anyway...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

> ---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 0d806fc..6ffecd2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -262,10 +262,12 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  
>  static int
>  relocate_entry_cpu(struct drm_i915_gem_object *obj,
> -		   struct drm_i915_gem_relocation_entry *reloc)
> +		   struct drm_i915_gem_relocation_entry *reloc,
> +		   uint64_t target_offset)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	uint32_t page_offset = offset_in_page(reloc->offset);
> +	uint64_t delta = reloc->delta + target_offset;

I would not have called the final value delta, but target_offset.
I was going to quible over the use of a local variable instead of
reloc->delta, but you successfully argued in my head that your way was
less obtuse.
-Chris
Daniel Vetter May 5, 2014, 2:06 p.m. UTC | #2
On Thu, May 01, 2014 at 09:04:50AM +0100, Chris Wilson wrote:
> On Mon, Apr 28, 2014 at 05:18:28PM -0700, Ben Widawsky wrote:
> > All the rest of the code to enable this is in my branch. Without my
> > branch, hitting > 32b offsets is impossible. The code has always
> > "supported" 64b, but it's never actually been run of tested. This change
> > doesn't actually fix anything. [1] I am not sure why X won't work yet. I
> > do not get hangs or obvious errors.
> > 
> > There are 3 fixes grouped together here. First is to remove the
> > hardcoded 0 for the upper dword of the relocation. The next fix is to
> > use a 64b value for target_offset. The final fix is to not directly
> > apply target_offset to reloc->delta. reloc->delta is part of ABI, and so
> > we cannot change it. As it stands, 32b is enough to represent everything
> > we're interested in representing anyway. The main problem is, we cannot
> > add greater than 32b values to it directly.

Imo if you have a target_offset > 32b in a valid use-case we can bother to
look at this. But not before, since I expect that hw advances will make
this obsolete anyway.

> > [1] Almost all of intel-gpu-tools is not yet ready to test 64b
> > relocations. There are a few places that expect 32b values for offsets
> > and these all won't work.
> > 
> > Cc: Rafael Barbalho <rafael.barbalho@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Seriously, we did this? I am ashamed. I was annoyed by the original
> assertion that no userspace was ready in the first place, and to see
> that the code was a complete farce anyway...

Well my idea was that we try to prep userspace to avoid a needless abi
rev, but it was always clear to me that the kernel side (and igt) is
hopelessly broken for 64b relocs.

> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Queued for -next, thanks for the patch.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0d806fc..6ffecd2 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -262,10 +262,12 @@  static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
 
 static int
 relocate_entry_cpu(struct drm_i915_gem_object *obj,
-		   struct drm_i915_gem_relocation_entry *reloc)
+		   struct drm_i915_gem_relocation_entry *reloc,
+		   uint64_t target_offset)
 {
 	struct drm_device *dev = obj->base.dev;
 	uint32_t page_offset = offset_in_page(reloc->offset);
+	uint64_t delta = reloc->delta + target_offset;
 	char *vaddr;
 	int ret;
 
@@ -275,7 +277,7 @@  relocate_entry_cpu(struct drm_i915_gem_object *obj,
 
 	vaddr = kmap_atomic(i915_gem_object_get_page(obj,
 				reloc->offset >> PAGE_SHIFT));
-	*(uint32_t *)(vaddr + page_offset) = reloc->delta;
+	*(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta);
 
 	if (INTEL_INFO(dev)->gen >= 8) {
 		page_offset = offset_in_page(page_offset + sizeof(uint32_t));
@@ -286,7 +288,7 @@  relocate_entry_cpu(struct drm_i915_gem_object *obj,
 			    (reloc->offset + sizeof(uint32_t)) >> PAGE_SHIFT));
 		}
 
-		*(uint32_t *)(vaddr + page_offset) = 0;
+		*(uint32_t *)(vaddr + page_offset) = upper_32_bits(delta);
 	}
 
 	kunmap_atomic(vaddr);
@@ -296,10 +298,12 @@  relocate_entry_cpu(struct drm_i915_gem_object *obj,
 
 static int
 relocate_entry_gtt(struct drm_i915_gem_object *obj,
-		   struct drm_i915_gem_relocation_entry *reloc)
+		   struct drm_i915_gem_relocation_entry *reloc,
+		   uint64_t target_offset)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	uint64_t delta = reloc->delta + target_offset;
 	uint32_t __iomem *reloc_entry;
 	void __iomem *reloc_page;
 	int ret;
@@ -318,7 +322,7 @@  relocate_entry_gtt(struct drm_i915_gem_object *obj,
 			reloc->offset & PAGE_MASK);
 	reloc_entry = (uint32_t __iomem *)
 		(reloc_page + offset_in_page(reloc->offset));
-	iowrite32(reloc->delta, reloc_entry);
+	iowrite32(lower_32_bits(delta), reloc_entry);
 
 	if (INTEL_INFO(dev)->gen >= 8) {
 		reloc_entry += 1;
@@ -331,7 +335,7 @@  relocate_entry_gtt(struct drm_i915_gem_object *obj,
 			reloc_entry = reloc_page;
 		}
 
-		iowrite32(0, reloc_entry);
+		iowrite32(upper_32_bits(delta), reloc_entry);
 	}
 
 	io_mapping_unmap_atomic(reloc_page);
@@ -348,7 +352,7 @@  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	struct drm_gem_object *target_obj;
 	struct drm_i915_gem_object *target_i915_obj;
 	struct i915_vma *target_vma;
-	uint32_t target_offset;
+	uint64_t target_offset;
 	int ret;
 
 	/* we've already hold a reference to all valid objects */
@@ -427,11 +431,10 @@  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (obj->active && in_atomic())
 		return -EFAULT;
 
-	reloc->delta += target_offset;
 	if (use_cpu_reloc(obj))
-		ret = relocate_entry_cpu(obj, reloc);
+		ret = relocate_entry_cpu(obj, reloc, target_offset);
 	else
-		ret = relocate_entry_gtt(obj, reloc);
+		ret = relocate_entry_gtt(obj, reloc, target_offset);
 
 	if (ret)
 		return ret;