Message ID | 1373690756-15156-2-git-send-email-ben@bwidawsk.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote: > As we plumb the code with more VM information, it has become more > obvious that the easiest way to deal with bind and unbind is to simply > put the function pointers in the vm, and let those choose the correct > way to handle the page table updates. This change allows many places in > the code to simply be vm->bind, and not have to worry about > distinguishing PPGTT vs GGTT. > > NOTE: At some point in the future, brining back insert_entries may in > fact be desirable in order to use 1 bind/unbind for multiple generations > of PPGTT. For now however, it's just not necessary. I need to check the -internal tree again, but I'm rather sure that we need ->insert_entries. In that case I don't want to remove it here in the upstream tree since I have no intention to carry the re-add patch in -internal ;-) > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_drv.h | 9 +++++ > drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 81 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index e6694ae..c2a9c98 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -484,9 +484,18 @@ struct i915_address_space { > /* FIXME: Need a more generic return type */ > gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr, > enum i915_cache_level level); > + > + /** Unmap an object from an address space. This usually consists of > + * setting the valid PTE entries to a reserved scratch page. */ > + void (*unbind_object)(struct i915_address_space *vm, > + struct drm_i915_gem_object *obj); void (*unbind_vma)(struct i915_vma *vma); void (*bind_vma)(struct i915_vma *vma, enum i915_cache_level cache_level); I think if you do this as a follow-up we might as well bikeshed the interface a bit. Again (I know, broken record) for me it feels semantically much cleaner to talk about binding/unbindinig a vma instead of an (obj, vm) pair ... > void (*clear_range)(struct i915_address_space *vm, > unsigned int first_entry, > unsigned int num_entries); > + /* Map an object into an address space with the given cache flags. */ > + void (*bind_object)(struct i915_address_space *vm, > + struct drm_i915_gem_object *obj, > + enum i915_cache_level cache_level); > void (*insert_entries)(struct i915_address_space *vm, > struct sg_table *st, > unsigned int first_entry, > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index c0d0223..31ff971 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -45,6 +45,12 @@ > #define GEN6_PTE_CACHE_LLC_MLC (3 << 1) > #define GEN6_PTE_ADDR_ENCODE(addr) GEN6_GTT_ADDR_ENCODE(addr) > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > + struct drm_i915_gem_object *obj, > + enum i915_cache_level cache_level); > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm, > + struct drm_i915_gem_object *obj); > + > static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, > enum i915_cache_level level) > { > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) > } > ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; > ppgtt->enable = gen6_ppgtt_enable; > + ppgtt->base.unbind_object = gen6_ppgtt_unbind_object; > ppgtt->base.clear_range = gen6_ppgtt_clear_range; > + ppgtt->base.bind_object = gen6_ppgtt_bind_object; > ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; > ppgtt->base.cleanup = gen6_ppgtt_cleanup; > ppgtt->base.scratch = dev_priv->gtt.base.scratch; > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, > cache_level); > } > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > + struct drm_i915_gem_object *obj, > + enum i915_cache_level cache_level) > +{ > + const unsigned long entry = i915_gem_obj_offset(obj, vm); > + > + gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT, > + cache_level); > + obj->has_aliasing_ppgtt_mapping = 1; Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a bit wrong here. Either we do the ppgtt insert_entries call as part of the global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special global gtt binding call for execbuf. Thinking about this some more we might need bind flags with #define VMA_BIND_CPU (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */ #define VMA_BIND_GPU (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */ since otherwise we can't properly encapsulate the aliasing ppgtt binding logic into vm->bind. So in the end we'd have void ggtt_bind_vma(vma, bind_flags, cache_level) { ggtt_vm = vma->vm; WARN_ON(ggtt_vm != &dev_priv->gtt.base); if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) && !obj->has_global_gtt_mapping) { ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve); vma->obj->has_global_gtt_mapping = true; } if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) && !obj->has_ppgtt_mapping) { ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj, vma->node.start, cache_leve); vma->obj->has_ppgtt_mapping = true; } } Obviously completely untested, but I hope I could get the idea accross. Cheers, Daniel
On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote: > On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote: > > As we plumb the code with more VM information, it has become more > > obvious that the easiest way to deal with bind and unbind is to simply > > put the function pointers in the vm, and let those choose the correct > > way to handle the page table updates. This change allows many places in > > the code to simply be vm->bind, and not have to worry about > > distinguishing PPGTT vs GGTT. > > > > NOTE: At some point in the future, brining back insert_entries may in > > fact be desirable in order to use 1 bind/unbind for multiple generations > > of PPGTT. For now however, it's just not necessary. > > I need to check the -internal tree again, but I'm rather sure that we need > ->insert_entries. In that case I don't want to remove it here in the > upstream tree since I have no intention to carry the re-add patch in > -internal ;-) We do use it for i915_ppgtt_bind_object(), however it should be easily fixable since the mini-series is exactly about removing i915_ppgtt_bind_object, and making into vm->bind_object. I think it's fair if you ask me to fix this up on -internal as well, before merging it, but with that one exception - I still believe this is the right direction to go in. > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > --- > > drivers/gpu/drm/i915/i915_drv.h | 9 +++++ > > drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++ > > 2 files changed, 81 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index e6694ae..c2a9c98 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -484,9 +484,18 @@ struct i915_address_space { > > /* FIXME: Need a more generic return type */ > > gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr, > > enum i915_cache_level level); > > + > > + /** Unmap an object from an address space. This usually consists of > > + * setting the valid PTE entries to a reserved scratch page. */ > > + void (*unbind_object)(struct i915_address_space *vm, > > + struct drm_i915_gem_object *obj); > > void (*unbind_vma)(struct i915_vma *vma); > void (*bind_vma)(struct i915_vma *vma, > enum i915_cache_level cache_level); > > I think if you do this as a follow-up we might as well bikeshed the > interface a bit. Again (I know, broken record) for me it feels > semantically much cleaner to talk about binding/unbindinig a vma instead > of an (obj, vm) pair ... > So as mentioned (and I haven't yet responded to the other email, but I'll be broken record there also) - I don't disagree with you. My argument is the performance difference should be negligible, and the code as is, is decently tested. Changing this requires changing so much, I'd rather do the conversion on top. See the other mail thread for more... > > void (*clear_range)(struct i915_address_space *vm, > > unsigned int first_entry, > > unsigned int num_entries); > > + /* Map an object into an address space with the given cache flags. */ > > + void (*bind_object)(struct i915_address_space *vm, > > + struct drm_i915_gem_object *obj, > > + enum i915_cache_level cache_level); > > void (*insert_entries)(struct i915_address_space *vm, > > struct sg_table *st, > > unsigned int first_entry, > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > > index c0d0223..31ff971 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > > @@ -45,6 +45,12 @@ > > #define GEN6_PTE_CACHE_LLC_MLC (3 << 1) > > #define GEN6_PTE_ADDR_ENCODE(addr) GEN6_GTT_ADDR_ENCODE(addr) > > > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > > + struct drm_i915_gem_object *obj, > > + enum i915_cache_level cache_level); > > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm, > > + struct drm_i915_gem_object *obj); > > + > > static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, > > enum i915_cache_level level) > > { > > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) > > } > > ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; > > ppgtt->enable = gen6_ppgtt_enable; > > + ppgtt->base.unbind_object = gen6_ppgtt_unbind_object; > > ppgtt->base.clear_range = gen6_ppgtt_clear_range; > > + ppgtt->base.bind_object = gen6_ppgtt_bind_object; > > ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; > > ppgtt->base.cleanup = gen6_ppgtt_cleanup; > > ppgtt->base.scratch = dev_priv->gtt.base.scratch; > > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, > > cache_level); > > } > > > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > > + struct drm_i915_gem_object *obj, > > + enum i915_cache_level cache_level) > > +{ > > + const unsigned long entry = i915_gem_obj_offset(obj, vm); > > + > > + gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT, > > + cache_level); > > + obj->has_aliasing_ppgtt_mapping = 1; > > Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a > bit wrong here. Either we do the ppgtt insert_entries call as part of the > global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special > global gtt binding call for execbuf. > > Thinking about this some more we might need bind flags with > > #define VMA_BIND_CPU (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */ > #define VMA_BIND_GPU (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */ > > since otherwise we can't properly encapsulate the aliasing ppgtt binding > logic into vm->bind. So in the end we'd have > > void ggtt_bind_vma(vma, bind_flags, cache_level) > { > ggtt_vm = vma->vm; > WARN_ON(ggtt_vm != &dev_priv->gtt.base); > > if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) && > !obj->has_global_gtt_mapping) { > ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve); > vma->obj->has_global_gtt_mapping = true; > } > > if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) && > !obj->has_ppgtt_mapping) { > ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj, > vma->node.start, > cache_leve); > vma->obj->has_ppgtt_mapping = true; > } > } > > Obviously completely untested, but I hope I could get the idea accross. > > Cheers, Daniel To me, aliasing ppgtt is just a wart that doesn't fit well with anything. As such, my plan was to hide as much of it as possible in ggtt functions. Using some kind of flag on ggtt_bind() we can determine if the user actually wants ggtt, and if so bind to both, else just use aliasing ppgtt. None of that code appears here because I want to make the diff churn as small as possible, and hadn't completely thought it all through. Now after typing that (and this really did happen), I just looked at your function, and it seems to be more or less exactly what I just typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you really just want one flag which specifies "bind it in the global gtt, sucka" Now having just typed /that/, it was indeed my plan. So as long as nothing really bothers you with the bind/unbind() stuff, I can move forward with a patch on top to fix it.
On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote: > On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote: > > On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote: > > > As we plumb the code with more VM information, it has become more > > > obvious that the easiest way to deal with bind and unbind is to simply > > > put the function pointers in the vm, and let those choose the correct > > > way to handle the page table updates. This change allows many places in > > > the code to simply be vm->bind, and not have to worry about > > > distinguishing PPGTT vs GGTT. > > > > > > NOTE: At some point in the future, brining back insert_entries may in > > > fact be desirable in order to use 1 bind/unbind for multiple generations > > > of PPGTT. For now however, it's just not necessary. > > > > I need to check the -internal tree again, but I'm rather sure that we need > > ->insert_entries. In that case I don't want to remove it here in the > > upstream tree since I have no intention to carry the re-add patch in > > -internal ;-) > > We do use it for i915_ppgtt_bind_object(), however it should be easily > fixable since the mini-series is exactly about removing > i915_ppgtt_bind_object, and making into vm->bind_object. I think it's > fair if you ask me to fix this up on -internal as well, before merging > it, but with that one exception - I still believe this is the right > direction to go in. > > > > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > --- > > > drivers/gpu/drm/i915/i915_drv.h | 9 +++++ > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 81 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > > index e6694ae..c2a9c98 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > @@ -484,9 +484,18 @@ struct i915_address_space { > > > /* FIXME: Need a more generic return type */ > > > gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr, > > > enum i915_cache_level level); > > > + > > > + /** Unmap an object from an address space. This usually consists of > > > + * setting the valid PTE entries to a reserved scratch page. */ > > > + void (*unbind_object)(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj); > > > > void (*unbind_vma)(struct i915_vma *vma); > > void (*bind_vma)(struct i915_vma *vma, > > enum i915_cache_level cache_level); > > > > I think if you do this as a follow-up we might as well bikeshed the > > interface a bit. Again (I know, broken record) for me it feels > > semantically much cleaner to talk about binding/unbindinig a vma instead > > of an (obj, vm) pair ... > > > > So as mentioned (and I haven't yet responded to the other email, but > I'll be broken record there also) - I don't disagree with you. My > argument is the performance difference should be negligible, and the code > as is, is decently tested. Changing this requires changing so much, I'd > rather do the conversion on top. See the other mail thread for more... > > > > void (*clear_range)(struct i915_address_space *vm, > > > unsigned int first_entry, > > > unsigned int num_entries); > > > + /* Map an object into an address space with the given cache flags. */ > > > + void (*bind_object)(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj, > > > + enum i915_cache_level cache_level); > > > void (*insert_entries)(struct i915_address_space *vm, > > > struct sg_table *st, > > > unsigned int first_entry, > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > index c0d0223..31ff971 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > @@ -45,6 +45,12 @@ > > > #define GEN6_PTE_CACHE_LLC_MLC (3 << 1) > > > #define GEN6_PTE_ADDR_ENCODE(addr) GEN6_GTT_ADDR_ENCODE(addr) > > > > > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj, > > > + enum i915_cache_level cache_level); > > > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj); > > > + > > > static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, > > > enum i915_cache_level level) > > > { > > > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) > > > } > > > ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; > > > ppgtt->enable = gen6_ppgtt_enable; > > > + ppgtt->base.unbind_object = gen6_ppgtt_unbind_object; > > > ppgtt->base.clear_range = gen6_ppgtt_clear_range; > > > + ppgtt->base.bind_object = gen6_ppgtt_bind_object; > > > ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; > > > ppgtt->base.cleanup = gen6_ppgtt_cleanup; > > > ppgtt->base.scratch = dev_priv->gtt.base.scratch; > > > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, > > > cache_level); > > > } > > > > > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj, > > > + enum i915_cache_level cache_level) > > > +{ > > > + const unsigned long entry = i915_gem_obj_offset(obj, vm); > > > + > > > + gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT, > > > + cache_level); > > > + obj->has_aliasing_ppgtt_mapping = 1; > > > > Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a > > bit wrong here. Either we do the ppgtt insert_entries call as part of the > > global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special > > global gtt binding call for execbuf. > > > > Thinking about this some more we might need bind flags with > > > > #define VMA_BIND_CPU (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */ > > #define VMA_BIND_GPU (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */ > > > > since otherwise we can't properly encapsulate the aliasing ppgtt binding > > logic into vm->bind. So in the end we'd have > > > > void ggtt_bind_vma(vma, bind_flags, cache_level) > > { > > ggtt_vm = vma->vm; > > WARN_ON(ggtt_vm != &dev_priv->gtt.base); > > > > if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) && > > !obj->has_global_gtt_mapping) { > > ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve); > > vma->obj->has_global_gtt_mapping = true; > > } > > > > if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) && > > !obj->has_ppgtt_mapping) { > > ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj, > > vma->node.start, > > cache_leve); > > vma->obj->has_ppgtt_mapping = true; > > } > > } > > > > Obviously completely untested, but I hope I could get the idea accross. > > > > Cheers, Daniel > > To me, aliasing ppgtt is just a wart that doesn't fit well with > anything. As such, my plan was to hide as much of it as possible in ggtt > functions. Using some kind of flag on ggtt_bind() we can determine if > the user actually wants ggtt, and if so bind to both, else just use > aliasing ppgtt. None of that code appears here because I want to make > the diff churn as small as possible, and hadn't completely thought it > all through. > > Now after typing that (and this really did happen), I just looked at > your function, and it seems to be more or less exactly what I just > typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you > really just want one flag which specifies "bind it in the global gtt, > sucka" > > Now having just typed /that/, it was indeed my plan. So as long as > nothing really bothers you with the bind/unbind() stuff, I can move > forward with a patch on top to fix it. > I changed my mind already. A patch on top doesn't make sense. I'll try to fix this one up as is.
On Mon, Jul 15, 2013 at 09:00:54PM -0700, Ben Widawsky wrote: > On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote: > > To me, aliasing ppgtt is just a wart that doesn't fit well with > > anything. As such, my plan was to hide as much of it as possible in ggtt > > functions. Using some kind of flag on ggtt_bind() we can determine if > > the user actually wants ggtt, and if so bind to both, else just use > > aliasing ppgtt. None of that code appears here because I want to make > > the diff churn as small as possible, and hadn't completely thought it > > all through. > > > > Now after typing that (and this really did happen), I just looked at > > your function, and it seems to be more or less exactly what I just > > typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you > > really just want one flag which specifies "bind it in the global gtt, > > sucka" Feel free to bikeshed the names, I tend to not be really good at that ;-) I guess a flag to switch between binding to ppgtt or ggtt would also work, I just slowly started to dislike make bool arguments and favour explicitly named flags/enums more. Better self-documenting code ... > > Now having just typed /that/, it was indeed my plan. So as long as > > nothing really bothers you with the bind/unbind() stuff, I can move > > forward with a patch on top to fix it. > > > > I changed my mind already. A patch on top doesn't make sense. I'll try > to fix this one up as is. Yeah, overall the thing that irks me is that you've added ppgtt bind functions despite that we don't yet bind anything into any real ppgtt address space. I'd recommend to just implement the ggtt bind functions (with the aliasing crap added) and then add the ppgtt bind functions once we get nearer to actually using them for real. -Daniel
On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote: > On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote: > > On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote: > > > As we plumb the code with more VM information, it has become more > > > obvious that the easiest way to deal with bind and unbind is to simply > > > put the function pointers in the vm, and let those choose the correct > > > way to handle the page table updates. This change allows many places in > > > the code to simply be vm->bind, and not have to worry about > > > distinguishing PPGTT vs GGTT. > > > > > > NOTE: At some point in the future, brining back insert_entries may in > > > fact be desirable in order to use 1 bind/unbind for multiple generations > > > of PPGTT. For now however, it's just not necessary. > > > > I need to check the -internal tree again, but I'm rather sure that we need > > ->insert_entries. In that case I don't want to remove it here in the > > upstream tree since I have no intention to carry the re-add patch in > > -internal ;-) > > We do use it for i915_ppgtt_bind_object(), however it should be easily > fixable since the mini-series is exactly about removing > i915_ppgtt_bind_object, and making into vm->bind_object. I think it's > fair if you ask me to fix this up on -internal as well, before merging > it, but with that one exception - I still believe this is the right > direction to go in. My idea behind ->bind was that we could us this to hide the aliasing ppgtt stuff a bit, and otherwise keep things exactly as-is. I haven't actually looked at -internal to check whether it's as ugly as I expect ;-) So if you promise to fix up -internal if I come screaming around due to rebase breakage I'm ok with either option. > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > --- > > > drivers/gpu/drm/i915/i915_drv.h | 9 +++++ > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 81 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > > index e6694ae..c2a9c98 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > @@ -484,9 +484,18 @@ struct i915_address_space { > > > /* FIXME: Need a more generic return type */ > > > gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr, > > > enum i915_cache_level level); > > > + > > > + /** Unmap an object from an address space. This usually consists of > > > + * setting the valid PTE entries to a reserved scratch page. */ > > > + void (*unbind_object)(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj); > > > > void (*unbind_vma)(struct i915_vma *vma); > > void (*bind_vma)(struct i915_vma *vma, > > enum i915_cache_level cache_level); > > > > I think if you do this as a follow-up we might as well bikeshed the > > interface a bit. Again (I know, broken record) for me it feels > > semantically much cleaner to talk about binding/unbindinig a vma instead > > of an (obj, vm) pair ... > > > > So as mentioned (and I haven't yet responded to the other email, but > I'll be broken record there also) - I don't disagree with you. My > argument is the performance difference should be negligible, and the code > as is, is decently tested. Changing this requires changing so much, I'd > rather do the conversion on top. See the other mail thread for more... Yeah, I agree with the testing argument, sometimes I just want the Perfect Patch a bit too much ;-) -Daniel
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e6694ae..c2a9c98 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -484,9 +484,18 @@ struct i915_address_space { /* FIXME: Need a more generic return type */ gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr, enum i915_cache_level level); + + /** Unmap an object from an address space. This usually consists of + * setting the valid PTE entries to a reserved scratch page. */ + void (*unbind_object)(struct i915_address_space *vm, + struct drm_i915_gem_object *obj); void (*clear_range)(struct i915_address_space *vm, unsigned int first_entry, unsigned int num_entries); + /* Map an object into an address space with the given cache flags. */ + void (*bind_object)(struct i915_address_space *vm, + struct drm_i915_gem_object *obj, + enum i915_cache_level cache_level); void (*insert_entries)(struct i915_address_space *vm, struct sg_table *st, unsigned int first_entry, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index c0d0223..31ff971 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -45,6 +45,12 @@ #define GEN6_PTE_CACHE_LLC_MLC (3 << 1) #define GEN6_PTE_ADDR_ENCODE(addr) GEN6_GTT_ADDR_ENCODE(addr) +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj, + enum i915_cache_level cache_level); +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj); + static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, enum i915_cache_level level) { @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) } ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; ppgtt->enable = gen6_ppgtt_enable; + ppgtt->base.unbind_object = gen6_ppgtt_unbind_object; ppgtt->base.clear_range = gen6_ppgtt_clear_range; + ppgtt->base.bind_object = gen6_ppgtt_bind_object; ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; ppgtt->base.cleanup = gen6_ppgtt_cleanup; ppgtt->base.scratch = dev_priv->gtt.base.scratch; @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, cache_level); } +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj, + enum i915_cache_level cache_level) +{ + const unsigned long entry = i915_gem_obj_offset(obj, vm); + + gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT, + cache_level); + obj->has_aliasing_ppgtt_mapping = 1; +} + void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, struct drm_i915_gem_object *obj) { @@ -407,6 +426,16 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, obj->base.size >> PAGE_SHIFT); } +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj) +{ + const unsigned long entry = i915_gem_obj_offset(obj, vm); + + gen6_ppgtt_clear_range(vm, entry >> PAGE_SHIFT, + obj->base.size >> PAGE_SHIFT); + obj->has_aliasing_ppgtt_mapping = 0; +} + extern int intel_iommu_gfx_mapped; /* Certain Gen5 chipsets require require idling the GPU before * unmapping anything from the GTT when VT-d is enabled. @@ -555,6 +584,18 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm, } +static void i915_ggtt_bind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj, + enum i915_cache_level cache_level) +{ + const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; + unsigned int flags = (cache_level == I915_CACHE_NONE) ? + AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY; + + BUG_ON(!i915_is_ggtt(vm)); + intel_gtt_insert_sg_entries(obj->pages, entry, flags); +} + static void i915_ggtt_clear_range(struct i915_address_space *vm, unsigned int first_entry, unsigned int num_entries) @@ -562,6 +603,24 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm, intel_gtt_clear_range(first_entry, num_entries); } +static void i915_ggtt_unbind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj) +{ + const unsigned int first = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; + const unsigned int size = obj->base.size >> PAGE_SHIFT; + + BUG_ON(!i915_is_ggtt(vm)); + intel_gtt_clear_range(first, size); +} + +static void gen6_ggtt_bind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj, + enum i915_cache_level cache_level) +{ + const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; + gen6_ggtt_insert_entries(vm, obj->pages, entry, cache_level); + obj->has_global_gtt_mapping = 1; +} void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level) @@ -590,6 +649,15 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj) obj->has_global_gtt_mapping = 0; } +static void gen6_ggtt_unbind_object(struct i915_address_space *vm, + struct drm_i915_gem_object *obj) +{ + const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; + + gen6_ggtt_clear_range(vm, entry, obj->base.size >> PAGE_SHIFT); + obj->has_global_gtt_mapping = 0; +} + void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj) { struct drm_device *dev = obj->base.dev; @@ -823,7 +891,9 @@ static int gen6_gmch_probe(struct drm_device *dev, DRM_ERROR("Scratch setup failed\n"); dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range; + dev_priv->gtt.base.unbind_object = gen6_ggtt_unbind_object; dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries; + dev_priv->gtt.base.bind_object = gen6_ggtt_bind_object; return ret; } @@ -855,7 +925,9 @@ static int i915_gmch_probe(struct drm_device *dev, dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev); dev_priv->gtt.base.clear_range = i915_ggtt_clear_range; + dev_priv->gtt.base.unbind_object = i915_ggtt_unbind_object; dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries; + dev_priv->gtt.base.bind_object = i915_ggtt_bind_object; return 0; }
As we plumb the code with more VM information, it has become more obvious that the easiest way to deal with bind and unbind is to simply put the function pointers in the vm, and let those choose the correct way to handle the page table updates. This change allows many places in the code to simply be vm->bind, and not have to worry about distinguishing PPGTT vs GGTT. NOTE: At some point in the future, brining back insert_entries may in fact be desirable in order to use 1 bind/unbind for multiple generations of PPGTT. For now however, it's just not necessary. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_drv.h | 9 +++++ drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++ 2 files changed, 81 insertions(+)