Message ID | 20210412175048.51689-1-thomas.hellstrom@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] drm/i915: Simplify userptr locking | expand |
On Mon, Apr 12, 2021 at 7:51 PM Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote: > > Use an rwlock instead of spinlock for the global notifier lock > to reduce risk of contention in execbuf. > > Protect object state with the object lock whenever possible rather > than with the global notifier lock > > Don't take an explicit page_ref in userptr_submit_init() but rather > call get_pages() after obtaining the page list so that > get_pages() holds the page_ref. This means we don't need to call > userptr_submit_fini(), which is needed to avoid awkward locking > in our upcoming VM_BIND code. > > CC'ing a broader audience since userptr handling is something we > probably want to unify across drivers at some point. > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Cc: Daniel Vetter <daniel@ffwll.ch> > Cc: Dave Airlie <airlied@gmail.com> > Cc: dri-devel@lists.freedesktop.org > Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> So I was panicking when I've seen this first because of the RFC and feared it's the big discussion about which userptr semantics exactly we should aim for. Which is something we really should nail. But looking now again it's really that you remove some optional paths, and the core bit with mmu_interval_read_retry is still there. It's just mmu_interval_check_retry that goes out. So no panic from me on this. But ideally I think needs Maarten to review the details. Wrt the panic topic, I think there's 3 legit ways to do userptr: - using pin_user_page with PIN_LONGTERM - using hmm/mmu_notifier_range, gpu page faults and tlb invalidate, with no dma_fence and no page refcounts at all - with mmu_notifier_range and dma_fence, and I think there the big question is whether we should get some page references. I think it would be great if we get as close as possible to the case with real page faults/tlb invalidate, so no page references, since holding random references. I think ideally we'd at least document this, ideally with helpers used by drivers, also I'd like a few more ponies :-) Cheers, Daniel > --- > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 21 +++--- > drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 - > drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 72 ++++++------------- > drivers/gpu/drm/i915/i915_drv.h | 2 +- > 4 files changed, 31 insertions(+), 66 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index 5964e67c7d36..d3604ddfa93b 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -994,7 +994,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle) > } > } > > -static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release_userptr) > +static void eb_release_vmas(struct i915_execbuffer *eb, bool final) > { > const unsigned int count = eb->buffer_count; > unsigned int i; > @@ -1008,11 +1008,6 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release > > eb_unreserve_vma(ev); > > - if (release_userptr && ev->flags & __EXEC_OBJECT_USERPTR_INIT) { > - ev->flags &= ~__EXEC_OBJECT_USERPTR_INIT; > - i915_gem_object_userptr_submit_fini(vma->obj); > - } > - > if (final) > i915_vma_put(vma); > } > @@ -1990,7 +1985,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, > } > > /* We may process another execbuffer during the unlock... */ > - eb_release_vmas(eb, false, true); > + eb_release_vmas(eb, false); > i915_gem_ww_ctx_fini(&eb->ww); > > if (rq) { > @@ -2094,7 +2089,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, > > err: > if (err == -EDEADLK) { > - eb_release_vmas(eb, false, false); > + eb_release_vmas(eb, false); > err = i915_gem_ww_ctx_backoff(&eb->ww); > if (!err) > goto repeat_validate; > @@ -2191,7 +2186,7 @@ static int eb_relocate_parse(struct i915_execbuffer *eb) > > err: > if (err == -EDEADLK) { > - eb_release_vmas(eb, false, false); > + eb_release_vmas(eb, false); > err = i915_gem_ww_ctx_backoff(&eb->ww); > if (!err) > goto retry; > @@ -2268,7 +2263,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) > > #ifdef CONFIG_MMU_NOTIFIER > if (!err && (eb->args->flags & __EXEC_USERPTR_USED)) { > - spin_lock(&eb->i915->mm.notifier_lock); > + read_lock(&eb->i915->mm.notifier_lock); > > /* > * count is always at least 1, otherwise __EXEC_USERPTR_USED > @@ -2286,7 +2281,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) > break; > } > > - spin_unlock(&eb->i915->mm.notifier_lock); > + read_unlock(&eb->i915->mm.notifier_lock); > } > #endif > > @@ -3435,7 +3430,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, > > err = eb_lookup_vmas(&eb); > if (err) { > - eb_release_vmas(&eb, true, true); > + eb_release_vmas(&eb, true); > goto err_engine; > } > > @@ -3528,7 +3523,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, > i915_request_put(eb.request); > > err_vma: > - eb_release_vmas(&eb, true, true); > + eb_release_vmas(&eb, true); > if (eb.trampoline) > i915_vma_unpin(eb.trampoline); > WARN_ON(err == -EDEADLK); > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h > index 2ebd79537aea..aea750142272 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > @@ -596,14 +596,12 @@ i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) > > int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj); > int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj); > -void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj); > int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj); > #else > static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) { return false; } > > static inline int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; } > static inline int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; } > -static inline void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); } > static inline int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; } > > #endif > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > index a657b99ec760..c361669a5491 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > @@ -67,11 +67,11 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni, > if (!mmu_notifier_range_blockable(range)) > return false; > > - spin_lock(&i915->mm.notifier_lock); > + write_lock(&i915->mm.notifier_lock); > > mmu_interval_set_seq(mni, cur_seq); > > - spin_unlock(&i915->mm.notifier_lock); > + write_unlock(&i915->mm.notifier_lock); > > /* > * We don't wait when the process is exiting. This is valid > @@ -107,16 +107,15 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj) > > static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj) > { > - struct drm_i915_private *i915 = to_i915(obj->base.dev); > struct page **pvec = NULL; > > - spin_lock(&i915->mm.notifier_lock); > + assert_object_held_shared(obj); > + > if (!--obj->userptr.page_ref) { > pvec = obj->userptr.pvec; > obj->userptr.pvec = NULL; > } > GEM_BUG_ON(obj->userptr.page_ref < 0); > - spin_unlock(&i915->mm.notifier_lock); > > if (pvec) { > const unsigned long num_pages = obj->base.size >> PAGE_SHIFT; > @@ -128,7 +127,6 @@ static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj) > > static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj) > { > - struct drm_i915_private *i915 = to_i915(obj->base.dev); > const unsigned long num_pages = obj->base.size >> PAGE_SHIFT; > unsigned int max_segment = i915_sg_segment_size(); > struct sg_table *st; > @@ -141,16 +139,13 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj) > if (!st) > return -ENOMEM; > > - spin_lock(&i915->mm.notifier_lock); > - if (GEM_WARN_ON(!obj->userptr.page_ref)) { > - spin_unlock(&i915->mm.notifier_lock); > - ret = -EFAULT; > + if (!obj->userptr.page_ref) { > + ret = -EAGAIN; > goto err_free; > } > > obj->userptr.page_ref++; > pvec = obj->userptr.pvec; > - spin_unlock(&i915->mm.notifier_lock); > > alloc_table: > sg = __sg_alloc_table_from_pages(st, pvec, num_pages, 0, > @@ -241,7 +236,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj, > i915_gem_object_userptr_drop_ref(obj); > } > > -static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool get_pages) > +static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj) > { > struct sg_table *pages; > int err; > @@ -259,15 +254,11 @@ static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool > if (!IS_ERR_OR_NULL(pages)) > i915_gem_userptr_put_pages(obj, pages); > > - if (get_pages) > - err = ____i915_gem_object_get_pages(obj); > - > return err; > } > > int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) > { > - struct drm_i915_private *i915 = to_i915(obj->base.dev); > const unsigned long num_pages = obj->base.size >> PAGE_SHIFT; > struct page **pvec; > unsigned int gup_flags = 0; > @@ -277,39 +268,22 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) > if (obj->userptr.notifier.mm != current->mm) > return -EFAULT; > > + notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier); > + > ret = i915_gem_object_lock_interruptible(obj, NULL); > if (ret) > return ret; > > - /* optimistically try to preserve current pages while unlocked */ > - if (i915_gem_object_has_pages(obj) && > - !mmu_interval_check_retry(&obj->userptr.notifier, > - obj->userptr.notifier_seq)) { > - spin_lock(&i915->mm.notifier_lock); > - if (obj->userptr.pvec && > - !mmu_interval_read_retry(&obj->userptr.notifier, > - obj->userptr.notifier_seq)) { > - obj->userptr.page_ref++; > - > - /* We can keep using the current binding, this is the fastpath */ > - ret = 1; > - } > - spin_unlock(&i915->mm.notifier_lock); > + if (notifier_seq == obj->userptr.notifier_seq && obj->userptr.pvec) { > + i915_gem_object_unlock(obj); > + return 0; > } > > - if (!ret) { > - /* Make sure userptr is unbound for next attempt, so we don't use stale pages. */ > - ret = i915_gem_object_userptr_unbind(obj, false); > - } > + ret = i915_gem_object_userptr_unbind(obj); > i915_gem_object_unlock(obj); > - if (ret < 0) > + if (ret) > return ret; > > - if (ret > 0) > - return 0; > - > - notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier); > - > pvec = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL); > if (!pvec) > return -ENOMEM; > @@ -329,7 +303,9 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) > } > ret = 0; > > - spin_lock(&i915->mm.notifier_lock); > + ret = i915_gem_object_lock_interruptible(obj, NULL); > + if (ret) > + goto out; > > if (mmu_interval_read_retry(&obj->userptr.notifier, > !obj->userptr.page_ref ? notifier_seq : > @@ -341,12 +317,14 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) > if (!obj->userptr.page_ref++) { > obj->userptr.pvec = pvec; > obj->userptr.notifier_seq = notifier_seq; > - > pvec = NULL; > + ret = ____i915_gem_object_get_pages(obj); > } > > + obj->userptr.page_ref--; > + > out_unlock: > - spin_unlock(&i915->mm.notifier_lock); > + i915_gem_object_unlock(obj); > > out: > if (pvec) { > @@ -369,11 +347,6 @@ int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj) > return 0; > } > > -void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj) > -{ > - i915_gem_object_userptr_drop_ref(obj); > -} > - > int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj) > { > int err; > @@ -396,7 +369,6 @@ int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj) > i915_gem_object_unlock(obj); > } > > - i915_gem_object_userptr_submit_fini(obj); > return err; > } > > @@ -572,7 +544,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, > int i915_gem_init_userptr(struct drm_i915_private *dev_priv) > { > #ifdef CONFIG_MMU_NOTIFIER > - spin_lock_init(&dev_priv->mm.notifier_lock); > + rwlock_init(&dev_priv->mm.notifier_lock); > #endif > > return 0; > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 69e43bf91a15..37e784840864 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -560,7 +560,7 @@ struct i915_gem_mm { > * notifier_lock for mmu notifiers, memory may not be allocated > * while holding this lock. > */ > - spinlock_t notifier_lock; > + rwlock_t notifier_lock; > #endif > > /* shrinker accounting, also useful for userland debugging */ > -- > 2.30.2 >
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 5964e67c7d36..d3604ddfa93b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -994,7 +994,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle) } } -static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release_userptr) +static void eb_release_vmas(struct i915_execbuffer *eb, bool final) { const unsigned int count = eb->buffer_count; unsigned int i; @@ -1008,11 +1008,6 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release eb_unreserve_vma(ev); - if (release_userptr && ev->flags & __EXEC_OBJECT_USERPTR_INIT) { - ev->flags &= ~__EXEC_OBJECT_USERPTR_INIT; - i915_gem_object_userptr_submit_fini(vma->obj); - } - if (final) i915_vma_put(vma); } @@ -1990,7 +1985,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, } /* We may process another execbuffer during the unlock... */ - eb_release_vmas(eb, false, true); + eb_release_vmas(eb, false); i915_gem_ww_ctx_fini(&eb->ww); if (rq) { @@ -2094,7 +2089,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, err: if (err == -EDEADLK) { - eb_release_vmas(eb, false, false); + eb_release_vmas(eb, false); err = i915_gem_ww_ctx_backoff(&eb->ww); if (!err) goto repeat_validate; @@ -2191,7 +2186,7 @@ static int eb_relocate_parse(struct i915_execbuffer *eb) err: if (err == -EDEADLK) { - eb_release_vmas(eb, false, false); + eb_release_vmas(eb, false); err = i915_gem_ww_ctx_backoff(&eb->ww); if (!err) goto retry; @@ -2268,7 +2263,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) #ifdef CONFIG_MMU_NOTIFIER if (!err && (eb->args->flags & __EXEC_USERPTR_USED)) { - spin_lock(&eb->i915->mm.notifier_lock); + read_lock(&eb->i915->mm.notifier_lock); /* * count is always at least 1, otherwise __EXEC_USERPTR_USED @@ -2286,7 +2281,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) break; } - spin_unlock(&eb->i915->mm.notifier_lock); + read_unlock(&eb->i915->mm.notifier_lock); } #endif @@ -3435,7 +3430,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, err = eb_lookup_vmas(&eb); if (err) { - eb_release_vmas(&eb, true, true); + eb_release_vmas(&eb, true); goto err_engine; } @@ -3528,7 +3523,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, i915_request_put(eb.request); err_vma: - eb_release_vmas(&eb, true, true); + eb_release_vmas(&eb, true); if (eb.trampoline) i915_vma_unpin(eb.trampoline); WARN_ON(err == -EDEADLK); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 2ebd79537aea..aea750142272 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -596,14 +596,12 @@ i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj); int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj); -void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj); int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj); #else static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) { return false; } static inline int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; } static inline int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; } -static inline void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); } static inline int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; } #endif diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index a657b99ec760..c361669a5491 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -67,11 +67,11 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni, if (!mmu_notifier_range_blockable(range)) return false; - spin_lock(&i915->mm.notifier_lock); + write_lock(&i915->mm.notifier_lock); mmu_interval_set_seq(mni, cur_seq); - spin_unlock(&i915->mm.notifier_lock); + write_unlock(&i915->mm.notifier_lock); /* * We don't wait when the process is exiting. This is valid @@ -107,16 +107,15 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj) static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); struct page **pvec = NULL; - spin_lock(&i915->mm.notifier_lock); + assert_object_held_shared(obj); + if (!--obj->userptr.page_ref) { pvec = obj->userptr.pvec; obj->userptr.pvec = NULL; } GEM_BUG_ON(obj->userptr.page_ref < 0); - spin_unlock(&i915->mm.notifier_lock); if (pvec) { const unsigned long num_pages = obj->base.size >> PAGE_SHIFT; @@ -128,7 +127,6 @@ static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj) static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); const unsigned long num_pages = obj->base.size >> PAGE_SHIFT; unsigned int max_segment = i915_sg_segment_size(); struct sg_table *st; @@ -141,16 +139,13 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj) if (!st) return -ENOMEM; - spin_lock(&i915->mm.notifier_lock); - if (GEM_WARN_ON(!obj->userptr.page_ref)) { - spin_unlock(&i915->mm.notifier_lock); - ret = -EFAULT; + if (!obj->userptr.page_ref) { + ret = -EAGAIN; goto err_free; } obj->userptr.page_ref++; pvec = obj->userptr.pvec; - spin_unlock(&i915->mm.notifier_lock); alloc_table: sg = __sg_alloc_table_from_pages(st, pvec, num_pages, 0, @@ -241,7 +236,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj, i915_gem_object_userptr_drop_ref(obj); } -static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool get_pages) +static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj) { struct sg_table *pages; int err; @@ -259,15 +254,11 @@ static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool if (!IS_ERR_OR_NULL(pages)) i915_gem_userptr_put_pages(obj, pages); - if (get_pages) - err = ____i915_gem_object_get_pages(obj); - return err; } int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); const unsigned long num_pages = obj->base.size >> PAGE_SHIFT; struct page **pvec; unsigned int gup_flags = 0; @@ -277,39 +268,22 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) if (obj->userptr.notifier.mm != current->mm) return -EFAULT; + notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier); + ret = i915_gem_object_lock_interruptible(obj, NULL); if (ret) return ret; - /* optimistically try to preserve current pages while unlocked */ - if (i915_gem_object_has_pages(obj) && - !mmu_interval_check_retry(&obj->userptr.notifier, - obj->userptr.notifier_seq)) { - spin_lock(&i915->mm.notifier_lock); - if (obj->userptr.pvec && - !mmu_interval_read_retry(&obj->userptr.notifier, - obj->userptr.notifier_seq)) { - obj->userptr.page_ref++; - - /* We can keep using the current binding, this is the fastpath */ - ret = 1; - } - spin_unlock(&i915->mm.notifier_lock); + if (notifier_seq == obj->userptr.notifier_seq && obj->userptr.pvec) { + i915_gem_object_unlock(obj); + return 0; } - if (!ret) { - /* Make sure userptr is unbound for next attempt, so we don't use stale pages. */ - ret = i915_gem_object_userptr_unbind(obj, false); - } + ret = i915_gem_object_userptr_unbind(obj); i915_gem_object_unlock(obj); - if (ret < 0) + if (ret) return ret; - if (ret > 0) - return 0; - - notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier); - pvec = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL); if (!pvec) return -ENOMEM; @@ -329,7 +303,9 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) } ret = 0; - spin_lock(&i915->mm.notifier_lock); + ret = i915_gem_object_lock_interruptible(obj, NULL); + if (ret) + goto out; if (mmu_interval_read_retry(&obj->userptr.notifier, !obj->userptr.page_ref ? notifier_seq : @@ -341,12 +317,14 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) if (!obj->userptr.page_ref++) { obj->userptr.pvec = pvec; obj->userptr.notifier_seq = notifier_seq; - pvec = NULL; + ret = ____i915_gem_object_get_pages(obj); } + obj->userptr.page_ref--; + out_unlock: - spin_unlock(&i915->mm.notifier_lock); + i915_gem_object_unlock(obj); out: if (pvec) { @@ -369,11 +347,6 @@ int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj) return 0; } -void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj) -{ - i915_gem_object_userptr_drop_ref(obj); -} - int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj) { int err; @@ -396,7 +369,6 @@ int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj) i915_gem_object_unlock(obj); } - i915_gem_object_userptr_submit_fini(obj); return err; } @@ -572,7 +544,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, int i915_gem_init_userptr(struct drm_i915_private *dev_priv) { #ifdef CONFIG_MMU_NOTIFIER - spin_lock_init(&dev_priv->mm.notifier_lock); + rwlock_init(&dev_priv->mm.notifier_lock); #endif return 0; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 69e43bf91a15..37e784840864 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -560,7 +560,7 @@ struct i915_gem_mm { * notifier_lock for mmu notifiers, memory may not be allocated * while holding this lock. */ - spinlock_t notifier_lock; + rwlock_t notifier_lock; #endif /* shrinker accounting, also useful for userland debugging */
Use an rwlock instead of spinlock for the global notifier lock to reduce risk of contention in execbuf. Protect object state with the object lock whenever possible rather than with the global notifier lock Don't take an explicit page_ref in userptr_submit_init() but rather call get_pages() after obtaining the page list so that get_pages() holds the page_ref. This means we don't need to call userptr_submit_fini(), which is needed to avoid awkward locking in our upcoming VM_BIND code. CC'ing a broader audience since userptr handling is something we probably want to unify across drivers at some point. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Airlie <airlied@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 21 +++--- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 72 ++++++------------- drivers/gpu/drm/i915/i915_drv.h | 2 +- 4 files changed, 31 insertions(+), 66 deletions(-)