Message ID | 20221110053133.2433412-1-mani@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Fix unhandled deadlock in grab_vma() | expand |
On 10/11/2022 05:31, Mani Milani wrote: > At present, the gpu thread crashes at times when grab_vma() attempts to > acquire a gem object lock when in a deadlock state. > > Problems: > I identified the following 4 issues in the current code: > 1. Since grab_vma() calls i915_gem_object_trylock(), which consequently > calls ww_mutex_trylock(), to acquire lock, it does not perform any > -EDEADLK handling; And -EALREADY handling is also unreliable, > according to the description of ww_mutex_trylock(). > 2. Since the return value of grab_vma() is a boolean showing > success/failure, it does not provide any extra information on the > failure reason, and therefore does not provide any mechanism to its > caller to take any action to fix a potential deadlock. > 3. Current grab_vma() implementation produces inconsistent behaviour > depending on the refcount value, without informing the caller. If > refcount is already zero, grab_vma() neither acquires lock nor > increments the refcount, but still returns 'true' for success! This > means that grab_vma() returning true (for success) does not always > mean that the gem obj is actually safely accessible. > 4. Currently, calling "i915_gem_object_lock(obj,ww)" is meant to be > followed by a consequent "i915_gem_object_unlock(obj)" ONLY if the > original 'ww' object pointer was NULL, or otherwise not be called and > leave the houskeeping to "i915_gem_ww_ctx_fini(ww)". There are a few > issues with this: > - This is not documented anywhere in the code (that I could find), > but only explained in an older commit message. > - This produces an inconsistent usage of the lock/unlock functions, > increasing the chance of mistakes and issues. > - This is not a clean design as it requires any new code that calls > these lock/unlock functions to know their internals, as well as the > internals of the functions calling the new code being added. > > Fix: > To fix the issues above, this patch: > 1. Changes grab_vma() to call i915_gem_object_lock() instead of > i915_gem_object_trylock(), to handle -EDEADLK and -EALREADY cases. > This should not cause any issue since the PIN_NONBLOCK flag is > checked beforehand in the 2 cases grab_vma() is called. > 2. Changes grab_vma() to return the actual error code, instead of bool. > 3. Changes grab_vma() to behave consistently when returning success, by > both incrementing the refcount and acquiring lock at all times. > 4. Changes i915_gem_object_unlock() to pair with i915_gem_object_lock() > nicely in all cases and do the housekeeping without the need for the > caller to do anything other than simply calling lock and unlock. > 5. Ensures the gem obj->obj_link is initialized and deleted from the ww > list such that it can be tested for emptiness using list_empty(). > > Signed-off-by: Mani Milani <mani@chromium.org> > --- > > drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 + > drivers/gpu/drm/i915/gem/i915_gem_object.h | 10 ++++- > drivers/gpu/drm/i915/i915_gem_evict.c | 48 ++++++++++++---------- > drivers/gpu/drm/i915/i915_gem_ww.c | 8 ++-- > 4 files changed, 41 insertions(+), 27 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c > index 369006c5317f..69d013b393fb 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c > @@ -78,6 +78,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, > > INIT_LIST_HEAD(&obj->mm.link); > > + INIT_LIST_HEAD(&obj->obj_link); > + > INIT_LIST_HEAD(&obj->lut_list); > spin_lock_init(&obj->lut_lock); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h > index 1723af9b0f6a..7e7a61bdf52c 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > @@ -219,7 +219,7 @@ static inline bool i915_gem_object_trylock(struct drm_i915_gem_object *obj, > return ww_mutex_trylock(&obj->base.resv->lock, &ww->ctx); > } > > -static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) > +static inline void __i915_gem_object_unlock(struct drm_i915_gem_object *obj) > { > if (obj->ops->adjust_lru) > obj->ops->adjust_lru(obj); > @@ -227,6 +227,14 @@ static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) > dma_resv_unlock(obj->base.resv); > } > > +static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) > +{ > + if (list_empty(&obj->obj_link)) > + __i915_gem_object_unlock(obj); > + else > + i915_gem_ww_unlock_single(obj); > +} > + > static inline void > i915_gem_object_set_readonly(struct drm_i915_gem_object *obj) > { > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c > index f025ee4fa526..3eb514b4eddc 100644 > --- a/drivers/gpu/drm/i915/i915_gem_evict.c > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c > @@ -55,29 +55,33 @@ static int ggtt_flush(struct intel_gt *gt) > return intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); > } > > -static bool grab_vma(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) > +static int grab_vma(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) > { > + int err; > + > + /* Dead objects don't need pins */ > + if (dying_vma(vma)) > + atomic_and(~I915_VMA_PIN_MASK, &vma->flags); > + > + err = i915_gem_object_lock(vma->obj, ww); AFAIK the issue here is that we are already holding the vm->mutex, so this can potentially deadlock, which I guess is why this was trylock. We typically grab a bunch of object locks during execbuf, and then grab the vm->mutex, before binding the vma for each object. So vm->mutex is always our inner lock, and the object lock is the outer one. Using a full lock here then inverts that locking AFAICT. Like say if one process is holding object A + vm->mutex and then tries to grab object B here in grab_vma(), but another process is already holding object B + waiting to grab vm->mutex? > + > /* > * We add the extra refcount so the object doesn't drop to zero until > - * after ungrab_vma(), this way trylock is always paired with unlock. > + * after ungrab_vma(), this way lock is always paired with unlock. > */ > - if (i915_gem_object_get_rcu(vma->obj)) { > - if (!i915_gem_object_trylock(vma->obj, ww)) { > - i915_gem_object_put(vma->obj); > - return false; > - } > - } else { > - /* Dead objects don't need pins */ > - atomic_and(~I915_VMA_PIN_MASK, &vma->flags); > - } > + if (!err) > + i915_gem_object_get(vma->obj); > > - return true; > + return err; > } > > static void ungrab_vma(struct i915_vma *vma) > { > - if (dying_vma(vma)) > + if (dying_vma(vma)) { > + /* Dead objects don't need pins */ > + atomic_and(~I915_VMA_PIN_MASK, &vma->flags); > return; > + } > > i915_gem_object_unlock(vma->obj); > i915_gem_object_put(vma->obj); > @@ -93,10 +97,11 @@ mark_free(struct drm_mm_scan *scan, > if (i915_vma_is_pinned(vma)) > return false; > > - if (!grab_vma(vma, ww)) > + if (grab_vma(vma, ww)) > return false; > > list_add(&vma->evict_link, unwind); > + > return drm_mm_scan_add_block(scan, &vma->node); > } > > @@ -284,10 +289,12 @@ i915_gem_evict_something(struct i915_address_space *vm, > vma = container_of(node, struct i915_vma, node); > > /* If we find any non-objects (!vma), we cannot evict them */ > - if (vma->node.color != I915_COLOR_UNEVICTABLE && > - grab_vma(vma, ww)) { > - ret = __i915_vma_unbind(vma); > - ungrab_vma(vma); > + if (vma->node.color != I915_COLOR_UNEVICTABLE) { > + ret = grab_vma(vma, ww); > + if (!ret) { > + ret = __i915_vma_unbind(vma); > + ungrab_vma(vma); > + } > } else { > ret = -ENOSPC; > } > @@ -382,10 +389,9 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, > break; > } > > - if (!grab_vma(vma, ww)) { > - ret = -ENOSPC; > + ret = grab_vma(vma, ww); > + if (ret) > break; > - } > > /* > * Never show fear in the face of dragons! > diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c b/drivers/gpu/drm/i915/i915_gem_ww.c > index 3f6ff139478e..937b279f50fc 100644 > --- a/drivers/gpu/drm/i915/i915_gem_ww.c > +++ b/drivers/gpu/drm/i915/i915_gem_ww.c > @@ -19,16 +19,14 @@ static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww) > struct drm_i915_gem_object *obj; > > while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) { > - list_del(&obj->obj_link); > - i915_gem_object_unlock(obj); > - i915_gem_object_put(obj); > + i915_gem_ww_unlock_single(obj); > } > } > > void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj) > { > - list_del(&obj->obj_link); > - i915_gem_object_unlock(obj); > + list_del_init(&obj->obj_link); > + __i915_gem_object_unlock(obj); > i915_gem_object_put(obj); > } >
On Thu, 2022-11-10 at 14:49 +0000, Matthew Auld wrote: > On 10/11/2022 05:31, Mani Milani wrote: > > At present, the gpu thread crashes at times when grab_vma() > > attempts to > > acquire a gem object lock when in a deadlock state. > > > > Problems: > > I identified the following 4 issues in the current code: > > 1. Since grab_vma() calls i915_gem_object_trylock(), which > > consequently > > calls ww_mutex_trylock(), to acquire lock, it does not perform > > any > > -EDEADLK handling; And -EALREADY handling is also unreliable, > > according to the description of ww_mutex_trylock(). > > 2. Since the return value of grab_vma() is a boolean showing > > success/failure, it does not provide any extra information on > > the > > failure reason, and therefore does not provide any mechanism to > > its > > caller to take any action to fix a potential deadlock. > > 3. Current grab_vma() implementation produces inconsistent > > behaviour > > depending on the refcount value, without informing the caller. > > If > > refcount is already zero, grab_vma() neither acquires lock nor > > increments the refcount, but still returns 'true' for success! > > This > > means that grab_vma() returning true (for success) does not > > always > > mean that the gem obj is actually safely accessible. > > 4. Currently, calling "i915_gem_object_lock(obj,ww)" is meant to be > > followed by a consequent "i915_gem_object_unlock(obj)" ONLY if > > the > > original 'ww' object pointer was NULL, or otherwise not be > > called and > > leave the houskeeping to "i915_gem_ww_ctx_fini(ww)". There are > > a few > > issues with this: > > - This is not documented anywhere in the code (that I could > > find), > > but only explained in an older commit message. > > - This produces an inconsistent usage of the lock/unlock > > functions, > > increasing the chance of mistakes and issues. > > - This is not a clean design as it requires any new code that > > calls > > these lock/unlock functions to know their internals, as well > > as the > > internals of the functions calling the new code being added. > > > > Fix: > > To fix the issues above, this patch: > > 1. Changes grab_vma() to call i915_gem_object_lock() instead of > > i915_gem_object_trylock(), to handle -EDEADLK and -EALREADY > > cases. > > This should not cause any issue since the PIN_NONBLOCK flag is > > checked beforehand in the 2 cases grab_vma() is called. > > 2. Changes grab_vma() to return the actual error code, instead of > > bool. > > 3. Changes grab_vma() to behave consistently when returning > > success, by > > both incrementing the refcount and acquiring lock at all times. > > 4. Changes i915_gem_object_unlock() to pair with > > i915_gem_object_lock() > > nicely in all cases and do the housekeeping without the need > > for the > > caller to do anything other than simply calling lock and > > unlock. > > 5. Ensures the gem obj->obj_link is initialized and deleted from > > the ww > > list such that it can be tested for emptiness using > > list_empty(). > > > > Signed-off-by: Mani Milani <mani@chromium.org> > > --- > > > > drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 + > > drivers/gpu/drm/i915/gem/i915_gem_object.h | 10 ++++- > > drivers/gpu/drm/i915/i915_gem_evict.c | 48 ++++++++++++----- > > ----- > > drivers/gpu/drm/i915/i915_gem_ww.c | 8 ++-- > > 4 files changed, 41 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c > > b/drivers/gpu/drm/i915/gem/i915_gem_object.c > > index 369006c5317f..69d013b393fb 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c > > @@ -78,6 +78,8 @@ void i915_gem_object_init(struct > > drm_i915_gem_object *obj, > > > > INIT_LIST_HEAD(&obj->mm.link); > > > > + INIT_LIST_HEAD(&obj->obj_link); > > + > > INIT_LIST_HEAD(&obj->lut_list); > > spin_lock_init(&obj->lut_lock); > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > index 1723af9b0f6a..7e7a61bdf52c 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > @@ -219,7 +219,7 @@ static inline bool > > i915_gem_object_trylock(struct drm_i915_gem_object *obj, > > return ww_mutex_trylock(&obj->base.resv->lock, &ww- > > >ctx); > > } > > > > -static inline void i915_gem_object_unlock(struct > > drm_i915_gem_object *obj) > > +static inline void __i915_gem_object_unlock(struct > > drm_i915_gem_object *obj) > > { > > if (obj->ops->adjust_lru) > > obj->ops->adjust_lru(obj); > > @@ -227,6 +227,14 @@ static inline void > > i915_gem_object_unlock(struct drm_i915_gem_object *obj) > > dma_resv_unlock(obj->base.resv); > > } > > > > +static inline void i915_gem_object_unlock(struct > > drm_i915_gem_object *obj) > > +{ > > + if (list_empty(&obj->obj_link)) > > + __i915_gem_object_unlock(obj); > > + else > > + i915_gem_ww_unlock_single(obj); > > +} > > + > > static inline void > > i915_gem_object_set_readonly(struct drm_i915_gem_object *obj) > > { > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c > > b/drivers/gpu/drm/i915/i915_gem_evict.c > > index f025ee4fa526..3eb514b4eddc 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c > > @@ -55,29 +55,33 @@ static int ggtt_flush(struct intel_gt *gt) > > return intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); > > } > > > > -static bool grab_vma(struct i915_vma *vma, struct i915_gem_ww_ctx > > *ww) > > +static int grab_vma(struct i915_vma *vma, struct i915_gem_ww_ctx > > *ww) > > { > > + int err; > > + > > + /* Dead objects don't need pins */ > > + if (dying_vma(vma)) > > + atomic_and(~I915_VMA_PIN_MASK, &vma->flags); > > + > > + err = i915_gem_object_lock(vma->obj, ww); > > AFAIK the issue here is that we are already holding the vm->mutex, so > this can potentially deadlock, which I guess is why this was trylock. > > We typically grab a bunch of object locks during execbuf, and then > grab > the vm->mutex, before binding the vma for each object. So vm->mutex > is > always our inner lock, and the object lock is the outer one. Using a > full lock here then inverts that locking AFAICT. Like say if one > process > is holding object A + vm->mutex and then tries to grab object B here > in > grab_vma(), but another process is already holding object B + waiting > to > grab vm->mutex? Indeed. IIRC the assumption here was that a ww mutex trylock would be similar in behaviour to the previous code which instead checked for pinned vmas. One difference, though, is that we lock most objects upfront and then try to make space for VMAs avoiding evicting vmas that is needed anyway for the batch. The old code would instead evict and rebind. But there should be a catch-all evict everything and rebind if eviction fails so it would be beneficial to see the "gpu thread crash". I think there is an issue filed in gitlab where even this catch-all evict everything fails that might needs looking at. /Thomas > > > + > > /* > > * We add the extra refcount so the object doesn't drop to > > zero until > > - * after ungrab_vma(), this way trylock is always paired > > with unlock. > > + * after ungrab_vma(), this way lock is always paired with > > unlock. > > */ > > - if (i915_gem_object_get_rcu(vma->obj)) { > > - if (!i915_gem_object_trylock(vma->obj, ww)) { > > - i915_gem_object_put(vma->obj); > > - return false; > > - } > > - } else { > > - /* Dead objects don't need pins */ > > - atomic_and(~I915_VMA_PIN_MASK, &vma->flags); > > - } > > + if (!err) > > + i915_gem_object_get(vma->obj); > > > > - return true; > > + return err; > > } > > > > static void ungrab_vma(struct i915_vma *vma) > > { > > - if (dying_vma(vma)) > > + if (dying_vma(vma)) { > > + /* Dead objects don't need pins */ > > + atomic_and(~I915_VMA_PIN_MASK, &vma->flags); > > return; > > + } > > > > i915_gem_object_unlock(vma->obj); > > i915_gem_object_put(vma->obj); > > @@ -93,10 +97,11 @@ mark_free(struct drm_mm_scan *scan, > > if (i915_vma_is_pinned(vma)) > > return false; > > > > - if (!grab_vma(vma, ww)) > > + if (grab_vma(vma, ww)) > > return false; > > > > list_add(&vma->evict_link, unwind); > > + > > return drm_mm_scan_add_block(scan, &vma->node); > > } > > > > @@ -284,10 +289,12 @@ i915_gem_evict_something(struct > > i915_address_space *vm, > > vma = container_of(node, struct i915_vma, node); > > > > /* If we find any non-objects (!vma), we cannot > > evict them */ > > - if (vma->node.color != I915_COLOR_UNEVICTABLE && > > - grab_vma(vma, ww)) { > > - ret = __i915_vma_unbind(vma); > > - ungrab_vma(vma); > > + if (vma->node.color != I915_COLOR_UNEVICTABLE) { > > + ret = grab_vma(vma, ww); > > + if (!ret) { > > + ret = __i915_vma_unbind(vma); > > + ungrab_vma(vma); > > + } > > } else { > > ret = -ENOSPC; > > } > > @@ -382,10 +389,9 @@ int i915_gem_evict_for_node(struct > > i915_address_space *vm, > > break; > > } > > > > - if (!grab_vma(vma, ww)) { > > - ret = -ENOSPC; > > + ret = grab_vma(vma, ww); > > + if (ret) > > break; > > - } > > > > /* > > * Never show fear in the face of dragons! > > diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c > > b/drivers/gpu/drm/i915/i915_gem_ww.c > > index 3f6ff139478e..937b279f50fc 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_ww.c > > +++ b/drivers/gpu/drm/i915/i915_gem_ww.c > > @@ -19,16 +19,14 @@ static void i915_gem_ww_ctx_unlock_all(struct > > i915_gem_ww_ctx *ww) > > struct drm_i915_gem_object *obj; > > > > while ((obj = list_first_entry_or_null(&ww->obj_list, > > struct drm_i915_gem_object, obj_link))) { > > - list_del(&obj->obj_link); > > - i915_gem_object_unlock(obj); > > - i915_gem_object_put(obj); > > + i915_gem_ww_unlock_single(obj); > > } > > } > > > > void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj) > > { > > - list_del(&obj->obj_link); > > - i915_gem_object_unlock(obj); > > + list_del_init(&obj->obj_link); > > + __i915_gem_object_unlock(obj); > > i915_gem_object_put(obj); > > } > >
Thank you for your comments. To Thomas's point, the crash always seems to happen when the following sequence of events occurs: 1. When inside "i915_gem_evict_vm()", the call to "i915_gem_object_trylock(vma->obj, ww)" fails (due to deadlock), and eviction of a vma is skipped as a result. Basically if the code reaches here: https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/i915/i915_gem_evict.c#L468 And here is the stack dump for this scenario: Call Trace: <TASK> dump_stack_lvl+0x68/0x95 i915_gem_evict_vm+0x1d2/0x369 eb_validate_vmas+0x54a/0x6ae eb_relocate_parse+0x4b/0xdb i915_gem_execbuffer2_ioctl+0x6f5/0xab6 ? i915_gem_object_prepare_write+0xfb/0xfb drm_ioctl_kernel+0xda/0x14d drm_ioctl+0x27f/0x3b7 ? i915_gem_object_prepare_write+0xfb/0xfb __se_sys_ioctl+0x7a/0xbc do_syscall_64+0x56/0xa1 ? exit_to_user_mode_prepare+0x3d/0x8c entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x78302de5fae7 Code: c0 0f 89 74 ff ff ff 48 83 c4 08 49 c7 c4 ff ff ff ff 5b 4c 89 e0 41 5c 41 5d 5d c3 0f 1f 80 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c3 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe64b87f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000003cc00470000 RCX: 000078302de5fae7 RDX: 00007ffe64b87fd0 RSI: 0000000040406469 RDI: 000000000000000d RBP: 00007ffe64b87fa0 R08: 0000000000000013 R09: 000003cc004d0950 R10: 0000000000000200 R11: 0000000000000246 R12: 000000000000000d R13: 0000000000000000 R14: 00007ffe64b87fd0 R15: 0000000040406469 </TASK> It is worth noting that "i915_gem_evict_vm()" still returns success in this case. 2. After step 1 occurs, the next call to "grab_vma()" always fails (with "i915_gem_object_trylock(vma->obj, ww)" failing also due to deadlock), which then results in the crash. Here is the stack dump for this scenario: Call Trace: <TASK> dump_stack_lvl+0x68/0x95 grab_vma+0x6c/0xd0 i915_gem_evict_for_node+0x178/0x23b i915_gem_gtt_reserve+0x5a/0x82 i915_vma_insert+0x295/0x29e i915_vma_pin_ww+0x41e/0x5c7 eb_validate_vmas+0x5f5/0x6ae eb_relocate_parse+0x4b/0xdb i915_gem_execbuffer2_ioctl+0x6f5/0xab6 ? i915_gem_object_prepare_write+0xfb/0xfb drm_ioctl_kernel+0xda/0x14d drm_ioctl+0x27f/0x3b7 ? i915_gem_object_prepare_write+0xfb/0xfb __se_sys_ioctl+0x7a/0xbc do_syscall_64+0x56/0xa1 ? exit_to_user_mode_prepare+0x3d/0x8c entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x78302de5fae7 Code: c0 0f 89 74 ff ff ff 48 83 c4 08 49 c7 c4 ff ff ff ff 5b 4c 89 e0 41 5c 41 5d 5d c3 0f 1f 80 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c3 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe64b87f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000003cc00470000 RCX: 000078302de5fae7 RDX: 00007ffe64b87fd0 RSI: 0000000040406469 RDI: 000000000000000d RBP: 00007ffe64b87fa0 R08: 0000000000000013 R09: 000003cc004d0950 R10: 0000000000000200 R11: 0000000000000246 R12: 000000000000000d R13: 0000000000000000 R14: 00007ffe64b87fd0 R15: 0000000040406469 </TASK> My Notes: - I verified the two "i915_gem_object_trylock()" failures I mentioned above are due to deadlock by slightly modifying the code to call "i915_gem_object_lock()" only in those exact cases and subsequent to the trylock failure, only to look at the return error code. - The two cases mentioned above, are the only cases where "i915_gem_object_trylock(obj, ww)" is called with the second argument not being forced to NULL. - When in either of the two cases above (i.e. inside "grab_vma()" or "i915_gem_evict_vm") I replace calling "i915_gem_object_trylock" with "i915_gem_object_lock", the issue gets resolved (because deadlock is detected and resolved). So if this could matches the design better, another solution could be for "grab_vma" to continue to call "i915_gem_object_trylock", but for "i915_gem_evict_vm" to call "i915_gem_object_lock" instead. Further info: - Would you like any further info on the crash? If so, could you please advise 1) what exactly you need and 2) how I can share with you especially if it is big dumps? Thanks.
Hi, Mani. On 11/14/22 03:16, Mani Milani wrote: > Thank you for your comments. > > To Thomas's point, the crash always seems to happen when the following > sequence of events occurs: > > 1. When inside "i915_gem_evict_vm()", the call to > "i915_gem_object_trylock(vma->obj, ww)" fails (due to deadlock), and > eviction of a vma is skipped as a result. Basically if the code > reaches here: > https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/i915/i915_gem_evict.c#L468 > And here is the stack dump for this scenario: > Call Trace: > <TASK> > dump_stack_lvl+0x68/0x95 > i915_gem_evict_vm+0x1d2/0x369 > eb_validate_vmas+0x54a/0x6ae > eb_relocate_parse+0x4b/0xdb > i915_gem_execbuffer2_ioctl+0x6f5/0xab6 > ? i915_gem_object_prepare_write+0xfb/0xfb > drm_ioctl_kernel+0xda/0x14d > drm_ioctl+0x27f/0x3b7 > ? i915_gem_object_prepare_write+0xfb/0xfb > __se_sys_ioctl+0x7a/0xbc > do_syscall_64+0x56/0xa1 > ? exit_to_user_mode_prepare+0x3d/0x8c > entry_SYSCALL_64_after_hwframe+0x61/0xcb > RIP: 0033:0x78302de5fae7 > Code: c0 0f 89 74 ff ff ff 48 83 c4 08 49 c7 c4 ff ff ff ff 5b 4c > 89 e0 41 5c 41 5d 5d c3 0f 1f 80 00 00 00 00 b8 10 00 00 00 0f 05 <48> > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c3 0c 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffe64b87f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 000003cc00470000 RCX: 000078302de5fae7 > RDX: 00007ffe64b87fd0 RSI: 0000000040406469 RDI: 000000000000000d > RBP: 00007ffe64b87fa0 R08: 0000000000000013 R09: 000003cc004d0950 > R10: 0000000000000200 R11: 0000000000000246 R12: 000000000000000d > R13: 0000000000000000 R14: 00007ffe64b87fd0 R15: 0000000040406469 > </TASK> > It is worth noting that "i915_gem_evict_vm()" still returns success in > this case. > > 2. After step 1 occurs, the next call to "grab_vma()" always fails > (with "i915_gem_object_trylock(vma->obj, ww)" failing also due to > deadlock), which then results in the crash. > Here is the stack dump for this scenario: > Call Trace: > <TASK> > dump_stack_lvl+0x68/0x95 > grab_vma+0x6c/0xd0 > i915_gem_evict_for_node+0x178/0x23b > i915_gem_gtt_reserve+0x5a/0x82 > i915_vma_insert+0x295/0x29e > i915_vma_pin_ww+0x41e/0x5c7 > eb_validate_vmas+0x5f5/0x6ae > eb_relocate_parse+0x4b/0xdb > i915_gem_execbuffer2_ioctl+0x6f5/0xab6 > ? i915_gem_object_prepare_write+0xfb/0xfb > drm_ioctl_kernel+0xda/0x14d > drm_ioctl+0x27f/0x3b7 > ? i915_gem_object_prepare_write+0xfb/0xfb > __se_sys_ioctl+0x7a/0xbc > do_syscall_64+0x56/0xa1 > ? exit_to_user_mode_prepare+0x3d/0x8c > entry_SYSCALL_64_after_hwframe+0x61/0xcb > RIP: 0033:0x78302de5fae7 > Code: c0 0f 89 74 ff ff ff 48 83 c4 08 49 c7 c4 ff ff ff ff 5b 4c > 89 e0 41 5c 41 5d 5d c3 0f 1f 80 00 00 00 00 b8 10 00 00 00 0f 05 <48> > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c3 0c 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffe64b87f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 000003cc00470000 RCX: 000078302de5fae7 > RDX: 00007ffe64b87fd0 RSI: 0000000040406469 RDI: 000000000000000d > RBP: 00007ffe64b87fa0 R08: 0000000000000013 R09: 000003cc004d0950 > R10: 0000000000000200 R11: 0000000000000246 R12: 000000000000000d > R13: 0000000000000000 R14: 00007ffe64b87fd0 R15: 0000000040406469 > </TASK> > > My Notes: > - I verified the two "i915_gem_object_trylock()" failures I mentioned > above are due to deadlock by slightly modifying the code to call > "i915_gem_object_lock()" only in those exact cases and subsequent to > the trylock failure, only to look at the return error code. > - The two cases mentioned above, are the only cases where > "i915_gem_object_trylock(obj, ww)" is called with the second argument > not being forced to NULL. > - When in either of the two cases above (i.e. inside "grab_vma()" or > "i915_gem_evict_vm") I replace calling "i915_gem_object_trylock" with > "i915_gem_object_lock", the issue gets resolved (because deadlock is > detected and resolved). > > So if this could matches the design better, another solution could be > for "grab_vma" to continue to call "i915_gem_object_trylock", but for > "i915_gem_evict_vm" to call "i915_gem_object_lock" instead. No, i915_gem_object_lock() is not allowed when the vm mutex is held. > > Further info: > - Would you like any further info on the crash? If so, could you > please advise 1) what exactly you need and 2) how I can share with you > especially if it is big dumps? Yes, I would like to know how the crash manifests itself. Is it a kernel BUG or a kernel WARNING or is it the user-space application that crashes due to receiveing an -ENOSPC? Thanks, Thomas > > Thanks.
Hi Thomas, It is a user-space application that crashes due to receiving an -ENOSPC. This occurs after code reaches the line below where grab_vma() fails (due to deadlock) and consequently i915_gem_evict_for_node() returns -ENOSPC. I provided the call stack for when this happens in my previous message: https://github.com/torvalds/linux/blame/59d0d52c30d4991ac4b329f049cc37118e00f5b0/drivers/gpu/drm/i915/i915_gem_evict.c#L386 Context: This crash is happening on an intel GeminiLake chromebook, when running a video seek h264 stress test, and it is reproducible 100%. To troubleshoot, I did a git bisect and found the following commit to be the culprit (which is when grab_vma() has been introduced): https://github.com/torvalds/linux/commit/7e00897be8bf13ef9c68c95a8e386b714c29ad95 I also have crash dumps and further logs that I can send you if needed. But please let me know how to share those with you, since pasting them here does not seem reasonable to me. Thanks, Mani On Mon, Nov 14, 2022 at 11:48 PM Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote: > > Hi, Mani. > > On 11/14/22 03:16, Mani Milani wrote: > > Thank you for your comments. > > > > To Thomas's point, the crash always seems to happen when the following > > sequence of events occurs: > > > > 1. When inside "i915_gem_evict_vm()", the call to > > "i915_gem_object_trylock(vma->obj, ww)" fails (due to deadlock), and > > eviction of a vma is skipped as a result. Basically if the code > > reaches here: > > https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/i915/i915_gem_evict.c#L468 > > And here is the stack dump for this scenario: > > Call Trace: > > <TASK> > > dump_stack_lvl+0x68/0x95 > > i915_gem_evict_vm+0x1d2/0x369 > > eb_validate_vmas+0x54a/0x6ae > > eb_relocate_parse+0x4b/0xdb > > i915_gem_execbuffer2_ioctl+0x6f5/0xab6 > > ? i915_gem_object_prepare_write+0xfb/0xfb > > drm_ioctl_kernel+0xda/0x14d > > drm_ioctl+0x27f/0x3b7 > > ? i915_gem_object_prepare_write+0xfb/0xfb > > __se_sys_ioctl+0x7a/0xbc > > do_syscall_64+0x56/0xa1 > > ? exit_to_user_mode_prepare+0x3d/0x8c > > entry_SYSCALL_64_after_hwframe+0x61/0xcb > > RIP: 0033:0x78302de5fae7 > > Code: c0 0f 89 74 ff ff ff 48 83 c4 08 49 c7 c4 ff ff ff ff 5b 4c > > 89 e0 41 5c 41 5d 5d c3 0f 1f 80 00 00 00 00 b8 10 00 00 00 0f 05 <48> > > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c3 0c 00 f7 d8 64 89 01 48 > > RSP: 002b:00007ffe64b87f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > > RAX: ffffffffffffffda RBX: 000003cc00470000 RCX: 000078302de5fae7 > > RDX: 00007ffe64b87fd0 RSI: 0000000040406469 RDI: 000000000000000d > > RBP: 00007ffe64b87fa0 R08: 0000000000000013 R09: 000003cc004d0950 > > R10: 0000000000000200 R11: 0000000000000246 R12: 000000000000000d > > R13: 0000000000000000 R14: 00007ffe64b87fd0 R15: 0000000040406469 > > </TASK> > > It is worth noting that "i915_gem_evict_vm()" still returns success in > > this case. > > > > 2. After step 1 occurs, the next call to "grab_vma()" always fails > > (with "i915_gem_object_trylock(vma->obj, ww)" failing also due to > > deadlock), which then results in the crash. > > Here is the stack dump for this scenario: > > Call Trace: > > <TASK> > > dump_stack_lvl+0x68/0x95 > > grab_vma+0x6c/0xd0 > > i915_gem_evict_for_node+0x178/0x23b > > i915_gem_gtt_reserve+0x5a/0x82 > > i915_vma_insert+0x295/0x29e > > i915_vma_pin_ww+0x41e/0x5c7 > > eb_validate_vmas+0x5f5/0x6ae > > eb_relocate_parse+0x4b/0xdb > > i915_gem_execbuffer2_ioctl+0x6f5/0xab6 > > ? i915_gem_object_prepare_write+0xfb/0xfb > > drm_ioctl_kernel+0xda/0x14d > > drm_ioctl+0x27f/0x3b7 > > ? i915_gem_object_prepare_write+0xfb/0xfb > > __se_sys_ioctl+0x7a/0xbc > > do_syscall_64+0x56/0xa1 > > ? exit_to_user_mode_prepare+0x3d/0x8c > > entry_SYSCALL_64_after_hwframe+0x61/0xcb > > RIP: 0033:0x78302de5fae7 > > Code: c0 0f 89 74 ff ff ff 48 83 c4 08 49 c7 c4 ff ff ff ff 5b 4c > > 89 e0 41 5c 41 5d 5d c3 0f 1f 80 00 00 00 00 b8 10 00 00 00 0f 05 <48> > > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c3 0c 00 f7 d8 64 89 01 48 > > RSP: 002b:00007ffe64b87f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > > RAX: ffffffffffffffda RBX: 000003cc00470000 RCX: 000078302de5fae7 > > RDX: 00007ffe64b87fd0 RSI: 0000000040406469 RDI: 000000000000000d > > RBP: 00007ffe64b87fa0 R08: 0000000000000013 R09: 000003cc004d0950 > > R10: 0000000000000200 R11: 0000000000000246 R12: 000000000000000d > > R13: 0000000000000000 R14: 00007ffe64b87fd0 R15: 0000000040406469 > > </TASK> > > > > My Notes: > > - I verified the two "i915_gem_object_trylock()" failures I mentioned > > above are due to deadlock by slightly modifying the code to call > > "i915_gem_object_lock()" only in those exact cases and subsequent to > > the trylock failure, only to look at the return error code. > > - The two cases mentioned above, are the only cases where > > "i915_gem_object_trylock(obj, ww)" is called with the second argument > > not being forced to NULL. > > - When in either of the two cases above (i.e. inside "grab_vma()" or > > "i915_gem_evict_vm") I replace calling "i915_gem_object_trylock" with > > "i915_gem_object_lock", the issue gets resolved (because deadlock is > > detected and resolved). > > > > So if this could matches the design better, another solution could be > > for "grab_vma" to continue to call "i915_gem_object_trylock", but for > > "i915_gem_evict_vm" to call "i915_gem_object_lock" instead. > > No, i915_gem_object_lock() is not allowed when the vm mutex is held. > > > > > > Further info: > > - Would you like any further info on the crash? If so, could you > > please advise 1) what exactly you need and 2) how I can share with you > > especially if it is big dumps? > > Yes, I would like to know how the crash manifests itself. Is it a kernel > BUG or a kernel WARNING or is it the user-space application that crashes > due to receiveing an -ENOSPC? > > Thanks, > > Thomas > > > > > > > Thanks.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 369006c5317f..69d013b393fb 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -78,6 +78,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, INIT_LIST_HEAD(&obj->mm.link); + INIT_LIST_HEAD(&obj->obj_link); + INIT_LIST_HEAD(&obj->lut_list); spin_lock_init(&obj->lut_lock); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 1723af9b0f6a..7e7a61bdf52c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -219,7 +219,7 @@ static inline bool i915_gem_object_trylock(struct drm_i915_gem_object *obj, return ww_mutex_trylock(&obj->base.resv->lock, &ww->ctx); } -static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) +static inline void __i915_gem_object_unlock(struct drm_i915_gem_object *obj) { if (obj->ops->adjust_lru) obj->ops->adjust_lru(obj); @@ -227,6 +227,14 @@ static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) dma_resv_unlock(obj->base.resv); } +static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) +{ + if (list_empty(&obj->obj_link)) + __i915_gem_object_unlock(obj); + else + i915_gem_ww_unlock_single(obj); +} + static inline void i915_gem_object_set_readonly(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index f025ee4fa526..3eb514b4eddc 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -55,29 +55,33 @@ static int ggtt_flush(struct intel_gt *gt) return intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); } -static bool grab_vma(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) +static int grab_vma(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) { + int err; + + /* Dead objects don't need pins */ + if (dying_vma(vma)) + atomic_and(~I915_VMA_PIN_MASK, &vma->flags); + + err = i915_gem_object_lock(vma->obj, ww); + /* * We add the extra refcount so the object doesn't drop to zero until - * after ungrab_vma(), this way trylock is always paired with unlock. + * after ungrab_vma(), this way lock is always paired with unlock. */ - if (i915_gem_object_get_rcu(vma->obj)) { - if (!i915_gem_object_trylock(vma->obj, ww)) { - i915_gem_object_put(vma->obj); - return false; - } - } else { - /* Dead objects don't need pins */ - atomic_and(~I915_VMA_PIN_MASK, &vma->flags); - } + if (!err) + i915_gem_object_get(vma->obj); - return true; + return err; } static void ungrab_vma(struct i915_vma *vma) { - if (dying_vma(vma)) + if (dying_vma(vma)) { + /* Dead objects don't need pins */ + atomic_and(~I915_VMA_PIN_MASK, &vma->flags); return; + } i915_gem_object_unlock(vma->obj); i915_gem_object_put(vma->obj); @@ -93,10 +97,11 @@ mark_free(struct drm_mm_scan *scan, if (i915_vma_is_pinned(vma)) return false; - if (!grab_vma(vma, ww)) + if (grab_vma(vma, ww)) return false; list_add(&vma->evict_link, unwind); + return drm_mm_scan_add_block(scan, &vma->node); } @@ -284,10 +289,12 @@ i915_gem_evict_something(struct i915_address_space *vm, vma = container_of(node, struct i915_vma, node); /* If we find any non-objects (!vma), we cannot evict them */ - if (vma->node.color != I915_COLOR_UNEVICTABLE && - grab_vma(vma, ww)) { - ret = __i915_vma_unbind(vma); - ungrab_vma(vma); + if (vma->node.color != I915_COLOR_UNEVICTABLE) { + ret = grab_vma(vma, ww); + if (!ret) { + ret = __i915_vma_unbind(vma); + ungrab_vma(vma); + } } else { ret = -ENOSPC; } @@ -382,10 +389,9 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, break; } - if (!grab_vma(vma, ww)) { - ret = -ENOSPC; + ret = grab_vma(vma, ww); + if (ret) break; - } /* * Never show fear in the face of dragons! diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c b/drivers/gpu/drm/i915/i915_gem_ww.c index 3f6ff139478e..937b279f50fc 100644 --- a/drivers/gpu/drm/i915/i915_gem_ww.c +++ b/drivers/gpu/drm/i915/i915_gem_ww.c @@ -19,16 +19,14 @@ static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww) struct drm_i915_gem_object *obj; while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) { - list_del(&obj->obj_link); - i915_gem_object_unlock(obj); - i915_gem_object_put(obj); + i915_gem_ww_unlock_single(obj); } } void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj) { - list_del(&obj->obj_link); - i915_gem_object_unlock(obj); + list_del_init(&obj->obj_link); + __i915_gem_object_unlock(obj); i915_gem_object_put(obj); }
At present, the gpu thread crashes at times when grab_vma() attempts to acquire a gem object lock when in a deadlock state. Problems: I identified the following 4 issues in the current code: 1. Since grab_vma() calls i915_gem_object_trylock(), which consequently calls ww_mutex_trylock(), to acquire lock, it does not perform any -EDEADLK handling; And -EALREADY handling is also unreliable, according to the description of ww_mutex_trylock(). 2. Since the return value of grab_vma() is a boolean showing success/failure, it does not provide any extra information on the failure reason, and therefore does not provide any mechanism to its caller to take any action to fix a potential deadlock. 3. Current grab_vma() implementation produces inconsistent behaviour depending on the refcount value, without informing the caller. If refcount is already zero, grab_vma() neither acquires lock nor increments the refcount, but still returns 'true' for success! This means that grab_vma() returning true (for success) does not always mean that the gem obj is actually safely accessible. 4. Currently, calling "i915_gem_object_lock(obj,ww)" is meant to be followed by a consequent "i915_gem_object_unlock(obj)" ONLY if the original 'ww' object pointer was NULL, or otherwise not be called and leave the houskeeping to "i915_gem_ww_ctx_fini(ww)". There are a few issues with this: - This is not documented anywhere in the code (that I could find), but only explained in an older commit message. - This produces an inconsistent usage of the lock/unlock functions, increasing the chance of mistakes and issues. - This is not a clean design as it requires any new code that calls these lock/unlock functions to know their internals, as well as the internals of the functions calling the new code being added. Fix: To fix the issues above, this patch: 1. Changes grab_vma() to call i915_gem_object_lock() instead of i915_gem_object_trylock(), to handle -EDEADLK and -EALREADY cases. This should not cause any issue since the PIN_NONBLOCK flag is checked beforehand in the 2 cases grab_vma() is called. 2. Changes grab_vma() to return the actual error code, instead of bool. 3. Changes grab_vma() to behave consistently when returning success, by both incrementing the refcount and acquiring lock at all times. 4. Changes i915_gem_object_unlock() to pair with i915_gem_object_lock() nicely in all cases and do the housekeeping without the need for the caller to do anything other than simply calling lock and unlock. 5. Ensures the gem obj->obj_link is initialized and deleted from the ww list such that it can be tested for emptiness using list_empty(). Signed-off-by: Mani Milani <mani@chromium.org> --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 + drivers/gpu/drm/i915/gem/i915_gem_object.h | 10 ++++- drivers/gpu/drm/i915/i915_gem_evict.c | 48 ++++++++++++---------- drivers/gpu/drm/i915/i915_gem_ww.c | 8 ++-- 4 files changed, 41 insertions(+), 27 deletions(-)