From patchwork Mon Jan 10 11:51:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 12708606 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 986ABC433F5 for ; Mon, 10 Jan 2022 11:51:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BFAFC14AC10; Mon, 10 Jan 2022 11:51:39 +0000 (UTC) Received: from mblankhorst.nl (mblankhorst.nl [141.105.120.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 471FF14A5CB; Mon, 10 Jan 2022 11:51:38 +0000 (UTC) From: Maarten Lankhorst To: intel-gfx@lists.freedesktop.org Date: Mon, 10 Jan 2022 12:51:27 +0100 Message-Id: <20220110115133.1500718-2-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> References: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 1/7] drm/i915: Call i915_gem_evict_vm in vm_fault_gtt to prevent new ENOSPC errors, v2. X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Now that we cannot unbind kill the currently locked object directly because we're removing short term pinning, we may have to unbind the object from gtt manually, using a i915_gem_evict_vm() call. Changes since v1: - Remove -ENOSPC warning, can still happen with concurrent mmaps where we can't unbind the other mmap because of the lock held. This fixes the gem_mmap_gtt@cpuset tests. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index 5ac2506f4ee8..4337f3c1400c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -358,8 +358,21 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 0, flags); } - /* The entire mappable GGTT is pinned? Unexpected! */ - GEM_BUG_ON(vma == ERR_PTR(-ENOSPC)); + /* + * The entire mappable GGTT is pinned? Unexpected! + * Try to evict the object we locked too, as normally we skip it + * due to lack of short term pinning inside execbuf. + */ + if (vma == ERR_PTR(-ENOSPC)) { + ret = mutex_lock_interruptible(&ggtt->vm.mutex); + if (!ret) { + ret = i915_gem_evict_vm(&ggtt->vm); + mutex_unlock(&ggtt->vm.mutex); + } + if (ret) + goto err_reset; + vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 0, flags); + } } if (IS_ERR(vma)) { ret = PTR_ERR(vma); From patchwork Mon Jan 10 11:51:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 12708604 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1F4D2C433F5 for ; Mon, 10 Jan 2022 11:51:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6C52514A92E; Mon, 10 Jan 2022 11:51:39 +0000 (UTC) Received: from mblankhorst.nl (mblankhorst.nl [IPv6:2a02:2308:0:7ec:e79c:4e97:b6c4:f0ae]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2612412BD2C; Mon, 10 Jan 2022 11:51:38 +0000 (UTC) From: Maarten Lankhorst To: intel-gfx@lists.freedesktop.org Date: Mon, 10 Jan 2022 12:51:28 +0100 Message-Id: <20220110115133.1500718-3-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> References: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 2/7] drm/i915: Add locking to i915_gem_evict_vm() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" i915_gem_evict_vm will need to be able to evict objects that are locked by the current ctx. By testing if the current context already locked the object, we can do this correctly. This allows us to evict the entire vm even if we already hold some objects' locks. Previously, this was spread over several commits, but it makes more sense to commit the changes to i915_gem_evict_vm separately from the changes to i915_gem_evict_something() and i915_gem_evict_for_node(). Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 3 +- drivers/gpu/drm/i915/i915_gem_evict.c | 30 +++++++++++++++++-- drivers/gpu/drm/i915/i915_vma.c | 7 ++++- .../gpu/drm/i915/selftests/i915_gem_evict.c | 10 +++++-- 6 files changed, 46 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 5ecc85b96a3d..da35a143af36 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -766,7 +766,7 @@ static int eb_reserve(struct i915_execbuffer *eb) case 1: /* Too fragmented, unbind everything and retry */ mutex_lock(&eb->context->vm->mutex); - err = i915_gem_evict_vm(eb->context->vm); + err = i915_gem_evict_vm(eb->context->vm, &eb->ww); mutex_unlock(&eb->context->vm->mutex); if (err) return err; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index 4337f3c1400c..4afad1604a6a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -366,7 +366,7 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) if (vma == ERR_PTR(-ENOSPC)) { ret = mutex_lock_interruptible(&ggtt->vm.mutex); if (!ret) { - ret = i915_gem_evict_vm(&ggtt->vm); + ret = i915_gem_evict_vm(&ggtt->vm, &ww); mutex_unlock(&ggtt->vm.mutex); } if (ret) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2f9336302e6c..ef121ddef418 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1729,7 +1729,8 @@ int __must_check i915_gem_evict_something(struct i915_address_space *vm, int __must_check i915_gem_evict_for_node(struct i915_address_space *vm, struct drm_mm_node *node, unsigned int flags); -int i915_gem_evict_vm(struct i915_address_space *vm); +int i915_gem_evict_vm(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww); /* i915_gem_internal.c */ struct drm_i915_gem_object * diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 2b73ddb11c66..bfd66f539fc1 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -367,7 +367,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, * To clarify: This is for freeing up virtual address space, not for freeing * memory in e.g. the shrinker. */ -int i915_gem_evict_vm(struct i915_address_space *vm) +int i915_gem_evict_vm(struct i915_address_space *vm, struct i915_gem_ww_ctx *ww) { int ret = 0; @@ -388,24 +388,50 @@ int i915_gem_evict_vm(struct i915_address_space *vm) do { struct i915_vma *vma, *vn; LIST_HEAD(eviction_list); + LIST_HEAD(locked_eviction_list); list_for_each_entry(vma, &vm->bound_list, vm_link) { if (i915_vma_is_pinned(vma)) continue; + /* + * If we already own the lock, trylock fails. In case the resv + * is shared among multiple objects, we still need the object ref. + */ + if (ww && (dma_resv_locking_ctx(vma->obj->base.resv) == &ww->ctx)) { + __i915_vma_pin(vma); + list_add(&vma->evict_link, &locked_eviction_list); + continue; + } + + if (!i915_gem_object_trylock(vma->obj, ww)) + continue; + __i915_vma_pin(vma); list_add(&vma->evict_link, &eviction_list); } - if (list_empty(&eviction_list)) + if (list_empty(&eviction_list) && list_empty(&locked_eviction_list)) break; ret = 0; + /* Unbind locked objects first, before unlocking the eviction_list */ + list_for_each_entry_safe(vma, vn, &locked_eviction_list, evict_link) { + __i915_vma_unpin(vma); + + if (ret == 0) + ret = __i915_vma_unbind(vma); + if (ret != -EINTR) /* "Get me out of here!" */ + ret = 0; + } + list_for_each_entry_safe(vma, vn, &eviction_list, evict_link) { __i915_vma_unpin(vma); if (ret == 0) ret = __i915_vma_unbind(vma); if (ret != -EINTR) /* "Get me out of here!" */ ret = 0; + + i915_gem_object_unlock(vma->obj); } } while (ret == 0); diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index be208a8f1ed0..32620f93beab 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1463,7 +1463,12 @@ static int __i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, /* Unlike i915_vma_pin, we don't take no for an answer! */ flush_idle_contexts(vm->gt); if (mutex_lock_interruptible(&vm->mutex) == 0) { - i915_gem_evict_vm(vm); + /* + * We pass NULL ww here, as we don't want to unbind + * locked objects when called from execbuf when pinning + * is removed. This would probably regress badly. + */ + i915_gem_evict_vm(vm, NULL); mutex_unlock(&vm->mutex); } } while (1); diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c index 75b709c26dd3..7c075c16a573 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c @@ -331,6 +331,7 @@ static int igt_evict_vm(void *arg) { struct intel_gt *gt = arg; struct i915_ggtt *ggtt = gt->ggtt; + struct i915_gem_ww_ctx ww; LIST_HEAD(objects); int err; @@ -342,7 +343,7 @@ static int igt_evict_vm(void *arg) /* Everything is pinned, nothing should happen */ mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_vm(&ggtt->vm); + err = i915_gem_evict_vm(&ggtt->vm, NULL); mutex_unlock(&ggtt->vm.mutex); if (err) { pr_err("i915_gem_evict_vm on a full GGTT returned err=%d]\n", @@ -352,9 +353,14 @@ static int igt_evict_vm(void *arg) unpin_ggtt(ggtt); + i915_gem_ww_ctx_init(&ww, false); mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_vm(&ggtt->vm); + err = i915_gem_evict_vm(&ggtt->vm, &ww); mutex_unlock(&ggtt->vm.mutex); + + /* no -EDEADLK handling; can't happen with vm.mutex in place */ + i915_gem_ww_ctx_fini(&ww); + if (err) { pr_err("i915_gem_evict_vm on a full GGTT returned err=%d]\n", err); From patchwork Mon Jan 10 11:51:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 12708607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8C8AFC433EF for ; Mon, 10 Jan 2022 11:51:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CE86F14AC11; Mon, 10 Jan 2022 11:51:39 +0000 (UTC) Received: from mblankhorst.nl (mblankhorst.nl [141.105.120.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 45CF614A5A9; Mon, 10 Jan 2022 11:51:38 +0000 (UTC) From: Maarten Lankhorst To: intel-gfx@lists.freedesktop.org Date: Mon, 10 Jan 2022 12:51:29 +0100 Message-Id: <20220110115133.1500718-4-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> References: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 3/7] drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Because we will start to require the obj->resv lock for unbinding, ensure these shrinker functions also take the lock. This requires some function signature changes, to ensure that the ww context is passed around, but is mostly straightforward. Previously this was split up into several patches, but reworking should allow for easier bisection. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 +- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 2 +- drivers/gpu/drm/i915/gvt/aperture_gm.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem_evict.c | 34 +++++++++++++++---- drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +++-- drivers/gpu/drm/i915/i915_gem_gtt.h | 3 ++ drivers/gpu/drm/i915/i915_vgpu.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 9 ++--- .../gpu/drm/i915/selftests/i915_gem_evict.c | 17 +++++----- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 14 ++++---- 11 files changed, 63 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index ab6c4322dc08..e416e1f12d1a 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -504,7 +504,7 @@ static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt) GEM_BUG_ON(ggtt->vm.total <= GUC_GGTT_TOP); size = ggtt->vm.total - GUC_GGTT_TOP; - ret = i915_gem_gtt_reserve(&ggtt->vm, &ggtt->uc_fw, size, + ret = i915_gem_gtt_reserve(&ggtt->vm, NULL, &ggtt->uc_fw, size, GUC_GGTT_TOP, I915_COLOR_UNEVICTABLE, PIN_NOEVICT); if (ret) diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 15d63435ec4d..9c21b55b927b 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -1382,7 +1382,7 @@ static int evict_vma(void *data) complete(&arg->completion); mutex_lock(&vm->mutex); - err = i915_gem_evict_for_node(vm, &evict, 0); + err = i915_gem_evict_for_node(vm, NULL, &evict, 0); mutex_unlock(&vm->mutex); return err; diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c index 0d6d59871308..c08098a167e9 100644 --- a/drivers/gpu/drm/i915/gvt/aperture_gm.c +++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c @@ -63,7 +63,7 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm) mutex_lock(>->ggtt->vm.mutex); mmio_hw_access_pre(gt); - ret = i915_gem_gtt_insert(>->ggtt->vm, node, + ret = i915_gem_gtt_insert(>->ggtt->vm, NULL, node, size, I915_GTT_PAGE_SIZE, I915_COLOR_UNEVICTABLE, start, end, flags); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ef121ddef418..88eb203a0742 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1722,11 +1722,13 @@ i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id) /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, u64 min_size, u64 alignment, unsigned long color, u64 start, u64 end, unsigned flags); int __must_check i915_gem_evict_for_node(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, struct drm_mm_node *node, unsigned int flags); int i915_gem_evict_vm(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index bfd66f539fc1..f502a617b35c 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -51,6 +51,7 @@ static int ggtt_flush(struct intel_gt *gt) static bool mark_free(struct drm_mm_scan *scan, + struct i915_gem_ww_ctx *ww, struct i915_vma *vma, unsigned int flags, struct list_head *unwind) @@ -58,6 +59,9 @@ mark_free(struct drm_mm_scan *scan, if (i915_vma_is_pinned(vma)) return false; + if (!i915_gem_object_trylock(vma->obj, ww)) + return false; + list_add(&vma->evict_link, unwind); return drm_mm_scan_add_block(scan, &vma->node); } @@ -98,6 +102,7 @@ static bool defer_evict(struct i915_vma *vma) */ int i915_gem_evict_something(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, u64 min_size, u64 alignment, unsigned long color, u64 start, u64 end, @@ -170,7 +175,7 @@ i915_gem_evict_something(struct i915_address_space *vm, continue; } - if (mark_free(&scan, vma, flags, &eviction_list)) + if (mark_free(&scan, ww, vma, flags, &eviction_list)) goto found; } @@ -178,6 +183,7 @@ i915_gem_evict_something(struct i915_address_space *vm, list_for_each_entry_safe(vma, next, &eviction_list, evict_link) { ret = drm_mm_scan_remove_block(&scan, &vma->node); BUG_ON(ret); + i915_gem_object_unlock(vma->obj); } /* @@ -222,10 +228,12 @@ i915_gem_evict_something(struct i915_address_space *vm, * of any of our objects, thus corrupting the list). */ list_for_each_entry_safe(vma, next, &eviction_list, evict_link) { - if (drm_mm_scan_remove_block(&scan, &vma->node)) + if (drm_mm_scan_remove_block(&scan, &vma->node)) { __i915_vma_pin(vma); - else + } else { list_del(&vma->evict_link); + i915_gem_object_unlock(vma->obj); + } } /* Unbinding will emit any required flushes */ @@ -234,16 +242,22 @@ i915_gem_evict_something(struct i915_address_space *vm, __i915_vma_unpin(vma); if (ret == 0) ret = __i915_vma_unbind(vma); + + i915_gem_object_unlock(vma->obj); } while (ret == 0 && (node = drm_mm_scan_color_evict(&scan))) { vma = container_of(node, struct i915_vma, node); + /* If we find any non-objects (!vma), we cannot evict them */ - if (vma->node.color != I915_COLOR_UNEVICTABLE) + if (vma->node.color != I915_COLOR_UNEVICTABLE && + i915_gem_object_trylock(vma->obj, ww)) { ret = __i915_vma_unbind(vma); - else - ret = -ENOSPC; /* XXX search failed, try again? */ + i915_gem_object_unlock(vma->obj); + } else { + ret = -ENOSPC; + } } return ret; @@ -261,6 +275,7 @@ i915_gem_evict_something(struct i915_address_space *vm, * memory in e.g. the shrinker. */ int i915_gem_evict_for_node(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, struct drm_mm_node *target, unsigned int flags) { @@ -333,6 +348,11 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, break; } + if (!i915_gem_object_trylock(vma->obj, ww)) { + ret = -ENOSPC; + break; + } + /* * Never show fear in the face of dragons! * @@ -350,6 +370,8 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, __i915_vma_unpin(vma); if (ret == 0) ret = __i915_vma_unbind(vma); + + i915_gem_object_unlock(vma->obj); } return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 2f2ba7a2955d..332cff339050 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -93,6 +93,7 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj, * asked to wait for eviction and interrupted. */ int i915_gem_gtt_reserve(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, struct drm_mm_node *node, u64 size, u64 offset, unsigned long color, unsigned int flags) @@ -117,7 +118,7 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm, if (flags & PIN_NOEVICT) return -ENOSPC; - err = i915_gem_evict_for_node(vm, node, flags); + err = i915_gem_evict_for_node(vm, ww, node, flags); if (err == 0) err = drm_mm_reserve_node(&vm->mm, node); @@ -184,6 +185,7 @@ static u64 random_offset(u64 start, u64 end, u64 len, u64 align) * asked to wait for eviction and interrupted. */ int i915_gem_gtt_insert(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, struct drm_mm_node *node, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, unsigned int flags) @@ -269,7 +271,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, */ offset = random_offset(start, end, size, alignment ?: I915_GTT_MIN_ALIGNMENT); - err = i915_gem_gtt_reserve(vm, node, size, offset, color, flags); + err = i915_gem_gtt_reserve(vm, ww, node, size, offset, color, flags); if (err != -ENOSPC) return err; @@ -277,7 +279,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, return -ENOSPC; /* Randomly selected placement is pinned, do a search */ - err = i915_gem_evict_something(vm, size, alignment, color, + err = i915_gem_evict_something(vm, ww, size, alignment, color, start, end, flags); if (err) return err; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index c9b0ee5e1d23..e4938aba3fe9 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -16,6 +16,7 @@ struct drm_i915_gem_object; struct i915_address_space; +struct i915_gem_ww_ctx; int __must_check i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj, struct sg_table *pages); @@ -23,11 +24,13 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj, struct sg_table *pages); int i915_gem_gtt_reserve(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, struct drm_mm_node *node, u64 size, u64 offset, unsigned long color, unsigned int flags); int i915_gem_gtt_insert(struct i915_address_space *vm, + struct i915_gem_ww_ctx *ww, struct drm_mm_node *node, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, unsigned int flags); diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c index 31a105bc1792..c97323973f9b 100644 --- a/drivers/gpu/drm/i915/i915_vgpu.c +++ b/drivers/gpu/drm/i915/i915_vgpu.c @@ -197,7 +197,7 @@ static int vgt_balloon_space(struct i915_ggtt *ggtt, drm_info(&dev_priv->drm, "balloon space: range [ 0x%lx - 0x%lx ] %lu KiB.\n", start, end, size / 1024); - ret = i915_gem_gtt_reserve(&ggtt->vm, node, + ret = i915_gem_gtt_reserve(&ggtt->vm, NULL, node, size, start, I915_COLOR_UNEVICTABLE, 0); if (!ret) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 32620f93beab..8284bf389cd5 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -651,7 +651,8 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color) * 0 on success, negative error code otherwise. */ static int -i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) +i915_vma_insert(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, + u64 size, u64 alignment, u64 flags) { unsigned long color; u64 start, end; @@ -703,7 +704,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) range_overflows(offset, size, end)) return -EINVAL; - ret = i915_gem_gtt_reserve(vma->vm, &vma->node, + ret = i915_gem_gtt_reserve(vma->vm, ww, &vma->node, size, offset, color, flags); if (ret) @@ -742,7 +743,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) size = round_up(size, I915_GTT_PAGE_SIZE_2M); } - ret = i915_gem_gtt_insert(vma->vm, &vma->node, + ret = i915_gem_gtt_insert(vma->vm, ww, &vma->node, size, alignment, color, start, end, flags); if (ret) @@ -1383,7 +1384,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, goto err_unlock; if (!(bound & I915_VMA_BIND_MASK)) { - err = i915_vma_insert(vma, size, alignment, flags); + err = i915_vma_insert(vma, ww, size, alignment, flags); if (err) goto err_active; diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c index 7c075c16a573..15b4e5631070 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c @@ -117,7 +117,7 @@ static int igt_evict_something(void *arg) /* Everything is pinned, nothing should happen */ mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_something(&ggtt->vm, + err = i915_gem_evict_something(&ggtt->vm, NULL, I915_GTT_PAGE_SIZE, 0, 0, 0, U64_MAX, 0); @@ -132,11 +132,12 @@ static int igt_evict_something(void *arg) /* Everything is unpinned, we should be able to evict something */ mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_something(&ggtt->vm, + err = i915_gem_evict_something(&ggtt->vm, NULL, I915_GTT_PAGE_SIZE, 0, 0, 0, U64_MAX, 0); mutex_unlock(&ggtt->vm.mutex); + if (err) { pr_err("i915_gem_evict_something failed on a full GGTT with err=%d\n", err); @@ -204,7 +205,7 @@ static int igt_evict_for_vma(void *arg) /* Everything is pinned, nothing should happen */ mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_for_node(&ggtt->vm, &target, 0); + err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0); mutex_unlock(&ggtt->vm.mutex); if (err != -ENOSPC) { pr_err("i915_gem_evict_for_node on a full GGTT returned err=%d\n", @@ -216,7 +217,7 @@ static int igt_evict_for_vma(void *arg) /* Everything is unpinned, we should be able to evict the node */ mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_for_node(&ggtt->vm, &target, 0); + err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0); mutex_unlock(&ggtt->vm.mutex); if (err) { pr_err("i915_gem_evict_for_node returned err=%d\n", @@ -297,7 +298,7 @@ static int igt_evict_for_cache_color(void *arg) /* Remove just the second vma */ mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_for_node(&ggtt->vm, &target, 0); + err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0); mutex_unlock(&ggtt->vm.mutex); if (err) { pr_err("[0]i915_gem_evict_for_node returned err=%d\n", err); @@ -310,7 +311,7 @@ static int igt_evict_for_cache_color(void *arg) target.color = I915_CACHE_L3_LLC; mutex_lock(&ggtt->vm.mutex); - err = i915_gem_evict_for_node(&ggtt->vm, &target, 0); + err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0); mutex_unlock(&ggtt->vm.mutex); if (!err) { pr_err("[1]i915_gem_evict_for_node returned err=%d\n", err); @@ -408,7 +409,7 @@ static int igt_evict_contexts(void *arg) /* Reserve a block so that we know we have enough to fit a few rq */ memset(&hole, 0, sizeof(hole)); mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_insert(&ggtt->vm, &hole, + err = i915_gem_gtt_insert(&ggtt->vm, NULL, &hole, PRETEND_GGTT_SIZE, 0, I915_COLOR_UNEVICTABLE, 0, ggtt->vm.total, PIN_NOEVICT); @@ -428,7 +429,7 @@ static int igt_evict_contexts(void *arg) goto out_locked; } - if (i915_gem_gtt_insert(&ggtt->vm, &r->node, + if (i915_gem_gtt_insert(&ggtt->vm, NULL, &r->node, 1ul << 20, 0, I915_COLOR_UNEVICTABLE, 0, ggtt->vm.total, PIN_NOEVICT)) { diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c index 357ced0b88e7..0336c065f331 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c @@ -1378,7 +1378,7 @@ static int igt_gtt_reserve(void *arg) } mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node, + err = i915_gem_gtt_reserve(&ggtt->vm, NULL, &vma->node, obj->base.size, total, obj->cache_level, @@ -1430,7 +1430,7 @@ static int igt_gtt_reserve(void *arg) } mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node, + err = i915_gem_gtt_reserve(&ggtt->vm, NULL, &vma->node, obj->base.size, total, obj->cache_level, @@ -1477,7 +1477,7 @@ static int igt_gtt_reserve(void *arg) I915_GTT_MIN_ALIGNMENT); mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node, + err = i915_gem_gtt_reserve(&ggtt->vm, NULL, &vma->node, obj->base.size, offset, obj->cache_level, @@ -1552,7 +1552,7 @@ static int igt_gtt_insert(void *arg) /* Check a couple of obviously invalid requests */ for (ii = invalid_insert; ii->size; ii++) { mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_insert(&ggtt->vm, &tmp, + err = i915_gem_gtt_insert(&ggtt->vm, NULL, &tmp, ii->size, ii->alignment, I915_COLOR_UNEVICTABLE, ii->start, ii->end, @@ -1594,7 +1594,7 @@ static int igt_gtt_insert(void *arg) } mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_insert(&ggtt->vm, &vma->node, + err = i915_gem_gtt_insert(&ggtt->vm, NULL, &vma->node, obj->base.size, 0, obj->cache_level, 0, ggtt->vm.total, 0); @@ -1654,7 +1654,7 @@ static int igt_gtt_insert(void *arg) } mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_insert(&ggtt->vm, &vma->node, + err = i915_gem_gtt_insert(&ggtt->vm, NULL, &vma->node, obj->base.size, 0, obj->cache_level, 0, ggtt->vm.total, 0); @@ -1703,7 +1703,7 @@ static int igt_gtt_insert(void *arg) } mutex_lock(&ggtt->vm.mutex); - err = i915_gem_gtt_insert(&ggtt->vm, &vma->node, + err = i915_gem_gtt_insert(&ggtt->vm, NULL, &vma->node, obj->base.size, 0, obj->cache_level, 0, ggtt->vm.total, 0); From patchwork Mon Jan 10 11:51:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 12708608 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 26B4AC433F5 for ; Mon, 10 Jan 2022 11:52:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8941614AC14; Mon, 10 Jan 2022 11:51:40 +0000 (UTC) Received: from mblankhorst.nl (mblankhorst.nl [141.105.120.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4FEFA14A92E; Mon, 10 Jan 2022 11:51:38 +0000 (UTC) From: Maarten Lankhorst To: intel-gfx@lists.freedesktop.org Date: Mon, 10 Jan 2022 12:51:30 +0100 Message-Id: <20220110115133.1500718-5-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> References: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 4/7] drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for i915_vma_unbind, v2. X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We want to remove more members of i915_vma, which requires the locking to be held more often. Start requiring gem object lock for i915_vma_unbind, as it's one of the callers that may unpin pages. Some special care is needed when evicting, because the last reference to the object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage, and we need to cache vma->obj before unlocking. Changes since v1: - Make trylock failing a WARN. (Matt) - Remove double i915_vma_wait_for_bind() (Matt) - Move atomic_set to right before mutex_unlock(), to make it more clear they belong together. (Matt) Signed-off-by: Maarten Lankhorst Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/display/intel_fb_pin.c | 2 +- .../gpu/drm/i915/gem/selftests/huge_pages.c | 2 +- .../i915/gem/selftests/i915_gem_client_blt.c | 2 +- .../drm/i915/gem/selftests/i915_gem_mman.c | 6 +++ drivers/gpu/drm/i915/gt/intel_ggtt.c | 49 ++++++++++++++++--- drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_vma.c | 27 +++++++++- drivers/gpu/drm/i915/i915_vma.h | 1 + drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 22 ++++----- drivers/gpu/drm/i915/selftests/i915_vma.c | 8 +-- 10 files changed, 95 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_fb_pin.c b/drivers/gpu/drm/i915/display/intel_fb_pin.c index 31c15e5fca95..9c555f6d1958 100644 --- a/drivers/gpu/drm/i915/display/intel_fb_pin.c +++ b/drivers/gpu/drm/i915/display/intel_fb_pin.c @@ -47,7 +47,7 @@ intel_pin_fb_obj_dpt(struct drm_framebuffer *fb, goto err; if (i915_vma_misplaced(vma, 0, alignment, 0)) { - ret = i915_vma_unbind(vma); + ret = i915_vma_unbind_unlocked(vma); if (ret) { vma = ERR_PTR(ret); goto err; diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 11f0aa65f8a3..b14c4e0a58d8 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -647,7 +647,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg) * pages. */ for (offset = 4096; offset < page_size; offset += 4096) { - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) goto out_unpin; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c index c08f766e6e15..c8ff8bf0986d 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c @@ -318,7 +318,7 @@ static int pin_buffer(struct i915_vma *vma, u64 addr) int err; if (drm_mm_node_allocated(&vma->node) && vma->node.start != addr) { - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) return err; } diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index f61356b72b1c..ba29767348be 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -166,7 +166,9 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj, kunmap(p); out: + i915_gem_object_lock(obj, NULL); __i915_vma_put(vma); + i915_gem_object_unlock(obj); return err; } @@ -261,7 +263,9 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj, if (err) return err; + i915_gem_object_lock(obj, NULL); __i915_vma_put(vma); + i915_gem_object_unlock(obj); if (igt_timeout(end_time, "%s: timed out after tiling=%d stride=%d\n", @@ -1352,7 +1356,9 @@ static int __igt_mmap_revoke(struct drm_i915_private *i915, * for other objects. Ergo we have to revoke the previous mmap PTE * access as it no longer points to the same object. */ + i915_gem_object_lock(obj, NULL); err = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE); + i915_gem_object_unlock(obj); if (err) { pr_err("Failed to unbind object!\n"); goto out_unmap; diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index e416e1f12d1a..e73d453a0d6b 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -129,22 +129,49 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm) drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt); +retry: + i915_gem_drain_freed_objects(vm->i915); + mutex_lock(&vm->mutex); /* Skip rewriting PTE on VMA unbind. */ open = atomic_xchg(&vm->open, 0); list_for_each_entry_safe(vma, vn, &vm->bound_list, vm_link) { + struct drm_i915_gem_object *obj = vma->obj; + GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); - i915_vma_wait_for_bind(vma); - if (i915_vma_is_pinned(vma)) + if (i915_vma_is_pinned(vma) || !i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) continue; - if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) { - __i915_vma_evict(vma); - drm_mm_remove_node(&vma->node); + /* unlikely to race when GPU is idle, so no worry about slowpath.. */ + if (WARN_ON(!i915_gem_object_trylock(obj, NULL))) { + /* + * No dead objects should appear here, GPU should be + * completely idle, and userspace suspended + */ + i915_gem_object_get(obj); + + atomic_set(&vm->open, open); + mutex_unlock(&vm->mutex); + + i915_gem_object_lock(obj, NULL); + open = i915_vma_unbind(vma); + i915_gem_object_unlock(obj); + + GEM_WARN_ON(open); + + i915_gem_object_put(obj); + goto retry; } + + i915_vma_wait_for_bind(vma); + + __i915_vma_evict(vma); + drm_mm_remove_node(&vma->node); + + i915_gem_object_unlock(obj); } vm->clear_range(vm, 0, vm->total); @@ -742,11 +769,21 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt) atomic_set(&ggtt->vm.open, 0); flush_workqueue(ggtt->vm.i915->wq); + i915_gem_drain_freed_objects(ggtt->vm.i915); mutex_lock(&ggtt->vm.mutex); - list_for_each_entry_safe(vma, vn, &ggtt->vm.bound_list, vm_link) + list_for_each_entry_safe(vma, vn, &ggtt->vm.bound_list, vm_link) { + struct drm_i915_gem_object *obj = vma->obj; + bool trylock; + + trylock = i915_gem_object_trylock(obj, NULL); + WARN_ON(!trylock); + WARN_ON(__i915_vma_unbind(vma)); + if (trylock) + i915_gem_object_unlock(obj); + } if (drm_mm_node_allocated(&ggtt->error_capture)) drm_mm_remove_node(&ggtt->error_capture); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e3730096abd9..b4db094fdd70 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -119,6 +119,8 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj, struct i915_vma *vma; int ret; + assert_object_held(obj); + if (list_empty(&obj->vma.list)) return 0; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 8284bf389cd5..b7f25ff65c19 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1605,8 +1605,16 @@ void i915_vma_parked(struct intel_gt *gt) struct drm_i915_gem_object *obj = vma->obj; struct i915_address_space *vm = vma->vm; - INIT_LIST_HEAD(&vma->closed_link); - __i915_vma_put(vma); + if (i915_gem_object_trylock(obj, NULL)) { + INIT_LIST_HEAD(&vma->closed_link); + __i915_vma_put(vma); + i915_gem_object_unlock(obj); + } else { + /* back you go.. */ + spin_lock_irq(>->closed_lock); + list_add(&vma->closed_link, >->closed_vma); + spin_unlock_irq(>->closed_lock); + } i915_gem_object_put(obj); i915_vm_close(vm); @@ -1722,6 +1730,7 @@ int _i915_vma_move_to_active(struct i915_vma *vma, void __i915_vma_evict(struct i915_vma *vma) { GEM_BUG_ON(i915_vma_is_pinned(vma)); + assert_object_held_shared(vma->obj); if (i915_vma_is_map_and_fenceable(vma)) { /* Force a pagefault for domain tracking on next user access */ @@ -1767,6 +1776,7 @@ int __i915_vma_unbind(struct i915_vma *vma) int ret; lockdep_assert_held(&vma->vm->mutex); + assert_object_held_shared(vma->obj); if (!drm_mm_node_allocated(&vma->node)) return 0; @@ -1798,6 +1808,8 @@ int i915_vma_unbind(struct i915_vma *vma) intel_wakeref_t wakeref = 0; int err; + assert_object_held_shared(vma->obj); + /* Optimistic wait before taking the mutex */ err = i915_vma_sync(vma); if (err) @@ -1828,6 +1840,17 @@ int i915_vma_unbind(struct i915_vma *vma) return err; } +int i915_vma_unbind_unlocked(struct i915_vma *vma) +{ + int err; + + i915_gem_object_lock(vma->obj, NULL); + err = i915_vma_unbind(vma); + i915_gem_object_unlock(vma->obj); + + return err; +} + struct i915_vma *i915_vma_make_unshrinkable(struct i915_vma *vma) { i915_gem_object_make_unshrinkable(vma->obj); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 32719431b3df..da69ecb1b860 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -214,6 +214,7 @@ void i915_vma_revoke_mmap(struct i915_vma *vma); void __i915_vma_evict(struct i915_vma *vma); int __i915_vma_unbind(struct i915_vma *vma); int __must_check i915_vma_unbind(struct i915_vma *vma); +int __must_check i915_vma_unbind_unlocked(struct i915_vma *vma); void i915_vma_unlink_ctx(struct i915_vma *vma); void i915_vma_close(struct i915_vma *vma); void i915_vma_reopen(struct i915_vma *vma); diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c index 0336c065f331..b7f42b21a3aa 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c @@ -385,7 +385,7 @@ static void close_object_list(struct list_head *objects, vma = i915_vma_instance(obj, vm, NULL); if (!IS_ERR(vma)) - ignored = i915_vma_unbind(vma); + ignored = i915_vma_unbind_unlocked(vma); list_del(&obj->st_link); i915_gem_object_put(obj); @@ -496,7 +496,7 @@ static int fill_hole(struct i915_address_space *vm, goto err; } - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("%s(%s) (forward) unbind of vma.node=%llx + %llx failed with err=%d\n", __func__, p->name, vma->node.start, vma->node.size, @@ -569,7 +569,7 @@ static int fill_hole(struct i915_address_space *vm, goto err; } - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("%s(%s) (backward) unbind of vma.node=%llx + %llx failed with err=%d\n", __func__, p->name, vma->node.start, vma->node.size, @@ -655,7 +655,7 @@ static int walk_hole(struct i915_address_space *vm, goto err_put; } - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("%s unbind failed at %llx + %llx with err=%d\n", __func__, addr, vma->size, err); @@ -732,13 +732,13 @@ static int pot_hole(struct i915_address_space *vm, pr_err("%s incorrect at %llx + %llx\n", __func__, addr, vma->size); i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); err = -EINVAL; goto err_obj; } i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); GEM_BUG_ON(err); } @@ -832,13 +832,13 @@ static int drunk_hole(struct i915_address_space *vm, pr_err("%s incorrect at %llx + %llx\n", __func__, addr, BIT_ULL(size)); i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); err = -EINVAL; goto err_obj; } i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); GEM_BUG_ON(err); if (igt_timeout(end_time, @@ -906,7 +906,7 @@ static int __shrink_hole(struct i915_address_space *vm, pr_err("%s incorrect at %llx + %llx\n", __func__, addr, size); i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); err = -EINVAL; break; } @@ -1465,7 +1465,7 @@ static int igt_gtt_reserve(void *arg) goto out; } - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("i915_vma_unbind failed with err=%d!\n", err); goto out; @@ -1647,7 +1647,7 @@ static int igt_gtt_insert(void *arg) GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); offset = vma->node.start; - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("i915_vma_unbind failed with err=%d!\n", err); goto out; diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c index de37cfa4c65f..0280605a2673 100644 --- a/drivers/gpu/drm/i915/selftests/i915_vma.c +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c @@ -340,7 +340,7 @@ static int igt_vma_pin1(void *arg) if (!err) { i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("Failed to unbind single page from GGTT, err=%d\n", err); goto out; @@ -691,7 +691,7 @@ static int igt_vma_rotate_remap(void *arg) } i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("Unbinding returned %i\n", err); goto out_object; @@ -852,7 +852,7 @@ static int igt_vma_partial(void *arg) i915_vma_unpin(vma); nvma++; - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("Unbinding returned %i\n", err); goto out_object; @@ -891,7 +891,7 @@ static int igt_vma_partial(void *arg) i915_vma_unpin(vma); - err = i915_vma_unbind(vma); + err = i915_vma_unbind_unlocked(vma); if (err) { pr_err("Unbinding returned %i\n", err); goto out_object; From patchwork Mon Jan 10 11:51:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 12708611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C7FF0C433FE for ; Mon, 10 Jan 2022 11:52:07 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 45A8814AC16; Mon, 10 Jan 2022 11:51:43 +0000 (UTC) Received: from mblankhorst.nl (mblankhorst.nl [IPv6:2a02:2308:0:7ec:e79c:4e97:b6c4:f0ae]) by gabe.freedesktop.org (Postfix) with ESMTPS id 748F614A96B; Mon, 10 Jan 2022 11:51:39 +0000 (UTC) From: Maarten Lankhorst To: intel-gfx@lists.freedesktop.org Date: Mon, 10 Jan 2022 12:51:31 +0100 Message-Id: <20220110115133.1500718-6-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> References: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 5/7] drm/i915: Remove assert_object_held_shared X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Auld , dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This duck tape workaround is no longer required, unbind and destroy are fixed to take the obj->resv mutex before destroying and obj->mm.lock has been removed, always requiring obj->resv as well. Signed-off-by: Maarten Lankhorst Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 ++-- drivers/gpu/drm/i915/gem/i915_gem_object.h | 14 -------------- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 +++++----- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 6 +++--- 5 files changed, 11 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index d87b508b59b1..fd34b1a115c4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -550,7 +550,7 @@ bool i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj) #ifdef CONFIG_LOCKDEP if (IS_DGFX(to_i915(obj->base.dev)) && i915_gem_object_evictable((void __force *)obj)) - assert_object_held_shared(obj); + assert_object_held(obj); #endif return obj->mem_flags & I915_BO_FLAG_STRUCT_PAGE; } @@ -569,7 +569,7 @@ bool i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj) #ifdef CONFIG_LOCKDEP if (IS_DGFX(to_i915(obj->base.dev)) && i915_gem_object_evictable((void __force *)obj)) - assert_object_held_shared(obj); + assert_object_held(obj); #endif return obj->mem_flags & I915_BO_FLAG_IOMEM; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index bc448f895ae8..c1cdfaf2d1e3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -157,20 +157,6 @@ i915_gem_object_put(struct drm_i915_gem_object *obj) #define assert_object_held(obj) dma_resv_assert_held((obj)->base.resv) -/* - * If more than one potential simultaneous locker, assert held. - */ -static inline void assert_object_held_shared(const struct drm_i915_gem_object *obj) -{ - /* - * Note mm list lookup is protected by - * kref_get_unless_zero(). - */ - if (IS_ENABLED(CONFIG_LOCKDEP) && - kref_read(&obj->base.refcount) > 0) - assert_object_held(obj); -} - static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj, struct i915_gem_ww_ctx *ww, bool intr) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 7d2211fbe548..a1a785068779 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -19,7 +19,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, bool shrinkable; int i; - assert_object_held_shared(obj); + assert_object_held(obj); if (i915_gem_object_is_volatile(obj)) obj->mm.madv = I915_MADV_DONTNEED; @@ -95,7 +95,7 @@ int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj) struct drm_i915_private *i915 = to_i915(obj->base.dev); int err; - assert_object_held_shared(obj); + assert_object_held(obj); if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) { drm_dbg(&i915->drm, @@ -122,7 +122,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) assert_object_held(obj); - assert_object_held_shared(obj); + assert_object_held(obj); if (unlikely(!i915_gem_object_has_pages(obj))) { GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj)); @@ -191,7 +191,7 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) { struct sg_table *pages; - assert_object_held_shared(obj); + assert_object_held(obj); pages = fetch_and_zero(&obj->mm.pages); if (IS_ERR_OR_NULL(pages)) @@ -222,7 +222,7 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) return -EBUSY; /* May be called by shrinker from within get_pages() (on another bo) */ - assert_object_held_shared(obj); + assert_object_held(obj); i915_gem_object_release_mmap_offset(obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 3cc01c30dd62..8a0da441225b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -109,7 +109,7 @@ static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj) { struct page **pvec = NULL; - assert_object_held_shared(obj); + assert_object_held(obj); if (!--obj->userptr.page_ref) { pvec = obj->userptr.pvec; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index b7f25ff65c19..11e10e0d628c 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1730,7 +1730,7 @@ int _i915_vma_move_to_active(struct i915_vma *vma, void __i915_vma_evict(struct i915_vma *vma) { GEM_BUG_ON(i915_vma_is_pinned(vma)); - assert_object_held_shared(vma->obj); + assert_object_held(vma->obj); if (i915_vma_is_map_and_fenceable(vma)) { /* Force a pagefault for domain tracking on next user access */ @@ -1776,7 +1776,7 @@ int __i915_vma_unbind(struct i915_vma *vma) int ret; lockdep_assert_held(&vma->vm->mutex); - assert_object_held_shared(vma->obj); + assert_object_held(vma->obj); if (!drm_mm_node_allocated(&vma->node)) return 0; @@ -1808,7 +1808,7 @@ int i915_vma_unbind(struct i915_vma *vma) intel_wakeref_t wakeref = 0; int err; - assert_object_held_shared(vma->obj); + assert_object_held(vma->obj); /* Optimistic wait before taking the mutex */ err = i915_vma_sync(vma); From patchwork Mon Jan 10 11:51:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 12708610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9C785C433F5 for ; Mon, 10 Jan 2022 11:52:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 46D7214AC18; Mon, 10 Jan 2022 11:51:43 +0000 (UTC) Received: from mblankhorst.nl (mblankhorst.nl [141.105.120.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id A0F1214AB58; Mon, 10 Jan 2022 11:51:39 +0000 (UTC) From: Maarten Lankhorst To: intel-gfx@lists.freedesktop.org Date: Mon, 10 Jan 2022 12:51:32 +0100 Message-Id: <20220110115133.1500718-7-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> References: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 6/7] drm/i915: Remove support for unlocked i915_vma unbind X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Now that we require the object lock for all ops, some code handling race conditions can be removed. This is required to not take short-term pins inside execbuf. Signed-off-by: Maarten Lankhorst Acked-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_vma.c | 55 +++++---------------------------- 1 file changed, 8 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 11e10e0d628c..8859feb7d131 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -777,7 +777,6 @@ i915_vma_detach(struct i915_vma *vma) static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) { unsigned int bound; - bool pinned = true; bound = atomic_read(&vma->flags); do { @@ -787,34 +786,10 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR))) return false; - if (!(bound & I915_VMA_PIN_MASK)) - goto unpinned; - GEM_BUG_ON(((bound + 1) & I915_VMA_PIN_MASK) == 0); } while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1)); return true; - -unpinned: - /* - * If pin_count==0, but we are bound, check under the lock to avoid - * racing with a concurrent i915_vma_unbind(). - */ - mutex_lock(&vma->vm->mutex); - do { - if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR))) { - pinned = false; - break; - } - - if (unlikely(flags & ~bound)) { - pinned = false; - break; - } - } while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1)); - mutex_unlock(&vma->vm->mutex); - - return pinned; } static struct scatterlist * @@ -1153,7 +1128,6 @@ static int __i915_vma_get_pages(struct i915_vma *vma) { struct sg_table *pages; - int ret; /* * The vma->pages are only valid within the lifespan of the borrowed @@ -1186,18 +1160,16 @@ __i915_vma_get_pages(struct i915_vma *vma) break; } - ret = 0; if (IS_ERR(pages)) { - ret = PTR_ERR(pages); - pages = NULL; drm_err(&vma->vm->i915->drm, - "Failed to get pages for VMA view type %u (%d)!\n", - vma->ggtt_view.type, ret); + "Failed to get pages for VMA view type %u (%ld)!\n", + vma->ggtt_view.type, PTR_ERR(pages)); + return PTR_ERR(pages); } vma->pages = pages; - return ret; + return 0; } I915_SELFTEST_EXPORT int i915_vma_get_pages(struct i915_vma *vma) @@ -1229,25 +1201,14 @@ I915_SELFTEST_EXPORT int i915_vma_get_pages(struct i915_vma *vma) static void __vma_put_pages(struct i915_vma *vma, unsigned int count) { /* We allocate under vma_get_pages, so beware the shrinker */ - struct sg_table *pages = READ_ONCE(vma->pages); - GEM_BUG_ON(atomic_read(&vma->pages_count) < count); if (atomic_sub_return(count, &vma->pages_count) == 0) { - /* - * The atomic_sub_return is a read barrier for the READ_ONCE of - * vma->pages above. - * - * READ_ONCE is safe because this is either called from the same - * function (i915_vma_pin_ww), or guarded by vma->vm->mutex. - * - * TODO: We're leaving vma->pages dangling, until vma->obj->resv - * lock is required. - */ - if (pages != vma->obj->mm.pages) { - sg_free_table(pages); - kfree(pages); + if (vma->pages != vma->obj->mm.pages) { + sg_free_table(vma->pages); + kfree(vma->pages); } + vma->pages = NULL; i915_gem_object_unpin_pages(vma->obj); } From patchwork Mon Jan 10 11:51:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 12708609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3AA2C433FE for ; Mon, 10 Jan 2022 11:52:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7CD3714A94F; Mon, 10 Jan 2022 11:51:41 +0000 (UTC) Received: from mblankhorst.nl (mblankhorst.nl [IPv6:2a02:2308:0:7ec:e79c:4e97:b6c4:f0ae]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9567014AB0A; Mon, 10 Jan 2022 11:51:39 +0000 (UTC) From: Maarten Lankhorst To: intel-gfx@lists.freedesktop.org Date: Mon, 10 Jan 2022 12:51:33 +0100 Message-Id: <20220110115133.1500718-8-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> References: <20220110115133.1500718-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 7/7] drm/i915: Remove short-term pins from execbuf, v6. X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Auld , dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Add a flag PIN_VALIDATE, to indicate we don't need to pin and only protected by the object lock. This removes the need to unpin, which is done by just releasing the lock. eb_reserve is slightly reworked for readability, but the same steps are still done: - First pass pins with NONBLOCK. - Second pass unbinds all objects first, then pins. - Third pass is only called when not all objects are softpinned, and unbinds all objects, then calls i915_gem_evict_vm(), then pins. Changes since v1: - Split out eb_reserve() into separate functions for readability. Changes since v2: - Make batch buffer mappable on platforms where only GGTT is available, to prevent moving the batch buffer during relocations. Changes since v3: - Preserve current behavior for batch buffer, instead be cautious when calling i915_gem_object_ggtt_pin_ww, and re-use the current batch vma if it's inside ggtt and map-and-fenceable. - Remove impossible condition check from eb_reserve. (Matt) Changes since v5: - Do not even temporarily pin, just call i915_gem_evict_vm() and mark all vma's as unpinned. Signed-off-by: Maarten Lankhorst Reviewed-by: Matthew Auld --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 220 +++++++++--------- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 1 - drivers/gpu/drm/i915/i915_gem_gtt.h | 1 + drivers/gpu/drm/i915/i915_vma.c | 24 +- 4 files changed, 128 insertions(+), 118 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index da35a143af36..cfff194d90e7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -441,7 +441,7 @@ eb_pin_vma(struct i915_execbuffer *eb, else pin_flags = entry->offset & PIN_OFFSET_MASK; - pin_flags |= PIN_USER | PIN_NOEVICT | PIN_OFFSET_FIXED; + pin_flags |= PIN_USER | PIN_NOEVICT | PIN_OFFSET_FIXED | PIN_VALIDATE; if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_GTT)) pin_flags |= PIN_GLOBAL; @@ -459,17 +459,15 @@ eb_pin_vma(struct i915_execbuffer *eb, entry->pad_to_size, entry->alignment, eb_pin_flags(entry, ev->flags) | - PIN_USER | PIN_NOEVICT); + PIN_USER | PIN_NOEVICT | PIN_VALIDATE); if (unlikely(err)) return err; } if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) { err = i915_vma_pin_fence(vma); - if (unlikely(err)) { - i915_vma_unpin(vma); + if (unlikely(err)) return err; - } if (vma->fence) ev->flags |= __EXEC_OBJECT_HAS_FENCE; @@ -485,13 +483,9 @@ eb_pin_vma(struct i915_execbuffer *eb, static inline void eb_unreserve_vma(struct eb_vma *ev) { - if (!(ev->flags & __EXEC_OBJECT_HAS_PIN)) - return; - if (unlikely(ev->flags & __EXEC_OBJECT_HAS_FENCE)) __i915_vma_unpin_fence(ev->vma); - __i915_vma_unpin(ev->vma); ev->flags &= ~__EXEC_OBJECT_RESERVED; } @@ -684,10 +678,8 @@ static int eb_reserve_vma(struct i915_execbuffer *eb, if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) { err = i915_vma_pin_fence(vma); - if (unlikely(err)) { - i915_vma_unpin(vma); + if (unlikely(err)) return err; - } if (vma->fence) ev->flags |= __EXEC_OBJECT_HAS_FENCE; @@ -699,85 +691,95 @@ static int eb_reserve_vma(struct i915_execbuffer *eb, return 0; } -static int eb_reserve(struct i915_execbuffer *eb) +static bool eb_unbind(struct i915_execbuffer *eb, bool force) { const unsigned int count = eb->buffer_count; - unsigned int pin_flags = PIN_USER | PIN_NONBLOCK; + unsigned int i; struct list_head last; + bool unpinned = false; + + /* Resort *all* the objects into priority order */ + INIT_LIST_HEAD(&eb->unbound); + INIT_LIST_HEAD(&last); + + for (i = 0; i < count; i++) { + struct eb_vma *ev = &eb->vma[i]; + unsigned int flags = ev->flags; + + if (!force && flags & EXEC_OBJECT_PINNED && + flags & __EXEC_OBJECT_HAS_PIN) + continue; + + unpinned = true; + eb_unreserve_vma(ev); + + if (flags & EXEC_OBJECT_PINNED) + /* Pinned must have their slot */ + list_add(&ev->bind_link, &eb->unbound); + else if (flags & __EXEC_OBJECT_NEEDS_MAP) + /* Map require the lowest 256MiB (aperture) */ + list_add_tail(&ev->bind_link, &eb->unbound); + else if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)) + /* Prioritise 4GiB region for restricted bo */ + list_add(&ev->bind_link, &last); + else + list_add_tail(&ev->bind_link, &last); + } + + list_splice_tail(&last, &eb->unbound); + return unpinned; +} + +static int eb_reserve(struct i915_execbuffer *eb) +{ struct eb_vma *ev; - unsigned int i, pass; + unsigned int pass; int err = 0; + bool unpinned; /* * Attempt to pin all of the buffers into the GTT. - * This is done in 3 phases: + * This is done in 2 phases: * - * 1a. Unbind all objects that do not match the GTT constraints for - * the execbuffer (fenceable, mappable, alignment etc). - * 1b. Increment pin count for already bound objects. - * 2. Bind new objects. - * 3. Decrement pin count. + * 1. Unbind all objects that do not match the GTT constraints for + * the execbuffer (fenceable, mappable, alignment etc). + * 2. Bind new objects. * * This avoid unnecessary unbinding of later objects in order to make * room for the earlier objects *unless* we need to defragment. + * + * Defragmenting is skipped if all objects are pinned at a fixed location. */ - pass = 0; - do { - list_for_each_entry(ev, &eb->unbound, bind_link) { - err = eb_reserve_vma(eb, ev, pin_flags); - if (err) - break; - } - if (err != -ENOSPC) - return err; + for (pass = 0; pass <= 2; pass++) { + int pin_flags = PIN_USER | PIN_VALIDATE; - /* Resort *all* the objects into priority order */ - INIT_LIST_HEAD(&eb->unbound); - INIT_LIST_HEAD(&last); - for (i = 0; i < count; i++) { - unsigned int flags; + if (pass == 0) + pin_flags |= PIN_NONBLOCK; - ev = &eb->vma[i]; - flags = ev->flags; - if (flags & EXEC_OBJECT_PINNED && - flags & __EXEC_OBJECT_HAS_PIN) - continue; + if (pass >= 1) + unpinned = eb_unbind(eb, pass == 2); - eb_unreserve_vma(ev); - - if (flags & EXEC_OBJECT_PINNED) - /* Pinned must have their slot */ - list_add(&ev->bind_link, &eb->unbound); - else if (flags & __EXEC_OBJECT_NEEDS_MAP) - /* Map require the lowest 256MiB (aperture) */ - list_add_tail(&ev->bind_link, &eb->unbound); - else if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)) - /* Prioritise 4GiB region for restricted bo */ - list_add(&ev->bind_link, &last); - else - list_add_tail(&ev->bind_link, &last); - } - list_splice_tail(&last, &eb->unbound); - - switch (pass++) { - case 0: - break; - - case 1: - /* Too fragmented, unbind everything and retry */ - mutex_lock(&eb->context->vm->mutex); - err = i915_gem_evict_vm(eb->context->vm, &eb->ww); - mutex_unlock(&eb->context->vm->mutex); + if (pass == 2) { + err = mutex_lock_interruptible(&eb->context->vm->mutex); + if (!err) { + err = i915_gem_evict_vm(eb->context->vm, &eb->ww); + mutex_unlock(&eb->context->vm->mutex); + } if (err) return err; - break; + } - default: - return -ENOSPC; + list_for_each_entry(ev, &eb->unbound, bind_link) { + err = eb_reserve_vma(eb, ev, pin_flags); + if (err) + break; } - pin_flags = PIN_USER; - } while (1); + if (err != -ENOSPC) + break; + } + + return err; } static int eb_select_context(struct i915_execbuffer *eb) @@ -1225,10 +1227,11 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj, return vaddr; } -static void *reloc_iomap(struct drm_i915_gem_object *obj, +static void *reloc_iomap(struct i915_vma *batch, struct i915_execbuffer *eb, unsigned long page) { + struct drm_i915_gem_object *obj = batch->obj; struct reloc_cache *cache = &eb->reloc_cache; struct i915_ggtt *ggtt = cache_to_ggtt(cache); unsigned long offset; @@ -1238,7 +1241,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, intel_gt_flush_ggtt_writes(ggtt->vm.gt); io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr)); } else { - struct i915_vma *vma; + struct i915_vma *vma = ERR_PTR(-ENODEV); int err; if (i915_gem_object_is_tiled(obj)) @@ -1251,10 +1254,23 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, if (err) return ERR_PTR(err); - vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, - PIN_MAPPABLE | - PIN_NONBLOCK /* NOWARN */ | - PIN_NOEVICT); + /* + * i915_gem_object_ggtt_pin_ww may attempt to remove the batch + * VMA from the object list because we no longer pin. + * + * Only attempt to pin the batch buffer to ggtt if the current batch + * is not inside ggtt, or the batch buffer is not misplaced. + */ + if (!i915_is_ggtt(batch->vm)) { + vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, + PIN_MAPPABLE | + PIN_NONBLOCK /* NOWARN */ | + PIN_NOEVICT); + } else if (i915_vma_is_map_and_fenceable(batch)) { + __i915_vma_pin(batch); + vma = batch; + } + if (vma == ERR_PTR(-EDEADLK)) return vma; @@ -1292,7 +1308,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, return vaddr; } -static void *reloc_vaddr(struct drm_i915_gem_object *obj, +static void *reloc_vaddr(struct i915_vma *vma, struct i915_execbuffer *eb, unsigned long page) { @@ -1304,9 +1320,9 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj, } else { vaddr = NULL; if ((cache->vaddr & KMAP) == 0) - vaddr = reloc_iomap(obj, eb, page); + vaddr = reloc_iomap(vma, eb, page); if (!vaddr) - vaddr = reloc_kmap(obj, cache, page); + vaddr = reloc_kmap(vma->obj, cache, page); } return vaddr; @@ -1347,7 +1363,7 @@ relocate_entry(struct i915_vma *vma, void *vaddr; repeat: - vaddr = reloc_vaddr(vma->obj, eb, + vaddr = reloc_vaddr(vma, eb, offset >> PAGE_SHIFT); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); @@ -2209,7 +2225,7 @@ shadow_batch_pin(struct i915_execbuffer *eb, if (IS_ERR(vma)) return vma; - err = i915_vma_pin_ww(vma, &eb->ww, 0, 0, flags); + err = i915_vma_pin_ww(vma, &eb->ww, 0, 0, flags | PIN_VALIDATE); if (err) return ERR_PTR(err); @@ -2223,7 +2239,7 @@ static struct i915_vma *eb_dispatch_secure(struct i915_execbuffer *eb, struct i9 * batch" bit. Hence we need to pin secure batches into the global gtt. * hsw should have this fixed, but bdw mucks it up again. */ if (eb->batch_flags & I915_DISPATCH_SECURE) - return i915_gem_object_ggtt_pin_ww(vma->obj, &eb->ww, NULL, 0, 0, 0); + return i915_gem_object_ggtt_pin_ww(vma->obj, &eb->ww, NULL, 0, 0, PIN_VALIDATE); return NULL; } @@ -2274,13 +2290,12 @@ static int eb_parse(struct i915_execbuffer *eb) err = i915_gem_object_lock(pool->obj, &eb->ww); if (err) - goto err; + return err; shadow = shadow_batch_pin(eb, pool->obj, eb->context->vm, PIN_USER); - if (IS_ERR(shadow)) { - err = PTR_ERR(shadow); - goto err; - } + if (IS_ERR(shadow)) + return PTR_ERR(shadow); + intel_gt_buffer_pool_mark_used(pool); i915_gem_object_set_readonly(shadow->obj); shadow->private = pool; @@ -2292,25 +2307,21 @@ static int eb_parse(struct i915_execbuffer *eb) shadow = shadow_batch_pin(eb, pool->obj, &eb->gt->ggtt->vm, PIN_GLOBAL); - if (IS_ERR(shadow)) { - err = PTR_ERR(shadow); - shadow = trampoline; - goto err_shadow; - } + if (IS_ERR(shadow)) + return PTR_ERR(shadow); + shadow->private = pool; eb->batch_flags |= I915_DISPATCH_SECURE; } batch = eb_dispatch_secure(eb, shadow); - if (IS_ERR(batch)) { - err = PTR_ERR(batch); - goto err_trampoline; - } + if (IS_ERR(batch)) + return PTR_ERR(batch); err = dma_resv_reserve_shared(shadow->obj->base.resv, 1); if (err) - goto err_trampoline; + return err; err = intel_engine_cmd_parser(eb->context->engine, eb->batches[0]->vma, @@ -2318,7 +2329,7 @@ static int eb_parse(struct i915_execbuffer *eb) eb->batch_len[0], shadow, trampoline); if (err) - goto err_unpin_batch; + return err; eb->batches[0] = &eb->vma[eb->buffer_count++]; eb->batches[0]->vma = i915_vma_get(shadow); @@ -2337,17 +2348,6 @@ static int eb_parse(struct i915_execbuffer *eb) eb->batches[0]->vma = i915_vma_get(batch); } return 0; - -err_unpin_batch: - if (batch) - i915_vma_unpin(batch); -err_trampoline: - if (trampoline) - i915_vma_unpin(trampoline); -err_shadow: - i915_vma_unpin(shadow); -err: - return err; } static int eb_request_submit(struct i915_execbuffer *eb, @@ -3468,8 +3468,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_vma: eb_release_vmas(&eb, true); - if (eb.trampoline) - i915_vma_unpin(eb.trampoline); WARN_ON(err == -EDEADLK); i915_gem_ww_ctx_fini(&eb.ww); diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index beabf3bc9b75..c52d255e8ef3 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -425,7 +425,6 @@ int i915_vma_pin_fence(struct i915_vma *vma) * must keep the device awake whilst using the fence. */ assert_rpm_wakelock_held(vma->vm->gt->uncore->rpm); - GEM_BUG_ON(!i915_vma_is_pinned(vma)); GEM_BUG_ON(!i915_vma_is_ggtt(vma)); err = mutex_lock_interruptible(&vma->vm->mutex); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index e4938aba3fe9..8c2f57eb5dda 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -44,6 +44,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, #define PIN_HIGH BIT_ULL(5) #define PIN_OFFSET_BIAS BIT_ULL(6) #define PIN_OFFSET_FIXED BIT_ULL(7) +#define PIN_VALIDATE BIT_ULL(8) /* validate placement only, no need to call unpin() */ #define PIN_GLOBAL BIT_ULL(10) /* I915_VMA_GLOBAL_BIND */ #define PIN_USER BIT_ULL(11) /* I915_VMA_LOCAL_BIND */ diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 8859feb7d131..5f43100638b5 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -779,6 +779,15 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) unsigned int bound; bound = atomic_read(&vma->flags); + + if (flags & PIN_VALIDATE) { + flags &= I915_VMA_BIND_MASK; + + return (flags & bound) == flags; + } + + /* with the lock mandatory for unbind, we don't race here */ + flags &= I915_VMA_BIND_MASK; do { if (unlikely(flags & ~bound)) return false; @@ -1254,7 +1263,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, GEM_BUG_ON(!(flags & (PIN_USER | PIN_GLOBAL))); /* First try and grab the pin without rebinding the vma */ - if (try_qad_pin(vma, flags & I915_VMA_BIND_MASK)) + if (try_qad_pin(vma, flags)) return 0; err = i915_vma_get_pages(vma); @@ -1336,7 +1345,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, } if (unlikely(!(flags & ~bound & I915_VMA_BIND_MASK))) { - __i915_vma_pin(vma); + if (!(flags & PIN_VALIDATE)) + __i915_vma_pin(vma); goto err_unlock; } @@ -1365,8 +1375,10 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, atomic_add(I915_VMA_PAGES_ACTIVE, &vma->pages_count); list_move_tail(&vma->vm_link, &vma->vm->bound_list); - __i915_vma_pin(vma); - GEM_BUG_ON(!i915_vma_is_pinned(vma)); + if (!(flags & PIN_VALIDATE)) { + __i915_vma_pin(vma); + GEM_BUG_ON(!i915_vma_is_pinned(vma)); + } GEM_BUG_ON(!i915_vma_is_bound(vma, flags)); GEM_BUG_ON(i915_vma_misplaced(vma, size, alignment, flags)); @@ -1626,8 +1638,6 @@ static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request * { int err; - GEM_BUG_ON(!i915_vma_is_pinned(vma)); - /* Wait for the vma to be bound before we start! */ err = __i915_request_await_bind(rq, vma); if (err) @@ -1646,6 +1656,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma, assert_object_held(obj); + GEM_BUG_ON(!vma->pages); + err = __i915_vma_move_to_active(vma, rq); if (unlikely(err)) return err;