From patchwork Wed Sep 9 14:07:09 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Micha=C5=82_Winiarski?= X-Patchwork-Id: 7147091 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 2F8799F1D3 for ; Wed, 9 Sep 2015 14:08:52 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id CECA72097E for ; Wed, 9 Sep 2015 14:08:50 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 7C22D20987 for ; Wed, 9 Sep 2015 14:08:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 82C756EB42; Wed, 9 Sep 2015 07:08:47 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id C97716EB3C; Wed, 9 Sep 2015 07:08:45 -0700 (PDT) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 09 Sep 2015 07:07:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,496,1437462000"; d="scan'208";a="800883229" Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153]) by orsmga002.jf.intel.com with ESMTP; 09 Sep 2015 07:07:50 -0700 Received: from mwiniars-desk1.igk.intel.com (172.28.173.140) by IRSMSX101.ger.corp.intel.com (163.33.3.153) with Microsoft SMTP Server id 14.3.224.2; Wed, 9 Sep 2015 15:07:49 +0100 From: =?UTF-8?q?Micha=C5=82=20Winiarski?= To: Date: Wed, 9 Sep 2015 16:07:09 +0200 Message-ID: <1441807630-20072-3-git-send-email-michal.winiarski@intel.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1441807630-20072-1-git-send-email-michal.winiarski@intel.com> References: <1441807630-20072-1-git-send-email-michal.winiarski@intel.com> MIME-Version: 1.0 X-Originating-IP: [172.28.173.140] Cc: =?UTF-8?q?Kristian=20H=C3=B8gsberg?= , dri-devel@lists.freedesktop.org, Akash Goel , Vinay Belgaumkar , mesa-dev@lists.freedesktop.org Subject: [Intel-gfx] [PATCH v6 2/2] drm/i915: Add soft-pinning API for execbuffer X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Wilson Userspace can pass in an offset that it presumes the object is located at. The kernel will then do its utmost to fit the object into that location. The assumption is that userspace is handling its own object locations (for example along with full-ppgtt) and that the kernel will rarely have to make space for the user's requests. v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested by Chris Wilson. Fixed incorrect error paths causing crash found by Michal Winiarski. (Not published externally) v3: Rebased because of trivial conflict in object_bind_to_vm. Fixed eviction to allow eviction of soft-pinned objects when another soft-pinned object used by a subsequent execbuffer overlaps reported by Michal Winiarski. (Not published externally) v4: Moved soft-pinned objects to the front of ordered_vmas so that they are pinned first after an address conflict happens to avoid repeated conflicts in rare cases (Suggested by Chris Wilson). Expanded comment on drm_i915_gem_exec_object2.offset to cover this new API. v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this capability (Kristian). Added check for multiple pinnings on eviction (Akash). Made sure buffers are not considered misplaced without the user specifying EXEC_OBJECT_SUPPORTS_48BBADDRESS. User must assume responsibility for any addressing workarounds. Updated object2.offset field comment again to clarify NO_RELOC case (Chris). checkpatch cleanup. v6: Rebase on top of current nightly. Dropped any references to EXEC_OBJECT_SUPPORTS_48BBADDRESS since those don't exist in upstream yet, and are not a dependency. Cc: Chris Wilson Cc: Akash Goel Cc: Vinay Belgaumkar Cc: Michal Winiarski Cc: Zou Nanhai Cc: Kristian Høgsberg Signed-off-by: Chris Wilson Signed-off-by: Thomas Daniel Signed-off-by: Micha? Winiarski --- drivers/gpu/drm/i915/i915_dma.c | 3 ++ drivers/gpu/drm/i915/i915_drv.h | 3 ++ drivers/gpu/drm/i915/i915_gem.c | 52 ++++++++++++++++++++++-------- drivers/gpu/drm/i915/i915_gem_evict.c | 39 ++++++++++++++++++++++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +++++++-- include/uapi/drm/i915_drm.h | 10 +++++- 6 files changed, 106 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 066a0ef..bd619af 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -170,6 +170,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_RESOURCE_STREAMER: value = HAS_RESOURCE_STREAMER(dev); break; + case I915_PARAM_HAS_EXEC_SOFTPIN: + value = 1; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2ef83c5..8eb01e0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2815,6 +2815,7 @@ void i915_gem_vma_destroy(struct i915_vma *vma); #define PIN_OFFSET_BIAS (1<<3) #define PIN_USER (1<<4) #define PIN_UPDATE (1<<5) +#define PIN_OFFSET_FIXED (1<<6) #define PIN_OFFSET_MASK (~4095) int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj, @@ -3157,6 +3158,8 @@ int __must_check i915_gem_evict_something(struct drm_device *dev, unsigned long start, unsigned long end, unsigned flags); +int __must_check +i915_gem_evict_for_vma(struct i915_vma *target); int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle); int i915_gem_evict_everything(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 41263cd..bb2ff81 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3437,22 +3437,42 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, if (IS_ERR(vma)) goto err_unpin; + if (flags & PIN_OFFSET_FIXED) { + uint64_t offset = flags & PIN_OFFSET_MASK; + + if (offset & (alignment - 1)) { + ret = -EINVAL; + goto err_free_vma; + } + vma->node.start = offset; + vma->node.size = size; + vma->node.color = obj->cache_level; + ret = drm_mm_reserve_node(&vm->mm, &vma->node); + if (ret) { + ret = i915_gem_evict_for_vma(vma); + if (ret == 0) + ret = drm_mm_reserve_node(&vm->mm, &vma->node); + } + if (ret) + goto err_free_vma; + } else { search_free: - ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, - size, alignment, - obj->cache_level, - start, end, - DRM_MM_SEARCH_DEFAULT, - DRM_MM_CREATE_DEFAULT); - if (ret) { - ret = i915_gem_evict_something(dev, vm, size, alignment, - obj->cache_level, - start, end, - flags); - if (ret == 0) - goto search_free; + ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, + size, alignment, + obj->cache_level, + start, end, + DRM_MM_SEARCH_DEFAULT, + DRM_MM_CREATE_DEFAULT); + if (ret) { + ret = i915_gem_evict_something(dev, vm, size, alignment, + obj->cache_level, + start, end, + flags); + if (ret == 0) + goto search_free; - goto err_free_vma; + goto err_free_vma; + } } if (WARN_ON(!i915_gem_valid_gtt_space(vma, obj->cache_level))) { ret = -EINVAL; @@ -3986,6 +4006,10 @@ i915_vma_misplaced(struct i915_vma *vma, uint32_t alignment, uint64_t flags) vma->node.start < (flags & PIN_OFFSET_MASK)) return true; + if (flags & PIN_OFFSET_FIXED && + vma->node.start != (flags & PIN_OFFSET_MASK)) + return true; + return false; } diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index d09e35e..bc5b91c 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -199,6 +199,45 @@ found: return ret; } +int +i915_gem_evict_for_vma(struct i915_vma *target) +{ + struct drm_mm_node *node, *next; + + list_for_each_entry_safe(node, next, + &target->vm->mm.head_node.node_list, + node_list) { + struct i915_vma *vma; + int ret; + + if (node->start + node->size <= target->node.start) + continue; + if (node->start >= target->node.start + target->node.size) + break; + + vma = container_of(node, typeof(*vma), node); + + if (vma->pin_count) { + if (!vma->exec_entry || (vma->pin_count > 1)) + /* Object is pinned for some other use */ + return -EBUSY; + + /* We need to evict a buffer in the same batch */ + if (vma->exec_entry->flags & EXEC_OBJECT_PINNED) + /* Overlapping fixed objects in the same batch */ + return -EINVAL; + + return -ENOSPC; + } + + ret = i915_vma_unbind(vma); + if (ret) + return ret; + } + + return 0; +} + /** * i915_gem_evict_vm - Evict all idle vmas from a vm * @vm: Address space to cleanse diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a953d49..49ef155 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -594,6 +594,8 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, flags |= PIN_GLOBAL | PIN_MAPPABLE; if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS) flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS; + if (entry->flags & EXEC_OBJECT_PINNED) + flags |= entry->offset | PIN_OFFSET_FIXED; } ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags); @@ -663,6 +665,10 @@ eb_vma_misplaced(struct i915_vma *vma) vma->node.start & (entry->alignment - 1)) return true; + if (entry->flags & EXEC_OBJECT_PINNED && + vma->node.start != entry->offset) + return true; + if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS && vma->node.start < BATCH_OFFSET_BIAS) return true; @@ -684,6 +690,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, struct i915_vma *vma; struct i915_address_space *vm; struct list_head ordered_vmas; + struct list_head pinned_vmas; bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4; int retry; @@ -692,6 +699,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, vm = list_first_entry(vmas, struct i915_vma, exec_list)->vm; INIT_LIST_HEAD(&ordered_vmas); + INIT_LIST_HEAD(&pinned_vmas); while (!list_empty(vmas)) { struct drm_i915_gem_exec_object2 *entry; bool need_fence, need_mappable; @@ -710,7 +718,9 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, obj->tiling_mode != I915_TILING_NONE; need_mappable = need_fence || need_reloc_mappable(vma); - if (need_mappable) { + if (entry->flags & EXEC_OBJECT_PINNED) + list_move_tail(&vma->exec_list, &pinned_vmas); + else if (need_mappable) { entry->flags |= __EXEC_OBJECT_NEEDS_MAP; list_move(&vma->exec_list, &ordered_vmas); } else @@ -720,6 +730,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring, obj->base.pending_write_domain = 0; } list_splice(&ordered_vmas, vmas); + list_splice(&pinned_vmas, vmas); /* Attempt to pin all of the buffers into the GTT. * This is done in 3 phases: @@ -1398,7 +1409,8 @@ eb_get_batch(struct eb_vmas *eb) * Note that actual hangs have only been observed on gen7, but for * paranoia do it everywhere. */ - vma->exec_entry->flags |= __EXEC_OBJECT_NEEDS_BIAS; + if ((vma->exec_entry->flags & EXEC_OBJECT_PINNED) == 0) + vma->exec_entry->flags |= __EXEC_OBJECT_NEEDS_BIAS; return vma->obj; } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index fd5aa47..0a0e781 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -356,6 +356,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_EU_TOTAL 34 #define I915_PARAM_HAS_GPU_RESET 35 #define I915_PARAM_HAS_RESOURCE_STREAMER 36 +#define I915_PARAM_HAS_EXEC_SOFTPIN 37 typedef struct drm_i915_getparam { __s32 param; @@ -684,13 +685,20 @@ struct drm_i915_gem_exec_object2 { /** * Returned value of the updated offset of the object, for future * presumed_offset writes. + * When the EXEC_OBJECT_PINNED flag is specified this is populated by + * the user with the GTT offset at which this object will be pinned. + * When the I915_EXEC_NO_RELOC flag is specified this must contain the + * presumed_offset of the object. + * During execbuffer2 the kernel populates it with the value of the + * current GTT offset of the object, for future presumed_offset writes. */ __u64 offset; #define EXEC_OBJECT_NEEDS_FENCE (1<<0) #define EXEC_OBJECT_NEEDS_GTT (1<<1) #define EXEC_OBJECT_WRITE (1<<2) -#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1) +#define EXEC_OBJECT_PINNED (1<<3) +#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_PINNED<<1) __u64 flags; __u64 rsvd1;