From patchwork Thu Jun 24 08:42:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12341557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5132CC49EA5 for ; Thu, 24 Jun 2021 08:43:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F0E19613D8 for ; Thu, 24 Jun 2021 08:43:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F0E19613D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A50DB6EB29; Thu, 24 Jun 2021 08:43:09 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 15A176EB24; Thu, 24 Jun 2021 08:43:08 +0000 (UTC) IronPort-SDR: eMZPo9GP6VLkMcQIKyy0HBBsItWuzMJqqI27dzfQ14CNujiZ7R5CeRc1B/PzTDdf1Tt/Dpqe55 hJdpE7u/XMcg== X-IronPort-AV: E=McAfee;i="6200,9189,10024"; a="205601338" X-IronPort-AV: E=Sophos;i="5.83,296,1616482800"; d="scan'208";a="205601338" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2021 01:43:07 -0700 IronPort-SDR: mmq+f4tzdaTQXPj4AmXBExJGRbptiSlUKwl5NlFAXKf998M7roe5l5Ewna1KddFf4Rja2n+Fl5 Azo2h8Jtp8hQ== X-IronPort-AV: E=Sophos;i="5.83,296,1616482800"; d="scan'208";a="453344917" Received: from cmutgix-mobl.gar.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.20]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2021 01:43:05 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Thu, 24 Jun 2021 10:42:38 +0200 Message-Id: <20210624084240.270219-2-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210624084240.270219-1-thomas.hellstrom@linux.intel.com> References: <20210624084240.270219-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v10 1/3] drm/i915: Update object placement flags to be mutable X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" The object ops i915_GEM_OBJECT_HAS_IOMEM and the object I915_BO_ALLOC_STRUCT_PAGE flags are considered immutable by much of our code. Introduce a new mem_flags member to hold these and make sure checks for these flags being set are either done under the object lock or with pages properly pinned. The flags will change during migration under the object lock. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- v2: - Unconditionally set VM_IO on our VMAs in line with the rest core gem and TTM. Since the bo might be migrated while the VMA is still alive, there is no sense, whether or not it maps iomem might change. v6: - Introduce a __i915_gem_object_is_lmem() to be used in situations where we know that a fence that can't currently signal keeps the object from being migrated or evicted. - Move a couple of shmem warnings for DGFX to a later patch where we actually move system memory to TTM. v10: - Add some comments around the gem object "mem_flags" field, and use a full unsigned int for it. (Suggested by Daniel Vetter). - Add an object locked section while checking mem_flags in the live mman selftest. --- drivers/gpu/drm/i915/gem/i915_gem_internal.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 22 +++++++++++ drivers/gpu/drm/i915/gem/i915_gem_lmem.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_mman.c | 12 +++--- drivers/gpu/drm/i915/gem/i915_gem_object.c | 38 +++++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.h | 14 ++----- .../gpu/drm/i915/gem/i915_gem_object_types.h | 27 ++++++++----- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 7 ++-- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 4 +- .../drm/i915/gem/selftests/huge_gem_object.c | 4 +- .../gpu/drm/i915/gem/selftests/huge_pages.c | 5 +-- .../drm/i915/gem/selftests/i915_gem_mman.c | 25 ++++++++---- .../drm/i915/gem/selftests/i915_gem_phys.c | 3 +- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- 17 files changed, 123 insertions(+), 52 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c index ce6b664b10aa..13b217f75055 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c @@ -177,8 +177,8 @@ i915_gem_object_create_internal(struct drm_i915_private *i915, return ERR_PTR(-ENOMEM); drm_gem_private_object_init(&i915->drm, &obj->base, size); - i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class, - I915_BO_ALLOC_STRUCT_PAGE); + i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class, 0); + obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE; /* * Mark the object as volatile, such that the pages are marked as diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c index d539dffa1554..41d5182cd367 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c @@ -71,6 +71,28 @@ bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj) mr->type == INTEL_MEMORY_STOLEN_LOCAL); } +/** + * __i915_gem_object_is_lmem - Whether the object is resident in + * lmem while in the fence signaling critical path. + * @obj: The object to check. + * + * This function is intended to be called from within the fence signaling + * path where the fence keeps the object from being migrated. For example + * during gpu reset or similar. + * + * Return: Whether the object is resident in lmem. + */ +bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj) +{ + struct intel_memory_region *mr = READ_ONCE(obj->mm.region); + +#ifdef CONFIG_LOCKDEP + GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, true)); +#endif + return mr && (mr->type == INTEL_MEMORY_LOCAL || + mr->type == INTEL_MEMORY_STOLEN_LOCAL); +} + struct drm_i915_gem_object * i915_gem_object_create_lmem(struct drm_i915_private *i915, resource_size_t size, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h index ea76fd11ccb0..27a611deba47 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h @@ -21,6 +21,8 @@ i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj, bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj); +bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj); + struct drm_i915_gem_object * i915_gem_object_create_lmem(struct drm_i915_private *i915, resource_size_t size, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index 2fd155742bd2..6497a2dbdab9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -684,7 +684,7 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj, if (mmap_type != I915_MMAP_TYPE_GTT && !i915_gem_object_has_struct_page(obj) && - !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)) + !i915_gem_object_has_iomem(obj)) return -ENODEV; mmo = mmap_offset_attach(obj, mmap_type, file); @@ -708,7 +708,12 @@ __assign_mmap_offset_handle(struct drm_file *file, if (!obj) return -ENOENT; + err = i915_gem_object_lock_interruptible(obj, NULL); + if (err) + goto out_put; err = __assign_mmap_offset(obj, mmap_type, offset, file); + i915_gem_object_unlock(obj); +out_put: i915_gem_object_put(obj); return err; } @@ -932,10 +937,7 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma) return PTR_ERR(anon); } - vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; - - if (i915_gem_object_has_iomem(obj)) - vma->vm_flags |= VM_IO; + vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO; /* * We keep the ref on mmo->obj, not vm_file, but we require diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index cf18c430d51f..07e8ff9a8aae 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -475,6 +475,44 @@ bool i915_gem_object_migratable(struct drm_i915_gem_object *obj) return obj->mm.n_placements > 1; } +/** + * i915_gem_object_has_struct_page - Whether the object is page-backed + * @obj: The object to query. + * + * This function should only be called while the object is locked or pinned, + * otherwise the page backing may change under the caller. + * + * Return: True if page-backed, false otherwise. + */ +bool i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj) +{ +#ifdef CONFIG_LOCKDEP + if (IS_DGFX(to_i915(obj->base.dev)) && + i915_gem_object_evictable((void __force *)obj)) + assert_object_held_shared(obj); +#endif + return obj->mem_flags & I915_BO_FLAG_STRUCT_PAGE; +} + +/** + * i915_gem_object_has_iomem - Whether the object is iomem-backed + * @obj: The object to query. + * + * This function should only be called while the object is locked or pinned, + * otherwise the iomem backing may change under the caller. + * + * Return: True if iomem-backed, false otherwise. + */ +bool i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj) +{ +#ifdef CONFIG_LOCKDEP + if (IS_DGFX(to_i915(obj->base.dev)) && + i915_gem_object_evictable((void __force *)obj)) + assert_object_held_shared(obj); +#endif + return obj->mem_flags & I915_BO_FLAG_IOMEM; +} + void i915_gem_init__objects(struct drm_i915_private *i915) { INIT_WORK(&i915->mm.free_work, __i915_gem_free_work); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 7bf4dd46d8d2..ea3224a480c4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -148,7 +148,7 @@ i915_gem_object_put(struct drm_i915_gem_object *obj) /* * If more than one potential simultaneous locker, assert held. */ -static inline void assert_object_held_shared(struct drm_i915_gem_object *obj) +static inline void assert_object_held_shared(const struct drm_i915_gem_object *obj) { /* * Note mm list lookup is protected by @@ -266,17 +266,9 @@ i915_gem_object_type_has(const struct drm_i915_gem_object *obj, return obj->ops->flags & flags; } -static inline bool -i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj) -{ - return obj->flags & I915_BO_ALLOC_STRUCT_PAGE; -} +bool i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj); -static inline bool -i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj) -{ - return i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM); -} +bool i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj); static inline bool i915_gem_object_is_shrinkable(const struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 3a2d9ecf8e03..441f913c87e6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -33,10 +33,9 @@ struct i915_lut_handle { struct drm_i915_gem_object_ops { unsigned int flags; -#define I915_GEM_OBJECT_HAS_IOMEM BIT(1) -#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(2) -#define I915_GEM_OBJECT_IS_PROXY BIT(3) -#define I915_GEM_OBJECT_NO_MMAP BIT(4) +#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(1) +#define I915_GEM_OBJECT_IS_PROXY BIT(2) +#define I915_GEM_OBJECT_NO_MMAP BIT(3) /* Interface between the GEM object and its backing storage. * get_pages() is called once prior to the use of the associated set @@ -201,17 +200,25 @@ struct drm_i915_gem_object { unsigned long flags; #define I915_BO_ALLOC_CONTIGUOUS BIT(0) #define I915_BO_ALLOC_VOLATILE BIT(1) -#define I915_BO_ALLOC_STRUCT_PAGE BIT(2) -#define I915_BO_ALLOC_CPU_CLEAR BIT(3) -#define I915_BO_ALLOC_USER BIT(4) +#define I915_BO_ALLOC_CPU_CLEAR BIT(2) +#define I915_BO_ALLOC_USER BIT(3) #define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | \ I915_BO_ALLOC_VOLATILE | \ - I915_BO_ALLOC_STRUCT_PAGE | \ I915_BO_ALLOC_CPU_CLEAR | \ I915_BO_ALLOC_USER) -#define I915_BO_READONLY BIT(5) -#define I915_TILING_QUIRK_BIT 6 /* unknown swizzling; do not release! */ +#define I915_BO_READONLY BIT(4) +#define I915_TILING_QUIRK_BIT 5 /* unknown swizzling; do not release! */ + /** + * @mem_flags - Mutable placement-related flags + * + * These are flags that indicate specifics of the memory region + * the object is currently in. As such they are only stable + * either under the object lock or if the object is pinned. + */ + unsigned int mem_flags; +#define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */ +#define I915_BO_FLAG_IOMEM BIT(1) /* Object backed by IO memory */ /* * Is the object to be mapped as read-only to the GPU * Only honoured if hardware has relevant pte bit diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 086005c1c7ea..f2f850e31b8e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -351,7 +351,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj, int err; if (!i915_gem_object_has_struct_page(obj) && - !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)) + !i915_gem_object_has_iomem(obj)) return ERR_PTR(-ENXIO); assert_object_held(obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c index be72ad0634ba..7986612f48fa 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c @@ -76,7 +76,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) intel_gt_chipset_flush(&to_i915(obj->base.dev)->gt); /* We're no longer struct page backed */ - obj->flags &= ~I915_BO_ALLOC_STRUCT_PAGE; + obj->mem_flags &= ~I915_BO_FLAG_STRUCT_PAGE; __i915_gem_object_set_pages(obj, st, sg->length); return 0; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 5d16c4462fda..7aa1c95c7b7d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -444,7 +444,7 @@ shmem_pread(struct drm_i915_gem_object *obj, static void shmem_release(struct drm_i915_gem_object *obj) { - if (obj->flags & I915_BO_ALLOC_STRUCT_PAGE) + if (i915_gem_object_has_struct_page(obj)) i915_gem_object_release_memory_region(obj); fput(obj->base.filp); @@ -513,9 +513,8 @@ static int shmem_object_init(struct intel_memory_region *mem, mapping_set_gfp_mask(mapping, mask); GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM)); - i915_gem_object_init(obj, &i915_gem_shmem_ops, &lock_class, - I915_BO_ALLOC_STRUCT_PAGE); - + i915_gem_object_init(obj, &i915_gem_shmem_ops, &lock_class, 0); + obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE; obj->write_domain = I915_GEM_DOMAIN_CPU; obj->read_domains = I915_GEM_DOMAIN_CPU; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index c5deb8b7227c..b5dd3b7037f4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -732,7 +732,6 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj) const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .name = "i915_gem_object_ttm", - .flags = I915_GEM_OBJECT_HAS_IOMEM, .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages, @@ -777,6 +776,7 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, i915_gem_object_init_memory_region(obj, mem); i915_gem_object_make_unshrinkable(obj); obj->read_domains = I915_GEM_DOMAIN_WC | I915_GEM_DOMAIN_GTT; + obj->mem_flags |= I915_BO_FLAG_IOMEM; i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->ttm.get_io_page.lock); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 4b0acc7eaa27..56edfeff8c02 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -510,8 +510,8 @@ i915_gem_userptr_ioctl(struct drm_device *dev, return -ENOMEM; drm_gem_private_object_init(dev, &obj->base, args->user_size); - i915_gem_object_init(obj, &i915_gem_userptr_ops, &lock_class, - I915_BO_ALLOC_STRUCT_PAGE); + i915_gem_object_init(obj, &i915_gem_userptr_ops, &lock_class, 0); + obj->mem_flags = I915_BO_FLAG_STRUCT_PAGE; obj->read_domains = I915_GEM_DOMAIN_CPU; obj->write_domain = I915_GEM_DOMAIN_CPU; i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC); diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c index 0c8ecfdf5405..f963b8e1e37b 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c @@ -114,8 +114,8 @@ huge_gem_object(struct drm_i915_private *i915, return ERR_PTR(-ENOMEM); drm_gem_private_object_init(&i915->drm, &obj->base, dma_size); - i915_gem_object_init(obj, &huge_ops, &lock_class, - I915_BO_ALLOC_STRUCT_PAGE); + i915_gem_object_init(obj, &huge_ops, &lock_class, 0); + obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE; obj->read_domains = I915_GEM_DOMAIN_CPU; obj->write_domain = I915_GEM_DOMAIN_CPU; diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index dadd485bc52f..ccc67ed1a84b 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -167,9 +167,8 @@ huge_pages_object(struct drm_i915_private *i915, return ERR_PTR(-ENOMEM); drm_gem_private_object_init(&i915->drm, &obj->base, size); - i915_gem_object_init(obj, &huge_page_ops, &lock_class, - I915_BO_ALLOC_STRUCT_PAGE); - + i915_gem_object_init(obj, &huge_page_ops, &lock_class, 0); + obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE; i915_gem_object_set_volatile(obj); obj->write_domain = I915_GEM_DOMAIN_CPU; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 44b5de06ce64..607b7d2d4c29 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -831,16 +831,19 @@ static int wc_check(struct drm_i915_gem_object *obj) static bool can_mmap(struct drm_i915_gem_object *obj, enum i915_mmap_type type) { + bool no_map; + if (type == I915_MMAP_TYPE_GTT && !i915_ggtt_has_aperture(&to_i915(obj->base.dev)->ggtt)) return false; - if (type != I915_MMAP_TYPE_GTT && - !i915_gem_object_has_struct_page(obj) && - !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)) - return false; + i915_gem_object_lock(obj, NULL); + no_map = (type != I915_MMAP_TYPE_GTT && + !i915_gem_object_has_struct_page(obj) && + !i915_gem_object_has_iomem(obj)); + i915_gem_object_unlock(obj); - return true; + return !no_map; } static void object_set_placements(struct drm_i915_gem_object *obj, @@ -988,10 +991,16 @@ static const char *repr_mmap_type(enum i915_mmap_type type) } } -static bool can_access(const struct drm_i915_gem_object *obj) +static bool can_access(struct drm_i915_gem_object *obj) { - return i915_gem_object_has_struct_page(obj) || - i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM); + bool access; + + i915_gem_object_lock(obj, NULL); + access = i915_gem_object_has_struct_page(obj) || + i915_gem_object_has_iomem(obj); + i915_gem_object_unlock(obj); + + return access; } static int __igt_mmap_access(struct drm_i915_private *i915, diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c index 3a6ce87f8b52..d43d8dae0f69 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c @@ -25,13 +25,14 @@ static int mock_phys_object(void *arg) goto out; } + i915_gem_object_lock(obj, NULL); if (!i915_gem_object_has_struct_page(obj)) { + i915_gem_object_unlock(obj); err = -EINVAL; pr_err("shmem has no struct page\n"); goto out_obj; } - i915_gem_object_lock(obj, NULL); err = i915_gem_object_attach_phys(obj, PAGE_SIZE); i915_gem_object_unlock(obj); if (err) { diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index cb182c6d265a..a2c58b54a592 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1039,7 +1039,7 @@ i915_vma_coredump_create(const struct intel_gt *gt, if (ret) break; } - } else if (i915_gem_object_is_lmem(vma->obj)) { + } else if (__i915_gem_object_is_lmem(vma->obj)) { struct intel_memory_region *mem = vma->obj->mm.region; dma_addr_t dma; From patchwork Thu Jun 24 08:42:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12341559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88592C49EA6 for ; Thu, 24 Jun 2021 08:43:17 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0C080613D8 for ; Thu, 24 Jun 2021 08:43:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0C080613D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CD2C56EB25; Thu, 24 Jun 2021 08:43:11 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 371066EB2B; Thu, 24 Jun 2021 08:43:10 +0000 (UTC) IronPort-SDR: wl4Rr8UUeuZyvQUQHudvb6Gn2Q4F2N+Z9ByhC0lAn0WYulx5CpQ8VeGAZg+r5ee2e6NFNaX3fm pj3lA+x2rl9g== X-IronPort-AV: E=McAfee;i="6200,9189,10024"; a="205601340" X-IronPort-AV: E=Sophos;i="5.83,296,1616482800"; d="scan'208";a="205601340" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2021 01:43:09 -0700 IronPort-SDR: ZACg4uFEnF6eKV727w3DFpesBPdmh91RnnwB/gppwShkyOhDxJohpLhXGhFgHimXOAL/+TDYV+ IfSu27njsHMA== X-IronPort-AV: E=Sophos;i="5.83,296,1616482800"; d="scan'208";a="453344922" Received: from cmutgix-mobl.gar.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.20]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2021 01:43:07 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Thu, 24 Jun 2021 10:42:39 +0200 Message-Id: <20210624084240.270219-3-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210624084240.270219-1-thomas.hellstrom@linux.intel.com> References: <20210624084240.270219-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v10 2/3] drm/i915/ttm: Adjust gem flags and caching settings after a move X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" After a TTM move or object init we need to update the i915 gem flags and caching settings to reflect the new placement. Currently caching settings are not changed during the lifetime of an object, although that might change moving forward if we run into performance issues or issues with WC system page allocations. Also introduce gpu_binds_iomem() and cpu_maps_iomem() to clean up the various ways we previously used to detect this. Finally, initialize the TTM object reserved to be able to update flags and caching before anyone else gets hold of the object. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- v6: - Rebase on accelerated ttm moves. v8: - Reinstate alignment at ttm_bo_init_reserved() time. (Reported by Matthew Auld). --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 144 ++++++++++++++++++------ 1 file changed, 108 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index b5dd3b7037f4..a1dc592c11bc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -91,6 +91,26 @@ static int i915_ttm_err_to_gem(int err) return err; } +static bool gpu_binds_iomem(struct ttm_resource *mem) +{ + return mem->mem_type != TTM_PL_SYSTEM; +} + +static bool cpu_maps_iomem(struct ttm_resource *mem) +{ + /* Once / if we support GGTT, this is also false for cached ttm_tts */ + return mem->mem_type != TTM_PL_SYSTEM; +} + +static enum i915_cache_level +i915_ttm_cache_level(struct drm_i915_private *i915, struct ttm_resource *res, + struct ttm_tt *ttm) +{ + return ((HAS_LLC(i915) || HAS_SNOOP(i915)) && !gpu_binds_iomem(res) && + ttm->caching == ttm_cached) ? I915_CACHE_LLC : + I915_CACHE_NONE; +} + static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj); static enum ttm_caching @@ -248,6 +268,35 @@ static void i915_ttm_free_cached_io_st(struct drm_i915_gem_object *obj) obj->ttm.cached_io_st = NULL; } +static void +i915_ttm_adjust_domains_after_move(struct drm_i915_gem_object *obj) +{ + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + + if (cpu_maps_iomem(bo->resource) || bo->ttm->caching != ttm_cached) { + obj->write_domain = I915_GEM_DOMAIN_WC; + obj->read_domains = I915_GEM_DOMAIN_WC; + } else { + obj->write_domain = I915_GEM_DOMAIN_CPU; + obj->read_domains = I915_GEM_DOMAIN_CPU; + } +} + +static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj) +{ + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + unsigned int cache_level; + + obj->mem_flags &= ~(I915_BO_FLAG_STRUCT_PAGE | I915_BO_FLAG_IOMEM); + + obj->mem_flags |= cpu_maps_iomem(bo->resource) ? I915_BO_FLAG_IOMEM : + I915_BO_FLAG_STRUCT_PAGE; + + cache_level = i915_ttm_cache_level(to_i915(bo->base.dev), bo->resource, + bo->ttm); + i915_gem_object_set_cache_coherency(obj, cache_level); +} + static void i915_ttm_purge(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); @@ -263,8 +312,10 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj) /* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx); - if (!ret) { + obj->write_domain = 0; + obj->read_domains = 0; + i915_ttm_adjust_gem_after_move(obj); i915_ttm_free_cached_io_st(obj); obj->mm.madv = __I915_MADV_PURGED; } @@ -347,12 +398,15 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj, struct ttm_resource *res) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); - struct ttm_resource_manager *man = - ttm_manager_type(bo->bdev, res->mem_type); - if (man->use_tt) + if (!gpu_binds_iomem(res)) return i915_ttm_tt_get_st(bo->ttm); + /* + * If CPU mapping differs, we need to add the ttm_tt pages to + * the resulting st. Might make sense for GGTT. + */ + GEM_WARN_ON(!cpu_maps_iomem(res)); return intel_region_ttm_resource_to_st(obj->mm.region, res); } @@ -367,23 +421,25 @@ static int i915_ttm_accel_move(struct ttm_buffer_object *bo, struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); struct sg_table *src_st; struct i915_request *rq; + struct ttm_tt *ttm = bo->ttm; + enum i915_cache_level src_level, dst_level; int ret; if (!i915->gt.migrate.context) return -EINVAL; - if (!bo->ttm || !ttm_tt_is_populated(bo->ttm)) { + dst_level = i915_ttm_cache_level(i915, dst_mem, ttm); + if (!ttm || !ttm_tt_is_populated(ttm)) { if (bo->type == ttm_bo_type_kernel) return -EINVAL; - if (bo->ttm && - !(bo->ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)) + if (ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)) return 0; intel_engine_pm_get(i915->gt.migrate.context->engine); ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL, - dst_st->sgl, I915_CACHE_NONE, - dst_mem->mem_type >= I915_PL_LMEM0, + dst_st->sgl, dst_level, + gpu_binds_iomem(dst_mem), 0, &rq); if (!ret && rq) { @@ -392,15 +448,16 @@ static int i915_ttm_accel_move(struct ttm_buffer_object *bo, } intel_engine_pm_put(i915->gt.migrate.context->engine); } else { - src_st = src_man->use_tt ? i915_ttm_tt_get_st(bo->ttm) : - obj->ttm.cached_io_st; + src_st = src_man->use_tt ? i915_ttm_tt_get_st(ttm) : + obj->ttm.cached_io_st; + src_level = i915_ttm_cache_level(i915, bo->resource, ttm); intel_engine_pm_get(i915->gt.migrate.context->engine); ret = intel_context_migrate_copy(i915->gt.migrate.context, - NULL, src_st->sgl, I915_CACHE_NONE, - bo->resource->mem_type >= I915_PL_LMEM0, - dst_st->sgl, I915_CACHE_NONE, - dst_mem->mem_type >= I915_PL_LMEM0, + NULL, src_st->sgl, src_level, + gpu_binds_iomem(bo->resource), + dst_st->sgl, dst_level, + gpu_binds_iomem(dst_mem), &rq); if (!ret && rq) { i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT); @@ -420,8 +477,6 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); struct ttm_resource_manager *dst_man = ttm_manager_type(bo->bdev, dst_mem->mem_type); - struct ttm_resource_manager *src_man = - ttm_manager_type(bo->bdev, bo->resource->mem_type); struct intel_memory_region *dst_reg, *src_reg; union { struct ttm_kmap_iter_tt tt; @@ -465,12 +520,12 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, ret = i915_ttm_accel_move(bo, dst_mem, dst_st); if (ret) { /* If we start mapping GGTT, we can no longer use man::use_tt here. */ - dst_iter = dst_man->use_tt ? + dst_iter = !cpu_maps_iomem(dst_mem) ? ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) : ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap, dst_st, dst_reg->region.start); - src_iter = src_man->use_tt ? + src_iter = !cpu_maps_iomem(bo->resource) ? ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) : ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap, obj->ttm.cached_io_st, @@ -478,21 +533,24 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter); } + /* Below dst_mem becomes bo->resource. */ ttm_bo_move_sync_cleanup(bo, dst_mem); + i915_ttm_adjust_domains_after_move(obj); i915_ttm_free_cached_io_st(obj); - if (!dst_man->use_tt) { + if (gpu_binds_iomem(dst_mem) || cpu_maps_iomem(dst_mem)) { obj->ttm.cached_io_st = dst_st; obj->ttm.get_io_page.sg_pos = dst_st->sgl; obj->ttm.get_io_page.sg_idx = 0; } + i915_ttm_adjust_gem_after_move(obj); return 0; } static int i915_ttm_io_mem_reserve(struct ttm_device *bdev, struct ttm_resource *mem) { - if (mem->mem_type < I915_PL_LMEM0) + if (!cpu_maps_iomem(mem)) return 0; mem->bus.caching = ttm_write_combined; @@ -590,6 +648,16 @@ static int i915_ttm_get_pages(struct drm_i915_gem_object *obj) return i915_ttm_err_to_gem(ret); } + i915_ttm_adjust_lru(obj); + if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) { + ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx); + if (ret) + return ret; + + i915_ttm_adjust_domains_after_move(obj); + i915_ttm_adjust_gem_after_move(obj); + } + /* Object either has a page vector or is an iomem object */ st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st; if (IS_ERR(st)) @@ -597,8 +665,6 @@ static int i915_ttm_get_pages(struct drm_i915_gem_object *obj) __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); - i915_ttm_adjust_lru(obj); - return ret; } @@ -768,6 +834,10 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, { static struct lock_class_key lock_class; struct drm_i915_private *i915 = mem->i915; + struct ttm_operation_ctx ctx = { + .interruptible = true, + .no_wait_gpu = false, + }; enum ttm_bo_type bo_type; int ret; @@ -775,14 +845,13 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags); i915_gem_object_init_memory_region(obj, mem); i915_gem_object_make_unshrinkable(obj); - obj->read_domains = I915_GEM_DOMAIN_WC | I915_GEM_DOMAIN_GTT; - obj->mem_flags |= I915_BO_FLAG_IOMEM; - i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->ttm.get_io_page.lock); bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device : ttm_bo_type_kernel; + obj->base.vma_node.driver_private = i915_gem_to_ttm(obj); + /* * If this function fails, it will call the destructor, but * our caller still owns the object. So no freeing in the @@ -790,14 +859,17 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, * Similarly, in delayed_destroy, we can't call ttm_bo_put() * until successful initialization. */ - obj->base.vma_node.driver_private = i915_gem_to_ttm(obj); - ret = ttm_bo_init(&i915->bdev, i915_gem_to_ttm(obj), size, - bo_type, &i915_sys_placement, - mem->min_page_size >> PAGE_SHIFT, - true, NULL, NULL, i915_ttm_bo_destroy); - if (!ret) - obj->ttm.created = true; - - /* i915 wants -ENXIO when out of memory region space. */ - return i915_ttm_err_to_gem(ret); + ret = ttm_bo_init_reserved(&i915->bdev, i915_gem_to_ttm(obj), size, + bo_type, &i915_sys_placement, + mem->min_page_size >> PAGE_SHIFT, + &ctx, NULL, NULL, i915_ttm_bo_destroy); + if (ret) + return i915_ttm_err_to_gem(ret); + + obj->ttm.created = true; + i915_ttm_adjust_domains_after_move(obj); + i915_ttm_adjust_gem_after_move(obj); + i915_gem_object_unlock(obj); + + return 0; } From patchwork Thu Jun 24 08:42:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12341561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1141C49EA6 for ; Thu, 24 Jun 2021 08:43:23 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 70908613DC for ; Thu, 24 Jun 2021 08:43:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 70908613DC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 894276EB3C; Thu, 24 Jun 2021 08:43:16 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 17CFE6EB28; Thu, 24 Jun 2021 08:43:12 +0000 (UTC) IronPort-SDR: 9UXtcsh4qPEu4vTcicSrN2CwztDqnjAxS3Bg59PF/jRPmVDIQTkwBhSaRNu6zyUBoJ0S8eWZeL R8YzfrjUhsQw== X-IronPort-AV: E=McAfee;i="6200,9189,10024"; a="205601347" X-IronPort-AV: E=Sophos;i="5.83,296,1616482800"; d="scan'208";a="205601347" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2021 01:43:11 -0700 IronPort-SDR: iHttE/23LaTdzLjzYIP0jx8bIhoLESXknTpVYalBQNEHi9eTzX8sMUUpSoEcuuXBhQtz128Efj xTTtQ1aaCJAQ== X-IronPort-AV: E=Sophos;i="5.83,296,1616482800"; d="scan'208";a="453344929" Received: from cmutgix-mobl.gar.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.20]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2021 01:43:09 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Thu, 24 Jun 2021 10:42:40 +0200 Message-Id: <20210624084240.270219-4-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210624084240.270219-1-thomas.hellstrom@linux.intel.com> References: <20210624084240.270219-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v10 3/3] drm/i915/ttm: Use TTM for system memory X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" For discrete, use TTM for both cached and WC system memory. That means we currently rely on the TTM memory accounting / shrinker. For cached system memory we should consider remaining shmem-backed, which can be implemented from our ttm_tt_populate callback. We can then also reuse our own very elaborate shrinker for that memory. If an object is evicted to a gem allowable region, we will now consider the object migrated, and we flip the gem region and move the object to a different region list. Since we are now changing gem regions, we can't any longer rely on the CONTIGUOUS flag being set based on the region min page size, so remove that flag update. If we want to reintroduce it, we need to put it in the mutable flags. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- v2: - Fix IS_ERR_OR_NULL() check to IS_ERR() (Reported by Matthew Auld) v3: - Commit message typo fix v6: - Fix TODO:s for supporting system memory with TTM. - Update the object GEM region after a TTM move if compatible. - Add a couple of warnings for shmem on DGFX. v8: - When changing regions, also move the object to the new region list (Reported by Matthew Auld). v9: - Remove a DGFX warning in __i915_gem_object_release_shmem since it is called from the userptr code. --- drivers/gpu/drm/i915/gem/i915_gem_region.c | 4 -- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 2 + drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 52 +++++++++++++++++----- drivers/gpu/drm/i915/i915_drv.h | 3 -- drivers/gpu/drm/i915/intel_memory_region.c | 7 ++- drivers/gpu/drm/i915/intel_memory_region.h | 8 ++++ 6 files changed, 58 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c index d1f1840540dd..4925563018b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_region.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c @@ -13,11 +13,7 @@ void i915_gem_object_init_memory_region(struct drm_i915_gem_object *obj, { obj->mm.region = intel_memory_region_get(mem); - if (obj->base.size <= mem->min_page_size) - obj->flags |= I915_BO_ALLOC_CONTIGUOUS; - mutex_lock(&mem->objects.lock); - list_add(&obj->mm.region_link, &mem->objects.list); mutex_unlock(&mem->objects.lock); } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 7aa1c95c7b7d..e9ac913b923a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -302,6 +302,7 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_ struct pagevec pvec; struct page *page; + GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev))); __i915_gem_object_release_shmem(obj, pages, true); i915_gem_gtt_finish_pages(obj, pages); @@ -560,6 +561,7 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv, resource_size_t offset; int err; + GEM_WARN_ON(IS_DGFX(dev_priv)); obj = i915_gem_object_create_shmem(dev_priv, round_up(size, PAGE_SIZE)); if (IS_ERR(obj)) return obj; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index a1dc592c11bc..3641a6c7b3d9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -286,6 +286,26 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); unsigned int cache_level; + unsigned int i; + + /* + * If object was moved to an allowable region, update the object + * region to consider it migrated. Note that if it's currently not + * in an allowable region, it's evicted and we don't update the + * object region. + */ + if (intel_region_to_ttm_type(obj->mm.region) != bo->resource->mem_type) { + for (i = 0; i < obj->mm.n_placements; ++i) { + struct intel_memory_region *mr = obj->mm.placements[i]; + + if (intel_region_to_ttm_type(mr) == bo->resource->mem_type && + mr != obj->mm.region) { + i915_gem_object_release_memory_region(obj); + i915_gem_object_init_memory_region(obj, mr); + break; + } + } + } obj->mem_flags &= ~(I915_BO_FLAG_STRUCT_PAGE | I915_BO_FLAG_IOMEM); @@ -615,13 +635,6 @@ static int i915_ttm_get_pages(struct drm_i915_gem_object *obj) /* Move to the requested placement. */ i915_ttm_placement_from_obj(obj, &requested, busy, &placement); - /* - * For now we support LMEM only with TTM. - * TODO: Remove with system support - */ - GEM_BUG_ON(requested.mem_type < I915_PL_LMEM0 || - busy[0].mem_type < I915_PL_LMEM0); - /* First try only the requested placement. No eviction. */ real_num_busy = fetch_and_zero(&placement.num_busy_placement); ret = ttm_bo_validate(bo, &placement, &ctx); @@ -635,9 +648,6 @@ static int i915_ttm_get_pages(struct drm_i915_gem_object *obj) ret == -EAGAIN) return ret; - /* TODO: Remove this when we support system as TTM. */ - real_num_busy = 1; - /* * If the initial attempt fails, allow all accepted placements, * evicting if necessary. @@ -873,3 +883,25 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, return 0; } + +static const struct intel_memory_region_ops ttm_system_region_ops = { + .init_object = __i915_gem_ttm_object_init, +}; + +struct intel_memory_region * +i915_gem_ttm_system_setup(struct drm_i915_private *i915, + u16 type, u16 instance) +{ + struct intel_memory_region *mr; + + mr = intel_memory_region_create(i915, 0, + totalram_pages() << PAGE_SHIFT, + PAGE_SIZE, 0, + type, instance, + &ttm_system_region_ops); + if (IS_ERR(mr)) + return mr; + + intel_memory_region_set_name(mr, "system-ttm"); + return mr; +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 01e11fe38642..bfbfbae57573 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1751,9 +1751,6 @@ void i915_gem_cleanup_userptr(struct drm_i915_private *dev_priv); void i915_gem_init_early(struct drm_i915_private *dev_priv); void i915_gem_cleanup_early(struct drm_i915_private *dev_priv); -struct intel_memory_region *i915_gem_shmem_setup(struct drm_i915_private *i915, - u16 type, u16 instance); - static inline void i915_gem_drain_freed_objects(struct drm_i915_private *i915) { /* diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c index df59f884d37c..779eb2fa90b6 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.c +++ b/drivers/gpu/drm/i915/intel_memory_region.c @@ -173,7 +173,12 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915) instance = intel_region_map[i].instance; switch (type) { case INTEL_MEMORY_SYSTEM: - mem = i915_gem_shmem_setup(i915, type, instance); + if (IS_DGFX(i915)) + mem = i915_gem_ttm_system_setup(i915, type, + instance); + else + mem = i915_gem_shmem_setup(i915, type, + instance); break; case INTEL_MEMORY_STOLEN_LOCAL: mem = i915_gem_stolen_lmem_setup(i915, type, instance); diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h index 2be8433d373a..b1b9e461d53b 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.h +++ b/drivers/gpu/drm/i915/intel_memory_region.h @@ -125,4 +125,12 @@ intel_memory_region_set_name(struct intel_memory_region *mem, int intel_memory_region_reserve(struct intel_memory_region *mem, resource_size_t offset, resource_size_t size); + +struct intel_memory_region * +i915_gem_ttm_system_setup(struct drm_i915_private *i915, + u16 type, u16 instance); +struct intel_memory_region * +i915_gem_shmem_setup(struct drm_i915_private *i915, + u16 type, u16 instance); + #endif