From patchwork Tue Aug 22 16:21:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 13361173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 84F39EE49A3 for ; Tue, 22 Aug 2023 16:22:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BA2D210E3AB; Tue, 22 Aug 2023 16:22:04 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id DB81910E377; Tue, 22 Aug 2023 16:22:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692721320; x=1724257320; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0jD/51uCMBhvag5p0cWlk8xZ35i+sBa63wjjnRG+2/k=; b=XwzxfzfJo+8gB/f/0bqbOMosX92HLGRdbzzvVU/AdxpkUtPuR/27AL9t jYamJrjyvkjrUrHRkL4wn8oE+bD37i/+FnuPVV44dsGsOx0n/BmJfdEOd l9kVSfDOGJDGtALuOGw1bC9/E93iiQS/xVcy5DdKPcaqY1cFHdHXSjjB8 cvw/kbKIUK7mZE/kOB6EDy0W00+zHaKoQ4ssnClwB1mxEpdp3yzOxWeFm ucbUZT8Dy0IJ3blo+sQ2/DRv5FLXfAaEhqfoHGJVEGOjmIfazHpvAxAg2 pUydW2WW0ZSLNGuCHkns+jQjSYcWAgOWOPoroO9CgmoTwnLGHoUQhDOgT g==; X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="354260384" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="354260384" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:22:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="826373911" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="826373911" Received: from kahchuno-mobl.gar.corp.intel.com (HELO fedora..) ([10.249.254.65]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:21:58 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH v2 1/4] drm/xe/vm: Use onion unwind for xe_vma_userptr_pin_pages() Date: Tue, 22 Aug 2023 18:21:33 +0200 Message-ID: <20230822162136.25895-2-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> References: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Use onion error unwind since that makes the function easier to read and extend. No functional change. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 2e99f865d7ec..8bf7f62e6548 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -116,19 +116,17 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) kthread_unuse_mm(vma->userptr.notifier.mm); mmput(vma->userptr.notifier.mm); } -mm_closed: if (ret) - goto out; + goto out_release_pages; ret = sg_alloc_table_from_pages_segment(&vma->userptr.sgt, pages, pinned, 0, (u64)pinned << PAGE_SHIFT, xe_sg_segment_size(xe->drm.dev), GFP_KERNEL); - if (ret) { - vma->userptr.sg = NULL; - goto out; - } + if (ret) + goto out_release_pages; + vma->userptr.sg = &vma->userptr.sgt; ret = dma_map_sgtable(xe->drm.dev, vma->userptr.sg, @@ -136,11 +134,8 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) DMA_BIDIRECTIONAL, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING); - if (ret) { - sg_free_table(vma->userptr.sg); - vma->userptr.sg = NULL; - goto out; - } + if (ret) + goto out_free_sg; for (i = 0; i < pinned; ++i) { if (!read_only) { @@ -152,17 +147,23 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) mark_page_accessed(pages[i]); } -out: release_pages(pages, pinned); kvfree(pages); - if (!(ret < 0)) { - vma->userptr.notifier_seq = notifier_seq; - if (xe_vma_userptr_check_repin(vma) == -EAGAIN) - goto retry; - } + vma->userptr.notifier_seq = notifier_seq; + if (xe_vma_userptr_check_repin(vma) == -EAGAIN) + goto retry; + + return 0; - return ret < 0 ? ret : 0; +out_free_sg: + sg_free_table(vma->userptr.sg); + vma->userptr.sg = NULL; +out_release_pages: + release_pages(pages, pinned); +mm_closed: + kvfree(pages); + return ret; } static bool preempt_fences_waiting(struct xe_vm *vm) From patchwork Tue Aug 22 16:21:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 13361174 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 56BA3EE4996 for ; Tue, 22 Aug 2023 16:22:07 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5E9DA10E3AD; Tue, 22 Aug 2023 16:22:06 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 789DB10E377; Tue, 22 Aug 2023 16:22:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692721323; x=1724257323; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7ZWn21J9dIvq9GZ8/rl2IemoEDLroEysvvgCre2NgxY=; b=PuVVBK8lB8LyLjsbw56mJuz84ETAY2zfiG2EhoHnmRLDg1RaIYfAhhgy XJoIs7n4dgaV8aXSCGZsv3C+plTm/5bbWUfBgqy4UPoK80cOHXwev3R2a 9GSjcImE6nCGj06B8TAbfophVdvC0EO9iswhdDfqG4olewh/xnA93syo1 bj9Vvz/vLwVvPQmpKXvVLn4hbxh+VC+UiPjtiJR2ZWmK8pSNl87BQGk19 qGXXEGJ+Emi3HUcMNMmupko1XOamdl5g4+j/1gM1+pd8JFAkHxLhIoV86 DQ1W1FDophmzTGpX+JxRRQpOmEmxdBhtgtl5owNao2XQLrpbVFNDkHn8N w==; X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="354260390" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="354260390" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:22:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="826373918" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="826373918" Received: from kahchuno-mobl.gar.corp.intel.com (HELO fedora..) ([10.249.254.65]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:22:00 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH v2 2/4] drm/xe/vm: Implement userptr page pinning Date: Tue, 22 Aug 2023 18:21:34 +0200 Message-ID: <20230822162136.25895-3-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> References: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Implement pinning of userptrs between VM_BIND and VM_UNBIND, which will facilitate avoiding long hangs on non-preemptible workloads. But don't hook it up to userspace just yet. v2: - Avoid marking userptr VMAs as invalid in the mmu invalidation notifier. (Matthew Brost) - Add an WARN that we don't try to repin userptr pages (Matthew Brost) Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 80 +++++++++++++++++++++++--------- drivers/gpu/drm/xe/xe_vm.h | 9 ++++ drivers/gpu/drm/xe/xe_vm_types.h | 12 +++++ 3 files changed, 79 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 8bf7f62e6548..037ac42f74a5 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -74,10 +74,6 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) if (notifier_seq == vma->userptr.notifier_seq) return 0; - pages = kvmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL); - if (!pages) - return -ENOMEM; - if (vma->userptr.sg) { dma_unmap_sgtable(xe->drm.dev, vma->userptr.sg, @@ -87,6 +83,18 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) vma->userptr.sg = NULL; } + /* TODO: Convert to xe_assert() */ + if (XE_WARN_ON(vma->userptr.pinned_pages)) { + unpin_user_pages_dirty_lock(vma->userptr.pinned_pages, + vma->userptr.num_pinned, + !read_only); + pages = vma->userptr.pinned_pages; + } else { + pages = kvmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL); + if (!pages) + return -ENOMEM; + } + pinned = ret = 0; if (in_kthread) { if (!mmget_not_zero(vma->userptr.notifier.mm)) { @@ -97,11 +105,18 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) } while (pinned < num_pages) { - ret = get_user_pages_fast(xe_vma_userptr(vma) + - pinned * PAGE_SIZE, - num_pages - pinned, - read_only ? 0 : FOLL_WRITE, - &pages[pinned]); + if (xe_vma_is_pinned(vma)) + ret = pin_user_pages_fast(xe_vma_userptr(vma) + + pinned * PAGE_SIZE, + num_pages - pinned, + read_only ? 0 : FOLL_WRITE, + &pages[pinned]); + else + ret = get_user_pages_fast(xe_vma_userptr(vma) + + pinned * PAGE_SIZE, + num_pages - pinned, + read_only ? 0 : FOLL_WRITE, + &pages[pinned]); if (ret < 0) { if (in_kthread) ret = 0; @@ -137,19 +152,24 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) if (ret) goto out_free_sg; - for (i = 0; i < pinned; ++i) { - if (!read_only) { - lock_page(pages[i]); - set_page_dirty(pages[i]); - unlock_page(pages[i]); + if (!xe_vma_is_pinned(vma)) { + for (i = 0; i < pinned; ++i) { + if (!read_only) { + lock_page(pages[i]); + set_page_dirty(pages[i]); + unlock_page(pages[i]); + } + + mark_page_accessed(pages[i]); } - mark_page_accessed(pages[i]); + release_pages(pages, pinned); + kvfree(pages); + } else { + vma->userptr.pinned_pages = pages; + vma->userptr.num_pinned = pinned; } - release_pages(pages, pinned); - kvfree(pages); - vma->userptr.notifier_seq = notifier_seq; if (xe_vma_userptr_check_repin(vma) == -EAGAIN) goto retry; @@ -160,9 +180,14 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) sg_free_table(vma->userptr.sg); vma->userptr.sg = NULL; out_release_pages: - release_pages(pages, pinned); + if (!xe_vma_is_pinned(vma)) + release_pages(pages, pinned); + else + unpin_user_pages(pages, pinned); + vma->userptr.num_pinned = 0; mm_closed: kvfree(pages); + vma->userptr.pinned_pages = NULL; return ret; } @@ -718,6 +743,11 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni, return false; down_write(&vm->userptr.notifier_lock); + if (xe_vma_is_pinned(vma)) { + up_write(&vm->userptr.notifier_lock); + return true; + } + mmu_interval_set_seq(mni, cur_seq); /* No need to stop gpu access if the userptr is not yet bound. */ @@ -976,10 +1006,16 @@ static void xe_vma_destroy_late(struct xe_vma *vma) vma->userptr.sg = NULL; } + if (vma->userptr.pinned_pages) { + unpin_user_pages_dirty_lock(vma->userptr.pinned_pages, + vma->userptr.num_pinned, + !read_only); + kvfree(vma->userptr.pinned_pages); + } + /* - * Since userptr pages are not pinned, we can't remove - * the notifer until we're sure the GPU is not accessing - * them anymore + * We can't remove the notifer until we're sure the GPU is + * not accessing the pages anymore */ mmu_interval_notifier_remove(&vma->userptr.notifier); xe_vm_put(vm); diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index 6de6e3edb24a..913544d7d995 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -139,6 +139,15 @@ static inline bool xe_vma_is_userptr(struct xe_vma *vma) return xe_vma_has_no_bo(vma) && !xe_vma_is_null(vma); } +/** + * xe_vma_is_pinned() - User has requested the backing store of this vma + * to be pinned. + */ +static inline bool xe_vma_is_pinned(struct xe_vma *vma) +{ + return xe_vma_is_userptr(vma) && (vma->gpuva.flags & XE_VMA_PINNED); +} + #define xe_vm_assert_held(vm) dma_resv_assert_held(&(vm)->resv) u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_tile *tile); diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 3681a5ff588b..9b90e649cd69 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -33,6 +33,8 @@ struct xe_vm; #define XE_VMA_PTE_4K (DRM_GPUVA_USERBITS << 5) #define XE_VMA_PTE_2M (DRM_GPUVA_USERBITS << 6) #define XE_VMA_PTE_1G (DRM_GPUVA_USERBITS << 7) +/* User requested backing store to be pinned */ +#define XE_VMA_PINNED (DRM_GPUVA_USERBITS << 8) /** struct xe_userptr - User pointer */ struct xe_userptr { @@ -54,6 +56,16 @@ struct xe_userptr { * read: vm->userptr.notifier_lock in write mode or vm->resv held. */ bool initial_bind; + /** + * @pinned_pages: List of pinned pages if xe_vma_pinned(), + * NULL otherwise. protected by the vm lock. + */ + struct page **pinned_pages; + /** + * @num_pinned: Number of pointers to pinned pages in @pinned_pages. + * protected by the vm lock. + */ + unsigned long num_pinned; #if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT) u32 divisor; #endif From patchwork Tue Aug 22 16:21:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 13361176 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 13B61EE4996 for ; Tue, 22 Aug 2023 16:22:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7279D10E3B3; Tue, 22 Aug 2023 16:22:16 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8821510E3AF; Tue, 22 Aug 2023 16:22:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692721325; x=1724257325; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2oWVQLHysGu8GlasoBP3+HFcF0VUw3Oh59Ie6V+Hqyg=; b=YjqZXAeRUeleSwCTdCjjBGricZPtwFQ+ks/0KUP0I1phRkunrtX0wUP2 5cAqk1WiU86qbqh7Xv7T+lavDjoG1xlSdxl4Bx1ta9PaG0RaFPIBS7gYJ INoDTdyLcqMTBK3XPn2Q6Q3HBaNO+E95Era0RjDCKYnWCyaIHcZ0c2Ne6 OovloLfV6VpTQMNsLOnVKRs/OfM6yhuKVyNVq+gs7ajVp49h+P0Dcxmr4 8mHm3E4hzK+lUJ1Uox+KN+ZFnypoWyCw2IPGosNZzwSD6nmE/pXtqkXdW mjQpG5RiMYuVPl6S1nRy6nO77l/NHGdZ4Lt8fRifSyZeHMJhS+Q/gBLjp w==; X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="354260396" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="354260396" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:22:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="826373925" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="826373925" Received: from kahchuno-mobl.gar.corp.intel.com (HELO fedora..) ([10.249.254.65]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:22:03 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH v2 3/4] drm/xe/vm: Perform accounting of userptr pinned pages Date: Tue, 22 Aug 2023 18:21:35 +0200 Message-ID: <20230822162136.25895-4-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> References: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Account these pages against RLIMIT_MEMLOCK following how RDMA does this with CAP_IPC_LOCK bypassing the limit. v2: - Change the naming of the accounting functions and WARN if we try to account anything but userptr pages. (Matthew Brost) Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 52 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 50 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 037ac42f74a5..a645cfa131ca 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -34,6 +34,41 @@ #define TEST_VM_ASYNC_OPS_ERROR +/* + * Perform userptr PIN accounting against RLIMIT_MEMLOCK for now, similarly + * to how RDMA does this. + */ +static int +xe_vma_userptr_mlock_reserve(struct xe_vma *vma, unsigned long num_pages) +{ + unsigned long lock_limit, new_pinned; + struct mm_struct *mm = vma->userptr.notifier.mm; + + /* TODO: Convert to xe_assert() */ + XE_WARN_ON(!xe_vma_is_userptr(vma)); + + if (!can_do_mlock()) + return -EPERM; + + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + new_pinned = atomic64_add_return(num_pages, &mm->pinned_vm); + if (new_pinned > lock_limit && !capable(CAP_IPC_LOCK)) { + atomic64_sub(num_pages, &mm->pinned_vm); + return -ENOMEM; + } + + return 0; +} + +static void +xe_vma_userptr_mlock_release(struct xe_vma *vma, unsigned long num_pages) +{ + /* TODO: Convert to xe_assert() */ + XE_WARN_ON(!xe_vma_is_userptr(vma)); + + atomic64_sub(num_pages, &vma->userptr.notifier.mm->pinned_vm); +} + /** * xe_vma_userptr_check_repin() - Advisory check for repin needed * @vma: The userptr vma @@ -90,9 +125,17 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) !read_only); pages = vma->userptr.pinned_pages; } else { + if (xe_vma_is_pinned(vma)) { + ret = xe_vma_userptr_mlock_reserve(vma, num_pages); + if (ret) + return ret; + } + pages = kvmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL); - if (!pages) - return -ENOMEM; + if (!pages) { + ret = -ENOMEM; + goto out_account; + } } pinned = ret = 0; @@ -188,6 +231,9 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) mm_closed: kvfree(pages); vma->userptr.pinned_pages = NULL; +out_account: + if (xe_vma_is_pinned(vma)) + xe_vma_userptr_mlock_release(vma, num_pages); return ret; } @@ -1010,6 +1056,8 @@ static void xe_vma_destroy_late(struct xe_vma *vma) unpin_user_pages_dirty_lock(vma->userptr.pinned_pages, vma->userptr.num_pinned, !read_only); + xe_vma_userptr_mlock_release(vma, xe_vma_size(vma) >> + PAGE_SHIFT); kvfree(vma->userptr.pinned_pages); } From patchwork Tue Aug 22 16:21:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 13361175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 46695EE4993 for ; Tue, 22 Aug 2023 16:22:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9D93710E3A9; Tue, 22 Aug 2023 16:22:14 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id A913810E3B0; Tue, 22 Aug 2023 16:22:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692721327; x=1724257327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w9u6uA82oCy1HGRCatKIP3fciJKy8mwFSrfD4R7ANTw=; b=HhGHKPpjn05l7SATEbRwFe6pja/JfPCOOqAhQ7qePY7pUTr5Kk+IQ3xN auTKzKGlkHMrk2Y5O41KrMIGFO8uv9S6Z3N20PJQwp1115J5EZdy0F6O9 HSo7Il8bQh2+WY8C948AeewjWUPl4F4x+e7rLTQhdpGcggkBqYO25kuC2 pK5JHFAwCWh/XVwGLtNlmWaCdS6YJKqy1hzWahePUDiWOWS2Dq/IoZyN5 MeDulLRfj/7DMfEJwSw/4vzTIGZ6Sls7fsfpp5r+NSWfScS5BU/paUvXA NpsilIUewZELGTty/jaQu8fcx9UoCq82w8oq6BLSwYrRBpHv//yzNg1Tu Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="354260404" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="354260404" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:22:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10809"; a="826373929" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="826373929" Received: from kahchuno-mobl.gar.corp.intel.com (HELO fedora..) ([10.249.254.65]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Aug 2023 09:22:05 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH v2 4/4] drm/xe/uapi: Support pinning of userptr vmas Date: Tue, 22 Aug 2023 18:21:36 +0200 Message-ID: <20230822162136.25895-5-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> References: <20230822162136.25895-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Support pinning of vmas using XE_VM_BIND_FLAG_PIN, initially for userptr only. Pinned memory becomes accounted against RLIMIT_MEMLOCK and processes with CAP_IPC_LOCK will not apply the limit. This is pretty similar to mlock()'ing userptr memory with the added benefit that the driver is aware and can ignore some actions in the MMU invalidation notifier. This will initially become useful for compute VMs on hardware without mid-thread-preemption capability since with pinned pages, the MMU invalidation notifier never tries to preempt a running compute kernel. If that were the only usage we could restrict this to a flag that always pins userptr VMAs on compute VMs on such hardware, but there are indications that this may become needed in other situations as well. From a more general point of view, the usage pattern of a system may be such that in most cases it only ever runs a single workload per system and then the sysadmin would want to configure the system to allow extensive pinning for performance reasons. Hence we might want to extend the pinning capability to bo-backed VMAs as well. How that pinning will be accounted remains an open but to build on the current drm CGROUP work would be an option. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 33 +++++++++++++++++++++++++------- drivers/gpu/drm/xe/xe_vm_types.h | 2 ++ include/uapi/drm/xe_drm.h | 18 +++++++++++++++++ 3 files changed, 46 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index a645cfa131ca..fdfe5a411386 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -950,6 +950,7 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, u64 start, u64 end, bool read_only, bool is_null, + bool pin, u8 tile_mask) { struct xe_vma *vma; @@ -981,6 +982,8 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, vma->gpuva.flags |= XE_VMA_READ_ONLY; if (is_null) vma->gpuva.flags |= DRM_GPUVA_SPARSE; + if (pin) + vma->gpuva.flags |= XE_VMA_PINNED; if (tile_mask) { vma->tile_mask = tile_mask; @@ -2382,6 +2385,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo, op->map.read_only = operation & XE_VM_BIND_FLAG_READONLY; op->map.is_null = operation & XE_VM_BIND_FLAG_NULL; + op->map.pin = operation & XE_VM_BIND_FLAG_PIN; } break; case XE_VM_BIND_OP_UNMAP: @@ -2446,7 +2450,8 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo, } static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op, - u8 tile_mask, bool read_only, bool is_null) + u8 tile_mask, bool read_only, bool is_null, + bool pin) { struct xe_bo *bo = op->gem.obj ? gem_to_xe_bo(op->gem.obj) : NULL; struct xe_vma *vma; @@ -2462,7 +2467,7 @@ static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op, } vma = xe_vma_create(vm, bo, op->gem.offset, op->va.addr, op->va.addr + - op->va.range - 1, read_only, is_null, + op->va.range - 1, read_only, is_null, pin, tile_mask); if (bo) xe_bo_unlock(bo, &ww); @@ -2577,7 +2582,7 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_exec_queue *q, vma = new_vma(vm, &op->base.map, op->tile_mask, op->map.read_only, - op->map.is_null); + op->map.is_null, op->map.pin); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto free_fence; @@ -2602,10 +2607,13 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_exec_queue *q, bool is_null = op->base.remap.unmap->va->flags & DRM_GPUVA_SPARSE; + bool pin = + op->base.remap.unmap->va->flags & + XE_VMA_PINNED; vma = new_vma(vm, op->base.remap.prev, op->tile_mask, read_only, - is_null); + is_null, pin); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto free_fence; @@ -2638,10 +2646,13 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_exec_queue *q, bool is_null = op->base.remap.unmap->va->flags & DRM_GPUVA_SPARSE; + bool pin = + op->base.remap.unmap->va->flags & + XE_VMA_PINNED; vma = new_vma(vm, op->base.remap.next, op->tile_mask, read_only, - is_null); + is_null, pin); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto free_fence; @@ -3146,11 +3157,12 @@ static void vm_bind_ioctl_ops_unwind(struct xe_vm *vm, #define SUPPORTED_FLAGS \ (FORCE_ASYNC_OP_ERROR | XE_VM_BIND_FLAG_ASYNC | \ XE_VM_BIND_FLAG_READONLY | XE_VM_BIND_FLAG_IMMEDIATE | \ - XE_VM_BIND_FLAG_NULL | 0xffff) + XE_VM_BIND_FLAG_NULL | XE_VM_BIND_FLAG_PIN | 0xffff) #else #define SUPPORTED_FLAGS \ (XE_VM_BIND_FLAG_ASYNC | XE_VM_BIND_FLAG_READONLY | \ - XE_VM_BIND_FLAG_IMMEDIATE | XE_VM_BIND_FLAG_NULL | 0xffff) + XE_VM_BIND_FLAG_IMMEDIATE | XE_VM_BIND_FLAG_NULL | \ + XE_VM_BIND_FLAG_PIN | 0xffff) #endif #define XE_64K_PAGE_MASK 0xffffull @@ -3220,6 +3232,13 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, goto free_bind_ops; } + /* TODO: Support OP_PREFETCH, OP_MAP */ + if (XE_IOCTL_DBG(xe, (op & XE_VM_BIND_FLAG_PIN) && + VM_BIND_OP(op) != XE_VM_BIND_OP_MAP_USERPTR)) { + err = -EINVAL; + goto free_bind_ops; + } + if (XE_IOCTL_DBG(xe, VM_BIND_OP(op) > XE_VM_BIND_OP_PREFETCH) || XE_IOCTL_DBG(xe, op & ~SUPPORTED_FLAGS) || diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 9b90e649cd69..024ccabadd12 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -360,6 +360,8 @@ struct xe_vma_op_map { bool read_only; /** @is_null: is NULL binding */ bool is_null; + /** @pin: pin underlying memory */ + bool pin; }; /** struct xe_vma_op_remap - VMA remap operation */ diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 86f16d50e9cc..fc3d9cd4f8d0 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -631,6 +631,24 @@ struct drm_xe_vm_bind_op { * intended to implement VK sparse bindings. */ #define XE_VM_BIND_FLAG_NULL (0x1 << 19) + /* + * When the PIN flag is set, the user requests the underlying + * backing store of the vma to be pinned, that is, it will be + * resident while bound and the underlying physical memory + * will not change. For userptr VMAs this means that if the + * user performs an operation that changes the underlying + * pages of the CPU virtual space, the corresponding pinned + * GPU virtual space will not pick up the new memory unless + * an OP_UNMAP followed by a OP_MAP_USERPTR is performed. + * Pinned userptr memory is accounted in the same way as + * mlock(2), and if pinning fails the following error codes + * may be returned: + * -EINVAL: The memory region does not support pinning. + * -EPERM: The process is not permitted to pin. + * -ENOMEM: The pinning limit does not allow pinning. + * For userptr memory, CAP_IPC_LOCK will bypass the limit checking. + */ +#define XE_VM_BIND_FLAG_PIN (0x1 << 20) /** @op: Operation to perform (lower 16 bits) and flags (upper 16 bits) */ __u32 op;