From patchwork Thu Dec 21 04:37:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501068 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CBA57C46CD8 for ; Thu, 21 Dec 2023 04:28:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C198310E656; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id B385110E646; Thu, 21 Dec 2023 04:28:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132901; x=1734668901; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IgTg5rpRtePLupWg3aD5I7Gh2zRaE7GKcIVij7uNhNM=; b=iJJRmQ+grismCWu3m4V90GuY5CmNHN4eowfWIHaAF+AVpbL8wzOik/vH p9nSHW0vDLf+9iGXKe26O4OvFzmiRS2nSiSeR6dQu3W3chUa1JKHe5uIR U8aLzXCkmgFTMbcmkkRxQglbujE+ujotvpgdiCnJtpf9wua9F/8KQEKHX AKCWr82AfaKS4mZuzoW48TYO1LcIuOEjMgUsgIOhyoBrdH1UAnjFY7ScF t1kvuy344bdVHR4hJgLPdLH9GXpbUR8YTxqIk9PoM5Gjum339sAlHVjQs UfbNXRHVjO6Dr6NZgTEifjmswxM3ub86tlV1o5jewPBpqNdyweNMvf5cy g==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069760" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069760" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481340" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481340" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 01/22] drm/xe/svm: Add SVM document Date: Wed, 20 Dec 2023 23:37:51 -0500 Message-Id: <20231221043812.3783313-2-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add shared virtual memory document. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe_svm.rst | 8 +++ drivers/gpu/drm/xe/xe_svm_doc.h | 121 ++++++++++++++++++++++++++++++++ 3 files changed, 130 insertions(+) create mode 100644 Documentation/gpu/xe/xe_svm.rst create mode 100644 drivers/gpu/drm/xe/xe_svm_doc.h diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst index c224ecaee81e..106b60aba1f0 100644 --- a/Documentation/gpu/xe/index.rst +++ b/Documentation/gpu/xe/index.rst @@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver. xe_firmware xe_tile xe_debugging + xe_svm diff --git a/Documentation/gpu/xe/xe_svm.rst b/Documentation/gpu/xe/xe_svm.rst new file mode 100644 index 000000000000..62954ba1c6f8 --- /dev/null +++ b/Documentation/gpu/xe/xe_svm.rst @@ -0,0 +1,8 @@ +.. SPDX-License-Identifier: (GPL-2.0+ OR MIT) + +============= +Shared virtual memory +============= + +.. kernel-doc:: drivers/gpu/drm/xe/xe_svm_doc.h + :doc: Shared virtual memory diff --git a/drivers/gpu/drm/xe/xe_svm_doc.h b/drivers/gpu/drm/xe/xe_svm_doc.h new file mode 100644 index 000000000000..de38ee3585e4 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_svm_doc.h @@ -0,0 +1,121 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef _XE_SVM_DOC_H_ +#define _XE_SVM_DOC_H_ + +/** + * DOC: Shared virtual memory + * + * Shared Virtual Memory (SVM) allows the programmer to use a single virtual + * address space shared between threads executing on CPUs and GPUs. It abstracts + * away from the user the location of the backing memory, and hence simplifies + * the user programming model. In a non-SVM memory model, user need to explicitly + * decide memory placement such as device or system memory, also user need to + * explicitly migrate memory b/t device and system memory. + * + * Interface + * ========= + * + * SVM makes use of default OS memory allocation and mapping interface such as + * malloc() and mmap(). The pointer returned from malloc() and mmap() can be + * directly used on both CPU and GPU program. + * + * SVM also provides API to set virtual address range based memory attributes + * such as preferred memory location, memory migration granularity, and memory + * atomic attributes etc. This is similar to Linux madvise API. + * + * Basic implementation + * ============== + * + * XeKMD implementation is based on Linux kernel Heterogeneous Memory Management + * (HMM) framework. HMM’s address space mirroring support allows sharing of the + * address space by duplicating sections of CPU page tables in the device page + * tables. This enables both CPU and GPU access a physical memory location using + * the same virtual address. + * + * Linux kernel also provides the ability to plugin device memory to the system + * (as a special ZONE_DEVICE type) and allocates struct page for each device memory + * page. + * + * HMM also provides a mechanism to migrate pages from host to device memory and + * vice versa. + * + * More information on HMM can be found here. + * https://www.kernel.org/doc/Documentation/vm/hmm.rst + * + * Unlike the non-SVM memory allocator (such as gem_create, vm_bind etc), there + * is no buffer object (BO, such as struct ttm_buffer_object, struct drm_gem_object), + * in our SVM implementation. We delibrately choose this implementation option + * to achieve page granularity memory placement, validation, eviction and migration. + * + * The SVM layer directly allocate device memory from drm buddy subsystem. The + * memory is organized as many blocks each of which has 2^n pages. SVM subsystem + * then mark the usage of each page using a simple bitmap. When all pages in a + * block are not used anymore, SVM return this block back to drm buddy subsystem. + * + * There are 3 events which can trigger SVM subsystem in actions: + * + * 1. A mmu notifier callback + * + * Since SVM need to mirror the program's CPU virtual address space from GPU side, + * when program's CPU address space changes, SVM need to make an identical change + * from GPU side. SVM/hmm use mmu interval notifier to achieve this. SVM register + * a mmu interval notifier call back function to core mm, and whenever a CPU side + * virtual address space is changed (i.e., when a virtual address range is unmapped + * from CPU calling munmap), the registered callback function will be called from + * core mm. SVM then mirror the CPU address space change from GPU side, i.e., unmap + * or invalidate the virtual address range from GPU page table. + * + * 2. A GPU page fault + * + * At the very beginning of a process's life, no virtual address of the process + * is mapped on GPU page table. So when GPU access any virtual address of the process + * a GPU page fault is triggered. SVM then decide the best memory location of the + * fault address (mainly from performance consideration. Some times also consider + * correctness requirement such as whether GPU can perform atomics operation to + * certain memory location), migrate memory if necessary, and map the fault address + * to GPU page table. + * + * 3. A CPU page fault + * + * A CPU page fault is usually managed by Linux core mm. But in a CPU and GPU + * mix programming environment, the backing store of a virtual address range + * can be in GPU's local memory which is not visible to CPU (DEVICE_PRIVATE), + * so CPU page fault handler need to migrate such pages to system memory for + * CPU to be able to access them. Such memory migration is device specific. + * HMM has a callback function (migrate_to_ram function of the dev_pagemap_ops) + * for device driver to implement. + * + * + * Memory hints: TBD + * ================= + * + * Memory eviction: TBD + * =============== + * + * Lock design + * =========== + * + * https://www.kernel.org/doc/Documentation/vm/hmm.rst section "Address space mirroring + * implemenation and API" described the locking scheme that driver writer has to + * respect. There are 3 lock mechanism involved in this scheme: + * + * 1. Use mmp_read/write_lock to protect VMA, cpu page table operations. Operation such + * as munmap/mmap, page table update during numa balance must hold this lock. Hmm_range_fault + * is a helper function provided by HMM to populate the CPU page table, so it must be called + * with this lock + * + * 2. Use xe_svm::mutex to protect device side page table operation. Any attempt to bind an + * address range to GPU, or invalidate an address range from GPU, should hold this device lock + * + * 3. In the GPU page fault handler, during device page table update, we hold a xe_svm::mutex, + * but we don't hold the mmap_read/write_lock. So programm's address space can change during + * the GPU page table update. mmu notifier seq# is used to determine whether unmap happened + * during during device page table update, if yes, then retry. + * + */ + +#endif From patchwork Thu Dec 21 04:37:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BC80FC46CD8 for ; Thu, 21 Dec 2023 04:28:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5D78F10E66D; Thu, 21 Dec 2023 04:28:25 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id D6AA210E638; Thu, 21 Dec 2023 04:28:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132901; x=1734668901; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=61zBSdrJ80mEY4m4E27nOmJa1Z6npZKoFz/hC/ENZ5c=; b=MWhU94Qf+Zbek55pH0vKpxEPdTQ89FsuwB8hwKES4q7qDNcd0d4PlRsd VZRWLHFRBMMliFsf59mTTwRGpqORW8e9+ViHnc5Z8580lN7LYb28fYrnt e4vzwPbTtccxya5lHixi7mjg6LjCxjWbmwACoPpUsPuX58kz93x7ijiFg fHDgbYjpbM+IoB90keb9ARlPucLc5j1DAW5B9KyChathi3wA5miaGaRTg kJD/6Ztl40INhlKHBZI+PlfE4/G2kth0bCEpJBwVRVx6DlNcBEu+gzFFz Wz/h7AnYbUHkg2p79Jnrac6045+tMf5bJQ8iw0O78mzmj6JM8PJYEKIw2 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069761" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069761" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481343" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481343" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 02/22] drm/xe/svm: Add svm key data structures Date: Wed, 20 Dec 2023 23:37:52 -0500 Message-Id: <20231221043812.3783313-3-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add xe_svm and xe_svm_range data structure. Each xe_svm represents a svm address space and it maps 1:1 to the process's mm_struct. It also maps 1:1 to the gpu xe_vm struct. Each xe_svm_range represent a virtual address range inside a svm address space. It is similar to CPU's vm_area_struct, or to the GPU xe_vma struct. It contains data to synchronize this address range to CPU's virtual address range, using mmu notifier mechanism. It can also hold this range's memory attributes set by user, such as preferred memory location etc - this is TBD. Each svm address space is made of many svm virtual address range. All address ranges are maintained in xe_svm's interval tree. Also add a xe_svm pointer to xe_vm data structure. So we have a 1:1 mapping b/t xe_svm and xe_vm. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.h | 59 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_vm_types.h | 2 ++ 2 files changed, 61 insertions(+) create mode 100644 drivers/gpu/drm/xe/xe_svm.h diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h new file mode 100644 index 000000000000..ba301a331f59 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ + +#ifndef __XE_SVM_H +#define __XE_SVM_H +#include +#include +#include +#include +#include +#include + +struct xe_vm; +struct mm_struct; + +/** + * struct xe_svm - data structure to represent a shared + * virtual address space from device side. xe_svm, xe_vm + * and mm_struct has a 1:1:1 relationship. + */ +struct xe_svm { + /** @vm: The xe_vm address space corresponding to this xe_svm */ + struct xe_vm *vm; + /** @mm: The mm_struct corresponding to this xe_svm */ + struct mm_struct *mm; + /** + * @mutex: A lock used by svm subsystem. It protects: + * 1. below range_tree + * 2. GPU page table update. Serialize all SVM GPU page table updates + */ + struct mutex mutex; + /** + * @range_tree: Interval tree of all svm ranges in this svm + */ + struct rb_root_cached range_tree; +}; + +/** + * struct xe_svm_range - Represents a shared virtual address range. + */ +struct xe_svm_range { + /** @notifier: The mmu interval notifer used to keep track of CPU + * side address range change. Driver will get a callback with this + * notifier if anything changed from CPU side, such as range is + * unmapped from CPU + */ + struct mmu_interval_notifier notifier; + /** @start: start address of this range, inclusive */ + u64 start; + /** @end: end address of this range, exclusive */ + u64 end; + /** @unregister_notifier_work: A worker used to unregister this notifier */ + struct work_struct unregister_notifier_work; + /** @inode: used to link this range to svm's range_tree */ + struct interval_tree_node inode; +}; +#endif diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 63e8a50b88e9..037fb7168c63 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -17,6 +17,7 @@ #include "xe_pt_types.h" #include "xe_range_fence.h" +struct xe_svm; struct xe_bo; struct xe_sync_entry; struct xe_vm; @@ -279,6 +280,7 @@ struct xe_vm { bool batch_invalidate_tlb; /** @xef: XE file handle for tracking this VM's drm client */ struct xe_file *xef; + struct xe_svm *svm; }; /** struct xe_vma_op_map - VMA map operation */ From patchwork Thu Dec 21 04:37:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 97503C35274 for ; Thu, 21 Dec 2023 04:28:39 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E0AB410E665; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 04BF110E646; Thu, 21 Dec 2023 04:28:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SnAGZEWVouqbngtFRpLMRdoRcj8w7Sa3qFCN3m4IV0s=; b=XV1kV2Wv1eaBs9DcLImF5q1n++Sj87KFO89YwaFV/tGdcdUL1kWL2nUu JHihqkTi4q1WkL1ZhS+MNxUJECY2DTf684XCwdSss5PVTcGDUjwLRM5tk AwCqkHTu099XlOW8Pb7pOBasVznp76niNwsJHVeG6eFrDIv8DYYjRt41d jDu4JNrhIh50HCJ76XgVEq8yI7avZpKN0qx3I9rGqRC8yav2LMDaaE91H SHUPz+io3YNlgUrPrYQIw7R4tTaFVQYTRxXoVEfrTtswN1yqPhvlzUVjk AXr4jj482C7huIaB5OwM+IQfHFRoPTL0eNG8n0QvtRlGqT4GBcYXCVvoC w==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069762" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069762" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481346" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481346" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 03/22] drm/xe/svm: create xe svm during vm creation Date: Wed, 20 Dec 2023 23:37:53 -0500 Message-Id: <20231221043812.3783313-4-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Create the xe_svm struct during xe_vm creation. Add xe_svm to a global hash table so later on we can retrieve xe_svm using mm_struct (the key). Destroy svm process during xe_vm close. Also add a helper function to retrieve svm struct from mm struct Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.c | 63 +++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_svm.h | 11 +++++++ drivers/gpu/drm/xe/xe_vm.c | 5 +++ 3 files changed, 79 insertions(+) create mode 100644 drivers/gpu/drm/xe/xe_svm.c diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c new file mode 100644 index 000000000000..559188471949 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ + +#include +#include +#include "xe_svm.h" + +DEFINE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); + +/** + * xe_destroy_svm() - destroy a svm process + * + * @svm: the xe_svm to destroy + */ +void xe_destroy_svm(struct xe_svm *svm) +{ + hash_del_rcu(&svm->hnode); + mutex_destroy(&svm->mutex); + kfree(svm); +} + +/** + * xe_create_svm() - create a svm process + * + * @vm: the xe_vm that we create svm process for + * + * Return the created xe svm struct + */ +struct xe_svm *xe_create_svm(struct xe_vm *vm) +{ + struct mm_struct *mm = current->mm; + struct xe_svm *svm; + + svm = kzalloc(sizeof(struct xe_svm), GFP_KERNEL); + svm->mm = mm; + svm->vm = vm; + mutex_init(&svm->mutex); + /** Add svm to global xe_svm_table hash table + * use mm as key so later we can retrieve svm using mm + */ + hash_add_rcu(xe_svm_table, &svm->hnode, (uintptr_t)mm); + return svm; +} + +/** + * xe_lookup_svm_by_mm() - retrieve xe_svm from mm struct + * + * @mm: the mm struct of the svm to retrieve + * + * Return the xe_svm struct pointer, or NULL if fail + */ +struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm) +{ + struct xe_svm *svm; + + hash_for_each_possible_rcu(xe_svm_table, svm, hnode, (uintptr_t)mm) + if (svm->mm == mm) + return svm; + + return NULL; +} diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index ba301a331f59..cd3cf92f3784 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -11,10 +11,15 @@ #include #include #include +#include +#include struct xe_vm; struct mm_struct; +#define XE_MAX_SVM_PROCESS 5 /* Maximumly support 32 SVM process*/ +extern DECLARE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); + /** * struct xe_svm - data structure to represent a shared * virtual address space from device side. xe_svm, xe_vm @@ -35,6 +40,8 @@ struct xe_svm { * @range_tree: Interval tree of all svm ranges in this svm */ struct rb_root_cached range_tree; + /** @hnode: used to add this svm to a global xe_svm_hash table*/ + struct hlist_node hnode; }; /** @@ -56,4 +63,8 @@ struct xe_svm_range { /** @inode: used to link this range to svm's range_tree */ struct interval_tree_node inode; }; + +void xe_destroy_svm(struct xe_svm *svm); +struct xe_svm *xe_create_svm(struct xe_vm *vm); +struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm); #endif diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 1ca917b8315c..3c301a5c7325 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -36,6 +36,7 @@ #include "xe_trace.h" #include "generated/xe_wa_oob.h" #include "xe_wa.h" +#include "xe_svm.h" #define TEST_VM_ASYNC_OPS_ERROR @@ -1375,6 +1376,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) xe->usm.num_vm_in_non_fault_mode++; mutex_unlock(&xe->usm.lock); + vm->svm = xe_create_svm(vm); trace_xe_vm_create(vm); return vm; @@ -1495,6 +1497,9 @@ void xe_vm_close_and_put(struct xe_vm *vm) for_each_tile(tile, xe, id) xe_range_fence_tree_fini(&vm->rftree[id]); + if (vm->svm) + xe_destroy_svm(vm->svm); + xe_vm_put(vm); } From patchwork Thu Dec 21 04:37:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 55D0BC35274 for ; Thu, 21 Dec 2023 04:28:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 508A010E658; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2AB0810E638; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7Dq1kI5bDtk1FZgTD5fpOED/D+/k0o0UZYIn9HvhAPM=; b=Z3whQpVsjlY4IgJKiCiABMSBuPeqXcPq5p7buDITUPJgW9coBt0B1Uq9 HtUWyu4XF7EhuklWkcmoHP8XACGV8JWzHHrUcrmYap1uIIKhKOkCYvTpN a5FdwNB4Y/GQ9p4vYPc8PlHcB9XdpYuOOdj7XTraYz7RR7J1dhQqETmsN J58wL9rVdU53cz7nBVbIRXC+97i2/85DTBa8pTW43DDriKtclw1djkvjm 4MySwZLPi4UYeORbOAZ5Bt4HbzDXXAqHSOZ5n0m/SZF8vAs40djcQUBIN /iFIaFwQT0wTFdJRvdHowhFVyYZEJLtloKi9char5Nfi3jvqN9MSBLckw Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069763" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069763" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481349" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481349" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 04/22] drm/xe/svm: Trace svm creation Date: Wed, 20 Dec 2023 23:37:54 -0500 Message-Id: <20231221043812.3783313-5-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" xe_vm tracepoint is extended to also print svm. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_trace.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_trace.h b/drivers/gpu/drm/xe/xe_trace.h index 95163c303f3e..63867c0fa848 100644 --- a/drivers/gpu/drm/xe/xe_trace.h +++ b/drivers/gpu/drm/xe/xe_trace.h @@ -467,15 +467,17 @@ DECLARE_EVENT_CLASS(xe_vm, TP_STRUCT__entry( __field(u64, vm) __field(u32, asid) + __field(u64, svm) ), TP_fast_assign( __entry->vm = (unsigned long)vm; __entry->asid = vm->usm.asid; + __entry->svm = (unsigned long)vm->svm; ), - TP_printk("vm=0x%016llx, asid=0x%05x", __entry->vm, - __entry->asid) + TP_printk("vm=0x%016llx, asid=0x%05x, svm=0x%016llx", __entry->vm, + __entry->asid, __entry->svm) ); DEFINE_EVENT(xe_vm, xe_vm_kill, From patchwork Thu Dec 21 04:37:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501070 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B9FFDC35274 for ; Thu, 21 Dec 2023 04:28:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2142F10E652; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4CA0810E646; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fBy/nGHk4g7eyh/WjuYu7v2iQ0vWUflQsL/txH5D3k4=; b=eUP1Je9Ot6bq/HFOq+D2aAv3XPg07z3tvRw4mPqe3+8JBbKs3kQNQKy0 ofk712zAC3O4rO9SEgB/aFClrxGwI31PB0+ln7RZLUw0ITePZH0uuZbhf 4EJyaNuZK1VTkGiq6A0KOEVW/hVR75cfERZ+ttfE4qqzzm3oS4IEUU0U4 DaIUtS7Zxdu6dsXTYdaJP6RUgodogqk5WTHeEKFOdwRsfSIhu2b+b4bnr U4PokjHIEXhWyMe3GlIYubwPqi4PDVQsVxvz9VUOs4sfMguQZPEA2+dsj a5BZyeHgmGpdmWabtJ4ekB9oOZRwjBm7YtdyRMsR4G4kUeTMZMcUBoB4q Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069764" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069764" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481351" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481351" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 05/22] drm/xe/svm: add helper to retrieve svm range from address Date: Wed, 20 Dec 2023 23:37:55 -0500 Message-Id: <20231221043812.3783313-6-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" All valid virtual address range are maintained in svm's range_tree. This functions iterate svm's range tree and return the svm range that contains specific address. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.h | 2 ++ drivers/gpu/drm/xe/xe_svm_range.c | 32 +++++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+) create mode 100644 drivers/gpu/drm/xe/xe_svm_range.c diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index cd3cf92f3784..3ed106ecc02b 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -67,4 +67,6 @@ struct xe_svm_range { void xe_destroy_svm(struct xe_svm *svm); struct xe_svm *xe_create_svm(struct xe_vm *vm); struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm); +struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, + unsigned long addr); #endif diff --git a/drivers/gpu/drm/xe/xe_svm_range.c b/drivers/gpu/drm/xe/xe_svm_range.c new file mode 100644 index 000000000000..d8251d38f65e --- /dev/null +++ b/drivers/gpu/drm/xe/xe_svm_range.c @@ -0,0 +1,32 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ + +#include +#include +#include +#include "xe_svm.h" + +/** + * xe_svm_range_from_addr() - retrieve svm_range contains a virtual address + * + * @svm: svm that the virtual address belongs to + * @addr: the virtual address to retrieve svm_range for + * + * return the svm range found, + * or NULL if no range found + */ +struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, + unsigned long addr) +{ + struct interval_tree_node *node; + + mutex_lock(&svm->mutex); + node = interval_tree_iter_first(&svm->range_tree, addr, addr); + mutex_unlock(&svm->mutex); + if (!node) + return NULL; + + return container_of(node, struct xe_svm_range, inode); +} From patchwork Thu Dec 21 04:37:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501088 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D9A2EC46CD8 for ; Thu, 21 Dec 2023 04:28:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F189D10E67D; Thu, 21 Dec 2023 04:28:31 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 67C8F10E652; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2kCKEQM0zRAF0jTpWpSPfvvYcsEoKW6S90DuXGUCuuo=; b=PAyE+rPI7vA67UsGcnqzCXsZlT34bys+EuX2MhErxFcbH4WMpClwJsKV xvdonAJnDyfijOetdg1DJXr4Wl9R8ewgj0SnEnN8PPqeRrvH+4cXvsDwG PjPZdZVqR9BXBe/qbRsQB7BSW7YKRHkqy/can5oGC72xXqbUFx1EG5E6D K527pycxDqgZWASNV69Q2NiipUVyOpC/W0qzYBDluHsXHjiclZdHHwQzp OXMw4wh+HZiTy+Hs5U0tHy0H6czzcxDz9T1xsXvu3M5D5cpFPu+lf/5Y3 0KCOR3KobY40HMHBcbkHFuzde3QEfJ4wuH8Ejjf+q0Oseb0TxfS7zSv7G Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069765" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069765" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481355" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481355" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 06/22] drm/xe/svm: Introduce a helper to build sg table from hmm range Date: Wed, 20 Dec 2023 23:37:56 -0500 Message-Id: <20231221043812.3783313-7-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Introduce xe_svm_build_sg helper function to build a scatter gather table from a hmm_range struct. This is prepare work for binding hmm range to gpu. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.c | 52 +++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_svm.h | 3 +++ 2 files changed, 55 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index 559188471949..ab3cc2121869 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -6,6 +6,8 @@ #include #include #include "xe_svm.h" +#include +#include DEFINE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); @@ -61,3 +63,53 @@ struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm) return NULL; } + +/** + * xe_svm_build_sg() - build a scatter gather table for all the physical pages/pfn + * in a hmm_range. + * + * @range: the hmm range that we build the sg table from. range->hmm_pfns[] + * has the pfn numbers of pages that back up this hmm address range. + * @st: pointer to the sg table. + * + * All the contiguous pfns will be collapsed into one entry in + * the scatter gather table. This is for the convenience of + * later on operations to bind address range to GPU page table. + * + * This function allocates the storage of the sg table. It is + * caller's responsibility to free it calling sg_free_table. + * + * Returns 0 if successful; -ENOMEM if fails to allocate memory + */ +int xe_svm_build_sg(struct hmm_range *range, + struct sg_table *st) +{ + struct scatterlist *sg; + u64 i, npages; + + sg = NULL; + st->nents = 0; + npages = ((range->end - 1) >> PAGE_SHIFT) - (range->start >> PAGE_SHIFT) + 1; + + if (unlikely(sg_alloc_table(st, npages, GFP_KERNEL))) + return -ENOMEM; + + for (i = 0; i < npages; i++) { + unsigned long addr = range->hmm_pfns[i]; + + if (sg && (addr == (sg_dma_address(sg) + sg->length))) { + sg->length += PAGE_SIZE; + sg_dma_len(sg) += PAGE_SIZE; + continue; + } + + sg = sg ? sg_next(sg) : st->sgl; + sg_dma_address(sg) = addr; + sg_dma_len(sg) = PAGE_SIZE; + sg->length = PAGE_SIZE; + st->nents++; + } + + sg_mark_end(sg); + return 0; +} diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 3ed106ecc02b..191bce6425db 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -13,6 +13,8 @@ #include #include #include +#include +#include "xe_device_types.h" struct xe_vm; struct mm_struct; @@ -69,4 +71,5 @@ struct xe_svm *xe_create_svm(struct xe_vm *vm); struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm); struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, unsigned long addr); +int xe_svm_build_sg(struct hmm_range *range, struct sg_table *st); #endif From patchwork Thu Dec 21 04:37:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501076 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EC2C4C47070 for ; Thu, 21 Dec 2023 04:28:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1DC6510E66A; Thu, 21 Dec 2023 04:28:24 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7665010E653; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kFTI+4vG3BRrNxkKNlzHPA05bP2aYXoRLJZ0TZub+CI=; b=BbGTmMDEDOU6tY+F396cFQmCC2ozUAGAEBUdCTVz4dyC7Njf767KDcYI 5L9AgNXp+30g0j7xBzeKvOHMG8333+E/djqDatbsxMzTXFA05Go7XrU0v mznhc6EllwegivTb/8lD8exgSoO2mmjz+3HFJQ1EY87IPxHR9DJuVS6Z2 zSmY6zsuPNR4K1r8RCVzDCuUlPcWyPEEg5R3KKHfPjy29XWDPCYnyBW0d Y813gcQOEI9xolHaJxAqE12T/CBkXAd10rstxVbYV8Hn3ZOkYuf7QbubU xmLTWiGXmEvZ/iSv1nccHNHn0/ASdRyDnu5miSnmqTzIKY3W42JTfKB9l Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069766" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069766" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481358" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481358" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 07/22] drm/xe/svm: Add helper for binding hmm range to gpu Date: Wed, 20 Dec 2023 23:37:57 -0500 Message-Id: <20231221043812.3783313-8-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add helper function xe_bind_svm_range to bind a svm range to gpu. A temporary xe_vma is created locally to re-use existing page table update functions which are vma-based. The svm page table update lock design is different from userptr and bo page table update. A xe_pt_svm_pre_commit function is introduced for svm range pre-commitment. A hmm_range pointer is added to xe_vma struct. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_pt.c | 101 ++++++++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_pt.h | 4 ++ drivers/gpu/drm/xe/xe_vm_types.h | 10 +++ 3 files changed, 113 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index de1030a47588..65cfac88ab2f 100644 --- a/drivers/gpu/drm/xe/xe_pt.c +++ b/drivers/gpu/drm/xe/xe_pt.c @@ -17,6 +17,7 @@ #include "xe_trace.h" #include "xe_ttm_stolen_mgr.h" #include "xe_vm.h" +#include "xe_svm.h" struct xe_pt_dir { struct xe_pt pt; @@ -617,7 +618,10 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma, xe_bo_assert_held(bo); if (!xe_vma_is_null(vma)) { - if (xe_vma_is_userptr(vma)) + if (vma->svm_sg) + xe_res_first_sg(vma->svm_sg, 0, xe_vma_size(vma), + &curs); + else if (xe_vma_is_userptr(vma)) xe_res_first_sg(vma->userptr.sg, 0, xe_vma_size(vma), &curs); else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo)) @@ -1046,6 +1050,28 @@ static int xe_pt_userptr_pre_commit(struct xe_migrate_pt_update *pt_update) return 0; } +static int xe_pt_svm_pre_commit(struct xe_migrate_pt_update *pt_update) +{ + struct xe_vma *vma = pt_update->vma; + struct hmm_range *range = vma->hmm_range; + + if (mmu_interval_read_retry(range->notifier, + range->notifier_seq)) { + /* + * FIXME: is this really necessary? We didn't update GPU + * page table yet... + */ + xe_vm_invalidate_vma(vma); + return -EAGAIN; + } + return 0; +} + +static const struct xe_migrate_pt_update_ops svm_bind_ops = { + .populate = xe_vm_populate_pgtable, + .pre_commit = xe_pt_svm_pre_commit, +}; + static const struct xe_migrate_pt_update_ops bind_ops = { .populate = xe_vm_populate_pgtable, .pre_commit = xe_pt_pre_commit, @@ -1197,7 +1223,8 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue struct xe_vm_pgtable_update entries[XE_VM_MAX_LEVEL * 2 + 1]; struct xe_pt_migrate_pt_update bind_pt_update = { .base = { - .ops = xe_vma_is_userptr(vma) ? &userptr_bind_ops : &bind_ops, + .ops = vma->svm_sg ? &svm_bind_ops : + (xe_vma_is_userptr(vma) ? &userptr_bind_ops : &bind_ops), .vma = vma, .tile_id = tile->id, }, @@ -1651,3 +1678,73 @@ __xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queu return fence; } + +/** + * xe_bind_svm_range() - bind an address range to vm + * + * @vm: the vm to bind this address range + * @tile: the tile to bind this address range to + * @range: a hmm_range which includes all the information + * needed for binding: virtual address range and physical + * pfns to back up this virtual address range. + * @flags: the binding flags to set in pte + * + * This is a helper function used by svm sub-system + * to bind a svm range to gpu vm. svm sub-system + * doesn't have xe_vma, thus helpers such as + * __xe_pt_bind_vma can't be used directly. So this + * helper is written for svm sub-system to use. + * + * This is a synchronous function. When this function + * returns, either the svm range is bound to GPU, or + * error happened. + * + * Return: 0 for success or error code for failure + * If -EAGAIN returns, it means mmu notifier was called ( + * aka there was concurrent cpu page table update) during + * this function, caller has to retry hmm_range_fault + */ +int xe_bind_svm_range(struct xe_vm *vm, struct xe_tile *tile, + struct hmm_range *range, u64 flags) +{ + struct dma_fence *fence = NULL; + struct xe_svm *svm = vm->svm; + int ret = 0; + /* + * Create a temp vma to reuse page table helpers such as + * __xe_pt_bind_vma + */ + struct xe_vma vma = { + .gpuva = { + .va = { + .addr = range->start, + .range = range->end - range->start + 1, + }, + .vm = &vm->gpuvm, + .flags = flags, + }, + .tile_mask = 0x1 << tile->id, + .hmm_range = range, + }; + + xe_svm_build_sg(range, &vma.svm_sgt); + vma.svm_sg = &vma.svm_sgt; + + mutex_lock(&svm->mutex); + if (mmu_interval_read_retry(range->notifier, range->notifier_seq)) { + ret = -EAGAIN; + goto unlock; + } + fence = __xe_pt_bind_vma(tile, &vma, vm->q[tile->id], NULL, 0, false); + +unlock: + mutex_unlock(&svm->mutex); + sg_free_table(vma.svm_sg); + + if (IS_ERR(fence)) + return PTR_ERR(fence); + + dma_fence_wait(fence, false); + dma_fence_put(fence); + return ret; +} diff --git a/drivers/gpu/drm/xe/xe_pt.h b/drivers/gpu/drm/xe/xe_pt.h index 71a4fbfcff43..775d08707466 100644 --- a/drivers/gpu/drm/xe/xe_pt.h +++ b/drivers/gpu/drm/xe/xe_pt.h @@ -17,6 +17,8 @@ struct xe_sync_entry; struct xe_tile; struct xe_vm; struct xe_vma; +struct xe_svm; +struct hmm_range; /* Largest huge pte is currently 1GiB. May become device dependent. */ #define MAX_HUGEPTE_LEVEL 2 @@ -45,4 +47,6 @@ __xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queu bool xe_pt_zap_ptes(struct xe_tile *tile, struct xe_vma *vma); +int xe_bind_svm_range(struct xe_vm *vm, struct xe_tile *tile, + struct hmm_range *range, u64 flags); #endif diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 037fb7168c63..deefe2364667 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -21,6 +21,7 @@ struct xe_svm; struct xe_bo; struct xe_sync_entry; struct xe_vm; +struct hmm_range; #define TEST_VM_ASYNC_OPS_ERROR #define FORCE_ASYNC_OP_ERROR BIT(31) @@ -112,6 +113,15 @@ struct xe_vma { * user pointers */ struct xe_userptr userptr; + + /** + * @svm_sgt: a scatter gather table to save svm virtual address range's + * pfns + */ + struct sg_table svm_sgt; + struct sg_table *svm_sg; + /** hmm range of this pt update, used by svm */ + struct hmm_range *hmm_range; }; struct xe_device; From patchwork Thu Dec 21 04:37:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501073 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AF984C46CD3 for ; Thu, 21 Dec 2023 04:28:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DBC2F10E663; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 91DD010E646; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WQqPRntErOjKUxjbw9Wty6YMAMzApez6E8/rs70LTlA=; b=ia6ViTzTaUHKFG1ttsUgoMkRV0zst3HW4Tefkg+VxSo4At9egHANSM1W tq1lYer3rajPGCd07rlPS2e2HIb2ZE0QRZnyT+rYgrvfcrVwcpFfojbcd 60wByK8wnuI/eAOFgwWWGq9CDPOXDP7jIdU+b1JDHXliKoh3oOQwftdzB lRX54lASg5ODoA0AWahbO4Eh8kipfy2QpAZGkLkd56AQiMSRSSqvwz3wH 18odF/MpNYNbXd6wxjy1froEWBQx3AhjEVfSfS7IJIV8un7wAZFlc27EN k+51jA0ARGlFrS7G5JyIb5JNtMD4nCPEBwzRW/7YU2Osi3tlxRFIv0N+N g==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069767" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069767" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481360" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481360" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 08/22] drm/xe/svm: Add helper to invalidate svm range from GPU Date: Wed, 20 Dec 2023 23:37:58 -0500 Message-Id: <20231221043812.3783313-9-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" A svm subsystem friendly function is added for svm range invalidation purpose. svm subsystem doesn't maintain xe_vma, so a temporary xe_vma is used to call function xe_vma_invalidate_vma Not sure whether this works or not. Will have to test. if a temporary vma doesn't work, we will have to call the zap_pte/tlb_inv functions directly. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_pt.c | 33 +++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_pt.h | 1 + 2 files changed, 34 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index 65cfac88ab2f..9805b402ebca 100644 --- a/drivers/gpu/drm/xe/xe_pt.c +++ b/drivers/gpu/drm/xe/xe_pt.c @@ -1748,3 +1748,36 @@ int xe_bind_svm_range(struct xe_vm *vm, struct xe_tile *tile, dma_fence_put(fence); return ret; } + +/** + * xe_invalidate_svm_range() - a helper to invalidate a svm address range + * + * @vm: The vm that the address range belongs to + * @start: start of the virtual address range + * @size: size of the virtual address range + * + * This is a helper function supposed to be used by svm subsystem. + * svm subsystem doesn't maintain xe_vma, so we create a temporary + * xe_vma structure so we can reuse xe_vm_invalidate_vma(). + */ +void xe_invalidate_svm_range(struct xe_vm *vm, u64 start, u64 size) +{ + struct xe_vma vma = { + .gpuva = { + .va = { + .addr = start, + .range = size, + }, + .vm = &vm->gpuvm, + }, + /** invalidate from all tiles + * FIXME: We used temporary vma in xe_bind_svm_range, so + * we lost track of which tile we are bound to. Does + * setting tile_present to all tiles cause a problem + * in xe_vm_invalidate_vma()? + */ + .tile_present = BIT(vm->xe->info.tile_count) - 1, + }; + + xe_vm_invalidate_vma(&vma); +} diff --git a/drivers/gpu/drm/xe/xe_pt.h b/drivers/gpu/drm/xe/xe_pt.h index 775d08707466..42d495997635 100644 --- a/drivers/gpu/drm/xe/xe_pt.h +++ b/drivers/gpu/drm/xe/xe_pt.h @@ -49,4 +49,5 @@ bool xe_pt_zap_ptes(struct xe_tile *tile, struct xe_vma *vma); int xe_bind_svm_range(struct xe_vm *vm, struct xe_tile *tile, struct hmm_range *range, u64 flags); +void xe_invalidate_svm_range(struct xe_vm *vm, u64 start, u64 size); #endif From patchwork Thu Dec 21 04:37:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501078 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7959BC35274 for ; Thu, 21 Dec 2023 04:28:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2A1D010E668; Thu, 21 Dec 2023 04:28:25 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9CDAD10E654; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o7zJ3ty+jyk7jcCM7EPJoXwr0WaTeOVcPcRQczuvpS8=; b=fa3V+uzy7GznfyLPs8ycB38RW4v0QunsZOWF+Leq3YM5W49nlviSMU7B Qch2cWRYY4AwLXQeBxcSPQVFeFX5LNEzKar3uJ+6hPET4JrzxMlV5df3h DNqixG05d3RrUd88hk0c8dDrJhJx1WaYOM8X6GYOp4C1JtMmePcyi3PsU 7hFBiDR9afFyuD0VXlhuhTrWkHI+PqOJV5l+p08ETZL+UxdvOXo1fFrRZ CTBWw7Md8AV4/jTpgpXzY+GO0k2M1qkcrIyPGn8rX3C5PwRPwokEu4WtW grC80Cn9CPNNfPwUD+ukVv02pGmWUzwLvlhTtlz9Lc8yXIZL4Kur2cle2 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069768" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069768" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481363" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481363" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 09/22] drm/xe/svm: Remap and provide memmap backing for GPU vram Date: Wed, 20 Dec 2023 23:37:59 -0500 Message-Id: <20231221043812.3783313-10-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Memory remap GPU vram using devm_memremap_pages, so each GPU vram page is backed by a struct page. Those struct pages are created to allow hmm migrate buffer b/t GPU vram and CPU system memory using existing Linux migration mechanism (i.e., migrating b/t CPU system memory and hard disk). This is prepare work to enable svm (shared virtual memory) through Linux kernel hmm framework. The memory remap's page map type is set to MEMORY_DEVICE_PRIVATE for now. This means even though each GPU vram page get a struct page and can be mapped in CPU page table, but such pages are treated as GPU's private resource, so CPU can't access them. If CPU access such page, a page fault is triggered and page will be migrate to system memory. For GPU device which supports coherent memory protocol b/t CPU and GPU (such as CXL and CAPI protocol), we can remap device memory as MEMORY_DEVICE_COHERENT. This is TBD. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_device_types.h | 8 +++ drivers/gpu/drm/xe/xe_mmio.c | 7 +++ drivers/gpu/drm/xe/xe_svm.h | 2 + drivers/gpu/drm/xe/xe_svm_devmem.c | 87 ++++++++++++++++++++++++++++ 4 files changed, 104 insertions(+) create mode 100644 drivers/gpu/drm/xe/xe_svm_devmem.c diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 71f23ac365e6..c67c28f04d2f 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -99,6 +99,14 @@ struct xe_mem_region { resource_size_t actual_physical_size; /** @mapping: pointer to VRAM mappable space */ void *__iomem mapping; + /** @pagemap: Used to remap device memory as ZONE_DEVICE */ + struct dev_pagemap pagemap; + /** + * @hpa_base: base host physical address + * + * This is generated when remap device memory as ZONE_DEVICE + */ + resource_size_t hpa_base; }; /** diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c index f660cfb79f50..cfe25a3c7059 100644 --- a/drivers/gpu/drm/xe/xe_mmio.c +++ b/drivers/gpu/drm/xe/xe_mmio.c @@ -21,6 +21,7 @@ #include "xe_macros.h" #include "xe_module.h" #include "xe_tile.h" +#include "xe_svm.h" #define XEHP_MTCFG_ADDR XE_REG(0x101800) #define TILE_COUNT REG_GENMASK(15, 8) @@ -285,6 +286,7 @@ int xe_mmio_probe_vram(struct xe_device *xe) } io_size -= min_t(u64, tile_size, io_size); + xe_svm_devm_add(tile, &tile->mem.vram); } xe->mem.vram.actual_physical_size = total_size; @@ -353,10 +355,15 @@ void xe_mmio_probe_tiles(struct xe_device *xe) static void mmio_fini(struct drm_device *drm, void *arg) { struct xe_device *xe = arg; + struct xe_tile *tile; + u8 id; pci_iounmap(to_pci_dev(xe->drm.dev), xe->mmio.regs); if (xe->mem.vram.mapping) iounmap(xe->mem.vram.mapping); + for_each_tile(tile, xe, id) { + xe_svm_devm_remove(xe, &tile->mem.vram); + } } static int xe_verify_lmem_ready(struct xe_device *xe) diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 191bce6425db..b54f7714a1fc 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -72,4 +72,6 @@ struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm); struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, unsigned long addr); int xe_svm_build_sg(struct hmm_range *range, struct sg_table *st); +int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mem); +void xe_svm_devm_remove(struct xe_device *xe, struct xe_mem_region *mem); #endif diff --git a/drivers/gpu/drm/xe/xe_svm_devmem.c b/drivers/gpu/drm/xe/xe_svm_devmem.c new file mode 100644 index 000000000000..cf7882830247 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_svm_devmem.c @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ + +#include +#include + +#include "xe_device_types.h" +#include "xe_trace.h" + + +static vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf) +{ + return 0; +} + +static void xe_devm_page_free(struct page *page) +{ +} + +static const struct dev_pagemap_ops xe_devm_pagemap_ops = { + .page_free = xe_devm_page_free, + .migrate_to_ram = xe_devm_migrate_to_ram, +}; + +/** + * xe_svm_devm_add: Remap and provide memmap backing for device memory + * @tile: tile that the memory region blongs to + * @mr: memory region to remap + * + * This remap device memory to host physical address space and create + * struct page to back device memory + * + * Return: 0 on success standard error code otherwise + */ +int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mr) +{ + struct device *dev = &to_pci_dev(tile->xe->drm.dev)->dev; + struct resource *res; + void *addr; + int ret; + + res = devm_request_free_mem_region(dev, &iomem_resource, + mr->usable_size); + if (IS_ERR(res)) { + ret = PTR_ERR(res); + return ret; + } + + mr->pagemap.type = MEMORY_DEVICE_PRIVATE; + mr->pagemap.range.start = res->start; + mr->pagemap.range.end = res->end; + mr->pagemap.nr_range = 1; + mr->pagemap.ops = &xe_devm_pagemap_ops; + mr->pagemap.owner = tile->xe->drm.dev; + addr = devm_memremap_pages(dev, &mr->pagemap); + if (IS_ERR(addr)) { + devm_release_mem_region(dev, res->start, resource_size(res)); + ret = PTR_ERR(addr); + drm_err(&tile->xe->drm, "Failed to remap tile %d memory, errno %d\n", + tile->id, ret); + return ret; + } + mr->hpa_base = res->start; + + drm_info(&tile->xe->drm, "Added tile %d memory [%llx-%llx] to devm, remapped to %pr\n", + tile->id, mr->io_start, mr->io_start + mr->usable_size, res); + return 0; +} + +/** + * xe_svm_devm_remove: Unmap device memory and free resources + * @xe: xe device + * @mr: memory region to remove + */ +void xe_svm_devm_remove(struct xe_device *xe, struct xe_mem_region *mr) +{ + struct device *dev = &to_pci_dev(xe->drm.dev)->dev; + + if (mr->hpa_base) { + devm_memunmap_pages(dev, &mr->pagemap); + devm_release_mem_region(dev, mr->pagemap.range.start, + mr->pagemap.range.end - mr->pagemap.range.start +1); + } +} + From patchwork Thu Dec 21 04:38:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501082 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D222FC46CD2 for ; Thu, 21 Dec 2023 04:28:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5AF2210E66C; Thu, 21 Dec 2023 04:28:25 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id BCF2910E652; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CRd9mzkVVErXnHEn4RtHqNseTZ1Uw+rVOrQ+69kVv1M=; b=VPRp5m0tvYibwmqhXGy3ZPMKUfWdU/ynN5fbbytqPOIgYmKfjN2JzZeh dxB6fFkTViUSLkQGOm/0c8t3SYhJlGRnkFR0R2NbJb76OuImJig6p4Cqc ByUV14RIW1/adV7qjunsQqtpYNm8TA8auI7PQZ4hhlRMnhtHsPxYZ2AJf sJIPcOMhb4zpoNGGmhh+2AymJZLKsgRDl/HxADsmX2ixzbg72aXSpKHvR jcHhQ8J1B5lmx5KmmA7+/4BLlBBrnnJ74B2G+7zsftA+YuQCIbvlEBrMQ 4CTaAsKEW6Ym97AL5karwgoaIwIrLpsy5jzn9ZWfvak6imKplbmIAtQ3B w==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069770" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069770" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481366" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481366" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 10/22] drm/xe/svm: Introduce svm migration function Date: Wed, 20 Dec 2023 23:38:00 -0500 Message-Id: <20231221043812.3783313-11-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Introduce xe_migrate_svm function for data migration. This function is similar to xe_migrate_copy function but has different parameters. Instead of BO and ttm resource parameters, it has source and destination buffer's dpa address as parameter. This function is intended to be used by svm sub-system which doesn't have BO and TTM concept. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_migrate.c | 213 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_migrate.h | 7 ++ 2 files changed, 220 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index adf1dab5eba2..425de8e44deb 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -387,6 +387,37 @@ static u64 xe_migrate_res_sizes(struct xe_device *xe, struct xe_res_cursor *cur) cur->remaining); } +/** + * pte_update_cmd_size() - calculate the batch buffer command size + * to update a flat page table. + * + * @size: The virtual address range size of the page table to update + * + * The page table to update is supposed to be a flat 1 level page + * table with all entries pointing to 4k pages. + * + * Return the number of dwords of the update command + */ +static u32 pte_update_cmd_size(u64 size) +{ + u32 dword; + u64 entries = DIV_ROUND_UP(size, XE_PAGE_SIZE); + + XE_WARN_ON(size > MAX_PREEMPTDISABLE_TRANSFER); + /* + * MI_STORE_DATA_IMM command is used to update page table. Each + * instruction can update maximumly 0x1ff pte entries. To update + * n (n <= 0x1ff) pte entries, we need: + * 1 dword for the MI_STORE_DATA_IMM command header (opcode etc) + * 2 dword for the page table's physical location + * 2*n dword for value of pte to fill (each pte entry is 2 dwords) + */ + dword = (1 + 2) * DIV_ROUND_UP(entries, 0x1ff); + dword += entries * 2; + + return dword; +} + static u32 pte_update_size(struct xe_migrate *m, bool is_vram, struct ttm_resource *res, @@ -492,6 +523,48 @@ static void emit_pte(struct xe_migrate *m, } } +/** + * build_pt_update_batch_sram() - build batch buffer commands to update + * migration vm page table for system memory + * + * @m: The migration context + * @bb: The batch buffer which hold the page table update commands + * @pt_offset: The offset of page table to update, in byte + * @dpa: device physical address you want the page table to point to + * @size: size of the virtual address space you want the page table to cover + */ +static void build_pt_update_batch_sram(struct xe_migrate *m, + struct xe_bb *bb, u32 pt_offset, + u64 dpa, u32 size) +{ + u16 pat_index = tile_to_xe(m->tile)->pat.idx[XE_CACHE_WB]; + u32 ptes; + + ptes = DIV_ROUND_UP(size, XE_PAGE_SIZE); + while (ptes) { + u32 chunk = min(0x1ffU, ptes); + + bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk); + bb->cs[bb->len++] = pt_offset; + bb->cs[bb->len++] = 0; + + pt_offset += chunk * 8; + ptes -= chunk; + + while (chunk--) { + u64 addr; + + addr = dpa & PAGE_MASK; + addr = m->q->vm->pt_ops->pte_encode_addr(m->tile->xe, + addr, pat_index, + 0, false, 0); + bb->cs[bb->len++] = lower_32_bits(addr); + bb->cs[bb->len++] = upper_32_bits(addr); + dpa += XE_PAGE_SIZE; + } + } +} + #define EMIT_COPY_CCS_DW 5 static void emit_copy_ccs(struct xe_gt *gt, struct xe_bb *bb, u64 dst_ofs, bool dst_is_indirect, @@ -808,6 +881,146 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, return fence; } +/** + * xe_migrate_svm() - A migrate function used by SVM subsystem + * + * @m: The migration context + * @src_dpa: device physical start address of source, from GPU's point of view + * @src_is_vram: True if source buffer is in vram. + * @dst_dpa: device physical start address of destination, from GPU's point of view + * @dst_is_vram: True if destination buffer is in vram. + * @size: The size of data to copy. + * + * Copy @size bytes of data from @src_dpa to @dst_dpa. The functionality + * and behavior of this function is similar to xe_migrate_copy function, but + * the interface is different. This function is a helper function supposed to + * be used by SVM subsytem. Since in SVM subsystem there is no buffer object + * and ttm, there is no src/dst bo as function input. Instead, we directly use + * src/dst's physical address as function input. + * + * Since the back store of any user malloc'ed or mmap'ed memory can be placed in + * system memory, it can not be compressed. Thus this function doesn't need + * to consider copy CCS (compression control surface) data as xe_migrate_copy did. + * + * This function assumes the source buffer and destination buffer are all physically + * contiguous. + * + * We use gpu blitter to copy data. Source and destination are first mapped to + * migration vm which is a flat one level (L0) page table, then blitter is used to + * perform the copy. + * + * Return: Pointer to a dma_fence representing the last copy batch, or + * an error pointer on failure. If there is a failure, any copy operation + * started by the function call has been synced. + */ +struct dma_fence *xe_migrate_svm(struct xe_migrate *m, + u64 src_dpa, + bool src_is_vram, + u64 dst_dpa, + bool dst_is_vram, + u64 size) +{ +#define NUM_PT_PER_BLIT (MAX_PREEMPTDISABLE_TRANSFER / SZ_2M) + struct xe_gt *gt = m->tile->primary_gt; + struct xe_device *xe = gt_to_xe(gt); + struct dma_fence *fence = NULL; + u64 src_L0_ofs, dst_L0_ofs; + u64 round_update_size; + /* A slot is a 4K page of page table, covers 2M virtual address*/ + u32 pt_slot; + int err; + + while (size) { + u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */ + struct xe_sched_job *job; + struct xe_bb *bb; + u32 update_idx; + + /* Maximumly copy MAX_PREEMPTDISABLE_TRANSFER bytes. Why?*/ + round_update_size = min_t(u64, size, MAX_PREEMPTDISABLE_TRANSFER); + + /* src pte update*/ + if (!src_is_vram) + batch_size += pte_update_cmd_size(round_update_size); + /* dst pte update*/ + if (!dst_is_vram) + batch_size += pte_update_cmd_size(round_update_size); + + /* Copy command size*/ + batch_size += EMIT_COPY_DW; + + bb = xe_bb_new(gt, batch_size, true); + if (IS_ERR(bb)) { + err = PTR_ERR(bb); + goto err_sync; + } + + if (!src_is_vram) { + pt_slot = 0; + build_pt_update_batch_sram(m, bb, pt_slot * XE_PAGE_SIZE, + src_dpa, round_update_size); + src_L0_ofs = xe_migrate_vm_addr(pt_slot, 0); + } + else + src_L0_ofs = xe_migrate_vram_ofs(xe, src_dpa); + + if (!dst_is_vram) { + pt_slot = NUM_PT_PER_BLIT; + build_pt_update_batch_sram(m, bb, pt_slot * XE_PAGE_SIZE, + dst_dpa, round_update_size); + dst_L0_ofs = xe_migrate_vm_addr(pt_slot, 0); + } + else + dst_L0_ofs = xe_migrate_vram_ofs(xe, dst_dpa); + + + bb->cs[bb->len++] = MI_BATCH_BUFFER_END; + update_idx = bb->len; + + emit_copy(gt, bb, src_L0_ofs, dst_L0_ofs, round_update_size, + XE_PAGE_SIZE); + + mutex_lock(&m->job_mutex); + job = xe_bb_create_migration_job(m->q, bb, + xe_migrate_batch_base(m, true), + update_idx); + if (IS_ERR(job)) { + err = PTR_ERR(job); + goto err; + } + + xe_sched_job_add_migrate_flush(job, 0); + xe_sched_job_arm(job); + dma_fence_put(fence); + fence = dma_fence_get(&job->drm.s_fence->finished); + xe_sched_job_push(job); + dma_fence_put(m->fence); + m->fence = dma_fence_get(fence); + + mutex_unlock(&m->job_mutex); + + xe_bb_free(bb, fence); + size -= round_update_size; + src_dpa += round_update_size; + dst_dpa += round_update_size; + continue; + +err: + mutex_unlock(&m->job_mutex); + xe_bb_free(bb, NULL); + +err_sync: + /* Sync partial copy if any. FIXME: under job_mutex? */ + if (fence) { + dma_fence_wait(fence, false); + dma_fence_put(fence); + } + + return ERR_PTR(err); + } + + return fence; +} static void emit_clear_link_copy(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs, u32 size, u32 pitch) { diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h index 951f19318ea4..a532760ae1fa 100644 --- a/drivers/gpu/drm/xe/xe_migrate.h +++ b/drivers/gpu/drm/xe/xe_migrate.h @@ -88,6 +88,13 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, struct ttm_resource *dst, bool copy_only_ccs); +struct dma_fence *xe_migrate_svm(struct xe_migrate *m, + u64 src_dpa, + bool src_is_vram, + u64 dst_dpa, + bool dst_is_vram, + u64 size); + struct dma_fence *xe_migrate_clear(struct xe_migrate *m, struct xe_bo *bo, struct ttm_resource *dst); From patchwork Thu Dec 21 04:38:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501074 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30A8DC4706E for ; Thu, 21 Dec 2023 04:28:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 23BDB10E653; Thu, 21 Dec 2023 04:28:24 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id BE2F210E653; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=h53Z4lotbrDr8C4EGWcUwwWlUSfdukAs71sZw2Y37nk=; b=ZfGARM0ITZgJC3nZ5bhntAFIdNLl/qQKsUXMETAapQdEGV5cr6OiFutz 2F086v9erARYFgKOIpr0H+eEhptW4dA6IxsRCaGqzfkP3+/6wn8uYAKMD 6jiLp2tMnlIBl5xTGarOUrm/uFo0KdiPEoY7k4HQlhJ5Sc4nKwXr1+49J 7Ae6iv6L9wIrwlLCmLiwZK9jOuAmolRTgJqnl7MzNX3cU60tehDZFzVaA 24ThbHUT63ido1BfoLs2d0hqTNDRsdNQH5YO94PPpyy8GA2yEfa96c+UC 15OL7jyfvuiab8Oyi9x3dV6Q51VOpj3s44q5ujIsoVP15lwOfJXRxyzZX A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069771" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069771" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481370" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481370" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 11/22] drm/xe/svm: implement functions to allocate and free device memory Date: Wed, 20 Dec 2023 23:38:01 -0500 Message-Id: <20231221043812.3783313-12-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Function xe_devm_alloc_pages allocate pages from drm buddy and perform house keeping work for all the pages allocated, such as get a page refcount, keep a bitmap of all pages to denote whether a page is in use, put pages to a drm lru list for eviction purpose. Function xe_devm_free_blocks return all memory blocks to drm buddy allocator. Function xe_devm_free_page is a call back function from hmm layer. It is called whenever a page's refcount reaches to 1. This function clears the bit of this page in the bitmap. If all the bits in the bitmap is cleared, it means all the pages have been freed, we return all the pages in this memory block back to drm buddy. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.h | 9 ++ drivers/gpu/drm/xe/xe_svm_devmem.c | 146 ++++++++++++++++++++++++++++- 2 files changed, 154 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index b54f7714a1fc..8551df2b9780 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -74,4 +74,13 @@ struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, int xe_svm_build_sg(struct hmm_range *range, struct sg_table *st); int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mem); void xe_svm_devm_remove(struct xe_device *xe, struct xe_mem_region *mem); + + +int xe_devm_alloc_pages(struct xe_tile *tile, + unsigned long npages, + struct list_head *blocks, + unsigned long *pfn); + +void xe_devm_free_blocks(struct list_head *blocks); +void xe_devm_page_free(struct page *page); #endif diff --git a/drivers/gpu/drm/xe/xe_svm_devmem.c b/drivers/gpu/drm/xe/xe_svm_devmem.c index cf7882830247..445e0e1bc3b4 100644 --- a/drivers/gpu/drm/xe/xe_svm_devmem.c +++ b/drivers/gpu/drm/xe/xe_svm_devmem.c @@ -5,18 +5,162 @@ #include #include +#include +#include +#include +#include +#include +#include +#include #include "xe_device_types.h" #include "xe_trace.h" +#include "xe_migrate.h" +#include "xe_ttm_vram_mgr_types.h" +#include "xe_assert.h" +/** + * struct xe_svm_block_meta - svm uses this data structure to manage each + * block allocated from drm buddy. This will be set to the drm_buddy_block's + * private field. + * + * @lru: used to link this block to drm's lru lists. This will be replace + * with struct drm_lru_entity later. + * @tile: tile from which we allocated this block + * @bitmap: A bitmap of each page in this block. 1 means this page is used, + * 0 means this page is idle. When all bits of this block are 0, it is time + * to return this block to drm buddy subsystem. + */ +struct xe_svm_block_meta { + struct list_head lru; + struct xe_tile *tile; + unsigned long bitmap[]; +}; + +static u64 block_offset_to_pfn(struct xe_mem_region *mr, u64 offset) +{ + /** DRM buddy's block offset is 0-based*/ + offset += mr->hpa_base; + + return PHYS_PFN(offset); +} + +/** + * xe_devm_alloc_pages() - allocate device pages from buddy allocator + * + * @xe_tile: which tile to allocate device memory from + * @npages: how many pages to allocate + * @blocks: used to return the allocated blocks + * @pfn: used to return the pfn of all allocated pages. Must be big enough + * to hold at @npages entries. + * + * This function allocate blocks of memory from drm buddy allocator, and + * performs initialization work: set struct page::zone_device_data to point + * to the memory block; set/initialize drm_buddy_block::private field; + * lock_page for each page allocated; add memory block to lru managers lru + * list - this is TBD. + * + * return: 0 on success + * error code otherwise + */ +int xe_devm_alloc_pages(struct xe_tile *tile, + unsigned long npages, + struct list_head *blocks, + unsigned long *pfn) +{ + struct drm_buddy *mm = &tile->mem.vram_mgr->mm; + struct drm_buddy_block *block, *tmp; + u64 size = npages << PAGE_SHIFT; + int ret = 0, i, j = 0; + + ret = drm_buddy_alloc_blocks(mm, 0, mm->size, size, PAGE_SIZE, + blocks, DRM_BUDDY_TOPDOWN_ALLOCATION); + + if (unlikely(ret)) + return ret; + + list_for_each_entry_safe(block, tmp, blocks, link) { + struct xe_mem_region *mr = &tile->mem.vram; + u64 block_pfn_first, pages_per_block; + struct xe_svm_block_meta *meta; + u32 meta_size; + + size = drm_buddy_block_size(mm, block); + pages_per_block = size >> PAGE_SHIFT; + meta_size = BITS_TO_BYTES(pages_per_block) + + sizeof(struct xe_svm_block_meta); + meta = kzalloc(meta_size, GFP_KERNEL); + bitmap_fill(meta->bitmap, pages_per_block); + meta->tile = tile; + block->private = meta; + block_pfn_first = + block_offset_to_pfn(mr, drm_buddy_block_offset(block)); + for(i = 0; i < pages_per_block; i++) { + struct page *page; + + pfn[j++] = block_pfn_first + i; + page = pfn_to_page(block_pfn_first + i); + /**Lock page per hmm requirement, see hmm.rst.*/ + zone_device_page_init(page); + page->zone_device_data = block; + } + } + + return ret; +} + +/** FIXME: we locked page by calling zone_device_page_init + * inxe_dev_alloc_pages. Should we unlock pages here? + */ +static void free_block(struct drm_buddy_block *block) +{ + struct xe_svm_block_meta *meta = + (struct xe_svm_block_meta *)block->private; + struct xe_tile *tile = meta->tile; + struct drm_buddy *mm = &tile->mem.vram_mgr->mm; + + kfree(block->private); + drm_buddy_free_block(mm, block); +} + +/** + * xe_devm_free_blocks() - free all memory blocks + * + * @blocks: memory blocks list head + */ +void xe_devm_free_blocks(struct list_head *blocks) +{ + struct drm_buddy_block *block, *tmp; + + list_for_each_entry_safe(block, tmp, blocks, link) + free_block(block); +} static vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf) { return 0; } -static void xe_devm_page_free(struct page *page) +void xe_devm_page_free(struct page *page) { + struct drm_buddy_block *block = + (struct drm_buddy_block *)page->zone_device_data; + struct xe_svm_block_meta *meta = + (struct xe_svm_block_meta *)block->private; + struct xe_tile *tile = meta->tile; + struct xe_mem_region *mr = &tile->mem.vram; + struct drm_buddy *mm = &tile->mem.vram_mgr->mm; + u64 size = drm_buddy_block_size(mm, block); + u64 pages_per_block = size >> PAGE_SHIFT; + u64 block_pfn_first = + block_offset_to_pfn(mr, drm_buddy_block_offset(block)); + u64 page_pfn = page_to_pfn(page); + u64 i = page_pfn - block_pfn_first; + + xe_assert(tile->xe, i < pages_per_block); + clear_bit(i, meta->bitmap); + if (bitmap_empty(meta->bitmap, pages_per_block)) + free_block(block); } static const struct dev_pagemap_ops xe_devm_pagemap_ops = { From patchwork Thu Dec 21 04:38:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501072 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E33A1C4706C for ; Thu, 21 Dec 2023 04:28:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EBD4C10E667; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id E379410E654; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=f40+dO32XfCQXtj0bJ18zsnoVP4q5bzdJvLh70FVUGk=; b=fiLhHv252zYocMHIcUWrpHBEpa8ZL2qGdWN2B/mnzJgIkjCVtnefDous URTfVkEHyOfb4bOTH2X2cdK3tyxUTRpLnqoDW8ML+Y2hPX+1lv0xdKFXh I8nBNl0m466IwwEly1QnDDWWZSh+uxFfHM/iJu95fkknLBNyRaDELrs4Z 4eBvmVdcytXBxTcor7LWhu23mOHqWmSuH4k5WbGm703h+dUK9J3LkkoF4 +Tqr7/H27P5WE9XrKBZBdQQhA+sxwRZRWXV+IxgP36hWlcb9erbiOmuej kne+cQlZYG5X6DseXtDz2YA+FH+AlNiqArYQenKLkx+lzWCc3Y0cqjfyy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069772" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069772" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481372" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481372" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 12/22] drm/xe/svm: Trace buddy block allocation and free Date: Wed, 20 Dec 2023 23:38:02 -0500 Message-Id: <20231221043812.3783313-13-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm_devmem.c | 5 ++++- drivers/gpu/drm/xe/xe_trace.h | 35 ++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_svm_devmem.c b/drivers/gpu/drm/xe/xe_svm_devmem.c index 445e0e1bc3b4..5cd54dde4a9d 100644 --- a/drivers/gpu/drm/xe/xe_svm_devmem.c +++ b/drivers/gpu/drm/xe/xe_svm_devmem.c @@ -95,6 +95,7 @@ int xe_devm_alloc_pages(struct xe_tile *tile, block->private = meta; block_pfn_first = block_offset_to_pfn(mr, drm_buddy_block_offset(block)); + trace_xe_buddy_block_alloc(block, size, block_pfn_first); for(i = 0; i < pages_per_block; i++) { struct page *page; @@ -159,8 +160,10 @@ void xe_devm_page_free(struct page *page) xe_assert(tile->xe, i < pages_per_block); clear_bit(i, meta->bitmap); - if (bitmap_empty(meta->bitmap, pages_per_block)) + if (bitmap_empty(meta->bitmap, pages_per_block)) { free_block(block); + trace_xe_buddy_block_free(block, size, block_pfn_first); + } } static const struct dev_pagemap_ops xe_devm_pagemap_ops = { diff --git a/drivers/gpu/drm/xe/xe_trace.h b/drivers/gpu/drm/xe/xe_trace.h index 63867c0fa848..50380f5173ca 100644 --- a/drivers/gpu/drm/xe/xe_trace.h +++ b/drivers/gpu/drm/xe/xe_trace.h @@ -11,6 +11,7 @@ #include #include +#include #include "xe_bo_types.h" #include "xe_exec_queue_types.h" @@ -600,6 +601,40 @@ DEFINE_EVENT_PRINT(xe_guc_ctb, xe_guc_ctb_g2h, ); +DECLARE_EVENT_CLASS(xe_buddy_block, + TP_PROTO(struct drm_buddy_block *block, u64 size, u64 pfn), + TP_ARGS(block, size, pfn), + + TP_STRUCT__entry( + __field(u64, block) + __field(u64, header) + __field(u64, size) + __field(u64, pfn) + ), + + TP_fast_assign( + __entry->block = (u64)block; + __entry->header = block->header; + __entry->size = size; + __entry->pfn = pfn; + ), + + TP_printk("xe svm: allocated block %llx, block header %llx, size %llx, pfn %llx\n", + __entry->block, __entry->header, __entry->size, __entry->pfn) +); + + +DEFINE_EVENT(xe_buddy_block, xe_buddy_block_alloc, + TP_PROTO(struct drm_buddy_block *block, u64 size, u64 pfn), + TP_ARGS(block, size, pfn) +); + + +DEFINE_EVENT(xe_buddy_block, xe_buddy_block_free, + TP_PROTO(struct drm_buddy_block *block, u64 size, u64 pfn), + TP_ARGS(block, size, pfn) +); + #endif /* This part must be outside protection */ From patchwork Thu Dec 21 04:38:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501086 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30911C46CD2 for ; Thu, 21 Dec 2023 04:28:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7048510E679; Thu, 21 Dec 2023 04:28:31 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id ECE6D10E658; Thu, 21 Dec 2023 04:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132902; x=1734668902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=468vXP6XMA+Ped90HBw+ujY2GwFKDU6DLNg9OR9HBxA=; b=WH5s/lQ8yPWVvxZ0F88tYxL77jafZhXHtckxeMGeBGn3KrR55kGF69NL vs3jLBUP1QsgEWgrgQQzMZaJ1BrBwQle/SGM5cUp9LhTBfXUluHiy28vK GWg5500ancfQXqsgyMaORXRm6fbyrKqjK5I8iUHSVZIlCqEruW5gaEXWQ Li75p4Le0rt2e1C1RXIW0J8PgB5BOUBMW56z21Czp4tFR3+id4cQzm27N X79q/RQVPNps7nw2kCV/VWK+8U/Fv7fqFbvldNUJTtWeB4769IwRlJjzU vY9nVqGjS/ztc+F7GAURKy5HJLbPhj50FpyU65kjoY4y79HOK1ZkPOSjY A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069773" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069773" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481376" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481376" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 13/22] drm/xe/svm: Handle CPU page fault Date: Wed, 20 Dec 2023 23:38:03 -0500 Message-Id: <20231221043812.3783313-14-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Under the picture of svm, CPU and GPU program share one same virtual address space. The backing store of this virtual address space can be either in system memory or device memory. Since GPU device memory is remaped as DEVICE_PRIVATE, CPU can't access it. Any CPU access to device memory causes a page fault. Implement a page fault handler to migrate memory back to system memory and map it to CPU page table so the CPU program can proceed. Also unbind this page from GPU side, and free the original GPU device page Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_device_types.h | 12 ++ drivers/gpu/drm/xe/xe_svm.h | 8 +- drivers/gpu/drm/xe/xe_svm_devmem.c | 10 +- drivers/gpu/drm/xe/xe_svm_migrate.c | 230 +++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_svm_range.c | 27 ++++ 5 files changed, 280 insertions(+), 7 deletions(-) create mode 100644 drivers/gpu/drm/xe/xe_svm_migrate.c diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index c67c28f04d2f..ac77996bebe6 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -555,4 +555,16 @@ struct xe_file { struct xe_drm_client *client; }; +static inline struct xe_tile *mem_region_to_tile(struct xe_mem_region *mr) +{ + return container_of(mr, struct xe_tile, mem.vram); +} + +static inline u64 vram_pfn_to_dpa(struct xe_mem_region *mr, u64 pfn) +{ + u64 dpa; + u64 offset = (pfn << PAGE_SHIFT) - mr->hpa_base; + dpa = mr->dpa_base + offset; + return dpa; +} #endif diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 8551df2b9780..6b93055934f8 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -12,8 +12,10 @@ #include #include #include +#include #include #include +#include #include "xe_device_types.h" struct xe_vm; @@ -66,16 +68,20 @@ struct xe_svm_range { struct interval_tree_node inode; }; +vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf); void xe_destroy_svm(struct xe_svm *svm); struct xe_svm *xe_create_svm(struct xe_vm *vm); struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm); struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, unsigned long addr); +bool xe_svm_range_belongs_to_vma(struct mm_struct *mm, + struct xe_svm_range *range, + struct vm_area_struct *vma); + int xe_svm_build_sg(struct hmm_range *range, struct sg_table *st); int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mem); void xe_svm_devm_remove(struct xe_device *xe, struct xe_mem_region *mem); - int xe_devm_alloc_pages(struct xe_tile *tile, unsigned long npages, struct list_head *blocks, diff --git a/drivers/gpu/drm/xe/xe_svm_devmem.c b/drivers/gpu/drm/xe/xe_svm_devmem.c index 5cd54dde4a9d..01f8385ebb5b 100644 --- a/drivers/gpu/drm/xe/xe_svm_devmem.c +++ b/drivers/gpu/drm/xe/xe_svm_devmem.c @@ -11,13 +11,16 @@ #include #include #include +#include +#include #include - #include "xe_device_types.h" #include "xe_trace.h" #include "xe_migrate.h" #include "xe_ttm_vram_mgr_types.h" #include "xe_assert.h" +#include "xe_pt.h" +#include "xe_svm.h" /** * struct xe_svm_block_meta - svm uses this data structure to manage each @@ -137,11 +140,6 @@ void xe_devm_free_blocks(struct list_head *blocks) free_block(block); } -static vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf) -{ - return 0; -} - void xe_devm_page_free(struct page *page) { struct drm_buddy_block *block = diff --git a/drivers/gpu/drm/xe/xe_svm_migrate.c b/drivers/gpu/drm/xe/xe_svm_migrate.c new file mode 100644 index 000000000000..3be26da33aa3 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_svm_migrate.c @@ -0,0 +1,230 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2023 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "xe_device_types.h" +#include "xe_trace.h" +#include "xe_migrate.h" +#include "xe_ttm_vram_mgr_types.h" +#include "xe_assert.h" +#include "xe_pt.h" +#include "xe_svm.h" + + +/** + * alloc_host_page() - allocate one host page for the fault vma + * + * @dev: (GPU) device that will access the allocated page + * @vma: the fault vma that we need allocate page for + * @addr: the fault address. The allocated page is for this address + * @dma_addr: used to output the dma address of the allocated page. + * This dma address will be used for gpu to access this page. GPU + * access host page through a dma mapped address. + * @pfn: used to output the pfn of the allocated page. + * + * This function allocate one host page for the specified vma. It + * also does some prepare work for GPU to access this page, such + * as map this page to iommu (by calling dma_map_page). + * + * When this function returns, the page is locked. + * + * Return struct page pointer when success + * NULL otherwise + */ +static struct page *alloc_host_page(struct device *dev, + struct vm_area_struct *vma, + unsigned long addr, + dma_addr_t *dma_addr, + unsigned long *pfn) +{ + struct page *page; + + page = alloc_page_vma(GFP_HIGHUSER, vma, addr); + if (unlikely(!page)) + return NULL; + + /**Lock page per hmm requirement, see hmm.rst*/ + lock_page(page); + *dma_addr = dma_map_page(dev, page, 0, PAGE_SIZE, DMA_FROM_DEVICE); + if (unlikely(dma_mapping_error(dev, *dma_addr))) { + unlock_page(page); + __free_page(page); + return NULL; + } + + *pfn = migrate_pfn(page_to_pfn(page)); + return page; +} + +static void free_host_page(struct page *page) +{ + unlock_page(page); + put_page(page); +} + +static inline struct xe_mem_region *page_to_mem_region(struct page *page) +{ + return container_of(page->pgmap, struct xe_mem_region, pagemap); +} + +/** + * migrate_page_vram_to_ram() - migrate one page from vram to ram + * + * @vma: The vma that the page is mapped to + * @addr: The virtual address that the page is mapped to + * @src_pfn: src page's page frame number + * @dst_pfn: used to return dstination page (in system ram)'s pfn + * + * Allocate one page in system ram and copy memory from device memory + * to system ram. + * + * Return: 0 if this page is already in sram (no need to migrate) + * 1: successfully migrated this page from vram to sram. + * error code otherwise + */ +static int migrate_page_vram_to_ram(struct vm_area_struct *vma, unsigned long addr, + unsigned long src_pfn, unsigned long *dst_pfn) +{ + struct xe_mem_region *mr; + struct xe_tile *tile; + struct xe_device *xe; + struct device *dev; + dma_addr_t dma_addr = 0; + struct dma_fence *fence; + struct page *host_page; + struct page *src_page; + u64 src_dpa; + + src_page = migrate_pfn_to_page(src_pfn); + if (unlikely(!src_page || !(src_pfn & MIGRATE_PFN_MIGRATE))) + return 0; + + mr = page_to_mem_region(src_page); + tile = mem_region_to_tile(mr); + xe = tile_to_xe(tile); + dev = xe->drm.dev; + + src_dpa = vram_pfn_to_dpa(mr, src_pfn); + host_page = alloc_host_page(dev, vma, addr, &dma_addr, dst_pfn); + if (!host_page) + return -ENOMEM; + + fence = xe_migrate_svm(tile->migrate, src_dpa, true, + dma_addr, false, PAGE_SIZE); + if (IS_ERR(fence)) { + dma_unmap_page(dev, dma_addr, PAGE_SIZE, DMA_FROM_DEVICE); + free_host_page(host_page); + return PTR_ERR(fence); + } + + dma_fence_wait(fence, false); + dma_fence_put(fence); + dma_unmap_page(dev, dma_addr, PAGE_SIZE, DMA_FROM_DEVICE); + return 1; +} + +/** + * xe_devmem_migrate_to_ram() - Migrate memory back to sram on CPU page fault + * + * @vmf: cpu vm fault structure, contains fault information such as vma etc. + * + * Note, this is in CPU's vm fault handler, caller holds the mmap read lock. + * FIXME: relook the lock design here. Is there any deadlock? + * + * This function migrate one svm range which contains the fault address to sram. + * We try to maintain a 1:1 mapping b/t the vma and svm_range (i.e., create one + * svm range for one vma initially and try not to split it). So this scheme end + * up migrate at the vma granularity. This might not be the best performant scheme + * when GPU is in the picture. + * + * This can be tunned with a migration granularity for performance, for example, + * migration 2M for each CPU page fault, or let user specify how much to migrate. + * But this is more complicated as this scheme requires vma and svm_range splitting. + * + * This function should also update GPU page table, so the fault virtual address + * points to the same sram location from GPU side. This is TBD. + * + * Return: + * 0 on success + * VM_FAULT_SIGBUS: failed to migrate page to system memory, application + * will be signaled a SIGBUG + */ +vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf) +{ + struct xe_mem_region *mr = page_to_mem_region(vmf->page); + struct xe_tile *tile = mem_region_to_tile(mr); + struct xe_device *xe = tile_to_xe(tile); + struct vm_area_struct *vma = vmf->vma; + struct mm_struct *mm = vma->vm_mm; + struct xe_svm *svm = xe_lookup_svm_by_mm(mm); + struct xe_svm_range *range = xe_svm_range_from_addr(svm, vmf->address); + struct xe_vm *vm = svm->vm; + u64 npages = (range->end - range->start) >> PAGE_SHIFT; + unsigned long addr = range->start; + vm_fault_t ret = 0; + void *buf; + int i; + + struct migrate_vma migrate_vma = { + .vma = vmf->vma, + .start = range->start, + .end = range->end, + .pgmap_owner = xe->drm.dev, + .flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE, + .fault_page = vmf->page, + }; + + xe_assert(xe, IS_ALIGNED(vmf->address, PAGE_SIZE)); + xe_assert(xe, IS_ALIGNED(range->start, PAGE_SIZE)); + xe_assert(xe, IS_ALIGNED(range->end, PAGE_SIZE)); + /**FIXME: in case of vma split, svm range might not belongs to one vma*/ + xe_assert(xe, xe_svm_range_belongs_to_vma(mm, range, vma)); + + buf = kvcalloc(npages, 2* sizeof(*migrate_vma.src), GFP_KERNEL); + migrate_vma.src = buf; + migrate_vma.dst = buf + npages; + if (migrate_vma_setup(&migrate_vma) < 0) { + ret = VM_FAULT_SIGBUS; + goto free_buf; + } + + if (!migrate_vma.cpages) + goto free_buf; + + for (i = 0; i < npages; i++) { + ret = migrate_page_vram_to_ram(vma, addr, migrate_vma.src[i], + migrate_vma.dst + i); + if (ret < 0) { + ret = VM_FAULT_SIGBUS; + break; + } + + /** Migration has been successful, unbind src page from gpu, + * and free source page + */ + if (ret == 1) { + struct page *src_page = migrate_pfn_to_page(migrate_vma.src[i]); + + xe_invalidate_svm_range(vm, addr, PAGE_SIZE); + xe_devm_page_free(src_page); + } + + addr += PAGE_SIZE; + } + + migrate_vma_pages(&migrate_vma); + migrate_vma_finalize(&migrate_vma); +free_buf: + kvfree(buf); + return 0; +} diff --git a/drivers/gpu/drm/xe/xe_svm_range.c b/drivers/gpu/drm/xe/xe_svm_range.c index d8251d38f65e..b32c32f60315 100644 --- a/drivers/gpu/drm/xe/xe_svm_range.c +++ b/drivers/gpu/drm/xe/xe_svm_range.c @@ -5,7 +5,9 @@ #include #include +#include #include +#include #include "xe_svm.h" /** @@ -30,3 +32,28 @@ struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, return container_of(node, struct xe_svm_range, inode); } + +/** + * xe_svm_range_belongs_to_vma() - determine a virtual address range + * belongs to a vma or not + * + * @mm: the mm of the virtual address range + * @range: the svm virtual address range + * @vma: the vma to determine the range + * + * Returns true if range belongs to vma + * false otherwise + */ +bool xe_svm_range_belongs_to_vma(struct mm_struct *mm, + struct xe_svm_range *range, + struct vm_area_struct *vma) +{ + struct vm_area_struct *vma1, *vma2; + unsigned long start = range->start; + unsigned long end = range->end; + + vma1 = find_vma_intersection(mm, start, start + 4); + vma2 = find_vma_intersection(mm, end - 4, end); + + return (vma1 == vma) && (vma2 == vma); +} From patchwork Thu Dec 21 04:38:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501080 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E3B9BC4706C for ; Thu, 21 Dec 2023 04:28:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5ECD610E66E; Thu, 21 Dec 2023 04:28:25 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1571F10E653; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a0CyKgI90QVtJzTmA2DuaoGo2T/vpJ7XqCsA6jXeFbo=; b=XpaSkZFyFkvcAdSl0sksFKyOI4E2xNW2n6ZD4B0pOnv6agv0hRKWhoUV xIlDU79JVhIIG4my6x2d/WlhXmWhnahlvEt1aW2ruSZyoC+Rz6Klv3Vtz qpT2f4pBmtnjLPOXQ+beXVoK9PXGCxbdTNwd/0j/tr5X1pT9v80hKh+zl MGjYv//nka0XYBY39iVg7EM7OBtBzP7ZRuxB10oc5z1p+NrzxpuP/ocl6 its10y4ngxFsOh6y1Z4Rnk10HTLRTpY6NIfSpras5aY1wggJKP4DBq/W7 RnxoynEswvWr3DMo8HuJCvPMqg/CuEoiKd/jHy4M4eqDtu1nlSSuaITjx A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069774" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069774" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481378" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481378" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:20 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 14/22] drm/xe/svm: trace svm range migration Date: Wed, 20 Dec 2023 23:38:04 -0500 Message-Id: <20231221043812.3783313-15-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add function to trace svm range migration, either from vram to sram, or sram to vram Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm_migrate.c | 1 + drivers/gpu/drm/xe/xe_trace.h | 30 +++++++++++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_svm_migrate.c b/drivers/gpu/drm/xe/xe_svm_migrate.c index 3be26da33aa3..b4df411e04f3 100644 --- a/drivers/gpu/drm/xe/xe_svm_migrate.c +++ b/drivers/gpu/drm/xe/xe_svm_migrate.c @@ -201,6 +201,7 @@ vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf) if (!migrate_vma.cpages) goto free_buf; + trace_xe_svm_migrate_vram_to_sram(range); for (i = 0; i < npages; i++) { ret = migrate_page_vram_to_ram(vma, addr, migrate_vma.src[i], migrate_vma.dst + i); diff --git a/drivers/gpu/drm/xe/xe_trace.h b/drivers/gpu/drm/xe/xe_trace.h index 50380f5173ca..960eec38aee5 100644 --- a/drivers/gpu/drm/xe/xe_trace.h +++ b/drivers/gpu/drm/xe/xe_trace.h @@ -21,6 +21,7 @@ #include "xe_guc_exec_queue_types.h" #include "xe_sched_job.h" #include "xe_vm.h" +#include "xe_svm.h" DECLARE_EVENT_CLASS(xe_gt_tlb_invalidation_fence, TP_PROTO(struct xe_gt_tlb_invalidation_fence *fence), @@ -601,6 +602,35 @@ DEFINE_EVENT_PRINT(xe_guc_ctb, xe_guc_ctb_g2h, ); +DECLARE_EVENT_CLASS(xe_svm_migrate, + TP_PROTO(struct xe_svm_range *range), + TP_ARGS(range), + + TP_STRUCT__entry( + __field(u64, start) + __field(u64, end) + ), + + TP_fast_assign( + __entry->start = range->start; + __entry->end = range->end; + ), + + TP_printk("Migrate svm range [0x%016llx,0x%016llx)", __entry->start, + __entry->end) +); + +DEFINE_EVENT(xe_svm_migrate, xe_svm_migrate_vram_to_sram, + TP_PROTO(struct xe_svm_range *range), + TP_ARGS(range) +); + + +DEFINE_EVENT(xe_svm_migrate, xe_svm_migrate_sram_to_vram, + TP_PROTO(struct xe_svm_range *range), + TP_ARGS(range) +); + DECLARE_EVENT_CLASS(xe_buddy_block, TP_PROTO(struct drm_buddy_block *block, u64 size, u64 pfn), TP_ARGS(block, size, pfn), From patchwork Thu Dec 21 04:38:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F0097C35274 for ; Thu, 21 Dec 2023 04:28:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EB0C210E65F; Thu, 21 Dec 2023 04:28:24 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 287DB10E659; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1rEGLaIa6Lp3+PyRExYWC9Us9ul7CVpa3VN8xsGdX1g=; b=ZxYrzAe1HvOMPdGJWQdxchIFHshswZzV2+lgt+66MZvZKwvL9lm0grt+ TNQAKA6dBGKBahMKn+HfE0UUKdAHMNH++823NaJ6S6mK03o1G4ykSWFVt 2w+8134YoPfUYJ61GlTPShcqAIuAj1ChWDtiu6VnkxZftEeTytBkMylrR eDgp7ouVHSSdV6Jv9d93z2Ew0BIXiTMY8srlmqR/XuY5RJarmyLuyRFju VRO7lPAXkaSdBUlgWs6AvbhWlx+bsdY6r8fmyHvtW9MtyT6BMhGudv7As 5ehKUjhyipUijoKCWa+aFKw6b8/O/LyJhBJyVlDeGqOy4KS0YukINNnVN A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069775" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069775" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481382" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481382" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 15/22] drm/xe/svm: Implement functions to register and unregister mmu notifier Date: Wed, 20 Dec 2023 23:38:05 -0500 Message-Id: <20231221043812.3783313-16-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" xe driver register mmu interval notifier to core mm to monitor vma change. We register mmu interval notifier for each svm range. mmu interval notifier should be unregistered in a worker (see next patch in this series), so also initialize kernel worker to unregister mmu interval notifier. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.h | 14 ++++++ drivers/gpu/drm/xe/xe_svm_range.c | 73 +++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 6b93055934f8..90e665f2bfc6 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -52,16 +52,28 @@ struct xe_svm { * struct xe_svm_range - Represents a shared virtual address range. */ struct xe_svm_range { + /** @svm: pointer of the xe_svm that this range belongs to */ + struct xe_svm *svm; + /** @notifier: The mmu interval notifer used to keep track of CPU * side address range change. Driver will get a callback with this * notifier if anything changed from CPU side, such as range is * unmapped from CPU */ struct mmu_interval_notifier notifier; + bool mmu_notifier_registered; /** @start: start address of this range, inclusive */ u64 start; /** @end: end address of this range, exclusive */ u64 end; + /** @vma: the corresponding vma of this svm range + * The relationship b/t vma and svm range is 1:N, + * which means one vma can be splitted into multiple + * @xe_svm_range while one @xe_svm_range can have + * only one vma. A N:N mapping means some complication + * in codes. Lets assume 1:N for now. + */ + struct vm_area_struct *vma; /** @unregister_notifier_work: A worker used to unregister this notifier */ struct work_struct unregister_notifier_work; /** @inode: used to link this range to svm's range_tree */ @@ -77,6 +89,8 @@ struct xe_svm_range *xe_svm_range_from_addr(struct xe_svm *svm, bool xe_svm_range_belongs_to_vma(struct mm_struct *mm, struct xe_svm_range *range, struct vm_area_struct *vma); +void xe_svm_range_unregister_mmu_notifier(struct xe_svm_range *range); +int xe_svm_range_register_mmu_notifier(struct xe_svm_range *range); int xe_svm_build_sg(struct hmm_range *range, struct sg_table *st); int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mem); diff --git a/drivers/gpu/drm/xe/xe_svm_range.c b/drivers/gpu/drm/xe/xe_svm_range.c index b32c32f60315..286d5f7d6ecd 100644 --- a/drivers/gpu/drm/xe/xe_svm_range.c +++ b/drivers/gpu/drm/xe/xe_svm_range.c @@ -4,6 +4,7 @@ */ #include +#include #include #include #include @@ -57,3 +58,75 @@ bool xe_svm_range_belongs_to_vma(struct mm_struct *mm, return (vma1 == vma) && (vma2 == vma); } + +static const struct mmu_interval_notifier_ops xe_svm_mni_ops = { + .invalidate = NULL, +}; + +/** + * unregister a mmu interval notifier for a svm range + * + * @range: svm range + * + */ +void xe_svm_range_unregister_mmu_notifier(struct xe_svm_range *range) +{ + if (!range->mmu_notifier_registered) + return; + + mmu_interval_notifier_remove(&range->notifier); + range->mmu_notifier_registered = false; +} + +static void xe_svm_unregister_notifier_work(struct work_struct *work) +{ + struct xe_svm_range *range; + + range = container_of(work, struct xe_svm_range, unregister_notifier_work); + + xe_svm_range_unregister_mmu_notifier(range); + + /** + * This is called from mmu notifier MUNMAP event. When munmap is called, + * this range is not valid any more. Remove it. + */ + mutex_lock(&range->svm->mutex); + interval_tree_remove(&range->inode, &range->svm->range_tree); + mutex_unlock(&range->svm->mutex); + kfree(range); +} + +/** + * register a mmu interval notifier to monitor vma change + * + * @range: svm range to monitor + * + * This has to be called inside a mmap_read_lock + */ +int xe_svm_range_register_mmu_notifier(struct xe_svm_range *range) +{ + struct vm_area_struct *vma = range->vma; + struct mm_struct *mm = range->svm->mm; + u64 start, length; + int ret = 0; + + if (range->mmu_notifier_registered) + return 0; + + start = range->start; + length = range->end - start; + /** We are inside a mmap_read_lock, but it requires a mmap_write_lock + * to register mmu notifier. + */ + mmap_read_unlock(mm); + mmap_write_lock(mm); + ret = mmu_interval_notifier_insert_locked(&range->notifier, vma->vm_mm, + start, length, &xe_svm_mni_ops); + mmap_write_downgrade(mm); + if (ret) + return ret; + + INIT_WORK(&range->unregister_notifier_work, xe_svm_unregister_notifier_work); + range->mmu_notifier_registered = true; + return ret; +} From patchwork Thu Dec 21 04:38:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501090 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AAC39C35274 for ; Thu, 21 Dec 2023 04:28:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 80E1810E654; Thu, 21 Dec 2023 04:28:37 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4C8FD10E654; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8g4drANwyulLrmFls9iiZYQ/UPta8PRTj4VT7Sr2nDg=; b=MVi0rwGOeeWB37XolY2qk6jcqd/O+ash4ifYiE2slvvOirBHMu8ZN/Uu qgP30B9wO9fBdxsIR3SPbWJp7lD68pAKzi3LRfa3ym8bw2tdhS2QqrMs7 GM/DwjE43yrZQkVj0ZXv7HbnrmKIakSXfIDgChPr1ZkMHhyEhTePhx11u t0ZX9i/AXlFEPIHYJjiheGFe8p7CV5q3yncJw17j94EFzgW208Zuu+oBO f05Lw4bqMWkVIL+w5SHgvPpsDhl0NHImCfu35nhKhLIEOk6nYtyuLhVha Ny2QHjjwUa6sMCAqIj4YXme979Wxy0/vMcPC9YbQRpJGC9w5J0JStVYuY A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069776" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069776" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481385" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481385" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 16/22] drm/xe/svm: Implement the mmu notifier range invalidate callback Date: Wed, 20 Dec 2023 23:38:06 -0500 Message-Id: <20231221043812.3783313-17-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To mirror the CPU page table from GPU side, we register a mmu interval notifier (in the coming patch of this series). Core mm call back to GPU driver whenever there is a change to certain virtual address range, i.e., range is released or unmapped by user etc. This patch implemented the GPU driver callback function for such mmu interval notifier. In the callback function we unbind the address range from GPU if it is unmapped from CPU side, thus we mirror the CPU page table change. We also unregister the mmu interval notifier from core mm in the case of munmap event. But we can't unregister mmu notifier directly from the mmu notifier range invalidation callback function. The reason is, during a munmap (see kernel function vm_munmap), a mmap_write_lock is held, but unregister mmu notifier (calling mmu_interval_notifier_remove) also requires a mmap_write_lock of the current process. Thus, we start a kernel worker to unregister mmu interval notifier on a MMU_NOTIFY_UNMAP event. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.c | 1 + drivers/gpu/drm/xe/xe_svm.h | 1 - drivers/gpu/drm/xe/xe_svm_range.c | 37 ++++++++++++++++++++++++++++++- 3 files changed, 37 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index ab3cc2121869..6393251c0051 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -8,6 +8,7 @@ #include "xe_svm.h" #include #include +#include "xe_pt.h" DEFINE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 90e665f2bfc6..0038f98c0cc7 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -54,7 +54,6 @@ struct xe_svm { struct xe_svm_range { /** @svm: pointer of the xe_svm that this range belongs to */ struct xe_svm *svm; - /** @notifier: The mmu interval notifer used to keep track of CPU * side address range change. Driver will get a callback with this * notifier if anything changed from CPU side, such as range is diff --git a/drivers/gpu/drm/xe/xe_svm_range.c b/drivers/gpu/drm/xe/xe_svm_range.c index 286d5f7d6ecd..53dd3be7ab9f 100644 --- a/drivers/gpu/drm/xe/xe_svm_range.c +++ b/drivers/gpu/drm/xe/xe_svm_range.c @@ -10,6 +10,7 @@ #include #include #include "xe_svm.h" +#include "xe_pt.h" /** * xe_svm_range_from_addr() - retrieve svm_range contains a virtual address @@ -59,8 +60,42 @@ bool xe_svm_range_belongs_to_vma(struct mm_struct *mm, return (vma1 == vma) && (vma2 == vma); } +static bool xe_svm_range_invalidate(struct mmu_interval_notifier *mni, + const struct mmu_notifier_range *range, + unsigned long cur_seq) +{ + struct xe_svm_range *svm_range = + container_of(mni, struct xe_svm_range, notifier); + struct xe_svm *svm = svm_range->svm; + unsigned long length = range->end - range->start; + + /* + * MMU_NOTIFY_RELEASE is called upon process exit to notify driver + * to release any process resources, such as zap GPU page table + * mapping or unregister mmu notifier etc. We already clear GPU + * page table and unregister mmu notifier in in xe_destroy_svm, + * upon process exit. So just simply return here. + */ + if (range->event == MMU_NOTIFY_RELEASE) + return true; + + if (mmu_notifier_range_blockable(range)) + mutex_lock(&svm->mutex); + else if (!mutex_trylock(&svm->mutex)) + return false; + + mmu_interval_set_seq(mni, cur_seq); + xe_invalidate_svm_range(svm->vm, range->start, length); + mutex_unlock(&svm->mutex); + + if (range->event == MMU_NOTIFY_UNMAP) + queue_work(system_unbound_wq, &svm_range->unregister_notifier_work); + + return true; +} + static const struct mmu_interval_notifier_ops xe_svm_mni_ops = { - .invalidate = NULL, + .invalidate = xe_svm_range_invalidate, }; /** From patchwork Thu Dec 21 04:38:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 05B14C35274 for ; Thu, 21 Dec 2023 04:28:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5B31E10E676; Thu, 21 Dec 2023 04:28:31 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 39BF010E65B; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GJo+44jDv+g6nOzMQZgGJ5rlhefDr6WjxKA4AakuTqs=; b=JIkzV1YEU+Z2+PTVJ2FAbhayj78NMf8kD6tWrZsXOOEjincd+D88TUop DCfnSBaRy9CbvgcV2fYbFA/+zwDRz49hxhMbSOhxBiLIhqAGBeLeHmUQh BNV5reu2siVgT74rrO8zBVyVXID2Iy5UXRjJLM01j9/ZyCzhmSb4q7Js2 AQBPR3FWGkNAw0TE1X4ibDEkedkSEjVMr/sRsuKpRk+y4e+LP3CiEkZuy qn1gXdXkAyPYkMzLb3tibCG+nT+uUTNFTWeVINPL/w7Y7jv7P18MUUaJY MinLDx6ivT7M4LpLIxiwqxDQG3OmdlzGsfHJnnGACx0q/wYHP+mFax9jj Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069777" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069777" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481388" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481388" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 17/22] drm/xe/svm: clean up svm range during process exit Date: Wed, 20 Dec 2023 23:38:07 -0500 Message-Id: <20231221043812.3783313-18-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Clean up svm range during process exit: Zap GPU page table of the svm process on process exit; unregister all the mmu interval notifiers which are registered before; free svm range and svm data structure. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.c | 24 ++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_svm.h | 1 + drivers/gpu/drm/xe/xe_svm_range.c | 17 +++++++++++++++++ 3 files changed, 42 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index 6393251c0051..5772bfcf7da4 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -9,6 +9,8 @@ #include #include #include "xe_pt.h" +#include "xe_assert.h" +#include "xe_vm_types.h" DEFINE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); @@ -19,9 +21,31 @@ DEFINE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); */ void xe_destroy_svm(struct xe_svm *svm) { +#define MAX_SVM_RANGE (1024*1024) + struct xe_svm_range **range_array; + struct interval_tree_node *node; + struct xe_svm_range *range; + int i = 0; + + range_array = kzalloc(sizeof(struct xe_svm_range *) * MAX_SVM_RANGE, + GFP_KERNEL); + node = interval_tree_iter_first(&svm->range_tree, 0, ~0ULL); + while (node) { + range = container_of(node, struct xe_svm_range, inode); + xe_svm_range_prepare_destroy(range); + node = interval_tree_iter_next(node, 0, ~0ULL); + xe_assert(svm->vm->xe, i < MAX_SVM_RANGE); + range_array[i++] = range; + } + + /** Free range (thus range->inode) while traversing above is not safe */ + for(; i >= 0; i--) + kfree(range_array[i]); + hash_del_rcu(&svm->hnode); mutex_destroy(&svm->mutex); kfree(svm); + kfree(range_array); } /** diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 0038f98c0cc7..5b3bd2c064f5 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -90,6 +90,7 @@ bool xe_svm_range_belongs_to_vma(struct mm_struct *mm, struct vm_area_struct *vma); void xe_svm_range_unregister_mmu_notifier(struct xe_svm_range *range); int xe_svm_range_register_mmu_notifier(struct xe_svm_range *range); +void xe_svm_range_prepare_destroy(struct xe_svm_range *range); int xe_svm_build_sg(struct hmm_range *range, struct sg_table *st); int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mem); diff --git a/drivers/gpu/drm/xe/xe_svm_range.c b/drivers/gpu/drm/xe/xe_svm_range.c index 53dd3be7ab9f..dfb4660dc26f 100644 --- a/drivers/gpu/drm/xe/xe_svm_range.c +++ b/drivers/gpu/drm/xe/xe_svm_range.c @@ -165,3 +165,20 @@ int xe_svm_range_register_mmu_notifier(struct xe_svm_range *range) range->mmu_notifier_registered = true; return ret; } + +/** + * xe_svm_range_prepare_destroy() - prepare work to destroy a svm range + * + * @range: the svm range to destroy + * + * prepare for a svm range destroy: Zap this range from GPU, unregister mmu + * notifier. + */ +void xe_svm_range_prepare_destroy(struct xe_svm_range *range) +{ + struct xe_vm *vm = range->svm->vm; + unsigned long length = range->end - range->start; + + xe_invalidate_svm_range(vm, range->start, length); + xe_svm_range_unregister_mmu_notifier(range); +} From patchwork Thu Dec 21 04:38:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4FC4CC46CD8 for ; Thu, 21 Dec 2023 04:28:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E280010E675; Thu, 21 Dec 2023 04:28:26 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5BBDA10E65C; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=48hfptNhzrcO9krlOeJxd2g9CKO7QM3DEf3WOpS2jvo=; b=HOrECIuFMjmYrZ1bqw6kaT13ZO+GmzJ5KBiY5+XLLHu3lWDuGY++cEMI mYBBnyezfX4Dn3dCFIvIyYGnic+M/r6PCHq/b6vEIYh/Dnwja2wppIyQq OJFdMZPmHrpkRGxdipOpP6dzkdijl9XGHJomY5I5ZgkuKxG75ZAcmJcUj nluKHH96jzoWDH0YgHXy5gHEXkloquygc80L1Q7HMByf9vZfKz55EDTy9 C+b8jqtp2FJOCz7rQW2N9Uu30PWV9lGZuSR6W/5czWttvtDzZlk7wOgKK rgVdDaoVcEbaMvxDyT77uwUGNhoC2No7sm7JhdPJgwkpzN62hbr1F4Djo w==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069778" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069778" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481391" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481391" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 18/22] drm/xe/svm: Move a few structures to xe_gt.h Date: Wed, 20 Dec 2023 23:38:08 -0500 Message-Id: <20231221043812.3783313-19-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Move access_type and pagefault struct to header file so it can be shared with svm sub-system. This is preparation work for enabling page fault for svm. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_gt.h | 20 ++++++++++++++++++++ drivers/gpu/drm/xe/xe_gt_pagefault.c | 21 --------------------- 2 files changed, 20 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h index 4486e083f5ef..51dd288cf1cf 100644 --- a/drivers/gpu/drm/xe/xe_gt.h +++ b/drivers/gpu/drm/xe/xe_gt.h @@ -17,6 +17,26 @@ xe_hw_engine_is_valid((hwe__))) #define CCS_MASK(gt) (((gt)->info.engine_mask & XE_HW_ENGINE_CCS_MASK) >> XE_HW_ENGINE_CCS0) +enum access_type { + ACCESS_TYPE_READ = 0, + ACCESS_TYPE_WRITE = 1, + ACCESS_TYPE_ATOMIC = 2, + ACCESS_TYPE_RESERVED = 3, +}; + +struct pagefault { + u64 page_addr; + u32 asid; + u16 pdata; + u8 vfid; + u8 access_type; + u8 fault_type; + u8 fault_level; + u8 engine_class; + u8 engine_instance; + u8 fault_unsuccessful; + bool trva_fault; +}; #ifdef CONFIG_FAULT_INJECTION #include /* XXX: fault-inject.h is broken */ diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 4489aadc7a52..6de1ff195aaa 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -23,27 +23,6 @@ #include "xe_trace.h" #include "xe_vm.h" -struct pagefault { - u64 page_addr; - u32 asid; - u16 pdata; - u8 vfid; - u8 access_type; - u8 fault_type; - u8 fault_level; - u8 engine_class; - u8 engine_instance; - u8 fault_unsuccessful; - bool trva_fault; -}; - -enum access_type { - ACCESS_TYPE_READ = 0, - ACCESS_TYPE_WRITE = 1, - ACCESS_TYPE_ATOMIC = 2, - ACCESS_TYPE_RESERVED = 3, -}; - enum fault_type { NOT_PRESENT = 0, WRITE_ACCESS_VIOLATION = 1, From patchwork Thu Dec 21 04:38:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30F3BC4706F for ; Thu, 21 Dec 2023 04:28:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 11F9210E660; Thu, 21 Dec 2023 04:28:25 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 704D510E653; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TFuLDl5rjE5Z8M92BmQP1BBm1VJQTh1JMd04l6WaM4I=; b=RkoeQ8Ggw4L1+L7rsEAwIfkGupVdUgYKfeXrYIFq567TkJ6LNoEROMDw Ugj0cWdL2AQkHSiw2XNzdCrypzES1/azqidwa5+JJlX8vgP4BJNF32Hsq L7HB/aN85z+cFrgpyBJtEGrDGEDhiOelTBTW26q4zdhOWYn5E2sM0hKO9 Z50B8BAPnB4sMpDCzen2yhL+CiotjeJ/Q9cJrrRdsjvyaZuKvucjJZNVY vUNixJn2VcpK4YC60Mi4TsjhLMpwAu17nfFA+qq0POE87Sn2QmgpFMLHc gMI/py0QxbTj0HW9WDCzHO0imviLp1GmhcdI0AQRSwFM4iX72Jy/1YlWN A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069779" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069779" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481394" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481394" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 19/22] drm/xe/svm: migrate svm range to vram Date: Wed, 20 Dec 2023 23:38:09 -0500 Message-Id: <20231221043812.3783313-20-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Since the source pages of the svm range can be physically none contiguous, and the destination vram pages can also be none contiguous, there is no easy way to migrate multiple pages per blitter command. We do page by page migration for now. Migration is best effort. Even if we fail to migrate some pages, we will try to migrate the rest pages. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.c | 7 ++ drivers/gpu/drm/xe/xe_svm.h | 3 + drivers/gpu/drm/xe/xe_svm_migrate.c | 114 ++++++++++++++++++++++++++++ 3 files changed, 124 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index 5772bfcf7da4..44d4f4216a93 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -5,12 +5,19 @@ #include #include +#include +#include +#include +#include #include "xe_svm.h" #include #include #include "xe_pt.h" #include "xe_assert.h" #include "xe_vm_types.h" +#include "xe_gt.h" +#include "xe_migrate.h" +#include "xe_trace.h" DEFINE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 5b3bd2c064f5..659bcb7927d6 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -80,6 +80,9 @@ struct xe_svm_range { }; vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf); +int svm_migrate_range_to_vram(struct xe_svm_range *range, + struct vm_area_struct *vma, + struct xe_tile *tile); void xe_destroy_svm(struct xe_svm *svm); struct xe_svm *xe_create_svm(struct xe_vm *vm); struct xe_svm *xe_lookup_svm_by_mm(struct mm_struct *mm); diff --git a/drivers/gpu/drm/xe/xe_svm_migrate.c b/drivers/gpu/drm/xe/xe_svm_migrate.c index b4df411e04f3..3724ad6c7aea 100644 --- a/drivers/gpu/drm/xe/xe_svm_migrate.c +++ b/drivers/gpu/drm/xe/xe_svm_migrate.c @@ -229,3 +229,117 @@ vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf) kvfree(buf); return 0; } + + +/** + * svm_migrate_range_to_vram() - migrate backing store of a va range to vram + * Must be called with mmap_read_lock(mm) held. + * @range: the va range to migrate. Range should only belong to one vma. + * @vma: the vma that this range belongs to. @range can cover whole @vma + * or a sub-range of @vma. + * @tile: the destination tile which holds the new backing store of the range + * + * Returns: negative errno on faiure, 0 on success + */ +int svm_migrate_range_to_vram(struct xe_svm_range *range, + struct vm_area_struct *vma, + struct xe_tile *tile) +{ + struct mm_struct *mm = range->svm->mm; + unsigned long start = range->start; + unsigned long end = range->end; + unsigned long npages = (end - start) >> PAGE_SHIFT; + struct xe_mem_region *mr = &tile->mem.vram; + struct migrate_vma migrate = { + .vma = vma, + .start = start, + .end = end, + .pgmap_owner = tile->xe->drm.dev, + .flags = MIGRATE_VMA_SELECT_SYSTEM, + }; + struct device *dev = tile->xe->drm.dev; + dma_addr_t *src_dma_addr; + struct dma_fence *fence; + struct page *src_page; + LIST_HEAD(blocks); + int ret = 0, i; + u64 dst_dpa; + void *buf; + + mmap_assert_locked(mm); + xe_assert(tile->xe, xe_svm_range_belongs_to_vma(mm, range, vma)); + + buf = kvcalloc(npages, 2* sizeof(*migrate.src) + sizeof(*src_dma_addr), + GFP_KERNEL); + if(!buf) + return -ENOMEM; + migrate.src = buf; + migrate.dst = migrate.src + npages; + src_dma_addr = (dma_addr_t *) (migrate.dst + npages); + ret = xe_devm_alloc_pages(tile, npages, &blocks, migrate.dst); + if (ret) + goto kfree_buf; + + ret = migrate_vma_setup(&migrate); + if (ret) { + drm_err(&tile->xe->drm, "vma setup returned %d for range [%lx - %lx]\n", + ret, start, end); + goto free_dst_pages; + } + + trace_xe_svm_migrate_sram_to_vram(range); + /**FIXME: partial migration of a range + * print a warning for now. If this message + * is printed, we need to fall back to page by page + * migration: only migrate pages with MIGRATE_PFN_MIGRATE + */ + if (migrate.cpages != npages) + drm_warn(&tile->xe->drm, "Partial migration for range [%lx - %lx], range is %ld pages, migrate only %ld pages\n", + start, end, npages, migrate.cpages); + + /**Migrate page by page for now. + * Both source pages and destination pages can physically not contiguous, + * there is no good way to migrate multiple pages per blitter command. + */ + for (i = 0; i < npages; i++) { + src_page = migrate_pfn_to_page(migrate.src[i]); + if (unlikely(!src_page || !(migrate.src[i] & MIGRATE_PFN_MIGRATE))) + goto free_dst_page; + + xe_assert(tile->xe, !is_zone_device_page(src_page)); + src_dma_addr[i] = dma_map_page(dev, src_page, 0, PAGE_SIZE, DMA_TO_DEVICE); + if (unlikely(dma_mapping_error(dev, src_dma_addr[i]))) { + drm_warn(&tile->xe->drm, "dma map error for host pfn %lx\n", migrate.src[i]); + goto free_dst_page; + } + dst_dpa = vram_pfn_to_dpa(mr, migrate.dst[i]); + fence = xe_migrate_svm(tile->migrate, src_dma_addr[i], false, + dst_dpa, true, PAGE_SIZE); + if (IS_ERR(fence)) { + drm_warn(&tile->xe->drm, "migrate host page (pfn: %lx) to vram failed\n", + migrate.src[i]); + /**Migration is best effort. Even we failed here, we continue*/ + goto free_dst_page; + } + /**FIXME: Use the first migration's out fence as the second migration's input fence, + * and so on. Only wait the out fence of last migration? + */ + dma_fence_wait(fence, false); + dma_fence_put(fence); +free_dst_page: + xe_devm_page_free(pfn_to_page(migrate.dst[i])); + } + + for (i = 0; i < npages; i++) + if (!(dma_mapping_error(dev, src_dma_addr[i]))) + dma_unmap_page(dev, src_dma_addr[i], PAGE_SIZE, DMA_TO_DEVICE); + + migrate_vma_pages(&migrate); + migrate_vma_finalize(&migrate); +free_dst_pages: + if (ret) + xe_devm_free_blocks(&blocks); +kfree_buf: + kfree(buf); + return ret; +} From patchwork Thu Dec 21 04:38:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501083 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC0A9C46CD3 for ; Thu, 21 Dec 2023 04:28:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0C28510E670; Thu, 21 Dec 2023 04:28:26 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8F75A10E65D; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a0vZtfrN9UpAC/iAia0pPxlZcWsaf9Idjir8E7djVZM=; b=A8DSnDx4uDgPDe1ITmdq7Ae7Nt81SIQ/HLw9sk4ex+Hxd9EK8H+H7e6b lqC41SkU4pNzMSNTQtmGYabwkrYBI6Uio6GRFzQsxbrfjQLIciEAhCYlc PjnnXKLhRFdr96a+0NlgbL0wNuRRqiemegiM0JyiAGVWg+8W4yH8mop3f h5V0muziVMNylXrt9FOGZoIIGYhwu+Tk1cTm1LCBqWLkaeIJCgpphEuB7 BoLmm4RXFY0cMtHXUtQD1+L2hV8st2QRmR3vgGb/98s5TSbBVJNcNhzKO 5W3DR2eC9H2JReolDEMCnRFJettW3y5Q+xZoSMLydhZ7e6vI2smZZBKtw g==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069780" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069780" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481397" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481397" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 20/22] drm/xe/svm: Populate svm range Date: Wed, 20 Dec 2023 23:38:10 -0500 Message-Id: <20231221043812.3783313-21-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add a helper function svm_populate_range to populate a svm range. This functions calls hmm_range_fault to read CPU page tables and populate all pfns of this virtual address range into an array, saved in hmm_range:: hmm_pfns. This is prepare work to bind a svm range to GPU. The hmm_pfns array will be used for the GPU binding. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_svm.c | 61 +++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index 44d4f4216a93..0c13690a19f5 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -145,3 +145,64 @@ int xe_svm_build_sg(struct hmm_range *range, sg_mark_end(sg); return 0; } + +/** Populate physical pages of a virtual address range + * This function also read mmu notifier sequence # ( + * mmu_interval_read_begin), for the purpose of later + * comparison (through mmu_interval_read_retry). + * This must be called with mmap read or write lock held. + * + * This function alloates hmm_range->hmm_pfns, it is caller's + * responsibility to free it. + * + * @svm_range: The svm range to populate + * @hmm_range: pointer to hmm_range struct. hmm_rang->hmm_pfns + * will hold the populated pfns. + * @write: populate pages with write permission + * + * returns: 0 for succuss; negative error no on failure + */ +static int svm_populate_range(struct xe_svm_range *svm_range, + struct hmm_range *hmm_range, bool write) +{ + unsigned long timeout = + jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); + unsigned long *pfns, flags = HMM_PFN_REQ_FAULT; + u64 npages; + int ret; + + mmap_assert_locked(svm_range->svm->mm); + + npages = ((svm_range->end - 1) >> PAGE_SHIFT) - + (svm_range->start >> PAGE_SHIFT) + 1; + pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL); + if (unlikely(!pfns)) + return -ENOMEM; + + if (write) + flags |= HMM_PFN_REQ_WRITE; + + memset64((u64 *)pfns, (u64)flags, npages); + hmm_range->hmm_pfns = pfns; + hmm_range->notifier_seq = mmu_interval_read_begin(&svm_range->notifier); + hmm_range->notifier = &svm_range->notifier; + hmm_range->start = svm_range->start; + hmm_range->end = svm_range->end; + hmm_range->pfn_flags_mask = HMM_PFN_REQ_FAULT | HMM_PFN_REQ_WRITE; + hmm_range->dev_private_owner = svm_range->svm->vm->xe->drm.dev; + + while (true) { + ret = hmm_range_fault(hmm_range); + if (time_after(jiffies, timeout)) + goto free_pfns; + + if (ret == -EBUSY) + continue; + break; + } + +free_pfns: + if (ret) + kvfree(pfns); + return ret; +} From patchwork Thu Dec 21 04:38:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C5787C4706C for ; Thu, 21 Dec 2023 04:28:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E35AD10E680; Thu, 21 Dec 2023 04:28:32 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id A21B510E660; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=toJHZ/EnH0ybS4vHd+ZNq/8CT0wCKgZvP2FrNGzeTxs=; b=RRSf7IW8kqlmHptKBJvTLfyZJMLaTNQ0ZyDO48voOLJcGqi2NyNH4BuL h7exqjtvWDTb9KpFjkcZ+lRNVwVzPXYRIZlYqv1xMWITP4qAqVhwgcQW3 hPDEALPgOPSb95G+EVNC2eTLGj435h6rvAXtlnS9YHskT+d87GWhd5DNy wT5kTy1RWx7sx0AZw/NblxbHmdiBTOSBZu6exbYkbe7yMRCwnVL6IPr2d v/rU4wppDeY3zwUkp4vUsfikXWZVU7bYMeBv33q9oQNl/SxGGh9Yk0r+X nOIromMem9GYuoofpDCCrLJmy6CWMx05U4aDQcQJ7VcsRNarCkSsr/Vdq g==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069781" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069781" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481400" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481400" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 21/22] drm/xe/svm: GPU page fault support Date: Wed, 20 Dec 2023 23:38:11 -0500 Message-Id: <20231221043812.3783313-22-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On gpu page fault of a virtual address, try to fault in the virtual address range to gpu page table and let HW to retry on the faulty address. Right now, we always migrate the whole vma which contains the fault address to GPU. This is subject to change of a more sophisticated migration policy: decide whether to migrate memory to GPU or map in place with CPU memory; migration granularity. There is rather complicated locking strategy in this patch. See more details in xe_svm_doc.h, lock design section. Signed-off-by: Oak Zeng Cc: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 7 ++ drivers/gpu/drm/xe/xe_svm.c | 116 +++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_svm.h | 6 ++ drivers/gpu/drm/xe/xe_svm_range.c | 43 ++++++++++ 4 files changed, 172 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 6de1ff195aaa..0afd312ff154 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -22,6 +22,7 @@ #include "xe_pt.h" #include "xe_trace.h" #include "xe_vm.h" +#include "xe_svm.h" enum fault_type { NOT_PRESENT = 0, @@ -131,6 +132,11 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) if (!vm || !xe_vm_in_fault_mode(vm)) return -EINVAL; + if (vm->svm) { + ret = xe_svm_handle_gpu_fault(vm, gt, pf); + goto put_vm; + } + retry_userptr: /* * TODO: Avoid exclusive lock if VM doesn't have userptrs, or @@ -219,6 +225,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) if (ret >= 0) ret = 0; } +put_vm: xe_vm_put(vm); return ret; diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index 0c13690a19f5..1ade8d7f0ab2 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -12,6 +12,7 @@ #include "xe_svm.h" #include #include +#include #include "xe_pt.h" #include "xe_assert.h" #include "xe_vm_types.h" @@ -206,3 +207,118 @@ static int svm_populate_range(struct xe_svm_range *svm_range, kvfree(pfns); return ret; } + +/** + * svm_access_allowed() - Determine whether read or/and write to vma is allowed + * + * @write: true means a read and write access; false: read only access + */ +static bool svm_access_allowed(struct vm_area_struct *vma, bool write) +{ + unsigned long access = VM_READ; + + if (write) + access |= VM_WRITE; + + return (vma->vm_flags & access) == access; +} + +/** + * svm_should_migrate() - Determine whether we should migrate a range to + * a destination memory region + * + * @range: The svm memory range to consider + * @dst_region: target destination memory region + * @is_atomic_fault: Is the intended migration triggered by a atomic access? + * On some platform, we have to migrate memory to guarantee atomic correctness. + */ +static bool svm_should_migrate(struct xe_svm_range *range, + struct xe_mem_region *dst_region, bool is_atomic_fault) +{ + return true; +} + +/** + * xe_svm_handle_gpu_fault() - gpu page fault handler for svm subsystem + * + * @vm: The vm of the fault. + * @gt: The gt hardware on which the fault happens. + * @pf: page fault descriptor + * + * Workout a backing memory for the fault address, migrate memory from + * system memory to gpu vram if nessary, and map the fault address to + * GPU so GPU HW can retry the last operation which has caused the GPU + * page fault. + */ +int xe_svm_handle_gpu_fault(struct xe_vm *vm, + struct xe_gt *gt, + struct pagefault *pf) +{ + u8 access_type = pf->access_type; + u64 page_addr = pf->page_addr; + struct hmm_range hmm_range; + struct vm_area_struct *vma; + struct xe_svm_range *range; + struct mm_struct *mm; + struct xe_svm *svm; + int ret = 0; + + svm = vm->svm; + if (!svm) + return -EINVAL; + + mm = svm->mm; + mmap_read_lock(mm); + vma = find_vma_intersection(mm, page_addr, page_addr + 4); + if (!vma) { + mmap_read_unlock(mm); + return -ENOENT; + } + + if (!svm_access_allowed (vma, access_type != ACCESS_TYPE_READ)) { + mmap_read_unlock(mm); + return -EPERM; + } + + range = xe_svm_range_from_addr(svm, page_addr); + if (!range) { + range = xe_svm_range_create(svm, vma); + if (!range) { + mmap_read_unlock(mm); + return -ENOMEM; + } + } + + if (svm_should_migrate(range, >->tile->mem.vram, + access_type == ACCESS_TYPE_ATOMIC)) + /** Migrate whole svm range for now. + * This is subject to change once we introduce a migration granularity + * parameter for user to select. + * + * Migration is best effort. If we failed to migrate to vram, + * we just map that range to gpu in system memory. For cases + * such as gpu atomic operation which requires memory to be + * resident in vram, we will fault again and retry migration. + */ + svm_migrate_range_to_vram(range, vma, gt->tile); + + ret = svm_populate_range(range, &hmm_range, vma->vm_flags & VM_WRITE); + mmap_read_unlock(mm); + /** There is no need to destroy this range. Range can be reused later */ + if (ret) + goto free_pfns; + + /**FIXME: set the DM, AE flags in PTE*/ + ret = xe_bind_svm_range(vm, gt->tile, &hmm_range, + !(vma->vm_flags & VM_WRITE) ? DRM_XE_VM_BIND_FLAG_READONLY : 0); + /** Concurrent cpu page table update happened, + * Return successfully so we will retry everything + * on next gpu page fault. + */ + if (ret == -EAGAIN) + ret = 0; + +free_pfns: + kvfree(hmm_range.hmm_pfns); + return ret; +} diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 659bcb7927d6..a8ff4957a9b8 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -20,6 +20,7 @@ struct xe_vm; struct mm_struct; +struct pagefault; #define XE_MAX_SVM_PROCESS 5 /* Maximumly support 32 SVM process*/ extern DECLARE_HASHTABLE(xe_svm_table, XE_MAX_SVM_PROCESS); @@ -94,6 +95,8 @@ bool xe_svm_range_belongs_to_vma(struct mm_struct *mm, void xe_svm_range_unregister_mmu_notifier(struct xe_svm_range *range); int xe_svm_range_register_mmu_notifier(struct xe_svm_range *range); void xe_svm_range_prepare_destroy(struct xe_svm_range *range); +struct xe_svm_range *xe_svm_range_create(struct xe_svm *svm, + struct vm_area_struct *vma); int xe_svm_build_sg(struct hmm_range *range, struct sg_table *st); int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mem); @@ -106,4 +109,7 @@ int xe_devm_alloc_pages(struct xe_tile *tile, void xe_devm_free_blocks(struct list_head *blocks); void xe_devm_page_free(struct page *page); +int xe_svm_handle_gpu_fault(struct xe_vm *vm, + struct xe_gt *gt, + struct pagefault *pf); #endif diff --git a/drivers/gpu/drm/xe/xe_svm_range.c b/drivers/gpu/drm/xe/xe_svm_range.c index dfb4660dc26f..05c088dddc2d 100644 --- a/drivers/gpu/drm/xe/xe_svm_range.c +++ b/drivers/gpu/drm/xe/xe_svm_range.c @@ -182,3 +182,46 @@ void xe_svm_range_prepare_destroy(struct xe_svm_range *range) xe_invalidate_svm_range(vm, range->start, length); xe_svm_range_unregister_mmu_notifier(range); } + +static void add_range_to_svm(struct xe_svm_range *range) +{ + range->inode.start = range->start; + range->inode.last = range->end; + mutex_lock(&range->svm->mutex); + interval_tree_insert(&range->inode, &range->svm->range_tree); + mutex_unlock(&range->svm->mutex); +} + +/** + * xe_svm_range_create() - create and initialize a svm range + * + * @svm: the svm that the range belongs to + * @vma: the corresponding vma of the range + * + * Create range, add it to svm's interval tree. Regiter a mmu + * interval notifier for this range. + * + * Return the pointer of the created svm range + * or NULL if fail + */ +struct xe_svm_range *xe_svm_range_create(struct xe_svm *svm, + struct vm_area_struct *vma) +{ + struct xe_svm_range *range = kzalloc(sizeof(*range), GFP_KERNEL); + + if (!range) + return NULL; + + range->start = vma->vm_start; + range->end = vma->vm_end; + range->vma = vma; + range->svm = svm; + + if (xe_svm_range_register_mmu_notifier(range)){ + kfree(range); + return NULL; + } + + add_range_to_svm(range); + return range; +} From patchwork Thu Dec 21 04:38:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zeng, Oak" X-Patchwork-Id: 13501084 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 80CBDC35274 for ; Thu, 21 Dec 2023 04:28:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 66F3610E673; Thu, 21 Dec 2023 04:28:26 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 94E6810E65E; Thu, 21 Dec 2023 04:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703132903; x=1734668903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JAifNztFGJdHbvn7TBEiRLW9QX1gnP6AASCF9p0qNJc=; b=PZ+EeN4Wafeq/LNKrq2cFodB+ZWIBzEZ9YLI7SLVmKxMaey6LmSL8JI0 CqzWejMMylM9u71g+KrAWBMdmWVrJzjHsm1eL6ymgSLgqYq1Rf68eOQo1 GVSiNtvHEXQLjOCFeGOUf2T9tj2hNYwdoRCjMGQlSxNQtSlK2VlQZZvMd CmaT+p3U/93hGHJXqVhzn9e0l7b+NU/gQUIDYFwQg+WBoSnbWBsLpxLo2 rOm2HNWtC/ke7+PxIZkO0878nqyjkTDiH7ZRGb1etnuQpNqn7/U+KP/ZA 5iuQVGn+OkWDG5bshbBi+hG+icRxiIj/vd86Psd3JgMyQAlujGTE5dySz A==; X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="427069782" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="427069782" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10930"; a="805481402" X-IronPort-AV: E=Sophos;i="6.04,293,1695711600"; d="scan'208";a="805481402" Received: from szeng-desk.jf.intel.com ([10.165.21.149]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2023 20:28:21 -0800 From: Oak Zeng To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH 22/22] drm/xe/svm: Add DRM_XE_SVM kernel config entry Date: Wed, 20 Dec 2023 23:38:12 -0500 Message-Id: <20231221043812.3783313-23-oak.zeng@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20231221043812.3783313-1-oak.zeng@intel.com> References: <20231221043812.3783313-1-oak.zeng@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com, niranjana.vishwanathapura@intel.com, brian.welty@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" DRM_XE_SVM kernel config entry is added so xe svm feature can be configured before kernel compilation. Signed-off-by: Oak Zeng Co-developed-by: Niranjana Vishwanathapura Signed-off-by: Niranjana Vishwanathapura Cc: Matthew Brost Cc: Thomas Hellström Cc: Brian Welty --- drivers/gpu/drm/xe/Kconfig | 22 ++++++++++++++++++++++ drivers/gpu/drm/xe/Makefile | 5 +++++ drivers/gpu/drm/xe/xe_mmio.c | 5 +++++ drivers/gpu/drm/xe/xe_vm.c | 2 ++ 4 files changed, 34 insertions(+) diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig index 5b3da06e7ba3..a57f0972e9ae 100644 --- a/drivers/gpu/drm/xe/Kconfig +++ b/drivers/gpu/drm/xe/Kconfig @@ -83,6 +83,28 @@ config DRM_XE_FORCE_PROBE Use "!*" to block the probe of the driver for all known devices. +config DRM_XE_SVM + bool "Enable Shared Virtual Memory support in xe" + depends on DRM_XE + depends on ARCH_ENABLE_MEMORY_HOTPLUG + depends on ARCH_ENABLE_MEMORY_HOTREMOVE + depends on MEMORY_HOTPLUG + depends on MEMORY_HOTREMOVE + depends on ARCH_HAS_PTE_DEVMAP + depends on SPARSEMEM_VMEMMAP + depends on ZONE_DEVICE + depends on DEVICE_PRIVATE + depends on MMU + select HMM_MIRROR + select MMU_NOTIFIER + default y + help + Choose this option if you want Shared Virtual Memory (SVM) + support in xe. With SVM, virtual address space is shared + between CPU and GPU. This means any virtual address such + as malloc or mmap returns, variables on stack, or global + memory pointers, can be used for GPU transparently. + menu "drm/Xe Debugging" depends on DRM_XE depends on EXPERT diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index df8601d6a59f..b75bdbc5e42c 100644 --- a/drivers/gpu/drm/xe/Makefile +++ b/drivers/gpu/drm/xe/Makefile @@ -282,6 +282,11 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \ i915-display/skl_universal_plane.o \ i915-display/skl_watermark.o +xe-$(CONFIG_DRM_XE_SVM) += xe_svm.o \ + xe_svm_devmem.o \ + xe_svm_range.o \ + xe_svm_migrate.o + ifeq ($(CONFIG_ACPI),y) xe-$(CONFIG_DRM_XE_DISPLAY) += \ i915-display/intel_acpi.o \ diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c index cfe25a3c7059..7c95f675ed92 100644 --- a/drivers/gpu/drm/xe/xe_mmio.c +++ b/drivers/gpu/drm/xe/xe_mmio.c @@ -286,7 +286,9 @@ int xe_mmio_probe_vram(struct xe_device *xe) } io_size -= min_t(u64, tile_size, io_size); +#if IS_ENABLED(CONFIG_DRM_XE_SVM) xe_svm_devm_add(tile, &tile->mem.vram); +#endif } xe->mem.vram.actual_physical_size = total_size; @@ -361,8 +363,11 @@ static void mmio_fini(struct drm_device *drm, void *arg) pci_iounmap(to_pci_dev(xe->drm.dev), xe->mmio.regs); if (xe->mem.vram.mapping) iounmap(xe->mem.vram.mapping); + +#if IS_ENABLED(CONFIG_DRM_XE_SVM) for_each_tile(tile, xe, id) { xe_svm_devm_remove(xe, &tile->mem.vram); +#endif } } diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 3c301a5c7325..12d82f2fc195 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -1376,7 +1376,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags) xe->usm.num_vm_in_non_fault_mode++; mutex_unlock(&xe->usm.lock); +#if IS_ENABLED(CONFIG_DRM_XE_SVM) vm->svm = xe_create_svm(vm); +#endif trace_xe_vm_create(vm); return vm;