Message ID | 20230118061256.2689-1-dakr@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | DRM GPUVA Manager & Nouveau VM_BIND UAPI | expand |
Am 18.01.23 um 07:12 schrieb Danilo Krummrich: > This patch series provides a new UAPI for the Nouveau driver in order to > support Vulkan features, such as sparse bindings and sparse residency. > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to > keep track of GPU virtual address (VA) mappings in a more generic way. > > The DRM GPUVA manager is indented to help drivers implement userspace-manageable > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it > serves the following purposes in this context. > > 1) Provide a dedicated range allocator to track GPU VA allocations and > mappings, making use of the drm_mm range allocator. This means that the ranges are allocated by the kernel? If yes that's a really really bad idea. Regards, Christian. > > 2) Generically connect GPU VA mappings to their backing buffers, in > particular DRM GEM objects. > > 3) Provide a common implementation to perform more complex mapping > operations on the GPU VA space. In particular splitting and merging > of GPU VA mappings, e.g. for intersecting mapping requests or partial > unmap requests. > > The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself > providing the following new interfaces. > > 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl > for UMDs to specify the portion of VA space managed by the kernel and > userspace, respectively. > > 2) Allocate and free a VA space region as well as bind and unbind memory > to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. > > 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > > Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM > scheduler to queue jobs and support asynchronous processing with DRM syncobjs > as synchronization mechanism. > > By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > > The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context > for GEM buffers) by Christian König. Since the patch implementing drm_exec was > not yet merged into drm-next it is part of this series, as well as a small fix > for this patch, which was found while testing this series. > > This patch series is also available at [1]. > > There is a Mesa NVK merge request by Dave Airlie [2] implementing the > corresponding userspace parts for this series. > > The Vulkan CTS test suite passes the sparse binding and sparse residency test > cases for the new UAPI together with Dave's Mesa work. > > There are also some test cases in the igt-gpu-tools project [3] for the new UAPI > and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU > VA manager's logic through Nouveau's new UAPI and should be considered just as > helper for implementation. > > However, I absolutely intend to change those test cases to proper kunit test > cases for the DRM GPUVA manager, once and if we agree on it's usefulness and > design. > > [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > > I also want to give credit to Dave Airlie, who contributed a lot of ideas to > this patch series. > > Christian König (1): > drm: execution context for GEM buffers > > Danilo Krummrich (13): > drm/exec: fix memory leak in drm_exec_prepare_obj() > drm: manager to keep track of GPUs VA mappings > drm: debugfs: provide infrastructure to dump a DRM GPU VA space > drm/nouveau: new VM_BIND uapi interfaces > drm/nouveau: get vmm via nouveau_cli_vmm() > drm/nouveau: bo: initialize GEM GPU VA interface > drm/nouveau: move usercopy helpers to nouveau_drv.h > drm/nouveau: fence: fail to emit when fence context is killed > drm/nouveau: chan: provide nouveau_channel_kill() > drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm > drm/nouveau: implement uvmm for user mode bindings > drm/nouveau: implement new VM_BIND UAPI > drm/nouveau: debugfs: implement DRM GPU VA debugfs > > Documentation/gpu/driver-uapi.rst | 11 + > Documentation/gpu/drm-mm.rst | 43 + > drivers/gpu/drm/Kconfig | 6 + > drivers/gpu/drm/Makefile | 3 + > drivers/gpu/drm/amd/amdgpu/Kconfig | 1 + > drivers/gpu/drm/drm_debugfs.c | 56 + > drivers/gpu/drm/drm_exec.c | 294 ++++ > drivers/gpu/drm/drm_gem.c | 3 + > drivers/gpu/drm/drm_gpuva_mgr.c | 1323 +++++++++++++++++ > drivers/gpu/drm/nouveau/Kbuild | 3 + > drivers/gpu/drm/nouveau/Kconfig | 2 + > drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +- > drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +- > .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + > drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 + > drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + > drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +- > drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- > drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +- > drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + > drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 + > drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +- > drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +- > drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++ > drivers/gpu/drm/nouveau/nouveau_exec.h | 55 + > drivers/gpu/drm/nouveau/nouveau_fence.c | 7 + > drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- > drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +- > drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + > drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++ > drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++ > drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++ > drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 + > drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- > drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +- > .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++- > .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +- > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 + > include/drm/drm_debugfs.h | 25 + > include/drm/drm_drv.h | 6 + > include/drm/drm_exec.h | 144 ++ > include/drm/drm_gem.h | 75 + > include/drm/drm_gpuva_mgr.h | 527 +++++++ > include/uapi/drm/nouveau_drm.h | 216 +++ > 47 files changed, 5266 insertions(+), 126 deletions(-) > create mode 100644 drivers/gpu/drm/drm_exec.c > create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c > create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h > create mode 100644 include/drm/drm_exec.h > create mode 100644 include/drm/drm_gpuva_mgr.h > > > base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
Hi Christian, On 1/18/23 09:53, Christian König wrote: > Am 18.01.23 um 07:12 schrieb Danilo Krummrich: >> This patch series provides a new UAPI for the Nouveau driver in order to >> support Vulkan features, such as sparse bindings and sparse residency. >> >> Furthermore, with the DRM GPUVA manager it provides a new DRM core >> feature to >> keep track of GPU virtual address (VA) mappings in a more generic way. >> >> The DRM GPUVA manager is indented to help drivers implement >> userspace-manageable >> GPU VA spaces in reference to the Vulkan API. In order to achieve this >> goal it >> serves the following purposes in this context. >> >> 1) Provide a dedicated range allocator to track GPU VA >> allocations and >> mappings, making use of the drm_mm range allocator. > > This means that the ranges are allocated by the kernel? If yes that's a > really really bad idea. No, it's just for keeping track of the ranges userspace has allocated. - Danilo > > Regards, > Christian. > >> >> 2) Generically connect GPU VA mappings to their backing buffers, in >> particular DRM GEM objects. >> >> 3) Provide a common implementation to perform more complex mapping >> operations on the GPU VA space. In particular splitting and >> merging >> of GPU VA mappings, e.g. for intersecting mapping requests or >> partial >> unmap requests. >> >> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, >> itself >> providing the following new interfaces. >> >> 1) Initialize a GPU VA space via the new >> DRM_IOCTL_NOUVEAU_VM_INIT ioctl >> for UMDs to specify the portion of VA space managed by the >> kernel and >> userspace, respectively. >> >> 2) Allocate and free a VA space region as well as bind and unbind >> memory >> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. >> >> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. >> >> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use >> of the DRM >> scheduler to queue jobs and support asynchronous processing with DRM >> syncobjs >> as synchronization mechanism. >> >> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, >> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. >> >> The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution >> context >> for GEM buffers) by Christian König. Since the patch implementing >> drm_exec was >> not yet merged into drm-next it is part of this series, as well as a >> small fix >> for this patch, which was found while testing this series. >> >> This patch series is also available at [1]. >> >> There is a Mesa NVK merge request by Dave Airlie [2] implementing the >> corresponding userspace parts for this series. >> >> The Vulkan CTS test suite passes the sparse binding and sparse >> residency test >> cases for the new UAPI together with Dave's Mesa work. >> >> There are also some test cases in the igt-gpu-tools project [3] for >> the new UAPI >> and hence the DRM GPU VA manager. However, most of them are testing >> the DRM GPU >> VA manager's logic through Nouveau's new UAPI and should be considered >> just as >> helper for implementation. >> >> However, I absolutely intend to change those test cases to proper >> kunit test >> cases for the DRM GPUVA manager, once and if we agree on it's >> usefulness and >> design. >> >> [1] >> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / >> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 >> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ >> [3] >> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind >> >> I also want to give credit to Dave Airlie, who contributed a lot of >> ideas to >> this patch series. >> >> Christian König (1): >> drm: execution context for GEM buffers >> >> Danilo Krummrich (13): >> drm/exec: fix memory leak in drm_exec_prepare_obj() >> drm: manager to keep track of GPUs VA mappings >> drm: debugfs: provide infrastructure to dump a DRM GPU VA space >> drm/nouveau: new VM_BIND uapi interfaces >> drm/nouveau: get vmm via nouveau_cli_vmm() >> drm/nouveau: bo: initialize GEM GPU VA interface >> drm/nouveau: move usercopy helpers to nouveau_drv.h >> drm/nouveau: fence: fail to emit when fence context is killed >> drm/nouveau: chan: provide nouveau_channel_kill() >> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm >> drm/nouveau: implement uvmm for user mode bindings >> drm/nouveau: implement new VM_BIND UAPI >> drm/nouveau: debugfs: implement DRM GPU VA debugfs >> >> Documentation/gpu/driver-uapi.rst | 11 + >> Documentation/gpu/drm-mm.rst | 43 + >> drivers/gpu/drm/Kconfig | 6 + >> drivers/gpu/drm/Makefile | 3 + >> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 + >> drivers/gpu/drm/drm_debugfs.c | 56 + >> drivers/gpu/drm/drm_exec.c | 294 ++++ >> drivers/gpu/drm/drm_gem.c | 3 + >> drivers/gpu/drm/drm_gpuva_mgr.c | 1323 +++++++++++++++++ >> drivers/gpu/drm/nouveau/Kbuild | 3 + >> drivers/gpu/drm/nouveau/Kconfig | 2 + >> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +- >> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +- >> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + >> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 + >> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + >> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +- >> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- >> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +- >> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + >> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 + >> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +- >> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +- >> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++ >> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 + >> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 + >> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- >> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +- >> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + >> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- >> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++ >> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++ >> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- >> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++ >> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 + >> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- >> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +- >> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++- >> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 + >> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +- >> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 + >> include/drm/drm_debugfs.h | 25 + >> include/drm/drm_drv.h | 6 + >> include/drm/drm_exec.h | 144 ++ >> include/drm/drm_gem.h | 75 + >> include/drm/drm_gpuva_mgr.h | 527 +++++++ >> include/uapi/drm/nouveau_drm.h | 216 +++ >> 47 files changed, 5266 insertions(+), 126 deletions(-) >> create mode 100644 drivers/gpu/drm/drm_exec.c >> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c >> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h >> create mode 100644 include/drm/drm_exec.h >> create mode 100644 include/drm/drm_gpuva_mgr.h >> >> >> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d >
Am 18.01.23 um 16:34 schrieb Danilo Krummrich: > Hi Christian, > > On 1/18/23 09:53, Christian König wrote: >> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: >>> This patch series provides a new UAPI for the Nouveau driver in >>> order to >>> support Vulkan features, such as sparse bindings and sparse residency. >>> >>> Furthermore, with the DRM GPUVA manager it provides a new DRM core >>> feature to >>> keep track of GPU virtual address (VA) mappings in a more generic way. >>> >>> The DRM GPUVA manager is indented to help drivers implement >>> userspace-manageable >>> GPU VA spaces in reference to the Vulkan API. In order to achieve >>> this goal it >>> serves the following purposes in this context. >>> >>> 1) Provide a dedicated range allocator to track GPU VA >>> allocations and >>> mappings, making use of the drm_mm range allocator. >> >> This means that the ranges are allocated by the kernel? If yes that's >> a really really bad idea. > > No, it's just for keeping track of the ranges userspace has allocated. Ok, that makes more sense. So basically you have an IOCTL which asks kernel for a free range? Or what exactly is the drm_mm used for here? Regards, Christian. > > - Danilo > >> >> Regards, >> Christian. >> >>> >>> 2) Generically connect GPU VA mappings to their backing >>> buffers, in >>> particular DRM GEM objects. >>> >>> 3) Provide a common implementation to perform more complex mapping >>> operations on the GPU VA space. In particular splitting and >>> merging >>> of GPU VA mappings, e.g. for intersecting mapping requests >>> or partial >>> unmap requests. >>> >>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, >>> itself >>> providing the following new interfaces. >>> >>> 1) Initialize a GPU VA space via the new >>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl >>> for UMDs to specify the portion of VA space managed by the >>> kernel and >>> userspace, respectively. >>> >>> 2) Allocate and free a VA space region as well as bind and >>> unbind memory >>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND >>> ioctl. >>> >>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. >>> >>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use >>> of the DRM >>> scheduler to queue jobs and support asynchronous processing with DRM >>> syncobjs >>> as synchronization mechanism. >>> >>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, >>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. >>> >>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec >>> (execution context >>> for GEM buffers) by Christian König. Since the patch implementing >>> drm_exec was >>> not yet merged into drm-next it is part of this series, as well as a >>> small fix >>> for this patch, which was found while testing this series. >>> >>> This patch series is also available at [1]. >>> >>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the >>> corresponding userspace parts for this series. >>> >>> The Vulkan CTS test suite passes the sparse binding and sparse >>> residency test >>> cases for the new UAPI together with Dave's Mesa work. >>> >>> There are also some test cases in the igt-gpu-tools project [3] for >>> the new UAPI >>> and hence the DRM GPU VA manager. However, most of them are testing >>> the DRM GPU >>> VA manager's logic through Nouveau's new UAPI and should be >>> considered just as >>> helper for implementation. >>> >>> However, I absolutely intend to change those test cases to proper >>> kunit test >>> cases for the DRM GPUVA manager, once and if we agree on it's >>> usefulness and >>> design. >>> >>> [1] >>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next >>> / >>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 >>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ >>> [3] >>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind >>> >>> I also want to give credit to Dave Airlie, who contributed a lot of >>> ideas to >>> this patch series. >>> >>> Christian König (1): >>> drm: execution context for GEM buffers >>> >>> Danilo Krummrich (13): >>> drm/exec: fix memory leak in drm_exec_prepare_obj() >>> drm: manager to keep track of GPUs VA mappings >>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space >>> drm/nouveau: new VM_BIND uapi interfaces >>> drm/nouveau: get vmm via nouveau_cli_vmm() >>> drm/nouveau: bo: initialize GEM GPU VA interface >>> drm/nouveau: move usercopy helpers to nouveau_drv.h >>> drm/nouveau: fence: fail to emit when fence context is killed >>> drm/nouveau: chan: provide nouveau_channel_kill() >>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm >>> drm/nouveau: implement uvmm for user mode bindings >>> drm/nouveau: implement new VM_BIND UAPI >>> drm/nouveau: debugfs: implement DRM GPU VA debugfs >>> >>> Documentation/gpu/driver-uapi.rst | 11 + >>> Documentation/gpu/drm-mm.rst | 43 + >>> drivers/gpu/drm/Kconfig | 6 + >>> drivers/gpu/drm/Makefile | 3 + >>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 + >>> drivers/gpu/drm/drm_debugfs.c | 56 + >>> drivers/gpu/drm/drm_exec.c | 294 ++++ >>> drivers/gpu/drm/drm_gem.c | 3 + >>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323 >>> +++++++++++++++++ >>> drivers/gpu/drm/nouveau/Kbuild | 3 + >>> drivers/gpu/drm/nouveau/Kconfig | 2 + >>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +- >>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +- >>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + >>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 + >>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + >>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +- >>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- >>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +- >>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + >>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 + >>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +- >>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +- >>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++ >>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 + >>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 + >>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- >>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +- >>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + >>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- >>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++ >>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++ >>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- >>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++ >>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 + >>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- >>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +- >>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++- >>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 + >>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +- >>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 + >>> include/drm/drm_debugfs.h | 25 + >>> include/drm/drm_drv.h | 6 + >>> include/drm/drm_exec.h | 144 ++ >>> include/drm/drm_gem.h | 75 + >>> include/drm/drm_gpuva_mgr.h | 527 +++++++ >>> include/uapi/drm/nouveau_drm.h | 216 +++ >>> 47 files changed, 5266 insertions(+), 126 deletions(-) >>> create mode 100644 drivers/gpu/drm/drm_exec.c >>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c >>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c >>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h >>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c >>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h >>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c >>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h >>> create mode 100644 include/drm/drm_exec.h >>> create mode 100644 include/drm/drm_gpuva_mgr.h >>> >>> >>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d >> >
On 1/18/23 16:37, Christian König wrote: > Am 18.01.23 um 16:34 schrieb Danilo Krummrich: >> Hi Christian, >> >> On 1/18/23 09:53, Christian König wrote: >>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: >>>> This patch series provides a new UAPI for the Nouveau driver in >>>> order to >>>> support Vulkan features, such as sparse bindings and sparse residency. >>>> >>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core >>>> feature to >>>> keep track of GPU virtual address (VA) mappings in a more generic way. >>>> >>>> The DRM GPUVA manager is indented to help drivers implement >>>> userspace-manageable >>>> GPU VA spaces in reference to the Vulkan API. In order to achieve >>>> this goal it >>>> serves the following purposes in this context. >>>> >>>> 1) Provide a dedicated range allocator to track GPU VA >>>> allocations and >>>> mappings, making use of the drm_mm range allocator. >>> >>> This means that the ranges are allocated by the kernel? If yes that's >>> a really really bad idea. >> >> No, it's just for keeping track of the ranges userspace has allocated. > > Ok, that makes more sense. > > So basically you have an IOCTL which asks kernel for a free range? Or > what exactly is the drm_mm used for here? Not even that, userspace provides both the base address and the range, the kernel really just keeps track of things. Though, writing a UAPI on top of the GPUVA manager asking for a free range instead would be possible by just adding the corresponding wrapper functions to get a free hole. Currently, and that's what I think I read out of your question, the main benefit of using drm_mm over simply stuffing the entries into a list or something boils down to easier collision detection and iterating sub-ranges of the whole VA space. > > Regards, > Christian. > >> >> - Danilo >> >>> >>> Regards, >>> Christian. >>> >>>> >>>> 2) Generically connect GPU VA mappings to their backing >>>> buffers, in >>>> particular DRM GEM objects. >>>> >>>> 3) Provide a common implementation to perform more complex mapping >>>> operations on the GPU VA space. In particular splitting and >>>> merging >>>> of GPU VA mappings, e.g. for intersecting mapping requests >>>> or partial >>>> unmap requests. >>>> >>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, >>>> itself >>>> providing the following new interfaces. >>>> >>>> 1) Initialize a GPU VA space via the new >>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl >>>> for UMDs to specify the portion of VA space managed by the >>>> kernel and >>>> userspace, respectively. >>>> >>>> 2) Allocate and free a VA space region as well as bind and >>>> unbind memory >>>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND >>>> ioctl. >>>> >>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. >>>> >>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use >>>> of the DRM >>>> scheduler to queue jobs and support asynchronous processing with DRM >>>> syncobjs >>>> as synchronization mechanism. >>>> >>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, >>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. >>>> >>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec >>>> (execution context >>>> for GEM buffers) by Christian König. Since the patch implementing >>>> drm_exec was >>>> not yet merged into drm-next it is part of this series, as well as a >>>> small fix >>>> for this patch, which was found while testing this series. >>>> >>>> This patch series is also available at [1]. >>>> >>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the >>>> corresponding userspace parts for this series. >>>> >>>> The Vulkan CTS test suite passes the sparse binding and sparse >>>> residency test >>>> cases for the new UAPI together with Dave's Mesa work. >>>> >>>> There are also some test cases in the igt-gpu-tools project [3] for >>>> the new UAPI >>>> and hence the DRM GPU VA manager. However, most of them are testing >>>> the DRM GPU >>>> VA manager's logic through Nouveau's new UAPI and should be >>>> considered just as >>>> helper for implementation. >>>> >>>> However, I absolutely intend to change those test cases to proper >>>> kunit test >>>> cases for the DRM GPUVA manager, once and if we agree on it's >>>> usefulness and >>>> design. >>>> >>>> [1] >>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / >>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 >>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ >>>> [3] >>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind >>>> >>>> I also want to give credit to Dave Airlie, who contributed a lot of >>>> ideas to >>>> this patch series. >>>> >>>> Christian König (1): >>>> drm: execution context for GEM buffers >>>> >>>> Danilo Krummrich (13): >>>> drm/exec: fix memory leak in drm_exec_prepare_obj() >>>> drm: manager to keep track of GPUs VA mappings >>>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space >>>> drm/nouveau: new VM_BIND uapi interfaces >>>> drm/nouveau: get vmm via nouveau_cli_vmm() >>>> drm/nouveau: bo: initialize GEM GPU VA interface >>>> drm/nouveau: move usercopy helpers to nouveau_drv.h >>>> drm/nouveau: fence: fail to emit when fence context is killed >>>> drm/nouveau: chan: provide nouveau_channel_kill() >>>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm >>>> drm/nouveau: implement uvmm for user mode bindings >>>> drm/nouveau: implement new VM_BIND UAPI >>>> drm/nouveau: debugfs: implement DRM GPU VA debugfs >>>> >>>> Documentation/gpu/driver-uapi.rst | 11 + >>>> Documentation/gpu/drm-mm.rst | 43 + >>>> drivers/gpu/drm/Kconfig | 6 + >>>> drivers/gpu/drm/Makefile | 3 + >>>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 + >>>> drivers/gpu/drm/drm_debugfs.c | 56 + >>>> drivers/gpu/drm/drm_exec.c | 294 ++++ >>>> drivers/gpu/drm/drm_gem.c | 3 + >>>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323 >>>> +++++++++++++++++ >>>> drivers/gpu/drm/nouveau/Kbuild | 3 + >>>> drivers/gpu/drm/nouveau/Kconfig | 2 + >>>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +- >>>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +- >>>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + >>>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 + >>>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + >>>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +- >>>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- >>>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +- >>>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + >>>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 + >>>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +- >>>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +- >>>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++ >>>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 + >>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 + >>>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- >>>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +- >>>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + >>>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- >>>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++ >>>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++ >>>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- >>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++ >>>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 + >>>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- >>>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +- >>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++- >>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 + >>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +- >>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 + >>>> include/drm/drm_debugfs.h | 25 + >>>> include/drm/drm_drv.h | 6 + >>>> include/drm/drm_exec.h | 144 ++ >>>> include/drm/drm_gem.h | 75 + >>>> include/drm/drm_gpuva_mgr.h | 527 +++++++ >>>> include/uapi/drm/nouveau_drm.h | 216 +++ >>>> 47 files changed, 5266 insertions(+), 126 deletions(-) >>>> create mode 100644 drivers/gpu/drm/drm_exec.c >>>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h >>>> create mode 100644 include/drm/drm_exec.h >>>> create mode 100644 include/drm/drm_gpuva_mgr.h >>>> >>>> >>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d >>> >> >
On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <dakr@redhat.com> wrote: > > On 1/18/23 16:37, Christian König wrote: > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich: > >> Hi Christian, > >> > >> On 1/18/23 09:53, Christian König wrote: > >>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: > >>>> This patch series provides a new UAPI for the Nouveau driver in > >>>> order to > >>>> support Vulkan features, such as sparse bindings and sparse residency. > >>>> > >>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core > >>>> feature to > >>>> keep track of GPU virtual address (VA) mappings in a more generic way. > >>>> > >>>> The DRM GPUVA manager is indented to help drivers implement > >>>> userspace-manageable > >>>> GPU VA spaces in reference to the Vulkan API. In order to achieve > >>>> this goal it > >>>> serves the following purposes in this context. > >>>> > >>>> 1) Provide a dedicated range allocator to track GPU VA > >>>> allocations and > >>>> mappings, making use of the drm_mm range allocator. > >>> > >>> This means that the ranges are allocated by the kernel? If yes that's > >>> a really really bad idea. > >> > >> No, it's just for keeping track of the ranges userspace has allocated. > > > > Ok, that makes more sense. > > > > So basically you have an IOCTL which asks kernel for a free range? Or > > what exactly is the drm_mm used for here? > > Not even that, userspace provides both the base address and the range, > the kernel really just keeps track of things. Though, writing a UAPI on > top of the GPUVA manager asking for a free range instead would be > possible by just adding the corresponding wrapper functions to get a > free hole. > > Currently, and that's what I think I read out of your question, the main > benefit of using drm_mm over simply stuffing the entries into a list or > something boils down to easier collision detection and iterating > sub-ranges of the whole VA space. Why not just do this in userspace? We have a range manager in libdrm_amdgpu that you could lift out into libdrm or some other helper. Alex > > > > > Regards, > > Christian. > > > >> > >> - Danilo > >> > >>> > >>> Regards, > >>> Christian. > >>> > >>>> > >>>> 2) Generically connect GPU VA mappings to their backing > >>>> buffers, in > >>>> particular DRM GEM objects. > >>>> > >>>> 3) Provide a common implementation to perform more complex mapping > >>>> operations on the GPU VA space. In particular splitting and > >>>> merging > >>>> of GPU VA mappings, e.g. for intersecting mapping requests > >>>> or partial > >>>> unmap requests. > >>>> > >>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, > >>>> itself > >>>> providing the following new interfaces. > >>>> > >>>> 1) Initialize a GPU VA space via the new > >>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl > >>>> for UMDs to specify the portion of VA space managed by the > >>>> kernel and > >>>> userspace, respectively. > >>>> > >>>> 2) Allocate and free a VA space region as well as bind and > >>>> unbind memory > >>>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND > >>>> ioctl. > >>>> > >>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > >>>> > >>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use > >>>> of the DRM > >>>> scheduler to queue jobs and support asynchronous processing with DRM > >>>> syncobjs > >>>> as synchronization mechanism. > >>>> > >>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > >>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > >>>> > >>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec > >>>> (execution context > >>>> for GEM buffers) by Christian König. Since the patch implementing > >>>> drm_exec was > >>>> not yet merged into drm-next it is part of this series, as well as a > >>>> small fix > >>>> for this patch, which was found while testing this series. > >>>> > >>>> This patch series is also available at [1]. > >>>> > >>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the > >>>> corresponding userspace parts for this series. > >>>> > >>>> The Vulkan CTS test suite passes the sparse binding and sparse > >>>> residency test > >>>> cases for the new UAPI together with Dave's Mesa work. > >>>> > >>>> There are also some test cases in the igt-gpu-tools project [3] for > >>>> the new UAPI > >>>> and hence the DRM GPU VA manager. However, most of them are testing > >>>> the DRM GPU > >>>> VA manager's logic through Nouveau's new UAPI and should be > >>>> considered just as > >>>> helper for implementation. > >>>> > >>>> However, I absolutely intend to change those test cases to proper > >>>> kunit test > >>>> cases for the DRM GPUVA manager, once and if we agree on it's > >>>> usefulness and > >>>> design. > >>>> > >>>> [1] > >>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > >>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > >>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > >>>> [3] > >>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > >>>> > >>>> I also want to give credit to Dave Airlie, who contributed a lot of > >>>> ideas to > >>>> this patch series. > >>>> > >>>> Christian König (1): > >>>> drm: execution context for GEM buffers > >>>> > >>>> Danilo Krummrich (13): > >>>> drm/exec: fix memory leak in drm_exec_prepare_obj() > >>>> drm: manager to keep track of GPUs VA mappings > >>>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space > >>>> drm/nouveau: new VM_BIND uapi interfaces > >>>> drm/nouveau: get vmm via nouveau_cli_vmm() > >>>> drm/nouveau: bo: initialize GEM GPU VA interface > >>>> drm/nouveau: move usercopy helpers to nouveau_drv.h > >>>> drm/nouveau: fence: fail to emit when fence context is killed > >>>> drm/nouveau: chan: provide nouveau_channel_kill() > >>>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm > >>>> drm/nouveau: implement uvmm for user mode bindings > >>>> drm/nouveau: implement new VM_BIND UAPI > >>>> drm/nouveau: debugfs: implement DRM GPU VA debugfs > >>>> > >>>> Documentation/gpu/driver-uapi.rst | 11 + > >>>> Documentation/gpu/drm-mm.rst | 43 + > >>>> drivers/gpu/drm/Kconfig | 6 + > >>>> drivers/gpu/drm/Makefile | 3 + > >>>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 + > >>>> drivers/gpu/drm/drm_debugfs.c | 56 + > >>>> drivers/gpu/drm/drm_exec.c | 294 ++++ > >>>> drivers/gpu/drm/drm_gem.c | 3 + > >>>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323 > >>>> +++++++++++++++++ > >>>> drivers/gpu/drm/nouveau/Kbuild | 3 + > >>>> drivers/gpu/drm/nouveau/Kconfig | 2 + > >>>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +- > >>>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +- > >>>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + > >>>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 + > >>>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + > >>>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +- > >>>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- > >>>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +- > >>>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + > >>>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 + > >>>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +- > >>>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +- > >>>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++ > >>>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 + > >>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 + > >>>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- > >>>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +- > >>>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + > >>>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- > >>>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++ > >>>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++ > >>>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- > >>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++ > >>>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 + > >>>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- > >>>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +- > >>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++- > >>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 + > >>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +- > >>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 + > >>>> include/drm/drm_debugfs.h | 25 + > >>>> include/drm/drm_drv.h | 6 + > >>>> include/drm/drm_exec.h | 144 ++ > >>>> include/drm/drm_gem.h | 75 + > >>>> include/drm/drm_gpuva_mgr.h | 527 +++++++ > >>>> include/uapi/drm/nouveau_drm.h | 216 +++ > >>>> 47 files changed, 5266 insertions(+), 126 deletions(-) > >>>> create mode 100644 drivers/gpu/drm/drm_exec.c > >>>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c > >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c > >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h > >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c > >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h > >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c > >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h > >>>> create mode 100644 include/drm/drm_exec.h > >>>> create mode 100644 include/drm/drm_gpuva_mgr.h > >>>> > >>>> > >>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d > >>> > >> > > >
On 1/18/23 17:30, Alex Deucher wrote: > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <dakr@redhat.com> wrote: >> >> On 1/18/23 16:37, Christian König wrote: >>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich: >>>> Hi Christian, >>>> >>>> On 1/18/23 09:53, Christian König wrote: >>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: >>>>>> This patch series provides a new UAPI for the Nouveau driver in >>>>>> order to >>>>>> support Vulkan features, such as sparse bindings and sparse residency. >>>>>> >>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core >>>>>> feature to >>>>>> keep track of GPU virtual address (VA) mappings in a more generic way. >>>>>> >>>>>> The DRM GPUVA manager is indented to help drivers implement >>>>>> userspace-manageable >>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve >>>>>> this goal it >>>>>> serves the following purposes in this context. >>>>>> >>>>>> 1) Provide a dedicated range allocator to track GPU VA >>>>>> allocations and >>>>>> mappings, making use of the drm_mm range allocator. >>>>> >>>>> This means that the ranges are allocated by the kernel? If yes that's >>>>> a really really bad idea. >>>> >>>> No, it's just for keeping track of the ranges userspace has allocated. >>> >>> Ok, that makes more sense. >>> >>> So basically you have an IOCTL which asks kernel for a free range? Or >>> what exactly is the drm_mm used for here? >> >> Not even that, userspace provides both the base address and the range, >> the kernel really just keeps track of things. Though, writing a UAPI on >> top of the GPUVA manager asking for a free range instead would be >> possible by just adding the corresponding wrapper functions to get a >> free hole. >> >> Currently, and that's what I think I read out of your question, the main >> benefit of using drm_mm over simply stuffing the entries into a list or >> something boils down to easier collision detection and iterating >> sub-ranges of the whole VA space. > > Why not just do this in userspace? We have a range manager in > libdrm_amdgpu that you could lift out into libdrm or some other > helper. The kernel still needs to keep track of the mappings within the various VA spaces, e.g. it silently needs to unmap mappings that are backed by BOs that get evicted and remap them once they're validated (or swapped back in). > > Alex > > >> >>> >>> Regards, >>> Christian. >>> >>>> >>>> - Danilo >>>> >>>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>>> >>>>>> 2) Generically connect GPU VA mappings to their backing >>>>>> buffers, in >>>>>> particular DRM GEM objects. >>>>>> >>>>>> 3) Provide a common implementation to perform more complex mapping >>>>>> operations on the GPU VA space. In particular splitting and >>>>>> merging >>>>>> of GPU VA mappings, e.g. for intersecting mapping requests >>>>>> or partial >>>>>> unmap requests. >>>>>> >>>>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, >>>>>> itself >>>>>> providing the following new interfaces. >>>>>> >>>>>> 1) Initialize a GPU VA space via the new >>>>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl >>>>>> for UMDs to specify the portion of VA space managed by the >>>>>> kernel and >>>>>> userspace, respectively. >>>>>> >>>>>> 2) Allocate and free a VA space region as well as bind and >>>>>> unbind memory >>>>>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND >>>>>> ioctl. >>>>>> >>>>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. >>>>>> >>>>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use >>>>>> of the DRM >>>>>> scheduler to queue jobs and support asynchronous processing with DRM >>>>>> syncobjs >>>>>> as synchronization mechanism. >>>>>> >>>>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, >>>>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. >>>>>> >>>>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec >>>>>> (execution context >>>>>> for GEM buffers) by Christian König. Since the patch implementing >>>>>> drm_exec was >>>>>> not yet merged into drm-next it is part of this series, as well as a >>>>>> small fix >>>>>> for this patch, which was found while testing this series. >>>>>> >>>>>> This patch series is also available at [1]. >>>>>> >>>>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the >>>>>> corresponding userspace parts for this series. >>>>>> >>>>>> The Vulkan CTS test suite passes the sparse binding and sparse >>>>>> residency test >>>>>> cases for the new UAPI together with Dave's Mesa work. >>>>>> >>>>>> There are also some test cases in the igt-gpu-tools project [3] for >>>>>> the new UAPI >>>>>> and hence the DRM GPU VA manager. However, most of them are testing >>>>>> the DRM GPU >>>>>> VA manager's logic through Nouveau's new UAPI and should be >>>>>> considered just as >>>>>> helper for implementation. >>>>>> >>>>>> However, I absolutely intend to change those test cases to proper >>>>>> kunit test >>>>>> cases for the DRM GPUVA manager, once and if we agree on it's >>>>>> usefulness and >>>>>> design. >>>>>> >>>>>> [1] >>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / >>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 >>>>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ >>>>>> [3] >>>>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind >>>>>> >>>>>> I also want to give credit to Dave Airlie, who contributed a lot of >>>>>> ideas to >>>>>> this patch series. >>>>>> >>>>>> Christian König (1): >>>>>> drm: execution context for GEM buffers >>>>>> >>>>>> Danilo Krummrich (13): >>>>>> drm/exec: fix memory leak in drm_exec_prepare_obj() >>>>>> drm: manager to keep track of GPUs VA mappings >>>>>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space >>>>>> drm/nouveau: new VM_BIND uapi interfaces >>>>>> drm/nouveau: get vmm via nouveau_cli_vmm() >>>>>> drm/nouveau: bo: initialize GEM GPU VA interface >>>>>> drm/nouveau: move usercopy helpers to nouveau_drv.h >>>>>> drm/nouveau: fence: fail to emit when fence context is killed >>>>>> drm/nouveau: chan: provide nouveau_channel_kill() >>>>>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm >>>>>> drm/nouveau: implement uvmm for user mode bindings >>>>>> drm/nouveau: implement new VM_BIND UAPI >>>>>> drm/nouveau: debugfs: implement DRM GPU VA debugfs >>>>>> >>>>>> Documentation/gpu/driver-uapi.rst | 11 + >>>>>> Documentation/gpu/drm-mm.rst | 43 + >>>>>> drivers/gpu/drm/Kconfig | 6 + >>>>>> drivers/gpu/drm/Makefile | 3 + >>>>>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 + >>>>>> drivers/gpu/drm/drm_debugfs.c | 56 + >>>>>> drivers/gpu/drm/drm_exec.c | 294 ++++ >>>>>> drivers/gpu/drm/drm_gem.c | 3 + >>>>>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323 >>>>>> +++++++++++++++++ >>>>>> drivers/gpu/drm/nouveau/Kbuild | 3 + >>>>>> drivers/gpu/drm/nouveau/Kconfig | 2 + >>>>>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +- >>>>>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +- >>>>>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + >>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 + >>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + >>>>>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + >>>>>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 + >>>>>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++ >>>>>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 + >>>>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 + >>>>>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + >>>>>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++ >>>>>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++ >>>>>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- >>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++ >>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 + >>>>>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- >>>>>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +- >>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++- >>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 + >>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +- >>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 + >>>>>> include/drm/drm_debugfs.h | 25 + >>>>>> include/drm/drm_drv.h | 6 + >>>>>> include/drm/drm_exec.h | 144 ++ >>>>>> include/drm/drm_gem.h | 75 + >>>>>> include/drm/drm_gpuva_mgr.h | 527 +++++++ >>>>>> include/uapi/drm/nouveau_drm.h | 216 +++ >>>>>> 47 files changed, 5266 insertions(+), 126 deletions(-) >>>>>> create mode 100644 drivers/gpu/drm/drm_exec.c >>>>>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h >>>>>> create mode 100644 include/drm/drm_exec.h >>>>>> create mode 100644 include/drm/drm_gpuva_mgr.h >>>>>> >>>>>> >>>>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d >>>>> >>>> >>> >> >
On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <dakr@redhat.com> wrote: > > > > On 1/18/23 17:30, Alex Deucher wrote: > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <dakr@redhat.com> wrote: > >> > >> On 1/18/23 16:37, Christian König wrote: > >>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich: > >>>> Hi Christian, > >>>> > >>>> On 1/18/23 09:53, Christian König wrote: > >>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: > >>>>>> This patch series provides a new UAPI for the Nouveau driver in > >>>>>> order to > >>>>>> support Vulkan features, such as sparse bindings and sparse residency. > >>>>>> > >>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core > >>>>>> feature to > >>>>>> keep track of GPU virtual address (VA) mappings in a more generic way. > >>>>>> > >>>>>> The DRM GPUVA manager is indented to help drivers implement > >>>>>> userspace-manageable > >>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve > >>>>>> this goal it > >>>>>> serves the following purposes in this context. > >>>>>> > >>>>>> 1) Provide a dedicated range allocator to track GPU VA > >>>>>> allocations and > >>>>>> mappings, making use of the drm_mm range allocator. > >>>>> > >>>>> This means that the ranges are allocated by the kernel? If yes that's > >>>>> a really really bad idea. > >>>> > >>>> No, it's just for keeping track of the ranges userspace has allocated. > >>> > >>> Ok, that makes more sense. > >>> > >>> So basically you have an IOCTL which asks kernel for a free range? Or > >>> what exactly is the drm_mm used for here? > >> > >> Not even that, userspace provides both the base address and the range, > >> the kernel really just keeps track of things. Though, writing a UAPI on > >> top of the GPUVA manager asking for a free range instead would be > >> possible by just adding the corresponding wrapper functions to get a > >> free hole. > >> > >> Currently, and that's what I think I read out of your question, the main > >> benefit of using drm_mm over simply stuffing the entries into a list or > >> something boils down to easier collision detection and iterating > >> sub-ranges of the whole VA space. > > > > Why not just do this in userspace? We have a range manager in > > libdrm_amdgpu that you could lift out into libdrm or some other > > helper. > > The kernel still needs to keep track of the mappings within the various > VA spaces, e.g. it silently needs to unmap mappings that are backed by > BOs that get evicted and remap them once they're validated (or swapped > back in). Ok, you are just using this for maintaining the GPU VM space in the kernel. Alex > > > > > Alex > > > > > >> > >>> > >>> Regards, > >>> Christian. > >>> > >>>> > >>>> - Danilo > >>>> > >>>>> > >>>>> Regards, > >>>>> Christian. > >>>>> > >>>>>> > >>>>>> 2) Generically connect GPU VA mappings to their backing > >>>>>> buffers, in > >>>>>> particular DRM GEM objects. > >>>>>> > >>>>>> 3) Provide a common implementation to perform more complex mapping > >>>>>> operations on the GPU VA space. In particular splitting and > >>>>>> merging > >>>>>> of GPU VA mappings, e.g. for intersecting mapping requests > >>>>>> or partial > >>>>>> unmap requests. > >>>>>> > >>>>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, > >>>>>> itself > >>>>>> providing the following new interfaces. > >>>>>> > >>>>>> 1) Initialize a GPU VA space via the new > >>>>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl > >>>>>> for UMDs to specify the portion of VA space managed by the > >>>>>> kernel and > >>>>>> userspace, respectively. > >>>>>> > >>>>>> 2) Allocate and free a VA space region as well as bind and > >>>>>> unbind memory > >>>>>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND > >>>>>> ioctl. > >>>>>> > >>>>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > >>>>>> > >>>>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use > >>>>>> of the DRM > >>>>>> scheduler to queue jobs and support asynchronous processing with DRM > >>>>>> syncobjs > >>>>>> as synchronization mechanism. > >>>>>> > >>>>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > >>>>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > >>>>>> > >>>>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec > >>>>>> (execution context > >>>>>> for GEM buffers) by Christian König. Since the patch implementing > >>>>>> drm_exec was > >>>>>> not yet merged into drm-next it is part of this series, as well as a > >>>>>> small fix > >>>>>> for this patch, which was found while testing this series. > >>>>>> > >>>>>> This patch series is also available at [1]. > >>>>>> > >>>>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the > >>>>>> corresponding userspace parts for this series. > >>>>>> > >>>>>> The Vulkan CTS test suite passes the sparse binding and sparse > >>>>>> residency test > >>>>>> cases for the new UAPI together with Dave's Mesa work. > >>>>>> > >>>>>> There are also some test cases in the igt-gpu-tools project [3] for > >>>>>> the new UAPI > >>>>>> and hence the DRM GPU VA manager. However, most of them are testing > >>>>>> the DRM GPU > >>>>>> VA manager's logic through Nouveau's new UAPI and should be > >>>>>> considered just as > >>>>>> helper for implementation. > >>>>>> > >>>>>> However, I absolutely intend to change those test cases to proper > >>>>>> kunit test > >>>>>> cases for the DRM GPUVA manager, once and if we agree on it's > >>>>>> usefulness and > >>>>>> design. > >>>>>> > >>>>>> [1] > >>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > >>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > >>>>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > >>>>>> [3] > >>>>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > >>>>>> > >>>>>> I also want to give credit to Dave Airlie, who contributed a lot of > >>>>>> ideas to > >>>>>> this patch series. > >>>>>> > >>>>>> Christian König (1): > >>>>>> drm: execution context for GEM buffers > >>>>>> > >>>>>> Danilo Krummrich (13): > >>>>>> drm/exec: fix memory leak in drm_exec_prepare_obj() > >>>>>> drm: manager to keep track of GPUs VA mappings > >>>>>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space > >>>>>> drm/nouveau: new VM_BIND uapi interfaces > >>>>>> drm/nouveau: get vmm via nouveau_cli_vmm() > >>>>>> drm/nouveau: bo: initialize GEM GPU VA interface > >>>>>> drm/nouveau: move usercopy helpers to nouveau_drv.h > >>>>>> drm/nouveau: fence: fail to emit when fence context is killed > >>>>>> drm/nouveau: chan: provide nouveau_channel_kill() > >>>>>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm > >>>>>> drm/nouveau: implement uvmm for user mode bindings > >>>>>> drm/nouveau: implement new VM_BIND UAPI > >>>>>> drm/nouveau: debugfs: implement DRM GPU VA debugfs > >>>>>> > >>>>>> Documentation/gpu/driver-uapi.rst | 11 + > >>>>>> Documentation/gpu/drm-mm.rst | 43 + > >>>>>> drivers/gpu/drm/Kconfig | 6 + > >>>>>> drivers/gpu/drm/Makefile | 3 + > >>>>>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 + > >>>>>> drivers/gpu/drm/drm_debugfs.c | 56 + > >>>>>> drivers/gpu/drm/drm_exec.c | 294 ++++ > >>>>>> drivers/gpu/drm/drm_gem.c | 3 + > >>>>>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323 > >>>>>> +++++++++++++++++ > >>>>>> drivers/gpu/drm/nouveau/Kbuild | 3 + > >>>>>> drivers/gpu/drm/nouveau/Kconfig | 2 + > >>>>>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +- > >>>>>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +- > >>>>>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++ > >>>>>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++ > >>>>>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++ > >>>>>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- > >>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++ > >>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 + > >>>>>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +- > >>>>>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +- > >>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++- > >>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 + > >>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +- > >>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 + > >>>>>> include/drm/drm_debugfs.h | 25 + > >>>>>> include/drm/drm_drv.h | 6 + > >>>>>> include/drm/drm_exec.h | 144 ++ > >>>>>> include/drm/drm_gem.h | 75 + > >>>>>> include/drm/drm_gpuva_mgr.h | 527 +++++++ > >>>>>> include/uapi/drm/nouveau_drm.h | 216 +++ > >>>>>> 47 files changed, 5266 insertions(+), 126 deletions(-) > >>>>>> create mode 100644 drivers/gpu/drm/drm_exec.c > >>>>>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c > >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c > >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h > >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c > >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h > >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c > >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h > >>>>>> create mode 100644 include/drm/drm_exec.h > >>>>>> create mode 100644 include/drm/drm_gpuva_mgr.h > >>>>>> > >>>>>> > >>>>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d > >>>>> > >>>> > >>> > >> > > >
On Thu, 19 Jan 2023 at 02:54, Alex Deucher <alexdeucher@gmail.com> wrote: > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <dakr@redhat.com> wrote: > > > > > > > > On 1/18/23 17:30, Alex Deucher wrote: > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <dakr@redhat.com> wrote: > > >> > > >> On 1/18/23 16:37, Christian König wrote: > > >>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich: > > >>>> Hi Christian, > > >>>> > > >>>> On 1/18/23 09:53, Christian König wrote: > > >>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: > > >>>>>> This patch series provides a new UAPI for the Nouveau driver in > > >>>>>> order to > > >>>>>> support Vulkan features, such as sparse bindings and sparse residency. > > >>>>>> > > >>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core > > >>>>>> feature to > > >>>>>> keep track of GPU virtual address (VA) mappings in a more generic way. > > >>>>>> > > >>>>>> The DRM GPUVA manager is indented to help drivers implement > > >>>>>> userspace-manageable > > >>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve > > >>>>>> this goal it > > >>>>>> serves the following purposes in this context. > > >>>>>> > > >>>>>> 1) Provide a dedicated range allocator to track GPU VA > > >>>>>> allocations and > > >>>>>> mappings, making use of the drm_mm range allocator. > > >>>>> > > >>>>> This means that the ranges are allocated by the kernel? If yes that's > > >>>>> a really really bad idea. > > >>>> > > >>>> No, it's just for keeping track of the ranges userspace has allocated. > > >>> > > >>> Ok, that makes more sense. > > >>> > > >>> So basically you have an IOCTL which asks kernel for a free range? Or > > >>> what exactly is the drm_mm used for here? > > >> > > >> Not even that, userspace provides both the base address and the range, > > >> the kernel really just keeps track of things. Though, writing a UAPI on > > >> top of the GPUVA manager asking for a free range instead would be > > >> possible by just adding the corresponding wrapper functions to get a > > >> free hole. > > >> > > >> Currently, and that's what I think I read out of your question, the main > > >> benefit of using drm_mm over simply stuffing the entries into a list or > > >> something boils down to easier collision detection and iterating > > >> sub-ranges of the whole VA space. > > > > > > Why not just do this in userspace? We have a range manager in > > > libdrm_amdgpu that you could lift out into libdrm or some other > > > helper. > > > > The kernel still needs to keep track of the mappings within the various > > VA spaces, e.g. it silently needs to unmap mappings that are backed by > > BOs that get evicted and remap them once they're validated (or swapped > > back in). > > Ok, you are just using this for maintaining the GPU VM space in the kernel. > Yes the idea behind having common code wrapping drm_mm for this is to allow us to make the rules consistent across drivers. Userspace (generally Vulkan, some compute) has interfaces that pretty much dictate a lot of how VMA tracking works, esp around lifetimes, sparse mappings and splitting/merging underlying page tables, I'd really like this to be more consistent across drivers, because already I think we've seen with freedreno some divergence from amdgpu and we also have i915/xe to deal with. I'd like to at least have one place that we can say this is how it should work, since this is something that *should* be consistent across drivers mostly, as it is more about how the uapi is exposed. Dave.
Am 18.01.23 um 20:17 schrieb Dave Airlie: > On Thu, 19 Jan 2023 at 02:54, Alex Deucher <alexdeucher@gmail.com> wrote: >> On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <dakr@redhat.com> wrote: >>> >>> >>> On 1/18/23 17:30, Alex Deucher wrote: >>>> On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <dakr@redhat.com> wrote: >>>>> On 1/18/23 16:37, Christian König wrote: >>>>>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich: >>>>>>> Hi Christian, >>>>>>> >>>>>>> On 1/18/23 09:53, Christian König wrote: >>>>>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: >>>>>>>>> This patch series provides a new UAPI for the Nouveau driver in >>>>>>>>> order to >>>>>>>>> support Vulkan features, such as sparse bindings and sparse residency. >>>>>>>>> >>>>>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core >>>>>>>>> feature to >>>>>>>>> keep track of GPU virtual address (VA) mappings in a more generic way. >>>>>>>>> >>>>>>>>> The DRM GPUVA manager is indented to help drivers implement >>>>>>>>> userspace-manageable >>>>>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve >>>>>>>>> this goal it >>>>>>>>> serves the following purposes in this context. >>>>>>>>> >>>>>>>>> 1) Provide a dedicated range allocator to track GPU VA >>>>>>>>> allocations and >>>>>>>>> mappings, making use of the drm_mm range allocator. >>>>>>>> This means that the ranges are allocated by the kernel? If yes that's >>>>>>>> a really really bad idea. >>>>>>> No, it's just for keeping track of the ranges userspace has allocated. >>>>>> Ok, that makes more sense. >>>>>> >>>>>> So basically you have an IOCTL which asks kernel for a free range? Or >>>>>> what exactly is the drm_mm used for here? >>>>> Not even that, userspace provides both the base address and the range, >>>>> the kernel really just keeps track of things. Though, writing a UAPI on >>>>> top of the GPUVA manager asking for a free range instead would be >>>>> possible by just adding the corresponding wrapper functions to get a >>>>> free hole. >>>>> >>>>> Currently, and that's what I think I read out of your question, the main >>>>> benefit of using drm_mm over simply stuffing the entries into a list or >>>>> something boils down to easier collision detection and iterating >>>>> sub-ranges of the whole VA space. >>>> Why not just do this in userspace? We have a range manager in >>>> libdrm_amdgpu that you could lift out into libdrm or some other >>>> helper. >>> The kernel still needs to keep track of the mappings within the various >>> VA spaces, e.g. it silently needs to unmap mappings that are backed by >>> BOs that get evicted and remap them once they're validated (or swapped >>> back in). >> Ok, you are just using this for maintaining the GPU VM space in the kernel. >> > Yes the idea behind having common code wrapping drm_mm for this is to > allow us to make the rules consistent across drivers. > > Userspace (generally Vulkan, some compute) has interfaces that pretty > much dictate a lot of how VMA tracking works, esp around lifetimes, > sparse mappings and splitting/merging underlying page tables, I'd > really like this to be more consistent across drivers, because already > I think we've seen with freedreno some divergence from amdgpu and we > also have i915/xe to deal with. I'd like to at least have one place > that we can say this is how it should work, since this is something > that *should* be consistent across drivers mostly, as it is more about > how the uapi is exposed. That's a really good idea, but the implementation with drm_mm won't work like that. We have Vulkan applications which use the sparse feature to create literally millions of mappings. That's why I have fine tuned the mapping structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle possible in the handling of that. A drm_mm_node is more in the range of ~200 bytes and certainly not suitable for this kind of job. I strongly suggest to rather use a good bunch of the amdgpu VM code as blueprint for the common infrastructure. Regards, Christian. > > Dave.
On 1/18/23 20:48, Christian König wrote: > Am 18.01.23 um 20:17 schrieb Dave Airlie: >> On Thu, 19 Jan 2023 at 02:54, Alex Deucher <alexdeucher@gmail.com> wrote: >>> On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <dakr@redhat.com> >>> wrote: >>>> >>>> >>>> On 1/18/23 17:30, Alex Deucher wrote: >>>>> On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <dakr@redhat.com> >>>>> wrote: >>>>>> On 1/18/23 16:37, Christian König wrote: >>>>>>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich: >>>>>>>> Hi Christian, >>>>>>>> >>>>>>>> On 1/18/23 09:53, Christian König wrote: >>>>>>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich: >>>>>>>>>> This patch series provides a new UAPI for the Nouveau driver in >>>>>>>>>> order to >>>>>>>>>> support Vulkan features, such as sparse bindings and sparse >>>>>>>>>> residency. >>>>>>>>>> >>>>>>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM >>>>>>>>>> core >>>>>>>>>> feature to >>>>>>>>>> keep track of GPU virtual address (VA) mappings in a more >>>>>>>>>> generic way. >>>>>>>>>> >>>>>>>>>> The DRM GPUVA manager is indented to help drivers implement >>>>>>>>>> userspace-manageable >>>>>>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve >>>>>>>>>> this goal it >>>>>>>>>> serves the following purposes in this context. >>>>>>>>>> >>>>>>>>>> 1) Provide a dedicated range allocator to track GPU VA >>>>>>>>>> allocations and >>>>>>>>>> mappings, making use of the drm_mm range allocator. >>>>>>>>> This means that the ranges are allocated by the kernel? If yes >>>>>>>>> that's >>>>>>>>> a really really bad idea. >>>>>>>> No, it's just for keeping track of the ranges userspace has >>>>>>>> allocated. >>>>>>> Ok, that makes more sense. >>>>>>> >>>>>>> So basically you have an IOCTL which asks kernel for a free >>>>>>> range? Or >>>>>>> what exactly is the drm_mm used for here? >>>>>> Not even that, userspace provides both the base address and the >>>>>> range, >>>>>> the kernel really just keeps track of things. Though, writing a >>>>>> UAPI on >>>>>> top of the GPUVA manager asking for a free range instead would be >>>>>> possible by just adding the corresponding wrapper functions to get a >>>>>> free hole. >>>>>> >>>>>> Currently, and that's what I think I read out of your question, >>>>>> the main >>>>>> benefit of using drm_mm over simply stuffing the entries into a >>>>>> list or >>>>>> something boils down to easier collision detection and iterating >>>>>> sub-ranges of the whole VA space. >>>>> Why not just do this in userspace? We have a range manager in >>>>> libdrm_amdgpu that you could lift out into libdrm or some other >>>>> helper. >>>> The kernel still needs to keep track of the mappings within the various >>>> VA spaces, e.g. it silently needs to unmap mappings that are backed by >>>> BOs that get evicted and remap them once they're validated (or swapped >>>> back in). >>> Ok, you are just using this for maintaining the GPU VM space in the >>> kernel. >>> >> Yes the idea behind having common code wrapping drm_mm for this is to >> allow us to make the rules consistent across drivers. >> >> Userspace (generally Vulkan, some compute) has interfaces that pretty >> much dictate a lot of how VMA tracking works, esp around lifetimes, >> sparse mappings and splitting/merging underlying page tables, I'd >> really like this to be more consistent across drivers, because already >> I think we've seen with freedreno some divergence from amdgpu and we >> also have i915/xe to deal with. I'd like to at least have one place >> that we can say this is how it should work, since this is something >> that *should* be consistent across drivers mostly, as it is more about >> how the uapi is exposed. > > That's a really good idea, but the implementation with drm_mm won't work > like that. > > We have Vulkan applications which use the sparse feature to create > literally millions of mappings. That's why I have fine tuned the mapping > structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle > possible in the handling of that. That's a valuable information. Can you recommend such an application for testing / benchmarking? Your optimization effort sounds great. May it be worth thinking about generalizing your approach by itself and stacking the drm_gpuva_manager on top of it? > > A drm_mm_node is more in the range of ~200 bytes and certainly not > suitable for this kind of job. > > I strongly suggest to rather use a good bunch of the amdgpu VM code as > blueprint for the common infrastructure. I will definitely have look. > > Regards, > Christian. > >> >> Dave. >
On Thu, Jan 19, 2023 at 05:04:32AM +0100, Danilo Krummrich wrote: > On 1/18/23 20:48, Christian König wrote: > > Am 18.01.23 um 20:17 schrieb Dave Airlie: > > > On Thu, 19 Jan 2023 at 02:54, Alex Deucher <alexdeucher@gmail.com> wrote: > > > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich > > > > <dakr@redhat.com> wrote: > > > > > > > > > > > > > > > On 1/18/23 17:30, Alex Deucher wrote: > > > > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich > > > > > > <dakr@redhat.com> wrote: > > > > > > > On 1/18/23 16:37, Christian König wrote: > > > > > > > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich: > > > > > > > > > Hi Christian, > > > > > > > > > > > > > > > > > > On 1/18/23 09:53, Christian König wrote: > > > > > > > > > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich: > > > > > > > > > > > This patch series provides a new UAPI for the Nouveau driver in > > > > > > > > > > > order to > > > > > > > > > > > support Vulkan features, such as > > > > > > > > > > > sparse bindings and sparse > > > > > > > > > > > residency. > > > > > > > > > > > > > > > > > > > > > > Furthermore, with the DRM GPUVA > > > > > > > > > > > manager it provides a new DRM core > > > > > > > > > > > feature to > > > > > > > > > > > keep track of GPU virtual address > > > > > > > > > > > (VA) mappings in a more generic way. > > > > > > > > > > > > > > > > > > > > > > The DRM GPUVA manager is indented to help drivers implement > > > > > > > > > > > userspace-manageable > > > > > > > > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve > > > > > > > > > > > this goal it > > > > > > > > > > > serves the following purposes in this context. > > > > > > > > > > > > > > > > > > > > > > 1) Provide a dedicated range allocator to track GPU VA > > > > > > > > > > > allocations and > > > > > > > > > > > mappings, making use of the drm_mm range allocator. > > > > > > > > > > This means that the ranges are allocated > > > > > > > > > > by the kernel? If yes that's > > > > > > > > > > a really really bad idea. > > > > > > > > > No, it's just for keeping track of the > > > > > > > > > ranges userspace has allocated. > > > > > > > > Ok, that makes more sense. > > > > > > > > > > > > > > > > So basically you have an IOCTL which asks kernel > > > > > > > > for a free range? Or > > > > > > > > what exactly is the drm_mm used for here? > > > > > > > Not even that, userspace provides both the base > > > > > > > address and the range, > > > > > > > the kernel really just keeps track of things. > > > > > > > Though, writing a UAPI on > > > > > > > top of the GPUVA manager asking for a free range instead would be > > > > > > > possible by just adding the corresponding wrapper functions to get a > > > > > > > free hole. > > > > > > > > > > > > > > Currently, and that's what I think I read out of > > > > > > > your question, the main > > > > > > > benefit of using drm_mm over simply stuffing the > > > > > > > entries into a list or > > > > > > > something boils down to easier collision detection and iterating > > > > > > > sub-ranges of the whole VA space. > > > > > > Why not just do this in userspace? We have a range manager in > > > > > > libdrm_amdgpu that you could lift out into libdrm or some other > > > > > > helper. > > > > > The kernel still needs to keep track of the mappings within the various > > > > > VA spaces, e.g. it silently needs to unmap mappings that are backed by > > > > > BOs that get evicted and remap them once they're validated (or swapped > > > > > back in). > > > > Ok, you are just using this for maintaining the GPU VM space in > > > > the kernel. > > > > > > > Yes the idea behind having common code wrapping drm_mm for this is to > > > allow us to make the rules consistent across drivers. > > > > > > Userspace (generally Vulkan, some compute) has interfaces that pretty > > > much dictate a lot of how VMA tracking works, esp around lifetimes, > > > sparse mappings and splitting/merging underlying page tables, I'd > > > really like this to be more consistent across drivers, because already > > > I think we've seen with freedreno some divergence from amdgpu and we > > > also have i915/xe to deal with. I'd like to at least have one place > > > that we can say this is how it should work, since this is something > > > that *should* be consistent across drivers mostly, as it is more about > > > how the uapi is exposed. > > > > That's a really good idea, but the implementation with drm_mm won't work > > like that. > > > > We have Vulkan applications which use the sparse feature to create > > literally millions of mappings. That's why I have fine tuned the mapping Is this not an application issue? Millions of mappings seems a bit absurd to me. > > structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle > > possible in the handling of that. We might need to bit of work here in Xe as our xe_vma structure is quite big as we currently use it as dumping ground for various features. > > That's a valuable information. Can you recommend such an application for > testing / benchmarking? > Also interested. > Your optimization effort sounds great. May it be worth thinking about > generalizing your approach by itself and stacking the drm_gpuva_manager on > top of it? > FWIW the Xe is on board with the drm_gpuva_manager effort, we basically open code all of this right now. I'd like to port over to drm_gpuva_manager ASAP so we can contribute and help find a viable solution for all of us. Matt > > > > A drm_mm_node is more in the range of ~200 bytes and certainly not > > suitable for this kind of job. > > > > I strongly suggest to rather use a good bunch of the amdgpu VM code as > > blueprint for the common infrastructure. > > I will definitely have look. > > > > > Regards, > > Christian. > > > > > > > > Dave. > > >
Am 19.01.23 um 06:23 schrieb Matthew Brost: > [SNIP] >>>> Userspace (generally Vulkan, some compute) has interfaces that pretty >>>> much dictate a lot of how VMA tracking works, esp around lifetimes, >>>> sparse mappings and splitting/merging underlying page tables, I'd >>>> really like this to be more consistent across drivers, because already >>>> I think we've seen with freedreno some divergence from amdgpu and we >>>> also have i915/xe to deal with. I'd like to at least have one place >>>> that we can say this is how it should work, since this is something >>>> that *should* be consistent across drivers mostly, as it is more about >>>> how the uapi is exposed. >>> That's a really good idea, but the implementation with drm_mm won't work >>> like that. >>> >>> We have Vulkan applications which use the sparse feature to create >>> literally millions of mappings. That's why I have fine tuned the mapping > Is this not an application issue? Millions of mappings seems a bit > absurd to me. That's unfortunately how some games are designed these days. >>> structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle >>> possible in the handling of that. > We might need to bit of work here in Xe as our xe_vma structure is quite > big as we currently use it as dumping ground for various features. We have done that as well and it turned out to be a bad idea. At one point we added some power management information into the mapping structure, but quickly reverted that. >> That's a valuable information. Can you recommend such an application for >> testing / benchmarking? >> > Also interested. On of the most demanding ones is Forza Horizon 5. The general approach of that game seems to be to allocate 64GiB of address space (equals 16 million 4kiB pages) and then mmap() whatever data it needs into that self managed space, assuming that every 4KiB page is individually mapable to a different location. >> Your optimization effort sounds great. May it be worth thinking about >> generalizing your approach by itself and stacking the drm_gpuva_manager on >> top of it? >> > FWIW the Xe is on board with the drm_gpuva_manager effort, we basically > open code all of this right now. I'd like to port over to > drm_gpuva_manager ASAP so we can contribute and help find a viable > solution for all of us. Sounds good. I haven't looked into the drm_gpuva_manager code yet, but a few design notes I've leaned from amdgpu: Separate address space management (drm_mm) from page table management. In other words when an application asks for 64GiB for free address space you don't look into the page table structures, but rather into a separate drm_mm instance. In amdgpu we even moved the later into userspace, but the general take away is that you have only a handful of address space requests while you have tons of mapping/unmapping requests. Separate the tracking structure into two, one for each BO+VM combination (we call that amdgpu_bo_va) and one for each mapping (called amdgpu_bo_va_mapping). We unfortunately use that for our hw dependent state machine as well, so it isn't easily generalize-able. I've gone back on forth on merging VMA and then not again. Not merging them can save quite a bit of overhead, but results in much more mappings for some use cases. Regards, Christian. > > Matt > >>> A drm_mm_node is more in the range of ~200 bytes and certainly not >>> suitable for this kind of job. >>> >>> I strongly suggest to rather use a good bunch of the amdgpu VM code as >>> blueprint for the common infrastructure. >> I will definitely have look. >> >>> Regards, >>> Christian. >>> >>>> Dave.
On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost <matthew.brost@intel.com> wrote: > > On Thu, Jan 19, 2023 at 05:04:32AM +0100, Danilo Krummrich wrote: > > On 1/18/23 20:48, Christian König wrote: > > > Am 18.01.23 um 20:17 schrieb Dave Airlie: > > > > On Thu, 19 Jan 2023 at 02:54, Alex Deucher <alexdeucher@gmail.com> wrote: > > > > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich > > > > > <dakr@redhat.com> wrote: > > > > > > > > > > > > > > > > > > On 1/18/23 17:30, Alex Deucher wrote: > > > > > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich > > > > > > > <dakr@redhat.com> wrote: > > > > > > > > On 1/18/23 16:37, Christian König wrote: > > > > > > > > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich: > > > > > > > > > > Hi Christian, > > > > > > > > > > > > > > > > > > > > On 1/18/23 09:53, Christian König wrote: > > > > > > > > > > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich: > > > > > > > > > > > > This patch series provides a new UAPI for the Nouveau driver in > > > > > > > > > > > > order to > > > > > > > > > > > > support Vulkan features, such as > > > > > > > > > > > > sparse bindings and sparse > > > > > > > > > > > > residency. > > > > > > > > > > > > > > > > > > > > > > > > Furthermore, with the DRM GPUVA > > > > > > > > > > > > manager it provides a new DRM core > > > > > > > > > > > > feature to > > > > > > > > > > > > keep track of GPU virtual address > > > > > > > > > > > > (VA) mappings in a more generic way. > > > > > > > > > > > > > > > > > > > > > > > > The DRM GPUVA manager is indented to help drivers implement > > > > > > > > > > > > userspace-manageable > > > > > > > > > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve > > > > > > > > > > > > this goal it > > > > > > > > > > > > serves the following purposes in this context. > > > > > > > > > > > > > > > > > > > > > > > > 1) Provide a dedicated range allocator to track GPU VA > > > > > > > > > > > > allocations and > > > > > > > > > > > > mappings, making use of the drm_mm range allocator. > > > > > > > > > > > This means that the ranges are allocated > > > > > > > > > > > by the kernel? If yes that's > > > > > > > > > > > a really really bad idea. > > > > > > > > > > No, it's just for keeping track of the > > > > > > > > > > ranges userspace has allocated. > > > > > > > > > Ok, that makes more sense. > > > > > > > > > > > > > > > > > > So basically you have an IOCTL which asks kernel > > > > > > > > > for a free range? Or > > > > > > > > > what exactly is the drm_mm used for here? > > > > > > > > Not even that, userspace provides both the base > > > > > > > > address and the range, > > > > > > > > the kernel really just keeps track of things. > > > > > > > > Though, writing a UAPI on > > > > > > > > top of the GPUVA manager asking for a free range instead would be > > > > > > > > possible by just adding the corresponding wrapper functions to get a > > > > > > > > free hole. > > > > > > > > > > > > > > > > Currently, and that's what I think I read out of > > > > > > > > your question, the main > > > > > > > > benefit of using drm_mm over simply stuffing the > > > > > > > > entries into a list or > > > > > > > > something boils down to easier collision detection and iterating > > > > > > > > sub-ranges of the whole VA space. > > > > > > > Why not just do this in userspace? We have a range manager in > > > > > > > libdrm_amdgpu that you could lift out into libdrm or some other > > > > > > > helper. > > > > > > The kernel still needs to keep track of the mappings within the various > > > > > > VA spaces, e.g. it silently needs to unmap mappings that are backed by > > > > > > BOs that get evicted and remap them once they're validated (or swapped > > > > > > back in). > > > > > Ok, you are just using this for maintaining the GPU VM space in > > > > > the kernel. > > > > > > > > > Yes the idea behind having common code wrapping drm_mm for this is to > > > > allow us to make the rules consistent across drivers. > > > > > > > > Userspace (generally Vulkan, some compute) has interfaces that pretty > > > > much dictate a lot of how VMA tracking works, esp around lifetimes, > > > > sparse mappings and splitting/merging underlying page tables, I'd > > > > really like this to be more consistent across drivers, because already > > > > I think we've seen with freedreno some divergence from amdgpu and we > > > > also have i915/xe to deal with. I'd like to at least have one place > > > > that we can say this is how it should work, since this is something > > > > that *should* be consistent across drivers mostly, as it is more about > > > > how the uapi is exposed. > > > > > > That's a really good idea, but the implementation with drm_mm won't work > > > like that. > > > > > > We have Vulkan applications which use the sparse feature to create > > > literally millions of mappings. That's why I have fine tuned the mapping > > Is this not an application issue? Millions of mappings seems a bit > absurd to me. If I look at the most extreme case for AI, assuming 256GB of HBM memory and page mapping of 2MB, we get to 128K of mappings. But that's really the extreme case imo. I assume most mappings will be much larger. In fact, in the most realistic scenario of large-scale training, a single user will probably map the entire HBM memory using 1GB pages. I have also a question, could this GPUVA code manage VA ranges mappings for userptr mappings, assuming we work without svm/uva/usm (pointer-is-a-pointer) ? Because then we are talking about possible 4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is the assumption this can be used also for non-VK use-cases. Please tell me if I'm totally wrong here). Thanks, Oded
Hi Oded, sorry for the late response, somehow this mail slipped through. On 2/6/23 15:48, Oded Gabbay wrote: > On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost <matthew.brost@intel.com> wrote: >> Is this not an application issue? Millions of mappings seems a bit >> absurd to me. > If I look at the most extreme case for AI, assuming 256GB of HBM > memory and page mapping of 2MB, we get to 128K of mappings. But that's > really the extreme case imo. I assume most mappings will be much > larger. In fact, in the most realistic scenario of large-scale > training, a single user will probably map the entire HBM memory using > 1GB pages. > > I have also a question, could this GPUVA code manage VA ranges > mappings for userptr mappings, assuming we work without svm/uva/usm > (pointer-is-a-pointer) ? Because then we are talking about possible > 4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is > the assumption this can be used also for non-VK use-cases. Please tell > me if I'm totally wrong here). In V2 I switched from drm_mm to maple tree, which should improve handling of lots of entries. I also dropped the requirement for GPUVA entries to be backed by a valid GEM object. I think it can be used for non-VK use-cases. It basically just keeps track of mappings (not allocating them in the sense of finding a hole and providing a base address for a given size). There are basic functions to insert and remove entries. For those basic functions it is ensured that colliding entries can't be inserted and only a specific given entry can be removed, rather than e.g. an arbitrary range. There are also more advanced functions where users of the GPUVA manager can request to "force map" a new mapping and to unmap a given range. The GPUVA manager will figure out the (sub-)operations to make this happen (.e.g. remove mappings in the way, split up mappings, etc.) and either provide these operations (or steps) through callbacks or though a list of operations to the caller to process them. Are there any other use-cases or features you could think of that would be beneficial for accelerators? - Danilo > > Thanks, > Oded >