Message ID | 20210407084238.20443-1-apopple@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Add support for SVM atomics in Nouveau | expand |
Hi Andrew, There is currently no outstanding feedback for this series so I am hoping it may be considered for inclusion (or at least the mm portions - I still need Reviews/Acks for the Nouveau bits). The main change for v8 was removal of entries on fork rather than copying in response to feedback from Jason so any follow up comments on patch 5 would also be welcome. The series contains a number of general clean-ups suggested by Christoph along with a feature to temporarily make selected user page mappings write-protected. This is needed to support OpenCL atomic operations in Nouveau to shared virtual memory (SVM) regions allocated with the CL_MEM_SVM_ATOMICS clSVMAlloc flag. A more complete description of the OpenCL SVM feature is available at https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/ OpenCL_API.html#_shared_virtual_memory . I have been testing this with Mesa 21.1.0 and a simple OpenCL program which checks GPU atomic accesses to system memory are atomic. Without this series the test fails as there is no way of write-protecting the userspace page mapping which results in the device clobbering CPU writes. For reference the test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/ . - Alistair On Wednesday, 7 April 2021 6:42:30 PM AEST Alistair Popple wrote: > This is the eighth version of a series to add support to Nouveau for atomic > memory operations on OpenCL shared virtual memory (SVM) regions. > > The main change for this version is a simplification of device exclusive > entry handling. Instead of copying entries for copy-on-write mappings > during fork they are removed instead. This is safer because there could be > unique corner cases when copying, particularly for pinned pages which > should follow the same logic as copy_present_page(). Removing entries > avoids this possiblity by treating them as normal ptes. > > Exclusive device access is implemented by adding a new swap entry type > (SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main > difference is that on fault the original entry is immediately restored by > the fault handler instead of waiting. > > Restoring the entry triggers calls to MMU notifers which allows a device > driver to revoke the atomic access permission from the GPU prior to the CPU > finalising the entry. > > Patches 1 & 2 refactor existing migration and device private entry > functions. > > Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated > functionality into separate functions - try_to_migrate_one() and > try_to_munlock_one(). These should not change any functionality, but any > help testing would be much appreciated as I have not been able to test > every usage of try_to_unmap_one(). > > Patch 5 contains the bulk of the implementation for device exclusive > memory. > > Patch 6 contains some additions to the HMM selftests to ensure everything > works as expected. > > Patch 7 is a cleanup for the Nouveau SVM implementation. > > Patch 8 contains the implementation of atomic access for the Nouveau > driver. > > This has been tested using the latest upstream Mesa userspace with a simple > OpenCL test program which checks the results of atomic GPU operations on a > SVM buffer whilst also writing to the same buffer from the CPU. > > Alistair Popple (8): > mm: Remove special swap entry functions > mm/swapops: Rework swap entry manipulation code > mm/rmap: Split try_to_munlock from try_to_unmap > mm/rmap: Split migration into its own function > mm: Device exclusive memory access > mm: Selftests for exclusive device memory > nouveau/svm: Refactor nouveau_range_fault > nouveau/svm: Implement atomic SVM access > > Documentation/vm/hmm.rst | 19 +- > Documentation/vm/unevictable-lru.rst | 33 +- > arch/s390/mm/pgtable.c | 2 +- > drivers/gpu/drm/nouveau/include/nvif/if000c.h | 1 + > drivers/gpu/drm/nouveau/nouveau_svm.c | 156 ++++- > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 + > .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 6 + > fs/proc/task_mmu.c | 23 +- > include/linux/mmu_notifier.h | 26 +- > include/linux/rmap.h | 11 +- > include/linux/swap.h | 8 +- > include/linux/swapops.h | 123 ++-- > lib/test_hmm.c | 126 +++- > lib/test_hmm_uapi.h | 2 + > mm/debug_vm_pgtable.c | 12 +- > mm/hmm.c | 12 +- > mm/huge_memory.c | 45 +- > mm/hugetlb.c | 10 +- > mm/memcontrol.c | 2 +- > mm/memory.c | 196 +++++- > mm/migrate.c | 51 +- > mm/mlock.c | 10 +- > mm/mprotect.c | 18 +- > mm/page_vma_mapped.c | 15 +- > mm/rmap.c | 612 +++++++++++++++--- > tools/testing/selftests/vm/hmm-tests.c | 158 +++++ > 26 files changed, 1366 insertions(+), 312 deletions(-) > >