mbox series

[v14,0/8] mm/gup: Introduce memfd_pin_folios() for pinning memfd folios

Message ID 20240411070157.3318425-1-vivek.kasireddy@intel.com (mailing list archive)
Headers show
Series mm/gup: Introduce memfd_pin_folios() for pinning memfd folios | expand

Message

Kasireddy, Vivek April 11, 2024, 6:59 a.m. UTC
Currently, some drivers (e.g, Udmabuf) that want to longterm-pin
the pages/folios associated with a memfd, do so by simply taking a
reference on them. This is not desirable because the pages/folios
may reside in Movable zone or CMA block.

Therefore, having drivers use memfd_pin_folios() API ensures that
the folios are appropriately pinned via FOLL_PIN for longterm DMA.

This patchset also introduces a few helpers and converts the Udmabuf
driver to use folios and memfd_pin_folios() API to longterm-pin
the folios for DMA. Two new Udmabuf selftests are also included to
test the driver and the new API.

---

Patchset overview:

Patch 1-2:    GUP helpers to migrate and unpin one or more folios
Patch 3:      Introduce memfd_pin_folios() API
Patch 4-5:    Udmabuf driver bug fixes for Qemu + hugetlb=on, blob=true case
Patch 6-8:    Convert Udmabuf to use memfd_pin_folios() and add selftests

This series is tested using the following methods:
- Run the subtests added in Patch 8
- Run Qemu (master) with the following options and a few additional
  patches to Spice:
  qemu-system-x86_64 -m 4096m....
  -device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080
  -spice port=3001,gl=on,disable-ticketing=on,preferred-codec=gstreamer:h264
  -object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M
  -machine memory-backend=mem1
- Run source ./run_vmtests.sh -t gup_test -a to check for GUP regressions

Changelog:

v13 -> v14:
- Drop the redundant comments before check_and_migrate_movable_pages()
  and refer to check_and_migrate_movable_folios() comments (David)
- Use appropriate ksft_* functions for printing and KSFT_* codes for
  exit() in udmabuf selftest (Shuah)
- Add Mike Kravetz's suggested-by tag in udmabuf selftest patch (Shuah)
- Collect Ack and Rb tags from David

v12 -> v13: (suggestions from David)
- Drop the sanity checks in unpin_folio()/unpin_folios() due to
  unavailability of per folio anon-exclusive flag
- Export unpin_folio()/unpin_folios() using EXPORT_SYMBOL_GPL
  instead of EXPORT_SYMBOL
- Have check_and_migrate_movable_pages() just call
  check_and_migrate_movable_folios() instead of calling other helpers
- Slightly improve the comments and commit messages

v11 -> v12:
- Rebased and tested on mm-unstable

v10 -> v11:
- Remove the version string from the patch subject (Andrew)
- Move the changelog from the patches into the cover letter
- Rearrange the patchset to have GUP patches at the beginning

v9 -> v10:
- Introduce and use unpin_folio(), unpin_folios() and
  check_and_migrate_movable_folios() helpers
- Use a list to track the folios that need to be unpinned in udmabuf

v8 -> v9: (suggestions from Matthew)
- Drop the extern while declaring memfd_alloc_folio()
- Fix memfd_alloc_folio() declaration to have it return struct folio *
  instead of struct page * when CONFIG_MEMFD_CREATE is not defined
- Use folio_pfn() on the folio instead of page_to_pfn() on head page
  in udmabuf
- Don't split the arguments to shmem_read_folio() on multiple lines
  in udmabuf

v7 -> v8: (suggestions from David)
- Have caller pass [start, end], max_folios instead of start, nr_pages
- Replace offsets array with just offset into the first page
- Add comments explaning the need for next_idx
- Pin (and return) the folio (via FOLL_PIN) only once

v6 -> v7:
- Rename this API to memfd_pin_folios() and make it return folios
  and offsets instead of pages (David)
- Don't continue processing the folios in the batch returned by
  filemap_get_folios_contig() if they do not have correct next_idx
- Add the R-b tag from Christoph

v5 -> v6: (suggestions from Christoph)
- Rename this API to memfd_pin_user_pages() to make it clear that it
  is intended for memfds
- Move the memfd page allocation helper from gup.c to memfd.c
- Fix indentation errors in memfd_pin_user_pages()
- For contiguous ranges of folios, use a helper such as
  filemap_get_folios_contig() to lookup the page cache in batches
- Split the processing of hugetlb or shmem pages into helpers to
  simplify the code in udmabuf_create()

v4 -> v5: (suggestions from David)
- For hugetlb case, ensure that we only obtain head pages from the
  mapping by using __filemap_get_folio() instead of find_get_page_flags()
- Handle -EEXIST when two or more potential users try to simultaneously
  add a huge page to the mapping by forcing them to retry on failure

v3 -> v4:
- Remove the local variable "page" and instead use 3 return statements
  in alloc_file_page() (David)
- Add the R-b tag from David

v2 -> v3: (suggestions from David)
- Enclose the huge page allocation code with #ifdef CONFIG_HUGETLB_PAGE
  (Build error reported by kernel test robot <lkp@intel.com>)
- Don't forget memalloc_pin_restore() on non-migration related errors
- Improve the readability of the cleanup code associated with
  non-migration related errors
- Augment the comments by describing FOLL_LONGTERM like behavior
- Include the R-b tag from Jason

v1 -> v2:
- Drop gup_flags and improve comments and commit message (David)
- Allocate a page if we cannot find in page cache for the hugetlbfs
  case as well (David)
- Don't unpin pages if there is a migration related failure (David)
- Drop the unnecessary nr_pages <= 0 check (Jason)
- Have the caller of the API pass in file * instead of fd (Jason)

Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>

Vivek Kasireddy (8):
  mm/gup: Introduce unpin_folio/unpin_folios helpers
  mm/gup: Introduce check_and_migrate_movable_folios()
  mm/gup: Introduce memfd_pin_folios() for pinning memfd folios
  udmabuf: Use vmf_insert_pfn and VM_PFNMAP for handling mmap
  udmabuf: Add back support for mapping hugetlb pages
  udmabuf: Convert udmabuf driver to use folios
  udmabuf: Pin the pages using memfd_pin_folios() API
  selftests/udmabuf: Add tests to verify data after page migration

 drivers/dma-buf/udmabuf.c                     | 231 +++++++++----
 include/linux/memfd.h                         |   5 +
 include/linux/mm.h                            |   5 +
 mm/gup.c                                      | 307 +++++++++++++++---
 mm/memfd.c                                    |  35 ++
 .../selftests/drivers/dma-buf/udmabuf.c       | 214 ++++++++++--
 6 files changed, 659 insertions(+), 138 deletions(-)

Comments

Dave Airlie May 23, 2024, 3:13 a.m. UTC | #1
Hey

Gerd, do you have any time to look at this series again, I think at
v14 we should probably consider landing it.

I'm happy to give
Acked-by: Dave Airlie <airlied@redhat.com> for landing this via mm if
it makes the most sense.

One comment in passing, was I wonder if it makes sense for things like
vm_map_ram to have a folio variant in the future, to avoid that pages
temporary allocation.

Dave.

> Cc: David Hildenbrand <david@redhat.com>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Dongwon Kim <dongwon.kim@intel.com>
> Cc: Junxiao Chang <junxiao.chang@intel.com>
>
> Vivek Kasireddy (8):
>   mm/gup: Introduce unpin_folio/unpin_folios helpers
>   mm/gup: Introduce check_and_migrate_movable_folios()
>   mm/gup: Introduce memfd_pin_folios() for pinning memfd folios
>   udmabuf: Use vmf_insert_pfn and VM_PFNMAP for handling mmap
>   udmabuf: Add back support for mapping hugetlb pages
>   udmabuf: Convert udmabuf driver to use folios
>   udmabuf: Pin the pages using memfd_pin_folios() API
>   selftests/udmabuf: Add tests to verify data after page migration
>
>  drivers/dma-buf/udmabuf.c                     | 231 +++++++++----
>  include/linux/memfd.h                         |   5 +
>  include/linux/mm.h                            |   5 +
>  mm/gup.c                                      | 307 +++++++++++++++---
>  mm/memfd.c                                    |  35 ++
>  .../selftests/drivers/dma-buf/udmabuf.c       | 214 ++++++++++--
>  6 files changed, 659 insertions(+), 138 deletions(-)
>
> --
> 2.43.0
>
Gerd Hoffmann May 23, 2024, 8:28 a.m. UTC | #2
On Thu, May 23, 2024 at 01:13:11PM GMT, Dave Airlie wrote:
> Hey
> 
> Gerd, do you have any time to look at this series again, I think at
> v14 we should probably consider landing it.

Phew.  Didn't follow recent MM changes closely, don't know much about
folios beyond LWN coverage.  The changes look sane to my untrained eye,
I wouldn't rate that a 'review' though.

The patch series structure looks a bit odd, with patch #5 adding hugetlb
support, with the functions added being removed again in patch #7 after
switching to folios.  But maybe regression testing the series is easier
that way ...

Acked-by: Gerd Hoffmann <kraxel@redhat.com>

take care,
  Gerd
Kasireddy, Vivek May 23, 2024, 11:04 p.m. UTC | #3
Hi Gerd, Dave,

> 
> On Thu, May 23, 2024 at 01:13:11PM GMT, Dave Airlie wrote:
> > Hey
> >
> > Gerd, do you have any time to look at this series again, I think at
> > v14 we should probably consider landing it.
> 
> Phew.  Didn't follow recent MM changes closely, don't know much about
> folios beyond LWN coverage.  The changes look sane to my untrained eye,
> I wouldn't rate that a 'review' though.
> 
> The patch series structure looks a bit odd, with patch #5 adding hugetlb
> support, with the functions added being removed again in patch #7 after
> switching to folios.  But maybe regression testing the series is easier
> that way ...
Yes, regression testing is one reason. The other reason is to make it possible for
patches #4 and #5 to be backported to older stable kernels in order to add back
support for mapping hugetlbfs files without depending on folio related changes/patches.

> 
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Thank you. Andrew has merged this series to his mm tree.

Thanks,
Vivek

> 
> take care,
>   Gerd