Message ID | 20220715021109.2225886-1-apopple@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/gup: migrate device coherent pages when pinning instead of failing | expand |
On 7/14/2022 9:11 PM, Alistair Popple wrote: > Currently any attempts to pin a device coherent page will fail. This is > because device coherent pages need to be managed by a device driver, and > pinning them would prevent a driver from migrating them off the device. > > However this is no reason to fail pinning of these pages. These are > coherent and accessible from the CPU so can be migrated just like > pinning ZONE_MOVABLE pages. So instead of failing all attempts to pin > them first try migrating them out of ZONE_DEVICE. > > [hch: rebased to the split device memory checks, > moved migrate_device_page to migrate_device.c] > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > > This patch hopefully addresses all of David's comments. It replaces both my "mm: > remove the vma check in migrate_vma_setup()" and "mm/gup: migrate device > coherent pages when pinning instead of failing" patches. I'm not sure what the > best way of including this is, perhaps Alex can respin the series with this > patch instead? For sure Alistair. I'll include this in my next patch series version. Thanks, Alex Sierra > > - Alistair > > mm/gup.c | 50 +++++++++++++++++++++++++++++++++++++------ > mm/internal.h | 1 + > mm/migrate_device.c | 52 +++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 96 insertions(+), 7 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index b65fe8bf5af4..22b97ab61cd9 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -1881,7 +1881,7 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, > unsigned long isolation_error_count = 0, i; > struct folio *prev_folio = NULL; > LIST_HEAD(movable_page_list); > - bool drain_allow = true; > + bool drain_allow = true, coherent_pages = false; > int ret = 0; > > for (i = 0; i < nr_pages; i++) { > @@ -1891,9 +1891,38 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, > continue; > prev_folio = folio; > > - if (folio_is_longterm_pinnable(folio)) > + /* > + * Device coherent pages are managed by a driver and should not > + * be pinned indefinitely as it prevents the driver moving the > + * page. So when trying to pin with FOLL_LONGTERM instead try > + * to migrate the page out of device memory. > + */ > + if (folio_is_device_coherent(folio)) { > + /* > + * We always want a new GUP lookup with device coherent > + * pages. > + */ > + pages[i] = 0; > + coherent_pages = true; > + > + /* > + * Migration will fail if the page is pinned, so convert > + * the pin on the source page to a normal reference. > + */ > + if (gup_flags & FOLL_PIN) { > + get_page(&folio->page); > + unpin_user_page(&folio->page); > + } > + > + ret = migrate_device_coherent_page(&folio->page); > + if (ret) > + goto unpin_pages; > + > continue; > + } > > + if (folio_is_longterm_pinnable(folio)) > + continue; > /* > * Try to move out any movable page before pinning the range. > */ > @@ -1919,7 +1948,8 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, > folio_nr_pages(folio)); > } > > - if (!list_empty(&movable_page_list) || isolation_error_count) > + if (!list_empty(&movable_page_list) || isolation_error_count > + || coherent_pages) > goto unpin_pages; > > /* > @@ -1929,10 +1959,16 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, > return nr_pages; > > unpin_pages: > - if (gup_flags & FOLL_PIN) { > - unpin_user_pages(pages, nr_pages); > - } else { > - for (i = 0; i < nr_pages; i++) > + /* > + * pages[i] might be NULL if any device coherent pages were found. > + */ > + for (i = 0; i < nr_pages; i++) { > + if (!pages[i]) > + continue; > + > + if (gup_flags & FOLL_PIN) > + unpin_user_page(pages[i]); > + else > put_page(pages[i]); > } > > diff --git a/mm/internal.h b/mm/internal.h > index c0f8fbe0445b..899dab512c5a 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -853,6 +853,7 @@ int numa_migrate_prep(struct page *page, struct vm_area_struct *vma, > unsigned long addr, int page_nid, int *flags); > > void free_zone_device_page(struct page *page); > +int migrate_device_coherent_page(struct page *page); > > /* > * mm/gup.c > diff --git a/mm/migrate_device.c b/mm/migrate_device.c > index 18bc6483f63a..7feeb447e3b9 100644 > --- a/mm/migrate_device.c > +++ b/mm/migrate_device.c > @@ -686,6 +686,12 @@ void migrate_vma_pages(struct migrate_vma *migrate) > } > > if (!page) { > + /* > + * The only time there is no vma is when called from > + * migrate_device_coherent_page(). However this isn't > + * called if the page could not be unmapped. > + */ > + VM_BUG_ON(!migrate->vma); > if (!(migrate->src[i] & MIGRATE_PFN_MIGRATE)) > continue; > if (!notified) { > @@ -794,3 +800,49 @@ void migrate_vma_finalize(struct migrate_vma *migrate) > } > } > EXPORT_SYMBOL(migrate_vma_finalize); > + > +/* > + * Migrate a device coherent page back to normal memory. The caller should have > + * a reference on page which will be copied to the new page if migration is > + * successful or dropped on failure. > + */ > +int migrate_device_coherent_page(struct page *page) > +{ > + unsigned long src_pfn, dst_pfn = 0; > + struct migrate_vma args; > + struct page *dpage; > + > + WARN_ON_ONCE(PageCompound(page)); > + > + lock_page(page); > + src_pfn = migrate_pfn(page_to_pfn(page)) | MIGRATE_PFN_MIGRATE; > + args.src = &src_pfn; > + args.dst = &dst_pfn; > + args.cpages = 1; > + args.npages = 1; > + args.vma = NULL; > + > + /* > + * We don't have a VMA and don't need to walk the page tables to find > + * the source page. So call migrate_vma_unmap() directly to unmap the > + * page as migrate_vma_setup() will fail if args.vma == NULL. > + */ > + migrate_vma_unmap(&args); > + if (!(src_pfn & MIGRATE_PFN_MIGRATE)) > + return -EBUSY; > + > + dpage = alloc_page(GFP_USER | __GFP_NOWARN); > + if (dpage) { > + lock_page(dpage); > + dst_pfn = migrate_pfn(page_to_pfn(dpage)); > + } > + > + migrate_vma_pages(&args); > + if (src_pfn & MIGRATE_PFN_MIGRATE) > + copy_highpage(dpage, page); > + migrate_vma_finalize(&args); > + > + if (src_pfn & MIGRATE_PFN_MIGRATE) > + return 0; > + return -EBUSY; > +}
diff --git a/mm/gup.c b/mm/gup.c index b65fe8bf5af4..22b97ab61cd9 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1881,7 +1881,7 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, unsigned long isolation_error_count = 0, i; struct folio *prev_folio = NULL; LIST_HEAD(movable_page_list); - bool drain_allow = true; + bool drain_allow = true, coherent_pages = false; int ret = 0; for (i = 0; i < nr_pages; i++) { @@ -1891,9 +1891,38 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, continue; prev_folio = folio; - if (folio_is_longterm_pinnable(folio)) + /* + * Device coherent pages are managed by a driver and should not + * be pinned indefinitely as it prevents the driver moving the + * page. So when trying to pin with FOLL_LONGTERM instead try + * to migrate the page out of device memory. + */ + if (folio_is_device_coherent(folio)) { + /* + * We always want a new GUP lookup with device coherent + * pages. + */ + pages[i] = 0; + coherent_pages = true; + + /* + * Migration will fail if the page is pinned, so convert + * the pin on the source page to a normal reference. + */ + if (gup_flags & FOLL_PIN) { + get_page(&folio->page); + unpin_user_page(&folio->page); + } + + ret = migrate_device_coherent_page(&folio->page); + if (ret) + goto unpin_pages; + continue; + } + if (folio_is_longterm_pinnable(folio)) + continue; /* * Try to move out any movable page before pinning the range. */ @@ -1919,7 +1948,8 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, folio_nr_pages(folio)); } - if (!list_empty(&movable_page_list) || isolation_error_count) + if (!list_empty(&movable_page_list) || isolation_error_count + || coherent_pages) goto unpin_pages; /* @@ -1929,10 +1959,16 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, return nr_pages; unpin_pages: - if (gup_flags & FOLL_PIN) { - unpin_user_pages(pages, nr_pages); - } else { - for (i = 0; i < nr_pages; i++) + /* + * pages[i] might be NULL if any device coherent pages were found. + */ + for (i = 0; i < nr_pages; i++) { + if (!pages[i]) + continue; + + if (gup_flags & FOLL_PIN) + unpin_user_page(pages[i]); + else put_page(pages[i]); } diff --git a/mm/internal.h b/mm/internal.h index c0f8fbe0445b..899dab512c5a 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -853,6 +853,7 @@ int numa_migrate_prep(struct page *page, struct vm_area_struct *vma, unsigned long addr, int page_nid, int *flags); void free_zone_device_page(struct page *page); +int migrate_device_coherent_page(struct page *page); /* * mm/gup.c diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 18bc6483f63a..7feeb447e3b9 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -686,6 +686,12 @@ void migrate_vma_pages(struct migrate_vma *migrate) } if (!page) { + /* + * The only time there is no vma is when called from + * migrate_device_coherent_page(). However this isn't + * called if the page could not be unmapped. + */ + VM_BUG_ON(!migrate->vma); if (!(migrate->src[i] & MIGRATE_PFN_MIGRATE)) continue; if (!notified) { @@ -794,3 +800,49 @@ void migrate_vma_finalize(struct migrate_vma *migrate) } } EXPORT_SYMBOL(migrate_vma_finalize); + +/* + * Migrate a device coherent page back to normal memory. The caller should have + * a reference on page which will be copied to the new page if migration is + * successful or dropped on failure. + */ +int migrate_device_coherent_page(struct page *page) +{ + unsigned long src_pfn, dst_pfn = 0; + struct migrate_vma args; + struct page *dpage; + + WARN_ON_ONCE(PageCompound(page)); + + lock_page(page); + src_pfn = migrate_pfn(page_to_pfn(page)) | MIGRATE_PFN_MIGRATE; + args.src = &src_pfn; + args.dst = &dst_pfn; + args.cpages = 1; + args.npages = 1; + args.vma = NULL; + + /* + * We don't have a VMA and don't need to walk the page tables to find + * the source page. So call migrate_vma_unmap() directly to unmap the + * page as migrate_vma_setup() will fail if args.vma == NULL. + */ + migrate_vma_unmap(&args); + if (!(src_pfn & MIGRATE_PFN_MIGRATE)) + return -EBUSY; + + dpage = alloc_page(GFP_USER | __GFP_NOWARN); + if (dpage) { + lock_page(dpage); + dst_pfn = migrate_pfn(page_to_pfn(dpage)); + } + + migrate_vma_pages(&args); + if (src_pfn & MIGRATE_PFN_MIGRATE) + copy_highpage(dpage, page); + migrate_vma_finalize(&args); + + if (src_pfn & MIGRATE_PFN_MIGRATE) + return 0; + return -EBUSY; +}