Message ID | 166015037385.760108.16881097713975517242.stgit@omen (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm: re-allow pinning of zero pfns (again) | expand |
> Subject: [PATCH v2] mm: re-allow pinning of zero pfns (again) > > The below referenced commit makes the same error as 1c563432588d ("mm: > fix is_pinnable_page against a cma page"), re-interpreting the logic to > exclude pinning of the zero page, which breaks device assignment with vfio. > > To avoid further subtle mistakes, split the logic into discrete tests. > > Suggested-by: Matthew Wilcox <willy@infradead.org> > Suggested-by: Felix Kuehling <felix.kuehling@amd.com> > Link: > https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit > @omen > Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory > support") > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- Tested-by: Slawomir Laba <slawomirx.laba@intel.com>
On 8/10/22 09:53, Alex Williamson wrote: > The below referenced commit makes the same error as 1c563432588d ("mm: fix > is_pinnable_page against a cma page"), re-interpreting the logic to exclude > pinning of the zero page, which breaks device assignment with vfio. > > To avoid further subtle mistakes, split the logic into discrete tests. > > Suggested-by: Matthew Wilcox <willy@infradead.org> > Suggested-by: Felix Kuehling <felix.kuehling@amd.com> > Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen > Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support") > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- > include/linux/mm.h | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) Hi Alex, Looks good. I'm suggesting a simpler comment, below, because even though the VFIO folks are thinking about VFIO, here we are deep in the mm layer and there are lots of non-VFIO callers that may pin the zero page. > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 18e01474cf6b..835106a9718f 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1544,9 +1544,20 @@ static inline bool is_longterm_pinnable_page(struct page *page) > if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE) > return false; > #endif > - return !(is_device_coherent_page(page) || > - is_zone_movable_page(page) || > - is_zero_pfn(page_to_pfn(page))); > + /* > + * The zero page might reside in a movable zone, however it may not > + * be migrated and can therefore be pinned. The vfio subsystem pins > + * user mappings including the zero page for IOMMU translation. > + */ Those notes are all about (some of) the callers. But it's a simple answer, really, so how about just this: /* The zero page is always allowed to be pinned. */ ? > + if (is_zero_pfn(page_to_pfn(page))) > + return true; > + > + /* Coherent device memory must always allow eviction. */ > + if (is_device_coherent_page(page)) > + return false; > + > + /* Otherwise, non-movable zone pages can be pinned. */ > + return !is_zone_movable_page(page); > } > #else > static inline bool is_longterm_pinnable_page(struct page *page) > > > Reviewed-by: John Hubbard <jhubbard@nvidia.com> thanks,
On Sat, 27 Aug 2022 17:59:32 -0700 John Hubbard <jhubbard@nvidia.com> wrote: > > /* The zero page is always allowed to be pinned. */ Wow, that's really verbose :) --- a/include/linux/mm.h~mm-re-allow-pinning-of-zero-pfns-again-fix +++ a/include/linux/mm.h @@ -1544,11 +1544,7 @@ static inline bool is_longterm_pinnable_ if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE) return false; #endif - /* - * The zero page might reside in a movable zone, however it may not - * be migrated and can therefore be pinned. The vfio subsystem pins - * user mappings including the zero page for IOMMU translation. - */ + /* The zero page may always be pinned */ if (is_zero_pfn(page_to_pfn(page))) return true;
On Sat, 27 Aug 2022 17:59:32 -0700 John Hubbard <jhubbard@nvidia.com> wrote: > On 8/10/22 09:53, Alex Williamson wrote: > > The below referenced commit makes the same error as 1c563432588d ("mm: fix > > is_pinnable_page against a cma page"), re-interpreting the logic to exclude > > pinning of the zero page, which breaks device assignment with vfio. > > > > To avoid further subtle mistakes, split the logic into discrete tests. > > > > Suggested-by: Matthew Wilcox <willy@infradead.org> > > Suggested-by: Felix Kuehling <felix.kuehling@amd.com> > > Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen > > Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support") > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > --- > > include/linux/mm.h | 17 ++++++++++++++--- > > 1 file changed, 14 insertions(+), 3 deletions(-) > Hi Alex, > > Looks good. I'm suggesting a simpler comment, below, because > even though the VFIO folks are thinking about VFIO, here we > are deep in the mm layer and there are lots of non-VFIO callers > that may pin the zero page. > > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 18e01474cf6b..835106a9718f 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1544,9 +1544,20 @@ static inline bool is_longterm_pinnable_page(struct page *page) > > if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE) > > return false; > > #endif > > - return !(is_device_coherent_page(page) || > > - is_zone_movable_page(page) || > > - is_zero_pfn(page_to_pfn(page))); > > + /* > > + * The zero page might reside in a movable zone, however it may not > > + * be migrated and can therefore be pinned. The vfio subsystem pins > > + * user mappings including the zero page for IOMMU translation. > > + */ > > Those notes are all about (some of) the callers. But it's a simple > answer, really, so how about just this: > > /* The zero page is always allowed to be pinned. */ Sure. Are we looking for a re-spin with this? I see Andrew already added this incremental change to his hotfix-unstable branch separately. I'd hate for a comment re-spin to delay getting a fix for this problem, that blocks any VM use cases of VFIO, into mainline any longer. Thanks, Alex
On 8/28/22 05:37, Alex Williamson wrote: >> Those notes are all about (some of) the callers. But it's a simple >> answer, really, so how about just this: >> >> /* The zero page is always allowed to be pinned. */ > > Sure. Are we looking for a re-spin with this? I see Andrew already > added this incremental change to his hotfix-unstable branch separately. > I'd hate for a comment re-spin to delay getting a fix for this problem, > that blocks any VM use cases of VFIO, into mainline any longer. Thanks, > Definitely not. Andrews fixup should suffice. thanks,
diff --git a/include/linux/mm.h b/include/linux/mm.h index 18e01474cf6b..835106a9718f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1544,9 +1544,20 @@ static inline bool is_longterm_pinnable_page(struct page *page) if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE) return false; #endif - return !(is_device_coherent_page(page) || - is_zone_movable_page(page) || - is_zero_pfn(page_to_pfn(page))); + /* + * The zero page might reside in a movable zone, however it may not + * be migrated and can therefore be pinned. The vfio subsystem pins + * user mappings including the zero page for IOMMU translation. + */ + if (is_zero_pfn(page_to_pfn(page))) + return true; + + /* Coherent device memory must always allow eviction. */ + if (is_device_coherent_page(page)) + return false; + + /* Otherwise, non-movable zone pages can be pinned. */ + return !is_zone_movable_page(page); } #else static inline bool is_longterm_pinnable_page(struct page *page)
The below referenced commit makes the same error as 1c563432588d ("mm: fix is_pinnable_page against a cma page"), re-interpreting the logic to exclude pinning of the zero page, which breaks device assignment with vfio. To avoid further subtle mistakes, split the logic into discrete tests. Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- include/linux/mm.h | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)