diff mbox series

[v2] mm: re-allow pinning of zero pfns (again)

Message ID 166015037385.760108.16881097713975517242.stgit@omen (mailing list archive)
State New
Headers show
Series [v2] mm: re-allow pinning of zero pfns (again) | expand

Commit Message

Alex Williamson Aug. 10, 2022, 4:53 p.m. UTC
The below referenced commit makes the same error as 1c563432588d ("mm: fix
is_pinnable_page against a cma page"), re-interpreting the logic to exclude
pinning of the zero page, which breaks device assignment with vfio.

To avoid further subtle mistakes, split the logic into discrete tests.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen
Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 include/linux/mm.h |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

Comments

Laba, SlawomirX Aug. 26, 2022, 7:01 p.m. UTC | #1
> Subject: [PATCH v2] mm: re-allow pinning of zero pfns (again)
> 
> The below referenced commit makes the same error as 1c563432588d ("mm:
> fix is_pinnable_page against a cma page"), re-interpreting the logic to
> exclude pinning of the zero page, which breaks device assignment with vfio.
> 
> To avoid further subtle mistakes, split the logic into discrete tests.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
> Link:
> https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit
> @omen
> Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory
> support")
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---

Tested-by: Slawomir Laba <slawomirx.laba@intel.com>
John Hubbard Aug. 28, 2022, 12:59 a.m. UTC | #2
On 8/10/22 09:53, Alex Williamson wrote:
> The below referenced commit makes the same error as 1c563432588d ("mm: fix
> is_pinnable_page against a cma page"), re-interpreting the logic to exclude
> pinning of the zero page, which breaks device assignment with vfio.
> 
> To avoid further subtle mistakes, split the logic into discrete tests.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
> Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen
> Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support")
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  include/linux/mm.h |   17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
Hi Alex,

Looks good. I'm suggesting a simpler comment, below, because
even though the VFIO folks are thinking about VFIO, here we
are deep in the mm layer and there are lots of non-VFIO callers
that may pin the zero page.

> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 18e01474cf6b..835106a9718f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1544,9 +1544,20 @@ static inline bool is_longterm_pinnable_page(struct page *page)
>  	if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
>  		return false;
>  #endif
> -	return !(is_device_coherent_page(page) ||
> -		 is_zone_movable_page(page) ||
> -		 is_zero_pfn(page_to_pfn(page)));
> +	/*
> +	 * The zero page might reside in a movable zone, however it may not
> +	 * be migrated and can therefore be pinned.  The vfio subsystem pins
> +	 * user mappings including the zero page for IOMMU translation.
> +	 */

Those notes are all about (some of) the callers. But it's a simple
answer, really, so how about just this:

	/* The zero page is always allowed to be pinned. */

?

> +	if (is_zero_pfn(page_to_pfn(page)))
> +		return true;
> +
> +	/* Coherent device memory must always allow eviction. */
> +	if (is_device_coherent_page(page))
> +		return false;
> +
> +	/* Otherwise, non-movable zone pages can be pinned. */
> +	return !is_zone_movable_page(page);
>  }
>  #else
>  static inline bool is_longterm_pinnable_page(struct page *page)
> 
> 
> 

Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
Andrew Morton Aug. 28, 2022, 1:40 a.m. UTC | #3
On Sat, 27 Aug 2022 17:59:32 -0700 John Hubbard <jhubbard@nvidia.com> wrote:

> 
> 	/* The zero page is always allowed to be pinned. */

Wow, that's really verbose :)

--- a/include/linux/mm.h~mm-re-allow-pinning-of-zero-pfns-again-fix
+++ a/include/linux/mm.h
@@ -1544,11 +1544,7 @@ static inline bool is_longterm_pinnable_
 	if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
 		return false;
 #endif
-	/*
-	 * The zero page might reside in a movable zone, however it may not
-	 * be migrated and can therefore be pinned.  The vfio subsystem pins
-	 * user mappings including the zero page for IOMMU translation.
-	 */
+	/* The zero page may always be pinned */
 	if (is_zero_pfn(page_to_pfn(page)))
 		return true;
Alex Williamson Aug. 28, 2022, 12:37 p.m. UTC | #4
On Sat, 27 Aug 2022 17:59:32 -0700
John Hubbard <jhubbard@nvidia.com> wrote:

> On 8/10/22 09:53, Alex Williamson wrote:
> > The below referenced commit makes the same error as 1c563432588d ("mm: fix
> > is_pinnable_page against a cma page"), re-interpreting the logic to exclude
> > pinning of the zero page, which breaks device assignment with vfio.
> > 
> > To avoid further subtle mistakes, split the logic into discrete tests.
> > 
> > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
> > Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen
> > Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support")
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  include/linux/mm.h |   17 ++++++++++++++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)  
> Hi Alex,
> 
> Looks good. I'm suggesting a simpler comment, below, because
> even though the VFIO folks are thinking about VFIO, here we
> are deep in the mm layer and there are lots of non-VFIO callers
> that may pin the zero page.
> 
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 18e01474cf6b..835106a9718f 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1544,9 +1544,20 @@ static inline bool is_longterm_pinnable_page(struct page *page)
> >  	if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
> >  		return false;
> >  #endif
> > -	return !(is_device_coherent_page(page) ||
> > -		 is_zone_movable_page(page) ||
> > -		 is_zero_pfn(page_to_pfn(page)));
> > +	/*
> > +	 * The zero page might reside in a movable zone, however it may not
> > +	 * be migrated and can therefore be pinned.  The vfio subsystem pins
> > +	 * user mappings including the zero page for IOMMU translation.
> > +	 */  
> 
> Those notes are all about (some of) the callers. But it's a simple
> answer, really, so how about just this:
> 
> 	/* The zero page is always allowed to be pinned. */

Sure.  Are we looking for a re-spin with this?  I see Andrew already
added this incremental change to his hotfix-unstable branch separately.
I'd hate for a comment re-spin to delay getting a fix for this problem,
that blocks any VM use cases of VFIO, into mainline any longer.  Thanks,

Alex
John Hubbard Aug. 28, 2022, 5:16 p.m. UTC | #5
On 8/28/22 05:37, Alex Williamson wrote:
>> Those notes are all about (some of) the callers. But it's a simple
>> answer, really, so how about just this:
>>
>> 	/* The zero page is always allowed to be pinned. */
> 
> Sure.  Are we looking for a re-spin with this?  I see Andrew already
> added this incremental change to his hotfix-unstable branch separately.
> I'd hate for a comment re-spin to delay getting a fix for this problem,
> that blocks any VM use cases of VFIO, into mainline any longer.  Thanks,
> 

Definitely not. Andrews fixup should suffice.


thanks,
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 18e01474cf6b..835106a9718f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1544,9 +1544,20 @@  static inline bool is_longterm_pinnable_page(struct page *page)
 	if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
 		return false;
 #endif
-	return !(is_device_coherent_page(page) ||
-		 is_zone_movable_page(page) ||
-		 is_zero_pfn(page_to_pfn(page)));
+	/*
+	 * The zero page might reside in a movable zone, however it may not
+	 * be migrated and can therefore be pinned.  The vfio subsystem pins
+	 * user mappings including the zero page for IOMMU translation.
+	 */
+	if (is_zero_pfn(page_to_pfn(page)))
+		return true;
+
+	/* Coherent device memory must always allow eviction. */
+	if (is_device_coherent_page(page))
+		return false;
+
+	/* Otherwise, non-movable zone pages can be pinned. */
+	return !is_zone_movable_page(page);
 }
 #else
 static inline bool is_longterm_pinnable_page(struct page *page)