diff mbox series

[RFC,01/13] mm/hmm: Process pud swap entry without pud_huge()

Message ID 20240306104147.193052-2-peterx@redhat.com (mailing list archive)
State New
Headers show
Series mm/treewide: Remove pXd_huge() API | expand

Commit Message

Peter Xu March 6, 2024, 10:41 a.m. UTC
From: Peter Xu <peterx@redhat.com>

Swap pud entries do not always return true for pud_huge() for all archs.
x86 and sparc (so far) allow it, but all the rest do not accept a swap
entry to be reported as pud_huge().  So it's not safe to check swap entries
within pud_huge().  Check swap entries before pud_huge(), so it should be
always safe.

This is the only place in the kernel that (IMHO, wrongly) relies on
pud_huge() to return true on pud swap entries.  The plan is to cleanup
pXd_huge() to only report non-swap mappings for all archs.

Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/hmm.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

Comments

Jason Gunthorpe March 7, 2024, 6:12 p.m. UTC | #1
On Wed, Mar 06, 2024 at 06:41:35PM +0800, peterx@redhat.com wrote:
> From: Peter Xu <peterx@redhat.com>
> 
> Swap pud entries do not always return true for pud_huge() for all archs.
> x86 and sparc (so far) allow it, but all the rest do not accept a swap
> entry to be reported as pud_huge().  So it's not safe to check swap entries
> within pud_huge().  Check swap entries before pud_huge(), so it should be
> always safe.
> 
> This is the only place in the kernel that (IMHO, wrongly) relies on
> pud_huge() to return true on pud swap entries.  The plan is to cleanup
> pXd_huge() to only report non-swap mappings for all archs.
> 
> Cc: Alistair Popple <apopple@nvidia.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/hmm.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

> @@ -424,7 +424,7 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
>  	walk->action = ACTION_CONTINUE;
>  
>  	pud = READ_ONCE(*pudp);
> -	if (pud_none(pud)) {
> +	if (pud_none(pud) || !pud_present(pud)) {

Isn't this a tautology? pud_none always implies !present() ?

Jason
Peter Xu March 8, 2024, 6:50 a.m. UTC | #2
On Thu, Mar 07, 2024 at 02:12:33PM -0400, Jason Gunthorpe wrote:
> On Wed, Mar 06, 2024 at 06:41:35PM +0800, peterx@redhat.com wrote:
> > From: Peter Xu <peterx@redhat.com>
> > 
> > Swap pud entries do not always return true for pud_huge() for all archs.
> > x86 and sparc (so far) allow it, but all the rest do not accept a swap
> > entry to be reported as pud_huge().  So it's not safe to check swap entries
> > within pud_huge().  Check swap entries before pud_huge(), so it should be
> > always safe.
> > 
> > This is the only place in the kernel that (IMHO, wrongly) relies on
> > pud_huge() to return true on pud swap entries.  The plan is to cleanup
> > pXd_huge() to only report non-swap mappings for all archs.
> > 
> > Cc: Alistair Popple <apopple@nvidia.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  mm/hmm.c | 7 +------
> >  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> > @@ -424,7 +424,7 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
> >  	walk->action = ACTION_CONTINUE;
> >  
> >  	pud = READ_ONCE(*pudp);
> > -	if (pud_none(pud)) {
> > +	if (pud_none(pud) || !pud_present(pud)) {
> 
> Isn't this a tautology? pud_none always implies !present() ?

Hmm yes I think so, afact, it should be "all=none+swap+present". I still
remember I missed that once previously, it's not always obvious when
preparing such patches. :( I'll simplify this and also on patch 3.

Thanks,
diff mbox series

Patch

diff --git a/mm/hmm.c b/mm/hmm.c
index 277ddcab4947..c44391a0246e 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -424,7 +424,7 @@  static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
 	walk->action = ACTION_CONTINUE;
 
 	pud = READ_ONCE(*pudp);
-	if (pud_none(pud)) {
+	if (pud_none(pud) || !pud_present(pud)) {
 		spin_unlock(ptl);
 		return hmm_vma_walk_hole(start, end, -1, walk);
 	}
@@ -435,11 +435,6 @@  static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
 		unsigned long *hmm_pfns;
 		unsigned long cpu_flags;
 
-		if (!pud_present(pud)) {
-			spin_unlock(ptl);
-			return hmm_vma_walk_hole(start, end, -1, walk);
-		}
-
 		i = (addr - range->start) >> PAGE_SHIFT;
 		npages = (end - addr) >> PAGE_SHIFT;
 		hmm_pfns = &range->hmm_pfns[i];