diff mbox series

[mm-unstable] Revert "mm/khugepaged: remove redundant transhuge_vma_suitable() check"

Message ID 20220720111318.1831708-1-zokeefe@google.com (mailing list archive)
State New
Headers show
Series [mm-unstable] Revert "mm/khugepaged: remove redundant transhuge_vma_suitable() check" | expand

Commit Message

Zach O'Keefe July 20, 2022, 11:13 a.m. UTC
A pmd should not cross a VMA boundary, which is normally enforced by
vma_adjust_trans_huge(), and assumed by e.g. __split_huge_pmd_locked().

In this regard, the transhuge_vma_suitable() check in
hugepage_vma_check() is not redundant with the transhuge_vma_suitable()
check previously in hugepage_vma_revalidate().

The former validates the VMA itself, and checks that *some* memory
in the VMA is suitable to collapse while the latter validates if
collapsing at a specific address is suitable.  By removing the check on
the faulting address, we've inadvertently allowed collapse of a pmd that
can cross vma->vm_end. Revert this change.

Fixes: 143776e7512e ("mm/khugepaged: remove redundant transhuge_vma_suitable() check")
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
---
Apologies, Andrew. I think you've put the series description into this
first patch (thank you).  Do you mind moving it into the next patch in the
series,
"mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA"?
Note that the "mm: userspace hugepage collapse, v7" series doesn't actually
depend on this patch, it was just a cleanup (and thus perhaps I shouldn't have
included it in the series in the first place).
---
 mm/khugepaged.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Yang Shi July 20, 2022, 5:22 p.m. UTC | #1
On Wed, Jul 20, 2022 at 4:13 AM Zach O'Keefe <zokeefe@google.com> wrote:
>
> A pmd should not cross a VMA boundary, which is normally enforced by
> vma_adjust_trans_huge(), and assumed by e.g. __split_huge_pmd_locked().
>
> In this regard, the transhuge_vma_suitable() check in
> hugepage_vma_check() is not redundant with the transhuge_vma_suitable()
> check previously in hugepage_vma_revalidate().
>
> The former validates the VMA itself, and checks that *some* memory
> in the VMA is suitable to collapse while the latter validates if
> collapsing at a specific address is suitable.  By removing the check on
> the faulting address, we've inadvertently allowed collapse of a pmd that
> can cross vma->vm_end. Revert this change.

Aha, yeah, nice catch.

Reviewed-by: Yang Shi <shy828301@gmail.com>

>
> Fixes: 143776e7512e ("mm/khugepaged: remove redundant transhuge_vma_suitable() check")
> Signed-off-by: Zach O'Keefe <zokeefe@google.com>
> ---
> Apologies, Andrew. I think you've put the series description into this
> first patch (thank you).  Do you mind moving it into the next patch in the
> series,
> "mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA"?
> Note that the "mm: userspace hugepage collapse, v7" series doesn't actually
> depend on this patch, it was just a cleanup (and thus perhaps I shouldn't have
> included it in the series in the first place).
> ---
>  mm/khugepaged.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 2db6d0dd2981..69990dacde14 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -855,6 +855,8 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
>         if (!vma)
>                 return SCAN_VMA_NULL;
>
> +       if (!transhuge_vma_suitable(vma, address))
> +               return SCAN_ADDRESS_RANGE;
>         if (!hugepage_vma_check(vma, vma->vm_flags, false, false,
>                                 cc->is_khugepaged))
>                 return SCAN_VMA_CHECK;
> --
> 2.37.0.170.g444d1eabd0-goog
>
Zach O'Keefe July 20, 2022, 6:42 p.m. UTC | #2
On Wed, Jul 20, 2022 at 10:22 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Wed, Jul 20, 2022 at 4:13 AM Zach O'Keefe <zokeefe@google.com> wrote:
> >
> > A pmd should not cross a VMA boundary, which is normally enforced by
> > vma_adjust_trans_huge(), and assumed by e.g. __split_huge_pmd_locked().
> >
> > In this regard, the transhuge_vma_suitable() check in
> > hugepage_vma_check() is not redundant with the transhuge_vma_suitable()
> > check previously in hugepage_vma_revalidate().
> >
> > The former validates the VMA itself, and checks that *some* memory
> > in the VMA is suitable to collapse while the latter validates if
> > collapsing at a specific address is suitable.  By removing the check on
> > the faulting address, we've inadvertently allowed collapse of a pmd that
> > can cross vma->vm_end. Revert this change.
>
> Aha, yeah, nice catch.
>
> Reviewed-by: Yang Shi <shy828301@gmail.com>
>

Thanks Yang. Also, hughd found it :) In hindsight, I think it's
actually customary to add a "Reported-by: Hugh Dickins
<hughd@google.com>" - but since the previous patch will just be
dropped and never see the light of day, I guess the value there is
diminished. Anyways - credit goes to Hugh :)

Thanks,
Zach

> >
> > Fixes: 143776e7512e ("mm/khugepaged: remove redundant transhuge_vma_suitable() check")
> > Signed-off-by: Zach O'Keefe <zokeefe@google.com>
> > ---
> > Apologies, Andrew. I think you've put the series description into this
> > first patch (thank you).  Do you mind moving it into the next patch in the
> > series,
> > "mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA"?
> > Note that the "mm: userspace hugepage collapse, v7" series doesn't actually
> > depend on this patch, it was just a cleanup (and thus perhaps I shouldn't have
> > included it in the series in the first place).
> > ---
> >  mm/khugepaged.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 2db6d0dd2981..69990dacde14 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -855,6 +855,8 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
> >         if (!vma)
> >                 return SCAN_VMA_NULL;
> >
> > +       if (!transhuge_vma_suitable(vma, address))
> > +               return SCAN_ADDRESS_RANGE;
> >         if (!hugepage_vma_check(vma, vma->vm_flags, false, false,
> >                                 cc->is_khugepaged))
> >                 return SCAN_VMA_CHECK;
> > --
> > 2.37.0.170.g444d1eabd0-goog
> >
Hugh Dickins July 20, 2022, 8:28 p.m. UTC | #3
On Wed, 20 Jul 2022, Zach O'Keefe wrote:
> On Wed, Jul 20, 2022 at 10:22 AM Yang Shi <shy828301@gmail.com> wrote:
> > On Wed, Jul 20, 2022 at 4:13 AM Zach O'Keefe <zokeefe@google.com> wrote:
> > >
> > > A pmd should not cross a VMA boundary, which is normally enforced by
> > > vma_adjust_trans_huge(), and assumed by e.g. __split_huge_pmd_locked().
> > >
> > > In this regard, the transhuge_vma_suitable() check in
> > > hugepage_vma_check() is not redundant with the transhuge_vma_suitable()
> > > check previously in hugepage_vma_revalidate().
> > >
> > > The former validates the VMA itself, and checks that *some* memory
> > > in the VMA is suitable to collapse while the latter validates if
> > > collapsing at a specific address is suitable.  By removing the check on
> > > the faulting address, we've inadvertently allowed collapse of a pmd that
> > > can cross vma->vm_end. Revert this change.
> >
> > Aha, yeah, nice catch.
> >
> > Reviewed-by: Yang Shi <shy828301@gmail.com>
> >
> 
> Thanks Yang. Also, hughd found it :) In hindsight, I think it's
> actually customary to add a "Reported-by: Hugh Dickins
> <hughd@google.com>" - but since the previous patch will just be
> dropped and never see the light of day, I guess the value there is
> diminished. Anyways - credit goes to Hugh :)

Thanks Zach, no probs, and as you say it would have vanished anyway.

It was something I hit in testing maple tree, and at first thought a
consequence of maple tree's (previous) brk handling:

https://lore.kernel.org/linux-mm/a6736ccf-fb45-5777-ca28-575297f1879f@google.com/
(the "coincident" paragraph).

But a similar crash occurred when I took that out of the picture:
maple tree not to blame at all - apology to Liam.

Hugh
diff mbox series

Patch

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 2db6d0dd2981..69990dacde14 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -855,6 +855,8 @@  static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 	if (!vma)
 		return SCAN_VMA_NULL;
 
+	if (!transhuge_vma_suitable(vma, address))
+		return SCAN_ADDRESS_RANGE;
 	if (!hugepage_vma_check(vma, vma->vm_flags, false, false,
 				cc->is_khugepaged))
 		return SCAN_VMA_CHECK;