Message ID | 20230925200110.1979606-1-zokeefe@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v4] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()" | expand |
On Mon, Sep 25, 2023 at 1:01 PM Zach O'Keefe <zokeefe@google.com> wrote: > > The 6.0 commits: > > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") > commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > merged "can we have THPs in this VMA?" logic that was previously done > separately by fault-path, khugepaged, and smaps "THPeligible" checks. > > During the process, the semantics of the fault path check changed in two > ways: > > 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). > 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault > handler that could satisfy the fault. Previously, this check had been > done in create_huge_pud() and create_huge_pmd() routines, but after > the changes, we never reach those routines. > > During the review of the above commits, it was determined that in-tree > users weren't affected by the change; most notably, since the only relevant > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is > explicitly approved early in approval logic. However, this was a bad > assumption to make as it assumes the only reason to support ->huge_fault > was for DAX (which is not true in general). > > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give > any ->huge_fault handler a chance to handle the fault. Note that we > don't validate the file mode or mapping alignment, which is consistent > with the behavior before the aforementioned commits. Looks good to me. Reviewed-by: Yang Shi <shy828301@gmail.com> > > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > Reported-by: Saurabh Singh Sengar <ssengar@microsoft.com> > Signed-off-by: Zach O'Keefe <zokeefe@google.com> > Cc: Yang Shi <shy828301@gmail.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: David Hildenbrand <david@redhat.com> > --- > I've updated the changelog to reflect discussions in [1] -- leaving > ack to David / Matthew on whether to take the patch. > > Changed from v3[1]: > - [akpm / David H. / M. Wilcox] Updated log to capture email discussion > Changed from v2[2]: > - Fixed false negative in smaps check when !dax && ->huge_fault > Changed from v1[3]: > - [Saurabhi] Allow ->huge_fault handler to handle fault, if it exists > > [1] https://lore.kernel.org/linux-mm/20230821234844.699818-1-zokeefe@google.com/ > [2] https://lore.kernel.org/linux-mm/20230818211533.2523697-1-zokeefe@google.com/ > [3] https://lore.kernel.org/linux-mm/CAAa6QmQw+F=o6htOn=6ADD6mwvMO=Ow_67f3ifBv3GpXx9Xg_g@mail.gmail.com/ > > --- > mm/huge_memory.c | 20 +++++++++++++------- > 1 file changed, 13 insertions(+), 7 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 0f93a73115f73..797fe617e51ab 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, > return in_pf; > > /* > - * Special VMA and hugetlb VMA. > + * khugepaged special VMA and hugetlb VMA. > * Must be checked after dax since some dax mappings may have > * VM_MIXEDMAP set. > */ > - if (vm_flags & VM_NO_KHUGEPAGED) > + if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED)) > return false; > > /* > @@ -132,12 +132,18 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, > !hugepage_flags_always()))) > return false; > > - /* Only regular file is valid */ > - if (!in_pf && file_thp_enabled(vma)) > - return true; > - > - if (!vma_is_anonymous(vma)) > + if (!vma_is_anonymous(vma)) { > + /* > + * Trust that ->huge_fault() handlers know what they are doing > + * in fault path. > + */ > + if (((in_pf || smaps)) && vma->vm_ops->huge_fault) > + return true; > + /* Only regular file is valid in collapse path */ > + if (((!in_pf || smaps)) && file_thp_enabled(vma)) > + return true; > return false; > + } > > if (vma_is_temporary_stack(vma)) > return false; > -- > 2.42.0.515.g380fc7ccd1-goog >
On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <zokeefe@google.com> wrote: > The 6.0 commits: > > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") > commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > merged "can we have THPs in this VMA?" logic that was previously done > separately by fault-path, khugepaged, and smaps "THPeligible" checks. > > During the process, the semantics of the fault path check changed in two > ways: > > 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). > 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault > handler that could satisfy the fault. Previously, this check had been > done in create_huge_pud() and create_huge_pmd() routines, but after > the changes, we never reach those routines. > > During the review of the above commits, it was determined that in-tree > users weren't affected by the change; most notably, since the only relevant > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is > explicitly approved early in approval logic. However, this was a bad > assumption to make as it assumes the only reason to support ->huge_fault > was for DAX (which is not true in general). > > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give > any ->huge_fault handler a chance to handle the fault. Note that we > don't validate the file mode or mapping alignment, which is consistent > with the behavior before the aforementioned commits. It's unclear what are the userspace visible impacts of this change. Which makes it hard for others to determine whether -stable kernels should be patched. > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > Reported-by: Saurabh Singh Sengar <ssengar@microsoft.com> It's nice to include a Closes: link after a Reported-by:. Then readers are better able to answer the above question. > Signed-off-by: Zach O'Keefe <zokeefe@google.com> > Cc: Yang Shi <shy828301@gmail.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: David Hildenbrand <david@redhat.com>
On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <zokeefe@google.com> wrote: > The 6.0 commits: > > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") > commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > merged "can we have THPs in this VMA?" logic that was previously done > separately by fault-path, khugepaged, and smaps "THPeligible" checks. > > During the process, the semantics of the fault path check changed in two > ways: > > 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). > 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault > handler that could satisfy the fault. Previously, this check had been > done in create_huge_pud() and create_huge_pmd() routines, but after > the changes, we never reach those routines. > > During the review of the above commits, it was determined that in-tree > users weren't affected by the change; most notably, since the only relevant > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is > explicitly approved early in approval logic. However, this was a bad > assumption to make as it assumes the only reason to support ->huge_fault > was for DAX (which is not true in general). > > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give > any ->huge_fault handler a chance to handle the fault. Note that we > don't validate the file mode or mapping alignment, which is consistent > with the behavior before the aforementioned commits. > > ... > > @@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, > return in_pf; > Ryan's "mm: thp: introduce anon_orders and anon_always_mask sysfs files" changes hugepage_vma_check() to return an unsigned int, so this patch will need some rework to fit in after that. However Ryan's overall series "variable-order, large folios for anonymous memory" is in early days and might not make it. And as I don't know what is the urgency of this patch ("mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which patch needs to come first (thus requiring rework of the other patch). Please discuss!
On 06.10.23 19:58, Andrew Morton wrote: > On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <zokeefe@google.com> wrote: > >> The 6.0 commits: >> >> commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") >> commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") >> >> merged "can we have THPs in this VMA?" logic that was previously done >> separately by fault-path, khugepaged, and smaps "THPeligible" checks. >> >> During the process, the semantics of the fault path check changed in two >> ways: >> >> 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). >> 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault >> handler that could satisfy the fault. Previously, this check had been >> done in create_huge_pud() and create_huge_pmd() routines, but after >> the changes, we never reach those routines. >> >> During the review of the above commits, it was determined that in-tree >> users weren't affected by the change; most notably, since the only relevant >> user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is >> explicitly approved early in approval logic. However, this was a bad >> assumption to make as it assumes the only reason to support ->huge_fault >> was for DAX (which is not true in general). >> >> Remove the VM_NO_KHUGEPAGED check when not in collapse path and give >> any ->huge_fault handler a chance to handle the fault. Note that we >> don't validate the file mode or mapping alignment, which is consistent >> with the behavior before the aforementioned commits. >> >> ... >> >> @@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, >> return in_pf; >> > > Ryan's "mm: thp: introduce anon_orders and anon_always_mask sysfs > files" changes hugepage_vma_check() to return an unsigned int, so this > patch will need some rework to fit in after that. > > However Ryan's overall series "variable-order, large folios for > anonymous memory" is in early days and might not make it. > > And as I don't know what is the urgency of this patch ("mm/thp: fix > "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which > patch needs to come first (thus requiring rework of the other patch). > > Please discuss! IMHO clearly this one.
On 25.09.23 22:01, Zach O'Keefe wrote: > The 6.0 commits: > > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") > commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > merged "can we have THPs in this VMA?" logic that was previously done > separately by fault-path, khugepaged, and smaps "THPeligible" checks. > > During the process, the semantics of the fault path check changed in two > ways: > > 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). > 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault > handler that could satisfy the fault. Previously, this check had been > done in create_huge_pud() and create_huge_pmd() routines, but after > the changes, we never reach those routines. > > During the review of the above commits, it was determined that in-tree > users weren't affected by the change; most notably, since the only relevant > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is > explicitly approved early in approval logic. However, this was a bad > assumption to make as it assumes the only reason to support ->huge_fault > was for DAX (which is not true in general). > > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give > any ->huge_fault handler a chance to handle the fault. Note that we > don't validate the file mode or mapping alignment, which is consistent > with the behavior before the aforementioned commits. > > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > Reported-by: Saurabh Singh Sengar <ssengar@microsoft.com> > Signed-off-by: Zach O'Keefe <zokeefe@google.com> > Cc: Yang Shi <shy828301@gmail.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: David Hildenbrand <david@redhat.com> > --- > I've updated the changelog to reflect discussions in [1] -- leaving > ack to David / Matthew on whether to take the patch. Works for me. Acked-by: David Hildenbrand <david@redhat.com>
On Fri, 6 Oct 2023 20:52:30 +0200 David Hildenbrand <david@redhat.com> wrote: > > And as I don't know what is the urgency of this patch ("mm/thp: fix > > "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which > > patch needs to come first (thus requiring rework of the other patch). > > > > Please discuss! > > IMHO clearly this one. OK. I'll drop the "variable-order, large folios for anonymous memory" v6 series for now.
On 06/10/2023 20:11, Andrew Morton wrote: > On Fri, 6 Oct 2023 20:52:30 +0200 David Hildenbrand <david@redhat.com> wrote: > >>> And as I don't know what is the urgency of this patch ("mm/thp: fix >>> "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which >>> patch needs to come first (thus requiring rework of the other patch). >>> >>> Please discuss! >> >> IMHO clearly this one. > > OK. I'll drop the "variable-order, large folios for anonymous memory" v6 > series for now. Yep, agreed!
On Fri, Oct 6, 2023 at 10:50 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 25 Sep 2023 13:01:10 -0700 "Zach O'Keefe" <zokeefe@google.com> wrote: > > > The 6.0 commits: > > > > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") > > commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > > > merged "can we have THPs in this VMA?" logic that was previously done > > separately by fault-path, khugepaged, and smaps "THPeligible" checks. > > > > During the process, the semantics of the fault path check changed in two > > ways: > > > > 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). > > 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault > > handler that could satisfy the fault. Previously, this check had been > > done in create_huge_pud() and create_huge_pmd() routines, but after > > the changes, we never reach those routines. > > > > During the review of the above commits, it was determined that in-tree > > users weren't affected by the change; most notably, since the only relevant > > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is > > explicitly approved early in approval logic. However, this was a bad > > assumption to make as it assumes the only reason to support ->huge_fault > > was for DAX (which is not true in general). > > > > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give > > any ->huge_fault handler a chance to handle the fault. Note that we > > don't validate the file mode or mapping alignment, which is consistent > > with the behavior before the aforementioned commits. > > It's unclear what are the userspace visible impacts of this change. > Which makes it hard for others to determine whether -stable kernels > should be patched. IMO, I don't think this change is suitable for -stable; the only users that would have been affected are those that maintain out-of-tree drivers / code that hooked into ->huge_fault() or used VM_MIXEDMAP + THP. No users of the in-tree kernel would have been affected. It's still a good "fix" to make going forward (and certainly happy to be able to help Saurabh out). + greg k-h for vis / to confirm. > > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > Reported-by: Saurabh Singh Sengar <ssengar@microsoft.com> > > It's nice to include a Closes: link after a Reported-by:. Then readers > are better able to answer the above question. Ah, apologies, Andrew; I didn't know such a tag existed -- I'll be sure to include it in the future. > > Signed-off-by: Zach O'Keefe <zokeefe@google.com> > > Cc: Yang Shi <shy828301@gmail.com> > > Cc: Matthew Wilcox <willy@infradead.org> > > Cc: David Hildenbrand <david@redhat.com> >
On Fri, Oct 6, 2023 at 2:28 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 06/10/2023 20:11, Andrew Morton wrote: > > On Fri, 6 Oct 2023 20:52:30 +0200 David Hildenbrand <david@redhat.com> wrote: > > > >>> And as I don't know what is the urgency of this patch ("mm/thp: fix > >>> "mm: thp: kill __transhuge_page_enabled()"), I'm unable to decide which > >>> patch needs to come first (thus requiring rework of the other patch). > >>> > >>> Please discuss! > >> > >> IMHO clearly this one. > > > > OK. I'll drop the "variable-order, large folios for anonymous memory" v6 > > series for now. > > Yep, agreed! Thank all and sorry for the late response ; have been buried as of late. Best, Zach
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0f93a73115f73..797fe617e51ab 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -100,11 +100,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, return in_pf; /* - * Special VMA and hugetlb VMA. + * khugepaged special VMA and hugetlb VMA. * Must be checked after dax since some dax mappings may have * VM_MIXEDMAP set. */ - if (vm_flags & VM_NO_KHUGEPAGED) + if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED)) return false; /* @@ -132,12 +132,18 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, !hugepage_flags_always()))) return false; - /* Only regular file is valid */ - if (!in_pf && file_thp_enabled(vma)) - return true; - - if (!vma_is_anonymous(vma)) + if (!vma_is_anonymous(vma)) { + /* + * Trust that ->huge_fault() handlers know what they are doing + * in fault path. + */ + if (((in_pf || smaps)) && vma->vm_ops->huge_fault) + return true; + /* Only regular file is valid in collapse path */ + if (((!in_pf || smaps)) && file_thp_enabled(vma)) + return true; return false; + } if (vma_is_temporary_stack(vma)) return false;
The 6.0 commits: commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") merged "can we have THPs in this VMA?" logic that was previously done separately by fault-path, khugepaged, and smaps "THPeligible" checks. During the process, the semantics of the fault path check changed in two ways: 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault handler that could satisfy the fault. Previously, this check had been done in create_huge_pud() and create_huge_pmd() routines, but after the changes, we never reach those routines. During the review of the above commits, it was determined that in-tree users weren't affected by the change; most notably, since the only relevant user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is explicitly approved early in approval logic. However, this was a bad assumption to make as it assumes the only reason to support ->huge_fault was for DAX (which is not true in general). Remove the VM_NO_KHUGEPAGED check when not in collapse path and give any ->huge_fault handler a chance to handle the fault. Note that we don't validate the file mode or mapping alignment, which is consistent with the behavior before the aforementioned commits. Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") Reported-by: Saurabh Singh Sengar <ssengar@microsoft.com> Signed-off-by: Zach O'Keefe <zokeefe@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Hildenbrand <david@redhat.com> --- I've updated the changelog to reflect discussions in [1] -- leaving ack to David / Matthew on whether to take the patch. Changed from v3[1]: - [akpm / David H. / M. Wilcox] Updated log to capture email discussion Changed from v2[2]: - Fixed false negative in smaps check when !dax && ->huge_fault Changed from v1[3]: - [Saurabhi] Allow ->huge_fault handler to handle fault, if it exists [1] https://lore.kernel.org/linux-mm/20230821234844.699818-1-zokeefe@google.com/ [2] https://lore.kernel.org/linux-mm/20230818211533.2523697-1-zokeefe@google.com/ [3] https://lore.kernel.org/linux-mm/CAAa6QmQw+F=o6htOn=6ADD6mwvMO=Ow_67f3ifBv3GpXx9Xg_g@mail.gmail.com/ --- mm/huge_memory.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)