Message ID | 20240306104147.193052-5-peterx@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/treewide: Remove pXd_huge() API | expand |
On Wed, Mar 06, 2024 at 06:41:38PM +0800, peterx@redhat.com wrote: > From: Peter Xu <peterx@redhat.com> > > This patch partly reverts below commits: > > 3a194f3f8ad0 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry") > cbef8478bee5 ("mm/hugetlb: pmd_huge() returns true for non-present hugepage") > > Right now, pXd_huge() definition across kernel is unclear. We have two > groups that think differently on swap entries: > > - x86/sparc: Allow pXd_huge() to accept swap entries > - all the rest: Doesn't allow pXd_huge() to accept swap entries > > This is so confusing. Since the sparc helpers seem to be added in 2016, > which is after x86's (2015), so sparc could have followed a trend. x86 > proposed such swap handling in 2015 to resolve hugetlb swap entries hit in > GUP, but now GUP guards swap entries with !pXd_present() in all layers so > we should be safe. > > We should define this API properly, one way or another, rather than keep > them defined differently across archs. > > Gut feeling tells me that pXd_huge() shouldn't include swap entries, and it > turns out that I am not the only one thinking so, the question was raised > when the current pmd_huge() for x86 was proposed by Ville Syrjälä: > > https://lore.kernel.org/all/Y2WQ7I4LXh8iUIRd@intel.com/ > > I might also be missing something obvious, but why is it even necessary > to treat PRESENT==0+PSE==0 as a huge entry? > > It is also questioned when Jason Gunthorpe reviewed the other patchset on > swap entry handlings: > > https://lore.kernel.org/all/20240221125753.GQ13330@nvidia.com/ > > Revert its meaning back to original. It shouldn't have any functional > change as we should be ready with guards on !pXd_present() explicitly > everywhere. > > Note that I also dropped the "#if CONFIG_PGTABLE_LEVELS > 2", it was there > probably because it was breaking things when 3a194f3f8ad0 was proposed, > according to the report here: > > https://lore.kernel.org/all/Y2LYXItKQyaJTv8j@intel.com/ > > Now we shouldn't need that. > > Instead of reverting to _PAGE_PSE raw check, leverage pXd_leaf(). > > Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Borislav Petkov <bp@alien8.de> > Cc: Dave Hansen <dave.hansen@linux.intel.com> > Cc: x86@kernel.org > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > arch/x86/mm/hugetlbpage.c | 18 ++++-------------- > 1 file changed, 4 insertions(+), 14 deletions(-) I think this is the right thing to do, callers should be more directly sensitive to swap entries not back into it indirectly from a helper like this. Jason
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 5804bbae4f01..8362953a24ce 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -20,29 +20,19 @@ #include <asm/elf.h> /* - * pmd_huge() returns 1 if @pmd is hugetlb related entry, that is normal - * hugetlb entry or non-present (migration or hwpoisoned) hugetlb entry. - * Otherwise, returns 0. + * pmd_huge() returns 1 if @pmd is hugetlb related entry. */ int pmd_huge(pmd_t pmd) { - return !pmd_none(pmd) && - (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; + return pmd_leaf(pmd); } /* - * pud_huge() returns 1 if @pud is hugetlb related entry, that is normal - * hugetlb entry or non-present (migration or hwpoisoned) hugetlb entry. - * Otherwise, returns 0. + * pud_huge() returns 1 if @pud is hugetlb related entry. */ int pud_huge(pud_t pud) { -#if CONFIG_PGTABLE_LEVELS > 2 - return !pud_none(pud) && - (pud_val(pud) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; -#else - return 0; -#endif + return pud_leaf(pud); } #ifdef CONFIG_HUGETLB_PAGE