Message ID | 1587646154-26276-1-git-send-email-lixinhai.lxh@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/hugetlb: avoid unnecessary check on pud and pmd entry in huge_pte_offset | expand |
On 2020-04-23 at 20:49 Li Xinhai wrote: >When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE >or PMD_SIZE. >If sz is PUD_SIZE and code can reach pud, then *pud must be none, or >normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb >entry, and we can directly return pud. >When sz is PMD_SIZE, pud must be none or present, and if code can reach >pmd, we can directly return pmd. > >So, after this patch, the code is simplified by first check on the >parameter sz, and avoid unnecessary checks in current code. > >Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> >Cc: Mike Kravetz <mike.kravetz@oracle.com> >Cc: Andrew Morton <akpm@linux-foundation.org> Since this huge_pte_offset() is under CONFIG_ARCH_WANT_GENERAL_HUGETLB, I only have chance to test with x86_64, but the logical should hold for all other cases which using this general implementation. The exsiting code path was introduced by commit 9b19df292c6 (mm/hugetlb.c: make huge_pte_offset() consistent and document behaviour), and the sematics is maintained after current patch applied.
Cc a few people who have looked at huge_pte_offset() recently. On 4/23/20 5:49 AM, Li Xinhai wrote: > When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE > or PMD_SIZE. > If sz is PUD_SIZE and code can reach pud, then *pud must be none, or > normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb > entry, and we can directly return pud. > When sz is PMD_SIZE, pud must be none or present, and if code can reach > pmd, we can directly return pmd. > > So, after this patch, the code is simplified by first check on the > parameter sz, and avoid unnecessary checks in current code. > > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > --- > mm/hugetlb.c | 24 +++++++++--------------- > 1 file changed, 9 insertions(+), 15 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index bcabbe0..e1424f5 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm, > { > pgd_t *pgd; > p4d_t *p4d; > - pud_t *pud, pud_entry; > - pmd_t *pmd, pmd_entry; > + pud_t *pud; > + pmd_t *pmd; > > pgd = pgd_offset(mm, addr); > if (!pgd_present(*pgd)) > @@ -5376,22 +5376,16 @@ pte_t *huge_pte_offset(struct mm_struct *mm, > return NULL; > > pud = pud_offset(p4d, addr); > - pud_entry = READ_ONCE(*pud); > - if (sz != PUD_SIZE && pud_none(pud_entry)) > - return NULL; > - /* hugepage or swap? */ > - if (pud_huge(pud_entry) || !pud_present(pud_entry)) > + if (sz == PUD_SIZE) > + /* must be pud_huge or pud_none */ > return (pte_t *)pud; > - > - pmd = pmd_offset(pud, addr); > - pmd_entry = READ_ONCE(*pmd); > - if (sz != PMD_SIZE && pmd_none(pmd_entry)) > + if (!pud_present(*pud)) > return NULL; > - /* hugepage or swap? */ > - if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry)) > - return (pte_t *)pmd; > + /* must have a valid entry and size to go further */ > > - return NULL; > + pmd = pmd_offset(pud, addr); Can we get here with sz = PMD_SIZE and pud_none(*pud)? Would that be an issue for the pmd_offset() call?
On Thu, Apr 23, 2020 at 11:14:28AM -0700, Mike Kravetz wrote: > Cc a few people who have looked at huge_pte_offset() recently. > > On 4/23/20 5:49 AM, Li Xinhai wrote: > > When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE > > or PMD_SIZE. > > If sz is PUD_SIZE and code can reach pud, then *pud must be none, or > > normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb > > entry, and we can directly return pud. > > When sz is PMD_SIZE, pud must be none or present, and if code can reach > > pmd, we can directly return pmd. > > > > So, after this patch, the code is simplified by first check on the > > parameter sz, and avoid unnecessary checks in current code. > > > > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> > > Cc: Mike Kravetz <mike.kravetz@oracle.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > mm/hugetlb.c | 24 +++++++++--------------- > > 1 file changed, 9 insertions(+), 15 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index bcabbe0..e1424f5 100644 > > +++ b/mm/hugetlb.c > > @@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm, > > { > > pgd_t *pgd; > > p4d_t *p4d; > > - pud_t *pud, pud_entry; > > - pmd_t *pmd, pmd_entry; > > + pud_t *pud; > > + pmd_t *pmd; > > > > pgd = pgd_offset(mm, addr); > > if (!pgd_present(*pgd)) > > @@ -5376,22 +5376,16 @@ pte_t *huge_pte_offset(struct mm_struct *mm, > > return NULL; > > > > pud = pud_offset(p4d, addr); > > - pud_entry = READ_ONCE(*pud); > > - if (sz != PUD_SIZE && pud_none(pud_entry)) > > - return NULL; > > - /* hugepage or swap? */ > > - if (pud_huge(pud_entry) || !pud_present(pud_entry)) > > + if (sz == PUD_SIZE) > > + /* must be pud_huge or pud_none */ > > return (pte_t *)pud; > > - > > - pmd = pmd_offset(pud, addr); > > - pmd_entry = READ_ONCE(*pmd); > > - if (sz != PMD_SIZE && pmd_none(pmd_entry)) > > + if (!pud_present(*pud)) > > return NULL; > > - /* hugepage or swap? */ > > - if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry)) > > - return (pte_t *)pmd; > > + /* must have a valid entry and size to go further */ > > > > - return NULL; > > + pmd = pmd_offset(pud, addr); > > Can we get here with sz = PMD_SIZE and pud_none(*pud)? Would that be > an issue for the pmd_offset() call? Certainly pmd_offset() must only be called if the PUD entry is pointing at a pmd level. AFAIK this means it should not be called on pud_none(), pud_huge() or !pud_present() cases. Jason
On 2020-04-24 at 02:38 Jason Gunthorpe wrote: >On Thu, Apr 23, 2020 at 11:14:28AM -0700, Mike Kravetz wrote: >> Cc a few people who have looked at huge_pte_offset() recently. >> >> On 4/23/20 5:49 AM, Li Xinhai wrote: >> > When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE >> > or PMD_SIZE. >> > If sz is PUD_SIZE and code can reach pud, then *pud must be none, or >> > normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb >> > entry, and we can directly return pud. >> > When sz is PMD_SIZE, pud must be none or present, and if code can reach >> > pmd, we can directly return pmd. >> > >> > So, after this patch, the code is simplified by first check on the >> > parameter sz, and avoid unnecessary checks in current code. >> > >> > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> >> > Cc: Mike Kravetz <mike.kravetz@oracle.com> >> > Cc: Andrew Morton <akpm@linux-foundation.org> >> > mm/hugetlb.c | 24 +++++++++--------------- >> > 1 file changed, 9 insertions(+), 15 deletions(-) >> > >> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> > index bcabbe0..e1424f5 100644 >> > +++ b/mm/hugetlb.c >> > @@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm, >> > { >> > pgd_t *pgd; >> > p4d_t *p4d; >> > - pud_t *pud, pud_entry; >> > - pmd_t *pmd, pmd_entry; >> > + pud_t *pud; >> > + pmd_t *pmd; >> > >> > pgd = pgd_offset(mm, addr); >> > if (!pgd_present(*pgd)) >> > @@ -5376,22 +5376,16 @@ pte_t *huge_pte_offset(struct mm_struct *mm, >> > return NULL; >> > >> > pud = pud_offset(p4d, addr); >> > - pud_entry = READ_ONCE(*pud); >> > - if (sz != PUD_SIZE && pud_none(pud_entry)) >> > - return NULL; >> > - /* hugepage or swap? */ >> > - if (pud_huge(pud_entry) || !pud_present(pud_entry)) >> > + if (sz == PUD_SIZE) >> > + /* must be pud_huge or pud_none */ >> > return (pte_t *)pud; >> > - >> > - pmd = pmd_offset(pud, addr); >> > - pmd_entry = READ_ONCE(*pmd); >> > - if (sz != PMD_SIZE && pmd_none(pmd_entry)) >> > + if (!pud_present(*pud)) >> > return NULL; >> > - /* hugepage or swap? */ >> > - if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry)) >> > - return (pte_t *)pmd; >> > + /* must have a valid entry and size to go further */ >> > >> > - return NULL; >> > + pmd = pmd_offset(pud, addr); >> >> Can we get here with sz = PMD_SIZE and pud_none(*pud)? Would that be >> an issue for the pmd_offset() call? > >Certainly pmd_offset() must only be called if the PUD entry is >pointing at a pmd level. > >AFAIK this means it should not be called on pud_none(), pud_huge() or >!pud_present() cases. The test of !pud_present(*pud) also block pud_none(*pud), so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD entry which point to PMD page table. > >Jason
On Fri, Apr 24, 2020 at 12:07:50PM +0800, Li Xinhai wrote: > On 2020-04-24 at 02:38 Jason Gunthorpe wrote: > >On Thu, Apr 23, 2020 at 11:14:28AM -0700, Mike Kravetz wrote: > >> Cc a few people who have looked at huge_pte_offset() recently. > >> > >> On 4/23/20 5:49 AM, Li Xinhai wrote: > >> > When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE > >> > or PMD_SIZE. > >> > If sz is PUD_SIZE and code can reach pud, then *pud must be none, or > >> > normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb > >> > entry, and we can directly return pud. > >> > When sz is PMD_SIZE, pud must be none or present, and if code can reach > >> > pmd, we can directly return pmd. > >> > > >> > So, after this patch, the code is simplified by first check on the > >> > parameter sz, and avoid unnecessary checks in current code. > >> > > >> > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> > >> > Cc: Mike Kravetz <mike.kravetz@oracle.com> > >> > Cc: Andrew Morton <akpm@linux-foundation.org> > >> > mm/hugetlb.c | 24 +++++++++--------------- > >> > 1 file changed, 9 insertions(+), 15 deletions(-) > >> > > >> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > >> > index bcabbe0..e1424f5 100644 > >> > +++ b/mm/hugetlb.c > >> > @@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm, > >> > { > >> > pgd_t *pgd; > >> > p4d_t *p4d; > >> > - pud_t *pud, pud_entry; > >> > - pmd_t *pmd, pmd_entry; > >> > + pud_t *pud; > >> > + pmd_t *pmd; > >> > > >> > pgd = pgd_offset(mm, addr); > >> > if (!pgd_present(*pgd)) > >> > @@ -5376,22 +5376,16 @@ pte_t *huge_pte_offset(struct mm_struct *mm, > >> > return NULL; > >> > > >> > pud = pud_offset(p4d, addr); > >> > - pud_entry = READ_ONCE(*pud); > >> > - if (sz != PUD_SIZE && pud_none(pud_entry)) > >> > - return NULL; > >> > - /* hugepage or swap? */ > >> > - if (pud_huge(pud_entry) || !pud_present(pud_entry)) > >> > + if (sz == PUD_SIZE) > >> > + /* must be pud_huge or pud_none */ > >> > return (pte_t *)pud; > >> > - > >> > - pmd = pmd_offset(pud, addr); > >> > - pmd_entry = READ_ONCE(*pmd); > >> > - if (sz != PMD_SIZE && pmd_none(pmd_entry)) > >> > + if (!pud_present(*pud)) > >> > return NULL; > >> > - /* hugepage or swap? */ > >> > - if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry)) > >> > - return (pte_t *)pmd; > >> > + /* must have a valid entry and size to go further */ > >> > > >> > - return NULL; > >> > + pmd = pmd_offset(pud, addr); > >> > >> Can we get here with sz = PMD_SIZE and pud_none(*pud)? Would that be > >> an issue for the pmd_offset() call? > > > >Certainly pmd_offset() must only be called if the PUD entry is > >pointing at a pmd level. > > > >AFAIK this means it should not be called on pud_none(), pud_huge() or > >!pud_present() cases. > > The test of !pud_present(*pud) also block pud_none(*pud) Sure > , so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD > entry which point to PMD page table. But what prevents pud_huge? This API seems kind of strange to be honest.. Should it be two functions instead of a sz parameter? huge_pud_offset() and huge_pmd_offset() ? Jason
On 2020-04-24 at 20:57 Jason Gunthorpe wrote: >On Fri, Apr 24, 2020 at 12:07:50PM +0800, Li Xinhai wrote: >> On 2020-04-24 at 02:38 Jason Gunthorpe wrote: >> >On Thu, Apr 23, 2020 at 11:14:28AM -0700, Mike Kravetz wrote: >> >> Cc a few people who have looked at huge_pte_offset() recently. >> >> >> >> On 4/23/20 5:49 AM, Li Xinhai wrote: >> >> > When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE >> >> > or PMD_SIZE. >> >> > If sz is PUD_SIZE and code can reach pud, then *pud must be none, or >> >> > normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb >> >> > entry, and we can directly return pud. >> >> > When sz is PMD_SIZE, pud must be none or present, and if code can reach >> >> > pmd, we can directly return pmd. >> >> > >> >> > So, after this patch, the code is simplified by first check on the >> >> > parameter sz, and avoid unnecessary checks in current code. >> >> > >> >> > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> >> >> > Cc: Mike Kravetz <mike.kravetz@oracle.com> >> >> > Cc: Andrew Morton <akpm@linux-foundation.org> >> >> > mm/hugetlb.c | 24 +++++++++--------------- >> >> > 1 file changed, 9 insertions(+), 15 deletions(-) >> >> > >> >> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> >> > index bcabbe0..e1424f5 100644 >> >> > +++ b/mm/hugetlb.c >> >> > @@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm, >> >> > { >> >> > pgd_t *pgd; >> >> > p4d_t *p4d; >> >> > - pud_t *pud, pud_entry; >> >> > - pmd_t *pmd, pmd_entry; >> >> > + pud_t *pud; >> >> > + pmd_t *pmd; >> >> > >> >> > pgd = pgd_offset(mm, addr); >> >> > if (!pgd_present(*pgd)) >> >> > @@ -5376,22 +5376,16 @@ pte_t *huge_pte_offset(struct mm_struct *mm, >> >> > return NULL; >> >> > >> >> > pud = pud_offset(p4d, addr); >> >> > - pud_entry = READ_ONCE(*pud); >> >> > - if (sz != PUD_SIZE && pud_none(pud_entry)) >> >> > - return NULL; >> >> > - /* hugepage or swap? */ >> >> > - if (pud_huge(pud_entry) || !pud_present(pud_entry)) >> >> > + if (sz == PUD_SIZE) >> >> > + /* must be pud_huge or pud_none */ >> >> > return (pte_t *)pud; >> >> > - >> >> > - pmd = pmd_offset(pud, addr); >> >> > - pmd_entry = READ_ONCE(*pmd); >> >> > - if (sz != PMD_SIZE && pmd_none(pmd_entry)) >> >> > + if (!pud_present(*pud)) >> >> > return NULL; >> >> > - /* hugepage or swap? */ >> >> > - if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry)) >> >> > - return (pte_t *)pmd; >> >> > + /* must have a valid entry and size to go further */ >> >> > >> >> > - return NULL; >> >> > + pmd = pmd_offset(pud, addr); >> >> >> >> Can we get here with sz = PMD_SIZE and pud_none(*pud)? Would that be >> >> an issue for the pmd_offset() call? >> > >> >Certainly pmd_offset() must only be called if the PUD entry is >> >pointing at a pmd level. >> > >> >AFAIK this means it should not be called on pud_none(), pud_huge() or >> >!pud_present() cases. >> >> The test of !pud_present(*pud) also block pud_none(*pud) > >Sure > >> , so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD >> entry which point to PMD page table. > >But what prevents pud_huge? > if sz == PUD_SIZE, the 'return (pte_t*)pud' alrady end the function, which cover pud_huge() and pud_none(), because we the mapping is for PUD_SIZE huge page. So, there is no possibility for pmd_offset() been called with invalid pud entry. Below is the code I used for test which has BUG_ON, that should give more clear idea about the semantics of code path: ... pud = pud_offset(p4d, addr); if (sz == PUD_SIZE) { /* must be pud_huge or pud_none */ BUG_ON(!pud_huge(*pud) && !pud_none(*pud)); return (pte_t *)pud; // note that return valid pointer for pud_none() case, // instead of NULL, that is same semantics as existing code. } if (!pud_present(*pud)) return NULL; // note that only return NULL in case pud not present, // same sematics as existing code. /* must have a valid entry and size to go further */ BUG_ON(sz != PMD_SIZE); pmd = pmd_offset(pud, addr); /* must be pmd_huge or pmd_none */ BUG_ON(!pmd_huge(*pmd) && !pmd_none(*pmd)); return (pte_t *)pmd; // note that return valid pointer for pmd_none() case, // instead of NULL, that is same semantics as existing code. ... >This API seems kind of strange to be honest.. Should it be two >functions instead of a sz parameter? > >huge_pud_offset() and huge_pmd_offset() ? I think checking huge size then call to one of these two functions at caller site will involve many redundant code do branch work in one function is better. > >Jason
On Fri, Apr 24, 2020 at 09:33:40PM +0800, Li Xinhai wrote: > > > >> , so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD > >> entry which point to PMD page table. > > > >But what prevents pud_huge? > > > if sz == PUD_SIZE, the 'return (pte_t*)pud' alrady end the function, which cover > pud_huge() and pud_none(), because we the mapping is for PUD_SIZE huge page. > > So, there is no possibility for pmd_offset() been called with invalid pud entry. > Below is the code I used for test which has BUG_ON, that should give more > clear idea about the semantics of code path: > > ... > pud = pud_offset(p4d, addr); > if (sz == PUD_SIZE) { > /* must be pud_huge or pud_none */ > BUG_ON(!pud_huge(*pud) && !pud_none(*pud)); > return (pte_t *)pud; // note that return valid pointer for pud_none() case, > // instead of NULL, that is same semantics as existing code. > } > if (!pud_present(*pud)) > return NULL; // note that only return NULL in case pud not present, > // same sematics as existing code. > /* must have a valid entry and size to go further */ > BUG_ON(sz != PMD_SIZE); > > pmd = pmd_offset(pud, addr); > /* must be pmd_huge or pmd_none */ > BUG_ON(!pmd_huge(*pmd) && !pmd_none(*pmd)); But why is !pmd_huge() ? The prior code returned null here, is that dead code? Your commit message should explain all of this.. Jason
On 2020-04-24 at 21:42 Jason Gunthorpe wrote: >On Fri, Apr 24, 2020 at 09:33:40PM +0800, Li Xinhai wrote: >> > >> >> , so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD >> >> entry which point to PMD page table. >> > >> >But what prevents pud_huge? >> > >> if sz == PUD_SIZE, the 'return (pte_t*)pud' alrady end the function, which cover >> pud_huge() and pud_none(), because we the mapping is for PUD_SIZE huge page. >> >> So, there is no possibility for pmd_offset() been called with invalid pud entry. >> Below is the code I used for test which has BUG_ON, that should give more >> clear idea about the semantics of code path: >> >> ... >> pud = pud_offset(p4d, addr); >> if (sz == PUD_SIZE) { >> /* must be pud_huge or pud_none */ >> BUG_ON(!pud_huge(*pud) && !pud_none(*pud)); >> return (pte_t *)pud; // note that return valid pointer for pud_none() case, >> // instead of NULL, that is same semantics as existing code. >> } >> if (!pud_present(*pud)) >> return NULL; // note that only return NULL in case pud not present, >> // same sematics as existing code. >> /* must have a valid entry and size to go further */ >> BUG_ON(sz != PMD_SIZE); >> >> pmd = pmd_offset(pud, addr); >> /* must be pmd_huge or pmd_none */ >> BUG_ON(!pmd_huge(*pmd) && !pmd_none(*pmd)); > >But why is !pmd_huge() ? The prior code returned null here, is that >dead code? Your commit message should explain all of this.. > let's see exising code for pmd part, the reason are in comments: ... pmd = pmd_offset(pud, addr); if (sz != PMD_SIZE && pmd_none(*pmd)) return NULL; // dead code, must sz == PMD_SIZE /* hugepage or swap? */ if (pmd_huge(*pmd) || !pmd_present(*pmd)) // !pmd_present() also cover pmd_none(), return (pte_t *)pmd; // so, all possible and valid value in pmd entry will reach here. return NULL; // dead code; can we have (!pmd_huge() && pmd_present()) and reach here? // no, because this is a hugetlb mapping. otherwise, there is invalid value in pmd entry. ... >Jason
On Fri, Apr 24, 2020 at 10:07:19PM +0800, Li Xinhai wrote: > On 2020-04-24 at 21:42 Jason Gunthorpe wrote: > >On Fri, Apr 24, 2020 at 09:33:40PM +0800, Li Xinhai wrote: > >> > > >> >> , so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD > >> >> entry which point to PMD page table. > >> > > >> >But what prevents pud_huge? > >> > > >> if sz == PUD_SIZE, the 'return (pte_t*)pud' alrady end the function, which cover > >> pud_huge() and pud_none(), because we the mapping is for PUD_SIZE huge page. > >> > >> So, there is no possibility for pmd_offset() been called with invalid pud entry. > >> Below is the code I used for test which has BUG_ON, that should give more > >> clear idea about the semantics of code path: > >> > >> ... > >> pud = pud_offset(p4d, addr); > >> if (sz == PUD_SIZE) { > >> /* must be pud_huge or pud_none */ > >> BUG_ON(!pud_huge(*pud) && !pud_none(*pud)); > >> return (pte_t *)pud; // note that return valid pointer for pud_none() case, > >> // instead of NULL, that is same semantics as existing code. > >> } > >> if (!pud_present(*pud)) > >> return NULL; // note that only return NULL in case pud not present, > >> // same sematics as existing code. > >> /* must have a valid entry and size to go further */ > >> BUG_ON(sz != PMD_SIZE); > >> > >> pmd = pmd_offset(pud, addr); > >> /* must be pmd_huge or pmd_none */ > >> BUG_ON(!pmd_huge(*pmd) && !pmd_none(*pmd)); > > > >But why is !pmd_huge() ? The prior code returned null here, is that > >dead code? Your commit message should explain all of this.. > > > let's see exising code for pmd part, the reason are in comments: > ... > pmd = pmd_offset(pud, addr); > if (sz != PMD_SIZE && pmd_none(*pmd)) > return NULL; // dead code, must sz == PMD_SIZE > /* hugepage or swap? */ > if (pmd_huge(*pmd) || !pmd_present(*pmd)) // !pmd_present() also cover pmd_none(), > return (pte_t *)pmd; // so, all possible and valid value in pmd entry will reach here. > > return NULL; // dead code; can we have (!pmd_huge() && pmd_present()) and reach here? > // no, because this is a hugetlb mapping. otherwise, there is invalid value in pmd entry. > ... well if you are relying on the caller to not call this in wrong cases it would make sense to have a if (WARN_ON(!pmd_huge(*pmd))) return NULL To document the assertion Jason
On 2020-04-24 at 22:10 Jason Gunthorpe wrote: >On Fri, Apr 24, 2020 at 10:07:19PM +0800, Li Xinhai wrote: >> On 2020-04-24 at 21:42 Jason Gunthorpe wrote: >> >On Fri, Apr 24, 2020 at 09:33:40PM +0800, Li Xinhai wrote: >> >> > >> >> >> , so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD >> >> >> entry which point to PMD page table. >> >> > >> >> >But what prevents pud_huge? >> >> > >> >> if sz == PUD_SIZE, the 'return (pte_t*)pud' alrady end the function, which cover >> >> pud_huge() and pud_none(), because we the mapping is for PUD_SIZE huge page. >> >> >> >> So, there is no possibility for pmd_offset() been called with invalid pud entry. >> >> Below is the code I used for test which has BUG_ON, that should give more >> >> clear idea about the semantics of code path: >> >> >> >> ... >> >> pud = pud_offset(p4d, addr); >> >> if (sz == PUD_SIZE) { >> >> /* must be pud_huge or pud_none */ >> >> BUG_ON(!pud_huge(*pud) && !pud_none(*pud)); >> >> return (pte_t *)pud; // note that return valid pointer for pud_none() case, >> >> // instead of NULL, that is same semantics as existing code. >> >> } >> >> if (!pud_present(*pud)) >> >> return NULL; // note that only return NULL in case pud not present, >> >> // same sematics as existing code. >> >> /* must have a valid entry and size to go further */ >> >> BUG_ON(sz != PMD_SIZE); >> >> >> >> pmd = pmd_offset(pud, addr); >> >> /* must be pmd_huge or pmd_none */ >> >> BUG_ON(!pmd_huge(*pmd) && !pmd_none(*pmd)); >> > >> >But why is !pmd_huge() ? The prior code returned null here, is that >> >dead code? Your commit message should explain all of this.. >> > >> let's see exising code for pmd part, the reason are in comments: >> ... >> pmd = pmd_offset(pud, addr); >> if (sz != PMD_SIZE && pmd_none(*pmd)) >> return NULL; // dead code, must sz == PMD_SIZE >> /* hugepage or swap? */ >> if (pmd_huge(*pmd) || !pmd_present(*pmd)) // !pmd_present() also cover pmd_none(), >> return (pte_t *)pmd; // so, all possible and valid value in pmd entry will reach here. >> >> return NULL; // dead code; can we have (!pmd_huge() && pmd_present()) and reach here? >> // no, because this is a hugetlb mapping. otherwise, there is invalid value in pmd entry. >> ... > >well if you are relying on the caller to not call this in wrong cases >it would make sense to have a > >if (WARN_ON(!pmd_huge(*pmd))) > return NULL > >To document the assertion > right, if this WARN_ON occurs, which means huge_pte_offset() been called for a normal 4K mapping, that is a bug of caller. After inspected current code of callers, no one tries to call it on 4K mapping. >Jason
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bcabbe0..e1424f5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm, { pgd_t *pgd; p4d_t *p4d; - pud_t *pud, pud_entry; - pmd_t *pmd, pmd_entry; + pud_t *pud; + pmd_t *pmd; pgd = pgd_offset(mm, addr); if (!pgd_present(*pgd)) @@ -5376,22 +5376,16 @@ pte_t *huge_pte_offset(struct mm_struct *mm, return NULL; pud = pud_offset(p4d, addr); - pud_entry = READ_ONCE(*pud); - if (sz != PUD_SIZE && pud_none(pud_entry)) - return NULL; - /* hugepage or swap? */ - if (pud_huge(pud_entry) || !pud_present(pud_entry)) + if (sz == PUD_SIZE) + /* must be pud_huge or pud_none */ return (pte_t *)pud; - - pmd = pmd_offset(pud, addr); - pmd_entry = READ_ONCE(*pmd); - if (sz != PMD_SIZE && pmd_none(pmd_entry)) + if (!pud_present(*pud)) return NULL; - /* hugepage or swap? */ - if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry)) - return (pte_t *)pmd; + /* must have a valid entry and size to go further */ - return NULL; + pmd = pmd_offset(pud, addr); + /* must be pmd_huge or pmd_none */ + return (pte_t *)pmd; } #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE or PMD_SIZE. If sz is PUD_SIZE and code can reach pud, then *pud must be none, or normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb entry, and we can directly return pud. When sz is PMD_SIZE, pud must be none or present, and if code can reach pmd, we can directly return pmd. So, after this patch, the code is simplified by first check on the parameter sz, and avoid unnecessary checks in current code. Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> --- mm/hugetlb.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-)