Message ID | 2838b6737bc259cf575ff11fd1c4b7fdb340fa73.1660717122.git.baolin.wang@linux.alibaba.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/damon: Validate if the pmd entry is present before accessing | expand |
On Wed, 17 Aug 2022 14:21:12 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > The pmd_huge() is used to validate if the pmd entry is mapped by a huge > page, also including the case of non-present (migration or hwpoisoned) > pmd entry on arm64 or x86 architectures. Thus we should validate if it > is present before making the pmd entry old or getting young state, > otherwise we can not get the correct corresponding page. > What are the user-visible runtime effects of this change?
Hi Baolin, Thank you always for your great patch! On Wed, 17 Aug 2022 14:21:12 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > The pmd_huge() is used to validate if the pmd entry is mapped by a huge > page, also including the case of non-present (migration or hwpoisoned) > pmd entry on arm64 or x86 architectures. Thus we should validate if it > is present before making the pmd entry old or getting young state, > otherwise we can not get the correct corresponding page. Maybe I'm missing something, but... I'm unsure if the page is present or not really matters from the perspective of access checking. In the case, DAMON could simply report the page has accessed once for the first check after the page being non-present if it really accessed before, and then report the page as not accessed, which is true. Please let me know if I'm missing something. Thanks, SJ > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/damon/vaddr.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c > index 3c7b9d6..1d16c6c 100644 > --- a/mm/damon/vaddr.c > +++ b/mm/damon/vaddr.c > @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, > > if (pmd_huge(*pmd)) { > ptl = pmd_lock(walk->mm, pmd); > + if (!pmd_present(*pmd)) { > + spin_unlock(ptl); > + return 0; > + } > + > if (pmd_huge(*pmd)) { > damon_pmdp_mkold(pmd, walk->mm, addr); > spin_unlock(ptl); > @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > if (pmd_huge(*pmd)) { > ptl = pmd_lock(walk->mm, pmd); > + if (!pmd_present(*pmd)) { > + spin_unlock(ptl); > + return 0; > + } > + > if (!pmd_huge(*pmd)) { > spin_unlock(ptl); > goto regular_page; > -- > 1.8.3.1
On 8/18/2022 12:09 AM, SeongJae Park wrote: > Hi Baolin, > > > Thank you always for your great patch! > > On Wed, 17 Aug 2022 14:21:12 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > >> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >> page, also including the case of non-present (migration or hwpoisoned) >> pmd entry on arm64 or x86 architectures. Thus we should validate if it >> is present before making the pmd entry old or getting young state, >> otherwise we can not get the correct corresponding page. > > Maybe I'm missing something, but... I'm unsure if the page is present or not > really matters from the perspective of access checking. In the case, DAMON > could simply report the page has accessed once for the first check after the > page being non-present if it really accessed before, and then report the page > as not accessed, which is true. Yes, that's the patch's goal to make the accesses correct. However if the PMD entry is not present, we can not get the correct page object by pmd_pfn(*pmd), since the non-present pmd entry will contain swap type and swap offset with below format on ARM64, that means the pfn number is saved in bits 8-57 in a migration or poisoned entry, but pmd_pfn() still treat bits 12-47 as the pfn number on ARM64, which may get an incorrect page struct (also maybe is NULL by pfn_to_online_page()) to make the access statistics incorrect. /* * Encode and decode a swap entry: * bits 0-1: present (must be zero) * bits 2: remember PG_anon_exclusive * bits 3-7: swap type * bits 8-57: swap offset * bit 58: PTE_PROT_NONE (must be zero) */ Moreoever I don't think we should still waste time to get the page of the non-present entry, just treat it as not-accessed and skip it, that keeps consistent with non-present pte level entry. Does that make sense for you? Thanks.
Hi Baolin, On Thu, 18 Aug 2022 09:05:58 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > On 8/18/2022 12:09 AM, SeongJae Park wrote: > > Hi Baolin, > > > > > > Thank you always for your great patch! > > > > On Wed, 17 Aug 2022 14:21:12 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > >> The pmd_huge() is used to validate if the pmd entry is mapped by a huge > >> page, also including the case of non-present (migration or hwpoisoned) > >> pmd entry on arm64 or x86 architectures. Thus we should validate if it > >> is present before making the pmd entry old or getting young state, > >> otherwise we can not get the correct corresponding page. > > > > Maybe I'm missing something, but... I'm unsure if the page is present or not > > really matters from the perspective of access checking. In the case, DAMON > > could simply report the page has accessed once for the first check after the > > page being non-present if it really accessed before, and then report the page > > as not accessed, which is true. > > Yes, that's the patch's goal to make the accesses correct. However if > the PMD entry is not present, we can not get the correct page object by > pmd_pfn(*pmd), since the non-present pmd entry will contain swap type > and swap offset with below format on ARM64, that means the pfn number is > saved in bits 8-57 in a migration or poisoned entry, but pmd_pfn() still > treat bits 12-47 as the pfn number on ARM64, which may get an incorrect > page struct (also maybe is NULL by pfn_to_online_page()) to make the > access statistics incorrect. > > /* > * Encode and decode a swap entry: > * bits 0-1: present (must be zero) > * bits 2: remember PG_anon_exclusive > * bits 3-7: swap type > * bits 8-57: swap offset > * bit 58: PTE_PROT_NONE (must be zero) > */ > > > Moreoever I don't think we should still waste time to get the page of > the non-present entry, just treat it as not-accessed and skip it, that > keeps consistent with non-present pte level entry. > > Does that make sense for you? Thanks. Yes, that totally makes sense. Thank you very much for the kind answer. I think it would be great if we could put the detailed explanation in the commit message. Could you please update the commit message and post v2 of the patch? Anyway, Reviewed-by: SeongJae Park <sj@kernel.org> Thanks, SJ
On 8/18/2022 10:29 AM, SeongJae Park wrote: > Hi Baolin, > > On Thu, 18 Aug 2022 09:05:58 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > >> >> >> On 8/18/2022 12:09 AM, SeongJae Park wrote: >>> Hi Baolin, >>> >>> >>> Thank you always for your great patch! >>> >>> On Wed, 17 Aug 2022 14:21:12 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>> >>>> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >>>> page, also including the case of non-present (migration or hwpoisoned) >>>> pmd entry on arm64 or x86 architectures. Thus we should validate if it >>>> is present before making the pmd entry old or getting young state, >>>> otherwise we can not get the correct corresponding page. >>> >>> Maybe I'm missing something, but... I'm unsure if the page is present or not >>> really matters from the perspective of access checking. In the case, DAMON >>> could simply report the page has accessed once for the first check after the >>> page being non-present if it really accessed before, and then report the page >>> as not accessed, which is true. >> >> Yes, that's the patch's goal to make the accesses correct. However if >> the PMD entry is not present, we can not get the correct page object by >> pmd_pfn(*pmd), since the non-present pmd entry will contain swap type >> and swap offset with below format on ARM64, that means the pfn number is >> saved in bits 8-57 in a migration or poisoned entry, but pmd_pfn() still >> treat bits 12-47 as the pfn number on ARM64, which may get an incorrect >> page struct (also maybe is NULL by pfn_to_online_page()) to make the >> access statistics incorrect. >> >> /* >> * Encode and decode a swap entry: >> * bits 0-1: present (must be zero) >> * bits 2: remember PG_anon_exclusive >> * bits 3-7: swap type >> * bits 8-57: swap offset >> * bit 58: PTE_PROT_NONE (must be zero) >> */ >> >> >> Moreoever I don't think we should still waste time to get the page of >> the non-present entry, just treat it as not-accessed and skip it, that >> keeps consistent with non-present pte level entry. >> >> Does that make sense for you? Thanks. > > Yes, that totally makes sense. Thank you very much for the kind answer. I > think it would be great if we could put the detailed explanation in the commit > message. Could you please update the commit message and post v2 of the patch? Sure, will update the commit message to make it more clear and I think that can also answer Andrew's concern. > > Reviewed-by: SeongJae Park <sj@kernel.org> Thanks.
> On Aug 17, 2022, at 14:21, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > The pmd_huge() is used to validate if the pmd entry is mapped by a huge > page, also including the case of non-present (migration or hwpoisoned) > pmd entry on arm64 or x86 architectures. Thus we should validate if it > is present before making the pmd entry old or getting young state, > otherwise we can not get the correct corresponding page. > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/damon/vaddr.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c > index 3c7b9d6..1d16c6c 100644 > --- a/mm/damon/vaddr.c > +++ b/mm/damon/vaddr.c > @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, > > if (pmd_huge(*pmd)) { > ptl = pmd_lock(walk->mm, pmd); > + if (!pmd_present(*pmd)) { Unluckily, we should use pte_present here. See commit c9d398fa23788. We can use huge_ptep_get() to get a hugetlb pte, so it’s better to put the check after pmd_huge. Cc Mike to make sure I am not missing something. Muchun, Thanks. > + spin_unlock(ptl); > + return 0; > + } > + > if (pmd_huge(*pmd)) { > damon_pmdp_mkold(pmd, walk->mm, addr); > spin_unlock(ptl); > @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > if (pmd_huge(*pmd)) { > ptl = pmd_lock(walk->mm, pmd); > + if (!pmd_present(*pmd)) { > + spin_unlock(ptl); > + return 0; > + } > + > if (!pmd_huge(*pmd)) { > spin_unlock(ptl); > goto regular_page; > -- > 1.8.3.1 > >
在 8/18/2022 10:41 AM, Muchun Song 写道: > > >> On Aug 17, 2022, at 14:21, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >> >> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >> page, also including the case of non-present (migration or hwpoisoned) >> pmd entry on arm64 or x86 architectures. Thus we should validate if it >> is present before making the pmd entry old or getting young state, >> otherwise we can not get the correct corresponding page. >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >> --- >> mm/damon/vaddr.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c >> index 3c7b9d6..1d16c6c 100644 >> --- a/mm/damon/vaddr.c >> +++ b/mm/damon/vaddr.c >> @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, >> >> if (pmd_huge(*pmd)) { >> ptl = pmd_lock(walk->mm, pmd); >> + if (!pmd_present(*pmd)) { > > Unluckily, we should use pte_present here. See commit c9d398fa23788. We can use > huge_ptep_get() to get a hugetlb pte, so it’s better to put the check after > pmd_huge. IMO this is not the case for hugetlb, and the hugetlb case will be handled by damon_mkold_hugetlb_entry(), which already used pte_present() for hugetlb case. > > Cc Mike to make sure I am not missing something. > > Muchun, > Thanks. > >> + spin_unlock(ptl); >> + return 0; >> + } >> + >> if (pmd_huge(*pmd)) { >> damon_pmdp_mkold(pmd, walk->mm, addr); >> spin_unlock(ptl); >> @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> if (pmd_huge(*pmd)) { >> ptl = pmd_lock(walk->mm, pmd); >> + if (!pmd_present(*pmd)) { >> + spin_unlock(ptl); >> + return 0; >> + } >> + >> if (!pmd_huge(*pmd)) { >> spin_unlock(ptl); >> goto regular_page; >> -- >> 1.8.3.1 >> >> >
> On Aug 18, 2022, at 10:57, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > > 在 8/18/2022 10:41 AM, Muchun Song 写道: >>> On Aug 17, 2022, at 14:21, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>> >>> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >>> page, also including the case of non-present (migration or hwpoisoned) >>> pmd entry on arm64 or x86 architectures. Thus we should validate if it >>> is present before making the pmd entry old or getting young state, >>> otherwise we can not get the correct corresponding page. >>> >>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >>> --- >>> mm/damon/vaddr.c | 10 ++++++++++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c >>> index 3c7b9d6..1d16c6c 100644 >>> --- a/mm/damon/vaddr.c >>> +++ b/mm/damon/vaddr.c >>> @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, >>> >>> if (pmd_huge(*pmd)) { >>> ptl = pmd_lock(walk->mm, pmd); >>> + if (!pmd_present(*pmd)) { >> Unluckily, we should use pte_present here. See commit c9d398fa23788. We can use >> huge_ptep_get() to get a hugetlb pte, so it’s better to put the check after >> pmd_huge. > > IMO this is not the case for hugetlb, and the hugetlb case will be handled by damon_mkold_hugetlb_entry(), which already used pte_present() for hugetlb case. Well, I thought it is hugetlb related since I saw the usage of pmd_huge. If it is THP case, why not use pmd_trans_huge? Thanks. > >> Cc Mike to make sure I am not missing something. >> Muchun, >> Thanks. >>> + spin_unlock(ptl); >>> + return 0; >>> + } >>> + >>> if (pmd_huge(*pmd)) { >>> damon_pmdp_mkold(pmd, walk->mm, addr); >>> spin_unlock(ptl); >>> @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, >>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>> if (pmd_huge(*pmd)) { >>> ptl = pmd_lock(walk->mm, pmd); >>> + if (!pmd_present(*pmd)) { >>> + spin_unlock(ptl); >>> + return 0; >>> + } >>> + >>> if (!pmd_huge(*pmd)) { >>> spin_unlock(ptl); >>> goto regular_page; >>> -- >>> 1.8.3.1
On 8/18/2022 11:39 AM, Muchun Song wrote: > > >> On Aug 18, 2022, at 10:57, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >> >> >> >> 在 8/18/2022 10:41 AM, Muchun Song 写道: >>>> On Aug 17, 2022, at 14:21, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>>> >>>> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >>>> page, also including the case of non-present (migration or hwpoisoned) >>>> pmd entry on arm64 or x86 architectures. Thus we should validate if it >>>> is present before making the pmd entry old or getting young state, >>>> otherwise we can not get the correct corresponding page. >>>> >>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >>>> --- >>>> mm/damon/vaddr.c | 10 ++++++++++ >>>> 1 file changed, 10 insertions(+) >>>> >>>> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c >>>> index 3c7b9d6..1d16c6c 100644 >>>> --- a/mm/damon/vaddr.c >>>> +++ b/mm/damon/vaddr.c >>>> @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, >>>> >>>> if (pmd_huge(*pmd)) { >>>> ptl = pmd_lock(walk->mm, pmd); >>>> + if (!pmd_present(*pmd)) { >>> Unluckily, we should use pte_present here. See commit c9d398fa23788. We can use >>> huge_ptep_get() to get a hugetlb pte, so it’s better to put the check after >>> pmd_huge. >> >> IMO this is not the case for hugetlb, and the hugetlb case will be handled by damon_mkold_hugetlb_entry(), which already used pte_present() for hugetlb case. > > Well, I thought it is hugetlb related since I saw the usage of pmd_huge. If it is THP case, why > not use pmd_trans_huge? IIUC, it can not guarantee the pmd is present if pmd_trans_huge() returns true on all architectures, at least on X86, we still need pmd_present() validation. So changing to pmd_trans_huge() does not make code simpler from my side, and I prefer to keep this patch. Maybe we can send another cleanup patch to replace pmd_huge() with pmd_trans_huge() for THP case to make code more readable? How do you think? Thanks. >> >>> Cc Mike to make sure I am not missing something. >>> Muchun, >>> Thanks. >>>> + spin_unlock(ptl); >>>> + return 0; >>>> + } >>>> + >>>> if (pmd_huge(*pmd)) { >>>> damon_pmdp_mkold(pmd, walk->mm, addr); >>>> spin_unlock(ptl); >>>> @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, >>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>>> if (pmd_huge(*pmd)) { >>>> ptl = pmd_lock(walk->mm, pmd); >>>> + if (!pmd_present(*pmd)) { >>>> + spin_unlock(ptl); >>>> + return 0; >>>> + } >>>> + >>>> if (!pmd_huge(*pmd)) { >>>> spin_unlock(ptl); >>>> goto regular_page; >>>> -- >>>> 1.8.3.1
> On Aug 18, 2022, at 13:07, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > > On 8/18/2022 11:39 AM, Muchun Song wrote: >>> On Aug 18, 2022, at 10:57, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>> >>> >>> >>> 在 8/18/2022 10:41 AM, Muchun Song 写道: >>>>> On Aug 17, 2022, at 14:21, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>>>> >>>>> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >>>>> page, also including the case of non-present (migration or hwpoisoned) >>>>> pmd entry on arm64 or x86 architectures. Thus we should validate if it >>>>> is present before making the pmd entry old or getting young state, >>>>> otherwise we can not get the correct corresponding page. >>>>> >>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >>>>> --- >>>>> mm/damon/vaddr.c | 10 ++++++++++ >>>>> 1 file changed, 10 insertions(+) >>>>> >>>>> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c >>>>> index 3c7b9d6..1d16c6c 100644 >>>>> --- a/mm/damon/vaddr.c >>>>> +++ b/mm/damon/vaddr.c >>>>> @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, >>>>> >>>>> if (pmd_huge(*pmd)) { >>>>> ptl = pmd_lock(walk->mm, pmd); >>>>> + if (!pmd_present(*pmd)) { >>>> Unluckily, we should use pte_present here. See commit c9d398fa23788. We can use >>>> huge_ptep_get() to get a hugetlb pte, so it’s better to put the check after >>>> pmd_huge. >>> >>> IMO this is not the case for hugetlb, and the hugetlb case will be handled by damon_mkold_hugetlb_entry(), which already used pte_present() for hugetlb case. >> Well, I thought it is hugetlb related since I saw the usage of pmd_huge. If it is THP case, why >> not use pmd_trans_huge? > > IIUC, it can not guarantee the pmd is present if pmd_trans_huge() returns true on all architectures, at least on X86, we still need pmd_present() validation. So changing to pmd_trans_huge() does not make code simpler from my side, and I prefer to keep this patch. I am not suggesting you change it to pmd_trans_huge() in this patch, I am just expressing my curious. At least, it is a little confusing to me. > > Maybe we can send another cleanup patch to replace pmd_huge() with pmd_trans_huge() for THP case to make code more readable? How do you think? Thanks. Yep, make sense to me. Thanks. > >>> >>>> Cc Mike to make sure I am not missing something. >>>> Muchun, >>>> Thanks. >>>>> + spin_unlock(ptl); >>>>> + return 0; >>>>> + } >>>>> + >>>>> if (pmd_huge(*pmd)) { >>>>> damon_pmdp_mkold(pmd, walk->mm, addr); >>>>> spin_unlock(ptl); >>>>> @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, >>>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>>>> if (pmd_huge(*pmd)) { >>>>> ptl = pmd_lock(walk->mm, pmd); >>>>> + if (!pmd_present(*pmd)) { >>>>> + spin_unlock(ptl); >>>>> + return 0; >>>>> + } >>>>> + >>>>> if (!pmd_huge(*pmd)) { >>>>> spin_unlock(ptl); >>>>> goto regular_page; >>>>> -- >>>>> 1.8.3.1
On 8/18/2022 1:12 PM, Muchun Song wrote: > > >> On Aug 18, 2022, at 13:07, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >> >> >> >> On 8/18/2022 11:39 AM, Muchun Song wrote: >>>> On Aug 18, 2022, at 10:57, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>>> >>>> >>>> >>>> 在 8/18/2022 10:41 AM, Muchun Song 写道: >>>>>> On Aug 17, 2022, at 14:21, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>>>>> >>>>>> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >>>>>> page, also including the case of non-present (migration or hwpoisoned) >>>>>> pmd entry on arm64 or x86 architectures. Thus we should validate if it >>>>>> is present before making the pmd entry old or getting young state, >>>>>> otherwise we can not get the correct corresponding page. >>>>>> >>>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >>>>>> --- >>>>>> mm/damon/vaddr.c | 10 ++++++++++ >>>>>> 1 file changed, 10 insertions(+) >>>>>> >>>>>> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c >>>>>> index 3c7b9d6..1d16c6c 100644 >>>>>> --- a/mm/damon/vaddr.c >>>>>> +++ b/mm/damon/vaddr.c >>>>>> @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, >>>>>> >>>>>> if (pmd_huge(*pmd)) { >>>>>> ptl = pmd_lock(walk->mm, pmd); >>>>>> + if (!pmd_present(*pmd)) { >>>>> Unluckily, we should use pte_present here. See commit c9d398fa23788. We can use >>>>> huge_ptep_get() to get a hugetlb pte, so it’s better to put the check after >>>>> pmd_huge. >>>> >>>> IMO this is not the case for hugetlb, and the hugetlb case will be handled by damon_mkold_hugetlb_entry(), which already used pte_present() for hugetlb case. >>> Well, I thought it is hugetlb related since I saw the usage of pmd_huge. If it is THP case, why >>> not use pmd_trans_huge? >> >> IIUC, it can not guarantee the pmd is present if pmd_trans_huge() returns true on all architectures, at least on X86, we still need pmd_present() validation. So changing to pmd_trans_huge() does not make code simpler from my side, and I prefer to keep this patch. > > I am not suggesting you change it to pmd_trans_huge() in this patch, I am just expressing > my curious. At least, it is a little confusing to me. OK. >> >> Maybe we can send another cleanup patch to replace pmd_huge() with pmd_trans_huge() for THP case to make code more readable? How do you think? Thanks. > > Yep, make sense to me. OK. I can add a cleanup patch in next version. Thanks for your input.
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 3c7b9d6..1d16c6c 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, if (pmd_huge(*pmd)) { ptl = pmd_lock(walk->mm, pmd); + if (!pmd_present(*pmd)) { + spin_unlock(ptl); + return 0; + } + if (pmd_huge(*pmd)) { damon_pmdp_mkold(pmd, walk->mm, addr); spin_unlock(ptl); @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (pmd_huge(*pmd)) { ptl = pmd_lock(walk->mm, pmd); + if (!pmd_present(*pmd)) { + spin_unlock(ptl); + return 0; + } + if (!pmd_huge(*pmd)) { spin_unlock(ptl); goto regular_page;
The pmd_huge() is used to validate if the pmd entry is mapped by a huge page, also including the case of non-present (migration or hwpoisoned) pmd entry on arm64 or x86 architectures. Thus we should validate if it is present before making the pmd entry old or getting young state, otherwise we can not get the correct corresponding page. Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> --- mm/damon/vaddr.c | 10 ++++++++++ 1 file changed, 10 insertions(+)