Message ID | 58b1d1f5fbda7db49ca886d9ef6783e3dcbbbc98.1660805030.git.baolin.wang@linux.alibaba.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,1/2] mm/damon: validate if the pmd entry is present before accessing | expand |
> On Aug 18, 2022, at 15:37, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > The pmd_huge() is usually used to indicate if a pmd level hugetlb, > however a pmd mapped huge page can only be THP in damon_mkold_pmd_entry() > or damon_young_pmd_entry(), so replacing pmd_huge() with pmd_trans_huge() > in this case to make code more readable according to the discussion [1]. > > [1] https://lore.kernel.org/all/098c1480-416d-bca9-cedb-ca495df69b64@linux.alibaba.com/ > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Thanks.
On Thu, 18 Aug 2022 15:37:43 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > The pmd_huge() is used to validate if the pmd entry is mapped by a huge > page, also including the case of non-present (migration or hwpoisoned) > pmd entry on arm64 or x86 architectures. That means the pmd_pfn() can > not get the correct pfn number for the non-present pmd entry, which > will cause damon_get_page() to get an incorrect page struct (also > may be NULL by pfn_to_online_page()) to make the access statistics > incorrect. > > Moreover it does not make sense that we still waste time to get the > page of the non-present entry, just treat it as not-accessed and skip it, > that keeps consistent with non-present pte level entry. > > Thus adding a pmd entry present validation to fix above issues. > Do we have a Fixes: for this? What are the user-visible runtime effects of the bug? "make the access statistics incorrect" is rather vague. Do we feel that a cc:stable is warranted?
On 8/21/2022 5:17 AM, Andrew Morton wrote: > On Thu, 18 Aug 2022 15:37:43 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > >> The pmd_huge() is used to validate if the pmd entry is mapped by a huge >> page, also including the case of non-present (migration or hwpoisoned) >> pmd entry on arm64 or x86 architectures. That means the pmd_pfn() can >> not get the correct pfn number for the non-present pmd entry, which >> will cause damon_get_page() to get an incorrect page struct (also >> may be NULL by pfn_to_online_page()) to make the access statistics >> incorrect. >> >> Moreover it does not make sense that we still waste time to get the >> page of the non-present entry, just treat it as not-accessed and skip it, >> that keeps consistent with non-present pte level entry. >> >> Thus adding a pmd entry present validation to fix above issues. >> > > Do we have a Fixes: for this? OK, should be: Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces") > What are the user-visible runtime effects of the bug? "make the access > statistics incorrect" is rather vague. "access statistics incorrect" means that the DAMON may make incorrect decision according to the incorrect statistics, for example, DAMON may can not reclaim cold page in time due to this cold page was regarded as accessed mistakenly if DAMOS_PAGEOUT operation is specified. > Do we feel that a cc:stable is warranted? Though this is not a regular case, I think this patch is suitable to be backported to cover this unusual case. So please help to add a stable tag when you apply this patch, or please let me know if you want a new version with adding Fixes and stable tags. Thanks.
On Sun, 21 Aug 2022 13:22:42 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > On 8/21/2022 5:17 AM, Andrew Morton wrote: > > On Thu, 18 Aug 2022 15:37:43 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > >> The pmd_huge() is used to validate if the pmd entry is mapped by a huge > >> page, also including the case of non-present (migration or hwpoisoned) > >> pmd entry on arm64 or x86 architectures. That means the pmd_pfn() can > >> not get the correct pfn number for the non-present pmd entry, which > >> will cause damon_get_page() to get an incorrect page struct (also > >> may be NULL by pfn_to_online_page()) to make the access statistics > >> incorrect. > >> > >> Moreover it does not make sense that we still waste time to get the > >> page of the non-present entry, just treat it as not-accessed and skip it, > >> that keeps consistent with non-present pte level entry. > >> > >> Thus adding a pmd entry present validation to fix above issues. > >> > > > > Do we have a Fixes: for this? > > OK, should be: > Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual > memory address spaces") > > > What are the user-visible runtime effects of the bug? "make the access > > statistics incorrect" is rather vague. > > "access statistics incorrect" means that the DAMON may make incorrect > decision according to the incorrect statistics, for example, DAMON may > can not reclaim cold page in time due to this cold page was regarded as > accessed mistakenly if DAMOS_PAGEOUT operation is specified. > > > Do we feel that a cc:stable is warranted? > > Though this is not a regular case, I think this patch is suitable to be > backported to cover this unusual case. So please help to add a stable > tag when you apply this patch, or please let me know if you want a new > version with adding Fixes and stable tags. Thanks. Thanks, I took care of all that.
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 3c7b9d6..1d16c6c 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, if (pmd_huge(*pmd)) { ptl = pmd_lock(walk->mm, pmd); + if (!pmd_present(*pmd)) { + spin_unlock(ptl); + return 0; + } + if (pmd_huge(*pmd)) { damon_pmdp_mkold(pmd, walk->mm, addr); spin_unlock(ptl); @@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (pmd_huge(*pmd)) { ptl = pmd_lock(walk->mm, pmd); + if (!pmd_present(*pmd)) { + spin_unlock(ptl); + return 0; + } + if (!pmd_huge(*pmd)) { spin_unlock(ptl); goto regular_page;