diff mbox series

[v2,1/2] mm/damon: validate if the pmd entry is present before accessing

Message ID 58b1d1f5fbda7db49ca886d9ef6783e3dcbbbc98.1660805030.git.baolin.wang@linux.alibaba.com (mailing list archive)
State New
Headers show
Series [v2,1/2] mm/damon: validate if the pmd entry is present before accessing | expand

Commit Message

Baolin Wang Aug. 18, 2022, 7:37 a.m. UTC
The pmd_huge() is used to validate if the pmd entry is mapped by a huge
page, also including the case of non-present (migration or hwpoisoned)
pmd entry on arm64 or x86 architectures. That means the pmd_pfn() can
not get the correct pfn number for the non-present pmd entry, which
will cause damon_get_page() to get an incorrect page struct (also
may be NULL by pfn_to_online_page()) to make the access statistics
incorrect.

Moreover it does not make sense that we still waste time to get the
page of the non-present entry, just treat it as not-accessed and skip it,
that keeps consistent with non-present pte level entry.

Thus adding a pmd entry present validation to fix above issues.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
Changes from v1:
 - Update the commit message to make it more clear.
 - Add reviewed tag from SeongJae.
---
 mm/damon/vaddr.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Muchun Song Aug. 18, 2022, 9:07 a.m. UTC | #1
> On Aug 18, 2022, at 15:37, Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
> The pmd_huge() is usually used to indicate if a pmd level hugetlb,
> however a pmd mapped huge page can only be THP in damon_mkold_pmd_entry()
> or damon_young_pmd_entry(), so replacing pmd_huge() with pmd_trans_huge()
> in this case to make code more readable according to the discussion [1].
> 
> [1] https://lore.kernel.org/all/098c1480-416d-bca9-cedb-ca495df69b64@linux.alibaba.com/
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

Thanks.
Andrew Morton Aug. 20, 2022, 9:17 p.m. UTC | #2
On Thu, 18 Aug 2022 15:37:43 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> The pmd_huge() is used to validate if the pmd entry is mapped by a huge
> page, also including the case of non-present (migration or hwpoisoned)
> pmd entry on arm64 or x86 architectures. That means the pmd_pfn() can
> not get the correct pfn number for the non-present pmd entry, which
> will cause damon_get_page() to get an incorrect page struct (also
> may be NULL by pfn_to_online_page()) to make the access statistics
> incorrect.
> 
> Moreover it does not make sense that we still waste time to get the
> page of the non-present entry, just treat it as not-accessed and skip it,
> that keeps consistent with non-present pte level entry.
> 
> Thus adding a pmd entry present validation to fix above issues.
> 

Do we have a Fixes: for this?

What are the user-visible runtime effects of the bug?  "make the access
statistics incorrect" is rather vague.

Do we feel that a cc:stable is warranted?
Baolin Wang Aug. 21, 2022, 5:22 a.m. UTC | #3
On 8/21/2022 5:17 AM, Andrew Morton wrote:
> On Thu, 18 Aug 2022 15:37:43 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
>> The pmd_huge() is used to validate if the pmd entry is mapped by a huge
>> page, also including the case of non-present (migration or hwpoisoned)
>> pmd entry on arm64 or x86 architectures. That means the pmd_pfn() can
>> not get the correct pfn number for the non-present pmd entry, which
>> will cause damon_get_page() to get an incorrect page struct (also
>> may be NULL by pfn_to_online_page()) to make the access statistics
>> incorrect.
>>
>> Moreover it does not make sense that we still waste time to get the
>> page of the non-present entry, just treat it as not-accessed and skip it,
>> that keeps consistent with non-present pte level entry.
>>
>> Thus adding a pmd entry present validation to fix above issues.
>>
> 
> Do we have a Fixes: for this?

OK, should be:
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual 
memory address spaces")

> What are the user-visible runtime effects of the bug?  "make the access
> statistics incorrect" is rather vague.

"access statistics incorrect" means that the DAMON may make incorrect 
decision according to the incorrect statistics, for example, DAMON may 
can not reclaim cold page in time due to this cold page was regarded as 
accessed mistakenly if DAMOS_PAGEOUT operation is specified.

> Do we feel that a cc:stable is warranted?

Though this is not a regular case, I think this patch is suitable to be 
backported to cover this unusual case. So please help to add a stable 
tag when you apply this patch, or please let me know if you want a new 
version with adding Fixes and stable tags. Thanks.
Andrew Morton Aug. 21, 2022, 5:46 a.m. UTC | #4
On Sun, 21 Aug 2022 13:22:42 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> 
> 
> On 8/21/2022 5:17 AM, Andrew Morton wrote:
> > On Thu, 18 Aug 2022 15:37:43 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> > 
> >> The pmd_huge() is used to validate if the pmd entry is mapped by a huge
> >> page, also including the case of non-present (migration or hwpoisoned)
> >> pmd entry on arm64 or x86 architectures. That means the pmd_pfn() can
> >> not get the correct pfn number for the non-present pmd entry, which
> >> will cause damon_get_page() to get an incorrect page struct (also
> >> may be NULL by pfn_to_online_page()) to make the access statistics
> >> incorrect.
> >>
> >> Moreover it does not make sense that we still waste time to get the
> >> page of the non-present entry, just treat it as not-accessed and skip it,
> >> that keeps consistent with non-present pte level entry.
> >>
> >> Thus adding a pmd entry present validation to fix above issues.
> >>
> > 
> > Do we have a Fixes: for this?
> 
> OK, should be:
> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual 
> memory address spaces")
> 
> > What are the user-visible runtime effects of the bug?  "make the access
> > statistics incorrect" is rather vague.
> 
> "access statistics incorrect" means that the DAMON may make incorrect 
> decision according to the incorrect statistics, for example, DAMON may 
> can not reclaim cold page in time due to this cold page was regarded as 
> accessed mistakenly if DAMOS_PAGEOUT operation is specified.
> 
> > Do we feel that a cc:stable is warranted?
> 
> Though this is not a regular case, I think this patch is suitable to be 
> backported to cover this unusual case. So please help to add a stable 
> tag when you apply this patch, or please let me know if you want a new 
> version with adding Fixes and stable tags. Thanks.

Thanks, I took care of all that.
diff mbox series

Patch

diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index 3c7b9d6..1d16c6c 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -304,6 +304,11 @@  static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
 
 	if (pmd_huge(*pmd)) {
 		ptl = pmd_lock(walk->mm, pmd);
+		if (!pmd_present(*pmd)) {
+			spin_unlock(ptl);
+			return 0;
+		}
+
 		if (pmd_huge(*pmd)) {
 			damon_pmdp_mkold(pmd, walk->mm, addr);
 			spin_unlock(ptl);
@@ -431,6 +436,11 @@  static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr,
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	if (pmd_huge(*pmd)) {
 		ptl = pmd_lock(walk->mm, pmd);
+		if (!pmd_present(*pmd)) {
+			spin_unlock(ptl);
+			return 0;
+		}
+
 		if (!pmd_huge(*pmd)) {
 			spin_unlock(ptl);
 			goto regular_page;