diff mbox series

[PATCHv2,2/8] khugepaged: Do not stop collapse if less than half PTEs are referenced

Message ID 20200403112928.19742-3-kirill.shutemov@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series thp/khugepaged improvements and CoW semantics | expand

Commit Message

Kirill A. Shutemov April 3, 2020, 11:29 a.m. UTC
__collapse_huge_page_swapin() check number of referenced PTE to decide
if the memory range is hot enough to justify swapin.

The problem is that it stops collapse altogether if there's not enough
referenced pages, not only swappingin.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 0db501f7a34c ("mm, thp: convert from optimistic swapin collapsing to conservative")
Reviewed-by: Zi Yan <ziy@nvidia.com>
---
 mm/khugepaged.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Yang Shi April 6, 2020, 6:13 p.m. UTC | #1
On 4/3/20 4:29 AM, Kirill A. Shutemov wrote:
> __collapse_huge_page_swapin() check number of referenced PTE to decide
> if the memory range is hot enough to justify swapin.
>
> The problem is that it stops collapse altogether if there's not enough
> referenced pages, not only swappingin.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 0db501f7a34c ("mm, thp: convert from optimistic swapin collapsing to conservative")
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> ---
>   mm/khugepaged.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)

Acked-by: Yang Shi <yang.shi@linux.alibaba.com>

>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 99bab7e4d05b..14d7afc90786 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -905,7 +905,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
>   	/* we only decide to swapin, if there is enough young ptes */
>   	if (referenced < HPAGE_PMD_NR/2) {
>   		trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> -		return false;
> +		/* Do not block collapse, only skip swapping in */
> +		return true;
>   	}
>   	vmf.pte = pte_offset_map(pmd, address);
>   	for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE;
Ralph Campbell April 6, 2020, 7:53 p.m. UTC | #2
On 4/3/20 4:29 AM, Kirill A. Shutemov wrote:
> __collapse_huge_page_swapin() check number of referenced PTE to decide
> if the memory range is hot enough to justify swapin.
> 
> The problem is that it stops collapse altogether if there's not enough
> referenced pages, not only swappingin.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 0db501f7a34c ("mm, thp: convert from optimistic swapin collapsing to conservative")
> Reviewed-by: Zi Yan <ziy@nvidia.com> ---
>   mm/khugepaged.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 99bab7e4d05b..14d7afc90786 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -905,7 +905,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
>   	/* we only decide to swapin, if there is enough young ptes */
>   	if (referenced < HPAGE_PMD_NR/2) {
>   		trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);

The trace point is recording the return value. Shouldn't you s/0/1 to match?

> -		return false;
> +		/* Do not block collapse, only skip swapping in */
> +		return true;
>   	}
>   	vmf.pte = pte_offset_map(pmd, address);
>   	for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE;
> 

So "referenced < HPAGE_PMD_NR/2" means swapped out pages aren't faulted back in
but there could still be no swapped out pages, just "old" pages so collapse could
succeed. Seems like this check could have been made in khugepaged_scan_pmd() when
"referenced" is being computed and "unmapped" is available. Just skip setting
"ret = 1" if unmapped > 0 && referenced < HPAGE_PMD_NR/2.
Kirill A. Shutemov April 9, 2020, 1:34 p.m. UTC | #3
On Mon, Apr 06, 2020 at 12:53:00PM -0700, Ralph Campbell wrote:
> 
> On 4/3/20 4:29 AM, Kirill A. Shutemov wrote:
> > __collapse_huge_page_swapin() check number of referenced PTE to decide
> > if the memory range is hot enough to justify swapin.
> > 
> > The problem is that it stops collapse altogether if there's not enough
> > referenced pages, not only swappingin.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 0db501f7a34c ("mm, thp: convert from optimistic swapin collapsing to conservative")
> > Reviewed-by: Zi Yan <ziy@nvidia.com> ---
> >   mm/khugepaged.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 99bab7e4d05b..14d7afc90786 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -905,7 +905,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
> >   	/* we only decide to swapin, if there is enough young ptes */
> >   	if (referenced < HPAGE_PMD_NR/2) {
> >   		trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> 
> The trace point is recording the return value. Shouldn't you s/0/1 to match?

Good catch.

> 
> > -		return false;
> > +		/* Do not block collapse, only skip swapping in */
> > +		return true;
> >   	}
> >   	vmf.pte = pte_offset_map(pmd, address);
> >   	for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE;
> > 
> 
> So "referenced < HPAGE_PMD_NR/2" means swapped out pages aren't faulted back in
> but there could still be no swapped out pages, just "old" pages so collapse could
> succeed. Seems like this check could have been made in khugepaged_scan_pmd() when
> "referenced" is being computed and "unmapped" is available. Just skip setting
> "ret = 1" if unmapped > 0 && referenced < HPAGE_PMD_NR/2.

Good point. I'll change the code to address it.
diff mbox series

Patch

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 99bab7e4d05b..14d7afc90786 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -905,7 +905,8 @@  static bool __collapse_huge_page_swapin(struct mm_struct *mm,
 	/* we only decide to swapin, if there is enough young ptes */
 	if (referenced < HPAGE_PMD_NR/2) {
 		trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
-		return false;
+		/* Do not block collapse, only skip swapping in */
+		return true;
 	}
 	vmf.pte = pte_offset_map(pmd, address);
 	for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE;