Message ID | 20200403112928.19742-3-kirill.shutemov@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | thp/khugepaged improvements and CoW semantics | expand |
On 4/3/20 4:29 AM, Kirill A. Shutemov wrote: > __collapse_huge_page_swapin() check number of referenced PTE to decide > if the memory range is hot enough to justify swapin. > > The problem is that it stops collapse altogether if there's not enough > referenced pages, not only swappingin. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Fixes: 0db501f7a34c ("mm, thp: convert from optimistic swapin collapsing to conservative") > Reviewed-by: Zi Yan <ziy@nvidia.com> > --- > mm/khugepaged.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) Acked-by: Yang Shi <yang.shi@linux.alibaba.com> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 99bab7e4d05b..14d7afc90786 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -905,7 +905,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, > /* we only decide to swapin, if there is enough young ptes */ > if (referenced < HPAGE_PMD_NR/2) { > trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > - return false; > + /* Do not block collapse, only skip swapping in */ > + return true; > } > vmf.pte = pte_offset_map(pmd, address); > for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE;
On 4/3/20 4:29 AM, Kirill A. Shutemov wrote: > __collapse_huge_page_swapin() check number of referenced PTE to decide > if the memory range is hot enough to justify swapin. > > The problem is that it stops collapse altogether if there's not enough > referenced pages, not only swappingin. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Fixes: 0db501f7a34c ("mm, thp: convert from optimistic swapin collapsing to conservative") > Reviewed-by: Zi Yan <ziy@nvidia.com> --- > mm/khugepaged.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 99bab7e4d05b..14d7afc90786 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -905,7 +905,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, > /* we only decide to swapin, if there is enough young ptes */ > if (referenced < HPAGE_PMD_NR/2) { > trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); The trace point is recording the return value. Shouldn't you s/0/1 to match? > - return false; > + /* Do not block collapse, only skip swapping in */ > + return true; > } > vmf.pte = pte_offset_map(pmd, address); > for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE; > So "referenced < HPAGE_PMD_NR/2" means swapped out pages aren't faulted back in but there could still be no swapped out pages, just "old" pages so collapse could succeed. Seems like this check could have been made in khugepaged_scan_pmd() when "referenced" is being computed and "unmapped" is available. Just skip setting "ret = 1" if unmapped > 0 && referenced < HPAGE_PMD_NR/2.
On Mon, Apr 06, 2020 at 12:53:00PM -0700, Ralph Campbell wrote: > > On 4/3/20 4:29 AM, Kirill A. Shutemov wrote: > > __collapse_huge_page_swapin() check number of referenced PTE to decide > > if the memory range is hot enough to justify swapin. > > > > The problem is that it stops collapse altogether if there's not enough > > referenced pages, not only swappingin. > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > Fixes: 0db501f7a34c ("mm, thp: convert from optimistic swapin collapsing to conservative") > > Reviewed-by: Zi Yan <ziy@nvidia.com> --- > > mm/khugepaged.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 99bab7e4d05b..14d7afc90786 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -905,7 +905,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, > > /* we only decide to swapin, if there is enough young ptes */ > > if (referenced < HPAGE_PMD_NR/2) { > > trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > > The trace point is recording the return value. Shouldn't you s/0/1 to match? Good catch. > > > - return false; > > + /* Do not block collapse, only skip swapping in */ > > + return true; > > } > > vmf.pte = pte_offset_map(pmd, address); > > for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE; > > > > So "referenced < HPAGE_PMD_NR/2" means swapped out pages aren't faulted back in > but there could still be no swapped out pages, just "old" pages so collapse could > succeed. Seems like this check could have been made in khugepaged_scan_pmd() when > "referenced" is being computed and "unmapped" is available. Just skip setting > "ret = 1" if unmapped > 0 && referenced < HPAGE_PMD_NR/2. Good point. I'll change the code to address it.
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 99bab7e4d05b..14d7afc90786 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -905,7 +905,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, /* we only decide to swapin, if there is enough young ptes */ if (referenced < HPAGE_PMD_NR/2) { trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); - return false; + /* Do not block collapse, only skip swapping in */ + return true; } vmf.pte = pte_offset_map(pmd, address); for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE;