Message ID | 20200327170601.18563-4-kirill.shutemov@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | thp/khugepaged improvements and CoW semantics | expand |
On 27 Mar 2020, at 13:05, Kirill A. Shutemov wrote: > __collapse_huge_page_isolate() may fail due to extra pin in the LRU add > pagevec. It's petty common for swapin case: we swap in pages just to > fail due to the extra pin. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/khugepaged.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 14d7afc90786..39e0994abeb8 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > * The page must only be referenced by the scanned process > * and page swap cache. > */ > + if (page_count(page) != 1 + PageSwapCache(page)) { > + /* > + * Drain pagevec and retry just in case we can get rid > + * of the extra pin, like in swapin case. > + */ > + lru_add_drain(); > + } > if (page_count(page) != 1 + PageSwapCache(page)) { > unlock_page(page); > result = SCAN_PAGE_COUNT; > goto out; > } > + > if (pte_write(pteval)) { > writable = true; > } else { > -- > 2.26.0 Looks good to me. Is the added empty line intentional? Reviewed-by: Zi Yan <ziy@nvidia.com> — Best Regards, Yan Zi
On Fri, Mar 27, 2020 at 10:06 AM Kirill A. Shutemov <kirill@shutemov.name> wrote: > > __collapse_huge_page_isolate() may fail due to extra pin in the LRU add > pagevec. It's petty common for swapin case: we swap in pages just to > fail due to the extra pin. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/khugepaged.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 14d7afc90786..39e0994abeb8 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > * The page must only be referenced by the scanned process > * and page swap cache. > */ > + if (page_count(page) != 1 + PageSwapCache(page)) { > + /* > + * Drain pagevec and retry just in case we can get rid > + * of the extra pin, like in swapin case. > + */ > + lru_add_drain(); This is definitely correct. I'm wondering if we need one more lru_add_drain() before PageLRU check in khugepaged_scan_pmd() or not? The page might be in lru cache then get skipped. This would improve the success rate. Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> > + } > if (page_count(page) != 1 + PageSwapCache(page)) { > unlock_page(page); > result = SCAN_PAGE_COUNT; > goto out; > } > + > if (pte_write(pteval)) { > writable = true; > } else { > -- > 2.26.0 > >
On Fri, Mar 27, 2020 at 01:34:20PM -0400, Zi Yan wrote: > On 27 Mar 2020, at 13:05, Kirill A. Shutemov wrote: > > > __collapse_huge_page_isolate() may fail due to extra pin in the LRU add > > pagevec. It's petty common for swapin case: we swap in pages just to > > fail due to the extra pin. > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > --- > > mm/khugepaged.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 14d7afc90786..39e0994abeb8 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > * The page must only be referenced by the scanned process > > * and page swap cache. > > */ > > + if (page_count(page) != 1 + PageSwapCache(page)) { > > + /* > > + * Drain pagevec and retry just in case we can get rid > > + * of the extra pin, like in swapin case. > > + */ > > + lru_add_drain(); > > + } > > if (page_count(page) != 1 + PageSwapCache(page)) { > > unlock_page(page); > > result = SCAN_PAGE_COUNT; > > goto out; > > } > > + > > if (pte_write(pteval)) { > > writable = true; > > } else { > > -- > > 2.26.0 > > Looks good to me. Is the added empty line intentional? Yes. It groups try and retry together.
On Fri, Mar 27, 2020 at 11:10:40AM -0700, Yang Shi wrote: > On Fri, Mar 27, 2020 at 10:06 AM Kirill A. Shutemov > <kirill@shutemov.name> wrote: > > > > __collapse_huge_page_isolate() may fail due to extra pin in the LRU add > > pagevec. It's petty common for swapin case: we swap in pages just to > > fail due to the extra pin. > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > --- > > mm/khugepaged.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 14d7afc90786..39e0994abeb8 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > * The page must only be referenced by the scanned process > > * and page swap cache. > > */ > > + if (page_count(page) != 1 + PageSwapCache(page)) { > > + /* > > + * Drain pagevec and retry just in case we can get rid > > + * of the extra pin, like in swapin case. > > + */ > > + lru_add_drain(); > > This is definitely correct. > > I'm wondering if we need one more lru_add_drain() before PageLRU check > in khugepaged_scan_pmd() or not? The page might be in lru cache then > get skipped. This would improve the success rate. Could you elaborate on the scenario, I don't follow.
On Sat, Mar 28, 2020 at 5:18 AM Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Fri, Mar 27, 2020 at 11:10:40AM -0700, Yang Shi wrote: > > On Fri, Mar 27, 2020 at 10:06 AM Kirill A. Shutemov > > <kirill@shutemov.name> wrote: > > > > > > __collapse_huge_page_isolate() may fail due to extra pin in the LRU add > > > pagevec. It's petty common for swapin case: we swap in pages just to > > > fail due to the extra pin. > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > --- > > > mm/khugepaged.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 14d7afc90786..39e0994abeb8 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > > * The page must only be referenced by the scanned process > > > * and page swap cache. > > > */ > > > + if (page_count(page) != 1 + PageSwapCache(page)) { > > > + /* > > > + * Drain pagevec and retry just in case we can get rid > > > + * of the extra pin, like in swapin case. > > > + */ > > > + lru_add_drain(); > > > > This is definitely correct. > > > > I'm wondering if we need one more lru_add_drain() before PageLRU check > > in khugepaged_scan_pmd() or not? The page might be in lru cache then > > get skipped. This would improve the success rate. > > Could you elaborate on the scenario, I don't follow. I mean the below change: --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1195,6 +1195,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, goto out_unmap; } khugepaged_node_load[node]++; + if (!PageLRU(page)) + /* Flush the page out of lru cache */ + lru_add_drain(); if (!PageLRU(page)) { result = SCAN_PAGE_LRU; goto out_unmap; If the page is not on LRU we even can't reach collapse_huge_page(), right? > > > -- > Kirill A. Shutemov
On Mon, Mar 30, 2020 at 11:30:14AM -0700, Yang Shi wrote: > On Sat, Mar 28, 2020 at 5:18 AM Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > On Fri, Mar 27, 2020 at 11:10:40AM -0700, Yang Shi wrote: > > > On Fri, Mar 27, 2020 at 10:06 AM Kirill A. Shutemov > > > <kirill@shutemov.name> wrote: > > > > > > > > __collapse_huge_page_isolate() may fail due to extra pin in the LRU add > > > > pagevec. It's petty common for swapin case: we swap in pages just to > > > > fail due to the extra pin. > > > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > --- > > > > mm/khugepaged.c | 8 ++++++++ > > > > 1 file changed, 8 insertions(+) > > > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > > index 14d7afc90786..39e0994abeb8 100644 > > > > --- a/mm/khugepaged.c > > > > +++ b/mm/khugepaged.c > > > > @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > > > * The page must only be referenced by the scanned process > > > > * and page swap cache. > > > > */ > > > > + if (page_count(page) != 1 + PageSwapCache(page)) { > > > > + /* > > > > + * Drain pagevec and retry just in case we can get rid > > > > + * of the extra pin, like in swapin case. > > > > + */ > > > > + lru_add_drain(); > > > > > > This is definitely correct. > > > > > > I'm wondering if we need one more lru_add_drain() before PageLRU check > > > in khugepaged_scan_pmd() or not? The page might be in lru cache then > > > get skipped. This would improve the success rate. > > > > Could you elaborate on the scenario, I don't follow. > > I mean the below change: > > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1195,6 +1195,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, > goto out_unmap; > } > khugepaged_node_load[node]++; > + if (!PageLRU(page)) > + /* Flush the page out of lru cache */ > + lru_add_drain(); > if (!PageLRU(page)) { > result = SCAN_PAGE_LRU; > goto out_unmap; > > If the page is not on LRU we even can't reach collapse_huge_page(), right? Yeah, I've archived the same by doing lru_add_drain_all() once per khugepaged_do_scan(). It is more effective than lru_add_drain() inside khugepaged_scan_pmd() and should have too much overhead. The lru_add_drain() from this patch moved into swapin routine and called only on success.
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 14d7afc90786..39e0994abeb8 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, * The page must only be referenced by the scanned process * and page swap cache. */ + if (page_count(page) != 1 + PageSwapCache(page)) { + /* + * Drain pagevec and retry just in case we can get rid + * of the extra pin, like in swapin case. + */ + lru_add_drain(); + } if (page_count(page) != 1 + PageSwapCache(page)) { unlock_page(page); result = SCAN_PAGE_COUNT; goto out; } + if (pte_write(pteval)) { writable = true; } else {
__collapse_huge_page_isolate() may fail due to extra pin in the LRU add pagevec. It's petty common for swapin case: we swap in pages just to fail due to the extra pin. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- mm/khugepaged.c | 8 ++++++++ 1 file changed, 8 insertions(+)