Message ID | 20230613215346.1022773-2-peterx@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/gup: Unify hugetlb, speed up thp | expand |
On 06/13/23 17:53, Peter Xu wrote: > Firstly, the no_page_table() is meaningless for hugetlb which is a no-op > there, because a hugetlb page always satisfies: > > - vma_is_anonymous() == false > - vma->vm_ops->fault != NULL > > So we can already safely remove it in hugetlb_follow_page_mask(), alongside > with the page* variable. > > Meanwhile, what we do in follow_hugetlb_page() actually makes sense for a > dump: we try to fault in the page only if the page cache is already > allocated. Let's do the same here for follow_page_mask() on hugetlb. > > It should so far has zero effect on real dumps, because that still goes > into follow_hugetlb_page(). But this may start to influence a bit on > follow_page() users who mimics a "dump page" scenario, but hopefully in a > good way. This also paves way for unifying the hugetlb gup-slow. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > mm/gup.c | 9 ++------- > mm/hugetlb.c | 9 +++++++++ > 2 files changed, 11 insertions(+), 7 deletions(-) Thanks Peter! Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
On 13.06.23 23:53, Peter Xu wrote: > Firstly, the no_page_table() is meaningless for hugetlb which is a no-op > there, because a hugetlb page always satisfies: > > - vma_is_anonymous() == false > - vma->vm_ops->fault != NULL > > So we can already safely remove it in hugetlb_follow_page_mask(), alongside > with the page* variable. > > Meanwhile, what we do in follow_hugetlb_page() actually makes sense for a > dump: we try to fault in the page only if the page cache is already > allocated. Let's do the same here for follow_page_mask() on hugetlb. > > It should so far has zero effect on real dumps, because that still goes > into follow_hugetlb_page(). But this may start to influence a bit on > follow_page() users who mimics a "dump page" scenario, but hopefully in a > good way. This also paves way for unifying the hugetlb gup-slow. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > mm/gup.c | 9 ++------- > mm/hugetlb.c | 9 +++++++++ > 2 files changed, 11 insertions(+), 7 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index dbe96d266670..aa0668505d61 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -781,7 +781,6 @@ static struct page *follow_page_mask(struct vm_area_struct *vma, > struct follow_page_context *ctx) > { > pgd_t *pgd; > - struct page *page; > struct mm_struct *mm = vma->vm_mm; > > ctx->page_mask = 0; > @@ -794,12 +793,8 @@ static struct page *follow_page_mask(struct vm_area_struct *vma, > * hugetlb_follow_page_mask is only for follow_page() handling here. > * Ordinary GUP uses follow_hugetlb_page for hugetlb processing. > */ > - if (is_vm_hugetlb_page(vma)) { > - page = hugetlb_follow_page_mask(vma, address, flags); > - if (!page) > - page = no_page_table(vma, flags); > - return page; > - } > + if (is_vm_hugetlb_page(vma)) > + return hugetlb_follow_page_mask(vma, address, flags); > > pgd = pgd_offset(mm, address); > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 270ec0ecd5a1..82dfdd96db4c 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -6501,6 +6501,15 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, > spin_unlock(ptl); > out_unlock: > hugetlb_vma_unlock_read(vma); > + > + /* > + * Fixup retval for dump requests: if pagecache doesn't exist, > + * don't try to allocate a new page but just skip it. > + */ > + if (!page && (flags & FOLL_DUMP) && > + !hugetlbfs_pagecache_present(h, vma, address)) > + page = ERR_PTR(-EFAULT); > + > return page; > } > Makes sense to me: Reviewed-by: David Hildenbrand <david@redhat.com>
diff --git a/mm/gup.c b/mm/gup.c index dbe96d266670..aa0668505d61 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -781,7 +781,6 @@ static struct page *follow_page_mask(struct vm_area_struct *vma, struct follow_page_context *ctx) { pgd_t *pgd; - struct page *page; struct mm_struct *mm = vma->vm_mm; ctx->page_mask = 0; @@ -794,12 +793,8 @@ static struct page *follow_page_mask(struct vm_area_struct *vma, * hugetlb_follow_page_mask is only for follow_page() handling here. * Ordinary GUP uses follow_hugetlb_page for hugetlb processing. */ - if (is_vm_hugetlb_page(vma)) { - page = hugetlb_follow_page_mask(vma, address, flags); - if (!page) - page = no_page_table(vma, flags); - return page; - } + if (is_vm_hugetlb_page(vma)) + return hugetlb_follow_page_mask(vma, address, flags); pgd = pgd_offset(mm, address); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 270ec0ecd5a1..82dfdd96db4c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6501,6 +6501,15 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, spin_unlock(ptl); out_unlock: hugetlb_vma_unlock_read(vma); + + /* + * Fixup retval for dump requests: if pagecache doesn't exist, + * don't try to allocate a new page but just skip it. + */ + if (!page && (flags & FOLL_DUMP) && + !hugetlbfs_pagecache_present(h, vma, address)) + page = ERR_PTR(-EFAULT); + return page; }
Firstly, the no_page_table() is meaningless for hugetlb which is a no-op there, because a hugetlb page always satisfies: - vma_is_anonymous() == false - vma->vm_ops->fault != NULL So we can already safely remove it in hugetlb_follow_page_mask(), alongside with the page* variable. Meanwhile, what we do in follow_hugetlb_page() actually makes sense for a dump: we try to fault in the page only if the page cache is already allocated. Let's do the same here for follow_page_mask() on hugetlb. It should so far has zero effect on real dumps, because that still goes into follow_hugetlb_page(). But this may start to influence a bit on follow_page() users who mimics a "dump page" scenario, but hopefully in a good way. This also paves way for unifying the hugetlb gup-slow. Signed-off-by: Peter Xu <peterx@redhat.com> --- mm/gup.c | 9 ++------- mm/hugetlb.c | 9 +++++++++ 2 files changed, 11 insertions(+), 7 deletions(-)