Message ID | 20211116235733.3774702-1-almasrymina@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] hugetlb, userfaultfd: Fix reservation restore on userfaultfd error | expand |
Hi Mike, On Tue, Nov 16, 2021 at 3:57 PM Mina Almasry <almasrymina@google.com> wrote: > > Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we > bail out using "goto out_release_unlock;" in the cases where idx >= > size, or !huge_pte_none(), the code will detect that new_pagecache_page > == false, and so call restore_reserve_on_error(). > In this case I see restore_reserve_on_error() delete the reservation, > and the following call to remove_inode_hugepages() will increment > h->resv_hugepages causing a 100% reproducible leak. > Attached is the .c file with the 100% repro. > We should treat the is_continue case similar to adding a page into the > pagecache and set new_pagecache_page to true, to indicate that there is > no reservation to restore on the error path, and we need not call > restore_reserve_on_error(). > > Cc: Wei Xu <weixugc@google.com> > > Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error") > Signed-off-by: Mina Almasry <almasrymina@google.com> > Reported-by: James Houghton <jthoughton@google.com> Not sure if this is a Cc: stable issue. If it is, I can add in v2. > --- > mm/hugetlb.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e09159c957e3..25a7a3d84607 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5741,6 +5741,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > page = find_lock_page(mapping, idx); > if (!page) > goto out; > + /* > + * Set new_pagecache_page to true, as we've added a page to the > + * pagecache, but userfaultfd hasn't set up a mapping for this > + * page yet. If we bail out before setting up the mapping, we > + * want to indicate to restore_reserve_on_error() that we've > + * added the page to the page cache. > + */ > + new_pagecache_page = true; > } else if (!*pagep) { > /* If a page already exists, then it's UFFDIO_COPY for > * a non-missing case. Return -EEXIST. > -- > 2.34.0.rc1.387.gb447b232ab-goog
On Tue, 16 Nov 2021 15:57:32 -0800 Mina Almasry <almasrymina@google.com> wrote: > Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we > bail out using "goto out_release_unlock;" in the cases where idx >= > size, or !huge_pte_none(), the code will detect that new_pagecache_page > == false, and so call restore_reserve_on_error(). > In this case I see restore_reserve_on_error() delete the reservation, > and the following call to remove_inode_hugepages() will increment > h->resv_hugepages causing a 100% reproducible leak. > > We should treat the is_continue case similar to adding a page into the > pagecache and set new_pagecache_page to true, to indicate that there is > no reservation to restore on the error path, and we need not call > restore_reserve_on_error(). > > Cc: Wei Xu <weixugc@google.com> > > Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error") I added cc:stable to this.
Subject: Re: [PATCH v1] hugetlb, userfaultfd: Fix reservation restore on userfaultfd error To: Mina Almasry <almasrymina@google.com>, Andrew Morton <akpm@linux-foundation.org> Cc: Wei Xu <weixugc@google.com>, James Houghton <jthoughton@google.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Bcc: -=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=- On 11/16/21 3:57 PM, Mina Almasry wrote: > Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we > bail out using "goto out_release_unlock;" in the cases where idx >= > size, or !huge_pte_none(), the code will detect that new_pagecache_page > == false, and so call restore_reserve_on_error(). > In this case I see restore_reserve_on_error() delete the reservation, > and the following call to remove_inode_hugepages() will increment > h->resv_hugepages causing a 100% reproducible leak. > > We should treat the is_continue case similar to adding a page into the > pagecache and set new_pagecache_page to true, to indicate that there is > no reservation to restore on the error path, and we need not call > restore_reserve_on_error(). > > Cc: Wei Xu <weixugc@google.com> > > Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error") > Signed-off-by: Mina Almasry <almasrymina@google.com> > Reported-by: James Houghton <jthoughton@google.com> Thanks Mina and James! Technically, the issue was introduced by commit 846be08578ed. See the 'Note on Fixes tag' in c7b1850dfb41. It is true that commit c7b1850dfb41 should have taken the 'is_continue' case into account when deciding whether or not to call restore_reserve_on_error. However, this issue first showed up with 846be08578ed. But, this patch depends on c7b1850dfb41 so I think c7b1850dfb41 it best for the Fixes tag. > --- > mm/hugetlb.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e09159c957e3..25a7a3d84607 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5741,6 +5741,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > page = find_lock_page(mapping, idx); > if (!page) > goto out; > + /* > + * Set new_pagecache_page to true, as we've added a page to the > + * pagecache, but userfaultfd hasn't set up a mapping for this We did not add the the page to the pagecache. Rather, this is the case where the page already exists in the cache. Right? > + * page yet. If we bail out before setting up the mapping, we > + * want to indicate to restore_reserve_on_error() that we've > + * added the page to the page cache. > + */ > + new_pagecache_page = true; How about changing the variable name new_pagecache_page to page_in_pagecache? Then it makes sense both here and below when actually adding to the cache. I think we could then drop the above comment.
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e09159c957e3..25a7a3d84607 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5741,6 +5741,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, page = find_lock_page(mapping, idx); if (!page) goto out; + /* + * Set new_pagecache_page to true, as we've added a page to the + * pagecache, but userfaultfd hasn't set up a mapping for this + * page yet. If we bail out before setting up the mapping, we + * want to indicate to restore_reserve_on_error() that we've + * added the page to the page cache. + */ + new_pagecache_page = true; } else if (!*pagep) { /* If a page already exists, then it's UFFDIO_COPY for * a non-missing case. Return -EEXIST.
Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we bail out using "goto out_release_unlock;" in the cases where idx >= size, or !huge_pte_none(), the code will detect that new_pagecache_page == false, and so call restore_reserve_on_error(). In this case I see restore_reserve_on_error() delete the reservation, and the following call to remove_inode_hugepages() will increment h->resv_hugepages causing a 100% reproducible leak. We should treat the is_continue case similar to adding a page into the pagecache and set new_pagecache_page to true, to indicate that there is no reservation to restore on the error path, and we need not call restore_reserve_on_error(). Cc: Wei Xu <weixugc@google.com> Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error") Signed-off-by: Mina Almasry <almasrymina@google.com> Reported-by: James Houghton <jthoughton@google.com> --- mm/hugetlb.c | 8 ++++++++ 1 file changed, 8 insertions(+) -- 2.34.0.rc1.387.gb447b232ab-goog