Message ID | 20210710002441.167759-4-mike.kravetz@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | hugetlb: fix potential ref counting races | expand |
On Sat, Jul 10, 2021 at 8:25 AM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > When removing a hugetlb page from the pool the ref count is set to > one (as the free page has no ref count) and compound page destructor > is set to NULL_COMPOUND_DTOR. Since a subsequent call to free the > hugetlb page will call __free_pages for non-gigantic pages and > free_gigantic_page for gigantic pages the destructor is not used. > > However, consider the following race with code taking a speculative > reference on the page: > > Thread 0 Thread 1 > -------- -------- > remove_hugetlb_page > set_page_refcounted(page); > set_compound_page_dtor(page, > NULL_COMPOUND_DTOR); > get_page_unless_zero(page) > __update_and_free_page > __free_pages(page, > huge_page_order(h)); > > /* Note that __free_pages() will simply drop > the reference to the page. */ > > put_page(page) > __put_compound_page() > destroy_compound_page > NULL_COMPOUND_DTOR > BUG: kernel NULL pointer > dereference, address: > 0000000000000000 > > To address this race, set the dtor to the normal compound page dtor > for non-gigantic pages. The dtor for gigantic pages does not matter > as gigantic pages are changed from a compound page to 'just a group of > pages' before freeing. Hence, the destructor is not used. > > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> > --- > mm/hugetlb.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 3132c7395743..fa8ec2072949 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1370,8 +1370,28 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page, > h->surplus_huge_pages_node[nid]--; > } > > + /* > + * Very subtle > + * > + * For non-gigantic pages set the destructor to the normal compound > + * page dtor. This is needed in case someone takes an additional > + * temporary ref to the page, and freeing is delayed until they drop > + * their reference. > + * > + * For gigantic pages set the destructor to the null dtor. This > + * destructor will never be called. Before freeing the gigantic > + * page destroy_compound_gigantic_page will turn the compound page > + * into a simple group of pages. After this the destructor does not > + * apply. > + * > + * This handles the case where more than one ref is held when and > + * after update_and_free_page is called. > + */ > set_page_refcounted(page); > - set_compound_page_dtor(page, NULL_COMPOUND_DTOR); > + if (hstate_is_gigantic(h)) > + set_compound_page_dtor(page, NULL_COMPOUND_DTOR); > + else > + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR); Hi Mike, The race is really subtle. But we also should remove the WARN from free_contig_range, right? Because the refcount of the head page of the gigantic page can be greater than one, but free_contig_range has the following warning. WARN(count != 0, "%lu pages are still in use!\n", count); Thanks. > > h->nr_huge_pages--; > h->nr_huge_pages_node[nid]--; > -- > 2.31.1 >
On 7/14/21 3:57 AM, Muchun Song wrote: > On Sat, Jul 10, 2021 at 8:25 AM Mike Kravetz <mike.kravetz@oracle.com> wrote: >> + /* >> + * Very subtle >> + * >> + * For non-gigantic pages set the destructor to the normal compound >> + * page dtor. This is needed in case someone takes an additional >> + * temporary ref to the page, and freeing is delayed until they drop >> + * their reference. >> + * >> + * For gigantic pages set the destructor to the null dtor. This >> + * destructor will never be called. Before freeing the gigantic >> + * page destroy_compound_gigantic_page will turn the compound page >> + * into a simple group of pages. After this the destructor does not >> + * apply. >> + * >> + * This handles the case where more than one ref is held when and >> + * after update_and_free_page is called. >> + */ >> set_page_refcounted(page); >> - set_compound_page_dtor(page, NULL_COMPOUND_DTOR); >> + if (hstate_is_gigantic(h)) >> + set_compound_page_dtor(page, NULL_COMPOUND_DTOR); >> + else >> + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR); > > Hi Mike, > > The race is really subtle. But we also should remove the WARN from > free_contig_range, right? Because the refcount of the head page of > the gigantic page can be greater than one, but free_contig_range has > the following warning. > > WARN(count != 0, "%lu pages are still in use!\n", count); > I did hit that warning in my testing and thought about removing it. However, I decided to keep it because non-hugetlb code also makes use of alloc_contig_range/free_contig_range and it might be useful in those cases. My 'guess' is that the warning was added not because of temporary ref count increases but rather to point out any code that forgot to drop a reference. BTW - It is not just the 'head' page which could trigger this warning, but any 'tail' page as well. That is because we do not call free_contig_range with a compound page, but rather a group of pages all with ref count of at least one. I'm happy to remove the warning if people do not think it is generally useful.
On Thu, Jul 15, 2021 at 1:39 AM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > On 7/14/21 3:57 AM, Muchun Song wrote: > > On Sat, Jul 10, 2021 at 8:25 AM Mike Kravetz <mike.kravetz@oracle.com> wrote: > >> + /* > >> + * Very subtle > >> + * > >> + * For non-gigantic pages set the destructor to the normal compound > >> + * page dtor. This is needed in case someone takes an additional > >> + * temporary ref to the page, and freeing is delayed until they drop > >> + * their reference. > >> + * > >> + * For gigantic pages set the destructor to the null dtor. This > >> + * destructor will never be called. Before freeing the gigantic > >> + * page destroy_compound_gigantic_page will turn the compound page > >> + * into a simple group of pages. After this the destructor does not > >> + * apply. > >> + * > >> + * This handles the case where more than one ref is held when and > >> + * after update_and_free_page is called. > >> + */ > >> set_page_refcounted(page); > >> - set_compound_page_dtor(page, NULL_COMPOUND_DTOR); > >> + if (hstate_is_gigantic(h)) > >> + set_compound_page_dtor(page, NULL_COMPOUND_DTOR); > >> + else > >> + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR); > > > > Hi Mike, > > > > The race is really subtle. But we also should remove the WARN from > > free_contig_range, right? Because the refcount of the head page of > > the gigantic page can be greater than one, but free_contig_range has > > the following warning. > > > > WARN(count != 0, "%lu pages are still in use!\n", count); > > > > I did hit that warning in my testing and thought about removing it. > However, I decided to keep it because non-hugetlb code also makes use of > alloc_contig_range/free_contig_range and it might be useful in those > cases. > > My 'guess' is that the warning was added not because of temporary ref > count increases but rather to point out any code that forgot to drop a > reference. Got it. At least this patch looks good to me. So Reviewed-by: Muchun Song <songmuchun@bytedance.com> > > BTW - It is not just the 'head' page which could trigger this warning, but > any 'tail' page as well. That is because we do not call free_contig_range > with a compound page, but rather a group of pages all with ref count of > at least one. Right. > > I'm happy to remove the warning if people do not think it is generally > useful. For me, I suggest removing it. If someone has any ideas, please let us know. > -- > Mike Kravetz
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3132c7395743..fa8ec2072949 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1370,8 +1370,28 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page, h->surplus_huge_pages_node[nid]--; } + /* + * Very subtle + * + * For non-gigantic pages set the destructor to the normal compound + * page dtor. This is needed in case someone takes an additional + * temporary ref to the page, and freeing is delayed until they drop + * their reference. + * + * For gigantic pages set the destructor to the null dtor. This + * destructor will never be called. Before freeing the gigantic + * page destroy_compound_gigantic_page will turn the compound page + * into a simple group of pages. After this the destructor does not + * apply. + * + * This handles the case where more than one ref is held when and + * after update_and_free_page is called. + */ set_page_refcounted(page); - set_compound_page_dtor(page, NULL_COMPOUND_DTOR); + if (hstate_is_gigantic(h)) + set_compound_page_dtor(page, NULL_COMPOUND_DTOR); + else + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR); h->nr_huge_pages--; h->nr_huge_pages_node[nid]--;
When removing a hugetlb page from the pool the ref count is set to one (as the free page has no ref count) and compound page destructor is set to NULL_COMPOUND_DTOR. Since a subsequent call to free the hugetlb page will call __free_pages for non-gigantic pages and free_gigantic_page for gigantic pages the destructor is not used. However, consider the following race with code taking a speculative reference on the page: Thread 0 Thread 1 -------- -------- remove_hugetlb_page set_page_refcounted(page); set_compound_page_dtor(page, NULL_COMPOUND_DTOR); get_page_unless_zero(page) __update_and_free_page __free_pages(page, huge_page_order(h)); /* Note that __free_pages() will simply drop the reference to the page. */ put_page(page) __put_compound_page() destroy_compound_page NULL_COMPOUND_DTOR BUG: kernel NULL pointer dereference, address: 0000000000000000 To address this race, set the dtor to the normal compound page dtor for non-gigantic pages. The dtor for gigantic pages does not matter as gigantic pages are changed from a compound page to 'just a group of pages' before freeing. Hence, the destructor is not used. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> --- mm/hugetlb.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)