Message ID | 78e109cdbec7b11b1832822143d483509abb059e.1712266967.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: do not wait for short bulk allocation | expand |
On 4/4/24 17:43, Qu Wenruo wrote: > [BUG] > There is a recent report that when memory pressure is high (including > cached pages), btrfs can spend most of its time on memory allocation in > btrfs_alloc_page_array() for compressed read/write. > > [CAUSE] > For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and > even if the bulk allocation failed (fell back to single page > allocation) we still retry but with extra memalloc_retry_wait(). > > If the bulk alloc only returned one page a time, we would spend a lot of > time on the retry wait. > > The behavior was introduced in commit 395cb57e8560 ("btrfs: wait between > incomplete batch memory allocations"). > > [FIX] > Although the commit mentioned that other filesystems do the wait, it's > not the case at least nowadays. > > All the mainlined filesystems only call memalloc_retry_wait() if they > failed to allocate any page (not only for bulk allocation). > If there is any progress, they won't call memalloc_retry_wait() at all. > > For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait() > if there is no allocation progress at all, and the call is not for > metadata readahead. > > So I don't believe we should call memalloc_retry_wait() unconditionally > for short allocation. > > This patch would only call memalloc_retry_wait() if failed to allocate > any page for tree block allocation (which goes with __GFP_NOFAIL and may > not need the special handling anyway), and reduce the latency for > btrfs_alloc_page_array(). > > Reported-by: Julian Taylor <julian.taylor@1und1.de> > Tested-by: Julian Taylor <julian.taylor@1und1.de> > Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/ > Fixes: 395cb57e8560 ("btrfs: wait between incomplete batch memory allocations") > Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > Changelog: > v3: > - Remove wait part completely > For NOFAIL metadata allocation, the allocation itself should not fail. > For regular allocation, we can afford the failure anyway. > > v2: > - Still use bulk allocation function > Since alloc_pages_bulk_array() would fall back to single page > allocation by itself, there is no need to go alloc_page() manually. > > - Update the commit message to indicate other fses do not call > memalloc_retry_wait() unconditionally > In fact, they only call it when they need to retry hard and can not > really fail. > --- > fs/btrfs/extent_io.c | 18 ++++-------------- > 1 file changed, 4 insertions(+), 14 deletions(-) > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > index bbdcb7475cea..48476f8fcf79 100644 > --- a/fs/btrfs/extent_io.c > +++ b/fs/btrfs/extent_io.c > @@ -712,31 +712,21 @@ int btrfs_alloc_folio_array(unsigned int nr_folios, struct folio **folio_array, > int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array, > gfp_t extra_gfp) > { > + const gfp_t gfp = GFP_NOFS | extra_gfp; > unsigned int allocated; > > for (allocated = 0; allocated < nr_pages;) { > unsigned int last = allocated; > > - allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp, > - nr_pages, page_array); > - > - if (allocated == nr_pages) > - return 0; > - > - /* > - * During this iteration, no page could be allocated, even > - * though alloc_pages_bulk_array() falls back to alloc_page() > - * if it could not bulk-allocate. So we must be out of memory. > - */ > - if (allocated == last) { > + allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array); > + if (unlikely(allocated == last)) { > + /* Fail and do cleanup. */ > for (int i = 0; i < allocated; i++) { > __free_page(page_array[i]); > page_array[i] = NULL; > } > return -ENOMEM; > } > - > - memalloc_retry_wait(GFP_NOFS); > } > return 0; > }
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index bbdcb7475cea..48476f8fcf79 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -712,31 +712,21 @@ int btrfs_alloc_folio_array(unsigned int nr_folios, struct folio **folio_array, int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array, gfp_t extra_gfp) { + const gfp_t gfp = GFP_NOFS | extra_gfp; unsigned int allocated; for (allocated = 0; allocated < nr_pages;) { unsigned int last = allocated; - allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp, - nr_pages, page_array); - - if (allocated == nr_pages) - return 0; - - /* - * During this iteration, no page could be allocated, even - * though alloc_pages_bulk_array() falls back to alloc_page() - * if it could not bulk-allocate. So we must be out of memory. - */ - if (allocated == last) { + allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array); + if (unlikely(allocated == last)) { + /* Fail and do cleanup. */ for (int i = 0; i < allocated; i++) { __free_page(page_array[i]); page_array[i] = NULL; } return -ENOMEM; } - - memalloc_retry_wait(GFP_NOFS); } return 0; }