From patchwork Mon Jan 29 14:35:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13535777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEEABC47422 for ; Mon, 29 Jan 2024 14:36:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E50F6B00B8; Mon, 29 Jan 2024 09:36:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 36EE56B00B9; Mon, 29 Jan 2024 09:36:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E9646B00BA; Mon, 29 Jan 2024 09:36:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 017596B00B8 for ; Mon, 29 Jan 2024 09:36:18 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A96E11C1381 for ; Mon, 29 Jan 2024 14:36:18 +0000 (UTC) X-FDA: 81732598836.08.549009C Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf16.hostedemail.com (Postfix) with ESMTP id B394C180007 for ; Mon, 29 Jan 2024 14:36:16 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=mQXQwbS8; spf=none (imf16.hostedemail.com: domain of BATV+95c35c30fd22f84c25d9+7463+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+95c35c30fd22f84c25d9+7463+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706538976; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qt5yyeGgZH/arpMlGGWn1M3p+M4NNVvZAq3nYJjy+lU=; b=ov7bRt5GM2qXH0eCKyg7e+Xf/HiJH8SNbNvINSpGie1SfziWi6Vjjgu3tVlkJM/gcbixMX 5hkeJo9WSKWzfTngDTNbar5k8W5vmoOeBt/TqVMdjwamWboEngAwCMHo7qENySs7esCJTP cogRQlsq9Cu/YS+TDHHFx51J7uEITgk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706538976; a=rsa-sha256; cv=none; b=FlPqOQjfLoVUt230UNjbq1OhdiriiSbUBfdLu1uq+fd07vUyBKfjMdoOGwttrP0Zha6xuw Bht+ASlycP2noDCB3lIlIX0y+NOn3JRz2B4utRgTVsIcxLbef3u515Ugxnc5eMmQqLTlH4 2PWltpM3R0a/qYxOq1fM5bGa/IgYoSc= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=mQXQwbS8; spf=none (imf16.hostedemail.com: domain of BATV+95c35c30fd22f84c25d9+7463+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+95c35c30fd22f84c25d9+7463+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=Qt5yyeGgZH/arpMlGGWn1M3p+M4NNVvZAq3nYJjy+lU=; b=mQXQwbS80ouB+FyyQsETaKNMyu dHVz+4B/1m5+ULvorUJcL/+PRs8Q/Y4tvelPKQvaGTNN0v42RjuYY8CSKVYv4fS8gyKnUN0SzOezL tLniUagDwIG8yfyv6mVGVchxiy98FkZPOiaZijQq+OGZYBlhdzy4YVN5gMw0XCwWtN0qxa8tb7lsR 77v7vNs3L74LQGw1dR/PiKTkTDdvZnGYXEOPhjGMZySoo483FEH3iRlGipZ1XKKWUXxDwAC+Efi9y Ab8lWx8nqVQtWXMj19dOhdw9u9fYWhQDtvgySEsB1KQnOGgX5FAt0TQuaZpYTGydO4J135RAyOBXE dywc+Ffw==; Received: from [2001:4bb8:182:6550:c70:4a89:bc61:3] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1rUSk9-0000000D6Uf-0oRX; Mon, 29 Jan 2024 14:36:14 +0000 From: Christoph Hellwig To: Chandan Babu R , "Darrick J. Wong" , Hugh Dickins , Andrew Morton Cc: linux-xfs@vger.kernel.org, linux-mm@kvack.org, Kent Overstreet Subject: [PATCH 19/20] xfs: convert xfarray_pagesort to deal with large folios Date: Mon, 29 Jan 2024 15:35:01 +0100 Message-Id: <20240129143502.189370-20-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240129143502.189370-1-hch@lst.de> References: <20240129143502.189370-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html X-Rspamd-Queue-Id: B394C180007 X-Rspam-User: X-Stat-Signature: 6yg7jdeqqk4jw5tr7xiat9ictkmkoh91 X-Rspamd-Server: rspam03 X-HE-Tag: 1706538976-392355 X-HE-Meta: U2FsdGVkX1/n013PNhlkEFYas6FHXzB3FjoN0iddyqO0kROin0wUAB17Hh03A1nYqxeDxb4+k9MVVe1DWItLiKCuCYLQ9DI0aXDzZnzWlBQJ6jCciexetPYuVHm3taVH0BAWzVcxIieNzqKTfcrwtp3RQ9URqQBzmCy9mYqENckk4joFFZ9GY6rD0sli4PmqyG/W36g7n3W5g4Foad3K2iSWLscixVbV7fHp1SdiYVfcs8JUEJjBbUEzXD4JlFF/a2Kv7r6NSKeLOQQUIJQzNYqT2qaLbz0RGhjJJzQg3xo6Lbi3VfmdKFJtk5F8QhHKDTO4IbvrBr0DTHzsic4nqVPSXUpThdmzuW1FMftbcsk3mBSrmR3BcqEl9q9EtWo7AF0OilO5pQOWYpiXYi+RWzMzeg8UWpuzmFBiMj4BfM7ICHzPdx5fhp5M6f3G9qzvTgeT/v6v6A6cRw+VK6PT76hTdCyUAtxEdEDvTwBME4u6G47YgpxsAiQ5qzdFCC13t+BbNCXzf2K2MI1FgWrS5Ck05BMB6oTl40RQjrMTn35FPKkjl32YvA4xpDRzPcF8wJKlamtDd4wSj2c0roRqsJuHhLNRk21m6BrM03fd1SfwYlQOiohVR1vcnlEEwZ3s861D3KsNW1wdImgH/mtR85Ean+ydPH7a3ELMILkFnAniK8J+Z3dMdw5/jANuRKP4SuUvZ4tyJYy8Jy50p6SQuVePJ/UUgWA6rUGl+RcYg5DZgJNk2MUW4tc27buRIjnfGYsl7Sa6MyUP75IYiIHZuTv4LqbPLV3rQL3c62XXPlpf8JcuOIbhYP9EYo7Ww27lHAsnbNkBtVjDdYPSNqCc1W8V4eN10GCwaRcMFfLe8go84/RJqKUYUmnPwG099YFNx0u7jd20vrxwlFBxs/XIl2mW2KpDIurAZchroy+lTR4sSq6hPY7c5j3hhXuoH+O+YZGd/FGt+nUdM/0w89C Vj8RwaGy qTDIVL3jZ4mXhXo5WnTNbAqQ3m6uWwkGK/b3uexzPtyCYpHra6edvSahjRg9gI69LrS0w+6ZrMhCWZmXQN6E6T4HYSQGNBZzrFZEn17vXY8kCszWSIETTt+PNoXDQbRY2jAV6F7LpnNJlvkNououyv7yb9Kh6e8TDHmLFZu6kKLZ2dme5c4wtuScj6bOQx0MqM7iLYc9QxvWP7X9TPGx4iPff0gfNY1pf6zIknSrP1L0nMiixQXJMZ3JC58L9P7JjfpgsyKBGgKThIdIXDNVeIMcbdpDEI94AZh8LiZS8Pf8BzXIRdubnZ/LUEQqMv4n5JsPFrQ661hlaEeHpEhpqNi+2ivz0KCSirr1H2wvtWDEDpr1qDycYMt+E77MFdM+SZcaA+z+K2LSrxLvBx+pf3LbGTRimtuwwazUM3HQeZv4cqKRS4u1XmEWDWHFjg2f3NNPNWl6MqwtZXxrletWIHdkVolikGgouyKE0JsUxx+3oFdpuXlct9HXZ3AO7fjanx/P51DXG4Vnj6ho= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: "Darrick J. Wong" Convert xfarray_pagesort to handle large folios by introducing a new xfile_get_folio routine that can return a folio of arbitrary size, and using heapsort on the full folio. This also corrects an off-by-one bug in the calculation of len in xfarray_pagesort that was papered over by xfarray_want_pagesort. Signed-off-by: Darrick J. Wong Signed-off-by: Christoph Hellwig Reviewed-by: Kent Overstreet --- fs/xfs/scrub/trace.h | 43 ++++++++- fs/xfs/scrub/xfarray.c | 201 +++++++++++++++++++---------------------- fs/xfs/scrub/xfarray.h | 10 +- 3 files changed, 143 insertions(+), 111 deletions(-) diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index c61fa7a95ef522..3a1a827828dcb9 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -956,7 +956,7 @@ TRACE_EVENT(xfarray_isort, __entry->hi - __entry->lo) ); -TRACE_EVENT(xfarray_pagesort, +TRACE_EVENT(xfarray_foliosort, TP_PROTO(struct xfarray_sortinfo *si, uint64_t lo, uint64_t hi), TP_ARGS(si, lo, hi), TP_STRUCT__entry( @@ -1027,6 +1027,47 @@ TRACE_EVENT(xfarray_sort, __entry->bytes) ); +TRACE_EVENT(xfarray_sort_scan, + TP_PROTO(struct xfarray_sortinfo *si, unsigned long long idx), + TP_ARGS(si, idx), + TP_STRUCT__entry( + __field(unsigned long, ino) + __field(unsigned long long, nr) + __field(size_t, obj_size) + __field(unsigned long long, idx) + __field(unsigned long long, folio_pos) + __field(unsigned long, folio_bytes) + __field(unsigned long long, first_idx) + __field(unsigned long long, last_idx) + ), + TP_fast_assign( + __entry->nr = si->array->nr; + __entry->obj_size = si->array->obj_size; + __entry->ino = file_inode(si->array->xfile->file)->i_ino; + __entry->idx = idx; + if (si->folio) { + __entry->folio_pos = folio_pos(si->folio); + __entry->folio_bytes = folio_size(si->folio); + __entry->first_idx = si->first_folio_idx; + __entry->last_idx = si->last_folio_idx; + } else { + __entry->folio_pos = 0; + __entry->folio_bytes = 0; + __entry->first_idx = 0; + __entry->last_idx = 0; + } + ), + TP_printk("xfino 0x%lx nr %llu objsz %zu idx %llu folio_pos 0x%llx folio_bytes 0x%lx first_idx %llu last_idx %llu", + __entry->ino, + __entry->nr, + __entry->obj_size, + __entry->idx, + __entry->folio_pos, + __entry->folio_bytes, + __entry->first_idx, + __entry->last_idx) +); + TRACE_EVENT(xfarray_sort_stats, TP_PROTO(struct xfarray_sortinfo *si, int error), TP_ARGS(si, error), diff --git a/fs/xfs/scrub/xfarray.c b/fs/xfs/scrub/xfarray.c index 379e1db22269c7..17c982a4821d47 100644 --- a/fs/xfs/scrub/xfarray.c +++ b/fs/xfs/scrub/xfarray.c @@ -563,70 +563,42 @@ xfarray_isort( return xfile_store(si->array->xfile, scratch, len, lo_pos); } -/* Grab a page for sorting records. */ -static inline int -xfarray_sort_get_page( - struct xfarray_sortinfo *si, - loff_t pos, - uint64_t len) -{ - return xfile_get_page(si->array->xfile, pos, len, &si->xfpage); -} - -/* Release a page we grabbed for sorting records. */ -static inline int -xfarray_sort_put_page( - struct xfarray_sortinfo *si) -{ - if (!xfile_page_cached(&si->xfpage)) - return 0; - return xfile_put_page(si->array->xfile, &si->xfpage); -} - -/* Decide if these records are eligible for in-page sorting. */ -static inline bool -xfarray_want_pagesort( - struct xfarray_sortinfo *si, - xfarray_idx_t lo, - xfarray_idx_t hi) -{ - pgoff_t lo_page; - pgoff_t hi_page; - loff_t end_pos; - - /* We can only map one page at a time. */ - lo_page = xfarray_pos(si->array, lo) >> PAGE_SHIFT; - end_pos = xfarray_pos(si->array, hi) + si->array->obj_size - 1; - hi_page = end_pos >> PAGE_SHIFT; - - return lo_page == hi_page; -} - -/* Sort a bunch of records that all live in the same memory page. */ +/* + * Sort the records from lo to hi (inclusive) if they are all backed by the + * same memory folio. Returns 1 if it sorted, 0 if it did not, or a negative + * errno. + */ STATIC int -xfarray_pagesort( +xfarray_foliosort( struct xfarray_sortinfo *si, xfarray_idx_t lo, xfarray_idx_t hi) { + struct folio *folio; void *startp; loff_t lo_pos = xfarray_pos(si->array, lo); - uint64_t len = xfarray_pos(si->array, hi - lo); - int error = 0; + uint64_t len = xfarray_pos(si->array, hi - lo + 1); - trace_xfarray_pagesort(si, lo, hi); + /* No single folio could back this many records. */ + if (len > XFILE_MAX_FOLIO_SIZE) + return 0; xfarray_sort_bump_loads(si); - error = xfarray_sort_get_page(si, lo_pos, len); - if (error) - return error; + folio = xfile_get_folio(si->array->xfile, lo_pos, len, XFILE_ALLOC); + if (IS_ERR(folio)) + return PTR_ERR(folio); + if (!folio) + return 0; + + trace_xfarray_foliosort(si, lo, hi); xfarray_sort_bump_heapsorts(si); - startp = page_address(si->xfpage.page) + offset_in_page(lo_pos); + startp = folio_address(folio) + offset_in_folio(folio, lo_pos); sort(startp, hi - lo + 1, si->array->obj_size, si->cmp_fn, NULL); xfarray_sort_bump_stores(si); - return xfarray_sort_put_page(si); + xfile_put_folio(si->array->xfile, folio); + return 1; } /* Return a pointer to the xfarray pivot record within the sortinfo struct. */ @@ -814,63 +786,78 @@ xfarray_qsort_push( return 0; } +static inline void +xfarray_sort_scan_done( + struct xfarray_sortinfo *si) +{ + if (si->folio) + xfile_put_folio(si->array->xfile, si->folio); + si->folio = NULL; +} + /* - * Load an element from the array into the first scratchpad and cache the page, - * if possible. + * Cache the folio backing the start of the given array element. If the array + * element is contained entirely within the folio, return a pointer to the + * cached folio. Otherwise, load the element into the scratchpad and return a + * pointer to the scratchpad. */ static inline int -xfarray_sort_load_cached( +xfarray_sort_scan( struct xfarray_sortinfo *si, xfarray_idx_t idx, - void *ptr) + void **ptrp) { loff_t idx_pos = xfarray_pos(si->array, idx); - pgoff_t startpage; - pgoff_t endpage; int error = 0; - /* - * If this load would split a page, release the cached page, if any, - * and perform a traditional read. - */ - startpage = idx_pos >> PAGE_SHIFT; - endpage = (idx_pos + si->array->obj_size - 1) >> PAGE_SHIFT; - if (startpage != endpage) { - error = xfarray_sort_put_page(si); - if (error) - return error; + if (xfarray_sort_terminated(si, &error)) + return error; - if (xfarray_sort_terminated(si, &error)) - return error; + trace_xfarray_sort_scan(si, idx); - return xfile_load(si->array->xfile, ptr, - si->array->obj_size, idx_pos); - } + /* If the cached folio doesn't cover this index, release it. */ + if (si->folio && + (idx < si->first_folio_idx || idx > si->last_folio_idx)) + xfarray_sort_scan_done(si); - /* If the cached page is not the one we want, release it. */ - if (xfile_page_cached(&si->xfpage) && - xfile_page_index(&si->xfpage) != startpage) { - error = xfarray_sort_put_page(si); - if (error) - return error; + /* Grab the first folio that backs this array element. */ + if (!si->folio) { + loff_t next_pos; + + si->folio = xfile_get_folio(si->array->xfile, idx_pos, + si->array->obj_size, XFILE_ALLOC); + if (IS_ERR(si->folio)) + return PTR_ERR(si->folio); + + si->first_folio_idx = xfarray_idx(si->array, + folio_pos(si->folio) + si->array->obj_size - 1); + + next_pos = folio_pos(si->folio) + folio_size(si->folio); + si->last_folio_idx = xfarray_idx(si->array, next_pos - 1); + if (xfarray_pos(si->array, si->last_folio_idx + 1) > next_pos) + si->last_folio_idx--; + + trace_xfarray_sort_scan(si, idx); } /* - * If we don't have a cached page (and we know the load is contained - * in a single page) then grab it. + * If this folio still doesn't cover the desired element, it must cross + * a folio boundary. Read into the scratchpad and we're done. */ - if (!xfile_page_cached(&si->xfpage)) { - if (xfarray_sort_terminated(si, &error)) - return error; + if (idx < si->first_folio_idx || idx > si->last_folio_idx) { + void *temp = xfarray_scratch(si->array); - error = xfarray_sort_get_page(si, startpage << PAGE_SHIFT, - PAGE_SIZE); + error = xfile_load(si->array->xfile, temp, si->array->obj_size, + idx_pos); if (error) return error; + + *ptrp = temp; + return 0; } - memcpy(ptr, page_address(si->xfpage.page) + offset_in_page(idx_pos), - si->array->obj_size); + /* Otherwise return a pointer to the array element in the folio. */ + *ptrp = folio_address(si->folio) + offset_in_folio(si->folio, idx_pos); return 0; } @@ -937,6 +924,8 @@ xfarray_sort( pivot = xfarray_sortinfo_pivot(si); while (si->stack_depth >= 0) { + int ret; + lo = si_lo[si->stack_depth]; hi = si_hi[si->stack_depth]; @@ -949,13 +938,13 @@ xfarray_sort( } /* - * If directly mapping the page and sorting can solve our + * If directly mapping the folio and sorting can solve our * problems, we're done. */ - if (xfarray_want_pagesort(si, lo, hi)) { - error = xfarray_pagesort(si, lo, hi); - if (error) - goto out_free; + ret = xfarray_foliosort(si, lo, hi); + if (ret < 0) + goto out_free; + if (ret == 1) { si->stack_depth--; continue; } @@ -980,25 +969,24 @@ xfarray_sort( * than the pivot is on the right side of the range. */ while (lo < hi) { + void *p; + /* * Decrement hi until it finds an a[hi] less than the * pivot value. */ - error = xfarray_sort_load_cached(si, hi, scratch); + error = xfarray_sort_scan(si, hi, &p); if (error) goto out_free; - while (xfarray_sort_cmp(si, scratch, pivot) >= 0 && - lo < hi) { + while (xfarray_sort_cmp(si, p, pivot) >= 0 && lo < hi) { hi--; - error = xfarray_sort_load_cached(si, hi, - scratch); + error = xfarray_sort_scan(si, hi, &p); if (error) goto out_free; } - error = xfarray_sort_put_page(si); - if (error) - goto out_free; - + if (p != scratch) + memcpy(scratch, p, si->array->obj_size); + xfarray_sort_scan_done(si); if (xfarray_sort_terminated(si, &error)) goto out_free; @@ -1013,21 +1001,18 @@ xfarray_sort( * Increment lo until it finds an a[lo] greater than * the pivot value. */ - error = xfarray_sort_load_cached(si, lo, scratch); + error = xfarray_sort_scan(si, lo, &p); if (error) goto out_free; - while (xfarray_sort_cmp(si, scratch, pivot) <= 0 && - lo < hi) { + while (xfarray_sort_cmp(si, p, pivot) <= 0 && lo < hi) { lo++; - error = xfarray_sort_load_cached(si, lo, - scratch); + error = xfarray_sort_scan(si, lo, &p); if (error) goto out_free; } - error = xfarray_sort_put_page(si); - if (error) - goto out_free; - + if (p != scratch) + memcpy(scratch, p, si->array->obj_size); + xfarray_sort_scan_done(si); if (xfarray_sort_terminated(si, &error)) goto out_free; diff --git a/fs/xfs/scrub/xfarray.h b/fs/xfs/scrub/xfarray.h index 6f2862054e194d..ec643cc9fc1432 100644 --- a/fs/xfs/scrub/xfarray.h +++ b/fs/xfs/scrub/xfarray.h @@ -105,8 +105,14 @@ struct xfarray_sortinfo { /* XFARRAY_SORT_* flags; see below. */ unsigned int flags; - /* Cache a page here for faster access. */ - struct xfile_page xfpage; + /* Cache a folio here for faster scanning for pivots */ + struct folio *folio; + + /* First array index in folio that is completely readable */ + xfarray_idx_t first_folio_idx; + + /* Last array index in folio that is completely readable */ + xfarray_idx_t last_folio_idx; #ifdef DEBUG /* Performance statistics. */