From patchwork Wed Dec 18 02:26:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13912927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13920E77188 for ; Wed, 18 Dec 2024 02:26:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 063E26B0083; Tue, 17 Dec 2024 21:26:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C20D86B0093; Tue, 17 Dec 2024 21:26:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 957126B0092; Tue, 17 Dec 2024 21:26:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6D01B6B0089 for ; Tue, 17 Dec 2024 21:26:34 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1DDD5120AD2 for ; Wed, 18 Dec 2024 02:26:34 +0000 (UTC) X-FDA: 82906490100.30.C5C9A46 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf13.hostedemail.com (Postfix) with ESMTP id ED83920003 for ; Wed, 18 Dec 2024 02:26:01 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=UOyu5RHN; spf=none (imf13.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=quarantine) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734488768; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2tu6vw3HnVklrsZdZBeEEpq3jVFl8xYfgifT0A/bbJA=; b=YO2MlJKVP72+J2JINcLYMY19rY5xAF4uCgk0lZm92uLEjeDPUED5GNTOiWwDKF3ilAX4Vw IVhhmg/sFvh4i54e6wbT8eO146arROH2vuwiPP9YCWYA4G3k8bgQ+zthcnfjCdmhQG13zO UjUC3vs8fIRf1UTZjFRCm37Ic/D4PCc= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=UOyu5RHN; spf=none (imf13.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=quarantine) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734488768; a=rsa-sha256; cv=none; b=c5bPHpmyS9opYPUAV9k3ym74NqrBccMtAIYws4t0SVHNtdG4iAcJcpcBjFbwp4+mmv71hE 9dk6UBl6pV/NR42PyuqftJwLusXXNatbFP2a1vP4JpBqf9xVn5Av/MMDLdNl4ZB5Ij3D5Z HM0FAkmiUc8nc6PnzrXxHSJPS0cgyew= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=2tu6vw3HnVklrsZdZBeEEpq3jVFl8xYfgifT0A/bbJA=; b=UOyu5RHNs1b1dIBIPJ3wXeM11K Qem2FStQNp12ErJNPvBxR8pD7ECxcshmwcWWQMoKlNNWXGc2UmQ17TtyzfCe62rNSP1oRjC+o5JJo vs5OeIciaTldFNmK4eHoM/ogSKq/h8jQGFaRs1bjS8wOQHdwl1/lxu28kqdPAcVDjtGDLnFDgmh3U xn5m+02QHyVvqS4WqbA0WIdEyGaai4Y5Io//FfGcKarijsMIoLsfLldTW9A1rttjzbwpg0+tGTjeP Lo2HtB1olSfaksGtbOY3dXGGsH/5C49xgoKFfzDxAdRm4d/8XVovi3sVIlX44mxqR+HpVwLDlmOgY N7Qza3aQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tNjlc-0000000FOFZ-1MBK; Wed, 18 Dec 2024 02:26:28 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH 4/5] fs/buffer: add iteration support for block_read_full_folio() Date: Tue, 17 Dec 2024 18:26:25 -0800 Message-ID: <20241218022626.3668119-5-mcgrof@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241218022626.3668119-1-mcgrof@kernel.org> References: <20241218022626.3668119-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: ED83920003 X-Rspamd-Server: rspam12 X-Stat-Signature: 3qy3pgboh5xzsy596s7za6uim4fobw86 X-Rspamd-Pre-Result: action=add header; module=dmarc; Action set by DMARC X-Rspam: Yes X-HE-Tag: 1734488761-161609 X-HE-Meta: U2FsdGVkX19u98fT/4ejmGcsbxR//WY4sfoxOlJjjlozq41z/1i02zXb6rnc9fNAEPFar2sqmShWL719/kmwO2dZU6V/b5HCjlRZLm31/o5rK3q9JG8+5sfDfRYEXtfgE7m85KNSBCTiHFlWZ5OWC9xbg6Agh5QsoFhrw/83rd2lwPNJLx10PiecCPNAfDbENDSfFb03MxbuwhaJj77l96CgQ8+SJKopXtT3q9jDZAOAWac4a8KaGpY7ZE7lXataljDeXfJZmeWEY7OeVjeu8QZQJ+JJjT/MHGVKLAWj1zJtS1l1XQtMVtbhTrZfusVxQJITlkRNuPalS9TYuiEeVF+P9Fa32J6KSECYuyD0Ufr9sLBjXOxRkiB56GOqntRwrUEqzgdUvakuCiMFEnrIswTCVglJesjFzgJqkCR8hwjMFSma/N/6/sAArfBPxlLGAZd2UA6vUziij1DvGGAffI3CINvqf06pNx3iEdsICiGVTK0Ozv6MQS/YhzAnqXvM5IYaWX2eTcVrfAxxcTfoTiEHEuQ0w9NWyP1/BAg8FU9Nh0VChv7NDl+HzP+19bUUXMB5EKUj19E+mkV92WXyO73D92Evq8I/8uCEwHhmRXVqu3WQDeW/NRoB3lCsFWsQ7EmiufQ4CWW9E3mKLA7bfm1ealZ2eCOsK0KSj1Enk0GeZsQXnhPYhZuB48JVUiB97rBGQdovw8OZmQN0HSy6i3XOnV4O8rowuhWYJGm2h0iC3Bryiprzn3kztlqMTN1CpjcRMx4itfCLe8tSlMyIC2AQVufvmfRvHLeFs5ILbJ/9BZxuUSiLgGAnUiREf/B59UqmcXPXgW/KV1XWAflC6cXOxXVSqXRMNXlXZcbw1qYoDfdXRG3idcRJzdfiLd8vfPhb7BT0wtVjF3pnA6+/Jja5WbQ91cxutBE8JE1KT59UxCuNZ0kZKXPQsOr3R/ouP/6TW7Wwqg29oi2XtLH Sz2BFblh 74y2UstFY3T6wvRpIO9SA7M3892v6ogZSW/12wGMG0Q8z3T8L2IaMY8HQDGlcWehU/v5qbc9+QCnuVbC4aImxl+H+5eNWWxzVjHcbecq8wFa1qlMfUl2YsCrrU63oJLfBq7Ix7RLCwyyoAwPfZstJ/esBua5CWcgmnSUCMDCRaPkjFwSQ0ruRi9lJXBcqqtjB37FYdqeB2DPN2bTAQhYcvfsV0gdpsLqz8isIr3ENG2OrDocSPmal/pVr8CiRRlzU+87uLGL9h05ktqtLhejoUts1aQcyhiX/Sz+6kPLud0RxtSiXGK4tRs7lFUMT6/HpnOB4GwUxPv/fUmd1z7Oh+vjbliwv6DwqHPWs80PrJISD/MKlaBw8gskx9/Yg+vERau0b/qvsmIcXaK/04B/VsrKJ4eRIvtQfYzcsjA/fv+bQKjd2IwLfiyOID65q6PUYszB62Vn+zDuQbpGs2rADKyEdF9UWt9xVxJ9USlvQVaBFh2abvbzYRJLdTe+wqapSQzjZNnvMH8EQNCAzEbaDHKdil6SJLhXAGyuVEHacSBKMlik= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Provide a helper to iterate on buffer heads on a folio. We do this as a preliminary step so to make the subsequent changes easier to read. Right now we use an array on stack to loop over all buffer heads in a folio of size MAX_BUF_PER_PAGE, however on CPUs where the system page size is quite larger like Hexagon with 256 KiB page size support this can mean the kernel can end up spewing spews stack growth warnings. To be able to break this down into smaller array chunks add support for processing smaller array chunks of buffer heads at a time. The used array size is not changed yet, that will be done in a subsequent patch, this just adds the iterator support and logic. While at it clarify the booleans used on bh_read_batch_async() and how they are only valid in consideration when we've processed all buffer-heads of a folio, that is when we're on the last buffer head in a folio: * bh_folio_reads * unmapped Reviewed-by: Hannes Reinecke Signed-off-by: Luis Chamberlain --- fs/buffer.c | 134 +++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 97 insertions(+), 37 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 1aeef7dd2281..b8ba72f2f211 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2402,66 +2402,75 @@ static void bh_read_batch_async(struct folio *folio, #define bh_next(__bh, __head) \ (bh_is_last(__bh, __head) ? NULL : (__bh)->b_this_page) +/* Starts from a pivot which you initialize */ +#define for_each_bh_pivot(__pivot, __last, __head) \ + for ((__pivot) = __last = (__pivot); \ + (__pivot); \ + (__pivot) = bh_next(__pivot, __head), \ + (__last) = (__pivot) ? (__pivot) : (__last)) + /* Starts from the provided head */ #define for_each_bh(__tmp, __head) \ for ((__tmp) = (__head); \ (__tmp); \ (__tmp) = bh_next(__tmp, __head)) +struct bh_iter { + sector_t iblock; + get_block_t *get_block; + bool any_get_block_error; + int unmapped; + int bh_folio_reads; +}; + /* - * Generic "read_folio" function for block devices that have the normal - * get_block functionality. This is most of the block device filesystems. - * Reads the folio asynchronously --- the unlock_buffer() and - * set/clear_buffer_uptodate() functions propagate buffer state into the - * folio once IO has completed. + * Reads up to MAX_BUF_PER_PAGE buffer heads at a time on a folio on the given + * block range iblock to lblock and helps update the number of buffer-heads + * which were not uptodate or unmapped for which we issued an async read for + * on iter->bh_folio_reads for the full folio. Returns the last buffer-head we + * worked on. */ -int block_read_full_folio(struct folio *folio, get_block_t *get_block) -{ - struct inode *inode = folio->mapping->host; - sector_t iblock, lblock; - struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE]; - size_t blocksize; - int nr; - int fully_mapped = 1; - bool page_error = false; - loff_t limit = i_size_read(inode); - - /* This is needed for ext4. */ - if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode)) - limit = inode->i_sb->s_maxbytes; +static struct buffer_head *bh_read_iter(struct folio *folio, + struct buffer_head *pivot, + struct buffer_head *head, + struct inode *inode, + struct bh_iter *iter, sector_t lblock) +{ + struct buffer_head *arr[MAX_BUF_PER_PAGE]; + struct buffer_head *bh = pivot, *last; + int nr = 0, i = 0; + size_t blocksize = head->b_size; + bool no_reads = false; + bool fully_mapped = false; - VM_BUG_ON_FOLIO(folio_test_large(folio), folio); + /* Stage one - collect buffer heads we need issue a read for */ - head = folio_create_buffers(folio, inode, 0); - blocksize = head->b_size; + /* collect buffers not uptodate and not mapped yet */ + for_each_bh_pivot(bh, last, head) { + BUG_ON(nr >= MAX_BUF_PER_PAGE); - iblock = div_u64(folio_pos(folio), blocksize); - lblock = div_u64(limit + blocksize - 1, blocksize); - nr = 0; - - /* Stage one - collect buffer heads we need issue a read for */ - for_each_bh(bh, head) { if (buffer_uptodate(bh)) { - iblock++; + iter->iblock++; continue; } if (!buffer_mapped(bh)) { int err = 0; - fully_mapped = 0; - if (iblock < lblock) { + iter->unmapped++; + if (iter->iblock < lblock) { WARN_ON(bh->b_size != blocksize); - err = get_block(inode, iblock, bh, 0); + err = iter->get_block(inode, iter->iblock, + bh, 0); if (err) - page_error = true; + iter->any_get_block_error = true; } if (!buffer_mapped(bh)) { folio_zero_range(folio, bh_offset(bh), blocksize); if (!err) set_buffer_uptodate(bh); - iblock++; + iter->iblock++; continue; } /* @@ -2469,15 +2478,66 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) * synchronously */ if (buffer_uptodate(bh)) { - iblock++; + iter->iblock++; continue; } } arr[nr++] = bh; - iblock++; + iter->iblock++; + } + + iter->bh_folio_reads += nr; + + WARN_ON_ONCE(!bh_is_last(last, head)); + + if (bh_is_last(last, head)) { + if (!iter->bh_folio_reads) + no_reads = true; + if (!iter->unmapped) + fully_mapped = true; } - bh_read_batch_async(folio, nr, arr, fully_mapped, nr == 0, page_error); + bh_read_batch_async(folio, nr, arr, fully_mapped, no_reads, + iter->any_get_block_error); + + return last; +} + +/* + * Generic "read_folio" function for block devices that have the normal + * get_block functionality. This is most of the block device filesystems. + * Reads the folio asynchronously --- the unlock_buffer() and + * set/clear_buffer_uptodate() functions propagate buffer state into the + * folio once IO has completed. + */ +int block_read_full_folio(struct folio *folio, get_block_t *get_block) +{ + struct inode *inode = folio->mapping->host; + sector_t lblock; + size_t blocksize; + struct buffer_head *bh, *head; + struct bh_iter iter = { + .get_block = get_block, + .unmapped = 0, + .any_get_block_error = false, + .bh_folio_reads = 0, + }; + loff_t limit = i_size_read(inode); + + /* This is needed for ext4. */ + if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode)) + limit = inode->i_sb->s_maxbytes; + + VM_BUG_ON_FOLIO(folio_test_large(folio), folio); + + head = folio_create_buffers(folio, inode, 0); + blocksize = head->b_size; + + iter.iblock = div_u64(folio_pos(folio), blocksize); + lblock = div_u64(limit + blocksize - 1, blocksize); + + for_each_bh(bh, head) + bh = bh_read_iter(folio, bh, head, inode, &iter, lblock); return 0; }