From patchwork Thu Apr 10 01:49:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045744 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0E391ADC90; Thu, 10 Apr 2025 01:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249794; cv=none; b=MqpYhuC1RS6BNMv4pBhT3RjsKUsaVS0KedraqkM0vWc6/NR4/b5sAXC4NYzx+duhEKGF3YhRGU7/0TfCH4UI6tHjvT/zekLU1R4/cHbrqAoRooDBUBfw2znY+xhHHDjsK9haHxtlWHvE2Hz6xJTzAUx64zfYC9e7bjtK8ZY6VcU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249794; c=relaxed/simple; bh=ThRGS12HG1C9teSRCGWbQ8jkrJD1nLkUQAc86uUErSk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=m/iV/RpXr5VlO/bIeLkzRYvV7QxuHkjPctfqzpGtm7B52Dvc/1PBfRefYbLK+EbFkaJLpMSByNnquCC8hUSnWjvzLV6Ix38CmexdjhpbctGCqJnVDinq6myeaczhZO+HG8Oxeu6Q6E1mWgUBlIsS0n1hOA8yYNRFsmuIF3CvYVw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=a/4PN9sb; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="a/4PN9sb" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description; bh=CL+6HEhZASTZTm0n+Z7z4M5wQOhYNXjmk5hIzXt/heo=; b=a/4PN9sb92D7cFqc3K7oeSik7t Q+3oQ2p8GkMhEL/NDp8lQAFxC6bkdTJSMpF0WP0WvzM/+GP/1hKhavztS3KblNtKFvUEjpenwScBe Hb1qNVm1TJLT2GecaI6PPeM3SapBsIbUuujrA5qaDs3R2iDv3LFpejYBEfCufwqRvCef1Swk0S2Es 9c6zatte5dgYw84XvL/jwERxw2Ed9nxrnGkYKolyC2QiGz2KtP2MaTvuitMv2Zhcenq9L8mRvVW3r Dbndqjo6JIIUg+j8oo6JD/6d323P3UgBAGgDtmY9129Mgvr9vMfEMQHJt7s+73+P1QI3VfZVZDpCt sK63AXDA==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h35-00000008yv5-3Gk9; Thu, 10 Apr 2025 01:49:47 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org, syzbot+f3c6fda1297c748a7076@syzkaller.appspotmail.com Subject: [PATCH v2 1/8] migrate: fix skipping metadata buffer heads on migration Date: Wed, 9 Apr 2025 18:49:38 -0700 Message-ID: <20250410014945.2140781-2-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain Filesystems which use buffer-heads where it cannot guarantees that there are no other references to the folio, for example with a folio lock, must use buffer_migrate_folio_norefs() for the address space mapping migrate_folio() callback. There are only 3 filesystems which use this callback: 1) the block device cache 2) ext4 for its ext4_journalled_aops, ie, jbd2 3) nilfs2 jbd2's use of this however callback however is very race prone, consider folio migration while reviewing jbd2_journal_write_metadata_buffer() and the fact that jbd2: - does not hold the folio lock - does not have have page writeback bit set - does not lock the buffer And so, it can race with folio_set_bh() on folio migration. The commit ebdf4de5642fb6 ("mm: migrate: fix reference check race between __find_get_block() and migration") added a spin lock to prevent races with page migration which ext4 users were reporting through the SUSE bugzilla (bnc#1137609 [0]). Although we don't have exact traces of the original filesystem corruption we can can reproduce fs corruption on ext4 by just removing the spinlock and stress testing the filesystem with generic/750, we eventually end up after 3 hours of testing with kdevops using libvirt on the ext4 profiles ext4-4k and ext4-2k. It turns out that the spin lock doesn't in the end protect against corruption, it *helps* reduce the possibility, but ext4 filesystem corruption can still happen even with the spin lock held. A test was done using vanilla Linux and adding a udelay(2000) right before we spin_lock(&bd_mapping->i_private_lock) on __find_get_block_slow() and we can reproduce the same exact filesystem corruption issues as observed without the spinlock with generic/750 [1]. We now have a slew of traces collected for the ext4 corruptions possible without the current spin lock held [2] [3] [4] but the general pattern is as follows, as summarized by ChatGPT from all traces: do_writepages() # write back --> ext4_map_block() # performs logical to physical block mapping --> ext4_ext_insert_extent() # updates extent tree --> jbd2_journal_dirty_metadata() # marks metadata as dirty for # journaling. This can lead # to any of the following hints # as to what happened from # ext4 / jbd2 - Directory and extent metadata corruption splats or - Filure to handle out-of-space conditions gracefully, with cascading metadata errors and eventual filesystem shutdown to prevent further damage. - Failure to journal new extent metadata during extent tree growth, triggered under memory pressure or heavy writeback. Commonly results in ENOSPC, journal abort, and read-only fallback. ** - Journal metadata failure during extent tree growth causes read-only fallback. Seen repeatedly on small-block (2k) filesystems under stress (e.g. fsstress). Triggers errors in bitmap and inode updates, and persists in journal replay logs. "Error count since last fsck" shows long-term corruption footprint. ** Reproduced on vanilla Linux with udelay(2000) ** Call trace (ENOSPC journal failure): do_writepages() → ext4_do_writepages() → ext4_map_blocks() → ext4_ext_map_blocks() → ext4_ext_insert_extent() → __ext4_handle_dirty_metadata() → jbd2_journal_dirty_metadata() → ERROR -28 (ENOSPC) And so jbd2 still needs more work to avoid races with folio migration. So replace the current spin lock solution by just skipping jbd buffers on folio migration. We identify jbd buffers as its the only user of set_buffer_meta() on __ext4_handle_dirty_metadata(). By checking for buffer_meta() and bailing on migration we fix the existing racy ext4 corruption while also removing the spin lock to be held while sleeping complaints originally reported by 0-day [5], and paves the way for buffer-heads for more users of large folios other than the block device cache. Link: https://bugzilla.suse.com/show_bug.cgi?id=1137609 # [0] Link: https://web.git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=20250408-ext4-jbd-force-corruption # [1] Link: https://lkml.kernel.org/r/Z-ZwToVfJbdTVRtG@bombadil.infradead.org # [2] Link: https://lore.kernel.org/all/Z-rzyrS0Jr7t984Y@bombadil.infradead.org/ # [3] Link: https://lore.kernel.org/all/Z_AA9SHZdRGq6tb4@bombadil.infradead.org/ # [4] Reported-by: kernel test robot Reported-by: syzbot+f3c6fda1297c748a7076@syzkaller.appspotmail.com Closes: https://lore.kernel.org/oe-lkp/202503101536.27099c77-lkp@intel.com # [5] Fixes: ebdf4de5642fb6 ("mm: migrate: fix reference check race between __find_get_block() and migration") Signed-off-by: Luis Chamberlain --- mm/migrate.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index f3ee6d8d5e2e..32fa72ba10b4 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -841,6 +841,9 @@ static int __buffer_migrate_folio(struct address_space *mapping, if (folio_ref_count(src) != expected_count) return -EAGAIN; + if (buffer_meta(head)) + return -EAGAIN; + if (!buffer_migrate_lock_buffers(head, mode)) return -EAGAIN; @@ -859,12 +862,12 @@ static int __buffer_migrate_folio(struct address_space *mapping, } bh = bh->b_this_page; } while (bh != head); + spin_unlock(&mapping->i_private_lock); if (busy) { if (invalidated) { rc = -EAGAIN; goto unlock_buffers; } - spin_unlock(&mapping->i_private_lock); invalidate_bh_lrus(); invalidated = true; goto recheck_buffers; @@ -880,10 +883,7 @@ static int __buffer_migrate_folio(struct address_space *mapping, folio_set_bh(bh, dst, bh_offset(bh)); bh = bh->b_this_page; } while (bh != head); - unlock_buffers: - if (check_refs) - spin_unlock(&mapping->i_private_lock); bh = head; do { unlock_buffer(bh); From patchwork Thu Apr 10 01:49:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045741 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 933E0198E8C; Thu, 10 Apr 2025 01:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; cv=none; b=TG2hWxFIvyZbROjHXNTG6OUchurksOghjo9P1eLsU71P2VBmjR0akRLyPLz4eD3sA15FdZC7lxVQ0ONRX5oOYfIe6hB/aAGrPdcTmP8sIxqF38272eKm+oZ2kyuWzlz4mHqE7bv9O1Yo7IVqFcybDQbmIioLL5gEYgiEu+7+KD4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; c=relaxed/simple; bh=mu/+znPJJj83Q4VgB1jn93BxAWKUp778KoeI0b0ewhU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uD4XV+Lmu8d33PX89WU+e61SKioK0zLU5yFrEsvEYy0CAQSNO1WtACh/jxLyDzLzlB4dDP0E7Mhrbb3ZglJ4uNTCmLHI/Euwsv0tVjU5ljPo/eN2AZUb1sp9i7mP7JFgHyTPBPmnO4d+gr8Novqh4XUFJ92NX3xZK+j5zRDMmnU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=b/XLETIB; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="b/XLETIB" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=1ZkQCLEfvSJJMcZUEvIWp91J3xZQH1qlm2s7wYby9N4=; b=b/XLETIB5KCHfGv4ObhFsAtZeg 9PIOhDozbrjIiwkz8FCrVudBNh2xFJuJPCMRAZeWfEuT0B65+YA0mPvNkelPKXGYW70ztF80jFtrk +zk3ijWBp87lt6bqq3DJhsxWX9gTuF74Jnd9NmvK1ryYmCOivexotCIUtVE5uDZGyi3XDwy0R42J6 hmwni0rd8Rx/vXlbvB/t2/Lj/REqtA8EpzSkQsol1GK/rc9TKvQvQBSVIAHKqYN5G9bju7WsHJS9q Qc3PJM/BXxCWo0oGP+Feqe48qlCXA0oi9dt8kdTsUCrc9P/4T1Apl74rogmqpTvs/ISX1C7WGRDtl KWtwGWeQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h35-00000008yvH-3Pl7; Thu, 10 Apr 2025 01:49:47 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org Subject: [PATCH v2 2/8] fs/buffer: try to use folio lock for pagecache lookups Date: Wed, 9 Apr 2025 18:49:39 -0700 Message-ID: <20250410014945.2140781-3-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Davidlohr Bueso Callers of __find_get_block() may or may not allow for blocking semantics, and is currently assumed that it will not. Layout two paths based on this. Ultimately the i_private_lock scheme will be used as a fallback in non-blocking contexts. Otherwise always take the folio lock instead. The suggested trylock idea is implemented, thereby potentially reducing i_private_lock contention in addition to later enabling future migration support around with large folios and noref migration. No change in semantics. All lookup users are non-blocking. Signed-off-by: Davidlohr Bueso Signed-off-by: Luis Chamberlain --- fs/buffer.c | 45 +++++++++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 16 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index c7abb4a029dc..5a1a37a6840a 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -176,18 +176,8 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate) } EXPORT_SYMBOL(end_buffer_write_sync); -/* - * Various filesystems appear to want __find_get_block to be non-blocking. - * But it's the page lock which protects the buffers. To get around this, - * we get exclusion from try_to_free_buffers with the blockdev mapping's - * i_private_lock. - * - * Hack idea: for the blockdev mapping, i_private_lock contention - * may be quite high. This code could TryLock the page, and if that - * succeeds, there is no need to take i_private_lock. - */ static struct buffer_head * -__find_get_block_slow(struct block_device *bdev, sector_t block) +__find_get_block_slow(struct block_device *bdev, sector_t block, bool atomic) { struct address_space *bd_mapping = bdev->bd_mapping; const int blkbits = bd_mapping->host->i_blkbits; @@ -197,6 +187,7 @@ __find_get_block_slow(struct block_device *bdev, sector_t block) struct buffer_head *head; struct folio *folio; int all_mapped = 1; + bool folio_locked = true; static DEFINE_RATELIMIT_STATE(last_warned, HZ, 1); index = ((loff_t)block << blkbits) / PAGE_SIZE; @@ -204,7 +195,19 @@ __find_get_block_slow(struct block_device *bdev, sector_t block) if (IS_ERR(folio)) goto out; - spin_lock(&bd_mapping->i_private_lock); + /* + * Folio lock protects the buffers. Callers that cannot block + * will fallback to serializing vs try_to_free_buffers() via + * the i_private_lock. + */ + if (!folio_trylock(folio)) { + if (atomic) { + spin_lock(&bd_mapping->i_private_lock); + folio_locked = false; + } else + folio_lock(folio); + } + head = folio_buffers(folio); if (!head) goto out_unlock; @@ -236,7 +239,10 @@ __find_get_block_slow(struct block_device *bdev, sector_t block) 1 << blkbits); } out_unlock: - spin_unlock(&bd_mapping->i_private_lock); + if (folio_locked) + folio_unlock(folio); + else + spin_unlock(&bd_mapping->i_private_lock); folio_put(folio); out: return ret; @@ -1388,14 +1394,15 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size) * it in the LRU and mark it as accessed. If it is not present then return * NULL */ -struct buffer_head * -__find_get_block(struct block_device *bdev, sector_t block, unsigned size) +static struct buffer_head * +find_get_block_common(struct block_device *bdev, sector_t block, + unsigned size, bool atomic) { struct buffer_head *bh = lookup_bh_lru(bdev, block, size); if (bh == NULL) { /* __find_get_block_slow will mark the page accessed */ - bh = __find_get_block_slow(bdev, block); + bh = __find_get_block_slow(bdev, block, atomic); if (bh) bh_lru_install(bh); } else @@ -1403,6 +1410,12 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size) return bh; } + +struct buffer_head * +__find_get_block(struct block_device *bdev, sector_t block, unsigned size) +{ + return find_get_block_common(bdev, block, size, true); +} EXPORT_SYMBOL(__find_get_block); /** From patchwork Thu Apr 10 01:49:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045742 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DB1019ADB0; Thu, 10 Apr 2025 01:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; cv=none; b=QYThojf1XU42SotNa/PZAZLJ+E+QZEyRtCXMFgEzM9GRr58IJ4tvRgKdJBi9U9G/ANWqks8PLOha5xBkc4UcRgj+Wcrd3wgXZ6tjMWE/GOReRMmHOQnJzxjYEAZqPVSJlW/DwJ99gz3KtBfIxgevEdDxRqJj9w6g1mYdPb75Uvc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; c=relaxed/simple; bh=6AcGXxfbBxjNN+NNnC8VOmrsKGHXtnZn9WapwkH55B0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m+Xub3Wn+U8xs2zvfAHA0hp9OJTbTLGOKcZaKIRv848MLMHHmfqHTn4ZK2IPxIFbNzQwZQFML8rOaWJJNkZMi73mWcgYOZ5nLf6+v1NUst4qh+ZXAbn4OxlBTkCM1VMAuY5PeF+qiindL64+MOlrHDYMWe/PKLKVnLZzS2W5uKU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=ipBwvmyv; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ipBwvmyv" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=IJXPCgsWY2AYZ+ERHJ5qQZaJVfrH0dcPo+w1wteI5Ag=; b=ipBwvmyv4g4pzBPiSz5QUDcBzq AOtFV3f6RHrv9qNMsPiWLlXHk0KRNOGrx4o0iqcNXtL+XXYJYoXwJGwkUKVjnxVFLEuXVjPSVotHj SGg2Q4NS6KTe8Zzo/MoOyLZKMbbIKuQ9KiDWJd56PjYD/DqdQmhbq7k3Cy/XazUH0PeMwkOCMJKmn J7vCSCXbe8pjh2LRYUGn3TUSWwGQH/GENSZPo1vx6r18czqaysbyjQa8LN8tn0udRXK6VV4Rt7SSJ +kvcKp+BHk0YcENBoAuFFeIKkaWGMU9Bg5YoEqG8ALCnuEWNez6FH2LXV8JPxBzxruzLFCfrxwFuE 2ZgTSy8w==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h35-00000008yvJ-3gUy; Thu, 10 Apr 2025 01:49:47 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org Subject: [PATCH v2 3/8] fs/buffer: introduce __find_get_block_nonatomic() Date: Wed, 9 Apr 2025 18:49:40 -0700 Message-ID: <20250410014945.2140781-4-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Davidlohr Bueso Callers that do not require atomic context for pagecache lookups can use this flavor to avoid the unnencesary contention on the blockdev mapping i_private_lock as well as waiting on noref migration. Convert local caller write_boundary_block() which already takes the buffer lock as well as bdev_getblk() depending on the respective gpf flags. Either way there are no currently changes in semantics. Signed-off-by: Davidlohr Bueso Signed-off-by: Luis Chamberlain --- fs/buffer.c | 19 +++++++++++++++++-- include/linux/buffer_head.h | 2 ++ 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 5a1a37a6840a..07ec57ec100e 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -662,7 +662,8 @@ EXPORT_SYMBOL(generic_buffers_fsync); void write_boundary_block(struct block_device *bdev, sector_t bblock, unsigned blocksize) { - struct buffer_head *bh = __find_get_block(bdev, bblock + 1, blocksize); + struct buffer_head *bh = __find_get_block_nonatomic(bdev, bblock + 1, + blocksize); if (bh) { if (buffer_dirty(bh)) write_dirty_buffer(bh, 0); @@ -1418,6 +1419,15 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size) } EXPORT_SYMBOL(__find_get_block); +/* same as __find_get_block() but allows sleeping contexts */ +struct buffer_head * +__find_get_block_nonatomic(struct block_device *bdev, sector_t block, + unsigned size) +{ + return find_get_block_common(bdev, block, size, false); +} +EXPORT_SYMBOL(__find_get_block_nonatomic); + /** * bdev_getblk - Get a buffer_head in a block device's buffer cache. * @bdev: The block device. @@ -1435,7 +1445,12 @@ EXPORT_SYMBOL(__find_get_block); struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block, unsigned size, gfp_t gfp) { - struct buffer_head *bh = __find_get_block(bdev, block, size); + struct buffer_head *bh; + + if (gfpflags_allow_blocking(gfp)) + bh = __find_get_block_nonatomic(bdev, block, size); + else + bh = __find_get_block(bdev, block, size); might_alloc(gfp); if (bh) diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index f0a4ad7839b6..2b5458517def 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -222,6 +222,8 @@ void __wait_on_buffer(struct buffer_head *); wait_queue_head_t *bh_waitq_head(struct buffer_head *bh); struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block, unsigned size); +struct buffer_head *__find_get_block_nonatomic(struct block_device *bdev, + sector_t block, unsigned size); struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block, unsigned size, gfp_t gfp); void __brelse(struct buffer_head *); From patchwork Thu Apr 10 01:49:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045736 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19A353596F; Thu, 10 Apr 2025 01:49:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249792; cv=none; b=KfF32NzB1cgAcsRtdUtX0/prBYMBmaEiupnQ63emhiea4fzmbQp60nmiV/H9lElNu9wWQA2S0Xi1ym7+4DdfiwPVPY9p+IuIRf1YKlLpOYVuoiKCLskgQoDxe8004aNYforXahLA5UwvGiEr9QIadvTmgWlFYFgYH6fPnOk7BxU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249792; c=relaxed/simple; bh=n4Z3bfNYR4ZdMyATwh0SIQXXYPOl4F3OtrnP8lUJ4ms=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Yp+R2OPO4E4KAWxRxBd//siwbi9546ILBKu1XjXws7CVPr+K6kXzY/eAdjCdO1xG5lwoEaWi9AuAhk+0phVjJZr8o12MzyDARkH0ydOXz6YSybSU2bPlGpWke560j1TCSJjHGbhi3B8ox9keCuhGwd75YKvz/c3FSbvJ+ftS3JY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=R26qRIYk; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="R26qRIYk" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=+ZrhLZCm8tm/ew0QWyWZbWu1XvI3GATXCgCuKoXd9Bk=; b=R26qRIYkXu0Z3b41jJAv/FgCEj 5Fz5zYumTTbKmUt+n0M/Y2qgsSkKGqiYMVYHQ29n1yoaKcyHxy1Ewjv/RG5lzQG8KQcgLDrrCvriD 9SAkHIK+HG+MCxJyh9NrYIyTPtgf72uG/3nOjjBXM8nijBwDcAcGZxztjJyNqN9u+GeTe4iSEqBbh /MohfrhorRHSbEqcCF3FSpEQOPzGBRol+BXc1dLRMa7hLfJNZ7+7YHy8AAJOxd0T7eZFR2G1fAcsv hs6YeBJqb6Z/nxrgkNXDpvLGd5on0WMn2dNqIKRY88dYoauT2Q+Nbbxj/FJYQ+uYvr5BJ0dJbGmN0 +jP/BxDw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h35-00000008yvM-3pIY; Thu, 10 Apr 2025 01:49:47 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org Subject: [PATCH v2 4/8] fs/ocfs2: use sleeping version of __find_get_block() Date: Wed, 9 Apr 2025 18:49:41 -0700 Message-ID: <20250410014945.2140781-5-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Davidlohr Bueso This is a path that allows for blocking as it does IO. Convert to the new nonatomic flavor to benefit from potential performance benefits and adapt in the future vs migration such that semantics are kept. Suggested-by: Jan Kara Signed-off-by: Davidlohr Bueso Signed-off-by: Luis Chamberlain --- fs/ocfs2/journal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index f1b4b3e611cb..c7a9729dc9d0 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -1249,7 +1249,7 @@ static int ocfs2_force_read_journal(struct inode *inode) } for (i = 0; i < p_blocks; i++, p_blkno++) { - bh = __find_get_block(osb->sb->s_bdev, p_blkno, + bh = __find_get_block_nonatomic(osb->sb->s_bdev, p_blkno, osb->sb->s_blocksize); /* block not cached. */ if (!bh) From patchwork Thu Apr 10 01:49:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045738 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 522F114884C; Thu, 10 Apr 2025 01:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249792; cv=none; b=aq3T9ybR8HkzXIV2E7l30ak8nqmruK+QDz8W+mqb0Bfv5c7ywvBQ6g0IgI4ckAeH/8wBIHu1FoXCHpk7wLHrIMGqVbsbRd/2i5IZF0LU4DihVO75yN6nJjASHSY5DWzy8zb4TtxBGLs/XQn54v7/W+Ckksmv8aVXYMBvAcUf3yw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249792; c=relaxed/simple; bh=PntIbLJLcwut+u4jzwWlmCVqW/K9+alpriCiB45QVFk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F7g3wVt/IsO9WD419jJ46gVh1BUeQlpoB+gyRYFQOwMUqkRo5wJ4B5PzoB1aX6BfRuOi7NKlSb3tQCsq6BbG9uthiKblnRmZT4Ii89EnctuDwxsOSsLVVgRqN1+3p/6OMT1D9nfDVt0GUARsEEHbhwegmVn0D0PpSiJtWEkqxkE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=Yw//8RAk; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="Yw//8RAk" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=vTKYqeK6owf1W0PFCfkmcr/WH0zrnKyGAnGwrj+ptGw=; b=Yw//8RAkVfn5FzwjQqq2jaRoNX VzPuh+AjIaEx6KIP7Yx5LcTOpptHkTAeDWIPwy+3tzS2/smqJLeuJ+oBb0yAZFkX3RQwVre04CAOy a1WKN5Xl1vTRpgwK25wuPi7rvQ8FCimEw1FU4mZqd2fWe/JTz0rzjT5sCLHgi2bG4VQOIC8Z0y8lZ JKIjgKSt1QMD4NwFRfaEyrgRsIPjAgaT6u40ehpHsnitQqNhMD+2mDVAnGN7nuAuV7fzKRhjJw21X UVL9ZoHR0ZGuINF0W2kc4CH+LJIblsNeZlApOVZpSxxzfcjInWmE96/RUpvF4viwNPZTJiLAFSKyO f7WZbJkg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h35-00000008yvO-47o1; Thu, 10 Apr 2025 01:49:47 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org Subject: [PATCH v2 5/8] fs/jbd2: use sleeping version of __find_get_block() Date: Wed, 9 Apr 2025 18:49:42 -0700 Message-ID: <20250410014945.2140781-6-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Davidlohr Bueso Convert to the new nonatomic flavor to benefit from potential performance benefits and adapt in the future vs migration such that semantics are kept. - jbd2_journal_revoke(): can sleep (has might_sleep() in the beginning) - jbd2_journal_cancel_revoke(): only used from do_get_write_access() and do_get_create_access() which do sleep. So can sleep. - jbd2_clear_buffer_revoked_flags() - only called from journal commit code which sleeps. So can sleep. Suggested-by: Jan Kara Signed-off-by: Davidlohr Bueso Signed-off-by: Luis Chamberlain --- fs/jbd2/revoke.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c index 0cf0fddbee81..1467f6790747 100644 --- a/fs/jbd2/revoke.c +++ b/fs/jbd2/revoke.c @@ -345,7 +345,8 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr, bh = bh_in; if (!bh) { - bh = __find_get_block(bdev, blocknr, journal->j_blocksize); + bh = __find_get_block_nonatomic(bdev, blocknr, + journal->j_blocksize); if (bh) BUFFER_TRACE(bh, "found on hash"); } @@ -355,7 +356,8 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr, /* If there is a different buffer_head lying around in * memory anywhere... */ - bh2 = __find_get_block(bdev, blocknr, journal->j_blocksize); + bh2 = __find_get_block_nonatomic(bdev, blocknr, + journal->j_blocksize); if (bh2) { /* ... and it has RevokeValid status... */ if (bh2 != bh && buffer_revokevalid(bh2)) @@ -464,7 +466,8 @@ void jbd2_journal_cancel_revoke(handle_t *handle, struct journal_head *jh) * state machine will get very upset later on. */ if (need_cancel) { struct buffer_head *bh2; - bh2 = __find_get_block(bh->b_bdev, bh->b_blocknr, bh->b_size); + bh2 = __find_get_block_nonatomic(bh->b_bdev, bh->b_blocknr, + bh->b_size); if (bh2) { if (bh2 != bh) clear_buffer_revoked(bh2); @@ -492,9 +495,9 @@ void jbd2_clear_buffer_revoked_flags(journal_t *journal) struct jbd2_revoke_record_s *record; struct buffer_head *bh; record = (struct jbd2_revoke_record_s *)list_entry; - bh = __find_get_block(journal->j_fs_dev, - record->blocknr, - journal->j_blocksize); + bh = __find_get_block_nonatomic(journal->j_fs_dev, + record->blocknr, + journal->j_blocksize); if (bh) { clear_buffer_revoked(bh); __brelse(bh); From patchwork Thu Apr 10 01:49:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045740 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 548D6193077; Thu, 10 Apr 2025 01:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; cv=none; b=YtQhVOLCSJPNjY2C4qldeEyq3YQ0r7T1pKPzEjH5Vrz4WGtUgkplJEBPYalbDSouhG+OIwXEnHL4ya0ZszWeCR9W7+8UrtovXfInwEgyf75fmq7IURHWZkWKtEP+YOSdYYBoed9GmkGQ9/1ayDDcnAIcMFDLhvCtk6lCen3tsVw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; c=relaxed/simple; bh=j18XkRZT8Iv+bow22V0UsyXoR45P8eB4imMXzaz4PiA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gHO5cBue/swdj6HjLKrKpxsQ5g9ASjf0Qh1uH0s9RozBfr2Szx3tiXX955DGQfm2VSX+/zK9oM6lPM5yaf4xDSwSxLv+D/P4+keoPr3W7RR8VOPxuafMuV1lWkwd39J4SaHIiElxjsR8oU33COEdB7kR8D6R4RoWw5nIKwTC2wE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=rJPriUve; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="rJPriUve" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=zXx4xaSMWMb3LOBQgWf2JR22pqZmwI+2jZSbXpELefQ=; b=rJPriUvenypkpmUpeJ0u1yQNW6 9n/TNmnCNyxAF36Iu35qPLksZxeTJr7tDTlQwIFNI9iaqRDrB9jQDJrt6aCobuskV4GVxqAvKIyj6 rMFTAOA1pRqR1+VXmw4FKbHcWBf+LIhHQDeNTEfMh7UuXlvDEbSeeJ/4nTwum7x4jFMPCYPyQ4Iks ozYZuavJJYbt+dWMjbQD5S1TERtIeXXcyMMeCw4UzZ0Ni2ls+jzz94syh2MkBvyTwoEnj2dLqRUDE zn8L6hnMpDY++pw8sgot2giYHriNA8aCLi69OBqzcBgj33tG0d0DRoVZCSPNjc5BzJawmDd3aZ/xP VxFWWpRA==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h36-00000008yvQ-04S3; Thu, 10 Apr 2025 01:49:48 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org Subject: [PATCH v2 6/8] fs/ext4: use sleeping version of __find_get_block() Date: Wed, 9 Apr 2025 18:49:43 -0700 Message-ID: <20250410014945.2140781-7-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Davidlohr Bueso Trivially introduce the wrapper and enable ext4_free_blocks() to use it, which has a cond_resched to begin with. Convert to the new nonatomic flavor to benefit from potential performance benefits and adapt in the future vs migration such that semantics are kept. Suggested-by: Jan Kara Signed-off-by: Davidlohr Bueso Signed-off-by: Luis Chamberlain --- fs/ext4/inode.c | 2 ++ fs/ext4/mballoc.c | 3 ++- include/linux/buffer_head.h | 6 ++++++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 1dc09ed5d403..b7acb5d3adcb 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -860,6 +860,8 @@ struct buffer_head *ext4_getblk(handle_t *handle, struct inode *inode, return sb_find_get_block(inode->i_sb, map.m_pblk); /* + * Potential TODO: use sb_find_get_block_nonatomic() instead. + * * Since bh could introduce extra ref count such as referred by * journal_head etc. Try to avoid using __GFP_MOVABLE here * as it may fail the migration when journal_head remains. diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 0d523e9fb3d5..6f4265b21e19 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -6644,7 +6644,8 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode, for (i = 0; i < count; i++) { cond_resched(); if (is_metadata) - bh = sb_find_get_block(inode->i_sb, block + i); + bh = sb_find_get_block_nonatomic(inode->i_sb, + block + i); ext4_forget(handle, is_metadata, inode, bh, block + i); } } diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index 2b5458517def..8db10ca288fc 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -399,6 +399,12 @@ sb_find_get_block(struct super_block *sb, sector_t block) return __find_get_block(sb->s_bdev, block, sb->s_blocksize); } +static inline struct buffer_head * +sb_find_get_block_nonatomic(struct super_block *sb, sector_t block) +{ + return __find_get_block_nonatomic(sb->s_bdev, block, sb->s_blocksize); +} + static inline void map_bh(struct buffer_head *bh, struct super_block *sb, sector_t block) { From patchwork Thu Apr 10 01:49:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045739 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 522AC78F34; Thu, 10 Apr 2025 01:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249792; cv=none; b=iwmqhEWDxAm/Pw8tgq3NWdCi5mcESvgeKd0dSV6impt1cqvCR5PiMmgdqCH44vef6+++t56rStXme/UE/B1ArH+2hn9vLH47HQWKN5AsYfBwhC4Si+KQiba2Wry3dkGgh3CM9pTeDG8YEASXIYPhR0viqx+PmwDDckUVjMnkUoY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249792; c=relaxed/simple; bh=g87H8JA2OOq9l6nYFs8Rw8WPVhcwUIpwnUsZZOyLVYE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bKEcp8ENB1hWf9hdRu8P71cy06pqZZBB9nKYwKUit5k4kTeuO5pR3uGoCIWVBxSGoEVXMX4r9ewVP62J9PPfkySOZ9tBbdDvqmKSrH1gulP/x5Mu2p4YBA4rQbQES8LFv+YLKfi+2WTc44HChVep7cMrKp78nCkalEL/4eandR8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=phLch9jc; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="phLch9jc" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=xzMVQnlSZLVrjFGORLU1U+D4R5k/13V//cM2BeZORKg=; b=phLch9jcit+F/148sP3LkjXCiZ /KwpKJXZqxT7edP+btyTcTvrMUZl1GQP0pguCo6GmdzIW16Yke7+tI3/Wv2kifqdyRUdpjwxHY8Kf Ag9kTDteSqT75mrwU89HyrAHr2oWdnrq6YGjXQqjR6tzKiaa+EX7F+L3uusKkSxmIW1G+d8jQ3CvE kZ9BVRX8/Yk/up+pVzR+ZGHbejP+a2cJTT/5JSWQ92WS4LMV88u1PCvKXGJu9GX5xl8lAp9vQeQ0d kmZIn8BP+CjXxaDUH4zaTbywssOP/ELkdlO3d4/pE4du3xn+AQ9Fmx98q8lH2ZKnk7/5TPZjsMIB5 VXkVN5wg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h36-00000008yvS-0MvB; Thu, 10 Apr 2025 01:49:48 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org Subject: [PATCH v2 7/8] mm/migrate: enable noref migration for jbd2 Date: Wed, 9 Apr 2025 18:49:44 -0700 Message-ID: <20250410014945.2140781-8-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Davidlohr Bueso Add semantics to enable future optimizations for buffer head noref jbd2 migration. This adds a new BH_Migrate flag which ensures we can bail on the lookup path. This should enable jbd2 to get semantics of when a buffer head is under folio migration, and should yield to it and to eventually remove the buffer_meta() check skipping current jbd2 folio migration. Suggested-by: Jan Kara Co-developed-by: Luis Chamberlain Signed-off-by: Davidlohr Bueso Signed-off-by: Luis Chamberlain --- fs/buffer.c | 12 +++++++++++- fs/ext4/ialloc.c | 3 ++- include/linux/buffer_head.h | 1 + mm/migrate.c | 4 ++++ 4 files changed, 18 insertions(+), 2 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 07ec57ec100e..753fc45667da 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -211,6 +211,15 @@ __find_get_block_slow(struct block_device *bdev, sector_t block, bool atomic) head = folio_buffers(folio); if (!head) goto out_unlock; + /* + * Upon a noref migration, the folio lock serializes here; + * otherwise bail. + */ + if (test_bit_acquire(BH_Migrate, &head->b_state)) { + WARN_ON(folio_locked); + goto out_unlock; + } + bh = head; do { if (!buffer_mapped(bh)) @@ -1393,7 +1402,8 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size) /* * Perform a pagecache lookup for the matching buffer. If it's there, refresh * it in the LRU and mark it as accessed. If it is not present then return - * NULL + * NULL. Atomic context callers may also return NULL if the buffer is being + * migrated; similarly the page is not marked accessed either. */ static struct buffer_head * find_get_block_common(struct block_device *bdev, sector_t block, diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 38bc8d74f4cc..e7ecc7c8a729 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -691,7 +691,8 @@ static int recently_deleted(struct super_block *sb, ext4_group_t group, int ino) if (!bh || !buffer_uptodate(bh)) /* * If the block is not in the buffer cache, then it - * must have been written out. + * must have been written out, or, most unlikely, is + * being migrated - false failure should be OK here. */ goto out; diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index 8db10ca288fc..5559b906b1de 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -34,6 +34,7 @@ enum bh_state_bits { BH_Meta, /* Buffer contains metadata */ BH_Prio, /* Buffer should be submitted with REQ_PRIO */ BH_Defer_Completion, /* Defer AIO completion to workqueue */ + BH_Migrate, /* Buffer is being migrated (norefs) */ BH_PrivateStart,/* not a state bit, but the first bit available * for private allocation by other entities diff --git a/mm/migrate.c b/mm/migrate.c index 32fa72ba10b4..8fed2655f2e8 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -851,6 +851,8 @@ static int __buffer_migrate_folio(struct address_space *mapping, bool busy; bool invalidated = false; + VM_WARN_ON_ONCE(test_and_set_bit_lock(BH_Migrate, + &head->b_state)); recheck_buffers: busy = false; spin_lock(&mapping->i_private_lock); @@ -884,6 +886,8 @@ static int __buffer_migrate_folio(struct address_space *mapping, bh = bh->b_this_page; } while (bh != head); unlock_buffers: + if (check_refs) + clear_bit_unlock(BH_Migrate, &head->b_state); bh = head; do { unlock_buffer(bh); From patchwork Thu Apr 10 01:49:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 14045743 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C4C61991DB; Thu, 10 Apr 2025 01:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; cv=none; b=SlBwwovPcSs585OCLqXRHIOluBsAKgp3AE9TvrP8Gtq7Y1w+1zj5f4S2zTDDqXn/iR2k6XWuIe5zPOXuK0lA3aMTa7oYvm5zdXJYKw8haY7BcKBAndJF4sE1YADUkxm4YDEE8G5oNMc9h4q4RujzQ63Pzeja4Pp+abnxjTMcafc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744249793; c=relaxed/simple; bh=WxZ1ENIkDUorMda0aZ6BTZiU7BtJJXWHNlFF1GnZUbw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ori8X4Ci6EJqp+7T3CjURE4cBFhrCLsX//X1ERBfsnAblsw0iKuCXYjO06hjYckMtb0a/3DqW1zAq6RfB/Rd82mEVgI8UzBWLWunc0hWh3l1s6vubWn2Eczwci3AE8VoKuFEZSIxHPWhsxvlCzkjciZ8mU+OE4dA5b4UMpTbkE0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=ewoFzbvy; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ewoFzbvy" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=Lo+mNNCNF0Kcn2O97aAamvTUH5kTxQV3YmJyIlBZZYM=; b=ewoFzbvyobG0uDxAjVoFDwjHWk WuZ/3S9yAjfKR2FmAUSEeXkkrkoEv+JUrk7lOQjdFVWExzB7lNCFT/WwVAeEYTqA/sy2xN3HQia6i /ghOVN2nNtzpBl+oOAfTpEl6qx2iF9VVGH+99o+w8iKGr0xEC5ojaCJ2SSQGhESiqnCkB87He/3r8 G+O96d8oiH4BoJHjpRwltFGCAvF/VC5EJEDVr5tvE1SIwBCillDTlLlrPdDeGASJydAFVQnvBxvOZ YZQBbE6LPTYlsCm4iSU/5Xr+xRWoVGrXbzvRonVbCbxo2U5PhfpawzAmeQ726cHZZIDKTeO97pgca Rfa3HOOg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2h36-00000008yvU-0WmT; Thu, 10 Apr 2025 01:49:48 +0000 From: Luis Chamberlain To: brauner@kernel.org, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com Cc: dave@stgolabs.net, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, mcgrof@kernel.org Subject: [PATCH v2 8/8] mm: add migration buffer-head debugfs interface Date: Wed, 9 Apr 2025 18:49:45 -0700 Message-ID: <20250410014945.2140781-9-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250410014945.2140781-1-mcgrof@kernel.org> References: <20250410014945.2140781-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain If you are working on enhancing folio migration it is easy to not be certain on improvements. This debugfs interface enables you to evaluate gains on improvements on buffer-head folio migration. This can easily tell you *why* folio migration might fail, for example, here is the output of a generic/750 run for 18 hours: root@e3-ext4-2k ~ # cat /sys/kernel/debug/mm/migrate/bh/stats [buffer_migrate_folio] calls 50160811 success 50047572 fails 113239 [buffer_migrate_folio_norefs] calls 23577082468 success 2939858 fails 23574142610 jbd-meta 23425956714 no-head-success 102 no-head-fails 0 invalid 147919982 valid 2939881 valid-success 2939756 valid-fails 125 Success ratios: buffer_migrate_folio: 99% success (50047572/50160811) buffer_migrate_folio_norefs: 0% success (2939858/23577082468) Signed-off-by: Luis Chamberlain --- mm/migrate.c | 184 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 178 insertions(+), 6 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 8fed2655f2e8..c478e8218cb0 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -44,6 +44,7 @@ #include #include #include +#include #include @@ -791,6 +792,126 @@ int migrate_folio(struct address_space *mapping, struct folio *dst, EXPORT_SYMBOL(migrate_folio); #ifdef CONFIG_BUFFER_HEAD + +static const char * const bh_routine_names[] = { + "buffer_migrate_folio", + "buffer_migrate_folio_norefs", +}; + +#define BH_STATS(X) \ + X(bh_migrate_folio, 0, "calls") \ + X(bh_migrate_folio_success, 0, "success") \ + X(bh_migrate_folio_fails, 0, "fails") \ + X(bh_migrate_folio_norefs, 1, "calls") \ + X(bh_migrate_folio_norefs_success, 1, "success") \ + X(bh_migrate_folio_norefs_fails, 1, "fails") \ + X(bh_migrate_folio_norefs_meta, 1, "jbd-meta") \ + X(bh_migrate_folio_norefs_nohead_success, 1, "no-head-success") \ + X(bh_migrate_folio_norefs_nohead_fails, 1, "no-head-fails") \ + X(bh_migrate_folio_norefs_invalid, 1, "invalid") \ + X(bh_migrate_folio_norefs_valid, 1, "valid") \ + X(bh_migrate_folio_norefs_valid_success, 1, "valid-success") \ + X(bh_migrate_folio_norefs_valid_fails, 1, "valid-fails") + + +#define DECLARE_STAT(name, routine_idx, meaning) static atomic_long_t name; +BH_STATS(DECLARE_STAT) + +#define BH_STAT_PTR(name, routine_idx, meaning) &name, +static atomic_long_t * const bh_stat_array[] = { + BH_STATS(BH_STAT_PTR) +}; + +#define BH_STAT_ROUTINE_IDX(name, routine_idx, meaning) routine_idx, +static const int bh_stat_routine_index[] = { + BH_STATS(BH_STAT_ROUTINE_IDX) +}; + +#define BH_STAT_MEANING(name, routine_idx, meaning) meaning, +static const char * const bh_stat_meanings[] = { + BH_STATS(BH_STAT_MEANING) +}; + +#define NUM_BH_STATS ARRAY_SIZE(bh_stat_array) + +static ssize_t read_file_bh_migrate_stats(struct file *file, + char __user *user_buf, + size_t count, loff_t *ppos) +{ + char *buf; + unsigned int i, len = 0, size = NUM_BH_STATS * 128; + int ret, last_routine = -1; + unsigned long total, success, rate; + + BUILD_BUG_ON(ARRAY_SIZE(bh_stat_array) != ARRAY_SIZE(bh_stat_meanings)); + + buf = kzalloc(size, GFP_KERNEL); + if (buf == NULL) + return -ENOMEM; + + for (i = 0; i < NUM_BH_STATS; i++) { + int routine_idx = bh_stat_routine_index[i]; + + if (routine_idx != last_routine) { + len += scnprintf(buf + len, size - len, "\n[%s]\n", + bh_routine_names[routine_idx]); + last_routine = routine_idx; + } + + len += scnprintf(buf + len, size - len, "%25s\t%lu\n", + bh_stat_meanings[i], + atomic_long_read(bh_stat_array[i])); + + } + + len += scnprintf(buf + len, size - len, "\nSuccess ratios:\n"); + + total = atomic_long_read(&bh_migrate_folio); + success = atomic_long_read(&bh_migrate_folio_success); + rate = total ? (success * 100) / total : 0; + len += scnprintf(buf + len, size - len, + "%s: %lu%% success (%lu/%lu)\n", + "buffer_migrate_folio", rate, success, total); + + total = atomic_long_read(&bh_migrate_folio_norefs); + success = atomic_long_read(&bh_migrate_folio_norefs_success); + rate = total ? (success * 100) / total : 0; + len += scnprintf(buf + len, size - len, + "%s: %lu%% success (%lu/%lu)\n", + "buffer_migrate_folio_norefs", rate, success, total); + + ret = simple_read_from_buffer(user_buf, count, ppos, buf, len); + kfree(buf); + return ret; +} + +static const struct file_operations fops_bh_migrate_stats = { + .read = read_file_bh_migrate_stats, + .open = simple_open, + .owner = THIS_MODULE, + .llseek = default_llseek, +}; + +static void mm_migrate_bh_init(struct dentry *migrate_debug_root) +{ + struct dentry *parent_dirs[ARRAY_SIZE(bh_routine_names)] = { NULL }; + struct dentry *root = debugfs_create_dir("bh", migrate_debug_root); + int i; + + for (i = 0; i < ARRAY_SIZE(bh_routine_names); i++) + parent_dirs[i] = debugfs_create_dir(bh_routine_names[i], root); + + for (i = 0; i < NUM_BH_STATS; i++) { + int routine = bh_stat_routine_index[i]; + debugfs_create_ulong(bh_stat_meanings[i], 0400, + parent_dirs[routine], + (unsigned long *) + &bh_stat_array[i]->counter); + } + + debugfs_create_file("stats", 0400, root, root, &fops_bh_migrate_stats); +} + /* Returns true if all buffers are successfully locked */ static bool buffer_migrate_lock_buffers(struct buffer_head *head, enum migrate_mode mode) @@ -833,16 +954,26 @@ static int __buffer_migrate_folio(struct address_space *mapping, int expected_count; head = folio_buffers(src); - if (!head) - return migrate_folio(mapping, dst, src, mode); + if (!head) { + rc = migrate_folio(mapping, dst, src, mode); + if (check_refs) { + if (rc == 0) + atomic_long_inc(&bh_migrate_folio_norefs_nohead_success); + else + atomic_long_inc(&bh_migrate_folio_norefs_nohead_fails); + } + return rc; + } /* Check whether page does not have extra refs before we do more work */ expected_count = folio_expected_refs(mapping, src); if (folio_ref_count(src) != expected_count) return -EAGAIN; - if (buffer_meta(head)) + if (buffer_meta(head)) { + atomic_long_inc(&bh_migrate_folio_norefs_meta); return -EAGAIN; + } if (!buffer_migrate_lock_buffers(head, mode)) return -EAGAIN; @@ -868,17 +999,23 @@ static int __buffer_migrate_folio(struct address_space *mapping, if (busy) { if (invalidated) { rc = -EAGAIN; + atomic_long_inc(&bh_migrate_folio_norefs_invalid); goto unlock_buffers; } invalidate_bh_lrus(); invalidated = true; goto recheck_buffers; } + atomic_long_inc(&bh_migrate_folio_norefs_valid); } rc = filemap_migrate_folio(mapping, dst, src, mode); - if (rc != MIGRATEPAGE_SUCCESS) + if (rc != MIGRATEPAGE_SUCCESS) { + if (check_refs) + atomic_long_inc(&bh_migrate_folio_norefs_valid_fails); goto unlock_buffers; + } else if (check_refs) + atomic_long_inc(&bh_migrate_folio_norefs_valid_success); bh = head; do { @@ -915,7 +1052,16 @@ static int __buffer_migrate_folio(struct address_space *mapping, int buffer_migrate_folio(struct address_space *mapping, struct folio *dst, struct folio *src, enum migrate_mode mode) { - return __buffer_migrate_folio(mapping, dst, src, mode, false); + int ret; + atomic_long_inc(&bh_migrate_folio); + + ret = __buffer_migrate_folio(mapping, dst, src, mode, false); + if (ret == 0) + atomic_long_inc(&bh_migrate_folio_success); + else + atomic_long_inc(&bh_migrate_folio_fails); + + return ret; } EXPORT_SYMBOL(buffer_migrate_folio); @@ -936,9 +1082,21 @@ EXPORT_SYMBOL(buffer_migrate_folio); int buffer_migrate_folio_norefs(struct address_space *mapping, struct folio *dst, struct folio *src, enum migrate_mode mode) { - return __buffer_migrate_folio(mapping, dst, src, mode, true); + int ret; + + atomic_long_inc(&bh_migrate_folio_norefs); + + ret = __buffer_migrate_folio(mapping, dst, src, mode, true); + if (ret == 0) + atomic_long_inc(&bh_migrate_folio_norefs_success); + else + atomic_long_inc(&bh_migrate_folio_norefs_fails); + + return ret; } EXPORT_SYMBOL_GPL(buffer_migrate_folio_norefs); +#else +static inline void mm_migrate_bh_init(struct dentry *migrate_debug_root) { } #endif /* CONFIG_BUFFER_HEAD */ int filemap_migrate_folio(struct address_space *mapping, @@ -2737,3 +2895,17 @@ int migrate_misplaced_folio(struct folio *folio, int node) } #endif /* CONFIG_NUMA_BALANCING */ #endif /* CONFIG_NUMA */ + +static __init int mm_migrate_debugfs_init(void) +{ + struct dentry *mm_debug_root; + struct dentry *migrate_debug_root; + + mm_debug_root = debugfs_create_dir("mm", NULL); + migrate_debug_root = debugfs_create_dir("migrate", mm_debug_root); + + mm_migrate_bh_init(migrate_debug_root); + + return 0; +} +fs_initcall(mm_migrate_debugfs_init);