From patchwork Mon Sep 18 23:12:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13390567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4158CD3425 for ; Mon, 18 Sep 2023 23:12:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229881AbjIRXMM (ORCPT ); Mon, 18 Sep 2023 19:12:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229436AbjIRXML (ORCPT ); Mon, 18 Sep 2023 19:12:11 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3785890; Mon, 18 Sep 2023 16:12:06 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D2B2CC433C8; Mon, 18 Sep 2023 23:12:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695078725; bh=DYUrraa5d+0gdnnutT4NvEXRrBNI5SLRUy1a1wNggtg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ePRKYxzhCvowevrvmlBEmHxWvDnISsHNsnPKpDU8rpAQI0M3n3hiOkVzoKE7FDWnc rCw6A7Et1KrqtENl5w4A/2UGIKnSOvvFQ5+XLNNO0ZUObPAbaK+ZQNzNhUJwkHt4Gf 7it3shF76fiwO4NACW3er/IPueRVd9HMKrD/okbkRCCNmu3/iK2gncNdUxJukiP2fi /wc9nQ2bJhelHfhHzNLai7aehdhu1dDzP8HFAP0up0UtpGgic0l77O+IslF0VdQxZL tJ+IG6R8QXbnwKzpqZsBuNuRm/H/NhYX1g2ITTl4wRTOQwwELPy3a5ymAo1bHO6bQd Spnqx9gX/ycbQ== Subject: [PATCH 1/2] iomap: don't skip reading in !uptodate folios when unsharing a range From: "Darrick J. Wong" To: djwong@kernel.org Cc: ritesh.list@gmail.com, willy@infradead.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, ritesh.list@gmail.com, willy@infradead.org Date: Mon, 18 Sep 2023 16:12:05 -0700 Message-ID: <169507872536.772278.18183365318216726644.stgit@frogsfrogsfrogs> In-Reply-To: <169507871947.772278.5767091361086740046.stgit@frogsfrogsfrogs> References: <169507871947.772278.5767091361086740046.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Darrick J. Wong Prior to commit a01b8f225248e, we would always read in the contents of a !uptodate folio prior to writing userspace data into the folio, allocated a folio state object, etc. Ritesh introduced an optimization that skips all of that if the write would cover the entire folio. Unfortunately, the optimization misses the unshare case, where we always have to read in the folio contents since there isn't a data buffer supplied by userspace. This can result in stale kernel memory exposure if userspace issues a FALLOC_FL_UNSHARE_RANGE call on part of a shared file that isn't already cached. This was caught by observing fstests regressions in the "unshare around" mechanism that is used for unaligned writes to a reflinked realtime volume when the realtime extent size is larger than 1FSB, though I think it applies to any shared file. Cc: ritesh.list@gmail.com, willy@infradead.org Fixes: a01b8f225248e ("iomap: Allocate ifs in ->write_begin() early") Signed-off-by: Darrick J. Wong Reviewed-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index ae8673ce08b1..0350830fc989 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -640,11 +640,13 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t poff, plen; /* - * If the write completely overlaps the current folio, then + * If the write or zeroing completely overlaps the current folio, then * entire folio will be dirtied so there is no need for * per-block state tracking structures to be attached to this folio. + * For the unshare case, we must read in the ondisk contents because we + * are not changing pagecache contents. */ - if (pos <= folio_pos(folio) && + if (!(iter->flags & IOMAP_UNSHARE) && pos <= folio_pos(folio) && pos + len >= folio_pos(folio) + folio_size(folio)) return 0; From patchwork Mon Sep 18 23:12:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13390568 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10E19CD3424 for ; Mon, 18 Sep 2023 23:12:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230125AbjIRXMR (ORCPT ); Mon, 18 Sep 2023 19:12:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229436AbjIRXMR (ORCPT ); Mon, 18 Sep 2023 19:12:17 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE29199; Mon, 18 Sep 2023 16:12:11 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 82A4FC433C7; Mon, 18 Sep 2023 23:12:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695078731; bh=mESWbYZ+5fFLjoX7bRpHcoHkMIc1+TPSSsaojhOfaps=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=LpzTwK48vQNQkJFmolVasj/fv9FvzzAn0azjUN4WkUcF8z90g7vp9ckX0JAR/Wj8t 89RBrL9yWfNrgvJHaOu9jUPYYIeZHxljoudf7qKB15ys62jT86HMvXxcLH+fCIwHcK TTm6ynV8BFSidB9GeAYqfuF28xSTMk0W5qFtmaTOsn2sR62iCzw52qUX3HrtDW5ukj TYt7Vw4pJJ/P8RS9KjOp4bdicpbZ0CL/X6xrWi5FqE3gS6rnMQ5+/aiJpNxykA3nQo qm7snKSe2swCYBPWd4J8O1HQUIh8S4VHoQ4chdszhTih8q84oxmygGnvxedTA6+U6r R/K+MbXa3sz0Q== Subject: [PATCH 2/2] iomap: convert iomap_unshare_iter to use large folios From: "Darrick J. Wong" To: djwong@kernel.org Cc: ritesh.list@gmail.com, willy@infradead.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, ritesh.list@gmail.com, willy@infradead.org Date: Mon, 18 Sep 2023 16:12:11 -0700 Message-ID: <169507873100.772278.2320683121600245730.stgit@frogsfrogsfrogs> In-Reply-To: <169507871947.772278.5767091361086740046.stgit@frogsfrogsfrogs> References: <169507871947.772278.5767091361086740046.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Darrick J. Wong Convert iomap_unshare_iter to create large folios if possible, since the write and zeroing paths already do that. I think this got missed in the conversion of the write paths that landed in 6.6-rc1. Cc: ritesh.list@gmail.com, willy@infradead.org Signed-off-by: Darrick J. Wong Reviewed-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 0350830fc989..db889bdfd327 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1263,7 +1263,6 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter) const struct iomap *srcmap = iomap_iter_srcmap(iter); loff_t pos = iter->pos; loff_t length = iomap_length(iter); - long status = 0; loff_t written = 0; /* don't bother with blocks that are not shared to start with */ @@ -1274,9 +1273,10 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter) return length; do { - unsigned long offset = offset_in_page(pos); - unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length); struct folio *folio; + int status; + size_t offset; + size_t bytes = min_t(u64, SIZE_MAX, length); status = iomap_write_begin(iter, pos, bytes, &folio); if (unlikely(status)) @@ -1284,18 +1284,22 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter) if (iter->iomap.flags & IOMAP_F_STALE) break; - status = iomap_write_end(iter, pos, bytes, bytes, folio); - if (WARN_ON_ONCE(status == 0)) + offset = offset_in_folio(folio, pos); + if (bytes > folio_size(folio) - offset) + bytes = folio_size(folio) - offset; + + bytes = iomap_write_end(iter, pos, bytes, bytes, folio); + if (WARN_ON_ONCE(bytes == 0)) return -EIO; cond_resched(); - pos += status; - written += status; - length -= status; + pos += bytes; + written += bytes; + length -= bytes; balance_dirty_pages_ratelimited(iter->inode->i_mapping); - } while (length); + } while (length > 0); return written; } From patchwork Mon Sep 18 23:19:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13390572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C21ACD3425 for ; Mon, 18 Sep 2023 23:19:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230170AbjIRXTw (ORCPT ); Mon, 18 Sep 2023 19:19:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbjIRXTv (ORCPT ); Mon, 18 Sep 2023 19:19:51 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EB2999; Mon, 18 Sep 2023 16:19:46 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B18D4C433C7; Mon, 18 Sep 2023 23:19:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695079185; bh=I/gQLzyHAMGpNPEq1NYGX8rh0f0JTnkhTQOsHXvb3d8=; h=Date:From:To:Subject:References:In-Reply-To:From; b=SOlIvMjR0fMsbzEo91/hRydnijOpuZ4B4D9juFJEKV5tO51qbdd+Dvsw1xt+uI0sh inFdRgSyF/Q6wkkbq206qwXnmABhca/f2sdgef12AmAjbHW47sy7KYtb1FtFCcaM3Z XJVxpRjoN6x+wcB9NCd9V2DHxDaqWzyhvBUYO8uCbTyiiY+I87TfyfAVcweqJMC4cO usFLHG1ka5bCQbCJ3dQHUxAgjGwQ2mOQuugj6zVYO+7ruV5NU8Lks9OD1WyGfyNp2j RWB5lcgiVE1JEGdhZR7+9mBqUCMu1zllxhPu2BjaQl9U1vXpcmop4FTuRmi7PNfR0r 2e6Rk9N5KPvFw== Date: Mon, 18 Sep 2023 16:19:45 -0700 From: "Darrick J. Wong" To: ritesh.list@gmail.com, willy@infradead.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH 3/2] fstests: test FALLOC_FL_UNSHARE when pagecache is not loaded Message-ID: <20230918231945.GC348018@frogsfrogsfrogs> References: <169507871947.772278.5767091361086740046.stgit@frogsfrogsfrogs> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <169507871947.772278.5767091361086740046.stgit@frogsfrogsfrogs> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a regression test for funsharing uncached files to ensure that we actually manage the pagecache state correctly. Signed-off-by: Darrick J. Wong Reviewed-by: Ritesh Harjani (IBM) --- tests/xfs/1936 | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/xfs/1936.out | 4 ++ 2 files changed, 92 insertions(+) create mode 100755 tests/xfs/1936 create mode 100644 tests/xfs/1936.out diff --git a/tests/xfs/1936 b/tests/xfs/1936 new file mode 100755 index 0000000000..bcf9b6b478 --- /dev/null +++ b/tests/xfs/1936 @@ -0,0 +1,88 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2023 Oracle. All Rights Reserved. +# +# FS QA Test 1936 +# +# This is a regression test for the kernel commit noted below. The stale +# memory exposure can be exploited by creating a file with shared blocks, +# evicting the page cache for that file, and then funshareing at least one +# memory page's worth of data. iomap will mark the page uptodate and dirty +# without ever reading the ondisk contents. +# +. ./common/preamble +_begin_fstest auto quick unshare clone + +_cleanup() +{ + cd / + rm -r -f $tmp.* $testdir +} + +# real QA test starts here + +# Import common functions. +. ./common/filter +. ./common/attr +. ./common/reflink + +_fixed_by_git_commit kernel XXXXXXXXXXXXX \ + "iomap: don't skip reading in !uptodate folios when unsharing a range" + +# real QA test starts here +_require_test_reflink +_require_cp_reflink +_require_xfs_io_command "funshare" + +testdir=$TEST_DIR/test-$seq +rm -rf $testdir +mkdir $testdir + +# Create a file that is at least four pages in size and aligned to the +# file allocation unit size so that we don't trigger any unnecessary zeroing. +pagesz=$(_get_page_size) +alloc_unit=$(_get_file_block_size $TEST_DIR) +filesz=$(( ( (4 * pagesz) + alloc_unit - 1) / alloc_unit * alloc_unit)) + +echo "Create the original file and a clone" +_pwrite_byte 0x61 0 $filesz $testdir/file2.chk >> $seqres.full +_pwrite_byte 0x61 0 $filesz $testdir/file1 >> $seqres.full +_cp_reflink $testdir/file1 $testdir/file2 +_cp_reflink $testdir/file1 $testdir/file3 + +_test_cycle_mount + +cat $testdir/file3 > /dev/null + +echo "Funshare at least one pagecache page" +$XFS_IO_PROG -c "funshare 0 $filesz" $testdir/file2 +$XFS_IO_PROG -c "funshare 0 $filesz" $testdir/file3 +_pwrite_byte 0x61 0 $filesz $testdir/file2.chk >> $seqres.full + +echo "Check contents" + +# file2 wasn't cached when it was unshared, but it should match +if ! cmp -s $testdir/file2.chk $testdir/file2; then + echo "file2.chk does not match file2" + + echo "file2.chk contents" >> $seqres.full + od -tx1 -Ad -c $testdir/file2.chk >> $seqres.full + echo "file2 contents" >> $seqres.full + od -tx1 -Ad -c $testdir/file2 >> $seqres.full + echo "end bad contents" >> $seqres.full +fi + +# file3 was cached when it was unshared, and it should match +if ! cmp -s $testdir/file2.chk $testdir/file3; then + echo "file2.chk does not match file3" + + echo "file2.chk contents" >> $seqres.full + od -tx1 -Ad -c $testdir/file2.chk >> $seqres.full + echo "file3 contents" >> $seqres.full + od -tx1 -Ad -c $testdir/file3 >> $seqres.full + echo "end bad contents" >> $seqres.full +fi + +# success, all done +status=0 +exit diff --git a/tests/xfs/1936.out b/tests/xfs/1936.out new file mode 100644 index 0000000000..c7c820ced5 --- /dev/null +++ b/tests/xfs/1936.out @@ -0,0 +1,4 @@ +QA output created by 1936 +Create the original file and a clone +Funshare at least one pagecache page +Check contents