diff mbox series

[v2,10/11] fs/buffer.c: support fsverity in block_read_full_folio()

Message ID 20221223203638.41293-11-ebiggers@kernel.org (mailing list archive)
State Accepted
Headers show
Series fsverity: support for non-4K pages | expand

Commit Message

Eric Biggers Dec. 23, 2022, 8:36 p.m. UTC
From: Eric Biggers <ebiggers@google.com>

After each filesystem block (as represented by a buffer_head) has been
read from disk by block_read_full_folio(), verify it if needed.  The
verification is done on the fsverity_read_workqueue.  Also allow reads
of verity metadata past i_size, as required by ext4.

This is needed to support fsverity on ext4 filesystems where the
filesystem block size is less than the page size.

The new code is compiled away when CONFIG_FS_VERITY=n.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/buffer.c | 67 +++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 57 insertions(+), 10 deletions(-)

Comments

Andrew Morton Jan. 10, 2023, 2:37 a.m. UTC | #1
On Fri, 23 Dec 2022 12:36:37 -0800 Eric Biggers <ebiggers@kernel.org> wrote:

> After each filesystem block (as represented by a buffer_head) has been
> read from disk by block_read_full_folio(), verify it if needed.  The
> verification is done on the fsverity_read_workqueue.  Also allow reads
> of verity metadata past i_size, as required by ext4.

Sigh.  Do we reeeeealy need to mess with buffer.c in this fashion?  Did
any other subsystems feel a need to do this?

> This is needed to support fsverity on ext4 filesystems where the
> filesystem block size is less than the page size.

Does any real person actually do this?
Eric Biggers Jan. 10, 2023, 3:05 a.m. UTC | #2
On Mon, Jan 09, 2023 at 06:37:59PM -0800, Andrew Morton wrote:
> On Fri, 23 Dec 2022 12:36:37 -0800 Eric Biggers <ebiggers@kernel.org> wrote:
> 
> > After each filesystem block (as represented by a buffer_head) has been
> > read from disk by block_read_full_folio(), verify it if needed.  The
> > verification is done on the fsverity_read_workqueue.  Also allow reads
> > of verity metadata past i_size, as required by ext4.
> 
> Sigh.  Do we reeeeealy need to mess with buffer.c in this fashion?  Did
> any other subsystems feel a need to do this?

ext4 is currently the only filesystem that uses block_read_full_folio() and that
supports fsverity.  However, since fsverity has a common infrastructure across
filesystems, in fs/verity/, it makes sense to support it in the other filesystem
infrastructure so that things aren't mutually exclusive for no reason.

Note that this applies to fscrypt too, which block_read_full_folio() (previously
block_read_full_page()) already supports since v5.5.

If you'd prefer that block_read_full_folio() be copied into ext4, then modified
to support fscrypt and fsverity, and then the fscrypt support removed from the
original copy, we could do that.  That seems more like a workaround to avoid
modifying certain files than an actually better solution, but it could be done.

> 
> > This is needed to support fsverity on ext4 filesystems where the
> > filesystem block size is less than the page size.
> 
> Does any real person actually do this?

Yes, on systems with the page size larger than 4K, the ext4 filesystem block
size is often smaller than the page size.  ext4 encryption (fscrypt) originally
had the same limitation, and Chandan Rajendra from IBM did significant work to
solve it a few years ago, with the changes landing in v5.5.

- Eric
Eric Biggers Jan. 20, 2023, 7:56 p.m. UTC | #3
On Mon, Jan 09, 2023 at 07:05:07PM -0800, Eric Biggers wrote:
> On Mon, Jan 09, 2023 at 06:37:59PM -0800, Andrew Morton wrote:
> > On Fri, 23 Dec 2022 12:36:37 -0800 Eric Biggers <ebiggers@kernel.org> wrote:
> > 
> > > After each filesystem block (as represented by a buffer_head) has been
> > > read from disk by block_read_full_folio(), verify it if needed.  The
> > > verification is done on the fsverity_read_workqueue.  Also allow reads
> > > of verity metadata past i_size, as required by ext4.
> > 
> > Sigh.  Do we reeeeealy need to mess with buffer.c in this fashion?  Did
> > any other subsystems feel a need to do this?
> 
> ext4 is currently the only filesystem that uses block_read_full_folio() and that
> supports fsverity.  However, since fsverity has a common infrastructure across
> filesystems, in fs/verity/, it makes sense to support it in the other filesystem
> infrastructure so that things aren't mutually exclusive for no reason.
> 
> Note that this applies to fscrypt too, which block_read_full_folio() (previously
> block_read_full_page()) already supports since v5.5.
> 
> If you'd prefer that block_read_full_folio() be copied into ext4, then modified
> to support fscrypt and fsverity, and then the fscrypt support removed from the
> original copy, we could do that.  That seems more like a workaround to avoid
> modifying certain files than an actually better solution, but it could be done.
> 
> > 
> > > This is needed to support fsverity on ext4 filesystems where the
> > > filesystem block size is less than the page size.
> > 
> > Does any real person actually do this?
> 
> Yes, on systems with the page size larger than 4K, the ext4 filesystem block
> size is often smaller than the page size.  ext4 encryption (fscrypt) originally
> had the same limitation, and Chandan Rajendra from IBM did significant work to
> solve it a few years ago, with the changes landing in v5.5.
> 
> - Eric

Any more thoughts on this from Andrew, the ext4 maintainers, or anyone else?

- Eric
Christoph Hellwig Jan. 21, 2023, 6:39 a.m. UTC | #4
On Fri, Jan 20, 2023 at 11:56:45AM -0800, Eric Biggers wrote:
> Any more thoughts on this from Andrew, the ext4 maintainers, or anyone else?

As someone else:  I relaly much prefer to support common functionality
(fsverity) in common helpers rather than copy and pasting them into
various file systems.  The copy common helper and slightly modify it
is a cancer infecting various file systems that makes it really hard
to maintain the kernel.
diff mbox series

Patch

diff --git a/fs/buffer.c b/fs/buffer.c
index d9c6d1fbb6dde..2e65ba2b3919b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -48,6 +48,7 @@ 
 #include <linux/sched/mm.h>
 #include <trace/events/block.h>
 #include <linux/fscrypt.h>
+#include <linux/fsverity.h>
 
 #include "internal.h"
 
@@ -295,20 +296,52 @@  static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
 	return;
 }
 
-struct decrypt_bh_ctx {
+struct postprocess_bh_ctx {
 	struct work_struct work;
 	struct buffer_head *bh;
 };
 
+static void verify_bh(struct work_struct *work)
+{
+	struct postprocess_bh_ctx *ctx =
+		container_of(work, struct postprocess_bh_ctx, work);
+	struct buffer_head *bh = ctx->bh;
+	bool valid;
+
+	valid = fsverity_verify_blocks(bh->b_page, bh->b_size, bh_offset(bh));
+	end_buffer_async_read(bh, valid);
+	kfree(ctx);
+}
+
+static bool need_fsverity(struct buffer_head *bh)
+{
+	struct page *page = bh->b_page;
+	struct inode *inode = page->mapping->host;
+
+	return fsverity_active(inode) &&
+		/* needed by ext4 */
+		page->index < DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
+}
+
 static void decrypt_bh(struct work_struct *work)
 {
-	struct decrypt_bh_ctx *ctx =
-		container_of(work, struct decrypt_bh_ctx, work);
+	struct postprocess_bh_ctx *ctx =
+		container_of(work, struct postprocess_bh_ctx, work);
 	struct buffer_head *bh = ctx->bh;
 	int err;
 
 	err = fscrypt_decrypt_pagecache_blocks(bh->b_page, bh->b_size,
 					       bh_offset(bh));
+	if (err == 0 && need_fsverity(bh)) {
+		/*
+		 * We use different work queues for decryption and for verity
+		 * because verity may require reading metadata pages that need
+		 * decryption, and we shouldn't recurse to the same workqueue.
+		 */
+		INIT_WORK(&ctx->work, verify_bh);
+		fsverity_enqueue_verify_work(&ctx->work);
+		return;
+	}
 	end_buffer_async_read(bh, err == 0);
 	kfree(ctx);
 }
@@ -319,15 +352,24 @@  static void decrypt_bh(struct work_struct *work)
  */
 static void end_buffer_async_read_io(struct buffer_head *bh, int uptodate)
 {
-	/* Decrypt if needed */
-	if (uptodate &&
-	    fscrypt_inode_uses_fs_layer_crypto(bh->b_page->mapping->host)) {
-		struct decrypt_bh_ctx *ctx = kmalloc(sizeof(*ctx), GFP_ATOMIC);
+	struct inode *inode = bh->b_page->mapping->host;
+	bool decrypt = fscrypt_inode_uses_fs_layer_crypto(inode);
+	bool verify = need_fsverity(bh);
+
+	/* Decrypt (with fscrypt) and/or verify (with fsverity) if needed. */
+	if (uptodate && (decrypt || verify)) {
+		struct postprocess_bh_ctx *ctx =
+			kmalloc(sizeof(*ctx), GFP_ATOMIC);
 
 		if (ctx) {
-			INIT_WORK(&ctx->work, decrypt_bh);
 			ctx->bh = bh;
-			fscrypt_enqueue_decrypt_work(&ctx->work);
+			if (decrypt) {
+				INIT_WORK(&ctx->work, decrypt_bh);
+				fscrypt_enqueue_decrypt_work(&ctx->work);
+			} else {
+				INIT_WORK(&ctx->work, verify_bh);
+				fsverity_enqueue_verify_work(&ctx->work);
+			}
 			return;
 		}
 		uptodate = 0;
@@ -2245,6 +2287,11 @@  int block_read_full_folio(struct folio *folio, get_block_t *get_block)
 	int nr, i;
 	int fully_mapped = 1;
 	bool page_error = false;
+	loff_t limit = i_size_read(inode);
+
+	/* This is needed for ext4. */
+	if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode))
+		limit = inode->i_sb->s_maxbytes;
 
 	VM_BUG_ON_FOLIO(folio_test_large(folio), folio);
 
@@ -2253,7 +2300,7 @@  int block_read_full_folio(struct folio *folio, get_block_t *get_block)
 	bbits = block_size_bits(blocksize);
 
 	iblock = (sector_t)folio->index << (PAGE_SHIFT - bbits);
-	lblock = (i_size_read(inode)+blocksize-1) >> bbits;
+	lblock = (limit+blocksize-1) >> bbits;
 	bh = head;
 	nr = 0;
 	i = 0;