diff mbox series

Btrfs: send, fix incorrect file layout after hole punching beyond eof

Message ID 20180730113958.5178-1-fdmanana@kernel.org (mailing list archive)
State New, archived
Headers show
Series Btrfs: send, fix incorrect file layout after hole punching beyond eof | expand

Commit Message

Filipe Manana July 30, 2018, 11:39 a.m. UTC
From: Filipe Manana <fdmanana@suse.com>

When doing an incremental send, if we have a file in the parent snapshot
that has prealloc extents beyond EOF and in the send snapshot it got a
hole punch that partially covers the prealloc extents, the send stream,
when replayed by a receiver, can result in a file that has a size bigger
than it should and filled with zeroes past the correct EOF.

For example:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ xfs_io -f -c "falloc -k 0 4M" /mnt/foobar
  $ xfs_io -c "pwrite -S 0xea 0 1M" /mnt/foobar

  $ btrfs subvolume snapshot -r /mnt /mnt/snap1
  $ btrfs send -f /tmp/1.send /mnt/snap1

  $ xfs_io -c "fpunch 1M 2M" /mnt/foobar

  $ btrfs subvolume snapshot -r /mnt /mnt/snap2
  $ btrfs send -f /tmp/2.send -p /mnt/snap1 /mnt/snap2

  $ stat --format %s /mnt/snap2/foobar
  1048576
  $ md5sum /mnt/snap2/foobar
  d31659e82e87798acd4669a1e0a19d4f  /mnt/snap2/foobar

  $ umount /mnt
  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt

  $ btrfs receive -f /mnt/1.snap /mnt
  $ btrfs receive -f /mnt/2.snap /mnt

  $ stat --format %s /mnt/snap2/foobar
  3145728
  # --> should be 1Mb and not 3Mb (which was the end offset of hole
  #     punch operation)
  $ md5sum /mnt/snap2/foobar
  117baf295297c2a995f92da725b0b651  /mnt/snap2/foobar
  # --> should be d31659e82e87798acd4669a1e0a19d4f as in the original fs

This issue actually happens only since commit ffa7c4296e93 ("Btrfs: send,
do not issue unnecessary truncate operations"), but before that commit we
were issuing a write operation full of zeroes (to "punch" a hole) which
was extending the file size beyond the correct value and then immediately
issue a truncate operation to the correct size and undoing the previous
write operation. Since the send protocol does not support fallocate, for
extent preallocation and hole punching, fix this by not even attempting
to send a "hole" (regular write full of zeroes) if it starts at an offset
greater then or equals to the file's size. This approach, besides being
much more simple then making send issue the truncate operation, adds the
benefit of avoiding the useless pair of write of zeroes and truncate
operations, saving time and IO at the receiver and reducing the size of
the send stream.

A test case for fstests follows soon.

CC: stable@vger.kernel.org # 4.17+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/send.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

David Sterba Aug. 2, 2018, 3:20 p.m. UTC | #1
On Mon, Jul 30, 2018 at 12:39:58PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> When doing an incremental send, if we have a file in the parent snapshot
> that has prealloc extents beyond EOF and in the send snapshot it got a
> hole punch that partially covers the prealloc extents, the send stream,
> when replayed by a receiver, can result in a file that has a size bigger
> than it should and filled with zeroes past the correct EOF.
> 
> For example:
> 
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
> 
>   $ xfs_io -f -c "falloc -k 0 4M" /mnt/foobar
>   $ xfs_io -c "pwrite -S 0xea 0 1M" /mnt/foobar
> 
>   $ btrfs subvolume snapshot -r /mnt /mnt/snap1
>   $ btrfs send -f /tmp/1.send /mnt/snap1
> 
>   $ xfs_io -c "fpunch 1M 2M" /mnt/foobar
> 
>   $ btrfs subvolume snapshot -r /mnt /mnt/snap2
>   $ btrfs send -f /tmp/2.send -p /mnt/snap1 /mnt/snap2
> 
>   $ stat --format %s /mnt/snap2/foobar
>   1048576
>   $ md5sum /mnt/snap2/foobar
>   d31659e82e87798acd4669a1e0a19d4f  /mnt/snap2/foobar
> 
>   $ umount /mnt
>   $ mkfs.btrfs -f /dev/sdc
>   $ mount /dev/sdc /mnt
> 
>   $ btrfs receive -f /mnt/1.snap /mnt
>   $ btrfs receive -f /mnt/2.snap /mnt
> 
>   $ stat --format %s /mnt/snap2/foobar
>   3145728
>   # --> should be 1Mb and not 3Mb (which was the end offset of hole
>   #     punch operation)
>   $ md5sum /mnt/snap2/foobar
>   117baf295297c2a995f92da725b0b651  /mnt/snap2/foobar
>   # --> should be d31659e82e87798acd4669a1e0a19d4f as in the original fs
> 
> This issue actually happens only since commit ffa7c4296e93 ("Btrfs: send,
> do not issue unnecessary truncate operations"), but before that commit we
> were issuing a write operation full of zeroes (to "punch" a hole) which
> was extending the file size beyond the correct value and then immediately
> issue a truncate operation to the correct size and undoing the previous
> write operation. Since the send protocol does not support fallocate, for
> extent preallocation and hole punching, fix this by not even attempting
> to send a "hole" (regular write full of zeroes) if it starts at an offset
> greater then or equals to the file's size. This approach, besides being
> much more simple then making send issue the truncate operation, adds the
> benefit of avoiding the useless pair of write of zeroes and truncate
> operations, saving time and IO at the receiver and reducing the size of
> the send stream.
> 
> A test case for fstests follows soon.
> 
> CC: stable@vger.kernel.org # 4.17+
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Added to misc-next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox series

Patch

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 0f31760f875f..87e89f7ff335 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -5007,6 +5007,15 @@  static int send_hole(struct send_ctx *sctx, u64 end)
 	u64 len;
 	int ret = 0;
 
+	/*
+	 * A hole that starts at EOF or beyond it. Since we do not yet support
+	 * fallocate (for extent preallocation and hole punching), sending a
+	 * write of zeroes starting at EOF or beyond would later require issuing
+	 * a truncate operation which would undo the write and achieve nothing.
+	 */
+	if (offset >= sctx->cur_inode_size)
+		return 0;
+
 	if (sctx->flags & BTRFS_SEND_FLAG_NO_FILE_DATA)
 		return send_update_extent(sctx, offset, end - offset);