From patchwork Tue Feb 14 20:35:04 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Filipe Manana <fdmanana@kernel.org>
X-Patchwork-Id: 9572837
Return-Path: <linux-btrfs-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	C478160573 for <patchwork-linux-btrfs@patchwork.kernel.org>;
	Tue, 14 Feb 2017 20:35:47 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD48427C2D
	for <patchwork-linux-btrfs@patchwork.kernel.org>;
	Tue, 14 Feb 2017 20:35:47 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id A22C92840E; Tue, 14 Feb 2017 20:35:47 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI
	autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2218727C2D
	for <patchwork-linux-btrfs@patchwork.kernel.org>;
	Tue, 14 Feb 2017 20:35:45 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755450AbdBNUfh (ORCPT
	<rfc822;patchwork-linux-btrfs@patchwork.kernel.org>);
	Tue, 14 Feb 2017 15:35:37 -0500
Received: from mail.kernel.org ([198.145.29.136]:37008 "EHLO mail.kernel.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1755527AbdBNUf1 (ORCPT <rfc822;linux-btrfs@vger.kernel.org>);
	Tue, 14 Feb 2017 15:35:27 -0500
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 54253202F0
	for <linux-btrfs@vger.kernel.org>;
	Tue, 14 Feb 2017 20:35:15 +0000 (UTC)
Received: from debian3.lan (bl12-226-64.dsl.telepac.pt [85.245.226.64])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPSA id C8F4C202EB
	for <linux-btrfs@vger.kernel.org>;
	Tue, 14 Feb 2017 20:35:13 +0000 (UTC)
From: fdmanana@kernel.org
To: linux-btrfs@vger.kernel.org
Subject: [PATCH] Btrfs: incremental send,
	fix unnecessary hole writes for sparse files
Date: Tue, 14 Feb 2017 20:35:04 +0000
Message-Id: <1487104504-13334-1-git-send-email-fdmanana@kernel.org>
X-Mailer: git-send-email 2.7.0.rc3
X-Virus-Scanned: ClamAV using ClamSMTP
Sender: linux-btrfs-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-btrfs.vger.kernel.org>
X-Mailing-List: linux-btrfs@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

From: Filipe Manana <fdmanana@suse.com>

When using the NO_HOLES feature, during an incremental send we often issue
write operations for holes when we should not, because that range is already
a hole in the destination snapshot. While that does not change the contents
of the file at the receiver, it avoids preservation of file holes, leading
to wasted disk space and extra IO during send/receive.

A couple examples where the holes are not preserved follows.

 $ mkfs.btrfs -O no-holes -f /dev/sdb
 $ mount /dev/sdb /mnt
 $ xfs_io -f -c "pwrite -S 0xaa 0 4K" /mnt/foo
 $ xfs_io -f -c "pwrite -S 0xaa 0 4K" -c "pwrite -S 0xbb 1028K 4K" /mnt/bar
 $ btrfs subvolume snapshot -r /mnt /mnt/snap1

 # Now add one new extent to our first test file, increasing its size and
 # leaving a 1Mb hole between the first extent and this new extent.
 $ xfs_io -c "pwrite -S 0xbb 1028K 4K" /mnt/foo

 # Now overwrite the last extent of our second test file.
 $ xfs_io -c "pwrite -S 0xcc 1028K 4K" /mnt/bar

 $ btrfs subvolume snapshot -r /mnt /mnt/snap2

 $ xfs_io -r -c "fiemap -v" /mnt/snap2/foo
 /mnt/snap2/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          25088..25095         8 0x2000
   1: [8..2055]:       hole              2048
   2: [2056..2063]:    24576..24583         8 0x2001

 $ xfs_io -r -c "fiemap -v" /mnt/snap2/bar
 /mnt/snap2/bar:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          25096..25103         8 0x2000
   1: [8..2055]:       hole              2048
   2: [2056..2063]:    24584..24591         8 0x2001

  $ btrfs send /mnt/snap1 -f /tmp/1.snap
  $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.snap

  $ umount /mnt
  # It's not relevant to enable no-holes in the new filesystem.
  $ mkfs.btrfs -O no-holes -f /dev/sdc
  $ mount /dev/sdc /mnt
  $ btrfs receive /mnt -f /tmp/1.snap
  $ btrfs receive /mnt -f /tmp/2.snap

  $ xfs_io -r -c "fiemap -v" /mnt/snap2/foo
  /mnt/snap2/foo:
  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
    0: [0..7]:          24576..24583         8 0x2000
    1: [8..2063]:       25624..27679      2056   0x1

  $ xfs_io -r -c "fiemap -v" /mnt/snap2/bar
  /mnt/snap2/bar:
  EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
    0: [0..7]:          24584..24591         8 0x2000
    1: [8..2063]:       27680..29735      2056   0x1

The holes do not exist in the second filesystem and they were replaced
with extents filled with the byte 0x00, making each file take 1032Kb of
space instead of 8Kb.

So fix this by not issuing the write operations consisting of buffers
filled with the byte 0x00 when the destination snapshot already has a
hole for the respective range.

A test case for fstests will follow soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/send.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 712922e..456c890 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -5306,6 +5306,81 @@ static int get_last_extent(struct send_ctx *sctx, u64 offset)
 	return ret;
 }
 
+static int range_is_hole_in_parent(struct send_ctx *sctx,
+				   const u64 start,
+				   const u64 end)
+{
+	struct btrfs_path *path;
+	struct btrfs_key key;
+	struct btrfs_root *root = sctx->parent_root;
+	u64 search_start = start;
+	int ret;
+
+	path = alloc_path_for_send();
+	if (!path)
+		return -ENOMEM;
+
+	key.objectid = sctx->cur_ino;
+	key.type = BTRFS_EXTENT_DATA_KEY;
+	key.offset = search_start;
+	ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+	if (ret < 0)
+		goto out;
+	if (ret > 0 && path->slots[0] > 0)
+		path->slots[0]--;
+
+	while (search_start < end) {
+		struct extent_buffer *leaf = path->nodes[0];
+		int slot = path->slots[0];
+		struct btrfs_file_extent_item *fi;
+		u64 extent_end;
+
+		if (slot >= btrfs_header_nritems(leaf)) {
+			ret = btrfs_next_leaf(root, path);
+			if (ret < 0)
+				goto out;
+			else if (ret > 0)
+				break;
+			continue;
+		}
+
+		btrfs_item_key_to_cpu(leaf, &key, slot);
+		if (key.objectid < sctx->cur_ino ||
+		    key.type < BTRFS_EXTENT_DATA_KEY)
+			goto next;
+		if (key.objectid > sctx->cur_ino ||
+		    key.type > BTRFS_EXTENT_DATA_KEY ||
+		    key.offset >= end)
+			break;
+
+		fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
+		if (btrfs_file_extent_type(leaf, fi) ==
+		    BTRFS_FILE_EXTENT_INLINE) {
+			u64 size = btrfs_file_extent_inline_len(leaf, slot, fi);
+
+			extent_end = ALIGN(key.offset + size,
+					   root->fs_info->sectorsize);
+		} else {
+			extent_end = key.offset +
+				btrfs_file_extent_num_bytes(leaf, fi);
+		}
+		if (extent_end <= start)
+			goto next;
+		if (btrfs_file_extent_disk_bytenr(leaf, fi) == 0) {
+			search_start = extent_end;
+			goto next;
+		}
+		ret = 0;
+		goto out;
+next:
+		path->slots[0]++;
+	}
+	ret = 1;
+out:
+	btrfs_free_path(path);
+	return ret;
+}
+
 static int maybe_send_hole(struct send_ctx *sctx, struct btrfs_path *path,
 			   struct btrfs_key *key)
 {
@@ -5350,8 +5425,17 @@ static int maybe_send_hole(struct send_ctx *sctx, struct btrfs_path *path,
 			return ret;
 	}
 
-	if (sctx->cur_inode_last_extent < key->offset)
-		ret = send_hole(sctx, key->offset);
+	if (sctx->cur_inode_last_extent < key->offset) {
+		ret = range_is_hole_in_parent(sctx,
+					      sctx->cur_inode_last_extent,
+					      key->offset);
+		if (ret < 0)
+			return ret;
+		else if (ret == 0)
+			ret = send_hole(sctx, key->offset);
+		else
+			ret = 0;
+	}
 	sctx->cur_inode_last_extent = extent_end;
 	return ret;
 }