From patchwork Mon Feb 22 22:07:06 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: John Snow <jsnow@redhat.com>
X-Patchwork-Id: 8384231
Return-Path: 
 <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
X-Original-To: patchwork-qemu-devel@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork1.web.kernel.org (Postfix) with ESMTP id 62F9C9F372
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Mon, 22 Feb 2016 22:08:26 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id B9CC320392
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Mon, 22 Feb 2016 22:08:25 +0000 (UTC)
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 06B6120389
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Mon, 22 Feb 2016 22:08:25 +0000 (UTC)
Received: from localhost ([::1]:52317 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>)
	id 1aXyeW-0002Yj-C5 for patchwork-qemu-devel@patchwork.kernel.org;
	Mon, 22 Feb 2016 17:08:24 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50643)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jsnow@redhat.com>) id 1aXydS-0000iS-4q
	for qemu-devel@nongnu.org; Mon, 22 Feb 2016 17:07:19 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jsnow@redhat.com>) id 1aXydR-0004Rh-7Y
	for qemu-devel@nongnu.org; Mon, 22 Feb 2016 17:07:18 -0500
Received: from mx1.redhat.com ([209.132.183.28]:58932)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jsnow@redhat.com>)
	id 1aXydM-0004OO-Ph; Mon, 22 Feb 2016 17:07:12 -0500
Received: from int-mx11.intmail.prod.int.phx2.redhat.com
	(int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24])
	by mx1.redhat.com (Postfix) with ESMTPS id 7672A8E3CC;
	Mon, 22 Feb 2016 22:07:12 +0000 (UTC)
Received: from scv.usersys.redhat.com (dhcp-17-171.bos.redhat.com
	[10.18.17.171])
	by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with
	ESMTP id u1MM796n001470; Mon, 22 Feb 2016 17:07:11 -0500
From: John Snow <jsnow@redhat.com>
To: qemu-block@nongnu.org
Date: Mon, 22 Feb 2016 17:07:06 -0500
Message-Id: <1456178827-6419-3-git-send-email-jsnow@redhat.com>
In-Reply-To: <1456178827-6419-1-git-send-email-jsnow@redhat.com>
References: <1456178827-6419-1-git-send-email-jsnow@redhat.com>
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-Received-From: 209.132.183.28
Cc: kwolf@redhat.com, famz@redhat.com, jcody@redhat.com,
	qemu-devel@nongnu.org, stefanha@redhat.com, John Snow <jsnow@redhat.com>
Subject: [Qemu-devel] [PATCH v2 2/3] block/backup: avoid copying less than
	full target clusters
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
Sender: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI,
	UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

During incremental backups, if the target has a cluster size that is
larger than the backup cluster size and we are backing up to a target
that cannot (for whichever reason) pull clusters up from a backing image,
we may inadvertantly create unusable incremental backup images.

For example:

If the bitmap tracks changes at a 64KB granularity and we transmit 64KB
of data at a time but the target uses a 128KB cluster size, it is
possible that only half of a target cluster will be recognized as dirty
by the backup block job. When the cluster is allocated on the target
image but only half populated with data, we lose the ability to
distinguish between zero padding and uninitialized data.

This does not happen if the target image has a backing file that points
to the last known good backup.

Even if we have a backing file, though, it's likely going to be faster
to just buffer the redundant data ourselves from the live image than
fetching it from the backing file, so let's just always round up to the
target granularity.

The same logic applies to backup modes top, none, and full. Copying
fractional clusters without the guarantee of COW is dangerous, but even
if we can rely on COW, it's likely better to just re-copy the data.

Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
 block/backup.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index 76addef..a9a4d5c 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -501,6 +501,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                   BlockJobTxn *txn, Error **errp)
 {
     int64_t len;
+    BlockDriverInfo bdi;
 
     assert(bs);
     assert(target);
@@ -578,7 +579,14 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
     job->sync_mode = sync_mode;
     job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
                        sync_bitmap : NULL;
-    job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
+
+    /* If there is no backing file on the target, we cannot rely on COW if our
+     * backup cluster size is smaller than the target cluster size. Instead of
+     * checking for a backing file, we assume that just copying the data in the
+     * backup loop is comparable to the unreliable COW. */
+    bdrv_get_info(job->target, &bdi);
+    job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
+
     job->common.len = len;
     job->common.co = qemu_coroutine_create(backup_run);
     block_job_txn_add_job(txn, &job->common);