From patchwork Tue Jul 3 02:37:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fam Zheng X-Patchwork-Id: 10502847 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D5EC36028F for ; Tue, 3 Jul 2018 02:41:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C44B328A21 for ; Tue, 3 Jul 2018 02:41:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B875128A27; Tue, 3 Jul 2018 02:41:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1A28928A21 for ; Tue, 3 Jul 2018 02:41:33 +0000 (UTC) Received: from localhost ([::1]:37478 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1faBG0-0007vo-AR for patchwork-qemu-devel@patchwork.kernel.org; Mon, 02 Jul 2018 22:41:32 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35880) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1faBD1-0005Sj-1L for qemu-devel@nongnu.org; Mon, 02 Jul 2018 22:38:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1faBCy-0008LS-8V for qemu-devel@nongnu.org; Mon, 02 Jul 2018 22:38:27 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:38744 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1faBCu-0008Jp-V2; Mon, 02 Jul 2018 22:38:21 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7FE4E3466D; Tue, 3 Jul 2018 02:38:20 +0000 (UTC) Received: from lemon.usersys.redhat.com (ovpn-12-86.pek2.redhat.com [10.72.12.86]) by smtp.corp.redhat.com (Postfix) with ESMTP id 70BF32156889; Tue, 3 Jul 2018 02:38:16 +0000 (UTC) From: Fam Zheng To: qemu-devel@nongnu.org Date: Tue, 3 Jul 2018 10:37:58 +0800 Message-Id: <20180703023758.14422-4-famz@redhat.com> In-Reply-To: <20180703023758.14422-1-famz@redhat.com> References: <20180703023758.14422-1-famz@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Tue, 03 Jul 2018 02:38:20 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Tue, 03 Jul 2018 02:38:20 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'famz@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v4 3/3] backup: Use copy offloading X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Jeff Cody , Max Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP The implementation is similar to the 'qemu-img convert'. In the beginning of the job, offloaded copy is attempted. If it fails, further I/O will go through the existing bounce buffer code path. Then, as Kevin pointed out, both this and qemu-img convert can benefit from a local check if one request fails because of, for example, the offset is beyond EOF, but another may well be accepted by the protocol layer. This will be implemented separately. Reviewed-by: Stefan Hajnoczi Signed-off-by: Fam Zheng --- block/backup.c | 150 ++++++++++++++++++++++++++++++++------------- block/trace-events | 1 + 2 files changed, 110 insertions(+), 41 deletions(-) diff --git a/block/backup.c b/block/backup.c index d18be40caf..81895ddbe2 100644 --- a/block/backup.c +++ b/block/backup.c @@ -45,6 +45,8 @@ typedef struct BackupBlockJob { QLIST_HEAD(, CowRequest) inflight_reqs; HBitmap *copy_bitmap; + bool use_copy_range; + int64_t copy_range_size; } BackupBlockJob; static const BlockJobDriver backup_job_driver; @@ -86,19 +88,101 @@ static void cow_request_end(CowRequest *req) qemu_co_queue_restart_all(&req->wait_queue); } +/* Copy range to target with a bounce buffer and return the bytes copied. If + * error occured, return a negative error number */ +static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job, + int64_t start, + int64_t end, + bool is_write_notifier, + bool *error_is_read, + void **bounce_buffer) +{ + int ret; + struct iovec iov; + QEMUIOVector qiov; + BlockBackend *blk = job->common.blk; + int nbytes; + + hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1); + nbytes = MIN(job->cluster_size, job->len - start); + if (!*bounce_buffer) { + *bounce_buffer = blk_blockalign(blk, job->cluster_size); + } + iov.iov_base = *bounce_buffer; + iov.iov_len = nbytes; + qemu_iovec_init_external(&qiov, &iov, 1); + + ret = blk_co_preadv(blk, start, qiov.size, &qiov, + is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0); + if (ret < 0) { + trace_backup_do_cow_read_fail(job, start, ret); + if (error_is_read) { + *error_is_read = true; + } + goto fail; + } + + if (qemu_iovec_is_zero(&qiov)) { + ret = blk_co_pwrite_zeroes(job->target, start, + qiov.size, BDRV_REQ_MAY_UNMAP); + } else { + ret = blk_co_pwritev(job->target, start, + qiov.size, &qiov, + job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0); + } + if (ret < 0) { + trace_backup_do_cow_write_fail(job, start, ret); + if (error_is_read) { + *error_is_read = false; + } + goto fail; + } + + return nbytes; +fail: + hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1); + return ret; + +} + +/* Copy range to target and return the bytes copied. If error occured, return a + * negative error number. */ +static int coroutine_fn backup_cow_with_offload(BackupBlockJob *job, + int64_t start, + int64_t end, + bool is_write_notifier) +{ + int ret; + int nr_clusters; + BlockBackend *blk = job->common.blk; + int nbytes; + + assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size)); + nbytes = MIN(job->copy_range_size, end - start); + nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size); + hbitmap_reset(job->copy_bitmap, start / job->cluster_size, + nr_clusters); + ret = blk_co_copy_range(blk, start, job->target, start, nbytes, + is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0); + if (ret < 0) { + trace_backup_do_cow_copy_range_fail(job, start, ret); + hbitmap_set(job->copy_bitmap, start / job->cluster_size, + nr_clusters); + return ret; + } + + return nbytes; +} + static int coroutine_fn backup_do_cow(BackupBlockJob *job, int64_t offset, uint64_t bytes, bool *error_is_read, bool is_write_notifier) { - BlockBackend *blk = job->common.blk; CowRequest cow_request; - struct iovec iov; - QEMUIOVector bounce_qiov; - void *bounce_buffer = NULL; int ret = 0; int64_t start, end; /* bytes */ - int n; /* bytes */ + void *bounce_buffer = NULL; qemu_co_rwlock_rdlock(&job->flush_rwlock); @@ -110,60 +194,38 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job, wait_for_overlapping_requests(job, start, end); cow_request_begin(&cow_request, job, start, end); - for (; start < end; start += job->cluster_size) { + while (start < end) { if (!hbitmap_get(job->copy_bitmap, start / job->cluster_size)) { trace_backup_do_cow_skip(job, start); + start += job->cluster_size; continue; /* already copied */ } - hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1); trace_backup_do_cow_process(job, start); - n = MIN(job->cluster_size, job->len - start); - - if (!bounce_buffer) { - bounce_buffer = blk_blockalign(blk, job->cluster_size); - } - iov.iov_base = bounce_buffer; - iov.iov_len = n; - qemu_iovec_init_external(&bounce_qiov, &iov, 1); - - ret = blk_co_preadv(blk, start, bounce_qiov.size, &bounce_qiov, - is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0); - if (ret < 0) { - trace_backup_do_cow_read_fail(job, start, ret); - if (error_is_read) { - *error_is_read = true; + if (job->use_copy_range) { + ret = backup_cow_with_offload(job, start, end, is_write_notifier); + if (ret < 0) { + job->use_copy_range = false; } - hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1); - goto out; } - - if (buffer_is_zero(iov.iov_base, iov.iov_len)) { - ret = blk_co_pwrite_zeroes(job->target, start, - bounce_qiov.size, BDRV_REQ_MAY_UNMAP); - } else { - ret = blk_co_pwritev(job->target, start, - bounce_qiov.size, &bounce_qiov, - job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0); + if (!job->use_copy_range) { + ret = backup_cow_with_bounce_buffer(job, start, end, is_write_notifier, + error_is_read, &bounce_buffer); } if (ret < 0) { - trace_backup_do_cow_write_fail(job, start, ret); - if (error_is_read) { - *error_is_read = false; - } - hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1); - goto out; + break; } /* Publish progress, guest I/O counts as progress too. Note that the * offset field is an opaque progress value, it is not a disk offset. */ - job->bytes_read += n; - job_progress_update(&job->common.job, n); + start += ret; + job->bytes_read += ret; + job_progress_update(&job->common.job, ret); + ret = 0; } -out: if (bounce_buffer) { qemu_vfree(bounce_buffer); } @@ -665,6 +727,12 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, } else { job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size); } + job->use_copy_range = true; + job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk), + blk_get_max_transfer(job->target)); + job->copy_range_size = MAX(job->cluster_size, + QEMU_ALIGN_UP(job->copy_range_size, + job->cluster_size)); /* Required permissions are already taken with target's blk_new() */ block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL, diff --git a/block/trace-events b/block/trace-events index 2d59b53fd3..c35287b48a 100644 --- a/block/trace-events +++ b/block/trace-events @@ -42,6 +42,7 @@ backup_do_cow_skip(void *job, int64_t start) "job %p start %"PRId64 backup_do_cow_process(void *job, int64_t start) "job %p start %"PRId64 backup_do_cow_read_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d" backup_do_cow_write_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d" +backup_do_cow_copy_range_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d" # blockdev.c qmp_block_job_cancel(void *job) "job %p"