From patchwork Wed Jan 20 16:25:00 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 8072941 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 57631BEEE5 for ; Wed, 20 Jan 2016 16:30:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 93DC820456 for ; Wed, 20 Jan 2016 16:30:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A46D720398 for ; Wed, 20 Jan 2016 16:30:49 +0000 (UTC) Received: from localhost ([::1]:43934 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aLvej-0005FZ-16 for patchwork-qemu-devel@patchwork.kernel.org; Wed, 20 Jan 2016 11:30:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59085) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aLvZa-0003iX-1k for qemu-devel@nongnu.org; Wed, 20 Jan 2016 11:25:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aLvZY-00072v-PD for qemu-devel@nongnu.org; Wed, 20 Jan 2016 11:25:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:55466) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aLvZW-0006y7-5g; Wed, 20 Jan 2016 11:25:26 -0500 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (Postfix) with ESMTPS id D1D758CF4D; Wed, 20 Jan 2016 16:25:25 +0000 (UTC) Received: from noname.redhat.com (ovpn-116-62.ams2.redhat.com [10.36.116.62]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u0KGP8CY032534; Wed, 20 Jan 2016 11:25:24 -0500 From: Kevin Wolf To: qemu-block@nongnu.org Date: Wed, 20 Jan 2016 17:25:00 +0100 Message-Id: <1453307106-28330-12-git-send-email-kwolf@redhat.com> In-Reply-To: <1453307106-28330-1-git-send-email-kwolf@redhat.com> References: <1453307106-28330-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: kwolf@redhat.com, qemu-devel@nongnu.org Subject: [Qemu-devel] [PULL 11/17] block: Inactivate BDS when migration completes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP So far, live migration with shared storage meant that the image is in a not-really-ready don't-touch-me state on the destination while the source is still actively using it, but after completing the migration, the image was fully opened on both sides. This is bad. This patch adds a block driver callback to inactivate images on the source before completing the migration. Inactivation means that it goes to a state as if it was just live migrated to the qemu instance on the source (i.e. BDRV_O_INACTIVE is set). You're then supposed to continue either on the source or on the destination, which takes ownership of the image. A typical migration looks like this now with respect to disk images: 1. Destination qemu is started, the image is opened with BDRV_O_INACTIVE. The image is fully opened on the source. 2. Migration is about to complete. The source flushes the image and inactivates it. Now both sides have the image opened with BDRV_O_INACTIVE and are expecting the other side to still modify it. 3. One side (the destination on success) continues and calls bdrv_invalidate_all() in order to take ownership of the image again. This removes BDRV_O_INACTIVE on the resuming side; the flag remains set on the other side. This ensures that the same image isn't written to by both instances (unless both are resumed, but then you get what you deserve). This is important because .bdrv_close for non-BDRV_O_INACTIVE images could write to the image file, which is definitely forbidden while another host is using the image. Signed-off-by: Kevin Wolf Reviewed-by: Eric Blake Reviewed-by: John Snow --- block.c | 34 ++++++++++++++++++++++++++++++++++ include/block/block.h | 1 + include/block/block_int.h | 1 + migration/migration.c | 7 +++++++ qmp.c | 12 ++++++++++++ 5 files changed, 55 insertions(+) diff --git a/block.c b/block.c index 95b2967..5709d3d 100644 --- a/block.c +++ b/block.c @@ -3303,6 +3303,40 @@ void bdrv_invalidate_cache_all(Error **errp) } } +static int bdrv_inactivate(BlockDriverState *bs) +{ + int ret; + + if (bs->drv->bdrv_inactivate) { + ret = bs->drv->bdrv_inactivate(bs); + if (ret < 0) { + return ret; + } + } + + bs->open_flags |= BDRV_O_INACTIVE; + return 0; +} + +int bdrv_inactivate_all(void) +{ + BlockDriverState *bs; + int ret; + + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { + AioContext *aio_context = bdrv_get_aio_context(bs); + + aio_context_acquire(aio_context); + ret = bdrv_inactivate(bs); + aio_context_release(aio_context); + if (ret < 0) { + return ret; + } + } + + return 0; +} + /**************************************************************/ /* removable device support */ diff --git a/include/block/block.h b/include/block/block.h index 2b7d33c..25f36dc 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -369,6 +369,7 @@ BlockAIOCB *bdrv_aio_ioctl(BlockDriverState *bs, /* Invalidate any cached metadata used by image formats */ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp); void bdrv_invalidate_cache_all(Error **errp); +int bdrv_inactivate_all(void); /* Ensure contents are flushed to disk. */ int bdrv_flush(BlockDriverState *bs); diff --git a/include/block/block_int.h b/include/block/block_int.h index 256609d..428fa33 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -172,6 +172,7 @@ struct BlockDriver { * Invalidate any cached meta-data. */ void (*bdrv_invalidate_cache)(BlockDriverState *bs, Error **errp); + int (*bdrv_inactivate)(BlockDriverState *bs); /* * Flushes all data that was already written to the OS all the way down to diff --git a/migration/migration.c b/migration/migration.c index bc611e4..aaca451 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1422,7 +1422,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running) *old_vm_running = runstate_is_running(); global_state_store(); ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); + if (ret < 0) { + goto fail; + } + ret = bdrv_inactivate_all(); if (ret < 0) { goto fail; } @@ -1542,6 +1546,9 @@ static void migration_completion(MigrationState *s, int current_active_state, if (!ret) { ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); if (ret >= 0) { + ret = bdrv_inactivate_all(); + } + if (ret >= 0) { qemu_file_set_rate_limit(s->file, INT64_MAX); qemu_savevm_state_complete_precopy(s->file, false); } diff --git a/qmp.c b/qmp.c index 3ff6db7..53affe2 100644 --- a/qmp.c +++ b/qmp.c @@ -192,6 +192,18 @@ void qmp_cont(Error **errp) } } + /* Continuing after completed migration. Images have been inactivated to + * allow the destination to take control. Need to get control back now. */ + if (runstate_check(RUN_STATE_FINISH_MIGRATE) || + runstate_check(RUN_STATE_POSTMIGRATE)) + { + bdrv_invalidate_cache_all(&local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + } + if (runstate_check(RUN_STATE_INMIGRATE)) { autostart = 1; } else {