From patchwork Mon Jul 2 19:02:37 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brunner X-Patchwork-Id: 1147691 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id D9C78DFFAD for ; Mon, 2 Jul 2012 19:05:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751422Ab2GBTFZ (ORCPT ); Mon, 2 Jul 2012 15:05:25 -0400 Received: from mail-ey0-f174.google.com ([209.85.215.174]:42812 "EHLO mail-ey0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751047Ab2GBTFY (ORCPT ); Mon, 2 Jul 2012 15:05:24 -0400 Received: by eaak11 with SMTP id k11so2222530eaa.19 for ; Mon, 02 Jul 2012 12:05:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent :x-gm-message-state; bh=hH4nTo+Kc07qz0Nr//pcBxel3UOKQP84O76g+OIWf18=; b=PgfBMfTo1/GRjKv/AiOkmZ2K0/3bodfvjex1mZK9BOYQBwPdRWqAtTznNmbneY9Yp1 VGDpEO9n8cK395q9P7AxrPKqtE/STJJ8Jc9jq/6Jeh2dSA8OjKhC9lz5d3sdyIU3A0Bj FV0Yamr7u90h9NuZ0QzzrEH7phoAjkOWU52/ooQNI7SGsAz+8fhOnWN5+NLW2vSFt2U9 TK5NyQQ9zgfQAaNPCOLeSdxOLRNTgQLjKX29nn38g+1e5l8shJ2U6kmTNNe2hmawvqCe Bei11iohXTcUMExrw/ON94M6Hfpd8D3YPQYsaM89o5ty2N3rGgknlel2zhroF/8nENQn lS0w== Received: by 10.14.37.201 with SMTP id y49mr3525938eea.128.1341255922914; Mon, 02 Jul 2012 12:05:22 -0700 (PDT) Received: from sir.fritz.box ([2001:6f8:1dce:0:222:4dff:fe4f:f0cb]) by mx.google.com with ESMTPS id m46sm37631238eeh.9.2012.07.02.12.05.21 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 02 Jul 2012 12:05:22 -0700 (PDT) Date: Mon, 2 Jul 2012 21:02:37 +0200 From: Christian Brunner To: Gregory Farnum Cc: Josh Durgin , Vladimir Bashkirtsev , ceph-devel@vger.kernel.org Subject: Re: Ceph and KVM live migration Message-ID: <20120702190237.GA4732@sir.fritz.box> References: <4FEF9CF1.6030008@bashkirtsev.com> <4FEFA53F.4020503@inktank.com> <4FEFB2D6.6010603@bashkirtsev.com> <4FEFB5EC.5090202@inktank.com> <4FEFC225.5080204@bashkirtsev.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-12-10) X-Gm-Message-State: ALoCoQk2t/qstyf4RYO/JUXK/TPCHACabKf/sqEtHjT+rQxZa7BA2d/vWUg3JbKUP6odlU7pTo+1 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org On Mon, Jul 02, 2012 at 11:21:40AM -0700, Gregory Farnum wrote: > On Sat, Jun 30, 2012 at 8:21 PM, Vladimir Bashkirtsev > wrote: > > On 01/07/12 11:59, Josh Durgin wrote: > >> > >> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote: > >>> > >>> On 01/07/12 10:47, Josh Durgin wrote: > >>>> > >>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: > >>>>> > >>>>> Dear all, > >>>>> > >>>>> Currently I testing KVMs running on ceph and particularly testing > >>>>> recent > >>>>> cache feature. Performance is of course vastly improved but still have > >>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I > >>>>> will deal with it later. Right now I've got myself a question which I > >>>>> could not get answered myself: if I do live migration of KVM while > >>>>> there > >>>>> some uncommitted data in ceph cache will this cache be committed prior > >>>>> cut-over to another host? Reading through the list I've got an > >>>>> impression that it may be left uncommitted and thus it may cause data > >>>>> corruption. I just would like a simple confirmation if code which > >>>>> commits cache on cut-over to new host does exist and no data corruption > >>>>> due to RBD cache+live migration should happen. > >>>>> > >>>>> Regards, > >>>>> Vladimir > >>>> > >>>> > >>>> QEMU does a flush on all the disks when it stops the guest on the > >>>> original host, so there will be no uncommitted data in the cache. > >>>> > >>>> Josh > >>> > >>> Thank you for quick and precise answer. Now when I actually attempted to > >>> live migrate ceph based VM I get: > >>> > >>> Unable to migrate guest: Invalid relative path > >>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument > >>> > >>> I guess KVM does not like having :rbd_cache=true (migration works > >>> without it). I know that it is most likely KVM problem but still decided > >>> to ask here in case if you know about it. Any ideas how to fix it? > >>> > >>> Regards, > >>> Vladimir > >> > >> > >> Is the destination librbd older and not supporting the cache option? > >> > >> Migrating with rbd_cache=true and other options specified like that > >> worked in my testing. > >> > >> Josh > > > > Both installations are the same: > > qemu 1.0.17 > > ceph 0.47.3 > > libvirt 0.9.12 > > > > I have googled around and found that if I call migration with --unsafe > > option then it should go. And indeed: it works. Apparently this check > > introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no > > problems migrating. > > Have we checked if the live migrate actually does do the cache flushes > when you use the unsafe flag? That worries me a little! > > In either case, I created a bug so we can try and make QEMU play nice: > http://tracker.newdream.net/issues/2685 I took a quick look at the libvirt code and I think this is an issue in libvirt only. The unsafe flag is not handed over to qemu. You could try the attached patch (untested). Christian From 36314693f8b9be1f3c77621543adf01d7c51cb88 Mon Sep 17 00:00:00 2001 From: Christian Brunner Date: Tue, 19 Jun 2012 12:23:38 +0200 Subject: [PATCH] libvirt: allow migration for network protocols Live migration should be possible with most (all ?) network protocols, as qemu does a flush right before the migration. Signed-off-by: Christian Brunner --- src/qemu/qemu_migration.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index aee613e..6392b98 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -848,6 +848,12 @@ qemuMigrationIsSafe(virDomainDefPtr def) return false; } + if (disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_RBD || + disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_NBD || + disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_SHEEPDOG ) { + continue; + } + qemuReportError(VIR_ERR_MIGRATE_UNSAFE, "%s", _("Migration may lead to data corruption if disks" " use cache != none"));