From patchwork Mon Nov 14 12:36:30 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Tosatti X-Patchwork-Id: 9427459 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D5B886047D for ; Mon, 14 Nov 2016 12:49:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D669B286E8 for ; Mon, 14 Nov 2016 12:49:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CB23D2892B; Mon, 14 Nov 2016 12:49:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 56FE1286E8 for ; Mon, 14 Nov 2016 12:49:48 +0000 (UTC) Received: from localhost ([::1]:39608 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c6Ghn-0002qT-GE for patchwork-qemu-devel@patchwork.kernel.org; Mon, 14 Nov 2016 07:49:47 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46205) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c6GWP-0001ec-Jy for qemu-devel@nongnu.org; Mon, 14 Nov 2016 07:38:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c6GWK-00050E-Q3 for qemu-devel@nongnu.org; Mon, 14 Nov 2016 07:38:01 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53264) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c6GWK-0004yq-Ix for qemu-devel@nongnu.org; Mon, 14 Nov 2016 07:37:56 -0500 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3B4654E028; Mon, 14 Nov 2016 12:37:55 +0000 (UTC) Received: from amt.cnet (vpn1-4-30.gru2.redhat.com [10.97.4.30]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uAECbsQY014622; Mon, 14 Nov 2016 07:37:54 -0500 Received: from amt.cnet (localhost [127.0.0.1]) by amt.cnet (Postfix) with ESMTP id E0A5E1008DC; Mon, 14 Nov 2016 10:37:18 -0200 (BRST) Received: (from marcelo@localhost) by amt.cnet (8.14.7/8.14.7/Submit) id uAECbI0q016507; Mon, 14 Nov 2016 10:37:18 -0200 Message-Id: <20161114123700.158592605@redhat.com> User-Agent: quilt/0.60-1 Date: Mon, 14 Nov 2016 10:36:30 -0200 From: Marcelo Tosatti To: kvm@vger.kernel.org References: <20161114123628.703911091@redhat.com> Content-Disposition: inline; filename=kvmclock-advance-clock.patch X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 14 Nov 2016 12:37:55 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [qemu patch 2/2] kvmclock: reduce kvmclock difference on migration X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Marcelo Tosatti , Eduardo Habkost , Juan Quintela , Radim Krcmar , qemu-devel@nongnu.org, "Dr. David Alan Gilbert" , Paolo Bonzini Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which indicates that KVM_GET_CLOCK returns a value as seen by the guest at that moment. For new machine types, use this value rather than reading from guest memory. This reduces kvmclock difference on migration from 5s to 0.1s (when max_downtime == 5s). Signed-off-by: Marcelo Tosatti Acked-by: Juan Quintela --- hw/i386/kvm/clock.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++-- include/hw/i386/pc.h | 5 ++ target-i386/kvm.c | 7 +++ target-i386/kvm_i386.h | 1 4 files changed, 98 insertions(+), 3 deletions(-) Index: qemu-mig-advance-clock/hw/i386/kvm/clock.c =================================================================== --- qemu-mig-advance-clock.orig/hw/i386/kvm/clock.c 2016-11-14 09:07:55.519856329 -0200 +++ qemu-mig-advance-clock/hw/i386/kvm/clock.c 2016-11-14 10:19:45.723254737 -0200 @@ -36,6 +36,12 @@ uint64_t clock; bool clock_valid; + + /* whether machine supports reliable KVM_GET_CLOCK */ + bool mach_use_reliable_get_clock; + + /* whether source host supported reliable KVM_GET_CLOCK */ + bool src_use_reliable_get_clock; } KVMClockState; struct pvclock_vcpu_time_info { @@ -91,15 +97,37 @@ if (running) { struct kvm_clock_data data = {}; - uint64_t time_at_migration = kvmclock_current_nsec(s); + uint64_t time_at_migration = 0; - s->clock_valid = false; + /* local (running VM) restore */ + if (s->clock_valid) { + /* + * if host does not support reliable KVM_GET_CLOCK, + * read kvmclock value from memory + */ + if (!kvm_has_adjust_clock_stable()) { + time_at_migration = kvmclock_current_nsec(s); + } + /* migration/savevm/init restore */ + } else { + /* + * use s->clock in case machine uses reliable + * get clock and host where vm was executing + * supported reliable get clock + */ + if (!s->mach_use_reliable_get_clock || + !s->src_use_reliable_get_clock) { + time_at_migration = kvmclock_current_nsec(s); + } + } - /* We can't rely on the migrated clock value, just discard it */ + /* We can't rely on the saved clock value, just discard it */ if (time_at_migration) { s->clock = time_at_migration; } + s->clock_valid = false; + data.clock = s->clock; ret = kvm_vm_ioctl(kvm_state, KVM_SET_CLOCK, &data); if (ret < 0) { @@ -152,22 +180,76 @@ qemu_add_vm_change_state_handler(kvmclock_vm_state_change, s); } +static bool kvmclock_src_use_reliable_get_clock(void *opaque) +{ + KVMClockState *s = opaque; + + /* + * On machine types that support reliable KVM_GET_CLOCK, + * if host kernel does provide reliable KVM_GET_CLOCK, + * set src_use_reliable_get_clock=true so that destination + * avoids reading kvmclock from memory. + */ + if (s->mach_use_reliable_get_clock && kvm_has_adjust_clock_stable()) { + s->src_use_reliable_get_clock = true; + } + + return s->src_use_reliable_get_clock; +} + +static const VMStateDescription kvmclock_reliable_get_clock = { + .name = "kvmclock/src_use_reliable_get_clock", + .version_id = 1, + .minimum_version_id = 1, + .needed = kvmclock_src_use_reliable_get_clock, + .fields = (VMStateField[]) { + VMSTATE_BOOL(src_use_reliable_get_clock, KVMClockState), + VMSTATE_END_OF_LIST() + } +}; + +static void kvmclock_pre_save(void *opaque) +{ + KVMClockState *s = opaque; + struct kvm_clock_data data; + int ret; + + ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); + if (ret < 0) { + fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); + abort(); + } + s->clock = data.clock; +} + static const VMStateDescription kvmclock_vmsd = { .name = "kvmclock", .version_id = 1, .minimum_version_id = 1, + .pre_save = kvmclock_pre_save, .fields = (VMStateField[]) { VMSTATE_UINT64(clock, KVMClockState), VMSTATE_END_OF_LIST() + }, + .subsections = (const VMStateDescription * []) { + &kvmclock_reliable_get_clock, + NULL } }; +static Property kvmclock_properties[] = { + DEFINE_PROP_BOOL("mach_use_reliable_get_clock", KVMClockState, + mach_use_reliable_get_clock, true), + DEFINE_PROP_END_OF_LIST(), +}; + static void kvmclock_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); dc->realize = kvmclock_realize; dc->vmsd = &kvmclock_vmsd; + dc->props = kvmclock_properties; } static const TypeInfo kvmclock_info = { Index: qemu-mig-advance-clock/include/hw/i386/pc.h =================================================================== --- qemu-mig-advance-clock.orig/include/hw/i386/pc.h 2016-11-14 09:07:55.519856329 -0200 +++ qemu-mig-advance-clock/include/hw/i386/pc.h 2016-11-14 09:11:47.112200123 -0200 @@ -370,6 +370,11 @@ #define PC_COMPAT_2_7 \ HW_COMPAT_2_7 \ {\ + .driver = "kvmclock",\ + .property = "mach_use_reliable_get_clock",\ + .value = "off",\ + },\ + {\ .driver = TYPE_X86_CPU,\ .property = "l3-cache",\ .value = "off",\ Index: qemu-mig-advance-clock/target-i386/kvm.c =================================================================== --- qemu-mig-advance-clock.orig/target-i386/kvm.c 2016-11-14 09:07:55.520856330 -0200 +++ qemu-mig-advance-clock/target-i386/kvm.c 2016-11-14 09:11:47.125200142 -0200 @@ -117,6 +117,13 @@ return kvm_check_extension(kvm_state, KVM_CAP_X86_SMM); } +bool kvm_has_adjust_clock_stable(void) +{ + int ret = kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK); + + return (ret == KVM_CLOCK_TSC_STABLE); +} + bool kvm_allows_irq0_override(void) { return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing(); Index: qemu-mig-advance-clock/target-i386/kvm_i386.h =================================================================== --- qemu-mig-advance-clock.orig/target-i386/kvm_i386.h 2016-11-14 09:07:55.520856330 -0200 +++ qemu-mig-advance-clock/target-i386/kvm_i386.h 2016-11-14 09:11:47.125200142 -0200 @@ -17,6 +17,7 @@ bool kvm_allows_irq0_override(void); bool kvm_has_smm(void); +bool kvm_has_adjust_clock_stable(void); void kvm_synchronize_all_tsc(void); void kvm_arch_reset_vcpu(X86CPU *cs); void kvm_arch_do_init_vcpu(X86CPU *cs);