From patchwork Wed Mar 9 16:41:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 12775286 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33928C43219 for ; Wed, 9 Mar 2022 16:50:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235969AbiCIQv2 (ORCPT ); Wed, 9 Mar 2022 11:51:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236048AbiCIQqY (ORCPT ); Wed, 9 Mar 2022 11:46:24 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CABB1017D1 for ; Wed, 9 Mar 2022 08:41:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=MIME-Version:Content-Type:References: In-Reply-To:Date:Cc:To:From:Subject:Message-ID:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=yAU7nj+t/e+cuIq6A3PyePlQneRj/+qOxTa586BTfPc=; b=Q2biUWdr6mSeyaqs0XkFb8x0pc GTu5tQWrvuV/VoTBs2hCEv9cDO973AzAz6SSDLJLFm0sQn4fSozy/4Px4hRX6pzBO9rlDtzmjJmsW zm6w8jr4Sn2ue07h398V4sw8dPmyG+1q2sHMdwiZLpCK62sL6jj15v9WSHZ0CS04FtX24IsnAJKAW yNdgblwGSeOUvxGl6Uok1J7tMplUnDGHcINCkTCXJ/VyM1ic+pd59l9yrYzKsdq70HgTTsBkiT4Wf nOycUIulPpL450eZDqWmFfpRXK4Vjq/n7K9JnCzl1Jr10agSziekBii576cYx3fjrcxBDyryicG7K g4GgO+7Q==; Received: from [54.239.6.184] (helo=u3832b3a9db3152.drs11.amazon.com) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1nRzNa-00HIyA-EH; Wed, 09 Mar 2022 16:41:38 +0000 Message-ID: <0709ac62f664c0f3123fcdeabed3b79038cef3b6.camel@infradead.org> Subject: [PATCH v2 1/2] KVM: x86/xen: PV oneshot timer fixes From: David Woodhouse To: Paolo Bonzini , kvm@vger.kernel.org Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Joao Martins , Boris Ostrovsky , Metin Kaya , Paul Durrant Date: Wed, 09 Mar 2022 17:41:37 +0100 In-Reply-To: <846caa99-2e42-4443-1070-84e49d2f11d2@redhat.com> References: <20220309143835.253911-1-dwmw2@infradead.org> <20220309143835.253911-2-dwmw2@infradead.org> <846caa99-2e42-4443-1070-84e49d2f11d2@redhat.com> User-Agent: Evolution 3.36.5-0ubuntu1 MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Woodhouse Fix the case where a restored timer is supposed to have triggered not just in the past, but before this kernel even booted. The resulting integer wrap caused the timer to be set a very long time into the future, and thus effectively never trigger. Trigger timers immediately when delta_ns <= 0 to avoid that situation. Also switch to using HRTIMER_MODE_ABS_HARD, following the changes in the local APIC timer in commits 2c0d278f3293f ("KVM: LAPIC: Mark hrtimer to expire in hard interrupt context") and 4d151bf3b89e7 ("KVM: LAPIC: Make lapic timer unpinned"). Since we only support the Xen oneshot timer and not the periodic timer, we also don't need to bother with migrating it from one physical CPU to another when the vCPU moves; it'll get started again soon enough anyway. When the timer fires, set the recorded expiry time to zero so that when userspace queries the state it correctly sees zero in the expires_ns to indicate that the timer isn't active, and avoid duplicate events after live migration / live update. Finally, fix the 'delta' in kvm_xen_hcall_set_timer_op() to explicitly use 'int64_t' instead of 'long' to make the sanity check shift by 50 bits work correctly in the 32-bit build. That last one was Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: David Woodhouse --- On Wed, 2022-03-09 at 16:39 +0100, Paolo Bonzini wrote: > This is still racy. Thanks for the review! Better? arch/x86/kvm/irq.c | 1 - arch/x86/kvm/xen.c | 44 +++++++++++++------------------------------- arch/x86/kvm/xen.h | 1 - 3 files changed, 13 insertions(+), 33 deletions(-) diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index af2d26fc5458..f371f1292ca3 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -156,7 +156,6 @@ void __kvm_migrate_timers(struct kvm_vcpu *vcpu) { __kvm_migrate_apic_timer(vcpu); __kvm_migrate_pit_timer(vcpu); - __kvm_migrate_xen_timer(vcpu); static_call_cond(kvm_x86_migrate_timers)(vcpu); } diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 8c85a71aa8ca..7e7c8a5bff52 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -122,6 +122,8 @@ void kvm_xen_inject_timer_irqs(struct kvm_vcpu *vcpu) e.priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL; kvm_xen_set_evtchn(&e, vcpu->kvm); + + vcpu->arch.xen.timer_expires = 0; atomic_set(&vcpu->arch.xen.timer_pending, 0); } } @@ -130,19 +132,9 @@ static enum hrtimer_restart xen_timer_callback(struct hrtimer *timer) { struct kvm_vcpu *vcpu = container_of(timer, struct kvm_vcpu, arch.xen.timer); - struct kvm_xen_evtchn e; - if (atomic_read(&vcpu->arch.xen.timer_pending)) return HRTIMER_NORESTART; - e.vcpu_id = vcpu->vcpu_id; - e.vcpu_idx = vcpu->vcpu_idx; - e.port = vcpu->arch.xen.timer_virq; - e.priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL; - - if (kvm_xen_set_evtchn_fast(&e, vcpu->kvm) != -EWOULDBLOCK) - return HRTIMER_NORESTART; - atomic_inc(&vcpu->arch.xen.timer_pending); kvm_make_request(KVM_REQ_UNBLOCK, vcpu); kvm_vcpu_kick(vcpu); @@ -150,29 +142,19 @@ static enum hrtimer_restart xen_timer_callback(struct hrtimer *timer) return HRTIMER_NORESTART; } -void __kvm_migrate_xen_timer(struct kvm_vcpu *vcpu) -{ - struct hrtimer *timer; - - if (!kvm_xen_timer_enabled(vcpu)) - return; - - timer = &vcpu->arch.xen.timer; - if (hrtimer_cancel(timer)) - hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED); -} - -static void kvm_xen_start_timer(struct kvm_vcpu *vcpu, u64 guest_abs, u64 delta_ns) +static void kvm_xen_start_timer(struct kvm_vcpu *vcpu, u64 guest_abs, s64 delta_ns) { - ktime_t ktime_now; - atomic_set(&vcpu->arch.xen.timer_pending, 0); vcpu->arch.xen.timer_expires = guest_abs; - ktime_now = ktime_get(); - hrtimer_start(&vcpu->arch.xen.timer, - ktime_add_ns(ktime_now, delta_ns), - HRTIMER_MODE_ABS_PINNED); + if (delta_ns <= 0) { + xen_timer_callback(&vcpu->arch.xen.timer); + } else { + ktime_t ktime_now = ktime_get(); + hrtimer_start(&vcpu->arch.xen.timer, + ktime_add_ns(ktime_now, delta_ns), + HRTIMER_MODE_ABS_HARD); + } } static void kvm_xen_stop_timer(struct kvm_vcpu *vcpu) @@ -185,7 +167,7 @@ static void kvm_xen_stop_timer(struct kvm_vcpu *vcpu) static void kvm_xen_init_timer(struct kvm_vcpu *vcpu) { hrtimer_init(&vcpu->arch.xen.timer, CLOCK_MONOTONIC, - HRTIMER_MODE_ABS_PINNED); + HRTIMER_MODE_ABS_HARD); vcpu->arch.xen.timer.function = xen_timer_callback; } @@ -1204,7 +1186,7 @@ static bool kvm_xen_hcall_set_timer_op(struct kvm_vcpu *vcpu, uint64_t timeout, if (timeout) { uint64_t guest_now = get_kvmclock_ns(vcpu->kvm); - long delta = timeout - guest_now; + int64_t delta = timeout - guest_now; /* Xen has a 'Linux workaround' in do_set_timer_op() which * checks for negative absolute timeout values (caused by diff --git a/arch/x86/kvm/xen.h b/arch/x86/kvm/xen.h index ad0876a7c301..2bbbc1a3953e 100644 --- a/arch/x86/kvm/xen.h +++ b/arch/x86/kvm/xen.h @@ -75,7 +75,6 @@ static inline int kvm_xen_has_pending_timer(struct kvm_vcpu *vcpu) return 0; } -void __kvm_migrate_xen_timer(struct kvm_vcpu *vcpu); void kvm_xen_inject_timer_irqs(struct kvm_vcpu *vcpu); #else static inline int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data)