From patchwork Fri Nov 30 16:47:40 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoffer Dall X-Patchwork-Id: 1825741 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id 77520DF24C for ; Fri, 30 Nov 2012 16:50:53 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1TeTkv-00040T-Dl; Fri, 30 Nov 2012 16:48:01 +0000 Received: from mail-vb0-f49.google.com ([209.85.212.49]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1TeTkd-0003zH-K6 for linux-arm-kernel@lists.infradead.org; Fri, 30 Nov 2012 16:47:44 +0000 Received: by mail-vb0-f49.google.com with SMTP id r6so11571844vbi.22 for ; Fri, 30 Nov 2012 08:47:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=HMECoVqeGZq21y6RujMg3DAVytq1hWFs1a5y4mMQ15E=; b=fY0h2PQNZrcXA+Ye2rf8sf5Ya4mD2oQVB3ZTt9RURDd9oTVhqP4n2Pa0D94kZbvUQF wRqfZDJ9v1uDVa7O1Rb0Lu9LRcNcX6xYznYXzarQh4AZeRbECyAJ1ZTA74TI/tZcue+r w97ZYiIvSjQemujisPqWqBIZBTuKAje/HbM3l1jPU6vIvXEbE4qki5isinA9o2NB9rph CpxTJYcldvmh5DV1SUsE5xPXPzizUVTq8IYLZgAxlBbEHjHt/V/2i5kwVuYqn5VPhNyY vDmHUVarAGtJb6GjzYwgVicElzMy/gSoGFESpOt3axMHzp3Lb1ElvIu5NzDfXpwL8iu3 iZzw== MIME-Version: 1.0 Received: by 10.59.5.229 with SMTP id cp5mr1521937ved.32.1354294060436; Fri, 30 Nov 2012 08:47:40 -0800 (PST) Received: by 10.220.127.16 with HTTP; Fri, 30 Nov 2012 08:47:40 -0800 (PST) X-Originating-IP: [72.80.83.148] In-Reply-To: <20121130151500.GC26289@mudshark.cambridge.arm.com> References: <20121110154203.2836.46686.stgit@chazy-air> <20121110154306.2836.93473.stgit@chazy-air> <20121119145758.GA3205@mudshark.cambridge.arm.com> <20121130151500.GC26289@mudshark.cambridge.arm.com> Date: Fri, 30 Nov 2012 11:47:40 -0500 Message-ID: Subject: Re: [PATCH v4 08/14] KVM: ARM: World-switch implementation From: Christoffer Dall To: Will Deacon X-Gm-Message-State: ALoCoQk5biAi/nXNmMSPE+pH3jQ/wUTDyCzQPo+ruRO8VCmJ2vZy6G4dTPqgb6mwlzY/ZyMaYigL X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20121130_114744_151602_367C51F7 X-CRM114-Status: GOOD ( 33.59 ) X-Spam-Score: -2.6 (--) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.212.49 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Rusty Russell , "kvm@vger.kernel.org" , Marc Zyngier , Marcelo Tosatti , Antonios Motakis , "kvmarm@lists.cs.columbia.edu" , "linux-arm-kernel@lists.infradead.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org On Fri, Nov 30, 2012 at 10:15 AM, Will Deacon wrote: > On Fri, Nov 30, 2012 at 06:37:04AM +0000, Christoffer Dall wrote: >> On Mon, Nov 19, 2012 at 9:57 AM, Will Deacon wrote: >> > >> > I must be missing something here: how do you ensure that a guest running >> > on multiple CPUs continues to have the same VMID across them after a >> > rollover? >> > >> >> when a roll over occurs, there's no problem until someone comes along >> that doesn't have a valid vmid (need_new_vmid_gen will return true). > > Well a roll-over is triggered by somebody not having a valid VMID and us > failing to allocate a new one from the current generation. > >> In this case, to assign a vmid, we need to start a new generation of >> id's to assign one, and must ensure that all old vmid's are no longer >> used. So how do we ensure that? >> >> Well, we increment the kvm_vmid_gen, causing all other cpus who try to >> run a VM to hit the spin_lock if they exit the VMs. We reserve the >> vmid 1 for the new cpu, and we call on_each_cpu, which causes an ipi >> to all other physical cpus, and waits until the other physical cpus >> actually complete reset_vm_context. >> >> At this point, once on_each_cpu(reset_vm_context) returns, all other >> physical CPUs have cleared their data structures for occurences of old >> vmids, and the kvm_vmid_gen has been incremented, so no other vcpus >> can come and claim other vmids until we unlock the spinlock, and >> everything starts over. >> >> Makes sense? > > Yes, but I still don't understand how you ensure VMID consistency across > different vcpus of the same vm. Imagine the following scenario: > > We have two VMs: > > VM0: VCPU0 on physical CPU0, VCPU1 on physical CPU1 > VM1: VCPU0 on physical CPU2 > > Also assume that VM0 is happily running and we want to schedule VM1 for > the first time. Finally, also assume that kvm_next_vmid is zero (that is, > the next allocation will trigger a roll-over). > > Now, we want to schedule VM1: > > > kvm_arch_init_vm > kvm->arch.vmid_gen = 0; > kvm_arch_vcpu_ioctl_run > update_vttbr > need_new_vmid_gen == true > lock(kvm_vmid_lock) > kvm_vmid_gen++; > kvm_next_vmid = 1; > on_each_cpu(reset_vm_context); > > > At this point, the other two (physical) CPUs exit the guest: > > > kvm_guest_exit // Received IRQ from cross-call > local_irq_enable > kvm_call_hyp(__kvm_flush_vm_context); // Invalidate TLB (this is overkill as should be bcast) > cond_resched; > update_vttbr > need_new_vmid_gen == true > /* spin on kvm_vmid_lock */ > > > I think the __kvm_flush_vm_context is overkill -- you should check > tlb_ops_need_broadcast (which is currently only true for 11MPCore). However, > continuing with this, the other CPU gets its vmid and releases the lock: > > > /* receives vmid 1, kvm_next_vmid = 2 */ > unlock(kvm_vmid_lock) > /* Back to the guest */ > > > Now, let's say that CPU0 takes an interrupt (which it goes off to handle) > and CPU1 grabs the lock: > > > lock(kvm_vmid_lock) > /* CPU1 receives vmid 2, bumps vmid counter */ > unlock(kvm_vmid_lock) > /* Back to the guest */ > > > At this point, VM1 is running and VM0:VCPU1 is running. VM0:VCPU0 is not > running because physical CPU0 is handling an interrupt. The problem is that > when VCPU0 *is* resumed, it will update the VMID of VM0 and could be > scheduled in parallel with VCPU1 but with a different VMID. > > How do you avoid this in the current code? > I don't. Nice catch. Please apply your interesting brain to the following fix:) From 5ba0481b78b5b5e9321cb2bb05d611b74149461c Mon Sep 17 00:00:00 2001 From: Christoffer Dall Date: Fri, 30 Nov 2012 11:43:24 -0500 Subject: [PATCH] KVM: ARM: Fix race condition in update_vttbr Will Deacon pointed out another funny race in the update_vttbr code, where two VCPUs belonging to the same VM could end up using different VMIDs during a roll over of the VMID space. To preserve the lockless vcpu_run loop in the common case, we re-check the vmid generation after grabbing the spin lock. Reported-by: Will Deacon Signed-off-by: Christoffer Dall --- arch/arm/kvm/arm.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index c4f631e..dbdee30 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -445,6 +445,16 @@ static void update_vttbr(struct kvm *kvm) spin_lock(&kvm_vmid_lock); + /* + * We need to re-check the vmid_gen here to ensure that if another vcpu + * already allocated a valid vmid for this vm, then this vcpu should + * use the same vmid. + */ + if (!need_new_vmid_gen(kvm)) { + spin_unlock(&kvm_vmid_lock); + return; + } + /* First user of a new VMID generation? */ if (unlikely(kvm_next_vmid == 0)) { atomic64_inc(&kvm_vmid_gen);