From patchwork Sun May 8 08:19:18 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 765142 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.3) with ESMTP id p488JQgK031917 for ; Sun, 8 May 2011 08:19:26 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752032Ab1EHITX (ORCPT ); Sun, 8 May 2011 04:19:23 -0400 Received: from mtagate1.uk.ibm.com ([194.196.100.161]:46196 "EHLO mtagate1.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751875Ab1EHITW (ORCPT ); Sun, 8 May 2011 04:19:22 -0400 Received: from d06nrmr1307.portsmouth.uk.ibm.com (d06nrmr1307.portsmouth.uk.ibm.com [9.149.38.129]) by mtagate1.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p488JLjI027731 for ; Sun, 8 May 2011 08:19:21 GMT Received: from d06av04.portsmouth.uk.ibm.com (d06av04.portsmouth.uk.ibm.com [9.149.37.216]) by d06nrmr1307.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p488Kal72187514 for ; Sun, 8 May 2011 09:20:36 +0100 Received: from d06av04.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av04.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p488JK7V023437 for ; Sun, 8 May 2011 02:19:20 -0600 Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217]) by d06av04.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p488JJf2023429 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 8 May 2011 02:19:20 -0600 Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id p488JJdi017954; Sun, 8 May 2011 11:19:19 +0300 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id p488JIgW017952; Sun, 8 May 2011 11:19:18 +0300 Date: Sun, 8 May 2011 11:19:18 +0300 Message-Id: <201105080819.p488JIgW017952@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: gleb@redhat.com, avi@redhat.com To: kvm@vger.kernel.org From: "Nadav Har'El" References: <1304842511-nyh@il.ibm.com> Subject: [PATCH 08/30] nVMX: Fix local_vcpus_link handling Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter2.kernel.org [140.211.167.43]); Sun, 08 May 2011 08:19:26 +0000 (UTC) In VMX, before we bring down a CPU we must VMCLEAR all VMCSs loaded on it because (at least in theory) the processor might not have written all of its content back to memory. Since a patch from June 26, 2008, this is done using a per-cpu "vcpus_on_cpu" linked list of vcpus loaded on each CPU. The problem is that with nested VMX, we no longer have the concept of a vcpu being loaded on a cpu: A vcpu has multiple VMCSs (one for L1, others for each L2), and each of those may be have been last loaded on a different cpu. This trivial patch changes the code to keep on vcpus_on_cpu only L1 VMCSs. This fixes crashes on L1 shutdown caused by incorrectly maintaing the linked lists. It is not a complete solution, though. It doesn't flush the inactive L1 or L2 VMCSs loaded on a CPU which is being shutdown. Doing this correctly will probably require replacing the vcpu linked list by a link list of "saved_vcms" objects (VMCS, cpu and launched), and it is left as a TODO. Signed-off-by: Nadav Har'El --- arch/x86/kvm/vmx.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- .before/arch/x86/kvm/vmx.c 2011-05-08 10:43:18.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2011-05-08 10:43:18.000000000 +0300 @@ -638,7 +638,9 @@ static void __vcpu_clear(void *arg) vmcs_clear(vmx->vmcs); if (per_cpu(current_vmcs, cpu) == vmx->vmcs) per_cpu(current_vmcs, cpu) = NULL; - list_del(&vmx->local_vcpus_link); + /* TODO: currently, local_vcpus_link is just for L1 VMCSs */ + if (!is_guest_mode(&vmx->vcpu)) + list_del(&vmx->local_vcpus_link); vmx->vcpu.cpu = -1; vmx->launched = 0; } @@ -1100,8 +1102,10 @@ static void vmx_vcpu_load(struct kvm_vcp kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); local_irq_disable(); - list_add(&vmx->local_vcpus_link, - &per_cpu(vcpus_on_cpu, cpu)); + /* TODO: currently, local_vcpus_link is just for L1 VMCSs */ + if (!is_guest_mode(&vmx->vcpu)) + list_add(&vmx->local_vcpus_link, + &per_cpu(vcpus_on_cpu, cpu)); local_irq_enable(); /* @@ -1806,7 +1810,9 @@ static void vmclear_local_vcpus(void) list_for_each_entry_safe(vmx, n, &per_cpu(vcpus_on_cpu, cpu), local_vcpus_link) - __vcpu_clear(vmx); + /* TODO: currently, local_vcpus_link is just for L1 VMCSs */ + if (!is_guest_mode(&vmx->vcpu)) + __vcpu_clear(vmx); }