From patchwork Mon Feb 17 10:36:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 11386069 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4D5D924 for ; Mon, 17 Feb 2020 10:36:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B5E4420836 for ; Mon, 17 Feb 2020 10:36:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NJoGVxJg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726797AbgBQKgy (ORCPT ); Mon, 17 Feb 2020 05:36:54 -0500 Received: from mail-ot1-f67.google.com ([209.85.210.67]:35589 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725972AbgBQKgx (ORCPT ); Mon, 17 Feb 2020 05:36:53 -0500 Received: by mail-ot1-f67.google.com with SMTP id r16so15637143otd.2; Mon, 17 Feb 2020 02:36:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=ur/ztzg097hNdiOqNl8MURR6+AytvFPiL8IkE10Yl1Q=; b=NJoGVxJgA8Ke2CHUVj9hmcpI5GSEOxBa9au4sw2v0wW7hnClOJbAG2aCuVfJV7cJep CxKMl62BY6MY+WpZPIcbnvj5We8vw5q1pZNSenF4Xx7QpXnItnQyE5nJY0qC7hmdvvj3 ITLmRwT3U41EBZds1mbQRQ63N7eOFzhAiI/gMJtDUQIH8DZPPe4o/L51/kLfYQrFyJo+ JNoCN4TLAFPc4FT38cG9moYmFtzUvDNSU4bNfZcmz8zS/QPNSg6vge66CPPaL40DIQw9 EDsNxTzzIkBrHMyyrxv+cak+suv4+jRWRvPNG4sVoteghhIN1Kj+pie6zjV4bF0nj59c cl1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=ur/ztzg097hNdiOqNl8MURR6+AytvFPiL8IkE10Yl1Q=; b=NOYC6RuPWxB5ZpyYR0+pF7pAebqG5MZ3xKKGCX7XmwnoXBlACKL66KjmOW1f0/D95+ P20VZGgpKEY90FleyJI7Niso2ikMa2+OxPM2DvWaAkwkzgZ2/NnFLLbhlTiTAzmBQvqm sDB94XwcNL4uhhsAP0FzxXwVctRp7NqlGqWAUZq6x+tryu8xyHXH2/JT8lBbL+5UZaRE Yki8DA8RjbGrJiLww8znlT04ED8PsdRdkAYkbAF5ngT5Yv00N49FLrVzDFcmQ2xTthBZ vSBl4piy54v4oFPBavor/0e2iYVf+vsIeXvfpz2i4tCRASHBHIEh4Cbyw3kL9rKdRyZn i/bQ== X-Gm-Message-State: APjAAAUpo3MxvczLtDJB3s4yT1Vfm6vgBgsS10lgPA63Ov/ubO0UIPrA agKtpFMeHA1wmhpeRFC7mdNu7XcRakWiAbm3MTxrtujQmS6l5Q== X-Google-Smtp-Source: APXvYqx+pB+sFO42DRkAL0RCvyTGIbk1nCpM3Rl6C4VZ+Q++st91JjnhSvk4mXvlkrf0/U07A7VQF4bh5F4pNz5qn7I= X-Received: by 2002:a9d:63d6:: with SMTP id e22mr11830539otl.185.1581935812830; Mon, 17 Feb 2020 02:36:52 -0800 (PST) MIME-Version: 1.0 From: Wanpeng Li Date: Mon, 17 Feb 2020 18:36:41 +0800 Message-ID: Subject: [PATCH v3 1/2] KVM: X86: Less kvmclock sync induced vmexits after VM boots To: LKML , kvm Cc: Paolo Bonzini , Sean Christopherson , Wanpeng Li , Vitaly Kuznetsov , Jim Mattson , Joerg Roedel Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wanpeng Li In the progress of vCPUs creation, it queues a kvmclock sync worker to the global workqueue before each vCPU creation completes. Each worker will be scheduled after 300 * HZ delay and request a kvmclock update for all vCPUs and kick them out. This is especially worse when scaling to large VMs due to a lot of vmexits. Just one worker as a leader to trigger the kvmclock sync request for all vCPUs is enough. Signed-off-by: Wanpeng Li --- arch/x86/kvm/x86.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- 2.7.4 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fb5d64e..d0ba2d4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9390,8 +9390,9 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) if (!kvmclock_periodic_sync) return; - schedule_delayed_work(&kvm->arch.kvmclock_sync_work, - KVMCLOCK_SYNC_PERIOD); + if (kvm->created_vcpus == 1) + schedule_delayed_work(&kvm->arch.kvmclock_sync_work, + KVMCLOCK_SYNC_PERIOD); } void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) From patchwork Mon Feb 17 10:37:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 11386071 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9444F924 for ; Mon, 17 Feb 2020 10:37:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 75AFE20725 for ; Mon, 17 Feb 2020 10:37:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XltBufQf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726963AbgBQKhz (ORCPT ); Mon, 17 Feb 2020 05:37:55 -0500 Received: from mail-oi1-f194.google.com ([209.85.167.194]:43766 "EHLO mail-oi1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726403AbgBQKhz (ORCPT ); Mon, 17 Feb 2020 05:37:55 -0500 Received: by mail-oi1-f194.google.com with SMTP id p125so16209610oif.10; Mon, 17 Feb 2020 02:37:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=k8BSp0O0KfBeVprk778t84qb07TP/hfwso06rDN0yGc=; b=XltBufQfP7wpwCaNj3e2sGKjeJdqeCFagWjLEArZeghTwlgQn7jnZ/rCVwGXc+spcI GPND8ggd8u9g3YV/+UTlF28LLPC7SJXtT6k3e54xDfkWHJMcqJkzxpIAY0PeoIVvOjsD 3F42Be7kyt/JCWZ5oF/8cJofMIJwXPgLAfv4oNCixaLgbCjwq3t2HoGyFmtoSq3Ua4vJ wN1BGhcyrqivi+J9AdRuHJNe3fEvUMBqLyK9u1PrgbOPH1z6ozTSZ7VJVkKNby57b8EM 5KrPb5MaPkfT5H336GfRnPEZx+r0nXv8Bd8Jr+kbJ6ZL7uAsTIIDpKpYXAp8xHRzzTsM 1O7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=k8BSp0O0KfBeVprk778t84qb07TP/hfwso06rDN0yGc=; b=SMOzQqIfub+NHS5fSuiwWrv2J9pb6/DVYDBzt/tdR6BwmyIXMT9IzGueP8/sXujdfM VdLDjCA9qeW6/ZEarb3SDudL+RPEdPAJF56mGfvRssNctkf03RjvoDu7xaGskc6fKx5j BxGsUPZMjqjScG3gPwURqtxI5YUtTVPDW235B+SvsFBn08J5fRqagTlYyoRpOhPPX/oo +1DxpUtCycBn58kEL8VV4VcOsXu2/Fbc7kuWLqGNBdHLri+D50VCJaedwC+MG4mQv3Bs eT+CHn+9yO4JnLRliCvyYtP6rTpI8Eao9u1XODyccNc1PCo3B+YC5R8c7tOz5apiEngg 1oPQ== X-Gm-Message-State: APjAAAXRiisXytfZ6JhSuakkTnBqFjc+/wJlLUqXpxEzz012Xog0K09c 3BT6Cvab1Zsmx8wH4LH3JxYsGYvs912JpC5y0RRGwFj47xEbIQ== X-Google-Smtp-Source: APXvYqxDjBYcTjQcr5nocEiOj7Htq4tDWpJtbNMIsIoV6egKz/W1ZjwiadRqjn1/jSJ5/DdLuvCwUa09BsVcra6xkbo= X-Received: by 2002:aca:8d5:: with SMTP id 204mr9150053oii.141.1581935873942; Mon, 17 Feb 2020 02:37:53 -0800 (PST) MIME-Version: 1.0 From: Wanpeng Li Date: Mon, 17 Feb 2020 18:37:43 +0800 Message-ID: Subject: [PATCH v3 2/2] KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow To: LKML , kvm Cc: Paolo Bonzini , Sean Christopherson , Wanpeng Li , Vitaly Kuznetsov , Jim Mattson , Joerg Roedel Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: wanpeng li For the duration of mapping eVMCS, it derefences ->memslots without holding ->srcu or ->slots_lock when accessing hv assist page. This patch fixes it by moving nested_sync_vmcs12_to_shadow to prepare_guest_switch, where the SRCU is already taken. It can be reproduced by running kvm's evmcs_test selftest. ============================= warning: suspicious rcu usage 5.6.0-rc1+ #53 tainted: g w ioe ----------------------------- ./include/linux/kvm_host.h:623 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by evmcs_test/8507: #0: ffff9ddd156d00d0 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680 [kvm] stack backtrace: cpu: 6 pid: 8507 comm: evmcs_test tainted: g w ioe 5.6.0-rc1+ #53 hardware name: dell inc. optiplex 7040/0jctf8, bios 1.4.9 09/12/2016 call trace: dump_stack+0x68/0x9b kvm_read_guest_cached+0x11d/0x150 [kvm] kvm_hv_get_assist_page+0x33/0x40 [kvm] nested_enlightened_vmentry+0x2c/0x60 [kvm_intel] nested_vmx_handle_enlightened_vmptrld.part.52+0x32/0x1c0 [kvm_intel] nested_sync_vmcs12_to_shadow+0x439/0x680 [kvm_intel] vmx_vcpu_run+0x67a/0xe60 [kvm_intel] vcpu_enter_guest+0x35e/0x1bc0 [kvm] kvm_arch_vcpu_ioctl_run+0x40b/0x670 [kvm] kvm_vcpu_ioctl+0x370/0x680 [kvm] ksys_ioctl+0x235/0x850 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x77/0x780 entry_syscall_64_after_hwframe+0x49/0xbe Signed-off-by: Wanpeng Li --- v1 -> v2: * update Subject * move the check above * add the WARN_ON_ONCE arch/x86/kvm/vmx/vmx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 2.7.4 diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 3be25ec..9a6797f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1175,6 +1175,10 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vmx->guest_msrs[i].mask); } + + if (vmx->nested.need_vmcs12_to_shadow_sync) + nested_sync_vmcs12_to_shadow(vcpu); + if (vmx->guest_state_loaded) return; @@ -6482,8 +6486,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) vmcs_write32(PLE_WINDOW, vmx->ple_window); } - if (vmx->nested.need_vmcs12_to_shadow_sync) - nested_sync_vmcs12_to_shadow(vcpu); + WARN_ON_ONCE(vmx->nested.need_vmcs12_to_shadow_sync); if (kvm_register_is_dirty(vcpu, VCPU_REGS_RSP)) vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);