From patchwork Tue Feb 18 01:17:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 11387729 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9079E924 for ; Tue, 18 Feb 2020 01:19:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 70A3E20836 for ; Tue, 18 Feb 2020 01:19:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ayxtwJaJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726293AbgBRBTc (ORCPT ); Mon, 17 Feb 2020 20:19:32 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:44771 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726097AbgBRBTc (ORCPT ); Mon, 17 Feb 2020 20:19:32 -0500 Received: by mail-pg1-f195.google.com with SMTP id g3so9970574pgs.11; Mon, 17 Feb 2020 17:19:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=wdl63vLkZeTCO6cEhD7wb/Lt5+ras5GaQCOoaNINnOw=; b=ayxtwJaJbgP+QbO48qMs3x/NxRg1vDVDmL+gjWbxxaU3tI1Qh0trNJeYSdUeIkwXV4 o4kD+UFWcT1uRG+LyO5jEAYfnr7gicrHdPZKy5hAORerM3yZ3dDeEGG4tkPmKL9YNKCk SjzmUR+5goMb7yJGKwPp3qarq9+Po7GmxbG000fR3pdqpp77U3BiGO422VbQB2zaCRPE vK7YImQV8c47x9QWXmB6GiS8+mQbMwv0qn+9RBfh3i0LZhGUedl/CgzinjFL6jeimtwj OhqHvDPE/+ofCJO6OceQIJeYQj8MgTOgY9vwIJy00Yg7FmzRnIBB6dGAmW9+4IsH3Ml7 4SLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=wdl63vLkZeTCO6cEhD7wb/Lt5+ras5GaQCOoaNINnOw=; b=OoGDXkMYUuF7S+0h8u5MYYB7Ce2vN4hUK1cDkbrpK1eQklWkQ6FQGZLB2dqq1GU6ep 0WpxU7L1gLLQbty50Bnvq5hojJ9KT5p+i1vxlrRyAZq818vVOcu++bquy6AoCpWF90Sl s0wJd1FdLdyJfRiR16qLLDWurmT4QFwOrlBbZy0DjI8Dw/nACpd9VCmMM24TuoZKj/Eh G+FIuyZPiA7M5lQWAu4MzLiuITURs5U19MveksBL4yS0IWn0zr5jYb91wkq8CV4l+JVG H+tYpXcILhibxErLGG0U0hq0rJAzKGQzfqvUql7/gFPa7G/QAOpqmDyPM/YHrSgVwmiS H4HQ== X-Gm-Message-State: APjAAAVoYC+nTW6hOB0kwM/SB5P/T9Z4QWzh29HrygmB3YbexBdDiS6G 95SaI+7JaYtRaXUxHtuOk6Ldhx9YlnUEZQ== X-Google-Smtp-Source: APXvYqx5VtdeS9U5CVkgXTZIeohJKxNEj2BmCVl/+Xi8BV5uynYvo/WGPRsqNJSsSroxYkw0Q5MYrQ== X-Received: by 2002:a63:38b:: with SMTP id 133mr20234538pgd.153.1581988771240; Mon, 17 Feb 2020 17:19:31 -0800 (PST) Received: from kernel.DHCP ([120.244.140.205]) by smtp.googlemail.com with ESMTPSA id g13sm1519511pfo.169.2020.02.17.17.19.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 17 Feb 2020 17:19:30 -0800 (PST) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel Subject: [PATCH v4 1/2] KVM: X86: Less kvmclock sync induced vmexits after VM boots Date: Tue, 18 Feb 2020 09:17:09 +0800 Message-Id: <1581988630-19182-1-git-send-email-wanpengli@tencent.com> X-Mailer: git-send-email 2.7.4 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wanpeng Li In the progress of vCPUs creation, it queues a kvmclock sync worker to the global workqueue before each vCPU creation completes. Each worker will be scheduled after 300 * HZ delay and request a kvmclock update for all vCPUs and kick them out. This is especially worse when scaling to large VMs due to a lot of vmexits. Just one worker as a leader to trigger the kvmclock sync request for all vCPUs is enough. Signed-off-by: Wanpeng Li --- v3 -> v4: * check vcpu->vcpu_idx arch/x86/kvm/x86.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fb5d64e..d0ba2d4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9390,8 +9390,9 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) if (!kvmclock_periodic_sync) return; - schedule_delayed_work(&kvm->arch.kvmclock_sync_work, - KVMCLOCK_SYNC_PERIOD); + if (vcpu->vcpu_idx == 0) + schedule_delayed_work(&kvm->arch.kvmclock_sync_work, + KVMCLOCK_SYNC_PERIOD); } void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) From patchwork Tue Feb 18 01:17:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 11387731 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B74BF17EF for ; Tue, 18 Feb 2020 01:19:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 986AE20836 for ; Tue, 18 Feb 2020 01:19:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kxHMSvgY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726339AbgBRBTi (ORCPT ); Mon, 17 Feb 2020 20:19:38 -0500 Received: from mail-pf1-f195.google.com ([209.85.210.195]:35678 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726070AbgBRBTi (ORCPT ); Mon, 17 Feb 2020 20:19:38 -0500 Received: by mail-pf1-f195.google.com with SMTP id y73so9791415pfg.2; Mon, 17 Feb 2020 17:19:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uAFuTDTSBh5DgAHzrivMQNHJeAZwyWRUw0tzZPnJB70=; b=kxHMSvgYBck1aHGVwUoliWOhkqU+F3fuoDRzPCkuO6pYEx1nTJuOnFuo3PA6okf3ex VwEYP9tTVXJS4aa5sF4XJYMuZGgNAzoOQIaUPKPHLLZiVBVrd92peGzywkEZfA4r/6k3 s4/2tJJLPUb4D4CuEzavW/e2l++Ayc5cQxxU8lSJjGcsLkv7GXlRP4BJppmeRKrTeFwv xgGXb/G00EERo43Evw/CftFLtVmmNmOGlW20rNSUUsF/On3SXJFzOb1dwgBQJGZgYMNA G2tnRnH3FCjaTCMcS15zEV8s0JEvXj5x313Yitovx03QW4Lpx0HZO4zJVLngdDMjVVXQ uuGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uAFuTDTSBh5DgAHzrivMQNHJeAZwyWRUw0tzZPnJB70=; b=bW2KYEYLk14JsaJBFa6Kbm2qW5nf01fSEvwP1sqJykIxk2VKAr8bsYiSxLqCyn7HWF CRQ+8Uh6MMV7WMlqSHAitloD/hJsWTlgyI3GUGA7LV8GonJFQqqe55lMAK6bRkjHZ8vD g/At/Mt5ww/aOZklD0sR46bo0VcNO8gLFbbfRl257gtexdOrjI/Bp3SmLBX63rd3zXWW apRkwKjbrW91E1x6jXaimRRKpUks+iV+Zjusbi4jns+7D55LeTrFNNexqmCpIVNyld4u Z4vv3IlRYsUF2m5e3PrZ72TiiZ57DKiRLUca8jG6/7d4oPrs1tunp+UWqDOP34xcZ1yf EOGA== X-Gm-Message-State: APjAAAWT0znlVOaJKXI1Vk+CBvZ6EiH6C5S7vuo6V/zoNLizHaOGRyeE Sxr7qz44saG3j3st82JVxlP7S+55H/LFGw== X-Google-Smtp-Source: APXvYqww4cNatnKwY4zaD9Vx/A6THgqmaEck9Z0vbx3g47ErbE9U+rQiWgGNZ3iwQsf5Q5X4Q+aK7Q== X-Received: by 2002:a63:7c16:: with SMTP id x22mr20473660pgc.335.1581988777811; Mon, 17 Feb 2020 17:19:37 -0800 (PST) Received: from kernel.DHCP ([120.244.140.205]) by smtp.googlemail.com with ESMTPSA id g13sm1519511pfo.169.2020.02.17.17.19.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 17 Feb 2020 17:19:37 -0800 (PST) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel Subject: [PATCH RESEND v4 2/2] KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow Date: Tue, 18 Feb 2020 09:17:10 +0800 Message-Id: <1581988630-19182-2-git-send-email-wanpengli@tencent.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1581988630-19182-1-git-send-email-wanpengli@tencent.com> References: <1581988630-19182-1-git-send-email-wanpengli@tencent.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: wanpeng li For the duration of mapping eVMCS, it derefences ->memslots without holding ->srcu or ->slots_lock when accessing hv assist page. This patch fixes it by moving nested_sync_vmcs12_to_shadow to prepare_guest_switch, where the SRCU is already taken. It can be reproduced by running kvm's evmcs_test selftest. ============================= warning: suspicious rcu usage 5.6.0-rc1+ #53 tainted: g w ioe ----------------------------- ./include/linux/kvm_host.h:623 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by evmcs_test/8507: #0: ffff9ddd156d00d0 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680 [kvm] stack backtrace: cpu: 6 pid: 8507 comm: evmcs_test tainted: g w ioe 5.6.0-rc1+ #53 hardware name: dell inc. optiplex 7040/0jctf8, bios 1.4.9 09/12/2016 call trace: dump_stack+0x68/0x9b kvm_read_guest_cached+0x11d/0x150 [kvm] kvm_hv_get_assist_page+0x33/0x40 [kvm] nested_enlightened_vmentry+0x2c/0x60 [kvm_intel] nested_vmx_handle_enlightened_vmptrld.part.52+0x32/0x1c0 [kvm_intel] nested_sync_vmcs12_to_shadow+0x439/0x680 [kvm_intel] vmx_vcpu_run+0x67a/0xe60 [kvm_intel] vcpu_enter_guest+0x35e/0x1bc0 [kvm] kvm_arch_vcpu_ioctl_run+0x40b/0x670 [kvm] kvm_vcpu_ioctl+0x370/0x680 [kvm] ksys_ioctl+0x235/0x850 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x77/0x780 entry_syscall_64_after_hwframe+0x49/0xbe Signed-off-by: Wanpeng Li --- v2 -> v3: * update Subject * move the check above * add the WARN_ON_ONCE arch/x86/kvm/vmx/vmx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 3be25ec..9a6797f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1175,6 +1175,10 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vmx->guest_msrs[i].mask); } + + if (vmx->nested.need_vmcs12_to_shadow_sync) + nested_sync_vmcs12_to_shadow(vcpu); + if (vmx->guest_state_loaded) return; @@ -6482,8 +6486,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) vmcs_write32(PLE_WINDOW, vmx->ple_window); } - if (vmx->nested.need_vmcs12_to_shadow_sync) - nested_sync_vmcs12_to_shadow(vcpu); + WARN_ON_ONCE(vmx->nested.need_vmcs12_to_shadow_sync); if (kvm_register_is_dirty(vcpu, VCPU_REGS_RSP)) vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);