From patchwork Thu May 23 06:35:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Mackerras X-Patchwork-Id: 10956935 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 02FB71708 for ; Thu, 23 May 2019 06:36:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE8E127FB1 for ; Thu, 23 May 2019 06:36:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C2973280CF; Thu, 23 May 2019 06:36:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2AB3327FB1 for ; Thu, 23 May 2019 06:36:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727708AbfEWGgq (ORCPT ); Thu, 23 May 2019 02:36:46 -0400 Received: from bilbo.ozlabs.org ([203.11.71.1]:36661 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726429AbfEWGgq (ORCPT ); Thu, 23 May 2019 02:36:46 -0400 Received: by ozlabs.org (Postfix, from userid 1003) id 458ft34WT8z9s9y; Thu, 23 May 2019 16:36:43 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ozlabs.org; s=201707; t=1558593403; bh=OtPqCQFgkkeo8E7gxrpj22y1pW8L1gICcBwHEo6K42Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=XDvcIaRGeC13dLxgpS0AvOJUmXlmcYTNa7fDhTgWmuW24r71f6v+rl5Z5YQeWEGPr fetupnCLDYnRJMZW71M2Yc6tm68w35c2qoHcRK64gYqdbxnDsdxBvDnvohM4hSbKB2 d2t5LooC6VePfwFQOvow0rnSvzH5MS1Oc0OF5LNM+kS9xW2Im6tNSBoiLLi2LPn9Xu 7IECpKQOoBNag947QNGanDyaxSXeHR8BLqM5UyDI+6fH3oxsTmZaoXE61J7Q7aqdCt ggu5nha77dpmtjt/1JVmafteCp+s0X3ButV7gy8Z5zcEVYtXKuVCMgW73T7IY6tnJ4 8UvUFDNo4+dew== Date: Thu, 23 May 2019 16:35:07 +1000 From: Paul Mackerras To: kvm@vger.kernel.org Cc: kvm-ppc@vger.kernel.org, =?iso-8859-1?q?C=E9dric?= Le Goater Subject: [PATCH 1/4] KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions Message-ID: <20190523063507.GC19655@blackberry> References: <20190523063424.GB19655@blackberry> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190523063424.GB19655@blackberry> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently, kvmppc_xive_release() and kvmppc_xive_native_release() clear kvm->arch.mmu_ready and call kick_all_cpus_sync() as a way of ensuring that no vcpus are executing in the guest. However, future patches will change the mutex associated with kvm->arch.mmu_ready to a new mutex that nests inside the vcpu mutexes, making it difficult to continue to use this method. In fact, taking the vcpu mutex for a vcpu excludes execution of that vcpu, and we already take the vcpu mutex around the call to kvmppc_xive_[native_]cleanup_vcpu(). Once the cleanup function is done and we release the vcpu mutex, the vcpu can execute once again, but because we have cleared vcpu->arch.xive_vcpu, vcpu->arch.irq_type, vcpu->arch.xive_esc_vaddr and vcpu->arch.xive_esc_raddr, that vcpu will not be going into XIVE code any more. Thus, once we have cleaned up all of the vcpus, we are safe to clean up the rest of the XIVE state, and we don't need to use kvm->arch.mmu_ready to hold off vcpu execution. Signed-off-by: Paul Mackerras --- arch/powerpc/kvm/book3s_xive.c | 28 ++++++++++++---------------- arch/powerpc/kvm/book3s_xive_native.c | 28 ++++++++++++---------------- 2 files changed, 24 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c index 4953957..f623451 100644 --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -1859,21 +1859,10 @@ static void kvmppc_xive_release(struct kvm_device *dev) struct kvm *kvm = xive->kvm; struct kvm_vcpu *vcpu; int i; - int was_ready; pr_devel("Releasing xive device\n"); - debugfs_remove(xive->dentry); - /* - * Clearing mmu_ready temporarily while holding kvm->lock - * is a way of ensuring that no vcpus can enter the guest - * until we drop kvm->lock. Doing kick_all_cpus_sync() - * ensures that any vcpu executing inside the guest has - * exited the guest. Once kick_all_cpus_sync() has finished, - * we know that no vcpu can be executing the XIVE push or - * pull code, or executing a XICS hcall. - * * Since this is the device release function, we know that * userspace does not have any open fd referring to the * device. Therefore there can not be any of the device @@ -1881,9 +1870,8 @@ static void kvmppc_xive_release(struct kvm_device *dev) * and similarly, the connect_vcpu and set/clr_mapped * functions also cannot be being executed. */ - was_ready = kvm->arch.mmu_ready; - kvm->arch.mmu_ready = 0; - kick_all_cpus_sync(); + + debugfs_remove(xive->dentry); /* * We should clean up the vCPU interrupt presenters first. @@ -1892,12 +1880,22 @@ static void kvmppc_xive_release(struct kvm_device *dev) /* * Take vcpu->mutex to ensure that no one_reg get/set ioctl * (i.e. kvmppc_xive_[gs]et_icp) can be done concurrently. + * Holding the vcpu->mutex also means that the vcpu cannot + * be executing the KVM_RUN ioctl, and therefore it cannot + * be executing the XIVE push or pull code or accessing + * the XIVE MMIO regions. */ mutex_lock(&vcpu->mutex); kvmppc_xive_cleanup_vcpu(vcpu); mutex_unlock(&vcpu->mutex); } + /* + * Now that we have cleared vcpu->arch.xive_vcpu, vcpu->arch.irq_type + * and vcpu->arch.xive_esc_[vr]addr on each vcpu, we are safe + * against xive code getting called during vcpu execution or + * set/get one_reg operations. + */ kvm->arch.xive = NULL; /* Mask and free interrupts */ @@ -1911,8 +1909,6 @@ static void kvmppc_xive_release(struct kvm_device *dev) if (xive->vp_base != XIVE_INVALID_VP) xive_native_free_vp_block(xive->vp_base); - kvm->arch.mmu_ready = was_ready; - /* * A reference of the kvmppc_xive pointer is now kept under * the xive_devices struct of the machine for reuse. It is diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c index 6a8e698..da31dd0 100644 --- a/arch/powerpc/kvm/book3s_xive_native.c +++ b/arch/powerpc/kvm/book3s_xive_native.c @@ -973,21 +973,10 @@ static void kvmppc_xive_native_release(struct kvm_device *dev) struct kvm *kvm = xive->kvm; struct kvm_vcpu *vcpu; int i; - int was_ready; - - debugfs_remove(xive->dentry); pr_devel("Releasing xive native device\n"); /* - * Clearing mmu_ready temporarily while holding kvm->lock - * is a way of ensuring that no vcpus can enter the guest - * until we drop kvm->lock. Doing kick_all_cpus_sync() - * ensures that any vcpu executing inside the guest has - * exited the guest. Once kick_all_cpus_sync() has finished, - * we know that no vcpu can be executing the XIVE push or - * pull code or accessing the XIVE MMIO regions. - * * Since this is the device release function, we know that * userspace does not have any open fd or mmap referring to * the device. Therefore there can not be any of the @@ -996,9 +985,8 @@ static void kvmppc_xive_native_release(struct kvm_device *dev) * connect_vcpu and set/clr_mapped functions also cannot * be being executed. */ - was_ready = kvm->arch.mmu_ready; - kvm->arch.mmu_ready = 0; - kick_all_cpus_sync(); + + debugfs_remove(xive->dentry); /* * We should clean up the vCPU interrupt presenters first. @@ -1007,12 +995,22 @@ static void kvmppc_xive_native_release(struct kvm_device *dev) /* * Take vcpu->mutex to ensure that no one_reg get/set ioctl * (i.e. kvmppc_xive_native_[gs]et_vp) can be being done. + * Holding the vcpu->mutex also means that the vcpu cannot + * be executing the KVM_RUN ioctl, and therefore it cannot + * be executing the XIVE push or pull code or accessing + * the XIVE MMIO regions. */ mutex_lock(&vcpu->mutex); kvmppc_xive_native_cleanup_vcpu(vcpu); mutex_unlock(&vcpu->mutex); } + /* + * Now that we have cleared vcpu->arch.xive_vcpu, vcpu->arch.irq_type + * and vcpu->arch.xive_esc_[vr]addr on each vcpu, we are safe + * against xive code getting called during vcpu execution or + * set/get one_reg operations. + */ kvm->arch.xive = NULL; for (i = 0; i <= xive->max_sbid; i++) { @@ -1025,8 +1023,6 @@ static void kvmppc_xive_native_release(struct kvm_device *dev) if (xive->vp_base != XIVE_INVALID_VP) xive_native_free_vp_block(xive->vp_base); - kvm->arch.mmu_ready = was_ready; - /* * A reference of the kvmppc_xive pointer is now kept under * the xive_devices struct of the machine for reuse. It is From patchwork Thu May 23 06:35:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Mackerras X-Patchwork-Id: 10956937 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B07F13AD for ; Thu, 23 May 2019 06:36:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7148F27FA5 for ; Thu, 23 May 2019 06:36:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 657802807B; Thu, 23 May 2019 06:36:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9949127FA5 for ; Thu, 23 May 2019 06:36:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727817AbfEWGgr (ORCPT ); Thu, 23 May 2019 02:36:47 -0400 Received: from bilbo.ozlabs.org ([203.11.71.1]:42601 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726237AbfEWGgq (ORCPT ); Thu, 23 May 2019 02:36:46 -0400 Received: by ozlabs.org (Postfix, from userid 1003) id 458ft33wcNz9s9T; Thu, 23 May 2019 16:36:43 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ozlabs.org; s=201707; t=1558593403; bh=kui+3K3xhU9wSNTXYHthRqjWMp0R/dcN2Dkp/OTAGU8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=eEZVbThwUsV3Pr79sfWQYMVThqnJqoHJb70TN8ALD1ArumS1WnbBLxNt4SjA+Cx9S 3RaYhwPS2Nf/dUnQ3Nxao0pOPO3QHivUPEWQhhyXXRfeBTpQ3lM81ANYa8R9+Em0Uv 4ac1QIGXUBXp/dKLTbYLERgtU3EnjzZP1sFJPXAkJrfkVqGauGBYRZC2oPOO2H1ZUu rjtBtbbBLZzyw8NXElzVdG5LWjp5EwgIyb8uR6ZlEnND3jLW/QHug+9u/4iJ+pSySx WVM7VX36TY05jE5h6XovxCMR266EOtAngiOqaj5sP0dq/XMhBBcI+AuYU6gcRGl0sQ N8bsZQk3nvBEA== Date: Thu, 23 May 2019 16:35:34 +1000 From: Paul Mackerras To: kvm@vger.kernel.org Cc: kvm-ppc@vger.kernel.org, =?iso-8859-1?q?C=E9dric?= Le Goater Subject: [PATCH 2/4] KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup Message-ID: <20190523063534.GD19655@blackberry> References: <20190523063424.GB19655@blackberry> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190523063424.GB19655@blackberry> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently the HV KVM code uses kvm->lock in conjunction with a flag, kvm->arch.mmu_ready, to synchronize MMU setup and hold off vcpu execution until the MMU-related data structures are ready. However, this means that kvm->lock is being taken inside vcpu->mutex, which is contrary to Documentation/virtual/kvm/locking.txt and results in lockdep warnings. To fix this, we add a new mutex, kvm->arch.mmu_setup_lock, which nests inside the vcpu mutexes, and is taken in the places where kvm->lock was taken that are related to MMU setup. Additionally we take the new mutex in the vcpu creation code at the point where we are creating a new vcore, in order to provide mutual exclusion with kvmppc_update_lpcr() and ensure that an update to kvm->arch.lpcr doesn't get missed, which could otherwise lead to a stale vcore->lpcr value. Signed-off-by: Paul Mackerras --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/book3s_64_mmu_hv.c | 36 ++++++++++++++++++------------------ arch/powerpc/kvm/book3s_hv.c | 31 +++++++++++++++++++++++-------- 3 files changed, 42 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 013c76a..26b3ce4 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -325,6 +325,7 @@ struct kvm_arch { #endif struct kvmppc_ops *kvm_ops; #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + struct mutex mmu_setup_lock; /* nests inside vcpu mutexes */ u64 l1_ptcr; int max_nested_lpid; struct kvm_nested_guest *nested_guests[KVM_MAX_NESTED_GUESTS]; diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index ab3d484..5197131 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -63,7 +63,7 @@ struct kvm_resize_hpt { struct work_struct work; u32 order; - /* These fields protected by kvm->lock */ + /* These fields protected by kvm->arch.mmu_setup_lock */ /* Possible values and their usage: * <0 an error occurred during allocation, @@ -73,7 +73,7 @@ struct kvm_resize_hpt { int error; /* Private to the work thread, until error != -EBUSY, - * then protected by kvm->lock. + * then protected by kvm->arch.mmu_setup_lock. */ struct kvm_hpt_info hpt; }; @@ -139,7 +139,7 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) long err = -EBUSY; struct kvm_hpt_info info; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); if (kvm->arch.mmu_ready) { kvm->arch.mmu_ready = 0; /* order mmu_ready vs. vcpus_running */ @@ -183,7 +183,7 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) /* Ensure that each vcpu will flush its TLB on next entry. */ cpumask_setall(&kvm->arch.need_tlb_flush); - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return err; } @@ -1447,7 +1447,7 @@ static void resize_hpt_pivot(struct kvm_resize_hpt *resize) static void resize_hpt_release(struct kvm *kvm, struct kvm_resize_hpt *resize) { - if (WARN_ON(!mutex_is_locked(&kvm->lock))) + if (WARN_ON(!mutex_is_locked(&kvm->arch.mmu_setup_lock))) return; if (!resize) @@ -1474,14 +1474,14 @@ static void resize_hpt_prepare_work(struct work_struct *work) if (WARN_ON(resize->error != -EBUSY)) return; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); /* Request is still current? */ if (kvm->arch.resize_hpt == resize) { /* We may request large allocations here: - * do not sleep with kvm->lock held for a while. + * do not sleep with kvm->arch.mmu_setup_lock held for a while. */ - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n", resize->order); @@ -1494,9 +1494,9 @@ static void resize_hpt_prepare_work(struct work_struct *work) if (WARN_ON(err == -EBUSY)) err = -EINPROGRESS; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); /* It is possible that kvm->arch.resize_hpt != resize - * after we grab kvm->lock again. + * after we grab kvm->arch.mmu_setup_lock again. */ } @@ -1505,7 +1505,7 @@ static void resize_hpt_prepare_work(struct work_struct *work) if (kvm->arch.resize_hpt != resize) resize_hpt_release(kvm, resize); - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); } long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, @@ -1522,7 +1522,7 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, if (shift && ((shift < 18) || (shift > 46))) return -EINVAL; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); resize = kvm->arch.resize_hpt; @@ -1565,7 +1565,7 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, ret = 100; /* estimated time in ms */ out: - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return ret; } @@ -1588,7 +1588,7 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm, if (shift && ((shift < 18) || (shift > 46))) return -EINVAL; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); resize = kvm->arch.resize_hpt; @@ -1625,7 +1625,7 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm, smp_mb(); out_no_hpt: resize_hpt_release(kvm, resize); - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return ret; } @@ -1868,7 +1868,7 @@ static ssize_t kvm_htab_write(struct file *file, const char __user *buf, return -EINVAL; /* lock out vcpus from running while we're doing this */ - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); mmu_ready = kvm->arch.mmu_ready; if (mmu_ready) { kvm->arch.mmu_ready = 0; /* temporarily */ @@ -1876,7 +1876,7 @@ static ssize_t kvm_htab_write(struct file *file, const char __user *buf, smp_mb(); if (atomic_read(&kvm->arch.vcpus_running)) { kvm->arch.mmu_ready = 1; - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return -EBUSY; } } @@ -1963,7 +1963,7 @@ static ssize_t kvm_htab_write(struct file *file, const char __user *buf, /* Order HPTE updates vs. mmu_ready */ smp_wmb(); kvm->arch.mmu_ready = mmu_ready; - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); if (err) return err; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index d5fc624..b1c0a9b 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2338,11 +2338,17 @@ static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct kvm *kvm, pr_devel("KVM: collision on id %u", id); vcore = NULL; } else if (!vcore) { + /* + * Take mmu_setup_lock for mutual exclusion + * with kvmppc_update_lpcr(). + */ err = -ENOMEM; vcore = kvmppc_vcore_create(kvm, id & ~(kvm->arch.smt_mode - 1)); + mutex_lock(&kvm->arch.mmu_setup_lock); kvm->arch.vcores[core] = vcore; kvm->arch.online_vcores++; + mutex_unlock(&kvm->arch.mmu_setup_lock); } } mutex_unlock(&kvm->lock); @@ -3859,7 +3865,7 @@ static int kvmhv_setup_mmu(struct kvm_vcpu *vcpu) int r = 0; struct kvm *kvm = vcpu->kvm; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); if (!kvm->arch.mmu_ready) { if (!kvm_is_radix(kvm)) r = kvmppc_hv_setup_htab_rma(vcpu); @@ -3869,7 +3875,7 @@ static int kvmhv_setup_mmu(struct kvm_vcpu *vcpu) kvm->arch.mmu_ready = 1; } } - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return r; } @@ -4478,7 +4484,8 @@ static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm, /* * Update LPCR values in kvm->arch and in vcores. - * Caller must hold kvm->lock. + * Caller must hold kvm->arch.mmu_setup_lock (for mutual exclusion + * of kvm->arch.lpcr update). */ void kvmppc_update_lpcr(struct kvm *kvm, unsigned long lpcr, unsigned long mask) { @@ -4530,7 +4537,7 @@ void kvmppc_setup_partition_table(struct kvm *kvm) /* * Set up HPT (hashed page table) and RMA (real-mode area). - * Must be called with kvm->lock held. + * Must be called with kvm->arch.mmu_setup_lock held. */ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) { @@ -4618,7 +4625,10 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) goto out_srcu; } -/* Must be called with kvm->lock held and mmu_ready = 0 and no vcpus running */ +/* + * Must be called with kvm->arch.mmu_setup_lock held and + * mmu_ready = 0 and no vcpus running. + */ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm) { if (nesting_enabled(kvm)) @@ -4635,7 +4645,10 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm) return 0; } -/* Must be called with kvm->lock held and mmu_ready = 0 and no vcpus running */ +/* + * Must be called with kvm->arch.mmu_setup_lock held and + * mmu_ready = 0 and no vcpus running. + */ int kvmppc_switch_mmu_to_radix(struct kvm *kvm) { int err; @@ -4740,6 +4753,8 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm) char buf[32]; int ret; + mutex_init(&kvm->arch.mmu_setup_lock); + /* Allocate the guest's logical partition ID */ lpid = kvmppc_alloc_lpid(); @@ -5265,7 +5280,7 @@ static int kvmhv_configure_mmu(struct kvm *kvm, struct kvm_ppc_mmuv3_cfg *cfg) if (kvmhv_on_pseries() && !radix) return -EINVAL; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.mmu_setup_lock); if (radix != kvm_is_radix(kvm)) { if (kvm->arch.mmu_ready) { kvm->arch.mmu_ready = 0; @@ -5293,7 +5308,7 @@ static int kvmhv_configure_mmu(struct kvm *kvm, struct kvm_ppc_mmuv3_cfg *cfg) err = 0; out_unlock: - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.mmu_setup_lock); return err; } From patchwork Thu May 23 06:36:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Paul Mackerras X-Patchwork-Id: 10956939 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BCBEC13AD for ; Thu, 23 May 2019 06:36:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 93F5227FA5 for ; Thu, 23 May 2019 06:36:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8857D2807B; Thu, 23 May 2019 06:36:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1E24127FA5 for ; Thu, 23 May 2019 06:36:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728106AbfEWGgs (ORCPT ); Thu, 23 May 2019 02:36:48 -0400 Received: from ozlabs.org ([203.11.71.1]:45053 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726081AbfEWGgq (ORCPT ); Thu, 23 May 2019 02:36:46 -0400 Received: by ozlabs.org (Postfix, from userid 1003) id 458ft33Gdlz9s5c; Thu, 23 May 2019 16:36:43 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ozlabs.org; s=201707; t=1558593403; bh=1RkNi1llbZ/sh4QG2b06tiV4GCD6pjS7UlZ0n5v4SK4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=njni4rbS3FUD9ju6dDGZb8073UBUjf/9j8qIp5PspLUsIpbF8NkoBxVX3TGI3HlSh 8kMmo0aVg52beictn0MH+Dp1uM7sjVGnsR2SDwwatI077arjeVXEF/UV3nuylU/RW7 auvl4r1X7NEgzeuHNwc+3+rIglzE5OwJsBdi+Nz8qXRRu9VffCRAy5IJ2/dHEBIh+t TNRgHeLSoIFEY7Z3rpBh0ABB8hOrc8DpC+tqYWpPLutpkWrl5TJNH9QolaB3kEyu1e GAB2gQs/Lrpne4M3ptCdIlgnFQwAWc4fs0jiH89WiU3fdNKkO8Y8ZDY4GNOh6PyR+4 OstVjHYkwgs0Q== Date: Thu, 23 May 2019 16:36:01 +1000 From: Paul Mackerras To: kvm@vger.kernel.org Cc: kvm-ppc@vger.kernel.org, =?iso-8859-1?q?C=E9dric?= Le Goater Subject: [PATCH 3/4] KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list Message-ID: <20190523063601.GE19655@blackberry> References: <20190523063424.GB19655@blackberry> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190523063424.GB19655@blackberry> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently the Book 3S KVM code uses kvm->lock to synchronize access to the kvm->arch.rtas_tokens list. Because this list is scanned inside kvmppc_rtas_hcall(), which is called with the vcpu mutex held, taking kvm->lock cause a lock inversion problem, which could lead to a deadlock. To fix this, we add a new mutex, kvm->arch.rtas_token_lock, which nests inside the vcpu mutexes, and use that instead of kvm->lock when accessing the rtas token list. Signed-off-by: Paul Mackerras Reviewed-by: Cédric Le Goater --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/book3s.c | 1 + arch/powerpc/kvm/book3s_rtas.c | 14 +++++++------- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 26b3ce4..d10df67 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -309,6 +309,7 @@ struct kvm_arch { #ifdef CONFIG_PPC_BOOK3S_64 struct list_head spapr_tce_tables; struct list_head rtas_tokens; + struct mutex rtas_token_lock; DECLARE_BITMAP(enabled_hcalls, MAX_HCALL_OPCODE/4 + 1); #endif #ifdef CONFIG_KVM_MPIC diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 61a212d..ac56648 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -902,6 +902,7 @@ int kvmppc_core_init_vm(struct kvm *kvm) #ifdef CONFIG_PPC64 INIT_LIST_HEAD_RCU(&kvm->arch.spapr_tce_tables); INIT_LIST_HEAD(&kvm->arch.rtas_tokens); + mutex_init(&kvm->arch.rtas_token_lock); #endif return kvm->arch.kvm_ops->init_vm(kvm); diff --git a/arch/powerpc/kvm/book3s_rtas.c b/arch/powerpc/kvm/book3s_rtas.c index 4e178c4..47279a5 100644 --- a/arch/powerpc/kvm/book3s_rtas.c +++ b/arch/powerpc/kvm/book3s_rtas.c @@ -146,7 +146,7 @@ static int rtas_token_undefine(struct kvm *kvm, char *name) { struct rtas_token_definition *d, *tmp; - lockdep_assert_held(&kvm->lock); + lockdep_assert_held(&kvm->arch.rtas_token_lock); list_for_each_entry_safe(d, tmp, &kvm->arch.rtas_tokens, list) { if (rtas_name_matches(d->handler->name, name)) { @@ -167,7 +167,7 @@ static int rtas_token_define(struct kvm *kvm, char *name, u64 token) bool found; int i; - lockdep_assert_held(&kvm->lock); + lockdep_assert_held(&kvm->arch.rtas_token_lock); list_for_each_entry(d, &kvm->arch.rtas_tokens, list) { if (d->token == token) @@ -206,14 +206,14 @@ int kvm_vm_ioctl_rtas_define_token(struct kvm *kvm, void __user *argp) if (copy_from_user(&args, argp, sizeof(args))) return -EFAULT; - mutex_lock(&kvm->lock); + mutex_lock(&kvm->arch.rtas_token_lock); if (args.token) rc = rtas_token_define(kvm, args.name, args.token); else rc = rtas_token_undefine(kvm, args.name); - mutex_unlock(&kvm->lock); + mutex_unlock(&kvm->arch.rtas_token_lock); return rc; } @@ -245,7 +245,7 @@ int kvmppc_rtas_hcall(struct kvm_vcpu *vcpu) orig_rets = args.rets; args.rets = &args.args[be32_to_cpu(args.nargs)]; - mutex_lock(&vcpu->kvm->lock); + mutex_lock(&vcpu->kvm->arch.rtas_token_lock); rc = -ENOENT; list_for_each_entry(d, &vcpu->kvm->arch.rtas_tokens, list) { @@ -256,7 +256,7 @@ int kvmppc_rtas_hcall(struct kvm_vcpu *vcpu) } } - mutex_unlock(&vcpu->kvm->lock); + mutex_unlock(&vcpu->kvm->arch.rtas_token_lock); if (rc == 0) { args.rets = orig_rets; @@ -282,7 +282,7 @@ void kvmppc_rtas_tokens_free(struct kvm *kvm) { struct rtas_token_definition *d, *tmp; - lockdep_assert_held(&kvm->lock); + lockdep_assert_held(&kvm->arch.rtas_token_lock); list_for_each_entry_safe(d, tmp, &kvm->arch.rtas_tokens, list) { list_del(&d->list); From patchwork Thu May 23 06:36:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Paul Mackerras X-Patchwork-Id: 10956941 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F1781708 for ; Thu, 23 May 2019 06:36:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1871927FA5 for ; Thu, 23 May 2019 06:36:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0CBC227FB1; Thu, 23 May 2019 06:36:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A48F8280CF for ; Thu, 23 May 2019 06:36:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727466AbfEWGgq (ORCPT ); Thu, 23 May 2019 02:36:46 -0400 Received: from ozlabs.org ([203.11.71.1]:34359 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726385AbfEWGgp (ORCPT ); Thu, 23 May 2019 02:36:45 -0400 Received: by ozlabs.org (Postfix, from userid 1003) id 458ft353Lrz9sBV; Thu, 23 May 2019 16:36:43 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ozlabs.org; s=201707; t=1558593403; bh=XqYcC2hfPVNyLaizCQlvjY5iInJ3OLoLBVpPVN8v8LU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=pnayADD3fDzj8vJX/tLcCYRdJwDw6tWApdyWE5oEFWXOPxNJxSuWp5d+ty16mEAuo Mw4B1rMAjWXRxUL+EMVQDivCDRpxGl6Wsrh84MnKGJ6KDYJ11z364cyiKURxPTVF6z Ddwn7xqP0fIrzv0gVM3EHjstnWfbInkoF48YeEPrLXIlWyKDnyHULm32iVd3WuNTtY WcOnwBsIiTTFiL98nCsnYCzxTxOJWUNleGfBl4AMFb7j1KAPJsXh4m+bPTQgb+cq6x ZBaZZI2uNSB877cKB7VnqUMFTudqsczK+guPMRCW16efIlzqCEJP97qYJbyuUQkoTL OMexRWIStWG9Q== Date: Thu, 23 May 2019 16:36:32 +1000 From: Paul Mackerras To: kvm@vger.kernel.org Cc: kvm-ppc@vger.kernel.org, =?iso-8859-1?q?C=E9dric?= Le Goater Subject: [PATCH 4/4] KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu Message-ID: <20190523063632.GF19655@blackberry> References: <20190523063424.GB19655@blackberry> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190523063424.GB19655@blackberry> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently the HV KVM code takes the kvm->lock around calls to kvm_for_each_vcpu() and kvm_get_vcpu_by_id() (which can call kvm_for_each_vcpu() internally). However, that leads to a lock order inversion problem, because these are called in contexts where the vcpu mutex is held, but the vcpu mutexes nest within kvm->lock according to Documentation/virtual/kvm/locking.txt. Hence there is a possibility of deadlock. To fix this, we simply don't take the kvm->lock mutex around these calls. This is safe because the implementations of kvm_for_each_vcpu() and kvm_get_vcpu_by_id() have been designed to be able to be called locklessly. Signed-off-by: Paul Mackerras Reviewed-by: Cédric Le Goater --- arch/powerpc/kvm/book3s_hv.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index b1c0a9b..27054d3 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -446,12 +446,7 @@ static void kvmppc_dump_regs(struct kvm_vcpu *vcpu) static struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id) { - struct kvm_vcpu *ret; - - mutex_lock(&kvm->lock); - ret = kvm_get_vcpu_by_id(kvm, id); - mutex_unlock(&kvm->lock); - return ret; + return kvm_get_vcpu_by_id(kvm, id); } static void init_vpa(struct kvm_vcpu *vcpu, struct lppaca *vpa) @@ -1583,7 +1578,6 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr, struct kvmppc_vcore *vc = vcpu->arch.vcore; u64 mask; - mutex_lock(&kvm->lock); spin_lock(&vc->lock); /* * If ILE (interrupt little-endian) has changed, update the @@ -1623,7 +1617,6 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr, mask &= 0xFFFFFFFF; vc->lpcr = (vc->lpcr & ~mask) | (new_lpcr & mask); spin_unlock(&vc->lock); - mutex_unlock(&kvm->lock); } static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,