Message ID | 54531E7B.1040006@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Tiejun, On Fri, Oct 31, 2014 at 01:30:35PM +0800, Chen, Tiejun wrote: >On 2014/10/31 12:33, Wanpeng Li wrote: >>The srcu read lock must be held while accessing memslots (e.g. >>when using gfn_to_* functions), however, commit c24ae0dcd3e8 >>("kvm: x86: Unpin and remove kvm_arch->apic_access_page") call >>gfn_to_page() in kvm_vcpu_reload_apic_access_page() w/o hold it >>which leads to suspicious rcu_dereference_check() usage warning. >>This patch fix it by holding srcu read lock when call gfn_to_page() >>in kvm_vcpu_reload_apic_access_page() function. >> >> >>[ INFO: suspicious RCU usage. ] >>3.18.0-rc2-test2+ #70 Not tainted >>------------------------------- >>include/linux/kvm_host.h:474 suspicious rcu_dereference_check() usage! >> >>other info that might help us debug this: >> >>rcu_scheduler_active = 1, debug_locks = 0 >>1 lock held by qemu-system-x86/2371: >> #0: (&vcpu->mutex){+.+...}, at: [<ffffffffa037d800>] vcpu_load+0x20/0xd0 [kvm] >> >>stack backtrace: >>CPU: 4 PID: 2371 Comm: qemu-system-x86 Not tainted 3.18.0-rc2-test2+ #70 >>Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013 >> 0000000000000001 ffff880209983ca8 ffffffff816f514f 0000000000000000 >> ffff8802099b8990 ffff880209983cd8 ffffffff810bd687 00000000000fee00 >> ffff880208a2c000 ffff880208a10000 ffff88020ef50040 ffff880209983d08 >>Call Trace: >> [<ffffffff816f514f>] dump_stack+0x4e/0x71 >> [<ffffffff810bd687>] lockdep_rcu_suspicious+0xe7/0x120 >> [<ffffffffa037d055>] gfn_to_memslot+0xd5/0xe0 [kvm] >> [<ffffffffa03807d3>] __gfn_to_pfn+0x33/0x60 [kvm] >> [<ffffffffa0380885>] gfn_to_page+0x25/0x90 [kvm] >> [<ffffffffa038aeec>] kvm_vcpu_reload_apic_access_page+0x3c/0x80 [kvm] >> [<ffffffffa08f0a9c>] vmx_vcpu_reset+0x20c/0x460 [kvm_intel] >> [<ffffffffa039ab8e>] kvm_vcpu_reset+0x15e/0x1b0 [kvm] >> [<ffffffffa039ac0c>] kvm_arch_vcpu_setup+0x2c/0x50 [kvm] >> [<ffffffffa037f7e0>] kvm_vm_ioctl+0x1d0/0x780 [kvm] >> [<ffffffff810bc664>] ? __lock_is_held+0x54/0x80 >> [<ffffffff812231f0>] do_vfs_ioctl+0x300/0x520 >> [<ffffffff8122ee45>] ? __fget+0x5/0x250 >> [<ffffffff8122f0fa>] ? __fget_light+0x2a/0xe0 >> [<ffffffff81223491>] SyS_ioctl+0x81/0xa0 >> [<ffffffff816fed6d>] system_call_fastpath+0x16/0x1b >> >>Reported-by: Takashi Iwai <tiwai@suse.de> >>Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> >>Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> >>--- >> arch/x86/kvm/x86.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>index 0033df3..2d97329 100644 >>--- a/arch/x86/kvm/x86.c >>+++ b/arch/x86/kvm/x86.c >>@@ -6059,6 +6059,7 @@ static void kvm_vcpu_flush_tlb(struct kvm_vcpu *vcpu) >> void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu) >> { >> struct page *page = NULL; >>+ int idx; >> >> if (!irqchip_in_kernel(vcpu->kvm)) >> return; >>@@ -6066,7 +6067,9 @@ void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu) >> if (!kvm_x86_ops->set_apic_access_page_addr) >> return; >> >>+ idx = srcu_read_lock(&vcpu->kvm->srcu); > >There's another scenario that we already hold srcu before call >kvm_vcpu_reload_apic_access_page(), > >__vcpu_run() > | > + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); > + r = vcpu_enter_guest(vcpu); > | > + kvm_vcpu_reload_apic_access_page(vcpu); > You are right. Great thanks for your pointing out. After recheck all the callsites of kvm_vcpu_reload_apic_access_page(), just vmx_vcpu_reset() path need to be fixed. Regards, Wanpeng Li >So according to backtrace I think we should fix as follows: > >kvm: x86: vmx: hold kvm->srcu while reload apic access page > >kvm_vcpu_reload_apic_access_page() needs to access memslots via >gfn_to_page(), so its necessary to hold kvm->srcu. > >Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> >--- > arch/x86/kvm/vmx.c | 3 +++ > 1 file changed, 3 insertions(+) > >diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >index b25a588..9fa1f46 100644 >--- a/arch/x86/kvm/vmx.c >+++ b/arch/x86/kvm/vmx.c >@@ -4442,6 +4442,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > struct msr_data apic_base_msr; >+ int idx; > > vmx->rmode.vm86_active = 0; > >@@ -4509,7 +4510,9 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) > vmcs_write32(TPR_THRESHOLD, 0); > } > >+ idx = srcu_read_lock(&vcpu->kvm->srcu); > kvm_vcpu_reload_apic_access_page(vcpu); >+ srcu_read_unlock(&vcpu->kvm->srcu, idx); > > if (vmx_vm_has_apicv(vcpu->kvm)) > memset(&vmx->pi_desc, 0, sizeof(struct pi_desc)); >-- >1.9.1 > >Thanks >Tiejun >> page = gfn_to_page(vcpu->kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); >>+ srcu_read_unlock(&vcpu->kvm->srcu, idx); >> kvm_x86_ops->set_apic_access_page_addr(vcpu, page_to_phys(page)); >> >> /* >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 31/10/2014 06:30, Chen, Tiejun wrote: > > @@ -4442,6 +4442,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > struct msr_data apic_base_msr; > + int idx; > > vmx->rmode.vm86_active = 0; > > @@ -4509,7 +4510,9 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) > vmcs_write32(TPR_THRESHOLD, 0); > } > > + idx = srcu_read_lock(&vcpu->kvm->srcu); > kvm_vcpu_reload_apic_access_page(vcpu); > + srcu_read_unlock(&vcpu->kvm->srcu, idx); > > if (vmx_vm_has_apicv(vcpu->kvm)) > memset(&vmx->pi_desc, 0, sizeof(struct pi_desc)); Not enough; you can call vcpu_enter_guest -> kvm_apic_accept_events -> kvm_vcpu_reset -> vmx_vcpu_reset while under the SRCU lock. The right place to add the lock is kvm_arch_vcpu_setup. Thanks, Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Paolo, On 14/10/31 ??6:36, Paolo Bonzini wrote: > > On 31/10/2014 06:30, Chen, Tiejun wrote: >> @@ -4442,6 +4442,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) >> { >> struct vcpu_vmx *vmx = to_vmx(vcpu); >> struct msr_data apic_base_msr; >> + int idx; >> >> vmx->rmode.vm86_active = 0; >> >> @@ -4509,7 +4510,9 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) >> vmcs_write32(TPR_THRESHOLD, 0); >> } >> >> + idx = srcu_read_lock(&vcpu->kvm->srcu); >> kvm_vcpu_reload_apic_access_page(vcpu); >> + srcu_read_unlock(&vcpu->kvm->srcu, idx); >> >> if (vmx_vm_has_apicv(vcpu->kvm)) >> memset(&vmx->pi_desc, 0, sizeof(struct pi_desc)); > Not enough; you can call vcpu_enter_guest -> kvm_apic_accept_events -> > kvm_vcpu_reset -> vmx_vcpu_reset while under the SRCU lock. The right > place to add the lock is kvm_arch_vcpu_setup. Ah, ok, I will send a newer version tomorrow. ;-) Regards, Wanpeng Li > > Thanks, > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Paolo, On 14/10/31 ??6:36, Paolo Bonzini wrote: > > On 31/10/2014 06:30, Chen, Tiejun wrote: >> @@ -4442,6 +4442,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) >> { >> struct vcpu_vmx *vmx = to_vmx(vcpu); >> struct msr_data apic_base_msr; >> + int idx; >> >> vmx->rmode.vm86_active = 0; >> >> @@ -4509,7 +4510,9 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) >> vmcs_write32(TPR_THRESHOLD, 0); >> } >> >> + idx = srcu_read_lock(&vcpu->kvm->srcu); >> kvm_vcpu_reload_apic_access_page(vcpu); >> + srcu_read_unlock(&vcpu->kvm->srcu, idx); >> >> if (vmx_vm_has_apicv(vcpu->kvm)) >> memset(&vmx->pi_desc, 0, sizeof(struct pi_desc)); > Not enough; you can call vcpu_enter_guest -> kvm_apic_accept_events -> > kvm_vcpu_reset -> vmx_vcpu_reset while under the SRCU lock. The right > place to add the lock is kvm_arch_vcpu_setup. This is also not enough. I see the warning in the below path during the test: kvm_arch_vcpu_ioctl_run -> kvm_apic_accept_events -> kvm_vcpu_reset I just send out the version 3 and hope it can take care all the situations. ;-) Regards, Wanpeng Li > > Thanks, > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b25a588..9fa1f46 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -4442,6 +4442,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct msr_data apic_base_msr; + int idx; vmx->rmode.vm86_active = 0; @@ -4509,7 +4510,9 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu) vmcs_write32(TPR_THRESHOLD, 0); } + idx = srcu_read_lock(&vcpu->kvm->srcu); kvm_vcpu_reload_apic_access_page(vcpu); + srcu_read_unlock(&vcpu->kvm->srcu, idx); if (vmx_vm_has_apicv(vcpu->kvm)) memset(&vmx->pi_desc, 0, sizeof(struct pi_desc));