Message ID | 20170801210040.10295-1-dmatlack@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/08/2017 23:00, David Matlack wrote: > According to the Intel SDM, software cannot rely on the current VMCS to be > coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12 > flushes. > > 24.11.1 Software Use of Virtual-Machine Control Structures > ... > If a logical processor leaves VMX operation, any VMCSs active on > that logical processor may be corrupted (see below). To prevent > such corruption of a VMCS that may be used either after a return > to VMX operation or on another logical processor, software should > execute VMCLEAR for that VMCS before executing the VMXOFF instruction > or removing power from the processor (e.g., as part of a transition > to the S3 and S4 power states). > ... > > This fixes a "suspicious rcu_dereference_check() usage!" warning during > kvm_vm_release() because nested_release_vmcs12() calls > kvm_vcpu_write_guest_page() without holding kvm->srcu. > > Signed-off-by: David Matlack <dmatlack@google.com> > --- > This patch applies on top of Paolo's "[PATCH] KVM: nVMX: do not pin the VMCS12". > (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1455166.html) Thanks, I think Radim should first apply the RCU-on-teardown patch (which I'll resend formally today), then "do not pin the VMCS12", then these two. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> > > arch/x86/kvm/vmx.c | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 5c03340f7827..07d2198db225 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -419,7 +419,7 @@ struct nested_vmx { > /* > * Cache of the guest's VMCS, existing outside of guest memory. > * Loaded from guest memory during VMPTRLD. Flushed to guest > - * memory during VMXOFF, VMCLEAR, VMPTRLD. > + * memory during VMCLEAR and VMPTRLD. > */ > struct vmcs12 *cached_vmcs12; > /* > @@ -7131,6 +7131,12 @@ static int nested_vmx_check_permission(struct kvm_vcpu *vcpu) > return 1; > } > > +static void vmx_disable_shadow_vmcs(struct vcpu_vmx *vmx) > +{ > + vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, SECONDARY_EXEC_SHADOW_VMCS); > + vmcs_write64(VMCS_LINK_POINTER, -1ull); > +} > + > static inline void nested_release_vmcs12(struct vcpu_vmx *vmx) > { > if (vmx->nested.current_vmptr == -1ull) > @@ -7141,9 +7147,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx) > they were modified */ > copy_shadow_to_vmcs12(vmx); > vmx->nested.sync_shadow_vmcs = false; > - vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, > - SECONDARY_EXEC_SHADOW_VMCS); > - vmcs_write64(VMCS_LINK_POINTER, -1ull); > + vmx_disable_shadow_vmcs(vmx); > } > vmx->nested.posted_intr_nv = -1; > > @@ -7166,12 +7170,14 @@ static void free_nested(struct vcpu_vmx *vmx) > > vmx->nested.vmxon = false; > free_vpid(vmx->nested.vpid02); > - nested_release_vmcs12(vmx); > + vmx->nested.posted_intr_nv = -1; > + vmx->nested.current_vmptr = -1ull; > if (vmx->nested.msr_bitmap) { > free_page((unsigned long)vmx->nested.msr_bitmap); > vmx->nested.msr_bitmap = NULL; > } > if (enable_shadow_vmcs) { > + vmx_disable_shadow_vmcs(vmx); > vmcs_clear(vmx->vmcs01.shadow_vmcs); > free_vmcs(vmx->vmcs01.shadow_vmcs); > vmx->vmcs01.shadow_vmcs = NULL; >
2017-08-02 10:18+0200, Paolo Bonzini: > On 01/08/2017 23:00, David Matlack wrote: > > According to the Intel SDM, software cannot rely on the current VMCS to be > > coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12 > > flushes. > > > > 24.11.1 Software Use of Virtual-Machine Control Structures > > ... > > If a logical processor leaves VMX operation, any VMCSs active on > > that logical processor may be corrupted (see below). To prevent > > such corruption of a VMCS that may be used either after a return > > to VMX operation or on another logical processor, software should > > execute VMCLEAR for that VMCS before executing the VMXOFF instruction > > or removing power from the processor (e.g., as part of a transition > > to the S3 and S4 power states). > > ... > > > > This fixes a "suspicious rcu_dereference_check() usage!" warning during > > kvm_vm_release() because nested_release_vmcs12() calls > > kvm_vcpu_write_guest_page() without holding kvm->srcu. > > > > Signed-off-by: David Matlack <dmatlack@google.com> > > --- > > This patch applies on top of Paolo's "[PATCH] KVM: nVMX: do not pin the VMCS12". > > (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1455166.html) > > Thanks, I think Radim should first apply the RCU-on-teardown patch > (which I'll resend formally today), then "do not pin the VMCS12", then > these two. > > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Applied in that order, thanks.
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 5c03340f7827..07d2198db225 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -419,7 +419,7 @@ struct nested_vmx { /* * Cache of the guest's VMCS, existing outside of guest memory. * Loaded from guest memory during VMPTRLD. Flushed to guest - * memory during VMXOFF, VMCLEAR, VMPTRLD. + * memory during VMCLEAR and VMPTRLD. */ struct vmcs12 *cached_vmcs12; /* @@ -7131,6 +7131,12 @@ static int nested_vmx_check_permission(struct kvm_vcpu *vcpu) return 1; } +static void vmx_disable_shadow_vmcs(struct vcpu_vmx *vmx) +{ + vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, SECONDARY_EXEC_SHADOW_VMCS); + vmcs_write64(VMCS_LINK_POINTER, -1ull); +} + static inline void nested_release_vmcs12(struct vcpu_vmx *vmx) { if (vmx->nested.current_vmptr == -1ull) @@ -7141,9 +7147,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx) they were modified */ copy_shadow_to_vmcs12(vmx); vmx->nested.sync_shadow_vmcs = false; - vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, - SECONDARY_EXEC_SHADOW_VMCS); - vmcs_write64(VMCS_LINK_POINTER, -1ull); + vmx_disable_shadow_vmcs(vmx); } vmx->nested.posted_intr_nv = -1; @@ -7166,12 +7170,14 @@ static void free_nested(struct vcpu_vmx *vmx) vmx->nested.vmxon = false; free_vpid(vmx->nested.vpid02); - nested_release_vmcs12(vmx); + vmx->nested.posted_intr_nv = -1; + vmx->nested.current_vmptr = -1ull; if (vmx->nested.msr_bitmap) { free_page((unsigned long)vmx->nested.msr_bitmap); vmx->nested.msr_bitmap = NULL; } if (enable_shadow_vmcs) { + vmx_disable_shadow_vmcs(vmx); vmcs_clear(vmx->vmcs01.shadow_vmcs); free_vmcs(vmx->vmcs01.shadow_vmcs); vmx->vmcs01.shadow_vmcs = NULL;
According to the Intel SDM, software cannot rely on the current VMCS to be coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12 flushes. 24.11.1 Software Use of Virtual-Machine Control Structures ... If a logical processor leaves VMX operation, any VMCSs active on that logical processor may be corrupted (see below). To prevent such corruption of a VMCS that may be used either after a return to VMX operation or on another logical processor, software should execute VMCLEAR for that VMCS before executing the VMXOFF instruction or removing power from the processor (e.g., as part of a transition to the S3 and S4 power states). ... This fixes a "suspicious rcu_dereference_check() usage!" warning during kvm_vm_release() because nested_release_vmcs12() calls kvm_vcpu_write_guest_page() without holding kvm->srcu. Signed-off-by: David Matlack <dmatlack@google.com> --- This patch applies on top of Paolo's "[PATCH] KVM: nVMX: do not pin the VMCS12". (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1455166.html) arch/x86/kvm/vmx.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)