Message ID | 54355344.5050301@siemens.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Il 08/10/2014 17:07, Jan Kiszka ha scritto: > As usual, this was a nasty race that involved some concurrent VCPUs and > proper host load, so hard to write unit tests... > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 04fa1b8..d6bcaca 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -6417,6 +6417,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) > const unsigned long *fields = shadow_read_write_fields; > const int num_fields = max_shadow_read_write_fields; > > + preempt_disable(); > + > vmcs_load(shadow_vmcs); > > for (i = 0; i < num_fields; i++) { > @@ -6440,6 +6442,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) > > vmcs_clear(shadow_vmcs); > vmcs_load(vmx->loaded_vmcs->vmcs); > + > + preempt_enable(); > } > > static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) > @@ -6457,6 +6461,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) > u64 field_value = 0; > struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs; > > + preempt_disable(); > + > vmcs_load(shadow_vmcs); > > for (q = 0; q < ARRAY_SIZE(fields); q++) { > @@ -6483,6 +6489,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) > > vmcs_clear(shadow_vmcs); > vmcs_load(vmx->loaded_vmcs->vmcs); > + > + preempt_enable(); > } > > No proper patch yet because there might be a smarter approach without > using the preempt_disable() hammer. copy_vmcs12_to_shadow already runs with preemption disabled; for stable@ it's not that bad to do the same in copy_shadow_to_vmcs12. For 3.18 it could be nice of course to use loaded_vmcs properly, but it would also incur some overhead. Paolo > But the point is that we temporarily > load a vmcs without updating loaded_vmcs->vmcs. Now, if some other VCPU > is scheduling in right in the middle of this, the wrong vmcs will be > flushed and then reloaded - e.g. a non-shadow vmcs with that interrupt > window flag set... > > Patch is currently under heavy load testing here, but it looks very good > as the bug was quickly reproducible before I applied it. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2014-10-08 17:44, Paolo Bonzini wrote: > Il 08/10/2014 17:07, Jan Kiszka ha scritto: >> As usual, this was a nasty race that involved some concurrent VCPUs and >> proper host load, so hard to write unit tests... >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 04fa1b8..d6bcaca 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -6417,6 +6417,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) >> const unsigned long *fields = shadow_read_write_fields; >> const int num_fields = max_shadow_read_write_fields; >> >> + preempt_disable(); >> + >> vmcs_load(shadow_vmcs); >> >> for (i = 0; i < num_fields; i++) { >> @@ -6440,6 +6442,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) >> >> vmcs_clear(shadow_vmcs); >> vmcs_load(vmx->loaded_vmcs->vmcs); >> + >> + preempt_enable(); >> } >> >> static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) >> @@ -6457,6 +6461,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) >> u64 field_value = 0; >> struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs; >> >> + preempt_disable(); >> + >> vmcs_load(shadow_vmcs); >> >> for (q = 0; q < ARRAY_SIZE(fields); q++) { >> @@ -6483,6 +6489,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) >> >> vmcs_clear(shadow_vmcs); >> vmcs_load(vmx->loaded_vmcs->vmcs); >> + >> + preempt_enable(); >> } >> >> No proper patch yet because there might be a smarter approach without >> using the preempt_disable() hammer. > > copy_vmcs12_to_shadow already runs with preemption disabled; for stable@ > it's not that bad to do the same in copy_shadow_to_vmcs12. > > For 3.18 it could be nice of course to use loaded_vmcs properly, but it > would also incur some overhead. If the other direction is already under preempt_disable, I'm not sure if there is much to gain for this direction. Anyway, fix sent. Jan
On Wed, Oct 08, 2014 at 05:07:48PM +0200, Jan Kiszka wrote: >On 2014-10-08 12:34, Paolo Bonzini wrote: >> Il 08/10/2014 12:29, Jan Kiszka ha scritto: >>>>> But it would write to the vmcs02, not to the shadow VMCS; the shadow >>>>> VMCS is active during copy_shadow_to_vmcs12/copy_vmcs12_to_shadow, and >>>>> at no other time. It is not clear to me how the VIRTUAL_INTR_PENDING >>>>> bit ended up from the vmcs02 (where it is perfectly fine) to the vmcs12. >>> Well, but somehow that bit ends up in vmcs12, that's a fact. Also that >>> the proble disappears when shadowing is disabled. Need to think about >>> the path again. Maybe there is just a bug, not a conceptual issue. >> >> Yeah, and at this point we cannot actually exclude a processor bug. Can >> you check that the bit is not in the shadow VMCS just before vmrun, or >> just after enable_irq_window? >> >> Having a kvm-unit-tests testcase could also be of some help. > >As usual, this was a nasty race that involved some concurrent VCPUs and >proper host load, so hard to write unit tests... > >diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >index 04fa1b8..d6bcaca 100644 >--- a/arch/x86/kvm/vmx.c >+++ b/arch/x86/kvm/vmx.c >@@ -6417,6 +6417,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) > const unsigned long *fields = shadow_read_write_fields; > const int num_fields = max_shadow_read_write_fields; > >+ preempt_disable(); >+ > vmcs_load(shadow_vmcs); > > for (i = 0; i < num_fields; i++) { >@@ -6440,6 +6442,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) > > vmcs_clear(shadow_vmcs); > vmcs_load(vmx->loaded_vmcs->vmcs); >+ >+ preempt_enable(); > } > > static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) >@@ -6457,6 +6461,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) > u64 field_value = 0; > struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs; > >+ preempt_disable(); >+ > vmcs_load(shadow_vmcs); > > for (q = 0; q < ARRAY_SIZE(fields); q++) { >@@ -6483,6 +6489,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) > > vmcs_clear(shadow_vmcs); > vmcs_load(vmx->loaded_vmcs->vmcs); >+ >+ preempt_enable(); > } > > /* > >No proper patch yet because there might be a smarter approach without >using the preempt_disable() hammer. But the point is that we temporarily >load a vmcs without updating loaded_vmcs->vmcs. Now, if some other VCPU >is scheduling in right in the middle of this, the wrong vmcs will be >flushed and then reloaded - e.g. a non-shadow vmcs with that interrupt >window flag set... If non-shadow vmcs and shadow vmcs can present in one system simultaneously? Regards, Wanpeng Li > >Patch is currently under heavy load testing here, but it looks very good >as the bug was quickly reproducible before I applied it. > >Jan > >-- >Siemens AG, Corporate Technology, CT RTC ITP SES-DE >Corporate Competence Center Embedded Linux >-- >To unsubscribe from this list: send the line "unsubscribe kvm" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 09, 2014 at 07:34:47AM +0800, Wanpeng Li wrote: >On Wed, Oct 08, 2014 at 05:07:48PM +0200, Jan Kiszka wrote: >>On 2014-10-08 12:34, Paolo Bonzini wrote: >>> Il 08/10/2014 12:29, Jan Kiszka ha scritto: >>>>>> But it would write to the vmcs02, not to the shadow VMCS; the shadow >>>>>> VMCS is active during copy_shadow_to_vmcs12/copy_vmcs12_to_shadow, and >>>>>> at no other time. It is not clear to me how the VIRTUAL_INTR_PENDING >>>>>> bit ended up from the vmcs02 (where it is perfectly fine) to the vmcs12. >>>> Well, but somehow that bit ends up in vmcs12, that's a fact. Also that >>>> the proble disappears when shadowing is disabled. Need to think about >>>> the path again. Maybe there is just a bug, not a conceptual issue. >>> >>> Yeah, and at this point we cannot actually exclude a processor bug. Can >>> you check that the bit is not in the shadow VMCS just before vmrun, or >>> just after enable_irq_window? >>> >>> Having a kvm-unit-tests testcase could also be of some help. >> >>As usual, this was a nasty race that involved some concurrent VCPUs and >>proper host load, so hard to write unit tests... >> >>diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>index 04fa1b8..d6bcaca 100644 >>--- a/arch/x86/kvm/vmx.c >>+++ b/arch/x86/kvm/vmx.c >>@@ -6417,6 +6417,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) >> const unsigned long *fields = shadow_read_write_fields; >> const int num_fields = max_shadow_read_write_fields; >> >>+ preempt_disable(); >>+ >> vmcs_load(shadow_vmcs); >> >> for (i = 0; i < num_fields; i++) { >>@@ -6440,6 +6442,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) >> >> vmcs_clear(shadow_vmcs); >> vmcs_load(vmx->loaded_vmcs->vmcs); >>+ >>+ preempt_enable(); >> } >> >> static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) >>@@ -6457,6 +6461,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) >> u64 field_value = 0; >> struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs; >> >>+ preempt_disable(); >>+ >> vmcs_load(shadow_vmcs); >> >> for (q = 0; q < ARRAY_SIZE(fields); q++) { >>@@ -6483,6 +6489,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) >> >> vmcs_clear(shadow_vmcs); >> vmcs_load(vmx->loaded_vmcs->vmcs); >>+ >>+ preempt_enable(); >> } >> >> /* >> >>No proper patch yet because there might be a smarter approach without >>using the preempt_disable() hammer. But the point is that we temporarily >>load a vmcs without updating loaded_vmcs->vmcs. Now, if some other VCPU >>is scheduling in right in the middle of this, the wrong vmcs will be >>flushed and then reloaded - e.g. a non-shadow vmcs with that interrupt >>window flag set... > >If non-shadow vmcs and shadow vmcs can present in one system simultaneously? Ah, got it, you mean non-current-shadow vmcs. Regards, Wanpeng Li > >Regards, >Wanpeng Li > >> >>Patch is currently under heavy load testing here, but it looks very good >>as the bug was quickly reproducible before I applied it. >> >>Jan >> >>-- >>Siemens AG, Corporate Technology, CT RTC ITP SES-DE >>Corporate Competence Center Embedded Linux >>-- >>To unsubscribe from this list: send the line "unsubscribe kvm" in >>the body of a message to majordomo@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html >-- >To unsubscribe from this list: send the line "unsubscribe kvm" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 04fa1b8..d6bcaca 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6417,6 +6417,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) const unsigned long *fields = shadow_read_write_fields; const int num_fields = max_shadow_read_write_fields; + preempt_disable(); + vmcs_load(shadow_vmcs); for (i = 0; i < num_fields; i++) { @@ -6440,6 +6442,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx) vmcs_clear(shadow_vmcs); vmcs_load(vmx->loaded_vmcs->vmcs); + + preempt_enable(); } static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) @@ -6457,6 +6461,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) u64 field_value = 0; struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs; + preempt_disable(); + vmcs_load(shadow_vmcs); for (q = 0; q < ARRAY_SIZE(fields); q++) { @@ -6483,6 +6489,8 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) vmcs_clear(shadow_vmcs); vmcs_load(vmx->loaded_vmcs->vmcs); + + preempt_enable(); } /*