diff mbox

[v2,3/4] VMX: Assign the right value to 'NDST' field in a concern case

Message ID 1464269954-8056-4-git-send-email-feng.wu@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wu, Feng May 26, 2016, 1:39 p.m. UTC
Normally, in vmx_cpu_block() 'NDST' filed should have the same
value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending
on whether x2apic is enabled. However, in the following scenario,
'NDST' has different value:

'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
other three PI hooks have not been assigned or not been excuted yet.
And during this interval, we are running in vmx_vcpu_block(), then
'NDST' may have different value.

This patch fix this concern case.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Comments

Jan Beulich May 27, 2016, 2 p.m. UTC | #1
>>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote:
> Normally, in vmx_cpu_block() 'NDST' filed should have the same
> value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending
> on whether x2apic is enabled. However, in the following scenario,
> 'NDST' has different value:
> 
> 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
> other three PI hooks have not been assigned or not been excuted yet.
> And during this interval, we are running in vmx_vcpu_block(), then
> 'NDST' may have different value.

How about simply assigning vcpu_block last? Or alternatively
holding pi_blocking_list_lock around all four assignments? Or
(maybe in addition to one of these) writing something sensible in
vmx_pi_hooks_assign() before doing the hook assignments? Of
course much depends on whether the ASSERT() you replace was
meant just as a sanity check, or instead logic elsewhere relies on
NDST having the right value already before getting into
vmx_vcpu_block().

Also, rather than saying twice that NDST has a different value in
that case, how about stating what exact value it has then and
why that's not what we want/need? (Same goes for the code
comment then.)

> This patch fix this concern case.

Here as well as in earlier patches - do you perhaps mean "corner
case"?

Jan
Wu, Feng May 31, 2016, 10:27 a.m. UTC | #2
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, May 27, 2016 10:00 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org
> Subject: Re: [PATCH v2 3/4] VMX: Assign the right value to 'NDST' field in a
> concern case
> 
> >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote:
> > Normally, in vmx_cpu_block() 'NDST' filed should have the same
> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending
> > on whether x2apic is enabled. However, in the following scenario,
> > 'NDST' has different value:
> >
> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
> > other three PI hooks have not been assigned or not been excuted yet.
> > And during this interval, we are running in vmx_vcpu_block(), then
> > 'NDST' may have different value.
> 
> How about simply assigning vcpu_block last? Or alternatively
> holding pi_blocking_list_lock around all four assignments? Or
> (maybe in addition to one of these) writing something sensible in
> vmx_pi_hooks_assign() before doing the hook assignments? 

The problem is vmx_vcpu_block() can still get called first, since
the other ones might not get called yet.

> Of course much depends on whether the ASSERT() you replace was
> meant just as a sanity check, or instead logic elsewhere relies on
> NDST having the right value already before getting into
> vmx_vcpu_block().

The point is we need to make sure NDST have the right value
after leaving vmx_vcpu_block().

> 
> Also, rather than saying twice that NDST has a different value in
> that case, how about stating what exact value it has then and
> why that's not what we want/need? (Same goes for the code
> comment then.)

Sure, will add this in the next version.

> 
> > This patch fix this concern case.
> 
> Here as well as in earlier patches - do you perhaps mean "corner
> case"?

Sorry for the typo.

Thanks,
Feng

> 
> Jan
Jan Beulich May 31, 2016, 11:58 a.m. UTC | #3
>>> On 31.05.16 at 12:27, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, May 27, 2016 10:00 PM
>> >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote:
>> > Normally, in vmx_cpu_block() 'NDST' filed should have the same
>> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending
>> > on whether x2apic is enabled. However, in the following scenario,
>> > 'NDST' has different value:
>> >
>> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
>> > other three PI hooks have not been assigned or not been excuted yet.
>> > And during this interval, we are running in vmx_vcpu_block(), then
>> > 'NDST' may have different value.
>> 
>> How about simply assigning vcpu_block last? Or alternatively
>> holding pi_blocking_list_lock around all four assignments? Or
>> (maybe in addition to one of these) writing something sensible in
>> vmx_pi_hooks_assign() before doing the hook assignments? 
> 
> The problem is vmx_vcpu_block() can still get called first, since
> the other ones might not get called yet.

That refers to the first two questions, yet the third (unanswered
one) was added by me because I anticipated that response. So
what's wrong with that last option?

Jan
Wu, Feng June 3, 2016, 5:23 a.m. UTC | #4
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, May 31, 2016 7:58 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org
> Subject: RE: [PATCH v2 3/4] VMX: Assign the right value to 'NDST' field in a
> concern case
> 
> >>> On 31.05.16 at 12:27, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, May 27, 2016 10:00 PM
> >> >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote:
> >> > Normally, in vmx_cpu_block() 'NDST' filed should have the same
> >> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending
> >> > on whether x2apic is enabled. However, in the following scenario,
> >> > 'NDST' has different value:
> >> >
> >> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
> >> > other three PI hooks have not been assigned or not been excuted yet.
> >> > And during this interval, we are running in vmx_vcpu_block(), then
> >> > 'NDST' may have different value.
> >>
> >> How about simply assigning vcpu_block last? Or alternatively
> >> holding pi_blocking_list_lock around all four assignments? Or
> >> (maybe in addition to one of these) writing something sensible in
> >> vmx_pi_hooks_assign() before doing the hook assignments?
> >
> > The problem is vmx_vcpu_block() can still get called first, since
> > the other ones might not get called yet.
> 
> That refers to the first two questions, yet the third (unanswered
> one) was added by me because I anticipated that response. So
> what's wrong with that last option?

Do you mean " Or (maybe in addition to one of these) writing
something sensible in vmx_pi_hooks_assign() before doing the
hook assignments?"?

The problem is you cannot know whether vmx_vcpu_block() is
called before or after the other hooks. And I am wondering
why do you think the current solution is not good.

Thanks,
Feng

> 
> Jan
Jan Beulich June 3, 2016, 9:57 a.m. UTC | #5
>>> On 03.06.16 at 07:23, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, May 31, 2016 7:58 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org 
>> Subject: RE: [PATCH v2 3/4] VMX: Assign the right value to 'NDST' field in a
>> concern case
>> 
>> >>> On 31.05.16 at 12:27, <feng.wu@intel.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Friday, May 27, 2016 10:00 PM
>> >> >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote:
>> >> > Normally, in vmx_cpu_block() 'NDST' filed should have the same
>> >> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending
>> >> > on whether x2apic is enabled. However, in the following scenario,
>> >> > 'NDST' has different value:
>> >> >
>> >> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
>> >> > other three PI hooks have not been assigned or not been excuted yet.
>> >> > And during this interval, we are running in vmx_vcpu_block(), then
>> >> > 'NDST' may have different value.
>> >>
>> >> How about simply assigning vcpu_block last? Or alternatively
>> >> holding pi_blocking_list_lock around all four assignments? Or
>> >> (maybe in addition to one of these) writing something sensible in
>> >> vmx_pi_hooks_assign() before doing the hook assignments?
>> >
>> > The problem is vmx_vcpu_block() can still get called first, since
>> > the other ones might not get called yet.
>> 
>> That refers to the first two questions, yet the third (unanswered
>> one) was added by me because I anticipated that response. So
>> what's wrong with that last option?
> 
> Do you mean " Or (maybe in addition to one of these) writing
> something sensible in vmx_pi_hooks_assign() before doing the
> hook assignments?"?

Yes.

> The problem is you cannot know whether vmx_vcpu_block() is
> called before or after the other hooks.

Well - it for sure can't get called before getting installed as a hook.
I.e. if you set up proper state before installing it, you should be
fine.

> And I am wondering
> why do you think the current solution is not good.

See the other thread (on patch 1). When dealing with corner
cases, as a first option (leading to less complication, or even
simplification) it should be understood why they are corner
cases in the first place, i.e. why the normal code flow doesn't
take care of them. It's only the second best option to add
extra logic / conditionals in order to deal with such.

For the case at hand that first question translates to "Why
does the ASSERT() not hold, despite us having thought it
always would?" And yes, the change here isn't complicated,
so may well be fine to go in, but together with the other
patches I can't help getting the feeling that the overall
situation wasn't really fully understood (and thus explained in
the patch overview and descriptions). In particular, this part
of the source comment

+     * 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
+     * other three PI hooks have not been assigned or not been excuted yet.
+     * And during this interval, we get here in this function, then 'NDST'
+     * may have different value.

suggests to me that my consideration above (about setting up
state correctly _before_ assigning the hooks) provides a viable
alternative solution.

Jan
George Dunlap June 22, 2016, 6 p.m. UTC | #6
On Thu, May 26, 2016 at 2:39 PM, Feng Wu <feng.wu@intel.com> wrote:
> Normally, in vmx_cpu_block() 'NDST' filed should have the same
> value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending
> on whether x2apic is enabled. However, in the following scenario,
> 'NDST' has different value:
>
> 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
> other three PI hooks have not been assigned or not been excuted yet.
> And during this interval, we are running in vmx_vcpu_block(), then
> 'NDST' may have different value.
>
> This patch fix this concern case.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>

I agree with Jan that a cleaner solution here would be making sure
that all the appropriate state is actually set up for all vcpus before
leaving vmx_pi_hooks_assign().  With the patch you propose, the
following sequence of events is possible:

* vcpu 0 starts running on a pcpu
* a device is assigned, causing the hooks to be set
* an interrupt from the device is routed to vcpu 0, but it is not
actually delivered properly, since ndst is not pointing to the right
processor.

One option would be to pause all vcpus before setting the hooks and
then un-pause them; this would force all the vcpus to go through
vmx_pi_switch_to() before vmx_vcpu_block().  Another would be to grab
the scheduler lock for each pcpu and write the vcpu's ndst with the
appropriate value before setting the hooks.

 -George
Wu, Feng June 24, 2016, 9:08 a.m. UTC | #7
> -----Original Message-----

> From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George

> Dunlap

> Sent: Thursday, June 23, 2016 2:01 AM

> To: Wu, Feng <feng.wu@intel.com>

> Cc: xen-devel@lists.xen.org; Tian, Kevin <kevin.tian@intel.com>; Keir Fraser

> <keir@xen.org>; Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli

> <dario.faggioli@citrix.com>; Jan Beulich <jbeulich@suse.com>

> Subject: Re: [Xen-devel] [PATCH v2 3/4] VMX: Assign the right value to 'NDST'

> field in a concern case

> 

> On Thu, May 26, 2016 at 2:39 PM, Feng Wu <feng.wu@intel.com> wrote:

> > Normally, in vmx_cpu_block() 'NDST' filed should have the same

> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending

> > on whether x2apic is enabled. However, in the following scenario,

> > 'NDST' has different value:

> >

> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all

> > other three PI hooks have not been assigned or not been excuted yet.

> > And during this interval, we are running in vmx_vcpu_block(), then

> > 'NDST' may have different value.

> >

> > This patch fix this concern case.

> >

> > Signed-off-by: Feng Wu <feng.wu@intel.com>

> 

> I agree with Jan that a cleaner solution here would be making sure

> that all the appropriate state is actually set up for all vcpus before

> leaving vmx_pi_hooks_assign().  With the patch you propose, the

> following sequence of events is possible:

> 

> * vcpu 0 starts running on a pcpu

> * a device is assigned, causing the hooks to be set

> * an interrupt from the device is routed to vcpu 0, but it is not

> actually delivered properly, since ndst is not pointing to the right

> processor.

> 

> One option would be to pause all vcpus before setting the hooks and

> then un-pause them; this would force all the vcpus to go through

> vmx_pi_switch_to() before vmx_vcpu_block().  Another would be to grab

> the scheduler lock for each pcpu and write the vcpu's ndst with the

> appropriate value before setting the hooks.


That sounds a great idea. Besides that, maybe we can also pause/unpause
the domain before/after unsetting the hooks, then we don't need to
care about the race condition when vmx_pi_hooks_deassign() and
vmx_vcpu_block() get called at the same time. After unpause the domain,
we can safely remove the vCPUs from the per-cpu blocking list if needed.

Thanks,
Feng

> 

>  -George
diff mbox

Patch

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b01128a..662b38d 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -146,8 +146,19 @@  static void vmx_vcpu_block(struct vcpu *v)
 
     dest = cpu_physical_id(v->processor);
 
-    ASSERT(pi_desc->ndst ==
-           (x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK)));
+    /*
+     * Normally, 'NDST' filed should have the same value with dest or
+     * MASK_INSR(dest, PI_xAPIC_NDST_MASK) depending on whether x2apic is
+     * enabled. However, in the following scenario, 'NDST' has different
+     * value:
+     *
+     * 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all
+     * other three PI hooks have not been assigned or not been excuted yet.
+     * And during this interval, we get here in this function, then 'NDST'
+     * may have different value.
+     */
+    write_atomic(&pi_desc->ndst,
+                 x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
 
     write_atomic(&pi_desc->nv, pi_wakeup_vector);
 }