Message ID | 1464269954-8056-4-git-send-email-feng.wu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote: > Normally, in vmx_cpu_block() 'NDST' filed should have the same > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending > on whether x2apic is enabled. However, in the following scenario, > 'NDST' has different value: > > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all > other three PI hooks have not been assigned or not been excuted yet. > And during this interval, we are running in vmx_vcpu_block(), then > 'NDST' may have different value. How about simply assigning vcpu_block last? Or alternatively holding pi_blocking_list_lock around all four assignments? Or (maybe in addition to one of these) writing something sensible in vmx_pi_hooks_assign() before doing the hook assignments? Of course much depends on whether the ASSERT() you replace was meant just as a sanity check, or instead logic elsewhere relies on NDST having the right value already before getting into vmx_vcpu_block(). Also, rather than saying twice that NDST has a different value in that case, how about stating what exact value it has then and why that's not what we want/need? (Same goes for the code comment then.) > This patch fix this concern case. Here as well as in earlier patches - do you perhaps mean "corner case"? Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Friday, May 27, 2016 10:00 PM > To: Wu, Feng <feng.wu@intel.com> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > Subject: Re: [PATCH v2 3/4] VMX: Assign the right value to 'NDST' field in a > concern case > > >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote: > > Normally, in vmx_cpu_block() 'NDST' filed should have the same > > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending > > on whether x2apic is enabled. However, in the following scenario, > > 'NDST' has different value: > > > > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all > > other three PI hooks have not been assigned or not been excuted yet. > > And during this interval, we are running in vmx_vcpu_block(), then > > 'NDST' may have different value. > > How about simply assigning vcpu_block last? Or alternatively > holding pi_blocking_list_lock around all four assignments? Or > (maybe in addition to one of these) writing something sensible in > vmx_pi_hooks_assign() before doing the hook assignments? The problem is vmx_vcpu_block() can still get called first, since the other ones might not get called yet. > Of course much depends on whether the ASSERT() you replace was > meant just as a sanity check, or instead logic elsewhere relies on > NDST having the right value already before getting into > vmx_vcpu_block(). The point is we need to make sure NDST have the right value after leaving vmx_vcpu_block(). > > Also, rather than saying twice that NDST has a different value in > that case, how about stating what exact value it has then and > why that's not what we want/need? (Same goes for the code > comment then.) Sure, will add this in the next version. > > > This patch fix this concern case. > > Here as well as in earlier patches - do you perhaps mean "corner > case"? Sorry for the typo. Thanks, Feng > > Jan
>>> On 31.05.16 at 12:27, <feng.wu@intel.com> wrote: >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: Friday, May 27, 2016 10:00 PM >> >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote: >> > Normally, in vmx_cpu_block() 'NDST' filed should have the same >> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending >> > on whether x2apic is enabled. However, in the following scenario, >> > 'NDST' has different value: >> > >> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all >> > other three PI hooks have not been assigned or not been excuted yet. >> > And during this interval, we are running in vmx_vcpu_block(), then >> > 'NDST' may have different value. >> >> How about simply assigning vcpu_block last? Or alternatively >> holding pi_blocking_list_lock around all four assignments? Or >> (maybe in addition to one of these) writing something sensible in >> vmx_pi_hooks_assign() before doing the hook assignments? > > The problem is vmx_vcpu_block() can still get called first, since > the other ones might not get called yet. That refers to the first two questions, yet the third (unanswered one) was added by me because I anticipated that response. So what's wrong with that last option? Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Tuesday, May 31, 2016 7:58 PM > To: Wu, Feng <feng.wu@intel.com> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > Subject: RE: [PATCH v2 3/4] VMX: Assign the right value to 'NDST' field in a > concern case > > >>> On 31.05.16 at 12:27, <feng.wu@intel.com> wrote: > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: Friday, May 27, 2016 10:00 PM > >> >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote: > >> > Normally, in vmx_cpu_block() 'NDST' filed should have the same > >> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending > >> > on whether x2apic is enabled. However, in the following scenario, > >> > 'NDST' has different value: > >> > > >> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all > >> > other three PI hooks have not been assigned or not been excuted yet. > >> > And during this interval, we are running in vmx_vcpu_block(), then > >> > 'NDST' may have different value. > >> > >> How about simply assigning vcpu_block last? Or alternatively > >> holding pi_blocking_list_lock around all four assignments? Or > >> (maybe in addition to one of these) writing something sensible in > >> vmx_pi_hooks_assign() before doing the hook assignments? > > > > The problem is vmx_vcpu_block() can still get called first, since > > the other ones might not get called yet. > > That refers to the first two questions, yet the third (unanswered > one) was added by me because I anticipated that response. So > what's wrong with that last option? Do you mean " Or (maybe in addition to one of these) writing something sensible in vmx_pi_hooks_assign() before doing the hook assignments?"? The problem is you cannot know whether vmx_vcpu_block() is called before or after the other hooks. And I am wondering why do you think the current solution is not good. Thanks, Feng > > Jan
>>> On 03.06.16 at 07:23, <feng.wu@intel.com> wrote: > >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: Tuesday, May 31, 2016 7:58 PM >> To: Wu, Feng <feng.wu@intel.com> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- >> devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org >> Subject: RE: [PATCH v2 3/4] VMX: Assign the right value to 'NDST' field in a >> concern case >> >> >>> On 31.05.16 at 12:27, <feng.wu@intel.com> wrote: >> >> From: Jan Beulich [mailto:JBeulich@suse.com] >> >> Sent: Friday, May 27, 2016 10:00 PM >> >> >>> On 26.05.16 at 15:39, <feng.wu@intel.com> wrote: >> >> > Normally, in vmx_cpu_block() 'NDST' filed should have the same >> >> > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending >> >> > on whether x2apic is enabled. However, in the following scenario, >> >> > 'NDST' has different value: >> >> > >> >> > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all >> >> > other three PI hooks have not been assigned or not been excuted yet. >> >> > And during this interval, we are running in vmx_vcpu_block(), then >> >> > 'NDST' may have different value. >> >> >> >> How about simply assigning vcpu_block last? Or alternatively >> >> holding pi_blocking_list_lock around all four assignments? Or >> >> (maybe in addition to one of these) writing something sensible in >> >> vmx_pi_hooks_assign() before doing the hook assignments? >> > >> > The problem is vmx_vcpu_block() can still get called first, since >> > the other ones might not get called yet. >> >> That refers to the first two questions, yet the third (unanswered >> one) was added by me because I anticipated that response. So >> what's wrong with that last option? > > Do you mean " Or (maybe in addition to one of these) writing > something sensible in vmx_pi_hooks_assign() before doing the > hook assignments?"? Yes. > The problem is you cannot know whether vmx_vcpu_block() is > called before or after the other hooks. Well - it for sure can't get called before getting installed as a hook. I.e. if you set up proper state before installing it, you should be fine. > And I am wondering > why do you think the current solution is not good. See the other thread (on patch 1). When dealing with corner cases, as a first option (leading to less complication, or even simplification) it should be understood why they are corner cases in the first place, i.e. why the normal code flow doesn't take care of them. It's only the second best option to add extra logic / conditionals in order to deal with such. For the case at hand that first question translates to "Why does the ASSERT() not hold, despite us having thought it always would?" And yes, the change here isn't complicated, so may well be fine to go in, but together with the other patches I can't help getting the feeling that the overall situation wasn't really fully understood (and thus explained in the patch overview and descriptions). In particular, this part of the source comment + * 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all + * other three PI hooks have not been assigned or not been excuted yet. + * And during this interval, we get here in this function, then 'NDST' + * may have different value. suggests to me that my consideration above (about setting up state correctly _before_ assigning the hooks) provides a viable alternative solution. Jan
On Thu, May 26, 2016 at 2:39 PM, Feng Wu <feng.wu@intel.com> wrote: > Normally, in vmx_cpu_block() 'NDST' filed should have the same > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending > on whether x2apic is enabled. However, in the following scenario, > 'NDST' has different value: > > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all > other three PI hooks have not been assigned or not been excuted yet. > And during this interval, we are running in vmx_vcpu_block(), then > 'NDST' may have different value. > > This patch fix this concern case. > > Signed-off-by: Feng Wu <feng.wu@intel.com> I agree with Jan that a cleaner solution here would be making sure that all the appropriate state is actually set up for all vcpus before leaving vmx_pi_hooks_assign(). With the patch you propose, the following sequence of events is possible: * vcpu 0 starts running on a pcpu * a device is assigned, causing the hooks to be set * an interrupt from the device is routed to vcpu 0, but it is not actually delivered properly, since ndst is not pointing to the right processor. One option would be to pause all vcpus before setting the hooks and then un-pause them; this would force all the vcpus to go through vmx_pi_switch_to() before vmx_vcpu_block(). Another would be to grab the scheduler lock for each pcpu and write the vcpu's ndst with the appropriate value before setting the hooks. -George
> -----Original Message----- > From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George > Dunlap > Sent: Thursday, June 23, 2016 2:01 AM > To: Wu, Feng <feng.wu@intel.com> > Cc: xen-devel@lists.xen.org; Tian, Kevin <kevin.tian@intel.com>; Keir Fraser > <keir@xen.org>; Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli > <dario.faggioli@citrix.com>; Jan Beulich <jbeulich@suse.com> > Subject: Re: [Xen-devel] [PATCH v2 3/4] VMX: Assign the right value to 'NDST' > field in a concern case > > On Thu, May 26, 2016 at 2:39 PM, Feng Wu <feng.wu@intel.com> wrote: > > Normally, in vmx_cpu_block() 'NDST' filed should have the same > > value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending > > on whether x2apic is enabled. However, in the following scenario, > > 'NDST' has different value: > > > > 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all > > other three PI hooks have not been assigned or not been excuted yet. > > And during this interval, we are running in vmx_vcpu_block(), then > > 'NDST' may have different value. > > > > This patch fix this concern case. > > > > Signed-off-by: Feng Wu <feng.wu@intel.com> > > I agree with Jan that a cleaner solution here would be making sure > that all the appropriate state is actually set up for all vcpus before > leaving vmx_pi_hooks_assign(). With the patch you propose, the > following sequence of events is possible: > > * vcpu 0 starts running on a pcpu > * a device is assigned, causing the hooks to be set > * an interrupt from the device is routed to vcpu 0, but it is not > actually delivered properly, since ndst is not pointing to the right > processor. > > One option would be to pause all vcpus before setting the hooks and > then un-pause them; this would force all the vcpus to go through > vmx_pi_switch_to() before vmx_vcpu_block(). Another would be to grab > the scheduler lock for each pcpu and write the vcpu's ndst with the > appropriate value before setting the hooks. That sounds a great idea. Besides that, maybe we can also pause/unpause the domain before/after unsetting the hooks, then we don't need to care about the race condition when vmx_pi_hooks_deassign() and vmx_vcpu_block() get called at the same time. After unpause the domain, we can safely remove the vCPUs from the per-cpu blocking list if needed. Thanks, Feng > > -George
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index b01128a..662b38d 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -146,8 +146,19 @@ static void vmx_vcpu_block(struct vcpu *v) dest = cpu_physical_id(v->processor); - ASSERT(pi_desc->ndst == - (x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK))); + /* + * Normally, 'NDST' filed should have the same value with dest or + * MASK_INSR(dest, PI_xAPIC_NDST_MASK) depending on whether x2apic is + * enabled. However, in the following scenario, 'NDST' has different + * value: + * + * 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all + * other three PI hooks have not been assigned or not been excuted yet. + * And during this interval, we get here in this function, then 'NDST' + * may have different value. + */ + write_atomic(&pi_desc->ndst, + x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK)); write_atomic(&pi_desc->nv, pi_wakeup_vector); }
Normally, in vmx_cpu_block() 'NDST' filed should have the same value with 'dest' or 'MASK_INSR(dest, PI_xAPIC_NDST_MASK)' depending on whether x2apic is enabled. However, in the following scenario, 'NDST' has different value: 'vcpu_block' hook gets assigned in vmx_pi_hooks_assign(), but all other three PI hooks have not been assigned or not been excuted yet. And during this interval, we are running in vmx_vcpu_block(), then 'NDST' may have different value. This patch fix this concern case. Signed-off-by: Feng Wu <feng.wu@intel.com> --- xen/arch/x86/hvm/vmx/vmx.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)