Message ID | 1463734431-22353-4-git-send-email-feng.wu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> From: Wu, Feng > Sent: Friday, May 20, 2016 4:54 PM > > We need to make sure the bocking vcpu is not in any per-cpu blocking list > when the associated domain is going to be destroyed. > > Signed-off-by: Feng Wu <feng.wu@intel.com> > --- > xen/arch/x86/hvm/vmx/vmx.c | 32 > ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > index 4862b13..e74b3e7 100644 > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > } > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) Is it more natural to move such cleanup under vcpu destroy? > +{ > + unsigned int cpu; > + > + for_each_online_cpu ( cpu ) > + { > + struct vcpu *v; > + unsigned long flags; > + struct arch_vmx_struct *vmx, *tmp; > + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; > + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list; > + > + spin_lock_irqsave(lock, flags); > + > + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list) > + { > + v = container_of(vmx, struct vcpu, arch.hvm_vmx); > + > + if (v->domain == d) > + { > + list_del(&vmx->pi_blocking.list); > + ASSERT(vmx->pi_blocking.lock == lock); > + vmx->pi_blocking.lock = NULL; > + } > + } > + > + spin_unlock_irqrestore(lock, flags); > + } > +} > + > static int vmx_domain_initialise(struct domain *d) > { > int rc; > @@ -265,6 +295,8 @@ static int vmx_domain_initialise(struct domain *d) > > static void vmx_domain_destroy(struct domain *d) > { > + vmx_pi_blocking_list_cleanup(d); > + > if ( !has_vlapic(d) ) > return; > > -- > 2.1.0
> -----Original Message----- > From: Tian, Kevin > Sent: Monday, May 23, 2016 1:19 PM > To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org > Cc: keir@xen.org; jbeulich@suse.com; andrew.cooper3@citrix.com; > george.dunlap@eu.citrix.com; dario.faggioli@citrix.com; > konrad.wilk@oracle.com > Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > after domain termination > > > From: Wu, Feng > > Sent: Friday, May 20, 2016 4:54 PM > > > > We need to make sure the bocking vcpu is not in any per-cpu blocking list > > when the associated domain is going to be destroyed. > > > > Signed-off-by: Feng Wu <feng.wu@intel.com> > > --- > > xen/arch/x86/hvm/vmx/vmx.c | 32 > > ++++++++++++++++++++++++++++++++ > > 1 file changed, 32 insertions(+) > > > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > > index 4862b13..e74b3e7 100644 > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > > } > > > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > > Is it more natural to move such cleanup under vcpu destroy? Yes, indeed it is more natural to add this function when vcpu is destroyed, however, the reason I add it here is I still have some concerns about the timing. When the VM is destroyed, here is the calling path: - vmx_pi_hooks_deassign() --> ...... - vmx_vcpu_destroy() --> ...... - vmx_domain_destroy() ...... As I replied in the previous mail, when we remove the vcpus from the blocking list, there might be some _in-flight_ call to the hooks, so I put the cleanup code in the vmx_domain_destroy(), which is a bit more far from vmx_pi_hooks_deassign, and hence safer. If you have any other good ideas, I am all ears!:) Thanks, Feng
> From: Wu, Feng > Sent: Monday, May 23, 2016 1:48 PM > > -----Original Message----- > > From: Tian, Kevin > > Sent: Monday, May 23, 2016 1:19 PM > > To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org > > Cc: keir@xen.org; jbeulich@suse.com; andrew.cooper3@citrix.com; > > george.dunlap@eu.citrix.com; dario.faggioli@citrix.com; > > konrad.wilk@oracle.com > > Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > > after domain termination > > > > > From: Wu, Feng > > > Sent: Friday, May 20, 2016 4:54 PM > > > > > > We need to make sure the bocking vcpu is not in any per-cpu blocking list > > > when the associated domain is going to be destroyed. > > > > > > Signed-off-by: Feng Wu <feng.wu@intel.com> > > > --- > > > xen/arch/x86/hvm/vmx/vmx.c | 32 > > > ++++++++++++++++++++++++++++++++ > > > 1 file changed, 32 insertions(+) > > > > > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > > > index 4862b13..e74b3e7 100644 > > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > > > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > > > } > > > > > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > > > > Is it more natural to move such cleanup under vcpu destroy? > > Yes, indeed it is more natural to add this function when vcpu is destroyed, > however, the reason I add it here is I still have some concerns about the timing. > When the VM is destroyed, here is the calling path: > > - vmx_pi_hooks_deassign() --> > ...... > - vmx_vcpu_destroy() --> > ...... > - vmx_domain_destroy() > ...... > > As I replied in the previous mail, when we remove the vcpus from the blocking > list, there might be some _in-flight_ call to the hooks, so I put the cleanup > code in the vmx_domain_destroy(), which is a bit more far from vmx_pi_hooks_deassign, > and hence safer. If you have any other good ideas, I am all ears!:) > If we don't need reset callbacks at deassign path, as commented for patch 1/3, would it make above puzzle away? :-) Thanks Kevin
>>> On 23.05.16 at 07:48, <feng.wu@intel.com> wrote: >> From: Tian, Kevin >> Sent: Monday, May 23, 2016 1:19 PM >> > From: Wu, Feng >> > Sent: Friday, May 20, 2016 4:54 PM >> > --- a/xen/arch/x86/hvm/vmx/vmx.c >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c >> > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) >> > d->arch.hvm_domain.vmx.pi_switch_to = NULL; >> > } >> > >> > +static void vmx_pi_blocking_list_cleanup(struct domain *d) >> >> Is it more natural to move such cleanup under vcpu destroy? > > Yes, indeed it is more natural to add this function when vcpu is destroyed, > however, the reason I add it here is I still have some concerns about the > timing. > When the VM is destroyed, here is the calling path: > > - vmx_pi_hooks_deassign() --> > ...... > - vmx_vcpu_destroy() --> > ...... > - vmx_domain_destroy() > ...... > > As I replied in the previous mail, when we remove the vcpus from the > blocking > list, there might be some _in-flight_ call to the hooks, so I put the cleanup > code in the vmx_domain_destroy(), which is a bit more far from > vmx_pi_hooks_deassign, > and hence safer. If you have any other good ideas, I am all ears!:) Well, either there is a possible race (then moving the addition later just reduces the chances, but doesn't eliminate it), or there isn't (in which case Kevin's suggestion should probably be followed). Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Monday, May 23, 2016 5:08 PM > To: Wu, Feng <feng.wu@intel.com> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > after domain termination > > >>> On 23.05.16 at 07:48, <feng.wu@intel.com> wrote: > >> From: Tian, Kevin > >> Sent: Monday, May 23, 2016 1:19 PM > >> > From: Wu, Feng > >> > Sent: Friday, May 20, 2016 4:54 PM > >> > --- a/xen/arch/x86/hvm/vmx/vmx.c > >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c > >> > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > >> > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > >> > } > >> > > >> > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > >> > >> Is it more natural to move such cleanup under vcpu destroy? > > > > Yes, indeed it is more natural to add this function when vcpu is destroyed, > > however, the reason I add it here is I still have some concerns about the > > timing. > > When the VM is destroyed, here is the calling path: > > > > - vmx_pi_hooks_deassign() --> > > ...... > > - vmx_vcpu_destroy() --> > > ...... > > - vmx_domain_destroy() > > ...... > > > > As I replied in the previous mail, when we remove the vcpus from the > > blocking > > list, there might be some _in-flight_ call to the hooks, so I put the cleanup > > code in the vmx_domain_destroy(), which is a bit more far from > > vmx_pi_hooks_deassign, > > and hence safer. If you have any other good ideas, I am all ears!:) > > Well, either there is a possible race (then moving the addition > later just reduces the chances, but doesn't eliminate it), or there > isn't (in which case Kevin's suggestion should probably be followed). Yes, I agree, adding the cleanup code in domain destroy other than vcpu destroy point just reduces the risk, but not eliminate. So far I don't get a perfect solution to solve this possible race condition. Thanks, Feng > > Jan
> -----Original Message----- > From: Wu, Feng > Sent: Monday, May 23, 2016 5:18 PM > To: Jan Beulich <JBeulich@suse.com> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org; Wu, Feng > <feng.wu@intel.com> > Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > after domain termination > > > > > -----Original Message----- > > From: Jan Beulich [mailto:JBeulich@suse.com] > > Sent: Monday, May 23, 2016 5:08 PM > > To: Wu, Feng <feng.wu@intel.com> > > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > > Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > > after domain termination > > > > >>> On 23.05.16 at 07:48, <feng.wu@intel.com> wrote: > > >> From: Tian, Kevin > > >> Sent: Monday, May 23, 2016 1:19 PM > > >> > From: Wu, Feng > > >> > Sent: Friday, May 20, 2016 4:54 PM > > >> > --- a/xen/arch/x86/hvm/vmx/vmx.c > > >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > >> > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > > >> > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > > >> > } > > >> > > > >> > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > > >> > > >> Is it more natural to move such cleanup under vcpu destroy? > > > > > > Yes, indeed it is more natural to add this function when vcpu is destroyed, > > > however, the reason I add it here is I still have some concerns about the > > > timing. > > > When the VM is destroyed, here is the calling path: > > > > > > - vmx_pi_hooks_deassign() --> > > > ...... > > > - vmx_vcpu_destroy() --> > > > ...... > > > - vmx_domain_destroy() > > > ...... > > > > > > As I replied in the previous mail, when we remove the vcpus from the > > > blocking > > > list, there might be some _in-flight_ call to the hooks, so I put the cleanup > > > code in the vmx_domain_destroy(), which is a bit more far from > > > vmx_pi_hooks_deassign, > > > and hence safer. If you have any other good ideas, I am all ears!:) > > > > Well, either there is a possible race (then moving the addition > > later just reduces the chances, but doesn't eliminate it), or there > > isn't (in which case Kevin's suggestion should probably be followed). > > Yes, I agree, adding the cleanup code in domain destroy other than > vcpu destroy point just reduces the risk, but not eliminate. So far I don't > get a perfect solution to solve this possible race condition. After more thinking about this, I think this race condition can be resolve in the following way: 1. Define a per-vCPU flag, say, 'v->arch.hvm_vmx.pi_back_from_hotplug' 2. In vmx_pi_blocking_list_cleanup(), when we find the vCPU from an blocking list, after removing it, set the flag to 1 3. In vmx_vcpu_block(), add the following check: spin_lock_irqsave(pi_blocking_list_lock, flags); + if ( unlikely(v->arch.hvm_vmx.pi_blocking_cleaned_up == 1) ) + { + /* + * The vcpu is to be destroyed and it has already been removed + * from the per-CPU list if it is blocking, we shouldn't add + * new vCPUs to the list. + */ + spin_unlock_irqrestore(pi_blocking_list_lock, flags); + return; + } + old_lock = cmpxchg(&v->arch.hvm_vmx.pi_blocking.lock, NULL, pi_blocking_list_lock); Then we can following Kevin's suggestion to move the addition to vmx_vcpu_destory(). Any ideas? Thanks, Feng > > Thanks, > Feng > > > > > Jan
>>> On 23.05.16 at 12:35, <feng.wu@intel.com> wrote: >> From: Wu, Feng >> Sent: Monday, May 23, 2016 5:18 PM >> > From: Jan Beulich [mailto:JBeulich@suse.com] >> > Sent: Monday, May 23, 2016 5:08 PM >> > To: Wu, Feng <feng.wu@intel.com> >> > >>> On 23.05.16 at 07:48, <feng.wu@intel.com> wrote: >> > > Yes, indeed it is more natural to add this function when vcpu is destroyed, >> > > however, the reason I add it here is I still have some concerns about the >> > > timing. >> > > When the VM is destroyed, here is the calling path: >> > > >> > > - vmx_pi_hooks_deassign() --> >> > > ...... >> > > - vmx_vcpu_destroy() --> >> > > ...... >> > > - vmx_domain_destroy() >> > > ...... >> > > >> > > As I replied in the previous mail, when we remove the vcpus from the >> > > blocking >> > > list, there might be some _in-flight_ call to the hooks, so I put the cleanup >> > > code in the vmx_domain_destroy(), which is a bit more far from >> > > vmx_pi_hooks_deassign, >> > > and hence safer. If you have any other good ideas, I am all ears!:) >> > >> > Well, either there is a possible race (then moving the addition >> > later just reduces the chances, but doesn't eliminate it), or there >> > isn't (in which case Kevin's suggestion should probably be followed). >> >> Yes, I agree, adding the cleanup code in domain destroy other than >> vcpu destroy point just reduces the risk, but not eliminate. So far I don't >> get a perfect solution to solve this possible race condition. > > After more thinking about this, I think this race condition can be resolve > in the following way: > 1. Define a per-vCPU flag, say, 'v->arch.hvm_vmx.pi_back_from_hotplug' > 2. In vmx_pi_blocking_list_cleanup(), when we find the vCPU from an > blocking list, after removing it, set the flag to 1 > 3. In vmx_vcpu_block(), add the following check: > > spin_lock_irqsave(pi_blocking_list_lock, flags); > + if ( unlikely(v->arch.hvm_vmx.pi_blocking_cleaned_up == 1) ) > + { > + /* > + * The vcpu is to be destroyed and it has already been removed > + * from the per-CPU list if it is blocking, we shouldn't add > + * new vCPUs to the list. > + */ > + spin_unlock_irqrestore(pi_blocking_list_lock, flags); > + return; > + } > + > old_lock = cmpxchg(&v->arch.hvm_vmx.pi_blocking.lock, NULL, > pi_blocking_list_lock); > > Then we can following Kevin's suggestion to move the addition > to vmx_vcpu_destory(). Before adding yet another PI-related field, I'd really like to see other alternatives explored. In particular - can determination be based on some other state (considering the subject, e.g. per-domain one)? Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Monday, May 23, 2016 7:11 PM > To: Wu, Feng <feng.wu@intel.com> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > after domain termination > > >>> On 23.05.16 at 12:35, <feng.wu@intel.com> wrote: > >> From: Wu, Feng > >> Sent: Monday, May 23, 2016 5:18 PM > >> > From: Jan Beulich [mailto:JBeulich@suse.com] > >> > Sent: Monday, May 23, 2016 5:08 PM > >> > To: Wu, Feng <feng.wu@intel.com> > >> > >>> On 23.05.16 at 07:48, <feng.wu@intel.com> wrote: > >> > > Yes, indeed it is more natural to add this function when vcpu is destroyed, > >> > > however, the reason I add it here is I still have some concerns about the > >> > > timing. > >> > > When the VM is destroyed, here is the calling path: > >> > > > >> > > - vmx_pi_hooks_deassign() --> > >> > > ...... > >> > > - vmx_vcpu_destroy() --> > >> > > ...... > >> > > - vmx_domain_destroy() > >> > > ...... > >> > > > >> > > As I replied in the previous mail, when we remove the vcpus from the > >> > > blocking > >> > > list, there might be some _in-flight_ call to the hooks, so I put the cleanup > >> > > code in the vmx_domain_destroy(), which is a bit more far from > >> > > vmx_pi_hooks_deassign, > >> > > and hence safer. If you have any other good ideas, I am all ears!:) > >> > > >> > Well, either there is a possible race (then moving the addition > >> > later just reduces the chances, but doesn't eliminate it), or there > >> > isn't (in which case Kevin's suggestion should probably be followed). > >> > >> Yes, I agree, adding the cleanup code in domain destroy other than > >> vcpu destroy point just reduces the risk, but not eliminate. So far I don't > >> get a perfect solution to solve this possible race condition. > > > > After more thinking about this, I think this race condition can be resolve > > in the following way: > > 1. Define a per-vCPU flag, say, 'v->arch.hvm_vmx.pi_back_from_hotplug' > > 2. In vmx_pi_blocking_list_cleanup(), when we find the vCPU from an > > blocking list, after removing it, set the flag to 1 > > 3. In vmx_vcpu_block(), add the following check: > > > > spin_lock_irqsave(pi_blocking_list_lock, flags); > > + if ( unlikely(v->arch.hvm_vmx.pi_blocking_cleaned_up == 1) ) > > + { > > + /* > > + * The vcpu is to be destroyed and it has already been removed > > + * from the per-CPU list if it is blocking, we shouldn't add > > + * new vCPUs to the list. > > + */ > > + spin_unlock_irqrestore(pi_blocking_list_lock, flags); > > + return; > > + } > > + > > old_lock = cmpxchg(&v->arch.hvm_vmx.pi_blocking.lock, NULL, > > pi_blocking_list_lock); > > > > Then we can following Kevin's suggestion to move the addition > > to vmx_vcpu_destory(). > > Before adding yet another PI-related field, I'd really like to see other > alternatives explored. In particular - can determination be based on > some other state (considering the subject, e.g. per-domain one)? I think the point is we need to set some flag inside the spin_lock_irqsave()/spin_unlock_irqrestore() section in vmx_pi_blocking_list_cleanup() and check it after acquiring the lock in vmx_vcpu_block(), so the case condition can be eliminated, right? If that is the case, I am not sure how we can use other state. Thanks, Feng > > Jan
On Fri, 2016-05-20 at 16:53 +0800, Feng Wu wrote: > We need to make sure the bocking vcpu is not in any per-cpu blocking > list > when the associated domain is going to be destroyed. > > Signed-off-by: Feng Wu <feng.wu@intel.com> > --- > > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > } > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > +{ > + unsigned int cpu; > + > + for_each_online_cpu ( cpu ) > + { > + struct vcpu *v; > + unsigned long flags; > + struct arch_vmx_struct *vmx, *tmp; > + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; > + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, > cpu).list; > + > + spin_lock_irqsave(lock, flags); > + > + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, > pi_blocking.list) > + { > + v = container_of(vmx, struct vcpu, arch.hvm_vmx); > + > + if (v->domain == d) > + { > + list_del(&vmx->pi_blocking.list); > + ASSERT(vmx->pi_blocking.lock == lock); > + vmx->pi_blocking.lock = NULL; > + } > + } > + > + spin_unlock_irqrestore(lock, flags); > + } > So, I'm probably missing something very ver basic, but I don't see what's the reason why we need this loop... can't we arrange for checking list_empty(&v->arch.hvm_vmx.pi_blocking.list) ? :-O Regards, Dario
>>> On 20.05.16 at 10:53, <feng.wu@intel.com> wrote: > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > } > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > +{ > + unsigned int cpu; > + > + for_each_online_cpu ( cpu ) > + { > + struct vcpu *v; > + unsigned long flags; > + struct arch_vmx_struct *vmx, *tmp; > + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; > + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list; > + > + spin_lock_irqsave(lock, flags); > + > + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list) > + { Did you consider how long these two nested loops may take on a large system? Jan
>>> On 23.05.16 at 14:24, <feng.wu@intel.com> wrote: > >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: Monday, May 23, 2016 7:11 PM >> To: Wu, Feng <feng.wu@intel.com> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- >> devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org >> Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list >> after domain termination >> >> >>> On 23.05.16 at 12:35, <feng.wu@intel.com> wrote: >> >> From: Wu, Feng >> >> Sent: Monday, May 23, 2016 5:18 PM >> >> > From: Jan Beulich [mailto:JBeulich@suse.com] >> >> > Sent: Monday, May 23, 2016 5:08 PM >> >> > To: Wu, Feng <feng.wu@intel.com> >> >> > >>> On 23.05.16 at 07:48, <feng.wu@intel.com> wrote: >> >> > > Yes, indeed it is more natural to add this function when vcpu is > destroyed, >> >> > > however, the reason I add it here is I still have some concerns about the >> >> > > timing. >> >> > > When the VM is destroyed, here is the calling path: >> >> > > >> >> > > - vmx_pi_hooks_deassign() --> >> >> > > ...... >> >> > > - vmx_vcpu_destroy() --> >> >> > > ...... >> >> > > - vmx_domain_destroy() >> >> > > ...... >> >> > > >> >> > > As I replied in the previous mail, when we remove the vcpus from the >> >> > > blocking >> >> > > list, there might be some _in-flight_ call to the hooks, so I put the > cleanup >> >> > > code in the vmx_domain_destroy(), which is a bit more far from >> >> > > vmx_pi_hooks_deassign, >> >> > > and hence safer. If you have any other good ideas, I am all ears!:) >> >> > >> >> > Well, either there is a possible race (then moving the addition >> >> > later just reduces the chances, but doesn't eliminate it), or there >> >> > isn't (in which case Kevin's suggestion should probably be followed). >> >> >> >> Yes, I agree, adding the cleanup code in domain destroy other than >> >> vcpu destroy point just reduces the risk, but not eliminate. So far I don't >> >> get a perfect solution to solve this possible race condition. >> > >> > After more thinking about this, I think this race condition can be resolve >> > in the following way: >> > 1. Define a per-vCPU flag, say, 'v->arch.hvm_vmx.pi_back_from_hotplug' >> > 2. In vmx_pi_blocking_list_cleanup(), when we find the vCPU from an >> > blocking list, after removing it, set the flag to 1 >> > 3. In vmx_vcpu_block(), add the following check: >> > >> > spin_lock_irqsave(pi_blocking_list_lock, flags); >> > + if ( unlikely(v->arch.hvm_vmx.pi_blocking_cleaned_up == 1) ) >> > + { >> > + /* >> > + * The vcpu is to be destroyed and it has already been removed >> > + * from the per-CPU list if it is blocking, we shouldn't add >> > + * new vCPUs to the list. >> > + */ >> > + spin_unlock_irqrestore(pi_blocking_list_lock, flags); >> > + return; >> > + } >> > + >> > old_lock = cmpxchg(&v->arch.hvm_vmx.pi_blocking.lock, NULL, >> > pi_blocking_list_lock); >> > >> > Then we can following Kevin's suggestion to move the addition >> > to vmx_vcpu_destory(). >> >> Before adding yet another PI-related field, I'd really like to see other >> alternatives explored. In particular - can determination be based on >> some other state (considering the subject, e.g. per-domain one)? > > I think the point is we need to set some flag inside the > spin_lock_irqsave()/spin_unlock_irqrestore() section in > vmx_pi_blocking_list_cleanup() and check it after acquiring the lock > in vmx_vcpu_block(), so the case condition can be eliminated, right? > If that is the case, I am not sure how we can use other state. Since you only need this during domain shutdown, I'm not sure. For example, can't you simply use d->is_dying or d->is_shutting_down? Jan
> -----Original Message----- > From: Dario Faggioli [mailto:dario.faggioli@citrix.com] > Sent: Monday, May 23, 2016 8:31 PM > To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org > Cc: keir@xen.org; Tian, Kevin <kevin.tian@intel.com>; jbeulich@suse.com; > andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; > konrad.wilk@oracle.com > Subject: Re: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > after domain termination > > On Fri, 2016-05-20 at 16:53 +0800, Feng Wu wrote: > > We need to make sure the bocking vcpu is not in any per-cpu blocking > > list > > when the associated domain is going to be destroyed. > > > > Signed-off-by: Feng Wu <feng.wu@intel.com> > > --- > > > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > > } > > > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > > +{ > > + unsigned int cpu; > > + > > + for_each_online_cpu ( cpu ) > > + { > > + struct vcpu *v; > > + unsigned long flags; > > + struct arch_vmx_struct *vmx, *tmp; > > + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; > > + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, > > cpu).list; > > + > > + spin_lock_irqsave(lock, flags); > > + > > + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, > > pi_blocking.list) > > + { > > + v = container_of(vmx, struct vcpu, arch.hvm_vmx); > > + > > + if (v->domain == d) > > + { > > + list_del(&vmx->pi_blocking.list); > > + ASSERT(vmx->pi_blocking.lock == lock); > > + vmx->pi_blocking.lock = NULL; > > + } > > + } > > + > > + spin_unlock_irqrestore(lock, flags); > > + } > > > So, I'm probably missing something very ver basic, but I don't see > what's the reason why we need this loop... can't we arrange for > checking > > list_empty(&v->arch.hvm_vmx.pi_blocking.list) Yes, I also cannot find the reason why can't we use this good suggestion, Except we need use list_del_init() instead of list_del() in the current code. Or we can just check whether ' vmx->pi_blocking.lock ' is NULL? I total don't know why I missed it! :) Thanks, Feng > > ? > > :-O > > Regards, > Dario > -- > <<This happens because I choose it to happen!>> (Raistlin Majere) > ----------------------------------------------------------------- > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Monday, May 23, 2016 8:36 PM > To: Wu, Feng <feng.wu@intel.com> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > Subject: Re: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > after domain termination > > >>> On 20.05.16 at 10:53, <feng.wu@intel.com> wrote: > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > > } > > > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > > +{ > > + unsigned int cpu; > > + > > + for_each_online_cpu ( cpu ) > > + { > > + struct vcpu *v; > > + unsigned long flags; > > + struct arch_vmx_struct *vmx, *tmp; > > + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; > > + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list; > > + > > + spin_lock_irqsave(lock, flags); > > + > > + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list) > > + { > > Did you consider how long these two nested loops may take on a > large system? > As Dario just mentioned, we may not need this loop at all. Thanks, Feng > Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Monday, May 23, 2016 8:47 PM > To: Wu, Feng <feng.wu@intel.com> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking list > after domain termination > > >>> On 23.05.16 at 14:24, <feng.wu@intel.com> wrote: > > > > >> -----Original Message----- > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: Monday, May 23, 2016 7:11 PM > >> To: Wu, Feng <feng.wu@intel.com> > >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com; > >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen- > >> devel@lists.xen.org; konrad.wilk@oracle.com; keir@xen.org > >> Subject: RE: [PATCH 3/3] VMX: Remove the vcpu from the per-cpu blocking > list > >> after domain termination > >> > >> >>> On 23.05.16 at 12:35, <feng.wu@intel.com> wrote: > >> >> From: Wu, Feng > >> >> Sent: Monday, May 23, 2016 5:18 PM > >> >> > From: Jan Beulich [mailto:JBeulich@suse.com] > >> >> > Sent: Monday, May 23, 2016 5:08 PM > >> >> > To: Wu, Feng <feng.wu@intel.com> > >> >> > >>> On 23.05.16 at 07:48, <feng.wu@intel.com> wrote: > >> >> > > Yes, indeed it is more natural to add this function when vcpu is > > destroyed, > >> >> > > however, the reason I add it here is I still have some concerns about > the > >> >> > > timing. > >> >> > > When the VM is destroyed, here is the calling path: > >> >> > > > >> >> > > - vmx_pi_hooks_deassign() --> > >> >> > > ...... > >> >> > > - vmx_vcpu_destroy() --> > >> >> > > ...... > >> >> > > - vmx_domain_destroy() > >> >> > > ...... > >> >> > > > >> >> > > As I replied in the previous mail, when we remove the vcpus from the > >> >> > > blocking > >> >> > > list, there might be some _in-flight_ call to the hooks, so I put the > > cleanup > >> >> > > code in the vmx_domain_destroy(), which is a bit more far from > >> >> > > vmx_pi_hooks_deassign, > >> >> > > and hence safer. If you have any other good ideas, I am all ears!:) > >> >> > > >> >> > Well, either there is a possible race (then moving the addition > >> >> > later just reduces the chances, but doesn't eliminate it), or there > >> >> > isn't (in which case Kevin's suggestion should probably be followed). > >> >> > >> >> Yes, I agree, adding the cleanup code in domain destroy other than > >> >> vcpu destroy point just reduces the risk, but not eliminate. So far I don't > >> >> get a perfect solution to solve this possible race condition. > >> > > >> > After more thinking about this, I think this race condition can be resolve > >> > in the following way: > >> > 1. Define a per-vCPU flag, say, 'v->arch.hvm_vmx.pi_back_from_hotplug' > >> > 2. In vmx_pi_blocking_list_cleanup(), when we find the vCPU from an > >> > blocking list, after removing it, set the flag to 1 > >> > 3. In vmx_vcpu_block(), add the following check: > >> > > >> > spin_lock_irqsave(pi_blocking_list_lock, flags); > >> > + if ( unlikely(v->arch.hvm_vmx.pi_blocking_cleaned_up == 1) ) > >> > + { > >> > + /* > >> > + * The vcpu is to be destroyed and it has already been removed > >> > + * from the per-CPU list if it is blocking, we shouldn't add > >> > + * new vCPUs to the list. > >> > + */ > >> > + spin_unlock_irqrestore(pi_blocking_list_lock, flags); > >> > + return; > >> > + } > >> > + > >> > old_lock = cmpxchg(&v->arch.hvm_vmx.pi_blocking.lock, NULL, > >> > pi_blocking_list_lock); > >> > > >> > Then we can following Kevin's suggestion to move the addition > >> > to vmx_vcpu_destory(). > >> > >> Before adding yet another PI-related field, I'd really like to see other > >> alternatives explored. In particular - can determination be based on > >> some other state (considering the subject, e.g. per-domain one)? > > > > I think the point is we need to set some flag inside the > > spin_lock_irqsave()/spin_unlock_irqrestore() section in > > vmx_pi_blocking_list_cleanup() and check it after acquiring the lock > > in vmx_vcpu_block(), so the case condition can be eliminated, right? > > If that is the case, I am not sure how we can use other state. > > Since you only need this during domain shutdown, I'm not sure. For > example, can't you simply use d->is_dying or d->is_shutting_down? I am not sure those flags can be used for this case, since I think we should set the flags inside the spin lock area as mentioned above Anyway, I will think it more about this. Thanks, Feng > > Jan
On Mon, 2016-05-23 at 13:32 +0000, Wu, Feng wrote: > > > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > > @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) > > > d->arch.hvm_domain.vmx.pi_switch_to = NULL; > > > } > > > > > > +static void vmx_pi_blocking_list_cleanup(struct domain *d) > > > +{ > > > + unsigned int cpu; > > > + > > > + for_each_online_cpu ( cpu ) > > > + { > > > + struct vcpu *v; > > > + unsigned long flags; > > > + struct arch_vmx_struct *vmx, *tmp; > > > + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; > > > + struct list_head *blocked_vcpus = > > > &per_cpu(vmx_pi_blocking, > > > cpu).list; > > > + > > > + spin_lock_irqsave(lock, flags); > > > + > > > + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, > > > pi_blocking.list) > > > + { > > > + v = container_of(vmx, struct vcpu, arch.hvm_vmx); > > > + > > > + if (v->domain == d) > > > + { > > > + list_del(&vmx->pi_blocking.list); > > > + ASSERT(vmx->pi_blocking.lock == lock); > > > + vmx->pi_blocking.lock = NULL; > > > + } > > > + } > > > + > > > + spin_unlock_irqrestore(lock, flags); > > > + } > > > > > So, I'm probably missing something very ver basic, but I don't see > > what's the reason why we need this loop... can't we arrange for > > checking > > > > list_empty(&v->arch.hvm_vmx.pi_blocking.list) > Yes, I also cannot find the reason why can't we use this good > suggestion, Except we need use list_del_init() instead of > list_del() in the current code. > Yes, I saw that, and it's well worth doing that, to get rid of the loop. :-) > Or we can just check whether > ' vmx->pi_blocking.lock ' is NULL? > I guess that will work as well. Still, if it were me doing this, I'd go for the list_del_init()/list_empty() approach. > I total don't know why I > missed it! :) > :-) Regards, Dario
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 4862b13..e74b3e7 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -248,6 +248,36 @@ void vmx_pi_hooks_deassign(struct domain *d) d->arch.hvm_domain.vmx.pi_switch_to = NULL; } +static void vmx_pi_blocking_list_cleanup(struct domain *d) +{ + unsigned int cpu; + + for_each_online_cpu ( cpu ) + { + struct vcpu *v; + unsigned long flags; + struct arch_vmx_struct *vmx, *tmp; + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list; + + spin_lock_irqsave(lock, flags); + + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list) + { + v = container_of(vmx, struct vcpu, arch.hvm_vmx); + + if (v->domain == d) + { + list_del(&vmx->pi_blocking.list); + ASSERT(vmx->pi_blocking.lock == lock); + vmx->pi_blocking.lock = NULL; + } + } + + spin_unlock_irqrestore(lock, flags); + } +} + static int vmx_domain_initialise(struct domain *d) { int rc; @@ -265,6 +295,8 @@ static int vmx_domain_initialise(struct domain *d) static void vmx_domain_destroy(struct domain *d) { + vmx_pi_blocking_list_cleanup(d); + if ( !has_vlapic(d) ) return;
We need to make sure the bocking vcpu is not in any per-cpu blocking list when the associated domain is going to be destroyed. Signed-off-by: Feng Wu <feng.wu@intel.com> --- xen/arch/x86/hvm/vmx/vmx.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)