Message ID | 1502860478-84512-2-git-send-email-chao.gao@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote: > @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) > spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); > } > > +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, > + struct vmx_pi_blocking_vcpu *vpbv) > +{ > + ASSERT(spin_is_locked(&vpbv->lock)); You realize this is only a very weak check for a non-recursive lock? > + add_sized(&vpbv->counter, 1); > + ASSERT(read_atomic(&vpbv->counter)); Why add_sized() and read_atomic() when you hold the lock? Jan
On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote: >> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >> spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); >> } >> >> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >> + struct vmx_pi_blocking_vcpu *vpbv) >> +{ >> + ASSERT(spin_is_locked(&vpbv->lock)); > >You realize this is only a very weak check for a non-recursive lock? I just thought the lock should be held when adding one entry to the blocking list. Do you think we should remove this check or make it stricter? > >> + add_sized(&vpbv->counter, 1); >> + ASSERT(read_atomic(&vpbv->counter)); > >Why add_sized() and read_atomic() when you hold the lock? > In patch 3, frequent reading the counter is used to find a suitable vcpu and we can use add_sized() and read_atomic() to avoid acquiring the lock. In one word, the lock doesn't protect the counter. Thanks Chao
On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: >>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote: >> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote: >>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >>>> spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); >>>> } >>>> >>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >>>> + struct vmx_pi_blocking_vcpu *vpbv) >>>> +{ >>>> + ASSERT(spin_is_locked(&vpbv->lock)); >>> >>>You realize this is only a very weak check for a non-recursive lock? >> >> I just thought the lock should be held when adding one entry to the >> blocking list. Do you think we should remove this check or make it >> stricter? > >Well, the primary purpose of my comment was to make you aware >of the fact. If the weak check is good enough for you, then fine. To be honest, I don't know the difference between weak check and tight check. >Removing the check would be a bad idea imo (but see also below); >tightening might be worthwhile, but might also go too far (depending >mainly on how clearly provable it is that all callers actually hold the >lock). IMO, the lock was introduced (not by me) to protect the blocking list. list_add() and list_del() should be performed with the lock held. So I think it is clear that all callers should hold the lock. > >>>> + add_sized(&vpbv->counter, 1); >>>> + ASSERT(read_atomic(&vpbv->counter)); >>> >>>Why add_sized() and read_atomic() when you hold the lock? >>> >> >> In patch 3, frequent reading the counter is used to find a suitable >> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the >> lock. In one word, the lock doesn't protect the counter. > >In that case it would be more natural to switch to the atomic >accesses there. Plus you still wouldn't need read_atomic() >here, with the lock held. Furthermore I would then wonder >whether it wasn't better to use atomic_t for the counter at Is there some basic guide on when it is better to use read_atomic() and add_sized() and when it is better to define a atomic variable directly? >that point. Also with a lock-less readers the requirement to >hold a lock here (rather than using suitable LOCKed accesses) >becomes questionable too. As I said above, I think the lock is used to protect the list. I think this patch has two parts: 1. Move all list operations to two inline functions. (with this, adding a counter is easier and don't need add code in several places.) 2. Add a counter. Thanks Chao
>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote: > On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote: >>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >>> spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); >>> } >>> >>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >>> + struct vmx_pi_blocking_vcpu *vpbv) >>> +{ >>> + ASSERT(spin_is_locked(&vpbv->lock)); >> >>You realize this is only a very weak check for a non-recursive lock? > > I just thought the lock should be held when adding one entry to the > blocking list. Do you think we should remove this check or make it > stricter? Well, the primary purpose of my comment was to make you aware of the fact. If the weak check is good enough for you, then fine. Removing the check would be a bad idea imo (but see also below); tightening might be worthwhile, but might also go too far (depending mainly on how clearly provable it is that all callers actually hold the lock). >>> + add_sized(&vpbv->counter, 1); >>> + ASSERT(read_atomic(&vpbv->counter)); >> >>Why add_sized() and read_atomic() when you hold the lock? >> > > In patch 3, frequent reading the counter is used to find a suitable > vcpu and we can use add_sized() and read_atomic() to avoid acquiring the > lock. In one word, the lock doesn't protect the counter. In that case it would be more natural to switch to the atomic accesses there. Plus you still wouldn't need read_atomic() here, with the lock held. Furthermore I would then wonder whether it wasn't better to use atomic_t for the counter at that point. Also with a lock-less readers the requirement to hold a lock here (rather than using suitable LOCKed accesses) becomes questionable too. Jan
On Thu, Aug 31, 2017 at 02:33:57AM -0600, Jan Beulich wrote: >>>> On 31.08.17 at 09:15, <chao.gao@intel.com> wrote: >> On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: >>>>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote: >>>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >>>>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote: >>>>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >>>>>> spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); >>>>>> } >>>>>> >>>>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >>>>>> + struct vmx_pi_blocking_vcpu *vpbv) >>>>>> +{ >>>>>> + ASSERT(spin_is_locked(&vpbv->lock)); >>>>> >>>>>You realize this is only a very weak check for a non-recursive lock? >>>> >>>> I just thought the lock should be held when adding one entry to the >>>> blocking list. Do you think we should remove this check or make it >>>> stricter? >>> >>>Well, the primary purpose of my comment was to make you aware >>>of the fact. If the weak check is good enough for you, then fine. >> >> To be honest, I don't know the difference between weak check and tight >> check. > >For non-recursive locks spin_is_locked() only tells you if _any_ >CPU in the system currently holds the lock. For recursive ones it >checks whether it's the local CPU that owns the lock. This remake is impressive to me. > >>>Removing the check would be a bad idea imo (but see also below); >>>tightening might be worthwhile, but might also go too far (depending >>>mainly on how clearly provable it is that all callers actually hold the >>>lock). >> >> IMO, the lock was introduced (not by me) to protect the blocking list. >> list_add() and list_del() should be performed with the lock held. So I >> think it is clear that all callers should hold the lock. > >Good. > >>>>>> + add_sized(&vpbv->counter, 1); >>>>>> + ASSERT(read_atomic(&vpbv->counter)); >>>>> >>>>>Why add_sized() and read_atomic() when you hold the lock? >>>>> >>>> >>>> In patch 3, frequent reading the counter is used to find a suitable >>>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the >>>> lock. In one word, the lock doesn't protect the counter. >>> >>>In that case it would be more natural to switch to the atomic >>>accesses there. Plus you still wouldn't need read_atomic() >>>here, with the lock held. Furthermore I would then wonder >>>whether it wasn't better to use atomic_t for the counter at >> >> Is there some basic guide on when it is better to use read_atomic() >> and add_sized() and when it is better to define a atomic variable >> directly? > >If an atomic_t variable fits your needs, I think it should always >be preferred. add_sized() was introduced for a case where an >atomic_t variable would not have been usable. Please also >consult older commits for understanding the background. Ok. I will. Thanks for your guide. > >>>that point. Also with a lock-less readers the requirement to >>>hold a lock here (rather than using suitable LOCKed accesses) >>>becomes questionable too. >> >> As I said above, I think the lock is used to protect the list. >> >> I think this patch has two parts: >> 1. Move all list operations to two inline functions. (with this, adding >> a counter is easier and don't need add code in several places.) >> >> 2. Add a counter. > >With it being left unclear whether the counter is meant to >also be protected by the lock: In the patch here you claim it >is, yet by later introducing lock-less readers you weaken >that model. Hence the request to bring things into a >consistent state right away, and ideally also into the final >state. Sure. I will clarify this and make things consistent. Thanks Chao
>>> On 31.08.17 at 09:15, <chao.gao@intel.com> wrote: > On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: >>>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote: >>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >>>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote: >>>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >>>>> spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); >>>>> } >>>>> >>>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >>>>> + struct vmx_pi_blocking_vcpu *vpbv) >>>>> +{ >>>>> + ASSERT(spin_is_locked(&vpbv->lock)); >>>> >>>>You realize this is only a very weak check for a non-recursive lock? >>> >>> I just thought the lock should be held when adding one entry to the >>> blocking list. Do you think we should remove this check or make it >>> stricter? >> >>Well, the primary purpose of my comment was to make you aware >>of the fact. If the weak check is good enough for you, then fine. > > To be honest, I don't know the difference between weak check and tight > check. For non-recursive locks spin_is_locked() only tells you if _any_ CPU in the system currently holds the lock. For recursive ones it checks whether it's the local CPU that owns the lock. >>Removing the check would be a bad idea imo (but see also below); >>tightening might be worthwhile, but might also go too far (depending >>mainly on how clearly provable it is that all callers actually hold the >>lock). > > IMO, the lock was introduced (not by me) to protect the blocking list. > list_add() and list_del() should be performed with the lock held. So I > think it is clear that all callers should hold the lock. Good. >>>>> + add_sized(&vpbv->counter, 1); >>>>> + ASSERT(read_atomic(&vpbv->counter)); >>>> >>>>Why add_sized() and read_atomic() when you hold the lock? >>>> >>> >>> In patch 3, frequent reading the counter is used to find a suitable >>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the >>> lock. In one word, the lock doesn't protect the counter. >> >>In that case it would be more natural to switch to the atomic >>accesses there. Plus you still wouldn't need read_atomic() >>here, with the lock held. Furthermore I would then wonder >>whether it wasn't better to use atomic_t for the counter at > > Is there some basic guide on when it is better to use read_atomic() > and add_sized() and when it is better to define a atomic variable > directly? If an atomic_t variable fits your needs, I think it should always be preferred. add_sized() was introduced for a case where an atomic_t variable would not have been usable. Please also consult older commits for understanding the background. >>that point. Also with a lock-less readers the requirement to >>hold a lock here (rather than using suitable LOCKed accesses) >>becomes questionable too. > > As I said above, I think the lock is used to protect the list. > > I think this patch has two parts: > 1. Move all list operations to two inline functions. (with this, adding > a counter is easier and don't need add code in several places.) > > 2. Add a counter. With it being left unclear whether the counter is meant to also be protected by the lock: In the patch here you claim it is, yet by later introducing lock-less readers you weaken that model. Hence the request to bring things into a consistent state right away, and ideally also into the final state. Jan
On Thu, Aug 31, 2017 at 02:33:57AM -0600, Jan Beulich wrote: >>>> On 31.08.17 at 09:15, <chao.gao@intel.com> wrote: >> On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote: >>>>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote: >>>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote: >>>>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote: >>>>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) >>>>>> spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); >>>>>> } >>>>>> >>>>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, >>>>>> + struct vmx_pi_blocking_vcpu *vpbv) >>>>>> +{ >>>>>> + ASSERT(spin_is_locked(&vpbv->lock)); >>>>> >>>>>You realize this is only a very weak check for a non-recursive lock? >>>> >>>> I just thought the lock should be held when adding one entry to the >>>> blocking list. Do you think we should remove this check or make it >>>> stricter? >>> >>>Well, the primary purpose of my comment was to make you aware >>>of the fact. If the weak check is good enough for you, then fine. >> >> To be honest, I don't know the difference between weak check and tight >> check. > >For non-recursive locks spin_is_locked() only tells you if _any_ >CPU in the system currently holds the lock. For recursive ones it >checks whether it's the local CPU that owns the lock. > >>>Removing the check would be a bad idea imo (but see also below); >>>tightening might be worthwhile, but might also go too far (depending >>>mainly on how clearly provable it is that all callers actually hold the >>>lock). >> >> IMO, the lock was introduced (not by me) to protect the blocking list. >> list_add() and list_del() should be performed with the lock held. So I >> think it is clear that all callers should hold the lock. > >Good. > >>>>>> + add_sized(&vpbv->counter, 1); >>>>>> + ASSERT(read_atomic(&vpbv->counter)); >>>>> >>>>>Why add_sized() and read_atomic() when you hold the lock? >>>>> >>>> >>>> In patch 3, frequent reading the counter is used to find a suitable >>>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the >>>> lock. In one word, the lock doesn't protect the counter. >>> >>>In that case it would be more natural to switch to the atomic >>>accesses there. Plus you still wouldn't need read_atomic() >>>here, with the lock held. Furthermore I would then wonder >>>whether it wasn't better to use atomic_t for the counter at >> >> Is there some basic guide on when it is better to use read_atomic() >> and add_sized() and when it is better to define a atomic variable >> directly? > >If an atomic_t variable fits your needs, I think it should always >be preferred. add_sized() was introduced for a case where an >atomic_t variable would not have been usable. Please also >consult older commits for understanding the background. > >>>that point. Also with a lock-less readers the requirement to >>>hold a lock here (rather than using suitable LOCKed accesses) >>>becomes questionable too. >> >> As I said above, I think the lock is used to protect the list. >> >> I think this patch has two parts: >> 1. Move all list operations to two inline functions. (with this, adding >> a counter is easier and don't need add code in several places.) >> >> 2. Add a counter. > >With it being left unclear whether the counter is meant to >also be protected by the lock: In the patch here you claim it >is, yet by later introducing lock-less readers you weaken >that model. Hence the request to bring things into a >consistent state right away, and ideally also into the final >state. > Hi, Jan. After thinking it again, I want to define the counter as a unsigned int variable for the following reasion: 1. It is definite that the counter is closely related with list_add() and list_del(). If the list is protected by the lock, it is straightforward that the counter is also protected by the lock. 2. In patch 3, althought there are some lock-less readers, we will check the counter still meets our requirement with the lock held. Thus, I don't think there is a racing issue. Thanks Chao
On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote: >>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote: >> After thinking it again, I want to define the counter as >> a unsigned int variable for the following reasion: >> 1. It is definite that the counter is closely related with >> list_add() and list_del(). If the list is protected by the >> lock, it is straightforward that the counter is also protected >> by the lock. >> 2. In patch 3, althought there are some lock-less readers, we >> will check the counter still meets our requirement with the lock >> held. Thus, I don't think there is a racing issue. > >I think that's fine, but then you still don't need LOCKed accesses >to the counter for updating it; write_atomic() will suffice afaict. A stupid question. Is it contradictory that you think the counter can be protected by the lock while suggesting using write_atomic() instead of LOCKed accesses? updating the counter is always accompanied by updating list and updating list should in locked region. I meaned things like: spin_lock() list_add() counter++ spin_unlock() However, I am afraid that not using LOCKed accesses but using write_atomic() means something like (separating updating the counter from updating the list I think is not good): spin_lock() list_add() spin_unlock() write_atomic() And I think this version is: spin_lock() list_add() add_sized() spin_unlock() Thanks Chao
>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote: > After thinking it again, I want to define the counter as > a unsigned int variable for the following reasion: > 1. It is definite that the counter is closely related with > list_add() and list_del(). If the list is protected by the > lock, it is straightforward that the counter is also protected > by the lock. > 2. In patch 3, althought there are some lock-less readers, we > will check the counter still meets our requirement with the lock > held. Thus, I don't think there is a racing issue. I think that's fine, but then you still don't need LOCKed accesses to the counter for updating it; write_atomic() will suffice afaict. Jan
On Fri, Sep 01, 2017 at 03:13:17AM -0600, Jan Beulich wrote: >>>> On 01.09.17 at 09:55, <chao.gao@intel.com> wrote: >> On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote: >>>>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote: >>>> After thinking it again, I want to define the counter as >>>> a unsigned int variable for the following reasion: >>>> 1. It is definite that the counter is closely related with >>>> list_add() and list_del(). If the list is protected by the >>>> lock, it is straightforward that the counter is also protected >>>> by the lock. >>>> 2. In patch 3, althought there are some lock-less readers, we >>>> will check the counter still meets our requirement with the lock >>>> held. Thus, I don't think there is a racing issue. >>> >>>I think that's fine, but then you still don't need LOCKed accesses >>>to the counter for updating it; write_atomic() will suffice afaict. >> >> A stupid question. >> Is it contradictory that you think the counter can be protected by >> the lock while suggesting using write_atomic() instead of LOCKed >> accesses? >> >> updating the counter is always accompanied by updating list and updating >> list should in locked region. I meaned things like: >> >> spin_lock() >> list_add() >> counter++ >> spin_unlock() >> >> However, I am afraid that not using LOCKed accesses but using >> write_atomic() means something like (separating updating the counter >> from updating the list I think is not good): >> >> spin_lock() >> list_add() >> spin_unlock() >> write_atomic() > >No, I mean > > spin_lock() > list_add() > write_atomic() > spin_unlock() > >whereas ... > >> And I think this version is: >> >> spin_lock() >> list_add() >> add_sized() >> spin_unlock() > >... this produces a needless LOCKed instruction redundant with being >inside the locked region). it seems add_sized() won't be a LOCKed instruction. #define build_add_sized(name, size, type, reg) \ static inline void name(volatile type *addr, type val) \ { \ asm volatile("add" size " %1,%0" \ : "=m" (*addr) \ : reg (val)); \ } Thanks Chao
>>> On 01.09.17 at 09:55, <chao.gao@intel.com> wrote: > On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote: >>>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote: >>> After thinking it again, I want to define the counter as >>> a unsigned int variable for the following reasion: >>> 1. It is definite that the counter is closely related with >>> list_add() and list_del(). If the list is protected by the >>> lock, it is straightforward that the counter is also protected >>> by the lock. >>> 2. In patch 3, althought there are some lock-less readers, we >>> will check the counter still meets our requirement with the lock >>> held. Thus, I don't think there is a racing issue. >> >>I think that's fine, but then you still don't need LOCKed accesses >>to the counter for updating it; write_atomic() will suffice afaict. > > A stupid question. > Is it contradictory that you think the counter can be protected by > the lock while suggesting using write_atomic() instead of LOCKed > accesses? > > updating the counter is always accompanied by updating list and updating > list should in locked region. I meaned things like: > > spin_lock() > list_add() > counter++ > spin_unlock() > > However, I am afraid that not using LOCKed accesses but using > write_atomic() means something like (separating updating the counter > from updating the list I think is not good): > > spin_lock() > list_add() > spin_unlock() > write_atomic() No, I mean spin_lock() list_add() write_atomic() spin_unlock() whereas ... > And I think this version is: > > spin_lock() > list_add() > add_sized() > spin_unlock() ... this produces a needless LOCKed instruction redundant with being inside the locked region). Jan
>>> On 01.09.17 at 10:37, <chao.gao@intel.com> wrote: > it seems add_sized() won't be a LOCKed instruction. > #define build_add_sized(name, size, type, reg) \ > static inline void name(volatile type *addr, type val) \ > { \ > asm volatile("add" size " %1,%0" \ > : "=m" (*addr) \ > : reg (val)); \ > } Oh, you're right. But then I'd still like you to not add a new user, as I don't see why it was introduced in the first place: Independent of architecture it is equivalent to write_atomic(p, read_atomic(p) + c) and hence I'd like to get rid of it as misleading/redundant. Jan
>>> On 01.09.17 at 11:55, <JBeulich@suse.com> wrote: >>>> On 01.09.17 at 10:37, <chao.gao@intel.com> wrote: >> it seems add_sized() won't be a LOCKed instruction. >> #define build_add_sized(name, size, type, reg) \ >> static inline void name(volatile type *addr, type val) \ >> { \ >> asm volatile("add" size " %1,%0" \ >> : "=m" (*addr) \ >> : reg (val)); \ >> } > > Oh, you're right. But then I'd still like you to not add a new > user, as I don't see why it was introduced in the first place: > Independent of architecture it is equivalent to > > write_atomic(p, read_atomic(p) + c) > > and hence I'd like to get rid of it as misleading/redundant. Actually, on x86 it still is a bit better than the generic replacement, i.e. it would only be worthwhile dropping the custom ARM variant in favor of a generic one. Keep using it here then. Jan
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 67fc85b..bf17988 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -83,6 +83,7 @@ static int vmx_vmfunc_intercept(struct cpu_user_regs *regs); struct vmx_pi_blocking_vcpu { struct list_head list; spinlock_t lock; + unsigned int counter; }; /* @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu) spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); } +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv, + struct vmx_pi_blocking_vcpu *vpbv) +{ + ASSERT(spin_is_locked(&vpbv->lock)); + add_sized(&vpbv->counter, 1); + ASSERT(read_atomic(&vpbv->counter)); + list_add_tail(&pbv->list, &vpbv->list); +} + +static void vmx_pi_del_vcpu(struct pi_blocking_vcpu *pbv, + struct vmx_pi_blocking_vcpu *vpbv) +{ + ASSERT(spin_is_locked(&vpbv->lock)); + ASSERT(read_atomic(&vpbv->counter)); + list_del(&pbv->list); + add_sized(&vpbv->counter, -1); +} + static void vmx_vcpu_block(struct vcpu *v) { unsigned long flags; @@ -120,8 +139,8 @@ static void vmx_vcpu_block(struct vcpu *v) */ ASSERT(old_lock == NULL); - list_add_tail(&v->arch.hvm_vmx.pi_blocking.list, - &per_cpu(vmx_pi_blocking, v->processor).list); + vmx_pi_add_vcpu(&v->arch.hvm_vmx.pi_blocking, + &per_cpu(vmx_pi_blocking, v->processor)); spin_unlock_irqrestore(pi_blocking_list_lock, flags); ASSERT(!pi_test_sn(pi_desc)); @@ -186,7 +205,9 @@ static void vmx_pi_unblock_vcpu(struct vcpu *v) if ( v->arch.hvm_vmx.pi_blocking.lock != NULL ) { ASSERT(v->arch.hvm_vmx.pi_blocking.lock == pi_blocking_list_lock); - list_del(&v->arch.hvm_vmx.pi_blocking.list); + vmx_pi_del_vcpu(&v->arch.hvm_vmx.pi_blocking, + container_of(pi_blocking_list_lock, + struct vmx_pi_blocking_vcpu, lock)); v->arch.hvm_vmx.pi_blocking.lock = NULL; } @@ -234,7 +255,7 @@ void vmx_pi_desc_fixup(unsigned int cpu) */ if ( pi_test_on(&vmx->pi_desc) ) { - list_del(&vmx->pi_blocking.list); + vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu)); vmx->pi_blocking.lock = NULL; vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx)); } @@ -257,8 +278,9 @@ void vmx_pi_desc_fixup(unsigned int cpu) write_atomic(&vmx->pi_desc.ndst, x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK)); - list_move(&vmx->pi_blocking.list, - &per_cpu(vmx_pi_blocking, new_cpu).list); + vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu)); + vmx_pi_add_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, + new_cpu)); vmx->pi_blocking.lock = new_lock; spin_unlock(new_lock); @@ -2351,9 +2373,9 @@ static struct hvm_function_table __initdata vmx_function_table = { static void pi_wakeup_interrupt(struct cpu_user_regs *regs) { struct arch_vmx_struct *vmx, *tmp; - spinlock_t *lock = &per_cpu(vmx_pi_blocking, smp_processor_id()).lock; - struct list_head *blocked_vcpus = - &per_cpu(vmx_pi_blocking, smp_processor_id()).list; + unsigned int cpu = smp_processor_id(); + spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock; + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list; ack_APIC_irq(); this_cpu(irq_count)++; @@ -2369,7 +2391,7 @@ static void pi_wakeup_interrupt(struct cpu_user_regs *regs) { if ( pi_test_on(&vmx->pi_desc) ) { - list_del(&vmx->pi_blocking.list); + vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu)); ASSERT(vmx->pi_blocking.lock == lock); vmx->pi_blocking.lock = NULL; vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
This patch adds a field, counter, in struct vmx_pi_blocking_vcpu to track how many entries are on the pi blocking list. Signed-off-by: Chao Gao <chao.gao@intel.com> --- v5: - introduce two functions for adding or removing vcpus from pi blocking list. - check the sanity of vcpu count on pi blocking list v4: - non-trace part of Patch 1 in v3 --- xen/arch/x86/hvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++---------- 1 file changed, 32 insertions(+), 10 deletions(-)