Message ID | 1576877960-12767-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/vpt: update last_guest_time with cmpxchg and drop pl_time_lock | expand |
On 20.12.2019 22:39, Igor Druzhinin wrote: > Similarly to PV vTSC emulation, optimize HVM side for consistency > and scalability by dropping a spinlock protecting a single variable. > > Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Seeing that you didn't reply to my comment sent on Dec 23rd, I'm going to drop this patch now from my to-be-dealt-with folder. You can always re-submit. Jan
On 18/02/2020 17:00, Jan Beulich wrote: > On 20.12.2019 22:39, Igor Druzhinin wrote: >> Similarly to PV vTSC emulation, optimize HVM side for consistency >> and scalability by dropping a spinlock protecting a single variable. >> >> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> > > Seeing that you didn't reply to my comment sent on Dec 23rd, > I'm going to drop this patch now from my to-be-dealt-with > folder. You can always re-submit. I didn't receive anything. This is literally the first reply on the thread. This patch wasn't terribly important so I didn't chase. Could you resend your comment? Igor
(Resend; no idea where the original, sent on Dec 23rd, ended up - I can't find it in the list archives in any event) On 20.12.2019 22:39, Igor Druzhinin wrote: > @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d) > uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc) > { > struct pl_time *pl = v->domain->arch.hvm.pl_time; > - u64 now; > + s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset; > > /* Called from device models shared with PV guests. Be careful. */ > ASSERT(is_hvm_vcpu(v)); > > - spin_lock(&pl->pl_time_lock); > - now = get_s_time_fixed(at_tsc) + pl->stime_offset; > - > if ( !at_tsc ) > { > - if ( (int64_t)(now - pl->last_guest_time) > 0 ) > - pl->last_guest_time = now; > - else > - now = ++pl->last_guest_time; > + do { > + old = pl->last_guest_time; > + new = now > pl->last_guest_time ? now : old + 1; > + } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); I wonder whether you wouldn't better re-invoke get_s_time() in case you need to retry here. See how the function previously was called only after the lock was already acquired. Jan
On 19/02/2020 07:48, Jan Beulich wrote: > On 20.12.2019 22:39, Igor Druzhinin wrote: >> @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d) >> uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc) >> { >> struct pl_time *pl = v->domain->arch.hvm.pl_time; >> - u64 now; >> + s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset; >> >> /* Called from device models shared with PV guests. Be careful. */ >> ASSERT(is_hvm_vcpu(v)); >> >> - spin_lock(&pl->pl_time_lock); >> - now = get_s_time_fixed(at_tsc) + pl->stime_offset; >> - >> if ( !at_tsc ) >> { >> - if ( (int64_t)(now - pl->last_guest_time) > 0 ) >> - pl->last_guest_time = now; >> - else >> - now = ++pl->last_guest_time; >> + do { >> + old = pl->last_guest_time; >> + new = now > pl->last_guest_time ? now : old + 1; >> + } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); > > I wonder whether you wouldn't better re-invoke get_s_time() in > case you need to retry here. See how the function previously > was called only after the lock was already acquired. If there is a concurrent writer, wouldn't it just update pl->last_guest_time with the new get_s_time() and then we subsequently would just use the new time on retry? We use the same logic in pv_soft_rdtsc() and so far it proved to be safe. Igor
On 19.02.2020 19:52, Igor Druzhinin wrote: > On 19/02/2020 07:48, Jan Beulich wrote: >> On 20.12.2019 22:39, Igor Druzhinin wrote: >>> @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d) >>> uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc) >>> { >>> struct pl_time *pl = v->domain->arch.hvm.pl_time; >>> - u64 now; >>> + s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>> >>> /* Called from device models shared with PV guests. Be careful. */ >>> ASSERT(is_hvm_vcpu(v)); >>> >>> - spin_lock(&pl->pl_time_lock); >>> - now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>> - >>> if ( !at_tsc ) >>> { >>> - if ( (int64_t)(now - pl->last_guest_time) > 0 ) >>> - pl->last_guest_time = now; >>> - else >>> - now = ++pl->last_guest_time; >>> + do { >>> + old = pl->last_guest_time; >>> + new = now > pl->last_guest_time ? now : old + 1; >>> + } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); >> >> I wonder whether you wouldn't better re-invoke get_s_time() in >> case you need to retry here. See how the function previously >> was called only after the lock was already acquired. > > If there is a concurrent writer, wouldn't it just update pl->last_guest_time > with the new get_s_time() and then we subsequently would just use the new > time on retry? Yes, it would, but the latency until the retry actually occurs is unknown (in particular if Xen itself runs virtualized). I.e. in the at_tsc == 0 case I think the value would better be re-calculated on every iteration. Anther thing I notice only now are the multiple reads of pl->last_guest_time. Wouldn't you better do do { old = ACCESS_ONCE(pl->last_guest_time); new = now > old ? now : old + 1; } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); ? Jan
On 20/02/2020 08:27, Jan Beulich wrote: > On 19.02.2020 19:52, Igor Druzhinin wrote: >> On 19/02/2020 07:48, Jan Beulich wrote: >>> On 20.12.2019 22:39, Igor Druzhinin wrote: >>>> @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d) >>>> uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc) >>>> { >>>> struct pl_time *pl = v->domain->arch.hvm.pl_time; >>>> - u64 now; >>>> + s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>>> >>>> /* Called from device models shared with PV guests. Be careful. */ >>>> ASSERT(is_hvm_vcpu(v)); >>>> >>>> - spin_lock(&pl->pl_time_lock); >>>> - now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>>> - >>>> if ( !at_tsc ) >>>> { >>>> - if ( (int64_t)(now - pl->last_guest_time) > 0 ) >>>> - pl->last_guest_time = now; >>>> - else >>>> - now = ++pl->last_guest_time; >>>> + do { >>>> + old = pl->last_guest_time; >>>> + new = now > pl->last_guest_time ? now : old + 1; >>>> + } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); >>> >>> I wonder whether you wouldn't better re-invoke get_s_time() in >>> case you need to retry here. See how the function previously >>> was called only after the lock was already acquired. >> >> If there is a concurrent writer, wouldn't it just update pl->last_guest_time >> with the new get_s_time() and then we subsequently would just use the new >> time on retry? > > Yes, it would, but the latency until the retry actually occurs > is unknown (in particular if Xen itself runs virtualized). I.e. > in the at_tsc == 0 case I think the value would better be > re-calculated on every iteration. Why does it need to be recalculated if a concurrent writer did this for us already anyway and (get_s_time_fixed(at_tsc) + pl->stime_offset) value is common for all of vCPUs? Yes, it might reduce jitter slightly but overall latency could come from any point (especially in case of rinning virtualized) and it's important just to preserve invariant that the value is monotonic across vCPUs. > Anther thing I notice only now are the multiple reads of > pl->last_guest_time. Wouldn't you better do > > do { > old = ACCESS_ONCE(pl->last_guest_time); > new = now > old ? now : old + 1; > } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); > > ? Fair enough, although even reading it multiple times wouldn't cause any harm as any inconsistency would be resolved by cmpxchg op. I'd prefer to make it in a separate commit to unify it with pv_soft_rdtsc(). Igor
On 20.02.2020 16:37, Igor Druzhinin wrote: > On 20/02/2020 08:27, Jan Beulich wrote: >> On 19.02.2020 19:52, Igor Druzhinin wrote: >>> On 19/02/2020 07:48, Jan Beulich wrote: >>>> On 20.12.2019 22:39, Igor Druzhinin wrote: >>>>> @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d) >>>>> uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc) >>>>> { >>>>> struct pl_time *pl = v->domain->arch.hvm.pl_time; >>>>> - u64 now; >>>>> + s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>>>> >>>>> /* Called from device models shared with PV guests. Be careful. */ >>>>> ASSERT(is_hvm_vcpu(v)); >>>>> >>>>> - spin_lock(&pl->pl_time_lock); >>>>> - now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>>>> - >>>>> if ( !at_tsc ) >>>>> { >>>>> - if ( (int64_t)(now - pl->last_guest_time) > 0 ) >>>>> - pl->last_guest_time = now; >>>>> - else >>>>> - now = ++pl->last_guest_time; >>>>> + do { >>>>> + old = pl->last_guest_time; >>>>> + new = now > pl->last_guest_time ? now : old + 1; >>>>> + } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); >>>> >>>> I wonder whether you wouldn't better re-invoke get_s_time() in >>>> case you need to retry here. See how the function previously >>>> was called only after the lock was already acquired. >>> >>> If there is a concurrent writer, wouldn't it just update pl->last_guest_time >>> with the new get_s_time() and then we subsequently would just use the new >>> time on retry? >> >> Yes, it would, but the latency until the retry actually occurs >> is unknown (in particular if Xen itself runs virtualized). I.e. >> in the at_tsc == 0 case I think the value would better be >> re-calculated on every iteration. > > Why does it need to be recalculated if a concurrent writer did this > for us already anyway and (get_s_time_fixed(at_tsc) + pl->stime_offset) > value is common for all of vCPUs? Yes, it might reduce jitter slightly > but overall latency could come from any point (especially in case of > rinning virtualized) and it's important just to preserve invariant that > the value is monotonic across vCPUs. I'm afraid I don't follow: If we rely on remote CPUs updating pl->last_guest_time, then what we'd return is whatever was put there plus one. Whereas the correct value might be dozens of clocks further ahead. >> Anther thing I notice only now are the multiple reads of >> pl->last_guest_time. Wouldn't you better do >> >> do { >> old = ACCESS_ONCE(pl->last_guest_time); >> new = now > old ? now : old + 1; >> } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); >> >> ? > > Fair enough, although even reading it multiple times wouldn't cause > any harm as any inconsistency would be resolved by cmpxchg op. Afaics "new", if calculated from a value latched _earlier_ than "old", could cause time to actually move backwards. Reads can be re-ordered, after all. > I'd > prefer to make it in a separate commit to unify it with pv_soft_rdtsc(). I'd be fine if you changed pv_soft_rdtsc() first, and then made the code here match. But I don't think the code should be introduced in other than its (for the time being) final shape. Jan
On 20/02/2020 15:47, Jan Beulich wrote: > On 20.02.2020 16:37, Igor Druzhinin wrote: >> On 20/02/2020 08:27, Jan Beulich wrote: >>> On 19.02.2020 19:52, Igor Druzhinin wrote: >>>> On 19/02/2020 07:48, Jan Beulich wrote: >>>>> On 20.12.2019 22:39, Igor Druzhinin wrote: >>>>>> @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d) >>>>>> uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc) >>>>>> { >>>>>> struct pl_time *pl = v->domain->arch.hvm.pl_time; >>>>>> - u64 now; >>>>>> + s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>>>>> >>>>>> /* Called from device models shared with PV guests. Be careful. */ >>>>>> ASSERT(is_hvm_vcpu(v)); >>>>>> >>>>>> - spin_lock(&pl->pl_time_lock); >>>>>> - now = get_s_time_fixed(at_tsc) + pl->stime_offset; >>>>>> - >>>>>> if ( !at_tsc ) >>>>>> { >>>>>> - if ( (int64_t)(now - pl->last_guest_time) > 0 ) >>>>>> - pl->last_guest_time = now; >>>>>> - else >>>>>> - now = ++pl->last_guest_time; >>>>>> + do { >>>>>> + old = pl->last_guest_time; >>>>>> + new = now > pl->last_guest_time ? now : old + 1; >>>>>> + } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); >>>>> >>>>> I wonder whether you wouldn't better re-invoke get_s_time() in >>>>> case you need to retry here. See how the function previously >>>>> was called only after the lock was already acquired. >>>> >>>> If there is a concurrent writer, wouldn't it just update pl->last_guest_time >>>> with the new get_s_time() and then we subsequently would just use the new >>>> time on retry? >>> >>> Yes, it would, but the latency until the retry actually occurs >>> is unknown (in particular if Xen itself runs virtualized). I.e. >>> in the at_tsc == 0 case I think the value would better be >>> re-calculated on every iteration. >> >> Why does it need to be recalculated if a concurrent writer did this >> for us already anyway and (get_s_time_fixed(at_tsc) + pl->stime_offset) >> value is common for all of vCPUs? Yes, it might reduce jitter slightly >> but overall latency could come from any point (especially in case of >> rinning virtualized) and it's important just to preserve invariant that >> the value is monotonic across vCPUs. > > I'm afraid I don't follow: If we rely on remote CPUs updating > pl->last_guest_time, then what we'd return is whatever was put > there plus one. Whereas the correct value might be dozens of > clocks further ahead. I'm merely stating that there might be other places contributing to jitter and getting rid of one of them wouldn't solve the issue completely (if there is one). But again, I'd like the code to be unified with pv_soft_rdtsc() so will have to introduce re-calculation there as well. >>> Anther thing I notice only now are the multiple reads of >>> pl->last_guest_time. Wouldn't you better do >>> >>> do { >>> old = ACCESS_ONCE(pl->last_guest_time); >>> new = now > old ? now : old + 1; >>> } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); >>> >>> ? >> >> Fair enough, although even reading it multiple times wouldn't cause >> any harm as any inconsistency would be resolved by cmpxchg op. > > Afaics "new", if calculated from a value latched _earlier_ > than "old", could cause time to actually move backwards. Reads > can be re-ordered, after all. I don't think it's possible due to x86 memory model and the fact pl->last_guest_time only goes forward. But I will change it to make it explicit and improve readability. >> I'd >> prefer to make it in a separate commit to unify it with pv_soft_rdtsc(). > > I'd be fine if you changed pv_soft_rdtsc() first, and then > made the code here match. But I don't think the code should be > introduced in other than its (for the time being) final shape. Ok, I'll put pv_soft_rdtsc() commit first. Igor
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c index ecd25d7..bf4c432 100644 --- a/xen/arch/x86/hvm/vpt.c +++ b/xen/arch/x86/hvm/vpt.c @@ -30,7 +30,6 @@ void hvm_init_guest_time(struct domain *d) { struct pl_time *pl = d->arch.hvm.pl_time; - spin_lock_init(&pl->pl_time_lock); pl->stime_offset = -(u64)get_s_time(); pl->last_guest_time = 0; } @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d) uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc) { struct pl_time *pl = v->domain->arch.hvm.pl_time; - u64 now; + s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset; /* Called from device models shared with PV guests. Be careful. */ ASSERT(is_hvm_vcpu(v)); - spin_lock(&pl->pl_time_lock); - now = get_s_time_fixed(at_tsc) + pl->stime_offset; - if ( !at_tsc ) { - if ( (int64_t)(now - pl->last_guest_time) > 0 ) - pl->last_guest_time = now; - else - now = ++pl->last_guest_time; + do { + old = pl->last_guest_time; + new = now > pl->last_guest_time ? now : old + 1; + } while ( cmpxchg(&pl->last_guest_time, old, new) != old ); } - spin_unlock(&pl->pl_time_lock); + else + new = now; - return now + v->arch.hvm.stime_offset; + return new + v->arch.hvm.stime_offset; } void hvm_set_guest_time(struct vcpu *v, u64 guest_time) diff --git a/xen/include/asm-x86/hvm/vpt.h b/xen/include/asm-x86/hvm/vpt.h index 99169dd..f5ccb49 100644 --- a/xen/include/asm-x86/hvm/vpt.h +++ b/xen/include/asm-x86/hvm/vpt.h @@ -135,10 +135,9 @@ struct pl_time { /* platform time */ struct HPETState vhpet; struct PMTState vpmt; /* guest_time = Xen sys time + stime_offset */ - int64_t stime_offset; + s_time_t stime_offset; /* Ensures monotonicity in appropriate timer modes. */ - uint64_t last_guest_time; - spinlock_t pl_time_lock; + s_time_t last_guest_time; struct domain *domain; };
Similarly to PV vTSC emulation, optimize HVM side for consistency and scalability by dropping a spinlock protecting a single variable. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> --- xen/arch/x86/hvm/vpt.c | 19 ++++++++----------- xen/include/asm-x86/hvm/vpt.h | 5 ++--- 2 files changed, 10 insertions(+), 14 deletions(-)