Message ID | 1472042632-12883-4-git-send-email-joao.m.martins@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 24.08.16 at 14:43, <joao.m.martins@oracle.com> wrote: > And use to initialize platform time solely for clocksource=tsc, > as opposed to initializing platform overflow timer, which would > only fire in ~180 years (on 2.2 Ghz Broadwell processor). Do we really want to risk this time period going down by two orders of magnitude? Is there anything that's really expensive in setting the overflow timer in the far distant future? > Changes since v2: > - Remove pointless intializer and replace it with the > platform_time init return. Does this really apply to this patch? > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -526,17 +526,31 @@ static s_time_t __read_platform_stime(u64 > platform_time) > return (stime_platform_stamp + scale_delta(diff, &plt_scale)); > } > > +static void __plt_update(void) A single leading underscore only, please. > @@ -630,10 +644,21 @@ static s64 __init try_platform_timer(struct platform_timesource *pts) > > set_time_scale(&plt_scale, pts->frequency); > > - plt_overflow_period = scale_delta( > - 1ull << (pts->counter_bits - 1), &plt_scale); > plt_src = *pts; > > + if ( pts == &plt_tsc ) > + { > + plt_update(); > + } Unnecessary braces. Jan
On 08/25/2016 11:13 AM, Jan Beulich wrote: >>>> On 24.08.16 at 14:43, <joao.m.martins@oracle.com> wrote: >> And use to initialize platform time solely for clocksource=tsc, >> as opposed to initializing platform overflow timer, which would >> only fire in ~180 years (on 2.2 Ghz Broadwell processor). > > Do we really want to risk this time period going down by two orders > of magnitude? Is there anything that's really expensive in setting the > overflow timer in the far distant future? It wasn't about cost but rather setting the timer in a so distant future. I could decrease to an year time, month or day. But do you think we really need that overflow handling for TSC? >> Changes since v2: >> - Remove pointless intializer and replace it with the >> platform_time init return. > > Does this really apply to this patch? Oh no, The comment should have been something like: "Remove clocksource_is_tsc in favor of comparing pts against plt_tsc" > >> --- a/xen/arch/x86/time.c >> +++ b/xen/arch/x86/time.c >> @@ -526,17 +526,31 @@ static s_time_t __read_platform_stime(u64 >> platform_time) >> return (stime_platform_stamp + scale_delta(diff, &plt_scale)); >> } >> >> +static void __plt_update(void) > > A single leading underscore only, please. Fixed. > >> @@ -630,10 +644,21 @@ static s64 __init try_platform_timer(struct platform_timesource *pts) >> >> set_time_scale(&plt_scale, pts->frequency); >> >> - plt_overflow_period = scale_delta( >> - 1ull << (pts->counter_bits - 1), &plt_scale); >> plt_src = *pts; >> >> + if ( pts == &plt_tsc ) >> + { >> + plt_update(); >> + } > > Unnecessary braces. Fixed.
>>> On 26.08.16 at 17:12, <joao.m.martins@oracle.com> wrote: > On 08/25/2016 11:13 AM, Jan Beulich wrote: >>>>> On 24.08.16 at 14:43, <joao.m.martins@oracle.com> wrote: >>> And use to initialize platform time solely for clocksource=tsc, >>> as opposed to initializing platform overflow timer, which would >>> only fire in ~180 years (on 2.2 Ghz Broadwell processor). >> >> Do we really want to risk this time period going down by two orders >> of magnitude? Is there anything that's really expensive in setting the >> overflow timer in the far distant future? > It wasn't about cost but rather setting the timer in a so distant future. I could > decrease to an year time, month or day. But do you think we really need that overflow > handling for TSC? I think we shouldn't introduce latent issues, no matter how unlikely we may deem it for them to actually become active ones. Just consider how unlikely it would have seemed to someone in the i586 days (when the TSC got introduced) that clock speeds might ever cross the 4GHz boundary. As a consequence long ago Linux did use a 32-bit variable for it. It got noticed early enough before processors got really close to that boundary, but it demonstrates the fundamental issue. And yes, processor speeds have stopped to always grow (at least in the x86 space), but that's not a rule set in stone. Jan
On 08/29/2016 10:41 AM, Jan Beulich wrote: >>>> On 26.08.16 at 17:12, <joao.m.martins@oracle.com> wrote: >> On 08/25/2016 11:13 AM, Jan Beulich wrote: >>>>>> On 24.08.16 at 14:43, <joao.m.martins@oracle.com> wrote: >>>> And use to initialize platform time solely for clocksource=tsc, >>>> as opposed to initializing platform overflow timer, which would >>>> only fire in ~180 years (on 2.2 Ghz Broadwell processor). >>> >>> Do we really want to risk this time period going down by two orders >>> of magnitude? Is there anything that's really expensive in setting the >>> overflow timer in the far distant future? >> It wasn't about cost but rather setting the timer in a so distant future. I could >> decrease to an year time, month or day. But do you think we really need that overflow >> handling for TSC? > > I think we shouldn't introduce latent issues, no matter how unlikely > we may deem it for them to actually become active ones. Just > consider how unlikely it would have seemed to someone in the > i586 days (when the TSC got introduced) that clock speeds might > ever cross the 4GHz boundary. As a consequence long ago Linux > did use a 32-bit variable for it. It got noticed early enough before > processors got really close to that boundary, but it demonstrates > the fundamental issue. And yes, processor speeds have stopped > to always grow (at least in the x86 space), but that's not a rule > set in stone. Thanks for the insight. What would be a preferred period - probably capping it to a day/hour/minutes? Or keeping the current calculated value? Maximum right now in current platform timers are few minutes depending on the HPET frequency. Joao
>>> On 30.08.16 at 14:10, <joao.m.martins@oracle.com> wrote: > What would be a preferred period - probably capping it to a day/hour/minutes? Or > keeping the current calculated value? Maximum right now in current platform timers > are few minutes depending on the HPET frequency. Ideally I would see you just use the calculated value, unless that causes problems elsewhere. Depending on such possible problems would be what cap to alternatively use. Jan
On 08/30/2016 01:31 PM, Jan Beulich wrote: >>>> On 30.08.16 at 14:10, <joao.m.martins@oracle.com> wrote: >> What would be a preferred period - probably capping it to a day/hour/minutes? Or >> keeping the current calculated value? Maximum right now in current platform timers >> are few minutes depending on the HPET frequency. > > Ideally I would see you just use the calculated value, unless that > causes problems elsewhere. Depending on such possible problems > would be what cap to alternatively use. While sending v4 out, I spotted some issues with platform timer overflow handling when we get close to u64 boundary. Specific to clocksource=tsc, and setting up the plt_overflow, one would see at boot: "Platform timer appears to have unexpectedly wrapped 10 or more times" The counter doesn't wrap or stop at all. But current overflowing checking goes like: count = plt_src.read_counter(); plt_stamp64 += (count - plt_stamp) & plt_mask; now = NOW(); plt_wrap = __read_platform_stime(plt_stamp64); plt_stamp = count; for (i=0;...) { plt_now = plt_wrap; plt_wrap = __read_platform_stime(plt_stamp64 + plt_mask + 1); @@ if ( ABS(plt_wrap - now) > ABS(plt_now - now) ) break; // didn't wrap around // did wrap plt_stamp64 += plt_mask + 1; } if (i!=0) "Platform timer appears to have unexpectedly wrapped " Effectively the calculation in @@ doesn't handle the fact that for clocksource=tsc, "ABS(plt_wrap - now)" would be == to "ABS(plt_now - now)". Not that is right to be so, but TSC is a 64-bit counter and "plt_mask + 1" overflows the u64, turning the above condition into a comparison of same value. For <= 32-bit platform timers the current math works fine, but reaching the 64-bit boundary not so much. And then handling the wraparound further below with "plt_stamp64 += plt_mask + 1" is effectively adding zeroes, which is assuming that plt_stamp64/plt_stamp is alone enough to not represent any discontinuity. I am not sure we shouldn't handle this way at least for clocksource=tsc: we only see issues when the delta of the two stamps overflows a u64 (computed with: plt_stamp64 + (read_counter() - plt_stamp)). Keeping those two stamps updated more often and we won't see issues. When the wraparound happens (in lots lots of years away) we could not only update plt_stamp64 and plt_stamp, but also increment stime_platform_stamp and platform_timer_stamp. This lets us compensate when the stamps wraparound since for stime_platform_stamp (in ns) that would only happen after STIME_MAX. Or as a simpler alternative, keeping this patch and update plt_stamp64 (zeroing this one out) plus plt_stamp on platform_time_calibration() as additional change. Would that sound reasonable - am I overlooking something? To some extent this might also applicable to the general case, although platform timer is now only used for initial seeding so probably a non-visible issue. Joao
>>> On 09.09.16 at 18:32, <joao.m.martins@oracle.com> wrote: > Would that sound reasonable - am I overlooking something? To some extent this > might also applicable to the general case, although platform timer is now only > used for initial seeding so probably a non-visible issue. Wouldn't it already help to simply make TSC a 62- or 63-bit counter, masking off the high bit(s) during reads? Jan
On 09/12/2016 08:26 AM, Jan Beulich wrote: >>>> On 09.09.16 at 18:32, <joao.m.martins@oracle.com> wrote: >> Would that sound reasonable - am I overlooking something? To some extent this >> might also applicable to the general case, although platform timer is now only >> used for initial seeding so probably a non-visible issue. > > Wouldn't it already help to simply make TSC a 62- or 63-bit counter, > masking off the high bit(s) during reads? Yeap - That indeed would be the simplest solution, as we currently mask out the difference between stamps. Tested it and changed counter_bits to be 63 (amended in patch 2). Joao
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index b2a11a8..a03127e 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -526,17 +526,31 @@ static s_time_t __read_platform_stime(u64 platform_time) return (stime_platform_stamp + scale_delta(diff, &plt_scale)); } +static void __plt_update(void) +{ + u64 count; + + ASSERT(spin_is_locked(&platform_timer_lock)); + count = plt_src.read_counter(); + plt_stamp64 += (count - plt_stamp) & plt_mask; + plt_stamp = count; +} + +static void plt_update(void) +{ + spin_lock_irq(&platform_timer_lock); + __plt_update(); + spin_unlock_irq(&platform_timer_lock); +} + static void plt_overflow(void *unused) { int i; - u64 count; s_time_t now, plt_now, plt_wrap; spin_lock_irq(&platform_timer_lock); - count = plt_src.read_counter(); - plt_stamp64 += (count - plt_stamp) & plt_mask; - plt_stamp = count; + __plt_update(); now = NOW(); plt_wrap = __read_platform_stime(plt_stamp64); @@ -630,10 +644,21 @@ static s64 __init try_platform_timer(struct platform_timesource *pts) set_time_scale(&plt_scale, pts->frequency); - plt_overflow_period = scale_delta( - 1ull << (pts->counter_bits - 1), &plt_scale); plt_src = *pts; + if ( pts == &plt_tsc ) + { + plt_update(); + } + else + { + plt_overflow_period = scale_delta( + 1ull << (pts->counter_bits - 1), &plt_scale); + + printk(XENLOG_INFO "Platform timer overflow period is %lu msecs\n", + plt_overflow_period / MILLISECS(1)); + } + return rc; }
And use to initialize platform time solely for clocksource=tsc, as opposed to initializing platform overflow timer, which would only fire in ~180 years (on 2.2 Ghz Broadwell processor). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Changes since v2: - Remove pointless intializer and replace it with the platform_time init return. - s/plt_init/plt_update/g --- xen/arch/x86/time.c | 37 +++++++++++++++++++++++++++++++------ 1 file changed, 31 insertions(+), 6 deletions(-)