Message ID | 20171027222543.57279-1-salyzyn@android.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Oct 27, 2017 at 03:25:40PM -0700, Mark Salyzyn wrote: > In variable timer reading loops, pick up just the values until all > are synchronized, then outside of loop pick up cntvct and perform > calculations to determine final offset, shifted and multiplied > output value. So this is all about moving the computation out of the loop, which sounds sensible... > +#ifdef ARCH_CLOCK_FIXED_MASK > + static const u64 mask = ARCH_CLOCK_FIXED_MASK; > +#else > + u64 mask; > +#endif ... but this wasn't mentioned, and ARCH_CLOCK_FIXED_MASK doesn't seem to be defined as of this patch. It looks like an orthogonal change that should be introduced in a separate patch. > + > + typeof(((struct vdso_data *)vd)->xtime_clock_sec) sec; Why do we need to do this typeof() magic? Can't we settle on a consistent type across arches, or have a typedef in a header? Thanks, Mark.
Thanks for the review, am taking all the points into consideration. On 10/30/2017 07:15 AM, Mark Rutland wrote: >> + >> + typeof(((struct vdso_data *)vd)->xtime_clock_sec) sec; > Why do we need to do this typeof() magic? > > Can't we settle on a consistent type across arches, or have a typedef in > a header? Would you accept 'because I do not want to standardize the sizes yet'? [TL;DR] We could, if there was one, but there isn't currently, and I do not want to invent one within the context of this series. It is also an architectural decision to decide on the individual size of xtime_clock_sec, rate_time_sec and wtm_clock_nsec (or equivalents, not yet common names throughout), because pedantically they must _all_ be u64, but realistically they only need to be u32 on the smaller platforms. This gets even more complicated for compat (vdso32 etc) implementations as to what is optimal, realistic, desired or pedantic; and we have not even dealt with that. I'd prefer to be agnostic in that debate for now and typeof() (which there is precedence to use in the linux code tree outside of a macro) handily deals with that controversy. As for tv_sec and tv_nsec, there is precedence to override them in private product builds or architectures (#define _STRUCT_TIMESPEC) so I could not count on them being __kernel_time_t or long respectively. typeof() was used to also allow that flexibility. I am not sure this happens, only that the levers are there to allow it. typeof() allows me to respect that facility. Yes, the code gets much more optimal with the help of typeof() for the arm architecture if these are all u32 in size. I am wondering out loud that we may wish to only use u32 in vdso32, despite the size(s) of all of these structure members. But that is another patch series (on hold until these are settled). I am thinking of a nebulous future. The decision for these are being deferred because my focus is on arm and arm64 because they are testable with my current resources. On purpose am not unifying all the vdso_data and vdso.c implementation details as that phase may follow (by me or others). In that phase xtime_clock_sec, rate_time_sec and wtm_clock_nsec could very well be standardized and these typeof()'s may melt away. mips and tile (because they are written in C) could be the next existing arches that could serve merged into this, but I do not have the platforms to test the changes on. -- Mark
On Mon, Oct 30, 2017 at 01:27:22PM -0700, Mark Salyzyn wrote: > Thanks for the review, am taking all the points into consideration. > > On 10/30/2017 07:15 AM, Mark Rutland wrote: > > > + > > > + typeof(((struct vdso_data *)vd)->xtime_clock_sec) sec; > > Why do we need to do this typeof() magic? > > > > Can't we settle on a consistent type across arches, or have a typedef in > > a header? > > Would you accept 'because I do not want to standardize the sizes yet'? Well, a typedef would solve that issue. e.g. each arch can have: typedef u64 vdso_xtime_sec_t; ... or typedef u32 vdso_xtime_sec_t; ... in headers for native/compat, and the above can be: vdso_xtime_sec_t sec ... with the same type in the relevant struct. That way the type can differ per arch. If we want to do something more complicated later, we can always rework the code. Thanks, Mark.
diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c index 71003a1997c4..7fcc8cfcb7df 100644 --- a/arch/arm/vdso/vgettimeofday.c +++ b/arch/arm/vdso/vgettimeofday.c @@ -99,28 +99,38 @@ static notrace int do_monotonic_coarse(const struct vdso_data *vd, #ifdef CONFIG_ARM_ARCH_TIMER -static __always_inline notrace u64 get_ns(const struct vdso_data *vd) +/* + * Returns the clock delta, in nanoseconds left-shifted by the clock + * shift. + */ +static __always_inline notrace u64 get_clock_shifted_nsec(const u64 cycle_last, + const u32 mult, + const u64 mask) { - u64 cycle_delta; - u64 cycle_now; - u64 nsec; - - cycle_now = __arch_counter_get_cntvct(); + u64 res; - cycle_delta = (cycle_now - vd->cs_cycle_last) & vd->cs_mask; + /* Read the virtual counter. */ + res = __arch_counter_get_cntvct(); - nsec = (cycle_delta * vd->cs_mono_mult) + vd->xtime_clock_snsec; - nsec >>= vd->cs_shift; + res = res - cycle_last; - return nsec; + res &= mask; + return res * mult; } /* Code size doesn't matter (vdso is 4k/16k/64k anyway) and this is faster. */ static __always_inline notrace int do_realtime(const struct vdso_data *vd, struct timespec *ts) { - u64 nsecs; - u32 seq; + u32 seq, mult, shift; + u64 nsec, cycle_last; +#ifdef ARCH_CLOCK_FIXED_MASK + static const u64 mask = ARCH_CLOCK_FIXED_MASK; +#else + u64 mask; +#endif + + typeof(((struct vdso_data *)vd)->xtime_clock_sec) sec; do { seq = vdso_read_begin(vd); @@ -128,13 +138,24 @@ static __always_inline notrace int do_realtime(const struct vdso_data *vd, if (vd->use_syscall) return -1; - ts->tv_sec = vd->xtime_clock_sec; - nsecs = get_ns(vd); + cycle_last = vd->cs_cycle_last; - } while (vdso_read_retry(vd, seq)); + mult = vd->cs_mono_mult; + shift = vd->cs_shift; +#ifndef ARCH_CLOCK_FIXED_MASK + mask = vd->cs_mask; +#endif + + sec = vd->xtime_clock_sec; + nsec = vd->xtime_clock_snsec; - ts->tv_nsec = 0; - timespec_add_ns(ts, nsecs); + } while (unlikely(vdso_read_retry(vd, seq))); + + nsec += get_clock_shifted_nsec(cycle_last, mult, mask); + nsec >>= shift; + /* open coding timespec_add_ns to save a ts->tv_nsec = 0 */ + ts->tv_sec = sec + __iter_div_u64_rem(nsec, NSEC_PER_SEC, &nsec); + ts->tv_nsec = nsec; return 0; } @@ -142,9 +163,16 @@ static __always_inline notrace int do_realtime(const struct vdso_data *vd, static __always_inline notrace int do_monotonic(const struct vdso_data *vd, struct timespec *ts) { - struct timespec tomono; - u64 nsecs; - u32 seq; + u32 seq, mult, shift; + u64 nsec, cycle_last; +#ifdef ARCH_CLOCK_FIXED_MASK + static const u64 mask = ARCH_CLOCK_FIXED_MASK; +#else + u64 mask; +#endif + + typeof(((struct vdso_data *)vd)->wtm_clock_nsec) wtm_nsec; + typeof(ts->tv_sec) sec; do { seq = vdso_read_begin(vd); @@ -152,17 +180,28 @@ static __always_inline notrace int do_monotonic(const struct vdso_data *vd, if (vd->use_syscall) return -1; - ts->tv_sec = vd->xtime_clock_sec; - nsecs = get_ns(vd); + cycle_last = vd->cs_cycle_last; - tomono.tv_sec = vd->wtm_clock_sec; - tomono.tv_nsec = vd->wtm_clock_nsec; + mult = vd->cs_mono_mult; + shift = vd->cs_shift; +#ifndef ARCH_CLOCK_FIXED_MASK + mask = vd->cs_mask; +#endif - } while (vdso_read_retry(vd, seq)); + sec = vd->xtime_clock_sec; + nsec = vd->xtime_clock_snsec; - ts->tv_sec += tomono.tv_sec; - ts->tv_nsec = 0; - timespec_add_ns(ts, nsecs + tomono.tv_nsec); + sec += vd->wtm_clock_sec; + wtm_nsec = vd->wtm_clock_nsec; + + } while (unlikely(vdso_read_retry(vd, seq))); + + nsec += get_clock_shifted_nsec(cycle_last, mult, mask); + nsec >>= shift; + nsec += wtm_nsec; + /* open coding timespec_add_ns to save a ts->tv_nsec = 0 */ + ts->tv_sec = sec + __iter_div_u64_rem(nsec, NSEC_PER_SEC, &nsec); + ts->tv_nsec = nsec; return 0; }
In variable timer reading loops, pick up just the values until all are synchronized, then outside of loop pick up cntvct and perform calculations to determine final offset, shifted and multiplied output value. This replaces get_ns with get_clock_shifted_nsec as cntvct reader. Signed-off-by: Mark Salyzyn <salyzyn@android.com> Cc: James Morse <james.morse@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andy Gross <andy.gross@linaro.org> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Andrew Pinski <apinski@cavium.com> Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org v2: - split first CL into 5 of 7 pieces v3: - rebase (unchanged) --- arch/arm/vdso/vgettimeofday.c | 95 ++++++++++++++++++++++++++++++------------- 1 file changed, 67 insertions(+), 28 deletions(-)