Message ID | 1448466169-5230-1-git-send-email-jszhang@marvell.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/25/2015 04:42 PM, Jisheng Zhang wrote: > Let's assume the counter value is 0xf000000, the pistachio clocksource > read cycles function would return 0xffffffff0fffffff, but it should > return 0xfffffff. > > We fix this issue by calculating bitwise-not counter, then cast to > cycle_t. > > Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Hi Jisheng, I tried to reproduce this behavior on x86_64 but without success. On which architecture did you produce this result ? Do you have a simple test program to check with ? Thanks -- Daniel
Dear Daniel, On Tue, 15 Dec 2015 21:59:30 +0100 Daniel Lezcano wrote: > On 11/25/2015 04:42 PM, Jisheng Zhang wrote: > > Let's assume the counter value is 0xf000000, the pistachio clocksource > > read cycles function would return 0xffffffff0fffffff, but it should > > return 0xfffffff. > > > > We fix this issue by calculating bitwise-not counter, then cast to > > cycle_t. > > > > Signed-off-by: Jisheng Zhang <jszhang@marvell.com> > > Hi Jisheng, > > I tried to reproduce this behavior on x86_64 but without success. > > On which architecture did you produce this result ? Do you have a simple > test program to check with ? I have no HW platforms with pistachio, just read the code and run the following test code in x86_64 and x86_32: #include <stdio.h> unsigned long long pistachio_clocksource_read_cycles() { unsigned int counter = 0xf000000; return ~(unsigned long long)counter; } int main() { printf("%llx\n", pistachio_clocksource_read_cycles()); return 0; } Thanks, Jisheng
On Wed, 16 Dec 2015 15:11:25 +0800 Jisheng Zhang wrote: > Dear Daniel, > > On Tue, 15 Dec 2015 21:59:30 +0100 Daniel Lezcano wrote: > > > On 11/25/2015 04:42 PM, Jisheng Zhang wrote: > > > Let's assume the counter value is 0xf000000, the pistachio clocksource oops, sorry, should be 0xf0000000. Do I need to send a v2 patch? > > > read cycles function would return 0xffffffff0fffffff, but it should > > > return 0xfffffff. > > > > > > We fix this issue by calculating bitwise-not counter, then cast to > > > cycle_t. > > > > > > Signed-off-by: Jisheng Zhang <jszhang@marvell.com> > > > > Hi Jisheng, > > > > I tried to reproduce this behavior on x86_64 but without success. > > > > On which architecture did you produce this result ? Do you have a simple > > test program to check with ? > > I have no HW platforms with pistachio, just read the code and run the > following test code in x86_64 and x86_32: > > #include <stdio.h> > unsigned long long pistachio_clocksource_read_cycles() > { > unsigned int counter = 0xf000000; should be unsigned int counter = 0xf0000000; > return ~(unsigned long long)counter; > } > int main() > { > printf("%llx\n", pistachio_clocksource_read_cycles()); > return 0; > } > > Thanks, > Jisheng > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, 16 Dec 2015 15:28:07 +0800 wrote: > On Wed, 16 Dec 2015 15:11:25 +0800 > Jisheng Zhang wrote: > > > Dear Daniel, > > > > On Tue, 15 Dec 2015 21:59:30 +0100 Daniel Lezcano wrote: > > > > > On 11/25/2015 04:42 PM, Jisheng Zhang wrote: > > > > Let's assume the counter value is 0xf000000, the pistachio clocksource > > oops, sorry, should be 0xf0000000. Do I need to send a v2 patch? And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks with c->mask before return, the c->mask is less than 32 bit (because the clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) the higher 32 bits are masked off, so we never saw such issue. But we'd better to fix that, what's your opinion? Thank you very much, Jisheng > > > > > read cycles function would return 0xffffffff0fffffff, but it should > > > > return 0xfffffff. > > > > > > > > We fix this issue by calculating bitwise-not counter, then cast to > > > > cycle_t. > > > > > > > > Signed-off-by: Jisheng Zhang <jszhang@marvell.com> > > > > > > Hi Jisheng, > > > > > > I tried to reproduce this behavior on x86_64 but without success. > > > > > > On which architecture did you produce this result ? Do you have a simple > > > test program to check with ? > > > > I have no HW platforms with pistachio, just read the code and run the > > following test code in x86_64 and x86_32: > > > > #include <stdio.h> > > unsigned long long pistachio_clocksource_read_cycles() > > { > > unsigned int counter = 0xf000000; > > should be unsigned int counter = 0xf0000000; > > > return ~(unsigned long long)counter; > > } > > int main() > > { > > printf("%llx\n", pistachio_clocksource_read_cycles()); > > return 0; > > } > > > > Thanks, > > Jisheng > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On Wed, 16 Dec 2015 15:36:09 +0800 Jisheng Zhang wrote: > On Wed, 16 Dec 2015 15:28:07 +0800 wrote: > > > On Wed, 16 Dec 2015 15:11:25 +0800 > > Jisheng Zhang wrote: > > > > > Dear Daniel, > > > > > > On Tue, 15 Dec 2015 21:59:30 +0100 Daniel Lezcano wrote: > > > > > > > On 11/25/2015 04:42 PM, Jisheng Zhang wrote: > > > > > Let's assume the counter value is 0xf000000, the pistachio clocksource > > > > oops, sorry, should be 0xf0000000. Do I need to send a v2 patch? > > And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks oops, sorry, I really need a cup of coffee. I mean clocksource_mmio_readl_down() > with c->mask before return, the c->mask is less than 32 bit (because the > clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) > the higher 32 bits are masked off, so we never saw such issue. But we'd better > to fix that, what's your opinion? > > Thank you very much, > Jisheng > > > > > > > > read cycles function would return 0xffffffff0fffffff, but it should > > > > > return 0xfffffff. > > > > > > > > > > We fix this issue by calculating bitwise-not counter, then cast to > > > > > cycle_t. > > > > > > > > > > Signed-off-by: Jisheng Zhang <jszhang@marvell.com> > > > > > > > > Hi Jisheng, > > > > > > > > I tried to reproduce this behavior on x86_64 but without success. > > > > > > > > On which architecture did you produce this result ? Do you have a simple > > > > test program to check with ? > > > > > > I have no HW platforms with pistachio, just read the code and run the > > > following test code in x86_64 and x86_32: > > > > > > #include <stdio.h> > > > unsigned long long pistachio_clocksource_read_cycles() > > > { > > > unsigned int counter = 0xf000000; > > > > should be unsigned int counter = 0xf0000000; > > > > > return ~(unsigned long long)counter; > > > } > > > int main() > > > { > > > printf("%llx\n", pistachio_clocksource_read_cycles()); > > > return 0; > > > } > > > > > > Thanks, > > > Jisheng > > > > > > _______________________________________________ > > > linux-arm-kernel mailing list > > > linux-arm-kernel@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On 12/16/2015 08:28 AM, Jisheng Zhang wrote: > On Wed, 16 Dec 2015 15:11:25 +0800 > Jisheng Zhang wrote: > >> Dear Daniel, >> >> On Tue, 15 Dec 2015 21:59:30 +0100 Daniel Lezcano wrote: >> >>> On 11/25/2015 04:42 PM, Jisheng Zhang wrote: >>>> Let's assume the counter value is 0xf000000, the pistachio clocksource > > oops, sorry, should be 0xf0000000. Do I need to send a v2 patch? Nope, I fixed it.
On 12/16/2015 08:11 AM, Jisheng Zhang wrote: > Dear Daniel, > > On Tue, 15 Dec 2015 21:59:30 +0100 Daniel Lezcano wrote: > >> On 11/25/2015 04:42 PM, Jisheng Zhang wrote: >>> Let's assume the counter value is 0xf000000, the pistachio clocksource >>> read cycles function would return 0xffffffff0fffffff, but it should >>> return 0xfffffff. >>> >>> We fix this issue by calculating bitwise-not counter, then cast to >>> cycle_t. >>> >>> Signed-off-by: Jisheng Zhang <jszhang@marvell.com> >> >> Hi Jisheng, >> >> I tried to reproduce this behavior on x86_64 but without success. >> >> On which architecture did you produce this result ? Do you have a simple >> test program to check with ? > > I have no HW platforms with pistachio, just read the code and run the > following test code in x86_64 and x86_32: > > #include <stdio.h> > unsigned long long pistachio_clocksource_read_cycles() > { > unsigned int counter = 0xf000000; > return ~(unsigned long long)counter; > } > int main() > { > printf("%llx\n", pistachio_clocksource_read_cycles()); > return 0; > } Ok, I reproduced it. I had an issue with the signed/unsigned type. That's a good catch !
On 12/16/2015 08:36 AM, Jisheng Zhang wrote: > On Wed, 16 Dec 2015 15:28:07 +0800 wrote: > >> On Wed, 16 Dec 2015 15:11:25 +0800 >> Jisheng Zhang wrote: >> >>> Dear Daniel, >>> >>> On Tue, 15 Dec 2015 21:59:30 +0100 Daniel Lezcano wrote: >>> >>>> On 11/25/2015 04:42 PM, Jisheng Zhang wrote: >>>>> Let's assume the counter value is 0xf000000, the pistachio clocksource >> >> oops, sorry, should be 0xf0000000. Do I need to send a v2 patch? > > And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks > with c->mask before return, the c->mask is less than 32 bit (because the > clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) > the higher 32 bits are masked off, so we never saw such issue. But we'd better > to fix that, what's your opinion? I think we should have a look to this portion closely.
On Wed, Dec 16, 2015 at 10:21:55AM +0100, Daniel Lezcano wrote: > On 12/16/2015 08:36 AM, Jisheng Zhang wrote: > >And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks > >with c->mask before return, the c->mask is less than 32 bit (because the > >clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) > >the higher 32 bits are masked off, so we never saw such issue. But we'd better > >to fix that, what's your opinion? > > I think we should have a look to this portion closely. There is no need to return more bits than are specified. If you have a N-bit counter, then the high (64-N)-bits can be any value, because: static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask) { return (now - last) & mask; } where 'now' is the current value returned from the clock source read function, 'last' is a previously returned value, and 'mask' is the bit mask. This has the effect of ignoring the high order bits.
On 12/16/2015 10:33 AM, Russell King - ARM Linux wrote: > On Wed, Dec 16, 2015 at 10:21:55AM +0100, Daniel Lezcano wrote: >> On 12/16/2015 08:36 AM, Jisheng Zhang wrote: >>> And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks >>> with c->mask before return, the c->mask is less than 32 bit (because the >>> clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) >>> the higher 32 bits are masked off, so we never saw such issue. But we'd better >>> to fix that, what's your opinion? >> >> I think we should have a look to this portion closely. > > There is no need to return more bits than are specified. If you have > a N-bit counter, then the high (64-N)-bits can be any value, because: > > static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask) > { > return (now - last) & mask; > } > > where 'now' is the current value returned from the clock source read > function, 'last' is a previously returned value, and 'mask' is the > bit mask. This has the effect of ignoring the high order bits. I think this approach is perfectly sane. When I said we should look at this portion closely, I meant we should double check the bitwise-nor order regarding the explicit cast. The clocksource's mask makes sense and must stay untouched.
On Wed, Dec 16, 2015 at 11:32:17AM +0100, Daniel Lezcano wrote: > On 12/16/2015 10:33 AM, Russell King - ARM Linux wrote: > >On Wed, Dec 16, 2015 at 10:21:55AM +0100, Daniel Lezcano wrote: > >>On 12/16/2015 08:36 AM, Jisheng Zhang wrote: > >>>And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks > >>>with c->mask before return, the c->mask is less than 32 bit (because the > >>>clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) > >>>the higher 32 bits are masked off, so we never saw such issue. But we'd better > >>>to fix that, what's your opinion? > >> > >>I think we should have a look to this portion closely. > > > >There is no need to return more bits than are specified. If you have > >a N-bit counter, then the high (64-N)-bits can be any value, because: > > > >static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask) > >{ > > return (now - last) & mask; > >} > > > >where 'now' is the current value returned from the clock source read > >function, 'last' is a previously returned value, and 'mask' is the > >bit mask. This has the effect of ignoring the high order bits. > > I think this approach is perfectly sane. When I said we should look at this > portion closely, I meant we should double check the bitwise-nor order > regarding the explicit cast. The clocksource's mask makes sense and must > stay untouched. That's not my point. Whether you do: ~(cycle_t)readl(...) or (cycle_t)~readl(...) is irrelevant - the result is the same as far as the core code is concerned as it doesn't care about the higher order bits. The only thing about which should be done is really which is faster in the general case, since this is a fast path in the time keeping code.
On 12/16/2015 11:38 AM, Russell King - ARM Linux wrote: > On Wed, Dec 16, 2015 at 11:32:17AM +0100, Daniel Lezcano wrote: >> On 12/16/2015 10:33 AM, Russell King - ARM Linux wrote: >>> On Wed, Dec 16, 2015 at 10:21:55AM +0100, Daniel Lezcano wrote: >>>> On 12/16/2015 08:36 AM, Jisheng Zhang wrote: >>>>> And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks >>>>> with c->mask before return, the c->mask is less than 32 bit (because the >>>>> clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) >>>>> the higher 32 bits are masked off, so we never saw such issue. But we'd better >>>>> to fix that, what's your opinion? >>>> >>>> I think we should have a look to this portion closely. >>> >>> There is no need to return more bits than are specified. If you have >>> a N-bit counter, then the high (64-N)-bits can be any value, because: >>> >>> static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask) >>> { >>> return (now - last) & mask; >>> } >>> >>> where 'now' is the current value returned from the clock source read >>> function, 'last' is a previously returned value, and 'mask' is the >>> bit mask. This has the effect of ignoring the high order bits. >> >> I think this approach is perfectly sane. When I said we should look at this >> portion closely, I meant we should double check the bitwise-nor order >> regarding the explicit cast. The clocksource's mask makes sense and must >> stay untouched. > > That's not my point. Whether you do: > > ~(cycle_t)readl(...) > > or > > (cycle_t)~readl(...) > > is irrelevant - the result is the same as far as the core code is > concerned as it doesn't care about the higher order bits. > > The only thing about which should be done is really which is faster > in the general case, since this is a fast path in the time keeping > code. Ah, ok. Yes, I agree.
On Wed, 16 Dec 2015 10:38:03 +0000 Russell King - ARM Linux wrote: > On Wed, Dec 16, 2015 at 11:32:17AM +0100, Daniel Lezcano wrote: > > On 12/16/2015 10:33 AM, Russell King - ARM Linux wrote: > > >On Wed, Dec 16, 2015 at 10:21:55AM +0100, Daniel Lezcano wrote: > > >>On 12/16/2015 08:36 AM, Jisheng Zhang wrote: > > >>>And in fact, clocksource_mmio_readw_down() also has similar issue, but it masks > > >>>with c->mask before return, the c->mask is less than 32 bit (because the > > >>>clocksource_mmio_init think number of valid bits > 32 or < 16 is invalid.) > > >>>the higher 32 bits are masked off, so we never saw such issue. But we'd better > > >>>to fix that, what's your opinion? > > >> > > >>I think we should have a look to this portion closely. > > > > > >There is no need to return more bits than are specified. If you have > > >a N-bit counter, then the high (64-N)-bits can be any value, because: > > > > > >static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask) > > >{ > > > return (now - last) & mask; > > >} So the "& c->mask" in "~(cycle_t)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;" isn't needed, I'm not sure I understand this correctly. > > > > > >where 'now' is the current value returned from the clock source read > > >function, 'last' is a previously returned value, and 'mask' is the > > >bit mask. This has the effect of ignoring the high order bits. > > > > I think this approach is perfectly sane. When I said we should look at this > > portion closely, I meant we should double check the bitwise-nor order > > regarding the explicit cast. The clocksource's mask makes sense and must > > stay untouched. > > That's not my point. Whether you do: > > ~(cycle_t)readl(...) > > or > > (cycle_t)~readl(...) > > is irrelevant - the result is the same as far as the core code is > concerned as it doesn't care about the higher order bits. > > The only thing about which should be done is really which is faster > in the general case, since this is a fast path in the time keeping > code. > Got it. If there's no "& c->mask", just as the pistachio does, return (cycle_t)~readl_relaxed(to_mmio_clksrc(c)->reg) 1c: e1a0c00d mov ip, sp 20: e92dd800 push {fp, ip, lr, pc} 24: e24cb004 sub fp, ip, #4 28: e5103040 ldr r3, [r0, #-64] ; 0x40 2c: e5930000 ldr r0, [r3] 30: e3a01000 mov r1, #0 34: e1e00000 mvn r0, r0 38: e89da800 ldm sp, {fp, sp, pc} is better than return ~(cycle_t)readl_relaxed(to_mmio_clksrc(c)->reg); 1c: e1a0c00d mov ip, sp 20: e92dd800 push {fp, ip, lr, pc} 24: e24cb004 sub fp, ip, #4 28: e5103040 ldr r3, [r0, #-64] ; 0x40 2c: e5932000 ldr r2, [r3] 30: e3a01000 mov r1, #0 34: e1e00002 mvn r0, r2 38: e1e01001 mvn r1, r1 3c: e89da800 ldm sp, {fp, sp, pc} Thanks, Jisheng
diff --git a/drivers/clocksource/time-pistachio.c b/drivers/clocksource/time-pistachio.c index bba6799..3269d9e 100644 --- a/drivers/clocksource/time-pistachio.c +++ b/drivers/clocksource/time-pistachio.c @@ -84,7 +84,7 @@ pistachio_clocksource_read_cycles(struct clocksource *cs) counter = gpt_readl(pcs->base, TIMER_CURRENT_VALUE, 0); raw_spin_unlock_irqrestore(&pcs->lock, flags); - return ~(cycle_t)counter; + return (cycle_t)~counter; } static u64 notrace pistachio_read_sched_clock(void)
Let's assume the counter value is 0xf000000, the pistachio clocksource read cycles function would return 0xffffffff0fffffff, but it should return 0xfffffff. We fix this issue by calculating bitwise-not counter, then cast to cycle_t. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> --- drivers/clocksource/time-pistachio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)