Message ID | 20170921220510.GA11395@obsidianresearch.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Sep 21, 2017 at 04:05:10PM -0600, Jason Gunthorpe wrote: > On Thu, Sep 21, 2017 at 10:32:19AM +0100, Russell King - ARM Linux > wrote: > > I've combined several of your messages together into this reply... Sorry about that, I'm currently ill and sleep deprived, so it's been a struggle to pick up on everything together. > So with v2 of this patch, the driver would specify positive 1s, and > then the set will be called when tv_nsec == 0, but the tv_sec will > be +1. > > Similarly the existing case of 0.5s offset will work more like > before, where it will be called with tv_nsec == .5s and the tv_sec > will be +1 - before I think this happened reliably 'by accident' due > to the rounding. It seems to mean (from reading the code) that I'd need to specify -470ms for the PCF8523, if I'm understand what you're saying and the code correctly. I should point out, however, that I'm running a modified version of your previous patch with 470ms specified - I added a debug printk() for the failure cases, and I seem to be getting quite a number: [ 416.966909] rtc_tv_nsec_ok: fail: ts=1506001533.510194975 [ 427.974142] rtc_tv_nsec_ok: fail: ts=1506001544.518073728 [ 768.946856] rtc_tv_nsec_ok: fail: ts=1506001885.510091123 [ 779.954207] rtc_tv_nsec_ok: fail: ts=1506001896.518068045 ... [31391.230806] rtc_tv_nsec_ok: fail: ts=1506032509.510048412 [31402.238221] rtc_tv_nsec_ok: fail: ts=1506032520.518080991 176 so far in that window, which equates to an average failure rate of one failure every 3m5s. This is on a kernel with: CONFIG_NO_HZ_IDLE=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_HZ=250 From what I can see, we have failures that are 11s apart and 340s apart. If we assume that we had a successful write at 1506001555s, then that means we only waited 329s before we next tried to set the RTC, not the 11 minutes that we should have. Not sure yet what's going on there, but it seems to suggest that we're struggling to set the RTC, taking multiple attempts to achieve it. Also... don't we want to move the call to rtc_set_ntp_time() from out of the "if (abs(now.tv_nsec - (NSEC_PER_SEC / 2)) <= tick_nsec * 5) {" check, otherwise how could it be called except at about .5s? I'm actually not sure if we are ending up setting the time - my test program reports: RTC Time: 21-09-2017 22:40:45 System Time was: 22:40:44.997 so the RTC has drifted off by 3ms wrt system time, and that suggests it hasn't been set. It's difficult to know. Given that tick_nsec is likely to be 4ms (I've not checked) it would suggest that we're never successfully setting the RTC - since either the test in ntp.c fails or the test in systohc.c fails. Both can't succeed given these constraints.
On Thu, Sep 21, 2017 at 11:45:41PM +0100, Russell King - ARM Linux wrote: > > I've combined several of your messages together into this reply... > > Sorry about that, I'm currently ill and sleep deprived, so it's been > a struggle to pick up on everything together. No worries, your replys were easy to follow, thanks > > So with v2 of this patch, the driver would specify positive 1s, and > > then the set will be called when tv_nsec == 0, but the tv_sec will > > be +1. > > > > Similarly the existing case of 0.5s offset will work more like > > before, where it will be called with tv_nsec == .5s and the tv_sec > > will be +1 - before I think this happened reliably 'by accident' due > > to the rounding. > > It seems to mean (from reading the code) that I'd need to specify > -470ms for the PCF8523, if I'm understand what you're saying and > the code correctly. Okay.. but I guessed a negative number is probably not what most RTCs would want, eg a positive number would compensate for delays executing an I2C command, assuming the RTC zeros its divider when it sets the time. > I should point out, however, that I'm running a modified version > of your previous patch with 470ms specified - I added a debug > printk() for the failure cases, and I seem to be getting quite a > number: > > [ 416.966909] rtc_tv_nsec_ok: fail: ts=1506001533.510194975 > then that means we only waited 329s before we next tried to set the > RTC, not the 11 minutes that we should have. Not sure yet what's > going on there, but it seems to suggest that we're struggling to > set the RTC, taking multiple attempts to achieve it. Ah.. > Also... don't we want to move the call to rtc_set_ntp_time() from out > of the "if (abs(now.tv_nsec - (NSEC_PER_SEC / 2)) <= tick_nsec * 5) {" > check, otherwise how could it be called except at about .5s? Yes, definitely. If we do not call through to rtc_set_ntp_time then it does not update target_nsec, and we go back to the 0.5 offset not the offset you want, which will certainly cause random rtc_tv_nsec_ok failures. My bad to not notice that.. But I'm not sure what the revision should be.. I think this is using tick_nsec similarly to TIME_SET_NSEC_FUZZ, but I'm not sure where tick_nsec comes from or if we should push it down in to TIME_SET_NSEC_FUZZ, or something else. > I'm actually not sure if we are ending up setting the time - my test > program reports: Probably not, given the above! Jason
On Thu, Sep 21, 2017 at 05:20:40PM -0600, Jason Gunthorpe wrote: > On Thu, Sep 21, 2017 at 11:45:41PM +0100, Russell King - ARM Linux wrote: > > > I've combined several of your messages together into this reply... > > > > Sorry about that, I'm currently ill and sleep deprived, so it's been > > a struggle to pick up on everything together. > > No worries, your replys were easy to follow, thanks Feeling better today, thankfully. > > > So with v2 of this patch, the driver would specify positive 1s, and > > > then the set will be called when tv_nsec == 0, but the tv_sec will > > > be +1. > > > > > > Similarly the existing case of 0.5s offset will work more like > > > before, where it will be called with tv_nsec == .5s and the tv_sec > > > will be +1 - before I think this happened reliably 'by accident' due > > > to the rounding. > > > > It seems to mean (from reading the code) that I'd need to specify > > -470ms for the PCF8523, if I'm understand what you're saying and > > the code correctly. > > Okay.. but I guessed a negative number is probably not what most RTCs would > want, eg a positive number would compensate for delays executing an > I2C command, assuming the RTC zeros its divider when it sets the time. Okay, I thought I'd investigate a few other systems: With the mainline code, which sets the RTC to tv_sec + 1 when tv_nsec >= 500ms or tv_sec when tv_nsec < 500ms, at about tv_nsec = 500ms: Dove - drifting about 4-5ms per minute: RTC Time: 22-09-2017 09:06:38 System Time was: 09:06:37.464 RTC Time: 22-09-2017 09:07:38 System Time was: 09:07:37.508 RTC Time: 22-09-2017 09:08:39 System Time was: 09:08:38.504 ... RTC Time: 22-09-2017 09:17:48 System Time was: 09:17:47.464 RTC Time: 22-09-2017 09:18:48 System Time was: 09:18:47.500 I suspect this wants the time to be set at or around tv_nsec = 0. Armada 388 - drifting at about 2-3ms per minute: RTC Time: 22-09-2017 08:22:25 System Time was: 08:22:25.498 ... RTC Time: 22-09-2017 08:30:42 System Time was: 08:30:42.516 RTC Time: 22-09-2017 08:31:43 System Time was: 08:31:43.518 RTC Time: 22-09-2017 08:32:45 System Time was: 08:32:44.520 RTC Time: 22-09-2017 08:33:46 System Time was: 08:33:45.522 ... RTC Time: 22-09-2017 08:41:54 System Time was: 08:41:53.540 RTC Time: 22-09-2017 08:42:55 System Time was: 08:42:54.542 RTC Time: 22-09-2017 08:43:56 System Time was: 08:43:55.520 RTC Time: 22-09-2017 08:44:57 System Time was: 08:44:56.522 This looks the same as the Dove case - although there's some flucuation in the second offset, probably caused by the 500ms threshold in rtc_set_ntp_time(). > > Also... don't we want to move the call to rtc_set_ntp_time() from out > > of the "if (abs(now.tv_nsec - (NSEC_PER_SEC / 2)) <= tick_nsec * 5) {" > > check, otherwise how could it be called except at about .5s? > > Yes, definitely. If we do not call through to rtc_set_ntp_time then it > does not update target_nsec, and we go back to the 0.5 offset not the > offset you want, which will certainly cause random rtc_tv_nsec_ok > failures. > > My bad to not notice that.. But I'm not sure what the revision should > be.. I think this is using tick_nsec similarly to TIME_SET_NSEC_FUZZ, > but I'm not sure where tick_nsec comes from or if we should push it > down in to TIME_SET_NSEC_FUZZ, or something else. I think tick_nsec is the length of one tick in nanoseconds, adjusted by NTP (see ntp_update_frequency() where it's calculated.) It's probably easiest to think about it in terms of the old timekeeping model, where we had a periodic tick whose period was fixed and defined the point at which time incremented. tick_nsec would be the amount of time for one period. I think that's still relevant in a modern kernel, especially when it's using a clock event in periodic mode, as that's when we get the regular interrupts which guarantee a scheduling point (other scheduling points are available!) and hence when the workqueue can be run. The only places it's used is by the NTP code, and also by a few architectures to that still have gettimeoffset() functionality (pre- clocksource) as it can be used to derive the scaling factor to convert timer count ticks to wall time. I also don't see an obvious rearrangement of the code to fix this - the problem is we need to call update_persistent_clock64() to know whether we should call rtc_set_ntp_time(), but we can only call update_persistent_clock64() on tv_nsec=500ms. It looks like the users of update_persistent_clock*() are: arch/mips/kernel/time.c arch/powerpc/kernel/time.c arch/sh/kernel/time.c arch/x86/kernel/rtc.c I thought SH was introduced after rtclib was born? All of these have some kind of architecture specific multiplexing of the update_persistent_clock() call - switch statements depending on rtc type, function pointers set according to the rtc type, etc. They all look like legacy code that no one wants to update to rtclib. Maybe the solution here is to split two forms of ntp RTC synchronisation so: - If you enable CONFIG_GENERIC_CMOS_UPDATE, then you get the update_persistent_clock*() methods called. - If you enable CONFIG_RTC_SYSTOHC, rtc_set_ntp_time() gets called. - If you enable both, then both get called at their appropriate times. This would certainly simplify sync_cmos_clock().
On Fri, Sep 22, 2017 at 10:57:13AM +0100, Russell King - ARM Linux wrote: > Maybe the solution here is to split two forms of ntp RTC synchronisation > so: > - If you enable CONFIG_GENERIC_CMOS_UPDATE, then you get the > update_persistent_clock*() methods called. > - If you enable CONFIG_RTC_SYSTOHC, rtc_set_ntp_time() gets called. > - If you enable both, then both get called at their appropriate times. > > This would certainly simplify sync_cmos_clock(). I've just tried that out, and it doesn't work very well (not wrapping the kernel messages): [ 18.827144] sync_rtc_clock: adjust=1506081346.710933001 target=470000000 next=1506081357.470000000 now=1506081346.710953334 delta=10.759046666 jiffies=2690 [ 29.662077] sync_rtc_clock: adjust=1506081357.546474534 target=470000000 next=1506081368.470000000 now=1506081357.546487201 delta=10.923512799 jiffies=2731 [ 40.671669] sync_rtc_clock: adjust=1506081368.556684059 target=470000000 next=1506081379.470000000 now=1506081368.556696727 delta=10.913303273 jiffies=2729 [ 51.675016] sync_rtc_clock: adjust=1506081379.560649235 target=470000000 next=1506081390.470000000 now=1506081379.560662236 delta=10.909337764 jiffies=2728 [ 62.682385] sync_rtc_clock: adjust=1506081390.568637970 target=470000000 next=1506081401.470000000 now=1506081390.568648637 delta=10.901351363 jiffies=2726 [ 73.689831] sync_rtc_clock: adjust=1506081401.576696708 target=470000000 next=1506081412.470000000 now=1506081401.576708376 delta=10.893291624 jiffies=2724 [ 84.697233] sync_rtc_clock: adjust=1506081412.584695110 target=470000000 next=1506081423.470000000 now=1506081412.584723778 delta=10.885276222 jiffies=2722 [ 95.704624] sync_rtc_clock: adjust=1506081423.592724847 target=470000000 next=1506081434.470000000 now=1506081423.592736847 delta=10.877263153 jiffies=2720 [ 106.711941] sync_rtc_clock: adjust=1506081434.600651911 target=470000000 next=1506081445.470000000 now=1506081434.600665579 delta=10.869334421 jiffies=2718 [ 117.719334] sync_rtc_clock: adjust=1506081445.608659647 target=470000000 next=1506081456.470000000 now=1506081445.608673981 delta=10.861326019 jiffies=2716 [ 128.726723] sync_rtc_clock: adjust=1506081456.616661715 target=470000000 next=1506081467.470000000 now=1506081456.616676049 delta=10.853323951 jiffies=2714 [ 139.734103] sync_rtc_clock: adjust=1506081467.624659450 target=470000000 next=1506081478.470000000 now=1506081467.624672451 delta=10.845327549 jiffies=2712 [ 150.741489] sync_rtc_clock: adjust=1506081478.632661185 target=470000000 next=1506081489.470000000 now=1506081478.632673519 delta=10.837326481 jiffies=2710 [ 161.748870] sync_rtc_clock: adjust=1506081489.640655587 target=470000000 next=1506081500.470000000 now=1506081489.640668254 delta=10.829331746 jiffies=2708 [ 172.756352] sync_rtc_clock: adjust=1506081500.648758661 target=470000000 next=1506081511.470000000 now=1506081500.648772995 delta=10.821227005 jiffies=2706 [ 183.763642] sync_rtc_clock: adjust=1506081511.656658057 target=470000000 next=1506081522.470000000 now=1506081511.656671391 delta=10.813328609 jiffies=2704 [ 194.771043] sync_rtc_clock: adjust=1506081522.664678127 target=470000000 next=1506081533.470000000 now=1506081522.664689794 delta=10.805310206 jiffies=2702 [ 205.778402] sync_rtc_clock: adjust=1506081533.672649194 target=470000000 next=1506081544.470000000 now=1506081533.672662528 delta=10.797337472 jiffies=2700 [ 216.785780] sync_rtc_clock: adjust=1506081544.680643928 target=470000000 next=1506081555.470000000 now=1506081544.680656929 delta=10.789343071 jiffies=2698 [ 227.792447] sync_rtc_clock: adjust=1506081555.688650369 target=470000000 next=1506081566.470000000 now=1506081555.688664037 delta=10.781335963 jiffies=2696 [ 238.798968] sync_rtc_clock: adjust=1506081566.696674887 target=470000000 next=1506081577.470000000 now=1506081566.696687889 delta=10.773312111 jiffies=2694 [ 249.805483] sync_rtc_clock: adjust=1506081577.704652923 target=470000000 next=1506081588.470000000 now=1506081577.704666259 delta=10.765333741 jiffies=2692 [ 260.812078] sync_rtc_clock: adjust=1506081588.712679979 target=470000000 next=1506081599.470000000 now=1506081588.712693647 delta=10.757306353 jiffies=2690 [ 271.818656] sync_rtc_clock: adjust=1506081599.720654429 target=470000000 next=1506081610.470000000 now=1506081599.720668431 delta=10.749331569 jiffies=2688 [ 282.825311] sync_rtc_clock: adjust=1506081610.728677710 target=470000000 next=1506081621.470000000 now=1506081610.728690045 delta=10.741309955 jiffies=2686 [ 293.575979] sync_rtc_clock: adjust=1506081621.480653995 target=470000000 next=1506081632.470000000 now=1506081621.480666996 delta=10.989333004 jiffies=2748 [ 304.582667] sync_rtc_clock: adjust=1506081632.488651088 target=470000000 next=1506081643.470000000 now=1506081632.488663756 delta=10.981336244 jiffies=2746 [ 315.589372] sync_rtc_clock: adjust=1506081643.496641761 target=470000000 next=1506081654.470000000 now=1506081643.496653429 delta=10.973346571 jiffies=2744 As previous mentioned, this has HZ=250, so one jiffy is about 4ms. What this shows is that we're failing every single time to hit the desired window - the closest we got in the 5 minutes of uptime to 470ms was 480653995ns. The comments in linux/workqueue.h refer to a gmane link - http://thread.gmane.org/gmane.linux.kernel/1480396 but this seems to be dead - "Archived At Nothing found - bye". Looks like using gmane URLs in kernel code is not a good idea, they don't seem to be persistent! It's interesting to watch what's happening there - despite reducing the number of jiffies each time, we still seem to be woken later and later each time, with an approximate increase of 8ms or two jiffies each time. Eg: [ 293.575979] sync_rtc_clock: adjust=1506081621.480653995 target=470000000 next=1506081632.470000000 now=1506081621.480666996 delta=10.989333004 jiffies=2748 [ 304.582667] sync_rtc_clock: adjust=1506081632.488651088 target=470000000 next=1506081643.470000000 now=1506081632.488663756 delta=10.981336244 jiffies=2746 [ 315.589372] sync_rtc_clock: adjust=1506081643.496641761 target=470000000 next=1506081654.470000000 now=1506081643.496653429 delta=10.973346571 jiffies=2744 If we do the math, then: 2748 jiffies is 10.992s, which is 10.989333004 rounded up to a jiffy. 10.992 + 1506081621.480666996 gives 1506081632.472666996, but we're next called at 1506081632.488651088, which is almost 4 jiffies later than requested. 2746 jiffies is 10.984s, which is 10.981336244 rounded up to a jiffy. 10.984 + 1506081632.488663756 gives 1506081643.472663756, but we're next called at 1506081643.496641761, which is almost 6 jiffies later than requested. I'm not sure where those extra jiffies are coming from... but maybe using the power efficient wq is not the best for accuracy? Also notice that the target time seems to always be about 2.6ms late. based on the requested jiffies, which in itself would mean we miss the 1ms window in the systohc.c code.
On Fri, Sep 22, 2017 at 01:24:20PM +0100, Russell King - ARM Linux wrote: > On Fri, Sep 22, 2017 at 10:57:13AM +0100, Russell King - ARM Linux wrote: > > Maybe the solution here is to split two forms of ntp RTC synchronisation > > so: > > - If you enable CONFIG_GENERIC_CMOS_UPDATE, then you get the > > update_persistent_clock*() methods called. > > - If you enable CONFIG_RTC_SYSTOHC, rtc_set_ntp_time() gets called. > > - If you enable both, then both get called at their appropriate times. > > > > This would certainly simplify sync_cmos_clock(). > > I've just tried that out, and it doesn't work very well (not wrapping > the kernel messages): > > [ 18.827144] sync_rtc_clock: adjust=1506081346.710933001 target=470000000 next=1506081357.470000000 now=1506081346.710953334 delta=10.759046666 jiffies=2690 > [ 29.662077] sync_rtc_clock: adjust=1506081357.546474534 target=470000000 next=1506081368.470000000 now=1506081357.546487201 delta=10.923512799 jiffies=2731 > [ 40.671669] sync_rtc_clock: adjust=1506081368.556684059 target=470000000 next=1506081379.470000000 now=1506081368.556696727 delta=10.913303273 jiffies=2729 > [ 51.675016] sync_rtc_clock: adjust=1506081379.560649235 target=470000000 next=1506081390.470000000 now=1506081379.560662236 delta=10.909337764 jiffies=2728 > [ 62.682385] sync_rtc_clock: adjust=1506081390.568637970 target=470000000 next=1506081401.470000000 now=1506081390.568648637 delta=10.901351363 jiffies=2726 > [ 73.689831] sync_rtc_clock: adjust=1506081401.576696708 target=470000000 next=1506081412.470000000 now=1506081401.576708376 delta=10.893291624 jiffies=2724 > [ 84.697233] sync_rtc_clock: adjust=1506081412.584695110 target=470000000 next=1506081423.470000000 now=1506081412.584723778 delta=10.885276222 jiffies=2722 > [ 95.704624] sync_rtc_clock: adjust=1506081423.592724847 target=470000000 next=1506081434.470000000 now=1506081423.592736847 delta=10.877263153 jiffies=2720 > [ 106.711941] sync_rtc_clock: adjust=1506081434.600651911 target=470000000 next=1506081445.470000000 now=1506081434.600665579 delta=10.869334421 jiffies=2718 > [ 117.719334] sync_rtc_clock: adjust=1506081445.608659647 target=470000000 next=1506081456.470000000 now=1506081445.608673981 delta=10.861326019 jiffies=2716 > [ 128.726723] sync_rtc_clock: adjust=1506081456.616661715 target=470000000 next=1506081467.470000000 now=1506081456.616676049 delta=10.853323951 jiffies=2714 > [ 139.734103] sync_rtc_clock: adjust=1506081467.624659450 target=470000000 next=1506081478.470000000 now=1506081467.624672451 delta=10.845327549 jiffies=2712 > [ 150.741489] sync_rtc_clock: adjust=1506081478.632661185 target=470000000 next=1506081489.470000000 now=1506081478.632673519 delta=10.837326481 jiffies=2710 > [ 161.748870] sync_rtc_clock: adjust=1506081489.640655587 target=470000000 next=1506081500.470000000 now=1506081489.640668254 delta=10.829331746 jiffies=2708 > [ 172.756352] sync_rtc_clock: adjust=1506081500.648758661 target=470000000 next=1506081511.470000000 now=1506081500.648772995 delta=10.821227005 jiffies=2706 > [ 183.763642] sync_rtc_clock: adjust=1506081511.656658057 target=470000000 next=1506081522.470000000 now=1506081511.656671391 delta=10.813328609 jiffies=2704 > [ 194.771043] sync_rtc_clock: adjust=1506081522.664678127 target=470000000 next=1506081533.470000000 now=1506081522.664689794 delta=10.805310206 jiffies=2702 > [ 205.778402] sync_rtc_clock: adjust=1506081533.672649194 target=470000000 next=1506081544.470000000 now=1506081533.672662528 delta=10.797337472 jiffies=2700 > [ 216.785780] sync_rtc_clock: adjust=1506081544.680643928 target=470000000 next=1506081555.470000000 now=1506081544.680656929 delta=10.789343071 jiffies=2698 > [ 227.792447] sync_rtc_clock: adjust=1506081555.688650369 target=470000000 next=1506081566.470000000 now=1506081555.688664037 delta=10.781335963 jiffies=2696 > [ 238.798968] sync_rtc_clock: adjust=1506081566.696674887 target=470000000 next=1506081577.470000000 now=1506081566.696687889 delta=10.773312111 jiffies=2694 > [ 249.805483] sync_rtc_clock: adjust=1506081577.704652923 target=470000000 next=1506081588.470000000 now=1506081577.704666259 delta=10.765333741 jiffies=2692 > [ 260.812078] sync_rtc_clock: adjust=1506081588.712679979 target=470000000 next=1506081599.470000000 now=1506081588.712693647 delta=10.757306353 jiffies=2690 > [ 271.818656] sync_rtc_clock: adjust=1506081599.720654429 target=470000000 next=1506081610.470000000 now=1506081599.720668431 delta=10.749331569 jiffies=2688 > [ 282.825311] sync_rtc_clock: adjust=1506081610.728677710 target=470000000 next=1506081621.470000000 now=1506081610.728690045 delta=10.741309955 jiffies=2686 > [ 293.575979] sync_rtc_clock: adjust=1506081621.480653995 target=470000000 next=1506081632.470000000 now=1506081621.480666996 delta=10.989333004 jiffies=2748 > [ 304.582667] sync_rtc_clock: adjust=1506081632.488651088 target=470000000 next=1506081643.470000000 now=1506081632.488663756 delta=10.981336244 jiffies=2746 > [ 315.589372] sync_rtc_clock: adjust=1506081643.496641761 target=470000000 next=1506081654.470000000 now=1506081643.496653429 delta=10.973346571 jiffies=2744 > > As previous mentioned, this has HZ=250, so one jiffy is about 4ms. What > this shows is that we're failing every single time to hit the desired > window - the closest we got in the 5 minutes of uptime to 470ms was > 480653995ns. > > The comments in linux/workqueue.h refer to a gmane link - > http://thread.gmane.org/gmane.linux.kernel/1480396 but this seems to be > dead - "Archived At Nothing found - bye". Looks like using gmane URLs > in kernel code is not a good idea, they don't seem to be persistent! > > It's interesting to watch what's happening there - despite reducing the > number of jiffies each time, we still seem to be woken later and later > each time, with an approximate increase of 8ms or two jiffies each time. > Eg: > > [ 293.575979] sync_rtc_clock: adjust=1506081621.480653995 target=470000000 > next=1506081632.470000000 now=1506081621.480666996 delta=10.989333004 > jiffies=2748 > [ 304.582667] sync_rtc_clock: adjust=1506081632.488651088 target=470000000 > next=1506081643.470000000 now=1506081632.488663756 delta=10.981336244 > jiffies=2746 > [ 315.589372] sync_rtc_clock: adjust=1506081643.496641761 target=470000000 > next=1506081654.470000000 now=1506081643.496653429 delta=10.973346571 > jiffies=2744 > > If we do the math, then: > > 2748 jiffies is 10.992s, which is 10.989333004 rounded up to a jiffy. > 10.992 + 1506081621.480666996 gives 1506081632.472666996, but we're > next called at 1506081632.488651088, which is almost 4 jiffies later > than requested. > > 2746 jiffies is 10.984s, which is 10.981336244 rounded up to a jiffy. > 10.984 + 1506081632.488663756 gives 1506081643.472663756, but we're > next called at 1506081643.496641761, which is almost 6 jiffies later > than requested. > > I'm not sure where those extra jiffies are coming from... but maybe > using the power efficient wq is not the best for accuracy? > > Also notice that the target time seems to always be about 2.6ms late. > based on the requested jiffies, which in itself would mean we miss > the 1ms window in the systohc.c code. Okay, the problem seems to be the new timer wheel code - see https://lwn.net/Articles/646950/ This means that timers set way in the future (like more than a few milliseconds) can be delayed - they no longer expire "on the specified jiffies". Something to bear in mind for the future, as this affects everything that uses normal timers. I think the reason this doesn't show up for the original code is because of your change from a 0-1s retry to a 10s retry - the further the expiry into the future, the more delayed the timers can get. I guess we could go back to the 0-1s retry and keep using the delayed work queues, but I wonder if normal timers will get even less accurate in the future (if tglx does more work there.) The suggestion from Arjan van de Ven is to use hrtimers for this - but we still need the workqueue (as we may sleep) so we need a hrtimer- delayed workqueue. On the face of it, that seems a contradiction in terms to use a hrtimer with a workqueue. I do have an implementation for the hrtimer suggestion, which I'll try to sort out during the remainder of the day, once I've ironed out a few remaining quirks.
diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c index 2ed970d61da140..eef4123a573504 100644 --- a/drivers/rtc/class.c +++ b/drivers/rtc/class.c @@ -161,6 +161,12 @@ static struct rtc_device *rtc_allocate_device(void) device_initialize(&rtc->dev); + /* Drivers can revise this default after allocating the device. It + * should be the value of wallclock tv_nsec that the driver needs in + * order to synchronize the second tick over during set. + */ + rtc->set_offset_nsec = NSEC_PER_SEC / 2; + rtc->irq_freq = 1; rtc->max_user_freq = 64; rtc->dev.class = rtc_class; diff --git a/drivers/rtc/systohc.c b/drivers/rtc/systohc.c index b4a68ffcd06bb8..a7d0bbe2577110 100644 --- a/drivers/rtc/systohc.c +++ b/drivers/rtc/systohc.c @@ -7,9 +7,42 @@ #include <linux/rtc.h> #include <linux/time.h> +/* Determine if we can call to driver to set the time. Drivers can only be + * called to set a second aligned time value, and the field set_offset_nsec + * specifies how far away from the second aligned time to call the driver. + * + * This also compute 'to_set' which is the time we are trying to set, and has + * a zero in tv_nsecs, such that: + * to_set - set_delay_nsec == now +/- FUZZ + * + */ +#define TIME_SET_NSEC_FUZZ (1000 * 1000) +static inline bool rtc_tv_nsec_ok(struct rtc_device *rtc, + struct timespec64 *to_set, + const struct timespec64 *now) +{ + struct timespec64 delay = {.tv_sec = 0, + .tv_nsec = rtc->set_offset_nsec}; + + *to_set = timespec64_add(*now, delay); + + if (to_set->tv_nsec < TIME_SET_NSEC_FUZZ) { + to_set->tv_nsec = 0; + return true; + } + + if (to_set->tv_nsec > NSEC_PER_SEC - TIME_SET_NSEC_FUZZ) { + to_set->tv_sec++; + to_set->tv_nsec = 0; + return true; + } + return false; +} + /** * rtc_set_ntp_time - Save NTP synchronized time to the RTC * @now: Current time of day + * @target_nsec: pointer for desired now->tv_nsec value * * Replacement for the NTP platform function update_persistent_clock64 * that stores time for later retrieval by rtc_hctosys. @@ -18,30 +51,52 @@ * possible at all, and various other -errno for specific temporary failure * cases. * + * -EPROTO is returned if now.tv_nsec is not close enough to *target_nsec. + ( * If temporary failure is indicated the caller should try again 'soon' */ -int rtc_set_ntp_time(struct timespec64 now) +int rtc_set_ntp_time(struct timespec64 now, long *target_nsec) { struct rtc_device *rtc; struct rtc_time tm; + struct timespec64 to_set; int err = -ENODEV; - - if (now.tv_nsec < (NSEC_PER_SEC >> 1)) - rtc_time64_to_tm(now.tv_sec, &tm); - else - rtc_time64_to_tm(now.tv_sec + 1, &tm); + bool ok; rtc = rtc_class_open(CONFIG_RTC_SYSTOHC_DEVICE); - if (rtc) { - /* rtc_hctosys exclusively uses UTC, so we call set_time here, - * not set_mmss. */ - if (rtc->ops && - (rtc->ops->set_time || - rtc->ops->set_mmss64 || - rtc->ops->set_mmss)) - err = rtc_set_time(rtc, &tm); - rtc_class_close(rtc); + if (!rtc) + goto out_err; + + if (!rtc->ops || (!rtc->ops->set_time && !rtc->ops->set_mmss64 && + !rtc->ops->set_mmss)) + goto out_close; + + /* Compute the value of tv_nsec we require the caller to supply in + * now.tv_nsec. This is the value such that (now + + * set_offset_nsec).tv_nsec == 0. + */ + set_normalized_timespec64(&to_set, 0, -rtc->set_offset_nsec); + *target_nsec = to_set.tv_nsec; + + /* The ntp code must call this with the correct value in tv_nsec, if + * it does not we update target_nsec and return EPROTO to make the ntp + * code try again later. + */ + ok = rtc_tv_nsec_ok(rtc, &to_set, &now); + if (!ok) { + err = -EPROTO; + goto out_close; } + rtc_time64_to_tm(to_set.tv_sec, &tm); + + /* rtc_hctosys exclusively uses UTC, so we call set_time here, not + * set_mmss. + */ + err = rtc_set_time(rtc, &tm); + +out_close: + rtc_class_close(rtc); +out_err: return err; } diff --git a/include/linux/rtc.h b/include/linux/rtc.h index 0a0f0d14a5fba5..7d526550a700f8 100644 --- a/include/linux/rtc.h +++ b/include/linux/rtc.h @@ -137,6 +137,14 @@ struct rtc_device { /* Some hardware can't support UIE mode */ int uie_unsupported; + /* Number of nsec it takes to set the RTC clock. This influences when + * the set ops are called. An offset: + * - of 0.5 s will call RTC set for wall clock time 10.0 s at 9.5 s + * - of 1.5 s will call RTC set for wall clock time 10.0 s at 8.5 s + * - of -0.5 s will call RTC set for wall clock time 10.0 s at 10.5 s + */ + long set_offset_nsec; + bool registered; struct nvmem_config *nvmem_config; @@ -174,7 +182,7 @@ extern void devm_rtc_device_unregister(struct device *dev, extern int rtc_read_time(struct rtc_device *rtc, struct rtc_time *tm); extern int rtc_set_time(struct rtc_device *rtc, struct rtc_time *tm); -extern int rtc_set_ntp_time(struct timespec64 now); +extern int rtc_set_ntp_time(struct timespec64 now, long *target_nsec); int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm); extern int rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alrm); diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c index edf19cc5314043..6d2f521dd97748 100644 --- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -514,17 +514,19 @@ static DECLARE_DELAYED_WORK(sync_cmos_work, sync_cmos_clock); static void sync_cmos_clock(struct work_struct *work) { - struct timespec64 now; - struct timespec64 next; + struct timespec64 now, next, delta; int fail = 1; + long target_nsec = NSEC_PER_SEC / 2; /* - * If we have an externally synchronized Linux clock, then update - * CMOS clock accordingly every ~11 minutes. Set_rtc_mmss() has to be - * called as close as possible to 500 ms before the new second starts. - * This code is run on a timer. If the clock is set, that timer - * may not expire at the correct time. Thus, we adjust... - * We want the clock to be within a couple of ticks from the target. + * If we have an externally synchronized Linux clock, then update CMOS + * clock accordingly every ~11 minutes. Histiocally Set_rtc_mmss() + * has to be called as close as possible to 500 ms (target_nsec) + * before the new second starts, but new RTC drivers can select a + * different value. This code is run on a timer. If the clock is + * set, that timer may not expire at the correct time. Thus, we + * adjust... We want the clock to be within a couple of ticks from + * the target. */ if (!ntp_synced()) { /* @@ -547,25 +549,41 @@ static void sync_cmos_clock(struct work_struct *work) #ifdef CONFIG_RTC_SYSTOHC if (fail == -ENODEV) - fail = rtc_set_ntp_time(adjust); + fail = rtc_set_ntp_time(adjust, &target_nsec); #endif } - next.tv_nsec = (NSEC_PER_SEC / 2) - now.tv_nsec - (TICK_NSEC / 2); - if (next.tv_nsec <= 0) - next.tv_nsec += NSEC_PER_SEC; + do { + /* + * Compute the next wall clock time to try and set the + * clock + */ + next = now; + if (!fail || fail == -ENODEV) + timespec64_add_ns(&next, 659ULL * NSEC_PER_SEC); + else + /* Update failed, try again in about 10 seconds */ + timespec64_add_ns(&next, 10ULL * NSEC_PER_SEC); - if (!fail || fail == -ENODEV) - next.tv_sec = 659; - else - next.tv_sec = 0; + /* + * The next call to sync_cmos_clock needs to have have a wall + * clock tv_nsec value equal to target_nsec. + */ + if (next.tv_nsec > target_nsec) + next.tv_sec++; + next.tv_nsec = target_nsec; - if (next.tv_nsec >= NSEC_PER_SEC) { - next.tv_sec++; - next.tv_nsec -= NSEC_PER_SEC; - } - queue_delayed_work(system_power_efficient_wq, - &sync_cmos_work, timespec64_to_jiffies(&next)); + /* + * Convert to a relative delay. If time set took a really long + * time, or the wall clock was changed, this might become + * negative, so try again. + */ + getnstimeofday64(&now); + delta = timespec64_sub(next, now); + } while (delta.tv_sec <= 0); + + queue_delayed_work(system_power_efficient_wq, &sync_cmos_work, + timespec64_to_jiffies(&delta)); } void ntp_notify_cmos_timer(void)