Message ID | 1476995664-15668-1-git-send-email-dianders@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 20 Oct 2016, Douglas Anderson wrote: > - An effort was made to look for users relying on the old behavior by > looking for usleep_range() in the same file as wake_up_process(). > No problems was found by this search, though it is conceivable that > someone could have put the sleep and wakeup in two different files. > - An effort was made to ask several upstream maintainers if they were > aware of people relying on wake_up_process() to wake up > usleep_range(). No maintainers were aware of that but they were aware > of many people relying on usleep_range() never returning before the > minimum. Thanks for going the extra mile ! > static void __sched do_usleep_range(unsigned long min, unsigned long max) > { > + ktime_t now, end; > ktime_t kmin; > u64 delta; > + int ret; > > - kmin = ktime_set(0, min * NSEC_PER_USEC); > + now = ktime_get(); > + end = ktime_add_us(now, min); So you calculate the absolute expiry time here. > delta = (u64)(max - min) * NSEC_PER_USEC; > - schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL); > + do { > + kmin = ktime_sub(end, now); > + ret = schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL); And then you schedule the thing relative. That does not make sense. > + > + /* > + * If schedule_hrtimeout_range() returns 0 then we actually > + * hit the timeout. If not then we need to re-calculate the > + * new timeout ourselves. > + */ > + if (ret == 0) > + break; > + > + now = ktime_get(); And this is broken because schedule_hrtimeout_range() returns with task state TASK_RUNNING so the next schedule_hrtimeout_range() will return -EINTR again because nothing sets the task state back to UNINTERRUPTIBLE. So instead of sleeping you busy loop. What you really want to do is something like this: void __sched usleep_range(unsigned long min, unsigned long max) { ktime_t expires = ktime_add_us(ktime_get(), min * NSEC_PER_USEC); ktime_t delta = ktime_set(0, (u64)(max - min) * NSEC_PER_USEC); for (;;) { __set_current_state(TASK_UNINTERRUPTIBLE); /* Do not return before the requested sleep time has elapsed */ if (!schedule_hrtimeout_range(&expires, delta, HRTIMER_MODE_ABS)) break; } } Thanks, tglx
Hi, On Thu, Oct 20, 2016 at 2:25 PM, Thomas Gleixner <tglx@linutronix.de> wrote: > On Thu, 20 Oct 2016, Douglas Anderson wrote: >> - An effort was made to look for users relying on the old behavior by >> looking for usleep_range() in the same file as wake_up_process(). >> No problems was found by this search, though it is conceivable that >> someone could have put the sleep and wakeup in two different files. >> - An effort was made to ask several upstream maintainers if they were >> aware of people relying on wake_up_process() to wake up >> usleep_range(). No maintainers were aware of that but they were aware >> of many people relying on usleep_range() never returning before the >> minimum. > > Thanks for going the extra mile ! > >> static void __sched do_usleep_range(unsigned long min, unsigned long max) >> { >> + ktime_t now, end; >> ktime_t kmin; >> u64 delta; >> + int ret; >> >> - kmin = ktime_set(0, min * NSEC_PER_USEC); >> + now = ktime_get(); >> + end = ktime_add_us(now, min); > > So you calculate the absolute expiry time here. > >> delta = (u64)(max - min) * NSEC_PER_USEC; >> - schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL); >> + do { >> + kmin = ktime_sub(end, now); >> + ret = schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL); > > And then you schedule the thing relative. That does not make sense. > >> + >> + /* >> + * If schedule_hrtimeout_range() returns 0 then we actually >> + * hit the timeout. If not then we need to re-calculate the >> + * new timeout ourselves. >> + */ >> + if (ret == 0) >> + break; >> + >> + now = ktime_get(); > > And this is broken because schedule_hrtimeout_range() returns with task > state TASK_RUNNING so the next schedule_hrtimeout_range() will return > -EINTR again because nothing sets the task state back to UNINTERRUPTIBLE. > So instead of sleeping you busy loop. That explains the mystery of why my delays were always so precise in the test. I was a bit baffled by the fact that I was ending up with a delay of almost exactly 50001 or 50002 in my test case. Thank you for catching! > What you really want to do is something like this: > > void __sched usleep_range(unsigned long min, unsigned long max) > { > ktime_t expires = ktime_add_us(ktime_get(), min * NSEC_PER_USEC); > ktime_t delta = ktime_set(0, (u64)(max - min) * NSEC_PER_USEC); > > for (;;) { > __set_current_state(TASK_UNINTERRUPTIBLE); > /* Do not return before the requested sleep time has elapsed */ > if (!schedule_hrtimeout_range(&expires, delta, HRTIMER_MODE_ABS)) > break; > } > } The above mostly works other than some small fixups. Thanks! ...other than small fixups, the one substantive change I'll make is to actually check the timeout in the loop above. If I use your code with my test, I find that, even though I'm waking up every millisecond I still end up not exiting the loop until the upper bound of the delay. Presumably this happens because: a_time_in_the_past = ktime_sub_us(ktime_get(), 10); schedule_hrtimeout_range(&a_time_in_the_past, delta, HRTIMER_MODE_ABS) ...delays "delta" nano seconds even though "a_time_in_the_past" is in the past. I presume that behavior is OK (let me know if it's not). In the case of usleep_range() if we're waking up anyway, it seems sensible to spend a few cycles to see if the minimum bound has already past. I'll plan to send a new version tomorrow unless I hear that you don't like the above. -Doug
On Thu, 20 Oct 2016, Doug Anderson wrote: > > And this is broken because schedule_hrtimeout_range() returns with task > > state TASK_RUNNING so the next schedule_hrtimeout_range() will return > > -EINTR again because nothing sets the task state back to UNINTERRUPTIBLE. > > So instead of sleeping you busy loop. > > That explains the mystery of why my delays were always so precise in > the test. I was a bit baffled by the fact that I was ending up with a > delay of almost exactly 50001 or 50002 in my test case. Well, if you see something as a mystery or you are baffled, then you certainly should figure out why and not just declare: Works for me, but I don't know why. That's stuff which comes back to hunt you sooner than later. > > What you really want to do is something like this: > > > > void __sched usleep_range(unsigned long min, unsigned long max) > > { > > ktime_t expires = ktime_add_us(ktime_get(), min * NSEC_PER_USEC); > > ktime_t delta = ktime_set(0, (u64)(max - min) * NSEC_PER_USEC); > > > > for (;;) { > > __set_current_state(TASK_UNINTERRUPTIBLE); > > /* Do not return before the requested sleep time has elapsed */ > > if (!schedule_hrtimeout_range(&expires, delta, HRTIMER_MODE_ABS)) > > break; > > } > > } > > The above mostly works other than some small fixups. Thanks! Yeah, delta is u64. Was too lazy to look it up. > ...other than small fixups, the one substantive change I'll make is to > actually check the timeout in the loop above. If I use your code with > my test, I find that, even though I'm waking up every millisecond I > still end up not exiting the loop until the upper bound of the delay. > > Presumably this happens because: > > a_time_in_the_past = ktime_sub_us(ktime_get(), 10); > schedule_hrtimeout_range(&a_time_in_the_past, delta, HRTIMER_MODE_ABS) > > ...delays "delta" nano seconds even though "a_time_in_the_past" is in > the past. I presume that behavior is OK (let me know if it's not). It does not delay delta nano seconds. It delays to (now - 10us) + deltans The core is free to use the full range for batching or extending idle sleep times and with a delta > 10us it's expected and correct behaviour. > In the case of usleep_range() if we're waking up anyway, it seems > sensible to spend a few cycles to see if the minimum bound has already > past. We really can do without that. You are over engineering for something which is pointless in 99.9% of the use cases. That stuff is what causes bloat. A few lines of code here and there, which sums up in the end. Thanks, tglx
diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 32bf6f75a8fe..219439efd56a 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1898,12 +1898,28 @@ EXPORT_SYMBOL(msleep_interruptible); static void __sched do_usleep_range(unsigned long min, unsigned long max) { + ktime_t now, end; ktime_t kmin; u64 delta; + int ret; - kmin = ktime_set(0, min * NSEC_PER_USEC); + now = ktime_get(); + end = ktime_add_us(now, min); delta = (u64)(max - min) * NSEC_PER_USEC; - schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL); + do { + kmin = ktime_sub(end, now); + ret = schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL); + + /* + * If schedule_hrtimeout_range() returns 0 then we actually + * hit the timeout. If not then we need to re-calculate the + * new timeout ourselves. + */ + if (ret == 0) + break; + + now = ktime_get(); + } while (ktime_before(now, end)); } /**