Message ID | 1350906877-19410-1-git-send-email-balbi@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On Mon, Oct 22, 2012 at 02:54:37PM +0300, Felipe Balbi wrote: > The scheduler imposes a requirement to sched_clock() > which is to stop the clock during suspend, if we don't > do that IRQ threads will be rescheduled in the future > which might cause transfers to timeout depending on > how driver is written. > > This became an issue on OMAP when we converted omap-i2c.c > to use threaded IRQs, it turned out that depending on how > much time we spent on suspend, the I2C IRQ thread would > end up being rescheduled so far in the future that I2C > transfers would timeout and, because omap_hsmmc depends > on an I2C-connected device to detect if an MMC card is > inserted in the slot, our rootfs would just vanish. > > arch/arm/kernel/sched_clock.c already had an optional > implementation (sched_clock_needs_suspend()) which would > handle scheduler's requirement properly, what this patch > does is simply to make that implementation non-optional. > > This has been tested with beagleboard XM (OMAP3630) and > pandaboard rev A3 (OMAP4430). Suspend to RAM is now working > after this patch. > > Signed-off-by: Felipe Balbi <balbi@ti.com> just adding more guys to Cc. Please add more if I have missed someone. > --- > arch/arm/include/asm/sched_clock.h | 2 -- > arch/arm/kernel/sched_clock.c | 18 ++++-------------- > 2 files changed, 4 insertions(+), 16 deletions(-) > > diff --git a/arch/arm/include/asm/sched_clock.h b/arch/arm/include/asm/sched_clock.h > index 05b8e82..e3f7572 100644 > --- a/arch/arm/include/asm/sched_clock.h > +++ b/arch/arm/include/asm/sched_clock.h > @@ -10,7 +10,5 @@ > > extern void sched_clock_postinit(void); > extern void setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate); > -extern void setup_sched_clock_needs_suspend(u32 (*read)(void), int bits, > - unsigned long rate); > > #endif > diff --git a/arch/arm/kernel/sched_clock.c b/arch/arm/kernel/sched_clock.c > index e21bac2..fc6692e 100644 > --- a/arch/arm/kernel/sched_clock.c > +++ b/arch/arm/kernel/sched_clock.c > @@ -107,13 +107,6 @@ static void sched_clock_poll(unsigned long wrap_ticks) > update_sched_clock(); > } > > -void __init setup_sched_clock_needs_suspend(u32 (*read)(void), int bits, > - unsigned long rate) > -{ > - setup_sched_clock(read, bits, rate); > - cd.needs_suspend = true; > -} > - > void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate) > { > unsigned long r, w; > @@ -189,18 +182,15 @@ void __init sched_clock_postinit(void) > static int sched_clock_suspend(void) > { > sched_clock_poll(sched_clock_timer.data); > - if (cd.needs_suspend) > - cd.suspended = true; > + cd.suspended = true; > return 0; > } > > static void sched_clock_resume(void) > { > - if (cd.needs_suspend) { > - cd.epoch_cyc = read_sched_clock(); > - cd.epoch_cyc_copy = cd.epoch_cyc; > - cd.suspended = false; > - } > + cd.epoch_cyc = read_sched_clock(); > + cd.epoch_cyc_copy = cd.epoch_cyc; > + cd.suspended = false; > } > > static struct syscore_ops sched_clock_ops = { > -- > 1.8.0.rc0 >
+Colin Cross, Barry Song also Felipe Balbi <balbi@ti.com> writes: > The scheduler imposes a requirement to sched_clock() > which is to stop the clock during suspend, if we don't > do that IRQ threads will be rescheduled in the future > which might cause transfers to timeout depending on > how driver is written. It's not just about IRQ threads, it's about RT throttling. IOW, not just IRQ threads will be postponed, but all RT tasks will be throttled temporarily as well. The changelog should also mention that this has an inconvenient side effect of stopping the printk times during suspend. Based on the original thread where this feature was discussed and introduced, some platforms wanted to opt out of this behavior[1], so the optional API was added. However, in light of RT throttling, this a correctness issue for process accounting, so I agree that this should be done for all platforms instead of providing an optional 'needs suspend' version of the API, even though it means printk times no longer reflect time spent suspended. After a discussion with peterz on this topic, it seems that x86 already ensures that sched_clock stops during suspend for similar reasons[2]. The question then is whether this is a fix that belongs in v3.7. Technically, it is not a regression, so I think this should probably be v3.8 material. If that's the decision, then the threaded IRQ support for the OMAP I2C driver needs to be reverted for v3.7 until this fix is merged. Kevin [1] http://marc.info/?l=linux-arm-kernel&m=134307004508708&w=2 [2] http://marc.info/?l=linux-arm-kernel&m=135065529907297&w=2
On Mon, Oct 22, 2012 at 7:05 PM, Kevin Hilman <khilman@deeprootsystems.com> wrote: > However, in light of RT throttling, this a correctness issue for process > accounting, so I agree that this should be done for all platforms > instead of providing an optional 'needs suspend' version of the API, > even though it means printk times no longer reflect time spent > suspended. Maybe we should get printk() to use the best clocksource instead. The reason AFAICT that printk() is using sched_clock() is that it's supposed to be fast. But now it seems that it's not going to return what printk() needs anymore. (Dragging Stultz & Gleixner into this...) Yours, Linus Walleij
On Tue, Oct 23, 2012 at 12:28:32AM +0200, Linus Walleij wrote: > On Mon, Oct 22, 2012 at 7:05 PM, Kevin Hilman > <khilman@deeprootsystems.com> wrote: > > > However, in light of RT throttling, this a correctness issue for process > > accounting, so I agree that this should be done for all platforms > > instead of providing an optional 'needs suspend' version of the API, > > even though it means printk times no longer reflect time spent > > suspended. > > Maybe we should get printk() to use the best clocksource > instead. > > The reason AFAICT that printk() is using sched_clock() is that > it's supposed to be fast. But now it seems that it's not going > to return what printk() needs anymore. No, printk() does not need this. You think it does, but it doesn't. What we have is a difference between ARM and x86, and this difference is breaking the scheduler. The fact that the printk timestamp increments while suspended is a bug. It doesn't on x86. There's three other functions in the kernel which do return updated time. I'm sure one of those can be used to printk() the time passed in suspend at resume time, which would give the information required.
On Mon, Oct 22, 2012 at 1:54 PM, Felipe Balbi <balbi@ti.com> wrote: > The scheduler imposes a requirement to sched_clock() > which is to stop the clock during suspend, if we don't > do that IRQ threads will be rescheduled in the future > which might cause transfers to timeout depending on > how driver is written. > > This became an issue on OMAP when we converted omap-i2c.c > to use threaded IRQs, it turned out that depending on how > much time we spent on suspend, the I2C IRQ thread would > end up being rescheduled so far in the future that I2C > transfers would timeout and, because omap_hsmmc depends > on an I2C-connected device to detect if an MMC card is > inserted in the slot, our rootfs would just vanish. > > arch/arm/kernel/sched_clock.c already had an optional > implementation (sched_clock_needs_suspend()) which would > handle scheduler's requirement properly, what this patch > does is simply to make that implementation non-optional. > > This has been tested with beagleboard XM (OMAP3630) and > pandaboard rev A3 (OMAP4430). Suspend to RAM is now working > after this patch. > > Signed-off-by: Felipe Balbi <balbi@ti.com> After Russell explains so I get it: Acked-by: Linus Walleij <linus.walleij@linaro.org> Yours, Linus Walleij
Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Oct 23, 2012 at 12:28:32AM +0200, Linus Walleij wrote: >> On Mon, Oct 22, 2012 at 7:05 PM, Kevin Hilman >> <khilman@deeprootsystems.com> wrote: >> >> > However, in light of RT throttling, this a correctness issue for process >> > accounting, so I agree that this should be done for all platforms >> > instead of providing an optional 'needs suspend' version of the API, >> > even though it means printk times no longer reflect time spent >> > suspended. >> >> Maybe we should get printk() to use the best clocksource >> instead. >> >> The reason AFAICT that printk() is using sched_clock() is that >> it's supposed to be fast. But now it seems that it's not going >> to return what printk() needs anymore. > > No, printk() does not need this. You think it does, but it doesn't. What > we have is a difference between ARM and x86, and this difference is breaking > the scheduler. > > The fact that the printk timestamp increments while suspended is a bug. It > doesn't on x86. Russell, I agree that it's a bug, but does it qualify as a something you're willing to take for v3.7-rc? For OMAP, we need to know if this will go for v3.7 or not because there's a regression in the OMAP I2C driver, and if this doesn't go in, we'll need to revert something in the I2C driver until it does. Thanks, Kevin
On Tue, Oct 23, 2012 at 07:17:33AM -0700, Kevin Hilman wrote: > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > No, printk() does not need this. You think it does, but it doesn't. What > > we have is a difference between ARM and x86, and this difference is breaking > > the scheduler. > > > > The fact that the printk timestamp increments while suspended is a bug. It > > doesn't on x86. > > Russell, I agree that it's a bug, but does it qualify as a something > you're willing to take for v3.7-rc? Definitely. Our current behaviour across suspend for the scheduler is wrong. This is one of the questions I had when I created the sched_clock stuff - but no one at the time could answer. So, now that we have our answer, let's get it fixed to conform.
Hi, On Tue, Oct 23, 2012 at 05:03:31PM +0100, Russell King - ARM Linux wrote: > On Tue, Oct 23, 2012 at 07:17:33AM -0700, Kevin Hilman wrote: > > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > > No, printk() does not need this. You think it does, but it doesn't. What > > > we have is a difference between ARM and x86, and this difference is breaking > > > the scheduler. > > > > > > The fact that the printk timestamp increments while suspended is a bug. It > > > doesn't on x86. > > > > Russell, I agree that it's a bug, but does it qualify as a something > > you're willing to take for v3.7-rc? > > Definitely. Our current behaviour across suspend for the scheduler is > wrong. This is one of the questions I had when I created the sched_clock > stuff - but no one at the time could answer. So, now that we have our > answer, let's get it fixed to conform. I have just pushed the patch the your patch system, hope I did it all right.
diff --git a/arch/arm/include/asm/sched_clock.h b/arch/arm/include/asm/sched_clock.h index 05b8e82..e3f7572 100644 --- a/arch/arm/include/asm/sched_clock.h +++ b/arch/arm/include/asm/sched_clock.h @@ -10,7 +10,5 @@ extern void sched_clock_postinit(void); extern void setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate); -extern void setup_sched_clock_needs_suspend(u32 (*read)(void), int bits, - unsigned long rate); #endif diff --git a/arch/arm/kernel/sched_clock.c b/arch/arm/kernel/sched_clock.c index e21bac2..fc6692e 100644 --- a/arch/arm/kernel/sched_clock.c +++ b/arch/arm/kernel/sched_clock.c @@ -107,13 +107,6 @@ static void sched_clock_poll(unsigned long wrap_ticks) update_sched_clock(); } -void __init setup_sched_clock_needs_suspend(u32 (*read)(void), int bits, - unsigned long rate) -{ - setup_sched_clock(read, bits, rate); - cd.needs_suspend = true; -} - void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate) { unsigned long r, w; @@ -189,18 +182,15 @@ void __init sched_clock_postinit(void) static int sched_clock_suspend(void) { sched_clock_poll(sched_clock_timer.data); - if (cd.needs_suspend) - cd.suspended = true; + cd.suspended = true; return 0; } static void sched_clock_resume(void) { - if (cd.needs_suspend) { - cd.epoch_cyc = read_sched_clock(); - cd.epoch_cyc_copy = cd.epoch_cyc; - cd.suspended = false; - } + cd.epoch_cyc = read_sched_clock(); + cd.epoch_cyc_copy = cd.epoch_cyc; + cd.suspended = false; } static struct syscore_ops sched_clock_ops = {
The scheduler imposes a requirement to sched_clock() which is to stop the clock during suspend, if we don't do that IRQ threads will be rescheduled in the future which might cause transfers to timeout depending on how driver is written. This became an issue on OMAP when we converted omap-i2c.c to use threaded IRQs, it turned out that depending on how much time we spent on suspend, the I2C IRQ thread would end up being rescheduled so far in the future that I2C transfers would timeout and, because omap_hsmmc depends on an I2C-connected device to detect if an MMC card is inserted in the slot, our rootfs would just vanish. arch/arm/kernel/sched_clock.c already had an optional implementation (sched_clock_needs_suspend()) which would handle scheduler's requirement properly, what this patch does is simply to make that implementation non-optional. This has been tested with beagleboard XM (OMAP3630) and pandaboard rev A3 (OMAP4430). Suspend to RAM is now working after this patch. Signed-off-by: Felipe Balbi <balbi@ti.com> --- arch/arm/include/asm/sched_clock.h | 2 -- arch/arm/kernel/sched_clock.c | 18 ++++-------------- 2 files changed, 4 insertions(+), 16 deletions(-)