Message ID | 1441310314-8857-8-git-send-email-lina.iyer@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 09/03, Lina Iyer wrote: > @@ -137,7 +138,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) > pr_err("CPU%u: failed to boot: %d\n", cpu, ret); > } > > - Please remove noise. > memset(&secondary_data, 0, sizeof(secondary_data)); > return ret; > } > @@ -271,6 +271,16 @@ void __cpu_die(unsigned int cpu) > void __ref cpu_die(void) > { > unsigned int cpu = smp_processor_id(); > + struct device *cpu_dev; > + > + /* > + * We dont need the CPU device anymore. s/dont/don't/ > + * Lets do this before IRQs are disabled to allow s/Lets/Let's/ > + * runtime PM to suspend the domain as well. > + */
On Thu, Sep 3, 2015 at 9:58 PM, Lina Iyer <lina.iyer@linaro.org> wrote: > Enable runtime PM for CPU devices. Do a runtime get of the CPU device > when the CPU is hotplugged in and a runtime put of the CPU device when s/in/on/ > the CPU is hotplugged off. When all the CPUs in a domain are hotplugged > off, the domain may also be powered off and cluster_pm_enter/exit() > notifications are be sent out. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Thu, Sep 03, 2015 at 01:58:34PM -0600, Lina Iyer wrote: > @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void) > local_irq_enable(); > local_fiq_enable(); > > + /* We are running, enable runtime PM for the CPU. */ > + cpu_dev = get_cpu_device(cpu); > + if (cpu_dev) > + pm_runtime_get_sync(cpu_dev); > + Please no. This is fragile. pm_runtime_get_sync() can sleep, and this path is used when the system is fully up and running. Locks may be held, which cause this to sleep. However, we're calling it from a context where we can't sleep - which is the general rule for any part of system initialisation where we have not entered the idle thread yet (the idle thread is what we run when sleeping.) NAK on this.
On Fri, Sep 04, 2015 at 10:17:35AM +0100, Russell King - ARM Linux wrote: > On Thu, Sep 03, 2015 at 01:58:34PM -0600, Lina Iyer wrote: > > @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void) > > local_irq_enable(); > > local_fiq_enable(); > > > > + /* We are running, enable runtime PM for the CPU. */ > > + cpu_dev = get_cpu_device(cpu); > > + if (cpu_dev) > > + pm_runtime_get_sync(cpu_dev); > > + > > Please no. This is fragile. > > pm_runtime_get_sync() can sleep, and this path is used when the system > is fully up and running. Locks may be held, which cause this to sleep. > However, we're calling it from a context where we can't sleep - which > is the general rule for any part of system initialisation where we > have not entered the idle thread yet (the idle thread is what we run > when sleeping.) > > NAK on this. There's another issue here. This is obviously wrong. If the CPU is inside the PM domain, then don't you need the PM domain powered up for the CPU to start executing instructions? If the CPU can run instructions with the PM domain that it's allegedly inside powered down, then the CPU is _not_ part of the PM domain, and this whole thing is total crap. Sorry, I'm really not impressed by the "design" here, it seems to be totally screwed.
On Fri, Sep 04 2015 at 03:27 -0600, Russell King - ARM Linux wrote: >On Fri, Sep 04, 2015 at 10:17:35AM +0100, Russell King - ARM Linux wrote: >> On Thu, Sep 03, 2015 at 01:58:34PM -0600, Lina Iyer wrote: >> > @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void) >> > local_irq_enable(); >> > local_fiq_enable(); >> > >> > + /* We are running, enable runtime PM for the CPU. */ >> > + cpu_dev = get_cpu_device(cpu); >> > + if (cpu_dev) >> > + pm_runtime_get_sync(cpu_dev); >> > + >> >> Please no. This is fragile. >> >> pm_runtime_get_sync() can sleep, and this path is used when the system >> is fully up and running. Locks may be held, which cause this to sleep. >> However, we're calling it from a context where we can't sleep - which >> is the general rule for any part of system initialisation where we >> have not entered the idle thread yet (the idle thread is what we run >> when sleeping.) >> More explanation below. Another patch (3/7) in this series defines CPU devices as IRQ safe and therefore the dev->power.lock would be spinlock'd before calling the rest of the PM runtime code and the domain. The CPU PM domain would also be an IRQ safe domain (patch 2/7 makes the genpd framework usable in IRQ safe context). >> NAK on this. > >There's another issue here. This is obviously wrong. If the CPU is >inside the PM domain, then don't you need the PM domain powered up for >the CPU to start executing instructions? > This would not be the case for CPU PM domains. The CPU PM domains AFAIK, are triggered by the same wake up interrupt that brings the CPU out of its power downs state. The domain hardware has to be awake before the CPU executes its first instruction. Thefore the hw power up is a logical OR of the wakeup signals to any of the cores. Similarly, the domain hardware would be the last to power down after all the CPUs. Generally tied to the execution of WFI command by the ARM CPU core. The domain power down signal is a logical AND of all the WFI signals of all the cores in the domain. >If the CPU can run instructions with the PM domain that it's allegedly >inside powered down, then the CPU is _not_ part of the PM domain, and >this whole thing is total crap. > CPU PM domains have to be synchronized in hardware and all these can only happen when the SoC provides a domain that can power on before any of the CPU powers up and power down only after the all CPUs have powered down. What s/w generally can do in the CPU domain power on/off callbacks from genpd is to configure which state the domain should be in, when all the CPUs have powered down. Many SoCs, have PM domains that could be in retention in anticipation of an early wakeup or could be completely cache flushed and powered down, when suspending or sleeping for a longer duration. The power off callbacks for the PM domain does the configuration and it doesnt take effect until all the CPUs power down. While powering on the domain in the genpd power on callback, all we do is to ensure that the domain is reconfigured to remain active (ignore all WFI instruction from the CPU, which are not tied to CPU power down. Dont want the domain powering off when all CPUs in the domain are just clock gated.) until the domain is configured to its idle state by the last CPU in Linux. >Sorry, I'm really not impressed by the "design" here, it seems to be >totally screwed. > Thanks, Lina
On Thu, Sep 03 2015 at 21:59 -0600, Stephen Boyd wrote: >On 09/03, Lina Iyer wrote: >> @@ -137,7 +138,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) >> pr_err("CPU%u: failed to boot: %d\n", cpu, ret); >> } >> >> - > >Please remove noise. > Sorry, I forgot to address it earlier. Thought I did. Apologize. >> memset(&secondary_data, 0, sizeof(secondary_data)); >> return ret; >> } >> @@ -271,6 +271,16 @@ void __cpu_die(unsigned int cpu) >> void __ref cpu_die(void) >> { >> unsigned int cpu = smp_processor_id(); >> + struct device *cpu_dev; >> + >> + /* >> + * We dont need the CPU device anymore. > >s/dont/don't/ > >> + * Lets do this before IRQs are disabled to allow > >s/Lets/Let's/ > >> + * runtime PM to suspend the domain as well. >> + */ > >-- >Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >a Linux Foundation Collaborative Project
On Fri, Sep 04, 2015 at 09:12:02AM -0600, Lina Iyer wrote: > On Fri, Sep 04 2015 at 03:27 -0600, Russell King - ARM Linux wrote: > >On Fri, Sep 04, 2015 at 10:17:35AM +0100, Russell King - ARM Linux wrote: > >>On Thu, Sep 03, 2015 at 01:58:34PM -0600, Lina Iyer wrote: > >>> @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void) > >>> local_irq_enable(); > >>> local_fiq_enable(); > >>> > >>> + /* We are running, enable runtime PM for the CPU. */ > >>> + cpu_dev = get_cpu_device(cpu); > >>> + if (cpu_dev) > >>> + pm_runtime_get_sync(cpu_dev); > >>> + > >> > >>Please no. This is fragile. > >> > >>pm_runtime_get_sync() can sleep, and this path is used when the system > >>is fully up and running. Locks may be held, which cause this to sleep. > >>However, we're calling it from a context where we can't sleep - which > >>is the general rule for any part of system initialisation where we > >>have not entered the idle thread yet (the idle thread is what we run > >>when sleeping.) > >> > More explanation below. > > Another patch (3/7) in this series defines CPU devices as IRQ safe and > therefore the dev->power.lock would be spinlock'd before calling the > rest of the PM runtime code and the domain. The CPU PM domain would also > be an IRQ safe domain (patch 2/7 makes the genpd framework usable in IRQ > safe context). > > >>NAK on this. > > > >There's another issue here. This is obviously wrong. If the CPU is > >inside the PM domain, then don't you need the PM domain powered up for > >the CPU to start executing instructions? > > > This would not be the case for CPU PM domains. The CPU PM domains AFAIK, > are triggered by the same wake up interrupt that brings the CPU out of > its power downs state. The domain hardware has to be awake before the > CPU executes its first instruction. Thefore the hw power up is a logical > OR of the wakeup signals to any of the cores. You're taking the behaviour of the hardware you have in front of you and claiming that it's true everywhere, and shoving that into generic code. I know, for example on OMAP, you have to power up the CPU first before you can "wake" it. I wouldn't be surprised if other SoCs are like that: where they require the CPU core to be powered and held in reset, before releasing the reset to then allow them to start executing the secondary core bringup. Relying on hardware to do this sounds really fragile and bad design to me. If you want to persue your current design, don't make it generic code, because it's got no right to be generic with assumptions like that.
On Fri, Sep 04 2015 at 10:23 -0600, Russell King - ARM Linux wrote: >On Fri, Sep 04, 2015 at 09:12:02AM -0600, Lina Iyer wrote: >> On Fri, Sep 04 2015 at 03:27 -0600, Russell King - ARM Linux wrote: >> >On Fri, Sep 04, 2015 at 10:17:35AM +0100, Russell King - ARM Linux wrote: >> >>On Thu, Sep 03, 2015 at 01:58:34PM -0600, Lina Iyer wrote: >> >>> @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void) >> >>> local_irq_enable(); >> >>> local_fiq_enable(); >> >>> >> >>> + /* We are running, enable runtime PM for the CPU. */ >> >>> + cpu_dev = get_cpu_device(cpu); >> >>> + if (cpu_dev) >> >>> + pm_runtime_get_sync(cpu_dev); >> >>> + >> >> >> >>Please no. This is fragile. >> >> >> >>pm_runtime_get_sync() can sleep, and this path is used when the system >> >>is fully up and running. Locks may be held, which cause this to sleep. >> >>However, we're calling it from a context where we can't sleep - which >> >>is the general rule for any part of system initialisation where we >> >>have not entered the idle thread yet (the idle thread is what we run >> >>when sleeping.) >> >> >> More explanation below. >> >> Another patch (3/7) in this series defines CPU devices as IRQ safe and >> therefore the dev->power.lock would be spinlock'd before calling the >> rest of the PM runtime code and the domain. The CPU PM domain would also >> be an IRQ safe domain (patch 2/7 makes the genpd framework usable in IRQ >> safe context). >> >> >>NAK on this. >> > >> >There's another issue here. This is obviously wrong. If the CPU is >> >inside the PM domain, then don't you need the PM domain powered up for >> >the CPU to start executing instructions? >> > >> This would not be the case for CPU PM domains. The CPU PM domains AFAIK, >> are triggered by the same wake up interrupt that brings the CPU out of >> its power downs state. The domain hardware has to be awake before the >> CPU executes its first instruction. Thefore the hw power up is a logical >> OR of the wakeup signals to any of the cores. > >You're taking the behaviour of the hardware you have in front of you >and claiming that it's true everywhere, and shoving that into generic >code. > >I know, for example on OMAP, you have to power up the CPU first before >you can "wake" it. > >I wouldn't be surprised if other SoCs are like that: where they require >the CPU core to be powered and held in reset, before releasing the reset >to then allow them to start executing the secondary core bringup. > It would still require that the CPU's domain be powered on in the hw, before the CPU can run Linux. >Relying on hardware to do this sounds really fragile and bad design to >me. > >If you want to persue your current design, don't make it generic code, >because it's got no right to be generic with assumptions like that. > Hmm, okay. I can look at alternatives like hotplug notifiers. -- Lina
On Fri, Sep 04, 2015 at 11:02:11AM -0600, Lina Iyer wrote: > On Fri, Sep 04 2015 at 10:23 -0600, Russell King - ARM Linux wrote: > >You're taking the behaviour of the hardware you have in front of you > >and claiming that it's true everywhere, and shoving that into generic > >code. > > > >I know, for example on OMAP, you have to power up the CPU first before > >you can "wake" it. > > > >I wouldn't be surprised if other SoCs are like that: where they require > >the CPU core to be powered and held in reset, before releasing the reset > >to then allow them to start executing the secondary core bringup. > > > It would still require that the CPU's domain be powered on in the hw, > before the CPU can run Linux. > > >Relying on hardware to do this sounds really fragile and bad design to > >me. > > > >If you want to persue your current design, don't make it generic code, > >because it's got no right to be generic with assumptions like that. > > > Hmm, okay. I can look at alternatives like hotplug notifiers. How about putting the pm resume in path of the _requesting_ CPU? IOW, in __cpu_up().
Hi Lina, On 09/04/2015 06:12 PM, Lina Iyer wrote: > On Fri, Sep 04 2015 at 03:27 -0600, Russell King - ARM Linux wrote: >> On Fri, Sep 04, 2015 at 10:17:35AM +0100, Russell King - ARM Linux wrote: >>> On Thu, Sep 03, 2015 at 01:58:34PM -0600, Lina Iyer wrote: >>> > @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void) >>> > local_irq_enable(); >>> > local_fiq_enable(); >>> > >>> > + /* We are running, enable runtime PM for the CPU. */ >>> > + cpu_dev = get_cpu_device(cpu); >>> > + if (cpu_dev) >>> > + pm_runtime_get_sync(cpu_dev); >>> > + >>> >>> Please no. This is fragile. >>> >>> pm_runtime_get_sync() can sleep, and this path is used when the system >>> is fully up and running. Locks may be held, which cause this to sleep. >>> However, we're calling it from a context where we can't sleep - which >>> is the general rule for any part of system initialisation where we >>> have not entered the idle thread yet (the idle thread is what we run >>> when sleeping.) >>> > More explanation below. > > Another patch (3/7) in this series defines CPU devices as IRQ safe and > therefore the dev->power.lock would be spinlock'd before calling the > rest of the PM runtime code and the domain. The CPU PM domain would also > be an IRQ safe domain (patch 2/7 makes the genpd framework usable in IRQ > safe context). There is one "small" problem with such approach :( - It's incompatible with -RT kernel, because PM runtime can't be used in atomic context on -RT. As result, CPU's hotplug will be broken (CPUIdle will be broken also, but CPU hotplug is more important at least for me:). Also, as for me, PM runtime + genpd is a little bit heavy-weight things to be used for CPUIdle and It could affect on states with small wake-up latencies. [...]
On Fri, 4 Sep 2015, Grygorii Strashko wrote: > There is one "small" problem with such approach :( > > - It's incompatible with -RT kernel, because PM runtime can't be used > in atomic context on -RT. Can you explain this more fully? Why can't runtime PM be used in atomic context in the -rt kernels? Alan Stern
On 09/04/2015 09:45 PM, Alan Stern wrote: > On Fri, 4 Sep 2015, Grygorii Strashko wrote: > >> There is one "small" problem with such approach :( >> >> - It's incompatible with -RT kernel, because PM runtime can't be used >> in atomic context on -RT. > > Can you explain this more fully? Why can't runtime PM be used in > atomic context in the -rt kernels? > See: http://lwn.net/Articles/146861/ https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F spinlock_t Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) do -not- disable hardware interrupts. Priority inheritance is used to prevent priority inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. As result, have to do things like: https://lkml.org/lkml/2015/8/18/161 https://lkml.org/lkml/2015/8/18/162 Sorry for brief reply - Friday/Sat night :)
On Sat, 5 Sep 2015, Grygorii Strashko wrote: > On 09/04/2015 09:45 PM, Alan Stern wrote: > > On Fri, 4 Sep 2015, Grygorii Strashko wrote: > > > >> There is one "small" problem with such approach :( > >> > >> - It's incompatible with -RT kernel, because PM runtime can't be used > >> in atomic context on -RT. > > > > Can you explain this more fully? Why can't runtime PM be used in > > atomic context in the -rt kernels? > > > > See: > http://lwn.net/Articles/146861/ > https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F > > spinlock_t > Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) > do -not- disable hardware interrupts. Priority inheritance is used to prevent priority > inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. > > As result, have to do things like: > https://lkml.org/lkml/2015/8/18/161 > https://lkml.org/lkml/2015/8/18/162 > > Sorry for brief reply - Friday/Sat night :) I see. Although we normally think of interrupt contexts as being atomic, in an -rt kernel this isn't true any more because things like spin_lock_irq don't actually disable interrupts. Therefore it would be correct to say that in -rt kernels, runtime PM can be used in interrupt context (if the device is marked as irq-safe), but not in atomic context. Right? Alan Stern
On Saturday, September 05, 2015 11:39:20 AM Alan Stern wrote: > On Sat, 5 Sep 2015, Grygorii Strashko wrote: > > > On 09/04/2015 09:45 PM, Alan Stern wrote: > > > On Fri, 4 Sep 2015, Grygorii Strashko wrote: > > > > > >> There is one "small" problem with such approach :( > > >> > > >> - It's incompatible with -RT kernel, because PM runtime can't be used > > >> in atomic context on -RT. > > > > > > Can you explain this more fully? Why can't runtime PM be used in > > > atomic context in the -rt kernels? > > > > > > > See: > > http://lwn.net/Articles/146861/ > > https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F > > > > spinlock_t > > Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) > > do -not- disable hardware interrupts. Priority inheritance is used to prevent priority > > inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. > > > > As result, have to do things like: > > https://lkml.org/lkml/2015/8/18/161 > > https://lkml.org/lkml/2015/8/18/162 > > > > Sorry for brief reply - Friday/Sat night :) > > I see. Although we normally think of interrupt contexts as being > atomic, in an -rt kernel this isn't true any more because things like > spin_lock_irq don't actually disable interrupts. > > Therefore it would be correct to say that in -rt kernels, runtime PM > can be used in interrupt context (if the device is marked as irq-safe), > but not in atomic context. Right? Right. Whatever is suitable for interrupt context in the mainline, will be suitable for that in -rt kernels too. However, what is suitable for the idle loop in the mainline, may not be suitable for that in -rt kernels. That's why raw_spin_lock/unlock() need to be used within the idle loop. Thanks, Rafael
On 09/07/2015 04:04 PM, Rafael J. Wysocki wrote: > On Saturday, September 05, 2015 11:39:20 AM Alan Stern wrote: >> On Sat, 5 Sep 2015, Grygorii Strashko wrote: >> >>> On 09/04/2015 09:45 PM, Alan Stern wrote: >>>> On Fri, 4 Sep 2015, Grygorii Strashko wrote: >>>> >>>>> There is one "small" problem with such approach :( >>>>> >>>>> - It's incompatible with -RT kernel, because PM runtime can't be used >>>>> in atomic context on -RT. >>>> >>>> Can you explain this more fully? Why can't runtime PM be used in >>>> atomic context in the -rt kernels? >>>> >>> >>> See: >>> http://lwn.net/Articles/146861/ >>> https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F >>> >>> spinlock_t >>> Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) >>> do -not- disable hardware interrupts. Priority inheritance is used to prevent priority >>> inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. >>> >>> As result, have to do things like: >>> https://lkml.org/lkml/2015/8/18/161 >>> https://lkml.org/lkml/2015/8/18/162 >>> >>> Sorry for brief reply - Friday/Sat night :) >> >> I see. Although we normally think of interrupt contexts as being >> atomic, in an -rt kernel this isn't true any more because things like >> spin_lock_irq don't actually disable interrupts. >> >> Therefore it would be correct to say that in -rt kernels, runtime PM >> can be used in interrupt context (if the device is marked as irq-safe), >> but not in atomic context. Right? > > Right. > > Whatever is suitable for interrupt context in the mainline, will be suitable > for that in -rt kernels too. Not exactly true :(, since spinlock is converted to [rt_] mutex. Usually, this difference can't be seen because on -RT kernel all or mostly all HW IRQ handlers will be forced to be threaded. For the cases, where such automatic conversion is not working, (like chained irq handlers or HW-handler+Threaded handler) the code has to be carefully patched to work properly as for non-RT as for -RT. Also, this triggers some -RT incompatibility issues, like with PM runtime or CLK framework (http://www.spinics.net/lists/linux-rt-users/msg13653.html). > However, what is suitable for the idle loop > in the mainline, may not be suitable for that in -rt kernels. > > That's why raw_spin_lock/unlock() need to be used within the idle loop. Indeed. CPU hotplug/CPUIdle is guarded by local_irq_disable()/local_irq_enable(). (example of CPU hotplug RT-issue http://www.spinics.net/lists/arm-kernel/msg438963.html). I don't want to be the final authority here as my experience with -RT is short. But, I want to point out on potential issues based on what I've already discovered and tried to fix. Thanks & regards, -grygorii
On Monday, September 07, 2015 04:37:44 PM Grygorii Strashko wrote: > On 09/07/2015 04:04 PM, Rafael J. Wysocki wrote: > > On Saturday, September 05, 2015 11:39:20 AM Alan Stern wrote: > >> On Sat, 5 Sep 2015, Grygorii Strashko wrote: > >> > >>> On 09/04/2015 09:45 PM, Alan Stern wrote: > >>>> On Fri, 4 Sep 2015, Grygorii Strashko wrote: > >>>> > >>>>> There is one "small" problem with such approach :( > >>>>> > >>>>> - It's incompatible with -RT kernel, because PM runtime can't be used > >>>>> in atomic context on -RT. > >>>> > >>>> Can you explain this more fully? Why can't runtime PM be used in > >>>> atomic context in the -rt kernels? > >>>> > >>> > >>> See: > >>> http://lwn.net/Articles/146861/ > >>> https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F > >>> > >>> spinlock_t > >>> Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) > >>> do -not- disable hardware interrupts. Priority inheritance is used to prevent priority > >>> inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. > >>> > >>> As result, have to do things like: > >>> https://lkml.org/lkml/2015/8/18/161 > >>> https://lkml.org/lkml/2015/8/18/162 > >>> > >>> Sorry for brief reply - Friday/Sat night :) > >> > >> I see. Although we normally think of interrupt contexts as being > >> atomic, in an -rt kernel this isn't true any more because things like > >> spin_lock_irq don't actually disable interrupts. > >> > >> Therefore it would be correct to say that in -rt kernels, runtime PM > >> can be used in interrupt context (if the device is marked as irq-safe), > >> but not in atomic context. Right? > > > > Right. > > > > Whatever is suitable for interrupt context in the mainline, will be suitable > > for that in -rt kernels too. > > Not exactly true :(, since spinlock is converted to [rt_] mutex. > Usually, this difference can't be seen because on -RT kernel all or > mostly all HW IRQ handlers will be forced to be threaded. Exactly. And that's what I'm talking about. > For the cases, where such automatic conversion is not working, > (like chained irq handlers or HW-handler+Threaded handler) the code > has to be carefully patched to work properly as for non-RT as for -RT. Right. > Also, this triggers some -RT incompatibility issues, like with PM runtime or That I'm not sure about. Why would runtime PM cause problems with -RT (apart from attempts to use it from the idle loop, but that's not happening in the mainline anyway)? > CLK framework (http://www.spinics.net/lists/linux-rt-users/msg13653.html). > > > However, what is suitable for the idle loop > > in the mainline, may not be suitable for that in -rt kernels. > > > > That's why raw_spin_lock/unlock() need to be used within the idle loop. > > Indeed. CPU hotplug/CPUIdle is guarded by local_irq_disable()/local_irq_enable(). > (example of CPU hotplug RT-issue http://www.spinics.net/lists/arm-kernel/msg438963.html). > > I don't want to be the final authority here as my experience with -RT is short. > But, I want to point out on potential issues based on what I've already discovered > and tried to fix. OK Thanks, Rafael
On 09/07/2015 11:42 PM, Rafael J. Wysocki wrote: > On Monday, September 07, 2015 04:37:44 PM Grygorii Strashko wrote: >> On 09/07/2015 04:04 PM, Rafael J. Wysocki wrote: >>> On Saturday, September 05, 2015 11:39:20 AM Alan Stern wrote: >>>> On Sat, 5 Sep 2015, Grygorii Strashko wrote: >>>> >>>>> On 09/04/2015 09:45 PM, Alan Stern wrote: >>>>>> On Fri, 4 Sep 2015, Grygorii Strashko wrote: >>>>>> >>>>>>> There is one "small" problem with such approach :( >>>>>>> >>>>>>> - It's incompatible with -RT kernel, because PM runtime can't be used >>>>>>> in atomic context on -RT. >>>>>> >>>>>> Can you explain this more fully? Why can't runtime PM be used in >>>>>> atomic context in the -rt kernels? >>>>>> >>>>> >>>>> See: >>>>> http://lwn.net/Articles/146861/ >>>>> https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F >>>>> >>>>> spinlock_t >>>>> Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) >>>>> do -not- disable hardware interrupts. Priority inheritance is used to prevent priority >>>>> inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. >>>>> >>>>> As result, have to do things like: >>>>> https://lkml.org/lkml/2015/8/18/161 >>>>> https://lkml.org/lkml/2015/8/18/162 >>>>> >>>>> Sorry for brief reply - Friday/Sat night :) >>>> >>>> I see. Although we normally think of interrupt contexts as being >>>> atomic, in an -rt kernel this isn't true any more because things like >>>> spin_lock_irq don't actually disable interrupts. >>>> >>>> Therefore it would be correct to say that in -rt kernels, runtime PM >>>> can be used in interrupt context (if the device is marked as irq-safe), >>>> but not in atomic context. Right? >>> >>> Right. >>> >>> Whatever is suitable for interrupt context in the mainline, will be suitable >>> for that in -rt kernels too. >> >> Not exactly true :(, since spinlock is converted to [rt_] mutex. >> Usually, this difference can't be seen because on -RT kernel all or >> mostly all HW IRQ handlers will be forced to be threaded. > > Exactly. And that's what I'm talking about. > >> For the cases, where such automatic conversion is not working, >> (like chained irq handlers or HW-handler+Threaded handler) the code >> has to be carefully patched to work properly as for non-RT as for -RT. > > Right. > >> Also, this triggers some -RT incompatibility issues, like with PM runtime or > > That I'm not sure about. Why would runtime PM cause problems with -RT (apart > from attempts to use it from the idle loop, but that's not happening in the > mainline anyway)? I have to be more specific - sorry. "irq_safe" mode of PM runtime is incompatible with -RT. Here is an example: - HW IRQ handler in TI OMAP GPIO driver is implemented as chained IRQ handler and contains pm_runtime_get_sync()/pm_runtime_put(). This works properly with vanilla kernel because OMAP GPIO devices marked as irq_safe. Chained IRQ handlers can't be forced threaded and PM runtime APIs trigger "sleeping function called from invalid context" issues there, so corresponding code has to be reworked. ...
Grygorii Strashko <grygorii.strashko@ti.com> writes: > On 09/07/2015 11:42 PM, Rafael J. Wysocki wrote: >> On Monday, September 07, 2015 04:37:44 PM Grygorii Strashko wrote: >>> On 09/07/2015 04:04 PM, Rafael J. Wysocki wrote: >>>> On Saturday, September 05, 2015 11:39:20 AM Alan Stern wrote: >>>>> On Sat, 5 Sep 2015, Grygorii Strashko wrote: >>>>> >>>>>> On 09/04/2015 09:45 PM, Alan Stern wrote: >>>>>>> On Fri, 4 Sep 2015, Grygorii Strashko wrote: >>>>>>> >>>>>>>> There is one "small" problem with such approach :( >>>>>>>> >>>>>>>> - It's incompatible with -RT kernel, because PM runtime can't be used >>>>>>>> in atomic context on -RT. >>>>>>> >>>>>>> Can you explain this more fully? Why can't runtime PM be used in >>>>>>> atomic context in the -rt kernels? >>>>>>> >>>>>> >>>>>> See: >>>>>> http://lwn.net/Articles/146861/ >>>>>> https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F >>>>>> >>>>>> spinlock_t >>>>>> Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) >>>>>> do -not- disable hardware interrupts. Priority inheritance is used to prevent priority >>>>>> inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. >>>>>> >>>>>> As result, have to do things like: >>>>>> https://lkml.org/lkml/2015/8/18/161 >>>>>> https://lkml.org/lkml/2015/8/18/162 >>>>>> >>>>>> Sorry for brief reply - Friday/Sat night :) >>>>> >>>>> I see. Although we normally think of interrupt contexts as being >>>>> atomic, in an -rt kernel this isn't true any more because things like >>>>> spin_lock_irq don't actually disable interrupts. >>>>> >>>>> Therefore it would be correct to say that in -rt kernels, runtime PM >>>>> can be used in interrupt context (if the device is marked as irq-safe), >>>>> but not in atomic context. Right? >>>> >>>> Right. >>>> >>>> Whatever is suitable for interrupt context in the mainline, will be suitable >>>> for that in -rt kernels too. >>> >>> Not exactly true :(, since spinlock is converted to [rt_] mutex. >>> Usually, this difference can't be seen because on -RT kernel all or >>> mostly all HW IRQ handlers will be forced to be threaded. >> >> Exactly. And that's what I'm talking about. >> >>> For the cases, where such automatic conversion is not working, >>> (like chained irq handlers or HW-handler+Threaded handler) the code >>> has to be carefully patched to work properly as for non-RT as for -RT. >> >> Right. >> >>> Also, this triggers some -RT incompatibility issues, like with PM runtime or >> >> That I'm not sure about. Why would runtime PM cause problems with -RT (apart >> from attempts to use it from the idle loop, but that's not happening in the >> mainline anyway)? > > > I have to be more specific - sorry. "irq_safe" mode of PM runtime is incompatible with -RT. > > Here is an example: > - HW IRQ handler in TI OMAP GPIO driver is implemented as chained IRQ handler and > contains pm_runtime_get_sync()/pm_runtime_put(). This works properly with vanilla kernel > because OMAP GPIO devices marked as irq_safe. > Chained IRQ handlers can't be forced threaded and PM runtime APIs trigger > "sleeping function called from invalid context" issues there, so corresponding code has to be reworked. Isn't that why those are being converted to raw_*[1] ? Kevin [1] http://marc.info/?l=linux-omap&m=143749603401221&w=2
On 09/09/2015 01:03 AM, Kevin Hilman wrote: > Grygorii Strashko <grygorii.strashko@ti.com> writes: > >> On 09/07/2015 11:42 PM, Rafael J. Wysocki wrote: >>> On Monday, September 07, 2015 04:37:44 PM Grygorii Strashko wrote: >>>> On 09/07/2015 04:04 PM, Rafael J. Wysocki wrote: >>>>> On Saturday, September 05, 2015 11:39:20 AM Alan Stern wrote: >>>>>> On Sat, 5 Sep 2015, Grygorii Strashko wrote: >>>>>> >>>>>>> On 09/04/2015 09:45 PM, Alan Stern wrote: >>>>>>>> On Fri, 4 Sep 2015, Grygorii Strashko wrote: >>>>>>>> >>>>>>>>> There is one "small" problem with such approach :( >>>>>>>>> >>>>>>>>> - It's incompatible with -RT kernel, because PM runtime can't be used >>>>>>>>> in atomic context on -RT. >>>>>>>> >>>>>>>> Can you explain this more fully? Why can't runtime PM be used in >>>>>>>> atomic context in the -rt kernels? >>>>>>>> >>>>>>> >>>>>>> See: >>>>>>> http://lwn.net/Articles/146861/ >>>>>>> https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F >>>>>>> >>>>>>> spinlock_t >>>>>>> Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) >>>>>>> do -not- disable hardware interrupts. Priority inheritance is used to prevent priority >>>>>>> inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. >>>>>>> >>>>>>> As result, have to do things like: >>>>>>> https://lkml.org/lkml/2015/8/18/161 >>>>>>> https://lkml.org/lkml/2015/8/18/162 >>>>>>> >>>>>>> Sorry for brief reply - Friday/Sat night :) >>>>>> >>>>>> I see. Although we normally think of interrupt contexts as being >>>>>> atomic, in an -rt kernel this isn't true any more because things like >>>>>> spin_lock_irq don't actually disable interrupts. >>>>>> >>>>>> Therefore it would be correct to say that in -rt kernels, runtime PM >>>>>> can be used in interrupt context (if the device is marked as irq-safe), >>>>>> but not in atomic context. Right? >>>>> >>>>> Right. >>>>> >>>>> Whatever is suitable for interrupt context in the mainline, will be suitable >>>>> for that in -rt kernels too. >>>> >>>> Not exactly true :(, since spinlock is converted to [rt_] mutex. >>>> Usually, this difference can't be seen because on -RT kernel all or >>>> mostly all HW IRQ handlers will be forced to be threaded. >>> >>> Exactly. And that's what I'm talking about. >>> >>>> For the cases, where such automatic conversion is not working, >>>> (like chained irq handlers or HW-handler+Threaded handler) the code >>>> has to be carefully patched to work properly as for non-RT as for -RT. >>> >>> Right. >>> >>>> Also, this triggers some -RT incompatibility issues, like with PM runtime or >>> >>> That I'm not sure about. Why would runtime PM cause problems with -RT (apart >>> from attempts to use it from the idle loop, but that's not happening in the >>> mainline anyway)? >> >> >> I have to be more specific - sorry. "irq_safe" mode of PM runtime is incompatible with -RT. >> >> Here is an example: >> - HW IRQ handler in TI OMAP GPIO driver is implemented as chained IRQ handler and >> contains pm_runtime_get_sync()/pm_runtime_put(). This works properly with vanilla kernel >> because OMAP GPIO devices marked as irq_safe. >> Chained IRQ handlers can't be forced threaded and PM runtime APIs trigger >> "sleeping function called from invalid context" issues there, so corresponding code has to be reworked. > > Isn't that why those are being converted to raw_*[1] ? > > Kevin > > [1] http://marc.info/?l=linux-omap&m=143749603401221&w=2 > That's way I've tried to convert those to generic IRQ handler [2] :), so on -RT it will be forced threaded. raw_* is different kind of problem in gpio-omap - IRQ controllers have to use raw_* inside irq_chip callbacks, because IRQ core guards those callbacks using raw_* locks. .irq_bus_lock()/irq_bus_sync_unlock() callbacks can be used [3] for any kind of operations which require non-atomic context. [2] https://lkml.org/lkml/2015/8/18/162 gpio: omap: convert to use generic irq handler [3] https://lkml.org/lkml/2015/8/18/161 gpio: omap: move pm runtime in irq_chip.irq_bus_lock/sync_unlock
On Thu, Sep 10 2015 at 04:01 -0700, Grygorii Strashko wrote: >On 09/09/2015 01:03 AM, Kevin Hilman wrote: >> Grygorii Strashko <grygorii.strashko@ti.com> writes: >> >>> On 09/07/2015 11:42 PM, Rafael J. Wysocki wrote: >>>> On Monday, September 07, 2015 04:37:44 PM Grygorii Strashko wrote: >>>>> On 09/07/2015 04:04 PM, Rafael J. Wysocki wrote: >>>>>> On Saturday, September 05, 2015 11:39:20 AM Alan Stern wrote: >>>>>>> On Sat, 5 Sep 2015, Grygorii Strashko wrote: >>>>>>> >>>>>>>> On 09/04/2015 09:45 PM, Alan Stern wrote: >>>>>>>>> On Fri, 4 Sep 2015, Grygorii Strashko wrote: >>>>>>>>> >>>>>>>>>> There is one "small" problem with such approach :( >>>>>>>>>> >>>>>>>>>> - It's incompatible with -RT kernel, because PM runtime can't be used >>>>>>>>>> in atomic context on -RT. >>>>>>>>> >>>>>>>>> Can you explain this more fully? Why can't runtime PM be used in >>>>>>>>> atomic context in the -rt kernels? >>>>>>>>> >>>>>>>> >>>>>>>> See: >>>>>>>> http://lwn.net/Articles/146861/ >>>>>>>> https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#How_does_the_CONFIG_PREEMPT_RT_patch_work.3F >>>>>>>> >>>>>>>> spinlock_t >>>>>>>> Critical sections are preemptible. The _irq operations (e.g., spin_lock_irqsave()) >>>>>>>> do -not- disable hardware interrupts. Priority inheritance is used to prevent priority >>>>>>>> inversion. An underlying rt_mutex is used to implement spinlock_t in PREEMPT_RT. >>>>>>>> >>>>>>>> As result, have to do things like: >>>>>>>> https://lkml.org/lkml/2015/8/18/161 >>>>>>>> https://lkml.org/lkml/2015/8/18/162 >>>>>>>> >>>>>>>> Sorry for brief reply - Friday/Sat night :) >>>>>>> >>>>>>> I see. Although we normally think of interrupt contexts as being >>>>>>> atomic, in an -rt kernel this isn't true any more because things like >>>>>>> spin_lock_irq don't actually disable interrupts. >>>>>>> >>>>>>> Therefore it would be correct to say that in -rt kernels, runtime PM >>>>>>> can be used in interrupt context (if the device is marked as irq-safe), >>>>>>> but not in atomic context. Right? >>>>>> >>>>>> Right. >>>>>> >>>>>> Whatever is suitable for interrupt context in the mainline, will be suitable >>>>>> for that in -rt kernels too. >>>>> >>>>> Not exactly true :(, since spinlock is converted to [rt_] mutex. >>>>> Usually, this difference can't be seen because on -RT kernel all or >>>>> mostly all HW IRQ handlers will be forced to be threaded. >>>> >>>> Exactly. And that's what I'm talking about. >>>> >>>>> For the cases, where such automatic conversion is not working, >>>>> (like chained irq handlers or HW-handler+Threaded handler) the code >>>>> has to be carefully patched to work properly as for non-RT as for -RT. >>>> >>>> Right. >>>> >>>>> Also, this triggers some -RT incompatibility issues, like with PM runtime or >>>> >>>> That I'm not sure about. Why would runtime PM cause problems with -RT (apart >>>> from attempts to use it from the idle loop, but that's not happening in the >>>> mainline anyway)? >>> >>> >>> I have to be more specific - sorry. "irq_safe" mode of PM runtime is incompatible with -RT. >>> >>> Here is an example: >>> - HW IRQ handler in TI OMAP GPIO driver is implemented as chained IRQ handler and >>> contains pm_runtime_get_sync()/pm_runtime_put(). This works properly with vanilla kernel >>> because OMAP GPIO devices marked as irq_safe. >>> Chained IRQ handlers can't be forced threaded and PM runtime APIs trigger >>> "sleeping function called from invalid context" issues there, so corresponding code has to be reworked. >> >> Isn't that why those are being converted to raw_*[1] ? >> >> Kevin >> >> [1] http://marc.info/?l=linux-omap&m=143749603401221&w=2 >> > >That's way I've tried to convert those to generic IRQ handler [2] :), >so on -RT it will be forced threaded. > >raw_* is different kind of problem in gpio-omap - IRQ controllers >have to use raw_* inside irq_chip callbacks, because IRQ core guards those >callbacks using raw_* locks. >.irq_bus_lock()/irq_bus_sync_unlock() callbacks can be used [3] >for any kind of operations which require non-atomic context. > Talking to John Stultz at Linaro Connect: Is cpuidle relevant in -RT kernel? I dont know much about -RT. I thought this might be point that we should consider. As it stands today, on a 800 Mhz ARM quad core, I am seeing a latency of 50-70 usec for the additional runtime PM in the cpuidle. Ofcourse, there is a definite need to optimize and there are opportunities to do that. Since each CPU does its own runtime PM, we could probably avoid any kind of locks in the runtime PM. But locks in genpd may be unavoidable. Will look more into that. Thanks, Lina >[2] https://lkml.org/lkml/2015/8/18/162 >gpio: omap: convert to use generic irq handler >[3] https://lkml.org/lkml/2015/8/18/161 >gpio: omap: move pm runtime in irq_chip.irq_bus_lock/sync_unlock > > > >-- >regards, >-grygorii
On Tue, 22 Sep 2015, Lina Iyer wrote: > Talking to John Stultz at Linaro Connect: Is cpuidle relevant in > -RT kernel? I dont know much about -RT. I thought this might be point > that we should consider. There are definitely RT systems out there where power consumption is a concern. > As it stands today, on a 800 Mhz ARM quad core, I am seeing a latency of > 50-70 usec for the additional runtime PM in the cpuidle. Ofcourse, there > is a definite need to optimize and there are opportunities to do that. RT is not about being fast. RT is about being deterministic. So if a power sensitive RT system can afford the extra latencies induced by PM it will tolerate them. Of course, optimizing that is not a bad thing at all :) Thanks, tglx
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 0496b48..077c55f 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -27,6 +27,7 @@ #include <linux/completion.h> #include <linux/cpufreq.h> #include <linux/irq_work.h> +#include <linux/pm_runtime.h> #include <linux/atomic.h> #include <asm/smp.h> @@ -137,7 +138,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) pr_err("CPU%u: failed to boot: %d\n", cpu, ret); } - memset(&secondary_data, 0, sizeof(secondary_data)); return ret; } @@ -271,6 +271,16 @@ void __cpu_die(unsigned int cpu) void __ref cpu_die(void) { unsigned int cpu = smp_processor_id(); + struct device *cpu_dev; + + /* + * We dont need the CPU device anymore. + * Lets do this before IRQs are disabled to allow + * runtime PM to suspend the domain as well. + */ + cpu_dev = get_cpu_device(cpu); + if (cpu_dev) + pm_runtime_put_sync(cpu_dev); idle_task_exit(); @@ -352,6 +362,7 @@ asmlinkage void secondary_start_kernel(void) { struct mm_struct *mm = &init_mm; unsigned int cpu; + struct device *cpu_dev; /* * The identity mapping is uncached (strongly ordered), so @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void) local_irq_enable(); local_fiq_enable(); + /* We are running, enable runtime PM for the CPU. */ + cpu_dev = get_cpu_device(cpu); + if (cpu_dev) + pm_runtime_get_sync(cpu_dev); + /* * OK, it's off to the idle thread for us */
Enable runtime PM for CPU devices. Do a runtime get of the CPU device when the CPU is hotplugged in and a runtime put of the CPU device when the CPU is hotplugged off. When all the CPUs in a domain are hotplugged off, the domain may also be powered off and cluster_pm_enter/exit() notifications are be sent out. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> --- arch/arm/kernel/smp.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)