Message ID | 1433363565-17725-1-git-send-email-balbi@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
* Felipe Balbi <balbi@ti.com> [150603 13:36]: > > AM43xx, even though it's a single processor A9, it still has TWD and global > timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just > by switching from gptimer to twd/global. > > The only problem is that currently, is_smp() check prevents me from using twd > with AM43xx (that's why it's commented below, for testing purposes). > > In the hopes that we can start a, hopefully, small thread around the subject, > I'm sending this HACK which I used to get TWD and global timer enabled so I > could measure latencies with cyclictest. > > Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A > processors ? > --- a/arch/arm/kernel/smp_twd.c > +++ b/arch/arm/kernel/smp_twd.c > @@ -388,8 +389,10 @@ static void __init twd_local_timer_of_register(struct device_node *np) > { > int err; > > +#if 0 > if (!is_smp() || !setup_max_cpus) > return; > +#endif > > twd_ppi = irq_of_parse_and_map(np, 0); > if (!twd_ppi) { Why don't you just check for the c-a9 revision here? If it's the UP processor then allow twd? Similar to what's done in global_timer_of_register? Regards, Tony
On Wednesday 03 June 2015 15:32:45 Felipe Balbi wrote: > > Hi Tony and Russell, > > AM43xx, even though it's a single processor A9, it still has TWD and global > timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just > by switching from gptimer to twd/global. > > The only problem is that currently, is_smp() check prevents me from using twd > with AM43xx (that's why it's commented below, for testing purposes). > > In the hopes that we can start a, hopefully, small thread around the subject, > I'm sending this HACK which I used to get TWD and global timer enabled so I > could measure latencies with cyclictest. > > Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A > processors ? > > I wondered about this recently when looking at something unrelated and noticed that the check had been introduced as part of 904464b91eca8 ("ARM: 7655/1: smp_twd: make twd_local_timer_of_register() no-op for nosmp"). I suspect this was just the wrong fix at the time, and that the real culprit is either alloc_percpu() or request_percpu_irq() getting called too early on a machine without SMP support. Possibly the problem is already resolved independently, if you didn't run into it. Arnd
On Wed, Jun 03, 2015 at 10:55:27PM +0200, Arnd Bergmann wrote: > On Wednesday 03 June 2015 15:32:45 Felipe Balbi wrote: > > > > Hi Tony and Russell, > > > > AM43xx, even though it's a single processor A9, it still has TWD and global > > timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just > > by switching from gptimer to twd/global. > > > > The only problem is that currently, is_smp() check prevents me from using twd > > with AM43xx (that's why it's commented below, for testing purposes). > > > > In the hopes that we can start a, hopefully, small thread around the subject, > > I'm sending this HACK which I used to get TWD and global timer enabled so I > > could measure latencies with cyclictest. > > > > Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A > > processors ? > > > > > > I wondered about this recently when looking at something unrelated > and noticed that the check had been introduced as part of > 904464b91eca8 ("ARM: 7655/1: smp_twd: make twd_local_timer_of_register() > no-op for nosmp"). > > I suspect this was just the wrong fix at the time, and that the > real culprit is either alloc_percpu() or request_percpu_irq() > getting called too early on a machine without SMP support. > > Possibly the problem is already resolved independently, if you > didn't run into it. no, no splats, nothing at all. See [1] [1] http://hastebin.com/helekubutu
On Wed, Jun 03, 2015 at 04:04:55PM -0500, Felipe Balbi wrote: > On Wed, Jun 03, 2015 at 10:55:27PM +0200, Arnd Bergmann wrote: > > On Wednesday 03 June 2015 15:32:45 Felipe Balbi wrote: > > > > > > Hi Tony and Russell, > > > > > > AM43xx, even though it's a single processor A9, it still has TWD and global > > > timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just > > > by switching from gptimer to twd/global. > > > > > > The only problem is that currently, is_smp() check prevents me from using twd > > > with AM43xx (that's why it's commented below, for testing purposes). > > > > > > In the hopes that we can start a, hopefully, small thread around the subject, > > > I'm sending this HACK which I used to get TWD and global timer enabled so I > > > could measure latencies with cyclictest. > > > > > > Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A > > > processors ? > > > > > > > > > > I wondered about this recently when looking at something unrelated > > and noticed that the check had been introduced as part of > > 904464b91eca8 ("ARM: 7655/1: smp_twd: make twd_local_timer_of_register() > > no-op for nosmp"). > > > > I suspect this was just the wrong fix at the time, and that the > > real culprit is either alloc_percpu() or request_percpu_irq() > > getting called too early on a machine without SMP support. > > > > Possibly the problem is already resolved independently, if you > > didn't run into it. > > no, no splats, nothing at all. See [1] > > [1] http://hastebin.com/helekubutu Adding Shawn
On 06/03/2015 02:28 PM, Felipe Balbi wrote: > On Wed, Jun 03, 2015 at 04:04:55PM -0500, Felipe Balbi wrote: >> On Wed, Jun 03, 2015 at 10:55:27PM +0200, Arnd Bergmann wrote: >>> On Wednesday 03 June 2015 15:32:45 Felipe Balbi wrote: >>>> Hi Tony and Russell, >>>> >>>> AM43xx, even though it's a single processor A9, it still has TWD and global >>>> timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just >>>> by switching from gptimer to twd/global. >>>> >>>> The only problem is that currently, is_smp() check prevents me from using twd >>>> with AM43xx (that's why it's commented below, for testing purposes). >>>> >>>> In the hopes that we can start a, hopefully, small thread around the subject, >>>> I'm sending this HACK which I used to get TWD and global timer enabled so I >>>> could measure latencies with cyclictest. >>>> >>>> Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A >>>> processors ? >>>> >>>> >>> I wondered about this recently when looking at something unrelated >>> and noticed that the check had been introduced as part of >>> 904464b91eca8 ("ARM: 7655/1: smp_twd: make twd_local_timer_of_register() >>> no-op for nosmp"). >>> >>> I suspect this was just the wrong fix at the time, and that the >>> real culprit is either alloc_percpu() or request_percpu_irq() >>> getting called too early on a machine without SMP support. >>> >>> Possibly the problem is already resolved independently, if you >>> didn't run into it. >> no, no splats, nothing at all. See [1] >> >> [1] http://hastebin.com/helekubutu > Adding Shawn > Mason was also interested in doing this. See [2]. From what I could tell back then, commit 904464b91eca8 was working around the local timer APIs that no longer exist. [2] http://thread.gmane.org/gmane.linux.ports.arm.kernel/389931/focus=392348
Hi, On Wed, Jun 03, 2015 at 02:41:39PM -0700, Stephen Boyd wrote: > >>>>AM43xx, even though it's a single processor A9, it still has TWD and global > >>>>timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just > >>>>by switching from gptimer to twd/global. > >>>> > >>>>The only problem is that currently, is_smp() check prevents me from using twd > >>>>with AM43xx (that's why it's commented below, for testing purposes). > >>>> > >>>>In the hopes that we can start a, hopefully, small thread around the subject, > >>>>I'm sending this HACK which I used to get TWD and global timer enabled so I > >>>>could measure latencies with cyclictest. > >>>> > >>>>Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A > >>>>processors ? > >>>> > >>>> > >>>I wondered about this recently when looking at something unrelated > >>>and noticed that the check had been introduced as part of > >>>904464b91eca8 ("ARM: 7655/1: smp_twd: make twd_local_timer_of_register() > >>>no-op for nosmp"). > >>> > >>>I suspect this was just the wrong fix at the time, and that the > >>>real culprit is either alloc_percpu() or request_percpu_irq() > >>>getting called too early on a machine without SMP support. > >>> > >>>Possibly the problem is already resolved independently, if you > >>>didn't run into it. > >>no, no splats, nothing at all. See [1] > >> > >>[1] http://hastebin.com/helekubutu > >Adding Shawn > > > > Mason was also interested in doing this. See [2]. From what I could tell > back then, commit 904464b91eca8 was working around the local timer APIs that > no longer exist. > > [2] http://thread.gmane.org/gmane.linux.ports.arm.kernel/389931/focus=392348 A lot of good information on that thread, thanks. Seems like getting twd/global timer working would also have some effect on context switching, perhaps ? cheers
On 03/06/2015 23:54, Felipe Balbi wrote: > On Wed, Jun 03, 2015 at 02:41:39PM -0700, Stephen Boyd wrote: >>>>>> AM43xx, even though it's a single processor A9, it still has TWD and global >>>>>> timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just >>>>>> by switching from gptimer to twd/global. >>>>>> >>>>>> The only problem is that currently, is_smp() check prevents me from using twd >>>>>> with AM43xx (that's why it's commented below, for testing purposes). >>>>>> >>>>>> In the hopes that we can start a, hopefully, small thread around the subject, >>>>>> I'm sending this HACK which I used to get TWD and global timer enabled so I >>>>>> could measure latencies with cyclictest. >>>>>> >>>>>> Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A >>>>>> processors ? >>>>>> >>>>>> >>>>> I wondered about this recently when looking at something unrelated >>>>> and noticed that the check had been introduced as part of >>>>> 904464b91eca8 ("ARM: 7655/1: smp_twd: make twd_local_timer_of_register() >>>>> no-op for nosmp"). >>>>> >>>>> I suspect this was just the wrong fix at the time, and that the >>>>> real culprit is either alloc_percpu() or request_percpu_irq() >>>>> getting called too early on a machine without SMP support. >>>>> >>>>> Possibly the problem is already resolved independently, if you >>>>> didn't run into it. >>>> no, no splats, nothing at all. See [1] >>>> >>>> [1] http://hastebin.com/helekubutu >>> Adding Shawn >>> >> >> Mason was also interested in doing this. See [2]. From what I could tell >> back then, commit 904464b91eca8 was working around the local timer APIs that >> no longer exist. >> >> [2] http://thread.gmane.org/gmane.linux.ports.arm.kernel/389931/focus=392348 > > A lot of good information on that thread, thanks. Seems like getting > twd/global timer working would also have some effect on context > switching, perhaps ? Hello, In my case, I need to support two platforms: single core Cortex A9 MPCore dual core Cortex A9 MPCore However, as the MPCore moniker implies, even the single core platform is "SMP capable". (I think this only means an SCU is available?) Thus, I worked around the issue by using the same SMP kernel for both platforms; which is why I didn't push any patch. Also, check /proc/timer_list for a "Broadcast device". If you don't define one, the TWD timers are set to periodic mode, with hrtimers disabled. Regards.
On Thu, Jun 04, 2015 at 11:46:59AM +0200, Mason wrote: > On 03/06/2015 23:54, Felipe Balbi wrote: > > > On Wed, Jun 03, 2015 at 02:41:39PM -0700, Stephen Boyd wrote: > >>>>>> AM43xx, even though it's a single processor A9, it still has TWD and global > >>>>>> timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just > >>>>>> by switching from gptimer to twd/global. > >>>>>> > >>>>>> The only problem is that currently, is_smp() check prevents me from using twd > >>>>>> with AM43xx (that's why it's commented below, for testing purposes). > >>>>>> > >>>>>> In the hopes that we can start a, hopefully, small thread around the subject, > >>>>>> I'm sending this HACK which I used to get TWD and global timer enabled so I > >>>>>> could measure latencies with cyclictest. > >>>>>> > >>>>>> Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A > >>>>>> processors ? > >>>>>> > >>>>>> > >>>>> I wondered about this recently when looking at something unrelated > >>>>> and noticed that the check had been introduced as part of > >>>>> 904464b91eca8 ("ARM: 7655/1: smp_twd: make twd_local_timer_of_register() > >>>>> no-op for nosmp"). > >>>>> > >>>>> I suspect this was just the wrong fix at the time, and that the > >>>>> real culprit is either alloc_percpu() or request_percpu_irq() > >>>>> getting called too early on a machine without SMP support. > >>>>> > >>>>> Possibly the problem is already resolved independently, if you > >>>>> didn't run into it. > >>>> no, no splats, nothing at all. See [1] > >>>> > >>>> [1] http://hastebin.com/helekubutu > >>> Adding Shawn > >>> > >> > >> Mason was also interested in doing this. See [2]. From what I could tell > >> back then, commit 904464b91eca8 was working around the local timer APIs that > >> no longer exist. > >> > >> [2] http://thread.gmane.org/gmane.linux.ports.arm.kernel/389931/focus=392348 > > > > A lot of good information on that thread, thanks. Seems like getting > > twd/global timer working would also have some effect on context > > switching, perhaps ? > > Hello, > > In my case, I need to support two platforms: > > single core Cortex A9 MPCore > dual core Cortex A9 MPCore > > However, as the MPCore moniker implies, even the single core platform > is "SMP capable". (I think this only means an SCU is available?) > > Thus, I worked around the issue by using the same SMP kernel for both > platforms; which is why I didn't push any patch. > > Also, check /proc/timer_list for a "Broadcast device". If you don't > define one, the TWD timers are set to periodic mode, with hrtimers > disabled. Yeah, I have a broadcast device however Linux is picking a gp timer instead of A9's global timer. Now I think I managed to get Linux to choose global, but my device won't boot anymore :-p Debugging that. I'm speculating global timer IRQs aren't firing or aren't wired up properly on this particular SoC, still to confirm.
Hi, On Thu, Jun 04, 2015 at 11:46:59AM +0200, Mason wrote: > Also, check /proc/timer_list for a "Broadcast device". If you don't > define one, the TWD timers are set to periodic mode, with hrtimers > disabled. Did you manage to turn global timer into Broadcast device ?
On Thu, Jun 04, 2015 at 03:08:50PM -0500, Felipe Balbi wrote: > Hi, > > On Thu, Jun 04, 2015 at 11:46:59AM +0200, Mason wrote: > > Also, check /proc/timer_list for a "Broadcast device". If you don't > > define one, the TWD timers are set to periodic mode, with hrtimers > > disabled. > > Did you manage to turn global timer into Broadcast device ? arm_global_timer is marked PERCPU, so it will never be chosen as broadcast.
On 04/06/2015 22:08, Felipe Balbi wrote: > On Thu, Jun 04, 2015 at 11:46:59AM +0200, Mason wrote: >> Also, check /proc/timer_list for a "Broadcast device". If you don't >> define one, the TWD timers are set to periodic mode, with hrtimers >> disabled. > > Did you manage to turn global timer into Broadcast device ? Disclaimer: I am a kernel noob, take everything I say with a rock of salt. As far as I can see, the global timer code doesn't handle frequency changes, whereas the TWD code does. Yet all these timers are clocked by PERIPHCLK, which is defined as CPUCLK / N. So if a platform enables cpufreq, CPUCLK will vary, meaning PERIPHCLK will vary. I don't see how the global timer can work in that situation. cf. also this recent thread: http://thread.gmane.org/gmane.linux.kernel.cpufreq/15770 Regards.
On Thu, Jun 04, 2015 at 10:32:40PM +0200, Mason wrote: > On 04/06/2015 22:08, Felipe Balbi wrote: > > > On Thu, Jun 04, 2015 at 11:46:59AM +0200, Mason wrote: > >> Also, check /proc/timer_list for a "Broadcast device". If you don't > >> define one, the TWD timers are set to periodic mode, with hrtimers > >> disabled. > > > > Did you manage to turn global timer into Broadcast device ? > > Disclaimer: I am a kernel noob, take everything I say with > a rock of salt. > > As far as I can see, the global timer code doesn't handle > frequency changes, whereas the TWD code does. All right, but TWD has C3STOP and that prevents it from being used as broadcast device. Tony Lindgren had the idea of implementing a timer switch during idle and that could help us strip the kernel off of C3STOP feature flag. That means we might be able to use TWD as broadcast until CPU decides to idle, at which point we need to switch to a different timer. > Yet all these timers are clocked by PERIPHCLK, which is defined > as CPUCLK / N. So if a platform enables cpufreq, CPUCLK will > vary, meaning PERIPHCLK will vary. I don't see how the global > timer can work in that situation. > > cf. also this recent thread: > http://thread.gmane.org/gmane.linux.kernel.cpufreq/15770 I'll have a read, cheers.
On 04/06/2015 22:37, Felipe Balbi wrote: > On Thu, Jun 04, 2015 at 10:32:40PM +0200, Mason wrote: >> On 04/06/2015 22:08, Felipe Balbi wrote: >> >>> On Thu, Jun 04, 2015 at 11:46:59AM +0200, Mason wrote: >>>> Also, check /proc/timer_list for a "Broadcast device". If you don't >>>> define one, the TWD timers are set to periodic mode, with hrtimers >>>> disabled. >>> >>> Did you manage to turn global timer into Broadcast device ? >> >> Disclaimer: I am a kernel noob, take everything I say with >> a rock of salt. >> >> As far as I can see, the global timer code doesn't handle >> frequency changes, whereas the TWD code does. > > All right, but TWD has C3STOP and that prevents it from being used as > broadcast device. AFAICT, my platform doesn't stop the local timers in low-power mode, so I just dropped the CLOCK_EVT_FEAT_C3STOP flag. There's even a patch approved by Arnd somewhere in the thread, although he did recommend I should investigate to understand the problem better. Regards.
* Felipe Balbi <balbi@ti.com> [150604 13:41]: > On Thu, Jun 04, 2015 at 10:32:40PM +0200, Mason wrote: > > On 04/06/2015 22:08, Felipe Balbi wrote: > > > > > On Thu, Jun 04, 2015 at 11:46:59AM +0200, Mason wrote: > > >> Also, check /proc/timer_list for a "Broadcast device". If you don't > > >> define one, the TWD timers are set to periodic mode, with hrtimers > > >> disabled. > > > > > > Did you manage to turn global timer into Broadcast device ? > > > > Disclaimer: I am a kernel noob, take everything I say with > > a rock of salt. > > > > As far as I can see, the global timer code doesn't handle > > frequency changes, whereas the TWD code does. > > All right, but TWD has C3STOP and that prevents it from being used as > broadcast device. > > Tony Lindgren had the idea of implementing a timer switch during idle > and that could help us strip the kernel off of C3STOP feature flag. That > means we might be able to use TWD as broadcast until CPU decides to > idle, at which point we need to switch to a different timer. Yeah I'm looking at adding clocksource_pm_enter/exit() to allow also changing the clocksource to a different one for idle.. Will post some patches after investigating it a bit further. Changing the clockevent for idle already works just fine based on tick_broadcast_enable() + tick_broadcast_enter/exit(). Regards, Tony
diff --git a/arch/arm/boot/dts/am4372.dtsi b/arch/arm/boot/dts/am4372.dtsi index 1943fc333e7c..e782927d1c44 100644 --- a/arch/arm/boot/dts/am4372.dtsi +++ b/arch/arm/boot/dts/am4372.dtsi @@ -50,6 +50,25 @@ <0x48240100 0x0100>; }; + scu@48240000 { + compatible = "arm,cortex-a9-scu"; + reg = <0x48240000 0x100>; + }; + + global-timer@48240200 { + compatible = "arm,cortex-a9-global-timer"; + reg = <0x48240200 0x100>; + interrupts = <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&dpll_mpu_m2_ck>; + }; + + local-timer@48240200 { + compatible = "arm,cortex-a9-twd-timer"; + reg = <0x48240600 0x100>; + interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&dpll_mpu_ck>; + }; + l2-cache-controller@48242000 { compatible = "arm,pl310-cache"; reg = <0x48242000 0x1000>; diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c index 172c6a05d27f..43c053d836e7 100644 --- a/arch/arm/kernel/smp_twd.c +++ b/arch/arm/kernel/smp_twd.c @@ -388,8 +389,10 @@ static void __init twd_local_timer_of_register(struct device_node *np) { int err; +#if 0 if (!is_smp() || !setup_max_cpus) return; +#endif twd_ppi = irq_of_parse_and_map(np, 0); if (!twd_ppi) { diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig index 2b8e47788062..80984f87b020 100644 --- a/arch/arm/mach-omap2/Kconfig +++ b/arch/arm/mach-omap2/Kconfig @@ -60,6 +60,10 @@ config SOC_AM43XX select ARM_GIC select MACH_OMAP_GENERIC select MIGHT_HAVE_CACHE_L2X0 + select HAVE_ARM_SCU + select HAVE_ARM_TWD + select ARM_GLOBAL_TIMER + select CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK config SOC_DRA7XX bool "TI DRA7XX" diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c index 34ff14b7beab..8acb3b4c84e3 100644 --- a/arch/arm/mach-omap2/board-generic.c +++ b/arch/arm/mach-omap2/board-generic.c @@ -279,7 +279,7 @@ DT_MACHINE_START(AM43_DT, "Generic AM43 (Flattened Device Tree)") .init_late = am43xx_init_late, .init_irq = omap_gic_of_init, .init_machine = omap_generic_init, - .init_time = omap3_gptimer_timer_init, + .init_time = omap4_local_timer_init, .dt_compat = am43_boards_compat, .restart = omap44xx_restart, MACHINE_END diff --git a/arch/arm/mach-omap2/timer.c b/arch/arm/mach-omap2/timer.c index cef67af9e9b8..0031f56f50b1 100644 --- a/arch/arm/mach-omap2/timer.c +++ b/arch/arm/mach-omap2/timer.c @@ -655,20 +655,18 @@ static OMAP_SYS_32K_TIMER_INIT(4, 1, "timer_32k_ck", "ti,timer-alwon", static DEFINE_TWD_LOCAL_TIMER(twd_local_timer, OMAP44XX_LOCAL_TWD_BASE, 29); void __init omap4_local_timer_init(void) { + int err; + omap4_sync32k_timer_init(); /* Local timers are not supprted on OMAP4430 ES1.0 */ - if (omap_rev() != OMAP4430_REV_ES1_0) { - int err; - - if (of_have_populated_dt()) { - clocksource_of_init(); - return; - } - - err = twd_local_timer_register(&twd_local_timer); - if (err) - pr_err("twd_local_timer_register failed %d\n", err); + if (of_have_populated_dt()) { + clocksource_of_init(); + return; } + + err = twd_local_timer_register(&twd_local_timer); + if (err) + pr_err("twd_local_timer_register failed %d\n", err); } #else void __init omap4_local_timer_init(void)
Signed-off-by: Felipe Balbi <balbi@ti.com> --- Hi Tony and Russell, AM43xx, even though it's a single processor A9, it still has TWD and global timer. I was doing some profiling with RT v4.0 and latency is 3.5x lower just by switching from gptimer to twd/global. The only problem is that currently, is_smp() check prevents me from using twd with AM43xx (that's why it's commented below, for testing purposes). In the hopes that we can start a, hopefully, small thread around the subject, I'm sending this HACK which I used to get TWD and global timer enabled so I could measure latencies with cyclictest. Is it so that TWD shouldn't be available on UP integrations of ARM's Cortex-A processors ? Regards arch/arm/boot/dts/am4372.dtsi | 19 +++++++++++++++++++ arch/arm/kernel/smp_twd.c | 3 +++ arch/arm/mach-omap2/Kconfig | 4 ++++ arch/arm/mach-omap2/board-generic.c | 2 +- arch/arm/mach-omap2/timer.c | 20 +++++++++----------- 5 files changed, 36 insertions(+), 12 deletions(-)