diff mbox series

bus: ti-sysc: Fix gpt12 system timer issue with reserved status

Message ID 20210611060224.36769-1-tony@atomide.com (mailing list archive)
State New, archived
Headers show
Series bus: ti-sysc: Fix gpt12 system timer issue with reserved status | expand

Commit Message

Tony Lindgren June 11, 2021, 6:02 a.m. UTC
Jarkko Nikula <jarkko.nikula@bitmer.com> reported that Beagleboard
revision c2 stopped booting. Jarkko bisected the issue down to
commit 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend
and resume for am3 and am4").

Let's fix the issue by tagging system timers as reserved rather than
ignoring them. And let's not probe any interconnect target module child
devices for reserved modules.

This allows PM runtime to keep track of clocks and clockdomains for
the interconnect target module, and prevent the system timer from idling
as we already have SYSC_QUIRK_NO_IDLE and SYSC_QUIRK_NO_IDLE_ON_INIT
flags set for system timers.

Fixes: 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4")
Reported-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 drivers/bus/ti-sysc.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

Comments

Jarkko Nikula June 11, 2021, 1:45 p.m. UTC | #1
On 11.6.2021 9.02, Tony Lindgren wrote:
> Jarkko Nikula <jarkko.nikula@bitmer.com> reported that Beagleboard
> revision c2 stopped booting. Jarkko bisected the issue down to
> commit 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend
> and resume for am3 and am4").
> 
> Let's fix the issue by tagging system timers as reserved rather than
> ignoring them. And let's not probe any interconnect target module child
> devices for reserved modules.
> 
> This allows PM runtime to keep track of clocks and clockdomains for
> the interconnect target module, and prevent the system timer from idling
> as we already have SYSC_QUIRK_NO_IDLE and SYSC_QUIRK_NO_IDLE_ON_INIT
> flags set for system timers.
> 
> Fixes: 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4")
> Reported-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
> Signed-off-by: Tony Lindgren <tony@atomide.com>
> ---
>  drivers/bus/ti-sysc.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
I tested this on top of 06af8679449d ("coredump: Limit what can
interrupt coredumps"). I tested Tony's earlier test diff which does the
same that this actual patch also on top of 6cfcd5563b4f.

Tested-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Pavel Machek Aug. 10, 2021, 12:40 p.m. UTC | #2
Hi!

I noticed the issue while reviewing stable kernels, as this is being
backported.

> Jarkko Nikula <jarkko.nikula@bitmer.com> reported that Beagleboard
> revision c2 stopped booting. Jarkko bisected the issue down to
> commit 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend
> and resume for am3 and am4").
> 
> Let's fix the issue by tagging system timers as reserved rather than
> ignoring them. And let's not probe any interconnect target module child
> devices for reserved modules.

+++ b/drivers/bus/ti-sysc.c
> @@ -3093,8 +3095,8 @@ static int sysc_probe(struct platform_device *pdev)
>  		return error;
>  
>  	error = sysc_check_active_timer(ddata);
> -	if (error)
> -		return error;
> +	if (error == -EBUSY)
> +		ddata->reserved = true;
>  
>  	error = sysc_get_clocks(ddata);
>  	if (error)

What is going on here? First, we silently ignore errors other than
EBUSY. Second, sysc_check_active_timer() can't return -EBUSY: it
returns either 0 or -ENXIO. (I checked 5.10-stable, mainline and
-next-20210806).

Best regards,
								Pavel
Tony Lindgren Aug. 10, 2021, 12:52 p.m. UTC | #3
* Pavel Machek <pavel@denx.de> [210810 12:40]:
> Hi!
> 
> I noticed the issue while reviewing stable kernels, as this is being
> backported.
> 
> > Jarkko Nikula <jarkko.nikula@bitmer.com> reported that Beagleboard
> > revision c2 stopped booting. Jarkko bisected the issue down to
> > commit 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend
> > and resume for am3 and am4").
> > 
> > Let's fix the issue by tagging system timers as reserved rather than
> > ignoring them. And let's not probe any interconnect target module child
> > devices for reserved modules.
> 
> +++ b/drivers/bus/ti-sysc.c
> > @@ -3093,8 +3095,8 @@ static int sysc_probe(struct platform_device *pdev)
> >  		return error;
> >  
> >  	error = sysc_check_active_timer(ddata);
> > -	if (error)
> > -		return error;
> > +	if (error == -EBUSY)
> > +		ddata->reserved = true;
> >  
> >  	error = sysc_get_clocks(ddata);
> >  	if (error)
> 
> What is going on here? First, we silently ignore errors other than
> EBUSY. Second, sysc_check_active_timer() can't return -EBUSY: it
> returns either 0 or -ENXIO. (I checked 5.10-stable, mainline and
> -next-20210806).

Thanks for spotting it, looks like there's now a conflict with commit
65fb73676112 ("bus: ti-sysc: suppress err msg for timers used as
clockevent/source"). It seems we should also check for -ENXIO here
too. And yeah it makes sens to return on other errors for sure.

Regards,

Tony
Tony Lindgren Aug. 11, 2021, 6:12 a.m. UTC | #4
* Tony Lindgren <tony@atomide.com> [210810 12:52]:
> * Pavel Machek <pavel@denx.de> [210810 12:40]:
> > What is going on here? First, we silently ignore errors other than
> > EBUSY. Second, sysc_check_active_timer() can't return -EBUSY: it
> > returns either 0 or -ENXIO. (I checked 5.10-stable, mainline and
> > -next-20210806).
> 
> Thanks for spotting it, looks like there's now a conflict with commit
> 65fb73676112 ("bus: ti-sysc: suppress err msg for timers used as
> clockevent/source"). It seems we should also check for -ENXIO here
> too. And yeah it makes sens to return on other errors for sure.

FYI, fix posted at [0] below.

Regards,

Tony

[0] https://lore.kernel.org/linux-omap/20210811061053.32081-1-tony@atomide.com/T/#u
Kevin Hilman March 4, 2022, 5:41 p.m. UTC | #5
Hi Tony,

Tony Lindgren <tony@atomide.com> writes:

> Jarkko Nikula <jarkko.nikula@bitmer.com> reported that Beagleboard
> revision c2 stopped booting. Jarkko bisected the issue down to
> commit 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend
> and resume for am3 and am4").
>
> Let's fix the issue by tagging system timers as reserved rather than
> ignoring them. And let's not probe any interconnect target module child
> devices for reserved modules.
>
> This allows PM runtime to keep track of clocks and clockdomains for
> the interconnect target module, and prevent the system timer from idling
> as we already have SYSC_QUIRK_NO_IDLE and SYSC_QUIRK_NO_IDLE_ON_INIT
> flags set for system timers.
>
> Fixes: 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4")
> Reported-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
> Signed-off-by: Tony Lindgren <tony@atomide.com>

I'm debugging why suspend/resume on AM3x and AM4x are mostly working,
but getting the warning that not all powerdomains are transitioning:

   pm33xx pm33xx: PM: Could not transition all powerdomains to target state

I bisected it down to $SUBJECT patch, and verified that reverting it
makes both on am335x-boneblack and am437x-gp-evm fully suspend, and I'm
now seeing:

   pm33xx pm33xx: PM: Successfully put all powerdomains to target state

Note that it doesn't revert cleanly due to some other changes, but this
one-liner[1] effectively reverts the behavior of $SUBJECT patch, and
also makes things work again.

I verified the revert (and hack[1]) on both v5.10 stable and mainline
v5.16 but TBH, I'm still not 100% sure what's going on so looking for
some guidance from you Tony on what the "real" fix should be.

Kevin

[1] 
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c                                                                                                            
index 54c0ee6dda30..82379ff9dce5 100644                                                                                                                               
--- a/drivers/bus/ti-sysc.c                                                                                                                                           
+++ b/drivers/bus/ti-sysc.c                                                                                                                                           
@@ -3304,7 +3304,7 @@ static int sysc_probe(struct platform_device *pdev)                                                                                             
                                                                                                                                                                      
        error = sysc_check_active_timer(ddata);                                                                                                                       
        if (error == -ENXIO)                                                                                                                                          
-               ddata->reserved = true;                                                                                                                               
+               return error;                                                                                                                                         
        else if (error)                                                                                                                                               
                return error;
Tony Lindgren March 7, 2022, 12:51 p.m. UTC | #6
Hi,

* Kevin Hilman <khilman@baylibre.com> [220304 17:39]:
> Hi Tony,
> 
> Tony Lindgren <tony@atomide.com> writes:
> 
> > Jarkko Nikula <jarkko.nikula@bitmer.com> reported that Beagleboard
> > revision c2 stopped booting. Jarkko bisected the issue down to
> > commit 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend
> > and resume for am3 and am4").
> >
> > Let's fix the issue by tagging system timers as reserved rather than
> > ignoring them. And let's not probe any interconnect target module child
> > devices for reserved modules.
> >
> > This allows PM runtime to keep track of clocks and clockdomains for
> > the interconnect target module, and prevent the system timer from idling
> > as we already have SYSC_QUIRK_NO_IDLE and SYSC_QUIRK_NO_IDLE_ON_INIT
> > flags set for system timers.
> >
> > Fixes: 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4")
> > Reported-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
> > Signed-off-by: Tony Lindgren <tony@atomide.com>
> 
> I'm debugging why suspend/resume on AM3x and AM4x are mostly working,
> but getting the warning that not all powerdomains are transitioning:
> 
>    pm33xx pm33xx: PM: Could not transition all powerdomains to target state
> 
> I bisected it down to $SUBJECT patch, and verified that reverting it
> makes both on am335x-boneblack and am437x-gp-evm fully suspend, and I'm
> now seeing:
> 
>    pm33xx pm33xx: PM: Successfully put all powerdomains to target state
> 
> Note that it doesn't revert cleanly due to some other changes, but this
> one-liner[1] effectively reverts the behavior of $SUBJECT patch, and
> also makes things work again.
> 
> I verified the revert (and hack[1]) on both v5.10 stable and mainline
> v5.16 but TBH, I'm still not 100% sure what's going on so looking for
> some guidance from you Tony on what the "real" fix should be.

Thanks for debugging the issue Kevin. It seems the issue is caused by the
extra runtime PM usage count done for modules tagged no-idle. However,
this causes issues for am335x timers as the PM coprocessor needs all
the domains idled for system suspend despite the system timers tagged
with no-idle.

We could patch ti-sysc.c for more timer workarounds, but I don't know if
that really makes sense. It would add further dependencies between the
system timer code and the interconnect code, and I'd rather go back to
no dependencies between the system timers and the interconnect code :)

So I suggest we make the omap3 gpt12 quirk checks SoC specific as below
for now, they are not needed for the other SoCs.

Then at some point we can plan on dropping support for the old beagleboard
revisions A to B4, and then reverting commit 3ff340e24c9d ("bus: ti-sysc:
Fix gpt12 system timer issue with reserved status").

Note that we now have commit 23885389dbbb ("ARM: dts: Fix timer regression
for beagleboard revision c"), so there no need to (wrongly) enable the
old timer quirks for working omap3 revision C and later boards.

Regards,

Tony

8< ----------------------
From tony Mon Sep 17 00:00:00 2001
From: Tony Lindgren <tony@atomide.com>
Date: Mon, 7 Mar 2022 14:28:44 +0200
Subject: [PATCH] bus: ti-sysc: Make omap3 gpt12 quirk handling SoC
 specific

On beagleboard revisions A to B4 we need to use gpt12 as the system timer.
However, the quirk handling added for gpt12 caused a regression for system
suspend for am335x as the PM coprocessor needs the timers idled for
suspend.

Let's make the gpt12 quirk specific to omap34xx, other SoCs don't need
it. Beagleboard revisions C and later no longer need to use the gpt12
related quirk. Then at some point, if we decide to drop support for the old
beagleboard revisions A to B4, we can also drop the gpt12 related quirks
completely.

Fixes: 3ff340e24c9d ("bus: ti-sysc: Fix gpt12 system timer issue with reserved status")
Reported-by: Kevin Hilman <khilman@baylibre.com>
Suggested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 drivers/bus/ti-sysc.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
--- a/drivers/bus/ti-sysc.c
+++ b/drivers/bus/ti-sysc.c
@@ -3232,13 +3232,27 @@ static int sysc_check_disabled_devices(struct sysc *ddata)
  */
 static int sysc_check_active_timer(struct sysc *ddata)
 {
+	int error;
+
 	if (ddata->cap->type != TI_SYSC_OMAP2_TIMER &&
 	    ddata->cap->type != TI_SYSC_OMAP4_TIMER)
 		return 0;
 
+	/*
+	 * Quirk for omap3 beagleboard revision A to B4 to use gpt12.
+	 * Revision C and later are fixed with commit 23885389dbbb ("ARM:
+	 * dts: Fix timer regression for beagleboard revision c"). This all
+	 * can be dropped if we stop supporting old beagleboard revisions
+	 * A to B4 at some point.
+	 */
+	if (sysc_soc->soc == SOC_3430)
+		error = -ENXIO;
+	else
+		error = -EBUSY;
+
 	if ((ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT) &&
 	    (ddata->cfg.quirks & SYSC_QUIRK_NO_IDLE))
-		return -ENXIO;
+		return error;
 
 	return 0;
 }
Kevin Hilman March 9, 2022, 10:36 a.m. UTC | #7
Tony Lindgren <tony@atomide.com> writes:

> Hi,
>
> * Kevin Hilman <khilman@baylibre.com> [220304 17:39]:
>> Hi Tony,
>> 
>> Tony Lindgren <tony@atomide.com> writes:
>> 
>> > Jarkko Nikula <jarkko.nikula@bitmer.com> reported that Beagleboard
>> > revision c2 stopped booting. Jarkko bisected the issue down to
>> > commit 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend
>> > and resume for am3 and am4").
>> >
>> > Let's fix the issue by tagging system timers as reserved rather than
>> > ignoring them. And let's not probe any interconnect target module child
>> > devices for reserved modules.
>> >
>> > This allows PM runtime to keep track of clocks and clockdomains for
>> > the interconnect target module, and prevent the system timer from idling
>> > as we already have SYSC_QUIRK_NO_IDLE and SYSC_QUIRK_NO_IDLE_ON_INIT
>> > flags set for system timers.
>> >
>> > Fixes: 6cfcd5563b4f ("clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4")
>> > Reported-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
>> > Signed-off-by: Tony Lindgren <tony@atomide.com>
>> 
>> I'm debugging why suspend/resume on AM3x and AM4x are mostly working,
>> but getting the warning that not all powerdomains are transitioning:
>> 
>>    pm33xx pm33xx: PM: Could not transition all powerdomains to target state
>> 
>> I bisected it down to $SUBJECT patch, and verified that reverting it
>> makes both on am335x-boneblack and am437x-gp-evm fully suspend, and I'm
>> now seeing:
>> 
>>    pm33xx pm33xx: PM: Successfully put all powerdomains to target state
>> 
>> Note that it doesn't revert cleanly due to some other changes, but this
>> one-liner[1] effectively reverts the behavior of $SUBJECT patch, and
>> also makes things work again.
>> 
>> I verified the revert (and hack[1]) on both v5.10 stable and mainline
>> v5.16 but TBH, I'm still not 100% sure what's going on so looking for
>> some guidance from you Tony on what the "real" fix should be.
>
> Thanks for debugging the issue Kevin. It seems the issue is caused by the
> extra runtime PM usage count done for modules tagged no-idle. However,
> this causes issues for am335x timers as the PM coprocessor needs all
> the domains idled for system suspend despite the system timers tagged
> with no-idle.
>
> We could patch ti-sysc.c for more timer workarounds, but I don't know if
> that really makes sense. It would add further dependencies between the
> system timer code and the interconnect code, and I'd rather go back to
> no dependencies between the system timers and the interconnect code :)
>
> So I suggest we make the omap3 gpt12 quirk checks SoC specific as below
> for now, they are not needed for the other SoCs.
>
> Then at some point we can plan on dropping support for the old beagleboard
> revisions A to B4, and then reverting commit 3ff340e24c9d ("bus: ti-sysc:
> Fix gpt12 system timer issue with reserved status").
>
> Note that we now have commit 23885389dbbb ("ARM: dts: Fix timer regression
> for beagleboard revision c"), so there no need to (wrongly) enable the
> old timer quirks for working omap3 revision C and later boards.

Thanks for the explanation.

> 8< ----------------------
> From tony Mon Sep 17 00:00:00 2001
> From: Tony Lindgren <tony@atomide.com>
> Date: Mon, 7 Mar 2022 14:28:44 +0200
> Subject: [PATCH] bus: ti-sysc: Make omap3 gpt12 quirk handling SoC
>  specific
>
> On beagleboard revisions A to B4 we need to use gpt12 as the system timer.
> However, the quirk handling added for gpt12 caused a regression for system
> suspend for am335x as the PM coprocessor needs the timers idled for
> suspend.
>
> Let's make the gpt12 quirk specific to omap34xx, other SoCs don't need
> it. Beagleboard revisions C and later no longer need to use the gpt12
> related quirk. Then at some point, if we decide to drop support for the old
> beagleboard revisions A to B4, we can also drop the gpt12 related quirks
> completely.
>
> Fixes: 3ff340e24c9d ("bus: ti-sysc: Fix gpt12 system timer issue with reserved status")
> Reported-by: Kevin Hilman <khilman@baylibre.com>
> Suggested-by: Kevin Hilman <khilman@baylibre.com>
> Signed-off-by: Tony Lindgren <tony@atomide.com>

Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>

Teested on am335x-boneblack and am437x-gp-evm and am seeing 

    pm33xx pm33xx: PM: Successfully put all powerdomains to target state

on both boards.

Kevin
diff mbox series

Patch

diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
--- a/drivers/bus/ti-sysc.c
+++ b/drivers/bus/ti-sysc.c
@@ -100,6 +100,7 @@  static const char * const clock_names[SYSC_MAX_CLOCKS] = {
  * @cookie: data used by legacy platform callbacks
  * @name: name if available
  * @revision: interconnect target module revision
+ * @reserved: target module is reserved and already in use
  * @enabled: sysc runtime enabled status
  * @needs_resume: runtime resume needed on resume from suspend
  * @child_needs_resume: runtime resume needed for child on resume from suspend
@@ -130,6 +131,7 @@  struct sysc {
 	struct ti_sysc_cookie cookie;
 	const char *name;
 	u32 revision;
+	unsigned int reserved:1;
 	unsigned int enabled:1;
 	unsigned int needs_resume:1;
 	unsigned int child_needs_resume:1;
@@ -3093,8 +3095,8 @@  static int sysc_probe(struct platform_device *pdev)
 		return error;
 
 	error = sysc_check_active_timer(ddata);
-	if (error)
-		return error;
+	if (error == -EBUSY)
+		ddata->reserved = true;
 
 	error = sysc_get_clocks(ddata);
 	if (error)
@@ -3130,11 +3132,15 @@  static int sysc_probe(struct platform_device *pdev)
 	sysc_show_registers(ddata);
 
 	ddata->dev->type = &sysc_device_type;
-	error = of_platform_populate(ddata->dev->of_node, sysc_match_table,
-				     pdata ? pdata->auxdata : NULL,
-				     ddata->dev);
-	if (error)
-		goto err;
+
+	if (!ddata->reserved) {
+		error = of_platform_populate(ddata->dev->of_node,
+					     sysc_match_table,
+					     pdata ? pdata->auxdata : NULL,
+					     ddata->dev);
+		if (error)
+			goto err;
+	}
 
 	INIT_DELAYED_WORK(&ddata->idle_work, ti_sysc_idle);