Message ID | 20240410134044.2138310-10-claudiu.beznea.uj@bp.renesas.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Geert Uytterhoeven |
Headers | show |
Series | watchdog: rzg2l_wdt: Add support for RZ/G3S | expand |
On Wed, Apr 10, 2024 at 04:40:43PM +0300, Claudiu wrote: > From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > The rzg2l_wdt_restart() is called from atomic context. Calling > pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume > APIs is not an option as it may lead to issues as described in commit > e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") > that removed the pm_runtime_get_sync() and used directly the > clk_prepare_enable() APIs. > > Starting with RZ/G3S the watchdog could be part of its own software > controlled power domain (see the initial implementation in Link section). > In case the watchdog is not used the power domain is off and accessing > watchdog registers leads to aborts. > > To solve this the patch powers on the power domain using > dev_pm_genpd_resume() API before enabling its clock. This is not > sleeping or taking any other locks as the power domain will not be > registered with GENPD_FLAG_IRQ_SAFE flags. > > Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> > --- > > Changes in v8: > - none, this patch is new > > drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c > index c8c20cfb97a3..98e5e9914a5d 100644 > --- a/drivers/watchdog/rzg2l_wdt.c > +++ b/drivers/watchdog/rzg2l_wdt.c > @@ -12,6 +12,7 @@ > #include <linux/module.h> > #include <linux/of.h> > #include <linux/platform_device.h> > +#include <linux/pm_domain.h> > #include <linux/pm_runtime.h> > #include <linux/reset.h> > #include <linux/units.h> > @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, > struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); > int ret; > > + /* > + * The device may be part of a power domain that is currently > + * powered off. We need to power it up before accessing registers. > + * We don't undo the dev_pm_genpd_resume() as the device need to > + * be up for the reboot to happen. Also, as we are in atomic context > + * here there is no need to increment PM runtime usage counter > + * (to make sure pm_runtime_active() doesn't return wrong code). > + */ > + if (!pm_runtime_active(wdev->parent)) > + dev_pm_genpd_resume(wdev->parent); > + > clk_prepare_enable(priv->pclk); > clk_prepare_enable(priv->osc_clk); > > -- > 2.39.2 >
Hi Claudiu, CC pmdomain On Thu, Apr 11, 2024 at 1:11 PM Claudiu <claudiu.beznea@tuxon.dev> wrote: > From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > The rzg2l_wdt_restart() is called from atomic context. Calling > pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume > APIs is not an option as it may lead to issues as described in commit > e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") > that removed the pm_runtime_get_sync() and used directly the > clk_prepare_enable() APIs. > > Starting with RZ/G3S the watchdog could be part of its own software > controlled power domain (see the initial implementation in Link section). > In case the watchdog is not used the power domain is off and accessing > watchdog registers leads to aborts. > > To solve this the patch powers on the power domain using > dev_pm_genpd_resume() API before enabling its clock. This is not > sleeping or taking any other locks as the power domain will not be > registered with GENPD_FLAG_IRQ_SAFE flags. > > Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Thanks for your patch! Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> > --- a/drivers/watchdog/rzg2l_wdt.c > +++ b/drivers/watchdog/rzg2l_wdt.c > @@ -12,6 +12,7 @@ > #include <linux/module.h> > #include <linux/of.h> > #include <linux/platform_device.h> > +#include <linux/pm_domain.h> > #include <linux/pm_runtime.h> > #include <linux/reset.h> > #include <linux/units.h> > @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, > struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); > int ret; > > + /* > + * The device may be part of a power domain that is currently > + * powered off. We need to power it up before accessing registers. > + * We don't undo the dev_pm_genpd_resume() as the device need to > + * be up for the reboot to happen. Also, as we are in atomic context > + * here there is no need to increment PM runtime usage counter > + * (to make sure pm_runtime_active() doesn't return wrong code). > + */ > + if (!pm_runtime_active(wdev->parent)) > + dev_pm_genpd_resume(wdev->parent); > + > clk_prepare_enable(priv->pclk); > clk_prepare_enable(priv->osc_clk); > Gr{oetje,eeting}s, Geert
On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@tuxon.dev> wrote: > > From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > The rzg2l_wdt_restart() is called from atomic context. Calling > pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume > APIs is not an option as it may lead to issues as described in commit > e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") > that removed the pm_runtime_get_sync() and used directly the > clk_prepare_enable() APIs. > > Starting with RZ/G3S the watchdog could be part of its own software > controlled power domain (see the initial implementation in Link section). > In case the watchdog is not used the power domain is off and accessing > watchdog registers leads to aborts. > > To solve this the patch powers on the power domain using > dev_pm_genpd_resume() API before enabling its clock. This is not > sleeping or taking any other locks as the power domain will not be > registered with GENPD_FLAG_IRQ_SAFE flags. > > Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > --- > > Changes in v8: > - none, this patch is new > > drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c > index c8c20cfb97a3..98e5e9914a5d 100644 > --- a/drivers/watchdog/rzg2l_wdt.c > +++ b/drivers/watchdog/rzg2l_wdt.c > @@ -12,6 +12,7 @@ > #include <linux/module.h> > #include <linux/of.h> > #include <linux/platform_device.h> > +#include <linux/pm_domain.h> > #include <linux/pm_runtime.h> > #include <linux/reset.h> > #include <linux/units.h> > @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, > struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); > int ret; > > + /* > + * The device may be part of a power domain that is currently > + * powered off. We need to power it up before accessing registers. > + * We don't undo the dev_pm_genpd_resume() as the device need to > + * be up for the reboot to happen. Also, as we are in atomic context > + * here there is no need to increment PM runtime usage counter > + * (to make sure pm_runtime_active() doesn't return wrong code). > + */ > + if (!pm_runtime_active(wdev->parent)) > + dev_pm_genpd_resume(wdev->parent); > + I doubt this is the correct solution, but I may be wrong. Unless this is invoked at the syscore stage? > clk_prepare_enable(priv->pclk); > clk_prepare_enable(priv->osc_clk); > > -- > 2.39.2 > > Can you redirectly me to the complete series, so I can have a better overview of the problem? Kind regards Uffe
On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@tuxon.dev> wrote: > > From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > The rzg2l_wdt_restart() is called from atomic context. Calling > pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume > APIs is not an option as it may lead to issues as described in commit > e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") > that removed the pm_runtime_get_sync() and used directly the > clk_prepare_enable() APIs. Calling clk_prepare_enable() doesn't work from an atomic context either as it may sleep on the clk prepare mutex. As I said in the other reply too, it looks like we need a different solution. I am not sure what, but I am happy to help discuss it. > > Starting with RZ/G3S the watchdog could be part of its own software > controlled power domain (see the initial implementation in Link section). > In case the watchdog is not used the power domain is off and accessing > watchdog registers leads to aborts. > > To solve this the patch powers on the power domain using > dev_pm_genpd_resume() API before enabling its clock. This is not > sleeping or taking any other locks as the power domain will not be > registered with GENPD_FLAG_IRQ_SAFE flags. > > Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Kind regards Uffe > --- > > Changes in v8: > - none, this patch is new > > drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c > index c8c20cfb97a3..98e5e9914a5d 100644 > --- a/drivers/watchdog/rzg2l_wdt.c > +++ b/drivers/watchdog/rzg2l_wdt.c > @@ -12,6 +12,7 @@ > #include <linux/module.h> > #include <linux/of.h> > #include <linux/platform_device.h> > +#include <linux/pm_domain.h> > #include <linux/pm_runtime.h> > #include <linux/reset.h> > #include <linux/units.h> > @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, > struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); > int ret; > > + /* > + * The device may be part of a power domain that is currently > + * powered off. We need to power it up before accessing registers. > + * We don't undo the dev_pm_genpd_resume() as the device need to > + * be up for the reboot to happen. Also, as we are in atomic context > + * here there is no need to increment PM runtime usage counter > + * (to make sure pm_runtime_active() doesn't return wrong code). > + */ > + if (!pm_runtime_active(wdev->parent)) > + dev_pm_genpd_resume(wdev->parent); > + > clk_prepare_enable(priv->pclk); > clk_prepare_enable(priv->osc_clk); > > -- > 2.39.2 > >
Hi, Ulf, On 12.04.2024 14:14, Ulf Hansson wrote: > On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@tuxon.dev> wrote: >> >> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >> >> The rzg2l_wdt_restart() is called from atomic context. Calling >> pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume >> APIs is not an option as it may lead to issues as described in commit >> e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") >> that removed the pm_runtime_get_sync() and used directly the >> clk_prepare_enable() APIs. >> >> Starting with RZ/G3S the watchdog could be part of its own software >> controlled power domain (see the initial implementation in Link section). >> In case the watchdog is not used the power domain is off and accessing >> watchdog registers leads to aborts. >> >> To solve this the patch powers on the power domain using >> dev_pm_genpd_resume() API before enabling its clock. This is not >> sleeping or taking any other locks as the power domain will not be >> registered with GENPD_FLAG_IRQ_SAFE flags. >> >> Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com >> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >> --- >> >> Changes in v8: >> - none, this patch is new >> >> drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c >> index c8c20cfb97a3..98e5e9914a5d 100644 >> --- a/drivers/watchdog/rzg2l_wdt.c >> +++ b/drivers/watchdog/rzg2l_wdt.c >> @@ -12,6 +12,7 @@ >> #include <linux/module.h> >> #include <linux/of.h> >> #include <linux/platform_device.h> >> +#include <linux/pm_domain.h> >> #include <linux/pm_runtime.h> >> #include <linux/reset.h> >> #include <linux/units.h> >> @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, >> struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); >> int ret; >> >> + /* >> + * The device may be part of a power domain that is currently >> + * powered off. We need to power it up before accessing registers. >> + * We don't undo the dev_pm_genpd_resume() as the device need to >> + * be up for the reboot to happen. Also, as we are in atomic context >> + * here there is no need to increment PM runtime usage counter >> + * (to make sure pm_runtime_active() doesn't return wrong code). >> + */ >> + if (!pm_runtime_active(wdev->parent)) >> + dev_pm_genpd_resume(wdev->parent); >> + > > I doubt this is the correct solution, but I may be wrong. Unless this > is invoked at the syscore stage? On my case I see it invoked from kernel_restart(). As of my code reading, at that point only one CPU is active with IRQs disabled (done in machine_restart()). Below is the stack trace decoded on next-20240410 with this series (https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/) on top and the one from here (adding power domain support): https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT) Call trace: dump_backtrace (arch/arm64/kernel/stacktrace.c:319) show_stack (arch/arm64/kernel/stacktrace.c:326) dump_stack_lvl (lib/dump_stack.c:117) dump_stack (lib/dump_stack.c:124) rzg2l_wdt_restart (drivers/watchdog/rzg2l_wdt.c:180) watchdog_restart_notifier (drivers/watchdog/watchdog_core.c:188) atomic_notifier_call_chain (kernel/notifier.c:98 kernel/notifier.c:231) do_kernel_restart (kernel/reboot.c:236) machine_restart (arch/arm64/kernel/process.c:145) kernel_restart (kernel/reboot.c:287) __do_sys_reboot (kernel/reboot.c:755) __arm64_sys_reboot (kernel/reboot.c:715) invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:53) el0_svc_common.constprop.0 (include/linux/thread_info.h:127 arch/arm64/kernel/syscall.c:141) do_el0_svc (arch/arm64/kernel/syscall.c:153) el0_svc (arch/arm64/include/asm/irqflags.h:56 arch/arm64/include/asm/irqflags.h:77 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) el0t_64_sync (arch/arm64/kernel/entry.S:598) The watchdog restart handler is added in restart_handler_list and this list is invoked though do_kernel_restart(). As of my code investigation the restart_handler_list is invoked only though do_kernel_restart() and only though the stack trace above. Thank you, Claudiu Beznea > >> clk_prepare_enable(priv->pclk); >> clk_prepare_enable(priv->osc_clk); >> >> -- >> 2.39.2 >> >> > > Can you redirectly me to the complete series, so I can have a better > overview of the problem? This is the series that adds power domain support for RZ/G3S SoC: https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ This is the series that adds watchdog support for RZ/G3S SoC: https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/ Thank you for your review, Claudiu Beznea > > Kind regards > Uffe
Hi, Ulf, On 12.04.2024 17:02, claudiu beznea wrote: > Hi, Ulf, > > On 12.04.2024 14:14, Ulf Hansson wrote: >> On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@tuxon.dev> wrote: >>> >>> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >>> >>> The rzg2l_wdt_restart() is called from atomic context. Calling >>> pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume >>> APIs is not an option as it may lead to issues as described in commit >>> e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") >>> that removed the pm_runtime_get_sync() and used directly the >>> clk_prepare_enable() APIs. >>> >>> Starting with RZ/G3S the watchdog could be part of its own software >>> controlled power domain (see the initial implementation in Link section). >>> In case the watchdog is not used the power domain is off and accessing >>> watchdog registers leads to aborts. >>> >>> To solve this the patch powers on the power domain using >>> dev_pm_genpd_resume() API before enabling its clock. This is not >>> sleeping or taking any other locks as the power domain will not be >>> registered with GENPD_FLAG_IRQ_SAFE flags. >>> >>> Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com >>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >>> --- >>> >>> Changes in v8: >>> - none, this patch is new >>> >>> drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c >>> index c8c20cfb97a3..98e5e9914a5d 100644 >>> --- a/drivers/watchdog/rzg2l_wdt.c >>> +++ b/drivers/watchdog/rzg2l_wdt.c >>> @@ -12,6 +12,7 @@ >>> #include <linux/module.h> >>> #include <linux/of.h> >>> #include <linux/platform_device.h> >>> +#include <linux/pm_domain.h> >>> #include <linux/pm_runtime.h> >>> #include <linux/reset.h> >>> #include <linux/units.h> >>> @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, >>> struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); >>> int ret; >>> >>> + /* >>> + * The device may be part of a power domain that is currently >>> + * powered off. We need to power it up before accessing registers. >>> + * We don't undo the dev_pm_genpd_resume() as the device need to >>> + * be up for the reboot to happen. Also, as we are in atomic context >>> + * here there is no need to increment PM runtime usage counter >>> + * (to make sure pm_runtime_active() doesn't return wrong code). >>> + */ >>> + if (!pm_runtime_active(wdev->parent)) >>> + dev_pm_genpd_resume(wdev->parent); >>> + >> >> I doubt this is the correct solution, but I may be wrong. Unless this >> is invoked at the syscore stage? > > On my case I see it invoked from kernel_restart(). As of my code reading, With the above explanations, do you consider calling dev_pm_genpd_resume() here is still wrong? Do you have any suggestions I could try? Thank you, Claudiu Beznea > at that point only one CPU is active with IRQs disabled (done in > machine_restart()). Below is the stack trace decoded on next-20240410 with > this series > (https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/) > on top and the one from here (adding power domain support): > https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ > > Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT) > Call trace: > dump_backtrace (arch/arm64/kernel/stacktrace.c:319) > show_stack (arch/arm64/kernel/stacktrace.c:326) > dump_stack_lvl (lib/dump_stack.c:117) > dump_stack (lib/dump_stack.c:124) > rzg2l_wdt_restart (drivers/watchdog/rzg2l_wdt.c:180) > watchdog_restart_notifier (drivers/watchdog/watchdog_core.c:188) > atomic_notifier_call_chain (kernel/notifier.c:98 kernel/notifier.c:231) > do_kernel_restart (kernel/reboot.c:236) > machine_restart (arch/arm64/kernel/process.c:145) > kernel_restart (kernel/reboot.c:287) > __do_sys_reboot (kernel/reboot.c:755) > __arm64_sys_reboot (kernel/reboot.c:715) > invoke_syscall (arch/arm64/include/asm/current.h:19 > arch/arm64/kernel/syscall.c:53) > el0_svc_common.constprop.0 (include/linux/thread_info.h:127 > arch/arm64/kernel/syscall.c:141) > do_el0_svc (arch/arm64/kernel/syscall.c:153) > el0_svc (arch/arm64/include/asm/irqflags.h:56 > arch/arm64/include/asm/irqflags.h:77 arch/arm64/kernel/entry-common.c:165 > arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) > el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) > el0t_64_sync (arch/arm64/kernel/entry.S:598) > > The watchdog restart handler is added in restart_handler_list and this list > is invoked though do_kernel_restart(). As of my code investigation the > restart_handler_list is invoked only though do_kernel_restart() and only > though the stack trace above. > > Thank you, > Claudiu Beznea > >> >>> clk_prepare_enable(priv->pclk); >>> clk_prepare_enable(priv->osc_clk); >>> >>> -- >>> 2.39.2 >>> >>> >> >> Can you redirectly me to the complete series, so I can have a better >> overview of the problem? > > This is the series that adds power domain support for RZ/G3S SoC: > https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ > > This is the series that adds watchdog support for RZ/G3S SoC: > https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/ > > Thank you for your review, > Claudiu Beznea > >> >> Kind regards >> Uffe
On Wed, 24 Apr 2024 at 13:14, claudiu beznea <claudiu.beznea@tuxon.dev> wrote: > > Hi, Ulf, > > On 12.04.2024 17:02, claudiu beznea wrote: > > Hi, Ulf, > > > > On 12.04.2024 14:14, Ulf Hansson wrote: > >> On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@tuxon.dev> wrote: > >>> > >>> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > >>> > >>> The rzg2l_wdt_restart() is called from atomic context. Calling > >>> pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume > >>> APIs is not an option as it may lead to issues as described in commit > >>> e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") > >>> that removed the pm_runtime_get_sync() and used directly the > >>> clk_prepare_enable() APIs. > >>> > >>> Starting with RZ/G3S the watchdog could be part of its own software > >>> controlled power domain (see the initial implementation in Link section). > >>> In case the watchdog is not used the power domain is off and accessing > >>> watchdog registers leads to aborts. > >>> > >>> To solve this the patch powers on the power domain using > >>> dev_pm_genpd_resume() API before enabling its clock. This is not > >>> sleeping or taking any other locks as the power domain will not be > >>> registered with GENPD_FLAG_IRQ_SAFE flags. > >>> > >>> Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com > >>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > >>> --- > >>> > >>> Changes in v8: > >>> - none, this patch is new > >>> > >>> drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ > >>> 1 file changed, 12 insertions(+) > >>> > >>> diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c > >>> index c8c20cfb97a3..98e5e9914a5d 100644 > >>> --- a/drivers/watchdog/rzg2l_wdt.c > >>> +++ b/drivers/watchdog/rzg2l_wdt.c > >>> @@ -12,6 +12,7 @@ > >>> #include <linux/module.h> > >>> #include <linux/of.h> > >>> #include <linux/platform_device.h> > >>> +#include <linux/pm_domain.h> > >>> #include <linux/pm_runtime.h> > >>> #include <linux/reset.h> > >>> #include <linux/units.h> > >>> @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, > >>> struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); > >>> int ret; > >>> > >>> + /* > >>> + * The device may be part of a power domain that is currently > >>> + * powered off. We need to power it up before accessing registers. > >>> + * We don't undo the dev_pm_genpd_resume() as the device need to > >>> + * be up for the reboot to happen. Also, as we are in atomic context > >>> + * here there is no need to increment PM runtime usage counter > >>> + * (to make sure pm_runtime_active() doesn't return wrong code). > >>> + */ > >>> + if (!pm_runtime_active(wdev->parent)) > >>> + dev_pm_genpd_resume(wdev->parent); > >>> + > >> > >> I doubt this is the correct solution, but I may be wrong. Unless this > >> is invoked at the syscore stage? > > > > On my case I see it invoked from kernel_restart(). As of my code reading, > > With the above explanations, do you consider calling dev_pm_genpd_resume() > here is still wrong? Yes. At least, those genpd functions were not added to cope for cases like this. Moreover, you still need to find another solution as clk_prepare_enable() can't be called in this path. > > Do you have any suggestions I could try? Not at the moment, but I will try to circle back to this topic more thinking next week, when I have some more time. > > Thank you, > Claudiu Beznea Kind regards Uffe > > > at that point only one CPU is active with IRQs disabled (done in > > machine_restart()). Below is the stack trace decoded on next-20240410 with > > this series > > (https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/) > > on top and the one from here (adding power domain support): > > https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ > > > > Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT) > > Call trace: > > dump_backtrace (arch/arm64/kernel/stacktrace.c:319) > > show_stack (arch/arm64/kernel/stacktrace.c:326) > > dump_stack_lvl (lib/dump_stack.c:117) > > dump_stack (lib/dump_stack.c:124) > > rzg2l_wdt_restart (drivers/watchdog/rzg2l_wdt.c:180) > > watchdog_restart_notifier (drivers/watchdog/watchdog_core.c:188) > > atomic_notifier_call_chain (kernel/notifier.c:98 kernel/notifier.c:231) > > do_kernel_restart (kernel/reboot.c:236) > > machine_restart (arch/arm64/kernel/process.c:145) > > kernel_restart (kernel/reboot.c:287) > > __do_sys_reboot (kernel/reboot.c:755) > > __arm64_sys_reboot (kernel/reboot.c:715) > > invoke_syscall (arch/arm64/include/asm/current.h:19 > > arch/arm64/kernel/syscall.c:53) > > el0_svc_common.constprop.0 (include/linux/thread_info.h:127 > > arch/arm64/kernel/syscall.c:141) > > do_el0_svc (arch/arm64/kernel/syscall.c:153) > > el0_svc (arch/arm64/include/asm/irqflags.h:56 > > arch/arm64/include/asm/irqflags.h:77 arch/arm64/kernel/entry-common.c:165 > > arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) > > el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) > > el0t_64_sync (arch/arm64/kernel/entry.S:598) > > > > The watchdog restart handler is added in restart_handler_list and this list > > is invoked though do_kernel_restart(). As of my code investigation the > > restart_handler_list is invoked only though do_kernel_restart() and only > > though the stack trace above. > > > > Thank you, > > Claudiu Beznea > > > >> > >>> clk_prepare_enable(priv->pclk); > >>> clk_prepare_enable(priv->osc_clk); > >>> > >>> -- > >>> 2.39.2 > >>> > >>> > >> > >> Can you redirectly me to the complete series, so I can have a better > >> overview of the problem? > > > > This is the series that adds power domain support for RZ/G3S SoC: > > https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ > > > > This is the series that adds watchdog support for RZ/G3S SoC: > > https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/ > > > > Thank you for your review, > > Claudiu Beznea > > > >> > >> Kind regards > >> Uffe
Hi, Ulf, On 29.04.2024 17:19, Ulf Hansson wrote: > On Wed, 24 Apr 2024 at 13:14, claudiu beznea <claudiu.beznea@tuxon.dev> wrote: >> >> Hi, Ulf, >> >> On 12.04.2024 17:02, claudiu beznea wrote: >>> Hi, Ulf, >>> >>> On 12.04.2024 14:14, Ulf Hansson wrote: >>>> On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@tuxon.dev> wrote: >>>>> >>>>> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >>>>> >>>>> The rzg2l_wdt_restart() is called from atomic context. Calling >>>>> pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume >>>>> APIs is not an option as it may lead to issues as described in commit >>>>> e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") >>>>> that removed the pm_runtime_get_sync() and used directly the >>>>> clk_prepare_enable() APIs. >>>>> >>>>> Starting with RZ/G3S the watchdog could be part of its own software >>>>> controlled power domain (see the initial implementation in Link section). >>>>> In case the watchdog is not used the power domain is off and accessing >>>>> watchdog registers leads to aborts. >>>>> >>>>> To solve this the patch powers on the power domain using >>>>> dev_pm_genpd_resume() API before enabling its clock. This is not >>>>> sleeping or taking any other locks as the power domain will not be >>>>> registered with GENPD_FLAG_IRQ_SAFE flags. >>>>> >>>>> Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@bp.renesas.com >>>>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >>>>> --- >>>>> >>>>> Changes in v8: >>>>> - none, this patch is new >>>>> >>>>> drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ >>>>> 1 file changed, 12 insertions(+) >>>>> >>>>> diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c >>>>> index c8c20cfb97a3..98e5e9914a5d 100644 >>>>> --- a/drivers/watchdog/rzg2l_wdt.c >>>>> +++ b/drivers/watchdog/rzg2l_wdt.c >>>>> @@ -12,6 +12,7 @@ >>>>> #include <linux/module.h> >>>>> #include <linux/of.h> >>>>> #include <linux/platform_device.h> >>>>> +#include <linux/pm_domain.h> >>>>> #include <linux/pm_runtime.h> >>>>> #include <linux/reset.h> >>>>> #include <linux/units.h> >>>>> @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, >>>>> struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); >>>>> int ret; >>>>> >>>>> + /* >>>>> + * The device may be part of a power domain that is currently >>>>> + * powered off. We need to power it up before accessing registers. >>>>> + * We don't undo the dev_pm_genpd_resume() as the device need to >>>>> + * be up for the reboot to happen. Also, as we are in atomic context >>>>> + * here there is no need to increment PM runtime usage counter >>>>> + * (to make sure pm_runtime_active() doesn't return wrong code). >>>>> + */ >>>>> + if (!pm_runtime_active(wdev->parent)) >>>>> + dev_pm_genpd_resume(wdev->parent); >>>>> + >>>> >>>> I doubt this is the correct solution, but I may be wrong. Unless this >>>> is invoked at the syscore stage? >>> >>> On my case I see it invoked from kernel_restart(). As of my code reading, >> >> With the above explanations, do you consider calling dev_pm_genpd_resume() >> here is still wrong? > > Yes. At least, those genpd functions were not added to cope for cases like this. Sorry to bother you, do you have some suggestions on this topic? On my side I did some investigation to see how else it could be implemented but I don't have much clue how to go forward. Would you prefer to have a separate API to deal with domain power on in this scenario? Maybe one that should run only in the reboot context? Would you consider only updating the description of dev_pm_genpd_resume() and genpd_sync_power_on() to specify that it could run in a reboot context? Would you consider updating the genpd_switch_state() to take into system reboot state and do locking based on that, too? > > Moreover, you still need to find another solution as > clk_prepare_enable() can't be called in this path. The clock driver doesn't implement clk_ops::prepare in all micro-architectures that this watchdog driver is used. This may be the reason the clk_prepare_enable() was used on this path from the beginning. Even though, a simple solution I have in mind for this is to keep the clk prepared all the time. Thank you, Claudiu Beznea > >> >> Do you have any suggestions I could try? > > Not at the moment, but I will try to circle back to this topic more > thinking next week, when I have some more time. > >> >> Thank you, >> Claudiu Beznea > > Kind regards > Uffe > >> >>> at that point only one CPU is active with IRQs disabled (done in >>> machine_restart()). Below is the stack trace decoded on next-20240410 with >>> this series >>> (https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/) >>> on top and the one from here (adding power domain support): >>> https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ >>> >>> Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT) >>> Call trace: >>> dump_backtrace (arch/arm64/kernel/stacktrace.c:319) >>> show_stack (arch/arm64/kernel/stacktrace.c:326) >>> dump_stack_lvl (lib/dump_stack.c:117) >>> dump_stack (lib/dump_stack.c:124) >>> rzg2l_wdt_restart (drivers/watchdog/rzg2l_wdt.c:180) >>> watchdog_restart_notifier (drivers/watchdog/watchdog_core.c:188) >>> atomic_notifier_call_chain (kernel/notifier.c:98 kernel/notifier.c:231) >>> do_kernel_restart (kernel/reboot.c:236) >>> machine_restart (arch/arm64/kernel/process.c:145) >>> kernel_restart (kernel/reboot.c:287) >>> __do_sys_reboot (kernel/reboot.c:755) >>> __arm64_sys_reboot (kernel/reboot.c:715) >>> invoke_syscall (arch/arm64/include/asm/current.h:19 >>> arch/arm64/kernel/syscall.c:53) >>> el0_svc_common.constprop.0 (include/linux/thread_info.h:127 >>> arch/arm64/kernel/syscall.c:141) >>> do_el0_svc (arch/arm64/kernel/syscall.c:153) >>> el0_svc (arch/arm64/include/asm/irqflags.h:56 >>> arch/arm64/include/asm/irqflags.h:77 arch/arm64/kernel/entry-common.c:165 >>> arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) >>> el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) >>> el0t_64_sync (arch/arm64/kernel/entry.S:598) >>> >>> The watchdog restart handler is added in restart_handler_list and this list >>> is invoked though do_kernel_restart(). As of my code investigation the >>> restart_handler_list is invoked only though do_kernel_restart() and only >>> though the stack trace above. >>> >>> Thank you, >>> Claudiu Beznea >>> >>>> >>>>> clk_prepare_enable(priv->pclk); >>>>> clk_prepare_enable(priv->osc_clk); >>>>> >>>>> -- >>>>> 2.39.2 >>>>> >>>>> >>>> >>>> Can you redirectly me to the complete series, so I can have a better >>>> overview of the problem? >>> >>> This is the series that adds power domain support for RZ/G3S SoC: >>> https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@bp.renesas.com/ >>> >>> This is the series that adds watchdog support for RZ/G3S SoC: >>> https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@bp.renesas.com/ >>> >>> Thank you for your review, >>> Claudiu Beznea >>> >>>> >>>> Kind regards >>>> Uffe
diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c index c8c20cfb97a3..98e5e9914a5d 100644 --- a/drivers/watchdog/rzg2l_wdt.c +++ b/drivers/watchdog/rzg2l_wdt.c @@ -12,6 +12,7 @@ #include <linux/module.h> #include <linux/of.h> #include <linux/platform_device.h> +#include <linux/pm_domain.h> #include <linux/pm_runtime.h> #include <linux/reset.h> #include <linux/units.h> @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); int ret; + /* + * The device may be part of a power domain that is currently + * powered off. We need to power it up before accessing registers. + * We don't undo the dev_pm_genpd_resume() as the device need to + * be up for the reboot to happen. Also, as we are in atomic context + * here there is no need to increment PM runtime usage counter + * (to make sure pm_runtime_active() doesn't return wrong code). + */ + if (!pm_runtime_active(wdev->parent)) + dev_pm_genpd_resume(wdev->parent); + clk_prepare_enable(priv->pclk); clk_prepare_enable(priv->osc_clk);