diff mbox

[2/2] thermal: rcar_thermal: use pm_runtime_put_sync()

Message ID 87d1vili9c.wl%kuninori.morimoto.gx@renesas.com (mailing list archive)
State Rejected
Delegated to: Geert Uytterhoeven
Headers show

Commit Message

Kuninori Morimoto Nov. 10, 2015, 2:12 a.m. UTC
From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>

It is using pm_runtime_get_sync() on probe(). Let's use
pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal
sensor doesn't work after unbind/re-bind

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
---
 drivers/thermal/rcar_thermal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Geert Uytterhoeven Nov. 10, 2015, 8:18 a.m. UTC | #1
Hi Morimoto-san, Ulf,

On Tue, Nov 10, 2015 at 3:12 AM, Kuninori Morimoto
<kuninori.morimoto.gx@renesas.com> wrote:
> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>
> It is using pm_runtime_get_sync() on probe(). Let's use
> pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal
> sensor doesn't work after unbind/re-bind
>
> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> ---
>  drivers/thermal/rcar_thermal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
> index 13d01ed..f7cf2d7 100644
> --- a/drivers/thermal/rcar_thermal.c
> +++ b/drivers/thermal/rcar_thermal.c
> @@ -373,7 +373,7 @@ static int rcar_thermal_remove(struct platform_device *pdev)
>                 thermal_zone_device_unregister(priv->zone);
>         }
>
> -       pm_runtime_put(dev);
> +       pm_runtime_put_sync(dev);
>         pm_runtime_disable(dev);
>
>         return 0;

While I can confirm this fixes the issue, I think this is a bug in the PM
core, and thus your patch is merely a workaround.

Morimoto-san: I assume this is a recent regression. Have you tried to bisect?

With a bit more debugging info, this is the difference between the failing
and the "fixed" cases:

 unbind:

+rcar_thermal e61f0000.thermal: pm_clk_suspend()
+renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF
 rcar_thermal e61f0000.thermal: removing from PM domain clock-controller
 pm_genpd_remove_device: Remove e61f0000.thermal from clock-controller
-renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF

 bind:

 rcar_thermal e61f0000.thermal: adding to PM domain clock-controller
 __pm_genpd_add_device: Add e61f0000.thermal to clock-controller
 rcar_thermal e61f0000.thermal: Clock thermal con_id (null) managed by
runtime PM.
-rcar_thermal e61f0000.thermal: thermal sensor was broken
+rcar_thermal e61f0000.thermal: pm_clk_resume()
+renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal ON
 rcar_thermal e61f0000.thermal: 1 sensor probed

In the failing case, pm_clk_suspend() is not called, and turning off the
module clock is thus delayed until removal of the device from the clock
domain.
But as pm_clk_suspend() wasn't called, the device isn't correctly resumed on
rebind, and the module clock is never re-enabled, leading to a failure.

Ulf, what do you think?

Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Nov. 10, 2015, 9:57 a.m. UTC | #2
On 10 November 2015 at 09:18, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> Hi Morimoto-san, Ulf,
>
> On Tue, Nov 10, 2015 at 3:12 AM, Kuninori Morimoto
> <kuninori.morimoto.gx@renesas.com> wrote:
>> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>>
>> It is using pm_runtime_get_sync() on probe(). Let's use
>> pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal
>> sensor doesn't work after unbind/re-bind
>>
>> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>> ---
>>  drivers/thermal/rcar_thermal.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
>> index 13d01ed..f7cf2d7 100644
>> --- a/drivers/thermal/rcar_thermal.c
>> +++ b/drivers/thermal/rcar_thermal.c
>> @@ -373,7 +373,7 @@ static int rcar_thermal_remove(struct platform_device *pdev)
>>                 thermal_zone_device_unregister(priv->zone);
>>         }
>>
>> -       pm_runtime_put(dev);
>> +       pm_runtime_put_sync(dev);
>>         pm_runtime_disable(dev);

For the reasons explained by Geert, this is to me also a "workaround".

I would replace pm_runtime_put() and pm_runtime_disable() with a call
to pm_runtime_force_suspend().

In that way, you will make sure you device get runtime suspended
(clock domain will gate the clock). Additionally, the runtime PM
status will properly reflect the status of the device.

>>
>>         return 0;
>
> While I can confirm this fixes the issue, I think this is a bug in the PM
> core, and thus your patch is merely a workaround.
>
> Morimoto-san: I assume this is a recent regression. Have you tried to bisect?
>
> With a bit more debugging info, this is the difference between the failing
> and the "fixed" cases:
>
>  unbind:
>
> +rcar_thermal e61f0000.thermal: pm_clk_suspend()
> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF
>  rcar_thermal e61f0000.thermal: removing from PM domain clock-controller
>  pm_genpd_remove_device: Remove e61f0000.thermal from clock-controller
> -renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF
>
>  bind:
>
>  rcar_thermal e61f0000.thermal: adding to PM domain clock-controller
>  __pm_genpd_add_device: Add e61f0000.thermal to clock-controller
>  rcar_thermal e61f0000.thermal: Clock thermal con_id (null) managed by
> runtime PM.
> -rcar_thermal e61f0000.thermal: thermal sensor was broken
> +rcar_thermal e61f0000.thermal: pm_clk_resume()
> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal ON
>  rcar_thermal e61f0000.thermal: 1 sensor probed
>
> In the failing case, pm_clk_suspend() is not called, and turning off the
> module clock is thus delayed until removal of the device from the clock
> domain.
> But as pm_clk_suspend() wasn't called, the device isn't correctly resumed on
> rebind, and the module clock is never re-enabled, leading to a failure.
>
> Ulf, what do you think?

I totally agree on your analyse.

The problem is that the runtime PM status of the device isn't
correctly updated at ->remove(). The effect is that the the
pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the
->runtime_resume() callbacks to be invoked, as the runtime PM core
believes the device is already runtime resumed.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Geert Uytterhoeven Nov. 10, 2015, 10:10 a.m. UTC | #3
Hi Ulf,

On Tue, Nov 10, 2015 at 10:57 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 10 November 2015 at 09:18, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> On Tue, Nov 10, 2015 at 3:12 AM, Kuninori Morimoto
>> <kuninori.morimoto.gx@renesas.com> wrote:
>>> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>>>
>>> It is using pm_runtime_get_sync() on probe(). Let's use
>>> pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal
>>> sensor doesn't work after unbind/re-bind
>>>
>>> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>>> ---
>>>  drivers/thermal/rcar_thermal.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
>>> index 13d01ed..f7cf2d7 100644
>>> --- a/drivers/thermal/rcar_thermal.c
>>> +++ b/drivers/thermal/rcar_thermal.c
>>> @@ -373,7 +373,7 @@ static int rcar_thermal_remove(struct platform_device *pdev)
>>>                 thermal_zone_device_unregister(priv->zone);
>>>         }
>>>
>>> -       pm_runtime_put(dev);
>>> +       pm_runtime_put_sync(dev);
>>>         pm_runtime_disable(dev);
>
> For the reasons explained by Geert, this is to me also a "workaround".
>
> I would replace pm_runtime_put() and pm_runtime_disable() with a call
> to pm_runtime_force_suspend().
>
> In that way, you will make sure you device get runtime suspended
> (clock domain will gate the clock). Additionally, the runtime PM
> status will properly reflect the status of the device.

That still sounds like a workaround to me, which we have to apply to all
drivers relying on Runtime PM?

>> With a bit more debugging info, this is the difference between the failing
>> and the "fixed" cases:
>>
>>  unbind:
>>
>> +rcar_thermal e61f0000.thermal: pm_clk_suspend()
>> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF
>>  rcar_thermal e61f0000.thermal: removing from PM domain clock-controller
>>  pm_genpd_remove_device: Remove e61f0000.thermal from clock-controller
>> -renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF
>>
>>  bind:
>>
>>  rcar_thermal e61f0000.thermal: adding to PM domain clock-controller
>>  __pm_genpd_add_device: Add e61f0000.thermal to clock-controller
>>  rcar_thermal e61f0000.thermal: Clock thermal con_id (null) managed by
>> runtime PM.
>> -rcar_thermal e61f0000.thermal: thermal sensor was broken
>> +rcar_thermal e61f0000.thermal: pm_clk_resume()
>> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal ON
>>  rcar_thermal e61f0000.thermal: 1 sensor probed
>>
>> In the failing case, pm_clk_suspend() is not called, and turning off the
>> module clock is thus delayed until removal of the device from the clock
>> domain.
>> But as pm_clk_suspend() wasn't called, the device isn't correctly resumed on
>> rebind, and the module clock is never re-enabled, leading to a failure.
>>
>> Ulf, what do you think?
>
> I totally agree on your analyse.
>
> The problem is that the runtime PM status of the device isn't
> correctly updated at ->remove(). The effect is that the the
> pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the
> ->runtime_resume() callbacks to be invoked, as the runtime PM core
> believes the device is already runtime resumed.

So that's where it should be fixed?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Nov. 10, 2015, 1 p.m. UTC | #4
+Rafael, Alan

On 10 November 2015 at 11:10, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> Hi Ulf,
>
> On Tue, Nov 10, 2015 at 10:57 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 10 November 2015 at 09:18, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>>> On Tue, Nov 10, 2015 at 3:12 AM, Kuninori Morimoto
>>> <kuninori.morimoto.gx@renesas.com> wrote:
>>>> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>>>>
>>>> It is using pm_runtime_get_sync() on probe(). Let's use
>>>> pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal
>>>> sensor doesn't work after unbind/re-bind
>>>>
>>>> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>>>> ---
>>>>  drivers/thermal/rcar_thermal.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
>>>> index 13d01ed..f7cf2d7 100644
>>>> --- a/drivers/thermal/rcar_thermal.c
>>>> +++ b/drivers/thermal/rcar_thermal.c
>>>> @@ -373,7 +373,7 @@ static int rcar_thermal_remove(struct platform_device *pdev)
>>>>                 thermal_zone_device_unregister(priv->zone);
>>>>         }
>>>>
>>>> -       pm_runtime_put(dev);
>>>> +       pm_runtime_put_sync(dev);
>>>>         pm_runtime_disable(dev);
>>
>> For the reasons explained by Geert, this is to me also a "workaround".
>>
>> I would replace pm_runtime_put() and pm_runtime_disable() with a call
>> to pm_runtime_force_suspend().
>>
>> In that way, you will make sure you device get runtime suspended
>> (clock domain will gate the clock). Additionally, the runtime PM
>> status will properly reflect the status of the device.
>
> That still sounds like a workaround to me, which we have to apply to all
> drivers relying on Runtime PM?

Definitely not all drivers, but those that runs pm_runtime_get_sync()
during ->probe() and expects the ->runtime_resume() callback to always
be invoked because of that. I guess we need to check upon which
drivers that may suffer from this.

I wouldn't be surprised if at least a subset of those cases we find,
are poorly designed from PM point of view and won't even probe
successfully unless CONFIG_PM is set. Whatever that means...

>
>>> With a bit more debugging info, this is the difference between the failing
>>> and the "fixed" cases:
>>>
>>>  unbind:
>>>
>>> +rcar_thermal e61f0000.thermal: pm_clk_suspend()
>>> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF
>>>  rcar_thermal e61f0000.thermal: removing from PM domain clock-controller
>>>  pm_genpd_remove_device: Remove e61f0000.thermal from clock-controller
>>> -renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF
>>>
>>>  bind:
>>>
>>>  rcar_thermal e61f0000.thermal: adding to PM domain clock-controller
>>>  __pm_genpd_add_device: Add e61f0000.thermal to clock-controller
>>>  rcar_thermal e61f0000.thermal: Clock thermal con_id (null) managed by
>>> runtime PM.
>>> -rcar_thermal e61f0000.thermal: thermal sensor was broken
>>> +rcar_thermal e61f0000.thermal: pm_clk_resume()
>>> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal ON
>>>  rcar_thermal e61f0000.thermal: 1 sensor probed
>>>
>>> In the failing case, pm_clk_suspend() is not called, and turning off the
>>> module clock is thus delayed until removal of the device from the clock
>>> domain.
>>> But as pm_clk_suspend() wasn't called, the device isn't correctly resumed on
>>> rebind, and the module clock is never re-enabled, leading to a failure.
>>>
>>> Ulf, what do you think?
>>
>> I totally agree on your analyse.
>>
>> The problem is that the runtime PM status of the device isn't
>> correctly updated at ->remove(). The effect is that the the
>> pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the
>> ->runtime_resume() callbacks to be invoked, as the runtime PM core
>> believes the device is already runtime resumed.
>
> So that's where it should be fixed?

That would be a more generic approach, although I am not sure how the
driver/PM core should be able to take the correct decision in this
phase. Devices may be runtime PM managed also without a driver bound.

Perhaps when __device_release_driver() finds a bounded driver for the
device, it could after all actions been performed to unbind the
driver, check if runtime PM is enabled. If it isn't, it could set the
runtime PM status to suspended!?

I have no idea if that would introduce other issues as it would kind
of force the runtime PM status of the device to suspend, without
actually knowing if it's the correct thing to do.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eduardo Valentin Nov. 10, 2015, 6:30 p.m. UTC | #5
Hi,

On Tue, Nov 10, 2015 at 02:00:38PM +0100, Ulf Hansson wrote:
> +Rafael, Alan
> 
> On 10 November 2015 at 11:10, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > Hi Ulf,
> >
> > On Tue, Nov 10, 2015 at 10:57 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >> On 10 November 2015 at 09:18, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >>> On Tue, Nov 10, 2015 at 3:12 AM, Kuninori Morimoto
> >>> <kuninori.morimoto.gx@renesas.com> wrote:
> >>>> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> >>>>
> >>>> It is using pm_runtime_get_sync() on probe(). Let's use
> >>>> pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal
> >>>> sensor doesn't work after unbind/re-bind
> >>>>
> >>>> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> >>>> ---
> >>>>  drivers/thermal/rcar_thermal.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
> >>>> index 13d01ed..f7cf2d7 100644
> >>>> --- a/drivers/thermal/rcar_thermal.c
> >>>> +++ b/drivers/thermal/rcar_thermal.c
> >>>> @@ -373,7 +373,7 @@ static int rcar_thermal_remove(struct platform_device *pdev)
> >>>>                 thermal_zone_device_unregister(priv->zone);
> >>>>         }
> >>>>
> >>>> -       pm_runtime_put(dev);
> >>>> +       pm_runtime_put_sync(dev);
> >>>>         pm_runtime_disable(dev);
> >>
> >> For the reasons explained by Geert, this is to me also a "workaround".
> >>
> >> I would replace pm_runtime_put() and pm_runtime_disable() with a call
> >> to pm_runtime_force_suspend().
> >>
> >> In that way, you will make sure you device get runtime suspended
> >> (clock domain will gate the clock). Additionally, the runtime PM
> >> status will properly reflect the status of the device.
> >
> > That still sounds like a workaround to me, which we have to apply to all
> > drivers relying on Runtime PM?
> 
> Definitely not all drivers, but those that runs pm_runtime_get_sync()
> during ->probe() and expects the ->runtime_resume() callback to always
> be invoked because of that. I guess we need to check upon which
> drivers that may suffer from this.
> 
> I wouldn't be surprised if at least a subset of those cases we find,
> are poorly designed from PM point of view and won't even probe
> successfully unless CONFIG_PM is set. Whatever that means...


Yeah, if it is the case this is a bug in runtime pm core, I would prefer
this to be properly fixed, and not only this driver benefits of it.

Rafael? Any thoughts?

BR,

Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 10, 2015, 11:57 p.m. UTC | #6
On Tuesday, November 10, 2015 02:00:38 PM Ulf Hansson wrote:
> +Rafael, Alan
> 
> On 10 November 2015 at 11:10, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > Hi Ulf,
> >

[cut]

> >>
> >> The problem is that the runtime PM status of the device isn't
> >> correctly updated at ->remove(). The effect is that the the
> >> pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the
> >> ->runtime_resume() callbacks to be invoked, as the runtime PM core
> >> believes the device is already runtime resumed.
> >
> > So that's where it should be fixed?
> 
> That would be a more generic approach, although I am not sure how the
> driver/PM core should be able to take the correct decision in this
> phase. Devices may be runtime PM managed also without a driver bound.
> 
> Perhaps when __device_release_driver() finds a bounded driver for the
> device, it could after all actions been performed to unbind the
> driver, check if runtime PM is enabled. If it isn't, it could set the
> runtime PM status to suspended!?
> 
> I have no idea if that would introduce other issues as it would kind
> of force the runtime PM status of the device to suspend, without
> actually knowing if it's the correct thing to do.

IMO, that needs to depend on the bus type.  If the bus type has a way
to manage PM for devices without drivers, it should be allowed to do so.

Of course, the platform bus type is somewhat special in that respect,
but it looks like we simply need some sort of a convention in there too
(the expectations should be the same for everybody).

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 11, 2015, 12:11 a.m. UTC | #7
On Tuesday, November 10, 2015 10:30:51 AM Eduardo Valentin wrote:
> Hi,
> 
> On Tue, Nov 10, 2015 at 02:00:38PM +0100, Ulf Hansson wrote:
> > +Rafael, Alan
> > 
> > On 10 November 2015 at 11:10, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > Hi Ulf,
> > >
> > > On Tue, Nov 10, 2015 at 10:57 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >> On 10 November 2015 at 09:18, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >>> On Tue, Nov 10, 2015 at 3:12 AM, Kuninori Morimoto
> > >>> <kuninori.morimoto.gx@renesas.com> wrote:
> > >>>> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> > >>>>
> > >>>> It is using pm_runtime_get_sync() on probe(). Let's use
> > >>>> pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal
> > >>>> sensor doesn't work after unbind/re-bind
> > >>>>
> > >>>> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> > >>>> ---
> > >>>>  drivers/thermal/rcar_thermal.c | 2 +-
> > >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>>>
> > >>>> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
> > >>>> index 13d01ed..f7cf2d7 100644
> > >>>> --- a/drivers/thermal/rcar_thermal.c
> > >>>> +++ b/drivers/thermal/rcar_thermal.c
> > >>>> @@ -373,7 +373,7 @@ static int rcar_thermal_remove(struct platform_device *pdev)
> > >>>>                 thermal_zone_device_unregister(priv->zone);
> > >>>>         }
> > >>>>
> > >>>> -       pm_runtime_put(dev);
> > >>>> +       pm_runtime_put_sync(dev);
> > >>>>         pm_runtime_disable(dev);
> > >>
> > >> For the reasons explained by Geert, this is to me also a "workaround".
> > >>
> > >> I would replace pm_runtime_put() and pm_runtime_disable() with a call
> > >> to pm_runtime_force_suspend().
> > >>
> > >> In that way, you will make sure you device get runtime suspended
> > >> (clock domain will gate the clock). Additionally, the runtime PM
> > >> status will properly reflect the status of the device.
> > >
> > > That still sounds like a workaround to me, which we have to apply to all
> > > drivers relying on Runtime PM?
> > 
> > Definitely not all drivers, but those that runs pm_runtime_get_sync()
> > during ->probe() and expects the ->runtime_resume() callback to always
> > be invoked because of that. I guess we need to check upon which
> > drivers that may suffer from this.

Generally, calling pm_runtime_get_sync() in ->probe() and expecting the
driver's ->runtime_resume() to be always be invoked is a mistake.  I know
nothing about any guarantees that this will always happen.

If you want your ->runtime_resume() to be invoked no matter what, you really
need to figure out what the current state of things is, change it to your
expectations with runtime PM disabled and enable runtime PM after that. 

Still, that also needs to be done with care as the bus type/PM domain may be
affected by it.

> > 
> > I wouldn't be surprised if at least a subset of those cases we find,
> > are poorly designed from PM point of view and won't even probe
> > successfully unless CONFIG_PM is set. Whatever that means...
> 
> 
> Yeah, if it is the case this is a bug in runtime pm core, I would prefer
> this to be properly fixed, and not only this driver benefits of it.
> 
> Rafael? Any thoughts?

First off, it's not a bug in the runtime PM core, as that code is agnostic
to what should or should not happen to devices during ->probe, ->remove etc.

Second, as I said above (and elsewhere), the driver is just a piece of the
puzzle in many cases.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kuninori Morimoto Nov. 11, 2015, 2:41 a.m. UTC | #8
Hi Geert

> > -       pm_runtime_put(dev);
> > +       pm_runtime_put_sync(dev);
> >         pm_runtime_disable(dev);
> >
> >         return 0;
> 
> While I can confirm this fixes the issue, I think this is a bug in the PM
> core, and thus your patch is merely a workaround.
> 
> Morimoto-san: I assume this is a recent regression. Have you tried to bisect?

I thought that this is driver side issue, but I noticed that it was working before.
I tried bisect, and found that this patch breaks bind/unbind

	cbc41d0a761bffb3166a413a3c77100a737c0cd7
	("drivers: sh: Disable PM runtime for multi-platform ARM with genpd")

Best regards
---
Kuninori Morimoto
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Nov. 11, 2015, 11:03 a.m. UTC | #9
On 11 November 2015 at 00:57, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, November 10, 2015 02:00:38 PM Ulf Hansson wrote:
>> +Rafael, Alan
>>
>> On 10 November 2015 at 11:10, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > Hi Ulf,
>> >
>
> [cut]
>
>> >>
>> >> The problem is that the runtime PM status of the device isn't
>> >> correctly updated at ->remove(). The effect is that the the
>> >> pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the
>> >> ->runtime_resume() callbacks to be invoked, as the runtime PM core
>> >> believes the device is already runtime resumed.
>> >
>> > So that's where it should be fixed?
>>
>> That would be a more generic approach, although I am not sure how the
>> driver/PM core should be able to take the correct decision in this
>> phase. Devices may be runtime PM managed also without a driver bound.
>>
>> Perhaps when __device_release_driver() finds a bounded driver for the
>> device, it could after all actions been performed to unbind the
>> driver, check if runtime PM is enabled. If it isn't, it could set the
>> runtime PM status to suspended!?
>>
>> I have no idea if that would introduce other issues as it would kind
>> of force the runtime PM status of the device to suspend, without
>> actually knowing if it's the correct thing to do.
>
> IMO, that needs to depend on the bus type.  If the bus type has a way
> to manage PM for devices without drivers, it should be allowed to do so.

By following my suggestion above, we would allow the bus/driver's
->remove() to manage whether runtime PM should be enabled/disabled for
the device, before __device_release_driver() checks that.
Don't you think that the driver core could rely on that information?

I realize that it would be a kind of policy decision for runtime PM,
but it's quite similar as when register/unregister devices when we set
the runtime PM status to suspended.

If you don't think this is a good idea, I guess we need to deal with
this from subsystem level code somehow instead.

>
> Of course, the platform bus type is somewhat special in that respect,
> but it looks like we simply need some sort of a convention in there too
> (the expectations should be the same for everybody).
>
> Thanks,
> Rafael
>

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 12, 2015, 1:06 a.m. UTC | #10
On Wednesday, November 11, 2015 12:03:52 PM Ulf Hansson wrote:
> On 11 November 2015 at 00:57, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, November 10, 2015 02:00:38 PM Ulf Hansson wrote:
> >> +Rafael, Alan
> >>
> >> On 10 November 2015 at 11:10, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> > Hi Ulf,
> >> >
> >
> > [cut]
> >
> >> >>
> >> >> The problem is that the runtime PM status of the device isn't
> >> >> correctly updated at ->remove(). The effect is that the the
> >> >> pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the
> >> >> ->runtime_resume() callbacks to be invoked, as the runtime PM core
> >> >> believes the device is already runtime resumed.
> >> >
> >> > So that's where it should be fixed?
> >>
> >> That would be a more generic approach, although I am not sure how the
> >> driver/PM core should be able to take the correct decision in this
> >> phase. Devices may be runtime PM managed also without a driver bound.
> >>
> >> Perhaps when __device_release_driver() finds a bounded driver for the
> >> device, it could after all actions been performed to unbind the
> >> driver, check if runtime PM is enabled. If it isn't, it could set the
> >> runtime PM status to suspended!?
> >>
> >> I have no idea if that would introduce other issues as it would kind
> >> of force the runtime PM status of the device to suspend, without
> >> actually knowing if it's the correct thing to do.
> >
> > IMO, that needs to depend on the bus type.  If the bus type has a way
> > to manage PM for devices without drivers, it should be allowed to do so.
> 
> By following my suggestion above, we would allow the bus/driver's
> ->remove() to manage whether runtime PM should be enabled/disabled for
> the device, before __device_release_driver() checks that.
> Don't you think that the driver core could rely on that information?
> 
> I realize that it would be a kind of policy decision for runtime PM,
> but it's quite similar as when register/unregister devices when we set
> the runtime PM status to suspended.

OK

If we did that, all devices that had just been unbound from their drivers
and had runtime PM disabled after that would be set to "suspended" by the
core, right?

If that helps, I don't really have objections.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Nov. 12, 2015, 8:04 a.m. UTC | #11
On 12 November 2015 at 02:06, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, November 11, 2015 12:03:52 PM Ulf Hansson wrote:
>> On 11 November 2015 at 00:57, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> > On Tuesday, November 10, 2015 02:00:38 PM Ulf Hansson wrote:
>> >> +Rafael, Alan
>> >>
>> >> On 10 November 2015 at 11:10, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> >> > Hi Ulf,
>> >> >
>> >
>> > [cut]
>> >
>> >> >>
>> >> >> The problem is that the runtime PM status of the device isn't
>> >> >> correctly updated at ->remove(). The effect is that the the
>> >> >> pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the
>> >> >> ->runtime_resume() callbacks to be invoked, as the runtime PM core
>> >> >> believes the device is already runtime resumed.
>> >> >
>> >> > So that's where it should be fixed?
>> >>
>> >> That would be a more generic approach, although I am not sure how the
>> >> driver/PM core should be able to take the correct decision in this
>> >> phase. Devices may be runtime PM managed also without a driver bound.
>> >>
>> >> Perhaps when __device_release_driver() finds a bounded driver for the
>> >> device, it could after all actions been performed to unbind the
>> >> driver, check if runtime PM is enabled. If it isn't, it could set the
>> >> runtime PM status to suspended!?
>> >>
>> >> I have no idea if that would introduce other issues as it would kind
>> >> of force the runtime PM status of the device to suspend, without
>> >> actually knowing if it's the correct thing to do.
>> >
>> > IMO, that needs to depend on the bus type.  If the bus type has a way
>> > to manage PM for devices without drivers, it should be allowed to do so.
>>
>> By following my suggestion above, we would allow the bus/driver's
>> ->remove() to manage whether runtime PM should be enabled/disabled for
>> the device, before __device_release_driver() checks that.
>> Don't you think that the driver core could rely on that information?
>>
>> I realize that it would be a kind of policy decision for runtime PM,
>> but it's quite similar as when register/unregister devices when we set
>> the runtime PM status to suspended.
>
> OK
>
> If we did that, all devices that had just been unbound from their drivers
> and had runtime PM disabled after that would be set to "suspended" by the
> core, right?

Yes, that's the idea. I will send a patch we can test.

>
> If that helps, I don't really have objections.
>

Thanks!

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eduardo Valentin Nov. 12, 2015, 6:43 p.m. UTC | #12
Hello,

On Thu, Nov 12, 2015 at 09:04:09AM +0100, Ulf Hansson wrote:
> >
> > OK
> >
> > If we did that, all devices that had just been unbound from their drivers
> > and had runtime PM disabled after that would be set to "suspended" by the
> > core, right?
> 
> Yes, that's the idea. I will send a patch we can test.
> 
> >
> > If that helps, I don't really have objections.

Given this discussion,

Is this series of two patches on this thermal driver still applicable?


BR,

Eduardo

> >
> 
> Thanks!
> 
> Kind regards
> Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Nov. 13, 2015, 3:06 p.m. UTC | #13
On 12 November 2015 at 19:43, Eduardo Valentin <edubezval@gmail.com> wrote:
> Hello,
>
> On Thu, Nov 12, 2015 at 09:04:09AM +0100, Ulf Hansson wrote:
>> >
>> > OK
>> >
>> > If we did that, all devices that had just been unbound from their drivers
>> > and had runtime PM disabled after that would be set to "suspended" by the
>> > core, right?
>>
>> Yes, that's the idea. I will send a patch we can test.
>>
>> >
>> > If that helps, I don't really have objections.
>
> Given this discussion,
>
> Is this series of two patches on this thermal driver still applicable?

I think patch1 is different, it's a cleanup patch (I just replied to
it separately).

As for subject patch, I think we agreed upon that it's a workaround
but I don't have strong opinion if you want to pick it up anyway.

On the other hand the change won't be needed *if* we solve problem via
driver core. I intend to send a patch for this on Monday, keep you on
cc.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
index 13d01ed..f7cf2d7 100644
--- a/drivers/thermal/rcar_thermal.c
+++ b/drivers/thermal/rcar_thermal.c
@@ -373,7 +373,7 @@  static int rcar_thermal_remove(struct platform_device *pdev)
 		thermal_zone_device_unregister(priv->zone);
 	}
 
-	pm_runtime_put(dev);
+	pm_runtime_put_sync(dev);
 	pm_runtime_disable(dev);
 
 	return 0;