[3/8] pm: domains: sync runtime PM status with PM domains

Message ID	E1YMedA-00054Z-TK@rmk-PC.arm.linux.org.uk (mailing list archive)
State	RFC, archived
Headers	show Return-Path: <linux-pm-owner@kernel.org> In-Reply-To: <20150214152659.GI8656@n2100.arm.linux.org.uk> References: <20150214152659.GI8656@n2100.arm.linux.org.uk> From: Russell King <rmk+kernel@arm.linux.org.uk> To: Andrew Lunn <andrew@lunn.ch>, Jason Cooper <jason@lakedaemon.net>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Cc: Len Brown <len.brown@intel.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, linux-pm@vger.kernel.org Subject: [PATCH 3/8] pm: domains: sync runtime PM status with PM domains MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" Message-Id: <E1YMedA-00054Z-TK@rmk-PC.arm.linux.org.uk> Date: Sat, 14 Feb 2015 15:27:40 +0000 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk

Russell King Feb. 14, 2015, 3:27 p.m. UTC

Synchronise the PM domain status with runtime PM's status.  This
provides a better solution to the problem than commit 2ed127697eb1
("PM / Domains: Power on the PM domain right after attach completes").

The above commit added a call to power up the PM domain when a device
attaches to the domain.  The assumption is that the device driver
will cause a runtime PM transition, which will synchronise the PM
domain state with the runtime PM state.

However, runtime PM will, by default, assume that the device is
initially suspended.  So we have two subsystems which have differing
initial state expectations.  This is silly.

Runtime PM requires that pm_runtime_set_active() is called before
pm_runtime_enable() if the device is already powered up.  However, PM
domains are not informed of this, and this is where the problem really
lies.  We need to keep PM domains properly and fully informed of their
associated device status.

We fix this by adding a new callback - runtime_set_status() which is
triggered whenever a successful call to __pm_runtime_set_status().
PM domain code hooks into this, and updates the PM domain appropriately.

This means a driver with the following sequence:

	pm_runtime_set_active(dev);
	pm_runtime_enable(dev);

will trigger the PM domain to be powered up at this point, which keeps
runtime PM and PM domains properly in sync with each other.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/base/power/domain.c  | 16 +++++++++++++++-
 drivers/base/power/runtime.c |  5 +++++
 include/linux/pm.h           |  1 +
 3 files changed, 21 insertions(+), 1 deletion(-)

Ulf Hansson Feb. 15, 2015, 4:24 a.m. UTC | #1

Hi Russell,

On 14 February 2015 at 16:27, Russell King <rmk+kernel@arm.linux.org.uk> wrote:
> Synchronise the PM domain status with runtime PM's status.  This
> provides a better solution to the problem than commit 2ed127697eb1
> ("PM / Domains: Power on the PM domain right after attach completes").
>
> The above commit added a call to power up the PM domain when a device
> attaches to the domain.  The assumption is that the device driver
> will cause a runtime PM transition, which will synchronise the PM
> domain state with the runtime PM state.
>
> However, runtime PM will, by default, assume that the device is
> initially suspended.  So we have two subsystems which have differing
> initial state expectations.  This is silly.
>
> Runtime PM requires that pm_runtime_set_active() is called before
> pm_runtime_enable() if the device is already powered up.  However, PM
> domains are not informed of this, and this is where the problem really
> lies.  We need to keep PM domains properly and fully informed of their
> associated device status.
>
> We fix this by adding a new callback - runtime_set_status() which is
> triggered whenever a successful call to __pm_runtime_set_status().
> PM domain code hooks into this, and updates the PM domain appropriately.
>
> This means a driver with the following sequence:
>
>         pm_runtime_set_active(dev);
>         pm_runtime_enable(dev);
>
> will trigger the PM domain to be powered up at this point, which keeps
> runtime PM and PM domains properly in sync with each other.

The issue you are trying to solve here was raised by me during the
discussion around the below merged commit.
"PM / Domains: Power on the PM domain right after attach completes"

You may find the complete thread here:
http://marc.info/?l=linux-pm&m=141623756506128&w=2

Especially I mention a "scenario 5", which is the issue you observed
and what we decided to keep as a limitation. Sorry about that!

Moreover, the below commit is also related to the similar issues and
this fixed a regression.
67732cd34382 ( PM / Domains: Fix initial default state of the
need_restore flag).

I just wanted to make sure you have all the background to why we ended
up with current solution...

Regarding $subject patch, I think it's an interesting approach but I
need some more time to think about it. Unfortunate I am OOF the next
two weeks so will have to promise you to review this as soon as I get
back.

BTW, I would also appreciate if you could keep me on cc for any
further changes related to genpd.

Kind regards
Uffe

>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  drivers/base/power/domain.c  | 16 +++++++++++++++-
>  drivers/base/power/runtime.c |  5 +++++
>  include/linux/pm.h           |  1 +
>  3 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 11a1023fa64a..2a700cea41fc 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -741,6 +741,20 @@ static int pm_genpd_runtime_resume(struct device *dev)
>         return 0;
>  }
>
> +static int pm_genpd_runtime_set_status(struct device *dev)
> +{
> +       int ret;
> +
> +       dev_dbg(dev, "%s()\n", __func__);
> +
> +       if (pm_runtime_suspended(dev))
> +               ret = pm_genpd_runtime_suspend(dev);
> +       else
> +               ret = pm_genpd_runtime_resume(dev);
> +
> +       return ret;
> +}
> +
>  static bool pd_ignore_unused;
>  static int __init pd_ignore_unused_setup(char *__unused)
>  {
> @@ -1907,6 +1921,7 @@ void pm_genpd_init(struct generic_pm_domain *genpd,
>         genpd->max_off_time_changed = true;
>         genpd->domain.ops.runtime_suspend = pm_genpd_runtime_suspend;
>         genpd->domain.ops.runtime_resume = pm_genpd_runtime_resume;
> +       genpd->domain.ops.runtime_set_status = pm_genpd_runtime_set_status;
>         genpd->domain.ops.prepare = pm_genpd_prepare;
>         genpd->domain.ops.suspend = pm_genpd_suspend;
>         genpd->domain.ops.suspend_late = pm_genpd_suspend_late;
> @@ -2223,7 +2238,6 @@ int genpd_dev_pm_attach(struct device *dev)
>         }
>
>         dev->pm_domain->detach = genpd_dev_pm_detach;
> -       pm_genpd_poweron(pd);
>
>         return 0;
>  }
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 5070c4fe8542..a958cae67a37 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -981,6 +981,7 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
>         struct device *parent = dev->parent;
>         unsigned long flags;
>         bool notify_parent = false;
> +       pm_callback_t callback;
>         int error = 0;
>
>         if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> @@ -1029,6 +1030,10 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
>   out_set:
>         __update_runtime_status(dev, status);
>         dev->power.runtime_error = 0;
> +       if (dev->power.no_callbacks)
> +               goto out;
> +       callback = RPM_GET_CALLBACK(dev, runtime_set_status);
> +       rpm_callback(callback, dev);
>   out:
>         spin_unlock_irqrestore(&dev->power.lock, flags);
>
> diff --git a/include/linux/pm.h b/include/linux/pm.h
> index 8b5976364619..ee098585d577 100644
> --- a/include/linux/pm.h
> +++ b/include/linux/pm.h
> @@ -316,6 +316,7 @@ struct dev_pm_ops {
>         int (*runtime_suspend)(struct device *dev);
>         int (*runtime_resume)(struct device *dev);
>         int (*runtime_idle)(struct device *dev);
> +       int (*runtime_set_status)(struct device *dev);
>  };
>
>  #ifdef CONFIG_PM_SLEEP
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael J. Wysocki Feb. 17, 2015, 6:53 p.m. UTC | #2

First off, sorry for being slow to respond lately, I'm traveling now.

Also adding Alan Stern to the CC (please always do that for discussions
regarding the runtime PM core).

On Saturday, February 14, 2015 03:27:40 PM Russell King wrote:
> Synchronise the PM domain status with runtime PM's status.  This
> provides a better solution to the problem than commit 2ed127697eb1
> ("PM / Domains: Power on the PM domain right after attach completes").
> 
> The above commit added a call to power up the PM domain when a device
> attaches to the domain.  The assumption is that the device driver
> will cause a runtime PM transition, which will synchronise the PM
> domain state with the runtime PM state.
> 
> However, runtime PM will, by default, assume that the device is
> initially suspended.  So we have two subsystems which have differing
> initial state expectations.  This is silly.
> 
> Runtime PM requires that pm_runtime_set_active() is called before
> pm_runtime_enable() if the device is already powered up.

Well, this is an oversimplification of sorts.

What really is expected is that the RPM status of the device will agree
with its actual state at the moment when pm_runtime_enable() is called.

If the actual state of the device is "not powered", then it is invalid to
call pm_runtime_set_active() before pm_runtime_enable() even.

> However, PM domains are not informed of this, and this is where the problem
> really lies.  We need to keep PM domains properly and fully informed of their
> associated device status.

This goes backwards.  Rather, PM domains should tell everyone about what they
have done to the device IMO.  That is, since PM domains power up the device,
it would be logical to change its RPM status at that point to me.

That may lead to one subtle problem, though.

Suppose that, in addition to either having or not having a power domain under
it, the device has a clock that is managed directly by the driver.  The driver
may then want to do the clock management in its runtime PM callbacks.
However, if genpd changes the device's RPM status to "active", the runtime
PM framework will not execute the resume callback for the device, so things
will break if the clock is not actually enabled to start with.

IOW, we need a way for the driver and genpd to agree on what the RPM status
of the device should be set to before pm_runtime_enable() is executed.

> We fix this by adding a new callback - runtime_set_status() which is

I'm not sure if that's the way to address that, though.

Our intention for the pm_runtime_set_active() etc. calls was that they didn't
trigger any callbacks, since were are supposed to be executed with runtime
PM disabled.

Moreover ->

> triggered whenever a successful call to __pm_runtime_set_status().
> PM domain code hooks into this, and updates the PM domain appropriately.
> 
> This means a driver with the following sequence:
> 
> 	pm_runtime_set_active(dev);
> 	pm_runtime_enable(dev);
> 
> will trigger the PM domain to be powered up at this point, which keeps
> runtime PM and PM domains properly in sync with each other.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  drivers/base/power/domain.c  | 16 +++++++++++++++-
>  drivers/base/power/runtime.c |  5 +++++
>  include/linux/pm.h           |  1 +
>  3 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 11a1023fa64a..2a700cea41fc 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -741,6 +741,20 @@ static int pm_genpd_runtime_resume(struct device *dev)
>  	return 0;
>  }
>  
> +static int pm_genpd_runtime_set_status(struct device *dev)
> +{
> +	int ret;
> +
> +	dev_dbg(dev, "%s()\n", __func__);
> +
> +	if (pm_runtime_suspended(dev))
> +		ret = pm_genpd_runtime_suspend(dev);
> +	else
> +		ret = pm_genpd_runtime_resume(dev);
> +
> +	return ret;
> +}
> +
>  static bool pd_ignore_unused;
>  static int __init pd_ignore_unused_setup(char *__unused)
>  {
> @@ -1907,6 +1921,7 @@ void pm_genpd_init(struct generic_pm_domain *genpd,
>  	genpd->max_off_time_changed = true;
>  	genpd->domain.ops.runtime_suspend = pm_genpd_runtime_suspend;
>  	genpd->domain.ops.runtime_resume = pm_genpd_runtime_resume;
> +	genpd->domain.ops.runtime_set_status = pm_genpd_runtime_set_status;
>  	genpd->domain.ops.prepare = pm_genpd_prepare;
>  	genpd->domain.ops.suspend = pm_genpd_suspend;
>  	genpd->domain.ops.suspend_late = pm_genpd_suspend_late;
> @@ -2223,7 +2238,6 @@ int genpd_dev_pm_attach(struct device *dev)
>  	}
>  
>  	dev->pm_domain->detach = genpd_dev_pm_detach;
> -	pm_genpd_poweron(pd);
>  
>  	return 0;
>  }
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 5070c4fe8542..a958cae67a37 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -981,6 +981,7 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
>  	struct device *parent = dev->parent;
>  	unsigned long flags;
>  	bool notify_parent = false;
> +	pm_callback_t callback;
>  	int error = 0;
>  
>  	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> @@ -1029,6 +1030,10 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
>   out_set:
>  	__update_runtime_status(dev, status);
>  	dev->power.runtime_error = 0;
> +	if (dev->power.no_callbacks)
> +		goto out;
> +	callback = RPM_GET_CALLBACK(dev, runtime_set_status);
> +	rpm_callback(callback, dev);

-> That is specific to PM domains and arguably not needed otherwise, so I would
define it in struct dev_pm_domain outside of dev_pm_ops.

>   out:
>  	spin_unlock_irqrestore(&dev->power.lock, flags);
>  
> diff --git a/include/linux/pm.h b/include/linux/pm.h
> index 8b5976364619..ee098585d577 100644
> --- a/include/linux/pm.h
> +++ b/include/linux/pm.h
> @@ -316,6 +316,7 @@ struct dev_pm_ops {
>  	int (*runtime_suspend)(struct device *dev);
>  	int (*runtime_resume)(struct device *dev);
>  	int (*runtime_idle)(struct device *dev);
> +	int (*runtime_set_status)(struct device *dev);
>  };
>  
>  #ifdef CONFIG_PM_SLEEP
>

Alan Stern Feb. 17, 2015, 7:42 p.m. UTC | #3

On Tue, 17 Feb 2015, Rafael J. Wysocki wrote:

> On Saturday, February 14, 2015 03:27:40 PM Russell King wrote:
> > Synchronise the PM domain status with runtime PM's status.  This
> > provides a better solution to the problem than commit 2ed127697eb1
> > ("PM / Domains: Power on the PM domain right after attach completes").
> > 
> > The above commit added a call to power up the PM domain when a device
> > attaches to the domain.  The assumption is that the device driver
> > will cause a runtime PM transition, which will synchronise the PM
> > domain state with the runtime PM state.
> > 
> > However, runtime PM will, by default, assume that the device is
> > initially suspended.  So we have two subsystems which have differing
> > initial state expectations.  This is silly.

As Rafael says, that's putting it a little strongly.  The runtime PM 
status is initialized to RPM_SUSPENDED, simply because it has to be 
initialized to _something_.  But the core doesn't expect that value to 
be meaningful initially.

> > Runtime PM requires that pm_runtime_set_active() is called before
> > pm_runtime_enable() if the device is already powered up.
> 
> Well, this is an oversimplification of sorts.
> 
> What really is expected is that the RPM status of the device will agree
> with its actual state at the moment when pm_runtime_enable() is called.
> 
> If the actual state of the device is "not powered", then it is invalid to
> call pm_runtime_set_active() before pm_runtime_enable() even.

In practice it wouldn't make any difference, so long as
pm_runtime_set_suspended() is called later on but before
pm_runtime_enable().

> > However, PM domains are not informed of this, and this is where the problem
> > really lies.  We need to keep PM domains properly and fully informed of their
> > associated device status.
> 
> This goes backwards.  Rather, PM domains should tell everyone about what they
> have done to the device IMO.  That is, since PM domains power up the device,
> it would be logical to change its RPM status at that point to me.
> 
> That may lead to one subtle problem, though.
> 
> Suppose that, in addition to either having or not having a power domain under
> it, the device has a clock that is managed directly by the driver.  The driver
> may then want to do the clock management in its runtime PM callbacks.
> However, if genpd changes the device's RPM status to "active", the runtime
> PM framework will not execute the resume callback for the device, so things
> will break if the clock is not actually enabled to start with.
> 
> IOW, we need a way for the driver and genpd to agree on what the RPM status
> of the device should be set to before pm_runtime_enable() is executed.

Agreed.  For example, the PM domain may simply have a policy that all
devices must start out in the active state, and require member drivers
to enforce that policy.  Or the PM domain might directly invoke a
driver's runtime-PM callback routine if the device appears to be in the
"wrong" state initially.

> > We fix this by adding a new callback - runtime_set_status() which is
> 
> I'm not sure if that's the way to address that, though.
> 
> Our intention for the pm_runtime_set_active() etc. calls was that they didn't
> trigger any callbacks, since were are supposed to be executed with runtime
> PM disabled.
> 
> Moreover ->
...
> -> That is specific to PM domains and arguably not needed otherwise, so I would
> define it in struct dev_pm_domain outside of dev_pm_ops.

My thought exactly.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Russell King - ARM Linux Feb. 17, 2015, 7:42 p.m. UTC | #4

On Tue, Feb 17, 2015 at 07:53:47PM +0100, Rafael J. Wysocki wrote:
> First off, sorry for being slow to respond lately, I'm traveling now.
> 
> Also adding Alan Stern to the CC (please always do that for discussions
> regarding the runtime PM core).

If that is the case, how about adding them into MAINTAINERS?  Without
this information in there, Alan will probably be forgotten for future
patches.

> On Saturday, February 14, 2015 03:27:40 PM Russell King wrote:
> > Runtime PM requires that pm_runtime_set_active() is called before
> > pm_runtime_enable() if the device is already powered up.
> 
> Well, this is an oversimplification of sorts.
> 
> What really is expected is that the RPM status of the device will agree
> with its actual state at the moment when pm_runtime_enable() is called.
> 
> If the actual state of the device is "not powered", then it is invalid to
> call pm_runtime_set_active() before pm_runtime_enable() even.

We're both saying exactly the same thing, so we're in full agreement,
which is good. :)

> > However, PM domains are not informed of this, and this is where the problem
> > really lies.  We need to keep PM domains properly and fully informed of their
> > associated device status.
> 
> This goes backwards.  Rather, PM domains should tell everyone about
> what they have done to the device IMO.  That is, since PM domains
> power up the device, it would be logical to change its RPM status at
> that point to me.

Right, so it may make sense for runtime PM to query the PM domain state,
and synchronise itself with that.

> That may lead to one subtle problem, though.
> 
> Suppose that, in addition to either having or not having a power domain
> under it, the device has a clock that is managed directly by the driver.
> The driver may then want to do the clock management in its runtime PM
> callbacks.  However, if genpd changes the device's RPM status to "active",
> the runtime PM framework will not execute the resume callback for the
> device, so things will break if the clock is not actually enabled to
> start with.
> 
> IOW, we need a way for the driver and genpd to agree on what the RPM status
> of the device should be set to before pm_runtime_enable() is executed.

It's worse than that.  If, in the probe, we decide at a point to query
the PM domain status, and transfer it into the RPM status, how does the
driver know whether it needs to do a "put" to cause a transition from
active to suspended?

In the case where the PM domain was suspended, the RPM status would also
be suspended at that point, and it requires a RPM get to resume it.

If the PM domain was active, then it would require a pm_runtime_put()
operation to allow it to suspend.

This is why I decided that the methodology in the runtime PM code should
apply to PM domains: runtime PM requires the actual state to match the
runtime PM state prior to calling pm_runtime_enable().  We should also
require that the PM domain state must also match the runtime PM state
prior to runtime PM being enabled too.

> > We fix this by adding a new callback - runtime_set_status() which is
> 
> I'm not sure if that's the way to address that, though.

I've come to the conclusion that this isn't a good way to handle it,
because those drivers which may make use of PM domains without using
runtime PM will be probed with the PM domain powered down.

I think we've got a rather sticky problem here, and my proposed
solution will cause problems in that scenario.

Another possibility may be to trigger PM domains to try to power down
the domain when it sees the driver call pm_runtime_enable().  If the
driver never calls pm_runtime_enable(), the PM domain will be left
active.  If it does call this function, the effect of this will be to
synchronise the PM domain with the runtime PM state.

> Our intention for the pm_runtime_set_active() etc. calls was that they didn't
> trigger any callbacks, since were are supposed to be executed with runtime
> PM disabled.
> 
> Moreover ->
> 
> > triggered whenever a successful call to __pm_runtime_set_status().
> > PM domain code hooks into this, and updates the PM domain appropriately.
> > 
> > This means a driver with the following sequence:
> > 
> > 	pm_runtime_set_active(dev);
> > 	pm_runtime_enable(dev);
> > 
> > will trigger the PM domain to be powered up at this point, which keeps
> > runtime PM and PM domains properly in sync with each other.
> > 
> > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> > ---
> >  drivers/base/power/domain.c  | 16 +++++++++++++++-
> >  drivers/base/power/runtime.c |  5 +++++
> >  include/linux/pm.h           |  1 +
> >  3 files changed, 21 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> > index 11a1023fa64a..2a700cea41fc 100644
> > --- a/drivers/base/power/domain.c
> > +++ b/drivers/base/power/domain.c
> > @@ -741,6 +741,20 @@ static int pm_genpd_runtime_resume(struct device *dev)
> >  	return 0;
> >  }
> >  
> > +static int pm_genpd_runtime_set_status(struct device *dev)
> > +{
> > +	int ret;
> > +
> > +	dev_dbg(dev, "%s()\n", __func__);
> > +
> > +	if (pm_runtime_suspended(dev))
> > +		ret = pm_genpd_runtime_suspend(dev);
> > +	else
> > +		ret = pm_genpd_runtime_resume(dev);
> > +
> > +	return ret;
> > +}
> > +
> >  static bool pd_ignore_unused;
> >  static int __init pd_ignore_unused_setup(char *__unused)
> >  {
> > @@ -1907,6 +1921,7 @@ void pm_genpd_init(struct generic_pm_domain *genpd,
> >  	genpd->max_off_time_changed = true;
> >  	genpd->domain.ops.runtime_suspend = pm_genpd_runtime_suspend;
> >  	genpd->domain.ops.runtime_resume = pm_genpd_runtime_resume;
> > +	genpd->domain.ops.runtime_set_status = pm_genpd_runtime_set_status;
> >  	genpd->domain.ops.prepare = pm_genpd_prepare;
> >  	genpd->domain.ops.suspend = pm_genpd_suspend;
> >  	genpd->domain.ops.suspend_late = pm_genpd_suspend_late;
> > @@ -2223,7 +2238,6 @@ int genpd_dev_pm_attach(struct device *dev)
> >  	}
> >  
> >  	dev->pm_domain->detach = genpd_dev_pm_detach;
> > -	pm_genpd_poweron(pd);
> >  
> >  	return 0;
> >  }
> > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > index 5070c4fe8542..a958cae67a37 100644
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -981,6 +981,7 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
> >  	struct device *parent = dev->parent;
> >  	unsigned long flags;
> >  	bool notify_parent = false;
> > +	pm_callback_t callback;
> >  	int error = 0;
> >  
> >  	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > @@ -1029,6 +1030,10 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
> >   out_set:
> >  	__update_runtime_status(dev, status);
> >  	dev->power.runtime_error = 0;
> > +	if (dev->power.no_callbacks)
> > +		goto out;
> > +	callback = RPM_GET_CALLBACK(dev, runtime_set_status);
> > +	rpm_callback(callback, dev);
> 
> -> That is specific to PM domains and arguably not needed otherwise,
> so I would define it in struct dev_pm_domain outside of dev_pm_ops.

How does runtime PM know that we're using PM domains though?  How
does runtime PM know that it can cast the dev_pm_ops pointer safely?

Rafael J. Wysocki Feb. 18, 2015, 7:02 a.m. UTC | #5

On Tuesday, February 17, 2015 07:42:43 PM Russell King - ARM Linux wrote:
> On Tue, Feb 17, 2015 at 07:53:47PM +0100, Rafael J. Wysocki wrote:

[cut]

> > 
> > What really is expected is that the RPM status of the device will agree
> > with its actual state at the moment when pm_runtime_enable() is called.
> > 
> > If the actual state of the device is "not powered", then it is invalid to
> > call pm_runtime_set_active() before pm_runtime_enable() even.
> 
> We're both saying exactly the same thing, so we're in full agreement,
> which is good. :)
> 
> > > However, PM domains are not informed of this, and this is where the problem
> > > really lies.  We need to keep PM domains properly and fully informed of their
> > > associated device status.
> > 
> > This goes backwards.  Rather, PM domains should tell everyone about
> > what they have done to the device IMO.  That is, since PM domains
> > power up the device, it would be logical to change its RPM status at
> > that point to me.
> 
> Right, so it may make sense for runtime PM to query the PM domain state,
> and synchronise itself with that.
> 
> > That may lead to one subtle problem, though.
> > 
> > Suppose that, in addition to either having or not having a power domain
> > under it, the device has a clock that is managed directly by the driver.
> > The driver may then want to do the clock management in its runtime PM
> > callbacks.  However, if genpd changes the device's RPM status to "active",
> > the runtime PM framework will not execute the resume callback for the
> > device, so things will break if the clock is not actually enabled to
> > start with.
> > 
> > IOW, we need a way for the driver and genpd to agree on what the RPM status
> > of the device should be set to before pm_runtime_enable() is executed.
> 
> It's worse than that.  If, in the probe, we decide at a point to query
> the PM domain status, and transfer it into the RPM status, how does the
> driver know whether it needs to do a "put" to cause a transition from
> active to suspended?

When ->probe() runs, the cases are:
(1) There are no power domains, so the device is "active" when the clock is
    enabled and "suspended" when it is disabled.
(2) There is a power domain which is initially off.  The RPM status of the
    device has to be "suspended" or the domain needs to be powered up.
(3) There is a power domain which is initially on.  The RPM status of the
    device depends on the clock state like in (1).  Also it may not be
    possible to power down the domain.

Essentially, (1) and (3) are equivalent from the ->probe() perspective.
Moreover, if the driver does not intend to enable the clock, the device
is "suspended" regardless of the domain state.  This means that the only
really interesting case is (2) and only when ->probe() will attempt to
enable the clock.

In that case it needs to do something like
(a) Bump up the device's runtime PM usage counter.
(b) Execute "power up the device for me" (missing).
(c) Enable the clock, do whatever it wants to the device, set the RPM status
    to "active" and call pm_runtime_enable().
(d) Drop down the device's runtime PM usage counter (the core will trigger
    suspend).

> In the case where the PM domain was suspended, the RPM status would also
> be suspended at that point, and it requires a RPM get to resume it.

Yes, it does, if the driver wants to access the device.

> If the PM domain was active, then it would require a pm_runtime_put()
> operation to allow it to suspend.
>
> This is why I decided that the methodology in the runtime PM code should
> apply to PM domains: runtime PM requires the actual state to match the
> runtime PM state prior to calling pm_runtime_enable().  We should also
> require that the PM domain state must also match the runtime PM state
> prior to runtime PM being enabled too.

Agreed.

> > > We fix this by adding a new callback - runtime_set_status() which is
> > 
> > I'm not sure if that's the way to address that, though.
> 
> I've come to the conclusion that this isn't a good way to handle it,
> because those drivers which may make use of PM domains without using
> runtime PM will be probed with the PM domain powered down.

What would they use PM domains for, then?

PM_RUNTIME is gone now and PM is implied by PM_SLEEP, so you can't have system
suspend support without runtime PM (which presumably might be the case in
question).

> I think we've got a rather sticky problem here, and my proposed
> solution will cause problems in that scenario.
> 
> Another possibility may be to trigger PM domains to try to power down
> the domain when it sees the driver call pm_runtime_enable().  If the
> driver never calls pm_runtime_enable(), the PM domain will be left
> active.  If it does call this function, the effect of this will be to
> synchronise the PM domain with the runtime PM state.

That's more along the lines of what I'm thinking about.  I don't have a clean
idea how to implement that, though.

> > Our intention for the pm_runtime_set_active() etc. calls was that they didn't
> > trigger any callbacks, since were are supposed to be executed with runtime
> > PM disabled.
> > 
> > Moreover ->

[cut]

> > > @@ -1029,6 +1030,10 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > >   out_set:
> > >  	__update_runtime_status(dev, status);
> > >  	dev->power.runtime_error = 0;
> > > +	if (dev->power.no_callbacks)
> > > +		goto out;
> > > +	callback = RPM_GET_CALLBACK(dev, runtime_set_status);
> > > +	rpm_callback(callback, dev);
> > 
> > -> That is specific to PM domains and arguably not needed otherwise,
> > so I would define it in struct dev_pm_domain outside of dev_pm_ops.
> 
> How does runtime PM know that we're using PM domains though?  How
> does runtime PM know that it can cast the dev_pm_ops pointer safely?

dev->pm_domain is set then, so you basically do

	if (dev->pm_domain && dev->pm_domain->runtime_set_status)
		dev->pm_domain->runtime_set_status(dev);

(or whatever the new callback may be).

And if power domains are in use, dev->pm_domain has to be set before doing
anything with runtime PM to the device or things may break in more interesting
ways.

Russell King - ARM Linux Feb. 18, 2015, 10:03 a.m. UTC | #6

On Wed, Feb 18, 2015 at 08:02:30AM +0100, Rafael J. Wysocki wrote:
> On Tuesday, February 17, 2015 07:42:43 PM Russell King - ARM Linux wrote:
> > It's worse than that.  If, in the probe, we decide at a point to query
> > the PM domain status, and transfer it into the RPM status, how does the
> > driver know whether it needs to do a "put" to cause a transition from
> > active to suspended?
> 
> When ->probe() runs, the cases are:
> (1) There are no power domains, so the device is "active" when the clock is
>     enabled and "suspended" when it is disabled.
> (2) There is a power domain which is initially off.  The RPM status of the
>     device has to be "suspended" or the domain needs to be powered up.

(quick reply)

That's not entirely correct.

As a result of 2ed127697eb13 (PM / Domains: Power on the PM domain right
after attach completes) the power domain will _always_ be powered on prior
to ->probe() being called, if the device was attached to the PM domain
just before ->probe() is called, inspite of the domain being powered off
before the device was attached to the domain.

If the PM domain is attached earlier, and pm_genpd_poweroff_unused() is
called (before they're attached, but in the kernel boot sequence) the
work for powering down the PM domains will be queued until a point where
it can be scheduled - which will be after devices are attached.  That can
allow the PM domain(s) to be powered down (as no driver is attached) which
then results in the driver's ->probe function being called with the PM
domain OFF.

So, right now, there's no way for a driver to know with certainty whether
the device it is about to probe is powered up or powered down.

Alan Stern Feb. 18, 2015, 3:12 p.m. UTC | #7

On Wed, 18 Feb 2015, Russell King - ARM Linux wrote:

> On Wed, Feb 18, 2015 at 08:02:30AM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, February 17, 2015 07:42:43 PM Russell King - ARM Linux wrote:
> > > It's worse than that.  If, in the probe, we decide at a point to query
> > > the PM domain status, and transfer it into the RPM status, how does the
> > > driver know whether it needs to do a "put" to cause a transition from
> > > active to suspended?
> > 
> > When ->probe() runs, the cases are:
> > (1) There are no power domains, so the device is "active" when the clock is
> >     enabled and "suspended" when it is disabled.
> > (2) There is a power domain which is initially off.  The RPM status of the
> >     device has to be "suspended" or the domain needs to be powered up.
> 
> (quick reply)
> 
> That's not entirely correct.
> 
> As a result of 2ed127697eb13 (PM / Domains: Power on the PM domain right
> after attach completes) the power domain will _always_ be powered on prior
> to ->probe() being called, if the device was attached to the PM domain
> just before ->probe() is called, inspite of the domain being powered off
> before the device was attached to the domain.
> 
> If the PM domain is attached earlier, and pm_genpd_poweroff_unused() is
> called (before they're attached, but in the kernel boot sequence) the
> work for powering down the PM domains will be queued until a point where
> it can be scheduled - which will be after devices are attached.  That can
> allow the PM domain(s) to be powered down (as no driver is attached) which
> then results in the driver's ->probe function being called with the PM
> domain OFF.
> 
> So, right now, there's no way for a driver to know with certainty whether
> the device it is about to probe is powered up or powered down.

Russell:

I'm not totally familiar with how PM domains are intended to work.  The 
impression I get from looking through the code is that they are highly 
over-engineered -- but that could just be a result of my ignorance.

At any rate, I don't have a clear picture of the problem you are trying
to solve.  What's the general outline?  Are you talking about a device
that _belongs_ to a PM domain or one that _depends_ on a PM domain?  
If the device belongs to the domain, how does it get added (i.e., by
its driver during probe or by some other mechanism)?

Are you asking about how the driver can tell what the PM domain's 
runtime status is?  Or would you like to add a way for the driver to 
set the PM domain's runtime status?

What else am I missing?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Russell King - ARM Linux Feb. 18, 2015, 3:42 p.m. UTC | #8

On Wed, Feb 18, 2015 at 10:12:17AM -0500, Alan Stern wrote:
> Russell:
> 
> I'm not totally familiar with how PM domains are intended to work.  The 
> impression I get from looking through the code is that they are highly 
> over-engineered -- but that could just be a result of my ignorance.
> 
> At any rate, I don't have a clear picture of the problem you are trying
> to solve.  What's the general outline?  Are you talking about a device
> that _belongs_ to a PM domain or one that _depends_ on a PM domain?  
> If the device belongs to the domain, how does it get added (i.e., by
> its driver during probe or by some other mechanism)?
> 
> Are you asking about how the driver can tell what the PM domain's 
> runtime status is?  Or would you like to add a way for the driver to 
> set the PM domain's runtime status?
> 
> What else am I missing?

The problem is that the PM domain stuff doesn't work very well, and I
believe it's problems are just being hacked around rather than being
viewed from a higher level and fixed properly. :)

The issue that I'm seeing with 3.19 is that I have a device which
does not need to be powered up to probe.  It's default state when the
kernel initially starts to boot is that the power domain associated
with it is powered down.

Hence, the driver is written such that its probe function does this:

        pm_runtime_use_autosuspend(vi->dev);
        pm_runtime_set_autosuspend_delay(vi->dev, 100);
        pm_runtime_enable(vi->dev);

The driver lives as a module on the filesystem, and so it's loaded
after the kernel has mounted its rootfs and udev has started.

Now, I boot this platform in two ways: I have a DT mode (which I have
found is not that stable - in that it causes problems with HDMI), and
a legacy mode.

When I boot in legacy mode, the PM domain is created early in the
kernel boot (at arch_initcall time).  As part of the PM domain
initialisation, I read the current state of the hardware and tell
PM domains whether it was enabled or not.  In this case, it starts
off disabled.

Right after they're created, I call pm_genpd_poweroff_unused() and
then register a notifier for platform devices.  The call to
pm_genpd_poweroff_unused() causes a work to be queued, which will
only get executed when we reach a scheduling or preemption point.

The platform devices then get created, and the notifier attaches the
appropriate PM domains to the devices.

In 3.18 and previous kernels, this resulted in the PM domains being
left in their initial state.  This means that the presumption by the
driver that the device is suspended is correct and valid.

In 3.19, this causes the PM domain to be powered up immediately.
However, because we get to a preemption or scheduling point before
the driver module is inserted, this allows the PM domain to power
back down.

The module is then loaded, and everything works correctly.  The PM
domain debugfs file indicates that the PM domain is off.

Now, if I boot in DT mode, the game changes.  In this case, the
PM domains are created as above, but without the platform device
notifier - we rely on the generic code to attach the PM domain to
the device a driver probe time.

What this means is that when the pm_genpd_poweroff_unused() work
executes, it finds that the domain is already powered down.

The module is then loaded as in the legacy case, and at this point,
the PM domain is attached to the device just before the driver is
probed.  This causes the device to be powered on at this point.
However, the runtime PM state remains as its initialisation default
- suspended.

Time passes, and the PM domain debug file continues to indicate that
the runtime PM state of the device is suspended, but the PM domain
state is "on".

So, if I boot in legacy mode, things work fine, but only because of
the fluke that the pm_genpd_poweroff_unused() happens to be executed
before the device is probed, undoing the powering up of the domain
at attachment time.  If I boot in DT mode, the order changes, and I'm
left with the PM domain and RPM in an inconsistent state.

Right now, there is _no_ way for a device driver to know whether the
PM domain attached to its device is powered up or not.  As runtime PM
needs to know (from the device driver) the initial state of the device,
the device driver can not determine whether it is powered up or not.

Rafael J. Wysocki Feb. 18, 2015, 4:46 p.m. UTC | #9

On Wednesday, February 18, 2015 10:03:22 AM Russell King - ARM Linux wrote:
> On Wed, Feb 18, 2015 at 08:02:30AM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, February 17, 2015 07:42:43 PM Russell King - ARM Linux wrote:
> > > It's worse than that.  If, in the probe, we decide at a point to query
> > > the PM domain status, and transfer it into the RPM status, how does the
> > > driver know whether it needs to do a "put" to cause a transition from
> > > active to suspended?
> > 
> > When ->probe() runs, the cases are:
> > (1) There are no power domains, so the device is "active" when the clock is
> >     enabled and "suspended" when it is disabled.
> > (2) There is a power domain which is initially off.  The RPM status of the
> >     device has to be "suspended" or the domain needs to be powered up.
> 
> (quick reply)
> 
> That's not entirely correct.
> 
> As a result of 2ed127697eb13 (PM / Domains: Power on the PM domain right
> after attach completes) the power domain will _always_ be powered on prior
> to ->probe() being called, if the device was attached to the PM domain
> just before ->probe() is called, inspite of the domain being powered off
> before the device was attached to the domain.
> 
> If the PM domain is attached earlier, and pm_genpd_poweroff_unused() is
> called (before they're attached, but in the kernel boot sequence) the
> work for powering down the PM domains will be queued until a point where
> it can be scheduled - which will be after devices are attached.  That can
> allow the PM domain(s) to be powered down (as no driver is attached) which
> then results in the driver's ->probe function being called with the PM
> domain OFF.
> 
> So, right now, there's no way for a driver to know with certainty whether
> the device it is about to probe is powered up or powered down.

And it looks that the driver is in a module which is only loaded after we've
called the pm_genpd_poweroff_unused().  So clearly commit 2ed127697eb13 doesn't
work for that driver.

We might just revert that commit, but that wouldn't make the problem go away.
Namely, the domain may be powered up or down by *another* driver whose device
belongs to it in parallel with your ->probe() anyway.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael J. Wysocki Feb. 18, 2015, 4:52 p.m. UTC | #10

On Wednesday, February 18, 2015 10:12:17 AM Alan Stern wrote:
> On Wed, 18 Feb 2015, Russell King - ARM Linux wrote:
> 
> > On Wed, Feb 18, 2015 at 08:02:30AM +0100, Rafael J. Wysocki wrote:
> > > On Tuesday, February 17, 2015 07:42:43 PM Russell King - ARM Linux wrote:
> > > > It's worse than that.  If, in the probe, we decide at a point to query
> > > > the PM domain status, and transfer it into the RPM status, how does the
> > > > driver know whether it needs to do a "put" to cause a transition from
> > > > active to suspended?
> > > 
> > > When ->probe() runs, the cases are:
> > > (1) There are no power domains, so the device is "active" when the clock is
> > >     enabled and "suspended" when it is disabled.
> > > (2) There is a power domain which is initially off.  The RPM status of the
> > >     device has to be "suspended" or the domain needs to be powered up.
> > 
> > (quick reply)
> > 
> > That's not entirely correct.
> > 
> > As a result of 2ed127697eb13 (PM / Domains: Power on the PM domain right
> > after attach completes) the power domain will _always_ be powered on prior
> > to ->probe() being called, if the device was attached to the PM domain
> > just before ->probe() is called, inspite of the domain being powered off
> > before the device was attached to the domain.
> > 
> > If the PM domain is attached earlier, and pm_genpd_poweroff_unused() is
> > called (before they're attached, but in the kernel boot sequence) the
> > work for powering down the PM domains will be queued until a point where
> > it can be scheduled - which will be after devices are attached.  That can
> > allow the PM domain(s) to be powered down (as no driver is attached) which
> > then results in the driver's ->probe function being called with the PM
> > domain OFF.
> > 
> > So, right now, there's no way for a driver to know with certainty whether
> > the device it is about to probe is powered up or powered down.
> 
> Russell:
> 
> I'm not totally familiar with how PM domains are intended to work.  The 
> impression I get from looking through the code is that they are highly 
> over-engineered

Well, that very likely is my fault. :-)

> -- but that could just be a result of my ignorance.

You may be right anyway, though ...

> At any rate, I don't have a clear picture of the problem you are trying
> to solve.  What's the general outline?  Are you talking about a device
> that _belongs_ to a PM domain or one that _depends_ on a PM domain?  
> If the device belongs to the domain, how does it get added (i.e., by
> its driver during probe or by some other mechanism)?
> 
> Are you asking about how the driver can tell what the PM domain's 
> runtime status is?  Or would you like to add a way for the driver to 
> set the PM domain's runtime status?

The problem, as I see it, is that drivers have certain expectations about the
initial power states of devices that may or may not be met if power domains
are in use.

Those drivers might have been developed on systems without direct control of
power domains where the conditions for the driver at the ->probe() time were
always the same.  With power domains, though, they may change depending on
what's going on with the other devices in the domain, for example.

So the question is how we can address that in the cleanest way possible.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Russell King - ARM Linux Feb. 18, 2015, 5:26 p.m. UTC | #11

On Wed, Feb 18, 2015 at 05:52:25PM +0100, Rafael J. Wysocki wrote:
> Those drivers might have been developed on systems without direct control of
> power domains where the conditions for the driver at the ->probe() time were
> always the same.  With power domains, though, they may change depending on
> what's going on with the other devices in the domain, for example.
> 
> So the question is how we can address that in the cleanest way possible.

Okay, going back to what was said earlier, we have three possibilities
for drivers:

1) A driver which makes no use of runtime PM and no use of PM domains.
2) A driver which makes no use of runtime PM but may have a PM domain
   attached.
3) A driver which uses runtime PM, and assumes that the device was
   initially suspended.  It calls pm_enable_device() optionally with
   a preceding call to pm_runtime_set_suspended().
4) A driver which uses runtime PM, and calls pm_runtime_set_active()
   followed by pm_enable_device().

What we want to end up with in the ideal situation is that drivers which
fall into class:

1 - may see their devices in any state; it is up to them to deal with
    whatever state the device is initially in.
2 - should see their devices in a powered up state as they have no way
    to inform that the device is active.
3 - should see their devices in a suspended state.
4 - should see their devices in a powered up state.

The problem here is that we have no way to know this prior to the drivers
probe function being called.  Whatever we do at this point, it is not
going to satisfy the requirements of everyone.

So, let's take what we're currently doing on DT, and make it the same
across the board.  In platform_drv_probe(), let's do this:

	/* attach the domain */
        ret = dev_pm_domain_attach(_dev, true);
	if (ret == -EPROBE_DEFER)
		goto defer;

	/* power up the domain - and hold it powered up */
	ret = dev_pm_domain_pre_probe(_dev);
	if (ret != -ENOSYS)
		goto error;

	ret = drv->probe(dev);

	/*
	 * remove the "hold" on the domain by this device, and start
	 * tracking its runtime PM status.
	 */
	dev_pm_domain_post_probe(_dev);

	if (ret)
		dev_pm_domain_detach(_dev, true);

What this means is that while the probe function is running, we guarantee
that the PM domain will always be powered up.  We also guarantee that
after the probe function has returned, we will then start tracking the
runtime PM state, and if the device driver leaves runtime PM in the
"suspended" state, PM domains will get to hear about it at that point,
and can transition the PM domain back to "off" mode.

Both these transitions only cause the PM domain to be affected; no
runtime PM callbacks are issued for either of these two changes.  (We
already have that for the initial state right now where attaching a
device to the PM domain causes the PM domain to be powered up.)

This is merely an extension of the current scheme.  Think of
dev_pm_domain_post_probe() as a method of synchronising the state of
PM domains with the state of runtime PM across all attached devices.

Aside:
I need to do some further checks; while considering this, I think I've
come across a worrying problem with "PM / Domains: Power on the PM domain
right after attach completes".  I think this will result in the driver's
runtime_resume callback being invoked _before_ the probe function.  I
can't test this right now as I'm in the middle of upgrading my dev box
and it doesn't have a functional install for building ARM kernels yet.

Rafael J. Wysocki Feb. 18, 2015, 6:05 p.m. UTC | #12

On Wednesday, February 18, 2015 05:26:51 PM Russell King - ARM Linux wrote:
> On Wed, Feb 18, 2015 at 05:52:25PM +0100, Rafael J. Wysocki wrote:
> > Those drivers might have been developed on systems without direct control of
> > power domains where the conditions for the driver at the ->probe() time were
> > always the same.  With power domains, though, they may change depending on
> > what's going on with the other devices in the domain, for example.
> > 
> > So the question is how we can address that in the cleanest way possible.
> 
> Okay, going back to what was said earlier, we have three possibilities
> for drivers:
> 
> 1) A driver which makes no use of runtime PM and no use of PM domains.
> 2) A driver which makes no use of runtime PM but may have a PM domain
>    attached.
> 3) A driver which uses runtime PM, and assumes that the device was
>    initially suspended.  It calls pm_enable_device() optionally with
>    a preceding call to pm_runtime_set_suspended().
> 4) A driver which uses runtime PM, and calls pm_runtime_set_active()
>    followed by pm_enable_device().
> 
> What we want to end up with in the ideal situation is that drivers which
> fall into class:
> 
> 1 - may see their devices in any state; it is up to them to deal with
>     whatever state the device is initially in.
> 2 - should see their devices in a powered up state as they have no way
>     to inform that the device is active.
> 3 - should see their devices in a suspended state.
> 4 - should see their devices in a powered up state.
> 
> The problem here is that we have no way to know this prior to the drivers
> probe function being called.  Whatever we do at this point, it is not
> going to satisfy the requirements of everyone.
> 
> So, let's take what we're currently doing on DT, and make it the same
> across the board.  In platform_drv_probe(), let's do this:
> 
> 	/* attach the domain */
>         ret = dev_pm_domain_attach(_dev, true);
> 	if (ret == -EPROBE_DEFER)
> 		goto defer;
> 
> 	/* power up the domain - and hold it powered up */
> 	ret = dev_pm_domain_pre_probe(_dev);
> 	if (ret != -ENOSYS)
> 		goto error;
> 
> 	ret = drv->probe(dev);
> 
> 	/*
> 	 * remove the "hold" on the domain by this device, and start
> 	 * tracking its runtime PM status.
> 	 */
> 	dev_pm_domain_post_probe(_dev);
> 
> 	if (ret)
> 		dev_pm_domain_detach(_dev, true);
> 
> What this means is that while the probe function is running, we guarantee
> that the PM domain will always be powered up.  We also guarantee that
> after the probe function has returned, we will then start tracking the
> runtime PM state, and if the device driver leaves runtime PM in the
> "suspended" state, PM domains will get to hear about it at that point,
> and can transition the PM domain back to "off" mode.
> 
> Both these transitions only cause the PM domain to be affected; no
> runtime PM callbacks are issued for either of these two changes.  (We
> already have that for the initial state right now where attaching a
> device to the PM domain causes the PM domain to be powered up.)
> 
> This is merely an extension of the current scheme.  Think of
> dev_pm_domain_post_probe() as a method of synchronising the state of
> PM domains with the state of runtime PM across all attached devices.

I like this.

And I would add two new "pre_probe" and "post_probe" (what about "init"
and "remove"?) to struct dev_pm_domain for that.

The "pre_probe" ("init") thing would be useful for solving one more problem
related to PM domains I have elsewhere.  In that case I need to defer probing
one of the device in the domain until one of the other devices is registered.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Feb. 18, 2015, 6:42 p.m. UTC | #13

On Wed, 18 Feb 2015, Russell King - ARM Linux wrote:

> So, let's take what we're currently doing on DT, and make it the same
> across the board.  In platform_drv_probe(), let's do this:
> 
> 	/* attach the domain */
>         ret = dev_pm_domain_attach(_dev, true);
> 	if (ret == -EPROBE_DEFER)
> 		goto defer;
> 
> 	/* power up the domain - and hold it powered up */
> 	ret = dev_pm_domain_pre_probe(_dev);
> 	if (ret != -ENOSYS)
> 		goto error;
> 
> 	ret = drv->probe(dev);
> 
> 	/*
> 	 * remove the "hold" on the domain by this device, and start
> 	 * tracking its runtime PM status.
> 	 */
> 	dev_pm_domain_post_probe(_dev);
> 
> 	if (ret)
> 		dev_pm_domain_detach(_dev, true);
> 
> What this means is that while the probe function is running, we guarantee
> that the PM domain will always be powered up.  We also guarantee that
> after the probe function has returned, we will then start tracking the
> runtime PM state, and if the device driver leaves runtime PM in the
> "suspended" state, PM domains will get to hear about it at that point,
> and can transition the PM domain back to "off" mode.
> 
> Both these transitions only cause the PM domain to be affected; no
> runtime PM callbacks are issued for either of these two changes.  (We
> already have that for the initial state right now where attaching a
> device to the PM domain causes the PM domain to be powered up.)
> 
> This is merely an extension of the current scheme.  Think of
> dev_pm_domain_post_probe() as a method of synchronising the state of
> PM domains with the state of runtime PM across all attached devices.

This seems like the right sort of approach.

The Runtime PM core has a general philosophy that it can never force
anything to be in the suspended state.  Therefore if you need something
to be in a well-defined power state for any length of time (or if you
need to know the power state), you have to force it into the active
state.

> Aside:
> I need to do some further checks; while considering this, I think I've
> come across a worrying problem with "PM / Domains: Power on the PM domain
> right after attach completes".  I think this will result in the driver's
> runtime_resume callback being invoked _before_ the probe function.  I
> can't test this right now as I'm in the middle of upgrading my dev box
> and it doesn't have a functional install for building ARM kernels yet.

Subsystems always face this problem, because the device core sets
dev->driver before calling the probe routine.  (There's a similar
problem on the other side, where the device core doesn't clear
dev->driver until after the remove routine has returned.)  To cope with
this, the subsystem-level runtime-PM callbacks have to be careful not
to invoke the driver-level callbacks before probing is finished or
after removal has started.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ulf Hansson March 4, 2015, 7:57 p.m. UTC | #14

On 18 February 2015 at 19:05, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, February 18, 2015 05:26:51 PM Russell King - ARM Linux wrote:
>> On Wed, Feb 18, 2015 at 05:52:25PM +0100, Rafael J. Wysocki wrote:
>> > Those drivers might have been developed on systems without direct control of
>> > power domains where the conditions for the driver at the ->probe() time were
>> > always the same.  With power domains, though, they may change depending on
>> > what's going on with the other devices in the domain, for example.
>> >
>> > So the question is how we can address that in the cleanest way possible.
>>
>> Okay, going back to what was said earlier, we have three possibilities
>> for drivers:
>>
>> 1) A driver which makes no use of runtime PM and no use of PM domains.
>> 2) A driver which makes no use of runtime PM but may have a PM domain
>>    attached.
>> 3) A driver which uses runtime PM, and assumes that the device was
>>    initially suspended.  It calls pm_enable_device() optionally with
>>    a preceding call to pm_runtime_set_suspended().
>> 4) A driver which uses runtime PM, and calls pm_runtime_set_active()
>>    followed by pm_enable_device().
>>
>> What we want to end up with in the ideal situation is that drivers which
>> fall into class:
>>
>> 1 - may see their devices in any state; it is up to them to deal with
>>     whatever state the device is initially in.
>> 2 - should see their devices in a powered up state as they have no way
>>     to inform that the device is active.
>> 3 - should see their devices in a suspended state.
>> 4 - should see their devices in a powered up state.
>>
>> The problem here is that we have no way to know this prior to the drivers
>> probe function being called.  Whatever we do at this point, it is not
>> going to satisfy the requirements of everyone.
>>
>> So, let's take what we're currently doing on DT, and make it the same
>> across the board.  In platform_drv_probe(), let's do this:
>>
>>       /* attach the domain */
>>         ret = dev_pm_domain_attach(_dev, true);
>>       if (ret == -EPROBE_DEFER)
>>               goto defer;
>>
>>       /* power up the domain - and hold it powered up */
>>       ret = dev_pm_domain_pre_probe(_dev);
>>       if (ret != -ENOSYS)
>>               goto error;
>>
>>       ret = drv->probe(dev);
>>
>>       /*
>>        * remove the "hold" on the domain by this device, and start
>>        * tracking its runtime PM status.
>>        */
>>       dev_pm_domain_post_probe(_dev);
>>
>>       if (ret)
>>               dev_pm_domain_detach(_dev, true);
>>
>> What this means is that while the probe function is running, we guarantee
>> that the PM domain will always be powered up.  We also guarantee that
>> after the probe function has returned, we will then start tracking the
>> runtime PM state, and if the device driver leaves runtime PM in the
>> "suspended" state, PM domains will get to hear about it at that point,
>> and can transition the PM domain back to "off" mode.
>>
>> Both these transitions only cause the PM domain to be affected; no
>> runtime PM callbacks are issued for either of these two changes.  (We
>> already have that for the initial state right now where attaching a
>> device to the PM domain causes the PM domain to be powered up.)
>>
>> This is merely an extension of the current scheme.  Think of
>> dev_pm_domain_post_probe() as a method of synchronising the state of
>> PM domains with the state of runtime PM across all attached devices.
>
> I like this.

Me too!

>
> And I would add two new "pre_probe" and "post_probe" (what about "init"
> and "remove"?) to struct dev_pm_domain for that.

Actually what Russell propose, is almost exactly what I proposed in
the following below patchset way back. At that point we decided to
reject that approach.
The only difference is the naming of dev_pm_domain_pre|post_probe(). I
decided to name them dev_pm_domain_get|put(). :-)

http://marc.info/?l=linux-pm&m=141320895122707&w=2

>
> The "pre_probe" ("init") thing would be useful for solving one more problem
> related to PM domains I have elsewhere.  In that case I need to defer probing
> one of the device in the domain until one of the other devices is registered.
>
> Rafael
>

So I wonder if I should maybe refresh that patchset? Or maybe Russell
would like to pick them up?

No matter what, I happy to help!

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Russell King - ARM Linux March 4, 2015, 8:11 p.m. UTC | #15

On Wed, Mar 04, 2015 at 08:57:40PM +0100, Ulf Hansson wrote:
> So I wonder if I should maybe refresh that patchset? Or maybe Russell
> would like to pick them up?

I'm in no state to do anything at the moment - see my mail last night
"rmk's taking it easy for a while" (which everyone seems to have
ignored).

[3/8] pm: domains: sync runtime PM status with PM domains

Commit Message

Comments

Patch