diff mbox

[3/4] PM / core: Direct DPM_FLAG_SMART_SUSPEND optimization

Message ID 252167445.OQUXi2Ca9Y@aspire.rjw.lan (mailing list archive)
State Mainlined
Delegated to: Rafael Wysocki
Headers show

Commit Message

Rafael J. Wysocki Dec. 10, 2017, midnight UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Make the PM core avoid invoking the "late" and "noirq" system-wide
suspend (or analogous) callbacks provided by device drivers directly
for devices with DPM_FLAG_SMART_SUSPEND set that are in runtime
suspend during the "late" and "noirq" phases of system-wide suspend
(or analogous) transitions.  That is only done for devices without
any middle-layer "late" and "noirq" suspend callbacks (to avoid
confusing the middle layer if there is one).

The underlying observation is that runtime PM is disabled for devices
during the "late" and "noirq" system-wide suspend phases, so if they
remain in runtime suspend from the "late" phase forward, it doesn't
make sense to invoke the "late" and "noirq" callbacks provided by
the drivers for them (arguably, the device is already suspended and
in the right state).  Thus, if the remaining driver suspend callbacks
are to be invoked directly by the core, they can be skipped.

This change really makes it possible for, say, platform device
drivers to re-use runtime PM suspend and resume callbacks by
pointing ->suspend_late and ->resume_early, respectively (and
possibly the analogous hibernation-related callback pointers too),
to them without adding any extra "is the device already suspended?"
type of checks to the callback routines, as long as they will be
invoked directly by the core.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/driver-api/pm/devices.rst |   18 +++---
 drivers/base/power/main.c               |   85 +++++++++++++++++++++++++++++---
 2 files changed, 88 insertions(+), 15 deletions(-)

Comments

Ulf Hansson Dec. 19, 2017, 7:38 a.m. UTC | #1
On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Make the PM core avoid invoking the "late" and "noirq" system-wide
> suspend (or analogous) callbacks provided by device drivers directly
> for devices with DPM_FLAG_SMART_SUSPEND set that are in runtime
> suspend during the "late" and "noirq" phases of system-wide suspend
> (or analogous) transitions.  That is only done for devices without
> any middle-layer "late" and "noirq" suspend callbacks (to avoid
> confusing the middle layer if there is one).
>
> The underlying observation is that runtime PM is disabled for devices
> during the "late" and "noirq" system-wide suspend phases, so if they
> remain in runtime suspend from the "late" phase forward, it doesn't
> make sense to invoke the "late" and "noirq" callbacks provided by
> the drivers for them (arguably, the device is already suspended and
> in the right state).  Thus, if the remaining driver suspend callbacks
> are to be invoked directly by the core, they can be skipped.
>

As I have stated earlier, this isn't going to solve the general case,
as the above change log seems to state. I think we really need to do
that, before adding yet another system suspend/resume optimization
path in the PM core.

The main reason is that lots of drivers have additional operations to
perform, besides making sure that its device is put into a "runtime
suspended state" during system suspend. In other words, skipping
system suspend callbacks (and likewise for system resume) is to me the
wrong solution to mitigate these problems.

> This change really makes it possible for, say, platform device
> drivers to re-use runtime PM suspend and resume callbacks by
> pointing ->suspend_late and ->resume_early, respectively (and
> possibly the analogous hibernation-related callback pointers too),
> to them without adding any extra "is the device already suspended?"
> type of checks to the callback routines, as long as they will be
> invoked directly by the core.

Certainly there are drivers that could start to deploying this
solution, because at the moment they don't have any additional
operations to perform at system suspend. But what about the rest,
don't we care about them?

Moreover, what happens when/if a driver that has deployed this
solution, starts being used on a new SoC and then additional
operations during system suspend becomes required (for example
pinctrls that needs to be put in a system sleep state)? Then
everything falls apart, doesn't it?

Kind regards
Uffe

>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  Documentation/driver-api/pm/devices.rst |   18 +++---
>  drivers/base/power/main.c               |   85 +++++++++++++++++++++++++++++---
>  2 files changed, 88 insertions(+), 15 deletions(-)
>
> Index: linux-pm/Documentation/driver-api/pm/devices.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/pm/devices.rst
> +++ linux-pm/Documentation/driver-api/pm/devices.rst
> @@ -777,14 +777,16 @@ The driver can indicate that by setting
>  runtime suspend at the beginning of the ``suspend_late`` phase of system-wide
>  suspend (or in the ``poweroff_late`` phase of hibernation), when runtime PM
>  has been disabled for it, under the assumption that its state should not change
> -after that point until the system-wide transition is over.  If that happens, the
> -driver's system-wide resume callbacks, if present, may still be invoked during
> -the subsequent system-wide resume transition and the device's runtime power
> -management status may be set to "active" before enabling runtime PM for it,
> -so the driver must be prepared to cope with the invocation of its system-wide
> -resume callbacks back-to-back with its ``->runtime_suspend`` one (without the
> -intervening ``->runtime_resume`` and so on) and the final state of the device
> -must reflect the "active" status for runtime PM in that case.
> +after that point until the system-wide transition is over (the PM core itself
> +does that for devices whose "noirq", "late" and "early" system-wide PM callbacks
> +are executed directly by it).  If that happens, the driver's system-wide resume
> +callbacks, if present, may still be invoked during the subsequent system-wide
> +resume transition and the device's runtime power management status may be set
> +to "active" before enabling runtime PM for it, so the driver must be prepared to
> +cope with the invocation of its system-wide resume callbacks back-to-back with
> +its ``->runtime_suspend`` one (without the intervening ``->runtime_resume`` and
> +so on) and the final state of the device must reflect the "active" runtime PM
> +status in that case.
>
>  During system-wide resume from a sleep state it's easiest to put devices into
>  the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -541,6 +541,24 @@ void dev_pm_skip_next_resume_phases(stru
>  }
>
>  /**
> + * suspend_event - Return a "suspend" message for given "resume" one.
> + * @resume_msg: PM message representing a system-wide resume transition.
> + */
> +static pm_message_t suspend_event(pm_message_t resume_msg)
> +{
> +       switch (resume_msg.event) {
> +       case PM_EVENT_RESUME:
> +               return PMSG_SUSPEND;
> +       case PM_EVENT_THAW:
> +       case PM_EVENT_RESTORE:
> +               return PMSG_FREEZE;
> +       case PM_EVENT_RECOVER:
> +               return PMSG_HIBERNATE;
> +       }
> +       return PMSG_ON;
> +}
> +
> +/**
>   * dev_pm_may_skip_resume - System-wide device resume optimization check.
>   * @dev: Target device.
>   *
> @@ -581,6 +599,14 @@ static pm_callback_t dpm_subsys_resume_n
>         return callback;
>  }
>
> +static pm_callback_t dpm_subsys_suspend_noirq_cb(struct device *dev,
> +                                                pm_message_t state,
> +                                                const char **info_p);
> +
> +static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev,
> +                                               pm_message_t state,
> +                                               const char **info_p);
> +
>  /**
>   * device_resume_noirq - Execute a "noirq resume" callback for given device.
>   * @dev: Device to handle.
> @@ -608,13 +634,40 @@ static int device_resume_noirq(struct de
>         dpm_wait_for_superior(dev, async);
>
>         callback = dpm_subsys_resume_noirq_cb(dev, state, &info);
> +       if (callback)
> +               goto Run;
>
> -       if (!callback && dev->driver && dev->driver->pm) {
> +       if (dev_pm_smart_suspend_and_suspended(dev)) {
> +               pm_message_t suspend_msg = suspend_event(state);
> +
> +               /*
> +                * If "freeze" callbacks have been skipped during a transition
> +                * related to hibernation, the subsequent "thaw" callbacks must
> +                * be skipped too or bad things may happen.  Otherwise, resume
> +                * callbacks are going to be run for the device, so its runtime
> +                * PM status must be changed to reflect the new state after the
> +                * transition under way.
> +                */
> +               if (!dpm_subsys_suspend_late_cb(dev, suspend_msg, NULL) &&
> +                   !dpm_subsys_suspend_noirq_cb(dev, suspend_msg, NULL)) {
> +                       if (state.event == PM_EVENT_THAW) {
> +                               dev_pm_skip_next_resume_phases(dev);
> +                               goto Skip;
> +                       } else {
> +                               pm_runtime_set_active(dev);
> +                       }
> +               }
> +       }
> +
> +       if (dev->driver && dev->driver->pm) {
>                 info = "noirq driver ";
>                 callback = pm_noirq_op(dev->driver->pm, state);
>         }
>
> +Run:
>         error = dpm_run_callback(callback, dev, state, info);
> +
> +Skip:
>         dev->power.is_noirq_suspended = false;
>
>         if (dev_pm_may_skip_resume(dev)) {
> @@ -629,7 +682,7 @@ static int device_resume_noirq(struct de
>                 dev_pm_skip_next_resume_phases(dev);
>         }
>
> - Out:
> +Out:
>         complete_all(&dev->power.completion);
>         TRACE_RESUME(error);
>         return error;
> @@ -1224,18 +1277,26 @@ static int __device_suspend_noirq(struct
>                 goto Complete;
>
>         callback = dpm_subsys_suspend_noirq_cb(dev, state, &info);
> +       if (callback)
> +               goto Run;
>
> -       if (!callback && dev->driver && dev->driver->pm) {
> +       if (dev_pm_smart_suspend_and_suspended(dev) &&
> +           !dpm_subsys_suspend_late_cb(dev, state, NULL))
> +               goto Skip;
> +
> +       if (dev->driver && dev->driver->pm) {
>                 info = "noirq driver ";
>                 callback = pm_noirq_op(dev->driver->pm, state);
>         }
>
> +Run:
>         error = dpm_run_callback(callback, dev, state, info);
>         if (error) {
>                 async_error = error;
>                 goto Complete;
>         }
>
> +Skip:
>         dev->power.is_noirq_suspended = true;
>
>         if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED)) {
> @@ -1420,17 +1481,27 @@ static int __device_suspend_late(struct
>                 goto Complete;
>
>         callback = dpm_subsys_suspend_late_cb(dev, state, &info);
> +       if (callback)
> +               goto Run;
>
> -       if (!callback && dev->driver && dev->driver->pm) {
> +       if (dev_pm_smart_suspend_and_suspended(dev) &&
> +           !dpm_subsys_suspend_noirq_cb(dev, state, NULL))
> +               goto Skip;
> +
> +       if (dev->driver && dev->driver->pm) {
>                 info = "late driver ";
>                 callback = pm_late_early_op(dev->driver->pm, state);
>         }
>
> +Run:
>         error = dpm_run_callback(callback, dev, state, info);
> -       if (!error)
> -               dev->power.is_late_suspended = true;
> -       else
> +       if (error) {
>                 async_error = error;
> +               goto Complete;
> +       }
> +
> +Skip:
> +       dev->power.is_late_suspended = true;
>
>  Complete:
>         TRACE_SUSPEND(error);
>
Rafael J. Wysocki Dec. 19, 2017, 11:13 a.m. UTC | #2
On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> Make the PM core avoid invoking the "late" and "noirq" system-wide
>> suspend (or analogous) callbacks provided by device drivers directly
>> for devices with DPM_FLAG_SMART_SUSPEND set that are in runtime
>> suspend during the "late" and "noirq" phases of system-wide suspend
>> (or analogous) transitions.  That is only done for devices without
>> any middle-layer "late" and "noirq" suspend callbacks (to avoid
>> confusing the middle layer if there is one).
>>
>> The underlying observation is that runtime PM is disabled for devices
>> during the "late" and "noirq" system-wide suspend phases, so if they
>> remain in runtime suspend from the "late" phase forward, it doesn't
>> make sense to invoke the "late" and "noirq" callbacks provided by
>> the drivers for them (arguably, the device is already suspended and
>> in the right state).  Thus, if the remaining driver suspend callbacks
>> are to be invoked directly by the core, they can be skipped.
>>

It looks like I'm consistently failing to explain my point clearly enough. :-)

> As I have stated earlier, this isn't going to solve the general case,
> as the above change log seems to state.

Well, it doesn't say that or anything similar, at least to my eyes.

The observation is that if you have set DPM_FLAG_SMART_SUSPEND, then
you need to be prepared for your ->suspend_late and ->suspend_noirq to
be skipped (because the ACPI PM domain does that and you may happen to
work with it, for example) if the device is already suspended at the
beginning of the "late suspend" phase.  That's already documented.

Given the above, and the fact that there is not much to be done for a
suspended device in ->suspend_late and ->suspend_noirq, why can't the
core skip these callbacks too if there's no middle layer?

But the reason why I really need this is because
i2c-designware-platdrv can work both with the ACPI PM domain and
standalone and I need it to be handled consistently in both cases.

> I think we really need to do that, before adding yet another system suspend/resume optimization
> path in the PM core.

So what exactly is the technical argument in the above?

> The main reason is that lots of drivers have additional operations to
> perform, besides making sure that its device is put into a "runtime
> suspended state" during system suspend. In other words, skipping
> system suspend callbacks (and likewise for system resume) is to me the
> wrong solution to mitigate these problems.
>
>> This change really makes it possible for, say, platform device
>> drivers to re-use runtime PM suspend and resume callbacks by
>> pointing ->suspend_late and ->resume_early, respectively (and
>> possibly the analogous hibernation-related callback pointers too),
>> to them without adding any extra "is the device already suspended?"
>> type of checks to the callback routines, as long as they will be
>> invoked directly by the core.
>
> Certainly there are drivers that could start to deploying this
> solution, because at the moment they don't have any additional
> operations to perform at system suspend. But what about the rest,
> don't we care about them?

We do, somewhat. :-)

> Moreover, what happens when/if a driver that has deployed this
> solution, starts being used on a new SoC and then additional
> operations during system suspend becomes required (for example
> pinctrls that needs to be put in a system sleep state)? Then
> everything falls apart, doesn't it?

Then you runtime-resume the device in ->suspend() and the remaining
callbacks will run for you just fine.

And IMO running "late" and "noirq" system suspend callbacks on a
suspended device is super-fragile anyway as they generally need to
distinguish between the "suspended" and "not suspended" cases
consistently and what they do may affect the children or the parent of
the device in ways that are difficult to predict in general.  So, I'd
rather not do that in any case.

Thanks,
Rafael
Rafael J. Wysocki Dec. 19, 2017, 11:19 a.m. UTC | #3
On Tue, Dec 19, 2017 at 12:13 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>

[cut]

>
>> Moreover, what happens when/if a driver that has deployed this
>> solution, starts being used on a new SoC and then additional
>> operations during system suspend becomes required (for example
>> pinctrls that needs to be put in a system sleep state)? Then
>> everything falls apart, doesn't it?
>
> Then you runtime-resume the device in ->suspend() and the remaining
> callbacks will run for you just fine.

BTW, I guess you'll need a middle layer in that case anyway so that
the driver doesn't have to distinguish between platforms it has to
work with which makes the argument somewhat moot.

Thanks,
Rafael
Ulf Hansson Dec. 19, 2017, 1:10 p.m. UTC | #4
On 19 December 2017 at 12:13, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Make the PM core avoid invoking the "late" and "noirq" system-wide
>>> suspend (or analogous) callbacks provided by device drivers directly
>>> for devices with DPM_FLAG_SMART_SUSPEND set that are in runtime
>>> suspend during the "late" and "noirq" phases of system-wide suspend
>>> (or analogous) transitions.  That is only done for devices without
>>> any middle-layer "late" and "noirq" suspend callbacks (to avoid
>>> confusing the middle layer if there is one).
>>>
>>> The underlying observation is that runtime PM is disabled for devices
>>> during the "late" and "noirq" system-wide suspend phases, so if they
>>> remain in runtime suspend from the "late" phase forward, it doesn't
>>> make sense to invoke the "late" and "noirq" callbacks provided by
>>> the drivers for them (arguably, the device is already suspended and
>>> in the right state).  Thus, if the remaining driver suspend callbacks
>>> are to be invoked directly by the core, they can be skipped.
>>>
>
> It looks like I'm consistently failing to explain my point clearly enough. :-)
>
>> As I have stated earlier, this isn't going to solve the general case,
>> as the above change log seems to state.
>
> Well, it doesn't say that or anything similar, at least to my eyes.
>
> The observation is that if you have set DPM_FLAG_SMART_SUSPEND, then
> you need to be prepared for your ->suspend_late and ->suspend_noirq to
> be skipped (because the ACPI PM domain does that and you may happen to
> work with it, for example) if the device is already suspended at the
> beginning of the "late suspend" phase.  That's already documented.
>
> Given the above, and the fact that there is not much to be done for a
> suspended device in ->suspend_late and ->suspend_noirq, why can't the
> core skip these callbacks too if there's no middle layer?
>
> But the reason why I really need this is because
> i2c-designware-platdrv can work both with the ACPI PM domain and
> standalone and I need it to be handled consistently in both cases.

Yeah, I understand that.

>
>> I think we really need to do that, before adding yet another system suspend/resume optimization
>> path in the PM core.
>
> So what exactly is the technical argument in the above?

As stated, when the driver needs additional operations to be done, it
all falls a part.

>
>> The main reason is that lots of drivers have additional operations to
>> perform, besides making sure that its device is put into a "runtime
>> suspended state" during system suspend. In other words, skipping
>> system suspend callbacks (and likewise for system resume) is to me the
>> wrong solution to mitigate these problems.
>>
>>> This change really makes it possible for, say, platform device
>>> drivers to re-use runtime PM suspend and resume callbacks by
>>> pointing ->suspend_late and ->resume_early, respectively (and
>>> possibly the analogous hibernation-related callback pointers too),
>>> to them without adding any extra "is the device already suspended?"
>>> type of checks to the callback routines, as long as they will be
>>> invoked directly by the core.
>>
>> Certainly there are drivers that could start to deploying this
>> solution, because at the moment they don't have any additional
>> operations to perform at system suspend. But what about the rest,
>> don't we care about them?
>
> We do, somewhat. :-)
>
>> Moreover, what happens when/if a driver that has deployed this
>> solution, starts being used on a new SoC and then additional
>> operations during system suspend becomes required (for example
>> pinctrls that needs to be put in a system sleep state)? Then
>> everything falls apart, doesn't it?
>
> Then you runtime-resume the device in ->suspend() and the remaining
> callbacks will run for you just fine.

Well, this proves my concern.

Even if the driver has additional operations to perform, why should it
have to runtime resume its device to have the callbacks to be invoked?
That may be a waste in both time and power.

No, this isn't good enough, sorry.

>
> And IMO running "late" and "noirq" system suspend callbacks on a
> suspended device is super-fragile anyway as they generally need to
> distinguish between the "suspended" and "not suspended" cases
> consistently and what they do may affect the children or the parent of
> the device in ways that are difficult to predict in general.  So, I'd
> rather not do that in any case.

That may be the case for the ACPI PM domain, but I don't see an issue
when devices are being attached to a more trivial middle layer/PM
domain.

Kind regards
Uffe
Ulf Hansson Dec. 19, 2017, 1:15 p.m. UTC | #5
On 19 December 2017 at 12:19, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Tue, Dec 19, 2017 at 12:13 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>
> [cut]
>
>>
>>> Moreover, what happens when/if a driver that has deployed this
>>> solution, starts being used on a new SoC and then additional
>>> operations during system suspend becomes required (for example
>>> pinctrls that needs to be put in a system sleep state)? Then
>>> everything falls apart, doesn't it?
>>
>> Then you runtime-resume the device in ->suspend() and the remaining
>> callbacks will run for you just fine.
>
> BTW, I guess you'll need a middle layer in that case anyway so that
> the driver doesn't have to distinguish between platforms it has to
> work with which makes the argument somewhat moot.

No, this isn't the case. Let's take the pinctrl as an example.

The pinctrl system sleep state could be optionally defined in the DT,
depending on the platform.

All the driver shall do, is to try to set the state, if it has been defined.

Kind regards
Uffe
Rafael J. Wysocki Dec. 19, 2017, 4:29 p.m. UTC | #6
On Tue, Dec 19, 2017 at 2:10 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 19 December 2017 at 12:13, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> Make the PM core avoid invoking the "late" and "noirq" system-wide
>>>> suspend (or analogous) callbacks provided by device drivers directly
>>>> for devices with DPM_FLAG_SMART_SUSPEND set that are in runtime
>>>> suspend during the "late" and "noirq" phases of system-wide suspend
>>>> (or analogous) transitions.  That is only done for devices without
>>>> any middle-layer "late" and "noirq" suspend callbacks (to avoid
>>>> confusing the middle layer if there is one).
>>>>
>>>> The underlying observation is that runtime PM is disabled for devices
>>>> during the "late" and "noirq" system-wide suspend phases, so if they
>>>> remain in runtime suspend from the "late" phase forward, it doesn't
>>>> make sense to invoke the "late" and "noirq" callbacks provided by
>>>> the drivers for them (arguably, the device is already suspended and
>>>> in the right state).  Thus, if the remaining driver suspend callbacks
>>>> are to be invoked directly by the core, they can be skipped.
>>>>
>>
>> It looks like I'm consistently failing to explain my point clearly enough. :-)
>>
>>> As I have stated earlier, this isn't going to solve the general case,
>>> as the above change log seems to state.

No, it doesn't, as long as drivers follow the documentation.

So your concern seems to be "What if I don't follow the
documentation?" which, honestly, is not something I can address. :-)

>> Well, it doesn't say that or anything similar, at least to my eyes.
>>
>> The observation is that if you have set DPM_FLAG_SMART_SUSPEND, then
>> you need to be prepared for your ->suspend_late and ->suspend_noirq to
>> be skipped (because the ACPI PM domain does that and you may happen to
>> work with it, for example) if the device is already suspended at the
>> beginning of the "late suspend" phase.  That's already documented.
>>
>> Given the above, and the fact that there is not much to be done for a
>> suspended device in ->suspend_late and ->suspend_noirq, why can't the
>> core skip these callbacks too if there's no middle layer?
>>
>> But the reason why I really need this is because
>> i2c-designware-platdrv can work both with the ACPI PM domain and
>> standalone and I need it to be handled consistently in both cases.
>
> Yeah, I understand that.
>
>>
>>> I think we really need to do that, before adding yet another system suspend/resume optimization
>>> path in the PM core.
>>
>> So what exactly is the technical argument in the above?
>
> As stated, when the driver needs additional operations to be done, it
> all falls a part.

If the driver *needs* such operations to be done, then it *should*
*not* set DPM_FLAG_SMART_SUSPEND as per the existing documentation.

>
>>
>>> The main reason is that lots of drivers have additional operations to
>>> perform, besides making sure that its device is put into a "runtime
>>> suspended state" during system suspend. In other words, skipping
>>> system suspend callbacks (and likewise for system resume) is to me the
>>> wrong solution to mitigate these problems.
>>>
>>>> This change really makes it possible for, say, platform device
>>>> drivers to re-use runtime PM suspend and resume callbacks by
>>>> pointing ->suspend_late and ->resume_early, respectively (and
>>>> possibly the analogous hibernation-related callback pointers too),
>>>> to them without adding any extra "is the device already suspended?"
>>>> type of checks to the callback routines, as long as they will be
>>>> invoked directly by the core.
>>>
>>> Certainly there are drivers that could start to deploying this
>>> solution, because at the moment they don't have any additional
>>> operations to perform at system suspend. But what about the rest,
>>> don't we care about them?
>>
>> We do, somewhat. :-)
>>
>>> Moreover, what happens when/if a driver that has deployed this
>>> solution, starts being used on a new SoC and then additional
>>> operations during system suspend becomes required (for example
>>> pinctrls that needs to be put in a system sleep state)? Then
>>> everything falls apart, doesn't it?
>>
>> Then you runtime-resume the device in ->suspend() and the remaining
>> callbacks will run for you just fine.
>
> Well, this proves my concern.
>
> Even if the driver has additional operations to perform, why should it
> have to runtime resume its device to have the callbacks to be invoked?
> That may be a waste in both time and power.
>
> No, this isn't good enough, sorry.

What isn't good enough for what?

>>
>> And IMO running "late" and "noirq" system suspend callbacks on a
>> suspended device is super-fragile anyway as they generally need to
>> distinguish between the "suspended" and "not suspended" cases
>> consistently and what they do may affect the children or the parent of
>> the device in ways that are difficult to predict in general.  So, I'd
>> rather not do that in any case.
>
> That may be the case for the ACPI PM domain, but I don't see an issue
> when devices are being attached to a more trivial middle layer/PM
> domain.

There always is a problem.

Yes, you can ignore it, but it doesn't make it automatically go away.

Also, what forces you to set DPM_FLAG_SMART_SUSPEND anyway?

Thanks,
Rafael
Rafael J. Wysocki Dec. 19, 2017, 4:35 p.m. UTC | #7
On Tue, Dec 19, 2017 at 2:15 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 19 December 2017 at 12:19, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Tue, Dec 19, 2017 at 12:13 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>>> On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>
>>
>> [cut]
>>
>>>
>>>> Moreover, what happens when/if a driver that has deployed this
>>>> solution, starts being used on a new SoC and then additional
>>>> operations during system suspend becomes required (for example
>>>> pinctrls that needs to be put in a system sleep state)? Then
>>>> everything falls apart, doesn't it?
>>>
>>> Then you runtime-resume the device in ->suspend() and the remaining
>>> callbacks will run for you just fine.
>>
>> BTW, I guess you'll need a middle layer in that case anyway so that
>> the driver doesn't have to distinguish between platforms it has to
>> work with which makes the argument somewhat moot.
>
> No, this isn't the case. Let's take the pinctrl as an example.
>
> The pinctrl system sleep state could be optionally defined in the DT,
> depending on the platform.
>
> All the driver shall do, is to try to set the state, if it has been defined.

And how exactly does the driver know what state should be set during
system suspend, say?

Anyway, though, as I said, nothing forces you to set
DPM_FLAG_SMART_SUSPEND.  Don't set it if it is not suitable for you
and no one else is going to be affected.

Thanks,
Rafael
Rafael J. Wysocki Dec. 19, 2017, 4:54 p.m. UTC | #8
On Tue, Dec 19, 2017 at 5:29 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Tue, Dec 19, 2017 at 2:10 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 19 December 2017 at 12:13, Rafael J. Wysocki <rafael@kernel.org> wrote:
>>> On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>
>>>>> Make the PM core avoid invoking the "late" and "noirq" system-wide
>>>>> suspend (or analogous) callbacks provided by device drivers directly
>>>>> for devices with DPM_FLAG_SMART_SUSPEND set that are in runtime
>>>>> suspend during the "late" and "noirq" phases of system-wide suspend
>>>>> (or analogous) transitions.  That is only done for devices without
>>>>> any middle-layer "late" and "noirq" suspend callbacks (to avoid
>>>>> confusing the middle layer if there is one).
>>>>>
>>>>> The underlying observation is that runtime PM is disabled for devices
>>>>> during the "late" and "noirq" system-wide suspend phases, so if they
>>>>> remain in runtime suspend from the "late" phase forward, it doesn't
>>>>> make sense to invoke the "late" and "noirq" callbacks provided by
>>>>> the drivers for them (arguably, the device is already suspended and
>>>>> in the right state).  Thus, if the remaining driver suspend callbacks
>>>>> are to be invoked directly by the core, they can be skipped.
>>>>>
>>>
>>> It looks like I'm consistently failing to explain my point clearly enough. :-)
>>>
>>>> As I have stated earlier, this isn't going to solve the general case,
>>>> as the above change log seems to state.
>
> No, it doesn't, as long as drivers follow the documentation.
>
> So your concern seems to be "What if I don't follow the
> documentation?" which, honestly, is not something I can address. :-)

I'm not sure why this ended up in this particular place ...

I must have messed up something.

>>> Well, it doesn't say that or anything similar, at least to my eyes.
>>>
>>> The observation is that if you have set DPM_FLAG_SMART_SUSPEND, then
>>> you need to be prepared for your ->suspend_late and ->suspend_noirq to
>>> be skipped (because the ACPI PM domain does that and you may happen to
>>> work with it, for example) if the device is already suspended at the
>>> beginning of the "late suspend" phase.  That's already documented.
>>>
>>> Given the above, and the fact that there is not much to be done for a
>>> suspended device in ->suspend_late and ->suspend_noirq, why can't the
>>> core skip these callbacks too if there's no middle layer?
>>>
>>> But the reason why I really need this is because
>>> i2c-designware-platdrv can work both with the ACPI PM domain and
>>> standalone and I need it to be handled consistently in both cases.
>>
>> Yeah, I understand that.
>>
>>>
>>>> I think we really need to do that, before adding yet another system suspend/resume optimization
>>>> path in the PM core.
>>>
>>> So what exactly is the technical argument in the above?
>>
>> As stated, when the driver needs additional operations to be done, it
>> all falls a part.

So I wanted to say the above thing here:

+ No, it doesn't, as long as drivers follow the documentation.
+
+ So your concern seems to be "What if I don't follow the
+ documentation?" which, honestly, is not something I can address. :-)

> If the driver *needs* such operations to be done, then it *should*
> *not* set DPM_FLAG_SMART_SUSPEND as per the existing documentation.

Thanks,
Rafael
diff mbox

Patch

Index: linux-pm/Documentation/driver-api/pm/devices.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/pm/devices.rst
+++ linux-pm/Documentation/driver-api/pm/devices.rst
@@ -777,14 +777,16 @@  The driver can indicate that by setting
 runtime suspend at the beginning of the ``suspend_late`` phase of system-wide
 suspend (or in the ``poweroff_late`` phase of hibernation), when runtime PM
 has been disabled for it, under the assumption that its state should not change
-after that point until the system-wide transition is over.  If that happens, the
-driver's system-wide resume callbacks, if present, may still be invoked during
-the subsequent system-wide resume transition and the device's runtime power
-management status may be set to "active" before enabling runtime PM for it,
-so the driver must be prepared to cope with the invocation of its system-wide
-resume callbacks back-to-back with its ``->runtime_suspend`` one (without the
-intervening ``->runtime_resume`` and so on) and the final state of the device
-must reflect the "active" status for runtime PM in that case.
+after that point until the system-wide transition is over (the PM core itself
+does that for devices whose "noirq", "late" and "early" system-wide PM callbacks
+are executed directly by it).  If that happens, the driver's system-wide resume
+callbacks, if present, may still be invoked during the subsequent system-wide
+resume transition and the device's runtime power management status may be set
+to "active" before enabling runtime PM for it, so the driver must be prepared to
+cope with the invocation of its system-wide resume callbacks back-to-back with
+its ``->runtime_suspend`` one (without the intervening ``->runtime_resume`` and
+so on) and the final state of the device must reflect the "active" runtime PM
+status in that case.
 
 During system-wide resume from a sleep state it's easiest to put devices into
 the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -541,6 +541,24 @@  void dev_pm_skip_next_resume_phases(stru
 }
 
 /**
+ * suspend_event - Return a "suspend" message for given "resume" one.
+ * @resume_msg: PM message representing a system-wide resume transition.
+ */
+static pm_message_t suspend_event(pm_message_t resume_msg)
+{
+	switch (resume_msg.event) {
+	case PM_EVENT_RESUME:
+		return PMSG_SUSPEND;
+	case PM_EVENT_THAW:
+	case PM_EVENT_RESTORE:
+		return PMSG_FREEZE;
+	case PM_EVENT_RECOVER:
+		return PMSG_HIBERNATE;
+	}
+	return PMSG_ON;
+}
+
+/**
  * dev_pm_may_skip_resume - System-wide device resume optimization check.
  * @dev: Target device.
  *
@@ -581,6 +599,14 @@  static pm_callback_t dpm_subsys_resume_n
 	return callback;
 }
 
+static pm_callback_t dpm_subsys_suspend_noirq_cb(struct device *dev,
+						 pm_message_t state,
+						 const char **info_p);
+
+static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev,
+						pm_message_t state,
+						const char **info_p);
+
 /**
  * device_resume_noirq - Execute a "noirq resume" callback for given device.
  * @dev: Device to handle.
@@ -608,13 +634,40 @@  static int device_resume_noirq(struct de
 	dpm_wait_for_superior(dev, async);
 
 	callback = dpm_subsys_resume_noirq_cb(dev, state, &info);
+	if (callback)
+		goto Run;
 
-	if (!callback && dev->driver && dev->driver->pm) {
+	if (dev_pm_smart_suspend_and_suspended(dev)) {
+		pm_message_t suspend_msg = suspend_event(state);
+
+		/*
+		 * If "freeze" callbacks have been skipped during a transition
+		 * related to hibernation, the subsequent "thaw" callbacks must
+		 * be skipped too or bad things may happen.  Otherwise, resume
+		 * callbacks are going to be run for the device, so its runtime
+		 * PM status must be changed to reflect the new state after the
+		 * transition under way.
+		 */
+		if (!dpm_subsys_suspend_late_cb(dev, suspend_msg, NULL) &&
+		    !dpm_subsys_suspend_noirq_cb(dev, suspend_msg, NULL)) {
+			if (state.event == PM_EVENT_THAW) {
+				dev_pm_skip_next_resume_phases(dev);
+				goto Skip;
+			} else {
+				pm_runtime_set_active(dev);
+			}
+		}
+	}
+
+	if (dev->driver && dev->driver->pm) {
 		info = "noirq driver ";
 		callback = pm_noirq_op(dev->driver->pm, state);
 	}
 
+Run:
 	error = dpm_run_callback(callback, dev, state, info);
+
+Skip:
 	dev->power.is_noirq_suspended = false;
 
 	if (dev_pm_may_skip_resume(dev)) {
@@ -629,7 +682,7 @@  static int device_resume_noirq(struct de
 		dev_pm_skip_next_resume_phases(dev);
 	}
 
- Out:
+Out:
 	complete_all(&dev->power.completion);
 	TRACE_RESUME(error);
 	return error;
@@ -1224,18 +1277,26 @@  static int __device_suspend_noirq(struct
 		goto Complete;
 
 	callback = dpm_subsys_suspend_noirq_cb(dev, state, &info);
+	if (callback)
+		goto Run;
 
-	if (!callback && dev->driver && dev->driver->pm) {
+	if (dev_pm_smart_suspend_and_suspended(dev) &&
+	    !dpm_subsys_suspend_late_cb(dev, state, NULL))
+		goto Skip;
+
+	if (dev->driver && dev->driver->pm) {
 		info = "noirq driver ";
 		callback = pm_noirq_op(dev->driver->pm, state);
 	}
 
+Run:
 	error = dpm_run_callback(callback, dev, state, info);
 	if (error) {
 		async_error = error;
 		goto Complete;
 	}
 
+Skip:
 	dev->power.is_noirq_suspended = true;
 
 	if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED)) {
@@ -1420,17 +1481,27 @@  static int __device_suspend_late(struct
 		goto Complete;
 
 	callback = dpm_subsys_suspend_late_cb(dev, state, &info);
+	if (callback)
+		goto Run;
 
-	if (!callback && dev->driver && dev->driver->pm) {
+	if (dev_pm_smart_suspend_and_suspended(dev) &&
+	    !dpm_subsys_suspend_noirq_cb(dev, state, NULL))
+		goto Skip;
+
+	if (dev->driver && dev->driver->pm) {
 		info = "late driver ";
 		callback = pm_late_early_op(dev->driver->pm, state);
 	}
 
+Run:
 	error = dpm_run_callback(callback, dev, state, info);
-	if (!error)
-		dev->power.is_late_suspended = true;
-	else
+	if (error) {
 		async_error = error;
+		goto Complete;
+	}
+
+Skip:
+	dev->power.is_late_suspended = true;
 
 Complete:
 	TRACE_SUSPEND(error);