mbox series

[v1,0/5] PM: sleep: Improvements of async suspend and resume of devices

Message ID 13709135.uLZWGnKmhe@rjwysocki.net (mailing list archive)
Headers show
Series PM: sleep: Improvements of async suspend and resume of devices | expand

Message

Rafael J. Wysocki Feb. 25, 2025, 4:38 p.m. UTC
Hi Everyone,

Initially, this was an attempt to address the problems described by
Saravana related to spawning async work for any async device upfront
in the resume path:

https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/

but then I realized that it could be extended to the suspend path and
used for speeding it up, which it really does.

Overall, the idea is that instead of starting an async work item for every
async device upfront, which is not very efficient because the majority of
those devices will not be able to make progress due to dependencies anyway,
the async handling is only started upfront for the devices that are likely
to be able to make progress.  That is, devices without parents in the resume
path and leaf devices (ie. devices without children or consumers) in the
suspend path (the underlying observation here is that devices without parents
are likely to have no suppliers too whereas devices without children that
have consumers are not unheard of).  This allows to reduce the amount of
processing that needs to be done to start with.

Then, after processing every device ("async" or "sync"), "async" processing
is started for some devices that have been "unblocked" by it, which are its
children in the resume path or its parent and its suppliers in the suspend
path.  This allows asynchronous handling to start as soon as it makes sense
without delaying the "async" devices unnecessarily.

Fortunately, the additional plumbing needed to implement this is not
particularly complicated.

The first two patches in the series are preparatory.

Patch [3/5] deals with the resume path for all device resume phases.

Patch [4/5] optimizes the "suspend" phase which has the most visible effect (on
the systems in my office the speedup is in the 100 ms range which is around 20%
of the total device resume time).

Patch [5/5] extend this to the "suspend late" and "suspend noirq" phases.

Thanks!

Comments

Saravana Kannan Feb. 27, 2025, 3:44 p.m. UTC | #1
On Tue, Feb 25, 2025 at 8:46 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> Hi Everyone,
>
> Initially, this was an attempt to address the problems described by
> Saravana related to spawning async work for any async device upfront
> in the resume path:
>
> https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/
>
> but then I realized that it could be extended to the suspend path and
> used for speeding it up, which it really does.

Btw, maybe I didn't  word it correctly, but my patch series was meant
to speed up the non-async case too.

I was going to get around sending a v2 of my series, but was caught up
with some other work. But I'm okay if you want to finish up my effort
-- less work for me and I can focus on the other aspects of suspend :)

Maybe add a Suggested-by: to the patches?

I definitely want to review the series, but very busy this week with
some other work. I'll get to this next week for sure.

Thanks,
Saravana

>
> Overall, the idea is that instead of starting an async work item for every
> async device upfront, which is not very efficient because the majority of
> those devices will not be able to make progress due to dependencies anyway,
> the async handling is only started upfront for the devices that are likely
> to be able to make progress.  That is, devices without parents in the resume
> path and leaf devices (ie. devices without children or consumers) in the
> suspend path (the underlying observation here is that devices without parents
> are likely to have no suppliers too whereas devices without children that
> have consumers are not unheard of).  This allows to reduce the amount of
> processing that needs to be done to start with.
>
> Then, after processing every device ("async" or "sync"), "async" processing
> is started for some devices that have been "unblocked" by it, which are its
> children in the resume path or its parent and its suppliers in the suspend
> path.  This allows asynchronous handling to start as soon as it makes sense
> without delaying the "async" devices unnecessarily.
>
> Fortunately, the additional plumbing needed to implement this is not
> particularly complicated.
>
> The first two patches in the series are preparatory.
>
> Patch [3/5] deals with the resume path for all device resume phases.
>
> Patch [4/5] optimizes the "suspend" phase which has the most visible effect (on
> the systems in my office the speedup is in the 100 ms range which is around 20%
> of the total device resume time).
>
> Patch [5/5] extend this to the "suspend late" and "suspend noirq" phases.
>
> Thanks!
>
>
>
Rafael J. Wysocki Feb. 27, 2025, 4:23 p.m. UTC | #2
On Thu, Feb 27, 2025 at 4:45 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Tue, Feb 25, 2025 at 8:46 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > Hi Everyone,
> >
> > Initially, this was an attempt to address the problems described by
> > Saravana related to spawning async work for any async device upfront
> > in the resume path:
> >
> > https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/
> >
> > but then I realized that it could be extended to the suspend path and
> > used for speeding it up, which it really does.
>
> Btw, maybe I didn't  word it correctly, but my patch series was meant
> to speed up the non-async case too.

If "the non-async case" means the case with "async" suspend/resume
disabled entirely, I don't think that the ordering in which devices
are processed can be changed just because there are no known
dependencies.

> I was going to get around sending a v2 of my series, but was caught up
> with some other work. But I'm okay if you want to finish up my effort
> -- less work for me and I can focus on the other aspects of suspend :)
>
> Maybe add a Suggested-by: to the patches?

Yeah, I can do that.

> I definitely want to review the series, but very busy this week with
> some other work. I'll get to this next week for sure.

That should be fine.

> > Overall, the idea is that instead of starting an async work item for every
> > async device upfront, which is not very efficient because the majority of
> > those devices will not be able to make progress due to dependencies anyway,
> > the async handling is only started upfront for the devices that are likely
> > to be able to make progress.  That is, devices without parents in the resume
> > path and leaf devices (ie. devices without children or consumers) in the
> > suspend path (the underlying observation here is that devices without parents
> > are likely to have no suppliers too whereas devices without children that
> > have consumers are not unheard of).  This allows to reduce the amount of
> > processing that needs to be done to start with.
> >
> > Then, after processing every device ("async" or "sync"), "async" processing
> > is started for some devices that have been "unblocked" by it, which are its
> > children in the resume path or its parent and its suppliers in the suspend
> > path.  This allows asynchronous handling to start as soon as it makes sense
> > without delaying the "async" devices unnecessarily.
> >
> > Fortunately, the additional plumbing needed to implement this is not
> > particularly complicated.
> >
> > The first two patches in the series are preparatory.
> >
> > Patch [3/5] deals with the resume path for all device resume phases.
> >
> > Patch [4/5] optimizes the "suspend" phase which has the most visible effect (on
> > the systems in my office the speedup is in the 100 ms range which is around 20%
> > of the total device resume time).
> >
> > Patch [5/5] extend this to the "suspend late" and "suspend noirq" phases.
> >
> > Thanks!
Ulf Hansson March 3, 2025, 12:06 p.m. UTC | #3
On Tue, 25 Feb 2025 at 17:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> Hi Everyone,
>
> Initially, this was an attempt to address the problems described by
> Saravana related to spawning async work for any async device upfront
> in the resume path:
>
> https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/
>
> but then I realized that it could be extended to the suspend path and
> used for speeding it up, which it really does.
>
> Overall, the idea is that instead of starting an async work item for every
> async device upfront, which is not very efficient because the majority of
> those devices will not be able to make progress due to dependencies anyway,
> the async handling is only started upfront for the devices that are likely
> to be able to make progress.  That is, devices without parents in the resume
> path and leaf devices (ie. devices without children or consumers) in the
> suspend path (the underlying observation here is that devices without parents
> are likely to have no suppliers too whereas devices without children that
> have consumers are not unheard of).  This allows to reduce the amount of
> processing that needs to be done to start with.
>
> Then, after processing every device ("async" or "sync"), "async" processing
> is started for some devices that have been "unblocked" by it, which are its
> children in the resume path or its parent and its suppliers in the suspend
> path.  This allows asynchronous handling to start as soon as it makes sense
> without delaying the "async" devices unnecessarily.
>
> Fortunately, the additional plumbing needed to implement this is not
> particularly complicated.

Thanks for the detailed description! Overall, the approach makes
perfect sense to me too!

I am certainly interested to hear Saravana's thoughts around this too.

>
> The first two patches in the series are preparatory.

For these two, feel free to add:

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

>
> Patch [3/5] deals with the resume path for all device resume phases.
>
> Patch [4/5] optimizes the "suspend" phase which has the most visible effect (on
> the systems in my office the speedup is in the 100 ms range which is around 20%
> of the total device resume time).
>
> Patch [5/5] extend this to the "suspend late" and "suspend noirq" phases.

I will try to have a closer look at patch 3->5 later in the week.

Kind regards
Uffe
Rafael J. Wysocki March 3, 2025, 12:08 p.m. UTC | #4
On Mon, Mar 3, 2025 at 1:07 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Tue, 25 Feb 2025 at 17:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > Hi Everyone,
> >
> > Initially, this was an attempt to address the problems described by
> > Saravana related to spawning async work for any async device upfront
> > in the resume path:
> >
> > https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/
> >
> > but then I realized that it could be extended to the suspend path and
> > used for speeding it up, which it really does.
> >
> > Overall, the idea is that instead of starting an async work item for every
> > async device upfront, which is not very efficient because the majority of
> > those devices will not be able to make progress due to dependencies anyway,
> > the async handling is only started upfront for the devices that are likely
> > to be able to make progress.  That is, devices without parents in the resume
> > path and leaf devices (ie. devices without children or consumers) in the
> > suspend path (the underlying observation here is that devices without parents
> > are likely to have no suppliers too whereas devices without children that
> > have consumers are not unheard of).  This allows to reduce the amount of
> > processing that needs to be done to start with.
> >
> > Then, after processing every device ("async" or "sync"), "async" processing
> > is started for some devices that have been "unblocked" by it, which are its
> > children in the resume path or its parent and its suppliers in the suspend
> > path.  This allows asynchronous handling to start as soon as it makes sense
> > without delaying the "async" devices unnecessarily.
> >
> > Fortunately, the additional plumbing needed to implement this is not
> > particularly complicated.
>
> Thanks for the detailed description! Overall, the approach makes
> perfect sense to me too!
>
> I am certainly interested to hear Saravana's thoughts around this too.
>
> >
> > The first two patches in the series are preparatory.
>
> For these two, feel free to add:
>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
>
> >
> > Patch [3/5] deals with the resume path for all device resume phases.
> >
> > Patch [4/5] optimizes the "suspend" phase which has the most visible effect (on
> > the systems in my office the speedup is in the 100 ms range which is around 20%
> > of the total device resume time).
> >
> > Patch [5/5] extend this to the "suspend late" and "suspend noirq" phases.
>
> I will try to have a closer look at patch 3->5 later in the week.

Thank you and thanks for all of the other reviews!