diff mbox

[00/04,RFC] PM: Runtime platform device PM

Message ID 200906020121.38946.rjw@sisk.pl (mailing list archive)
State RFC
Headers show

Commit Message

Rafael Wysocki June 1, 2009, 11:21 p.m. UTC
On Tuesday 02 June 2009, Alan Stern wrote:
> On Mon, 1 Jun 2009, Rafael J. Wysocki wrote:
> 
> > > With USB at least, this isn't true.  Some work items shouldn't be
> > > dequeued, namely, various sorts of resume requests generated while the
> > > driver is in interrupt context.  If they were dequeued during a system
> > > suspend then the device wouldn't ever be woken up, because usbcore
> > > doesn't pass system-resume messages to an autosuspended device.  The
> > > idea is if a device was autosuspended before the system sleep then it
> > > should remain autosuspended after the system wakes up.
> > 
> > Hmm, well, I'm not sure, really.  During a system-wide resume, does it really
> > matter if devices were autosuspended before the preceding suspend or they have
> > been suspended by the PM core?
> 
> It's partly a question of avoiding unnecessary activity.  Is there any
> reason to resume a device if it's just going to be autosuspended again
> a few seconds later?

Well, I'm not sure if we know in advance what's going to happen to the device
right after a system-wide resume.  It may be a totally different situation from
the device activity point of view.

> It's also a matter of principle.  A system sleep should be as
> transparent as possible (except for the passage of time, of course).  
> When the sleep is over, devices should be in the same state as before
> it began.

As a matter of fact, this need not be true at all.  For example, consider a
wireless network adapter that was assosicated with specific access point
before a system-wide suspend, but this access point is not present any more
after the subsequent system-wide resume.

I really don't think we can predict what's going to happen to a device after
a system-wide resume, so IMO we should assume that the device will be necessary
immediately after the resume and we should resume it.  It still can autosuspend
normally afterwards if it's not needed after all.

> >  Also, what happens if the platform firmware
> > resumes the autosuspended devices before passing control to the kernel during
> > system-wide resume?  How do we handle that?
> 
> In the case of USB, it's not possible to resume a suspended device
> without first resuming and using a host controller.  The host
> controller driver will realize that the firmware has taken over the
> controller and will reset the controller.
> 
> For devices that were not autosuspended, this action converts an
> ordinary resume into a reset-resume.  A different driver method is
> called, and drivers that don't support reset-resume are automatically
> unbound from their devices (and then reprobed afterward).
> 
> For devices that were autosuspended, the controller reset will put them 
> back into a suspended state.  They will remain that way until they are 
> autoresumed -- and of course it will be an "auto-reset-resume".

I'm not sure if this is doable so easily for PCI devices, for one example.
Most probably, they will come up in D0 after the system-wide suspend and
the PCI resume code will have to suspend them directly so that they remain
"autosuspended" after the resume.

> What happens if the firmware is sufficiently sneaky that it manages to
> fool the host controller driver?  In that case we end up with a device
> that really is at full power even though the kernel thinks it is
> autosuspended.  When we finally try to autoresume it, we will realize
> that it wasn't suspended after all.
> 
> > Moreover, what if the autosuspended device is no longer present in the system
> > after a system-wide resume.  Are we still going to attempt to autoresume it?
> 
> When the device or perhaps its parent hub is resumed, we will realize 
> the device is gone.  An attempted autoresume will simply fail.  Until 
> that happens, the kernel may believe the device is present when in fact 
> it isn't.
> 
> > > Furthermore, even if the existing work items were dequeued, you'd still 
> > > have to worry about new work items added on by drivers before they get 
> > > suspended.  If the workqueue were allowed to run freely, we might find 
> > > devices being autoresumed in the middle of the sleep transition!
> > 
> > Unless there's some other kind of synchronization between the workqueue and the
> > core suspend-resume code.
> 
> So the work routine would somehow be aware when a system sleep is in
> progress, at which point it would return immediately without doing
> anything?  That seems racey.  Isn't it easier just to freeze the
> workqueue?

Well, yes and no.  Namely, if we freeze the workqueue, all of the bus types
using it will have to check the vaildity of operations carried out by it,
because of the possibility that system-wide power transition took place in the
meantime and changed the state of the device.  Also, please take the PCI
example into consideration.

> > IMO, it would be convenient to treat every system-wide suspend (or hibernation)
> > as a cancellation point for all of the pending autosuspend and autoresume
> > requests and to treat devices autosuspended before that point as just
> > suspended.  Of course, the bus type and device driver suspend callbacks would
> > have to be prepared to handle this situation cleanly.
> 
> For pending autosuspend, okay.  The device was powered up before the
> system sleep, it remains powered up afterward, and another autosuspend
> will occur in due course.
> 
> For pending autoresume requests coming from user I/O calls, again okay.  
> There can't be any of these because userspace is frozen, and besides,
> they wouldn't go through the workqueue anyway.
> 
> For autoresume requests coming from an interrupt handler, I'm not so 
> sure.  I don't know of any good examples -- in fact there don't seem to 
> be any examples at the moment.  (The USB core supports autoresume 
> requests in interrupt context, AFAICS no driver uses them.)  So maybe 
> it's premature to worry about this case.

Hmm.  Can you please describe a theoretical situation in which this kind
of autoresume request will appear?  I'm not really sure what you mean here. :-)

> Lastly, we have resume requests coming from a device -- i.e., remote
> wakeup requests.  This is a little tricky.  Suppose you have remote
> wakeup enabled, and you press a key on your USB keyboard just as the
> computer is going to sleep.  Should this cause the sleep to be aborted?  
> Should it cause the computer to wake back up again right away?  Should
> the keystroke be ignored?

This is a difficult problem in general.  I _think_ we should ignore such
autoresume requests for devices that are not marked as wake-up ones
and we should abort the suspend if there's a request from a wake-up device.
 
> More to the point, suppose you plug in a new USB device as the computer
> is going to sleep.  When the computer wakes up, shouldn't it realize
> that a new device has been connected?  This is the same problem that
> other people have been discussing -- what should happen to wakeup IRQs
> during the time the system is suspending?  Will they get delivered 
> later on (and when drivers are ready to receive them)?

I think they should abort the suspend unless it's too late (eg. the suspend
sequence has reached sysdev_suspend()).

Anyway, below is a prototype implementation of a generic runtime PM workqueue,
just for discussion.

pm_schedule_suspend() and pm_schedule_resume() are to be called by bus types
that are requesting the autosuspend or autoresume, respectively, of a device.
The supposed control flow is like this:
* bus type calls pm_schedule_suspend() to schedule autosuspend of a device
* at the right time the PM workqueue executes the bus type's ->autosuspend()
  callback which is responsible for calling the device driver to suspend the
  device (the way in which this is done is totally bus type-specific)
* if there's a resume event for the device, it's delivered to the bus type
  driver (in a bus type-specific way) and the bus type schedules autoresume of
  the device by calling pm_schedule_resume()
* the PM workqueue executes the bus type's ->autoresume() as soon as it can
  and this callback is responsible for calling the device driver to resume the
  device (the way in which this is done is totally bus type-specific).

Thanks,
Rafael

---
 drivers/base/power/main.c |   11 ++++--
 include/linux/pm.h        |   16 +++++++++
 kernel/power/main.c       |   75 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 99 insertions(+), 3 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alan Stern June 2, 2009, 2:51 p.m. UTC | #1
On Tue, 2 Jun 2009, Rafael J. Wysocki wrote:

> > It's partly a question of avoiding unnecessary activity.  Is there any
> > reason to resume a device if it's just going to be autosuspended again
> > a few seconds later?
> 
> Well, I'm not sure if we know in advance what's going to happen to the device
> right after a system-wide resume.  It may be a totally different situation from
> the device activity point of view.

We _never_ know in advance what's going to happen.  An autosuspended
device may have to wake up at any time in order to carry out some work
-- that's true whether or not a system sleep has taken place.  IMO it
doesn't make sense to wake up an autosuspended device for no reason
other than that it _may_ need to do something; we should let it stay in
a low-power mode until we _know_ it has to wake up.

> > It's also a matter of principle.  A system sleep should be as
> > transparent as possible (except for the passage of time, of course).  
> > When the sleep is over, devices should be in the same state as before
> > it began.
> 
> As a matter of fact, this need not be true at all.  For example, consider a
> wireless network adapter that was assosicated with specific access point
> before a system-wide suspend, but this access point is not present any more
> after the subsequent system-wide resume.

This example doesn't prove your point.  Sure, conditions may have
changed while the system was asleep.  They might also change while the
system is awake.  In general, whether or not the system was asleep
shouldn't make any difference -- to as great an extent as possible we
should strive to pretend that the entire sleep took no time at all (or
simply didn't happen).

So: Suppose the system had been awake when the access point
disappeared.  What would the wireless adapter driver do then?  It
should try to do exactly the same thing if the access point disappears
while the system is asleep.

You see?  The sleep should be transparent.

> I really don't think we can predict what's going to happen to a device after
> a system-wide resume, so IMO we should assume that the device will be necessary
> immediately after the resume and we should resume it.  It still can autosuspend
> normally afterwards if it's not needed after all.

This is highly debatable.  I'd like to hear some thoughts from other 
people.


> > >  Also, what happens if the platform firmware
> > > resumes the autosuspended devices before passing control to the kernel during
> > > system-wide resume?  How do we handle that?
> > 
> > In the case of USB, it's not possible to resume a suspended device
> > without first resuming and using a host controller.  The host
> > controller driver will realize that the firmware has taken over the
> > controller and will reset the controller.
...

> I'm not sure if this is doable so easily for PCI devices, for one example.
> Most probably, they will come up in D0 after the system-wide suspend and
> the PCI resume code will have to suspend them directly so that they remain
> "autosuspended" after the resume.

Clearly the behavior will need to vary among buses.  What's appropriate
for USB won't be appropriate for PCI.  Furthermore, the overhead of
waking a PCI device unnecessarily might be a lot lower than the
overhead of waking a USB device, so it might make sense always to
resume autosuspended PCI devices during a system resume.


> > So the work routine would somehow be aware when a system sleep is in
> > progress, at which point it would return immediately without doing
> > anything?  That seems racey.  Isn't it easier just to freeze the
> > workqueue?
> 
> Well, yes and no.  Namely, if we freeze the workqueue, all of the bus types
> using it will have to check the vaildity of operations carried out by it,
> because of the possibility that system-wide power transition took place in the
> meantime and changed the state of the device.  Also, please take the PCI
> example into consideration.

Freezability doesn't matter; you can prevent items from being added to
the workqueue during a power transition just as easily regardless of
whether it is freezable or not.  In fact, it's easier if the workqueue
is freezable.  You don't have to prevent anything from being added; you
just have to cancel everything on the workqueue before unfreezing it.

One solution would be to have the bus types cancel their own pending
work items explicitly.  Another is to have two freezable PM workqueues,
one with automatic cancellation and the other without.


> > For autoresume requests coming from an interrupt handler, I'm not so 
> > sure.  I don't know of any good examples -- in fact there don't seem to 
> > be any examples at the moment.  (The USB core supports autoresume 
> > requests in interrupt context, AFAICS no driver uses them.)  So maybe 
> > it's premature to worry about this case.
> 
> Hmm.  Can you please describe a theoretical situation in which this kind
> of autoresume request will appear?  I'm not really sure what you mean here. :-)

Well, here's something people discussed but which hasn't been
implemented (and may never be implemented).  A suspended USB keyboard
doesn't have enough power to turn on any LEDs.  So suppose a system has
two keyboards, and kbd-A is autosuspended when the user engages
CapsLock on kbd-B.  The input layer wants to turn on the CapsLock LED
on all the keyboards, which means it has to autoresume kbd-A.  But all
this action takes place in interrupt context, so an autoresume request
would have to be queued.


> > Lastly, we have resume requests coming from a device -- i.e., remote
> > wakeup requests.  This is a little tricky.  Suppose you have remote
> > wakeup enabled, and you press a key on your USB keyboard just as the
> > computer is going to sleep.  Should this cause the sleep to be aborted?  
> > Should it cause the computer to wake back up again right away?  Should
> > the keystroke be ignored?
> 
> This is a difficult problem in general.  I _think_ we should ignore such
> autoresume requests for devices that are not marked as wake-up ones
> and we should abort the suspend if there's a request from a wake-up device.

It _is_ difficult.  Right now USB ignores such requests, because the
threads that handle them are all freezable.  But the information isn't
lost; it is stored in the hardware and when the system wakes up the
requests will be seen.

Except for wakeup requests from root hubs.  If they arrive at just the 
right time then they are not stored in the hardware -- instead they are 
stored as work items on the ksuspend_usbd workqueue!  In essence, the 
workqueue serves as a software mechanism for saving IRQs which were 
delivered but couldn't be acted on.

That's why I don't want to erase the work items; it would be logically 
equivalent to dropping interrupts.


> Anyway, below is a prototype implementation of a generic runtime PM workqueue,
> just for discussion.
> 
> pm_schedule_suspend() and pm_schedule_resume() are to be called by bus types
> that are requesting the autosuspend or autoresume, respectively, of a device.
> The supposed control flow is like this:
> * bus type calls pm_schedule_suspend() to schedule autosuspend of a device
> * at the right time the PM workqueue executes the bus type's ->autosuspend()
>   callback which is responsible for calling the device driver to suspend the
>   device (the way in which this is done is totally bus type-specific)
> * if there's a resume event for the device, it's delivered to the bus type
>   driver (in a bus type-specific way) and the bus type schedules autoresume of
>   the device by calling pm_schedule_resume()
> * the PM workqueue executes the bus type's ->autoresume() as soon as it can
>   and this callback is responsible for calling the device driver to resume the
>   device (the way in which this is done is totally bus type-specific).

It looks like the only benefits provided by this patch are the
workqueue itself and the embedded delayed_work structure.  The ability
to cancel the work items (or prevent them from being added) during a
system sleep is questionable, as discussed above.  There isn't any
value added by the new functions or methods.

In short, it would be just as useful simply to add the delayed_work 
structure and export two PM workqueues, both freezable, with automatic
cancellation for one of them.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki June 4, 2009, 4:30 p.m. UTC | #2
On Tuesday 02 June 2009, Alan Stern wrote:
> On Tue, 2 Jun 2009, Rafael J. Wysocki wrote:
> 
> > > It's partly a question of avoiding unnecessary activity.  Is there any
> > > reason to resume a device if it's just going to be autosuspended again
> > > a few seconds later?
> > 
> > Well, I'm not sure if we know in advance what's going to happen to the device
> > right after a system-wide resume.  It may be a totally different situation from
> > the device activity point of view.
> 
> We _never_ know in advance what's going to happen.  An autosuspended
> device may have to wake up at any time in order to carry out some work
> -- that's true whether or not a system sleep has taken place.  IMO it
> doesn't make sense to wake up an autosuspended device for no reason
> other than that it _may_ need to do something; we should let it stay in
> a low-power mode until we _know_ it has to wake up.

OK, that's a valid point.

I also think that if the BIOS has already woke up the device during system-wide
resume, we shouldn't try to put it to a low power state again unless we know
that this is desirable.

> > > It's also a matter of principle.  A system sleep should be as
> > > transparent as possible (except for the passage of time, of course).  
> > > When the sleep is over, devices should be in the same state as before
> > > it began.
> > 
> > As a matter of fact, this need not be true at all.  For example, consider a
> > wireless network adapter that was assosicated with specific access point
> > before a system-wide suspend, but this access point is not present any more
> > after the subsequent system-wide resume.
> 
> This example doesn't prove your point.

No, it doesn't.  Still, it illustrates the problem I had in mind, that it may
not be desirable to restore the pre-suspend settings in all cases.

> Sure, conditions may have changed while the system was asleep.  They might
> also change while the system is awake.  In general, whether or not the system
> was asleep shouldn't make any difference -- to as great an extent as possible
> we should strive to pretend that the entire sleep took no time at all (or
> simply didn't happen).
> 
> So: Suppose the system had been awake when the access point
> disappeared.  What would the wireless adapter driver do then?  It
> should try to do exactly the same thing if the access point disappears
> while the system is asleep.

Not really.  Namely, when the access point _vanishes_ at run time, this is an
error condition from which we have to recover, while if it's _not_ _present_
after a resume, it's a normal situation and we shouldn't be recovering from
that (the user might have changed the physical location while suspended).

On my system the wireless always remembers the network it was associated with
before suspend and it tries to reassociate with it after the subsequent resume,
even if the network is not there any more, so it fails and I have to choose
another one manually.  This isn't a desirable behavior and I'd even say it's
not correct, because after a fresh boot it always scans for networks and
associates with one that's actually present.  I'd like it to do the same after
the resume.

> You see?  The sleep should be transparent.

In some cases it won't be, because the hardware state changes while sleeping.
For example, a laptop battery may be depleted while suspended, so that after
resume it's in the 'low' condition, while it was in the 'good' condition before
the preceding suspend.  Arguably, you can't drain a battery from 90% to 10%,
for example, momentarily.  Also, a CPU fan might be 100% on before suspend,
because the CPU was hot at that time, but that doesn't mean the fan should be
100% on after the subsequent resume, because it's likely that the CPU will be
cold (on some systems trying to make the fan spin too fast may lead to
general problems with thermal management afterwards).

> > I really don't think we can predict what's going to happen to a device after
> > a system-wide resume, so IMO we should assume that the device will be necessary
> > immediately after the resume and we should resume it.  It still can autosuspend
> > normally afterwards if it's not needed after all.
> 
> This is highly debatable.  I'd like to hear some thoughts from other 
> people.

Agreed, but no one has spoken so far. :-)

> > > >  Also, what happens if the platform firmware
> > > > resumes the autosuspended devices before passing control to the kernel during
> > > > system-wide resume?  How do we handle that?
> > > 
> > > In the case of USB, it's not possible to resume a suspended device
> > > without first resuming and using a host controller.  The host
> > > controller driver will realize that the firmware has taken over the
> > > controller and will reset the controller.
> ...
> 
> > I'm not sure if this is doable so easily for PCI devices, for one example.
> > Most probably, they will come up in D0 after the system-wide suspend and
> > the PCI resume code will have to suspend them directly so that they remain
> > "autosuspended" after the resume.
> 
> Clearly the behavior will need to vary among buses.

Agreed.

> What's appropriate for USB won't be appropriate for PCI.  Furthermore, the
> overhead of waking a PCI device unnecessarily might be a lot lower than the
> overhead of waking a USB device, so it might make sense always to
> resume autosuspended PCI devices during a system resume.

Sure.  Also, as I said earlier, the BIOS usually wakes up PCI devices (in fact
a bus reset causes them to wake up and the buses are ususally reset during
resume by the BIOS) and it doesn't really make sense to put them into low
power states just because they were in low power states before the suspend.

> > > So the work routine would somehow be aware when a system sleep is in
> > > progress, at which point it would return immediately without doing
> > > anything?  That seems racey.  Isn't it easier just to freeze the
> > > workqueue?
> > 
> > Well, yes and no.  Namely, if we freeze the workqueue, all of the bus types
> > using it will have to check the vaildity of operations carried out by it,
> > because of the possibility that system-wide power transition took place in the
> > meantime and changed the state of the device.  Also, please take the PCI
> > example into consideration.
> 
> Freezability doesn't matter; you can prevent items from being added to
> the workqueue during a power transition just as easily regardless of
> whether it is freezable or not.  In fact, it's easier if the workqueue
> is freezable.  You don't have to prevent anything from being added; you
> just have to cancel everything on the workqueue before unfreezing it.

Good idea.

> One solution would be to have the bus types cancel their own pending
> work items explicitly.  Another is to have two freezable PM workqueues,
> one with automatic cancellation and the other without.

What about having a flag in struct dev_pm_info (along with the delayed
work structure) indicating whether or not the work should be automatically
cancelled before unfreezing the queue?  [It seems, in that case we'd need
another flag to indicate what is to cancel, the delayed work or just the work.]

> > > For autoresume requests coming from an interrupt handler, I'm not so 
> > > sure.  I don't know of any good examples -- in fact there don't seem to 
> > > be any examples at the moment.  (The USB core supports autoresume 
> > > requests in interrupt context, AFAICS no driver uses them.)  So maybe 
> > > it's premature to worry about this case.
> > 
> > Hmm.  Can you please describe a theoretical situation in which this kind
> > of autoresume request will appear?  I'm not really sure what you mean here. :-)
> 
> Well, here's something people discussed but which hasn't been
> implemented (and may never be implemented).  A suspended USB keyboard
> doesn't have enough power to turn on any LEDs.  So suppose a system has
> two keyboards, and kbd-A is autosuspended when the user engages
> CapsLock on kbd-B.  The input layer wants to turn on the CapsLock LED
> on all the keyboards, which means it has to autoresume kbd-A.  But all
> this action takes place in interrupt context, so an autoresume request
> would have to be queued.

I see, but this is similar to the remote wake-up in that we have to queue
the autoresume request.

> > > Lastly, we have resume requests coming from a device -- i.e., remote
> > > wakeup requests.  This is a little tricky.  Suppose you have remote
> > > wakeup enabled, and you press a key on your USB keyboard just as the
> > > computer is going to sleep.  Should this cause the sleep to be aborted?  
> > > Should it cause the computer to wake back up again right away?  Should
> > > the keystroke be ignored?
> > 
> > This is a difficult problem in general.  I _think_ we should ignore such
> > autoresume requests for devices that are not marked as wake-up ones
> > and we should abort the suspend if there's a request from a wake-up device.
> 
> It _is_ difficult.  Right now USB ignores such requests, because the
> threads that handle them are all freezable.  But the information isn't
> lost; it is stored in the hardware and when the system wakes up the
> requests will be seen.
> 
> Except for wakeup requests from root hubs.  If they arrive at just the 
> right time then they are not stored in the hardware -- instead they are 
> stored as work items on the ksuspend_usbd workqueue!  In essence, the 
> workqueue serves as a software mechanism for saving IRQs which were 
> delivered but couldn't be acted on.
> 
> That's why I don't want to erase the work items; it would be logically 
> equivalent to dropping interrupts.

I see.

Still, we'll need to think about a mechanism allowing us to abort suspend
if there's a wake-up request from a wake-up device.

> > Anyway, below is a prototype implementation of a generic runtime PM workqueue,
> > just for discussion.
> > 
> > pm_schedule_suspend() and pm_schedule_resume() are to be called by bus types
> > that are requesting the autosuspend or autoresume, respectively, of a device.
> > The supposed control flow is like this:
> > * bus type calls pm_schedule_suspend() to schedule autosuspend of a device
> > * at the right time the PM workqueue executes the bus type's ->autosuspend()
> >   callback which is responsible for calling the device driver to suspend the
> >   device (the way in which this is done is totally bus type-specific)
> > * if there's a resume event for the device, it's delivered to the bus type
> >   driver (in a bus type-specific way) and the bus type schedules autoresume of
> >   the device by calling pm_schedule_resume()
> > * the PM workqueue executes the bus type's ->autoresume() as soon as it can
> >   and this callback is responsible for calling the device driver to resume the
> >   device (the way in which this is done is totally bus type-specific).
> 
> It looks like the only benefits provided by this patch are the
> workqueue itself and the embedded delayed_work structure.  The ability
> to cancel the work items (or prevent them from being added) during a
> system sleep is questionable, as discussed above.  There isn't any
> value added by the new functions or methods.

The value added is that since the delayed work member is to be used for
queueing work, each bus type will have to do the same casting from
'struct work_struct' to 'struct delayed_work' and next to 'struct device'
to get to the device object, which means quite a bit of code duplication. 

Also, IMO, if there's a set of functions for doing that, the probability
that someone gets things wrong is smaller (slightly).  Moreover, we can
add variants for the works that will be automatically cancelled during
system-wide resume, so that bus types don't have to worry about the details
related to that.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek July 18, 2009, 11:49 a.m. UTC | #3
Hi!

> > Sure, conditions may have changed while the system was asleep.  They might
> > also change while the system is awake.  In general, whether or not the system
> > was asleep shouldn't make any difference -- to as great an extent as possible
> > we should strive to pretend that the entire sleep took no time at all (or
> > simply didn't happen).
> > 
> > So: Suppose the system had been awake when the access point
> > disappeared.  What would the wireless adapter driver do then?  It
> > should try to do exactly the same thing if the access point disappears
> > while the system is asleep.
> 
> Not really.  Namely, when the access point _vanishes_ at run time, this is an
> error condition from which we have to recover, while if it's _not_ _present_
> after a resume, it's a normal situation and we shouldn't be recovering from
> that (the user might have changed the physical location while
> suspended).

I disagree here. User can just walk away with powered notebook.


> > You see?  The sleep should be transparent.
> 
> In some cases it won't be, because the hardware state changes while sleeping.
> For example, a laptop battery may be depleted while suspended, so that after
> resume it's in the 'low' condition, while it was in the 'good' condition before
> the preceding suspend.  Arguably, you can't drain a battery from 90% to 10%,
> for example, momentarily.  Also, a CPU fan might be 100% on before
> suspend,

Actually... going 90%->10% immediately is common behaviour of old
li-ion batteries.

> because the CPU was hot at that time, but that doesn't mean the fan should be
> 100% on after the subsequent resume, because it's likely that the CPU will be
> cold (on some systems trying to make the fan spin too fast may lead to
> general problems with thermal management afterwards).

100% is indeed sane default for fan. Yes you should slow the fan down
when you read the real temperature. And yes some systems play it safe
and do 100% fan during boot.

(Otherwise you may have heat problems if you suspend and immediately
resume).
									Pavel
diff mbox

Patch

Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,7 @@ 
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -182,6 +183,8 @@  struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*autosuspend)(struct device *dev);
+	int (*autoresume)(struct device *dev);
 };
 
 /**
@@ -323,8 +326,21 @@  struct dev_pm_info {
 #ifdef	CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+	struct delayed_work	dwork;
 };
 
+extern void pm_schedule_suspend(struct device *dev, unsigned long delay);
+extern void pm_schedule_resume(struct device *dev);
+
+extern bool transition_started;
+
+static inline bool pm_transition_in_progress(void)
+{
+	return transition_started;
+}
+
+extern void pm_flush_workqueue(void);
+
 /*
  * The PM_EVENT_ messages are also used by drivers implementing the legacy
  * suspend framework, based on the ->suspend() and ->resume() callbacks common
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -22,6 +22,8 @@ 
 #include <linux/freezer.h>
 #include <linux/vmstat.h>
 #include <linux/syscalls.h>
+#include <linux/workqueue.h>
+#include <linux/device.h>
 
 #include "power.h"
 
@@ -578,12 +580,20 @@  static struct attribute_group attr_group
 	.attrs = g,
 };
 
+static int __init pm_start_workqueue(void);
 
 static int __init pm_init(void)
 {
+	int error;
+
+	error = pm_start_workqueue();
+	if (error)
+		return error;
+
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
+
 	return sysfs_create_group(power_kobj, &attr_group);
 }
 
@@ -729,3 +739,68 @@  done:
 late_initcall(test_suspend);
 
 #endif /* CONFIG_PM_TEST_SUSPEND */
+
+
+static struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_singlethread_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+
+void pm_flush_workqueue(void)
+{
+	flush_workqueue(pm_wq);
+}
+
+static inline struct device *pm_dwork_to_device(struct delayed_work *dwork)
+{
+	struct dev_pm_info *d = container_of(dwork, struct dev_pm_info, dwork);
+	return container_of(d, struct device, power);
+}
+
+static void pm_autosuspend(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	struct device *dev = pm_dwork_to_device(dwork);
+
+	device_pm_lock();
+	if (dev && dev->bus && dev->bus->pm && dev->bus->pm->autosuspend
+	    && !pm_transition_in_progress())
+		dev->bus->pm->autosuspend(dev);
+	device_pm_unlock();
+}
+
+void pm_schedule_suspend(struct device *dev, unsigned long delay)
+{
+	device_pm_lock();
+	if (!pm_transition_in_progress()) {
+		INIT_DELAYED_WORK(&dev->power.dwork, pm_autosuspend);
+		queue_delayed_work(pm_wq, &dev->power.dwork, delay);
+	}
+	device_pm_unlock();
+}
+
+static void pm_autoresume(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	struct device *dev = pm_dwork_to_device(dwork);
+
+	device_pm_lock();
+	if (dev && dev->bus && dev->bus->pm && dev->bus->pm->autoresume
+	    && !pm_transition_in_progress())
+		dev->bus->pm->autoresume(dev);
+	device_pm_unlock();
+}
+
+void pm_schedule_resume(struct device *dev)
+{
+	device_pm_lock();
+	if (!pm_transition_in_progress()) {
+		INIT_WORK(&dev->power.dwork.work, pm_autoresume);
+		queue_work(pm_wq, &dev->power.dwork.work);
+	}
+	device_pm_unlock();
+}
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -46,7 +46,7 @@  static DEFINE_MUTEX(dpm_list_mtx);
  * Set once the preparation of devices for a PM transition has started, reset
  * before starting to resume devices.  Protected by dpm_list_mtx.
  */
-static bool transition_started;
+bool transition_started;
 
 /**
  *	device_pm_lock - lock the list of active devices used by the PM core
@@ -436,7 +436,6 @@  static void dpm_resume(pm_message_t stat
 
 	INIT_LIST_HEAD(&list);
 	mutex_lock(&dpm_list_mtx);
-	transition_started = false;
 	while (!list_empty(&dpm_list)) {
 		struct device *dev = to_device(dpm_list.next);
 
@@ -461,6 +460,7 @@  static void dpm_resume(pm_message_t stat
 		put_device(dev);
 	}
 	list_splice(&list, &dpm_list);
+	transition_started = false;
 	mutex_unlock(&dpm_list_mtx);
 }
 
@@ -752,9 +752,14 @@  static int dpm_prepare(pm_message_t stat
 	struct list_head list;
 	int error = 0;
 
-	INIT_LIST_HEAD(&list);
 	mutex_lock(&dpm_list_mtx);
 	transition_started = true;
+	mutex_unlock(&dpm_list_mtx);
+
+	pm_flush_workqueue();
+
+	INIT_LIST_HEAD(&list);
+	mutex_lock(&dpm_list_mtx);
 	while (!list_empty(&dpm_list)) {
 		struct device *dev = to_device(dpm_list.next);