Message ID | 4935437e0bfbab9079db2c8b1fbac66cef7dc569.1472554278.git.lukas@wunner.de (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Wednesday, August 31, 2016 08:15:18 AM Lukas Wunner wrote: > Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been > reset by firmware") added a runtime resume for devices that were runtime > suspended when the system entered sleep. > > The motivation was that devices might be in a reset-power-on state after > waking from system sleep, so their power state as perceived by Linux > (stored in pci_dev->current_state) would no longer reflect reality. > By resuming such devices, we allow them to return to a low-power state > via autosuspend and also bring their current_state in sync with reality. Not only that. It allows those devices to respond more timely to user space requests occuring right after resume. Those user space requests often occur sequentially (read from device A, write to device B, read from device C, write to device D, for example) and without that pre-emptive resume each of them needs to wait for another device to (runtime-)resume which results in quite annoying latencies observable by users. > However for devices that are *not* in a reset-power-on state, doing an > unconditional resume wastes energy. It may or may not waste it and this really is a tradeoff. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, September 14, 2016 02:29:12 AM Rafael J. Wysocki wrote: > On Wednesday, August 31, 2016 08:15:18 AM Lukas Wunner wrote: > > Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been > > reset by firmware") added a runtime resume for devices that were runtime > > suspended when the system entered sleep. > > > > The motivation was that devices might be in a reset-power-on state after > > waking from system sleep, so their power state as perceived by Linux > > (stored in pci_dev->current_state) would no longer reflect reality. > > By resuming such devices, we allow them to return to a low-power state > > via autosuspend and also bring their current_state in sync with reality. > > Not only that. > > It allows those devices to respond more timely to user space requests > occuring right after resume. > > Those user space requests often occur sequentially (read from device A, > write to device B, read from device C, write to device D, for example) > and without that pre-emptive resume each of them needs to wait for another > device to (runtime-)resume which results in quite annoying latencies > observable by users. > > > However for devices that are *not* in a reset-power-on state, doing an > > unconditional resume wastes energy. > > It may or may not waste it and this really is a tradeoff. That said it is a bit inconsistent with the suspend-to-idle case in which we don't resume those devices, so never mind. :-) Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, August 31, 2016 08:15:18 AM Lukas Wunner wrote: > Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been > reset by firmware") added a runtime resume for devices that were runtime > suspended when the system entered sleep. > > The motivation was that devices might be in a reset-power-on state after > waking from system sleep, so their power state as perceived by Linux > (stored in pci_dev->current_state) would no longer reflect reality. > By resuming such devices, we allow them to return to a low-power state > via autosuspend and also bring their current_state in sync with reality. > > However for devices that are *not* in a reset-power-on state, doing an > unconditional resume wastes energy. A more refined approach is called > for which issues a runtime resume only if the power state after > direct-complete is shallower than it was before. To achieve this, update > the device's current_state by querying the platform firmware and reading > its PMCSR. > > Cc: Rafael J. Wysocki <rjw@rjwysocki.net> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > --- > drivers/pci/pci-driver.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index e39a67c..fd4b9c4 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -684,8 +684,19 @@ static int pci_pm_prepare(struct device *dev) > > static void pci_pm_complete(struct device *dev) > { > - pci_dev_complete_resume(to_pci_dev(dev)); > - pm_complete_with_resume_check(dev); > + struct pci_dev *pci_dev = to_pci_dev(dev); > + > + pci_dev_complete_resume(pci_dev); > + pm_generic_complete(dev); > + > + /* Resume device if platform firmware has put it in reset-power-on */ > + if (dev->power.direct_complete && pm_resume_via_firmware()) { > + pci_power_t pre_sleep_state = pci_dev->current_state; > + > + pci_update_current_state(pci_dev, pci_dev->current_state); I'm not sure if this can be made reliable enough for all of the systems out there (including the pre-ACPI-4.0 ones and so on). I'm a bit concerned that we'll uncover firmware issues and such by doing this. How much of a problem the existing code is in practice? > + if (pci_dev->current_state < pre_sleep_state) > + pm_request_resume(dev); > + } > } > > #else /* !CONFIG_PM_SLEEP */ > Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 14, 2016 at 02:50:13AM +0200, Rafael J. Wysocki wrote: > On Wednesday, August 31, 2016 08:15:18 AM Lukas Wunner wrote: > > Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been > > reset by firmware") added a runtime resume for devices that were runtime > > suspended when the system entered sleep. > > > > The motivation was that devices might be in a reset-power-on state after > > waking from system sleep, so their power state as perceived by Linux > > (stored in pci_dev->current_state) would no longer reflect reality. > > By resuming such devices, we allow them to return to a low-power state > > via autosuspend and also bring their current_state in sync with reality. > > > > However for devices that are *not* in a reset-power-on state, doing an > > unconditional resume wastes energy. A more refined approach is called > > for which issues a runtime resume only if the power state after > > direct-complete is shallower than it was before. To achieve this, update > > the device's current_state by querying the platform firmware and reading > > its PMCSR. > > > > Cc: Rafael J. Wysocki <rjw@rjwysocki.net> > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > --- > > drivers/pci/pci-driver.c | 15 +++++++++++++-- > > 1 file changed, 13 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > > index e39a67c..fd4b9c4 100644 > > --- a/drivers/pci/pci-driver.c > > +++ b/drivers/pci/pci-driver.c > > @@ -684,8 +684,19 @@ static int pci_pm_prepare(struct device *dev) > > > > static void pci_pm_complete(struct device *dev) > > { > > - pci_dev_complete_resume(to_pci_dev(dev)); > > - pm_complete_with_resume_check(dev); > > + struct pci_dev *pci_dev = to_pci_dev(dev); > > + > > + pci_dev_complete_resume(pci_dev); > > + pm_generic_complete(dev); > > + > > + /* Resume device if platform firmware has put it in reset-power-on */ > > + if (dev->power.direct_complete && pm_resume_via_firmware()) { > > + pci_power_t pre_sleep_state = pci_dev->current_state; > > + > > + pci_update_current_state(pci_dev, pci_dev->current_state); > > I'm not sure if this can be made reliable enough for all of the systems out > there (including the pre-ACPI-4.0 ones and so on). I'm a bit concerned that > we'll uncover firmware issues and such by doing this. > > How much of a problem the existing code is in practice? Where's your ambition? Do you want your power management to be as good as Apple's or are you going to waste energy for fear of uncovering firmware bugs? I first tried to solve this only for Thunderbolt by opting out of the mandatory resume after direct_complete. It worked, but it was ugly. (I have to clear the direct_complete flag in the ->complete hook, but the device tree is walked bottom-up during ->complete, and since the Thunderbolt controller exposes multiple devices, I have to clear the flag for all devices from the bottom-most device, which is the NHI. And all the rest of the s/r code lives on the top-most device, which is the upstream bridge.) You suggested in your e-mail of July 18: "maybe it's better to change the PCI bus type to do something different from calling the generic function?" So there, I did what you suggested and tried to fix it not just for Thunderbolt but for everyone. And you're still not happy. *sigh* If on pre ACPI 4.0 systems, the device state is incorrectly reported as D3hot although the device is still in D3cold, the device will be runtime resumed to D0, just as it is now. In other words, the benefit of avoiding an unnecessary runtime resume after direct_complete is only granted if the platform firmware reports the device state correctly. Best regards, Lukas > > > + if (pci_dev->current_state < pre_sleep_state) > > + pm_request_resume(dev); > > + } > > } > > > > #else /* !CONFIG_PM_SLEEP */ > > > > Thanks, > Rafael > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, September 14, 2016 11:56:40 AM Lukas Wunner wrote: > On Wed, Sep 14, 2016 at 02:50:13AM +0200, Rafael J. Wysocki wrote: > > On Wednesday, August 31, 2016 08:15:18 AM Lukas Wunner wrote: > > > Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been > > > reset by firmware") added a runtime resume for devices that were runtime > > > suspended when the system entered sleep. > > > > > > The motivation was that devices might be in a reset-power-on state after > > > waking from system sleep, so their power state as perceived by Linux > > > (stored in pci_dev->current_state) would no longer reflect reality. > > > By resuming such devices, we allow them to return to a low-power state > > > via autosuspend and also bring their current_state in sync with reality. > > > > > > However for devices that are *not* in a reset-power-on state, doing an > > > unconditional resume wastes energy. A more refined approach is called > > > for which issues a runtime resume only if the power state after > > > direct-complete is shallower than it was before. To achieve this, update > > > the device's current_state by querying the platform firmware and reading > > > its PMCSR. > > > > > > Cc: Rafael J. Wysocki <rjw@rjwysocki.net> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > --- > > > drivers/pci/pci-driver.c | 15 +++++++++++++-- > > > 1 file changed, 13 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > > > index e39a67c..fd4b9c4 100644 > > > --- a/drivers/pci/pci-driver.c > > > +++ b/drivers/pci/pci-driver.c > > > @@ -684,8 +684,19 @@ static int pci_pm_prepare(struct device *dev) > > > > > > static void pci_pm_complete(struct device *dev) > > > { > > > - pci_dev_complete_resume(to_pci_dev(dev)); > > > - pm_complete_with_resume_check(dev); > > > + struct pci_dev *pci_dev = to_pci_dev(dev); > > > + > > > + pci_dev_complete_resume(pci_dev); > > > + pm_generic_complete(dev); > > > + > > > + /* Resume device if platform firmware has put it in reset-power-on */ > > > + if (dev->power.direct_complete && pm_resume_via_firmware()) { > > > + pci_power_t pre_sleep_state = pci_dev->current_state; > > > + > > > + pci_update_current_state(pci_dev, pci_dev->current_state); > > > > I'm not sure if this can be made reliable enough for all of the systems out > > there (including the pre-ACPI-4.0 ones and so on). I'm a bit concerned that > > we'll uncover firmware issues and such by doing this. > > > > How much of a problem the existing code is in practice? > > Where's your ambition? Do you want your power management to be as good > as Apple's or are you going to waste energy for fear of uncovering > firmware bugs? That's not about my ambition or lack thereof, but about avoiding introducing regressions on systems that are not Apple because of the ambition to be as good as Apple. And I'm not even talking about this particular piece of code, but about the other code that calls pci_update_current_state(). > I first tried to solve this only for Thunderbolt by opting out of the > mandatory resume after direct_complete. It worked, but it was ugly. > > (I have to clear the direct_complete flag in the ->complete hook, but > the device tree is walked bottom-up during ->complete, and since the > Thunderbolt controller exposes multiple devices, I have to clear the > flag for all devices from the bottom-most device, which is the NHI. > And all the rest of the s/r code lives on the top-most device, which > is the upstream bridge.) > > You suggested in your e-mail of July 18: "maybe it's better to change > the PCI bus type to do something different from calling the generic > function?" > > So there, I did what you suggested and tried to fix it not just for > Thunderbolt but for everyone. > > And you're still not happy. > > *sigh* Sorry for disappointing you. I have concerns, so I talk about them. Is that wrong? If so, what's wrong about it? My time goes into that as well, mind you. > If on pre ACPI 4.0 systems, the device state is incorrectly reported > as D3hot although the device is still in D3cold, the device will be > runtime resumed to D0, just as it is now. In other words, the benefit > of avoiding an unnecessary runtime resume after direct_complete is > only granted if the platform firmware reports the device state > correctly. OK Which as I said in the other message I've just sent, if you use acpi_device_get_power() for that, it will only report D3cold if (a) the device has a _PR3 and (b) the power resources configuration in there indicates D3cold. Does that cover the Thunderbolt case even? Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 14, 2016 at 03:14:37PM +0200, Rafael J. Wysocki wrote: > On Wednesday, September 14, 2016 11:56:40 AM Lukas Wunner wrote: > > I first tried to solve this only for Thunderbolt by opting out of the > > mandatory resume after direct_complete. It worked, but it was ugly. > > > > (I have to clear the direct_complete flag in the ->complete hook, but > > the device tree is walked bottom-up during ->complete, and since the > > Thunderbolt controller exposes multiple devices, I have to clear the > > flag for all devices from the bottom-most device, which is the NHI. > > And all the rest of the s/r code lives on the top-most device, which > > is the upstream bridge.) > > > > You suggested in your e-mail of July 18: "maybe it's better to change > > the PCI bus type to do something different from calling the generic > > function?" > > > > So there, I did what you suggested and tried to fix it not just for > > Thunderbolt but for everyone. > > > > And you're still not happy. > > > > *sigh* > > Sorry for disappointing you. > > I have concerns, so I talk about them. Is that wrong? If so, what's wrong > about it? > > My time goes into that as well, mind you. Sorry, never mind, just a bout of desperation that I failed to control. Kind regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index e39a67c..fd4b9c4 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -684,8 +684,19 @@ static int pci_pm_prepare(struct device *dev) static void pci_pm_complete(struct device *dev) { - pci_dev_complete_resume(to_pci_dev(dev)); - pm_complete_with_resume_check(dev); + struct pci_dev *pci_dev = to_pci_dev(dev); + + pci_dev_complete_resume(pci_dev); + pm_generic_complete(dev); + + /* Resume device if platform firmware has put it in reset-power-on */ + if (dev->power.direct_complete && pm_resume_via_firmware()) { + pci_power_t pre_sleep_state = pci_dev->current_state; + + pci_update_current_state(pci_dev, pci_dev->current_state); + if (pci_dev->current_state < pre_sleep_state) + pm_request_resume(dev); + } } #else /* !CONFIG_PM_SLEEP */
Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been reset by firmware") added a runtime resume for devices that were runtime suspended when the system entered sleep. The motivation was that devices might be in a reset-power-on state after waking from system sleep, so their power state as perceived by Linux (stored in pci_dev->current_state) would no longer reflect reality. By resuming such devices, we allow them to return to a low-power state via autosuspend and also bring their current_state in sync with reality. However for devices that are *not* in a reset-power-on state, doing an unconditional resume wastes energy. A more refined approach is called for which issues a runtime resume only if the power state after direct-complete is shallower than it was before. To achieve this, update the device's current_state by querying the platform firmware and reading its PMCSR. Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/pci/pci-driver.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)