Message ID | 1713438.irjm9MTSvo@aspire.rjw.lan (mailing list archive) |
---|---|
State | Mainlined |
Delegated to: | Rafael Wysocki |
Headers | show |
On 12 November 2017 at 01:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > The check for "active" children in __pm_runtime_set_status(), when > trying to set the parent device status to "suspended", doesn't > really make sense, because in fact it is not invalid to set the > status of a device with runtime PM disabled to "suspended" in any > case. It is invalid to enable runtime PM for a device with its > status set to "suspended" while its child_count reference counter > is nonzero, but the check in __pm_runtime_set_status() doesn't > really cover that situation. The reason to why I changed this in commit a8636c89648a ("PM / Runtime: Don't allow to suspend a device with an active child") was because to get a consistent behavior. Because, setting the device's status to active (RPM_ACTIVE) via __pm_runtime_set_status(), requires its parent to also be active (in case the parent has runtime PM enabled). I would prefer to try maintain this consistency, but I also I realize that commit a8636c89648a, should also have been checking if the parent has runtime PM enabled (again for consistency), which it doesn't. What about fixing that instead? Kind regards Uffe > > For this reason, drop the children check from __pm_runtime_set_status() > and add a check against child_count reference counters of "suspended" > devices to pm_runtime_enable(). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/base/power/runtime.c | 30 ++++++++++-------------------- > 1 file changed, 10 insertions(+), 20 deletions(-) > > Index: linux-pm/drivers/base/power/runtime.c > =================================================================== > --- linux-pm.orig/drivers/base/power/runtime.c > +++ linux-pm/drivers/base/power/runtime.c > @@ -1101,29 +1101,13 @@ int __pm_runtime_set_status(struct devic > goto out; > } > > - if (dev->power.runtime_status == status) > + if (dev->power.runtime_status == status || !parent) > goto out_set; > > if (status == RPM_SUSPENDED) { > - /* > - * It is invalid to suspend a device with an active child, > - * unless it has been set to ignore its children. > - */ > - if (!dev->power.ignore_children && > - atomic_read(&dev->power.child_count)) { > - dev_err(dev, "runtime PM trying to suspend device but active child\n"); > - error = -EBUSY; > - goto out; > - } > - > - if (parent) { > - atomic_add_unless(&parent->power.child_count, -1, 0); > - notify_parent = !parent->power.ignore_children; > - } > - goto out_set; > - } > - > - if (parent) { > + atomic_add_unless(&parent->power.child_count, -1, 0); > + notify_parent = !parent->power.ignore_children; > + } else { > spin_lock_nested(&parent->power.lock, SINGLE_DEPTH_NESTING); > > /* > @@ -1307,6 +1291,12 @@ void pm_runtime_enable(struct device *de > else > dev_warn(dev, "Unbalanced %s!\n", __func__); > > + WARN(dev->power.runtime_status == RPM_SUSPENDED && > + !dev->power.ignore_children && > + atomic_read(&dev->power.child_count) > 0, > + "Enabling runtime PM for inactive device (%s) with active children\n", > + dev_name(dev)); > + > spin_unlock_irqrestore(&dev->power.lock, flags); > } > EXPORT_SYMBOL_GPL(pm_runtime_enable); >
On Monday, November 13, 2017 2:26:28 PM CET Ulf Hansson wrote: > On 12 November 2017 at 01:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > The check for "active" children in __pm_runtime_set_status(), when > > trying to set the parent device status to "suspended", doesn't > > really make sense, because in fact it is not invalid to set the > > status of a device with runtime PM disabled to "suspended" in any > > case. It is invalid to enable runtime PM for a device with its > > status set to "suspended" while its child_count reference counter > > is nonzero, but the check in __pm_runtime_set_status() doesn't > > really cover that situation. > > The reason to why I changed this in commit a8636c89648a ("PM / > Runtime: Don't allow to suspend a device with an active child") was > because to get a consistent behavior. Well, it causes the function to return an error in a non-error situation, which IMnsHO is a bug. > Because, setting the device's status to active (RPM_ACTIVE) via > __pm_runtime_set_status(), requires its parent to also be active (in > case the parent has runtime PM enabled). No!!! The check is in there, because the parent's usage_count is affected by that code and incrementing it in case the parent has runtime PM enabled and is RPM_SUSPENDED leads to an inconsistent runtime PM state of the parent. IOW, it would confuse the framework. There's no such issue if the runtime PM status of a child is set to RPM_SUSPENDED. It is all consistent without the check I'm removing and is made inconsistent by that very check. > I would prefer to try maintain this consistency, but I also I realize > that commit a8636c89648a, should also have been checking if the parent > has runtime PM enabled (again for consistency), which it doesn't. > > What about fixing that instead? Runtime PM is *disabled* for the parent at this point, guaranteed, so there's nothing to check, I'm afraid ... OTOH, the runtime PM status of the children doesn't matter here, because their reference counters are not updated. Thanks, Rafael
On Mon, Nov 13, 2017 at 10:50 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Monday, November 13, 2017 2:26:28 PM CET Ulf Hansson wrote: >> On 12 November 2017 at 01:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> > >> > The check for "active" children in __pm_runtime_set_status(), when >> > trying to set the parent device status to "suspended", doesn't >> > really make sense, because in fact it is not invalid to set the >> > status of a device with runtime PM disabled to "suspended" in any >> > case. It is invalid to enable runtime PM for a device with its >> > status set to "suspended" while its child_count reference counter >> > is nonzero, but the check in __pm_runtime_set_status() doesn't >> > really cover that situation. >> >> The reason to why I changed this in commit a8636c89648a ("PM / >> Runtime: Don't allow to suspend a device with an active child") was >> because to get a consistent behavior. > > Well, it causes the function to return an error in a non-error situation, > which IMnsHO is a bug. > >> Because, setting the device's status to active (RPM_ACTIVE) via >> __pm_runtime_set_status(), requires its parent to also be active (in >> case the parent has runtime PM enabled). > > No!!! > > The check is in there, because the parent's usage_count is affected by that Actually, the parent's child_count is affected, but that doesn't matter here. > code and incrementing it in case the parent has runtime PM enabled and is > RPM_SUSPENDED leads to an inconsistent runtime PM state of the parent. IOW, > it would confuse the framework. > > There's no such issue if the runtime PM status of a child is set to RPM_SUSPENDED. > > It is all consistent without the check I'm removing and is made inconsistent > by that very check. > >> I would prefer to try maintain this consistency, but I also I realize >> that commit a8636c89648a, should also have been checking if the parent >> has runtime PM enabled (again for consistency), which it doesn't. >> >> What about fixing that instead? > > Runtime PM is *disabled* for the parent at this point, guaranteed, so there's > nothing to check, I'm afraid ... > > OTOH, the runtime PM status of the children doesn't matter here, because their > reference counters are not updated. Thanks, Rafael
On 13 November 2017 at 22:50, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Monday, November 13, 2017 2:26:28 PM CET Ulf Hansson wrote: >> On 12 November 2017 at 01:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> > >> > The check for "active" children in __pm_runtime_set_status(), when >> > trying to set the parent device status to "suspended", doesn't >> > really make sense, because in fact it is not invalid to set the >> > status of a device with runtime PM disabled to "suspended" in any >> > case. It is invalid to enable runtime PM for a device with its >> > status set to "suspended" while its child_count reference counter >> > is nonzero, but the check in __pm_runtime_set_status() doesn't >> > really cover that situation. >> >> The reason to why I changed this in commit a8636c89648a ("PM / >> Runtime: Don't allow to suspend a device with an active child") was >> because to get a consistent behavior. > > Well, it causes the function to return an error in a non-error situation, > which IMnsHO is a bug. > >> Because, setting the device's status to active (RPM_ACTIVE) via >> __pm_runtime_set_status(), requires its parent to also be active (in >> case the parent has runtime PM enabled). > > No!!! > > The check is in there, because the parent's usage_count is affected by that > code and incrementing it in case the parent has runtime PM enabled and is > RPM_SUSPENDED leads to an inconsistent runtime PM state of the parent. IOW, > it would confuse the framework. Right, I do understand the reasons *why* it is like this. > > There's no such issue if the runtime PM status of a child is set to RPM_SUSPENDED. > > It is all consistent without the check I'm removing and is made inconsistent > by that very check. > >> I would prefer to try maintain this consistency, but I also I realize >> that commit a8636c89648a, should also have been checking if the parent >> has runtime PM enabled (again for consistency), which it doesn't. >> >> What about fixing that instead? > > Runtime PM is *disabled* for the parent at this point, guaranteed, so there's > nothing to check, I'm afraid ... Where and how is that guarantee made? [...] Kind regards Uffe
On 14 November 2017 at 10:13, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 13 November 2017 at 22:50, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> On Monday, November 13, 2017 2:26:28 PM CET Ulf Hansson wrote: >>> On 12 November 2017 at 01:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >>> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>> > >>> > The check for "active" children in __pm_runtime_set_status(), when >>> > trying to set the parent device status to "suspended", doesn't >>> > really make sense, because in fact it is not invalid to set the >>> > status of a device with runtime PM disabled to "suspended" in any >>> > case. It is invalid to enable runtime PM for a device with its >>> > status set to "suspended" while its child_count reference counter >>> > is nonzero, but the check in __pm_runtime_set_status() doesn't >>> > really cover that situation. >>> >>> The reason to why I changed this in commit a8636c89648a ("PM / >>> Runtime: Don't allow to suspend a device with an active child") was >>> because to get a consistent behavior. >> >> Well, it causes the function to return an error in a non-error situation, >> which IMnsHO is a bug. >> >>> Because, setting the device's status to active (RPM_ACTIVE) via >>> __pm_runtime_set_status(), requires its parent to also be active (in >>> case the parent has runtime PM enabled). >> >> No!!! >> >> The check is in there, because the parent's usage_count is affected by that >> code and incrementing it in case the parent has runtime PM enabled and is >> RPM_SUSPENDED leads to an inconsistent runtime PM state of the parent. IOW, >> it would confuse the framework. > > Right, I do understand the reasons *why* it is like this. > >> >> There's no such issue if the runtime PM status of a child is set to RPM_SUSPENDED. >> >> It is all consistent without the check I'm removing and is made inconsistent >> by that very check. >> >>> I would prefer to try maintain this consistency, but I also I realize >>> that commit a8636c89648a, should also have been checking if the parent >>> has runtime PM enabled (again for consistency), which it doesn't. >>> >>> What about fixing that instead? >> >> Runtime PM is *disabled* for the parent at this point, guaranteed, so there's >> nothing to check, I'm afraid ... > > Where and how is that guarantee made? Oh, just realize that it should be "child" instead of "parent", in the above reasoning. Apologize for giving the wrong argument. So, let's me take this once again, to make it clear. When pm_runtime_set_suspended(dev) is called, dev's child device may still be runtime PM enabled and active. I was suggesting to add a check for this scenario, to see if dev's child device is runtime PM is enabled, as and additional constraint before deciding to return an error code. The idea was to get a consistent behavior, from the pm_runtime_set_active|suspended() APIs point of view, and not from the runtime PM core point of view. Of course, because dev->power.child_count, is maintained properly even when runtime PM is disabled for dev's child device, it works as you have suggested in $subject patch as well. [...] Kind regards Uffe
On Tuesday, November 14, 2017 10:56:39 AM CET Ulf Hansson wrote: > On 14 November 2017 at 10:13, Ulf Hansson <ulf.hansson@linaro.org> wrote: > > On 13 November 2017 at 22:50, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > >> On Monday, November 13, 2017 2:26:28 PM CET Ulf Hansson wrote: > >>> On 12 November 2017 at 01:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > >>> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >>> > > >>> > The check for "active" children in __pm_runtime_set_status(), when > >>> > trying to set the parent device status to "suspended", doesn't > >>> > really make sense, because in fact it is not invalid to set the > >>> > status of a device with runtime PM disabled to "suspended" in any > >>> > case. It is invalid to enable runtime PM for a device with its > >>> > status set to "suspended" while its child_count reference counter > >>> > is nonzero, but the check in __pm_runtime_set_status() doesn't > >>> > really cover that situation. > >>> > >>> The reason to why I changed this in commit a8636c89648a ("PM / > >>> Runtime: Don't allow to suspend a device with an active child") was > >>> because to get a consistent behavior. > >> > >> Well, it causes the function to return an error in a non-error situation, > >> which IMnsHO is a bug. > >> > >>> Because, setting the device's status to active (RPM_ACTIVE) via > >>> __pm_runtime_set_status(), requires its parent to also be active (in > >>> case the parent has runtime PM enabled). > >> > >> No!!! > >> > >> The check is in there, because the parent's usage_count is affected by that > >> code and incrementing it in case the parent has runtime PM enabled and is > >> RPM_SUSPENDED leads to an inconsistent runtime PM state of the parent. IOW, > >> it would confuse the framework. > > > > Right, I do understand the reasons *why* it is like this. > > > >> > >> There's no such issue if the runtime PM status of a child is set to RPM_SUSPENDED. > >> > >> It is all consistent without the check I'm removing and is made inconsistent > >> by that very check. > >> > >>> I would prefer to try maintain this consistency, but I also I realize > >>> that commit a8636c89648a, should also have been checking if the parent > >>> has runtime PM enabled (again for consistency), which it doesn't. > >>> > >>> What about fixing that instead? > >> > >> Runtime PM is *disabled* for the parent at this point, guaranteed, so there's > >> nothing to check, I'm afraid ... > > > > Where and how is that guarantee made? > > Oh, just realize that it should be "child" instead of "parent", in the > above reasoning. Apologize for giving the wrong argument. > > So, let's me take this once again, to make it clear. > > When pm_runtime_set_suspended(dev) is called, dev's child device may > still be runtime PM enabled and active. > I was suggesting to add a check for this scenario, to see if dev's > child device is runtime PM is enabled, as and additional constraint > before deciding to return an error code. Well, that's sort of difficult to do, however, because the code would need to walk all of the children of the device and the child power lock cannot be acquired under the one of the parent, so it would be fragile and ugly. > The idea was to get a consistent behavior, from the > pm_runtime_set_active|suspended() APIs point of view, and not from the > runtime PM core point of view. Yes, but the cost is high and the benefit is shallow. The enable-time WARN() should cover the really broken cases without that much complexity. Thanks, Rafael
[...] >> >> When pm_runtime_set_suspended(dev) is called, dev's child device may >> still be runtime PM enabled and active. >> I was suggesting to add a check for this scenario, to see if dev's >> child device is runtime PM is enabled, as and additional constraint >> before deciding to return an error code. > > Well, that's sort of difficult to do, however, because the code would need to > walk all of the children of the device and the child power lock cannot be > acquired under the one of the parent, so it would be fragile and ugly. Yeah, you have a point. > >> The idea was to get a consistent behavior, from the >> pm_runtime_set_active|suspended() APIs point of view, and not from the >> runtime PM core point of view. > > Yes, but the cost is high and the benefit is shallow. > > The enable-time WARN() should cover the really broken cases without that > much complexity. Fair enough! Feel free to add: Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Kind regards Uffe
On Sun, Nov 12, 2017 at 01:27:30AM +0100, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > The check for "active" children in __pm_runtime_set_status(), when > trying to set the parent device status to "suspended", doesn't > really make sense, because in fact it is not invalid to set the > status of a device with runtime PM disabled to "suspended" in any > case. It is invalid to enable runtime PM for a device with its > status set to "suspended" while its child_count reference counter > is nonzero, but the check in __pm_runtime_set_status() doesn't > really cover that situation. > > For this reason, drop the children check from __pm_runtime_set_status() > and add a check against child_count reference counters of "suspended" > devices to pm_runtime_enable(). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Looks good to me, but you should also fix Documentation/power/runtime_pm.txt which was updated to reflect the constraint that is now being reverted. Reviewed-by: Johan Hovold <johan@kernel.org> Thanks, Johan
On Thursday, November 16, 2017 10:22:41 AM CET Johan Hovold wrote: > On Sun, Nov 12, 2017 at 01:27:30AM +0100, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > The check for "active" children in __pm_runtime_set_status(), when > > trying to set the parent device status to "suspended", doesn't > > really make sense, because in fact it is not invalid to set the > > status of a device with runtime PM disabled to "suspended" in any > > case. It is invalid to enable runtime PM for a device with its > > status set to "suspended" while its child_count reference counter > > is nonzero, but the check in __pm_runtime_set_status() doesn't > > really cover that situation. > > > > For this reason, drop the children check from __pm_runtime_set_status() > > and add a check against child_count reference counters of "suspended" > > devices to pm_runtime_enable(). > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Looks good to me, but you should also fix > Documentation/power/runtime_pm.txt which was updated to reflect the > constraint that is now being reverted. Thanks for pointing that out. > Reviewed-by: Johan Hovold <johan@kernel.org> Thanks! Rafael
Hi Rafael, Shimoda-san, On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > The check for "active" children in __pm_runtime_set_status(), when > trying to set the parent device status to "suspended", doesn't > really make sense, because in fact it is not invalid to set the > status of a device with runtime PM disabled to "suspended" in any > case. It is invalid to enable runtime PM for a device with its > status set to "suspended" while its child_count reference counter > is nonzero, but the check in __pm_runtime_set_status() doesn't > really cover that situation. > > For this reason, drop the children check from __pm_runtime_set_status() > and add a check against child_count reference counters of "suspended" > devices to pm_runtime_enable(). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/base/power/runtime.c | 30 ++++++++++-------------------- > 1 file changed, 10 insertions(+), 20 deletions(-) > > Index: linux-pm/drivers/base/power/runtime.c > =================================================================== > --- linux-pm.orig/drivers/base/power/runtime.c > +++ linux-pm/drivers/base/power/runtime.c > @@ -1101,29 +1101,13 @@ int __pm_runtime_set_status(struct devic > goto out; > } > > - if (dev->power.runtime_status == status) > + if (dev->power.runtime_status == status || !parent) > goto out_set; > > if (status == RPM_SUSPENDED) { > - /* > - * It is invalid to suspend a device with an active child, > - * unless it has been set to ignore its children. > - */ > - if (!dev->power.ignore_children && > - atomic_read(&dev->power.child_count)) { > - dev_err(dev, "runtime PM trying to suspend device but active child\n"); JFTR, this triggered before during system resume on e.g. Salvator-XS with R-Car H3: ohci-platform ee080000.usb: runtime PM trying to suspend device but active child phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend device but active child ohci-platform ee0c0000.usb: runtime PM trying to suspend device but active child ohci-platform ee0a0000.usb: runtime PM trying to suspend device but active child phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend device but active child phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend device but active child so this was an existing issue with USB before. > - error = -EBUSY; > - goto out; > - } > - > - if (parent) { > - atomic_add_unless(&parent->power.child_count, -1, 0); > - notify_parent = !parent->power.ignore_children; > - } > - goto out_set; > - } > - > - if (parent) { > + atomic_add_unless(&parent->power.child_count, -1, 0); > + notify_parent = !parent->power.ignore_children; > + } else { > spin_lock_nested(&parent->power.lock, SINGLE_DEPTH_NESTING); > > /* > @@ -1307,6 +1291,12 @@ void pm_runtime_enable(struct device *de > else > dev_warn(dev, "Unbalanced %s!\n", __func__); > > + WARN(dev->power.runtime_status == RPM_SUSPENDED && > + !dev->power.ignore_children && > + atomic_read(&dev->power.child_count) > 0, > + "Enabling runtime PM for inactive device (%s) with active children\n", > + dev_name(dev)); And now it became a bit more noisy: Enabling runtime PM for inactive device (ee0a0200.usb-phy) with active children WARNING: CPU: 0 PID: 1697 at drivers/base/power/runtime.c:1299 pm_runtime_enable+0x94/0xd8 CPU: 0 PID: 1697 Comm: s2idle Not tainted 4.15.0-rc1-arm64-renesas-00381-g40d152b966c941dd #41 Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ (DT) task: ffff8006f81bb100 task.stack: ffff00000aa80000 pstate: 60000085 (nZCv daIf -PAN -UAO) pc : pm_runtime_enable+0x94/0xd8 lr : pm_runtime_enable+0x94/0xd8 sp : ffff00000aa83b50 x29: ffff00000aa83b50 x28: ffff8006f81bb100 x27: ffff000008841000 x26: ffff000008b4b640 x25: ffff000008b7f6e0 x24: ffff0000097a2f90 x23: ffff000008b7f000 x22: 0000000000000010 x21: 0000000000000000 x20: ffff8006fa9ad158 x19: ffff8006fa9ad010 x18: 00000000013c6577 x17: ffff00000947ea80 x16: 0000000000006370 x15: 000000000000636e x14: 646c696863206576 x13: 0000000000000001 x12: ffff8006f81bbaa8 x11: ffff8006f9479230 x10: ffff8006f97a63e0 x9 : 0000000000000000 x8 : ffff8006f97a6408 x7 : ffff8006f97a63c0 x6 : ffff00000975de80 x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffffffffffffff x2 : ffff000008b4bbf0 x1 : ffff8006f81bb100 x0 : 000000000000004f Call trace: pm_runtime_enable+0x94/0xd8 device_resume_early+0x50/0xec dpm_resume_early+0x118/0x204 suspend_devices_and_enter+0x2a8/0x4b0 pm_suspend+0x22c/0x27c state_store+0x84/0x108 kobj_attr_store+0x14/0x24 sysfs_kf_write+0x54/0x64 kernfs_fop_write+0xd8/0x1ec __vfs_write+0x28/0x124 vfs_write+0xa0/0x198 SyS_write+0x44/0xa0 el0_svc_naked+0x20/0x24 ---[ end trace 965c08c229b62a65 ]--- ------------[ cut here ]------------ Enabling runtime PM for inactive device (ee0c0200.usb-phy) with active children WARNING: CPU: 0 PID: 1697 at drivers/base/power/runtime.c:1299 pm_runtime_enable+0x94/0xd8 CPU: 0 PID: 1697 Comm: s2idle Tainted: G W 4.15.0-rc1-arm64-renesas-00381-g40d152b966c941dd #41 Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ (DT) task: ffff8006f81bb100 task.stack: ffff00000aa80000 pstate: 60000085 (nZCv daIf -PAN -UAO) pc : pm_runtime_enable+0x94/0xd8 lr : pm_runtime_enable+0x94/0xd8 sp : ffff00000aa83b50 x29: ffff00000aa83b50 x28: ffff8006f81bb100 x27: ffff000008841000 x26: ffff000008b4b640 x25: ffff000008b7f6e0 x24: ffff0000097a2f90 x23: ffff000008b7f000 x22: 0000000000000010 x21: 0000000000000000 x20: ffff8006fa9ad958 x19: ffff8006fa9ad810 x18: 00000000013c6577 x17: ffff00000947ea80 x16: 0000000000006370 x15: 000000000000636e x14: 0720072007200720 x13: 0000000000000001 x12: ffff8006f81bbaa8 x11: ffff8006f9478940 x10: ffff8006f97a7ad8 x9 : 0000000000000000 x8 : ffff8006f97a7b00 x7 : ffff8006f97a7ab8 x6 : ffff00000975de80 x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffffffffffffff x2 : ffff000008b4bbf0 x1 : ffff8006f81bb100 x0 : 000000000000004f Call trace: pm_runtime_enable+0x94/0xd8 device_resume_early+0x50/0xec dpm_resume_early+0x118/0x204 suspend_devices_and_enter+0x2a8/0x4b0 pm_suspend+0x22c/0x27c state_store+0x84/0x108 kobj_attr_store+0x14/0x24 sysfs_kf_write+0x54/0x64 kernfs_fop_write+0xd8/0x1ec __vfs_write+0x28/0x124 vfs_write+0xa0/0x198 SyS_write+0x44/0xa0 el0_svc_naked+0x20/0x24 ---[ end trace 965c08c229b62a66 ]--- ------------[ cut here ]------------ ... > + > spin_unlock_irqrestore(&dev->power.lock, flags); > } > EXPORT_SYMBOL_GPL(pm_runtime_enable); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Geert-san, > From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM > > Hi Rafael, Shimoda-san, > > On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > The check for "active" children in __pm_runtime_set_status(), when > > trying to set the parent device status to "suspended", doesn't > > really make sense, because in fact it is not invalid to set the > > status of a device with runtime PM disabled to "suspended" in any > > case. It is invalid to enable runtime PM for a device with its > > status set to "suspended" while its child_count reference counter > > is nonzero, but the check in __pm_runtime_set_status() doesn't > > really cover that situation. > > > > For this reason, drop the children check from __pm_runtime_set_status() > > and add a check against child_count reference counters of "suspended" > > devices to pm_runtime_enable(). > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > drivers/base/power/runtime.c | 30 ++++++++++-------------------- > > 1 file changed, 10 insertions(+), 20 deletions(-) > > > > Index: linux-pm/drivers/base/power/runtime.c > > =================================================================== > > --- linux-pm.orig/drivers/base/power/runtime.c > > +++ linux-pm/drivers/base/power/runtime.c > > @@ -1101,29 +1101,13 @@ int __pm_runtime_set_status(struct devic > > goto out; > > } > > > > - if (dev->power.runtime_status == status) > > + if (dev->power.runtime_status == status || !parent) > > goto out_set; > > > > if (status == RPM_SUSPENDED) { > > - /* > > - * It is invalid to suspend a device with an active child, > > - * unless it has been set to ignore its children. > > - */ > > - if (!dev->power.ignore_children && > > - atomic_read(&dev->power.child_count)) { > > - dev_err(dev, "runtime PM trying to suspend device but active child\n"); > > JFTR, this triggered before during system resume on e.g. Salvator-XS with > R-Car H3: > > ohci-platform ee080000.usb: runtime PM trying to suspend device > but active child > phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend > device but active child > ohci-platform ee0c0000.usb: runtime PM trying to suspend device > but active child > ohci-platform ee0a0000.usb: runtime PM trying to suspend device > but active child > phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend > device but active child > phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend > device but active child > > so this was an existing issue with USB before. Thank you for the report! I know that, but since this didn't cause any trouble until now, I postponed to investigate the issue... But, I investigate it today. I don't find the root cause yet. However, it seems related to usb host and/or usb core. --> USB host related devices' child_count will be 1 in suspend timing. --> I guess remote wakeup feature is enabled? But, I don't find the point yet. The renesas_usbhs also uses the phy_rcar_gen3_usb2 driver. --> If I only used the renesas_usbhs driver (in other words, I don't install [eo]hci-{hcd,platform} drivers), the issue disappeared. --> So, I think the phy_rcar_gen3_usb2 driver doesn't cause this issue. (But, it is possible to be related though.) I'll continue to investigate this issue tomorrow. Best regards, Yoshihiro Shimoda
On Tue, Nov 28, 2017 at 11:58 AM, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > Hi Rafael, Shimoda-san, > > On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> The check for "active" children in __pm_runtime_set_status(), when >> trying to set the parent device status to "suspended", doesn't >> really make sense, because in fact it is not invalid to set the >> status of a device with runtime PM disabled to "suspended" in any >> case. It is invalid to enable runtime PM for a device with its >> status set to "suspended" while its child_count reference counter >> is nonzero, but the check in __pm_runtime_set_status() doesn't >> really cover that situation. >> >> For this reason, drop the children check from __pm_runtime_set_status() >> and add a check against child_count reference counters of "suspended" >> devices to pm_runtime_enable(). >> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> --- >> drivers/base/power/runtime.c | 30 ++++++++++-------------------- >> 1 file changed, 10 insertions(+), 20 deletions(-) >> >> Index: linux-pm/drivers/base/power/runtime.c >> =================================================================== >> --- linux-pm.orig/drivers/base/power/runtime.c >> +++ linux-pm/drivers/base/power/runtime.c >> @@ -1101,29 +1101,13 @@ int __pm_runtime_set_status(struct devic >> goto out; >> } >> >> - if (dev->power.runtime_status == status) >> + if (dev->power.runtime_status == status || !parent) >> goto out_set; >> >> if (status == RPM_SUSPENDED) { >> - /* >> - * It is invalid to suspend a device with an active child, >> - * unless it has been set to ignore its children. >> - */ >> - if (!dev->power.ignore_children && >> - atomic_read(&dev->power.child_count)) { >> - dev_err(dev, "runtime PM trying to suspend device but active child\n"); > > JFTR, this triggered before during system resume on e.g. Salvator-XS with > R-Car H3: > > ohci-platform ee080000.usb: runtime PM trying to suspend device > but active child > phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend > device but active child > ohci-platform ee0c0000.usb: runtime PM trying to suspend device > but active child > ohci-platform ee0a0000.usb: runtime PM trying to suspend device > but active child > phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend > device but active child > phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend > device but active child > > so this was an existing issue with USB before. > >> - error = -EBUSY; >> - goto out; >> - } >> - >> - if (parent) { >> - atomic_add_unless(&parent->power.child_count, -1, 0); >> - notify_parent = !parent->power.ignore_children; >> - } >> - goto out_set; >> - } >> - >> - if (parent) { >> + atomic_add_unless(&parent->power.child_count, -1, 0); >> + notify_parent = !parent->power.ignore_children; >> + } else { >> spin_lock_nested(&parent->power.lock, SINGLE_DEPTH_NESTING); >> >> /* >> @@ -1307,6 +1291,12 @@ void pm_runtime_enable(struct device *de >> else >> dev_warn(dev, "Unbalanced %s!\n", __func__); >> >> + WARN(dev->power.runtime_status == RPM_SUSPENDED && >> + !dev->power.ignore_children && >> + atomic_read(&dev->power.child_count) > 0, >> + "Enabling runtime PM for inactive device (%s) with active children\n", >> + dev_name(dev)); > > And now it became a bit more noisy: Well, it's all existing issues, although the WARN() doesn't provide additional information in this particular case. I'm considering changing it to print a message without a stack trace. Thanks, Rafael
On Tue, 28 Nov 2017, Yoshihiro Shimoda wrote: > Hi Geert-san, > > > From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM > > > > Hi Rafael, Shimoda-san, > > > > On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > The check for "active" children in __pm_runtime_set_status(), when > > > trying to set the parent device status to "suspended", doesn't > > > really make sense, because in fact it is not invalid to set the > > > status of a device with runtime PM disabled to "suspended" in any > > > case. It is invalid to enable runtime PM for a device with its > > > status set to "suspended" while its child_count reference counter > > > is nonzero, but the check in __pm_runtime_set_status() doesn't > > > really cover that situation. > > > > > > For this reason, drop the children check from __pm_runtime_set_status() > > > and add a check against child_count reference counters of "suspended" > > > devices to pm_runtime_enable(). > > > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > --- > > > drivers/base/power/runtime.c | 30 ++++++++++-------------------- > > > 1 file changed, 10 insertions(+), 20 deletions(-) > > > > > > Index: linux-pm/drivers/base/power/runtime.c > > > =================================================================== > > > --- linux-pm.orig/drivers/base/power/runtime.c > > > +++ linux-pm/drivers/base/power/runtime.c > > > @@ -1101,29 +1101,13 @@ int __pm_runtime_set_status(struct devic > > > goto out; > > > } > > > > > > - if (dev->power.runtime_status == status) > > > + if (dev->power.runtime_status == status || !parent) > > > goto out_set; > > > > > > if (status == RPM_SUSPENDED) { > > > - /* > > > - * It is invalid to suspend a device with an active child, > > > - * unless it has been set to ignore its children. > > > - */ > > > - if (!dev->power.ignore_children && > > > - atomic_read(&dev->power.child_count)) { > > > - dev_err(dev, "runtime PM trying to suspend device but active child\n"); > > > > JFTR, this triggered before during system resume on e.g. Salvator-XS with > > R-Car H3: > > > > ohci-platform ee080000.usb: runtime PM trying to suspend device > > but active child > > phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend > > device but active child > > ohci-platform ee0c0000.usb: runtime PM trying to suspend device > > but active child > > ohci-platform ee0a0000.usb: runtime PM trying to suspend device > > but active child > > phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend > > device but active child > > phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend > > device but active child > > > > so this was an existing issue with USB before. > > Thank you for the report! > I know that, but since this didn't cause any trouble until now, > I postponed to investigate the issue... But, I investigate it today. > I don't find the root cause yet. However, it seems related to usb host and/or usb core. > --> USB host related devices' child_count will be 1 in suspend timing. > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. > > The renesas_usbhs also uses the phy_rcar_gen3_usb2 driver. > --> If I only used the renesas_usbhs driver (in other words, I don't install > [eo]hci-{hcd,platform} drivers), the issue disappeared. > --> So, I think the phy_rcar_gen3_usb2 driver doesn't cause this issue. > (But, it is possible to be related though.) > > I'll continue to investigate this issue tomorrow. Does the phy_rcar_gen3_usb2 driver use runtime PM? It looks like the phy device somehow gets enabled for runtime PM when it shouldn't be. (And by the way, what device is the child of ee0a0200.usb-phy?) Alan Stern
On 28 November 2017 at 13:48, Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> wrote: > Hi Geert-san, > >> From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM >> >> Hi Rafael, Shimoda-san, >> >> On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> > >> > The check for "active" children in __pm_runtime_set_status(), when >> > trying to set the parent device status to "suspended", doesn't >> > really make sense, because in fact it is not invalid to set the >> > status of a device with runtime PM disabled to "suspended" in any >> > case. It is invalid to enable runtime PM for a device with its >> > status set to "suspended" while its child_count reference counter >> > is nonzero, but the check in __pm_runtime_set_status() doesn't >> > really cover that situation. >> > >> > For this reason, drop the children check from __pm_runtime_set_status() >> > and add a check against child_count reference counters of "suspended" >> > devices to pm_runtime_enable(). >> > >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> > --- >> > drivers/base/power/runtime.c | 30 ++++++++++-------------------- >> > 1 file changed, 10 insertions(+), 20 deletions(-) >> > >> > Index: linux-pm/drivers/base/power/runtime.c >> > =================================================================== >> > --- linux-pm.orig/drivers/base/power/runtime.c >> > +++ linux-pm/drivers/base/power/runtime.c >> > @@ -1101,29 +1101,13 @@ int __pm_runtime_set_status(struct devic >> > goto out; >> > } >> > >> > - if (dev->power.runtime_status == status) >> > + if (dev->power.runtime_status == status || !parent) >> > goto out_set; >> > >> > if (status == RPM_SUSPENDED) { >> > - /* >> > - * It is invalid to suspend a device with an active child, >> > - * unless it has been set to ignore its children. >> > - */ >> > - if (!dev->power.ignore_children && >> > - atomic_read(&dev->power.child_count)) { >> > - dev_err(dev, "runtime PM trying to suspend device but active child\n"); >> >> JFTR, this triggered before during system resume on e.g. Salvator-XS with >> R-Car H3: >> >> ohci-platform ee080000.usb: runtime PM trying to suspend device >> but active child >> phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend >> device but active child >> ohci-platform ee0c0000.usb: runtime PM trying to suspend device >> but active child >> ohci-platform ee0a0000.usb: runtime PM trying to suspend device >> but active child >> phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend >> device but active child >> phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend >> device but active child >> >> so this was an existing issue with USB before. > > Thank you for the report! > I know that, but since this didn't cause any trouble until now, > I postponed to investigate the issue... But, I investigate it today. > I don't find the root cause yet. However, it seems related to usb host and/or usb core. > --> USB host related devices' child_count will be 1 in suspend timing. > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. I am guessing the issue is triggered by genpd in the suspend noirq phase (genpd_suspend_noirq()). In there, there is a call to pm_runtime_force_suspend() (which calls pm_runtime_set_suspended() and which triggered the earlier error messages being printed). The reason why genpd calls pm_runtime_force_suspend(), is because when validating wakeup configurations for the device "if (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))", it's thinks wakeup isn't configured while it probably should be. An additional note, only when genpd has the GENPD_FLAG_PM_CLK set, which makes the genpd->dev_ops.stop|start() being assigned, genpd calls pm_runtime_force_suspend() - else it doesn't. Perhaps try out the series I recently posted improving the code dealing with wakeups in genpd and the PM core: https://www.spinics.net/lists/linux-renesas-soc/msg20122.html To that, you need to set the new flag (invented in the above series) DPM_FLAG_IN_BAND_WAKEUP in the driver that configures wakeup of its device. Hope this helps! > > The renesas_usbhs also uses the phy_rcar_gen3_usb2 driver. > --> If I only used the renesas_usbhs driver (in other words, I don't install > [eo]hci-{hcd,platform} drivers), the issue disappeared. > --> So, I think the phy_rcar_gen3_usb2 driver doesn't cause this issue. > (But, it is possible to be related though.) > > I'll continue to investigate this issue tomorrow. Please keep me posted, I am interested about the why the problem exists. :-) Kind regards Uffe
Hi, > From: Alan Stern, Sent: Wednesday, November 29, 2017 12:07 AM > > On Tue, 28 Nov 2017, Yoshihiro Shimoda wrote: > > > Hi Geert-san, > > > > > From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM > > > > > > Hi Rafael, Shimoda-san, > > > > > > On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> <snip> > > > JFTR, this triggered before during system resume on e.g. Salvator-XS with > > > R-Car H3: > > > > > > ohci-platform ee080000.usb: runtime PM trying to suspend device > > > but active child > > > phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend > > > device but active child > > > ohci-platform ee0c0000.usb: runtime PM trying to suspend device > > > but active child > > > ohci-platform ee0a0000.usb: runtime PM trying to suspend device > > > but active child > > > phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend > > > device but active child > > > phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend > > > device but active child > > > > > > so this was an existing issue with USB before. > > > > Thank you for the report! > > I know that, but since this didn't cause any trouble until now, > > I postponed to investigate the issue... But, I investigate it today. > > I don't find the root cause yet. However, it seems related to usb host and/or usb core. > > --> USB host related devices' child_count will be 1 in suspend timing. > > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. > > > > The renesas_usbhs also uses the phy_rcar_gen3_usb2 driver. I'm so sorry, but this is mistake. The renesas_usbhs doesn't use the phy_rcar_gen3_usb2 driver. So, > > --> If I only used the renesas_usbhs driver (in other words, I don't install > > [eo]hci-{hcd,platform} drivers), the issue disappeared. > > --> So, I think the phy_rcar_gen3_usb2 driver doesn't cause this issue. > > (But, it is possible to be related though.) They are also mistake. > > I'll continue to investigate this issue tomorrow. > > Does the phy_rcar_gen3_usb2 driver use runtime PM? Yes, the phy_rcar_gen3_usb2 uses runtime PM. > It looks like the > phy device somehow gets enabled for runtime PM when it shouldn't be. I also think that now. I don't find why for now, but the usage_count of a phy device was not 1 just before suspend. (This "a phy device" means the child of ee0a0200.usb-phy device.) > (And by the way, what device is the child of ee0a0200.usb-phy?) It's a phy device: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/phy/phy-core.c?h=v4.15-rc1#n773 Best regards, Yoshihiro Shimoda > Alan Stern
Hi, > From: Ulf Hansson, Sent: Wednesday, November 29, 2017 2:23 AM > > On 28 November 2017 at 13:48, Yoshihiro Shimoda > <yoshihiro.shimoda.uh@renesas.com> wrote: > > Hi Geert-san, > > > >> From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM > >> > >> Hi Rafael, Shimoda-san, > >> > >> On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> <snip> > >> JFTR, this triggered before during system resume on e.g. Salvator-XS with > >> R-Car H3: > >> > >> ohci-platform ee080000.usb: runtime PM trying to suspend device > >> but active child > >> phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend > >> device but active child > >> ohci-platform ee0c0000.usb: runtime PM trying to suspend device > >> but active child > >> ohci-platform ee0a0000.usb: runtime PM trying to suspend device > >> but active child > >> phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend > >> device but active child > >> phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend > >> device but active child > >> > >> so this was an existing issue with USB before. > > > > Thank you for the report! > > I know that, but since this didn't cause any trouble until now, > > I postponed to investigate the issue... But, I investigate it today. > > I don't find the root cause yet. However, it seems related to usb host and/or usb core. > > --> USB host related devices' child_count will be 1 in suspend timing. > > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. > > I am guessing the issue is triggered by genpd in the suspend noirq > phase (genpd_suspend_noirq()). In there, there is a call to > pm_runtime_force_suspend() (which calls pm_runtime_set_suspended() and > which triggered the earlier error messages being printed). > > The reason why genpd calls pm_runtime_force_suspend(), is because when > validating wakeup configurations for the device "if > (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))", it's > thinks wakeup isn't configured while it probably should be. > > An additional note, only when genpd has the GENPD_FLAG_PM_CLK set, > which makes the genpd->dev_ops.stop|start() being assigned, genpd > calls pm_runtime_force_suspend() - else it doesn't. > > Perhaps try out the series I recently posted improving the code > dealing with wakeups in genpd and the PM core: > https://www.spinics.net/lists/linux-renesas-soc/msg20122.html > To that, you need to set the new flag (invented in the above series) > DPM_FLAG_IN_BAND_WAKEUP in the driver that configures wakeup of its > device. > > Hope this helps! Thank you for the comments! I tried DPM_FLAG_IN_BAND_WAKEUP, but the issue still exists. I added the flag in the [eo]hci-platform driver and usb/core/driver.c. I also added the flag in the phy_rcar_gen3_usb2 driver except usb host drivers. > > The renesas_usbhs also uses the phy_rcar_gen3_usb2 driver. > > --> If I only used the renesas_usbhs driver (in other words, I don't install > > [eo]hci-{hcd,platform} drivers), the issue disappeared. > > --> So, I think the phy_rcar_gen3_usb2 driver doesn't cause this issue. > > (But, it is possible to be related though.) > > > > I'll continue to investigate this issue tomorrow. > > Please keep me posted, I am interested about the why the problem exists. :-) Sure! :) Best regards, Yoshihiro Shimoda > Kind regards > Uffe
On 29 November 2017 at 09:21, Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> wrote: > Hi, > >> From: Ulf Hansson, Sent: Wednesday, November 29, 2017 2:23 AM >> >> On 28 November 2017 at 13:48, Yoshihiro Shimoda >> <yoshihiro.shimoda.uh@renesas.com> wrote: >> > Hi Geert-san, >> > >> >> From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM >> >> >> >> Hi Rafael, Shimoda-san, >> >> >> >> On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > <snip> >> >> JFTR, this triggered before during system resume on e.g. Salvator-XS with >> >> R-Car H3: >> >> >> >> ohci-platform ee080000.usb: runtime PM trying to suspend device >> >> but active child >> >> phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend >> >> device but active child >> >> ohci-platform ee0c0000.usb: runtime PM trying to suspend device >> >> but active child >> >> ohci-platform ee0a0000.usb: runtime PM trying to suspend device >> >> but active child >> >> phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend >> >> device but active child >> >> phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend >> >> device but active child >> >> >> >> so this was an existing issue with USB before. >> > >> > Thank you for the report! >> > I know that, but since this didn't cause any trouble until now, >> > I postponed to investigate the issue... But, I investigate it today. >> > I don't find the root cause yet. However, it seems related to usb host and/or usb core. >> > --> USB host related devices' child_count will be 1 in suspend timing. >> > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. >> >> I am guessing the issue is triggered by genpd in the suspend noirq >> phase (genpd_suspend_noirq()). In there, there is a call to >> pm_runtime_force_suspend() (which calls pm_runtime_set_suspended() and >> which triggered the earlier error messages being printed). >> >> The reason why genpd calls pm_runtime_force_suspend(), is because when >> validating wakeup configurations for the device "if >> (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))", it's >> thinks wakeup isn't configured while it probably should be. >> >> An additional note, only when genpd has the GENPD_FLAG_PM_CLK set, >> which makes the genpd->dev_ops.stop|start() being assigned, genpd >> calls pm_runtime_force_suspend() - else it doesn't. >> >> Perhaps try out the series I recently posted improving the code >> dealing with wakeups in genpd and the PM core: >> https://www.spinics.net/lists/linux-renesas-soc/msg20122.html >> To that, you need to set the new flag (invented in the above series) >> DPM_FLAG_IN_BAND_WAKEUP in the driver that configures wakeup of its >> device. >> >> Hope this helps! > > Thank you for the comments! > I tried DPM_FLAG_IN_BAND_WAKEUP, but the issue still exists. > I added the flag in the [eo]hci-platform driver and usb/core/driver.c. > I also added the flag in the phy_rcar_gen3_usb2 driver except usb host drivers. First, did you confirm that genpd was used? Then for what device? Second, did you check the call to pm_runtime_force_suspend() called by genpd, is the reason to the error messages? Third, it should be sufficient to add DPM_FLAG_IN_BAND_WAKEUP for the driver that is actually dealing with the wakeup. Although, does this driver's system ->suspend() callback check device_may_wakeup(), before it decides to enable wakeup? If not, the PM core and genpd don't notice that wakeup is enabled for the device. > >> > The renesas_usbhs also uses the phy_rcar_gen3_usb2 driver. >> > --> If I only used the renesas_usbhs driver (in other words, I don't install >> > [eo]hci-{hcd,platform} drivers), the issue disappeared. >> > --> So, I think the phy_rcar_gen3_usb2 driver doesn't cause this issue. >> > (But, it is possible to be related though.) >> > >> > I'll continue to investigate this issue tomorrow. >> >> Please keep me posted, I am interested about the why the problem exists. :-) > > Sure! :) Great, thanks. Kind regards Uffe
Hi Ulf, On Wed, Nov 29, 2017 at 10:24 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 29 November 2017 at 09:21, Yoshihiro Shimoda > <yoshihiro.shimoda.uh@renesas.com> wrote: >>> From: Ulf Hansson, Sent: Wednesday, November 29, 2017 2:23 AM >>> On 28 November 2017 at 13:48, Yoshihiro Shimoda >>> <yoshihiro.shimoda.uh@renesas.com> wrote: >>> >> From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM >>> >> On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >>> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> <snip> >>> >> JFTR, this triggered before during system resume on e.g. Salvator-XS with >>> >> R-Car H3: >>> >> >>> >> ohci-platform ee080000.usb: runtime PM trying to suspend device >>> >> but active child >>> >> phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend >>> >> device but active child >>> >> ohci-platform ee0c0000.usb: runtime PM trying to suspend device >>> >> but active child >>> >> ohci-platform ee0a0000.usb: runtime PM trying to suspend device >>> >> but active child >>> >> phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend >>> >> device but active child >>> >> phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend >>> >> device but active child >>> >> >>> >> so this was an existing issue with USB before. >>> > >>> > Thank you for the report! >>> > I know that, but since this didn't cause any trouble until now, >>> > I postponed to investigate the issue... But, I investigate it today. >>> > I don't find the root cause yet. However, it seems related to usb host and/or usb core. >>> > --> USB host related devices' child_count will be 1 in suspend timing. >>> > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. >>> >>> I am guessing the issue is triggered by genpd in the suspend noirq >>> phase (genpd_suspend_noirq()). In there, there is a call to >>> pm_runtime_force_suspend() (which calls pm_runtime_set_suspended() and >>> which triggered the earlier error messages being printed). >>> >>> The reason why genpd calls pm_runtime_force_suspend(), is because when >>> validating wakeup configurations for the device "if >>> (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))", it's >>> thinks wakeup isn't configured while it probably should be. >>> >>> An additional note, only when genpd has the GENPD_FLAG_PM_CLK set, >>> which makes the genpd->dev_ops.stop|start() being assigned, genpd >>> calls pm_runtime_force_suspend() - else it doesn't. >>> >>> Perhaps try out the series I recently posted improving the code >>> dealing with wakeups in genpd and the PM core: >>> https://www.spinics.net/lists/linux-renesas-soc/msg20122.html >>> To that, you need to set the new flag (invented in the above series) >>> DPM_FLAG_IN_BAND_WAKEUP in the driver that configures wakeup of its >>> device. >>> >>> Hope this helps! >> >> Thank you for the comments! >> I tried DPM_FLAG_IN_BAND_WAKEUP, but the issue still exists. >> I added the flag in the [eo]hci-platform driver and usb/core/driver.c. >> I also added the flag in the phy_rcar_gen3_usb2 driver except usb host drivers. > > First, did you confirm that genpd was used? Then for what device? All 6 devices are part of the SYSC PM Domain. > Second, did you check the call to pm_runtime_force_suspend() called by > genpd, is the reason to the error messages? > > Third, it should be sufficient to add DPM_FLAG_IN_BAND_WAKEUP for the > driver that is actually dealing with the wakeup. Although, does this > driver's system ->suspend() callback check device_may_wakeup(), before > it decides to enable wakeup? > If not, the PM core and genpd don't notice that wakeup is enabled for > the device. Actually I saw this with my patches setting GENPD_FLAG_ACTIVE_WAKEUP for the SYSC PM Domain, which should trigger the same behavior. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On 29 November 2017 at 10:43, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > Hi Ulf, > > On Wed, Nov 29, 2017 at 10:24 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >> On 29 November 2017 at 09:21, Yoshihiro Shimoda >> <yoshihiro.shimoda.uh@renesas.com> wrote: >>>> From: Ulf Hansson, Sent: Wednesday, November 29, 2017 2:23 AM >>>> On 28 November 2017 at 13:48, Yoshihiro Shimoda >>>> <yoshihiro.shimoda.uh@renesas.com> wrote: >>>> >> From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM >>>> >> On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >>>> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>> <snip> >>>> >> JFTR, this triggered before during system resume on e.g. Salvator-XS with >>>> >> R-Car H3: >>>> >> >>>> >> ohci-platform ee080000.usb: runtime PM trying to suspend device >>>> >> but active child >>>> >> phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend >>>> >> device but active child >>>> >> ohci-platform ee0c0000.usb: runtime PM trying to suspend device >>>> >> but active child >>>> >> ohci-platform ee0a0000.usb: runtime PM trying to suspend device >>>> >> but active child >>>> >> phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend >>>> >> device but active child >>>> >> phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend >>>> >> device but active child >>>> >> >>>> >> so this was an existing issue with USB before. >>>> > >>>> > Thank you for the report! >>>> > I know that, but since this didn't cause any trouble until now, >>>> > I postponed to investigate the issue... But, I investigate it today. >>>> > I don't find the root cause yet. However, it seems related to usb host and/or usb core. >>>> > --> USB host related devices' child_count will be 1 in suspend timing. >>>> > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. >>>> >>>> I am guessing the issue is triggered by genpd in the suspend noirq >>>> phase (genpd_suspend_noirq()). In there, there is a call to >>>> pm_runtime_force_suspend() (which calls pm_runtime_set_suspended() and >>>> which triggered the earlier error messages being printed). >>>> >>>> The reason why genpd calls pm_runtime_force_suspend(), is because when >>>> validating wakeup configurations for the device "if >>>> (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))", it's >>>> thinks wakeup isn't configured while it probably should be. >>>> >>>> An additional note, only when genpd has the GENPD_FLAG_PM_CLK set, >>>> which makes the genpd->dev_ops.stop|start() being assigned, genpd >>>> calls pm_runtime_force_suspend() - else it doesn't. >>>> >>>> Perhaps try out the series I recently posted improving the code >>>> dealing with wakeups in genpd and the PM core: >>>> https://www.spinics.net/lists/linux-renesas-soc/msg20122.html >>>> To that, you need to set the new flag (invented in the above series) >>>> DPM_FLAG_IN_BAND_WAKEUP in the driver that configures wakeup of its >>>> device. >>>> >>>> Hope this helps! >>> >>> Thank you for the comments! >>> I tried DPM_FLAG_IN_BAND_WAKEUP, but the issue still exists. >>> I added the flag in the [eo]hci-platform driver and usb/core/driver.c. >>> I also added the flag in the phy_rcar_gen3_usb2 driver except usb host drivers. >> >> First, did you confirm that genpd was used? Then for what device? > > All 6 devices are part of the SYSC PM Domain. Okay! Can you perhaps clarify which 6 devices/drivers that are involved, and perhaps also point out if their child devices? > >> Second, did you check the call to pm_runtime_force_suspend() called by >> genpd, is the reason to the error messages? >> >> Third, it should be sufficient to add DPM_FLAG_IN_BAND_WAKEUP for the >> driver that is actually dealing with the wakeup. Although, does this >> driver's system ->suspend() callback check device_may_wakeup(), before >> it decides to enable wakeup? >> If not, the PM core and genpd don't notice that wakeup is enabled for >> the device. > > Actually I saw this with my patches setting GENPD_FLAG_ACTIVE_WAKEUP > for the SYSC PM Domain, which should trigger the same behavior. Okay, so the problem remains no matter which solution for wakeup you pick in genpd. Then this seems to point to that the driver may be misbehaving in some way. I can help to check what is going on. Kind regards Uffe
Hi Ulf, On Wed, Nov 29, 2017 at 10:59 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 29 November 2017 at 10:43, Geert Uytterhoeven <geert@linux-m68k.org> wrote: >> On Wed, Nov 29, 2017 at 10:24 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >>> On 29 November 2017 at 09:21, Yoshihiro Shimoda >>> <yoshihiro.shimoda.uh@renesas.com> wrote: >>>>> From: Ulf Hansson, Sent: Wednesday, November 29, 2017 2:23 AM >>>>> On 28 November 2017 at 13:48, Yoshihiro Shimoda >>>>> <yoshihiro.shimoda.uh@renesas.com> wrote: >>>>> >> From: Geert Uytterhoeven, Sent: Tuesday, November 28, 2017 7:58 PM >>>>> >> On Sun, Nov 12, 2017 at 1:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >>>>> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>>> <snip> >>>>> >> JFTR, this triggered before during system resume on e.g. Salvator-XS with >>>>> >> R-Car H3: >>>>> >> >>>>> >> ohci-platform ee080000.usb: runtime PM trying to suspend device >>>>> >> but active child >>>>> >> phy_rcar_gen3_usb2 ee080200.usb-phy: runtime PM trying to suspend >>>>> >> device but active child >>>>> >> ohci-platform ee0c0000.usb: runtime PM trying to suspend device >>>>> >> but active child >>>>> >> ohci-platform ee0a0000.usb: runtime PM trying to suspend device >>>>> >> but active child >>>>> >> phy_rcar_gen3_usb2 ee0c0200.usb-phy: runtime PM trying to suspend >>>>> >> device but active child >>>>> >> phy_rcar_gen3_usb2 ee0a0200.usb-phy: runtime PM trying to suspend >>>>> >> device but active child >>>>> >> >>>>> >> so this was an existing issue with USB before. >>>>> > >>>>> > Thank you for the report! >>>>> > I know that, but since this didn't cause any trouble until now, >>>>> > I postponed to investigate the issue... But, I investigate it today. >>>>> > I don't find the root cause yet. However, it seems related to usb host and/or usb core. >>>>> > --> USB host related devices' child_count will be 1 in suspend timing. >>>>> > --> I guess remote wakeup feature is enabled? But, I don't find the point yet. >>>>> >>>>> I am guessing the issue is triggered by genpd in the suspend noirq >>>>> phase (genpd_suspend_noirq()). In there, there is a call to >>>>> pm_runtime_force_suspend() (which calls pm_runtime_set_suspended() and >>>>> which triggered the earlier error messages being printed). >>>>> >>>>> The reason why genpd calls pm_runtime_force_suspend(), is because when >>>>> validating wakeup configurations for the device "if >>>>> (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))", it's >>>>> thinks wakeup isn't configured while it probably should be. >>>>> >>>>> An additional note, only when genpd has the GENPD_FLAG_PM_CLK set, >>>>> which makes the genpd->dev_ops.stop|start() being assigned, genpd >>>>> calls pm_runtime_force_suspend() - else it doesn't. >>>>> >>>>> Perhaps try out the series I recently posted improving the code >>>>> dealing with wakeups in genpd and the PM core: >>>>> https://www.spinics.net/lists/linux-renesas-soc/msg20122.html >>>>> To that, you need to set the new flag (invented in the above series) >>>>> DPM_FLAG_IN_BAND_WAKEUP in the driver that configures wakeup of its >>>>> device. >>>>> >>>>> Hope this helps! >>>> >>>> Thank you for the comments! >>>> I tried DPM_FLAG_IN_BAND_WAKEUP, but the issue still exists. >>>> I added the flag in the [eo]hci-platform driver and usb/core/driver.c. >>>> I also added the flag in the phy_rcar_gen3_usb2 driver except usb host drivers. >>> >>> First, did you confirm that genpd was used? Then for what device? >> >> All 6 devices are part of the SYSC PM Domain. > > Okay! > > Can you perhaps clarify which 6 devices/drivers that are involved, and > perhaps also point out if their child devices? /sys/devices/platform/soc/ee080000.usb /sys/devices/platform/soc/ee0c0000.usb /sys/devices/platform/soc/ee0a0000.usb Driver: ohci-platform The children are usb6/6-0:1.0, usb3/3-0:1.0, resp. usb4/4-0:1.0, all using the usb "hub" driver /sys/devices/platform/soc/ee080200.usb-phy /sys/devices/platform/soc/ee0a0200.usb-phy /sys/devices/platform/soc/ee0c0200.usb-phy Driver: phy_rcar_gen3_usb2 The children are: phy/phy-ee080200.usb-phy.2 phy/phy-ee0a0200.usb-phy.0 phy/phy-ee0c0200.usb-phy.1 all without a driver, according to sysfs. Note that at first I had missed them, as printing the children using device_for_each_child() does not print them, unlike the hub devices that are children of the usb hosts. With some debug code added, logging inc/dec of child_count: USB driver init: ehci-pci: EHCI PCI platform driver ehci-platform: EHCI generic platform driver +phy_rcar_gen3_usb2 ee0a0200.usb-phy: rpm_resume:830: inc child_count of parent soc +phy phy-ee0a0200.usb-phy.0: rpm_resume:830: inc child_count of parent ee0a0200.usb-phy +phy phy-ee0a0200.usb-phy.0: rpm_suspend:606: dec child_count of parent ee0a0200.usb-phy +phy phy-ee0a0200.usb-phy.0: rpm_resume:759: inc child_count of parent ee0a0200.usb-phy +phy_rcar_gen3_usb2 ee0c0200.usb-phy: rpm_resume:830: inc child_count of parent soc +phy phy-ee0c0200.usb-phy.1: rpm_resume:830: inc child_count of parent ee0c0200.usb-phy +phy phy-ee0c0200.usb-phy.1: rpm_suspend:606: dec child_count of parent ee0c0200.usb-phy +phy phy-ee0c0200.usb-phy.1: rpm_resume:759: inc child_count of parent ee0c0200.usb-phy +phy phy-ee080200.usb-phy.2: rpm_resume:830: inc child_count of parent ee080200.usb-phy +phy phy-ee080200.usb-phy.2: rpm_suspend:606: dec child_count of parent ee080200.usb-phy +phy phy-ee080200.usb-phy.2: rpm_resume:759: inc child_count of parent ee080200.usb-phy Somehow the phy class phy-ee0*0200.usb-phy.* devices have the platform ee0*0200.usb-phy devices as parent, but they're not part of the list of children? Looks like a bug in the USB PHY driver or subsystem. USB hub instantiation: +usb usb3: __pm_runtime_set_status:1131: inc child_count of parent ee0a0000.usb +usb usb3: rpm_suspend:606: dec child_count of parent ee0a0000.usb +usb usb4: __pm_runtime_set_status:1131: inc child_count of parent ee0c0000.usb +usb usb4: rpm_suspend:606: dec child_count of parent ee0c0000.usb +usb usb6: __pm_runtime_set_status:1131: inc child_count of parent ee080000.usb +usb usb6: rpm_suspend:606: dec child_count of parent ee080000.usb System suspend: +usb usb6: rpm_resume:830: inc child_count of parent ee080000.usb +usb usb3: rpm_resume:830: inc child_count of parent ee0a0000.usb +usb usb4: rpm_resume:830: inc child_count of parent ee0c0000.usb System resume: Enabling runtime PM for inactive device (ee0c0200.usb-phy) with active children Enabling runtime PM for inactive device (ee0c0200.usb-phy) with active children Enabling runtime PM for inactive device (ee0a0000.usb) with active children Enabling runtime PM for inactive device (ee0c0000.usb) with active children Enabling runtime PM for inactive device (ee080200.usb-phy) with active children Enabling runtime PM for inactive device (ee080000.usb) with active children +usb usb4: rpm_suspend:606: dec child_count of parent ee0c0000.usb +usb usb6: rpm_suspend:606: dec child_count of parent ee080000.usb +usb usb3: rpm_suspend:606: dec child_count of parent ee0a0000.usb Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi, > From: Ulf Hansson, Sent: Wednesday, November 29, 2017 6:59 PM > > On 29 November 2017 at 10:43, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > Hi Ulf, <snip> > Okay, so the problem remains no matter which solution for wakeup you > pick in genpd. Yes. Today I could reproduce this issue without usb host driver. - The renesas_usb3 usb peripheral driver has generic phy handling. (The peripheral driver uses different generic phy driver (phy-rcar-gen3-usb3.c) though.) --> If I used the current renesas_usb3 (this means doesn't call phy_power_{on,off}(), the issue didn't happen. --> If I added phy_power_{on,off}() calling, the issue happened. --> So, I'm thinking the APIs are related to the issue. - The generic phy APIs are in drivers/phy/phy-core.c. --> The phy-rcar-gen3-usb[23] drivers call only pm_runtime_enable() before devm_phy_create(). --> The phy-core will call pm_runtime_{get_sync,put}() in phy_{init,exit,power_{on,off}}. --> So, IIUC, both devices of phy-<dev_name>.<id> and <dev_name> will be handled by runtime PM APIs. --> The runtime PM implementation of phy-core seems good to me. But...? > Then this seems to point to that the driver may be misbehaving in some > way. I can help to check what is going on. I guess so. But, I don't find yet... Best regards, Yoshihiro Shimoda > Kind regards > Uffe
Index: linux-pm/drivers/base/power/runtime.c =================================================================== --- linux-pm.orig/drivers/base/power/runtime.c +++ linux-pm/drivers/base/power/runtime.c @@ -1101,29 +1101,13 @@ int __pm_runtime_set_status(struct devic goto out; } - if (dev->power.runtime_status == status) + if (dev->power.runtime_status == status || !parent) goto out_set; if (status == RPM_SUSPENDED) { - /* - * It is invalid to suspend a device with an active child, - * unless it has been set to ignore its children. - */ - if (!dev->power.ignore_children && - atomic_read(&dev->power.child_count)) { - dev_err(dev, "runtime PM trying to suspend device but active child\n"); - error = -EBUSY; - goto out; - } - - if (parent) { - atomic_add_unless(&parent->power.child_count, -1, 0); - notify_parent = !parent->power.ignore_children; - } - goto out_set; - } - - if (parent) { + atomic_add_unless(&parent->power.child_count, -1, 0); + notify_parent = !parent->power.ignore_children; + } else { spin_lock_nested(&parent->power.lock, SINGLE_DEPTH_NESTING); /* @@ -1307,6 +1291,12 @@ void pm_runtime_enable(struct device *de else dev_warn(dev, "Unbalanced %s!\n", __func__); + WARN(dev->power.runtime_status == RPM_SUSPENDED && + !dev->power.ignore_children && + atomic_read(&dev->power.child_count) > 0, + "Enabling runtime PM for inactive device (%s) with active children\n", + dev_name(dev)); + spin_unlock_irqrestore(&dev->power.lock, flags); } EXPORT_SYMBOL_GPL(pm_runtime_enable);