Message ID | 6120485.xubBpvge6h@aspire.rjw.lan (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Thu, Dec 28, 2017 at 6:29 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote: >> * Rafael J. Wysocki <rafael@kernel.org> [171228 12:21]: >> > On Thu, Dec 28, 2017 at 5:22 AM, Tony Lindgren <tony@atomide.com> wrote: >> > > * Rafael J. Wysocki <rafael@kernel.org> [171228 00:51]: >> > >> On Wed, Dec 27, 2017 at 4:08 PM, Tony Lindgren <tony@atomide.com> wrote: >> > >> > * Rafael J. Wysocki <rjw@rjwysocki.net> [171227 01:00]: >> > >> >> On Tuesday, December 26, 2017 2:06:47 AM CET JeffyChen wrote: >> > >> >> > Hi Rafael, >> > >> >> > >> > >> >> > Thanks for your reply :) >> > >> >> > >> > >> >> > On 12/26/2017 08:11 AM, Rafael J. Wysocki wrote: >> > >> >> > >> >+ >> > >> >> > >> >+ dn = pci_device_to_OF_node(ppdev); >> > >> >> > >> >+ if (!dn) >> > >> >> > >> >+ return 0; >> > >> >> > >> >+ >> > >> >> > >> >+ irq = of_irq_get_byname(dn, "wakeup"); >> > >> >> > > Why is this a property of the bridge and not of the device itself? >> > >> >> > >> > >> >> > That is suggested by Brian, because in that way, the wakeup pin would >> > >> >> > not "tied to what exact device is installed (or no device, if it's a slot)." >> > >> >> >> > >> >> But I don't think it works when there are two devices using different WAKE# >> > >> >> interrupt lines under the same bridge. Or how does it work then? >> > >> > >> > >> > It won't work currently for multiple devices but adding more than >> > >> > one wakeriq per device is doable. And I think we will have other >> > >> > cases where multiple wakeirqs are connected to a single device, so >> > >> > that issue should be sorted out sooner or later. >> > >> > >> > >> > And if requesting wakeirq for the PCI WAKE# lines at the PCI >> > >> > controller does the job, then maybe that's all we need to start with. >> > >> >> > >> These are expected to be out-of-band, so not having anything to do >> > >> with the Root Complex. >> > >> >> > >> In-band PME Messages go through the PCIe hierarchy, but that is a >> > >> standard mechanism and it is supported already. >> > >> >> > >> WAKE# are platform-specific, pretty much by definition and I guess >> > >> that on most ARM boards they are just going to be some kind of GPIO >> > >> pins. >> > > >> > > OK. So probably supporting the following two configurations >> > > should be enough then: >> > > >> > > 1. One or more WAKE# lines configured as a wakeirq for the PCI >> > > controller >> > > >> > > When the wakeirq calls pm_wakeup_event() for the PCI controller >> > > device driver, the PCI controller wakes up and can deal with >> > > it's child devices >> > >> > But this shouldn't be necessary at all. Or if it is, I wonder why >> > that's the case. >> >> Well Brian had a concern where we would have to implement PM runtime >> for all device drivers for PCI devices. > > Why would we? > >> > I'm assuming that we're talking about PCI Express here, which has two >> > wakeup mechanisms defined, one of which is based on using PME Messages >> > (Beacon) and the second one is WAKE#: >> > >> > "The WAKE# mechanism uses sideband signaling to implement wakeup >> > functionality. WAKE# is >> > an “open drain” signal asserted by components requesting wakeup and >> > observed by the associated >> > power controller." >> > >> > (from PCIe Base Spec 3.0). [And there's a diagram showing the routing >> > of WAKE# in two cases in Figure 5-4: Conceptual Diagrams Showing Two >> > Example Cases of WAKE# Routing.] >> >> Thanks for the pointer, I had not seen that :) So the use cases >> I was trying to describe above are similar to the wiring in the >> PCIe Base Spec 3.0 "Figure 5-4" , but numbered the other way around. >> >> > Note that WAKE# is defined to be "observed by the associated power >> > controller", so I'm not sure what the PCI controller's role in the >> > handing of it is at all. >> >> To me it seems the "switch" part stays at least partially powered >> and then in-band PME messages are used after host is woken up to >> figure out which WAKE# triggered? > > Yes, that's my understanding too. To be precise, it is not quite possible to figure out which WAKE# triggered, if they are sharing the line, without looking into the config spaces of the devices below the switch. The switch is not expected to do that AFAICS. It only generates a PME message meaning "wakeup is being signaled somewhere below" and the PME driver that handles the Root Port receiving it should look at the PME Status bits of the devices below the switch (the pme.c driver does that IIRC or at least it should do that ;-)). Still, the handling of WAKE# doesn't need to cover this case AFAICS. Thanks, Rafael
* Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]: > On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote: > > > > Well Brian had a concern where we would have to implement PM runtime > > for all device drivers for PCI devices. > > Why would we? Seems at least I was a bit confused. In the PCIe case the WAKE# is owned by the PCIe slot, not the child PCIe device. So you're right, there should be no need for the child PCIe device drivers to implement runtime PM. I was thinking the wakeirq case with WLAN on SDIO bus. Some WLAN devices can have a hardwired OOB wakeirq wired to a GPIO controller. In that case the wakeirq is owned by the child device driver (WLAN controller) and not by the SDIO slot. I was earlier thinking this is the same as the "Figure 5-4" case 1, but it's not. So in the PCIe WAKE# case for device tree, we must have the wakeirq property for the PCIe slot for the struct device managing that slot, and not for the child device driver. I think it's already this way in the most recent set of patches, I need to look again. > > So isn't my option 1 above similar to the PCIe spec "Figure 5-4" > > case 2? > > No, it isn't, because in that case there is no practical difference > between WAKE# and an in-band PME message sent by the device (Beacon) > from the software perspective. > > WAKE# causes the switch to send a PME message upstream and that is > handled by the Root Complex through the standard mechanism already > supported by our existing PME driver (drivers/pci/pcie/pme.c). OK. So if "Figure 5-4" case 2 is already handled then and we need to just deal with "Figure 5-4" case 1 :) > > Yeah. FYI, for the dedicated wakeirq cases I have, we need to keep > > them masked during runtime to avoid tons of interrupts as they > > are often wired to the RX pins. > > OK > > BTW, enable_irq_wake() should take care of the sharing, shouldn't it? That can be used to tell us which device has wakeirq enabled for wake-up events, but only for resume not runtiem PM. We still have the shared IRQ problem to deal with. And the PCIe subsystem still needs to go through the child devices. > But the WAKE# thing is not just for waking up the system from sleep states, > it is for runtime PM's wakeup signaling too. Yes my test cases have it working for runtime PM and for waking up system from suspend. > > > > Currently nothing happens with wakeirqs if there's no struct > > > > wakeup_source. On device_wakeup_enable() we call device_wakeup_attach() > > > > that just copies dev->power.wakeirq to ws->wakeirq. And when struct > > > > wake_source is freed the device should be active and wakeirq > > > > disabled. Or are you seeing other issues here? > > > > > > I'm suspicious about one thing, but I need to look deeper into the code. :-) > > So we are fine except for the race and we need the wakeirq field in wakeup > sources to automatically arm the wakeup IRQs during suspend. OK. > If I'm not mistaken, we only need something like the patch below (untested). Seems like it should fix the race, I'll do some testing next week. Regards, Tony > --- > drivers/base/power/wakeirq.c | 9 ++++----- > drivers/base/power/wakeup.c | 2 +- > 2 files changed, 5 insertions(+), 6 deletions(-) > > Index: linux-pm/drivers/base/power/wakeirq.c > =================================================================== > --- linux-pm.orig/drivers/base/power/wakeirq.c > +++ linux-pm/drivers/base/power/wakeirq.c > @@ -33,7 +33,6 @@ static int dev_pm_attach_wake_irq(struct > struct wake_irq *wirq) > { > unsigned long flags; > - int err; > > if (!dev || !wirq) > return -EINVAL; > @@ -45,12 +44,12 @@ static int dev_pm_attach_wake_irq(struct > return -EEXIST; > } > > - err = device_wakeup_attach_irq(dev, wirq); > - if (!err) > - dev->power.wakeirq = wirq; > + dev->power.wakeirq = wirq; > + if (dev->power.wakeup) > + device_wakeup_attach_irq(dev, wirq); > > spin_unlock_irqrestore(&dev->power.lock, flags); > - return err; > + return 0; > } > > /** > Index: linux-pm/drivers/base/power/wakeup.c > =================================================================== > --- linux-pm.orig/drivers/base/power/wakeup.c > +++ linux-pm/drivers/base/power/wakeup.c > @@ -303,7 +303,7 @@ int device_wakeup_attach_irq(struct devi > } > > if (ws->wakeirq) > - return -EEXIST; > + dev_err(dev, "Leftover wakeup IRQ found, overriding\n"); > > ws->wakeirq = wakeirq; > return 0; >
* Rafael J. Wysocki <rafael@kernel.org> [171228 17:46]: > > To be precise, it is not quite possible to figure out which WAKE# > triggered, if they are sharing the line, without looking into the > config spaces of the devices below the switch. The switch is not > expected to do that AFAICS. It only generates a PME message meaning > "wakeup is being signaled somewhere below" and the PME driver that > handles the Root Port receiving it should look at the PME Status bits > of the devices below the switch (the pme.c driver does that IIRC or at > least it should do that ;-)). > > Still, the handling of WAKE# doesn't need to cover this case AFAICS. OK makes sense now. Regards, Tony
On Fri, Dec 29, 2017 at 6:15 PM, Tony Lindgren <tony@atomide.com> wrote: > * Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]: >> On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote: >> > >> > Well Brian had a concern where we would have to implement PM runtime >> > for all device drivers for PCI devices. >> >> Why would we? > > Seems at least I was a bit confused. In the PCIe case the WAKE# is > owned by the PCIe slot, not the child PCIe device. Well, it depends on what you mean by "slot" and "child device", but if my understanding of it is correct, WAKE# actually is "owned" by the latter. First, let me clarify the terminology. PCI slots are not really represented in the device hierarchy in Linux. They are represented by kernel objects for hotplug purposes, but these objects are not based on struct device. Generally, there are two kinds of PCI entities represented by struct pci_dev, bridges and endpoints (functions). Bridges may represent physical devices, like PCI-to-PCI bridges, or parts of physical devices, like PCIe ports. In PCIe, every port is logically represented by a bridge (and a PCI config space region with a Type 1 header). However, ports do not take actions like generating interrupts; the pieces of hardware containing them (switches, Root Complex) do that. Endpoints (functions) are children of bridges (e.g. PCIe ports) and bridges may be children of other bridges (like in a switch that is represented by a bus segment with one upstream bridge - the upstream port - and possibly multiple downstream bridges - downstream ports). So in PCI a parent always is a bridge (either a PCI bridge - a bridge between to PCI bus segments - or a host bridge) and if that is a PCIe port, it cannot "own" anything like WAKE#, because it is not affected by it in any way and doesn't take part in the handling of it. In the context of "Figure 5-4" in the spec, Case 1, what matters is that every "Slot" in the figure represents a bunch (up to 8) of endpoints (functions), but the "Slot" is not the parent of them. The port of the switch the "Slot" is connected to is the parent. WAKE# basically comes from one of the endpoints belonging to the "Slot" and you need to look into the config space regions for all of these endpoints to check which one has PME Status set and clear it (to acknowledge the PME and make the hardware stop asserting the WAKE# signal). So, from the software perspective, the endpoint (child) is the source of WAKE# and that should be reflected by DT properties IMO. > So you're right, there should be no need for the child PCIe device drivers to > implement runtime PM. There should be no need for that regardless. You only need an interrupt handler that will look for the endpoint with PME Status set, acknowledge it and possibly invoke runtime PM for the endpoint in question (if supported). All of that is standard and can happen at the bus type level and the interrupt handler I'm talking about may be based on pci_pme_wakeup() or pci_acpi_wake_dev(). > I was thinking the wakeirq case with WLAN on SDIO bus. Some WLAN > devices can have a hardwired OOB wakeirq wired to a GPIO controller. > In that case the wakeirq is owned by the child device driver > (WLAN controller) and not by the SDIO slot. I was earlier > thinking this is the same as the "Figure 5-4" case 1, but it's > not. Well, it is not in the sense that the endpoint driver is not expected to handle the wakeup interrupt by itself. The PCI bus type is responsible for that, but technically WAKE# comes from the endpoint (child). > So in the PCIe WAKE# case for device tree, we must have the > wakeirq property for the PCIe slot for the struct device managing > that slot, Which doesn't exist. > and not for the child device driver. I think it's > already this way in the most recent set of patches, I need to > look again. No, you need a wakeirq properly for the child *device* and that property will be consumed by the PCI layer. >> > So isn't my option 1 above similar to the PCIe spec "Figure 5-4" >> > case 2? >> >> No, it isn't, because in that case there is no practical difference >> between WAKE# and an in-band PME message sent by the device (Beacon) >> from the software perspective. >> >> WAKE# causes the switch to send a PME message upstream and that is >> handled by the Root Complex through the standard mechanism already >> supported by our existing PME driver (drivers/pci/pcie/pme.c). > > OK. So if "Figure 5-4" case 2 is already handled then and we need > to just deal with "Figure 5-4" case 1 :) Right. >> > Yeah. FYI, for the dedicated wakeirq cases I have, we need to keep >> > them masked during runtime to avoid tons of interrupts as they >> > are often wired to the RX pins. >> >> OK >> >> BTW, enable_irq_wake() should take care of the sharing, shouldn't it? > > That can be used to tell us which device has wakeirq enabled for > wake-up events, but only for resume not runtiem PM. We still have the > shared IRQ problem to deal with. And the PCIe subsystem still needs > to go through the child devices. Right. It actually has to walk the bus below each of them too in case it is a bridge, like pci_acpi_wake_dev(). >> But the WAKE# thing is not just for waking up the system from sleep states, >> it is for runtime PM's wakeup signaling too. > > Yes my test cases have it working for runtime PM and for waking > up system from suspend. OK >> > > > Currently nothing happens with wakeirqs if there's no struct >> > > > wakeup_source. On device_wakeup_enable() we call device_wakeup_attach() >> > > > that just copies dev->power.wakeirq to ws->wakeirq. And when struct >> > > > wake_source is freed the device should be active and wakeirq >> > > > disabled. Or are you seeing other issues here? >> > > >> > > I'm suspicious about one thing, but I need to look deeper into the code. :-) >> >> So we are fine except for the race and we need the wakeirq field in wakeup >> sources to automatically arm the wakeup IRQs during suspend. > > OK. > >> If I'm not mistaken, we only need something like the patch below (untested). > > Seems like it should fix the race, I'll do some testing next week. OK, thanks! Rafael
On Sat, Dec 30, 2017 at 12:39 AM, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Fri, Dec 29, 2017 at 6:15 PM, Tony Lindgren <tony@atomide.com> wrote: >> * Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]: >>> On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote: >>> > >>> > Well Brian had a concern where we would have to implement PM runtime >>> > for all device drivers for PCI devices. >>> >>> Why would we? >> >> Seems at least I was a bit confused. In the PCIe case the WAKE# is >> owned by the PCIe slot, not the child PCIe device. > > Well, it depends on what you mean by "slot" and "child device", but if > my understanding of it is correct, WAKE# actually is "owned" by the > latter. > > First, let me clarify the terminology. PCI slots are not really > represented in the device hierarchy in Linux. They are represented by > kernel objects for hotplug purposes, but these objects are not based > on struct device. > > Generally, there are two kinds of PCI entities represented by struct > pci_dev, bridges and endpoints (functions). Bridges may represent > physical devices, like PCI-to-PCI bridges, or parts of physical > devices, like PCIe ports. In PCIe, every port is logically > represented by a bridge (and a PCI config space region with a Type 1 > header). However, ports do not take actions like generating > interrupts; the pieces of hardware containing them (switches, Root > Complex) do that. > > Endpoints (functions) are children of bridges (e.g. PCIe ports) and > bridges may be children of other bridges (like in a switch that is > represented by a bus segment with one upstream bridge - the upstream > port - and possibly multiple downstream bridges - downstream ports). > So in PCI a parent always is a bridge (either a PCI bridge - a bridge > between to PCI bus segments - or a host bridge) and if that is a PCIe > port, it cannot "own" anything like WAKE#, because it is not affected > by it in any way and doesn't take part in the handling of it. > > In the context of "Figure 5-4" in the spec, Case 1, what matters is > that every "Slot" in the figure represents a bunch (up to 8) of > endpoints (functions), but the "Slot" is not the parent of them. The > port of the switch the "Slot" is connected to is the parent. WAKE# > basically comes from one of the endpoints belonging to the "Slot" and > you need to look into the config space regions for all of these > endpoints to check which one has PME Status set and clear it (to > acknowledge the PME and make the hardware stop asserting the WAKE# > signal). So, from the software perspective, the endpoint (child) is > the source of WAKE# and that should be reflected by DT properties IMO. > >> So you're right, there should be no need for the child PCIe device drivers to >> implement runtime PM. > > There should be no need for that regardless. You only need an > interrupt handler that will look for the endpoint with PME Status set, > acknowledge it and possibly invoke runtime PM for the endpoint in > question (if supported). All of that is standard and can happen at > the bus type level and the interrupt handler I'm talking about may be > based on pci_pme_wakeup() or pci_acpi_wake_dev(). > >> I was thinking the wakeirq case with WLAN on SDIO bus. Some WLAN >> devices can have a hardwired OOB wakeirq wired to a GPIO controller. >> In that case the wakeirq is owned by the child device driver >> (WLAN controller) and not by the SDIO slot. I was earlier >> thinking this is the same as the "Figure 5-4" case 1, but it's >> not. > > Well, it is not in the sense that the endpoint driver is not expected > to handle the wakeup interrupt by itself. The PCI bus type is > responsible for that, but technically WAKE# comes from the endpoint > (child). > >> So in the PCIe WAKE# case for device tree, we must have the >> wakeirq property for the PCIe slot for the struct device managing >> that slot, > > Which doesn't exist. > >> and not for the child device driver. I think it's >> already this way in the most recent set of patches, I need to >> look again. > > No, you need a wakeirq properly for the child *device* and that > property will be consumed by the PCI layer. Or, if you use the convention mentioned in another message in this thread, you can make the wakeirq be a property of a bridge (port) with the clarification of the assumption that WAKE# is shared by all functions below the bridge. So the presence of the "wakeirq" property for a bridge (in addition to providing the wakeup IRQ) will mean that it applies to all devices below the bridge. In the case of parallel PCI (not PCIe), there may be multiple "slots" (or "PCI devices" consisting each of multiple functions) under one bridge and in theory each of them may use a different IRQ for WAKE# signaling, so the above convention will not work then. Thanks, Rafael
* Rafael J. Wysocki <rafael@kernel.org> [171230 00:24]: > On Sat, Dec 30, 2017 at 12:39 AM, Rafael J. Wysocki <rafael@kernel.org> wrote: > > No, you need a wakeirq properly for the child *device* and that > > property will be consumed by the PCI layer. > > Or, if you use the convention mentioned in another message in this > thread, you can make the wakeirq be a property of a bridge (port) with > the clarification of the assumption that WAKE# is shared by all > functions below the bridge. So the presence of the "wakeirq" property > for a bridge (in addition to providing the wakeup IRQ) will mean that > it applies to all devices below the bridge. Yes that makes sense from device tree point of view. > In the case of parallel PCI (not PCIe), there may be multiple "slots" > (or "PCI devices" consisting each of multiple functions) under one > bridge and in theory each of them may use a different IRQ for WAKE# > signaling, so the above convention will not work then. OK. In that case the wakeirq would have to be mapped to each PCI device if there is no other way to describe where the slot WAKE# line is wired to. I guess the PCI controller could go through the child devices and map the wakeirqs that way if needed. Regards, Tony
Hi, * Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]: > So we are fine except for the race and we need the wakeirq field in wakeup > sources to automatically arm the wakeup IRQs during suspend. > > If I'm not mistaken, we only need something like the patch below (untested). Yeah for your patch below works just fine for my test cases so: Tested-by: Tony Lindgren <tony@atomide.com> > --- > drivers/base/power/wakeirq.c | 9 ++++----- > drivers/base/power/wakeup.c | 2 +- > 2 files changed, 5 insertions(+), 6 deletions(-) > > Index: linux-pm/drivers/base/power/wakeirq.c > =================================================================== > --- linux-pm.orig/drivers/base/power/wakeirq.c > +++ linux-pm/drivers/base/power/wakeirq.c > @@ -33,7 +33,6 @@ static int dev_pm_attach_wake_irq(struct > struct wake_irq *wirq) > { > unsigned long flags; > - int err; > > if (!dev || !wirq) > return -EINVAL; > @@ -45,12 +44,12 @@ static int dev_pm_attach_wake_irq(struct > return -EEXIST; > } > > - err = device_wakeup_attach_irq(dev, wirq); > - if (!err) > - dev->power.wakeirq = wirq; > + dev->power.wakeirq = wirq; > + if (dev->power.wakeup) > + device_wakeup_attach_irq(dev, wirq); > > spin_unlock_irqrestore(&dev->power.lock, flags); > - return err; > + return 0; > } > > /** > Index: linux-pm/drivers/base/power/wakeup.c > =================================================================== > --- linux-pm.orig/drivers/base/power/wakeup.c > +++ linux-pm/drivers/base/power/wakeup.c > @@ -303,7 +303,7 @@ int device_wakeup_attach_irq(struct devi > } > > if (ws->wakeirq) > - return -EEXIST; > + dev_err(dev, "Leftover wakeup IRQ found, overriding\n"); > > ws->wakeirq = wakeirq; > return 0; >
On Wednesday, January 3, 2018 9:08:37 PM CET Tony Lindgren wrote: > Hi, > > * Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]: > > So we are fine except for the race and we need the wakeirq field in wakeup > > sources to automatically arm the wakeup IRQs during suspend. > > > > If I'm not mistaken, we only need something like the patch below (untested). > > Yeah for your patch below works just fine for my > test cases so: > > Tested-by: Tony Lindgren <tony@atomide.com> Cool, thanks! Let me respin it with a subject/changelog. Thanks, Rafael
Index: linux-pm/drivers/base/power/wakeirq.c =================================================================== --- linux-pm.orig/drivers/base/power/wakeirq.c +++ linux-pm/drivers/base/power/wakeirq.c @@ -33,7 +33,6 @@ static int dev_pm_attach_wake_irq(struct struct wake_irq *wirq) { unsigned long flags; - int err; if (!dev || !wirq) return -EINVAL; @@ -45,12 +44,12 @@ static int dev_pm_attach_wake_irq(struct return -EEXIST; } - err = device_wakeup_attach_irq(dev, wirq); - if (!err) - dev->power.wakeirq = wirq; + dev->power.wakeirq = wirq; + if (dev->power.wakeup) + device_wakeup_attach_irq(dev, wirq); spin_unlock_irqrestore(&dev->power.lock, flags); - return err; + return 0; } /** Index: linux-pm/drivers/base/power/wakeup.c =================================================================== --- linux-pm.orig/drivers/base/power/wakeup.c +++ linux-pm/drivers/base/power/wakeup.c @@ -303,7 +303,7 @@ int device_wakeup_attach_irq(struct devi } if (ws->wakeirq) - return -EEXIST; + dev_err(dev, "Leftover wakeup IRQ found, overriding\n"); ws->wakeirq = wakeirq; return 0;