diff mbox

[RFC,v11,4/5] PCI / PM: Add support for the PCIe WAKE# signal for OF

Message ID 6120485.xubBpvge6h@aspire.rjw.lan (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Rafael J. Wysocki Dec. 28, 2017, 5:29 p.m. UTC
On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [171228 12:21]:
> > On Thu, Dec 28, 2017 at 5:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> > > * Rafael J. Wysocki <rafael@kernel.org> [171228 00:51]:
> > >> On Wed, Dec 27, 2017 at 4:08 PM, Tony Lindgren <tony@atomide.com> wrote:
> > >> > * Rafael J. Wysocki <rjw@rjwysocki.net> [171227 01:00]:
> > >> >> On Tuesday, December 26, 2017 2:06:47 AM CET JeffyChen wrote:
> > >> >> > Hi Rafael,
> > >> >> >
> > >> >> > Thanks for your reply :)
> > >> >> >
> > >> >> > On 12/26/2017 08:11 AM, Rafael J. Wysocki wrote:
> > >> >> > >> >+
> > >> >> > >> >+       dn = pci_device_to_OF_node(ppdev);
> > >> >> > >> >+       if (!dn)
> > >> >> > >> >+               return 0;
> > >> >> > >> >+
> > >> >> > >> >+       irq = of_irq_get_byname(dn, "wakeup");
> > >> >> > > Why is this a property of the bridge and not of the device itself?
> > >> >> >
> > >> >> > That is suggested by Brian, because in that way, the wakeup pin would
> > >> >> > not "tied to what exact device is installed (or no device, if it's a slot)."
> > >> >>
> > >> >> But I don't think it works when there are two devices using different WAKE#
> > >> >> interrupt lines under the same bridge.  Or how does it work then?
> > >> >
> > >> > It won't work currently for multiple devices but adding more than
> > >> > one wakeriq per device is doable. And I think we will have other
> > >> > cases where multiple wakeirqs are connected to a single device, so
> > >> > that issue should be sorted out sooner or later.
> > >> >
> > >> > And if requesting wakeirq for the PCI WAKE# lines at the PCI
> > >> > controller does the job, then maybe that's all we need to start with.
> > >>
> > >> These are expected to be out-of-band, so not having anything to do
> > >> with the Root Complex.
> > >>
> > >> In-band PME Messages go through the PCIe hierarchy, but that is a
> > >> standard mechanism and it is supported already.
> > >>
> > >> WAKE# are platform-specific, pretty much by definition and I guess
> > >> that on most ARM boards they are just going to be some kind of GPIO
> > >> pins.
> > >
> > > OK. So probably supporting the following two configurations
> > > should be enough then:
> > >
> > > 1. One or more WAKE# lines configured as a wakeirq for the PCI
> > >    controller
> > >
> > >    When the wakeirq calls pm_wakeup_event() for the PCI controller
> > >    device driver, the PCI controller wakes up and can deal with
> > >    it's child devices
> > 
> > But this shouldn't be necessary at all.  Or if it is, I wonder why
> > that's the case.
> 
> Well Brian had a concern where we would have to implement PM runtime
> for all device drivers for PCI devices.

Why would we?

> > I'm assuming that we're talking about PCI Express here, which has two
> > wakeup mechanisms defined, one of which is based on using PME Messages
> > (Beacon) and the second one is WAKE#:
> > 
> > "The WAKE# mechanism uses sideband signaling to implement wakeup
> > functionality. WAKE# is
> > an “open drain” signal asserted by components requesting wakeup and
> > observed by the associated
> > power controller."
> > 
> > (from PCIe Base Spec 3.0).  [And there's a diagram showing the routing
> > of WAKE# in two cases in Figure 5-4: Conceptual Diagrams Showing Two
> > Example Cases of WAKE# Routing.]
> 
> Thanks for the pointer, I had not seen that :) So the use cases
> I was trying to describe above are similar to the wiring in the
> PCIe Base Spec 3.0 "Figure 5-4" , but numbered the other way around.
> 
> > Note that WAKE# is defined to be "observed by the associated power
> > controller", so I'm not sure what the PCI controller's role in the
> > handing of it is at all.
> 
> To me it seems the "switch" part stays at least partially powered
> and then in-band PME messages are used after host is woken up to
> figure out which WAKE# triggered?

Yes, that's my understanding too.

> > > 2. Optionally a WAKE# line from a PCI device configured as wakeirq
> > >    for the PCI device driver
> > >
> > >    In this case calling the PM runtime resume in the child
> > >    PCI device will also wake up the parent PCI controller,
> > >    and then the PCI controller can deal with it's children
> > >
> > > Seems like this series is pretty close to 1 above except
> > > we need to have a list of wakeirqs per device instead of
> > > just one. And option 2 should already work as long as the
> > > PCI device driver parses and configures the wakeirq.
> > 
> > Well, this is confusing, because as I said above, option 1 doesn't
> > look relevant even.
> 
> So isn't my option 1 above similar to the PCIe spec "Figure 5-4"
> case 2?

No, it isn't, because in that case there is no practical difference
between WAKE# and an in-band PME message sent by the device (Beacon)
from the software perspective.

WAKE# causes the switch to send a PME message upstream and that is
handled by the Root Complex through the standard mechanism already
supported by our existing PME driver (drivers/pci/pcie/pme.c).

> Anyways, let's standardize on the "Figure 5-4" naming
> from now to avoid confusion :)

OK

> > >> > Then in addition to that, we could do the following to allow
> > >> > PCI devices to request the wakeirq from the PCI controller:
> > >> >
> > >> > 1. PCI controller or framework implements a chained irq for
> > >> >    the WAKE# lines assuming it can mask/unmask the WAKE# lines
> > >> >
> > >> > 2. PCI devices then can just request the wakeirq from the PCI
> > >> >    controller
> > >> >
> > >> > And that's about it. Optionally we could leave out the dependency
> > >> > to having PCI devices implement PM runtime and just resume the
> > >> > parent (PCI controller) if PCI devices has not implemented
> > >> > PM runtime.
> > >>
> > >> So if my understanding is correct, DT should give you the WAKE# IRQ
> > >> for the given endpoint PCI device and you only are expected to request
> > >> it.   The rest should just follow from the other pieces of information
> > >> in the DT.
> > >
> > > Yeah and it seems that we should allow configuring both cases
> > > 1 and 2 above.
> > >
> > >> With the quite obvious caveat that the same IRQ may be used as WAKE#
> > >> for multiple endpoint devices (which BTW need not be under the same
> > >> bridge even).
> > >
> > > And with the shared interrupts we can't do the masking/unmasking
> > > automatically..
> > 
> > Or we need to use reference counting (so actually the wakeup IRQs are
> > not dedicated).
> 
> Yeah. FYI, for the dedicated wakeirq cases I have, we need to keep
> them masked during runtime to avoid tons of interrupts as they
> are often wired to the RX pins.

OK

BTW, enable_irq_wake() should take care of the sharing, shouldn't it?

But the WAKE# thing is not just for waking up the system from sleep states,
it is for runtime PM's wakeup signaling too.

> > >> >> > >> >+       if (irq == -EPROBE_DEFER)
> > >> >> > > Braces here, please.
> > >> >> > ok, will fix in the next version.
> > >> >> >
> > >> >> > >
> > >> >> > >> >+               return irq;
> > >> >> > >> >+       /* Ignore other errors, since a missing wakeup is non-fatal. */
> > >> >> > >> >+       else if (irq < 0) {
> > >> >> > >> >+               dev_info(&pdev->dev, "cannot get wakeup interrupt: %d\n", irq);
> > >> >> > >> >+               return 0;
> > >> >> > >> >+       }
> > >> >> > >> >+
> > >> >> > >> >+       device_init_wakeup(&pdev->dev, true);
> > >> >> > > Why do you call this before dev_pm_set_dedicated_wake_irq()?
> > >> >> >
> > >> >> > hmmm, i thought so too, but it turns out the dedicated wake irq
> > >> >> > framework requires device_init_wakeup(dev, true) before attach the wake irq:
> > >> >> >
> > >> >> > int device_wakeup_attach_irq(struct device *dev,
> > >> >> >                               struct wake_irq *wakeirq)
> > >> >> > {
> > >> >> >          struct wakeup_source *ws;
> > >> >> >
> > >> >> >          ws = dev->power.wakeup;
> > >> >> >          if (!ws) {
> > >> >> >                  dev_err(dev, "forgot to call device_init_wakeup?\n");
> > >> >> >                  return -EINVAL;
> > >> >> >
> > >> >>
> > >> >> Well, that's a framework issue, fair enough.
> > >> >>
> > >> >> That said, what if user space removes the wakeup source from under you
> > >> >> concurrently via sysfs?  Tony?
> > >> >
> > >> > Hmm sounds racy, need to take a look.
> > >>
> > >> Not only racy, as I don't see anything to prevent user space from
> > >> making the dev->power.wakeup wakeup source go away via sysfs at any
> > >> time *after* the IRQ has been requested.
> > >
> > > Currently nothing happens with wakeirqs if there's no struct
> > > wakeup_source. On device_wakeup_enable() we call device_wakeup_attach()
> > > that just copies dev->power.wakeirq to ws->wakeirq. And when struct
> > > wake_source is freed the device should be active and wakeirq
> > > disabled. Or are you seeing other issues here?
> > 
> > I'm suspicious about one thing, but I need to look deeper into the code. :-)

So we are fine except for the race and we need the wakeirq field in wakeup
sources to automatically arm the wakeup IRQs during suspend.

If I'm not mistaken, we only need something like the patch below (untested).

> OK. My response time will be laggy this week in case you find
> something that needs urgent fixing :)

No worries. :-)

---
 drivers/base/power/wakeirq.c |    9 ++++-----
 drivers/base/power/wakeup.c  |    2 +-
 2 files changed, 5 insertions(+), 6 deletions(-)

Comments

Rafael J. Wysocki Dec. 28, 2017, 5:43 p.m. UTC | #1
On Thu, Dec 28, 2017 at 6:29 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote:
>> * Rafael J. Wysocki <rafael@kernel.org> [171228 12:21]:
>> > On Thu, Dec 28, 2017 at 5:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > > * Rafael J. Wysocki <rafael@kernel.org> [171228 00:51]:
>> > >> On Wed, Dec 27, 2017 at 4:08 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > >> > * Rafael J. Wysocki <rjw@rjwysocki.net> [171227 01:00]:
>> > >> >> On Tuesday, December 26, 2017 2:06:47 AM CET JeffyChen wrote:
>> > >> >> > Hi Rafael,
>> > >> >> >
>> > >> >> > Thanks for your reply :)
>> > >> >> >
>> > >> >> > On 12/26/2017 08:11 AM, Rafael J. Wysocki wrote:
>> > >> >> > >> >+
>> > >> >> > >> >+       dn = pci_device_to_OF_node(ppdev);
>> > >> >> > >> >+       if (!dn)
>> > >> >> > >> >+               return 0;
>> > >> >> > >> >+
>> > >> >> > >> >+       irq = of_irq_get_byname(dn, "wakeup");
>> > >> >> > > Why is this a property of the bridge and not of the device itself?
>> > >> >> >
>> > >> >> > That is suggested by Brian, because in that way, the wakeup pin would
>> > >> >> > not "tied to what exact device is installed (or no device, if it's a slot)."
>> > >> >>
>> > >> >> But I don't think it works when there are two devices using different WAKE#
>> > >> >> interrupt lines under the same bridge.  Or how does it work then?
>> > >> >
>> > >> > It won't work currently for multiple devices but adding more than
>> > >> > one wakeriq per device is doable. And I think we will have other
>> > >> > cases where multiple wakeirqs are connected to a single device, so
>> > >> > that issue should be sorted out sooner or later.
>> > >> >
>> > >> > And if requesting wakeirq for the PCI WAKE# lines at the PCI
>> > >> > controller does the job, then maybe that's all we need to start with.
>> > >>
>> > >> These are expected to be out-of-band, so not having anything to do
>> > >> with the Root Complex.
>> > >>
>> > >> In-band PME Messages go through the PCIe hierarchy, but that is a
>> > >> standard mechanism and it is supported already.
>> > >>
>> > >> WAKE# are platform-specific, pretty much by definition and I guess
>> > >> that on most ARM boards they are just going to be some kind of GPIO
>> > >> pins.
>> > >
>> > > OK. So probably supporting the following two configurations
>> > > should be enough then:
>> > >
>> > > 1. One or more WAKE# lines configured as a wakeirq for the PCI
>> > >    controller
>> > >
>> > >    When the wakeirq calls pm_wakeup_event() for the PCI controller
>> > >    device driver, the PCI controller wakes up and can deal with
>> > >    it's child devices
>> >
>> > But this shouldn't be necessary at all.  Or if it is, I wonder why
>> > that's the case.
>>
>> Well Brian had a concern where we would have to implement PM runtime
>> for all device drivers for PCI devices.
>
> Why would we?
>
>> > I'm assuming that we're talking about PCI Express here, which has two
>> > wakeup mechanisms defined, one of which is based on using PME Messages
>> > (Beacon) and the second one is WAKE#:
>> >
>> > "The WAKE# mechanism uses sideband signaling to implement wakeup
>> > functionality. WAKE# is
>> > an “open drain” signal asserted by components requesting wakeup and
>> > observed by the associated
>> > power controller."
>> >
>> > (from PCIe Base Spec 3.0).  [And there's a diagram showing the routing
>> > of WAKE# in two cases in Figure 5-4: Conceptual Diagrams Showing Two
>> > Example Cases of WAKE# Routing.]
>>
>> Thanks for the pointer, I had not seen that :) So the use cases
>> I was trying to describe above are similar to the wiring in the
>> PCIe Base Spec 3.0 "Figure 5-4" , but numbered the other way around.
>>
>> > Note that WAKE# is defined to be "observed by the associated power
>> > controller", so I'm not sure what the PCI controller's role in the
>> > handing of it is at all.
>>
>> To me it seems the "switch" part stays at least partially powered
>> and then in-band PME messages are used after host is woken up to
>> figure out which WAKE# triggered?
>
> Yes, that's my understanding too.

To be precise, it is not quite possible to figure out which WAKE#
triggered, if they are sharing the line, without looking into the
config spaces of the devices below the switch.  The switch is not
expected to do that AFAICS.  It only generates a PME message meaning
"wakeup is being signaled somewhere below" and the PME driver that
handles the Root Port receiving it should look at the PME Status bits
of the devices below the switch (the pme.c driver does that IIRC or at
least it should do that ;-)).

Still, the handling of WAKE# doesn't need to cover this case AFAICS.

Thanks,
Rafael
Tony Lindgren Dec. 29, 2017, 5:15 p.m. UTC | #2
* Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]:
> On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote:
> > 
> > Well Brian had a concern where we would have to implement PM runtime
> > for all device drivers for PCI devices.
> 
> Why would we?

Seems at least I was a bit confused. In the PCIe case the WAKE# is
owned by the PCIe slot, not the child PCIe device. So you're right,
there should be no need for the child PCIe device drivers to
implement runtime PM.

I was thinking the wakeirq case with WLAN on SDIO bus. Some WLAN
devices can have a hardwired OOB wakeirq wired to a GPIO controller.
In that case the wakeirq is owned by the child device driver
(WLAN controller) and not by the SDIO slot. I was earlier
thinking this is the same as the "Figure 5-4" case 1, but it's
not.

So in the PCIe WAKE# case for device tree, we must have the
wakeirq property for the PCIe slot for the struct device managing
that slot, and not for the child device driver. I think it's
already this way in the most recent set of patches, I need to
look again.

> > So isn't my option 1 above similar to the PCIe spec "Figure 5-4"
> > case 2?
> 
> No, it isn't, because in that case there is no practical difference
> between WAKE# and an in-band PME message sent by the device (Beacon)
> from the software perspective.
> 
> WAKE# causes the switch to send a PME message upstream and that is
> handled by the Root Complex through the standard mechanism already
> supported by our existing PME driver (drivers/pci/pcie/pme.c).

OK. So if "Figure 5-4" case 2 is already handled then and we need
to just deal with "Figure 5-4" case 1 :)

> > Yeah. FYI, for the dedicated wakeirq cases I have, we need to keep
> > them masked during runtime to avoid tons of interrupts as they
> > are often wired to the RX pins.
> 
> OK
> 
> BTW, enable_irq_wake() should take care of the sharing, shouldn't it?

That can be used to tell us which device has wakeirq enabled for
wake-up events, but only for resume not runtiem PM. We still have the
shared IRQ problem to deal with. And the PCIe subsystem still needs
to go through the child devices.

> But the WAKE# thing is not just for waking up the system from sleep states,
> it is for runtime PM's wakeup signaling too.

Yes my test cases have it working for runtime PM and for waking
up system from suspend.

> > > > Currently nothing happens with wakeirqs if there's no struct
> > > > wakeup_source. On device_wakeup_enable() we call device_wakeup_attach()
> > > > that just copies dev->power.wakeirq to ws->wakeirq. And when struct
> > > > wake_source is freed the device should be active and wakeirq
> > > > disabled. Or are you seeing other issues here?
> > > 
> > > I'm suspicious about one thing, but I need to look deeper into the code. :-)
> 
> So we are fine except for the race and we need the wakeirq field in wakeup
> sources to automatically arm the wakeup IRQs during suspend.

OK.

> If I'm not mistaken, we only need something like the patch below (untested).

Seems like it should fix the race, I'll do some testing next week.

Regards,

Tony

> ---
>  drivers/base/power/wakeirq.c |    9 ++++-----
>  drivers/base/power/wakeup.c  |    2 +-
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> Index: linux-pm/drivers/base/power/wakeirq.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/wakeirq.c
> +++ linux-pm/drivers/base/power/wakeirq.c
> @@ -33,7 +33,6 @@ static int dev_pm_attach_wake_irq(struct
>  				  struct wake_irq *wirq)
>  {
>  	unsigned long flags;
> -	int err;
>  
>  	if (!dev || !wirq)
>  		return -EINVAL;
> @@ -45,12 +44,12 @@ static int dev_pm_attach_wake_irq(struct
>  		return -EEXIST;
>  	}
>  
> -	err = device_wakeup_attach_irq(dev, wirq);
> -	if (!err)
> -		dev->power.wakeirq = wirq;
> +	dev->power.wakeirq = wirq;
> +	if (dev->power.wakeup)
> +		device_wakeup_attach_irq(dev, wirq);
>  
>  	spin_unlock_irqrestore(&dev->power.lock, flags);
> -	return err;
> +	return 0;
>  }
>  
>  /**
> Index: linux-pm/drivers/base/power/wakeup.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/wakeup.c
> +++ linux-pm/drivers/base/power/wakeup.c
> @@ -303,7 +303,7 @@ int device_wakeup_attach_irq(struct devi
>  	}
>  
>  	if (ws->wakeirq)
> -		return -EEXIST;
> +		dev_err(dev, "Leftover wakeup IRQ found, overriding\n");
>  
>  	ws->wakeirq = wakeirq;
>  	return 0;
>
Tony Lindgren Dec. 29, 2017, 5:16 p.m. UTC | #3
* Rafael J. Wysocki <rafael@kernel.org> [171228 17:46]:
> 
> To be precise, it is not quite possible to figure out which WAKE#
> triggered, if they are sharing the line, without looking into the
> config spaces of the devices below the switch.  The switch is not
> expected to do that AFAICS.  It only generates a PME message meaning
> "wakeup is being signaled somewhere below" and the PME driver that
> handles the Root Port receiving it should look at the PME Status bits
> of the devices below the switch (the pme.c driver does that IIRC or at
> least it should do that ;-)).
> 
> Still, the handling of WAKE# doesn't need to cover this case AFAICS.

OK makes sense now.

Regards,

Tony
Rafael J. Wysocki Dec. 29, 2017, 11:39 p.m. UTC | #4
On Fri, Dec 29, 2017 at 6:15 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]:
>> On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote:
>> >
>> > Well Brian had a concern where we would have to implement PM runtime
>> > for all device drivers for PCI devices.
>>
>> Why would we?
>
> Seems at least I was a bit confused. In the PCIe case the WAKE# is
> owned by the PCIe slot, not the child PCIe device.

Well, it depends on what you mean by "slot" and "child device", but if
my understanding of it is correct, WAKE# actually is "owned" by the
latter.

First, let me clarify the terminology.  PCI slots are not really
represented in the device hierarchy in Linux.  They are represented by
kernel objects for hotplug purposes, but these objects are not based
on struct device.

Generally, there are two kinds of PCI entities represented by struct
pci_dev, bridges and endpoints (functions).  Bridges may represent
physical devices, like PCI-to-PCI bridges, or parts of physical
devices, like PCIe ports.  In PCIe, every port is logically
represented by a bridge (and a PCI config space region with a Type 1
header).  However, ports do not take actions like generating
interrupts; the pieces of hardware containing them (switches, Root
Complex) do that.

Endpoints (functions) are children of bridges (e.g. PCIe ports) and
bridges may be children of other bridges (like in a switch that is
represented by a bus segment with one upstream bridge - the upstream
port - and possibly multiple downstream bridges - downstream ports).
So in PCI a parent always is a bridge (either a PCI bridge - a bridge
between to PCI bus segments - or a host bridge) and if that is a PCIe
port, it cannot "own" anything like WAKE#, because it is not affected
by it in any way and doesn't take part in the handling of it.

In the context of "Figure 5-4" in the spec, Case 1, what matters is
that every "Slot" in the figure represents a bunch (up to 8) of
endpoints (functions), but the "Slot" is not the parent of them.  The
port of the switch the "Slot" is connected to is the parent.  WAKE#
basically comes from one of the endpoints belonging to the "Slot" and
you need to look into the config space regions for all of these
endpoints to check which one has PME Status set and clear it (to
acknowledge the PME and make the hardware stop asserting the WAKE#
signal).  So, from the software perspective, the endpoint (child) is
the source of WAKE# and that should be reflected by DT properties IMO.

> So you're right, there should be no need for the child PCIe device drivers to
> implement runtime PM.

There should be no need for that regardless.  You only need an
interrupt handler that will look for the endpoint with PME Status set,
acknowledge it and possibly invoke runtime PM for the endpoint in
question (if supported).  All of that is standard and can happen at
the bus type level and the interrupt handler I'm talking about may be
based on pci_pme_wakeup() or pci_acpi_wake_dev().

> I was thinking the wakeirq case with WLAN on SDIO bus. Some WLAN
> devices can have a hardwired OOB wakeirq wired to a GPIO controller.
> In that case the wakeirq is owned by the child device driver
> (WLAN controller) and not by the SDIO slot. I was earlier
> thinking this is the same as the "Figure 5-4" case 1, but it's
> not.

Well, it is not in the sense that the endpoint driver is not expected
to handle the wakeup interrupt by itself.  The PCI bus type is
responsible for that, but technically WAKE# comes from the endpoint
(child).

> So in the PCIe WAKE# case for device tree, we must have the
> wakeirq property for the PCIe slot for the struct device managing
> that slot,

Which doesn't exist.

> and not for the child device driver. I think it's
> already this way in the most recent set of patches, I need to
> look again.

No, you need a wakeirq properly for the child *device* and that
property will be consumed by the PCI layer.

>> > So isn't my option 1 above similar to the PCIe spec "Figure 5-4"
>> > case 2?
>>
>> No, it isn't, because in that case there is no practical difference
>> between WAKE# and an in-band PME message sent by the device (Beacon)
>> from the software perspective.
>>
>> WAKE# causes the switch to send a PME message upstream and that is
>> handled by the Root Complex through the standard mechanism already
>> supported by our existing PME driver (drivers/pci/pcie/pme.c).
>
> OK. So if "Figure 5-4" case 2 is already handled then and we need
> to just deal with "Figure 5-4" case 1 :)

Right.

>> > Yeah. FYI, for the dedicated wakeirq cases I have, we need to keep
>> > them masked during runtime to avoid tons of interrupts as they
>> > are often wired to the RX pins.
>>
>> OK
>>
>> BTW, enable_irq_wake() should take care of the sharing, shouldn't it?
>
> That can be used to tell us which device has wakeirq enabled for
> wake-up events, but only for resume not runtiem PM. We still have the
> shared IRQ problem to deal with. And the PCIe subsystem still needs
> to go through the child devices.

Right.

It actually has to walk the bus below each of them too in case it is a
bridge, like pci_acpi_wake_dev().

>> But the WAKE# thing is not just for waking up the system from sleep states,
>> it is for runtime PM's wakeup signaling too.
>
> Yes my test cases have it working for runtime PM and for waking
> up system from suspend.

OK

>> > > > Currently nothing happens with wakeirqs if there's no struct
>> > > > wakeup_source. On device_wakeup_enable() we call device_wakeup_attach()
>> > > > that just copies dev->power.wakeirq to ws->wakeirq. And when struct
>> > > > wake_source is freed the device should be active and wakeirq
>> > > > disabled. Or are you seeing other issues here?
>> > >
>> > > I'm suspicious about one thing, but I need to look deeper into the code. :-)
>>
>> So we are fine except for the race and we need the wakeirq field in wakeup
>> sources to automatically arm the wakeup IRQs during suspend.
>
> OK.
>
>> If I'm not mistaken, we only need something like the patch below (untested).
>
> Seems like it should fix the race, I'll do some testing next week.

OK, thanks!

Rafael
Rafael J. Wysocki Dec. 30, 2017, 12:21 a.m. UTC | #5
On Sat, Dec 30, 2017 at 12:39 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Fri, Dec 29, 2017 at 6:15 PM, Tony Lindgren <tony@atomide.com> wrote:
>> * Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]:
>>> On Thursday, December 28, 2017 5:51:34 PM CET Tony Lindgren wrote:
>>> >
>>> > Well Brian had a concern where we would have to implement PM runtime
>>> > for all device drivers for PCI devices.
>>>
>>> Why would we?
>>
>> Seems at least I was a bit confused. In the PCIe case the WAKE# is
>> owned by the PCIe slot, not the child PCIe device.
>
> Well, it depends on what you mean by "slot" and "child device", but if
> my understanding of it is correct, WAKE# actually is "owned" by the
> latter.
>
> First, let me clarify the terminology.  PCI slots are not really
> represented in the device hierarchy in Linux.  They are represented by
> kernel objects for hotplug purposes, but these objects are not based
> on struct device.
>
> Generally, there are two kinds of PCI entities represented by struct
> pci_dev, bridges and endpoints (functions).  Bridges may represent
> physical devices, like PCI-to-PCI bridges, or parts of physical
> devices, like PCIe ports.  In PCIe, every port is logically
> represented by a bridge (and a PCI config space region with a Type 1
> header).  However, ports do not take actions like generating
> interrupts; the pieces of hardware containing them (switches, Root
> Complex) do that.
>
> Endpoints (functions) are children of bridges (e.g. PCIe ports) and
> bridges may be children of other bridges (like in a switch that is
> represented by a bus segment with one upstream bridge - the upstream
> port - and possibly multiple downstream bridges - downstream ports).
> So in PCI a parent always is a bridge (either a PCI bridge - a bridge
> between to PCI bus segments - or a host bridge) and if that is a PCIe
> port, it cannot "own" anything like WAKE#, because it is not affected
> by it in any way and doesn't take part in the handling of it.
>
> In the context of "Figure 5-4" in the spec, Case 1, what matters is
> that every "Slot" in the figure represents a bunch (up to 8) of
> endpoints (functions), but the "Slot" is not the parent of them.  The
> port of the switch the "Slot" is connected to is the parent.  WAKE#
> basically comes from one of the endpoints belonging to the "Slot" and
> you need to look into the config space regions for all of these
> endpoints to check which one has PME Status set and clear it (to
> acknowledge the PME and make the hardware stop asserting the WAKE#
> signal).  So, from the software perspective, the endpoint (child) is
> the source of WAKE# and that should be reflected by DT properties IMO.
>
>> So you're right, there should be no need for the child PCIe device drivers to
>> implement runtime PM.
>
> There should be no need for that regardless.  You only need an
> interrupt handler that will look for the endpoint with PME Status set,
> acknowledge it and possibly invoke runtime PM for the endpoint in
> question (if supported).  All of that is standard and can happen at
> the bus type level and the interrupt handler I'm talking about may be
> based on pci_pme_wakeup() or pci_acpi_wake_dev().
>
>> I was thinking the wakeirq case with WLAN on SDIO bus. Some WLAN
>> devices can have a hardwired OOB wakeirq wired to a GPIO controller.
>> In that case the wakeirq is owned by the child device driver
>> (WLAN controller) and not by the SDIO slot. I was earlier
>> thinking this is the same as the "Figure 5-4" case 1, but it's
>> not.
>
> Well, it is not in the sense that the endpoint driver is not expected
> to handle the wakeup interrupt by itself.  The PCI bus type is
> responsible for that, but technically WAKE# comes from the endpoint
> (child).
>
>> So in the PCIe WAKE# case for device tree, we must have the
>> wakeirq property for the PCIe slot for the struct device managing
>> that slot,
>
> Which doesn't exist.
>
>> and not for the child device driver. I think it's
>> already this way in the most recent set of patches, I need to
>> look again.
>
> No, you need a wakeirq properly for the child *device* and that
> property will be consumed by the PCI layer.

Or, if you use the convention mentioned in another message in this
thread, you can make the wakeirq be a property of a bridge (port) with
the clarification of the assumption that WAKE# is shared by all
functions below the bridge.  So the presence of the "wakeirq" property
for a bridge (in addition to providing the wakeup IRQ) will mean that
it applies to all devices below the bridge.

In the case of parallel PCI (not PCIe), there may be multiple "slots"
(or "PCI devices" consisting each of multiple functions) under one
bridge and in theory each of them may use a different IRQ for WAKE#
signaling, so the above convention will not work then.

Thanks,
Rafael
Tony Lindgren Jan. 3, 2018, 8 p.m. UTC | #6
* Rafael J. Wysocki <rafael@kernel.org> [171230 00:24]:
> On Sat, Dec 30, 2017 at 12:39 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > No, you need a wakeirq properly for the child *device* and that
> > property will be consumed by the PCI layer.
> 
> Or, if you use the convention mentioned in another message in this
> thread, you can make the wakeirq be a property of a bridge (port) with
> the clarification of the assumption that WAKE# is shared by all
> functions below the bridge.  So the presence of the "wakeirq" property
> for a bridge (in addition to providing the wakeup IRQ) will mean that
> it applies to all devices below the bridge.

Yes that makes sense from device tree point of view.

> In the case of parallel PCI (not PCIe), there may be multiple "slots"
> (or "PCI devices" consisting each of multiple functions) under one
> bridge and in theory each of them may use a different IRQ for WAKE#
> signaling, so the above convention will not work then.

OK. In that case the wakeirq would have to be mapped to each
PCI device if there is no other way to describe where the slot
WAKE# line is wired to. I guess the PCI controller could go
through the child devices and map the wakeirqs that way if
needed.

Regards,

Tony
Tony Lindgren Jan. 3, 2018, 8:08 p.m. UTC | #7
Hi,

* Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]:
> So we are fine except for the race and we need the wakeirq field in wakeup
> sources to automatically arm the wakeup IRQs during suspend.
> 
> If I'm not mistaken, we only need something like the patch below (untested).

Yeah for your patch below works just fine for my
test cases so:

Tested-by: Tony Lindgren <tony@atomide.com>

> ---
>  drivers/base/power/wakeirq.c |    9 ++++-----
>  drivers/base/power/wakeup.c  |    2 +-
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> Index: linux-pm/drivers/base/power/wakeirq.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/wakeirq.c
> +++ linux-pm/drivers/base/power/wakeirq.c
> @@ -33,7 +33,6 @@ static int dev_pm_attach_wake_irq(struct
>  				  struct wake_irq *wirq)
>  {
>  	unsigned long flags;
> -	int err;
>  
>  	if (!dev || !wirq)
>  		return -EINVAL;
> @@ -45,12 +44,12 @@ static int dev_pm_attach_wake_irq(struct
>  		return -EEXIST;
>  	}
>  
> -	err = device_wakeup_attach_irq(dev, wirq);
> -	if (!err)
> -		dev->power.wakeirq = wirq;
> +	dev->power.wakeirq = wirq;
> +	if (dev->power.wakeup)
> +		device_wakeup_attach_irq(dev, wirq);
>  
>  	spin_unlock_irqrestore(&dev->power.lock, flags);
> -	return err;
> +	return 0;
>  }
>  
>  /**
> Index: linux-pm/drivers/base/power/wakeup.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/wakeup.c
> +++ linux-pm/drivers/base/power/wakeup.c
> @@ -303,7 +303,7 @@ int device_wakeup_attach_irq(struct devi
>  	}
>  
>  	if (ws->wakeirq)
> -		return -EEXIST;
> +		dev_err(dev, "Leftover wakeup IRQ found, overriding\n");
>  
>  	ws->wakeirq = wakeirq;
>  	return 0;
>
Rafael J. Wysocki Jan. 5, 2018, 12:11 a.m. UTC | #8
On Wednesday, January 3, 2018 9:08:37 PM CET Tony Lindgren wrote:
> Hi,
> 
> * Rafael J. Wysocki <rjw@rjwysocki.net> [171228 17:33]:
> > So we are fine except for the race and we need the wakeirq field in wakeup
> > sources to automatically arm the wakeup IRQs during suspend.
> > 
> > If I'm not mistaken, we only need something like the patch below (untested).
> 
> Yeah for your patch below works just fine for my
> test cases so:
> 
> Tested-by: Tony Lindgren <tony@atomide.com>

Cool, thanks!

Let me respin it with a subject/changelog.

Thanks,
Rafael
diff mbox

Patch

Index: linux-pm/drivers/base/power/wakeirq.c
===================================================================
--- linux-pm.orig/drivers/base/power/wakeirq.c
+++ linux-pm/drivers/base/power/wakeirq.c
@@ -33,7 +33,6 @@  static int dev_pm_attach_wake_irq(struct
 				  struct wake_irq *wirq)
 {
 	unsigned long flags;
-	int err;
 
 	if (!dev || !wirq)
 		return -EINVAL;
@@ -45,12 +44,12 @@  static int dev_pm_attach_wake_irq(struct
 		return -EEXIST;
 	}
 
-	err = device_wakeup_attach_irq(dev, wirq);
-	if (!err)
-		dev->power.wakeirq = wirq;
+	dev->power.wakeirq = wirq;
+	if (dev->power.wakeup)
+		device_wakeup_attach_irq(dev, wirq);
 
 	spin_unlock_irqrestore(&dev->power.lock, flags);
-	return err;
+	return 0;
 }
 
 /**
Index: linux-pm/drivers/base/power/wakeup.c
===================================================================
--- linux-pm.orig/drivers/base/power/wakeup.c
+++ linux-pm/drivers/base/power/wakeup.c
@@ -303,7 +303,7 @@  int device_wakeup_attach_irq(struct devi
 	}
 
 	if (ws->wakeirq)
-		return -EEXIST;
+		dev_err(dev, "Leftover wakeup IRQ found, overriding\n");
 
 	ws->wakeirq = wakeirq;
 	return 0;