Message ID | 4560ddb26a2bb15eeaf8eab565ac41c35e36d092.1458126755.git.lukas@wunner.de (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On Wed, 16 Mar 2016, Lukas Wunner wrote: > Document and implement Apple's ACPI-based (but nonstandard) mechanism > to power the controller up and down as needed. > > This fixes (at least partially) a power regression introduced in > Linux 3.17 by 7bc5a2bad0b8 ("ACPI: Support _OSI("Darwin") correctly"). > > A Thunderbolt controller consists of an NHI (Native Host Interface) and > a set of bridges. Power is cut to the entire chip. The Linux pm model > assumes that runtime pm is governed by the parent device, i.e. the > upstream bridge driver, pcieport. In violation of this model we let a > child govern it, i.e. the NHI driver thunderbolt.ko. The traditional The NHI driver is bound to bridge 0? Your diagram indicates this but you don't say so explicitly. > hierarchical pm model is defeated by setting ignore_children on the > upstream bridge and downstream bridge 0, and by having the NHI update > all the bridges' runtime pm state in unison with itself. It is also the > NHI driver's job to save and restore PCI state of the bridges. > > PCIe Port --- Upstream Bridge --+ > | > +-- Downstream Bridge 0 --+ > | | > | +-- NHI > | > +-- Downstream Bridge 1 ... > | > +-- Downstream Bridge 2 ... hotplugged > | devices > +-- Downstream Bridge 3 ... > | > +-- Downstream Bridge 4 ... This may be a naive question: The diagram indicates a single upstream bridge attached to a bunch of downstream bridges with nothing in between. Is that really how the kernel treats Thunderbolt controllers? In all other controllers that I'm familiar with, there's a device to represent the controller, another device representing its upward link, and a bunch of devices representing the downward links. The analogous approach here would make bridges 1 ... n children of bridge 0 (which sounds strange but might make more sense in the end). The way you're doing it, how does the NHI driver know when to go into suspend? The runtime PM core won't notify it when all the hotplugged devices attached to the other bridges have been suspended, since it's not their parent. > The PCI subsystem pm_ops do not work properly for devices which can be > put into D3cold by some other means than the standard _PSx ACPI platform > methods: We do not want to wake up the chip before system sleep, yet > pci_pm_prepare() does not return 1 as it should since pci_target_state() > returns D3hot. We solve this by overriding pci_pm_prepare() using power > domains. They are assigned to the bridges using a PCI quirk. We also do > not want to wake the chip after system resume as pci_pm_complete() does, > so we override that as well. Note that we can never remove and free the > dev_pm_domain assigned to the bridges as there is no PCI remove fixup > section. We also cannot bail out of the ->probe callback if allocation > of the struct dev_pm_domain fails since the PCI enable fixup does not > allow return values to be passed back. > > It might be possible to implement a less kludgy solution which adheres > to the hierarchical pm model and does not need a PCI enable quirk for > the bridges if pcieport had runtime pm support both for itself and > any service drivers registering with it. The runtime pm code could > then be moved from the NHI to a new Thunderbolt service driver that > gets used on the upstream bridge. Or you could interpose another device structure between the upstream bridge and all the downstream bridges. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Alan, On Wed, Mar 16, 2016 at 11:26:54AM -0400, Alan Stern wrote: > On Wed, 16 Mar 2016, Lukas Wunner wrote: > > > Document and implement Apple's ACPI-based (but nonstandard) mechanism > > to power the controller up and down as needed. > > > > This fixes (at least partially) a power regression introduced in > > Linux 3.17 by 7bc5a2bad0b8 ("ACPI: Support _OSI("Darwin") correctly"). > > > > A Thunderbolt controller consists of an NHI (Native Host Interface) and > > a set of bridges. Power is cut to the entire chip. The Linux pm model > > assumes that runtime pm is governed by the parent device, i.e. the > > upstream bridge driver, pcieport. In violation of this model we let a > > child govern it, i.e. the NHI driver thunderbolt.ko. The traditional > > The NHI driver is bound to bridge 0? Your diagram indicates this but > you don't say so explicitly. No, the NHI driver is bound to the NHI, that's a PCI device sitting on a bus behind Downstream Bridge 0. E.g. on a MacBookPro11,3 with a Falcon Ridge 4C it looks like this: Upstream Bridge: 06:00.0 PCI bridge [0604]: Intel Corporation Device [8086:156d] Bus: primary=06, secondary=07, subordinate=6c, sec-latency=0 Downstream Bridges: 07:00.0 PCI bridge [0604]: Intel Corporation Device [8086:156d] Bus: primary=07, secondary=08, subordinate=08, sec-latency=0 07:03.0 PCI bridge [0604]: Intel Corporation Device [8086:156d] Bus: primary=07, secondary=09, subordinate=39, sec-latency=0 07:04.0 PCI bridge [0604]: Intel Corporation Device [8086:156d] Bus: primary=07, secondary=3a, subordinate=3a, sec-latency=0 07:05.0 PCI bridge [0604]: Intel Corporation Device [8086:156d] Bus: primary=07, secondary=3b, subordinate=6b, sec-latency=0 07:06.0 PCI bridge [0604]: Intel Corporation Device [8086:156d] Bus: primary=07, secondary=6c, subordinate=6c, sec-latency=0 NHI: 08:00.0 System peripheral [0880]: Intel Corporation Device [8086:156c] Subsystem: Device [2222:1111] > > hierarchical pm model is defeated by setting ignore_children on the > > upstream bridge and downstream bridge 0, and by having the NHI update > > all the bridges' runtime pm state in unison with itself. It is also the > > NHI driver's job to save and restore PCI state of the bridges. > > > > PCIe Port --- Upstream Bridge --+ > > | > > +-- Downstream Bridge 0 --+ > > | | > > | +-- NHI > > | > > +-- Downstream Bridge 1 ... > > | > > +-- Downstream Bridge 2 ... hotplugged > > | devices > > +-- Downstream Bridge 3 ... > > | > > +-- Downstream Bridge 4 ... > > This may be a naive question: The diagram indicates a single upstream > bridge attached to a bunch of downstream bridges with nothing in > between. Is that really how the kernel treats Thunderbolt controllers? There's a bus in-between, bus 07 in the example above. Buses are signified by a vertical line in this ascii drawing. > In all other controllers that I'm familiar with, there's a device to > represent the controller, another device representing its upward link, > and a bunch of devices representing the downward links. The analogous > approach here would make bridges 1 ... n children of bridge 0 (which > sounds strange but might make more sense in the end). > > The way you're doing it, how does the NHI driver know when to go into > suspend? The runtime PM core won't notify it when all the hotplugged > devices attached to the other bridges have been suspended, since it's > not their parent. The NHI knows when something is plugged in, it talks to the switches in devices that are hotplugged to the controller. As I've explained in the lengthy comment in the middle of patch [4/4], we acquire a runtime pm ref for each switch that is plugged in and release one whenever a switch is unplugged. > > The PCI subsystem pm_ops do not work properly for devices which can be > > put into D3cold by some other means than the standard _PSx ACPI platform > > methods: We do not want to wake up the chip before system sleep, yet > > pci_pm_prepare() does not return 1 as it should since pci_target_state() > > returns D3hot. We solve this by overriding pci_pm_prepare() using power > > domains. They are assigned to the bridges using a PCI quirk. We also do > > not want to wake the chip after system resume as pci_pm_complete() does, > > so we override that as well. Note that we can never remove and free the > > dev_pm_domain assigned to the bridges as there is no PCI remove fixup > > section. We also cannot bail out of the ->probe callback if allocation > > of the struct dev_pm_domain fails since the PCI enable fixup does not > > allow return values to be passed back. > > > > It might be possible to implement a less kludgy solution which adheres > > to the hierarchical pm model and does not need a PCI enable quirk for > > the bridges if pcieport had runtime pm support both for itself and > > any service drivers registering with it. The runtime pm code could > > then be moved from the NHI to a new Thunderbolt service driver that > > gets used on the upstream bridge. > > Or you could interpose another device structure between the upstream > bridge and all the downstream bridges. How? The structure is predetermined by the way the PCI devices and bridges are connected to each other. That was Intel's idea. Best regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 16 Mar 2016, Lukas Wunner wrote: > > In all other controllers that I'm familiar with, there's a device to > > represent the controller, another device representing its upward link, > > and a bunch of devices representing the downward links. The analogous > > approach here would make bridges 1 ... n children of bridge 0 (which > > sounds strange but might make more sense in the end). > > > > The way you're doing it, how does the NHI driver know when to go into > > suspend? The runtime PM core won't notify it when all the hotplugged > > devices attached to the other bridges have been suspended, since it's > > not their parent. > > The NHI knows when something is plugged in, it talks to the switches > in devices that are hotplugged to the controller. As I've explained > in the lengthy comment in the middle of patch [4/4], we acquire a > runtime pm ref for each switch that is plugged in and release one > whenever a switch is unplugged. If I understand correctly, that means you allow the Thunderbolt controller to go into runtime suspend only when nothing is plugged into any of the ports. Is that right? It's quite inefficient. > > > The PCI subsystem pm_ops do not work properly for devices which can be > > > put into D3cold by some other means than the standard _PSx ACPI platform > > > methods: We do not want to wake up the chip before system sleep, yet > > > pci_pm_prepare() does not return 1 as it should since pci_target_state() > > > returns D3hot. We solve this by overriding pci_pm_prepare() using power > > > domains. They are assigned to the bridges using a PCI quirk. We also do > > > not want to wake the chip after system resume as pci_pm_complete() does, > > > so we override that as well. Note that we can never remove and free the > > > dev_pm_domain assigned to the bridges as there is no PCI remove fixup > > > section. We also cannot bail out of the ->probe callback if allocation > > > of the struct dev_pm_domain fails since the PCI enable fixup does not > > > allow return values to be passed back. > > > > > > It might be possible to implement a less kludgy solution which adheres > > > to the hierarchical pm model and does not need a PCI enable quirk for > > > the bridges if pcieport had runtime pm support both for itself and > > > any service drivers registering with it. The runtime pm code could > > > then be moved from the NHI to a new Thunderbolt service driver that > > > gets used on the upstream bridge. > > > > Or you could interpose another device structure between the upstream > > bridge and all the downstream bridges. > > How? The structure is predetermined by the way the PCI devices and > bridges are connected to each other. That was Intel's idea. What I'm getting at is that we should have proper runtime-PM support for bridges, i.e., I agree with what you wrote above. A bridge can safely go into runtime suspend when there are no unsuspended devices attached to any of its downstream ports. (That's how the USB hub driver works, for instance.) Doing things that way would make everything simpler in the long run. So my suggestion is that you change over to the "less kludgy solution" and work on that instead. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 16, 2016 at 3:50 PM, Lukas Wunner <lukas@wunner.de> wrote: > Document and implement Apple's ACPI-based (but nonstandard) mechanism > to power the controller up and down as needed. > > This fixes (at least partially) a power regression introduced in > Linux 3.17 by 7bc5a2bad0b8 ("ACPI: Support _OSI("Darwin") correctly"). > > A Thunderbolt controller consists of an NHI (Native Host Interface) and > a set of bridges. Power is cut to the entire chip. The Linux pm model > assumes that runtime pm is governed by the parent device, i.e. the > upstream bridge driver, pcieport. In violation of this model we let a > child govern it, i.e. the NHI driver thunderbolt.ko. The traditional > hierarchical pm model is defeated by setting ignore_children on the > upstream bridge and downstream bridge 0, and by having the NHI update > all the bridges' runtime pm state in unison with itself. It is also the > NHI driver's job to save and restore PCI state of the bridges. > > PCIe Port --- Upstream Bridge --+ > | > +-- Downstream Bridge 0 --+ > | | > | +-- NHI > | > +-- Downstream Bridge 1 ... > | > +-- Downstream Bridge 2 ... hotplugged > | devices > +-- Downstream Bridge 3 ... > | > +-- Downstream Bridge 4 ... > > The PCI subsystem pm_ops do not work properly for devices which can be > put into D3cold by some other means than the standard _PSx ACPI platform > methods: We do not want to wake up the chip before system sleep, yet > pci_pm_prepare() does not return 1 as it should since pci_target_state() > returns D3hot. We solve this by overriding pci_pm_prepare() using power > domains. They are assigned to the bridges using a PCI quirk. We also do > not want to wake the chip after system resume as pci_pm_complete() does, > so we override that as well. Note that we can never remove and free the > dev_pm_domain assigned to the bridges as there is no PCI remove fixup > section. We also cannot bail out of the ->probe callback if allocation > of the struct dev_pm_domain fails since the PCI enable fixup does not > allow return values to be passed back. > > It might be possible to implement a less kludgy solution which adheres > to the hierarchical pm model and does not need a PCI enable quirk for > the bridges if pcieport had runtime pm support both for itself and > any service drivers registering with it. The runtime pm code could > then be moved from the NHI to a new Thunderbolt service driver that > gets used on the upstream bridge. Hi Lukas, thanks for implementing this. I have tested it on my my MacBook Pro with CactusRidge and got it to work with a few modifications. Saves about 4 watts of power form me! - My firmware does not provide the TRPE ACPI method, only XRPE. So either TRPE is only post CactusRidge or it is only present in newer MBPs. In any case the OS X driver looks for TRPE first and uses XRPE only if TRPE does not exists. I suggest we do the same (but see below for TRPE). - The XRIN GPE fired immediately after the power was cut. The problem seems to be that the controller takes a bit to shut down. The solution is to poll until XRIL returns 1 before activating the GPE. On "Type 2" devices the OS X driver polls up to 300 times with a 1ms sleep in between (for me 1 or 2 iterations were always enough). Afaik no polling is done on "Type 1" devices. (Fun fact: Compiling with the kernel address sanitizer makes the kernel go slow enough such that this is not necessary:)). Also the OS X interrupt handler checks XRIL and only wakes up the device if it returns 0. This was not necessary to do on my model - but maybe spurious interrupts can happen with newer controllers?. Concerning TRPE style hardware: It seems that pm is more complicated here. I see a bunch of references to SX* ACPI methods (SXFP, SXLV, SXIO) and have not jet figured out what they do. Maybe we should not enable PM if XRPE is not present until we find someone to test it. I don't have any experience with the runtime pm core. But the thunderbolt side looks good. As you have noted the "correct" place to but this logic would be at the upstream bridge. Ideally the downstream bridges should go into D3hot by themselves if no devices are attached. The NHI as well (did you by chance check whether the NHI can be put into D3hot without killing the thunderbolt tunnels?). And then the upstream bridge would go to D3cold (and thus power down the whole subtree). If I recall correctly there were two problems: 1. PCI bridges do currently not suspend themselves at all 2. How to teach the upstream bridge about D3cold. (1) should be possible to fix? For (2): D3Cold always requires a platform specific mechanism and the pci subsystem only supports ACPI. Would it be possible to add an API to tell the pci subsystem that we know how to put a specific device(tree) into D3Cold from a platform driver [+CC Bjorn]? Then this whole thing would become a normal pci suspend operation. Regards, Andreas > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=92111 > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > Cc: Andreas Noever <andreas.noever@gmail.com> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > --- > drivers/pci/quirks.c | 35 ++++++ > drivers/thunderbolt/Kconfig | 2 +- > drivers/thunderbolt/nhi.c | 4 + > drivers/thunderbolt/nhi.h | 3 + > drivers/thunderbolt/power.c | 247 +++++++++++++++++++++++++++++++++++++++++++ > drivers/thunderbolt/power.h | 3 + > drivers/thunderbolt/switch.c | 9 ++ > drivers/thunderbolt/tb.c | 6 ++ > 8 files changed, 308 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index d1e3956..a007485 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -25,6 +25,7 @@ > #include <linux/sched.h> > #include <linux/ktime.h> > #include <linux/mm.h> > +#include <linux/pm_domain.h> > #include <asm/dma.h> /* isa_dma_bridge_buggy */ > #include "pci.h" > > @@ -3255,6 +3256,40 @@ DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, > DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, > PCI_DEVICE_ID_INTEL_FALCON_RIDGE_4C_BRIDGE, > quirk_apple_wait_for_thunderbolt); > + > +static int bridge_prepare(struct device *dev) > +{ > + return 1; /* stay asleep if already runtime suspended */ > +} > + > +static void quirk_apple_thunderbolt_runpm(struct pci_dev *dev) > +{ > + struct dev_pm_domain *bridge_pm_domain; > + > + if (!dmi_match(DMI_BOARD_VENDOR, "Apple Inc.")) > + return; > + if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) > + return; > + if (dev->dev.pm_domain) > + return; Bridges in Hotplugged TB devices might have the same PCI ids as the "root" bridges (if they use the same TB chip). You probably should check that dev is a bridge of the builtin controller (for example by checking for the presence of ACPI methods, see the comment in the other tb quirks). > + > + bridge_pm_domain = kzalloc(sizeof(*bridge_pm_domain), GFP_KERNEL); > + if (!bridge_pm_domain) { > + dev_err(&dev->dev, "cannot allocate pm_domain\n"); > + return; > + } > + > + bridge_pm_domain->ops = *pci_bus_type.pm; > + bridge_pm_domain->ops.prepare = bridge_prepare; > + bridge_pm_domain->ops.complete = NULL; > + dev_pm_domain_set(&dev->dev, bridge_pm_domain); > +} > +DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_INTEL, > + PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, > + quirk_apple_thunderbolt_runpm); > +DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_INTEL, > + PCI_DEVICE_ID_INTEL_FALCON_RIDGE_4C_BRIDGE, > + quirk_apple_thunderbolt_runpm); > #endif > > static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f, > diff --git a/drivers/thunderbolt/Kconfig b/drivers/thunderbolt/Kconfig > index c121acc..40335f7 100644 > --- a/drivers/thunderbolt/Kconfig > +++ b/drivers/thunderbolt/Kconfig > @@ -1,6 +1,6 @@ > menuconfig THUNDERBOLT > tristate "Thunderbolt support for Apple devices" > - depends on PCI > + depends on PCI && ACPI > select CRC32 > help > Cactus Ridge Thunderbolt Controller driver > diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c > index fa89160..964b006 100644 > --- a/drivers/thunderbolt/nhi.c > +++ b/drivers/thunderbolt/nhi.c > @@ -588,6 +588,8 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id) > } > pci_set_drvdata(pdev, tb); > > + nhi_runtime_pm_init(nhi); > + > return 0; > } > > @@ -595,6 +597,8 @@ static void nhi_remove(struct pci_dev *pdev) > { > struct tb *tb = pci_get_drvdata(pdev); > struct tb_nhi *nhi = tb->nhi; > + > + nhi_runtime_pm_fini(nhi); > thunderbolt_shutdown_and_free(tb); > nhi_shutdown(nhi); > } > diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h > index 3172429..dd725f7 100644 > --- a/drivers/thunderbolt/nhi.h > +++ b/drivers/thunderbolt/nhi.h > @@ -7,6 +7,7 @@ > #ifndef DSL3510_H_ > #define DSL3510_H_ > > +#include <linux/acpi.h> > #include <linux/mutex.h> > #include <linux/workqueue.h> > > @@ -25,6 +26,8 @@ struct tb_nhi { > struct tb_ring **rx_rings; > struct work_struct interrupt_work; > u32 hop_count; /* Number of rings (end point hops) supported by NHI. */ > + unsigned long long wake_gpe; /* Hotplug interrupt during powerdown. */ > + acpi_handle set_power; /* Method to power controller up/down. */ > }; > > /** > diff --git a/drivers/thunderbolt/power.c b/drivers/thunderbolt/power.c > index 1095ad0..cc83940 100644 > --- a/drivers/thunderbolt/power.c > +++ b/drivers/thunderbolt/power.c > @@ -2,11 +2,15 @@ > * Thunderbolt Cactus Ridge driver - power management > * > * Copyright (c) 2014 Andreas Noever <andreas.noever@gmail.com> > + * Copyright (c) 2016 Lukas Wunner <lukas@wunner.de> > */ > > +#include <linux/delay.h> > #include <linux/pci.h> > +#include <linux/pm_domain.h> > #include <linux/pm_runtime.h> > > +#include "nhi.h" > #include "tb.h" > > static int nhi_suspend_noirq(struct device *dev) > @@ -39,3 +43,246 @@ const struct dev_pm_ops nhi_pm_ops = { > */ > .restore_noirq = nhi_resume_noirq, > }; > + > +/* > + * Runtime Power Management > + * > + * Apple provides the following means for runtime pm in ACPI: > + * > + * * XRPE method (TRPE on Cactus Ridge and newer), takes argument 1 or 0, > + * toggles a GPIO pin to switch the controller on or off. > + * * XRIN named object (alternatively _GPE), contains number of a GPE which > + * fires as long as something is plugged in (regardless of power state). > + * * XRIL method returns 0 as long as something is plugged in, 1 otherwise. > + * * XRIP + XRIO methods, unused by OS X driver. (Flip interrupt polarity?) > + * > + * If there are multiple Thunderbolt controllers (e.g. MacPro6,1), each NHI > + * device has a separate XRIN GPE and separate instances of these methods. > + * > + * We acquire a runtime pm ref for each newly allocated switch (except for > + * the root switch) and drop one when a switch is freed. The controller is > + * thus powered up as long as something is plugged in. This behaviour is > + * identical to the OS X driver. > + * > + * Powering the controller down is almost instantaneous, but powering up takes > + * about 2 sec. To handle situations gracefully where a device is unplugged > + * and immediately replaced by another one, we afford a grace period of 10 sec > + * before powering down. This autosuspend_delay_ms may be reduced to 0 via > + * sysfs and to handle that properly we need to wait during runtime_resume > + * since it takes about 0.7 sec after resuming until a hotplug event appears. > + * > + * When the system wakes from suspend-to-RAM, the controller's power state is > + * as it was before. However if it was powered down, calling XRPE once to power > + * it up is not sufficient: An additional call to XRPE is necessary to reset > + * the power switch first. > + */ > + > +static int nhi_prepare(struct device *dev) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + struct tb *tb = pci_get_drvdata(pdev); > + acpi_status res; > + > + if (pm_runtime_active(dev)) > + return 0; > + > + res = acpi_disable_gpe(NULL, tb->nhi->wake_gpe); > + if (ACPI_FAILURE(res)) { > + dev_err(dev, "cannot disable wake GPE, resuming\n"); > + return 0; > + } else > + return 1; /* stay asleep if already runtime suspended */ > +} > + > +static void nhi_complete(struct device *dev) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + struct tb *tb = pci_get_drvdata(pdev); > + acpi_status res; > + > + if (pm_runtime_active(dev)) > + return; > + > + tb_info(tb, "resetting power switch\n"); > + res = acpi_execute_simple_method(tb->nhi->set_power, NULL, 0); > + if (ACPI_FAILURE(res)) { > + dev_err(dev, "cannot call set_power method\n"); > + dev->power.runtime_error = -ENODEV; > + } > + > + res = acpi_enable_gpe(NULL, tb->nhi->wake_gpe); > + if (ACPI_FAILURE(res)) { > + dev_err(dev, "cannot enable wake GPE, resuming\n"); > + pm_request_resume(dev); > + } > +} > + > +static int pci_save_state_cb(struct pci_dev *pdev, void *ptr) > +{ > + pci_save_state(pdev); > + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) { > + pm_runtime_disable(&pdev->dev); > + pm_runtime_set_suspended(&pdev->dev); > + pm_runtime_enable(&pdev->dev); > + } > + pdev->current_state = PCI_D3cold; > + return 0; > +} > + > +static int pci_restore_state_cb(struct pci_dev *pdev, void *ptr) > +{ > + pdev->current_state = PCI_D0; > + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) { > + pm_runtime_disable(&pdev->dev); > + pm_runtime_set_active(&pdev->dev); > + pm_runtime_enable(&pdev->dev); > + } > + pci_restore_state(pdev); > + return 0; > +} > + > +static int nhi_runtime_suspend(struct device *dev) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + struct pci_bus *upstream_bridge = pdev->bus->parent->parent; > + struct tb *tb = pci_get_drvdata(pdev); > + acpi_status res; > + > + if (!pdev->d3cold_allowed) > + return -EAGAIN; > + > + thunderbolt_suspend(tb); > + pci_walk_bus(upstream_bridge, pci_save_state_cb, NULL); > + > + tb_info(tb, "powering down\n"); > + res = acpi_execute_simple_method(tb->nhi->set_power, NULL, 0); > + if (ACPI_FAILURE(res)) { > + dev_err(dev, "cannot call set_power method, resuming\n"); > + goto err; > + } > + > + res = acpi_enable_gpe(NULL, tb->nhi->wake_gpe); > + if (ACPI_FAILURE(res)) { > + dev_err(dev, "cannot enable wake GPE, resuming\n"); > + goto err; > + } > + > + return 0; > + > +err: > + acpi_execute_simple_method(tb->nhi->set_power, NULL, 1); > + pci_walk_bus(upstream_bridge, pci_restore_state_cb, NULL); > + thunderbolt_resume(tb); > + return -EAGAIN; > +} > + > +static int nhi_runtime_resume(struct device *dev) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + struct pci_bus *upstream_bridge = pdev->bus->parent->parent; > + struct tb *tb = pci_get_drvdata(pdev); > + acpi_status res; > + > + if (system_state >= SYSTEM_HALT) > + return -ESHUTDOWN; > + > + res = acpi_disable_gpe(NULL, tb->nhi->wake_gpe); > + if (ACPI_FAILURE(res)) { > + dev_err(dev, "cannot disable wake GPE, disabling runtime pm\n"); > + pm_runtime_disable(dev); > + } > + > + tb_info(tb, "powering up\n"); > + res = acpi_execute_simple_method(tb->nhi->set_power, NULL, 1); > + if (ACPI_FAILURE(res)) { > + dev_err(dev, "cannot call set_power method\n"); > + return -ENODEV; > + } > + > + pci_walk_bus(upstream_bridge, pci_restore_state_cb, NULL); > + thunderbolt_resume(tb); > + msleep(1500); /* allow 1.5 sec for hotplug event to arrive */ > + pm_runtime_mark_last_busy(dev); > + > + return 0; > +} > + > +static u32 nhi_runtime_wake(acpi_handle gpe_device, u32 gpe_number, void *ctx) > +{ > + struct device *dev = ctx; > + WARN_ON(pm_request_resume(dev) < 0); > + return ACPI_INTERRUPT_HANDLED; > +} > + > +static struct dev_pm_domain nhi_pm_domain; > + > +void nhi_runtime_pm_init(struct tb_nhi *nhi) > +{ > + struct device *dev = &nhi->pdev->dev; > + struct acpi_handle *nhi_handle = ACPI_HANDLE(dev); > + acpi_status res; > + > + /* gen 1 controllers use XRPE, gen 2+ controllers use TRPE */ > + if (nhi->pdev->device <= PCI_DEVICE_ID_INTEL_EAGLE_RIDGE) > + res = acpi_get_handle(nhi_handle, "XRPE", &nhi->set_power); > + else > + res = acpi_get_handle(nhi_handle, "TRPE", &nhi->set_power); > + if (ACPI_FAILURE(res)) { > + dev_warn(dev, "cannot find set_power method, disabling runtime pm\n"); > + goto err; > + } > + > + res = acpi_evaluate_integer(nhi_handle, "XRIN", NULL, &nhi->wake_gpe); > + if (ACPI_FAILURE(res)) { > + dev_warn(dev, "cannot find wake GPE, disabling runtime pm\n"); > + goto err; > + } > + > + res = acpi_install_gpe_handler(NULL, nhi->wake_gpe, > + ACPI_GPE_LEVEL_TRIGGERED, > + nhi_runtime_wake, dev); > + if (ACPI_FAILURE(res)) { > + dev_warn(dev, "cannot install GPE handler, disabling runtime pm\n"); > + goto err; > + } > + > + nhi_pm_domain.ops = *pci_bus_type.pm; > + nhi_pm_domain.ops.prepare = nhi_prepare; > + nhi_pm_domain.ops.complete = nhi_complete; > + nhi_pm_domain.ops.runtime_suspend = nhi_runtime_suspend; > + nhi_pm_domain.ops.runtime_resume = nhi_runtime_resume; > + dev_pm_domain_set(dev, &nhi_pm_domain); > + > + /* apply to upstream bridge and downstream bridge 0 */ > + pm_suspend_ignore_children(dev->parent->parent, true); > + pm_suspend_ignore_children(dev->parent, true); > + > + pm_runtime_allow(dev); > + pm_runtime_set_autosuspend_delay(dev, 10000); > + pm_runtime_use_autosuspend(dev); > + pm_runtime_mark_last_busy(dev); > + pm_runtime_put(dev); > + return; > + > +err: > + nhi->wake_gpe = -1; > + if (pm_runtime_enabled(dev)) > + pm_runtime_disable(dev); > +} > + > +void nhi_runtime_pm_fini(struct tb_nhi *nhi) > +{ > + struct device *dev = &nhi->pdev->dev; > + acpi_status res; > + > + if (nhi->wake_gpe == -1) > + return; > + > + res = acpi_remove_gpe_handler(NULL, nhi->wake_gpe, nhi_runtime_wake); > + if (ACPI_FAILURE(res)) > + dev_warn(dev, "cannot remove GPE handler\n"); > + > + pm_runtime_get(dev); > + pm_runtime_forbid(dev); > + dev_pm_domain_set(dev, NULL); > +} > diff --git a/drivers/thunderbolt/power.h b/drivers/thunderbolt/power.h > index 99cb900..4fc836d 100644 > --- a/drivers/thunderbolt/power.h > +++ b/drivers/thunderbolt/power.h > @@ -11,4 +11,7 @@ > > extern const struct dev_pm_ops nhi_pm_ops; > > +void nhi_runtime_pm_fini(struct tb_nhi *nhi); > +void nhi_runtime_pm_init(struct tb_nhi *nhi); > + > #endif > diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c > index c6270f0..e9be3d5 100644 > --- a/drivers/thunderbolt/switch.c > +++ b/drivers/thunderbolt/switch.c > @@ -5,6 +5,7 @@ > */ > > #include <linux/delay.h> > +#include <linux/pm_runtime.h> > #include <linux/slab.h> > > #include "tb.h" > @@ -326,6 +327,11 @@ void tb_switch_free(struct tb_switch *sw) > if (!sw->is_unplugged) > tb_plug_events_active(sw, false); > > + if (sw != sw->tb->root_switch) { > + pm_runtime_mark_last_busy(&sw->tb->nhi->pdev->dev); > + pm_runtime_put(&sw->tb->nhi->pdev->dev); > + } > + > kfree(sw->ports); > kfree(sw->drom); > kfree(sw); > @@ -417,6 +423,9 @@ struct tb_switch *tb_switch_alloc(struct tb *tb, u64 route) > if (tb_plug_events_active(sw, true)) > goto err; > > + if (tb->root_switch) > + pm_runtime_get(&tb->nhi->pdev->dev); > + > return sw; > err: > kfree(sw->ports); > diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c > index 24b6d30..c33d3f1 100644 > --- a/drivers/thunderbolt/tb.c > +++ b/drivers/thunderbolt/tb.c > @@ -7,6 +7,7 @@ > #include <linux/slab.h> > #include <linux/errno.h> > #include <linux/delay.h> > +#include <linux/pm_runtime.h> > > #include "tb.h" > #include "tb_regs.h" > @@ -217,8 +218,11 @@ static void tb_handle_hotplug(struct work_struct *work) > { > struct tb_hotplug_event *ev = container_of(work, typeof(*ev), work); > struct tb *tb = ev->tb; > + struct device *dev = &tb->nhi->pdev->dev; > struct tb_switch *sw; > struct tb_port *port; > + > + pm_runtime_get(dev); > mutex_lock(&tb->lock); > if (!tb->hotplug_active) > goto out; /* during init, suspend or shutdown */ > @@ -274,6 +278,8 @@ static void tb_handle_hotplug(struct work_struct *work) > out: > mutex_unlock(&tb->lock); > kfree(ev); > + pm_runtime_mark_last_busy(dev); > + pm_runtime_put(dev); > } > > /** > -- > 2.7.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Andreas, thank you for your valuable feedback. On Sun, Mar 20, 2016 at 02:53:10PM +0100, Andreas Noever wrote: > - My firmware does not provide the TRPE ACPI method, only XRPE. So > either TRPE is only post CactusRidge or it is only present in newer > MBPs. In any case the OS X driver looks for TRPE first and uses XRPE > only if TRPE does not exists. I suggest we do the same (but see below > for TRPE). I only had the acpidump of an MBA6 (2013) available when I implemented this and it uses TRPE. I have since been able to obtain the acpidump of an MBP10 (2012) and you're right, it uses XRPE. Both have the same controller, Cactus Ridge 4C. It looks like they changed this on machines introduced 2013. It's just a rename of the method, there are no machines which have both methods. > - The XRIN GPE fired immediately after the power was cut. The problem > seems to be that the controller takes a bit to shut down. The solution > is to poll until XRIL returns 1 before activating the GPE. On "Type 2" > devices the OS X driver polls up to 300 times with a 1ms sleep in > between (for me 1 or 2 iterations were always enough). Afaik no > polling is done on "Type 1" devices. Hm, this means that the semantics of XRIN and XRIL changed on Cactus Ridge. I have changed the behaviour to be exactly as you've specified above, this works fine on Light Ridge and should hopefully also work on Cactus Ridge, no distinction between Type 1 and Type 2 necessary. > Also the OS X interrupt handler checks XRIL > and only wakes up the device if it returns 0. This was not necessary > to do on my model - but maybe spurious interrupts can happen with > newer controllers? They're doing lots of stuff which seems superfluous or needlessly complicated, e.g. they also reset the controller upon driver load using the XRST method (which exists only on some models). I don't think we have to do everything exactly as they do as long as it works. FWIW I haven't seen any spurious XRIN interrupts on Light Ridge. > Concerning TRPE style hardware: It seems that pm is more complicated > here. I see a bunch of references to SX* ACPI methods (SXFP, SXLV, > SXIO) and have not yet figured out what they do. Maybe we should not > enable PM if XRPE is not present until we find someone to test it. But you do have the SX* methods on your machine even though it uses XRPE, right? I've mostly figured out now what these methods are there for and have documented them extensively in upstream.c. However I cannot verify if my documentation is accurate as they are not present on my machine, but perhaps you can if your machine has them. SXLV, SXIO and SXIL exist only on Cactus Ridge machines and utilize the Go2Sx and Ok2Go2Sx pins. Judging by the PCI quirk you've added, it seems that a Go2Sx dance is necessary on this controller before power is cut (either by going to S3 / S4 / S5 or by using the Force Power pin, which is what XRPE / TRPE / SXFP do). > As you have noted the "correct" place to but this logic would be at > the upstream bridge. Ideally the downstream bridges should go into > D3hot by themselves if no devices are attached. The NHI as well In v2 it works exactly like this now: https://github.com/l1k/linux/commits/thunderbolt_runpm_v2 The trick is to allocate a Thunderbolt port service for the upstream bridge which we can bind to. In fact I'm allocating such a port service for *any* PCIe port on Thunderbolt devices, this could be useful for all sorts of other stuff. Binding to the upstream bridge also allows us to replace the PCI quirk which delays resume_noirq on the downstream bridges, as demonstrated by this experimental commit (works fine on Light Ridge but YMMV): https://github.com/l1k/linux/commit/79e0b8b8fb5da50b63836939f75212f824d8cba7 > (did you by chance check whether the NHI can be put into D3hot without > killing the thunderbolt tunnels?). Amazingly this works. However the NHI does not act on hotplug events after thunderbolt_suspend() has been called. Even without calling thunderbolt_suspend(), it seems that the control channel is down when the NHI is in D3hot, I'm getting RX timeouts. Also, I cannot see any reduction in power consumption when putting the NHI in D3, same for the downstream bridges. You can test this for yourself by commenting out the two calls to pm_runtime_get() and pm_runtime_put_autosuspend() in switch.c. Plug in a Thunderbolt device, wait 10 sec for the NHI to autosuspend, try accessing the Thunderbolt device. Works for me. If the NHI suspended before you had a chance to plug in the device, invoke "echo on > /sys/bus/pci/devices/0000:06:00.0/power/control". Plug in the device and use "echo auto" to let the NHI autosuspend. > (1) should be possible to fix? For (2): D3Cold always requires a > platform specific mechanism and the pci subsystem only supports ACPI. > Would it be possible to add an API to tell the pci subsystem that we > know how to put a specific device(tree) into D3Cold from a platform > driver [+CC Bjorn]? Then this whole thing would become a normal pci > suspend operation. I simply go to D3cold in the driver's ->runtime_suspend callback. There's just one small fix necessary in pci_raw_set_power_state() for this to work. Plus some changes in portdrv to call down to the port service drivers on each pm transition. (It already does this for ->suspend and ->resume, we just need the same functionality for additional pm callbacks). > Bridges in Hotplugged TB devices might have the same PCI ids as the > "root" bridges (if they use the same TB chip). You probably should > check that dev is a bridge of the builtin controller (for example by > checking for the presence of ACPI methods, see the comment in the > other tb quirks). For the upstream bridge I'm checking if its parent is a root port now to determine if it's a host controller built into the machine. I think the only chance for a false positive is if two machines are connected with Thunderbolt and one of them has multiple Thunderbolt controllers built in. Might look like this: RP - UPSB - DSB - UPSB - DSB - RP - RP - UPSB - DSB ^^^^^^^^^^^^^^^ ^^^^ ^^^^^^^^^^ local machine remote machine secondary controller on remote If the topology indeed looks like this (which I'm not sure of, I lack the hardware to test it), a thunderbolt_upstream driver will try to attach to UPSB on the secondary controller of the remote machine but should bail out because it can't find an ACPI handle for its NHI. So we should even have this corner case covered. Best regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Apr 24, 2016 at 5:23 PM, Lukas Wunner <lukas@wunner.de> wrote: > Hi Andreas, > > thank you for your valuable feedback. > > On Sun, Mar 20, 2016 at 02:53:10PM +0100, Andreas Noever wrote: >> - My firmware does not provide the TRPE ACPI method, only XRPE. So >> either TRPE is only post CactusRidge or it is only present in newer >> MBPs. In any case the OS X driver looks for TRPE first and uses XRPE >> only if TRPE does not exists. I suggest we do the same (but see below >> for TRPE). > > I only had the acpidump of an MBA6 (2013) available when I implemented > this and it uses TRPE. I have since been able to obtain the acpidump of > an MBP10 (2012) and you're right, it uses XRPE. Both have the same > controller, Cactus Ridge 4C. It looks like they changed this on machines > introduced 2013. It's just a rename of the method, there are no machines > which have both methods. > > >> - The XRIN GPE fired immediately after the power was cut. The problem >> seems to be that the controller takes a bit to shut down. The solution >> is to poll until XRIL returns 1 before activating the GPE. On "Type 2" >> devices the OS X driver polls up to 300 times with a 1ms sleep in >> between (for me 1 or 2 iterations were always enough). Afaik no >> polling is done on "Type 1" devices. > > Hm, this means that the semantics of XRIN and XRIL changed on Cactus > Ridge. I have changed the behaviour to be exactly as you've specified > above, this works fine on Light Ridge and should hopefully also work > on Cactus Ridge, no distinction between Type 1 and Type 2 necessary. ok > >> Also the OS X interrupt handler checks XRIL >> and only wakes up the device if it returns 0. This was not necessary >> to do on my model - but maybe spurious interrupts can happen with >> newer controllers? > > They're doing lots of stuff which seems superfluous or needlessly > complicated, e.g. they also reset the controller upon driver load using > the XRST method (which exists only on some models). I don't think we > have to do everything exactly as they do as long as it works. > FWIW I haven't seen any spurious XRIN interrupts on Light Ridge. > > >> Concerning TRPE style hardware: It seems that pm is more complicated >> here. I see a bunch of references to SX* ACPI methods (SXFP, SXLV, >> SXIO) and have not yet figured out what they do. Maybe we should not >> enable PM if XRPE is not present until we find someone to test it. > > But you do have the SX* methods on your machine even though it uses > XRPE, right? I've mostly figured out now what these methods are there > for and have documented them extensively in upstream.c. However I cannot > verify if my documentation is accurate as they are not present on my > machine, but perhaps you can if your machine has them. Yes I have these methods but I have no idea what they do. Just that they have to be called before suspend: http://lxr.free-electrons.com/source/drivers/pci/quirks.c#L3175 > SXLV, SXIO and SXIL exist only on Cactus Ridge machines and utilize > the Go2Sx and Ok2Go2Sx pins. Judging by the PCI quirk you've added, > it seems that a Go2Sx dance is necessary on this controller before > power is cut (either by going to S3 / S4 / S5 or by using the Force > Power pin, which is what XRPE / TRPE / SXFP do). > > >> As you have noted the "correct" place to but this logic would be at >> the upstream bridge. Ideally the downstream bridges should go into >> D3hot by themselves if no devices are attached. The NHI as well > > In v2 it works exactly like this now: > https://github.com/l1k/linux/commits/thunderbolt_runpm_v2 > > The trick is to allocate a Thunderbolt port service for the upstream > bridge which we can bind to. In fact I'm allocating such a port service > for *any* PCIe port on Thunderbolt devices, this could be useful for all > sorts of other stuff. Just tested your branch - works nicely (runtime pm, suspend and hibernate)! > Binding to the upstream bridge also allows us to replace the PCI quirk > which delays resume_noirq on the downstream bridges, as demonstrated by > this experimental commit (works fine on Light Ridge but YMMV): > https://github.com/l1k/linux/commit/79e0b8b8fb5da50b63836939f75212f824d8cba7 > > >> (did you by chance check whether the NHI can be put into D3hot without >> killing the thunderbolt tunnels?). > > Amazingly this works. However the NHI does not act on hotplug events > after thunderbolt_suspend() has been called. Even without calling > thunderbolt_suspend(), it seems that the control channel is down > when the NHI is in D3hot, I'm getting RX timeouts. Also, I cannot > see any reduction in power consumption when putting the NHI in D3, > same for the downstream bridges. Interesting. Looks like the NHI is really just a a device on the tb swicht. But then it is understandable that turning it of does not decrease power consumption. > You can test this for yourself by commenting out the two calls to > pm_runtime_get() and pm_runtime_put_autosuspend() in switch.c. > Plug in a Thunderbolt device, wait 10 sec for the NHI to autosuspend, > try accessing the Thunderbolt device. Works for me. > > If the NHI suspended before you had a chance to plug in the device, > invoke "echo on > /sys/bus/pci/devices/0000:06:00.0/power/control". > Plug in the device and use "echo auto" to let the NHI autosuspend. > > >> (1) should be possible to fix? For (2): D3Cold always requires a >> platform specific mechanism and the pci subsystem only supports ACPI. >> Would it be possible to add an API to tell the pci subsystem that we >> know how to put a specific device(tree) into D3Cold from a platform >> driver [+CC Bjorn]? Then this whole thing would become a normal pci >> suspend operation. > > I simply go to D3cold in the driver's ->runtime_suspend callback. > There's just one small fix necessary in pci_raw_set_power_state() > for this to work. Plus some changes in portdrv to call down to the > port service drivers on each pm transition. (It already does this > for ->suspend and ->resume, we just need the same functionality for > additional pm callbacks). > > >> Bridges in Hotplugged TB devices might have the same PCI ids as the >> "root" bridges (if they use the same TB chip). You probably should >> check that dev is a bridge of the builtin controller (for example by >> checking for the presence of ACPI methods, see the comment in the >> other tb quirks). > > For the upstream bridge I'm checking if its parent is a root port now > to determine if it's a host controller built into the machine. > I think the only chance for a false positive is if two machines are > connected with Thunderbolt and one of them has multiple Thunderbolt > controllers built in. Might look like this: > > RP - UPSB - DSB - UPSB - DSB - RP - RP - UPSB - DSB > ^^^^^^^^^^^^^^^ ^^^^ ^^^^^^^^^^ > local machine remote machine secondary controller on remote I don't think that it is possible to tunnel into a different machine like that :) The root port check should be sufficient. Best Regards Andreas > If the topology indeed looks like this (which I'm not sure of, I lack > the hardware to test it), a thunderbolt_upstream driver will try to > attach to UPSB on the secondary controller of the remote machine but > should bail out because it can't find an ACPI handle for its NHI. > So we should even have this corner case covered. > Best regards, > > Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Alan, On Thu, Mar 17, 2016 at 10:54:55AM -0400, Alan Stern wrote: > On Wed, 16 Mar 2016, Lukas Wunner wrote: > > > The way you're doing it, how does the NHI driver know when to go into > > > suspend? The runtime PM core won't notify it when all the hotplugged > > > devices attached to the other bridges have been suspended, since it's > > > not their parent. > > > > The NHI knows when something is plugged in, it talks to the switches > > in devices that are hotplugged to the controller. As I've explained > > in the lengthy comment in the middle of patch [4/4], we acquire a > > runtime pm ref for each switch that is plugged in and release one > > whenever a switch is unplugged. > > If I understand correctly, that means you allow the Thunderbolt > controller to go into runtime suspend only when nothing is plugged into > any of the ports. Is that right? It's quite inefficient. In the case of Thunderbolt on the Mac, runtime suspend means that the controller is powered down. A plug event is side-band signaled using a GPE so that we're able to power the controller up once something is plugged in. It's not possible to power the controller down while devices are attached because downstream devices have no way to side-band signal an interrupt when they need to send data to the controller. > What I'm getting at is that we should have proper runtime-PM support > for bridges, i.e., I agree with what you wrote above. A bridge can > safely go into runtime suspend when there are no unsuspended devices > attached to any of its downstream ports. (That's how the USB hub > driver works, for instance.) Doing things that way would make > everything simpler in the long run. > > So my suggestion is that you change over to the "less kludgy solution" > and work on that instead. Alright, posted as v2 today. :-) Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index d1e3956..a007485 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -25,6 +25,7 @@ #include <linux/sched.h> #include <linux/ktime.h> #include <linux/mm.h> +#include <linux/pm_domain.h> #include <asm/dma.h> /* isa_dma_bridge_buggy */ #include "pci.h" @@ -3255,6 +3256,40 @@ DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_FALCON_RIDGE_4C_BRIDGE, quirk_apple_wait_for_thunderbolt); + +static int bridge_prepare(struct device *dev) +{ + return 1; /* stay asleep if already runtime suspended */ +} + +static void quirk_apple_thunderbolt_runpm(struct pci_dev *dev) +{ + struct dev_pm_domain *bridge_pm_domain; + + if (!dmi_match(DMI_BOARD_VENDOR, "Apple Inc.")) + return; + if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) + return; + if (dev->dev.pm_domain) + return; + + bridge_pm_domain = kzalloc(sizeof(*bridge_pm_domain), GFP_KERNEL); + if (!bridge_pm_domain) { + dev_err(&dev->dev, "cannot allocate pm_domain\n"); + return; + } + + bridge_pm_domain->ops = *pci_bus_type.pm; + bridge_pm_domain->ops.prepare = bridge_prepare; + bridge_pm_domain->ops.complete = NULL; + dev_pm_domain_set(&dev->dev, bridge_pm_domain); +} +DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, + quirk_apple_thunderbolt_runpm); +DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_FALCON_RIDGE_4C_BRIDGE, + quirk_apple_thunderbolt_runpm); #endif static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f, diff --git a/drivers/thunderbolt/Kconfig b/drivers/thunderbolt/Kconfig index c121acc..40335f7 100644 --- a/drivers/thunderbolt/Kconfig +++ b/drivers/thunderbolt/Kconfig @@ -1,6 +1,6 @@ menuconfig THUNDERBOLT tristate "Thunderbolt support for Apple devices" - depends on PCI + depends on PCI && ACPI select CRC32 help Cactus Ridge Thunderbolt Controller driver diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c index fa89160..964b006 100644 --- a/drivers/thunderbolt/nhi.c +++ b/drivers/thunderbolt/nhi.c @@ -588,6 +588,8 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id) } pci_set_drvdata(pdev, tb); + nhi_runtime_pm_init(nhi); + return 0; } @@ -595,6 +597,8 @@ static void nhi_remove(struct pci_dev *pdev) { struct tb *tb = pci_get_drvdata(pdev); struct tb_nhi *nhi = tb->nhi; + + nhi_runtime_pm_fini(nhi); thunderbolt_shutdown_and_free(tb); nhi_shutdown(nhi); } diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h index 3172429..dd725f7 100644 --- a/drivers/thunderbolt/nhi.h +++ b/drivers/thunderbolt/nhi.h @@ -7,6 +7,7 @@ #ifndef DSL3510_H_ #define DSL3510_H_ +#include <linux/acpi.h> #include <linux/mutex.h> #include <linux/workqueue.h> @@ -25,6 +26,8 @@ struct tb_nhi { struct tb_ring **rx_rings; struct work_struct interrupt_work; u32 hop_count; /* Number of rings (end point hops) supported by NHI. */ + unsigned long long wake_gpe; /* Hotplug interrupt during powerdown. */ + acpi_handle set_power; /* Method to power controller up/down. */ }; /** diff --git a/drivers/thunderbolt/power.c b/drivers/thunderbolt/power.c index 1095ad0..cc83940 100644 --- a/drivers/thunderbolt/power.c +++ b/drivers/thunderbolt/power.c @@ -2,11 +2,15 @@ * Thunderbolt Cactus Ridge driver - power management * * Copyright (c) 2014 Andreas Noever <andreas.noever@gmail.com> + * Copyright (c) 2016 Lukas Wunner <lukas@wunner.de> */ +#include <linux/delay.h> #include <linux/pci.h> +#include <linux/pm_domain.h> #include <linux/pm_runtime.h> +#include "nhi.h" #include "tb.h" static int nhi_suspend_noirq(struct device *dev) @@ -39,3 +43,246 @@ const struct dev_pm_ops nhi_pm_ops = { */ .restore_noirq = nhi_resume_noirq, }; + +/* + * Runtime Power Management + * + * Apple provides the following means for runtime pm in ACPI: + * + * * XRPE method (TRPE on Cactus Ridge and newer), takes argument 1 or 0, + * toggles a GPIO pin to switch the controller on or off. + * * XRIN named object (alternatively _GPE), contains number of a GPE which + * fires as long as something is plugged in (regardless of power state). + * * XRIL method returns 0 as long as something is plugged in, 1 otherwise. + * * XRIP + XRIO methods, unused by OS X driver. (Flip interrupt polarity?) + * + * If there are multiple Thunderbolt controllers (e.g. MacPro6,1), each NHI + * device has a separate XRIN GPE and separate instances of these methods. + * + * We acquire a runtime pm ref for each newly allocated switch (except for + * the root switch) and drop one when a switch is freed. The controller is + * thus powered up as long as something is plugged in. This behaviour is + * identical to the OS X driver. + * + * Powering the controller down is almost instantaneous, but powering up takes + * about 2 sec. To handle situations gracefully where a device is unplugged + * and immediately replaced by another one, we afford a grace period of 10 sec + * before powering down. This autosuspend_delay_ms may be reduced to 0 via + * sysfs and to handle that properly we need to wait during runtime_resume + * since it takes about 0.7 sec after resuming until a hotplug event appears. + * + * When the system wakes from suspend-to-RAM, the controller's power state is + * as it was before. However if it was powered down, calling XRPE once to power + * it up is not sufficient: An additional call to XRPE is necessary to reset + * the power switch first. + */ + +static int nhi_prepare(struct device *dev) +{ + struct pci_dev *pdev = to_pci_dev(dev); + struct tb *tb = pci_get_drvdata(pdev); + acpi_status res; + + if (pm_runtime_active(dev)) + return 0; + + res = acpi_disable_gpe(NULL, tb->nhi->wake_gpe); + if (ACPI_FAILURE(res)) { + dev_err(dev, "cannot disable wake GPE, resuming\n"); + return 0; + } else + return 1; /* stay asleep if already runtime suspended */ +} + +static void nhi_complete(struct device *dev) +{ + struct pci_dev *pdev = to_pci_dev(dev); + struct tb *tb = pci_get_drvdata(pdev); + acpi_status res; + + if (pm_runtime_active(dev)) + return; + + tb_info(tb, "resetting power switch\n"); + res = acpi_execute_simple_method(tb->nhi->set_power, NULL, 0); + if (ACPI_FAILURE(res)) { + dev_err(dev, "cannot call set_power method\n"); + dev->power.runtime_error = -ENODEV; + } + + res = acpi_enable_gpe(NULL, tb->nhi->wake_gpe); + if (ACPI_FAILURE(res)) { + dev_err(dev, "cannot enable wake GPE, resuming\n"); + pm_request_resume(dev); + } +} + +static int pci_save_state_cb(struct pci_dev *pdev, void *ptr) +{ + pci_save_state(pdev); + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) { + pm_runtime_disable(&pdev->dev); + pm_runtime_set_suspended(&pdev->dev); + pm_runtime_enable(&pdev->dev); + } + pdev->current_state = PCI_D3cold; + return 0; +} + +static int pci_restore_state_cb(struct pci_dev *pdev, void *ptr) +{ + pdev->current_state = PCI_D0; + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_PCI) { + pm_runtime_disable(&pdev->dev); + pm_runtime_set_active(&pdev->dev); + pm_runtime_enable(&pdev->dev); + } + pci_restore_state(pdev); + return 0; +} + +static int nhi_runtime_suspend(struct device *dev) +{ + struct pci_dev *pdev = to_pci_dev(dev); + struct pci_bus *upstream_bridge = pdev->bus->parent->parent; + struct tb *tb = pci_get_drvdata(pdev); + acpi_status res; + + if (!pdev->d3cold_allowed) + return -EAGAIN; + + thunderbolt_suspend(tb); + pci_walk_bus(upstream_bridge, pci_save_state_cb, NULL); + + tb_info(tb, "powering down\n"); + res = acpi_execute_simple_method(tb->nhi->set_power, NULL, 0); + if (ACPI_FAILURE(res)) { + dev_err(dev, "cannot call set_power method, resuming\n"); + goto err; + } + + res = acpi_enable_gpe(NULL, tb->nhi->wake_gpe); + if (ACPI_FAILURE(res)) { + dev_err(dev, "cannot enable wake GPE, resuming\n"); + goto err; + } + + return 0; + +err: + acpi_execute_simple_method(tb->nhi->set_power, NULL, 1); + pci_walk_bus(upstream_bridge, pci_restore_state_cb, NULL); + thunderbolt_resume(tb); + return -EAGAIN; +} + +static int nhi_runtime_resume(struct device *dev) +{ + struct pci_dev *pdev = to_pci_dev(dev); + struct pci_bus *upstream_bridge = pdev->bus->parent->parent; + struct tb *tb = pci_get_drvdata(pdev); + acpi_status res; + + if (system_state >= SYSTEM_HALT) + return -ESHUTDOWN; + + res = acpi_disable_gpe(NULL, tb->nhi->wake_gpe); + if (ACPI_FAILURE(res)) { + dev_err(dev, "cannot disable wake GPE, disabling runtime pm\n"); + pm_runtime_disable(dev); + } + + tb_info(tb, "powering up\n"); + res = acpi_execute_simple_method(tb->nhi->set_power, NULL, 1); + if (ACPI_FAILURE(res)) { + dev_err(dev, "cannot call set_power method\n"); + return -ENODEV; + } + + pci_walk_bus(upstream_bridge, pci_restore_state_cb, NULL); + thunderbolt_resume(tb); + msleep(1500); /* allow 1.5 sec for hotplug event to arrive */ + pm_runtime_mark_last_busy(dev); + + return 0; +} + +static u32 nhi_runtime_wake(acpi_handle gpe_device, u32 gpe_number, void *ctx) +{ + struct device *dev = ctx; + WARN_ON(pm_request_resume(dev) < 0); + return ACPI_INTERRUPT_HANDLED; +} + +static struct dev_pm_domain nhi_pm_domain; + +void nhi_runtime_pm_init(struct tb_nhi *nhi) +{ + struct device *dev = &nhi->pdev->dev; + struct acpi_handle *nhi_handle = ACPI_HANDLE(dev); + acpi_status res; + + /* gen 1 controllers use XRPE, gen 2+ controllers use TRPE */ + if (nhi->pdev->device <= PCI_DEVICE_ID_INTEL_EAGLE_RIDGE) + res = acpi_get_handle(nhi_handle, "XRPE", &nhi->set_power); + else + res = acpi_get_handle(nhi_handle, "TRPE", &nhi->set_power); + if (ACPI_FAILURE(res)) { + dev_warn(dev, "cannot find set_power method, disabling runtime pm\n"); + goto err; + } + + res = acpi_evaluate_integer(nhi_handle, "XRIN", NULL, &nhi->wake_gpe); + if (ACPI_FAILURE(res)) { + dev_warn(dev, "cannot find wake GPE, disabling runtime pm\n"); + goto err; + } + + res = acpi_install_gpe_handler(NULL, nhi->wake_gpe, + ACPI_GPE_LEVEL_TRIGGERED, + nhi_runtime_wake, dev); + if (ACPI_FAILURE(res)) { + dev_warn(dev, "cannot install GPE handler, disabling runtime pm\n"); + goto err; + } + + nhi_pm_domain.ops = *pci_bus_type.pm; + nhi_pm_domain.ops.prepare = nhi_prepare; + nhi_pm_domain.ops.complete = nhi_complete; + nhi_pm_domain.ops.runtime_suspend = nhi_runtime_suspend; + nhi_pm_domain.ops.runtime_resume = nhi_runtime_resume; + dev_pm_domain_set(dev, &nhi_pm_domain); + + /* apply to upstream bridge and downstream bridge 0 */ + pm_suspend_ignore_children(dev->parent->parent, true); + pm_suspend_ignore_children(dev->parent, true); + + pm_runtime_allow(dev); + pm_runtime_set_autosuspend_delay(dev, 10000); + pm_runtime_use_autosuspend(dev); + pm_runtime_mark_last_busy(dev); + pm_runtime_put(dev); + return; + +err: + nhi->wake_gpe = -1; + if (pm_runtime_enabled(dev)) + pm_runtime_disable(dev); +} + +void nhi_runtime_pm_fini(struct tb_nhi *nhi) +{ + struct device *dev = &nhi->pdev->dev; + acpi_status res; + + if (nhi->wake_gpe == -1) + return; + + res = acpi_remove_gpe_handler(NULL, nhi->wake_gpe, nhi_runtime_wake); + if (ACPI_FAILURE(res)) + dev_warn(dev, "cannot remove GPE handler\n"); + + pm_runtime_get(dev); + pm_runtime_forbid(dev); + dev_pm_domain_set(dev, NULL); +} diff --git a/drivers/thunderbolt/power.h b/drivers/thunderbolt/power.h index 99cb900..4fc836d 100644 --- a/drivers/thunderbolt/power.h +++ b/drivers/thunderbolt/power.h @@ -11,4 +11,7 @@ extern const struct dev_pm_ops nhi_pm_ops; +void nhi_runtime_pm_fini(struct tb_nhi *nhi); +void nhi_runtime_pm_init(struct tb_nhi *nhi); + #endif diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c index c6270f0..e9be3d5 100644 --- a/drivers/thunderbolt/switch.c +++ b/drivers/thunderbolt/switch.c @@ -5,6 +5,7 @@ */ #include <linux/delay.h> +#include <linux/pm_runtime.h> #include <linux/slab.h> #include "tb.h" @@ -326,6 +327,11 @@ void tb_switch_free(struct tb_switch *sw) if (!sw->is_unplugged) tb_plug_events_active(sw, false); + if (sw != sw->tb->root_switch) { + pm_runtime_mark_last_busy(&sw->tb->nhi->pdev->dev); + pm_runtime_put(&sw->tb->nhi->pdev->dev); + } + kfree(sw->ports); kfree(sw->drom); kfree(sw); @@ -417,6 +423,9 @@ struct tb_switch *tb_switch_alloc(struct tb *tb, u64 route) if (tb_plug_events_active(sw, true)) goto err; + if (tb->root_switch) + pm_runtime_get(&tb->nhi->pdev->dev); + return sw; err: kfree(sw->ports); diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c index 24b6d30..c33d3f1 100644 --- a/drivers/thunderbolt/tb.c +++ b/drivers/thunderbolt/tb.c @@ -7,6 +7,7 @@ #include <linux/slab.h> #include <linux/errno.h> #include <linux/delay.h> +#include <linux/pm_runtime.h> #include "tb.h" #include "tb_regs.h" @@ -217,8 +218,11 @@ static void tb_handle_hotplug(struct work_struct *work) { struct tb_hotplug_event *ev = container_of(work, typeof(*ev), work); struct tb *tb = ev->tb; + struct device *dev = &tb->nhi->pdev->dev; struct tb_switch *sw; struct tb_port *port; + + pm_runtime_get(dev); mutex_lock(&tb->lock); if (!tb->hotplug_active) goto out; /* during init, suspend or shutdown */ @@ -274,6 +278,8 @@ static void tb_handle_hotplug(struct work_struct *work) out: mutex_unlock(&tb->lock); kfree(ev); + pm_runtime_mark_last_busy(dev); + pm_runtime_put(dev); } /**
Document and implement Apple's ACPI-based (but nonstandard) mechanism to power the controller up and down as needed. This fixes (at least partially) a power regression introduced in Linux 3.17 by 7bc5a2bad0b8 ("ACPI: Support _OSI("Darwin") correctly"). A Thunderbolt controller consists of an NHI (Native Host Interface) and a set of bridges. Power is cut to the entire chip. The Linux pm model assumes that runtime pm is governed by the parent device, i.e. the upstream bridge driver, pcieport. In violation of this model we let a child govern it, i.e. the NHI driver thunderbolt.ko. The traditional hierarchical pm model is defeated by setting ignore_children on the upstream bridge and downstream bridge 0, and by having the NHI update all the bridges' runtime pm state in unison with itself. It is also the NHI driver's job to save and restore PCI state of the bridges. PCIe Port --- Upstream Bridge --+ | +-- Downstream Bridge 0 --+ | | | +-- NHI | +-- Downstream Bridge 1 ... | +-- Downstream Bridge 2 ... hotplugged | devices +-- Downstream Bridge 3 ... | +-- Downstream Bridge 4 ... The PCI subsystem pm_ops do not work properly for devices which can be put into D3cold by some other means than the standard _PSx ACPI platform methods: We do not want to wake up the chip before system sleep, yet pci_pm_prepare() does not return 1 as it should since pci_target_state() returns D3hot. We solve this by overriding pci_pm_prepare() using power domains. They are assigned to the bridges using a PCI quirk. We also do not want to wake the chip after system resume as pci_pm_complete() does, so we override that as well. Note that we can never remove and free the dev_pm_domain assigned to the bridges as there is no PCI remove fixup section. We also cannot bail out of the ->probe callback if allocation of the struct dev_pm_domain fails since the PCI enable fixup does not allow return values to be passed back. It might be possible to implement a less kludgy solution which adheres to the hierarchical pm model and does not need a PCI enable quirk for the bridges if pcieport had runtime pm support both for itself and any service drivers registering with it. The runtime pm code could then be moved from the NHI to a new Thunderbolt service driver that gets used on the upstream bridge. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=92111 Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Andreas Noever <andreas.noever@gmail.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/pci/quirks.c | 35 ++++++ drivers/thunderbolt/Kconfig | 2 +- drivers/thunderbolt/nhi.c | 4 + drivers/thunderbolt/nhi.h | 3 + drivers/thunderbolt/power.c | 247 +++++++++++++++++++++++++++++++++++++++++++ drivers/thunderbolt/power.h | 3 + drivers/thunderbolt/switch.c | 9 ++ drivers/thunderbolt/tb.c | 6 ++ 8 files changed, 308 insertions(+), 1 deletion(-)