Message ID | 20230404052714.51315-1-mika.westerberg@linux.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | PCI/PM: Resume/reset wait time change | expand |
On Tue, Apr 04, 2023 at 08:27:12AM +0300, Mika Westerberg wrote: > Hi all, > > This series first increases the time we wait on resume path to > accommondate certain device, as reported in [1], and then "optimizes" > the timeout for slow links to avoid too long waits if a device is > disconnected during suspend. > > Previous version can be found here: > > https://lore.kernel.org/linux-pci/20230321095031.65709-1-mika.westerberg@linux.intel.com/ > > Changes from the previous version: > > * Split the patch into two: one that increases the resume timeout (on > all links, I was not able to figure out a simple way to limit this > only for the fast links) and the one that decreases the timeout on > slow links. > > * Use dev->link_active_reporting instead of speed to figure out slow > vs. fast links. > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=216728 > > Mika Westerberg (2): > PCI/PM: Increase wait time after resume I applied the above to pci/reset for v6.4. > PCI/PM: Decrease wait time for devices behind slow links Part of this patch is removing the pci_bridge_wait_for_secondary_bus() timeout parameter, since all callers now supply the same value (PCIE_RESET_READY_POLL_MS). I extracted that part out and applied it as well. I'm hoping we can restructure the rest of this as mentioned in the thread. If that's not possible, can you rebase what's left on top of this? https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=reset > drivers/pci/pci-driver.c | 2 +- > drivers/pci/pci.c | 42 ++++++++++++++++++++++++++-------------- > drivers/pci/pci.h | 16 +-------------- > drivers/pci/pcie/dpc.c | 3 +-- > 4 files changed, 30 insertions(+), 33 deletions(-) > > -- > 2.39.2 >
Hi, On Tue, Apr 11, 2023 at 05:40:14PM -0500, Bjorn Helgaas wrote: > On Tue, Apr 04, 2023 at 08:27:12AM +0300, Mika Westerberg wrote: > > Hi all, > > > > This series first increases the time we wait on resume path to > > accommondate certain device, as reported in [1], and then "optimizes" > > the timeout for slow links to avoid too long waits if a device is > > disconnected during suspend. > > > > Previous version can be found here: > > > > https://lore.kernel.org/linux-pci/20230321095031.65709-1-mika.westerberg@linux.intel.com/ > > > > Changes from the previous version: > > > > * Split the patch into two: one that increases the resume timeout (on > > all links, I was not able to figure out a simple way to limit this > > only for the fast links) and the one that decreases the timeout on > > slow links. > > > > * Use dev->link_active_reporting instead of speed to figure out slow > > vs. fast links. > > > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=216728 > > > > Mika Westerberg (2): > > PCI/PM: Increase wait time after resume > > I applied the above to pci/reset for v6.4. Thanks! > > PCI/PM: Decrease wait time for devices behind slow links > > Part of this patch is removing the pci_bridge_wait_for_secondary_bus() > timeout parameter, since all callers now supply the same value > (PCIE_RESET_READY_POLL_MS). I extracted that part out and applied it > as well. > > I'm hoping we can restructure the rest of this as mentioned in the > thread. If that's not possible, can you rebase what's left on top of > this? > > https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=reset Sure. The end result is below. I did not add the wait_for_link_active() because we already have the pcie_wait_for_link_delay(), and the msleep(100) really needs to be msleep(delay) because we extract that 'delay' from the device d3cold_delay which can be more than 100 ms. Let me know what you think. I will send a proper patch tomorrow if no objections. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 0b4f3b08f780..61bf8a4b2099 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5037,6 +5037,22 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type) } } + /* + * Everything above is handling the delays mandated by the PCIe r6.0 + * sec 6.6.1. + * + * If the port supports active link reporting we now check one more + * time if the link is active and if not bail out early with the + * assumption that the device is not present anymore. + */ + if (dev->link_active_reporting) { + u16 status; + + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status); + if (!(status & PCI_EXP_LNKSTA_DLLLA)) + return -ENOTTY; + } + return pci_dev_wait(child, reset_type, PCIE_RESET_READY_POLL_MS - delay); }
On Wed, Apr 12, 2023 at 11:00:19AM +0300, Mika Westerberg wrote: > On Tue, Apr 11, 2023 at 05:40:14PM -0500, Bjorn Helgaas wrote: > ... > > I'm hoping we can restructure the rest of this as mentioned in the > > thread. If that's not possible, can you rebase what's left on top of > > this? > > > > https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=reset > > Sure. The end result is below. I did not add the wait_for_link_active() > because we already have the pcie_wait_for_link_delay(), and the > msleep(100) really needs to be msleep(delay) because we extract that > 'delay' from the device d3cold_delay which can be more than 100 ms. Agreed about delay vs 100ms. I used 100ms in my pidgin patch just for simplicity. Yeah, I think this is OK. I think there's room for improvement around pcie_wait_for_link_delay() because it does several things that get a little muddled when mixed together, but your patch is fine as-is: - When !pdev->link_active_reporting, it ignores the "active" parameter and always returns true ("link is up"). It happens that we never wait for the link to go *down* when link_active_reporting isn't supported, so the behavior is fine, but this is a weird restriction of the interface. - IIUC, the 20ms delay comes from a requirement that *hardware* must enter LTSSM within 20ms, not software delay requirement. Also, 20ms only applies to devices at <= 5.0 GT/s; faster devices have up to 100ms. From the spec point of view, I don't think there's a reason to delay config reads to the Downstream Port (we only need to delay config accesses to devices below), so I would argue for removing all this. - It conflates the 1000ms timeout waiting for the link to come up with the 100ms wait times in the spec. AFAIK, we made up the 1000ms number out of thin air, and I think it would be clearer to handle it at a separate place (as opposed to "msleep(timeout + delay)"). - There's only one path (dpc_reset_link()) that waits for the link to go *down*. In that case, pcie_wait_for_link() passes a delay of 100ms, but since active==false and spec requires that Ports with DPC must implement link_active_reporting, that 100ms is ignored. Works like we want, but ... it feels like it relies a little bit too much on internal knowledge. > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 0b4f3b08f780..61bf8a4b2099 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -5037,6 +5037,22 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type) > } > } > > + /* > + * Everything above is handling the delays mandated by the PCIe r6.0 > + * sec 6.6.1. > + * > + * If the port supports active link reporting we now check one more > + * time if the link is active and if not bail out early with the > + * assumption that the device is not present anymore. > + */ > + if (dev->link_active_reporting) { > + u16 status; > + > + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status); > + if (!(status & PCI_EXP_LNKSTA_DLLLA)) > + return -ENOTTY; > + } > + > return pci_dev_wait(child, reset_type, > PCIE_RESET_READY_POLL_MS - delay); I have to smile about this: we make sure we don't wait that extra 0.1s when the overall timeout is a completely arbitrary 60s :) Bjorn