diff mbox series

[2/3] PCI: Unify delay handling for reset and resume

Message ID bd6ac49d60c1ca6fe5c27c2fa54b78d70a8ba07b.1672511017.git.lukas@wunner.de (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series PCI reset delay fixes | expand

Commit Message

Lukas Wunner Dec. 31, 2022, 6:33 p.m. UTC
Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait
for devices on the secondary bus to become accessible after reset:

Although it does call pci_dev_wait(), it erroneously passes the bridge's
pci_dev rather than that of a child.  The bridge of course is always
accessible while its secondary bus is reset, so pci_dev_wait() returns
immediately.

Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait()
function which is called from pci_bridge_secondary_bus_reset():

https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/

However we already have pci_bridge_wait_for_secondary_bus() which does
almost exactly what we need.  So far it's only called on resume from
D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8).
Re-using it for Secondary Bus Resets is a leaner and more rational
approach than introducing a new function.

That only requires a few minor tweaks:

- Amend pci_bridge_wait_for_secondary_bus() to await accessibility of
  the first device on the secondary bus by calling pci_dev_wait() after
  performing the prescribed delays.  pci_dev_wait() needs two parameters,
  a reset reason and a timeout, which callers must now pass to
  pci_bridge_wait_for_secondary_bus().  The timeout is 1 sec for resume
  (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c46c ("PCI:
  Wait up to 60 seconds for device to become ready after FLR")).

- Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or
  -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset().

- Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which
  is now performed by pci_bridge_wait_for_secondary_bus().  A static
  delay this long is only necessary for Conventional PCI, so modern
  PCIe systems benefit from shorter reset times as a side effect.

Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
Reported-by: Sheng Bi <windy.bi.enflame@gmail.com>
Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.17+
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/pci/pci-driver.c |  2 +-
 drivers/pci/pci.c        | 50 ++++++++++++++++++----------------------
 drivers/pci/pci.h        |  3 ++-
 3 files changed, 25 insertions(+), 30 deletions(-)

Comments

Kuppuswamy Sathyanarayanan Jan. 3, 2023, 7:50 p.m. UTC | #1
On 12/31/22 10:33 AM, Lukas Wunner wrote:
> Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait
> for devices on the secondary bus to become accessible after reset:
> 
> Although it does call pci_dev_wait(), it erroneously passes the bridge's
> pci_dev rather than that of a child.  The bridge of course is always
> accessible while its secondary bus is reset, so pci_dev_wait() returns
> immediately.
> 
> Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait()
> function which is called from pci_bridge_secondary_bus_reset():
> 
> https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/
> 
> However we already have pci_bridge_wait_for_secondary_bus() which does
> almost exactly what we need.  So far it's only called on resume from
> D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8).
> Re-using it for Secondary Bus Resets is a leaner and more rational
> approach than introducing a new function.
> 
> That only requires a few minor tweaks:
> 
> - Amend pci_bridge_wait_for_secondary_bus() to await accessibility of
>   the first device on the secondary bus by calling pci_dev_wait() after
>   performing the prescribed delays.  pci_dev_wait() needs two parameters,
>   a reset reason and a timeout, which callers must now pass to
>   pci_bridge_wait_for_secondary_bus().  The timeout is 1 sec for resume
>   (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c46c ("PCI:
>   Wait up to 60 seconds for device to become ready after FLR")).
> 
> - Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or
>   -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset().
> 
> - Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which
>   is now performed by pci_bridge_wait_for_secondary_bus().  A static
>   delay this long is only necessary for Conventional PCI, so modern
>   PCIe systems benefit from shorter reset times as a side effect.
> 
> Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
> Reported-by: Sheng Bi <windy.bi.enflame@gmail.com>
> Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v4.17+
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> ---

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

>  drivers/pci/pci-driver.c |  2 +-
>  drivers/pci/pci.c        | 50 ++++++++++++++++++----------------------
>  drivers/pci/pci.h        |  3 ++-
>  3 files changed, 25 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index a2ceeacc33eb..02e84c87f41a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -572,7 +572,7 @@ static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
>  
>  static void pci_pm_bridge_power_up_actions(struct pci_dev *pci_dev)
>  {
> -	pci_bridge_wait_for_secondary_bus(pci_dev);
> +	pci_bridge_wait_for_secondary_bus(pci_dev, "resume", 1000);
>  	/*
>  	 * When powering on a bridge from D3cold, the whole hierarchy may be
>  	 * powered on into D0uninitialized state, resume them to give them a
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index f43f3e84f634..b0b49243a908 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4948,24 +4948,31 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
>  /**
>   * pci_bridge_wait_for_secondary_bus - Wait for secondary bus to be accessible
>   * @dev: PCI bridge
> + * @reset_type: reset type in human-readable form
> + * @timeout: maximum time to wait for devices on secondary bus
>   *
>   * Handle necessary delays before access to the devices on the secondary
> - * side of the bridge are permitted after D3cold to D0 transition.
> + * side of the bridge are permitted after D3cold to D0 transition
> + * or Conventional Reset.
>   *
>   * For PCIe this means the delays in PCIe 5.0 section 6.6.1. For
>   * conventional PCI it means Tpvrh + Trhfa specified in PCI 3.0 section
>   * 4.3.2.
> + *
> + * Return 0 on success or -ENOTTY if the first device on the secondary bus
> + * failed to become accessible.
>   */
> -void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
> +int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
> +				      int timeout)
>  {
>  	struct pci_dev *child;
>  	int delay;
>  
>  	if (pci_dev_is_disconnected(dev))
> -		return;
> +		return 0;
>  
>  	if (!pci_is_bridge(dev))
> -		return;
> +		return 0;
>  
>  	down_read(&pci_bus_sem);
>  
> @@ -4977,14 +4984,14 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
>  	 */
>  	if (!dev->subordinate || list_empty(&dev->subordinate->devices)) {
>  		up_read(&pci_bus_sem);
> -		return;
> +		return 0;
>  	}
>  
>  	/* Take d3cold_delay requirements into account */
>  	delay = pci_bus_max_d3cold_delay(dev->subordinate);
>  	if (!delay) {
>  		up_read(&pci_bus_sem);
> -		return;
> +		return 0;
>  	}
>  
>  	child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
> @@ -4993,14 +5000,12 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
>  
>  	/*
>  	 * Conventional PCI and PCI-X we need to wait Tpvrh + Trhfa before
> -	 * accessing the device after reset (that is 1000 ms + 100 ms). In
> -	 * practice this should not be needed because we don't do power
> -	 * management for them (see pci_bridge_d3_possible()).
> +	 * accessing the device after reset (that is 1000 ms + 100 ms).
>  	 */
>  	if (!pci_is_pcie(dev)) {
>  		pci_dbg(dev, "waiting %d ms for secondary bus\n", 1000 + delay);
>  		msleep(1000 + delay);
> -		return;
> +		return 0;
>  	}
>  
>  	/*
> @@ -5017,11 +5022,11 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
>  	 * configuration requests if we only wait for 100 ms (see
>  	 * https://bugzilla.kernel.org/show_bug.cgi?id=203885).
>  	 *
> -	 * Therefore we wait for 100 ms and check for the device presence.
> -	 * If it is still not present give it an additional 100 ms.
> +	 * Therefore we wait for 100 ms and check for the device presence
> +	 * until the timeout expires.
>  	 */
>  	if (!pcie_downstream_port(dev))
> -		return;
> +		return 0;
>  
>  	if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
>  		pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
> @@ -5032,14 +5037,11 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
>  		if (!pcie_wait_for_link_delay(dev, true, delay)) {
>  			/* Did not train, no need to wait any further */
>  			pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n");
> -			return;
> +			return -ENOTTY;
>  		}
>  	}
>  
> -	if (!pci_device_is_present(child)) {
> -		pci_dbg(child, "waiting additional %d ms to become accessible\n", delay);
> -		msleep(delay);
> -	}
> +	return pci_dev_wait(child, reset_type, timeout - delay);
>  }
>  
>  void pci_reset_secondary_bus(struct pci_dev *dev)
> @@ -5058,15 +5060,6 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
>  
>  	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>  	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
> -
> -	/*
> -	 * Trhfa for conventional PCI is 2^25 clock cycles.
> -	 * Assuming a minimum 33MHz clock this results in a 1s
> -	 * delay before we can consider subordinate devices to
> -	 * be re-initialized.  PCIe has some ways to shorten this,
> -	 * but we don't make use of them yet.
> -	 */
> -	ssleep(1);
>  }
>  
>  void __weak pcibios_reset_secondary_bus(struct pci_dev *dev)
> @@ -5085,7 +5078,8 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
>  {
>  	pcibios_reset_secondary_bus(dev);
>  
> -	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
> +	return pci_bridge_wait_for_secondary_bus(dev, "bus reset",
> +						 PCIE_RESET_READY_POLL_MS);
>  }
>  EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
>  
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 9ed3b5550043..40758248dd80 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -86,8 +86,9 @@ void pci_msi_init(struct pci_dev *dev);
>  void pci_msix_init(struct pci_dev *dev);
>  bool pci_bridge_d3_possible(struct pci_dev *dev);
>  void pci_bridge_d3_update(struct pci_dev *dev);
> -void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev);
>  void pci_bridge_reconfigure_ltr(struct pci_dev *dev);
> +int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
> +				      int timeout);
>  
>  static inline void pci_wakeup_event(struct pci_dev *dev)
>  {
Bjorn Helgaas Jan. 12, 2023, 10:31 p.m. UTC | #2
On Sat, Dec 31, 2022 at 07:33:38PM +0100, Lukas Wunner wrote:
> Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait
> for devices on the secondary bus to become accessible after reset:
> 
> Although it does call pci_dev_wait(), it erroneously passes the bridge's
> pci_dev rather than that of a child.  The bridge of course is always
> accessible while its secondary bus is reset, so pci_dev_wait() returns
> immediately.
> 
> Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait()
> function which is called from pci_bridge_secondary_bus_reset():
> 
> https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/
> 
> However we already have pci_bridge_wait_for_secondary_bus() which does
> almost exactly what we need.  So far it's only called on resume from
> D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8).
> Re-using it for Secondary Bus Resets is a leaner and more rational
> approach than introducing a new function.
> 
> That only requires a few minor tweaks:
> 
> - Amend pci_bridge_wait_for_secondary_bus() to await accessibility of
>   the first device on the secondary bus by calling pci_dev_wait() after
>   performing the prescribed delays.  pci_dev_wait() needs two parameters,
>   a reset reason and a timeout, which callers must now pass to
>   pci_bridge_wait_for_secondary_bus().  The timeout is 1 sec for resume
>   (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c46c ("PCI:
>   Wait up to 60 seconds for device to become ready after FLR")).
> 
> - Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or
>   -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset().
> 
> - Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which
>   is now performed by pci_bridge_wait_for_secondary_bus().  A static
>   delay this long is only necessary for Conventional PCI, so modern
>   PCIe systems benefit from shorter reset times as a side effect.
> 
> Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
> Reported-by: Sheng Bi <windy.bi.enflame@gmail.com>
> Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v4.17+
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> ---
>  drivers/pci/pci-driver.c |  2 +-
>  drivers/pci/pci.c        | 50 ++++++++++++++++++----------------------
>  drivers/pci/pci.h        |  3 ++-
>  3 files changed, 25 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index a2ceeacc33eb..02e84c87f41a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -572,7 +572,7 @@ static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
>  
>  static void pci_pm_bridge_power_up_actions(struct pci_dev *pci_dev)
>  {
> -	pci_bridge_wait_for_secondary_bus(pci_dev);
> +	pci_bridge_wait_for_secondary_bus(pci_dev, "resume", 1000);

It sounds like this 1000 ms value is prescribed by sec 6.6.1, so we
should have a #define for it.  I know we didn't use one even before,
but this seems like a a good opportunity to add it.

>  /**
>   * pci_bridge_wait_for_secondary_bus - Wait for secondary bus to be accessible
>   * @dev: PCI bridge
> + * @reset_type: reset type in human-readable form
> + * @timeout: maximum time to wait for devices on secondary bus

I think we should mention here that the timeout is in milliseconds.

Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index a2ceeacc33eb..02e84c87f41a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -572,7 +572,7 @@  static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
 
 static void pci_pm_bridge_power_up_actions(struct pci_dev *pci_dev)
 {
-	pci_bridge_wait_for_secondary_bus(pci_dev);
+	pci_bridge_wait_for_secondary_bus(pci_dev, "resume", 1000);
 	/*
 	 * When powering on a bridge from D3cold, the whole hierarchy may be
 	 * powered on into D0uninitialized state, resume them to give them a
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f43f3e84f634..b0b49243a908 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4948,24 +4948,31 @@  static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
 /**
  * pci_bridge_wait_for_secondary_bus - Wait for secondary bus to be accessible
  * @dev: PCI bridge
+ * @reset_type: reset type in human-readable form
+ * @timeout: maximum time to wait for devices on secondary bus
  *
  * Handle necessary delays before access to the devices on the secondary
- * side of the bridge are permitted after D3cold to D0 transition.
+ * side of the bridge are permitted after D3cold to D0 transition
+ * or Conventional Reset.
  *
  * For PCIe this means the delays in PCIe 5.0 section 6.6.1. For
  * conventional PCI it means Tpvrh + Trhfa specified in PCI 3.0 section
  * 4.3.2.
+ *
+ * Return 0 on success or -ENOTTY if the first device on the secondary bus
+ * failed to become accessible.
  */
-void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
+int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
+				      int timeout)
 {
 	struct pci_dev *child;
 	int delay;
 
 	if (pci_dev_is_disconnected(dev))
-		return;
+		return 0;
 
 	if (!pci_is_bridge(dev))
-		return;
+		return 0;
 
 	down_read(&pci_bus_sem);
 
@@ -4977,14 +4984,14 @@  void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
 	 */
 	if (!dev->subordinate || list_empty(&dev->subordinate->devices)) {
 		up_read(&pci_bus_sem);
-		return;
+		return 0;
 	}
 
 	/* Take d3cold_delay requirements into account */
 	delay = pci_bus_max_d3cold_delay(dev->subordinate);
 	if (!delay) {
 		up_read(&pci_bus_sem);
-		return;
+		return 0;
 	}
 
 	child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
@@ -4993,14 +5000,12 @@  void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
 
 	/*
 	 * Conventional PCI and PCI-X we need to wait Tpvrh + Trhfa before
-	 * accessing the device after reset (that is 1000 ms + 100 ms). In
-	 * practice this should not be needed because we don't do power
-	 * management for them (see pci_bridge_d3_possible()).
+	 * accessing the device after reset (that is 1000 ms + 100 ms).
 	 */
 	if (!pci_is_pcie(dev)) {
 		pci_dbg(dev, "waiting %d ms for secondary bus\n", 1000 + delay);
 		msleep(1000 + delay);
-		return;
+		return 0;
 	}
 
 	/*
@@ -5017,11 +5022,11 @@  void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
 	 * configuration requests if we only wait for 100 ms (see
 	 * https://bugzilla.kernel.org/show_bug.cgi?id=203885).
 	 *
-	 * Therefore we wait for 100 ms and check for the device presence.
-	 * If it is still not present give it an additional 100 ms.
+	 * Therefore we wait for 100 ms and check for the device presence
+	 * until the timeout expires.
 	 */
 	if (!pcie_downstream_port(dev))
-		return;
+		return 0;
 
 	if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
 		pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
@@ -5032,14 +5037,11 @@  void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
 		if (!pcie_wait_for_link_delay(dev, true, delay)) {
 			/* Did not train, no need to wait any further */
 			pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n");
-			return;
+			return -ENOTTY;
 		}
 	}
 
-	if (!pci_device_is_present(child)) {
-		pci_dbg(child, "waiting additional %d ms to become accessible\n", delay);
-		msleep(delay);
-	}
+	return pci_dev_wait(child, reset_type, timeout - delay);
 }
 
 void pci_reset_secondary_bus(struct pci_dev *dev)
@@ -5058,15 +5060,6 @@  void pci_reset_secondary_bus(struct pci_dev *dev)
 
 	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
 	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
-
-	/*
-	 * Trhfa for conventional PCI is 2^25 clock cycles.
-	 * Assuming a minimum 33MHz clock this results in a 1s
-	 * delay before we can consider subordinate devices to
-	 * be re-initialized.  PCIe has some ways to shorten this,
-	 * but we don't make use of them yet.
-	 */
-	ssleep(1);
 }
 
 void __weak pcibios_reset_secondary_bus(struct pci_dev *dev)
@@ -5085,7 +5078,8 @@  int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
 {
 	pcibios_reset_secondary_bus(dev);
 
-	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
+	return pci_bridge_wait_for_secondary_bus(dev, "bus reset",
+						 PCIE_RESET_READY_POLL_MS);
 }
 EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 9ed3b5550043..40758248dd80 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -86,8 +86,9 @@  void pci_msi_init(struct pci_dev *dev);
 void pci_msix_init(struct pci_dev *dev);
 bool pci_bridge_d3_possible(struct pci_dev *dev);
 void pci_bridge_d3_update(struct pci_dev *dev);
-void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev);
 void pci_bridge_reconfigure_ltr(struct pci_dev *dev);
+int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
+				      int timeout);
 
 static inline void pci_wakeup_event(struct pci_dev *dev)
 {