diff mbox series

[v2,1/2] PCI/portdrv: Add necessary wait for disabling hotplug events

Message ID 20250218034859.40397-1-feng.tang@linux.alibaba.com (mailing list archive)
State New
Delegated to: Bjorn Helgaas
Headers show
Series [v2,1/2] PCI/portdrv: Add necessary wait for disabling hotplug events | expand

Commit Message

Feng Tang Feb. 18, 2025, 3:48 a.m. UTC
There was problem reported by firmware developers that they received
2 pcie link control commands in very short intervals on an ARM server,
which doesn't comply with pcie spec, and broke their state machine and
work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs
to wait at least 1 second for the command-complete event, before
resending the cmd or sending a new cmd.

And the first link control command firmware received is from
get_port_device_capability(), which sends cmd to disable pcie hotplug
interrupts without waiting for its completion.

Fix it by adding the necessary wait to comply with PCIe spec, referring
pcie_poll_cmd().

Also make the interrupt disabling not dependent on whether pciehp
service driver will be loaded as suggested by Lukas.

Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
Originally-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
---
Changlog:

  since v1:
    * Add the Originally-by for Liguang. The issue was found on a 5.10 kernel,
      then 6.6. I was initially given a 5.10 kernel tar bar without git info to
      debug the issue, and made the patch. Thanks to Guanghui who recently pointed
      me to tree https://gitee.com/anolis/cloud-kernel which show the wait logic
      in 5.10 was originally from Liguang, and never hit mainline.
    * Make the irq disabling not dependent on wthether pciehp service driver
      will be loaded (Lukas Wunner) 
    * Use read_poll_timeout() API to simply the waiting logic (Sathyanarayanan
      Kuppuswamy)
    * Add logic to skip irq disabling if it is already disabled.

 drivers/pci/pci.h          |  2 ++
 drivers/pci/pcie/portdrv.c | 44 +++++++++++++++++++++++++++++++++-----
 2 files changed, 41 insertions(+), 5 deletions(-)

Comments

Markus Elfring Feb. 18, 2025, 9 a.m. UTC | #1
> There was problem reported by firmware developers that they received
> 2 pcie link control commands in very short intervals on an ARM server,
> which doesn't comply with pcie spec, and broke their state machine and
> work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs

Would you like to use key words in consistent ways also in such a change description?


> to wait at least 1 second for the command-complete event, before
> resending the cmd or …

                command?

…
> ---
> Changlog:
>
>   since v1:
…

Are cover letters generally desirable for patch series?

Regards,
Markus
Sathyanarayanan Kuppuswamy Feb. 18, 2025, 6:58 p.m. UTC | #2
On 2/17/25 7:48 PM, Feng Tang wrote:
> There was problem reported by firmware developers that they received
> 2 pcie link control commands in very short intervals on an ARM server,
> which doesn't comply with pcie spec, and broke their state machine and
> work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs
> to wait at least 1 second for the command-complete event, before
> resending the cmd or sending a new cmd.
>
> And the first link control command firmware received is from
> get_port_device_capability(), which sends cmd to disable pcie hotplug
> interrupts without waiting for its completion.

Were you able to narrow down the source of the second command? The
reason I am asking is, the commit you are trying to fix seems to have
existed for 10+ years and no one had faced any issues with it. So
I am wondering whether this needs to fixed at this place or before
executing the second command.

>
> Fix it by adding the necessary wait to comply with PCIe spec, referring
> pcie_poll_cmd().
>
> Also make the interrupt disabling not dependent on whether pciehp
> service driver will be loaded as suggested by Lukas.

May be this needs a new patch?

>
> Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
> Originally-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
> Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
> Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
> ---

Code wise it looks fine to me.

> Changlog:
>
>    since v1:
>      * Add the Originally-by for Liguang. The issue was found on a 5.10 kernel,
>        then 6.6. I was initially given a 5.10 kernel tar bar without git info to
>        debug the issue, and made the patch. Thanks to Guanghui who recently pointed
>        me to tree https://gitee.com/anolis/cloud-kernel which show the wait logic
>        in 5.10 was originally from Liguang, and never hit mainline.
>      * Make the irq disabling not dependent on wthether pciehp service driver
>        will be loaded (Lukas Wunner)
>      * Use read_poll_timeout() API to simply the waiting logic (Sathyanarayanan
>        Kuppuswamy)
>      * Add logic to skip irq disabling if it is already disabled.
>
>   drivers/pci/pci.h          |  2 ++
>   drivers/pci/pcie/portdrv.c | 44 +++++++++++++++++++++++++++++++++-----
>   2 files changed, 41 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 01e51db8d285..c1e234d1b81d 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
>   #ifdef CONFIG_PCIEPORTBUS
>   void pcie_reset_lbms_count(struct pci_dev *port);
>   int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
> +void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
>   #else
>   static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
>   static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
>   {
>   	return -EOPNOTSUPP;
>   }
> +static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
>   #endif
>   
>   struct pci_dev_reset_methods {
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 02e73099bad0..2470333bba2f 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -18,6 +18,7 @@
>   #include <linux/string.h>
>   #include <linux/slab.h>
>   #include <linux/aer.h>
> +#include <linux/iopoll.h>
>   
>   #include "../pci.h"
>   #include "portdrv.h"
> @@ -205,6 +206,40 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
>   	return 0;
>   }
>   
> +static int pcie_wait_sltctl_cmd_raw(struct pci_dev *dev)
> +{
> +	u16 slot_status = 0;
> +	int ret, ret1, timeout_us;
> +
> +	/* 1 second, according to PCIe spec 6.1, section 6.7.3.2 */
> +	timeout_us = 1000000;
> +	ret = read_poll_timeout(pcie_capability_read_word, ret1,
> +				(slot_status & PCI_EXP_SLTSTA_CC), 10000,
> +				timeout_us, true, dev, PCI_EXP_SLTSTA,
> +				&slot_status);
> +	if (!ret)
> +		pcie_capability_write_word(dev, PCI_EXP_SLTSTA,
> +						PCI_EXP_SLTSTA_CC);
> +
> +	return  ret;
> +}
> +
> +void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
> +{
> +	u16 slot_ctrl = 0;
> +
> +	pcie_capability_read_word(dev, PCI_EXP_SLTCTL, &slot_ctrl);
> +	/* Bail out early if it is already disabled */
> +	if (!(slot_ctrl & (PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE)))
> +		return;
> +
> +	pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> +		  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> +
> +	if (pcie_wait_sltctl_cmd_raw(dev))
> +		pci_info(dev, "Timeout on disabling PCIE hot-plug interrupt\n");
> +}
> +
>   /**
>    * get_port_device_capability - discover capabilities of a PCI Express port
>    * @dev: PCI Express port to examine
> @@ -222,16 +257,15 @@ static int get_port_device_capability(struct pci_dev *dev)
>   
>   	if (dev->is_hotplug_bridge &&
>   	    (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> -	     pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) &&
> -	    (pcie_ports_native || host->native_pcie_hotplug)) {
> -		services |= PCIE_PORT_SERVICE_HP;
> +	     pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) {
> +		if (pcie_ports_native || host->native_pcie_hotplug)
> +			services |= PCIE_PORT_SERVICE_HP;
>   
>   		/*
>   		 * Disable hot-plug interrupts in case they have been enabled
>   		 * by the BIOS and the hot-plug service driver is not loaded.
>   		 */
> -		pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> -			  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> +		pcie_disable_hp_interrupts_early(dev);
>   	}
>   
>   #ifdef CONFIG_PCIEAER
Bjorn Helgaas Feb. 18, 2025, 10:33 p.m. UTC | #3
On Tue, Feb 18, 2025 at 11:48:58AM +0800, Feng Tang wrote:
> There was problem reported by firmware developers that they received
> 2 pcie link control commands in very short intervals on an ARM server,
> which doesn't comply with pcie spec, and broke their state machine and
> work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs
> to wait at least 1 second for the command-complete event, before
> resending the cmd or sending a new cmd.

s/link control/hotplug/ (also below)
s/2/two/
s/pcie/PCIe/ (also below)

> And the first link control command firmware received is from
> get_port_device_capability(), which sends cmd to disable pcie hotplug
> interrupts without waiting for its completion.
> 
> Fix it by adding the necessary wait to comply with PCIe spec, referring
> pcie_poll_cmd().
> 
> Also make the interrupt disabling not dependent on whether pciehp
> service driver will be loaded as suggested by Lukas.

This sounds like maybe it should be two separate patches.

> Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
> Originally-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
> Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
> Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
> ---
> Changlog:
> 
>   since v1:
>     * Add the Originally-by for Liguang. The issue was found on a 5.10 kernel,
>       then 6.6. I was initially given a 5.10 kernel tar bar without git info to
>       debug the issue, and made the patch. Thanks to Guanghui who recently pointed
>       me to tree https://gitee.com/anolis/cloud-kernel which show the wait logic
>       in 5.10 was originally from Liguang, and never hit mainline.
>     * Make the irq disabling not dependent on wthether pciehp service driver
>       will be loaded (Lukas Wunner) 
>     * Use read_poll_timeout() API to simply the waiting logic (Sathyanarayanan
>       Kuppuswamy)
>     * Add logic to skip irq disabling if it is already disabled.
> 
>  drivers/pci/pci.h          |  2 ++
>  drivers/pci/pcie/portdrv.c | 44 +++++++++++++++++++++++++++++++++-----
>  2 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 01e51db8d285..c1e234d1b81d 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
>  #ifdef CONFIG_PCIEPORTBUS
>  void pcie_reset_lbms_count(struct pci_dev *port);
>  int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
> +void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
>  #else
>  static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
>  static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
>  {
>  	return -EOPNOTSUPP;
>  }
> +static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
>  #endif
>  
>  struct pci_dev_reset_methods {
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 02e73099bad0..2470333bba2f 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -18,6 +18,7 @@
>  #include <linux/string.h>
>  #include <linux/slab.h>
>  #include <linux/aer.h>
> +#include <linux/iopoll.h>
>  
>  #include "../pci.h"
>  #include "portdrv.h"
> @@ -205,6 +206,40 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
>  	return 0;
>  }
>  
> +static int pcie_wait_sltctl_cmd_raw(struct pci_dev *dev)
> +{
> +	u16 slot_status = 0;
> +	int ret, ret1, timeout_us;
> +
> +	/* 1 second, according to PCIe spec 6.1, section 6.7.3.2 */
> +	timeout_us = 1000000;
> +	ret = read_poll_timeout(pcie_capability_read_word, ret1,
> +				(slot_status & PCI_EXP_SLTSTA_CC), 10000,
> +				timeout_us, true, dev, PCI_EXP_SLTSTA,
> +				&slot_status);
> +	if (!ret)
> +		pcie_capability_write_word(dev, PCI_EXP_SLTSTA,
> +						PCI_EXP_SLTSTA_CC);
> +
> +	return  ret;

Ugh.  I really don't like the way this basically duplicates
pcie_poll_cmd().  I don't have a great suggestion to fix it; maybe we
need a way to build part of pciehp unconditionally.  At the very least
we need a comment here pointing to pcie_poll_cmd().

And IIUC this will add a one second delay for ports that don't need
command completed events.  I don't think that's fair to those ports.

> +}
> +
> +void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
> +{
> +	u16 slot_ctrl = 0;
> +
> +	pcie_capability_read_word(dev, PCI_EXP_SLTCTL, &slot_ctrl);
> +	/* Bail out early if it is already disabled */
> +	if (!(slot_ctrl & (PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE)))
> +		return;
> +
> +	pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> +		  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> +
> +	if (pcie_wait_sltctl_cmd_raw(dev))
> +		pci_info(dev, "Timeout on disabling PCIE hot-plug interrupt\n");

s/PCIE/PCIe/

> +}
> +
>  /**
>   * get_port_device_capability - discover capabilities of a PCI Express port
>   * @dev: PCI Express port to examine
> @@ -222,16 +257,15 @@ static int get_port_device_capability(struct pci_dev *dev)
>  
>  	if (dev->is_hotplug_bridge &&
>  	    (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> -	     pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) &&
> -	    (pcie_ports_native || host->native_pcie_hotplug)) {
> -		services |= PCIE_PORT_SERVICE_HP;
> +	     pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) {
> +		if (pcie_ports_native || host->native_pcie_hotplug)
> +			services |= PCIE_PORT_SERVICE_HP;
>  
>  		/*
>  		 * Disable hot-plug interrupts in case they have been enabled
>  		 * by the BIOS and the hot-plug service driver is not loaded.
>  		 */
> -		pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> -			  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> +		pcie_disable_hp_interrupts_early(dev);
>  	}
>  
>  #ifdef CONFIG_PCIEAER
> -- 
> 2.43.5
>
diff mbox series

Patch

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 01e51db8d285..c1e234d1b81d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -759,12 +759,14 @@  static inline void pcie_ecrc_get_policy(char *str) { }
 #ifdef CONFIG_PCIEPORTBUS
 void pcie_reset_lbms_count(struct pci_dev *port);
 int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
+void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
 #else
 static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
 static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
 {
 	return -EOPNOTSUPP;
 }
+static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
 #endif
 
 struct pci_dev_reset_methods {
diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index 02e73099bad0..2470333bba2f 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -18,6 +18,7 @@ 
 #include <linux/string.h>
 #include <linux/slab.h>
 #include <linux/aer.h>
+#include <linux/iopoll.h>
 
 #include "../pci.h"
 #include "portdrv.h"
@@ -205,6 +206,40 @@  static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
 	return 0;
 }
 
+static int pcie_wait_sltctl_cmd_raw(struct pci_dev *dev)
+{
+	u16 slot_status = 0;
+	int ret, ret1, timeout_us;
+
+	/* 1 second, according to PCIe spec 6.1, section 6.7.3.2 */
+	timeout_us = 1000000;
+	ret = read_poll_timeout(pcie_capability_read_word, ret1,
+				(slot_status & PCI_EXP_SLTSTA_CC), 10000,
+				timeout_us, true, dev, PCI_EXP_SLTSTA,
+				&slot_status);
+	if (!ret)
+		pcie_capability_write_word(dev, PCI_EXP_SLTSTA,
+						PCI_EXP_SLTSTA_CC);
+
+	return  ret;
+}
+
+void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
+{
+	u16 slot_ctrl = 0;
+
+	pcie_capability_read_word(dev, PCI_EXP_SLTCTL, &slot_ctrl);
+	/* Bail out early if it is already disabled */
+	if (!(slot_ctrl & (PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE)))
+		return;
+
+	pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
+		  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
+
+	if (pcie_wait_sltctl_cmd_raw(dev))
+		pci_info(dev, "Timeout on disabling PCIE hot-plug interrupt\n");
+}
+
 /**
  * get_port_device_capability - discover capabilities of a PCI Express port
  * @dev: PCI Express port to examine
@@ -222,16 +257,15 @@  static int get_port_device_capability(struct pci_dev *dev)
 
 	if (dev->is_hotplug_bridge &&
 	    (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
-	     pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) &&
-	    (pcie_ports_native || host->native_pcie_hotplug)) {
-		services |= PCIE_PORT_SERVICE_HP;
+	     pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) {
+		if (pcie_ports_native || host->native_pcie_hotplug)
+			services |= PCIE_PORT_SERVICE_HP;
 
 		/*
 		 * Disable hot-plug interrupts in case they have been enabled
 		 * by the BIOS and the hot-plug service driver is not loaded.
 		 */
-		pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
-			  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
+		pcie_disable_hp_interrupts_early(dev);
 	}
 
 #ifdef CONFIG_PCIEAER