diff mbox

[2/2] cxl: Set VPD timeout to avoid access failures

Message ID 1479762606-51632-1-git-send-email-mrochs@linux.vnet.ibm.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Matthew R. Ochs Nov. 21, 2016, 9:10 p.m. UTC
Some IBM CXL devices can take up to ~120ms to complete a VPD access
transaction when under heavy load. With an existing default VPD timeout
of 50ms, reads/writes can fail despite there not being an issue with the
underlying hardware.

To avoid these false failures, increase the VPD wait timeout for certain
CXL devices to a value that allows appropriate VPD access delay when faced
with a worst-case scenario.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/misc/cxl/pci.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Andrew Donnellan Nov. 22, 2016, 12:32 a.m. UTC | #1
On 22/11/16 08:10, Matthew R. Ochs wrote:
> Some IBM CXL devices can take up to ~120ms to complete a VPD access

Which devices?

> transaction when under heavy load. With an existing default VPD timeout
> of 50ms, reads/writes can fail despite there not being an issue with the
> underlying hardware.
>
> To avoid these false failures, increase the VPD wait timeout for certain
> CXL devices to a value that allows appropriate VPD access delay when faced
> with a worst-case scenario.
>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

Contingent on Bjorn applying patch 1, and with comment below:

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
>  drivers/misc/cxl/pci.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
> index e96be9c..56a67cf 100644
> --- a/drivers/misc/cxl/pci.c
> +++ b/drivers/misc/cxl/pci.c
> @@ -1675,6 +1675,10 @@ bool cxl_slot_is_supported(struct pci_dev *dev, int flags)
>  }
>  EXPORT_SYMBOL_GPL(cxl_slot_is_supported);
>
> +static bool cxl_pci_is_slow_vpd_device(struct pci_dev *dev)
> +{
> +	return dev->device == 0x0601;

Comment with device name.

> +}
>
>  static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  {
> @@ -1692,6 +1696,9 @@ static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  		return -ENODEV;
>  	}
>
> +	if (cxl_pci_is_slow_vpd_device(dev))
> +		pci_set_vpd_timeout(dev, msecs_to_jiffies(125));
> +
>  	if (cxl_verbose)
>  		dump_cxl_config_space(dev);
>
>
Matthew R. Ochs Nov. 22, 2016, 12:40 a.m. UTC | #2
> On Nov 21, 2016, at 6:32 PM, Andrew Donnellan <andrew.donnellan@au1.ibm.com> wrote:
> On 22/11/16 08:10, Matthew R. Ochs wrote:
>> Some IBM CXL devices can take up to ~120ms to complete a VPD access
> 
> Which devices?

Flash GT

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frederic Barrat Nov. 22, 2016, 5:41 p.m. UTC | #3
Le 21/11/2016 à 22:10, Matthew R. Ochs a écrit :
> Some IBM CXL devices can take up to ~120ms to complete a VPD access
> transaction when under heavy load. With an existing default VPD timeout
> of 50ms, reads/writes can fail despite there not being an issue with the
> underlying hardware.
>
> To avoid these false failures, increase the VPD wait timeout for certain
> CXL devices to a value that allows appropriate VPD access delay when faced
> with a worst-case scenario.
>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> ---

If the new pci API gets in, then I'm ok with the cxl change, at least 
for now. For longer term, we should consider opening up the cxl kernel 
API to give access to the new pci API so that it can be tuned by the AFU 
driver directly, since it would know better than cxl what the proper vpd 
access timeout should be.

Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>

   Fred

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index e96be9c..56a67cf 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1675,6 +1675,10 @@  bool cxl_slot_is_supported(struct pci_dev *dev, int flags)
 }
 EXPORT_SYMBOL_GPL(cxl_slot_is_supported);
 
+static bool cxl_pci_is_slow_vpd_device(struct pci_dev *dev)
+{
+	return dev->device == 0x0601;
+}
 
 static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id)
 {
@@ -1692,6 +1696,9 @@  static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id)
 		return -ENODEV;
 	}
 
+	if (cxl_pci_is_slow_vpd_device(dev))
+		pci_set_vpd_timeout(dev, msecs_to_jiffies(125));
+
 	if (cxl_verbose)
 		dump_cxl_config_space(dev);