diff mbox series

[V11,4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices

Message ID 20211030135348.61364-5-liudongdong3@huawei.com (mailing list archive)
State Not Applicable
Delegated to: Bjorn Helgaas
Headers show
Series PCI: Enable 10-Bit tag support for PCIe devices | expand

Commit Message

Dongdong Liu Oct. 30, 2021, 1:53 p.m. UTC
PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add a tags sysfs file, write 0 to disable 10-Bit Tag Requester
when the driver does not bind the device. The typical use case is for
p2pdma when the peer device does not support 10-Bit Tag Completer.
Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
Completer capability. The typical use case is for host memory targeted
by DMA Requests. The tags file content indicate current status of Tags
Enable.

PCIe r5.0, sec 2.2.6.2 says:

  Receivers/Completers must handle 8-bit Tag values correctly regardless
  of the setting of their Extended Tag Field Enable bit (see Section
  7.5.3.4).

Add this comment in pci_configure_extended_tags(). As all PCIe completers
are required to support 8-bit tags, so we do not use tags sysfs file
to manage 8-bit tags.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 24 ++++++-
 drivers/pci/pci-sysfs.c                 | 88 +++++++++++++++++++++++++
 drivers/pci/pci.h                       |  2 +
 drivers/pci/probe.c                     | 20 ++++++
 4 files changed, 133 insertions(+), 1 deletion(-)

Comments

Bjorn Helgaas Nov. 1, 2021, 8:54 p.m. UTC | #1
On Sat, Oct 30, 2021 at 09:53:44PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says:
> 
>   If an Endpoint supports sending Requests to other Endpoints (as
>   opposed to host memory), the Endpoint must not send 10-Bit Tag
>   Requests to another given Endpoint unless an implementation-specific
>   mechanism determines that the Endpoint supports 10-Bit Tag Completer
>   capability.
> 
> Add a tags sysfs file, write 0 to disable 10-Bit Tag Requester
> when the driver does not bind the device. The typical use case is for
> p2pdma when the peer device does not support 10-Bit Tag Completer.
> Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
> Completer capability. The typical use case is for host memory targeted
> by DMA Requests. The tags file content indicate current status of Tags
> Enable.
> 
> PCIe r5.0, sec 2.2.6.2 says:
> 
>   Receivers/Completers must handle 8-bit Tag values correctly regardless
>   of the setting of their Extended Tag Field Enable bit (see Section
>   7.5.3.4).
> 
> Add this comment in pci_configure_extended_tags(). As all PCIe completers
> are required to support 8-bit tags, so we do not use tags sysfs file
> to manage 8-bit tags.

> +What:		/sys/bus/pci/devices/.../tags
> +Date:		September 2021
> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
> +Description:
> +		The file will be visible when the device supports 10-Bit Tag
> +		Requester. The file is readable, the value indicate current
> +		status of Tags Enable(5-Bit, 8-Bit, 10-Bit).
> +
> +		The file is also writable, The values accepted are:
> +		* > 0 - this number will be reported as tags bit to be
> +			enabled. current only 10 is accepted
> +		* < 0 - not valid
> +		* = 0 - disable 10-Bit Tag, use Extended Tags(8-Bit or 5-Bit)
> +
> +		write 0 to disable 10-Bit Tag Requester when the driver does
> +		not bind the device. The typical use case is for p2pdma when
> +		the peer device does not support 10-Bit Tag Completer.
> +
> +		Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit
> +		Tag Completer capability. The typical use case is for host
> +		memory targeted by DMA Requests.

1) I think I would rename this from "tags" to "tag_bits".  A file
   named "tags" that contains 8 suggests that we can use 8 tags, but
   in fact, we can use 256 tags.

2) This controls tag size the requester will use.  The current knobs
   in the hardware allow 5, 8, or 10 bits.

   "0" to disable 10-bit tags without specifying whether we should use
   5- or 8-bit tags doesn't seem right.  All completers are *supposed*
   to support 8-bit, but we've tripped over a few that don't.

   I don't think we currently have a run-time (or even a boot-time)
   way to disable 8-bit tags; all we have is the quirk_no_ext_tags()
   quirk.  But if we ever wanted to *add* that, maybe we would want:

      5 - use 5-bit tags
      8 - use 8-bit tags
     10 - use 10-bit tags

   Maybe we just say "0" is invalid, since there's no obvious way to
   map this?

> +static ssize_t tags_show(struct device *dev,
> +			 struct device_attribute *attr,
> +			 char *buf)
> +{
> + ...

> +	if (ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)
> +		return sysfs_emit(buf, "%s\n", "10-Bit");
> +
> +	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &ctl);
> +	if (ret)
> +		return -EINVAL;
> +
> +	if (ctl & PCI_EXP_DEVCTL_EXT_TAG)
> +		return sysfs_emit(buf, "%s\n", "8-Bit");
> +
> +	return sysfs_emit(buf, "%s\n", "5-Bit");

Since I prefer the "tag_bits" name, my preference would be bare
numbers here: "10", "8", "5".

Both comments apply to the sriov files, too.

> +static umode_t pcie_dev_tags_attrs_is_visible(struct kobject *kobj,
> +					      struct attribute *a, int n)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	if (pdev->is_virtfn)
> +		return 0;
> +
> +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
> +		return 0;
> +
> +	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
> +		return 0;

Makes sense for now that the file is only visible if a requester
supports 10-bit tags.  If we ever wanted to extend this to control 5-
vs 8-bit tags, we could make it visible in more cases then.

> +
> +	return a->mode;
> +}

> @@ -2075,6 +2089,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>  		return 0;
>  	}
>  
> +	/*
> +	 * PCIe r5.0, sec 2.2.6.2 says "Receivers/Completers must handle 8-bit
> +	 * Tag values correctly regardless of the setting of their Extended Tag
> +	 * Field Enable bit (see Section 7.5.3.4)", so it is safe to enable
> +	 * Extented Tags.

s/Extented/Extended/

> +	 */
>  	if (!(ctl & PCI_EXP_DEVCTL_EXT_TAG)) {
>  		pci_info(dev, "enabling Extended Tags\n");
>  		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
> -- 
> 2.22.0
>
Dongdong Liu Nov. 2, 2021, 1:03 p.m. UTC | #2
Hi Bjorn

Many thanks for you review.
On 2021/11/2 4:54, Bjorn Helgaas wrote:
> On Sat, Oct 30, 2021 at 09:53:44PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says:
>>
>>   If an Endpoint supports sending Requests to other Endpoints (as
>>   opposed to host memory), the Endpoint must not send 10-Bit Tag
>>   Requests to another given Endpoint unless an implementation-specific
>>   mechanism determines that the Endpoint supports 10-Bit Tag Completer
>>   capability.
>>
>> Add a tags sysfs file, write 0 to disable 10-Bit Tag Requester
>> when the driver does not bind the device. The typical use case is for
>> p2pdma when the peer device does not support 10-Bit Tag Completer.
>> Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
>> Completer capability. The typical use case is for host memory targeted
>> by DMA Requests. The tags file content indicate current status of Tags
>> Enable.
>>
>> PCIe r5.0, sec 2.2.6.2 says:
>>
>>   Receivers/Completers must handle 8-bit Tag values correctly regardless
>>   of the setting of their Extended Tag Field Enable bit (see Section
>>   7.5.3.4).
>>
>> Add this comment in pci_configure_extended_tags(). As all PCIe completers
>> are required to support 8-bit tags, so we do not use tags sysfs file
>> to manage 8-bit tags.
>
>> +What:		/sys/bus/pci/devices/.../tags
>> +Date:		September 2021
>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>> +Description:
>> +		The file will be visible when the device supports 10-Bit Tag
>> +		Requester. The file is readable, the value indicate current
>> +		status of Tags Enable(5-Bit, 8-Bit, 10-Bit).
>> +
>> +		The file is also writable, The values accepted are:
>> +		* > 0 - this number will be reported as tags bit to be
>> +			enabled. current only 10 is accepted
>> +		* < 0 - not valid
>> +		* = 0 - disable 10-Bit Tag, use Extended Tags(8-Bit or 5-Bit)
>> +
>> +		write 0 to disable 10-Bit Tag Requester when the driver does
>> +		not bind the device. The typical use case is for p2pdma when
>> +		the peer device does not support 10-Bit Tag Completer.
>> +
>> +		Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit
>> +		Tag Completer capability. The typical use case is for host
>> +		memory targeted by DMA Requests.
>
> 1) I think I would rename this from "tags" to "tag_bits".  A file
>    named "tags" that contains 8 suggests that we can use 8 tags, but
>    in fact, we can use 256 tags.
Looks good, Will do.
>
> 2) This controls tag size the requester will use.  The current knobs
>    in the hardware allow 5, 8, or 10 bits.
>
>    "0" to disable 10-bit tags without specifying whether we should use
>    5- or 8-bit tags doesn't seem right.  All completers are *supposed*
>    to support 8-bit, but we've tripped over a few that don't.
>
>    I don't think we currently have a run-time (or even a boot-time)
>    way to disable 8-bit tags; all we have is the quirk_no_ext_tags()
>    quirk.  But if we ever wanted to *add* that, maybe we would want:
>
>       5 - use 5-bit tags
>       8 - use 8-bit tags
>      10 - use 10-bit tags
will do.
>    Maybe we just say "0" is invalid, since there's no obvious way to
>    map this?
OK, will do.
>
>> +static ssize_t tags_show(struct device *dev,
>> +			 struct device_attribute *attr,
>> +			 char *buf)
>> +{
>> + ...
>
>> +	if (ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)
>> +		return sysfs_emit(buf, "%s\n", "10-Bit");
>> +
>> +	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &ctl);
>> +	if (ret)
>> +		return -EINVAL;
>> +
>> +	if (ctl & PCI_EXP_DEVCTL_EXT_TAG)
>> +		return sysfs_emit(buf, "%s\n", "8-Bit");
>> +
>> +	return sysfs_emit(buf, "%s\n", "5-Bit");
>
> Since I prefer the "tag_bits" name, my preference would be bare
> numbers here: "10", "8", "5".
Will do.
>
> Both comments apply to the sriov files, too.
Will do.
>
>> +static umode_t pcie_dev_tags_attrs_is_visible(struct kobject *kobj,
>> +					      struct attribute *a, int n)
>> +{
>> +	struct device *dev = kobj_to_dev(kobj);
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> +	if (pdev->is_virtfn)
>> +		return 0;
>> +
>> +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
>> +		return 0;
>> +
>> +	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
>> +		return 0;
>
> Makes sense for now that the file is only visible if a requester
> supports 10-bit tags.  If we ever wanted to extend this to control 5-
> vs 8-bit tags, we could make it visible in more cases then.
Will do.
>
>> +
>> +	return a->mode;
>> +}
>
>> @@ -2075,6 +2089,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>>  		return 0;
>>  	}
>>
>> +	/*
>> +	 * PCIe r5.0, sec 2.2.6.2 says "Receivers/Completers must handle 8-bit
>> +	 * Tag values correctly regardless of the setting of their Extended Tag
>> +	 * Field Enable bit (see Section 7.5.3.4)", so it is safe to enable
>> +	 * Extented Tags.
>
> s/Extented/Extended/
Will fix.

Thanks,
Dongdong
>
>> +	 */
>>  	if (!(ctl & PCI_EXP_DEVCTL_EXT_TAG)) {
>>  		pci_info(dev, "enabling Extended Tags\n");
>>  		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
>> --
>> 2.22.0
>>
> .
>
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index d4ae03296861..c16bb31486d2 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -156,7 +156,7 @@  Description:
 		binary file containing the Vital Product Data for the
 		device.  It should follow the VPD format defined in
 		PCI Specification 2.1 or 2.2, but users should consider
-		that some devices may have incorrectly formatted data.  
+		that some devices may have incorrectly formatted data.
 		If the underlying VPD has a writable section then the
 		corresponding section of this file will be writable.
 
@@ -424,3 +424,25 @@  Description:
 
 		The file is writable if the PF is bound to a driver that
 		implements ->sriov_set_msix_vec_count().
+
+What:		/sys/bus/pci/devices/.../tags
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		The file will be visible when the device supports 10-Bit Tag
+		Requester. The file is readable, the value indicate current
+		status of Tags Enable(5-Bit, 8-Bit, 10-Bit).
+
+		The file is also writable, The values accepted are:
+		* > 0 - this number will be reported as tags bit to be
+			enabled. current only 10 is accepted
+		* < 0 - not valid
+		* = 0 - disable 10-Bit Tag, use Extended Tags(8-Bit or 5-Bit)
+
+		write 0 to disable 10-Bit Tag Requester when the driver does
+		not bind the device. The typical use case is for p2pdma when
+		the peer device does not support 10-Bit Tag Completer.
+
+		Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit
+		Tag Completer capability. The typical use case is for host
+		memory targeted by DMA Requests.
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7fb5cd17cc98..04fd9a8b9d4e 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -306,6 +306,65 @@  static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(enable);
 
+static ssize_t tags_store(struct device *dev,
+			  struct device_attribute *attr,
+			  const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	unsigned long val;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (kstrtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	/* 10 - enable 10-Bit Tag, 0 - disable 10-Bit Tag */
+	if (val != 10 && val != 0)
+		return -EINVAL;
+
+	if (pdev->driver)
+		return -EBUSY;
+
+	if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+		return -EPERM;
+
+	if (val == 10)
+		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
+					 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	else
+		pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
+					   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+
+	return count;
+}
+
+static ssize_t tags_show(struct device *dev,
+			 struct device_attribute *attr,
+			 char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 ctl;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	if (ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)
+		return sysfs_emit(buf, "%s\n", "10-Bit");
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	if (ctl & PCI_EXP_DEVCTL_EXT_TAG)
+		return sysfs_emit(buf, "%s\n", "8-Bit");
+
+	return sysfs_emit(buf, "%s\n", "5-Bit");
+}
+static DEVICE_ATTR_RW(tags);
+
 #ifdef CONFIG_NUMA
 static ssize_t numa_node_store(struct device *dev,
 			       struct device_attribute *attr, const char *buf,
@@ -635,6 +694,11 @@  static struct attribute *pcie_dev_attrs[] = {
 	NULL,
 };
 
+static struct attribute *pcie_dev_tags_attrs[] = {
+	&dev_attr_tags.attr,
+	NULL,
+};
+
 static struct attribute *pcibus_attrs[] = {
 	&dev_attr_bus_rescan.attr,
 	&dev_attr_cpuaffinity.attr,
@@ -1482,6 +1546,24 @@  static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
 	return 0;
 }
 
+static umode_t pcie_dev_tags_attrs_is_visible(struct kobject *kobj,
+					      struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (pdev->is_virtfn)
+		return 0;
+
+	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
+		return 0;
+
+	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return 0;
+
+	return a->mode;
+}
+
 static const struct attribute_group pci_dev_group = {
 	.attrs = pci_dev_attrs,
 };
@@ -1522,6 +1604,11 @@  static const struct attribute_group pcie_dev_attr_group = {
 	.is_visible = pcie_dev_attrs_are_visible,
 };
 
+static const struct attribute_group pcie_dev_tags_attr_group = {
+	.attrs = pcie_dev_tags_attrs,
+	.is_visible = pcie_dev_tags_attrs_is_visible,
+};
+
 static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_attr_group,
 	&pci_dev_hp_attr_group,
@@ -1531,6 +1618,7 @@  static const struct attribute_group *pci_dev_attr_groups[] = {
 #endif
 	&pci_bridge_attr_group,
 	&pcie_dev_attr_group,
+	&pcie_dev_tags_attr_group,
 #ifdef CONFIG_PCIEAER
 	&aer_stats_attr_group,
 #endif
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1cce56c2aea0..f719a41dfc7f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -264,6 +264,8 @@  struct device *pci_get_host_bridge_device(struct pci_dev *dev);
 void pci_put_host_bridge_device(struct device *dev);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev);
+
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7259ad774ac8..8f5372c7c737 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2042,6 +2042,20 @@  static void pci_configure_mps(struct pci_dev *dev)
 		 p_mps, mps, mpss);
 }
 
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
+{
+	struct pci_dev *root;
+
+	root = pcie_find_root_port(dev);
+	if (!root)
+		return false;
+
+	if (!(root->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
+		return false;
+
+	return true;
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
@@ -2075,6 +2089,12 @@  int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 		return 0;
 	}
 
+	/*
+	 * PCIe r5.0, sec 2.2.6.2 says "Receivers/Completers must handle 8-bit
+	 * Tag values correctly regardless of the setting of their Extended Tag
+	 * Field Enable bit (see Section 7.5.3.4)", so it is safe to enable
+	 * Extented Tags.
+	 */
 	if (!(ctl & PCI_EXP_DEVCTL_EXT_TAG)) {
 		pci_info(dev, "enabling Extended Tags\n");
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,