diff mbox

[v3] PCI: Workaround wrong flags completions for IDT switch

Message ID 20170609231617.20243-1-yinghai@kernel.org (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Yinghai Lu June 9, 2017, 11:16 p.m. UTC
From: James Puthukattukaran <james.puthukattukaran@oracle.com>

The IDT switch incorrectly flags an ACS source violation on a read config
request to an end point device on the completion (IDT 89H32H8G3-YC,
errata #36) even though the PCI Express spec states that completions are
never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).

The suggested workaround by IDT is to issue a configuration write to the
downstream device before issuing the first config read. This allows the
downstream device to capture its bus number, thus avoiding the ACS
violation on the completion.

The patch does the following -

1. Disable ACS source violation if enabled
2. Wait for config space access to become available by reading vendor id
3. Do a config write to the end point (errata workaround)
4. Enable ACS source validation (if it was enabled to begin with)

-v2: move workaround to pci_bus_read_dev_vendor_id() from pci_bus_check_dev()
     and move enable_acs_sv to drivers/pci/pci.c -- by Yinghai
-v3: add bus->self check for root bus and virtual bus for sriov vfs.

Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

--

 drivers/pci/pci.c   |   33 +++++++++++++++++++++++++++++++++
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   38 ++++++++++++++++++++++++++++++++++++--
 3 files changed, 70 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas June 12, 2017, 9:48 p.m. UTC | #1
On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
> 
> The IDT switch incorrectly flags an ACS source violation on a read config
> request to an end point device on the completion (IDT 89H32H8G3-YC,
> errata #36) even though the PCI Express spec states that completions are
> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).

Can you include a URL where this erratum is published?  If not, can
you include the actual erratum text here?

Have you considered ways to make this patch apply only to the affected
IDT switches?  Currently it applies to *all* devices.

The purpose of the pci_bus_read_dev_vendor_id() path is to support the
Configuration Request Retry Status feature (see PCIe r3.1, sec 2.3.2),
which works by special handling of config reads of the Vendor ID after
a reset.  Normally, that Vendor ID read would be the first access to
a device when we enumerate it.

This patch adds config reads and writes of the ACS capability *before*
the Vendor ID read.  At that point we don't even know whether the
device exists.  If it doesn't exist, pci_find_ext_capability() would
read 0xffffffff data, and it probably fails reasonably gracefully.

But if the device *does* exist, I think this patch breaks the CRS
Software Visibility feature.  Without this patch, we try to read
Vendor ID, and the device may return a CRS Completion Status.  If CRS
visibility is enabled, the root complex may complete the request by
returning 0x0001 for the Vendor ID, in which case we sleep and try
again later.

With this patch, we first try to read the ACS capability.  If the
device returns a CRS Completion Status, the root complex is required
to reissue the config request.  This is the required behavior
regardless of whether CRS Software Visibility is enabled, so I think
this effectively disables that feature.

> The suggested workaround by IDT is to issue a configuration write to the
> downstream device before issuing the first config read. This allows the
> downstream device to capture its bus number, thus avoiding the ACS
> violation on the completion.
> 
> The patch does the following -
> 
> 1. Disable ACS source violation if enabled
> 2. Wait for config space access to become available by reading vendor id
> 3. Do a config write to the end point (errata workaround)
> 4. Enable ACS source validation (if it was enabled to begin with)
> 
> -v2: move workaround to pci_bus_read_dev_vendor_id() from pci_bus_check_dev()
>      and move enable_acs_sv to drivers/pci/pci.c -- by Yinghai
> -v3: add bus->self check for root bus and virtual bus for sriov vfs.
> 
> Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> --
> 
>  drivers/pci/pci.c   |   33 +++++++++++++++++++++++++++++++++
>  drivers/pci/pci.h   |    1 +
>  drivers/pci/probe.c |   38 ++++++++++++++++++++++++++++++++++++--
>  3 files changed, 70 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/drivers/pci/probe.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/probe.c
> +++ linux-2.6/drivers/pci/probe.c
> @@ -1763,8 +1763,8 @@ struct pci_dev *pci_alloc_dev(struct pci
>  }
>  EXPORT_SYMBOL(pci_alloc_dev);
>  
> -bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> -				int crs_timeout)
> +static bool __pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn,
> +					 u32 *l, int crs_timeout)
>  {
>  	int delay = 1;
>  
> @@ -1801,6 +1801,40 @@ bool pci_bus_read_dev_vendor_id(struct p
>  
>  	return true;
>  }
> +
> +bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> +				int crs_timeout)
> +{
> +	int found;
> +	int enable = -1;
> +
> +	/*
> +	 * Some IDT switches flag an ACS violation for config reads
> +	 * even though the PCI spec allows for it (PCIe 3.1, 6.1.12.1)
> +	 * It flags it because the bus number is not properly set in the
> +	 * completion. The workaround is to do a dummy write to properly
> +	 * latch number once the device is ready for config operations
> +	 */
> +
> +	if (bus->self)
> +		enable = pci_std_enable_acs_sv(bus->self, false);
> +
> +	found = __pci_bus_read_dev_vendor_id(bus, devfn, l, crs_timeout);
> +
> +	/*
> +	 * The fact that we can read the vendor id indicates that the device
> +	 * is ready for config operations. Do the write as part of the errata
> +	 * workaround.
> +	 */
> +	if (bus->self) {
> +		if (found)
> +			pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
> +		if (enable > 0)
> +			pci_std_enable_acs_sv(bus->self, enable);
> +	}
> +
> +	return found;
> +}
>  EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
>  
>  /*
> Index: linux-2.6/drivers/pci/pci.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pci.c
> +++ linux-2.6/drivers/pci/pci.c
> @@ -2838,6 +2838,39 @@ static bool pci_acs_flags_enabled(struct
>  }
>  
>  /**
> + *  pci_std_enable_acs_sv - enable/disable ACS source validation if supported by the switch
> + *  @dev - pcie switch/RP
> + *  @enable - enable (1) or disable (0) source validation
> + *
> + *  Returns : < 0 on failure
> + *           previous acs_sv state
> + */
> +int pci_std_enable_acs_sv(struct pci_dev *dev, bool enable)
> +{
> +	int pos;
> +	u16 cap;
> +	u16 ctrl;
> +	int retval;
> +
> +	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
> +	if (!pos)
> +		return -ENODEV;
> +
> +	pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap);
> +	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
> +
> +	retval = !!(ctrl & cap & PCI_ACS_SV);
> +	if (enable)
> +		ctrl |= (cap & PCI_ACS_SV);
> +	else
> +		ctrl &= ~(cap & PCI_ACS_SV);
> +
> +	pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
> +
> +	return retval;
> +}
> +
> +/**
>   * pci_acs_enabled - test ACS against required flags for a given device
>   * @pdev: device to test
>   * @acs_flags: required PCI ACS flags
> Index: linux-2.6/drivers/pci/pci.h
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pci.h
> +++ linux-2.6/drivers/pci/pci.h
> @@ -343,6 +343,7 @@ static inline resource_size_t pci_resour
>  }
>  
>  void pci_enable_acs(struct pci_dev *dev);
> +int pci_std_enable_acs_sv(struct pci_dev *dev, bool enable);
>  
>  #ifdef CONFIG_PCIE_PTM
>  void pci_ptm_init(struct pci_dev *dev);
Yinghai Lu June 13, 2017, 5 p.m. UTC | #2
On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
>> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
>>
>> The IDT switch incorrectly flags an ACS source violation on a read config
>> request to an end point device on the completion (IDT 89H32H8G3-YC,
>> errata #36) even though the PCI Express spec states that completions are
>> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
>
> Can you include a URL where this erratum is published?  If not, can
> you include the actual erratum text here?

Ok.

>
> Have you considered ways to make this patch apply only to the affected
> IDT switches?  Currently it applies to *all* devices.

But we need to apply that workaround before we know vendorid/deviceid
under that root port or downstream port.

>
> The purpose of the pci_bus_read_dev_vendor_id() path is to support the
> Configuration Request Retry Status feature (see PCIe r3.1, sec 2.3.2),
> which works by special handling of config reads of the Vendor ID after
> a reset.  Normally, that Vendor ID read would be the first access to
> a device when we enumerate it.
>
> This patch adds config reads and writes of the ACS capability *before*
> the Vendor ID read.  At that point we don't even know whether the
> device exists.  If it doesn't exist, pci_find_ext_capability() would
> read 0xffffffff data, and it probably fails reasonably gracefully.
>
> But if the device *does* exist, I think this patch breaks the CRS
> Software Visibility feature.  Without this patch, we try to read
> Vendor ID, and the device may return a CRS Completion Status.  If CRS
> visibility is enabled, the root complex may complete the request by
> returning 0x0001 for the Vendor ID, in which case we sleep and try
> again later.
>
> With this patch, we first try to read the ACS capability.  If the
> device returns a CRS Completion Status, the root complex is required
> to reissue the config request.  This is the required behavior
> regardless of whether CRS Software Visibility is enabled, so I think
> this effectively disables that feature.

The workaround (acs reading etc) is applied to root port or downstream port.
and pci_bus_read_dev_vendor_id() is for reading vendorid of device
under that root port or downstream port.

Thanks

Yinghai
James Puthukattukaran June 13, 2017, 6:30 p.m. UTC | #3
On 6/13/2017 1:00 PM, Yinghai Lu wrote:
> On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
>>> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
>>>
>>> The IDT switch incorrectly flags an ACS source violation on a read config
>>> request to an end point device on the completion (IDT 89H32H8G3-YC,
>>> errata #36) even though the PCI Express spec states that completions are
>>> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
>> Can you include a URL where this erratum is published?  If not, can
>> you include the actual erratum text here?
> Ok.

Here's the errata text
------------------------------------
Item #36 - Downstream port applies ACS Source Validation to Completions
“Section 6.12.1.1" of the PCI Express Base Specification 3.1 states that 
completions are never affected
by ACS Source Validation. However, completions received by a downstream 
port of the PCIe switch from a device that has not yet captured a PCIe 
bus number are incorrectly dropped by ACS source validation by the 
switch downstream port.

Workaround: Issue a CfgWr1 to the downstream device before issuing the 
first CfgRd1 to the device.
This allows the downstream device to capture its bus number; ACS source 
validation no longer stops
completions from being forwarded by the downstream port. It has been 
observed that Microsoft Windows implements this workaround already; 
however, some versions of Linux and other operating systems may not.
--------------------------------------------

>> Have you considered ways to make this patch apply only to the affected
>> IDT switches?  Currently it applies to *all* devices.
> But we need to apply that workaround before we know vendorid/deviceid
> under that root port or downstream port.
>
>> The purpose of the pci_bus_read_dev_vendor_id() path is to support the
>> Configuration Request Retry Status feature (see PCIe r3.1, sec 2.3.2),
>> which works by special handling of config reads of the Vendor ID after
>> a reset.  Normally, that Vendor ID read would be the first access to
>> a device when we enumerate it.
>>
>> This patch adds config reads and writes of the ACS capability *before*
>> the Vendor ID read.  At that point we don't even know whether the
>> device exists.  If it doesn't exist, pci_find_ext_capability() would
>> read 0xffffffff data, and it probably fails reasonably gracefully.
>>
>> But if the device *does* exist, I think this patch breaks the CRS
>> Software Visibility feature.  Without this patch, we try to read
>> Vendor ID, and the device may return a CRS Completion Status.  If CRS
>> visibility is enabled, the root complex may complete the request by
>> returning 0x0001 for the Vendor ID, in which case we sleep and try
>> again later.
>>
>> With this patch, we first try to read the ACS capability.  If the
>> device returns a CRS Completion Status, the root complex is required
>> to reissue the config request.  This is the required behavior
>> regardless of whether CRS Software Visibility is enabled, so I think
>> this effectively disables that feature.
> The workaround (acs reading etc) is applied to root port or downstream port.
> and pci_bus_read_dev_vendor_id() is for reading vendorid of device
> under that root port or downstream port.
>
> Thanks
>
> Yinghai
Bjorn Helgaas June 13, 2017, 10:10 p.m. UTC | #4
On Tue, Jun 13, 2017 at 10:00:33AM -0700, Yinghai Lu wrote:
> On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
> >> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
> >>
> >> The IDT switch incorrectly flags an ACS source violation on a read config
> >> request to an end point device on the completion (IDT 89H32H8G3-YC,
> >> errata #36) even though the PCI Express spec states that completions are
> >> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
> ...

> > Have you considered ways to make this patch apply only to the affected
> > IDT switches?  Currently it applies to *all* devices.
> 
> But we need to apply that workaround before we know vendorid/deviceid
> under that root port or downstream port.

If I understand correctly, the problem (the ACS source violation)
occurs when we enumerate a device below an IDT switch.  We enumerate
a switch before we enumerate any devices under the switch, so I don't
understand why this can't be made IDT-specific.

> > The purpose of the pci_bus_read_dev_vendor_id() path is to support the
> > Configuration Request Retry Status feature (see PCIe r3.1, sec 2.3.2),
> > which works by special handling of config reads of the Vendor ID after
> > a reset.  Normally, that Vendor ID read would be the first access to
> > a device when we enumerate it.
> >
> > This patch adds config reads and writes of the ACS capability *before*
> > the Vendor ID read.  At that point we don't even know whether the
> > device exists.  If it doesn't exist, pci_find_ext_capability() would
> > read 0xffffffff data, and it probably fails reasonably gracefully.
> >
> > But if the device *does* exist, I think this patch breaks the CRS
> > Software Visibility feature.  Without this patch, we try to read
> > Vendor ID, and the device may return a CRS Completion Status.  If CRS
> > visibility is enabled, the root complex may complete the request by
> > returning 0x0001 for the Vendor ID, in which case we sleep and try
> > again later.
> >
> > With this patch, we first try to read the ACS capability.  If the
> > device returns a CRS Completion Status, the root complex is required
> > to reissue the config request.  This is the required behavior
> > regardless of whether CRS Software Visibility is enabled, so I think
> > this effectively disables that feature.
> 
> The workaround (acs reading etc) is applied to root port or downstream port.
> and pci_bus_read_dev_vendor_id() is for reading vendorid of device
> under that root port or downstream port.

OK, I see what you're saying: pci_std_enable_acs_sv() twiddles the ACS
capability of the upstream bridge when we're enumerating a device
*below* the bridge.  Since the bridge has already been enumerated,
we've already read its Vendor ID, so looking up its ACS capability
should not cause any CRS completions.

Still, I think all this fiddling around with the ACS capability is too
intrusive for non-IDT devices.  Caching the ACS capability location
would help a little bit, but it's still too much in my opinion.

If ACS is broken on the IDT switch, one obvious possibility is a quirk
to disable use of ACS on those switches.

The current patch appears to penalize everybody for the sins of IDT,
which I'd like to avoid.

Bjorn
Bjorn Helgaas June 13, 2017, 10:14 p.m. UTC | #5
On Tue, Jun 13, 2017 at 02:30:55PM -0400, james puthukattukaran wrote:
> On 6/13/2017 1:00 PM, Yinghai Lu wrote:
> >On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
> >>>From: James Puthukattukaran <james.puthukattukaran@oracle.com>
> >>>
> >>>The IDT switch incorrectly flags an ACS source violation on a read config
> >>>request to an end point device on the completion (IDT 89H32H8G3-YC,
> >>>errata #36) even though the PCI Express spec states that completions are
> >>>never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
> >>Can you include a URL where this erratum is published?  If not, can
> >>you include the actual erratum text here?
> >Ok.
> 
> Here's the errata text
> ------------------------------------
> Item #36 - Downstream port applies ACS Source Validation to Completions
> “Section 6.12.1.1" of the PCI Express Base Specification 3.1 states
> that completions are never affected
> by ACS Source Validation. However, completions received by a
> downstream port of the PCIe switch from a device that has not yet
> captured a PCIe bus number are incorrectly dropped by ACS source
> validation by the switch downstream port.
> 
> Workaround: Issue a CfgWr1 to the downstream device before issuing
> the first CfgRd1 to the device.
> This allows the downstream device to capture its bus number; ACS
> source validation no longer stops
> completions from being forwarded by the downstream port. It has been
> observed that Microsoft Windows implements this workaround already;
> however, some versions of Linux and other operating systems may not.

This doesn't mention anything about disabling ACS.  Issuing a config
write to devices downstream of an IDT bridge sounds simpler than what
this patch does.  Why don't you do that?

This patch does write to PCI_VENDOR_ID, which purports to be part of
the workaround, but that happens *after* the first config read (which
happens inside __pci_bus_read_dev_vendor_id()).

Bjorn
James Puthukattukaran June 13, 2017, 11:19 p.m. UTC | #6
> On Jun 13, 2017, at 6:14 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
>> On Tue, Jun 13, 2017 at 02:30:55PM -0400, james puthukattukaran wrote:
>>> On 6/13/2017 1:00 PM, Yinghai Lu wrote:
>>>> On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>> On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
>>>>> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
>>>>> 
>>>>> The IDT switch incorrectly flags an ACS source violation on a read config
>>>>> request to an end point device on the completion (IDT 89H32H8G3-YC,
>>>>> errata #36) even though the PCI Express spec states that completions are
>>>>> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
>>>> Can you include a URL where this erratum is published?  If not, can
>>>> you include the actual erratum text here?
>>> Ok.
>> 
>> Here's the errata text
>> ------------------------------------
>> Item #36 - Downstream port applies ACS Source Validation to Completions
>> “Section 6.12.1.1" of the PCI Express Base Specification 3.1 states
>> that completions are never affected
>> by ACS Source Validation. However, completions received by a
>> downstream port of the PCIe switch from a device that has not yet
>> captured a PCIe bus number are incorrectly dropped by ACS source
>> validation by the switch downstream port.
>> 
>> Workaround: Issue a CfgWr1 to the downstream device before issuing
>> the first CfgRd1 to the device.
>> This allows the downstream device to capture its bus number; ACS
>> source validation no longer stops
>> completions from being forwarded by the downstream port. It has been
>> observed that Microsoft Windows implements this workaround already;
>> however, some versions of Linux and other operating systems may not.
> 
> This doesn't mention anything about disabling ACS.  Issuing a config
> write to devices downstream of an IDT bridge sounds simpler than what
> this patch does.  Why don't you do that?

The issue is how will we know is the config write succeeds if the device is not ready? I thought it was simpler to disable acs for the sake of the read and when we know that the device is ready ( returns vendor id from read), it's ready for subsequent config write. 

> 
> This patch does write to PCI_VENDOR_ID, which purports to be part of
> the workaround, but that happens *after* the first config read (which
> happens inside __pci_bus_read_dev_vendor_id()).
> 
> Bjorn
Bjorn Helgaas June 14, 2017, 11:32 a.m. UTC | #7
On Tue, Jun 13, 2017 at 07:19:57PM -0400, James Puthukattukaran wrote:
> > On Jun 13, 2017, at 6:14 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> On Tue, Jun 13, 2017 at 02:30:55PM -0400, james puthukattukaran wrote:
> >>> On 6/13/2017 1:00 PM, Yinghai Lu wrote:
> >>>> On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>> On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
> >>>>> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
> >>>>> 
> >>>>> The IDT switch incorrectly flags an ACS source violation on a read config
> >>>>> request to an end point device on the completion (IDT 89H32H8G3-YC,
> >>>>> errata #36) even though the PCI Express spec states that completions are
> >>>>> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
> >>>> Can you include a URL where this erratum is published?  If not, can
> >>>> you include the actual erratum text here?
> >> 
> >> Here's the errata text
> >> ------------------------------------
> >> Item #36 - Downstream port applies ACS Source Validation to
> >> Completions “Section 6.12.1.1" of the PCI Express Base
> >> Specification 3.1 states that completions are never affected by
> >> ACS Source Validation. However, completions received by a
> >> downstream port of the PCIe switch from a device that has not yet
> >> captured a PCIe bus number are incorrectly dropped by ACS source
> >> validation by the switch downstream port.
> >> 
> >> Workaround: Issue a CfgWr1 to the downstream device before
> >> issuing the first CfgRd1 to the device.  This allows the
> >> downstream device to capture its bus number; ACS source
> >> validation no longer stops completions from being forwarded by
> >> the downstream port. It has been observed that Microsoft Windows
> >> implements this workaround already; however, some versions of
> >> Linux and other operating systems may not.
> > 
> > This doesn't mention anything about disabling ACS.  Issuing a
> > config write to devices downstream of an IDT bridge sounds simpler
> > than what this patch does.  Why don't you do that?
> 
> The issue is how will we know is the config write succeeds if the
> device is not ready? I thought it was simpler to disable acs for the
> sake of the read and when we know that the device is ready ( returns
> vendor id from read), it's ready for subsequent config write. 

If that's a problem, it sounds like the errata text is wrong or at
least incomplete.  If disabling ACS SV is required, the errata text
should mention it.

But I don't think it is a problem.  Per PCIe r3.1, sec 2.3.2, if a
Root Complex receives a CRS completion for a Configuration Write, it
must re-issue the request.

Or did you actually try that and find that it didn't work?

Bjorn
James Puthukattukaran June 14, 2017, 6:15 p.m. UTC | #8
On 06/14/2017 07:32 AM, Bjorn Helgaas wrote:
> On Tue, Jun 13, 2017 at 07:19:57PM -0400, James Puthukattukaran wrote:
>>> On Jun 13, 2017, at 6:14 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>> On Tue, Jun 13, 2017 at 02:30:55PM -0400, james puthukattukaran wrote:
>>>>> On 6/13/2017 1:00 PM, Yinghai Lu wrote:
>>>>>> On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>>> On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
>>>>>>> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
>>>>>>>
>>>>>>> The IDT switch incorrectly flags an ACS source violation on a read config
>>>>>>> request to an end point device on the completion (IDT 89H32H8G3-YC,
>>>>>>> errata #36) even though the PCI Express spec states that completions are
>>>>>>> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
>>>>>> Can you include a URL where this erratum is published?  If not, can
>>>>>> you include the actual erratum text here?
>>>> Here's the errata text
>>>> ------------------------------------
>>>> Item #36 - Downstream port applies ACS Source Validation to
>>>> Completions “Section 6.12.1.1" of the PCI Express Base
>>>> Specification 3.1 states that completions are never affected by
>>>> ACS Source Validation. However, completions received by a
>>>> downstream port of the PCIe switch from a device that has not yet
>>>> captured a PCIe bus number are incorrectly dropped by ACS source
>>>> validation by the switch downstream port.
>>>>
>>>> Workaround: Issue a CfgWr1 to the downstream device before
>>>> issuing the first CfgRd1 to the device.  This allows the
>>>> downstream device to capture its bus number; ACS source
>>>> validation no longer stops completions from being forwarded by
>>>> the downstream port. It has been observed that Microsoft Windows
>>>> implements this workaround already; however, some versions of
>>>> Linux and other operating systems may not.
>>> This doesn't mention anything about disabling ACS.  Issuing a
>>> config write to devices downstream of an IDT bridge sounds simpler
>>> than what this patch does.  Why don't you do that?
>> The issue is how will we know is the config write succeeds if the
>> device is not ready? I thought it was simpler to disable acs for the
>> sake of the read and when we know that the device is ready ( returns
>> vendor id from read), it's ready for subsequent config write.
> If that's a problem, it sounds like the errata text is wrong or at
> least incomplete.  If disabling ACS SV is required, the errata text
> should mention it.
>
> But I don't think it is a problem.  Per PCIe r3.1, sec 2.3.2, if a
> Root Complex receives a CRS completion for a Configuration Write, it
> must re-issue the request.
>
> Or did you actually try that and find that it didn't work?
I tried the write and it did not work. This could be because the root 
port gave up after getting a few CRS responses?
I know that the IDT switch takes some time to be ready even to respond 
with CRS response (firmware initialization, etc). The spec does not
say how often the root complex would retry the request and  how long. 
So, I put in a 1 second delay before the write and the write seemed to 
latch/work.
The real issue here is there's no programmatic way (that I know of) to 
check if the write failed.

I thought instead of putting delays (which might vary for different 
devices and might not work in certain cases), the sure way of knowing 
the device exists and is responding to config requests.

--James
Bjorn Helgaas June 15, 2017, 7:53 p.m. UTC | #9
On Wed, Jun 14, 2017 at 02:15:24PM -0400, James Puthukattukaran wrote:
> On 06/14/2017 07:32 AM, Bjorn Helgaas wrote:
> >On Tue, Jun 13, 2017 at 07:19:57PM -0400, James Puthukattukaran wrote:
> >>>On Jun 13, 2017, at 6:14 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>On Tue, Jun 13, 2017 at 02:30:55PM -0400, james puthukattukaran wrote:
> >>>>>On 6/13/2017 1:00 PM, Yinghai Lu wrote:
> >>>>>>On Mon, Jun 12, 2017 at 2:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>>>>On Fri, Jun 09, 2017 at 04:16:17PM -0700, Yinghai Lu wrote:
> >>>>>>>From: James Puthukattukaran <james.puthukattukaran@oracle.com>
> >>>>>>>
> >>>>>>>The IDT switch incorrectly flags an ACS source violation on a read config
> >>>>>>>request to an end point device on the completion (IDT 89H32H8G3-YC,
> >>>>>>>errata #36) even though the PCI Express spec states that completions are
> >>>>>>>never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
> >>>>>>Can you include a URL where this erratum is published?  If not, can
> >>>>>>you include the actual erratum text here?
> >>>>Here's the errata text
> >>>>------------------------------------
> >>>>Item #36 - Downstream port applies ACS Source Validation to
> >>>>Completions “Section 6.12.1.1" of the PCI Express Base
> >>>>Specification 3.1 states that completions are never affected by
> >>>>ACS Source Validation. However, completions received by a
> >>>>downstream port of the PCIe switch from a device that has not yet
> >>>>captured a PCIe bus number are incorrectly dropped by ACS source
> >>>>validation by the switch downstream port.
> >>>>
> >>>>Workaround: Issue a CfgWr1 to the downstream device before
> >>>>issuing the first CfgRd1 to the device.  This allows the
> >>>>downstream device to capture its bus number; ACS source
> >>>>validation no longer stops completions from being forwarded by
> >>>>the downstream port. It has been observed that Microsoft Windows
> >>>>implements this workaround already; however, some versions of
> >>>>Linux and other operating systems may not.
> >>>This doesn't mention anything about disabling ACS.  Issuing a
> >>>config write to devices downstream of an IDT bridge sounds simpler
> >>>than what this patch does.  Why don't you do that?
> >>The issue is how will we know is the config write succeeds if the
> >>device is not ready? I thought it was simpler to disable acs for the
> >>sake of the read and when we know that the device is ready ( returns
> >>vendor id from read), it's ready for subsequent config write.
> >If that's a problem, it sounds like the errata text is wrong or at
> >least incomplete.  If disabling ACS SV is required, the errata text
> >should mention it.
> >
> >But I don't think it is a problem.  Per PCIe r3.1, sec 2.3.2, if a
> >Root Complex receives a CRS completion for a Configuration Write, it
> >must re-issue the request.
> >
> >Or did you actually try that and find that it didn't work?
> I tried the write and it did not work. This could be because the
> root port gave up after getting a few CRS responses?
> I know that the IDT switch takes some time to be ready even to
> respond with CRS response (firmware initialization, etc). 

The ACS problem is not with completions for config reads to the switch
itself, so it doesn't matter how long it takes the switch to be ready.
The problem is how the switch handles completions for config reads to
devices *downstream* from the switch.

The workaround recommended by the erratum is to perform a config write
to the device *below* the switch before doing a config read.  E.g., in
pci_scan_device(), you would do something like this:

  struct pci_dev *pci_scan_device(...)
  {
    if (bus->self && bus->self->idt_acs_bug)
      pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);

    if (!pci_bus_read_dev_vendor_id(bus, devfn, &l, 60*1000))
      return NULL;

    ...

> The spec does not say how often the root complex would retry the
> request and  how long. So, I put in a 1 second delay before the
> write and the write seemed to latch/work.

That's interesting and suggests that you did actually do the config
write to the downstream device, not to the switch.  But you don't say
exactly how the write "did not work".  If the RC stopped retrying the
config write, I would expect it to at least report the error, e.g., by
logging a Completion Timeout error.  Then a subsequent read would
probably cause the ACS SV error because the device didn't accept the
write.

PCIe r3.1, sec 6.6.1, requires some delays between a Conventional
Reset and the first config requests.  I don't recall a place where we
explicitly do that delay, so that could conceivably account for the
device not accepting the write.  But I'd be surprised if we could boot
Linux and get to PCI enumeration that fast (on the order of 0.1 - 1
second).

In any case, I'm looking for a patch that checks for the broken IDT
devices, so we don't have to do all the workaround stuff for
everybody.

Bjorn
diff mbox

Patch

Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1763,8 +1763,8 @@  struct pci_dev *pci_alloc_dev(struct pci
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
-				int crs_timeout)
+static bool __pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn,
+					 u32 *l, int crs_timeout)
 {
 	int delay = 1;
 
@@ -1801,6 +1801,40 @@  bool pci_bus_read_dev_vendor_id(struct p
 
 	return true;
 }
+
+bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
+				int crs_timeout)
+{
+	int found;
+	int enable = -1;
+
+	/*
+	 * Some IDT switches flag an ACS violation for config reads
+	 * even though the PCI spec allows for it (PCIe 3.1, 6.1.12.1)
+	 * It flags it because the bus number is not properly set in the
+	 * completion. The workaround is to do a dummy write to properly
+	 * latch number once the device is ready for config operations
+	 */
+
+	if (bus->self)
+		enable = pci_std_enable_acs_sv(bus->self, false);
+
+	found = __pci_bus_read_dev_vendor_id(bus, devfn, l, crs_timeout);
+
+	/*
+	 * The fact that we can read the vendor id indicates that the device
+	 * is ready for config operations. Do the write as part of the errata
+	 * workaround.
+	 */
+	if (bus->self) {
+		if (found)
+			pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
+		if (enable > 0)
+			pci_std_enable_acs_sv(bus->self, enable);
+	}
+
+	return found;
+}
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
 
 /*
Index: linux-2.6/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -2838,6 +2838,39 @@  static bool pci_acs_flags_enabled(struct
 }
 
 /**
+ *  pci_std_enable_acs_sv - enable/disable ACS source validation if supported by the switch
+ *  @dev - pcie switch/RP
+ *  @enable - enable (1) or disable (0) source validation
+ *
+ *  Returns : < 0 on failure
+ *           previous acs_sv state
+ */
+int pci_std_enable_acs_sv(struct pci_dev *dev, bool enable)
+{
+	int pos;
+	u16 cap;
+	u16 ctrl;
+	int retval;
+
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+	if (!pos)
+		return -ENODEV;
+
+	pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap);
+	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
+
+	retval = !!(ctrl & cap & PCI_ACS_SV);
+	if (enable)
+		ctrl |= (cap & PCI_ACS_SV);
+	else
+		ctrl &= ~(cap & PCI_ACS_SV);
+
+	pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
+
+	return retval;
+}
+
+/**
  * pci_acs_enabled - test ACS against required flags for a given device
  * @pdev: device to test
  * @acs_flags: required PCI ACS flags
Index: linux-2.6/drivers/pci/pci.h
===================================================================
--- linux-2.6.orig/drivers/pci/pci.h
+++ linux-2.6/drivers/pci/pci.h
@@ -343,6 +343,7 @@  static inline resource_size_t pci_resour
 }
 
 void pci_enable_acs(struct pci_dev *dev);
+int pci_std_enable_acs_sv(struct pci_dev *dev, bool enable);
 
 #ifdef CONFIG_PCIE_PTM
 void pci_ptm_init(struct pci_dev *dev);