Message ID | m3h9fghbfh.fsf@t19.piap.pl (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Mon, Mar 21, 2016 at 10:39:52AM +0100, Krzysztof Halasa wrote: > I think this bug needs to be fixed, this way or another. > > The platform in question is Cavium CNS3xxx, ARMv6. > > A recent patch by Arnd Bergmann (498a92d42596 "ARM: cns3xxx: pci: avoid > potential stack overflow") converted an explicit setting of > PCI_EXP_DEVCTL_READRQ = 0 (i.e., max 128 bytes for bus-mastering PCIe DMA > read request) to: > + pcie_bus_config = PCIE_BUS_PEER2PEER; Any thoughts on this, Arnd? I can (and will) look at this, but it's a complicated area and it will take me quite a while to dig into it. > with the following commentary: > "The second part is how the driver sets up the Max_Read_Request_Size > value for the first device/function on bus 1, i.e. the device > plugged directly into the PCIe root port. > For all I can tell, this is in fact incomplete, as it does not > perform the same setting on devices attached to a PCIe switch, > or multi-function devices. > The solution for this part fortunately is even easier: if we > just set the global pcie_bus_config variable to PCIE_BUS_PEER2PEER, > all PCIe devices in the system are limited to 128 byte MPS, which > in turn limits the MRRS to 128 bytes for all devices, and we > no longer even need to touch any devices." > > The problem is the MRRS setting is never written to the hardware. > I propose the following, though I'm not sure if we can do this safely, > especially given the comments in probe.c. OTOH, this change may be > required in other (all?) cases when the user requests > PCIE_BUS_PEER2PEER. > > On this Laguna GW-2388 the following patch fixes BM DMA with: > 0000:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01) > 0000:01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge > 0000:02:0e.0 (PCI devices behind the bridge, these are doing actual BM xfers) > 0001:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01 - this is > the second lane from the CPU) > > pci 0000:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 > pci 0000:01:00.0: Max Payload Size set to 128/ 512 (was 128), Max Read Rq 128 > pci 0001:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 > > Signed-off-by: Krzysztof Ha?asa <khalasa@piap.pl> > Fixes: 498a92d42596 ("ARM: cns3xxx: pci: avoid potential stack overflow") > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 6d7ab9b..91713b6 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1919,7 +1919,8 @@ static void pcie_write_mrrs(struct pci_dev *dev) > /* In the "safe" case, do not configure the MRRS. There appear to be > * issues with setting MRRS to 0 on a number of devices. > */ > - if (pcie_bus_config != PCIE_BUS_PERFORMANCE) > + if (pcie_bus_config != PCIE_BUS_PERFORMANCE && > + pcie_bus_config != PCIE_BUS_PEER2PEER) > return; > > /* For Max performance, the MRRS must be set to the largest supported > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 2771625..6f5088a 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -756,7 +756,7 @@ enum pcie_bus_config_types { > PCIE_BUS_DEFAULT, /* ensure MPS matches upstream bridge */ > PCIE_BUS_SAFE, /* use largest MPS boot-time devices support */ > PCIE_BUS_PERFORMANCE, /* use MPS and MRRS for best performance */ > - PCIE_BUS_PEER2PEER, /* set MPS = 128 for all devices */ > + PCIE_BUS_PEER2PEER, /* set MPS and MRSS to 128 for all devices */ > }; > > extern enum pcie_bus_config_types pcie_bus_config; > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 21, 2016 at 10:42:34AM -0500, Bjorn Helgaas wrote: > On Mon, Mar 21, 2016 at 10:39:52AM +0100, Krzysztof Halasa wrote: > > I think this bug needs to be fixed, this way or another. > > > > The platform in question is Cavium CNS3xxx, ARMv6. > > > > A recent patch by Arnd Bergmann (498a92d42596 "ARM: cns3xxx: pci: avoid > > potential stack overflow") converted an explicit setting of > > PCI_EXP_DEVCTL_READRQ = 0 (i.e., max 128 bytes for bus-mastering PCIe DMA > > read request) to: > > + pcie_bus_config = PCIE_BUS_PEER2PEER; > > Any thoughts on this, Arnd? I can (and will) look at this, but it's > a complicated area and it will take me quite a while to dig into it. What should we do here? Should we revert 498a92d42596? It appeared in v4.4, so this has been broken for a while. It looks like 498a92d42596 merely fixed a warning, at the expense of breaking DMA on Cavium. Reverting it would bring the warning back, but that's better than broken DMA. > > with the following commentary: > > "The second part is how the driver sets up the Max_Read_Request_Size > > value for the first device/function on bus 1, i.e. the device > > plugged directly into the PCIe root port. > > For all I can tell, this is in fact incomplete, as it does not > > perform the same setting on devices attached to a PCIe switch, > > or multi-function devices. > > The solution for this part fortunately is even easier: if we > > just set the global pcie_bus_config variable to PCIE_BUS_PEER2PEER, > > all PCIe devices in the system are limited to 128 byte MPS, which > > in turn limits the MRRS to 128 bytes for all devices, and we > > no longer even need to touch any devices." > > > > The problem is the MRRS setting is never written to the hardware. > > I propose the following, though I'm not sure if we can do this safely, > > especially given the comments in probe.c. OTOH, this change may be > > required in other (all?) cases when the user requests > > PCIE_BUS_PEER2PEER. > > > > On this Laguna GW-2388 the following patch fixes BM DMA with: > > 0000:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01) > > 0000:01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge > > 0000:02:0e.0 (PCI devices behind the bridge, these are doing actual BM xfers) > > 0001:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01 - this is > > the second lane from the CPU) > > > > pci 0000:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 > > pci 0000:01:00.0: Max Payload Size set to 128/ 512 (was 128), Max Read Rq 128 > > pci 0001:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 > > > > Signed-off-by: Krzysztof Ha?asa <khalasa@piap.pl> > > Fixes: 498a92d42596 ("ARM: cns3xxx: pci: avoid potential stack overflow") > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > > index 6d7ab9b..91713b6 100644 > > --- a/drivers/pci/probe.c > > +++ b/drivers/pci/probe.c > > @@ -1919,7 +1919,8 @@ static void pcie_write_mrrs(struct pci_dev *dev) > > /* In the "safe" case, do not configure the MRRS. There appear to be > > * issues with setting MRRS to 0 on a number of devices. > > */ > > - if (pcie_bus_config != PCIE_BUS_PERFORMANCE) > > + if (pcie_bus_config != PCIE_BUS_PERFORMANCE && > > + pcie_bus_config != PCIE_BUS_PEER2PEER) > > return; > > > > /* For Max performance, the MRRS must be set to the largest supported > > diff --git a/include/linux/pci.h b/include/linux/pci.h > > index 2771625..6f5088a 100644 > > --- a/include/linux/pci.h > > +++ b/include/linux/pci.h > > @@ -756,7 +756,7 @@ enum pcie_bus_config_types { > > PCIE_BUS_DEFAULT, /* ensure MPS matches upstream bridge */ > > PCIE_BUS_SAFE, /* use largest MPS boot-time devices support */ > > PCIE_BUS_PERFORMANCE, /* use MPS and MRRS for best performance */ > > - PCIE_BUS_PEER2PEER, /* set MPS = 128 for all devices */ > > + PCIE_BUS_PEER2PEER, /* set MPS and MRSS to 128 for all devices */ > > }; > > > > extern enum pcie_bus_config_types pcie_bus_config; > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas <helgaas@kernel.org> writes: > It looks like 498a92d42596 merely fixed a warning, at the expense of > breaking DMA on Cavium. Reverting it would bring the warning back, but > that's better than broken DMA. Perhaps we should change PCIE_BUS_PEER2PEER to also write MRRS anyway. I realize the CNS3xxx patch is some sort of clever workaround, and that PCIE_BUS_PEER2PEER (which normally comes from kernel command line parameter "pcie_bus_peer2peer") was not exactly intended for this. But if one asks for "peer2peer" (which means limiting transfers to 128 bytes), how could it all work if the bus mastering read requests are not equally limited? BTW s/MRSS/MRRS/g
On Wed, May 04, 2016 at 03:09:27PM +0200, Krzysztof Ha?asa wrote: > Bjorn Helgaas <helgaas@kernel.org> writes: > > > It looks like 498a92d42596 merely fixed a warning, at the expense of > > breaking DMA on Cavium. Reverting it would bring the warning back, but > > that's better than broken DMA. > > Perhaps we should change PCIE_BUS_PEER2PEER to also write MRRS anyway. > > I realize the CNS3xxx patch is some sort of clever workaround, and that > PCIE_BUS_PEER2PEER (which normally comes from kernel command line > parameter "pcie_bus_peer2peer") was not exactly intended for this. But > if one asks for "peer2peer" (which means limiting transfers to 128 > bytes), how could it all work if the bus mastering read requests are > not equally limited? MPS=128 means a function will never generate a TLP exceeding 128 bytes. MRRS=128 means a function will never generate a Read Request with a size exceeding 128 bytes. The comment in pcie_write_mrrs() claims: ... the MRRS ... cannot be configured larger than the MPS the device or the bus can support. I think this comment is wrong, at least from a hardware point of view. Setting all devices to MRRS=512 and MPS=128 is a legal configuration that means functions may receive Read Requests for up to 512 bytes, and they will have to respond with 4 128-byte TLPs. The spec (PCIe r3.0, 7.8.4 implementation note) says: The Max_Read_Request_Size mechanism allows improved control of bandwidth allocation in systems where quality of service (QoS) is important for the target applications. For example, an arbitration scheme based on counting Requests (and not the sizes of those Requests) provides imprecise bandwidth allocation when some Requesters use much larger sizes than others. The Max_Read_Request_Size mechanism can be used to force more uniform allocation of bandwidth, by restricting the upper size of Read Requests. The Linux usage of tying it to MRRS to MPS is part of a plan to use MRRS to restrict the TLP size in one direction and MPS to restrict TLP size in the other. See b03e7495a862 ("PCI: Set PCI-E Max Payload Size on fabric"). But this is not in line with the spec intention, and I'm not 100% convinced that it works reliably. So I'm a little wary of tweaking our MPS/MRRS configuration without a more extensive analysis and (hopefully) some simplification of the code. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 02, 2016 at 11:53:22AM -0500, Bjorn Helgaas wrote: > On Thu, Apr 21, 2016 at 10:42:34AM -0500, Bjorn Helgaas wrote: > > On Mon, Mar 21, 2016 at 10:39:52AM +0100, Krzysztof Halasa wrote: > > > I think this bug needs to be fixed, this way or another. > > > > > > The platform in question is Cavium CNS3xxx, ARMv6. > > > > > > A recent patch by Arnd Bergmann (498a92d42596 "ARM: cns3xxx: pci: avoid > > > potential stack overflow") converted an explicit setting of > > > PCI_EXP_DEVCTL_READRQ = 0 (i.e., max 128 bytes for bus-mastering PCIe DMA > > > read request) to: > > > + pcie_bus_config = PCIE_BUS_PEER2PEER; > > > > Any thoughts on this, Arnd? I can (and will) look at this, but it's > > a complicated area and it will take me quite a while to dig into it. > > What should we do here? Should we revert 498a92d42596? It appeared in > v4.4, so this has been broken for a while. > > It looks like 498a92d42596 merely fixed a warning, at the expense of > breaking DMA on Cavium. Reverting it would bring the warning back, but > that's better than broken DMA. I propose that we revert 498a92d42596 ("ARM: cns3xxx: pci: avoid potential stack overflow"). I haven't had a chance to look into the problem with it yet. > > > with the following commentary: > > > "The second part is how the driver sets up the Max_Read_Request_Size > > > value for the first device/function on bus 1, i.e. the device > > > plugged directly into the PCIe root port. > > > For all I can tell, this is in fact incomplete, as it does not > > > perform the same setting on devices attached to a PCIe switch, > > > or multi-function devices. > > > The solution for this part fortunately is even easier: if we > > > just set the global pcie_bus_config variable to PCIE_BUS_PEER2PEER, > > > all PCIe devices in the system are limited to 128 byte MPS, which > > > in turn limits the MRRS to 128 bytes for all devices, and we > > > no longer even need to touch any devices." > > > > > > The problem is the MRRS setting is never written to the hardware. > > > I propose the following, though I'm not sure if we can do this safely, > > > especially given the comments in probe.c. OTOH, this change may be > > > required in other (all?) cases when the user requests > > > PCIE_BUS_PEER2PEER. > > > > > > On this Laguna GW-2388 the following patch fixes BM DMA with: > > > 0000:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01) > > > 0000:01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge > > > 0000:02:0e.0 (PCI devices behind the bridge, these are doing actual BM xfers) > > > 0001:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01 - this is > > > the second lane from the CPU) > > > > > > pci 0000:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 > > > pci 0000:01:00.0: Max Payload Size set to 128/ 512 (was 128), Max Read Rq 128 > > > pci 0001:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 > > > > > > Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl> > > > Fixes: 498a92d42596 ("ARM: cns3xxx: pci: avoid potential stack overflow") > > > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > > > index 6d7ab9b..91713b6 100644 > > > --- a/drivers/pci/probe.c > > > +++ b/drivers/pci/probe.c > > > @@ -1919,7 +1919,8 @@ static void pcie_write_mrrs(struct pci_dev *dev) > > > /* In the "safe" case, do not configure the MRRS. There appear to be > > > * issues with setting MRRS to 0 on a number of devices. > > > */ > > > - if (pcie_bus_config != PCIE_BUS_PERFORMANCE) > > > + if (pcie_bus_config != PCIE_BUS_PERFORMANCE && > > > + pcie_bus_config != PCIE_BUS_PEER2PEER) > > > return; > > > > > > /* For Max performance, the MRRS must be set to the largest supported > > > diff --git a/include/linux/pci.h b/include/linux/pci.h > > > index 2771625..6f5088a 100644 > > > --- a/include/linux/pci.h > > > +++ b/include/linux/pci.h > > > @@ -756,7 +756,7 @@ enum pcie_bus_config_types { > > > PCIE_BUS_DEFAULT, /* ensure MPS matches upstream bridge */ > > > PCIE_BUS_SAFE, /* use largest MPS boot-time devices support */ > > > PCIE_BUS_PERFORMANCE, /* use MPS and MRRS for best performance */ > > > - PCIE_BUS_PEER2PEER, /* set MPS = 128 for all devices */ > > > + PCIE_BUS_PEER2PEER, /* set MPS and MRSS to 128 for all devices */ > > > }; > > > > > > extern enum pcie_bus_config_types pcie_bus_config; > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6d7ab9b..91713b6 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1919,7 +1919,8 @@ static void pcie_write_mrrs(struct pci_dev *dev) /* In the "safe" case, do not configure the MRRS. There appear to be * issues with setting MRRS to 0 on a number of devices. */ - if (pcie_bus_config != PCIE_BUS_PERFORMANCE) + if (pcie_bus_config != PCIE_BUS_PERFORMANCE && + pcie_bus_config != PCIE_BUS_PEER2PEER) return; /* For Max performance, the MRRS must be set to the largest supported diff --git a/include/linux/pci.h b/include/linux/pci.h index 2771625..6f5088a 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -756,7 +756,7 @@ enum pcie_bus_config_types { PCIE_BUS_DEFAULT, /* ensure MPS matches upstream bridge */ PCIE_BUS_SAFE, /* use largest MPS boot-time devices support */ PCIE_BUS_PERFORMANCE, /* use MPS and MRRS for best performance */ - PCIE_BUS_PEER2PEER, /* set MPS = 128 for all devices */ + PCIE_BUS_PEER2PEER, /* set MPS and MRSS to 128 for all devices */ }; extern enum pcie_bus_config_types pcie_bus_config;
I think this bug needs to be fixed, this way or another. The platform in question is Cavium CNS3xxx, ARMv6. A recent patch by Arnd Bergmann (498a92d42596 "ARM: cns3xxx: pci: avoid potential stack overflow") converted an explicit setting of PCI_EXP_DEVCTL_READRQ = 0 (i.e., max 128 bytes for bus-mastering PCIe DMA read request) to: + pcie_bus_config = PCIE_BUS_PEER2PEER; with the following commentary: "The second part is how the driver sets up the Max_Read_Request_Size value for the first device/function on bus 1, i.e. the device plugged directly into the PCIe root port. For all I can tell, this is in fact incomplete, as it does not perform the same setting on devices attached to a PCIe switch, or multi-function devices. The solution for this part fortunately is even easier: if we just set the global pcie_bus_config variable to PCIE_BUS_PEER2PEER, all PCIe devices in the system are limited to 128 byte MPS, which in turn limits the MRRS to 128 bytes for all devices, and we no longer even need to touch any devices." The problem is the MRRS setting is never written to the hardware. I propose the following, though I'm not sure if we can do this safely, especially given the comments in probe.c. OTOH, this change may be required in other (all?) cases when the user requests PCIE_BUS_PEER2PEER. On this Laguna GW-2388 the following patch fixes BM DMA with: 0000:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01) 0000:01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge 0000:02:0e.0 (PCI devices behind the bridge, these are doing actual BM xfers) 0001:00:00.0 PCI bridge: Cavium Networks Device 3400 (rev 01 - this is the second lane from the CPU) pci 0000:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 pci 0000:01:00.0: Max Payload Size set to 128/ 512 (was 128), Max Read Rq 128 pci 0001:00:00.0: Max Payload Size set to 128/ 128 (was 128), Max Read Rq 128 Signed-off-by: Krzysztof Ha?asa <khalasa@piap.pl> Fixes: 498a92d42596 ("ARM: cns3xxx: pci: avoid potential stack overflow") -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html