Message ID | m37dgp20cr.fsf@t19.piap.pl (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCIe: limit Max Read Request Size on i.MX to 512 bytes | expand |
Hi Krzysztof,
[...]
> This patch makes the RTL8111 work on i.MX6.
Would it be possible to implement this particular MRRS fix as a quirk
only for the i.MX6 controller? Unless this is something that we need in
the core, a quirk would be preferred over something that changes the PCI
core.
An example of such quirk might be the one currently implemented for the
Loongson controller, as per:
https://elixir.bootlin.com/linux/v5.14-rc5/source/drivers/pci/controller/pci-loongson.c#L63
Krzysztof
Krzysztof, :-) > Would it be possible to implement this particular MRRS fix as a quirk > only for the i.MX6 controller? Unless this is something that we need in > the core, a quirk would be preferred over something that changes the PCI > core. I have briefly considered it, but I think it would be *much* more complicated and error-prone. It also appears that there are more platforms which need it - the old CNS3xxx, which currently subverts the PCIE_BUS_PEER2PEER, the loongson, keystone, maybe all DWC PCIe. Multiplication of the "quirk" code doesn't really look good to me. TBH I don't think of this as of a "quirk" - all systems have MRRS limits, it just happens that these ones have their limit lower than 4096 bytes. This isn't a limitation of a particular PCIe device, this is a common limit of the whole system. Also I'm not exactly sure the loongson fixup is complete. It's only done at pci-enable*() time (e.g. for devices which have bigger MRRS after power-up), while e.g. the r8169 driver changes MRRS well after pci-enable*(). This means it needs to stay in/below pcie_get_readrq(), and while it could mean going to ops->write*, it would be a real mess parsing the devices, PCIE capabilities etc. Now it's basically a few lines in a seldom called routine in pci.c, and the loongson case (and others) can be made simpler (and really fixed) as well.
On Fri, Aug 13, 2021 at 3:52 AM Krzysztof Hałasa <khalasa@piap.pl> wrote: > > DWC PCIe controller imposes limits on the Read Request Size that it can > handle. For i.MX6 family it's fixed at 512 bytes by default. > > If a memory read larger than the limit is requested, the CPU responds > with Completer Abort (CA) (on i.MX6 Unsupported Request (UR) is returned > instead due to a design error). > > The i.MX6 documentation states that the limit can be changed by writing > to the PCIE_PL_MRCCR0 register, however there is a fixed (and > undocumented) maximum (CX_REMOTE_RD_REQ_SIZE constant). Tests indicate > that values larger than 512 bytes don't work, though. > > This patch makes the RTL8111 work on i.MX6. > > Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl> > > diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig > index 0c473d75e625..a11ec93a8cd0 100644 > --- a/drivers/pci/Kconfig > +++ b/drivers/pci/Kconfig > @@ -34,6 +34,9 @@ config PCI_DOMAINS_GENERIC > config PCI_SYSCALL > bool > > +config NEED_PCIE_MAX_MRRS We don't need a config option for this. It's not much code and it will effectively always be enabled with multi-platform kernels. Rob
Rob, Rob Herring <robh@kernel.org> writes: >> +++ b/drivers/pci/Kconfig >> @@ -34,6 +34,9 @@ config PCI_DOMAINS_GENERIC >> config PCI_SYSCALL >> bool >> >> +config NEED_PCIE_MAX_MRRS > > We don't need a config option for this. It's not much code and it will > effectively always be enabled with multi-platform kernels. But... non-ARM kernels? Then perhaps #if CONFIG_ARM?
On Fri, Aug 13, 2021 at 02:09:51PM +0200, Krzysztof Hałasa wrote: > Krzysztof, :-) > > > Would it be possible to implement this particular MRRS fix as a quirk > > only for the i.MX6 controller? Unless this is something that we need in > > the core, a quirk would be preferred over something that changes the PCI > > core. > > I have briefly considered it, but I think it would be *much* more > complicated and error-prone. It also appears that there are more > platforms which need it - the old CNS3xxx, which currently subverts the > PCIE_BUS_PEER2PEER, the loongson, keystone, maybe all DWC PCIe. > Multiplication of the "quirk" code doesn't really look good to me. > > TBH I don't think of this as of a "quirk" - all systems have MRRS > limits, it just happens that these ones have their limit lower than 4096 > bytes. This isn't a limitation of a particular PCIe device, this is a > common limit of the whole system. Do you have a reference for this? I don't see anything in the PCIe spec that suggests platforms must limit MRRS, and it seems that only these ARM-related controllers have this issue. If there *is* a platform connection here, we'll need some way to discover it, e.g., an ACPI _DSM method or similar. The only guidance in the spec about setting MRRS is that: - Software must set Max_Read_Request_Size of an isochronous-configured device with a value that does not exceed the Max_Payload_Size set for the device (PCIe r5.0, sec 6.3.4.1) - The Max_Read_Request_Size mechanism allows improved control of bandwidth allocation in systems where Quality of Service (QoS) is important for the target applications. For example, an arbitration scheme based on counting Requests (and not the sizes of those Requests) provides imprecise bandwidth allocation when some Requesters use much larger sizes than others. The Max_Read_Request_Size mechanism can be used to force more uniform allocation of bandwidth, by restricting the upper size of Read Requests (sec 7.5.3.4 implementation note)
Bjorn Helgaas <helgaas@kernel.org> writes: >> TBH I don't think of this as of a "quirk" - all systems have MRRS >> limits, it just happens that these ones have their limit lower than 4096 >> bytes. This isn't a limitation of a particular PCIe device, this is a >> common limit of the whole system. > > Do you have a reference for this? I don't see anything in the PCIe > spec that suggests platforms must limit MRRS, and it seems that only > these ARM-related controllers have this issue. I meant there is always a limit - isn't Max_Read_Request_Size a limit? Device Control Register (Offset 08h) Bit Location 14:12 Max_Read_Request_Size allows for max 4096 bytes, though two values are reserved, so there is room for some easy extension. - non-ARM (non-DWC?) systems are limited to 4096 bytes - DWC-based systems are limited to 128, 256, 512 bytes (are there 4096-byte ones?) That's how I understand it, please correct me if I'm wrong.
Hi Krzysztof: Regarding my understanding, that there should be decomposition operations if the Max_Read_Request_Size is larger than the Max_Payload_size specified by RC port. The bit0 of AMBA Multiple Outbound Decomposed NP Sub-Requests Control Register(Offset:0x700 + 0x24) had been set to be 1b1 in default. Note: The description of this bit. Enable AMBA Multiple Outbound Decomposed NP Sub- Requests. This bit when set to '0' disables the possibility of having multiple outstanding non-posted requests that were derived from decomposition of an outbound AMBA request. See Supported AXI Burst Operations for more details. You should not clear this register unless your application master is requesting an amount of read data greater than Max_Read_Request_Size, and the remote device (or switch) is reordering completions that have different tags > -----Original Message----- > From: khalasa@piap.pl <khalasa@piap.pl> > Sent: Monday, August 16, 2021 1:19 PM > To: Bjorn Helgaas <helgaas@kernel.org> > Cc: Krzysztof Wilczyński <kw@linux.com>; Bjorn Helgaas > <bhelgaas@google.com>; linux-pci@vger.kernel.org; Artem Lapkin > <email2tema@gmail.com>; Neil Armstrong <narmstrong@baylibre.com>; > Huacai Chen <chenhuacai@gmail.com>; Rob Herring <robh@kernel.org>; > Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>; Richard Zhu > <hongxing.zhu@nxp.com>; Lucas Stach <l.stach@pengutronix.de>; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH] PCIe: limit Max Read Request Size on i.MX to 512 bytes > > Bjorn Helgaas <helgaas@kernel.org> writes: > > >> TBH I don't think of this as of a "quirk" - all systems have MRRS > >> limits, it just happens that these ones have their limit lower than > >> 4096 bytes. This isn't a limitation of a particular PCIe device, this > >> is a common limit of the whole system. > > > > Do you have a reference for this? I don't see anything in the PCIe > > spec that suggests platforms must limit MRRS, and it seems that only > > these ARM-related controllers have this issue. > > I meant there is always a limit - isn't Max_Read_Request_Size a limit? > Device Control Register (Offset 08h) Bit Location 14:12 > Max_Read_Request_Size allows for max 4096 bytes, though two values are > reserved, so there is room for some easy extension. > > - non-ARM (non-DWC?) systems are limited to 4096 bytes [Richard] Regarding to the descriptions listed above, I think that there should no limitations of the Max_payload_size of RC port. > - DWC-based systems are limited to 128, 256, 512 bytes (are there > 4096-byte ones?) [Richard] The Max_payload_size can be configured from 128bytes to 4KB when integrate the DWC IP into the SOC. On i.MX6 PCIe, this parameter is fixed set to 512 bytes. BR Richard > > That's how I understand it, please correct me if I'm wrong. > -- > Krzysztof "Chris" Hałasa > > Sieć Badawcza Łukasiewicz > Przemysłowy Instytut Automatyki i Pomiarów PIAP Al. Jerozolimskie 202, > 02-486 Warszawa
Hi Richard, Please correct me if I got something wrong: > Regarding my understanding, that there should be decomposition operations if the > Max_Read_Request_Size is larger than the Max_Payload_size specified > by RC port. I think this means that, for example, a memory read request (a single short physical TLP packet on PCIe, from the peripheral device to the CPU) can request more data than Max_Payload_size (128 bytes on i.MX6). In such case, up to 4 "completion" physical TLP packets are returned by the CPU (up to 512 bytes, with each individual TLP up to 128 bytes as per Max_Payload_size). The device can't request more than 512 bytes, though - the CPU would not service such request. > The bit0 of AMBA Multiple Outbound Decomposed NP Sub-Requests Control Register(Offset:0x700 + 0x24) > had been set to be 1b1 in default. > > Note: The description of this bit. > Enable AMBA Multiple Outbound Decomposed NP Sub- Requests. > This bit when set to '0' disables the possibility of having multiple outstanding non-posted requests that > were derived from decomposition of an outbound AMBA request. See Supported AXI Burst Operations for > more details. I think the above means that - when I disable the bit in question - I can't issue memory read requests longer than 128 bytes (max payload size). This is not exactly clear to me: > You should not clear this register unless your application master is > requesting an amount of read data greater than Max_Read_Request_Size, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Is "completing" such a request at all possible? The device shouldn't request more data than its (not CPU's) Max_Read_Request_Size. I. e. if 512 is written to RTL8111's Max_Read_Request_Size, then the Realtek chip will request max 512 bytes at a time. > and the remote device (or switch) is reordering completions that have > different tags Fortunately, such completions don't seem to be reordered. However, I'm not sure how setting a bit in the CPU registers could help here. I think the only way *IF* the completions were reordered would be setting MRRS = MPS (= 128 bytes on i.MX6) - so there is nothing that could be reordered.
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 0c473d75e625..a11ec93a8cd0 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -34,6 +34,9 @@ config PCI_DOMAINS_GENERIC config PCI_SYSCALL bool +config NEED_PCIE_MAX_MRRS + bool + source "drivers/pci/pcie/Kconfig" config PCI_MSI diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig index 423d35872ce4..b59a4c91cb3b 100644 --- a/drivers/pci/controller/dwc/Kconfig +++ b/drivers/pci/controller/dwc/Kconfig @@ -98,6 +98,7 @@ config PCI_IMX6 depends on ARCH_MXC || COMPILE_TEST depends on PCI_MSI_IRQ_DOMAIN select PCIE_DW_HOST + select NEED_PCIE_MAX_MRRS config PCIE_SPEAR13XX bool "STMicroelectronics SPEAr PCIe controller" diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c index 80fc98acf097..7a83f677a632 100644 --- a/drivers/pci/controller/dwc/pci-imx6.c +++ b/drivers/pci/controller/dwc/pci-imx6.c @@ -1148,6 +1148,7 @@ static int imx6_pcie_probe(struct platform_device *pdev) imx6_pcie->vph = NULL; } + max_pcie_mrrs = 512; platform_set_drvdata(pdev, imx6_pcie); ret = imx6_pcie_attach_pd(dev); diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index aacf575c15cf..8ed8d2e75f4c 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -112,6 +112,10 @@ enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PEER2PEER; enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_DEFAULT; #endif +#ifdef CONFIG_NEED_PCIE_MAX_MRRS +u16 max_pcie_mrrs = 4096; // no limit +#endif + /* * The default CLS is used if arch didn't set CLS explicitly and not * all pci devices agree on the same value. Arch can override either @@ -5816,6 +5820,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq) rq = mps; } +#ifdef CONFIG_NEED_PCIE_MAX_MRRS + if (rq > max_pcie_mrrs) + rq = max_pcie_mrrs; +#endif + v = (ffs(rq) - 8) << 12; ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL, diff --git a/include/linux/pci.h b/include/linux/pci.h index 540b377ca8f6..7368be024c31 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -994,6 +994,7 @@ enum pcie_bus_config_types { }; extern enum pcie_bus_config_types pcie_bus_config; +extern u16 max_pcie_mrrs; extern struct bus_type pci_bus_type;
DWC PCIe controller imposes limits on the Read Request Size that it can handle. For i.MX6 family it's fixed at 512 bytes by default. If a memory read larger than the limit is requested, the CPU responds with Completer Abort (CA) (on i.MX6 Unsupported Request (UR) is returned instead due to a design error). The i.MX6 documentation states that the limit can be changed by writing to the PCIE_PL_MRCCR0 register, however there is a fixed (and undocumented) maximum (CX_REMOTE_RD_REQ_SIZE constant). Tests indicate that values larger than 512 bytes don't work, though. This patch makes the RTL8111 work on i.MX6. Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>