Message ID | 20240305081105.11912-1-johan+linaro@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | arm64: dts: qcom: sc8280xp: PCIe fixes and GICv3 ITS enable | expand |
On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote: > This series addresses a few problems with the sc8280xp PCIe > implementation. > > The DWC PCIe controller can either use its internal MSI controller or an > external one such as the GICv3 ITS. Enabling the latter allows for > assigning affinity to individual interrupts, but results in a large > amount of Correctable Errors being logged on both the Lenovo ThinkPad > X13s and the sc8280xp-crd reference design. > > It turns out that these errors are always generated, but for some yet to > be determined reason, the AER interrupts are never received when using > the internal MSI controller, which makes the link errors harder to > notice. > > On the X13s, there is a large number of errors generated when bringing > up the link on boot. This is related to the fact that UEFI firmware has > already enabled the Wi-Fi PCIe link at Gen2 speed and restarting the > link at Gen3 generates a massive amount of errors until the Wi-Fi > firmware is restarted. This has now also been shown to cause the Wi-Fi > to sometimes not start at all on boot for some users. > > A recent commit enabling ASPM on certain Qualcomm platforms introduced > further errors when using the Wi-Fi on the X13s as well as when > accessing the NVMe on the CRD. The exact reason for this has not yet > been identified, but disabling ASPM L0s makes the errors go away. This > could suggest that either the current ASPM implementation is incomplete > or that L0s is not supported with these devices. > > Note that the X13s and CRD use the same Wi-Fi controller, but the errors > are only generated on the X13s. The NVMe controller on my X13s does not > support L0s so there are no issues there, unlike on the CRD which uses a > different controller. The modem on the CRD does not generate any errors, > but both the NVMe and modem keeps bouncing in and out of L0s/L1 also > when not used, which could indicate that there are bigger problems with > the ASPM implementation. I don't have a modem on my X13s so I have not > been able to test whether L0s causes any trouble there. > > Enabling AER error reporting on sc8280xp could similarly also reveal > existing problems with the related sa8295p and sa8540p platforms as they > share the base dtsi. > > After discussing this with Bjorn Andersson at Qualcomm we have decided > to go ahead and disable L0s for all controllers on the CRD and the > X13s. > Just received confirmation from Qcom that L0s is not supported for any of the PCIe instances in sc8280xp (and its derivatives). Please move the property to SoC dtsi. - Mani > Note that disabling ASPM L0s for the X13s Wi-Fi does not seem to have a > significant impact on the power consumption (and there are indications > that this applies generally for L0s on these platforms). > > *** > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe > binding rework in linux-next so that the whole series can be merged for > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for > stable backport anyway). > > The DT bindings and PCI patch are expected to go through the PCI tree, > while Bjorn A takes the devicetree updates through the Qualcomm tree. > > Johan > > > Changes in v3 > - drop the two wifi link speed patches which have been picked up for > 6.8 > - rebase on binding rework in linux-next and add the properties also to > the new qcom,pcie-common.yaml > - https://lore.kernel.org/linux-pci/20240126-dt-bindings-pci-qcom-split-v3-0-f23cda4d74c0@linaro.org/ > - fix an 'L0s' typo in one commit message > > Changes in v2 > - drop RFC from ASPM patches and add stable tags > - reorder patches and move ITS patch last > - fix s/GB/MB/ typo in Gen2 speed commit messages > - fix an incorrect Fixes tag > - amend commit message X13 wifi link speed patch after user > confirmation that this fixes the wifi startup issue > - disable L0s also for modem and wifi on CRD > - disable L0s also for nvme and modem on X13s > > > Johan Hovold (10): > dt-bindings: PCI: qcom: Allow 'required-opps' > dt-bindings: PCI: qcom: Do not require 'msi-map-mask' > dt-bindings: PCI: qcom: Allow 'aspm-no-l0s' > PCI: qcom: Add support for disabling ASPM L0s in devicetree > arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP > arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for NVMe > arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for modem and Wi-Fi > arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for Wi-Fi > arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for NVMe and modem > arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe > > .../bindings/pci/qcom,pcie-common.yaml | 6 +++++- > .../devicetree/bindings/pci/qcom,pcie.yaml | 6 +++++- > arch/arm64/boot/dts/qcom/sc8280xp-crd.dts | 5 +++++ > .../qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 5 +++++ > arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 17 +++++++++++++++- > drivers/pci/controller/dwc/pcie-qcom.c | 20 +++++++++++++++++++ > 6 files changed, 56 insertions(+), 3 deletions(-) > > -- > 2.43.0 >
On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote: > On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote: > > This series addresses a few problems with the sc8280xp PCIe > > implementation. > > > > The DWC PCIe controller can either use its internal MSI controller or an > > external one such as the GICv3 ITS. Enabling the latter allows for > > assigning affinity to individual interrupts, but results in a large > > amount of Correctable Errors being logged on both the Lenovo ThinkPad > > X13s and the sc8280xp-crd reference design. > > > > It turns out that these errors are always generated, but for some yet to > > be determined reason, the AER interrupts are never received when using > > the internal MSI controller, which makes the link errors harder to > > notice. > > Enabling AER error reporting on sc8280xp could similarly also reveal > > existing problems with the related sa8295p and sa8540p platforms as they > > share the base dtsi. > > > > After discussing this with Bjorn Andersson at Qualcomm we have decided > > to go ahead and disable L0s for all controllers on the CRD and the > > X13s. > Just received confirmation from Qcom that L0s is not supported for any of the > PCIe instances in sc8280xp (and its derivatives). Please move the property to > SoC dtsi. Ok, thanks for confirming. But then the devicetree property is not the right way to handle this, and we should disable L0s based on the compatible string instead. > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe > > binding rework in linux-next so that the whole series can be merged for > > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for > > stable backport anyway). I'll respin the series. Looks like we've already missed the chance to enable ITS in 6.9 anyway. Johan
On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote: > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote: > > On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote: > > > This series addresses a few problems with the sc8280xp PCIe > > > implementation. > > > > > > The DWC PCIe controller can either use its internal MSI controller or an > > > external one such as the GICv3 ITS. Enabling the latter allows for > > > assigning affinity to individual interrupts, but results in a large > > > amount of Correctable Errors being logged on both the Lenovo ThinkPad > > > X13s and the sc8280xp-crd reference design. > > > > > > It turns out that these errors are always generated, but for some yet to > > > be determined reason, the AER interrupts are never received when using > > > the internal MSI controller, which makes the link errors harder to > > > notice. > > > > Enabling AER error reporting on sc8280xp could similarly also reveal > > > existing problems with the related sa8295p and sa8540p platforms as they > > > share the base dtsi. > > > > > > After discussing this with Bjorn Andersson at Qualcomm we have decided > > > to go ahead and disable L0s for all controllers on the CRD and the > > > X13s. > > > Just received confirmation from Qcom that L0s is not supported for any of the > > PCIe instances in sc8280xp (and its derivatives). Please move the property to > > SoC dtsi. > > Ok, thanks for confirming. But then the devicetree property is not the > right way to handle this, and we should disable L0s based on the > compatible string instead. > Hmm. I checked further and got the info that there is no change in the IP, but the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So there will be AERs when L0s is enabled on any controller instance. And there will be no updated PHY sequence in the future also for this chipset. So yeah, let's disable it in the driver instead. > > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe > > > binding rework in linux-next so that the whole series can be merged for > > > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for > > > stable backport anyway). > > I'll respin the series. Looks like we've already missed the chance to > enable ITS in 6.9 anyway. > Sounds good, thanks! - Mani
On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote: > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote: > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote: > > > On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote: > > > > This series addresses a few problems with the sc8280xp PCIe > > > > implementation. > > > > > > > > The DWC PCIe controller can either use its internal MSI controller or an > > > > external one such as the GICv3 ITS. Enabling the latter allows for > > > > assigning affinity to individual interrupts, but results in a large > > > > amount of Correctable Errors being logged on both the Lenovo ThinkPad > > > > X13s and the sc8280xp-crd reference design. > > > > > > > > It turns out that these errors are always generated, but for some yet to > > > > be determined reason, the AER interrupts are never received when using > > > > the internal MSI controller, which makes the link errors harder to > > > > notice. > > > > > > Enabling AER error reporting on sc8280xp could similarly also reveal > > > > existing problems with the related sa8295p and sa8540p platforms as they > > > > share the base dtsi. > > > > > > > > After discussing this with Bjorn Andersson at Qualcomm we have decided > > > > to go ahead and disable L0s for all controllers on the CRD and the > > > > X13s. > > > > > Just received confirmation from Qcom that L0s is not supported for any of the > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to > > > SoC dtsi. > > > > Ok, thanks for confirming. But then the devicetree property is not the > > right way to handle this, and we should disable L0s based on the > > compatible string instead. > > > > Hmm. I checked further and got the info that there is no change in the IP, but > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So > there will be AERs when L0s is enabled on any controller instance. And there > will be no updated PHY sequence in the future also for this chipset. Why? If it is a bug in the PHY driver, it should be fixed there instead of adding workarounds. > > So yeah, let's disable it in the driver instead. > > > > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe > > > > binding rework in linux-next so that the whole series can be merged for > > > > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for > > > > stable backport anyway). > > > > I'll respin the series. Looks like we've already missed the chance to > > enable ITS in 6.9 anyway. > > > > Sounds good, thanks! > > - Mani > > -- > மணிவண்ணன் சதாசிவம் >
On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote: > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam > <manivannan.sadhasivam@linaro.org> wrote: > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote: > > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote: > > > > Just received confirmation from Qcom that L0s is not supported for any of the > > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to > > > > SoC dtsi. > > > Ok, thanks for confirming. But then the devicetree property is not the > > > right way to handle this, and we should disable L0s based on the > > > compatible string instead. > > Hmm. I checked further and got the info that there is no change in the IP, but > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So > > there will be AERs when L0s is enabled on any controller instance. And there > > will be no updated PHY sequence in the future also for this chipset. > > Why? If it is a bug in the PHY driver, it should be fixed there > instead of adding workarounds. ASPM L0s is currently broken on these platforms and, as far as I understand, both under Windows and Linux. Since Qualcomm hasn't been able to come up with the necessary PHY init sequences for these platforms yet, I doubt they will suddenly appear in the near future. So we need to disable L0s for now. If an updated PHY init sequence later appears, we can always enable it again. > > So yeah, let's disable it in the driver instead. Johan
On Wed, 6 Mar 2024 at 11:12, Johan Hovold <johan@kernel.org> wrote: > > On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote: > > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam > > <manivannan.sadhasivam@linaro.org> wrote: > > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote: > > > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote: > > > > > > Just received confirmation from Qcom that L0s is not supported for any of the > > > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to > > > > > SoC dtsi. > > > > > Ok, thanks for confirming. But then the devicetree property is not the > > > > right way to handle this, and we should disable L0s based on the > > > > compatible string instead. > > > > Hmm. I checked further and got the info that there is no change in the IP, but > > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So > > > there will be AERs when L0s is enabled on any controller instance. And there > > > will be no updated PHY sequence in the future also for this chipset. > > > > Why? If it is a bug in the PHY driver, it should be fixed there > > instead of adding workarounds. > > ASPM L0s is currently broken on these platforms and, as far as I > understand, both under Windows and Linux. Since Qualcomm hasn't been > able to come up with the necessary PHY init sequences for these > platforms yet, I doubt they will suddenly appear in the near future. I see. Ok, I retract my comment. > > So we need to disable L0s for now. If an updated PHY init sequence later > appears, we can always enable it again. > > > > So yeah, let's disable it in the driver instead. > > Johan
On Wed, Mar 06, 2024 at 10:12:31AM +0100, Johan Hovold wrote: > On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote: > > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam > > <manivannan.sadhasivam@linaro.org> wrote: > > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote: > > > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote: > > > > > > Just received confirmation from Qcom that L0s is not supported for any of the > > > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to > > > > > SoC dtsi. > > > > > Ok, thanks for confirming. But then the devicetree property is not the > > > > right way to handle this, and we should disable L0s based on the > > > > compatible string instead. > > > > Hmm. I checked further and got the info that there is no change in the IP, but > > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So > > > there will be AERs when L0s is enabled on any controller instance. And there > > > will be no updated PHY sequence in the future also for this chipset. > > > > Why? If it is a bug in the PHY driver, it should be fixed there > > instead of adding workarounds. > > ASPM L0s is currently broken on these platforms and, as far as I > understand, both under Windows and Linux. Since Qualcomm hasn't been > able to come up with the necessary PHY init sequences for these > platforms yet, I doubt they will suddenly appear in the near future. > > So we need to disable L0s for now. If an updated PHY init sequence later > appears, we can always enable it again. > It could be the same case for all 'non-mobile' chipsets (automotive, compute, modem). So instead of using the compatible, please add a flag and set that for all non-mobile SoCs. Like the ones starting with SAxxx, SCxxx, SDxxx. - Mani
On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote: > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam > <manivannan.sadhasivam@linaro.org> wrote: > > > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote: > > > On Wed, Mar 06, 2024 at 12:03:02PM +0530, Manivannan Sadhasivam wrote: > > > > On Tue, Mar 05, 2024 at 09:10:55AM +0100, Johan Hovold wrote: > > > > > This series addresses a few problems with the sc8280xp PCIe > > > > > implementation. > > > > > > > > > > The DWC PCIe controller can either use its internal MSI controller or an > > > > > external one such as the GICv3 ITS. Enabling the latter allows for > > > > > assigning affinity to individual interrupts, but results in a large > > > > > amount of Correctable Errors being logged on both the Lenovo ThinkPad > > > > > X13s and the sc8280xp-crd reference design. > > > > > > > > > > It turns out that these errors are always generated, but for some yet to > > > > > be determined reason, the AER interrupts are never received when using > > > > > the internal MSI controller, which makes the link errors harder to > > > > > notice. > > > > > > > > Enabling AER error reporting on sc8280xp could similarly also reveal > > > > > existing problems with the related sa8295p and sa8540p platforms as they > > > > > share the base dtsi. > > > > > > > > > > After discussing this with Bjorn Andersson at Qualcomm we have decided > > > > > to go ahead and disable L0s for all controllers on the CRD and the > > > > > X13s. > > > > > > > Just received confirmation from Qcom that L0s is not supported for any of the > > > > PCIe instances in sc8280xp (and its derivatives). Please move the property to > > > > SoC dtsi. > > > > > > Ok, thanks for confirming. But then the devicetree property is not the > > > right way to handle this, and we should disable L0s based on the > > > compatible string instead. > > > > > > > Hmm. I checked further and got the info that there is no change in the IP, but > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So > > there will be AERs when L0s is enabled on any controller instance. And there > > will be no updated PHY sequence in the future also for this chipset. > > Why? If it is a bug in the PHY driver, it should be fixed there > instead of adding workarounds. > Fixing the L0s support requires the expertise of the PHY team and they will only do if there is any real demand (like in the case of mobile chipsets). For compute chipsets, they didn't do because most of the NVMe devices out there in the market only support L1 and L1ss. So we have to live with this limitation for now. - Mani > > > > So yeah, let's disable it in the driver instead. > > > > > > > As we are now at 6.8-rc7, I've rebased this series on the Qualcomm PCIe > > > > > binding rework in linux-next so that the whole series can be merged for > > > > > 6.9 (the 'aspm-no-l0s' support and devicetree fixes are all marked for > > > > > stable backport anyway). > > > > > > I'll respin the series. Looks like we've already missed the chance to > > > enable ITS in 6.9 anyway. > > > > > > > Sounds good, thanks! > > > > - Mani > > > > -- > > மணிவண்ணன் சதாசிவம் > > > > > -- > With best wishes > Dmitry
On Wed, Mar 06, 2024 at 03:08:57PM +0530, Manivannan Sadhasivam wrote: > On Wed, Mar 06, 2024 at 10:12:31AM +0100, Johan Hovold wrote: > > On Wed, Mar 06, 2024 at 10:48:30AM +0200, Dmitry Baryshkov wrote: > > > On Wed, 6 Mar 2024 at 10:39, Manivannan Sadhasivam > > > <manivannan.sadhasivam@linaro.org> wrote: > > > > On Wed, Mar 06, 2024 at 08:20:16AM +0100, Johan Hovold wrote: > > > > > Ok, thanks for confirming. But then the devicetree property is not the > > > > > right way to handle this, and we should disable L0s based on the > > > > > compatible string instead. > > > > > > Hmm. I checked further and got the info that there is no change in the IP, but > > > > the PHY sequence is not tuned correctly for L0s (as I suspected earlier). So > > > > there will be AERs when L0s is enabled on any controller instance. And there > > > > will be no updated PHY sequence in the future also for this chipset. > > > > > > Why? If it is a bug in the PHY driver, it should be fixed there > > > instead of adding workarounds. > > > > ASPM L0s is currently broken on these platforms and, as far as I > > understand, both under Windows and Linux. Since Qualcomm hasn't been > > able to come up with the necessary PHY init sequences for these > > platforms yet, I doubt they will suddenly appear in the near future. > > > > So we need to disable L0s for now. If an updated PHY init sequence later > > appears, we can always enable it again. > > It could be the same case for all 'non-mobile' chipsets (automotive, compute, > modem). So instead of using the compatible, please add a flag and set that for > all non-mobile SoCs. Like the ones starting with SAxxx, SCxxx, SDxxx. I've already updated the series and was just about to post it. Disabling for further platforms would also require matching on the compatible string and we can easily do that in a follow-up patch once we have some confirmation that it is needed. Johan