Message ID | 20231120084014.108274-1-manivannan.sadhasivam@linaro.org (mailing list archive) |
---|---|
Headers | show |
Series | PCI: designware-ep: Fix DBI access before core init | expand |
Hello, > This series is the continuation of previous work by Vidya Sagar [1] to fix the > issues related to accessing DBI register space before completing the core > initialization in some EP platforms like Tegra194/234 and Qcom SM8450. Applied to controller/dwc-ep, thank you! [01/02] PCI: designware-ep: Fix DBI access before core init https://git.kernel.org/pci/pci/c/d3d13b00a2cf [02/02] PCI: designware-ep: Move pci_epc_init_notify() inside dw_pcie_ep_init_complete() https://git.kernel.org/pci/pci/c/a171e1d60dad Krzysztof
Hello, > > This series is the continuation of previous work by Vidya Sagar [1] to fix the > > issues related to accessing DBI register space before completing the core > > initialization in some EP platforms like Tegra194/234 and Qcom SM8450. > > Applied to controller/dwc-ep, thank you! > > [01/02] PCI: designware-ep: Fix DBI access before core init > https://git.kernel.org/pci/pci/c/d3d13b00a2cf > [02/02] PCI: designware-ep: Move pci_epc_init_notify() inside dw_pcie_ep_init_complete() > https://git.kernel.org/pci/pci/c/a171e1d60dad Manivannan, I applied this series to "controller/dwc-ep" rather than "controller/dwc" so that I can have a look at the conflicts we have between this series and the current "controller/dwc" branch. There have been changes from both Shimoda-san and Frank Li that both changed some things and moved some things around within the code base, so this series no longer cleanly applies. However, I wanted the CI system to pick the branch for testing, as the sooner, the better. Hopefully, we can resolve the conflicts before Bjorn sends his Pull Request with 6.8 changes. Krzysztof
On Sun, Jan 07, 2024 at 04:27:07PM +0900, Krzysztof Wilczyński wrote: > Hello, Hello Krzysztof, Manivannan, > > > This series is the continuation of previous work by Vidya Sagar [1] to fix the > > issues related to accessing DBI register space before completing the core > > initialization in some EP platforms like Tegra194/234 and Qcom SM8450. > > Applied to controller/dwc-ep, thank you! > > [01/02] PCI: designware-ep: Fix DBI access before core init > https://git.kernel.org/pci/pci/c/d3d13b00a2cf > [02/02] PCI: designware-ep: Move pci_epc_init_notify() inside dw_pcie_ep_init_complete() > https://git.kernel.org/pci/pci/c/a171e1d60dad > > Krzysztof Considering that we know that this series introduces new problems for drivers with a .core_init_notifier (i.e. tegra and qcom), see: https://lore.kernel.org/linux-pci/ZWYmX8Y%2F7Q9WMxES@x1-carbon/ Do we really want to apply this series as is? Reading the patch, it appears that (at least some) tegra and qcom boards currently causes a whole system hang, which IIUC, renders those boards unusable. So perhaps the new issues introduced by this series is preferable to a whole system hang. As such, I can understand the urgency to merge this series. However, at the very least, I think that it might be worth amending the commit message to mention that this will currently not deregister the dma device in a clean way, leading to e.g. superfluous entries in /sys/class/dma/ and debugfs warnings being printed to dmesg. Kind regards, Niklas
On Tue, Jan 09, 2024 at 05:58:53PM +0000, Niklas Cassel wrote: > On Sun, Jan 07, 2024 at 04:27:07PM +0900, Krzysztof Wilczyński wrote: > > Hello, > > Hello Krzysztof, Manivannan, > > > > > > This series is the continuation of previous work by Vidya Sagar [1] to fix the > > > issues related to accessing DBI register space before completing the core > > > initialization in some EP platforms like Tegra194/234 and Qcom SM8450. > > > > Applied to controller/dwc-ep, thank you! > > > > [01/02] PCI: designware-ep: Fix DBI access before core init > > https://git.kernel.org/pci/pci/c/d3d13b00a2cf > > [02/02] PCI: designware-ep: Move pci_epc_init_notify() inside dw_pcie_ep_init_complete() > > https://git.kernel.org/pci/pci/c/a171e1d60dad > > > > Krzysztof > > Considering that we know that this series introduces new problems > for drivers with a .core_init_notifier (i.e. tegra and qcom), see: > https://lore.kernel.org/linux-pci/ZWYmX8Y%2F7Q9WMxES@x1-carbon/ > > Do we really want to apply this series as is? > > Niklas, I think I explained it in this thread itself. Let me reiterate here again. The fact that you are seeing the dmaengine warnings is due to function drivers not releasing the channels properly. It is not the job of the DWC driver to release the channels. The channels are requested by the function drivers [1] and they _should_ release them when the channels are no longer required. I know that the PCI_EPF_TEST driver is not doing so and so you are seeing the warnings. But I do not have a device to test that function driver. Qcom platforms use a dedicated function driver and that releases the channels when it gets the LINK_DOWN event from EPC [2]. So my conclusion is that the issue is there even without this series. If you still want me to fix the EPF_TEST driver, I can submit a change, but someone has to test it. - Mani [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/endpoint/functions/pci-epf-test.c#n229 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/endpoint/functions/pci-epf-mhi.c#n563 > Reading the patch, it appears that (at least some) tegra and > qcom boards currently causes a whole system hang, which IIUC, > renders those boards unusable. > > So perhaps the new issues introduced by this series is preferable > to a whole system hang. > > As such, I can understand the urgency to merge this series. > However, at the very least, I think that it might be worth amending > the commit message to mention that this will currently not deregister > the dma device in a clean way, leading to e.g. superfluous entries in > /sys/class/dma/ and debugfs warnings being printed to dmesg. > > > Kind regards, > Niklas
Hello Mani, On Wed, Jan 10, 2024 at 08:41:37AM +0530, Manivannan Sadhasivam wrote: > On Tue, Jan 09, 2024 at 05:58:53PM +0000, Niklas Cassel wrote: > > On Sun, Jan 07, 2024 at 04:27:07PM +0900, Krzysztof Wilczyński wrote: > > > > Considering that we know that this series introduces new problems > > for drivers with a .core_init_notifier (i.e. tegra and qcom), see: > > https://lore.kernel.org/linux-pci/ZWYmX8Y%2F7Q9WMxES@x1-carbon/ > > > > Do we really want to apply this series as is? > > > > > > Niklas, I think I explained it in this thread itself. Let me reiterate here > again. > > The fact that you are seeing the dmaengine warnings is due to function drivers > not releasing the channels properly. It is not the job of the DWC driver to > release the channels. The channels are requested by the function drivers [1] > and they _should_ release them when the channels are no longer required. Sure, the function driver should release the channels. > > I know that the PCI_EPF_TEST driver is not doing so and so you are seeing the > warnings. But I do not have a device to test that function driver. Qcom > platforms use a dedicated function driver and that releases the channels when it > gets the LINK_DOWN event from EPC [2]. > > So my conclusion is that the issue is there even without this series. If you > still want me to fix the EPF_TEST driver, I can submit a change, but someone has > to test it. That conclusion is not fully correct. Let's take e.g. these error messages that this series introduces: [ 1000.714355] debugfs: File 'mf' in directory '/' already present! [ 1000.714890] debugfs: File 'wr_ch_cnt' in directory '/' already present! [ 1000.715476] debugfs: File 'rd_ch_cnt' in directory '/' already present! [ 1000.716061] debugfs: Directory 'registers' with parent '/' already present! These come from dw_edma_core_debugfs_on(), which is called by dw_edma_probe(). This is a direct result from your patch: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=controller/dwc-ep&id=9ab5c8bb7a305135b1b6c65cb8db92b4acbef79d Which moves dw_pcie_edma_detect() from dw_pcie_ep_init_complete() to dw_pcie_ep_late_init() (since dw_pcie_edma_detect() calls dw_edma_probe()). So without your patch, those debugfs error messages are not seen. Thus, I do not think that it is sufficient to only modify the pci-epf-test driver to release the dma channels, as I don't see how that will avoid e.g. the debugfs error messages introduced by this patch. Kind regards, Niklas
On Wed, Jan 10, 2024 at 10:01:08AM +0000, Niklas Cassel wrote: > Hello Mani, > > On Wed, Jan 10, 2024 at 08:41:37AM +0530, Manivannan Sadhasivam wrote: > > On Tue, Jan 09, 2024 at 05:58:53PM +0000, Niklas Cassel wrote: > > > On Sun, Jan 07, 2024 at 04:27:07PM +0900, Krzysztof Wilczyński wrote: > > > > > > Considering that we know that this series introduces new problems > > > for drivers with a .core_init_notifier (i.e. tegra and qcom), see: > > > https://lore.kernel.org/linux-pci/ZWYmX8Y%2F7Q9WMxES@x1-carbon/ > > > > > > Do we really want to apply this series as is? > > > > > > > > > > Niklas, I think I explained it in this thread itself. Let me reiterate here > > again. > > > > The fact that you are seeing the dmaengine warnings is due to function drivers > > not releasing the channels properly. It is not the job of the DWC driver to > > release the channels. The channels are requested by the function drivers [1] > > and they _should_ release them when the channels are no longer required. > > Sure, the function driver should release the channels. > > > > > > I know that the PCI_EPF_TEST driver is not doing so and so you are seeing the > > warnings. But I do not have a device to test that function driver. Qcom > > platforms use a dedicated function driver and that releases the channels when it > > gets the LINK_DOWN event from EPC [2]. > > > > So my conclusion is that the issue is there even without this series. If you > > still want me to fix the EPF_TEST driver, I can submit a change, but someone has > > to test it. > > That conclusion is not fully correct. > > Let's take e.g. these error messages that this series introduces: > [ 1000.714355] debugfs: File 'mf' in directory '/' already present! > [ 1000.714890] debugfs: File 'wr_ch_cnt' in directory '/' already present! > [ 1000.715476] debugfs: File 'rd_ch_cnt' in directory '/' already present! > [ 1000.716061] debugfs: Directory 'registers' with parent '/' already present! > > These come from dw_edma_core_debugfs_on(), which is called by dw_edma_probe(). > > This is a direct result from your patch: > https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=controller/dwc-ep&id=9ab5c8bb7a305135b1b6c65cb8db92b4acbef79d > > Which moves dw_pcie_edma_detect() from dw_pcie_ep_init_complete() to > dw_pcie_ep_late_init() (since dw_pcie_edma_detect() calls dw_edma_probe()). > > So without your patch, those debugfs error messages are not seen. > > Thus, I do not think that it is sufficient to only modify the pci-epf-test > driver to release the dma channels, as I don't see how that will avoid e.g. > the debugfs error messages introduced by this patch. > Ah, sorry I overlooked the warnings from the edma core. I think adding dw_pcie_edma_remove() to the perst_assert() function would fix this issue. But I'm traveling this week, so couldn't verify it on the device. Bjorn, Krzysztof feel free to drop this series for 6.8. I will modify this series to address some other issues discussed so far and resubmit it for 6.9. - Mani > > Kind regards, > Niklas