diff mbox series

PCI: qcom: Implement shutdown() callback

Message ID 20250401-shutdown-v1-1-f699859403ae@oss.qualcomm.com (mailing list archive)
State New
Headers show
Series PCI: qcom: Implement shutdown() callback | expand

Commit Message

Krishna Chaitanya Chundru April 1, 2025, 11:21 a.m. UTC
From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

PCIe host controller drivers are supposed to properly remove the
endpoint drivers and release the resources during host shutdown/reboot
to avoid issues like smmu errors, NOC errors, etc.

So, stop and remove the root bus and its associated devices and release
its resources during system shutdown to ensure a clean shutdown/reboot.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
---
 drivers/pci/controller/dwc/pcie-qcom.c | 11 +++++++++++
 1 file changed, 11 insertions(+)


---
base-commit: b3ee1e4609512dfff642a96b34d7e5dfcdc92d05
change-id: 20250321-shutdown-9237fad7374c

Best regards,

Comments

Niklas Cassel April 2, 2025, 12:44 p.m. UTC | #1
Hello Krishna,

On Tue, Apr 01, 2025 at 04:51:37PM +0530, Krishna Chaitanya Chundru wrote:
> From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
>
> PCIe host controller drivers are supposed to properly remove the
> endpoint drivers and release the resources during host shutdown/reboot
> to avoid issues like smmu errors, NOC errors, etc.
>
> So, stop and remove the root bus and its associated devices and release
> its resources during system shutdown to ensure a clean shutdown/reboot.
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> ---
>  drivers/pci/controller/dwc/pcie-qcom.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index e4d3366ead1f9198693e6f9da4ae1dc40a3a0519..926811a0e63eb3663c1f41dc598659993546d832 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> @@ -1754,6 +1754,16 @@ static int qcom_pcie_probe(struct platform_device *pdev)
>	return ret;
>  }
>
> +static void qcom_pcie_shutdown(struct platform_device *pdev)
> +{
> +	struct qcom_pcie *pcie = platform_get_drvdata(pdev);
> +
> +	dw_pcie_host_deinit(&pcie->pci->pp);
> +	phy_exit(pcie->phy);
> +	pm_runtime_put(&pdev->dev);
> +	pm_runtime_disable(&pdev->dev);
> +}
> +
>  static int qcom_pcie_suspend_noirq(struct device *dev)
>  {
>	struct qcom_pcie *pcie = dev_get_drvdata(dev);
> @@ -1890,5 +1900,6 @@ static struct platform_driver qcom_pcie_driver = {
>		.pm = &qcom_pcie_pm_ops,
>		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
>	},
> +	.shutdown = qcom_pcie_shutdown,
>  };
>  builtin_platform_driver(qcom_pcie_driver);
>
> ---

Out of curiosity, I tried something similar to on pcie-dw-rockchip.c

Simply having a ->shutdown() callback that only calls dw_pcie_host_deinit()
was enough for me to produce:

[   40.209887] r8169 0004:41:00.0 eth0: Link is Down
[   40.216572] ------------[ cut here ]------------
[   40.216986] called from state HALTED
[   40.217317] WARNING: CPU: 7 PID: 265 at drivers/net/phy/phy.c:1630 phy_stop+0x134/0x1a0
[   40.218024] Modules linked in: rk805_pwrkey hantro_vpu v4l2_jpeg v4l2_vp9 v4l2_h264 v4l2_mem2mem videobuf2_v4l2 videobuf2_dma_contig videobuf2_memops videobuf2_common vidf
[   40.220267] CPU: 7 UID: 0 PID: 265 Comm: init Not tainted 6.14.0+ #134 PREEMPT
[   40.220908] Hardware name: Radxa ROCK 5B (DT)
[   40.221289] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   40.221899] pc : phy_stop+0x134/0x1a0
[   40.222222] lr : phy_stop+0x134/0x1a0
[   40.222546] sp : ffff800082213820
[   40.222836] x29: ffff800082213820 x28: ffff45ec84b30000 x27: 0000000000000000
[   40.223463] x26: 0000000000000000 x25: 0000000000000000 x24: ffffbe8df7fde030
[   40.224088] x23: ffff800082213990 x22: 0000000000000001 x21: ffff45ec80e10000
[   40.224714] x20: ffff45ec82cb40c8 x19: ffff45ec82ccc000 x18: 0000000000000006
[   40.225340] x17: 000000040044ffff x16: 005000f2b5503510 x15: 0720072007200720
[   40.225966] x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720
[   40.226592] x11: 0000000000000058 x10: 0000000000000018 x9 : ffffbe8df556469c
[   40.227217] x8 : 0000000000000268 x7 : ffffbe8df7a48648 x6 : ffffbe8df7a48648
[   40.227842] x5 : 0000000000017fe8 x4 : 0000000000000000 x3 : 0000000000000000
[   40.228468] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff45ec84b30000
[   40.229093] Call trace:
[   40.229308]  phy_stop+0x134/0x1a0 (P)
[   40.229634]  rtl8169_down+0x34/0x280
[   40.229952]  rtl8169_close+0x64/0x100
[   40.230275]  __dev_close_many+0xbc/0x1f0
[   40.230621]  dev_close_many+0x94/0x160
[   40.230951]  unregister_netdevice_many_notify+0x14c/0x9c0
[   40.231426]  unregister_netdevice_queue+0xe4/0x100
[   40.231848]  unregister_netdev+0x2c/0x60
[   40.232193]  rtl_remove_one+0xa0/0xe0
[   40.232517]  pci_device_remove+0x4c/0xf8
[   40.232864]  device_remove+0x54/0x90
[   40.233182]  device_release_driver_internal+0x1d4/0x238
[   40.233643]  device_release_driver+0x20/0x38
[   40.234019]  pci_stop_bus_device+0x84/0xe0
[   40.234381]  pci_stop_bus_device+0x40/0xe0
[   40.234741]  pci_stop_root_bus+0x48/0x80
[   40.235087]  dw_pcie_host_deinit+0x34/0xe0
[   40.235452]  rockchip_pcie_shutdown+0x24/0x48
[   40.235839]  platform_shutdown+0x2c/0x48
[   40.236187]  device_shutdown+0x150/0x278
[   40.236533]  kernel_restart+0x4c/0xb8
[   40.236859]  __do_sys_reboot+0x178/0x280
[   40.237206]  __arm64_sys_reboot+0x2c/0x40
[   40.237561]  invoke_syscall+0x50/0x120
[   40.237891]  el0_svc_common.constprop.0+0x48/0xf0
[   40.238305]  do_el0_svc+0x24/0x38
[   40.238597]  el0_svc+0x30/0xd0
[   40.238868]  el0t_64_sync_handler+0x10c/0x138
[   40.239251]  el0t_64_sync+0x198/0x1a0
[   40.239575] ---[ end trace 0000000000000000 ]---

Did you try your change with a simple network card connected to the PCI slot?
(And not just another qcom board running in EP mode.)

I don't see why you wouldn't see the same thing as me.


Kind regards,
Niklas
Krishna Chaitanya Chundru April 3, 2025, 3:56 a.m. UTC | #2
On 4/2/2025 6:14 PM, Niklas Cassel wrote:
> Hello Krishna,
> 
> On Tue, Apr 01, 2025 at 04:51:37PM +0530, Krishna Chaitanya Chundru wrote:
>> From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
>>
>> PCIe host controller drivers are supposed to properly remove the
>> endpoint drivers and release the resources during host shutdown/reboot
>> to avoid issues like smmu errors, NOC errors, etc.
>>
>> So, stop and remove the root bus and its associated devices and release
>> its resources during system shutdown to ensure a clean shutdown/reboot.
>>
>> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
>> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
>> ---
>>   drivers/pci/controller/dwc/pcie-qcom.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
>> index e4d3366ead1f9198693e6f9da4ae1dc40a3a0519..926811a0e63eb3663c1f41dc598659993546d832 100644
>> --- a/drivers/pci/controller/dwc/pcie-qcom.c
>> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
>> @@ -1754,6 +1754,16 @@ static int qcom_pcie_probe(struct platform_device *pdev)
>> 	return ret;
>>   }
>>
>> +static void qcom_pcie_shutdown(struct platform_device *pdev)
>> +{
>> +	struct qcom_pcie *pcie = platform_get_drvdata(pdev);
>> +
>> +	dw_pcie_host_deinit(&pcie->pci->pp);
>> +	phy_exit(pcie->phy);
>> +	pm_runtime_put(&pdev->dev);
>> +	pm_runtime_disable(&pdev->dev);
>> +}
>> +
>>   static int qcom_pcie_suspend_noirq(struct device *dev)
>>   {
>> 	struct qcom_pcie *pcie = dev_get_drvdata(dev);
>> @@ -1890,5 +1900,6 @@ static struct platform_driver qcom_pcie_driver = {
>> 		.pm = &qcom_pcie_pm_ops,
>> 		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
>> 	},
>> +	.shutdown = qcom_pcie_shutdown,
>>   };
>>   builtin_platform_driver(qcom_pcie_driver);
>>
>> ---
> 
> Out of curiosity, I tried something similar to on pcie-dw-rockchip.c
> 
> Simply having a ->shutdown() callback that only calls dw_pcie_host_deinit()
> was enough for me to produce:
> 
> [   40.209887] r8169 0004:41:00.0 eth0: Link is Down
> [   40.216572] ------------[ cut here ]------------
> [   40.216986] called from state HALTED
> [   40.217317] WARNING: CPU: 7 PID: 265 at drivers/net/phy/phy.c:1630 phy_stop+0x134/0x1a0
> [   40.218024] Modules linked in: rk805_pwrkey hantro_vpu v4l2_jpeg v4l2_vp9 v4l2_h264 v4l2_mem2mem videobuf2_v4l2 videobuf2_dma_contig videobuf2_memops videobuf2_common vidf
> [   40.220267] CPU: 7 UID: 0 PID: 265 Comm: init Not tainted 6.14.0+ #134 PREEMPT
> [   40.220908] Hardware name: Radxa ROCK 5B (DT)
> [   40.221289] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   40.221899] pc : phy_stop+0x134/0x1a0
> [   40.222222] lr : phy_stop+0x134/0x1a0
> [   40.222546] sp : ffff800082213820
> [   40.222836] x29: ffff800082213820 x28: ffff45ec84b30000 x27: 0000000000000000
> [   40.223463] x26: 0000000000000000 x25: 0000000000000000 x24: ffffbe8df7fde030
> [   40.224088] x23: ffff800082213990 x22: 0000000000000001 x21: ffff45ec80e10000
> [   40.224714] x20: ffff45ec82cb40c8 x19: ffff45ec82ccc000 x18: 0000000000000006
> [   40.225340] x17: 000000040044ffff x16: 005000f2b5503510 x15: 0720072007200720
> [   40.225966] x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720
> [   40.226592] x11: 0000000000000058 x10: 0000000000000018 x9 : ffffbe8df556469c
> [   40.227217] x8 : 0000000000000268 x7 : ffffbe8df7a48648 x6 : ffffbe8df7a48648
> [   40.227842] x5 : 0000000000017fe8 x4 : 0000000000000000 x3 : 0000000000000000
> [   40.228468] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff45ec84b30000
> [   40.229093] Call trace:
> [   40.229308]  phy_stop+0x134/0x1a0 (P)
> [   40.229634]  rtl8169_down+0x34/0x280
> [   40.229952]  rtl8169_close+0x64/0x100
> [   40.230275]  __dev_close_many+0xbc/0x1f0
> [   40.230621]  dev_close_many+0x94/0x160
> [   40.230951]  unregister_netdevice_many_notify+0x14c/0x9c0
> [   40.231426]  unregister_netdevice_queue+0xe4/0x100
> [   40.231848]  unregister_netdev+0x2c/0x60
> [   40.232193]  rtl_remove_one+0xa0/0xe0
> [   40.232517]  pci_device_remove+0x4c/0xf8
> [   40.232864]  device_remove+0x54/0x90
> [   40.233182]  device_release_driver_internal+0x1d4/0x238
> [   40.233643]  device_release_driver+0x20/0x38
> [   40.234019]  pci_stop_bus_device+0x84/0xe0
> [   40.234381]  pci_stop_bus_device+0x40/0xe0
> [   40.234741]  pci_stop_root_bus+0x48/0x80
> [   40.235087]  dw_pcie_host_deinit+0x34/0xe0
> [   40.235452]  rockchip_pcie_shutdown+0x24/0x48
> [   40.235839]  platform_shutdown+0x2c/0x48
> [   40.236187]  device_shutdown+0x150/0x278
> [   40.236533]  kernel_restart+0x4c/0xb8
> [   40.236859]  __do_sys_reboot+0x178/0x280
> [   40.237206]  __arm64_sys_reboot+0x2c/0x40
> [   40.237561]  invoke_syscall+0x50/0x120
> [   40.237891]  el0_svc_common.constprop.0+0x48/0xf0
> [   40.238305]  do_el0_svc+0x24/0x38
> [   40.238597]  el0_svc+0x30/0xd0
> [   40.238868]  el0t_64_sync_handler+0x10c/0x138
> [   40.239251]  el0t_64_sync+0x198/0x1a0
> [   40.239575] ---[ end trace 0000000000000000 ]---
> 
Hi Niklas,

The issue which you are seeing with specific to the RealTek ethernet
driver and should be fixed by RealTek driver nothing to do with the host
controller.
> Did you try your change with a simple network card connected to the PCI slot?
> (And not just another qcom board running in EP mode.)
> 
we don't have ethernet card's to try we tried with qcom PCIe switch
which usb hub connected as a endpoint.

- Krishna Chaitanya.
> I don't see why you wouldn't see the same thing as me.
> 
> 
> Kind regards,
> Niklas
Niklas Cassel April 3, 2025, 7:51 a.m. UTC | #3
Hello Krishna,

On Thu, Apr 03, 2025 at 09:26:08AM +0530, Krishna Chaitanya Chundru wrote:
> On 4/2/2025 6:14 PM, Niklas Cassel wrote:
> > 
> > Out of curiosity, I tried something similar to on pcie-dw-rockchip.c
> > 
> > Simply having a ->shutdown() callback that only calls dw_pcie_host_deinit()
> > was enough for me to produce:
> > 
> > [   40.209887] r8169 0004:41:00.0 eth0: Link is Down
> > [   40.216572] ------------[ cut here ]------------
> > [   40.216986] called from state HALTED
> > [   40.217317] WARNING: CPU: 7 PID: 265 at drivers/net/phy/phy.c:1630 phy_stop+0x134/0x1a0
> > [   40.218024] Modules linked in: rk805_pwrkey hantro_vpu v4l2_jpeg v4l2_vp9 v4l2_h264 v4l2_mem2mem videobuf2_v4l2 videobuf2_dma_contig videobuf2_memops videobuf2_common vidf
> > [   40.220267] CPU: 7 UID: 0 PID: 265 Comm: init Not tainted 6.14.0+ #134 PREEMPT
> > [   40.220908] Hardware name: Radxa ROCK 5B (DT)
> > [   40.221289] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   40.221899] pc : phy_stop+0x134/0x1a0
> > [   40.222222] lr : phy_stop+0x134/0x1a0
> > [   40.222546] sp : ffff800082213820
> > [   40.222836] x29: ffff800082213820 x28: ffff45ec84b30000 x27: 0000000000000000
> > [   40.223463] x26: 0000000000000000 x25: 0000000000000000 x24: ffffbe8df7fde030
> > [   40.224088] x23: ffff800082213990 x22: 0000000000000001 x21: ffff45ec80e10000
> > [   40.224714] x20: ffff45ec82cb40c8 x19: ffff45ec82ccc000 x18: 0000000000000006
> > [   40.225340] x17: 000000040044ffff x16: 005000f2b5503510 x15: 0720072007200720
> > [   40.225966] x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720
> > [   40.226592] x11: 0000000000000058 x10: 0000000000000018 x9 : ffffbe8df556469c
> > [   40.227217] x8 : 0000000000000268 x7 : ffffbe8df7a48648 x6 : ffffbe8df7a48648
> > [   40.227842] x5 : 0000000000017fe8 x4 : 0000000000000000 x3 : 0000000000000000
> > [   40.228468] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff45ec84b30000
> > [   40.229093] Call trace:
> > [   40.229308]  phy_stop+0x134/0x1a0 (P)
> > [   40.229634]  rtl8169_down+0x34/0x280
> > [   40.229952]  rtl8169_close+0x64/0x100
> > [   40.230275]  __dev_close_many+0xbc/0x1f0
> > [   40.230621]  dev_close_many+0x94/0x160
> > [   40.230951]  unregister_netdevice_many_notify+0x14c/0x9c0
> > [   40.231426]  unregister_netdevice_queue+0xe4/0x100
> > [   40.231848]  unregister_netdev+0x2c/0x60
> > [   40.232193]  rtl_remove_one+0xa0/0xe0
> > [   40.232517]  pci_device_remove+0x4c/0xf8
> > [   40.232864]  device_remove+0x54/0x90
> > [   40.233182]  device_release_driver_internal+0x1d4/0x238
> > [   40.233643]  device_release_driver+0x20/0x38
> > [   40.234019]  pci_stop_bus_device+0x84/0xe0
> > [   40.234381]  pci_stop_bus_device+0x40/0xe0
> > [   40.234741]  pci_stop_root_bus+0x48/0x80
> > [   40.235087]  dw_pcie_host_deinit+0x34/0xe0
> > [   40.235452]  rockchip_pcie_shutdown+0x24/0x48
> > [   40.235839]  platform_shutdown+0x2c/0x48
> > [   40.236187]  device_shutdown+0x150/0x278
> > [   40.236533]  kernel_restart+0x4c/0xb8
> > [   40.236859]  __do_sys_reboot+0x178/0x280
> > [   40.237206]  __arm64_sys_reboot+0x2c/0x40
> > [   40.237561]  invoke_syscall+0x50/0x120
> > [   40.237891]  el0_svc_common.constprop.0+0x48/0xf0
> > [   40.238305]  do_el0_svc+0x24/0x38
> > [   40.238597]  el0_svc+0x30/0xd0
> > [   40.238868]  el0t_64_sync_handler+0x10c/0x138
> > [   40.239251]  el0t_64_sync+0x198/0x1a0
> > [   40.239575] ---[ end trace 0000000000000000 ]---
> > 
> Hi Niklas,
> 
> The issue which you are seeing with specific to the RealTek ethernet
> driver and should be fixed by RealTek driver nothing to do with the host
> controller.

The warning comes from:
drivers/net/phy/phy.c:phy_stop()
so from the networking phylib.

Doing a simple:
$ git grep -p phy_stop

shows that practially all Ethernet drivers call phy_stop() from the
.ndo_stop() callback.

So after your suggested patch, you should see this warning appear with
any NIC, if connected to your PCIe controller.

You might not care, which is fine, I simply wanted to inform you about it,
since it is not only realtek, it is any NIC.


Kind regards,
Niklas
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index e4d3366ead1f9198693e6f9da4ae1dc40a3a0519..926811a0e63eb3663c1f41dc598659993546d832 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -1754,6 +1754,16 @@  static int qcom_pcie_probe(struct platform_device *pdev)
 	return ret;
 }
 
+static void qcom_pcie_shutdown(struct platform_device *pdev)
+{
+	struct qcom_pcie *pcie = platform_get_drvdata(pdev);
+
+	dw_pcie_host_deinit(&pcie->pci->pp);
+	phy_exit(pcie->phy);
+	pm_runtime_put(&pdev->dev);
+	pm_runtime_disable(&pdev->dev);
+}
+
 static int qcom_pcie_suspend_noirq(struct device *dev)
 {
 	struct qcom_pcie *pcie = dev_get_drvdata(dev);
@@ -1890,5 +1900,6 @@  static struct platform_driver qcom_pcie_driver = {
 		.pm = &qcom_pcie_pm_ops,
 		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
 	},
+	.shutdown = qcom_pcie_shutdown,
 };
 builtin_platform_driver(qcom_pcie_driver);