Message ID | 20231016040132.23824-1-kai.heng.feng@canonical.com (mailing list archive) |
---|---|
State | Handled Elsewhere |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | PCI: pciehp: Prevent child devices from doing RPM on PCIe Link Down | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Mon, Oct 16, 2023 at 12:01:31PM +0800, Kai-Heng Feng wrote: > When inserting an SD7.0 card to Realtek card reader, it can trigger PCI > slot Link down and causes the following error: Why does *inserting* a card cause a Link Down? > [ 63.898861] pcieport 0000:00:1c.0: pciehp: Slot(8): Link Down > [ 63.912118] BUG: unable to handle page fault for address: ffffb24d403e5010 [...] > [ 63.912198] ? asm_exc_page_fault+0x27/0x30 > [ 63.912203] ? ioread32+0x2e/0x70 > [ 63.912206] ? rtsx_pci_write_register+0x5b/0x90 [rtsx_pci] > [ 63.912217] rtsx_set_l1off_sub+0x1c/0x30 [rtsx_pci] > [ 63.912226] rts5261_set_l1off_cfg_sub_d0+0x36/0x40 [rtsx_pci] > [ 63.912234] rtsx_pci_runtime_idle+0xc7/0x160 [rtsx_pci] > [ 63.912243] ? __pfx_pci_pm_runtime_idle+0x10/0x10 > [ 63.912246] pci_pm_runtime_idle+0x34/0x70 > [ 63.912248] rpm_idle+0xc4/0x2b0 > [ 63.912251] pm_runtime_work+0x93/0xc0 > [ 63.912254] process_one_work+0x21a/0x430 > [ 63.912258] worker_thread+0x4a/0x3c0 This looks like pcr->remap_addr is accessed after it has been iounmap'ed in rtsx_pci_remove() or before it has been iomap'ed in rtsx_pci_probe(). Is the card reader itself located below a hotplug port and unplugged here? Or is this about the card being removed from the card reader? Having full dmesg output and lspci -vvv output attached to a bugzilla would help to understand what is going on. Thanks, Lukas
On Mon, Oct 16, 2023 at 5:32 PM Lukas Wunner <lukas@wunner.de> wrote: > > On Mon, Oct 16, 2023 at 12:01:31PM +0800, Kai-Heng Feng wrote: > > When inserting an SD7.0 card to Realtek card reader, it can trigger PCI > > slot Link down and causes the following error: > > Why does *inserting* a card cause a Link Down? Ricky, do you know the reason why Link Down happens? > > > > [ 63.898861] pcieport 0000:00:1c.0: pciehp: Slot(8): Link Down > > [ 63.912118] BUG: unable to handle page fault for address: ffffb24d403e5010 > [...] > > [ 63.912198] ? asm_exc_page_fault+0x27/0x30 > > [ 63.912203] ? ioread32+0x2e/0x70 > > [ 63.912206] ? rtsx_pci_write_register+0x5b/0x90 [rtsx_pci] > > [ 63.912217] rtsx_set_l1off_sub+0x1c/0x30 [rtsx_pci] > > [ 63.912226] rts5261_set_l1off_cfg_sub_d0+0x36/0x40 [rtsx_pci] > > [ 63.912234] rtsx_pci_runtime_idle+0xc7/0x160 [rtsx_pci] > > [ 63.912243] ? __pfx_pci_pm_runtime_idle+0x10/0x10 > > [ 63.912246] pci_pm_runtime_idle+0x34/0x70 > > [ 63.912248] rpm_idle+0xc4/0x2b0 > > [ 63.912251] pm_runtime_work+0x93/0xc0 > > [ 63.912254] process_one_work+0x21a/0x430 > > [ 63.912258] worker_thread+0x4a/0x3c0 > > This looks like pcr->remap_addr is accessed after it has been iounmap'ed > in rtsx_pci_remove() or before it has been iomap'ed in rtsx_pci_probe(). > > Is the card reader itself located below a hotplug port and unplugged here? > Or is this about the card being removed from the card reader? > > Having full dmesg output and lspci -vvv output attached to a bugzilla > would help to understand what is going on. I don't have the hardware so we need Ricky to provide more information here. Regardless of the cardreader issue, do you have any concern on the patch itself? Kai-Heng > > Thanks, > > Lukas
> -----Original Message----- > From: Kai-Heng Feng <kai.heng.feng@canonical.com> > > On Mon, Oct 16, 2023 at 5:32 PM Lukas Wunner <lukas@wunner.de> wrote: > > > > On Mon, Oct 16, 2023 at 12:01:31PM +0800, Kai-Heng Feng wrote: > > > When inserting an SD7.0 card to Realtek card reader, it can trigger > > > PCI slot Link down and causes the following error: > > > > Why does *inserting* a card cause a Link Down? > > Ricky, do you know the reason why Link Down happens? > Because SD7.0 card is use pcie-nvme driver, reader need to re-link then just do the pcie channel > > > > > > > [ 63.898861] pcieport 0000:00:1c.0: pciehp: Slot(8): Link Down > > > [ 63.912118] BUG: unable to handle page fault for address: > ffffb24d403e5010 > > [...] > > > [ 63.912198] ? asm_exc_page_fault+0x27/0x30 > > > [ 63.912203] ? ioread32+0x2e/0x70 > > > [ 63.912206] ? rtsx_pci_write_register+0x5b/0x90 [rtsx_pci] > > > [ 63.912217] rtsx_set_l1off_sub+0x1c/0x30 [rtsx_pci] > > > [ 63.912226] rts5261_set_l1off_cfg_sub_d0+0x36/0x40 [rtsx_pci] > > > [ 63.912234] rtsx_pci_runtime_idle+0xc7/0x160 [rtsx_pci] > > > [ 63.912243] ? __pfx_pci_pm_runtime_idle+0x10/0x10 > > > [ 63.912246] pci_pm_runtime_idle+0x34/0x70 > > > [ 63.912248] rpm_idle+0xc4/0x2b0 > > > [ 63.912251] pm_runtime_work+0x93/0xc0 > > > [ 63.912254] process_one_work+0x21a/0x430 > > > [ 63.912258] worker_thread+0x4a/0x3c0 > > > > This looks like pcr->remap_addr is accessed after it has been > > iounmap'ed in rtsx_pci_remove() or before it has been iomap'ed in > rtsx_pci_probe(). > > > > Is the card reader itself located below a hotplug port and unplugged here? > > Or is this about the card being removed from the card reader? > > > > Having full dmesg output and lspci -vvv output attached to a bugzilla > > would help to understand what is going on. > > I don't have the hardware so we need Ricky to provide more information here. > > Regardless of the cardreader issue, do you have any concern on the patch > itself? > > Kai-Heng > > > > > Thanks, > > > > Lukas
> On Mon, Oct 16, 2023 at 12:01:31PM +0800, Kai-Heng Feng wrote: > > When inserting an SD7.0 card to Realtek card reader, it can trigger > > PCI slot Link down and causes the following error: > > Why does *inserting* a card cause a Link Down? > > > > [ 63.898861] pcieport 0000:00:1c.0: pciehp: Slot(8): Link Down > > [ 63.912118] BUG: unable to handle page fault for address: > ffffb24d403e5010 > [...] > > [ 63.912198] ? asm_exc_page_fault+0x27/0x30 > > [ 63.912203] ? ioread32+0x2e/0x70 > > [ 63.912206] ? rtsx_pci_write_register+0x5b/0x90 [rtsx_pci] > > [ 63.912217] rtsx_set_l1off_sub+0x1c/0x30 [rtsx_pci] > > [ 63.912226] rts5261_set_l1off_cfg_sub_d0+0x36/0x40 [rtsx_pci] > > [ 63.912234] rtsx_pci_runtime_idle+0xc7/0x160 [rtsx_pci] > > [ 63.912243] ? __pfx_pci_pm_runtime_idle+0x10/0x10 > > [ 63.912246] pci_pm_runtime_idle+0x34/0x70 > > [ 63.912248] rpm_idle+0xc4/0x2b0 > > [ 63.912251] pm_runtime_work+0x93/0xc0 > > [ 63.912254] process_one_work+0x21a/0x430 > > [ 63.912258] worker_thread+0x4a/0x3c0 > > This looks like pcr->remap_addr is accessed after it has been iounmap'ed in > rtsx_pci_remove() or before it has been iomap'ed in rtsx_pci_probe(). > > Is the card reader itself located below a hotplug port and unplugged here? Yes it is card reader unplug itself, because sd7.0 card is not used rtsx_pcr, it use nvme driver Ricky > Or is this about the card being removed from the card reader? > > Having full dmesg output and lspci -vvv output attached to a bugzilla would > help to understand what is going on. > > Thanks, > > Lukas
Hi Kai-Heng,
kernel test robot noticed the following build warnings:
[auto build test WARNING on pci/next]
[also build test WARNING on pci/for-linus linus/master v6.6-rc6 next-20231017]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Kai-Heng-Feng/PCI-pciehp-Prevent-child-devices-from-doing-RPM-on-PCIe-Link-Down/20231017-142208
base: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link: https://lore.kernel.org/r/20231016040132.23824-1-kai.heng.feng%40canonical.com
patch subject: [PATCH] PCI: pciehp: Prevent child devices from doing RPM on PCIe Link Down
config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20231017/202310171955.hlish6FZ-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231017/202310171955.hlish6FZ-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <yujie.liu@intel.com>
| Closes: https://lore.kernel.org/r/202310171955.hlish6FZ-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/pci/hotplug/pciehp_pci.c:25:5: warning: no previous prototype for 'pci_dev_disconnect' [-Wmissing-prototypes]
25 | int pci_dev_disconnect(struct pci_dev *pdev, void *unused)
| ^~~~~~~~~~~~~~~~~~
vim +/pci_dev_disconnect +25 drivers/pci/hotplug/pciehp_pci.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 24
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 @25 int pci_dev_disconnect(struct pci_dev *pdev, void *unused)
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 26 {
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 27 pm_runtime_barrier(&pdev->dev);
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 28 pci_dev_set_disconnected(pdev, NULL);
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 29
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 30 return 0;
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 31 }
2bd1cb5c4e6711 Kai-Heng Feng 2023-10-16 32
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c index ad12515a4a12..9ae4fa95c8c1 100644 --- a/drivers/pci/hotplug/pciehp_pci.c +++ b/drivers/pci/hotplug/pciehp_pci.c @@ -18,9 +18,18 @@ #include <linux/kernel.h> #include <linux/types.h> #include <linux/pci.h> +#include <linux/pm_runtime.h> #include "../pci.h" #include "pciehp.h" +int pci_dev_disconnect(struct pci_dev *pdev, void *unused) +{ + pm_runtime_barrier(&pdev->dev); + pci_dev_set_disconnected(pdev, NULL); + + return 0; +} + /** * pciehp_configure_device() - enumerate PCI devices below a hotplug bridge * @ctrl: PCIe hotplug controller @@ -98,7 +107,7 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence) __func__, pci_domain_nr(parent), parent->number); if (!presence) - pci_walk_bus(parent, pci_dev_set_disconnected, NULL); + pci_walk_bus(parent, pci_dev_disconnect, NULL); pci_lock_rescan_remove();