Message ID | 20240926125909.2362244-1-acelan.kao@canonical.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI: pciehp: Fix system hang on resume after hot-unplug during suspend | expand |
On Thu, Sep 26, 2024 at 08:59:09PM +0800, Chia-Lin Kao (AceLan) wrote: > Remove unnecessary pci_walk_bus() call in pciehp_resume_noirq(). This > fixes a system hang that occurs when resuming after a Thunderbolt dock > with attached thunderbolt storage is unplugged during system suspend. > > The PCI core already handles setting the disconnected state for devices > under a port during suspend/resume. Please explain in the commit message where the PCI core does that. > The redundant bus walk was > interfering with proper hardware state detection during resume, causing > a system hang when hot-unplugging daisy-chained Thunderbolt devices. Please explain what "proper hardware state detection" means. Did you get a hung task stacktrace? If so, please include it in the commit message. If you're getting a system hang, it means there's a deadlock involving pci_bus_sem. I don't quite see how that could happen, so a more verbose explanation would be appreciated. Thanks, Lukas > Fixes: 9d573d19547b ("PCI: pciehp: Detect device replacement during system sleep") > Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> > --- > drivers/pci/hotplug/pciehp_core.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > index ff458e692fed..c1c3f7e2bc43 100644 > --- a/drivers/pci/hotplug/pciehp_core.c > +++ b/drivers/pci/hotplug/pciehp_core.c > @@ -330,8 +330,6 @@ static int pciehp_resume_noirq(struct pcie_device *dev) > */ > if (pciehp_device_replaced(ctrl)) { > ctrl_dbg(ctrl, "device replaced during system sleep\n"); > - pci_walk_bus(ctrl->pcie->port->subordinate, > - pci_dev_set_disconnected, NULL); > pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); > } > } > -- > 2.43.0
Lukas Wunner <lukas@wunner.de> 於 2024年9月26日 週四 下午9:23寫道: > > On Thu, Sep 26, 2024 at 08:59:09PM +0800, Chia-Lin Kao (AceLan) wrote: > > Remove unnecessary pci_walk_bus() call in pciehp_resume_noirq(). This > > fixes a system hang that occurs when resuming after a Thunderbolt dock > > with attached thunderbolt storage is unplugged during system suspend. > > > > The PCI core already handles setting the disconnected state for devices > > under a port during suspend/resume. > > Please explain in the commit message where the PCI core does that. Hi Lukas, I found my patch doesn't work. Let me explain the new findings below. > > > The redundant bus walk was > > interfering with proper hardware state detection during resume, causing > > a system hang when hot-unplugging daisy-chained Thunderbolt devices. > > Please explain what "proper hardware state detection" means. > > Did you get a hung task stacktrace? If so, please include it in the > commit message. > > If you're getting a system hang, it means there's a deadlock > involving pci_bus_sem. I don't quite see how that could happen, > so a more verbose explanation would be appreciated. I have no good answer for you now. After enabling some debugging options and debugging lock options, I still didn't get any message. It just hangs there while resuming, and nothing could be logged. Here is my setup ubuntu@localhost:~$ lspci -tv -[0000:00]-+-00.0 Intel Corporation Device 6400 +-02.0 Intel Corporation Lunar Lake [Intel Graphics] +-04.0 Intel Corporation Device 641d +-05.0 Intel Corporation Device 645d +-07.0-[01-38]-- +-07.2-[39-70]----00.0-[3a-70]--+-00.0-[3b]-- | +-01.0-[3c-4d]-- | +-02.0-[4e-5f]----00.0-[4f-50]----01.0-[50]----00.0 Phison Electronics Corporation E12 NVMe Controlle r | +-03.0-[60-6f]-- | \-04.0-[70]-- This is Dell WD22TB dock 39:00.0 PCI bridge [0604]: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0b26] (rev 03) Subsystem: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0000] Kernel driver in use: pcieport This is the TBT storage connects to the dock 50:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E12 NVMe Controller [1987:5012] (rev 01) Subsystem: Phison Electronics Corporation E12 NVMe Controller [1987:5012] Kernel driver in use: nvme Kernel modules: nvme While resuming, the dock(8086:0b26) resumes first and I found if dock device run through below 2 functions in pciehp_resume_noirq() if (pciehp_device_replaced(ctrl)) { pci_walk_bus(ctrl->pcie->port->subordinate,pci_dev_set_disconnected, NULL); pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); } The system hangs when storage(1987:5012) also calls the 2 functions. Only one of the 2 devices can enter the if statement, dock or storage, or the system hangs. To test it more, I found that mask out pciehp_request() eases the issue. It may be because it calls irq_wake_thread() and is stuck somewhere. So, I try to get rid of the irq_wake_thread() by replacing pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); with atomic_or(PCI_EXP_SLTSTA_PDC, &ctrl->pending_events); In this case, only dock(8086:0b26) will be called in pciehp_resume_noirq(). And the tbt storage will be released after pci_power_up() fails. Do you think this is a feasible solution? diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index ff458e692fed..56bf23d55c41 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -332,7 +332,7 @@ static int pciehp_resume_noirq(struct pcie_device *dev) ctrl_dbg(ctrl, "device replaced during system sleep\n"); pci_walk_bus(ctrl->pcie->port->subordinate, pci_dev_set_disconnected, NULL); - pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); + atomic_or(PCI_EXP_SLTSTA_PDC, &ctrl->pending_events); } } > > Thanks, > > Lukas > > > > Fixes: 9d573d19547b ("PCI: pciehp: Detect device replacement during system sleep") > > Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> > > --- > > drivers/pci/hotplug/pciehp_core.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > > index ff458e692fed..c1c3f7e2bc43 100644 > > --- a/drivers/pci/hotplug/pciehp_core.c > > +++ b/drivers/pci/hotplug/pciehp_core.c > > @@ -330,8 +330,6 @@ static int pciehp_resume_noirq(struct pcie_device *dev) > > */ > > if (pciehp_device_replaced(ctrl)) { > > ctrl_dbg(ctrl, "device replaced during system sleep\n"); > > - pci_walk_bus(ctrl->pcie->port->subordinate, > > - pci_dev_set_disconnected, NULL); > > pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); > > } > > } > > -- > > 2.43.0
On Fri, Sep 27, 2024 at 03:33:50PM +0800, AceLan Kao wrote: > Lukas Wunner <lukas@wunner.de> 2024-9-26 9:23 > > On Thu, Sep 26, 2024 at 08:59:09PM +0800, Chia-Lin Kao (AceLan) wrote: > > > Remove unnecessary pci_walk_bus() call in pciehp_resume_noirq(). This > > > fixes a system hang that occurs when resuming after a Thunderbolt dock > > > with attached thunderbolt storage is unplugged during system suspend. > > > > > > The PCI core already handles setting the disconnected state for devices > > > under a port during suspend/resume. > > > > > > The redundant bus walk was > > > interfering with proper hardware state detection during resume, causing > > > a system hang when hot-unplugging daisy-chained Thunderbolt devices. > > I have no good answer for you now. > After enabling some debugging options and debugging lock options, I > still didn't get any message. Have you tried "no_console_suspend" on the kernel command line? > ubuntu@localhost:~$ lspci -tv > -[0000:00]-+-00.0 Intel Corporation Device 6400 > +-02.0 Intel Corporation Lunar Lake [Intel Graphics] > +-04.0 Intel Corporation Device 641d > +-05.0 Intel Corporation Device 645d > +-07.0-[01-38]-- > +-07.2-[39-70]----00.0-[3a-70]--+-00.0-[3b]-- > | +-01.0-[3c-4d]-- > | +-02.0-[4e-5f]----00.0-[4f-50]----01.0-[50]----00.0 Phison Electronics Corporation E12 NVMe Controller > | +-03.0-[60-6f]-- > | \-04.0-[70]-- > > This is Dell WD22TB dock > 39:00.0 PCI bridge [0604]: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0b26] (rev 03) > Subsystem: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0000] > > This is the TBT storage connects to the dock > 50:00.0 Non-Volatile memory controller [0108]: Phison Electronics > Corporation E12 NVMe Controller [1987:5012] (rev 01) > Subsystem: Phison Electronics Corporation E12 NVMe Controller [1987:5012] > Kernel driver in use: nvme > Kernel modules: nvme The lspci output shows another PCIe switch in-between the WD22TB dock and the NVMe drive (bus 4e and 4f). Is that another Thunderbolt device? Or is the NVMe drive built into the WD22TB dock and the switch at bus 4e and 4f is a non-Thunderbolt PCIe switch in the dock? I realize now that commit 9d573d19547b ("PCI: pciehp: Detect device replacement during system sleep") is a little overzealous because it not only reacts to *replaced* devices but also to *unplugged* devices: If the device was unplugged, reading the vendor and device ID returns 0xffff, which is different from the cached value, so the device is assumed to have been replaced even though it's actually been unplugged. The device replacement check runs in the ->resume_noirq phase. Later on in the ->resume phase, pciehp_resume() calls pciehp_check_presence() to check for unplugged devices. Commit 9d573d19547b inadvertantly reacts before pciehp_check_presence() gets a chance to react. So that's something that we should probably change. I'm not sure though why that would call a hang. But there is a known issue that a deadlock may occur when hot-removing nested PCIe switches (which is what you've got here). Keith Busch recently re-discovered the issue. You may want to try if the hang goes away if you apply this patch: https://lore.kernel.org/all/20240612181625.3604512-2-kbusch@meta.com/ If it does go away then at least we know what the root cause is. The patch is a bit hackish, but there's an ongoing effort to tackle the problem more thoroughly: https://lore.kernel.org/all/20240722151936.1452299-1-kbusch@meta.com/ https://lore.kernel.org/all/20240827192826.710031-1-kbusch@meta.com/ Thanks, Lukas
On Fri, Sep 27, 2024 at 11:28:54AM +0200, Lukas Wunner wrote: > I realize now that commit 9d573d19547b ("PCI: pciehp: Detect device > replacement during system sleep") is a little overzealous because it > not only reacts to *replaced* devices but also to *unplugged* devices: > If the device was unplugged, reading the vendor and device ID returns > 0xffff, which is different from the cached value, so the device is > assumed to have been replaced even though it's actually been unplugged. > > The device replacement check runs in the ->resume_noirq phase. Later on > in the ->resume phase, pciehp_resume() calls pciehp_check_presence() to > check for unplugged devices. Commit 9d573d19547b inadvertantly reacts > before pciehp_check_presence() gets a chance to react. So that's something > that we should probably change. FWIW, below is a (compile-tested only) patch which modifies pciehp_device_replaced() to return false if the device was *unplugged* during system sleep. It continues to return true if it was *replaced* during system sleep. This might avoid the issue you're seeing, though it would be good if you could also try Keith's deadlock prevention patch (without any other patch) to determine if the deadlock is the actual root cause (as I suspect). Thanks! -- >8 -- diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index ff458e6..174832b 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev) static bool pciehp_device_replaced(struct controller *ctrl) { struct pci_dev *pdev __free(pci_dev_put); + u64 dsn; u32 reg; pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0)); if (!pdev) + return false; + + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) == 0 && + !PCI_POSSIBLE_ERROR(reg) && + reg != (pdev->vendor | (pdev->device << 16))) return true; - if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || - reg != (pdev->vendor | (pdev->device << 16)) || - pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || + if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) == 0 && + !PCI_POSSIBLE_ERROR(reg) && reg != (pdev->revision | (pdev->class << 8))) return true; if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL && - (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) || - reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))) + pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) == 0 && + !PCI_POSSIBLE_ERROR(reg) && + reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))) return true; - if (pci_get_dsn(pdev) != ctrl->dsn) + dsn = pci_get_dsn(pdev); + if (!PCI_POSSIBLE_ERROR(dsn) && + dsn != ctrl->dsn) return true; return false;
Lukas Wunner <lukas@wunner.de> 於 2024年9月28日 週六 下午8:51寫道: > > On Fri, Sep 27, 2024 at 11:28:54AM +0200, Lukas Wunner wrote: > > I realize now that commit 9d573d19547b ("PCI: pciehp: Detect device > > replacement during system sleep") is a little overzealous because it > > not only reacts to *replaced* devices but also to *unplugged* devices: > > If the device was unplugged, reading the vendor and device ID returns > > 0xffff, which is different from the cached value, so the device is > > assumed to have been replaced even though it's actually been unplugged. > > > > The device replacement check runs in the ->resume_noirq phase. Later on > > in the ->resume phase, pciehp_resume() calls pciehp_check_presence() to > > check for unplugged devices. Commit 9d573d19547b inadvertantly reacts > > before pciehp_check_presence() gets a chance to react. So that's something > > that we should probably change. > > FWIW, below is a (compile-tested only) patch which modifies > pciehp_device_replaced() to return false if the device was > *unplugged* during system sleep. It continues to return > true if it was *replaced* during system sleep. > > This might avoid the issue you're seeing, though it would > be good if you could also try Keith's deadlock prevention > patch (without any other patch) to determine if the deadlock > is the actual root cause (as I suspect). > > Thanks! > > -- >8 -- > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > index ff458e6..174832b 100644 > --- a/drivers/pci/hotplug/pciehp_core.c > +++ b/drivers/pci/hotplug/pciehp_core.c > @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev) > static bool pciehp_device_replaced(struct controller *ctrl) > { > struct pci_dev *pdev __free(pci_dev_put); > + u64 dsn; > u32 reg; > > pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0)); > if (!pdev) > + return false; > + > + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) == 0 && > + !PCI_POSSIBLE_ERROR(reg) && > + reg != (pdev->vendor | (pdev->device << 16))) > return true; > > - if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || > - reg != (pdev->vendor | (pdev->device << 16)) || > - pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || > + if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) == 0 && > + !PCI_POSSIBLE_ERROR(reg) && > reg != (pdev->revision | (pdev->class << 8))) > return true; > > if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL && > - (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) || > - reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))) > + pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) == 0 && > + !PCI_POSSIBLE_ERROR(reg) && > + reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))) > return true; > > - if (pci_get_dsn(pdev) != ctrl->dsn) > + dsn = pci_get_dsn(pdev); > + if (!PCI_POSSIBLE_ERROR(dsn) && > + dsn != ctrl->dsn) > return true; In my case, the pciehp_device_replaced() returns true from this final check. And these are the values I got dsn = 0x00000000, ctrl->dsn = 0x7800AA00 dsn = 0x00000000, ctrl->dsn = 0x21B7D000 Did some other test TBT HDD -> TBT dock -> laptop suspend TBT HDD -> laptop(replace TBT dock with the TBT HDD) resume Got the same result as above, looks like it didn't detect the TBT dock has been replaced by TBT HDD. In the origin call trace, unplug TBT dock or replace it with TBT HDD, it returns true by the below check if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || reg != (pdev->vendor | (pdev->device << 16)) || pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || reg != (pdev->revision | (pdev->class << 8))) return true; > > return false;
Lukas Wunner <lukas@wunner.de> 於 2024年9月27日 週五 下午5:28寫道: > > On Fri, Sep 27, 2024 at 03:33:50PM +0800, AceLan Kao wrote: > > Lukas Wunner <lukas@wunner.de> 2024-9-26 9:23 > > > On Thu, Sep 26, 2024 at 08:59:09PM +0800, Chia-Lin Kao (AceLan) wrote: > > > > Remove unnecessary pci_walk_bus() call in pciehp_resume_noirq(). This > > > > fixes a system hang that occurs when resuming after a Thunderbolt dock > > > > with attached thunderbolt storage is unplugged during system suspend. > > > > > > > > The PCI core already handles setting the disconnected state for devices > > > > under a port during suspend/resume. > > > > > > > > The redundant bus walk was > > > > interfering with proper hardware state detection during resume, causing > > > > a system hang when hot-unplugging daisy-chained Thunderbolt devices. > > > > I have no good answer for you now. > > After enabling some debugging options and debugging lock options, I > > still didn't get any message. > > Have you tried "no_console_suspend" on the kernel command line? > > > > ubuntu@localhost:~$ lspci -tv > > -[0000:00]-+-00.0 Intel Corporation Device 6400 > > +-02.0 Intel Corporation Lunar Lake [Intel Graphics] > > +-04.0 Intel Corporation Device 641d > > +-05.0 Intel Corporation Device 645d > > +-07.0-[01-38]-- > > +-07.2-[39-70]----00.0-[3a-70]--+-00.0-[3b]-- > > | +-01.0-[3c-4d]-- > > | +-02.0-[4e-5f]----00.0-[4f-50]----01.0-[50]----00.0 Phison Electronics Corporation E12 NVMe Controller > > | +-03.0-[60-6f]-- > > | \-04.0-[70]-- > > > > This is Dell WD22TB dock > > 39:00.0 PCI bridge [0604]: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0b26] (rev 03) > > Subsystem: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0000] > > > > This is the TBT storage connects to the dock > > 50:00.0 Non-Volatile memory controller [0108]: Phison Electronics > > Corporation E12 NVMe Controller [1987:5012] (rev 01) > > Subsystem: Phison Electronics Corporation E12 NVMe Controller [1987:5012] > > Kernel driver in use: nvme > > Kernel modules: nvme > > The lspci output shows another PCIe switch in-between the WD22TB dock and > the NVMe drive (bus 4e and 4f). Is that another Thunderbolt device? > Or is the NVMe drive built into the WD22TB dock and the switch at bus > 4e and 4f is a non-Thunderbolt PCIe switch in the dock? > > I realize now that commit 9d573d19547b ("PCI: pciehp: Detect device > replacement during system sleep") is a little overzealous because it > not only reacts to *replaced* devices but also to *unplugged* devices: > If the device was unplugged, reading the vendor and device ID returns > 0xffff, which is different from the cached value, so the device is > assumed to have been replaced even though it's actually been unplugged. > > The device replacement check runs in the ->resume_noirq phase. Later on > in the ->resume phase, pciehp_resume() calls pciehp_check_presence() to > check for unplugged devices. Commit 9d573d19547b inadvertantly reacts > before pciehp_check_presence() gets a chance to react. So that's something > that we should probably change. > > I'm not sure though why that would call a hang. But there is a known issue > that a deadlock may occur when hot-removing nested PCIe switches (which is > what you've got here). Keith Busch recently re-discovered the issue. > You may want to try if the hang goes away if you apply this patch: > > https://lore.kernel.org/all/20240612181625.3604512-2-kbusch@meta.com/ > > If it does go away then at least we know what the root cause is. Yes, the 2 patches work. > > The patch is a bit hackish, but there's an ongoing effort to tackle the > problem more thoroughly: > > https://lore.kernel.org/all/20240722151936.1452299-1-kbusch@meta.com/ > https://lore.kernel.org/all/20240827192826.710031-1-kbusch@meta.com/ v2 can't be applied clearly, so I made some changes. And this series doesn't work for me. > > Thanks, > > Lukas
On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote: > Lukas Wunner <lukas@wunner.de> 2024 9 28 8:51: > > - if (pci_get_dsn(pdev) != ctrl->dsn) > > + dsn = pci_get_dsn(pdev); > > + if (!PCI_POSSIBLE_ERROR(dsn) && > > + dsn != ctrl->dsn) > > return true; > > In my case, the pciehp_device_replaced() returns true from this final check. > And these are the values I got > dsn = 0x00000000, ctrl->dsn = 0x7800AA00 > dsn = 0x00000000, ctrl->dsn = 0x21B7D000 Ah because pci_get_dsn() returns 0 if the device is gone. Below is a modified patch which returns false in that case. I've only changed: - dsn = pci_get_dsn(pdev); - if (!PCI_POSSIBLE_ERROR(dsn) && + if ((dsn = pci_get_dsn(pdev)) && + !PCI_POSSIBLE_ERROR(dsn) && > Did some other test > TBT HDD -> TBT dock -> laptop > suspend > TBT HDD -> laptop(replace TBT dock with the TBT HDD) > resume > Got the same result as above, looks like it didn't detect the TBT dock > has been replaced by TBT HDD. > > In the origin call trace, unplug TBT dock or replace it with TBT HDD, > it returns true by the below check > if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || > reg != (pdev->vendor | (pdev->device << 16)) || > pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || > reg != (pdev->revision | (pdev->class << 8))) > return true; Hm, that's odd. Why is that? Is reg == 0xffffffff in one of those cases? I guess that could happen if the Thunderbolt tunnels are not yet established at that point (i.e. in the ->resume_noirq phase), but normally they should be. Does this system use ICM-controlled tunnel management or kernel-native (software-controlled) tunnel management? Thanks, Lukas
On Tue, Oct 01, 2024 at 01:02:46PM +0200, Lukas Wunner wrote: > On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote: > > Lukas Wunner <lukas@wunner.de> 2024 9 28 8:51: > > > - if (pci_get_dsn(pdev) != ctrl->dsn) > > > + dsn = pci_get_dsn(pdev); > > > + if (!PCI_POSSIBLE_ERROR(dsn) && > > > + dsn != ctrl->dsn) > > > return true; > > > > In my case, the pciehp_device_replaced() returns true from this final check. > > And these are the values I got > > dsn = 0x00000000, ctrl->dsn = 0x7800AA00 > > dsn = 0x00000000, ctrl->dsn = 0x21B7D000 > > Ah because pci_get_dsn() returns 0 if the device is gone. > Below is a modified patch which returns false in that case. Sorry, forgot to include the patch: -- >8 -- diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index ff458e6..957c320 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev) static bool pciehp_device_replaced(struct controller *ctrl) { struct pci_dev *pdev __free(pci_dev_put); + u64 dsn; u32 reg; pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0)); if (!pdev) + return false; + + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) == 0 && + !PCI_POSSIBLE_ERROR(reg) && + reg != (pdev->vendor | (pdev->device << 16))) return true; - if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || - reg != (pdev->vendor | (pdev->device << 16)) || - pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || + if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) == 0 && + !PCI_POSSIBLE_ERROR(reg) && reg != (pdev->revision | (pdev->class << 8))) return true; if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL && - (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) || - reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))) + pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) == 0 && + !PCI_POSSIBLE_ERROR(reg) && + reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))) return true; - if (pci_get_dsn(pdev) != ctrl->dsn) + if ((dsn = pci_get_dsn(pdev)) && + !PCI_POSSIBLE_ERROR(dsn) && + dsn != ctrl->dsn) return true; return false;
On Mon, Sep 30, 2024 at 11:27:28AM +0800, AceLan Kao wrote: > Lukas Wunner <lukas@wunner.de> 2024 9 27 5:28 > > there is a known issue > > that a deadlock may occur when hot-removing nested PCIe switches (which is > > what you've got here). Keith Busch recently re-discovered the issue. > > You may want to try if the hang goes away if you apply this patch: > > > > https://lore.kernel.org/all/20240612181625.3604512-2-kbusch@meta.com/ > > > > If it does go away then at least we know what the root cause is. > > Yes, the 2 patches work. Okay so you can't reproduce the issue with those patches? That would mean 9d573d19547b ("PCI: pciehp: Detect device replacement during system sleep") isn't the culprit, whew! > > The patch is a bit hackish, but there's an ongoing effort to tackle the > > problem more thoroughly: > > > > https://lore.kernel.org/all/20240722151936.1452299-1-kbusch@meta.com/ > > https://lore.kernel.org/all/20240827192826.710031-1-kbusch@meta.com/ > > v2 can't be applied clearly, so I made some changes. > And this series doesn't work for me. Okay, I'll look at those patches separately. Thanks, Lukas
Lukas Wunner <lukas@wunner.de> 於 2024年10月1日 週二 下午7:03寫道: > > On Tue, Oct 01, 2024 at 01:02:46PM +0200, Lukas Wunner wrote: > > On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote: > > > Lukas Wunner <lukas@wunner.de> 2024 9 28 8:51: > > > > - if (pci_get_dsn(pdev) != ctrl->dsn) > > > > + dsn = pci_get_dsn(pdev); > > > > + if (!PCI_POSSIBLE_ERROR(dsn) && > > > > + dsn != ctrl->dsn) > > > > return true; > > > > > > In my case, the pciehp_device_replaced() returns true from this final check. > > > And these are the values I got > > > dsn = 0x00000000, ctrl->dsn = 0x7800AA00 > > > dsn = 0x00000000, ctrl->dsn = 0x21B7D000 > > > > Ah because pci_get_dsn() returns 0 if the device is gone. > > Below is a modified patch which returns false in that case. > > Sorry, forgot to include the patch: > > -- >8 -- > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > index ff458e6..957c320 100644 > --- a/drivers/pci/hotplug/pciehp_core.c > +++ b/drivers/pci/hotplug/pciehp_core.c > @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev) > static bool pciehp_device_replaced(struct controller *ctrl) > { > struct pci_dev *pdev __free(pci_dev_put); > + u64 dsn; > u32 reg; > > pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0)); > if (!pdev) > + return false; > + > + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) == 0 && > + !PCI_POSSIBLE_ERROR(reg) && > + reg != (pdev->vendor | (pdev->device << 16))) > return true; > > - if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || > - reg != (pdev->vendor | (pdev->device << 16)) || > - pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || > + if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) == 0 && > + !PCI_POSSIBLE_ERROR(reg) && > reg != (pdev->revision | (pdev->class << 8))) > return true; > > if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL && > - (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) || > - reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))) > + pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) == 0 && > + !PCI_POSSIBLE_ERROR(reg) && > + reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))) > return true; > > - if (pci_get_dsn(pdev) != ctrl->dsn) > + if ((dsn = pci_get_dsn(pdev)) && > + !PCI_POSSIBLE_ERROR(dsn) && > + dsn != ctrl->dsn) > return true; > > return false; Hi Lukas, Sorry for the late reply, just encountered a strong typhoon in my area last week and can't check this in our lab. The patched pciehp_device_replaced() returns false at the end of the function in my case. Unplug the dock which is connected with a tbt storage won't be considered as a replacement. But when I tried to replace the dock with the tbt storage during suspend, it still returned false at the end of the function like unplugged. BTW, it's a new model, so I think the ICM is used. And the reg is 0xffffffff when unplugged.
AceLan Kao <acelan.kao@canonical.com> 於 2024年10月7日 週一 下午12:34寫道: > > Lukas Wunner <lukas@wunner.de> 於 2024年10月1日 週二 下午7:03寫道: > > > > On Tue, Oct 01, 2024 at 01:02:46PM +0200, Lukas Wunner wrote: > > > On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote: > > > > Lukas Wunner <lukas@wunner.de> 2024 9 28 8:51: > > > > > - if (pci_get_dsn(pdev) != ctrl->dsn) > > > > > + dsn = pci_get_dsn(pdev); > > > > > + if (!PCI_POSSIBLE_ERROR(dsn) && > > > > > + dsn != ctrl->dsn) > > > > > return true; > > > > > > > > In my case, the pciehp_device_replaced() returns true from this final check. > > > > And these are the values I got > > > > dsn = 0x00000000, ctrl->dsn = 0x7800AA00 > > > > dsn = 0x00000000, ctrl->dsn = 0x21B7D000 > > > > > > Ah because pci_get_dsn() returns 0 if the device is gone. > > > Below is a modified patch which returns false in that case. > > > > Sorry, forgot to include the patch: > > > > -- >8 -- > > > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > > index ff458e6..957c320 100644 > > --- a/drivers/pci/hotplug/pciehp_core.c > > +++ b/drivers/pci/hotplug/pciehp_core.c > > @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev) > > static bool pciehp_device_replaced(struct controller *ctrl) > > { > > struct pci_dev *pdev __free(pci_dev_put); > > + u64 dsn; > > u32 reg; > > > > pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0)); > > if (!pdev) > > + return false; > > + > > + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) == 0 && > > + !PCI_POSSIBLE_ERROR(reg) && > > + reg != (pdev->vendor | (pdev->device << 16))) > > return true; > > > > - if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || > > - reg != (pdev->vendor | (pdev->device << 16)) || > > - pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || > > + if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) == 0 && > > + !PCI_POSSIBLE_ERROR(reg) && > > reg != (pdev->revision | (pdev->class << 8))) > > return true; > > > > if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL && > > - (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) || > > - reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))) > > + pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) == 0 && > > + !PCI_POSSIBLE_ERROR(reg) && > > + reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))) > > return true; > > > > - if (pci_get_dsn(pdev) != ctrl->dsn) > > + if ((dsn = pci_get_dsn(pdev)) && > > + !PCI_POSSIBLE_ERROR(dsn) && > > + dsn != ctrl->dsn) > > return true; > > > > return false; > Hi Lukas, > > Sorry for the late reply, just encountered a strong typhoon in my area > last week and can't check this in our lab. > > The patched pciehp_device_replaced() returns false at the end of the > function in my case. > Unplug the dock which is connected with a tbt storage won't be > considered as a replacement. > > But when I tried to replace the dock with the tbt storage during > suspend, it still returned false at the end of the function like > unplugged. > > BTW, it's a new model, so I think the ICM is used. And the reg is > 0xffffffff when unplugged. Hi Lukas, PCI_POSSIBLE_ERROR() always returns true no matter the device is replaced or unplugged It seems difficult to distinguish between when a device is replaced and when it's unplugged. Do you have any ideas to fix the issue? This issue is severe to me, because the system hangs almost everytime when daisy chain tbt devices are unplugged when suspended. Thanks.
AceLan Kao <acelan.kao@canonical.com> 於 2024年10月17日 週四 上午10:40寫道: > > AceLan Kao <acelan.kao@canonical.com> 於 2024年10月7日 週一 下午12:34寫道: > > > > Lukas Wunner <lukas@wunner.de> 於 2024年10月1日 週二 下午7:03寫道: > > > > > > On Tue, Oct 01, 2024 at 01:02:46PM +0200, Lukas Wunner wrote: > > > > On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote: > > > > > Lukas Wunner <lukas@wunner.de> 2024 9 28 8:51: > > > > > > - if (pci_get_dsn(pdev) != ctrl->dsn) > > > > > > + dsn = pci_get_dsn(pdev); > > > > > > + if (!PCI_POSSIBLE_ERROR(dsn) && > > > > > > + dsn != ctrl->dsn) > > > > > > return true; > > > > > > > > > > In my case, the pciehp_device_replaced() returns true from this final check. > > > > > And these are the values I got > > > > > dsn = 0x00000000, ctrl->dsn = 0x7800AA00 > > > > > dsn = 0x00000000, ctrl->dsn = 0x21B7D000 > > > > > > > > Ah because pci_get_dsn() returns 0 if the device is gone. > > > > Below is a modified patch which returns false in that case. > > > > > > Sorry, forgot to include the patch: > > > > > > -- >8 -- > > > > > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > > > index ff458e6..957c320 100644 > > > --- a/drivers/pci/hotplug/pciehp_core.c > > > +++ b/drivers/pci/hotplug/pciehp_core.c > > > @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev) > > > static bool pciehp_device_replaced(struct controller *ctrl) > > > { > > > struct pci_dev *pdev __free(pci_dev_put); > > > + u64 dsn; > > > u32 reg; > > > > > > pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0)); > > > if (!pdev) > > > + return false; > > > + > > > + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) == 0 && > > > + !PCI_POSSIBLE_ERROR(reg) && > > > + reg != (pdev->vendor | (pdev->device << 16))) > > > return true; > > > > > > - if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || > > > - reg != (pdev->vendor | (pdev->device << 16)) || > > > - pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || > > > + if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) == 0 && > > > + !PCI_POSSIBLE_ERROR(reg) && > > > reg != (pdev->revision | (pdev->class << 8))) > > > return true; > > > > > > if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL && > > > - (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) || > > > - reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))) > > > + pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) == 0 && > > > + !PCI_POSSIBLE_ERROR(reg) && > > > + reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))) > > > return true; > > > > > > - if (pci_get_dsn(pdev) != ctrl->dsn) > > > + if ((dsn = pci_get_dsn(pdev)) && > > > + !PCI_POSSIBLE_ERROR(dsn) && > > > + dsn != ctrl->dsn) > > > return true; > > > > > > return false; > > Hi Lukas, > > > > Sorry for the late reply, just encountered a strong typhoon in my area > > last week and can't check this in our lab. > > > > The patched pciehp_device_replaced() returns false at the end of the > > function in my case. > > Unplug the dock which is connected with a tbt storage won't be > > considered as a replacement. > > > > But when I tried to replace the dock with the tbt storage during > > suspend, it still returned false at the end of the function like > > unplugged. > > > > BTW, it's a new model, so I think the ICM is used. And the reg is > > 0xffffffff when unplugged. > Hi Lukas, > > PCI_POSSIBLE_ERROR() always returns true no matter the device is > replaced or unplugged > It seems difficult to distinguish between when a device is replaced > and when it's unplugged. > > Do you have any ideas to fix the issue? > This issue is severe to me, because the system hangs almost everytime > when daisy chain tbt devices are unplugged when suspended. > Thanks. Hi Lukas, I just submitted the v2, please help to review, thanks. https://lore.kernel.org/linux-kernel/20241022130243.263737-1-acelan.kao@canonical.com/T/#u
On 10/17/2024 10:40 AM, AceLan Kao wrote: > AceLan Kao <acelan.kao@canonical.com> 於 2024年10月7日 週一 下午12:34寫道: >> Lukas Wunner <lukas@wunner.de> 於 2024年10月1日 週二 下午7:03寫道: >>> On Tue, Oct 01, 2024 at 01:02:46PM +0200, Lukas Wunner wrote: >>>> On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote: >>>>> Lukas Wunner <lukas@wunner.de> 2024 9 28 8:51: >>>>>> - if (pci_get_dsn(pdev) != ctrl->dsn) >>>>>> + dsn = pci_get_dsn(pdev); >>>>>> + if (!PCI_POSSIBLE_ERROR(dsn) && >>>>>> + dsn != ctrl->dsn) >>>>>> return true; >>>>> In my case, the pciehp_device_replaced() returns true from this final check. >>>>> And these are the values I got >>>>> dsn = 0x00000000, ctrl->dsn = 0x7800AA00 >>>>> dsn = 0x00000000, ctrl->dsn = 0x21B7D000 >>>> Ah because pci_get_dsn() returns 0 if the device is gone. >>>> Below is a modified patch which returns false in that case. >>> Sorry, forgot to include the patch: >>> >>> -- >8 -- >>> >>> diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c >>> index ff458e6..957c320 100644 >>> --- a/drivers/pci/hotplug/pciehp_core.c >>> +++ b/drivers/pci/hotplug/pciehp_core.c >>> @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev) >>> static bool pciehp_device_replaced(struct controller *ctrl) >>> { >>> struct pci_dev *pdev __free(pci_dev_put); >>> + u64 dsn; >>> u32 reg; >>> >>> pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0)); >>> if (!pdev) >>> + return false; >>> + >>> + if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) == 0 && >>> + !PCI_POSSIBLE_ERROR(reg) && >>> + reg != (pdev->vendor | (pdev->device << 16))) >>> return true; >>> >>> - if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) || >>> - reg != (pdev->vendor | (pdev->device << 16)) || >>> - pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) || >>> + if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) == 0 && >>> + !PCI_POSSIBLE_ERROR(reg) && >>> reg != (pdev->revision | (pdev->class << 8))) >>> return true; >>> >>> if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL && >>> - (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) || >>> - reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))) >>> + pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, ®) == 0 && >>> + !PCI_POSSIBLE_ERROR(reg) && >>> + reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))) >>> return true; >>> >>> - if (pci_get_dsn(pdev) != ctrl->dsn) >>> + if ((dsn = pci_get_dsn(pdev)) && >>> + !PCI_POSSIBLE_ERROR(dsn) && >>> + dsn != ctrl->dsn) >>> return true; >>> >>> return false; >> Hi Lukas, >> >> Sorry for the late reply, just encountered a strong typhoon in my area >> last week and can't check this in our lab. >> >> The patched pciehp_device_replaced() returns false at the end of the >> function in my case. >> Unplug the dock which is connected with a tbt storage won't be >> considered as a replacement. >> >> But when I tried to replace the dock with the tbt storage during >> suspend, it still returned false at the end of the function like >> unplugged. >> >> BTW, it's a new model, so I think the ICM is used. And the reg is >> 0xffffffff when unplugged. > Hi Lukas, > > PCI_POSSIBLE_ERROR() always returns true no matter the device is > replaced or unplugged > It seems difficult to distinguish between when a device is replaced > and when it's unplugged. If DSN (Device Serial Number) is not optional extended capability, then we could use it as a tag to know if the device is replaced or unplugged. So it would be better to have something like DSN but not optional in spec. Thanks, Ethan > > Do you have any ideas to fix the issue? > This issue is severe to me, because the system hangs almost everytime > when daisy chain tbt devices are unplugged when suspended. > Thanks. >
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index ff458e692fed..c1c3f7e2bc43 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -330,8 +330,6 @@ static int pciehp_resume_noirq(struct pcie_device *dev) */ if (pciehp_device_replaced(ctrl)) { ctrl_dbg(ctrl, "device replaced during system sleep\n"); - pci_walk_bus(ctrl->pcie->port->subordinate, - pci_dev_set_disconnected, NULL); pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); } }
Remove unnecessary pci_walk_bus() call in pciehp_resume_noirq(). This fixes a system hang that occurs when resuming after a Thunderbolt dock with attached thunderbolt storage is unplugged during system suspend. The PCI core already handles setting the disconnected state for devices under a port during suspend/resume. The redundant bus walk was interfering with proper hardware state detection during resume, causing a system hang when hot-unplugging daisy-chained Thunderbolt devices. Fixes: 9d573d19547b ("PCI: pciehp: Detect device replacement during system sleep") Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> --- drivers/pci/hotplug/pciehp_core.c | 2 -- 1 file changed, 2 deletions(-)