Message ID | 20231213003614.1648343-1-imammedo@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | PCI: acpiphp: workaround race between hotplug and SCSI_SCAN_ASYNC job | expand |
Hi Igor, I am not able to reproduce the issue any longer with the two patches on top of the mainline linux. Thank you very much! Dongli Zhang On 12/12/23 16:36, Igor Mammedov wrote: > Hacks to mask a race between HBA scan job and bridge re-configuration(s) > during hotplug. > > I don't like it a bit but it something that could be done quickly > and solves problems that were reported. > > Other options to discuss/possibly more invasive: > 1: make sure pci_assign_unassigned_bridge_resources() doesn't reconfigure > bridge if it's not necessary. > 2. make SCSI_SCAN_ASYNC job wait till hotplug is finished for all slots on > the bridge or somehow restart the job if it fails > 3. any other ideas? > > > 1st reported: https://urldefense.com/v3/__https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com__;!!ACWV5N9M2RV99hQ!ORo96Nh22kv1Yj0pazd3c692djoLbWscgouJoyVG1c1CNQnYz-H7nPM7RIp8N-0qQjScZ7BgORR_Lm4oMGMl$ > > CC: Dongli Zhang <dongli.zhang@oracle.com> > CC: linux-acpi@vger.kernel.org > CC: linux-pci@vger.kernel.org > CC: imammedo@redhat.com > CC: mst@redhat.com > CC: rafael@kernel.org > CC: lenb@kernel.org > CC: bhelgaas@google.com > CC: mika.westerberg@linux.intel.com > CC: boris.ostrovsky@oracle.com > CC: joe.jin@oracle.com > CC: stable@vger.kernel.org > CC: linux-kernel@vger.kernel.org > CC: Fiona Ebner <f.ebner@proxmox.com> > CC: Thomas Lamprecht <t.lamprecht@proxmox.com> > > Igor Mammedov (2): > PCI: acpiphp: enable slot only if it hasn't been enabled already > PCI: acpiphp: slowdown hotplug if hotplugging multiple devices at a > time > > drivers/pci/hotplug/acpiphp_glue.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) >
On Wed, Dec 13, 2023 at 01:36:12AM +0100, Igor Mammedov wrote: > Hacks to mask a race between HBA scan job and bridge re-configuration(s) > during hotplug. > > I don't like it a bit but it something that could be done quickly > and solves problems that were reported. I agree, I don't like it either. Adding a 1s delay doesn't address the real problem, and putting in a band-aid like this means the real problem would likely never be addressed. At this point the best option I see is to revert these: cc22522fd55e2 ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") 40613da52b13f ("PCI: acpiphp: Reassign resources on bridge if necessary") I hate the fact that reverting them would mean the root bus hotplug and ACPI bus check notifications would become issues again. But keeping these commits even though they add a new different problem that breaks things for somebody else seems worse to me. Bjorn > Other options to discuss/possibly more invasive: > 1: make sure pci_assign_unassigned_bridge_resources() doesn't reconfigure > bridge if it's not necessary. > 2. make SCSI_SCAN_ASYNC job wait till hotplug is finished for all slots on > the bridge or somehow restart the job if it fails > 3. any other ideas? > > > 1st reported: https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com > > CC: Dongli Zhang <dongli.zhang@oracle.com> > CC: linux-acpi@vger.kernel.org > CC: linux-pci@vger.kernel.org > CC: imammedo@redhat.com > CC: mst@redhat.com > CC: rafael@kernel.org > CC: lenb@kernel.org > CC: bhelgaas@google.com > CC: mika.westerberg@linux.intel.com > CC: boris.ostrovsky@oracle.com > CC: joe.jin@oracle.com > CC: stable@vger.kernel.org > CC: linux-kernel@vger.kernel.org > CC: Fiona Ebner <f.ebner@proxmox.com> > CC: Thomas Lamprecht <t.lamprecht@proxmox.com> > > Igor Mammedov (2): > PCI: acpiphp: enable slot only if it hasn't been enabled already > PCI: acpiphp: slowdown hotplug if hotplugging multiple devices at a > time > > drivers/pci/hotplug/acpiphp_glue.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > -- > 2.39.3 >
On Wed, Dec 13, 2023 at 7:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Wed, Dec 13, 2023 at 01:36:12AM +0100, Igor Mammedov wrote: > > Hacks to mask a race between HBA scan job and bridge re-configuration(s) > > during hotplug. > > > > I don't like it a bit but it something that could be done quickly > > and solves problems that were reported. > > I agree, I don't like it either. Adding a 1s delay doesn't address > the real problem, and putting in a band-aid like this means the real > problem would likely never be addressed. > > At this point the best option I see is to revert these: > > cc22522fd55e2 ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") > 40613da52b13f ("PCI: acpiphp: Reassign resources on bridge if necessary") > > I hate the fact that reverting them would mean the root bus hotplug > and ACPI bus check notifications would become issues again. > > But keeping these commits even though they add a new different problem > that breaks things for somebody else seems worse to me. Agreed.