Message ID | 20191024171228.877974-2-s.miroshnichenko@yadro.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI: Allow BAR movement during hotplug | expand |
pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff] pci_bus 0000:00: root bus resource [io 0x18800000-0x188fffff] pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0] pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff] pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size) pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size) pci 0000:00:04.0: BAR 0: assigned [mem 0x50000000-0x5000ffff] pci 0000:00:05.0: BAR 1: assigned [mem 0x50010000-0x50010fff] pci 0000:00:05.0: BAR 3: assigned [mem 0x50011000-0x50011fff] pci 0000:00:0a.0: BAR 1: assigned [mem 0x50012000-0x50012fff] pci 0000:00:0a.0: BAR 3: assigned [mem 0x50013000-0x50013fff] pci 0000:00:02.0: BAR 0: assigned [io 0x18800000-0x188000ff] pci 0000:00:02.0: BAR 1: assigned [mem 0x50014000-0x500140ff] pci 0000:00:03.0: BAR 0: assigned [io 0x18800400-0x188004ff] pci 0000:00:03.0: BAR 1: assigned [mem 0x50014100-0x500141ff] pci 0000:00:05.0: BAR 0: assigned [io 0x18800800-0x1880081f] pci 0000:00:05.0: BAR 2: assigned [io 0x18800820-0x1880083f] pci 0000:00:0a.0: BAR 0: assigned [io 0x18800840-0x1880085f] pci 0000:00:0a.0: BAR 2: assigned [io 0x18800860-0x1880087f] 00:00.0 Non-VGA unclassified device: Integrated Device Technology, Inc. Device 0000 Subsystem: Device 0214:011d Flags: bus master, 66MHz, medium devsel, latency 60, IRQ 140 Memory at <unassigned> (32-bit, prefetchable) I/O ports at <ignored> I/O ports at <ignored> 00:02.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S [Rhine-III] (rev 86) Subsystem: AST Research Inc Device 086c Flags: bus master, stepping, medium devsel, latency 64, IRQ 142 I/O ports at 18800000 [size=256] Memory at 50014000 (32-bit, non-prefetchable) [size=256] Capabilities: [40] Power Management version 2 Kernel driver in use: via-rhine 00:03.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S [Rhine-III] (rev 86) Subsystem: AST Research Inc Device 086c Flags: bus master, stepping, medium devsel, latency 64, IRQ 143 I/O ports at 18800400 [size=256] Memory at 50014100 (32-bit, non-prefetchable) [size=256] Capabilities: [40] Power Management version 2 Kernel driver in use: via-rhine 00:04.0 Network controller: Atheros Communications Inc. Device 0029 (rev 01) Subsystem: Atheros Communications Inc. Device 2091 Flags: bus master, 66MHz, medium devsel, latency 168, IRQ 142 Memory at 50000000 (32-bit, non-prefetchable) [size=64K] Capabilities: [44] Power Management version 2 Kernel driver in use: ath9k 00:05.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [) Subsystem: Oxford Semiconductor Ltd Device 0000 Flags: medium devsel, IRQ 143 I/O ports at 18800800 [size=32] Memory at 50010000 (32-bit, non-prefetchable) [size=4K] I/O ports at 18800820 [size=32] Memory at 50011000 (32-bit, non-prefetchable) [size=4K] Capabilities: [40] Power Management version 2 Kernel driver in use: serial 00:05.1 Non-VGA unclassified device: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01) Subsystem: Oxford Semiconductor Ltd Device 0000 Flags: medium devsel, IRQ 143 I/O ports at <unassigned> [disabled] I/O ports at <unassigned> [disabled] I/O ports at <unassigned> [disabled] Capabilities: [40] Power Management version 2 00:0a.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [) Subsystem: Oxford Semiconductor Ltd Device 0000 Flags: medium devsel, IRQ 140 I/O ports at 18800840 [size=32] Memory at 50012000 (32-bit, non-prefetchable) [size=4K] I/O ports at 18800860 [size=32] Memory at 50013000 (32-bit, non-prefetchable) [size=4K] Capabilities: [40] Power Management version 2 Kernel driver in use: serial 00:0a.1 Non-VGA unclassified device: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01) Subsystem: Oxford Semiconductor Ltd Device 0000 Flags: medium devsel, IRQ 140 I/O ports at <unassigned> [disabled] I/O ports at <unassigned> [disabled] I/O ports at <unassigned> [disabled] Capabilities: [40] Power Management version 2 hi guys I have a couple of miniPCI Oxford Semiconductor Ltd OX16PCI954 cards installed, and the dmesg looks weird espeially these lines pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff] pci_bus 0000:00: root bus resource [io 0x18800000-0x188fffff] pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0] pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff] pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size) pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size) besides, I am experimenting crashes happening in burn-in tests, and I do suspect it's something related to the newly added cards any enlightenment?
On Fri, Oct 25, 2019 at 04:33:13PM +0200, Carlo Pisani wrote: > pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff] > pci_bus 0000:00: root bus resource [io 0x18800000-0x188fffff] > pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0] > pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff] > pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size) > pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size) > pci 0000:00:04.0: BAR 0: assigned [mem 0x50000000-0x5000ffff] > pci 0000:00:05.0: BAR 1: assigned [mem 0x50010000-0x50010fff] > pci 0000:00:05.0: BAR 3: assigned [mem 0x50011000-0x50011fff] > pci 0000:00:0a.0: BAR 1: assigned [mem 0x50012000-0x50012fff] > pci 0000:00:0a.0: BAR 3: assigned [mem 0x50013000-0x50013fff] > pci 0000:00:02.0: BAR 0: assigned [io 0x18800000-0x188000ff] > pci 0000:00:02.0: BAR 1: assigned [mem 0x50014000-0x500140ff] > pci 0000:00:03.0: BAR 0: assigned [io 0x18800400-0x188004ff] > pci 0000:00:03.0: BAR 1: assigned [mem 0x50014100-0x500141ff] > pci 0000:00:05.0: BAR 0: assigned [io 0x18800800-0x1880081f] > pci 0000:00:05.0: BAR 2: assigned [io 0x18800820-0x1880083f] > pci 0000:00:0a.0: BAR 0: assigned [io 0x18800840-0x1880085f] > pci 0000:00:0a.0: BAR 2: assigned [io 0x18800860-0x1880087f] > > > 00:00.0 Non-VGA unclassified device: Integrated Device Technology, > Inc. Device 0000 > Subsystem: Device 0214:011d > Flags: bus master, 66MHz, medium devsel, latency 60, IRQ 140 > Memory at <unassigned> (32-bit, prefetchable) > I/O ports at <ignored> > I/O ports at <ignored> > > 00:02.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S > [Rhine-III] (rev 86) > Subsystem: AST Research Inc Device 086c > Flags: bus master, stepping, medium devsel, latency 64, IRQ 142 > I/O ports at 18800000 [size=256] > Memory at 50014000 (32-bit, non-prefetchable) [size=256] > Capabilities: [40] Power Management version 2 > Kernel driver in use: via-rhine > > 00:03.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S > [Rhine-III] (rev 86) > Subsystem: AST Research Inc Device 086c > Flags: bus master, stepping, medium devsel, latency 64, IRQ 143 > I/O ports at 18800400 [size=256] > Memory at 50014100 (32-bit, non-prefetchable) [size=256] > Capabilities: [40] Power Management version 2 > Kernel driver in use: via-rhine > > 00:04.0 Network controller: Atheros Communications Inc. Device 0029 (rev 01) > Subsystem: Atheros Communications Inc. Device 2091 > Flags: bus master, 66MHz, medium devsel, latency 168, IRQ 142 > Memory at 50000000 (32-bit, non-prefetchable) [size=64K] > Capabilities: [44] Power Management version 2 > Kernel driver in use: ath9k > > 00:05.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad > 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [) > Subsystem: Oxford Semiconductor Ltd Device 0000 > Flags: medium devsel, IRQ 143 > I/O ports at 18800800 [size=32] > Memory at 50010000 (32-bit, non-prefetchable) [size=4K] > I/O ports at 18800820 [size=32] > Memory at 50011000 (32-bit, non-prefetchable) [size=4K] > Capabilities: [40] Power Management version 2 > Kernel driver in use: serial > > 00:05.1 Non-VGA unclassified device: Oxford Semiconductor Ltd > OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01) > Subsystem: Oxford Semiconductor Ltd Device 0000 > Flags: medium devsel, IRQ 143 > I/O ports at <unassigned> [disabled] > I/O ports at <unassigned> [disabled] > I/O ports at <unassigned> [disabled] > Capabilities: [40] Power Management version 2 > > 00:0a.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad > 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [) > Subsystem: Oxford Semiconductor Ltd Device 0000 > Flags: medium devsel, IRQ 140 > I/O ports at 18800840 [size=32] > Memory at 50012000 (32-bit, non-prefetchable) [size=4K] > I/O ports at 18800860 [size=32] > Memory at 50013000 (32-bit, non-prefetchable) [size=4K] > Capabilities: [40] Power Management version 2 > Kernel driver in use: serial > > 00:0a.1 Non-VGA unclassified device: Oxford Semiconductor Ltd > OX16PCI954 (Quad 16950 UART) function 0 (Disabled) (rev 01) > Subsystem: Oxford Semiconductor Ltd Device 0000 > Flags: medium devsel, IRQ 140 > I/O ports at <unassigned> [disabled] > I/O ports at <unassigned> [disabled] > I/O ports at <unassigned> [disabled] > Capabilities: [40] Power Management version 2 > > > hi guys > I have a couple of miniPCI Oxford Semiconductor Ltd OX16PCI954 cards > installed, and the dmesg looks weird > > espeially these lines > pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff] > pci_bus 0000:00: root bus resource [io 0x18800000-0x188fffff] > pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0] > pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff] These resources are supplied to the PCI core, probably from DT. A complete dmesg log would show more. > pci 0000:00:00.0: [Firmware Bug]: reg 0x14: invalid BAR (can't size) > pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size) > besides, I am experimenting crashes happening in burn-in tests, and I > do suspect it's something related to the newly added cards If you take the cards out do the lines you mention above change? What sort of crashes do you see? I assume it doesn't crash without the cards? It *looks* like the miniPCI cards should be these devices: 00:05.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [) 00:0a.0 Serial controller: Oxford Semiconductor Ltd OX16PCI954 (Quad 16950 UART) function 0 (Uart) (rev 01) (prog-if 06 [) which are unrelated to the 00:00.0 device with the broken BAR.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a97e2571a527..44d0d12c80cf 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1643,6 +1643,8 @@ static void pci_enable_bridge(struct pci_dev *dev) struct pci_dev *bridge; int retval; + mutex_lock(&dev->enable_mutex); + bridge = pci_upstream_bridge(dev); if (bridge) pci_enable_bridge(bridge); @@ -1650,6 +1652,7 @@ static void pci_enable_bridge(struct pci_dev *dev) if (pci_is_enabled(dev)) { if (!dev->is_busmaster) pci_set_master(dev); + mutex_unlock(&dev->enable_mutex); return; } @@ -1658,11 +1661,14 @@ static void pci_enable_bridge(struct pci_dev *dev) pci_err(dev, "Error enabling bridge (%d), continuing\n", retval); pci_set_master(dev); + mutex_unlock(&dev->enable_mutex); } static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) { struct pci_dev *bridge; + /* Enable-locking of bridges is performed within the pci_enable_bridge() */ + bool need_lock = !dev->subordinate; int err; int i, bars = 0; @@ -1678,8 +1684,13 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK); } - if (atomic_inc_return(&dev->enable_cnt) > 1) + if (need_lock) + mutex_lock(&dev->enable_mutex); + if (pci_is_enabled(dev)) { + if (need_lock) + mutex_unlock(&dev->enable_mutex); return 0; /* already enabled */ + } bridge = pci_upstream_bridge(dev); if (bridge) @@ -1694,8 +1705,10 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) bars |= (1 << i); err = do_pci_enable_device(dev, bars); - if (err < 0) - atomic_dec(&dev->enable_cnt); + if (err >= 0) + atomic_inc(&dev->enable_cnt); + if (need_lock) + mutex_unlock(&dev->enable_mutex); return err; } @@ -1939,15 +1952,20 @@ void pci_disable_device(struct pci_dev *dev) if (dr) dr->enabled = 0; + mutex_lock(&dev->enable_mutex); dev_WARN_ONCE(&dev->dev, atomic_read(&dev->enable_cnt) <= 0, "disabling already-disabled device"); - if (atomic_dec_return(&dev->enable_cnt) != 0) + if (atomic_dec_return(&dev->enable_cnt) != 0) { + mutex_unlock(&dev->enable_mutex); return; + } do_pci_disable_device(dev); dev->is_busmaster = 0; + + mutex_unlock(&dev->enable_mutex); } EXPORT_SYMBOL(pci_disable_device); diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 3d5271a7a849..d4f21e413638 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2158,6 +2158,7 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus) INIT_LIST_HEAD(&dev->bus_list); dev->dev.type = &pci_dev_type; dev->bus = pci_bus_get(bus); + mutex_init(&dev->enable_mutex); return dev; } diff --git a/include/linux/pci.h b/include/linux/pci.h index f9088c89a534..525140e3a460 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -425,6 +425,7 @@ struct pci_dev { unsigned int no_vf_scan:1; /* Don't scan for VFs after IOV enablement */ pci_dev_flags_t dev_flags; atomic_t enable_cnt; /* pci_enable_device has been called */ + struct mutex enable_mutex; u32 saved_config_space[16]; /* Config space saved at suspend time */ struct hlist_head saved_cap_space;
This is a yet another approach to fix an old [1-2] concurrency issue, when: - two or more devices are being hot-added into a bridge which was initially empty; - a bridge with two or more devices is being hot-added; - during boot, if BIOS/bootloader/firmware doesn't pre-enable bridges. The problem is that a bridge is reported as enabled before the MEM/IO bits are actually written to the PCI_COMMAND register, so another driver thread starts memory requests through the not-yet-enabled bridge: CPU0 CPU1 pci_enable_device_mem() pci_enable_device_mem() pci_enable_bridge() pci_enable_bridge() pci_is_enabled() return false; atomic_inc_return(enable_cnt) Start actual enabling the bridge ... pci_is_enabled() ... return true; ... Start memory requests <-- FAIL ... Set the PCI_COMMAND_MEMORY bit <-- Must wait for this Protect the pci_enable/disable_device() and pci_enable_bridge(), which is similar to the previous solution from commit 40f11adc7cd9 ("PCI: Avoid race while enabling upstream bridges"), but adding a per-device mutexes and preventing the dev->enable_cnt from from incrementing early. CC: Srinath Mannam <srinath.mannam@broadcom.com> CC: Marta Rybczynska <mrybczyn@kalray.eu> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com> [1] https://lore.kernel.org/linux-pci/1501858648-22228-1-git-send-email-srinath.mannam@broadcom.com/T/#u [RFC PATCH v3] pci: Concurrency issue during pci enable bridge [2] https://lore.kernel.org/linux-pci/744877924.5841545.1521630049567.JavaMail.zimbra@kalray.eu/T/#u [RFC PATCH] nvme: avoid race-conditions when enabling devices --- drivers/pci/pci.c | 26 ++++++++++++++++++++++---- drivers/pci/probe.c | 1 + include/linux/pci.h | 1 + 3 files changed, 24 insertions(+), 4 deletions(-)