diff mbox series

[PATCHv2,1/5] pci: make pci_stop_dev concurrent safe

Message ID 20240827192826.710031-2-kbusch@meta.com (mailing list archive)
State Superseded
Headers show
Series pci cleanup/prep patches | expand

Commit Message

Keith Busch Aug. 27, 2024, 7:28 p.m. UTC
From: Keith Busch <kbusch@kernel.org>

Use the atomic ADDED bit flag to safely ensure concurrent callers can't
attempt to stop the device multiple times. Callers should currently be
holding the pci_rescan_remove_lock, so there shouldn't be fixing any
existing race. But that global lock can cause lock dependency issues, so
this is preparing to reduce reliance on that lock by using the existing
atomic bit ops.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/pci/bus.c    |  2 +-
 drivers/pci/pci.h    |  9 +++++++--
 drivers/pci/remove.c | 18 ++++++++----------
 3 files changed, 16 insertions(+), 13 deletions(-)

Comments

Davidlohr Bueso Oct. 2, 2024, 11:39 p.m. UTC | #1
On Tue, 27 Aug 2024, Keith Busch wrote:

>diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>index 55c8536860518..e41dfece0d969 100644
>--- a/drivers/pci/bus.c
>+++ b/drivers/pci/bus.c
>@@ -348,7 +348,7 @@ void pci_bus_add_device(struct pci_dev *dev)
>	if (retval < 0 && retval != -EPROBE_DEFER)
>		pci_warn(dev, "device attach failed (%d)\n", retval);
>
>-	pci_dev_assign_added(dev, true);
>+	pci_dev_assign_added(dev);
>
>	if (dev_of_node(&dev->dev) && pci_is_bridge(dev)) {
>		retval = of_platform_populate(dev_of_node(&dev->dev), NULL, NULL,
>diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>index 0e9b1c7b94a5a..802f7eac53115 100644
>--- a/drivers/pci/pci.h
>+++ b/drivers/pci/pci.h
>@@ -444,9 +444,14 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
> #define PCI_DPC_RECOVERED 1
> #define PCI_DPC_RECOVERING 2
>
>-static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
>+static inline void pci_dev_assign_added(struct pci_dev *dev)
> {
>-	assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
>+	set_bit(PCI_DEV_ADDED, &dev->priv_flags);
>+}

So set_bit does not imply any barriers. Does this matter in the future
when breaking up pci_rescan_remove_lock? For example, what prevents
things like:

pci_bus_add_device()			pci_stop_dev()
     pci_dev_assign_added()
	dev->priv_flags [S]
					    pci_dev_test_and_clear_added() // true
						dev->priv_flags [L]
     device_attach(&dev->dev)
					    device_release_driver(&dev->dev)

... I guess that implied barrier from that device_lock() in device_attach().
I am not familiar with this code, but overall I think any locking rework should
explain more about the ordering implications in the changes if the end result
is finer grained locking.

Thanks,
Davidlohr
Keith Busch Oct. 3, 2024, 12:04 a.m. UTC | #2
On Wed, Oct 02, 2024 at 04:39:37PM -0700, Davidlohr Bueso wrote:
> > +	set_bit(PCI_DEV_ADDED, &dev->priv_flags);
> > +}
> 
> So set_bit does not imply any barriers. 

Huh. Hannes told me the same thing just last weak, and I was thinking
"nah, it's an atomic operation." But I'm mistaken thinking that provides
a memory barrier.

> Does this matter in the future when breaking up
> pci_rescan_remove_lock? For example, what prevents things like:

We're still far from being able to remove the big pci rescan/remove
lock, but yes, that's the idea. This should be safe as-is since it is
still using that lock, I can add smp barriers to make the memroy
dependencies explicit.
 
> pci_bus_add_device()			pci_stop_dev()
>     pci_dev_assign_added()
> 	dev->priv_flags [S]
> 					    pci_dev_test_and_clear_added() // true
> 						dev->priv_flags [L]
>     device_attach(&dev->dev)
> 					    device_release_driver(&dev->dev)
> 
> ... I guess that implied barrier from that device_lock() in device_attach().
> I am not familiar with this code, but overall I think any locking rework should
> explain more about the ordering implications in the changes if the end result

Oh, goot point. This sequence shouldn't be possible with either the
existing or proposed bus locking, and I can certainly add more detailed
explanations.
diff mbox series

Patch

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 55c8536860518..e41dfece0d969 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -348,7 +348,7 @@  void pci_bus_add_device(struct pci_dev *dev)
 	if (retval < 0 && retval != -EPROBE_DEFER)
 		pci_warn(dev, "device attach failed (%d)\n", retval);
 
-	pci_dev_assign_added(dev, true);
+	pci_dev_assign_added(dev);
 
 	if (dev_of_node(&dev->dev) && pci_is_bridge(dev)) {
 		retval = of_platform_populate(dev_of_node(&dev->dev), NULL, NULL,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 0e9b1c7b94a5a..802f7eac53115 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -444,9 +444,14 @@  static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
 #define PCI_DPC_RECOVERED 1
 #define PCI_DPC_RECOVERING 2
 
-static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
+static inline void pci_dev_assign_added(struct pci_dev *dev)
 {
-	assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
+	set_bit(PCI_DEV_ADDED, &dev->priv_flags);
+}
+
+static inline bool pci_dev_test_and_clear_added(struct pci_dev *dev)
+{
+	return test_and_clear_bit(PCI_DEV_ADDED, &dev->priv_flags);
 }
 
 static inline bool pci_dev_is_added(const struct pci_dev *dev)
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 910387e5bdbf9..ec3064a115bf8 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -16,17 +16,15 @@  static void pci_free_resources(struct pci_dev *dev)
 
 static void pci_stop_dev(struct pci_dev *dev)
 {
-	pci_pme_active(dev, false);
-
-	if (pci_dev_is_added(dev)) {
-		of_platform_depopulate(&dev->dev);
-		device_release_driver(&dev->dev);
-		pci_proc_detach_device(dev);
-		pci_remove_sysfs_dev_files(dev);
-		of_pci_remove_node(dev);
+	if (!pci_dev_test_and_clear_added(dev))
+		return;
 
-		pci_dev_assign_added(dev, false);
-	}
+	pci_pme_active(dev, false);
+	of_platform_depopulate(&dev->dev);
+	device_release_driver(&dev->dev);
+	pci_proc_detach_device(dev);
+	pci_remove_sysfs_dev_files(dev);
+	of_pci_remove_node(dev);
 }
 
 static void pci_destroy_dev(struct pci_dev *dev)