Message ID | 20250328174216.3513079-1-sdf@fomichev.me (mailing list archive) |
---|---|
State | Accepted |
Commit | dd07df9ff3d148aee87fcbab99ff14f0727752f4 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] bnxt_en: bring back rtnl lock in bnxt_shutdown | expand |
On Sat, Mar 29, 2025 at 2:42 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > Hi Stanislav, Thanks a lot for this fix! > Taehee reports missing rtnl from bnxt_shutdown path: > > inetdev_event (./include/linux/inetdevice.h:256 net/ipv4/devinet.c:1585) > notifier_call_chain (kernel/notifier.c:85) > __dev_close_many (net/core/dev.c:1732 (discriminator 3)) > kernel/locking/mutex.c:713 kernel/locking/mutex.c:732) > dev_close_many (net/core/dev.c:1786) > netif_close (./include/linux/list.h:124 ./include/linux/list.h:215 > bnxt_shutdown (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16707) bnxt_en > pci_device_shutdown (drivers/pci/pci-driver.c:511) > device_shutdown (drivers/base/core.c:4820) > kernel_restart (kernel/reboot.c:271 kernel/reboot.c:285) > > Bring back the rtnl lock. Tested-by: Taehee Yoo <ap420073@gmail.com> > > Link: https://lore.kernel.org/netdev/CAMArcTV4P8PFsc6O2tSgzRno050DzafgqkLA2b7t=Fv_SY=brw@mail.gmail.com/ > Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") > Reported-by: Taehee Yoo <ap420073@gmail.com> > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> > --- > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > index 934ba9425857..1a70605fad38 100644 > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > @@ -16698,6 +16698,7 @@ static void bnxt_shutdown(struct pci_dev *pdev) > if (!dev) > return; > > + rtnl_lock(); > netdev_lock(dev); > bp = netdev_priv(dev); > if (!bp) > @@ -16717,6 +16718,7 @@ static void bnxt_shutdown(struct pci_dev *pdev) > > shutdown_exit: > netdev_unlock(dev); > + rtnl_unlock(); > } > > #ifdef CONFIG_PM_SLEEP > -- > 2.48.1 >
Hello Stanislav, On Fri, Mar 28, 2025 at 10:42:16AM -0700, Stanislav Fomichev wrote: > Taehee reports missing rtnl from bnxt_shutdown path: > > inetdev_event (./include/linux/inetdevice.h:256 net/ipv4/devinet.c:1585) > notifier_call_chain (kernel/notifier.c:85) > __dev_close_many (net/core/dev.c:1732 (discriminator 3)) > kernel/locking/mutex.c:713 kernel/locking/mutex.c:732) > dev_close_many (net/core/dev.c:1786) > netif_close (./include/linux/list.h:124 ./include/linux/list.h:215 > bnxt_shutdown (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16707) bnxt_en > pci_device_shutdown (drivers/pci/pci-driver.c:511) > device_shutdown (drivers/base/core.c:4820) > kernel_restart (kernel/reboot.c:271 kernel/reboot.c:285) I've got this issue as well. > > Bring back the rtnl lock. > > Link: https://lore.kernel.org/netdev/CAMArcTV4P8PFsc6O2tSgzRno050DzafgqkLA2b7t=Fv_SY=brw@mail.gmail.com/ > Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") > Reported-by: Taehee Yoo <ap420073@gmail.com> > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Breno Leitao <leitao@debian.org> > --- > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > index 934ba9425857..1a70605fad38 100644 > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > @@ -16698,6 +16698,7 @@ static void bnxt_shutdown(struct pci_dev *pdev) > if (!dev) > return; > > + rtnl_lock(); > netdev_lock(dev); can't we leverage the `struct net_device->lock` for the shutdown. Basically we have the lock the single device we are turning it down. I am wondering if we really need the big RTNL lock. This is my understanding of what is happening: pci_device_shutdown() is called for a single device - netdev_lock(dev) - netif_close(dev); - dev_close_many(&single, true); - __dev_close_many() - ASSERT_RTNL(); Basically we ware only closing one device, and the net_device->lock is already held. Shouldn't it be enough? Can we do something like this (from my naive point of view): static void __dev_close_many(struct list_head *head) { struct net_device *dev; - ASSERT_RTNL(); might_sleep(); list_for_each_entry(dev, head, close_list) { + ASSERT_RTNL_NET(dev); ... } Thanks --breno
On 03/31, Breno Leitao wrote: > Hello Stanislav, > > On Fri, Mar 28, 2025 at 10:42:16AM -0700, Stanislav Fomichev wrote: > > Taehee reports missing rtnl from bnxt_shutdown path: > > > > inetdev_event (./include/linux/inetdevice.h:256 net/ipv4/devinet.c:1585) > > notifier_call_chain (kernel/notifier.c:85) > > __dev_close_many (net/core/dev.c:1732 (discriminator 3)) > > kernel/locking/mutex.c:713 kernel/locking/mutex.c:732) > > dev_close_many (net/core/dev.c:1786) > > netif_close (./include/linux/list.h:124 ./include/linux/list.h:215 > > bnxt_shutdown (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16707) bnxt_en > > pci_device_shutdown (drivers/pci/pci-driver.c:511) > > device_shutdown (drivers/base/core.c:4820) > > kernel_restart (kernel/reboot.c:271 kernel/reboot.c:285) > > I've got this issue as well. > > > > > Bring back the rtnl lock. > > > > Link: https://lore.kernel.org/netdev/CAMArcTV4P8PFsc6O2tSgzRno050DzafgqkLA2b7t=Fv_SY=brw@mail.gmail.com/ > > Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") > > Reported-by: Taehee Yoo <ap420073@gmail.com> > > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> > > Tested-by: Breno Leitao <leitao@debian.org> > > > --- > > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > > index 934ba9425857..1a70605fad38 100644 > > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c > > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > > @@ -16698,6 +16698,7 @@ static void bnxt_shutdown(struct pci_dev *pdev) > > if (!dev) > > return; > > > > + rtnl_lock(); > > netdev_lock(dev); > > can't we leverage the `struct net_device->lock` for the shutdown. > Basically we have the lock the single device we are turning it down. > > I am wondering if we really need the big RTNL lock. This is my > understanding of what is happening: > > pci_device_shutdown() is called for a single device > - netdev_lock(dev) > - netif_close(dev); > - dev_close_many(&single, true); > - __dev_close_many() > - ASSERT_RTNL(); > > Basically we ware only closing one device, and the net_device->lock > is already held. Shouldn't it be enough? [..] > Can we do something like this (from my naive point of view): > > static void __dev_close_many(struct list_head *head) > { > struct net_device *dev; > > - ASSERT_RTNL(); > might_sleep(); > > list_for_each_entry(dev, head, close_list) { > + ASSERT_RTNL_NET(dev); > ... > } - netif_close adds dev->close_list to the list (if it was up) - __dev_close_many walks over that list, so your new assert should trigger as well But also in general, it would be nice to keep existing rtnl+instance_lock scheme for now (except were we want to explicitly opt out, as in queue apis); we can follow up later to un-rtnl the rest.
On Mon, Mar 31, 2025 at 08:29:53AM -0700, Stanislav Fomichev wrote: > On 03/31, Breno Leitao wrote: > > Hello Stanislav, > > > > On Fri, Mar 28, 2025 at 10:42:16AM -0700, Stanislav Fomichev wrote: > > > Taehee reports missing rtnl from bnxt_shutdown path: > > > > > > inetdev_event (./include/linux/inetdevice.h:256 net/ipv4/devinet.c:1585) > > > notifier_call_chain (kernel/notifier.c:85) > > > __dev_close_many (net/core/dev.c:1732 (discriminator 3)) > > > kernel/locking/mutex.c:713 kernel/locking/mutex.c:732) > > > dev_close_many (net/core/dev.c:1786) > > > netif_close (./include/linux/list.h:124 ./include/linux/list.h:215 > > > bnxt_shutdown (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16707) bnxt_en > > > pci_device_shutdown (drivers/pci/pci-driver.c:511) > > > device_shutdown (drivers/base/core.c:4820) > > > kernel_restart (kernel/reboot.c:271 kernel/reboot.c:285) > > > > I've got this issue as well. > > > > > > > > Bring back the rtnl lock. > > > > > > Link: https://lore.kernel.org/netdev/CAMArcTV4P8PFsc6O2tSgzRno050DzafgqkLA2b7t=Fv_SY=brw@mail.gmail.com/ > > > Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") > > > Reported-by: Taehee Yoo <ap420073@gmail.com> > > > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> > > > > Tested-by: Breno Leitao <leitao@debian.org> > > > > > --- > > > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > > > index 934ba9425857..1a70605fad38 100644 > > > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c > > > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > > > @@ -16698,6 +16698,7 @@ static void bnxt_shutdown(struct pci_dev *pdev) > > > if (!dev) > > > return; > > > > > > + rtnl_lock(); > > > netdev_lock(dev); > > > > can't we leverage the `struct net_device->lock` for the shutdown. > > Basically we have the lock the single device we are turning it down. > > > > I am wondering if we really need the big RTNL lock. This is my > > understanding of what is happening: > > > > pci_device_shutdown() is called for a single device > > - netdev_lock(dev) > > - netif_close(dev); > > - dev_close_many(&single, true); > > - __dev_close_many() > > - ASSERT_RTNL(); > > > > Basically we ware only closing one device, and the net_device->lock > > is already held. Shouldn't it be enough? > > [..] > > > Can we do something like this (from my naive point of view): > > > > static void __dev_close_many(struct list_head *head) > > { > > struct net_device *dev; > > > > - ASSERT_RTNL(); > > might_sleep(); > > > > list_for_each_entry(dev, head, close_list) { > > + ASSERT_RTNL_NET(dev); > > ... > > } > > - netif_close adds dev->close_list to the list (if it was up) Right, but that list has only one net_device entry, right? netif_close() instanciates a single list and merges it into `dev->close_list` > - __dev_close_many walks over that list, so your new assert should > trigger as well Why? Isn't the list only contain the dev that is already protected by netdev_lock()? > But also in general, it would be nice to keep existing > rtnl+instance_lock scheme for now (except were we want to explicitly opt > out, as in queue apis); we can follow up later to un-rtnl the rest. I am just wondering if the code as-is is already safe from a locking perspecting, and just the warning (ASSERT_RTNL) is wrong.
On Mon, 31 Mar 2025 09:53:16 -0700 Breno Leitao wrote: > > But also in general, it would be nice to keep existing > > rtnl+instance_lock scheme for now (except were we want to explicitly opt > > out, as in queue apis); we can follow up later to un-rtnl the rest. > > I am just wondering if the code as-is is already safe from a locking > perspecting, and just the warning (ASSERT_RTNL) is wrong. I suspect the notifiers for DOWN may expect to be called with rtnl held.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 28 Mar 2025 10:42:16 -0700 you wrote: > Taehee reports missing rtnl from bnxt_shutdown path: > > inetdev_event (./include/linux/inetdevice.h:256 net/ipv4/devinet.c:1585) > notifier_call_chain (kernel/notifier.c:85) > __dev_close_many (net/core/dev.c:1732 (discriminator 3)) > kernel/locking/mutex.c:713 kernel/locking/mutex.c:732) > dev_close_many (net/core/dev.c:1786) > netif_close (./include/linux/list.h:124 ./include/linux/list.h:215 > bnxt_shutdown (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16707) bnxt_en > pci_device_shutdown (drivers/pci/pci-driver.c:511) > device_shutdown (drivers/base/core.c:4820) > kernel_restart (kernel/reboot.c:271 kernel/reboot.c:285) > > [...] Here is the summary with links: - [net] bnxt_en: bring back rtnl lock in bnxt_shutdown https://git.kernel.org/netdev/net/c/dd07df9ff3d1 You are awesome, thank you!
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 934ba9425857..1a70605fad38 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -16698,6 +16698,7 @@ static void bnxt_shutdown(struct pci_dev *pdev) if (!dev) return; + rtnl_lock(); netdev_lock(dev); bp = netdev_priv(dev); if (!bp) @@ -16717,6 +16718,7 @@ static void bnxt_shutdown(struct pci_dev *pdev) shutdown_exit: netdev_unlock(dev); + rtnl_unlock(); } #ifdef CONFIG_PM_SLEEP
Taehee reports missing rtnl from bnxt_shutdown path: inetdev_event (./include/linux/inetdevice.h:256 net/ipv4/devinet.c:1585) notifier_call_chain (kernel/notifier.c:85) __dev_close_many (net/core/dev.c:1732 (discriminator 3)) kernel/locking/mutex.c:713 kernel/locking/mutex.c:732) dev_close_many (net/core/dev.c:1786) netif_close (./include/linux/list.h:124 ./include/linux/list.h:215 bnxt_shutdown (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16707) bnxt_en pci_device_shutdown (drivers/pci/pci-driver.c:511) device_shutdown (drivers/base/core.c:4820) kernel_restart (kernel/reboot.c:271 kernel/reboot.c:285) Bring back the rtnl lock. Link: https://lore.kernel.org/netdev/CAMArcTV4P8PFsc6O2tSgzRno050DzafgqkLA2b7t=Fv_SY=brw@mail.gmail.com/ Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Reported-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++ 1 file changed, 2 insertions(+)