diff mbox series

wifi: ath11k: avoid deadlock during regulatory update in ath11k_regd_update()

Message ID 20221006151747.13757-1-kvalo@kernel.org (mailing list archive)
State Accepted
Commit d99884ad9e3673a12879bc2830f6e5a66cccbd78
Delegated to: Kalle Valo
Headers show
Series wifi: ath11k: avoid deadlock during regulatory update in ath11k_regd_update() | expand

Commit Message

Kalle Valo Oct. 6, 2022, 3:17 p.m. UTC
From: Wen Gong <quic_wgong@quicinc.com>

Running this test in a loop it is easy to reproduce an rtnl deadlock:

iw reg set FI
ifconfig wlan0 down

What happens is that thread A (workqueue) tries to update the regulatory:

    try to acquire the rtnl_lock of ar->regd_update_work

    rtnl_lock+0x17/0x20
    ath11k_regd_update+0x15a/0x260 [ath11k]
    ath11k_regd_update_work+0x15/0x20 [ath11k]
    process_one_work+0x228/0x670
    worker_thread+0x4d/0x440
    kthread+0x16d/0x1b0
    ret_from_fork+0x22/0x30

And thread B (ifconfig) tries to stop the interface:

    try to cancel_work_sync(&ar->regd_update_work) in ath11k_mac_op_stop().
    ifconfig  3109 [003]  2414.232506: probe:

    ath11k_mac_op_stop: (ffffffffc14187a0)
    drv_stop+0x30 ([mac80211])
    ieee80211_do_stop+0x5d2 ([mac80211])
    ieee80211_stop+0x3e ([mac80211])
    __dev_close_many+0x9e ([kernel.kallsyms])
    __dev_change_flags+0xbe ([kernel.kallsyms])
    dev_change_flags+0x23 ([kernel.kallsyms])
    devinet_ioctl+0x5e3 ([kernel.kallsyms])
    inet_ioctl+0x197 ([kernel.kallsyms])
    sock_do_ioctl+0x4d ([kernel.kallsyms])
    sock_ioctl+0x264 ([kernel.kallsyms])
    __x64_sys_ioctl+0x92 ([kernel.kallsyms])
    do_syscall_64+0x3a ([kernel.kallsyms])
    entry_SYSCALL_64_after_hwframe+0x63 ([kernel.kallsyms])
    __GI___ioctl+0x7 (/lib/x86_64-linux-gnu/libc-2.23.so)

The sequence of deadlock is:

1. Thread B calls rtnl_lock().

2. Thread A starts to run and calls rtnl_lock() from within
   ath11k_regd_update_work(), then enters wait state because the lock is owned by
   thread B.

3. Thread B continues to run and tries to call
   cancel_work_sync(&ar->regd_update_work), but thread A is in
   ath11k_regd_update_work() waiting for rtnl_lock(). So cancel_work_sync()
   forever waits for ath11k_regd_update_work() to finish and we have a deadlock.

Fix this by switching from using regulatory_set_wiphy_regd_sync() to
regulatory_set_wiphy_regd(). Now cfg80211 will schedule another workqueue which
handles the locking on it's own. So the ath11k workqueue can simply exit without
taking any locks, avoiding the deadlock.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3

Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
[kvalo: improve commit log]
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
---
 drivers/net/wireless/ath/ath11k/reg.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)


base-commit: 023baf1318ef21442fab3842bf03883bc81223e0

Comments

Jeff Johnson Oct. 7, 2022, 9:08 p.m. UTC | #1
On 10/6/2022 8:17 AM, Kalle Valo wrote:
> From: Wen Gong <quic_wgong@quicinc.com>
> 
> Running this test in a loop it is easy to reproduce an rtnl deadlock:
> 
> iw reg set FI
> ifconfig wlan0 down
> 
> What happens is that thread A (workqueue) tries to update the regulatory:
> 
>      try to acquire the rtnl_lock of ar->regd_update_work
> 
>      rtnl_lock+0x17/0x20
>      ath11k_regd_update+0x15a/0x260 [ath11k]
>      ath11k_regd_update_work+0x15/0x20 [ath11k]
>      process_one_work+0x228/0x670
>      worker_thread+0x4d/0x440
>      kthread+0x16d/0x1b0
>      ret_from_fork+0x22/0x30
> 
> And thread B (ifconfig) tries to stop the interface:
> 
>      try to cancel_work_sync(&ar->regd_update_work) in ath11k_mac_op_stop().
>      ifconfig  3109 [003]  2414.232506: probe:
> 
>      ath11k_mac_op_stop: (ffffffffc14187a0)
>      drv_stop+0x30 ([mac80211])
>      ieee80211_do_stop+0x5d2 ([mac80211])
>      ieee80211_stop+0x3e ([mac80211])
>      __dev_close_many+0x9e ([kernel.kallsyms])
>      __dev_change_flags+0xbe ([kernel.kallsyms])
>      dev_change_flags+0x23 ([kernel.kallsyms])
>      devinet_ioctl+0x5e3 ([kernel.kallsyms])
>      inet_ioctl+0x197 ([kernel.kallsyms])
>      sock_do_ioctl+0x4d ([kernel.kallsyms])
>      sock_ioctl+0x264 ([kernel.kallsyms])
>      __x64_sys_ioctl+0x92 ([kernel.kallsyms])
>      do_syscall_64+0x3a ([kernel.kallsyms])
>      entry_SYSCALL_64_after_hwframe+0x63 ([kernel.kallsyms])
>      __GI___ioctl+0x7 (/lib/x86_64-linux-gnu/libc-2.23.so)
> 
> The sequence of deadlock is:
> 
> 1. Thread B calls rtnl_lock().
> 
> 2. Thread A starts to run and calls rtnl_lock() from within
>     ath11k_regd_update_work(), then enters wait state because the lock is owned by
>     thread B.
> 
> 3. Thread B continues to run and tries to call
>     cancel_work_sync(&ar->regd_update_work), but thread A is in
>     ath11k_regd_update_work() waiting for rtnl_lock(). So cancel_work_sync()
>     forever waits for ath11k_regd_update_work() to finish and we have a deadlock.
> 
> Fix this by switching from using regulatory_set_wiphy_regd_sync() to
> regulatory_set_wiphy_regd(). Now cfg80211 will schedule another workqueue which
> handles the locking on it's own. So the ath11k workqueue can simply exit without
> taking any locks, avoiding the deadlock.
> 
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3
> 
> Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
> [kvalo: improve commit log]
> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>

Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Kalle Valo Oct. 10, 2022, 9:38 a.m. UTC | #2
Kalle Valo <kvalo@kernel.org> wrote:

> Running this test in a loop it is easy to reproduce an rtnl deadlock:
> 
> iw reg set FI
> ifconfig wlan0 down
> 
> What happens is that thread A (workqueue) tries to update the regulatory:
> 
>     try to acquire the rtnl_lock of ar->regd_update_work
> 
>     rtnl_lock+0x17/0x20
>     ath11k_regd_update+0x15a/0x260 [ath11k]
>     ath11k_regd_update_work+0x15/0x20 [ath11k]
>     process_one_work+0x228/0x670
>     worker_thread+0x4d/0x440
>     kthread+0x16d/0x1b0
>     ret_from_fork+0x22/0x30
> 
> And thread B (ifconfig) tries to stop the interface:
> 
>     try to cancel_work_sync(&ar->regd_update_work) in ath11k_mac_op_stop().
>     ifconfig  3109 [003]  2414.232506: probe:
> 
>     ath11k_mac_op_stop: (ffffffffc14187a0)
>     drv_stop+0x30 ([mac80211])
>     ieee80211_do_stop+0x5d2 ([mac80211])
>     ieee80211_stop+0x3e ([mac80211])
>     __dev_close_many+0x9e ([kernel.kallsyms])
>     __dev_change_flags+0xbe ([kernel.kallsyms])
>     dev_change_flags+0x23 ([kernel.kallsyms])
>     devinet_ioctl+0x5e3 ([kernel.kallsyms])
>     inet_ioctl+0x197 ([kernel.kallsyms])
>     sock_do_ioctl+0x4d ([kernel.kallsyms])
>     sock_ioctl+0x264 ([kernel.kallsyms])
>     __x64_sys_ioctl+0x92 ([kernel.kallsyms])
>     do_syscall_64+0x3a ([kernel.kallsyms])
>     entry_SYSCALL_64_after_hwframe+0x63 ([kernel.kallsyms])
>     __GI___ioctl+0x7 (/lib/x86_64-linux-gnu/libc-2.23.so)
> 
> The sequence of deadlock is:
> 
> 1. Thread B calls rtnl_lock().
> 
> 2. Thread A starts to run and calls rtnl_lock() from within
>    ath11k_regd_update_work(), then enters wait state because the lock is owned by
>    thread B.
> 
> 3. Thread B continues to run and tries to call
>    cancel_work_sync(&ar->regd_update_work), but thread A is in
>    ath11k_regd_update_work() waiting for rtnl_lock(). So cancel_work_sync()
>    forever waits for ath11k_regd_update_work() to finish and we have a deadlock.
> 
> Fix this by switching from using regulatory_set_wiphy_regd_sync() to
> regulatory_set_wiphy_regd(). Now cfg80211 will schedule another workqueue which
> handles the locking on it's own. So the ath11k workqueue can simply exit without
> taking any locks, avoiding the deadlock.
> 
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3
> 
> Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
> [kvalo: improve commit log]
> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>

Patch applied to ath-next branch of ath.git, thanks.

d99884ad9e36 wifi: ath11k: avoid deadlock during regulatory update in ath11k_regd_update()
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath11k/reg.c b/drivers/net/wireless/ath/ath11k/reg.c
index 7ee3ff69dfc8..6fae4e61ede7 100644
--- a/drivers/net/wireless/ath/ath11k/reg.c
+++ b/drivers/net/wireless/ath/ath11k/reg.c
@@ -287,11 +287,7 @@  int ath11k_regd_update(struct ath11k *ar)
 		goto err;
 	}
 
-	rtnl_lock();
-	wiphy_lock(ar->hw->wiphy);
-	ret = regulatory_set_wiphy_regd_sync(ar->hw->wiphy, regd_copy);
-	wiphy_unlock(ar->hw->wiphy);
-	rtnl_unlock();
+	ret = regulatory_set_wiphy_regd(ar->hw->wiphy, regd_copy);
 
 	kfree(regd_copy);