Message ID | 20221006151747.13757-1-kvalo@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | d99884ad9e3673a12879bc2830f6e5a66cccbd78 |
Delegated to: | Kalle Valo |
Headers | show |
Series | wifi: ath11k: avoid deadlock during regulatory update in ath11k_regd_update() | expand |
On 10/6/2022 8:17 AM, Kalle Valo wrote: > From: Wen Gong <quic_wgong@quicinc.com> > > Running this test in a loop it is easy to reproduce an rtnl deadlock: > > iw reg set FI > ifconfig wlan0 down > > What happens is that thread A (workqueue) tries to update the regulatory: > > try to acquire the rtnl_lock of ar->regd_update_work > > rtnl_lock+0x17/0x20 > ath11k_regd_update+0x15a/0x260 [ath11k] > ath11k_regd_update_work+0x15/0x20 [ath11k] > process_one_work+0x228/0x670 > worker_thread+0x4d/0x440 > kthread+0x16d/0x1b0 > ret_from_fork+0x22/0x30 > > And thread B (ifconfig) tries to stop the interface: > > try to cancel_work_sync(&ar->regd_update_work) in ath11k_mac_op_stop(). > ifconfig 3109 [003] 2414.232506: probe: > > ath11k_mac_op_stop: (ffffffffc14187a0) > drv_stop+0x30 ([mac80211]) > ieee80211_do_stop+0x5d2 ([mac80211]) > ieee80211_stop+0x3e ([mac80211]) > __dev_close_many+0x9e ([kernel.kallsyms]) > __dev_change_flags+0xbe ([kernel.kallsyms]) > dev_change_flags+0x23 ([kernel.kallsyms]) > devinet_ioctl+0x5e3 ([kernel.kallsyms]) > inet_ioctl+0x197 ([kernel.kallsyms]) > sock_do_ioctl+0x4d ([kernel.kallsyms]) > sock_ioctl+0x264 ([kernel.kallsyms]) > __x64_sys_ioctl+0x92 ([kernel.kallsyms]) > do_syscall_64+0x3a ([kernel.kallsyms]) > entry_SYSCALL_64_after_hwframe+0x63 ([kernel.kallsyms]) > __GI___ioctl+0x7 (/lib/x86_64-linux-gnu/libc-2.23.so) > > The sequence of deadlock is: > > 1. Thread B calls rtnl_lock(). > > 2. Thread A starts to run and calls rtnl_lock() from within > ath11k_regd_update_work(), then enters wait state because the lock is owned by > thread B. > > 3. Thread B continues to run and tries to call > cancel_work_sync(&ar->regd_update_work), but thread A is in > ath11k_regd_update_work() waiting for rtnl_lock(). So cancel_work_sync() > forever waits for ath11k_regd_update_work() to finish and we have a deadlock. > > Fix this by switching from using regulatory_set_wiphy_regd_sync() to > regulatory_set_wiphy_regd(). Now cfg80211 will schedule another workqueue which > handles the locking on it's own. So the ath11k workqueue can simply exit without > taking any locks, avoiding the deadlock. > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3 > > Signed-off-by: Wen Gong <quic_wgong@quicinc.com> > [kvalo: improve commit log] > Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Kalle Valo <kvalo@kernel.org> wrote: > Running this test in a loop it is easy to reproduce an rtnl deadlock: > > iw reg set FI > ifconfig wlan0 down > > What happens is that thread A (workqueue) tries to update the regulatory: > > try to acquire the rtnl_lock of ar->regd_update_work > > rtnl_lock+0x17/0x20 > ath11k_regd_update+0x15a/0x260 [ath11k] > ath11k_regd_update_work+0x15/0x20 [ath11k] > process_one_work+0x228/0x670 > worker_thread+0x4d/0x440 > kthread+0x16d/0x1b0 > ret_from_fork+0x22/0x30 > > And thread B (ifconfig) tries to stop the interface: > > try to cancel_work_sync(&ar->regd_update_work) in ath11k_mac_op_stop(). > ifconfig 3109 [003] 2414.232506: probe: > > ath11k_mac_op_stop: (ffffffffc14187a0) > drv_stop+0x30 ([mac80211]) > ieee80211_do_stop+0x5d2 ([mac80211]) > ieee80211_stop+0x3e ([mac80211]) > __dev_close_many+0x9e ([kernel.kallsyms]) > __dev_change_flags+0xbe ([kernel.kallsyms]) > dev_change_flags+0x23 ([kernel.kallsyms]) > devinet_ioctl+0x5e3 ([kernel.kallsyms]) > inet_ioctl+0x197 ([kernel.kallsyms]) > sock_do_ioctl+0x4d ([kernel.kallsyms]) > sock_ioctl+0x264 ([kernel.kallsyms]) > __x64_sys_ioctl+0x92 ([kernel.kallsyms]) > do_syscall_64+0x3a ([kernel.kallsyms]) > entry_SYSCALL_64_after_hwframe+0x63 ([kernel.kallsyms]) > __GI___ioctl+0x7 (/lib/x86_64-linux-gnu/libc-2.23.so) > > The sequence of deadlock is: > > 1. Thread B calls rtnl_lock(). > > 2. Thread A starts to run and calls rtnl_lock() from within > ath11k_regd_update_work(), then enters wait state because the lock is owned by > thread B. > > 3. Thread B continues to run and tries to call > cancel_work_sync(&ar->regd_update_work), but thread A is in > ath11k_regd_update_work() waiting for rtnl_lock(). So cancel_work_sync() > forever waits for ath11k_regd_update_work() to finish and we have a deadlock. > > Fix this by switching from using regulatory_set_wiphy_regd_sync() to > regulatory_set_wiphy_regd(). Now cfg80211 will schedule another workqueue which > handles the locking on it's own. So the ath11k workqueue can simply exit without > taking any locks, avoiding the deadlock. > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3 > > Signed-off-by: Wen Gong <quic_wgong@quicinc.com> > [kvalo: improve commit log] > Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Patch applied to ath-next branch of ath.git, thanks. d99884ad9e36 wifi: ath11k: avoid deadlock during regulatory update in ath11k_regd_update()
diff --git a/drivers/net/wireless/ath/ath11k/reg.c b/drivers/net/wireless/ath/ath11k/reg.c index 7ee3ff69dfc8..6fae4e61ede7 100644 --- a/drivers/net/wireless/ath/ath11k/reg.c +++ b/drivers/net/wireless/ath/ath11k/reg.c @@ -287,11 +287,7 @@ int ath11k_regd_update(struct ath11k *ar) goto err; } - rtnl_lock(); - wiphy_lock(ar->hw->wiphy); - ret = regulatory_set_wiphy_regd_sync(ar->hw->wiphy, regd_copy); - wiphy_unlock(ar->hw->wiphy); - rtnl_unlock(); + ret = regulatory_set_wiphy_regd(ar->hw->wiphy, regd_copy); kfree(regd_copy);