Message ID | 20241113-mlo_dfs_fix-v1-1-e4326736347b@quicinc.com (mailing list archive) |
---|---|
State | Under Review |
Delegated to: | Johannes Berg |
Headers | show |
Series | wifi: cfg80211: fix WARN_ON during CAC cancelling | expand |
> > diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c > index a5eb92d93074e6ce1e08fcc2790b80cf04ff08f8..2a932a036225a3e0587cf5c18a4e80e91552313b 100644 > --- a/net/wireless/mlme.c > +++ b/net/wireless/mlme.c > @@ -1112,10 +1112,6 @@ void cfg80211_cac_event(struct net_device *netdev, > struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); > unsigned long timeout; > > - if (WARN_ON(wdev->valid_links && > - !(wdev->valid_links & BIT(link_id)))) > - return; > - > trace_cfg80211_cac_event(netdev, event, link_id); > > if (WARN_ON(!wdev->links[link_id].cac_started && > This really doesn't seem right. Perhaps the order in teardown should be changed? johannes
On 11/13/24 14:59, Johannes Berg wrote: >> >> diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c >> index a5eb92d93074e6ce1e08fcc2790b80cf04ff08f8..2a932a036225a3e0587cf5c18a4e80e91552313b 100644 >> --- a/net/wireless/mlme.c >> +++ b/net/wireless/mlme.c >> @@ -1112,10 +1112,6 @@ void cfg80211_cac_event(struct net_device *netdev, >> struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); >> unsigned long timeout; >> >> - if (WARN_ON(wdev->valid_links && >> - !(wdev->valid_links & BIT(link_id)))) >> - return; >> - >> trace_cfg80211_cac_event(netdev, event, link_id); >> >> if (WARN_ON(!wdev->links[link_id].cac_started && >> > > This really doesn't seem right. > > Perhaps the order in teardown should be changed? I thought about it but couldn't really come down to a convincing approach. The thing is when CAC in ongoing and hostapd process is killed, there is no specific event apart from link delete which hostapd sends. Will it be okay to add a new NL command to stop radar detection? Something opposite of what start_radar_detection command does?
On Wed, 2024-11-13 at 20:13 +0530, Aditya Kumar Singh wrote: > On 11/13/24 14:59, Johannes Berg wrote: > > > > > > diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c > > > index a5eb92d93074e6ce1e08fcc2790b80cf04ff08f8..2a932a036225a3e0587cf5c18a4e80e91552313b 100644 > > > --- a/net/wireless/mlme.c > > > +++ b/net/wireless/mlme.c > > > @@ -1112,10 +1112,6 @@ void cfg80211_cac_event(struct net_device *netdev, > > > struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); > > > unsigned long timeout; > > > > > > - if (WARN_ON(wdev->valid_links && > > > - !(wdev->valid_links & BIT(link_id)))) > > > - return; > > > - > > > trace_cfg80211_cac_event(netdev, event, link_id); > > > > > > if (WARN_ON(!wdev->links[link_id].cac_started && > > > > > > > This really doesn't seem right. > > > > Perhaps the order in teardown should be changed? > > I thought about it but couldn't really come down to a convincing approach. > > The thing is when CAC in ongoing and hostapd process is killed, there is > no specific event apart from link delete which hostapd sends. > so we do have link removal, why doesn't that work? > Will it be > okay to add a new NL command to stop radar detection? Something opposite > of what start_radar_detection command does? > No, obviously not. johannes
On 11/13/24 21:18, Johannes Berg wrote: > On Wed, 2024-11-13 at 20:13 +0530, Aditya Kumar Singh wrote: >> On 11/13/24 14:59, Johannes Berg wrote: >>>> >>>> diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c >>>> index a5eb92d93074e6ce1e08fcc2790b80cf04ff08f8..2a932a036225a3e0587cf5c18a4e80e91552313b 100644 >>>> --- a/net/wireless/mlme.c >>>> +++ b/net/wireless/mlme.c >>>> @@ -1112,10 +1112,6 @@ void cfg80211_cac_event(struct net_device *netdev, >>>> struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); >>>> unsigned long timeout; >>>> >>>> - if (WARN_ON(wdev->valid_links && >>>> - !(wdev->valid_links & BIT(link_id)))) >>>> - return; >>>> - >>>> trace_cfg80211_cac_event(netdev, event, link_id); >>>> >>>> if (WARN_ON(!wdev->links[link_id].cac_started && >>>> >>> >>> This really doesn't seem right. >>> >>> Perhaps the order in teardown should be changed? >> >> I thought about it but couldn't really come down to a convincing approach. >> >> The thing is when CAC in ongoing and hostapd process is killed, there is >> no specific event apart from link delete which hostapd sends. >> > > so we do have link removal, why doesn't that work? Because link ID is cleared from the bitmap well before link stop is called. As mentioned in commit message, this is the flow - nl80211_remove_link > cfg80211_remove_link -> link ID gets updated here > ieee80211_del_intf_link > ieee80211_vif_set_links > ieee80211_vif_update_links > ieee80211_link_stop -> this ultimately tries to stop CAC if it is ongoing. > >> Will it be >> okay to add a new NL command to stop radar detection? Something opposite >> of what start_radar_detection command does? >> > > No, obviously not. > > johannes
On Wed, 2024-11-13 at 21:50 +0530, Aditya Kumar Singh wrote: > > Because link ID is cleared from the bitmap well before link stop is > called. As mentioned in commit message, this is the flow - > > nl80211_remove_link > > cfg80211_remove_link -> link ID gets updated here > > ieee80211_del_intf_link > > ieee80211_vif_set_links > > ieee80211_vif_update_links > > ieee80211_link_stop -> this ultimately tries to stop > CAC if it is ongoing. > OK, but why does it have to be that way? It's all under wiphy mutex, so perhaps we could just clear it later? There's necessarily going to be some temporary inconsistency here, I'm not sure it matters too much if it isn't very visible? Alternatively, could do something like wdev->valid_links &= ~BIT(link_id); wdev->removing_link = link_id; ... wdev->removing_link = -1; and accept the wdev->removing_link in these APIs like CAC? johannes
On 11/15/24 13:44, Johannes Berg wrote: > On Wed, 2024-11-13 at 21:50 +0530, Aditya Kumar Singh wrote: >> >> Because link ID is cleared from the bitmap well before link stop is >> called. As mentioned in commit message, this is the flow - >> >> nl80211_remove_link >> > cfg80211_remove_link -> link ID gets updated here >> > ieee80211_del_intf_link >> > ieee80211_vif_set_links >> > ieee80211_vif_update_links >> > ieee80211_link_stop -> this ultimately tries to stop >> CAC if it is ongoing. >> > > OK, but why does it have to be that way? It's all under wiphy mutex, so > perhaps we could just clear it later? > Yeah. I tried below diff, hwsim test cases shows no regression. --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -5046,10 +5046,11 @@ static void ieee80211_del_intf_link(struct wiphy *wiphy, unsigned int link_id) { struct ieee80211_sub_if_data *sdata = IEEE80211_WDEV_TO_SUB_IF(wdev); + u16 new_links = wdev->valid_links & ~BIT(link_id); lockdep_assert_wiphy(sdata->local->hw.wiphy); - ieee80211_vif_set_links(sdata, wdev->valid_links, 0); + ieee80211_vif_set_links(sdata, new_links, 0); } static int diff --git a/net/wireless/util.c b/net/wireless/util.c index 040d62051eb9..65c8e47246b7 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -2843,10 +2843,9 @@ void cfg80211_remove_link(struct wireless_dev *wdev, unsigned int link_id) break; } - wdev->valid_links &= ~BIT(link_id); - rdev_del_intf_link(rdev, wdev, link_id); + wdev->valid_links &= ~BIT(link_id); eth_zero_addr(wdev->links[link_id].addr); } So I will submit this as patch then? > There's necessarily going to be some temporary inconsistency here, I'm > not sure it matters too much if it isn't very visible? > Any particular case you suspect and want me to test? > Alternatively, could do something like > > wdev->valid_links &= ~BIT(link_id); > wdev->removing_link = link_id; > ... > wdev->removing_link = -1; > > and accept the wdev->removing_link in these APIs like CAC? Umm.. will work for CAC stopping from user space. But if radar is detected, in this flow (driver -> mac80211 -> cfg80211), valid_link bitmap is still valid and hence valid_bitmap check works and there will be no removing_links set. So then may be both needs to be checked? Like either the link_id should be present in valid_link or it should be the removing_link? if (WARN_ON(wdev->removing_link != link_id && wdev->valid_links && !(wdev->valid_links & BIT(link_id)))) return; I'm more inclining towards the first suggestion you gave - clearing the bitmap later. What's your suggestion?
On Tue, 2024-11-19 at 11:23 +0530, Aditya Kumar Singh wrote: > > --- a/net/mac80211/cfg.c > +++ b/net/mac80211/cfg.c > @@ -5046,10 +5046,11 @@ static void ieee80211_del_intf_link(struct wiphy > *wiphy, > unsigned int link_id) > { > struct ieee80211_sub_if_data *sdata = > IEEE80211_WDEV_TO_SUB_IF(wdev); > + u16 new_links = wdev->valid_links & ~BIT(link_id); > > lockdep_assert_wiphy(sdata->local->hw.wiphy); > > - ieee80211_vif_set_links(sdata, wdev->valid_links, 0); > + ieee80211_vif_set_links(sdata, new_links, 0); > } Needing this part is sort of clear then, but maybe not entirely obvious? Should probably be accompanied by some documentation updates. > So I will submit this as patch then? Seems reasonable. > > There's necessarily going to be some temporary inconsistency here, I'm > > not sure it matters too much if it isn't very visible? > > > > Any particular case you suspect and want me to test? No, not really. > I'm more inclining towards the first suggestion you gave - clearing the > bitmap later. What's your suggestion? > Sure. johannes
diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c index a5eb92d93074e6ce1e08fcc2790b80cf04ff08f8..2a932a036225a3e0587cf5c18a4e80e91552313b 100644 --- a/net/wireless/mlme.c +++ b/net/wireless/mlme.c @@ -1112,10 +1112,6 @@ void cfg80211_cac_event(struct net_device *netdev, struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); unsigned long timeout; - if (WARN_ON(wdev->valid_links && - !(wdev->valid_links & BIT(link_id)))) - return; - trace_cfg80211_cac_event(netdev, event, link_id); if (WARN_ON(!wdev->links[link_id].cac_started &&