Message ID | 1426061656-17546-1-git-send-email-sven@narfation.org (mailing list archive) |
---|---|
State | RFC |
Headers | show |
On Wednesday 11 March 2015 09:14:15 Sven Eckelmann wrote: > This is a modified version of RFCv2 with an addition from > mid.gmane.org/8006741.C7YlhOg3U7@bentobox This was near but still a miss. The correct link is http://mid.gmane.org/2736671.Y6Lt54SxyK@bentobox Kind regards, Sven -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-03-11 at 09:14 +0100, Sven Eckelmann wrote: > Since fixing all rate control algorithms will be very difficult, > just provide locking for invocations. This protects the internal > data structures the algorithms maintain. So ... we've been running with this and found that it causes a potential deadlock. The sta->lock is held in many places (particularly mesh, but in one case also for delBA) while transmitting frames, so this causes a lock->rate_ctrl_lock dependency. At the same time, that lock is acquired for aggregation state machines, i.e. the exported function calls ieee80211_stop_rx_ba_session(), ieee80211_start_tx_ba_session() and ieee80211_stop_tx_ba_session(). These are often called from the rate control algorithm (minstrel, rtlwifi, iwlwifi, ...) and thus cause a rate_ctrl_lock->lock dependency. Combined, this clearly creates potential for classic ABBA deadlocks. I'm not really sure yet how to fix this. One way would be to avoid all TX under the sta->lock, which is a relatively simple patch for the aggregation case since there's just one place (http://p.sipsolutions.net/f951b8da4e78112a.txt) but mesh is far more complex and has many places that send frames while holding sta->lock. Another way would to avoid all such things would be to just make all the internal mac80211 TX asynchronous by punting to the TX tasklet we have anyway, but that seems overkill. Another option might be to let mesh use a different spinlock, but then we'd have to be careful not to cause a lock->mesh_lock dependency since that would again lead to a lock->mesh_lock->rate_ctrl_lock dependency, which is still buggy since the aggregation code paths cause the other dependency. Yet another option might be to make the three mentioned aggregation functions asynchronous entirely, but that might cause interesting races in the aggregation state machines again, and some drivers even rely on the return value which could obviously then not be given. If anyone has any good ideas ... johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/mac80211/rate.c b/net/mac80211/rate.c index d53355b..de69adf 100644 --- a/net/mac80211/rate.c +++ b/net/mac80211/rate.c @@ -683,7 +683,13 @@ void rate_control_get_rate(struct ieee80211_sub_if_data *sdata, if (sdata->local->hw.flags & IEEE80211_HW_HAS_RATE_CONTROL) return; - ref->ops->get_rate(ref->priv, ista, priv_sta, txrc); + if (ista) { + spin_lock_bh(&sta->rate_ctrl_lock); + ref->ops->get_rate(ref->priv, ista, priv_sta, txrc); + spin_unlock_bh(&sta->rate_ctrl_lock); + } else { + ref->ops->get_rate(ref->priv, NULL, NULL, txrc); + } if (sdata->local->hw.flags & IEEE80211_HW_SUPPORTS_RC_TABLE) return; diff --git a/net/mac80211/rate.h b/net/mac80211/rate.h index 38652f0..25c9be5 100644 --- a/net/mac80211/rate.h +++ b/net/mac80211/rate.h @@ -42,10 +42,12 @@ static inline void rate_control_tx_status(struct ieee80211_local *local, if (!ref || !test_sta_flag(sta, WLAN_STA_RATE_CONTROL)) return; + spin_lock_bh(&sta->rate_ctrl_lock); if (ref->ops->tx_status) ref->ops->tx_status(ref->priv, sband, ista, priv_sta, skb); else ref->ops->tx_status_noskb(ref->priv, sband, ista, priv_sta, info); + spin_unlock_bh(&sta->rate_ctrl_lock); } static inline void @@ -64,7 +66,9 @@ rate_control_tx_status_noskb(struct ieee80211_local *local, if (WARN_ON_ONCE(!ref->ops->tx_status_noskb)) return; + spin_lock_bh(&sta->rate_ctrl_lock); ref->ops->tx_status_noskb(ref->priv, sband, ista, priv_sta, info); + spin_unlock_bh(&sta->rate_ctrl_lock); } static inline void rate_control_rate_init(struct sta_info *sta) @@ -91,8 +95,10 @@ static inline void rate_control_rate_init(struct sta_info *sta) sband = local->hw.wiphy->bands[chanctx_conf->def.chan->band]; + spin_lock_bh(&sta->rate_ctrl_lock); ref->ops->rate_init(ref->priv, sband, &chanctx_conf->def, ista, priv_sta); + spin_unlock_bh(&sta->rate_ctrl_lock); rcu_read_unlock(); set_sta_flag(sta, WLAN_STA_RATE_CONTROL); } @@ -115,18 +121,20 @@ static inline void rate_control_rate_update(struct ieee80211_local *local, return; } + spin_lock_bh(&sta->rate_ctrl_lock); ref->ops->rate_update(ref->priv, sband, &chanctx_conf->def, ista, priv_sta, changed); + spin_unlock_bh(&sta->rate_ctrl_lock); rcu_read_unlock(); } drv_sta_rc_update(local, sta->sdata, &sta->sta, changed); } static inline void *rate_control_alloc_sta(struct rate_control_ref *ref, - struct ieee80211_sta *sta, - gfp_t gfp) + struct sta_info *sta, gfp_t gfp) { - return ref->ops->alloc_sta(ref->priv, sta, gfp); + spin_lock_init(&sta->rate_ctrl_lock); + return ref->ops->alloc_sta(ref->priv, &sta->sta, gfp); } static inline void rate_control_free_sta(struct sta_info *sta) diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index 00ca8dc..8e8f4c2 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -282,7 +282,7 @@ static int sta_prepare_rate_control(struct ieee80211_local *local, sta->rate_ctrl = local->rate_ctrl; sta->rate_ctrl_priv = rate_control_alloc_sta(sta->rate_ctrl, - &sta->sta, gfp); + sta, gfp); if (!sta->rate_ctrl_priv) return -ENOMEM; diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h index 925e68f..a8f037f 100644 --- a/net/mac80211/sta_info.h +++ b/net/mac80211/sta_info.h @@ -368,6 +368,7 @@ struct sta_info { u8 ptk_idx; struct rate_control_ref *rate_ctrl; void *rate_ctrl_priv; + spinlock_t rate_ctrl_lock; spinlock_t lock; struct work_struct drv_deliver_wk;