Message ID | iwlwifi.20211129152938.d5fceeb7e166.I555fef8e67d93fff3d9a304886c4a9f8b322e591@changeid (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Johannes Berg |
Headers | show |
Series | cfg80211/mac80211 patches from our internal tree 2021-11-29 | expand |
Luca Coelho <luca@coelho.fi> writes: > From: Johannes Berg <johannes.berg@intel.com> > > When we call ieee80211_agg_start_txq(), that will in turn call > schedule_and_wake_txq(). Called from ieee80211_stop_tx_ba_cb() > this is done under sta->lock, which leads to certain circular > lock dependencies, as reported by Chris Murphy: > https://lore.kernel.org/r/CAJCQCtSXJ5qA4bqSPY=oLRMbv-irihVvP7A2uGutEbXQVkoNaw@mail.gmail.com > > In general, ieee80211_agg_start_txq() is usually not called > with sta->lock held, only in this one place. But it's always > called with sta->ampdu_mlme.mtx held, and that's therefore > clearly sufficient. > > Change ieee80211_stop_tx_ba_cb() to also call it without the > sta->lock held, by factoring it out of ieee80211_remove_tid_tx() > (which is only called in this one place). > > This breaks the locking chain and makes it less likely that > we'll have similar locking chain problems in the future. > > Reported-by: Chris Murphy <lists@colorremedies.com> > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Does this need a fixes: tag? -Toke
On Mon, 2021-11-29 at 14:54 +0100, Toke Høiland-Jørgensen wrote: > Luca Coelho <luca@coelho.fi> writes: > > > From: Johannes Berg <johannes.berg@intel.com> > > > > When we call ieee80211_agg_start_txq(), that will in turn call > > schedule_and_wake_txq(). Called from ieee80211_stop_tx_ba_cb() > > this is done under sta->lock, which leads to certain circular > > lock dependencies, as reported by Chris Murphy: > > https://lore.kernel.org/r/CAJCQCtSXJ5qA4bqSPY=oLRMbv-irihVvP7A2uGutEbXQVkoNaw@mail.gmail.com > > > > In general, ieee80211_agg_start_txq() is usually not called > > with sta->lock held, only in this one place. But it's always > > called with sta->ampdu_mlme.mtx held, and that's therefore > > clearly sufficient. > > > > Change ieee80211_stop_tx_ba_cb() to also call it without the > > sta->lock held, by factoring it out of ieee80211_remove_tid_tx() > > (which is only called in this one place). > > > > This breaks the locking chain and makes it less likely that > > we'll have similar locking chain problems in the future. > > > > Reported-by: Chris Murphy <lists@colorremedies.com> > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > > Does this need a fixes: tag? Hi Toke, Neither Johannes nor Chris pointed to any specific patch that this commit is fixing. If you know the exact commit that broke this, I can add the tag and send v2. Thanks! -- Cheers, Luca.
Luca Coelho <luca@coelho.fi> writes: > On Mon, 2021-11-29 at 14:54 +0100, Toke Høiland-Jørgensen wrote: >> Luca Coelho <luca@coelho.fi> writes: >> >> > From: Johannes Berg <johannes.berg@intel.com> >> > >> > When we call ieee80211_agg_start_txq(), that will in turn call >> > schedule_and_wake_txq(). Called from ieee80211_stop_tx_ba_cb() >> > this is done under sta->lock, which leads to certain circular >> > lock dependencies, as reported by Chris Murphy: >> > https://lore.kernel.org/r/CAJCQCtSXJ5qA4bqSPY=oLRMbv-irihVvP7A2uGutEbXQVkoNaw@mail.gmail.com >> > >> > In general, ieee80211_agg_start_txq() is usually not called >> > with sta->lock held, only in this one place. But it's always >> > called with sta->ampdu_mlme.mtx held, and that's therefore >> > clearly sufficient. >> > >> > Change ieee80211_stop_tx_ba_cb() to also call it without the >> > sta->lock held, by factoring it out of ieee80211_remove_tid_tx() >> > (which is only called in this one place). >> > >> > This breaks the locking chain and makes it less likely that >> > we'll have similar locking chain problems in the future. >> > >> > Reported-by: Chris Murphy <lists@colorremedies.com> >> > Signed-off-by: Johannes Berg <johannes.berg@intel.com> >> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> >> >> Does this need a fixes: tag? > > Hi Toke, > > Neither Johannes nor Chris pointed to any specific patch that this > commit is fixing. If you know the exact commit that broke this, I can > add the tag and send v2. Well, it looks like the code you're changing comes all the way back from: ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation") Maybe Johannes can comment on whether it's appropriate to include this, or if the code changed too much in the meantime... -Toke
On Tue, 2021-11-30 at 12:32 +0100, Toke Høiland-Jørgensen wrote: > Luca Coelho <luca@coelho.fi> writes: > > > On Mon, 2021-11-29 at 14:54 +0100, Toke Høiland-Jørgensen wrote: > > > Luca Coelho <luca@coelho.fi> writes: > > > > > > > From: Johannes Berg <johannes.berg@intel.com> > > > > > > > > When we call ieee80211_agg_start_txq(), that will in turn call > > > > schedule_and_wake_txq(). Called from ieee80211_stop_tx_ba_cb() > > > > this is done under sta->lock, which leads to certain circular > > > > lock dependencies, as reported by Chris Murphy: > > > > https://lore.kernel.org/r/CAJCQCtSXJ5qA4bqSPY=oLRMbv-irihVvP7A2uGutEbXQVkoNaw@mail.gmail.com > > > > > > > > In general, ieee80211_agg_start_txq() is usually not called > > > > with sta->lock held, only in this one place. But it's always > > > > called with sta->ampdu_mlme.mtx held, and that's therefore > > > > clearly sufficient. > > > > > > > > Change ieee80211_stop_tx_ba_cb() to also call it without the > > > > sta->lock held, by factoring it out of ieee80211_remove_tid_tx() > > > > (which is only called in this one place). > > > > > > > > This breaks the locking chain and makes it less likely that > > > > we'll have similar locking chain problems in the future. > > > > > > > > Reported-by: Chris Murphy <lists@colorremedies.com> > > > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > > > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > > > > > > Does this need a fixes: tag? > > > > Hi Toke, > > > > Neither Johannes nor Chris pointed to any specific patch that this > > commit is fixing. If you know the exact commit that broke this, I can > > add the tag and send v2. > > Well, it looks like the code you're changing comes all the way back from: > ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation") > > Maybe Johannes can comment on whether it's appropriate to include this, > or if the code changed too much in the meantime... > I think that probably makes sense, and I guess I can include that tag when I apply it. I suspect the reason I didn't do it internally (we have a different tag though, so that's no excuse) is that there we didn't care until iwlwifi actually gained TXQ support. johannes
Johannes Berg <johannes@sipsolutions.net> writes: > On Tue, 2021-11-30 at 12:32 +0100, Toke Høiland-Jørgensen wrote: >> Luca Coelho <luca@coelho.fi> writes: >> >> > On Mon, 2021-11-29 at 14:54 +0100, Toke Høiland-Jørgensen wrote: >> > > Luca Coelho <luca@coelho.fi> writes: >> > > >> > > > From: Johannes Berg <johannes.berg@intel.com> >> > > > >> > > > When we call ieee80211_agg_start_txq(), that will in turn call >> > > > schedule_and_wake_txq(). Called from ieee80211_stop_tx_ba_cb() >> > > > this is done under sta->lock, which leads to certain circular >> > > > lock dependencies, as reported by Chris Murphy: >> > > > https://lore.kernel.org/r/CAJCQCtSXJ5qA4bqSPY=oLRMbv-irihVvP7A2uGutEbXQVkoNaw@mail.gmail.com >> > > > >> > > > In general, ieee80211_agg_start_txq() is usually not called >> > > > with sta->lock held, only in this one place. But it's always >> > > > called with sta->ampdu_mlme.mtx held, and that's therefore >> > > > clearly sufficient. >> > > > >> > > > Change ieee80211_stop_tx_ba_cb() to also call it without the >> > > > sta->lock held, by factoring it out of ieee80211_remove_tid_tx() >> > > > (which is only called in this one place). >> > > > >> > > > This breaks the locking chain and makes it less likely that >> > > > we'll have similar locking chain problems in the future. >> > > > >> > > > Reported-by: Chris Murphy <lists@colorremedies.com> >> > > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> >> > > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> >> > > >> > > Does this need a fixes: tag? >> > >> > Hi Toke, >> > >> > Neither Johannes nor Chris pointed to any specific patch that this >> > commit is fixing. If you know the exact commit that broke this, I can >> > add the tag and send v2. >> >> Well, it looks like the code you're changing comes all the way back from: >> ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation") >> >> Maybe Johannes can comment on whether it's appropriate to include this, >> or if the code changed too much in the meantime... >> > > I think that probably makes sense, and I guess I can include that tag > when I apply it. > > I suspect the reason I didn't do it internally (we have a different tag > though, so that's no excuse) is that there we didn't care until iwlwifi > actually gained TXQ support. Right, makes sense - thanks! :) -Toke
On Tue, 2021-11-30 at 12:52 +0100, Johannes Berg wrote: > On Tue, 2021-11-30 at 12:32 +0100, Toke Høiland-Jørgensen wrote: > > Luca Coelho <luca@coelho.fi> writes: > > > > > On Mon, 2021-11-29 at 14:54 +0100, Toke Høiland-Jørgensen wrote: > > > > Luca Coelho <luca@coelho.fi> writes: > > > > > > > > > From: Johannes Berg <johannes.berg@intel.com> > > > > > > > > > > When we call ieee80211_agg_start_txq(), that will in turn call > > > > > schedule_and_wake_txq(). Called from ieee80211_stop_tx_ba_cb() > > > > > this is done under sta->lock, which leads to certain circular > > > > > lock dependencies, as reported by Chris Murphy: > > > > > https://lore.kernel.org/r/CAJCQCtSXJ5qA4bqSPY=oLRMbv-irihVvP7A2uGutEbXQVkoNaw@mail.gmail.com > > > > > > > > > > In general, ieee80211_agg_start_txq() is usually not called > > > > > with sta->lock held, only in this one place. But it's always > > > > > called with sta->ampdu_mlme.mtx held, and that's therefore > > > > > clearly sufficient. > > > > > > > > > > Change ieee80211_stop_tx_ba_cb() to also call it without the > > > > > sta->lock held, by factoring it out of ieee80211_remove_tid_tx() > > > > > (which is only called in this one place). > > > > > > > > > > This breaks the locking chain and makes it less likely that > > > > > we'll have similar locking chain problems in the future. > > > > > > > > > > Reported-by: Chris Murphy <lists@colorremedies.com> > > > > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > > > > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > > > > > > > > Does this need a fixes: tag? > > > > > > Hi Toke, > > > > > > Neither Johannes nor Chris pointed to any specific patch that this > > > commit is fixing. If you know the exact commit that broke this, I can > > > add the tag and send v2. > > > > Well, it looks like the code you're changing comes all the way back from: > > ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation") > > > > Maybe Johannes can comment on whether it's appropriate to include this, > > or if the code changed too much in the meantime... > > > > I think that probably makes sense, and I guess I can include that tag > when I apply it. > > I suspect the reason I didn't do it internally (we have a different tag > though, so that's no excuse) is that there we didn't care until iwlwifi > actually gained TXQ support. Thanks, guys! Johannes, do you want me to send v2 or do you want to add the tag yourself? There is one more patch that is broken (ugh, sorry) that I need to fix anyway... -- Cheers, Luca.
On Tue, 2021-11-30 at 13:57 +0200, Luca Coelho wrote: > > > > > > Johannes, do you want me to send v2 or do you want to add the tag > yourself? There is one more patch that is broken (ugh, sorry) that I > need to fix anyway... > Well if you resend I don't have to remember, so that wouldn't hurt either ;-) johannes
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c index 430a58587538..4dd56daed89b 100644 --- a/net/mac80211/agg-tx.c +++ b/net/mac80211/agg-tx.c @@ -9,7 +9,7 @@ * Copyright 2007, Michael Wu <flamingice@sourmilk.net> * Copyright 2007-2010, Intel Corporation * Copyright(c) 2015-2017 Intel Deutschland GmbH - * Copyright (C) 2018 - 2020 Intel Corporation + * Copyright (C) 2018 - 2021 Intel Corporation */ #include <linux/ieee80211.h> @@ -213,6 +213,8 @@ ieee80211_agg_start_txq(struct sta_info *sta, int tid, bool enable) struct ieee80211_txq *txq = sta->sta.txq[tid]; struct txq_info *txqi; + lockdep_assert_held(&sta->ampdu_mlme.mtx); + if (!txq) return; @@ -290,7 +292,6 @@ static void ieee80211_remove_tid_tx(struct sta_info *sta, int tid) ieee80211_assign_tid_tx(sta, tid, NULL); ieee80211_agg_splice_finish(sta->sdata, tid); - ieee80211_agg_start_txq(sta, tid, false); kfree_rcu(tid_tx, rcu_head); } @@ -889,6 +890,7 @@ void ieee80211_stop_tx_ba_cb(struct sta_info *sta, int tid, { struct ieee80211_sub_if_data *sdata = sta->sdata; bool send_delba = false; + bool start_txq = false; ht_dbg(sdata, "Stopping Tx BA session for %pM tid %d\n", sta->sta.addr, tid); @@ -906,10 +908,14 @@ void ieee80211_stop_tx_ba_cb(struct sta_info *sta, int tid, send_delba = true; ieee80211_remove_tid_tx(sta, tid); + start_txq = true; unlock_sta: spin_unlock_bh(&sta->lock); + if (start_txq) + ieee80211_agg_start_txq(sta, tid, false); + if (send_delba) ieee80211_send_delba(sdata, sta->sta.addr, tid, WLAN_BACK_INITIATOR, WLAN_REASON_QSTA_NOT_USE);