diff mbox series

mac80211: fix pending queue hang due to TX_DROP

Message ID 20180905102259.16089-1-me@bobcopeland.com (mailing list archive)
State Accepted
Delegated to: Johannes Berg
Headers show
Series mac80211: fix pending queue hang due to TX_DROP | expand

Commit Message

Bob Copeland Sept. 5, 2018, 10:22 a.m. UTC
In our environment running lots of mesh nodes, we are seeing the
pending queue hang periodically, with the debugfs queues file showing
lines such as:

    00: 0x00000000/348

i.e. there are a large number of frames but no stop reason set.

One way this could happen is if queue processing from the pending
tasklet exited early without processing all frames, and without having
some future event (incoming frame, stop reason flag, ...) to reschedule
it.

Exactly this can occur today if ieee80211_tx() returns false due to
packet drops or power-save buffering in the tx handlers.  In the
past, this function would return true in such cases, and the change
to false doesn't seem to be intentional.  Fix this case by reverting
to the previous behavior.

Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue")
Signed-off-by: Bob Copeland <bobcopeland@fb.com>
---
 net/mac80211/tx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Toke Høiland-Jørgensen Sept. 5, 2018, 10:52 a.m. UTC | #1
Bob Copeland <me@bobcopeland.com> writes:

> In our environment running lots of mesh nodes, we are seeing the
> pending queue hang periodically, with the debugfs queues file showing
> lines such as:
>
>     00: 0x00000000/348
>
> i.e. there are a large number of frames but no stop reason set.
>
> One way this could happen is if queue processing from the pending
> tasklet exited early without processing all frames, and without having
> some future event (incoming frame, stop reason flag, ...) to reschedule
> it.
>
> Exactly this can occur today if ieee80211_tx() returns false due to
> packet drops or power-save buffering in the tx handlers.  In the
> past, this function would return true in such cases, and the change
> to false doesn't seem to be intentional.

Can confirm that this was not intentional; nice catch! :)

Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>

-Toke
diff mbox series

Patch

diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index e88547842239..6b83dc397c3e 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1907,7 +1907,7 @@  static bool ieee80211_tx(struct ieee80211_sub_if_data *sdata,
 			sdata->vif.hw_queue[skb_get_queue_mapping(skb)];
 
 	if (invoke_tx_handlers_early(&tx))
-		return false;
+		return true;
 
 	if (ieee80211_queue_skb(local, sdata, tx.sta, tx.skb))
 		return true;