diff mbox

[1/2] mac80211: fix txq queue related crashes

Message ID 1453382588-27105-1-git-send-email-michal.kazior@tieto.com (mailing list archive)
State Accepted
Delegated to: Johannes Berg
Headers show

Commit Message

Michal Kazior Jan. 21, 2016, 1:23 p.m. UTC
The driver can access the queue simultanously
while mac80211 tears down the interface. Without
spinlock protection this could lead to corrupting
sk_buff_head and subsequently to an invalid
pointer dereference.

Fixes: ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation")
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 net/mac80211/iface.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Ben Greear Jan. 25, 2016, 5:59 p.m. UTC | #1
On 01/21/2016 05:23 AM, Michal Kazior wrote:
> The driver can access the queue simultanously
> while mac80211 tears down the interface. Without
> spinlock protection this could lead to corrupting
> sk_buff_head and subsequently to an invalid
> pointer dereference.

Hard to know for certain, but this *appears* to fix the unexpectedly large
amount of CE/AXI ath10k firmware crashes that we saw in the 4.2 kernel (4.0 previously
ran much better han 4.2 for us).

We'll continue testing, in case we are just getting lucky so far.

Thanks,
Ben

>
> Fixes: ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation")
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
> ---
>   net/mac80211/iface.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
> index 33ae3c81bfc5..0451f120746e 100644
> --- a/net/mac80211/iface.c
> +++ b/net/mac80211/iface.c
> @@ -977,7 +977,10 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata,
>   	if (sdata->vif.txq) {
>   		struct txq_info *txqi = to_txq_info(sdata->vif.txq);
>
> +		spin_lock_bh(&txqi->queue.lock);
>   		ieee80211_purge_tx_queue(&local->hw, &txqi->queue);
> +		spin_unlock_bh(&txqi->queue.lock);
> +
>   		atomic_set(&sdata->txqs_len[txqi->txq.ac], 0);
>   	}
>
>
Michal Kazior Jan. 26, 2016, 6:35 a.m. UTC | #2
On 25 January 2016 at 18:59, Ben Greear <greearb@candelatech.com> wrote:
> On 01/21/2016 05:23 AM, Michal Kazior wrote:
>>
>> The driver can access the queue simultanously
>> while mac80211 tears down the interface. Without
>> spinlock protection this could lead to corrupting
>> sk_buff_head and subsequently to an invalid
>> pointer dereference.
>
> Hard to know for certain, but this *appears* to fix the unexpectedly large
> amount of CE/AXI ath10k firmware crashes that we saw in the 4.2 kernel (4.0
> previously
> ran much better han 4.2 for us).

That's impossible.

Without wake_tx_queue() txqs aren't even allocated (sdata->vif.txq is NULL).


Micha?
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Jan. 26, 2016, 1:11 p.m. UTC | #3
On Thu, 2016-01-21 at 14:23 +0100, Michal Kazior wrote:
> The driver can access the queue simultanously
> while mac80211 tears down the interface. Without
> spinlock protection this could lead to corrupting
> sk_buff_head and subsequently to an invalid
> pointer dereference.
> 
Applied.

johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear Jan. 26, 2016, 3:29 p.m. UTC | #4
On 01/25/2016 10:35 PM, Michal Kazior wrote:
> On 25 January 2016 at 18:59, Ben Greear <greearb@candelatech.com> wrote:
>> On 01/21/2016 05:23 AM, Michal Kazior wrote:
>>>
>>> The driver can access the queue simultanously
>>> while mac80211 tears down the interface. Without
>>> spinlock protection this could lead to corrupting
>>> sk_buff_head and subsequently to an invalid
>>> pointer dereference.
>>
>> Hard to know for certain, but this *appears* to fix the unexpectedly large
>> amount of CE/AXI ath10k firmware crashes that we saw in the 4.2 kernel (4.0
>> previously
>> ran much better han 4.2 for us).
>
> That's impossible.
>
> Without wake_tx_queue() txqs aren't even allocated (sdata->vif.txq is NULL).

You are right.  But while testing, one of my guys did find a way to reproduce the
crash very quickly in 4.2.  Happens fastest when I use the HTT-MGT variant
of my firmware, but same firmware works good-ish in 4.0.  Seems I have something
to bisect now if I can get a minimal patch to apply each time to enable my
htt-mgt firmware feature...

The latest test case is to just to change the channel of the AP while station
is connected.  Station sends some null-funcs, firmware resets it's low-level
stuff a bunch because it doesn't get AKCs, then CE/AXI crashes.  Could be
my firmware or kernel modifications of course, though similar crash scenarios have been seen forever
in all sorts of firmwares and kernels.

Thanks,
Ben
diff mbox

Patch

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 33ae3c81bfc5..0451f120746e 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -977,7 +977,10 @@  static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata,
 	if (sdata->vif.txq) {
 		struct txq_info *txqi = to_txq_info(sdata->vif.txq);
 
+		spin_lock_bh(&txqi->queue.lock);
 		ieee80211_purge_tx_queue(&local->hw, &txqi->queue);
+		spin_unlock_bh(&txqi->queue.lock);
+
 		atomic_set(&sdata->txqs_len[txqi->txq.ac], 0);
 	}