diff mbox

[4/4] cfg80211: prevent race condition on scan request cleanup

Message ID 1386235289-27278-4-git-send-email-eliad@wizery.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Eliad Peller Dec. 5, 2013, 9:21 a.m. UTC
___cfg80211_scan_done() can be called in some cases
(e.g. on NETDEV_DOWN) before the low level driver
notified scan completion (which is indicated by
passing leak=true).

Clearing rdev->scan_req in this case is buggy, as
scan_done_wk might have already being queued/running
(and can't be flushed as it takes rtnl()).

If a new scan will be requested at this stage, the
scan_done_wk will try freeing it (instead of the
previous scan), and this will later result in
a use after free.

Solve it by freeing scan_req (and clearing it) only
when leak=false. Otherwise, instead of freeing it
mark the request as pending_cleanup, and free it
later on (by the work).

An example backtrace after such crash:
Unable to handle kernel paging request at virtual address fffffee5
pgd = c0004000
[fffffee5] *pgd=9fdf6821, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] SMP ARM
PC is at cfg80211_scan_done+0x28/0xc4 [cfg80211]
LR is at __ieee80211_scan_completed+0xe4/0x2dc [mac80211]
[<bf0077b0>] (cfg80211_scan_done+0x28/0xc4 [cfg80211])
[<bf0973d4>] (__ieee80211_scan_completed+0xe4/0x2dc [mac80211])
[<bf0982cc>] (ieee80211_scan_work+0x94/0x4f0 [mac80211])
[<c005fd10>] (process_one_work+0x1b0/0x4a8)
[<c0060404>] (worker_thread+0x138/0x37c)
[<c0066d70>] (kthread+0xa4/0xb0)

Signed-off-by: Eliad Peller <eliad@wizery.com>
---
i encountered this one while adding some intentional delays in order
to reproduce a different issue. this is probably not a real-world scenario.

 include/net/cfg80211.h |  2 +-
 net/wireless/scan.c    | 13 ++++++++++---
 2 files changed, 11 insertions(+), 4 deletions(-)

Comments

Johannes Berg Dec. 5, 2013, 2:31 p.m. UTC | #1
On Thu, 2013-12-05 at 11:21 +0200, Eliad Peller wrote:

> @@ -219,8 +221,13 @@ void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
>  	 * the scan request or not ... if it accesses the dev
>  	 * in there (it shouldn't anyway) then it may crash.
>  	 */
> -	if (!leak)
> -		kfree(request);
> +	if (leak) {
> +		request->pending_cleanup = true;
> +		return;

This seems insufficient, if the driver never indicates completion, we'd
never clear rdev->scan_req?

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eliad Peller Dec. 5, 2013, 2:36 p.m. UTC | #2
On Thu, Dec 5, 2013 at 4:31 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Thu, 2013-12-05 at 11:21 +0200, Eliad Peller wrote:
>
>> @@ -219,8 +221,13 @@ void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
>>        * the scan request or not ... if it accesses the dev
>>        * in there (it shouldn't anyway) then it may crash.
>>        */
>> -     if (!leak)
>> -             kfree(request);
>> +     if (leak) {
>> +             request->pending_cleanup = true;
>> +             return;
>
> This seems insufficient, if the driver never indicates completion, we'd
> never clear rdev->scan_req?
>
right, but i think it somehow makes sense (i.e. the driver must
indicate completion...)?

Eliad.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Dec. 5, 2013, 2:43 p.m. UTC | #3
On Thu, 2013-12-05 at 16:36 +0200, Eliad Peller wrote:
> On Thu, Dec 5, 2013 at 4:31 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> > On Thu, 2013-12-05 at 11:21 +0200, Eliad Peller wrote:
> >
> >> @@ -219,8 +221,13 @@ void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
> >>        * the scan request or not ... if it accesses the dev
> >>        * in there (it shouldn't anyway) then it may crash.
> >>        */
> >> -     if (!leak)
> >> -             kfree(request);
> >> +     if (leak) {
> >> +             request->pending_cleanup = true;
> >> +             return;
> >
> > This seems insufficient, if the driver never indicates completion, we'd
> > never clear rdev->scan_req?
> >
> right, but i think it somehow makes sense (i.e. the driver must
> indicate completion...)?

But the whole thing was intended to catch buggy drivers :)

Btw, should any of this go to 3.13?

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eliad Peller Dec. 5, 2013, 3:39 p.m. UTC | #4
On Thu, Dec 5, 2013 at 4:43 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Thu, 2013-12-05 at 16:36 +0200, Eliad Peller wrote:
>> On Thu, Dec 5, 2013 at 4:31 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
>> > On Thu, 2013-12-05 at 11:21 +0200, Eliad Peller wrote:
>> >
>> >> @@ -219,8 +221,13 @@ void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
>> >>        * the scan request or not ... if it accesses the dev
>> >>        * in there (it shouldn't anyway) then it may crash.
>> >>        */
>> >> -     if (!leak)
>> >> -             kfree(request);
>> >> +     if (leak) {
>> >> +             request->pending_cleanup = true;
>> >> +             return;
>> >
>> > This seems insufficient, if the driver never indicates completion, we'd
>> > never clear rdev->scan_req?
>> >
>> right, but i think it somehow makes sense (i.e. the driver must
>> indicate completion...)?
>
> But the whole thing was intended to catch buggy drivers :)
>
yeah, you have a point here :)
anyway, i guess it's either leaking scan_req and hoping the driver
really forgot about it, or keeping it and hoping the driver will
finally indicate completion.

since i don't think this is a real-world scenario, i'm ok with
dropping this patch.

> Btw, should any of this go to 3.13?
maybe the first one. it's the only "real" issue.

Eliad.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Dec. 5, 2013, 3:45 p.m. UTC | #5
On Thu, 2013-12-05 at 17:39 +0200, Eliad Peller wrote:
> On Thu, Dec 5, 2013 at 4:43 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> > On Thu, 2013-12-05 at 16:36 +0200, Eliad Peller wrote:
> >> On Thu, Dec 5, 2013 at 4:31 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> >> > On Thu, 2013-12-05 at 11:21 +0200, Eliad Peller wrote:
> >> >
> >> >> @@ -219,8 +221,13 @@ void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
> >> >>        * the scan request or not ... if it accesses the dev
> >> >>        * in there (it shouldn't anyway) then it may crash.
> >> >>        */
> >> >> -     if (!leak)
> >> >> -             kfree(request);
> >> >> +     if (leak) {
> >> >> +             request->pending_cleanup = true;
> >> >> +             return;
> >> >
> >> > This seems insufficient, if the driver never indicates completion, we'd
> >> > never clear rdev->scan_req?
> >> >
> >> right, but i think it somehow makes sense (i.e. the driver must
> >> indicate completion...)?
> >
> > But the whole thing was intended to catch buggy drivers :)
> >
> yeah, you have a point here :)
> anyway, i guess it's either leaking scan_req and hoping the driver
> really forgot about it, or keeping it and hoping the driver will
> finally indicate completion.
> 
> since i don't think this is a real-world scenario, i'm ok with
> dropping this patch.

Well, it can be made to crash, so ...

Can we maybe avoid the crash in a different way? Disallow a new scan
somehow?

> > Btw, should any of this go to 3.13?
> maybe the first one. it's the only "real" issue.

Thanks.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arik Nemtsov Dec. 5, 2013, 3:52 p.m. UTC | #6
On Thu, Dec 5, 2013 at 5:45 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Thu, 2013-12-05 at 17:39 +0200, Eliad Peller wrote:
>> On Thu, Dec 5, 2013 at 4:43 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
>> > On Thu, 2013-12-05 at 16:36 +0200, Eliad Peller wrote:
>> >> On Thu, Dec 5, 2013 at 4:31 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
>> >> > On Thu, 2013-12-05 at 11:21 +0200, Eliad Peller wrote:
>> >> >
>> >> >> @@ -219,8 +221,13 @@ void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
>> >> >>        * the scan request or not ... if it accesses the dev
>> >> >>        * in there (it shouldn't anyway) then it may crash.
>> >> >>        */
>> >> >> -     if (!leak)
>> >> >> -             kfree(request);
>> >> >> +     if (leak) {
>> >> >> +             request->pending_cleanup = true;
>> >> >> +             return;
>> >> >
>> >> > This seems insufficient, if the driver never indicates completion, we'd
>> >> > never clear rdev->scan_req?
>> >> >
>> >> right, but i think it somehow makes sense (i.e. the driver must
>> >> indicate completion...)?
>> >
>> > But the whole thing was intended to catch buggy drivers :)
>> >
>> yeah, you have a point here :)
>> anyway, i guess it's either leaking scan_req and hoping the driver
>> really forgot about it, or keeping it and hoping the driver will
>> finally indicate completion.
>>
>> since i don't think this is a real-world scenario, i'm ok with
>> dropping this patch.
>
> Well, it can be made to crash, so ...
>
> Can we maybe avoid the crash in a different way? Disallow a new scan
> somehow?

Maybe we should drop the whole netdev-notified doing ___cfg80211_scan_done?
I mean if a workaround for buggy drivers is causing bugs for
legitimate drivers..

Something simple for buggy drivers would be doing this in the notifier
- BUG_ON(!rdev->scan_req->notified)

Just my 2c.

Arik
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Dec. 5, 2013, 3:53 p.m. UTC | #7
On Thu, 2013-12-05 at 17:52 +0200, Arik Nemtsov wrote:

> >> > But the whole thing was intended to catch buggy drivers :)
> >> >
> >> yeah, you have a point here :)
> >> anyway, i guess it's either leaking scan_req and hoping the driver
> >> really forgot about it, or keeping it and hoping the driver will
> >> finally indicate completion.
> >>
> >> since i don't think this is a real-world scenario, i'm ok with
> >> dropping this patch.
> >
> > Well, it can be made to crash, so ...
> >
> > Can we maybe avoid the crash in a different way? Disallow a new scan
> > somehow?
> 
> Maybe we should drop the whole netdev-notified doing ___cfg80211_scan_done?
> I mean if a workaround for buggy drivers is causing bugs for
> legitimate drivers..
> 
> Something simple for buggy drivers would be doing this in the notifier
> - BUG_ON(!rdev->scan_req->notified)

BUG_ON is probably a bit heavy-handed, but yeah, I suppose we can drop
this. We used to have more bugs with drivers and even mac80211, but that
should be a thing of the past.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Dec. 5, 2013, 4:14 p.m. UTC | #8
On Thu, 2013-12-05 at 17:39 +0200, Eliad Peller wrote:

> > Btw, should any of this go to 3.13?
> maybe the first one. it's the only "real" issue.

Actually, I'm not going to take that into 3.13 - we defined the API so
that the driver is always allowed to drop the scheduled scan, this isn't
really any different, is it?

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eliad Peller Dec. 5, 2013, 4:32 p.m. UTC | #9
On Thu, Dec 5, 2013 at 6:14 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Thu, 2013-12-05 at 17:39 +0200, Eliad Peller wrote:
>
>> > Btw, should any of this go to 3.13?
>> maybe the first one. it's the only "real" issue.
>
> Actually, I'm not going to take that into 3.13 - we defined the API so
> that the driver is always allowed to drop the scheduled scan, this isn't
> really any different, is it?
>
it doesn't getting stopped in normal cases (by the driver), and
wpa_supplicant doesn't seem to handle it well.
anyway, i guess we can wait with this patch as well.

Eliad.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 3f4eff0..c12259e 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -1376,7 +1376,7 @@  struct cfg80211_scan_request {
 	/* internal */
 	struct wiphy *wiphy;
 	unsigned long scan_start;
-	bool aborted, notified;
+	bool aborted, notified, pending_cleanup;
 	bool no_cck;
 
 	u32 min_dwell;
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index d960e4a..1ec43a8 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -176,6 +176,9 @@  void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
 	if (!request)
 		return;
 
+	if (request->pending_cleanup)
+		goto free_request;
+
 	wdev = request->wdev;
 
 	/*
@@ -209,7 +212,6 @@  void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
 	if (wdev->netdev)
 		dev_put(wdev->netdev);
 
-	rdev->scan_req = NULL;
 
 	/*
 	 * OK. If this is invoked with "leak" then we can't
@@ -219,8 +221,13 @@  void ___cfg80211_scan_done(struct cfg80211_registered_device *rdev, bool leak)
 	 * the scan request or not ... if it accesses the dev
 	 * in there (it shouldn't anyway) then it may crash.
 	 */
-	if (!leak)
-		kfree(request);
+	if (leak) {
+		request->pending_cleanup = true;
+		return;
+	}
+free_request:
+	rdev->scan_req = NULL;
+	kfree(request);
 }
 
 void __cfg80211_scan_done(struct work_struct *wk)