Message ID | 80e88f61ca68c36ebce5d17dfcaa8e956e19fb2f.1655196227.git.lukas@wunner.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] sierra_net: Fix use-after-free on unbind | expand |
On 14.06.22 10:50, Lukas Wunner wrote: > @@ -758,6 +758,8 @@ static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf) > > dev_dbg(&dev->udev->dev, "%s", __func__); > > + usbnet_status_stop(dev); > + > /* kill the timer and work */ > del_timer_sync(&priv->sync_timer); > cancel_work_sync(&priv->sierra_net_kevent); Hi, as far as I can see the following race condition exists: CPU A: intr_complete() -> static void sierra_net_status() -> defer_kevent() CPU B: usbnet_stop_status() ---- kills the URB but only the URB, kevent scheduled CPU A: sierra_net_kevent -> sierra_net_dosync() -> CPU B: -> del_timer_sync(&priv->sync_timer); ---- NOP, too early CPU A: add_timer(&priv->sync_timer); CPU B: cancel_work_sync(&priv->sierra_net_kevent); ---- NOP, too late Regards Oliver
[adding Jann as UAF connoisseur to cc] On Tue, Jun 14, 2022 at 12:48:23PM +0200, Oliver Neukum wrote: > On 14.06.22 10:50, Lukas Wunner wrote: > > @@ -758,6 +758,8 @@ static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf) > > > > dev_dbg(&dev->udev->dev, "%s", __func__); > > > > + usbnet_status_stop(dev); > > + > > /* kill the timer and work */ > > del_timer_sync(&priv->sync_timer); > > cancel_work_sync(&priv->sierra_net_kevent); > > as far as I can see the following race condition exists: > > CPU A: > intr_complete() -> static void sierra_net_status() -> defer_kevent() > > CPU B: > usbnet_stop_status() ---- kills the URB but only the URB, kevent scheduled > > CPU A: > sierra_net_kevent -> sierra_net_dosync() -> > > CPU B: > -> del_timer_sync(&priv->sync_timer); ---- NOP, too early > > CPU A: > add_timer(&priv->sync_timer); > > CPU B: > cancel_work_sync(&priv->sierra_net_kevent); ---- NOP, too late I see your point, but what's the solution? I could call netif_device_detach() on ->disconnect(), then avoid scheduling sierra_net_kevent in the timer if !netif_device_present(), and also avoid arming the timer in sierra_net_kevent under the same condition. Still, I think I'd need 3 calls to make this bulletproof, either del_timer_sync(&priv->sync_timer); cancel_work_sync(&priv->sierra_net_kevent); del_timer_sync(&priv->sync_timer); or cancel_work_sync(&priv->sierra_net_kevent); del_timer_sync(&priv->sync_timer); cancel_work_sync(&priv->sierra_net_kevent); Doesn't really matter which of these two. Am I right? Is there a better (simpler) approach? FWIW, the logic in usbnet.c looks similarly flawed: There's a timer, a tasklet and a work. (Sounds like one of those "... walk into a bar" jokes.) The timer is armed by the tx/rx URB completion callbacks. Those URBs are terminated in usbnet_stop() and the timer is deleted. So far so good. But: The tasklet schedules the work. The work schedules the tasklet. The tasklet also schedules itself. We kill the tasklet in usbnet_stop() and afterwards cancel the work in usbnet_disconnect(). What happens if the work schedules the tasklet again? Another UAF. That may happen in the EVENT_RX_HALT, EVENT_RX_MEMORY, EVENT_LINK_RESET and EVENT_LINK_CHANGE code paths. A few netif_device_present() safeguards may help to prevent rescheduling the killed tasklet, but I suspect we may again need 3 calls here (tasklet_kill() / cancel_work_sync() / tasklet_kill()) to make it bulletproof. What do you think? As a heads-up, I'm going to move the cancel_work_sync() to usbnet_stop() in an upcoming patch. That seems to be Jakub's preferred approach to tackle the linkwatch UAF. I fear it may increase the risk of encountering the issues outlined above as the time between tasklet_kill() and cancel_work_sync() is reduced: https://github.com/l1k/linux/commit/89988b499ab9 Thanks, Lukas
On 21.06.22 18:43, Lukas Wunner wrote: > [adding Jann as UAF connoisseur to cc] > > On Tue, Jun 14, 2022 at 12:48:23PM +0200, Oliver Neukum wrote: >> >> as far as I can see the following race condition exists: >> >> CPU A: >> intr_complete() -> static void sierra_net_status() -> defer_kevent() >> >> CPU B: >> usbnet_stop_status() ---- kills the URB but only the URB, kevent scheduled >> >> CPU A: >> sierra_net_kevent -> sierra_net_dosync() -> >> >> CPU B: >> -> del_timer_sync(&priv->sync_timer); ---- NOP, too early >> >> CPU A: >> add_timer(&priv->sync_timer); >> >> CPU B: >> cancel_work_sync(&priv->sierra_net_kevent); ---- NOP, too late > > I see your point, but what's the solution? That is hard to say. I will go as far as saying that this will need a flag indicating a status of currently being disconnected. > I could call netif_device_detach() on ->disconnect(), then avoid > scheduling sierra_net_kevent in the timer if !netif_device_present(), > and also avoid arming the timer in sierra_net_kevent under the same > condition. A flag you get by using netif_device_present() as a flag. Yet the idea of shifting things around in the disconnect() code path of a USB network driver just to solve some other issue makes me very nervous. If you decide that this needs a flag, please add a dedicated flag. > > Still, I think I'd need 3 calls to make this bulletproof, either > > del_timer_sync(&priv->sync_timer); > cancel_work_sync(&priv->sierra_net_kevent); > del_timer_sync(&priv->sync_timer); > > or > > cancel_work_sync(&priv->sierra_net_kevent); > del_timer_sync(&priv->sync_timer); > cancel_work_sync(&priv->sierra_net_kevent); > > Doesn't really matter which of these two. Am I right? Yes, that is right. > Is there a better (simpler) approach? I am thinking about causing the timer and the kevent fail their URB submissions. > FWIW, the logic in usbnet.c looks similarly flawed: > There's a timer, a tasklet and a work. > (Sounds like one of those "... walk into a bar" jokes.) We need somebody to start a web comic based on that. > The timer is armed by the tx/rx URB completion callbacks. > Those URBs are terminated in usbnet_stop() and the timer is > deleted. So far so good. But: > > The tasklet schedules the work. > The work schedules the tasklet. > The tasklet also schedules itself. This test is supposed to make the kevent save from itself: if (netif_running (dev->net) && netif_device_present (dev->net) && Do you think it is insufficient? I must admit the logic in usbnet is not easy to understand. In fact it may not be possible to understand at all. > We kill the tasklet in usbnet_stop() and afterwards cancel the > work in usbnet_disconnect(). What happens if the work schedules > the tasklet again? Another UAF. That may happen in the EVENT_RX_HALT, > EVENT_RX_MEMORY, EVENT_LINK_RESET and EVENT_LINK_CHANGE code paths. > A few netif_device_present() safeguards may help to prevent > rescheduling the killed tasklet, but I suspect we may again need > 3 calls here (tasklet_kill() / cancel_work_sync() / tasklet_kill()) > to make it bulletproof. What do you think? I think that this is unsalvagaeble. We'd fare better with a clear "zombie" flag we test before firing off anything. > As a heads-up, I'm going to move the cancel_work_sync() to usbnet_stop() > in an upcoming patch. That seems to be Jakub's preferred approach to > tackle the linkwatch UAF. I fear it may increase the risk of encountering > the issues outlined above as the time between tasklet_kill() and > cancel_work_sync() is reduced: > > https://github.com/l1k/linux/commit/89988b499ab9 Please do go ahead to adapt it to the way the big network drivers need it. You have now convinced me that usbnet needs major surgery. This means work in merging for me in any case. Feel free to do what serves the users best. Usbnet is a framework. It should be formed by what drivers need, not the other way round. Regards Oliver
diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c index bb4cbe8fc846..197e1356ae98 100644 --- a/drivers/net/usb/sierra_net.c +++ b/drivers/net/usb/sierra_net.c @@ -758,6 +758,8 @@ static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf) dev_dbg(&dev->udev->dev, "%s", __func__); + usbnet_status_stop(dev); + /* kill the timer and work */ del_timer_sync(&priv->sync_timer); cancel_work_sync(&priv->sierra_net_kevent); @@ -769,8 +771,6 @@ static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf) netdev_err(dev->net, "usb_control_msg failed, status %d\n", status); - usbnet_status_stop(dev); - sierra_net_set_private(dev, NULL); kfree(priv); }
On unbind, the Sierra USB WWAN driver cancels the sierra_net_kevent() work, then stops polling for interrupts by calling usbnet_status_stop(). However the interrupt handler sierra_net_status() may re-schedule the work after it's been canceled and thus cause a use-after-free. Fix by inverting the teardown order. Fixes: 7b0c5f21f348 ("sierra_net: keep status interrupt URB active") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v3.10+ Cc: Dan Williams <dan.j.williams@intel.com> --- drivers/net/usb/sierra_net.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)