Message ID | 20170504122304.11735-1-vkuznets@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
From: Vitaly Kuznetsov <vkuznets@redhat.com> Date: Thu, 4 May 2017 14:23:04 +0200 > Unavoidable crashes in netfront_resume() and netback_changed() after a > previous fail in talk_to_netback() (e.g. when we fail to read MAC from > xenstore) were discovered. The failure path in talk_to_netback() does > unregister/free for netdev but we don't reset drvdata and we try accessing > it again after resume. > > Reset drvdata in netback_changed() the same way we reset it in > netfront_probe() and check for NULL in both netfront_resume() and > netback_changed() to properly handle the situation. > > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> The circumstances under which netfront_probe() NULLs out the device private is different than what you propose here, which is to do it on a live device in netback_changed() whilst mutliple susbsytems have a reference to this device and can call into the driver still. It is only legal to do this in the probe function because such references and execution possibilities do not exist at that point. What really needs to happen is that the xenbus_driver must be told to unregister this xen device and stop making calls into the driver for it before you release the netdev state. That is the only reasonable way to fix this bug. Thanks.
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 6ffc482..92f746c 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -1426,6 +1426,9 @@ static int netfront_resume(struct xenbus_device *dev) { struct netfront_info *info = dev_get_drvdata(&dev->dev); + if (!info) + return 0; + dev_dbg(&dev->dev, "%s\n", dev->nodename); xennet_disconnect_backend(info); @@ -1936,6 +1939,7 @@ static int talk_to_netback(struct xenbus_device *dev, out: unregister_netdev(info->netdev); xennet_free_netdev(info->netdev); + dev_set_drvdata(&dev->dev, NULL); return err; } @@ -1997,7 +2001,12 @@ static void netback_changed(struct xenbus_device *dev, enum xenbus_state backend_state) { struct netfront_info *np = dev_get_drvdata(&dev->dev); - struct net_device *netdev = np->netdev; + struct net_device *netdev; + + if (!np) + return; + + netdev = np->netdev; dev_dbg(&dev->dev, "%s\n", xenbus_strstate(backend_state));
Unavoidable crashes in netfront_resume() and netback_changed() after a previous fail in talk_to_netback() (e.g. when we fail to read MAC from xenstore) were discovered. The failure path in talk_to_netback() does unregister/free for netdev but we don't reset drvdata and we try accessing it again after resume. Reset drvdata in netback_changed() the same way we reset it in netfront_probe() and check for NULL in both netfront_resume() and netback_changed() to properly handle the situation. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> --- I apologize for sending this during the merge window, I'm not sure if this needs to go through xen or netdev tree. I think the fix is simple enough and it would make sense to squeeze it in 4.12 if possible. Thanks. --- drivers/net/xen-netfront.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)