From patchwork Wed Apr 2 21:41:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cosmin Ratiu X-Patchwork-Id: 14036547 X-Patchwork-Delegate: kuba@kernel.org Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2058.outbound.protection.outlook.com [40.107.220.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F35C1F4C86 for ; Wed, 2 Apr 2025 21:41:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.220.58 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743630119; cv=fail; b=Tqu310rqgqWhfWA/tBJkfPl24c72thgA2WlWXQRmiBRVX63E6W2GdikSPulhFtSyN8o3vOS44GtsxsYWDC7lNXmXafgCmc75oSygKk+Ff/JkQjqnWoZKIF8iaO2XBVJLJgzJpbcGPCexEgMqZ6/Q5dSyp3Y/ZX68ySa0SzCoCZ0= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743630119; c=relaxed/simple; bh=Z8kTc5t+JIcIKplJmig8nYv11PkkAOTQBsrQsI4xoLQ=; h=From:To:CC:Subject:Date:Message-ID:Content-Type:MIME-Version; b=lF3O58WJO00pO4Gj3yx3Db2cD1bD4r02P6uVBDdGgQxIX/9TiuAwrZojYd0GXBNFP0/XJwdHiYhxjiV4OC7oiKwT1UYVK4AJ74/iMlQLUVECHfwphPzG3asIfcWWmowundDTqTYEyoyPHJMtcVyI4QlF7jbbvJeiYHyYcJnxkF4= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=GQCkamtf; arc=fail smtp.client-ip=40.107.220.58 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="GQCkamtf" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=H1S0ARWpJL6m9jokP++vEaTl3rxQ+WMWt1uSlcXfEYs30A60odjHvMtEeEDmHxby+chUlrDbgGoPyRpGmh2y2ATULKq6AUk6o6vgXMT+588DGMc1po4H1MciV7+Alo7NcifZ62EJ6MizwAs+wWiiPW5chjHZEQ7qPO+jIZ6jQdfO9lptppEpjXzxMW/lDDbHziSZlY+PHdsAno6E0pUK+jZ+1mgz4dGVnVafPY/PLwD4yjgVT732AlBczgwvNiYQBzFwnDWsomXTrAPfQhZGLvJqExS7U26QUqhQZ7N1OKpSwsYbiV4/G3f1im1PnhH2QwKCBUygsADGK8mBI+NOKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Z8kTc5t+JIcIKplJmig8nYv11PkkAOTQBsrQsI4xoLQ=; b=YQBwCvBQjHbTa8OzjHwuZsMQgthra33jDQhwiyggn4y0nG/qR+VuMXhFgGCDVgU48/lQlrcIz9/9MW2fL2zcXboTmEWsKqz7rlu9JrFkH8QMXV8i8zjRHxN8qjql6qsXHZg7e/AjN7SOWSAz1iv8XZSS5C1/eJlogq5D6Rb/u71cuslKF+IC9yHA6Z/UD0oH0WXTB0gUTESuzLE7k9d7KztAVavqXSSaRC4SbTwm8PaUYkKRsDTm/5fbTSITxubl2s04TAZM3oFpwri3u9GwLFZVzR2M+5mnxtXDSZ9VlY0IoY0rxwopoFCNFxjNFIFSef/T5jSxlgtbS3XtzrRhAA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Z8kTc5t+JIcIKplJmig8nYv11PkkAOTQBsrQsI4xoLQ=; b=GQCkamtfug2THCQZbnqnw3CsOluJvwdgqZ9ytPunmwhPccTgkYmlPg9YxjcsKOwYOk/pUpG3F5okRQAj8YjJvcLDWyOaMBJXttiJrINZLzYMH/YjXXc+hfVtd0GHnP4rUBEHt3ylJHRKfhxBLHirtMm1RsvqlQ5zTRiE668WxXp/8GNJBDlsVuwG4aKuC34UOjXyAd5HGaMka8sqTTEOzT1o1RTa85E5b7jwJJ1ZMXT0Z9LiQZLFMk6FMREyYdg6JaIGTNcGBkOX1xrsoQf/QMzqVnlIrgBPMwPHd+AQpAbzD7kAHOgDpIOoLjfeD9jScGXgAO49tJ+PcObNlzk98A== Received: from DS0PR12MB6560.namprd12.prod.outlook.com (2603:10b6:8:d0::22) by CH3PR12MB8934.namprd12.prod.outlook.com (2603:10b6:610:17a::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.54; Wed, 2 Apr 2025 21:41:55 +0000 Received: from DS0PR12MB6560.namprd12.prod.outlook.com ([fe80::4c05:4274:b769:b8af]) by DS0PR12MB6560.namprd12.prod.outlook.com ([fe80::4c05:4274:b769:b8af%2]) with mapi id 15.20.8534.043; Wed, 2 Apr 2025 21:41:54 +0000 From: Cosmin Ratiu To: "netdev@vger.kernel.org" , "sdf@fomichev.me" CC: "edumazet@google.com" , "davem@davemloft.net" , "kuba@kernel.org" , "pabeni@redhat.com" Subject: another netdev instance lock bug in ipv6_add_dev Thread-Topic: another netdev instance lock bug in ipv6_add_dev Thread-Index: AQHbpBgNsW0Cw2ElFUOEQjFKH76LcQ== Date: Wed, 2 Apr 2025 21:41:54 +0000 Message-ID: Reply-To: Cosmin Ratiu Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DS0PR12MB6560:EE_|CH3PR12MB8934:EE_ x-ms-office365-filtering-correlation-id: de51f62d-4aa9-4924-5cdf-08dd722f2fe7 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0;ARA:13230040|1800799024|366016|376014|38070700018; x-microsoft-antispam-message-info: =?utf-8?q?Ys79GNpdgbzISkcvnSgV4b4Tv9ndi48?= =?utf-8?q?7kjpYQOSjjf1NtX/UGtu4VXbYhfS1rDD6KAMR0x1lVVeQWFH5s/LFjU944/RtelCK?= =?utf-8?q?g0Mi4hnZYruGgsL2+xnDtrNoU3tZiUx5+ioaCovxSKrVkckHDM/vXojtS2sbbJL5i?= =?utf-8?q?IffLffnecPo4akjeO/hbVvUDE6Vy/hvSDMNKbJpyH0l2/QNRvZr0FLZoHktPBRbZ3?= =?utf-8?q?jhaa0rZycO2u03Glv4ddfHnJHyJ3JAHSt5sFdgtlkkJ7OW8PYdIXvNxkUeyZmx/bB?= =?utf-8?q?AW6Z79hMFCZpDfVD53ArU+y38NISlCZdGyLGFPXN7IkPu+0LLUqXSeXGHZQtKIgic?= =?utf-8?q?2hyABVodMtNAoumdCvN8vlZW2JmJcrKqaDFqIr43wOc0Is1NEyAzj8ykH7CXe/ITY?= =?utf-8?q?8d9YqSeuQ/r/Tfr7goiMjKU/n3gVs6Jaf8hTrLq939TF0TW9+Lk5QkdvLgUESvbJo?= =?utf-8?q?GiciEwA98uRPyAlBjah9+IPNzgr+bPkyUdKci9FCbjoaxQ9Qd0K5SfmWD7qDpuEnR?= =?utf-8?q?zruhSbDvy/NXiTzCnVttRb21RlehVKiN7GBJ4KvmJC0mDp+jdURlADCI1wMgTQymE?= =?utf-8?q?8dPOZB7qQlBiPH5lb6vLviZWguEwtA19gOuipovI67cIMMxpDNsGBfe+fjOkhIg15?= =?utf-8?q?zjvVxkLEA69t3eyebHg/KZCmzYRq2l6tW4LtBnRExVzQPhaqm/JZqS/bckVhSS43V?= =?utf-8?q?L9LD2GykWQHQ6+OPM9Tx/nE7KjjNbzOTo0iwp1DGRGHIfsTua++l5UdKm1JC7jOxw?= =?utf-8?q?pPHTaYiBV+q1L/fPGQRkdFFgUz7Z9k5X0hCRSoP1dG+s+uOnOyQ7v34IhyxTgfr73?= =?utf-8?q?96caNb+aGKoUgdvqVhbXpNkC5PiKPx47YfGd2YSauUIeognJu6NYh0a0xE2QFNOtN?= =?utf-8?q?yXpy7zNsq4dTu6gkUW2Y4S7OTNxvpiDdZK+k0onnK9RQ8KkJegCcOSop9hak2torJ?= =?utf-8?q?fwWti7TvKhIaAPp6yCu/xtFOOcDA9rdRjDxnQdXsp6l5Ho7OdOsEnynMBNU+Y9soA?= =?utf-8?q?2lvfzMYItzxGL9XEpe+GPP866b5fkyfrS5NwByAobQ81Vute0Kj0DBI6jIS26BAly?= =?utf-8?q?R4gbEcjY3q4K7RcWgmOA/cihZAPGI/ev9LYL4Hxi0LWHEf/YGkdWlIM4h//Vs8eTO?= =?utf-8?q?czjAY0Ako449Oxnig4u5E+2Ol209qhzN94pXl+WRfkL6EsM/iI923rQqziGwpUn7F?= =?utf-8?q?77ldZ7b7/rmY8KsIOyWYpDKzVZiagsnY/aefGu0OjWDfw2fuZf4FWPz37vd29ziiw?= =?utf-8?q?/BZVvcLf/spBM+izOFspDTlUE+/6Z3UF2eiXcNjQdeeNekQpqMt85S4BDgqgtTNOI?= =?utf-8?q?6mj5ee9wDB6H7BQLljxdNNkTH56UlOavanFcna7FoJ/kJVtN4mdanmytDXwZuQAL5?= =?utf-8?q?c9Aay0um76X?= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB6560.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(376014)(38070700018);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?q?WjKn3YYw4gq2tYsLIMMKvgaoQU4z?= =?utf-8?q?qGy5N28+NOE05vZvC4/HcSerXYZr6PCCZuRzA3gyJwDGLdDB8k1u/qn47OuYuDBvB?= =?utf-8?q?av02YVdYNl42mTgC4l8TQArRqiEyjxfllWIZxtqFNRXo9Y+YeaH+u5maQ7LK05mdk?= =?utf-8?q?aJxVYrufaWWwABMQJ7mFTWWVBUnYH/xN4OWnk5WailbvMNj4BfYabvptX5gN0ytfu?= =?utf-8?q?0SsZD99mHcWXly1JfKThNKXd/tFFIV3gkSJbi5F2lgOkX4kojztkEBaKRRxBGSFat?= =?utf-8?q?QpqXVeufY/j1RY127ef8vf/2RA3CKgBfnJrl0gI8d9NgokzPRkf+Zug1yl8xmpv3o?= =?utf-8?q?sGgSDt5ZoNEnq/NJ4QmhOde1mnyMH2ffWrRLoXVjbYWFbtnElF4AK+EtN6bKIn1Nc?= =?utf-8?q?yD74wtX/5tkQb6yb/ntkVze7FP6RMwq+rCuTd/emajh9qrD5Zmv/MEjeVOX/w20zA?= =?utf-8?q?BJdtGuc+Pj7tzUFo6VAVe3RWfSdGO7S89xLCU/ugaN0V7FMcOCivleiAx+SsVjc2q?= =?utf-8?q?ejFA3thXISKXIdIAY85LUGjSW2vzCWi8w6/PFCwzgr+kxZBItm+T9JWTeKitZMVkl?= =?utf-8?q?c4MIt1AjC5nQEdv8dpinuFrRkl4zbzXGxUgYnk0XNzUJZYjvDXeOvxP8ldYgKQBPe?= =?utf-8?q?2OjVn4HlFxdDNg3WvMNkJHTsYQsPb0qxe3T8NPafFUPC4WxcnURarHp6EYg2bvzWC?= =?utf-8?q?WoAA3jkb2na1KHTFEqz8UsiomYXwycPOlp1utg3Hful7ki7sx2R7VmMS/j0olcakN?= =?utf-8?q?cBggjjeLb1qq6+jQtfwrA1z1xLKlK+SKu6depYDchAZSE2UOQmtzIAjQMDEuouGNy?= =?utf-8?q?YR1eDLhXxHm2v1KaGkcekMjaO79qhflXwnCGPi4Tzu02vaI/pQd159KSh7Ch77XiS?= =?utf-8?q?tZZkD682Awe9awoQNqtjpv8ReaO5oKHuMVL2c6UWp8OyweGcWx1z9XwZpb1C4tIW0?= =?utf-8?q?9/XZrLQhj1FdZzhaANuDKxr8WU2DLNFte7Tbi5v63prySvbLGTuT6Y6+Cnl8Dfirv?= =?utf-8?q?ggSNHmYA/mrA8wty6EJRBRhaPFb9VH23+neUxxz+BDEyR8KXkWJ9dX4KL43JtiN0W?= =?utf-8?q?kF2tlQds3KLaFDOwJK/8t74XuYN5MlL3uGAjk9rUQtpW+6Rx7fkFmnDy+FDzVWhGc?= =?utf-8?q?R6MPA4e8SP6S9zdqRHslRI6p9Q8AhGbruWx55lj15BWNnxmScK6r+WT4GfNmb+Iv1?= =?utf-8?q?0fjpm3bvh+QDVovbKMxRNUWZ/Tx7erQQr2aPNL7IHZpspp5YAivRTaDFNt5yYOBx1?= =?utf-8?q?zwDaeDZrId03PkBUCShx3zPGpELZ/hz84uLx7QuUPZyYSCda5jBw6vhkW9sT6G2ki?= =?utf-8?q?Uwid94+9enhmw1ndE+243cYwZZl84LBgZiKPU1GIP2nFwaW8P1mNfbmXEi1a7hDk1?= =?utf-8?q?Gg+BAXsea6xQcxybhHZCWEkaKOdXWVbMNcx9Y2jshP0hEgjwLM7Q8sVuOzXsJXzqc?= =?utf-8?q?Pgc2tBBZNHVWI1Xr/8kZ8CzH8t0lPcWEozXU/p1yUHOJCtROqbjanDWrm9YyFrBjn?= =?utf-8?q?o2o8F22MghA7?= Content-ID: <77AFD1E5DBE6EE4FBA7CF57639A095C1@namprd12.prod.outlook.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB6560.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: de51f62d-4aa9-4924-5cdf-08dd722f2fe7 X-MS-Exchange-CrossTenant-originalarrivaltime: 02 Apr 2025 21:41:54.9135 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: nQUjBYosf1tE9VX4e315pVkOGueCCFQO/Fpup3v7T7rvsTqiLEgJbFCQEYV8aREPJ7nCzy0iwp3T5XcaduXi7Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB8934 Hi, Not sure if it's reported already, but I encountered a bug while testing with the new locking scheme. This is the call trace: [ 3454.975672] WARNING: CPU: 1 PID: 58237 at ./include/net/netdev_lock.h:54 ipv6_add_dev+0x370/0x620 [ 3455.008776] ? ipv6_add_dev+0x370/0x620 [ 3455.010097] ipv6_find_idev+0x96/0xe0 [ 3455.010725] addrconf_add_dev+0x1e/0xa0 [ 3455.011382] addrconf_init_auto_addrs+0xb0/0x720 [ 3455.013537] addrconf_notify+0x35f/0x8d0 [ 3455.014214] notifier_call_chain+0x38/0xf0 [ 3455.014903] netdev_state_change+0x65/0x90 [ 3455.015586] linkwatch_do_dev+0x5a/0x70 [ 3455.016238] rtnl_getlink+0x241/0x3e0 [ 3455.019046] rtnetlink_rcv_msg+0x177/0x5e0 The call chain is rtnl_getlink -> linkwatch_sync_dev -> linkwatch_do_dev -> netdev_state_change -> ... Nothing on this path acquires the netdev lock, resulting in a warning. Perhaps rtnl_getlink should acquire it, in addition to the RTNL already held by rtnetlink_rcv_msg? The same thing can be seen from the regular linkwatch wq: [ 3456.637014] WARNING: CPU: 16 PID: 83257 at ./include/net/netdev_lock.h:54 ipv6_add_dev+0x370/0x620 [ 3456.655305] Call Trace: [ 3456.655610] [ 3456.655890] ? __warn+0x89/0x1b0 [ 3456.656261] ? ipv6_add_dev+0x370/0x620 [ 3456.660039] ipv6_find_idev+0x96/0xe0 [ 3456.660445] addrconf_add_dev+0x1e/0xa0 [ 3456.660861] addrconf_init_auto_addrs+0xb0/0x720 [ 3456.661803] addrconf_notify+0x35f/0x8d0 [ 3456.662236] notifier_call_chain+0x38/0xf0 [ 3456.662676] netdev_state_change+0x65/0x90 [ 3456.663112] linkwatch_do_dev+0x5a/0x70 [ 3456.663529] __linkwatch_run_queue+0xeb/0x200 [ 3456.663990] linkwatch_event+0x21/0x30 [ 3456.664399] process_one_work+0x211/0x610 [ 3456.664828] worker_thread+0x1cc/0x380 [ 3456.665691] kthread+0xf4/0x210 In this case, __linkwatch_run_queue seems like a good place to grab a device lock before calling linkwatch_do_dev. The proposed patch is below, I'll let you reason through the implications of calling NETDEV_CHANGE notifiers from linkwatch with the instance lock, you have thought about this much longer than me. Signed-off-by: Cosmin Ratiu --- net/core/link_watch.c | 2 ++ net/core/rtnetlink.c | 2 ++ 2 files changed, 4 insertions(+) RTM_NEWLINK, NETLINK_CB(skb).portid, diff --git a/net/core/link_watch.c b/net/core/link_watch.c index cb04ef2b9807..002f18b11d85 100644 --- a/net/core/link_watch.c +++ b/net/core/link_watch.c @@ -240,7 +240,9 @@ static void __linkwatch_run_queue(int urgent_only) */ netdev_tracker_free(dev, &dev->linkwatch_dev_tracker); spin_unlock_irq(&lweventlist_lock); + netdev_lock_ops(dev); linkwatch_do_dev(dev); + netdev_unlock_ops(dev); do_dev--; spin_lock_irq(&lweventlist_lock); } diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index a2736e434712..c77b37d897eb 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4175,7 +4175,9 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh, * only TX if link watch work has run, but without this we'd * already report carrier on, even if it doesn't work yet. */ + netdev_lock_ops(dev); linkwatch_sync_dev(dev); + netdev_unlock_ops(dev); err = rtnl_fill_ifinfo(nskb, dev, net,