Message ID | 20240730061638.1831002-6-tariqt@nvidia.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 572f9caa9e7295f8c8822e4122c7ae8f1c412ff9 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | mlx5 misc fixes 2024-07-30 | expand |
On 30.07.2024 08:16, Tariq Toukan wrote: > From: Moshe Shemesh <moshe@nvidia.com> > > On sync reset reload work, when remote host updates devlink on reload > actions performed on that host, it misses taking devlink lock before > calling devlink_remote_reload_actions_performed() which results in > triggering lock assert like the following: > > WARNING: CPU: 4 PID: 1164 at net/devlink/core.c:261 devl_assert_locked+0x3e/0x50 > … > CPU: 4 PID: 1164 Comm: kworker/u96:6 Tainted: G S W 6.10.0-rc2+ #116 > Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0 12/18/2015 > Workqueue: mlx5_fw_reset_events mlx5_sync_reset_reload_work [mlx5_core] > RIP: 0010:devl_assert_locked+0x3e/0x50 > … > Call Trace: > <TASK> > ? __warn+0xa4/0x210 > ? devl_assert_locked+0x3e/0x50 > ? report_bug+0x160/0x280 > ? handle_bug+0x3f/0x80 > ? exc_invalid_op+0x17/0x40 > ? asm_exc_invalid_op+0x1a/0x20 > ? devl_assert_locked+0x3e/0x50 > devlink_notify+0x88/0x2b0 > ? mlx5_attach_device+0x20c/0x230 [mlx5_core] > ? __pfx_devlink_notify+0x10/0x10 > ? process_one_work+0x4b6/0xbb0 > process_one_work+0x4b6/0xbb0 > […] > > Fixes: 84a433a40d0e ("net/mlx5: Lock mlx5 devlink reload callbacks") > Signed-off-by: Moshe Shemesh <moshe@nvidia.com> > Reviewed-by: Maor Gottlieb <maorg@nvidia.com> > Signed-off-by: Tariq Toukan <tariqt@nvidia.com> > --- Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> > drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c > index 979c49ae6b5c..b43ca0b762c3 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c > @@ -207,6 +207,7 @@ int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev) > static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unloaded) > { > struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; > + struct devlink *devlink = priv_to_devlink(dev); > > /* if this is the driver that initiated the fw reset, devlink completed the reload */ > if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) { > @@ -218,9 +219,11 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unload > mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n"); > else > mlx5_load_one(dev, true); > - devlink_remote_reload_actions_performed(priv_to_devlink(dev), 0, > + devl_lock(devlink); > + devlink_remote_reload_actions_performed(devlink, 0, > BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) | > BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE)); > + devl_unlock(devlink); > } > } >
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 979c49ae6b5c..b43ca0b762c3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -207,6 +207,7 @@ int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev) static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unloaded) { struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; + struct devlink *devlink = priv_to_devlink(dev); /* if this is the driver that initiated the fw reset, devlink completed the reload */ if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) { @@ -218,9 +219,11 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unload mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n"); else mlx5_load_one(dev, true); - devlink_remote_reload_actions_performed(priv_to_devlink(dev), 0, + devl_lock(devlink); + devlink_remote_reload_actions_performed(devlink, 0, BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) | BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE)); + devl_unlock(devlink); } }