diff mbox series

[net,v2] ice: Fix deadlock on the rtnl_mutex

Message ID 20230105120518.29776-1-mateusz.palczewski@intel.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] ice: Fix deadlock on the rtnl_mutex | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: anirudh.venkataramanan@intel.com; 7 maintainers not CCed: edumazet@google.com davem@davemloft.net kuba@kernel.org pabeni@redhat.com anthony.l.nguyen@intel.com jesse.brandeburg@intel.com anirudh.venkataramanan@intel.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch fail ERROR: space required before the open brace '{'
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Mateusz Palczewski Jan. 5, 2023, 12:05 p.m. UTC
There is a deadlock on rtnl_mutex when attempting to take the lock
in unregister_netdev() after it has already been taken by
ethnl_set_channels(). This happened when unregister_netdev() was
called inside of ice_vsi_rebuild().
Fix that by removing the unregister_netdev() usage and replace it with
ice_vsi_clear_rings() that deallocates the tx and rx rings for the VSI.

Fixes: df0f847915b4 ("ice: Move common functions out of ice_main.c part 6/7")
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
---
 v2: Fixed goto unwind to remove code redundancy
---
 drivers/net/ethernet/intel/ice/ice_lib.c | 35 ++++++++++++------------
 1 file changed, 17 insertions(+), 18 deletions(-)

Comments

Leon Romanovsky Jan. 5, 2023, 1:18 p.m. UTC | #1
On Thu, Jan 05, 2023 at 07:05:18AM -0500, Mateusz Palczewski wrote:
> There is a deadlock on rtnl_mutex when attempting to take the lock
> in unregister_netdev() after it has already been taken by
> ethnl_set_channels(). This happened when unregister_netdev() was
> called inside of ice_vsi_rebuild().
> Fix that by removing the unregister_netdev() usage and replace it with
> ice_vsi_clear_rings() that deallocates the tx and rx rings for the VSI.
> 
> Fixes: df0f847915b4 ("ice: Move common functions out of ice_main.c part 6/7")
> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> ---
>  v2: Fixed goto unwind to remove code redundancy
> ---
>  drivers/net/ethernet/intel/ice/ice_lib.c | 35 ++++++++++++------------
>  1 file changed, 17 insertions(+), 18 deletions(-)
> 

I think that it will be beneficial to have lockdep trace in commit message too.

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Jakub Kicinski Jan. 5, 2023, 6:45 p.m. UTC | #2
On Thu,  5 Jan 2023 07:05:18 -0500 Mateusz Palczewski wrote:
>  		ret = ice_vsi_alloc_q_vectors(vsi);
> -		if (ret)
> -			goto err_rings;
> +		if (ret){
> +			ice_vsi_clear_rings(vsi);
> +			goto err_reset;
> +		}
>  
>  		ret = ice_vsi_setup_vector_base(vsi);
>  		if (ret)

Why do cases which jump to err_vectors no need any changes?
Mateusz Palczewski Jan. 12, 2023, 1:31 p.m. UTC | #3
Hi,
Sorry for late response
>
>
>>  		ret = ice_vsi_alloc_q_vectors(vsi);
>> -		if (ret)
>> -			goto err_rings;
>> +		if (ret){
>> +			ice_vsi_clear_rings(vsi);
>> +			goto err_reset;
>> +		}
>>  
>>  		ret = ice_vsi_setup_vector_base(vsi);
>>  		if (ret)
>
>Why do cases which jump to err_vectors no need any changes?
During my testing I saw no issues with cases when goto err_vectors were used. Did You manage to have other results? 
>

Regards,
Mateusz
Jakub Kicinski Jan. 12, 2023, 7:14 p.m. UTC | #4
On Thu, 12 Jan 2023 13:31:06 +0000 Palczewski, Mateusz wrote:
> >Why do cases which jump to err_vectors no need any changes?  
>
> During my testing I saw no issues with cases when goto err_vectors
> were used. Did You manage to have other results? 

I'm just reviewing the code.

Exhaustively testing all the cases is probably very hard,
which is why we generally try to reason about the code
from first principles.

IOW "it didn't fail in my testing" is rarely a sufficient proof
upstream.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 94aa834cd9a6..e5e96dad3563 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -3479,8 +3479,10 @@  int ice_vsi_rebuild(struct ice_vsi *vsi, bool init_vsi)
 		ice_vsi_set_num_qs(vsi, NULL);
 
 	ret = ice_vsi_alloc_arrays(vsi);
-	if (ret < 0)
-		goto err_vsi;
+	if (ret < 0){
+		ice_vsi_clear(vsi);
+		goto err_reset;
+	}
 
 	ice_vsi_get_qs(vsi);
 
@@ -3489,16 +3491,19 @@  int ice_vsi_rebuild(struct ice_vsi *vsi, bool init_vsi)
 
 	/* Initialize VSI struct elements and create VSI in FW */
 	ret = ice_vsi_init(vsi, init_vsi);
-	if (ret < 0)
-		goto err_vsi;
-
+	if (ret < 0){
+		ice_vsi_clear(vsi);
+		goto err_reset;
+	}
 	switch (vtype) {
 	case ICE_VSI_CTRL:
 	case ICE_VSI_SWITCHDEV_CTRL:
 	case ICE_VSI_PF:
 		ret = ice_vsi_alloc_q_vectors(vsi);
-		if (ret)
-			goto err_rings;
+		if (ret){
+			ice_vsi_clear_rings(vsi);
+			goto err_reset;
+		}
 
 		ret = ice_vsi_setup_vector_base(vsi);
 		if (ret)
@@ -3544,8 +3549,10 @@  int ice_vsi_rebuild(struct ice_vsi *vsi, bool init_vsi)
 		break;
 	case ICE_VSI_VF:
 		ret = ice_vsi_alloc_q_vectors(vsi);
-		if (ret)
-			goto err_rings;
+		if (ret){
+			ice_vsi_clear_rings(vsi);
+			goto err_reset;
+		}
 
 		ret = ice_vsi_set_q_vectors_reg_idx(vsi);
 		if (ret)
@@ -3618,15 +3625,7 @@  int ice_vsi_rebuild(struct ice_vsi *vsi, bool init_vsi)
 
 err_vectors:
 	ice_vsi_free_q_vectors(vsi);
-err_rings:
-	if (vsi->netdev) {
-		vsi->current_netdev_flags = 0;
-		unregister_netdev(vsi->netdev);
-		free_netdev(vsi->netdev);
-		vsi->netdev = NULL;
-	}
-err_vsi:
-	ice_vsi_clear(vsi);
+err_reset:
 	set_bit(ICE_RESET_FAILED, pf->state);
 	kfree(coalesce);
 	return ret;