diff mbox series

[net,1/2] bnxt_en: Fix race between firmware reset and driver remove.

Message ID 1614332590-17865-2-git-send-email-michael.chan@broadcom.com (mailing list archive)
State Accepted
Commit d20cd745218cde1b268bef5282095ec6c95a3ea2
Delegated to: Netdev Maintainers
Headers show
Series bnxt_en: Error recovery bug fixes. | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers fail 2 blamed authors not CCed: edwin.peer@broadcom.com willemb@google.com; 2 maintainers not CCed: edwin.peer@broadcom.com willemb@google.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Michael Chan Feb. 26, 2021, 9:43 a.m. UTC
From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

The driver's error recovery reset sequence can take many seconds to
complete and only the critical sections are protected by rtnl_lock.
A recent change has introduced a regression in this sequence.

bnxt_remove_one() may be called while the recovery is in progress.
Normally, unregister_netdev() would cause bnxt_close_nic() to be
called and this would cause the error recovery to safely abort
with the BNXT_STATE_ABORT_ERR flag set in bnxt_close_nic().

Recently, we added bnxt_reinit_after_abort() to allow the user to
reopen the device after an aborted recovery.  This causes the
regression in the scenario described above because we would
attempt to re-open even after the netdev has been unregistered.

Fix it by checking the netdev reg_state in
bnxt_reinit_after_abort() and abort if it is unregistered.

Fixes: 6882c36cf82e ("bnxt_en: attempt to reinitialize after aborted reset")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +++
 1 file changed, 3 insertions(+)
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index a680fd9c68ea..c55189c7bb36 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9890,6 +9890,9 @@  static int bnxt_reinit_after_abort(struct bnxt *bp)
 	if (test_bit(BNXT_STATE_IN_FW_RESET, &bp->state))
 		return -EBUSY;
 
+	if (bp->dev->reg_state == NETREG_UNREGISTERED)
+		return -ENODEV;
+
 	rc = bnxt_fw_init_one(bp);
 	if (!rc) {
 		bnxt_clear_int_mode(bp);