diff mbox series

[net,v4,1/1] mlxbf_gige: Fix kernel panic at shutdown

Message ID 20230721141956.29842-1-asmaa@nvidia.com (mailing list archive)
State Deferred
Delegated to: Netdev Maintainers
Headers show
Series [net,v4,1/1] mlxbf_gige: Fix kernel panic at shutdown | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1342 this patch: 1342
netdev/cc_maintainers fail 1 blamed authors not CCed: limings@nvidia.com; 1 maintainers not CCed: limings@nvidia.com
netdev/build_clang success Errors and warnings before: 1365 this patch: 1365
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1365 this patch: 1365
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 39 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Asmaa Mnebhi July 21, 2023, 2:19 p.m. UTC
There is a race condition happening during shutdown due to pending
napi transactions. Since mlxbf_gige_poll is still running, it tries
to access a NULL pointer and as a result causes a kernel panic.
To fix this during shutdown, invoke mlxbf_gige_remove to disable and
dequeue napi.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
---
v3-v4:
 - replace .remove with .remove_new to be able to return a void to
   .shutdown as well.
v2-v3:
 - remove mlxbf_gige_shutdown() since it is redundant with mlxbf_gige_remove().
v1->v2:
 - fixed the tag from net-next to net
 - check that the "priv" pointer is not NULL in mlxbf_gige_remove()

 .../mellanox/mlxbf_gige/mlxbf_gige_main.c     | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

Comments

Vladimir Oltean July 21, 2023, 3:02 p.m. UTC | #1
Hi Asmaa,

On Fri, Jul 21, 2023 at 10:19:56AM -0400, Asmaa Mnebhi wrote:
> There is a race condition happening during shutdown due to pending
> napi transactions. Since mlxbf_gige_poll is still running, it tries
> to access a NULL pointer and as a result causes a kernel panic.
> To fix this during shutdown, invoke mlxbf_gige_remove to disable and
> dequeue napi.
> 
> Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
> ---

What is the race condition; what does the stack trace of the NPD look like?
Asmaa Mnebhi July 21, 2023, 7:06 p.m. UTC | #2
> What is the race condition; what does the stack trace of the NPD look like?

Hi Vladimir,

[  OK  ] Reached target Shutdown.
[  OK  ] Reached target Final Step.
[  OK  ] Started Reboot.
[  OK  ] Reached target Reboot.
...
[  285.126250] mlxbf_gige MLNXBF17:00: shutdown
[  285.130669] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
[  285.139447] Mem abort info:
[  285.142228]   ESR = 0x0000000096000004
[  285.145964]   EC = 0x25: DABT (current EL), IL = 32 bits
[  285.151261]   SET = 0, FnV = 0
[  285.154303]   EA = 0, S1PTW = 0
[  285.157430]   FSC = 0x04: level 0 translation fault
[  285.162293] Data abort info:
[  285.165159]   ISV = 0, ISS = 0x00000004
[  285.168980]   CM = 0, WnR = 0
[  285.171932] user pgtable: 4k pages, 48-bit VAs, pgdp=000000011d373000
[  285.178358] [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
[  285.185134] Internal error: Oops: 96000004 [#1] SMP
Vladimir Oltean July 21, 2023, 10:32 p.m. UTC | #3
Hi Asmaa,

On Fri, Jul 21, 2023 at 07:06:03PM +0000, Asmaa Mnebhi wrote:
> > What is the race condition; what does the stack trace of the NPD look like?
> 
> Hi Vladimir,
> 
> [  OK  ] Reached target Shutdown.
> [  OK  ] Reached target Final Step.
> [  OK  ] Started Reboot.
> [  OK  ] Reached target Reboot.
> ...
> [  285.126250] mlxbf_gige MLNXBF17:00: shutdown
> [  285.130669] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
> [  285.139447] Mem abort info:
> [  285.142228]   ESR = 0x0000000096000004
> [  285.145964]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  285.151261]   SET = 0, FnV = 0
> [  285.154303]   EA = 0, S1PTW = 0
> [  285.157430]   FSC = 0x04: level 0 translation fault
> [  285.162293] Data abort info:
> [  285.165159]   ISV = 0, ISS = 0x00000004
> [  285.168980]   CM = 0, WnR = 0
> [  285.171932] user pgtable: 4k pages, 48-bit VAs, pgdp=000000011d373000
> [  285.178358] [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
> [  285.185134] Internal error: Oops: 96000004 [#1] SMP

That is not a stack trace. The stack trace is what appears between the
lines "Call trace:" and "---[ end trace 0000000000000000 ]---", and it
is missing from what you've posted.

It would be nice if you could also post-process the stack trace you get
through the command below (run in a kernel tree):

cat stack_trace | scripts/decode_stacktrace.sh path/to/vmlinux .

so that we could see the actual C code line numbers, and not just the
offsets within your particular binary image.
Asmaa Mnebhi July 21, 2023, 11:27 p.m. UTC | #4
> > [  285.126250] mlxbf_gige MLNXBF17:00: shutdown [  285.130669] Unable
> > to handle kernel NULL pointer dereference at virtual address
> > 0000000000000070 [  285.139447] Mem abort info:
> > [  285.142228]   ESR = 0x0000000096000004
> > [  285.145964]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [  285.151261]   SET = 0, FnV = 0
> > [  285.154303]   EA = 0, S1PTW = 0
> > [  285.157430]   FSC = 0x04: level 0 translation fault
> > [  285.162293] Data abort info:
> > [  285.165159]   ISV = 0, ISS = 0x00000004
> > [  285.168980]   CM = 0, WnR = 0
> > [  285.171932] user pgtable: 4k pages, 48-bit VAs,
> > pgdp=000000011d373000 [  285.178358] [0000000000000070]
> > pgd=0000000000000000, p4d=0000000000000000 [  285.185134] Internal
> > error: Oops: 96000004 [#1] SMP
> 
> That is not a stack trace. The stack trace is what appears between the lines "Call
> trace:" and "---[ end trace 0000000000000000 ]---", and it is missing from what
> you've posted.
> 
> It would be nice if you could also post-process the stack trace you get through
> the command below (run in a kernel tree):
> 
> cat stack_trace | scripts/decode_stacktrace.sh path/to/vmlinux .
> 
> so that we could see the actual C code line numbers, and not just the offsets
> within your particular binary image.

Hi Vladimir,

Do you want me to just send it as a reply or send a v5 and add it to the commit message?
Vladimir Oltean July 21, 2023, 11:29 p.m. UTC | #5
On Fri, Jul 21, 2023 at 11:27:10PM +0000, Asmaa Mnebhi wrote:
> > > [  285.126250] mlxbf_gige MLNXBF17:00: shutdown [  285.130669] Unable
> > > to handle kernel NULL pointer dereference at virtual address
> > > 0000000000000070 [  285.139447] Mem abort info:
> > > [  285.142228]   ESR = 0x0000000096000004
> > > [  285.145964]   EC = 0x25: DABT (current EL), IL = 32 bits
> > > [  285.151261]   SET = 0, FnV = 0
> > > [  285.154303]   EA = 0, S1PTW = 0
> > > [  285.157430]   FSC = 0x04: level 0 translation fault
> > > [  285.162293] Data abort info:
> > > [  285.165159]   ISV = 0, ISS = 0x00000004
> > > [  285.168980]   CM = 0, WnR = 0
> > > [  285.171932] user pgtable: 4k pages, 48-bit VAs,
> > > pgdp=000000011d373000 [  285.178358] [0000000000000070]
> > > pgd=0000000000000000, p4d=0000000000000000 [  285.185134] Internal
> > > error: Oops: 96000004 [#1] SMP
> > 
> > That is not a stack trace. The stack trace is what appears between the lines "Call
> > trace:" and "---[ end trace 0000000000000000 ]---", and it is missing from what
> > you've posted.
> > 
> > It would be nice if you could also post-process the stack trace you get through
> > the command below (run in a kernel tree):
> > 
> > cat stack_trace | scripts/decode_stacktrace.sh path/to/vmlinux .
> > 
> > so that we could see the actual C code line numbers, and not just the offsets
> > within your particular binary image.
> 
> Hi Vladimir,
> 
> Do you want me to just send it as a reply or send a v5 and add it to the commit message?

As a reply, for now, please. I haven't requested any changes to the patch.
Jakub Kicinski July 26, 2023, 4:27 p.m. UTC | #6
On Fri, 21 Jul 2023 10:19:56 -0400 Asmaa Mnebhi wrote:
> There is a race condition happening during shutdown due to pending
> napi transactions. Since mlxbf_gige_poll is still running, it tries
> to access a NULL pointer and as a result causes a kernel panic.
> To fix this during shutdown, invoke mlxbf_gige_remove to disable and
> dequeue napi.
> 
> Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>

Asmaa, you need to reply to Vladimir, I'm dropping this patch :(
Asmaa Mnebhi Aug. 8, 2023, 5:45 p.m. UTC | #7
> > Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
> > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
> 
> Asmaa, you need to reply to Vladimir, I'm dropping this patch :(
> --
Hi Jakub,
 
Apologies for the delay. Actually this worked out for the best, because this change caused another issue on our systems. 
I will submit a different patch once I test it more thoroughly and will attach the panic as requested previously.

Thanks again for your help and feedback.
Asmaa
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
index 694de9513b9f..5e677bd32956 100644
--- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
+++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
@@ -471,24 +471,19 @@  static int mlxbf_gige_probe(struct platform_device *pdev)
 	return err;
 }
 
-static int mlxbf_gige_remove(struct platform_device *pdev)
+static void mlxbf_gige_remove(struct platform_device *pdev)
 {
 	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
 
+	if (!priv)
+		return;
+
 	unregister_netdev(priv->netdev);
 	phy_disconnect(priv->netdev->phydev);
 	mlxbf_gige_mdio_remove(priv);
 	platform_set_drvdata(pdev, NULL);
 
-	return 0;
-}
-
-static void mlxbf_gige_shutdown(struct platform_device *pdev)
-{
-	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
-
-	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
-	mlxbf_gige_clean_port(priv);
+	return;
 }
 
 static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {
@@ -499,8 +494,8 @@  MODULE_DEVICE_TABLE(acpi, mlxbf_gige_acpi_match);
 
 static struct platform_driver mlxbf_gige_driver = {
 	.probe = mlxbf_gige_probe,
-	.remove = mlxbf_gige_remove,
-	.shutdown = mlxbf_gige_shutdown,
+	.remove_new = mlxbf_gige_remove,
+	.shutdown = mlxbf_gige_remove,
 	.driver = {
 		.name = KBUILD_MODNAME,
 		.acpi_match_table = ACPI_PTR(mlxbf_gige_acpi_match),