diff mbox series

[net] lan966x: Fix unloading/loading of the driver

Message ID 20230522120038.3749026-1-horatiu.vultur@microchip.com (mailing list archive)
State Accepted
Commit 600761245952d7f70280add6ce02894f1528992b
Delegated to: Netdev Maintainers
Headers show
Series [net] lan966x: Fix unloading/loading of the driver | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Horatiu Vultur May 22, 2023, noon UTC
It was noticing that after a while when unloading/loading the driver and
sending traffic through the switch, it would stop working. It would stop
forwarding any traffic and the only way to get out of this was to do a
power cycle of the board. The root cause seems to be that the switch
core is initialized twice. Apparently initializing twice the switch core
disturbs the pointers in the queue systems in the HW, so after a while
it would stop sending the traffic.
Unfortunetly, it is not possible to use a reset of the switch here,
because the reset line is connected to multiple devices like MDIO,
SGPIO, FAN, etc. So then all the devices will get reseted when the
network driver will be loaded.
So the fix is to check if the core is initialized already and if that is
the case don't initialize it again.

Fixes: db8bcaad5393 ("net: lan966x: add the basic lan966x driver")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
---
 drivers/net/ethernet/microchip/lan966x/lan966x_main.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Simon Horman May 22, 2023, 1:46 p.m. UTC | #1
On Mon, May 22, 2023 at 02:00:38PM +0200, Horatiu Vultur wrote:
> It was noticing that after a while when unloading/loading the driver and
> sending traffic through the switch, it would stop working. It would stop
> forwarding any traffic and the only way to get out of this was to do a
> power cycle of the board. The root cause seems to be that the switch
> core is initialized twice. Apparently initializing twice the switch core
> disturbs the pointers in the queue systems in the HW, so after a while
> it would stop sending the traffic.

Ouch.

> Unfortunetly, it is not possible to use a reset of the switch here,

nit: s/Unfortunetly/Unfortunately/

> because the reset line is connected to multiple devices like MDIO,
> SGPIO, FAN, etc. So then all the devices will get reseted when the

nit: s/reseted/reset/

> network driver will be loaded.
> So the fix is to check if the core is initialized already and if that is
> the case don't initialize it again.
> 
> Fixes: db8bcaad5393 ("net: lan966x: add the basic lan966x driver")
> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>

...
patchwork-bot+netdevbpf@kernel.org May 23, 2023, 1:40 p.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Mon, 22 May 2023 14:00:38 +0200 you wrote:
> It was noticing that after a while when unloading/loading the driver and
> sending traffic through the switch, it would stop working. It would stop
> forwarding any traffic and the only way to get out of this was to do a
> power cycle of the board. The root cause seems to be that the switch
> core is initialized twice. Apparently initializing twice the switch core
> disturbs the pointers in the queue systems in the HW, so after a while
> it would stop sending the traffic.
> Unfortunetly, it is not possible to use a reset of the switch here,
> because the reset line is connected to multiple devices like MDIO,
> SGPIO, FAN, etc. So then all the devices will get reseted when the
> network driver will be loaded.
> So the fix is to check if the core is initialized already and if that is
> the case don't initialize it again.
> 
> [...]

Here is the summary with links:
  - [net] lan966x: Fix unloading/loading of the driver
    https://git.kernel.org/netdev/net/c/600761245952

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
index 5f01b21acdd1b..f6931dfb3e68e 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
@@ -1039,6 +1039,16 @@  static int lan966x_reset_switch(struct lan966x *lan966x)
 
 	reset_control_reset(switch_reset);
 
+	/* Don't reinitialize the switch core, if it is already initialized. In
+	 * case it is initialized twice, some pointers inside the queue system
+	 * in HW will get corrupted and then after a while the queue system gets
+	 * full and no traffic is passing through the switch. The issue is seen
+	 * when loading and unloading the driver and sending traffic through the
+	 * switch.
+	 */
+	if (lan_rd(lan966x, SYS_RESET_CFG) & SYS_RESET_CFG_CORE_ENA)
+		return 0;
+
 	lan_wr(SYS_RESET_CFG_CORE_ENA_SET(0), lan966x, SYS_RESET_CFG);
 	lan_wr(SYS_RAM_INIT_RAM_INIT_SET(1), lan966x, SYS_RAM_INIT);
 	ret = readx_poll_timeout(lan966x_ram_init, lan966x,