diff mbox series

[net-next] net: sfp: add debugfs support

Message ID E1khJyS-0003UU-9O@rmk-PC.armlinux.org.uk (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: sfp: add debugfs support | expand

Checks

Context Check Description
netdev/apply fail Patch does not apply to net-next
netdev/tree_selection success Clearly marked for net-next

Commit Message

Russell King (Oracle) Nov. 23, 2020, 10:06 p.m. UTC
Add debugfs support to SFP so that the internal state of the SFP state
machines and hardware signal state can be viewed from userspace, rather
than having to compile a debug kernel to view state state transitions
in the kernel log.  The 'state' output looks like:

Module state: empty
Module probe attempts: 0 0
Device state: up
Main state: down
Fault recovery remaining retries: 5
PHY probe remaining retries: 12
moddef0: 0
rx_los: 1
tx_fault: 1
tx_disable: 1

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
This is useful to view the state of the SFP cage when e.g., debugging
a module that the link doesn't seem to be coming up for. Rather than
sending this patch each time there is a query, it seems sensible to
merge it into mainline instead.

 drivers/net/phy/sfp.c | 55 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

Comments

Andrew Lunn Nov. 24, 2020, 12:14 a.m. UTC | #1
On Mon, Nov 23, 2020 at 10:06:16PM +0000, Russell King wrote:
> Add debugfs support to SFP so that the internal state of the SFP state
> machines and hardware signal state can be viewed from userspace, rather
> than having to compile a debug kernel to view state state transitions
> in the kernel log.  The 'state' output looks like:
> 
> Module state: empty
> Module probe attempts: 0 0
> Device state: up
> Main state: down
> Fault recovery remaining retries: 5
> PHY probe remaining retries: 12
> moddef0: 0
> rx_los: 1
> tx_fault: 1
> tx_disable: 1
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Hi Russell

This looks useful. I always seem to end up recompiling the kernel,
which as you said, this should avoid.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew
Ido Schimmel Nov. 24, 2020, 8:41 a.m. UTC | #2
On Tue, Nov 24, 2020 at 01:14:31AM +0100, Andrew Lunn wrote:
> On Mon, Nov 23, 2020 at 10:06:16PM +0000, Russell King wrote:
> > Add debugfs support to SFP so that the internal state of the SFP state
> > machines and hardware signal state can be viewed from userspace, rather
> > than having to compile a debug kernel to view state state transitions
> > in the kernel log.  The 'state' output looks like:
> > 
> > Module state: empty
> > Module probe attempts: 0 0
> > Device state: up
> > Main state: down
> > Fault recovery remaining retries: 5
> > PHY probe remaining retries: 12
> > moddef0: 0
> > rx_los: 1
> > tx_fault: 1
> > tx_disable: 1
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> 
> Hi Russell
> 
> This looks useful. I always seem to end up recompiling the kernel,
> which as you said, this should avoid.

FWIW, another option is to use drgn [1]. Especially when the state is
queried from the kernel and not hardware. We are using that in mlxsw
[2][3].

[1] https://github.com/osandov/drgn
[2] https://github.com/Mellanox/mlxsw/blob/master/Debugging/hdroom_dump.txt
[3] https://github.com/Mellanox/mlxsw/blob/master/Debugging/hdroom_dump.py
Russell King (Oracle) Nov. 24, 2020, 9:49 a.m. UTC | #3
On Tue, Nov 24, 2020 at 10:41:51AM +0200, Ido Schimmel wrote:
> On Tue, Nov 24, 2020 at 01:14:31AM +0100, Andrew Lunn wrote:
> > On Mon, Nov 23, 2020 at 10:06:16PM +0000, Russell King wrote:
> > > Add debugfs support to SFP so that the internal state of the SFP state
> > > machines and hardware signal state can be viewed from userspace, rather
> > > than having to compile a debug kernel to view state state transitions
> > > in the kernel log.  The 'state' output looks like:
> > > 
> > > Module state: empty
> > > Module probe attempts: 0 0
> > > Device state: up
> > > Main state: down
> > > Fault recovery remaining retries: 5
> > > PHY probe remaining retries: 12
> > > moddef0: 0
> > > rx_los: 1
> > > tx_fault: 1
> > > tx_disable: 1
> > > 
> > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > 
> > Hi Russell
> > 
> > This looks useful. I always seem to end up recompiling the kernel,
> > which as you said, this should avoid.
> 
> FWIW, another option is to use drgn [1]. Especially when the state is
> queried from the kernel and not hardware. We are using that in mlxsw
> [2][3].

Presumably that requires /proc/kcore support, which 32-bit ARM doesn't
have.
Ido Schimmel Nov. 24, 2020, 10:46 a.m. UTC | #4
On Tue, Nov 24, 2020 at 09:49:16AM +0000, Russell King - ARM Linux admin wrote:
> On Tue, Nov 24, 2020 at 10:41:51AM +0200, Ido Schimmel wrote:
> > On Tue, Nov 24, 2020 at 01:14:31AM +0100, Andrew Lunn wrote:
> > > On Mon, Nov 23, 2020 at 10:06:16PM +0000, Russell King wrote:
> > > > Add debugfs support to SFP so that the internal state of the SFP state
> > > > machines and hardware signal state can be viewed from userspace, rather
> > > > than having to compile a debug kernel to view state state transitions
> > > > in the kernel log.  The 'state' output looks like:
> > > > 
> > > > Module state: empty
> > > > Module probe attempts: 0 0
> > > > Device state: up
> > > > Main state: down
> > > > Fault recovery remaining retries: 5
> > > > PHY probe remaining retries: 12
> > > > moddef0: 0
> > > > rx_los: 1
> > > > tx_fault: 1
> > > > tx_disable: 1
> > > > 
> > > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > > 
> > > Hi Russell
> > > 
> > > This looks useful. I always seem to end up recompiling the kernel,
> > > which as you said, this should avoid.
> > 
> > FWIW, another option is to use drgn [1]. Especially when the state is
> > queried from the kernel and not hardware. We are using that in mlxsw
> > [2][3].
> 
> Presumably that requires /proc/kcore support, which 32-bit ARM doesn't
> have.

Yes, it does seem to be required for live debugging. I mostly work with
x86 systems, I guess it's completely different for Andrew and you.
Russell King (Oracle) Dec. 2, 2020, 1:03 p.m. UTC | #5
Jakub,

What's your opinion on this patch? It seems to have stalled...

Regards,
Russell

On Tue, Nov 24, 2020 at 12:46:40PM +0200, Ido Schimmel wrote:
> On Tue, Nov 24, 2020 at 09:49:16AM +0000, Russell King - ARM Linux admin wrote:
> > On Tue, Nov 24, 2020 at 10:41:51AM +0200, Ido Schimmel wrote:
> > > On Tue, Nov 24, 2020 at 01:14:31AM +0100, Andrew Lunn wrote:
> > > > On Mon, Nov 23, 2020 at 10:06:16PM +0000, Russell King wrote:
> > > > > Add debugfs support to SFP so that the internal state of the SFP state
> > > > > machines and hardware signal state can be viewed from userspace, rather
> > > > > than having to compile a debug kernel to view state state transitions
> > > > > in the kernel log.  The 'state' output looks like:
> > > > > 
> > > > > Module state: empty
> > > > > Module probe attempts: 0 0
> > > > > Device state: up
> > > > > Main state: down
> > > > > Fault recovery remaining retries: 5
> > > > > PHY probe remaining retries: 12
> > > > > moddef0: 0
> > > > > rx_los: 1
> > > > > tx_fault: 1
> > > > > tx_disable: 1
> > > > > 
> > > > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > > > 
> > > > Hi Russell
> > > > 
> > > > This looks useful. I always seem to end up recompiling the kernel,
> > > > which as you said, this should avoid.
> > > 
> > > FWIW, another option is to use drgn [1]. Especially when the state is
> > > queried from the kernel and not hardware. We are using that in mlxsw
> > > [2][3].
> > 
> > Presumably that requires /proc/kcore support, which 32-bit ARM doesn't
> > have.
> 
> Yes, it does seem to be required for live debugging. I mostly work with
> x86 systems, I guess it's completely different for Andrew and you.
>
Jakub Kicinski Dec. 2, 2020, 4:59 p.m. UTC | #6
On Wed, 2 Dec 2020 13:03:18 +0000 Russell King - ARM Linux admin wrote:
> Jakub,
> 
> What's your opinion on this patch? It seems to have stalled...

Sorry, I think I expected someone to do the obvious questioning..

> On Tue, Nov 24, 2020 at 12:46:40PM +0200, Ido Schimmel wrote:
> > On Tue, Nov 24, 2020 at 09:49:16AM +0000, Russell King - ARM Linux admin wrote:  
> > > On Tue, Nov 24, 2020 at 10:41:51AM +0200, Ido Schimmel wrote:  
> > > > On Tue, Nov 24, 2020 at 01:14:31AM +0100, Andrew Lunn wrote:  
> > > > > On Mon, Nov 23, 2020 at 10:06:16PM +0000, Russell King wrote:  
> > > > > > Add debugfs support to SFP so that the internal state of the SFP state
> > > > > > machines and hardware signal state can be viewed from userspace, rather
> > > > > > than having to compile a debug kernel to view state state transitions
> > > > > > in the kernel log.  The 'state' output looks like:
> > > > > > 
> > > > > > Module state: empty
> > > > > > Module probe attempts: 0 0
> > > > > > Device state: up
> > > > > > Main state: down
> > > > > > Fault recovery remaining retries: 5
> > > > > > PHY probe remaining retries: 12

Perfectly reasonable, no objections.

> > > > > > moddef0: 0
> > > > > > rx_los: 1
> > > > > > tx_fault: 1
> > > > > > tx_disable: 1

These, tho, are standard SFP signals, right? Maybe we should put them
in struct ethtool_link_ext_state_info? I remember that various "vendor
tools" report those, maybe it'd be nice to have a standard way of
exposing those signals tru ethtool APIs?

Opinions welcome (let me CC more NIC people)!

> > > > > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>  
> > > > > 
> > > > > Hi Russell
> > > > > 
> > > > > This looks useful. I always seem to end up recompiling the kernel,
> > > > > which as you said, this should avoid.  
> > > > 
> > > > FWIW, another option is to use drgn [1]. Especially when the state is
> > > > queried from the kernel and not hardware. We are using that in mlxsw
> > > > [2][3].  
> > > 
> > > Presumably that requires /proc/kcore support, which 32-bit ARM doesn't
> > > have.  
> > 
> > Yes, it does seem to be required for live debugging. I mostly work with
> > x86 systems, I guess it's completely different for Andrew and you.
Jakub Kicinski Dec. 2, 2020, 5:01 p.m. UTC | #7
On Wed, 2 Dec 2020 08:59:13 -0800 Jakub Kicinski wrote:
> On Wed, 2 Dec 2020 13:03:18 +0000 Russell King - ARM Linux admin wrote:
> > Jakub,
> > 
> > What's your opinion on this patch? It seems to have stalled...  
> 
> Sorry, I think I expected someone to do the obvious questioning..

Ah, no! I know what happened... Check out patchwork:

https://patchwork.kernel.org/project/netdevbpf/patch/E1khJyS-0003UU-9O@rmk-PC.armlinux.org.uk/

It says the patch did not apply cleanly to net-next ;)

Regardless let's hear what people thing about using ext_link (or
similar) for the SFP signals.
Russell King (Oracle) Dec. 7, 2020, 4:06 p.m. UTC | #8
On Wed, Dec 02, 2020 at 09:01:47AM -0800, Jakub Kicinski wrote:
> On Wed, 2 Dec 2020 08:59:13 -0800 Jakub Kicinski wrote:
> > On Wed, 2 Dec 2020 13:03:18 +0000 Russell King - ARM Linux admin wrote:
> > > Jakub,
> > > 
> > > What's your opinion on this patch? It seems to have stalled...  
> > 
> > Sorry, I think I expected someone to do the obvious questioning..
> 
> Ah, no! I know what happened... Check out patchwork:
> 
> https://patchwork.kernel.org/project/netdevbpf/patch/E1khJyS-0003UU-9O@rmk-PC.armlinux.org.uk/
> 
> It says the patch did not apply cleanly to net-next ;)
> 
> Regardless let's hear what people thing about using ext_link (or
> similar) for the SFP signals.

There seems to have been no replies... I think I'll resend with
Andrew's r-b tag. If something better comes along in the future, we
can always change it - debugfs isn't an API.
diff mbox series

Patch

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 1e347afa951e..2c32aa891f17 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1,6 +1,7 @@ 
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/acpi.h>
 #include <linux/ctype.h>
+#include <linux/debugfs.h>
 #include <linux/delay.h>
 #include <linux/gpio/consumer.h>
 #include <linux/hwmon.h>
@@ -258,6 +259,9 @@  struct sfp {
 	char *hwmon_name;
 #endif
 
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	struct dentry *debugfs_dir;
+#endif
 };
 
 static bool sff_module_supported(const struct sfp_eeprom_id *id)
@@ -1390,6 +1394,54 @@  static void sfp_module_tx_enable(struct sfp *sfp)
 	sfp_set_state(sfp, sfp->state);
 }
 
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+static int sfp_debug_state_show(struct seq_file *s, void *data)
+{
+	struct sfp *sfp = s->private;
+
+	seq_printf(s, "Module state: %s\n",
+		   mod_state_to_str(sfp->sm_mod_state));
+	seq_printf(s, "Module probe attempts: %d %d\n",
+		   R_PROBE_RETRY_INIT - sfp->sm_mod_tries_init,
+		   R_PROBE_RETRY_SLOW - sfp->sm_mod_tries);
+	seq_printf(s, "Device state: %s\n",
+		   dev_state_to_str(sfp->sm_dev_state));
+	seq_printf(s, "Main state: %s\n",
+		   sm_state_to_str(sfp->sm_state));
+	seq_printf(s, "Fault recovery remaining retries: %d\n",
+		   sfp->sm_fault_retries);
+	seq_printf(s, "PHY probe remaining retries: %d\n",
+		   sfp->sm_phy_retries);
+	seq_printf(s, "moddef0: %d\n", !!(sfp->state & SFP_F_PRESENT));
+	seq_printf(s, "rx_los: %d\n", !!(sfp->state & SFP_F_LOS));
+	seq_printf(s, "tx_fault: %d\n", !!(sfp->state & SFP_F_TX_FAULT));
+	seq_printf(s, "tx_disable: %d\n", !!(sfp->state & SFP_F_TX_DISABLE));
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(sfp_debug_state);
+
+static void sfp_debugfs_init(struct sfp *sfp)
+{
+	sfp->debugfs_dir = debugfs_create_dir(dev_name(sfp->dev), NULL);
+
+	debugfs_create_file("state", 0600, sfp->debugfs_dir, sfp,
+			    &sfp_debug_state_fops);
+}
+
+static void sfp_debugfs_exit(struct sfp *sfp)
+{
+	debugfs_remove_recursive(sfp->debugfs_dir);
+}
+#else
+static void sfp_debugfs_init(struct sfp *sfp)
+{
+}
+
+static void sfp_debugfs_exit(struct sfp *sfp)
+{
+}
+#endif
+
 static void sfp_module_tx_fault_reset(struct sfp *sfp)
 {
 	unsigned int state = sfp->state;
@@ -2472,6 +2524,8 @@  static int sfp_probe(struct platform_device *pdev)
 		return -ENOMEM;
 	}
 
+	sfp_debugfs_init(sfp);
+
 	return 0;
 }
 
@@ -2479,6 +2533,7 @@  static int sfp_remove(struct platform_device *pdev)
 {
 	struct sfp *sfp = platform_get_drvdata(pdev);
 
+	sfp_debugfs_exit(sfp);
 	sfp_unregister_socket(sfp->sfp_bus);
 
 	rtnl_lock();