diff mbox series

[net-next,v4,2/2] netcons: Add udp send fail statistics to netconsole

Message ID 20241027-netcons-add-udp-send-fail-statistics-to-netconsole-v4-2-a8065a43c897@kutsevol.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series netcons: Add udp send fail statistics to netconsole | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5 this patch: 5
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 150 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-10-30--15-00 (tests: 777)

Commit Message

Maksym Kutsevol Oct. 27, 2024, 7:59 p.m. UTC
Enhance observability of netconsole. Packet sends can fail.
Start tracking at least two failure possibilities: ENOMEM and
NET_XMIT_DROP for every target. Stats are exposed via an additional
attribute in CONFIGFS.

The exposed statistics allows easier debugging of cases when netconsole
messages were not seen by receivers, eliminating the guesswork if the
sender thinks that messages in question were sent out.

Stats are not reset on enable/disable/change remote ip/etc, they
belong to the netcons target itself.

Reported-by: Breno Leitao <leitao@debian.org>
Closes: https://lore.kernel.org/all/ZsWoUzyK5du9Ffl+@gmail.com/
Signed-off-by: Maksym Kutsevol <max@kutsevol.com>
---
 Documentation/networking/netconsole.rst |  5 +--
 drivers/net/netconsole.c                | 61 +++++++++++++++++++++++++++++++--
 2 files changed, 61 insertions(+), 5 deletions(-)

Comments

Breno Leitao Oct. 28, 2024, 10:45 a.m. UTC | #1
On Sun, Oct 27, 2024 at 12:59:42PM -0700, Maksym Kutsevol wrote:
> Enhance observability of netconsole. Packet sends can fail.
> Start tracking at least two failure possibilities: ENOMEM and
> NET_XMIT_DROP for every target. Stats are exposed via an additional
> attribute in CONFIGFS.
> 
> The exposed statistics allows easier debugging of cases when netconsole
> messages were not seen by receivers, eliminating the guesswork if the
> sender thinks that messages in question were sent out.
> 
> Stats are not reset on enable/disable/change remote ip/etc, they
> belong to the netcons target itself.
> 
> Reported-by: Breno Leitao <leitao@debian.org>
> Closes: https://lore.kernel.org/all/ZsWoUzyK5du9Ffl+@gmail.com/
> Signed-off-by: Maksym Kutsevol <max@kutsevol.com>

Reviewed-by: Breno Leitao <leitao@debian.org>
Jakub Kicinski Oct. 31, 2024, 1:49 a.m. UTC | #2
On Sun, 27 Oct 2024 12:59:42 -0700 Maksym Kutsevol wrote:
> +struct netconsole_target_stats  {
> +	u64_stats_t xmit_drop_count;
> +	u64_stats_t enomem_count;
> +	struct u64_stats_sync syncp;
> +} __aligned(2 * sizeof(u64));

Why the alignment?

> +static void netpoll_send_udp_count_errs(struct netconsole_target *nt,
> +					const char *msg, int len)

This is defined in the netconsole driver, it should not use the
netpoll_ prefix for the function name.
Maksym Kutsevol Nov. 1, 2024, 1:19 a.m. UTC | #3
On Wed, Oct 30, 2024 at 9:49 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sun, 27 Oct 2024 12:59:42 -0700 Maksym Kutsevol wrote:
> > +struct netconsole_target_stats  {
> > +     u64_stats_t xmit_drop_count;
> > +     u64_stats_t enomem_count;
> > +     struct u64_stats_sync syncp;
> > +} __aligned(2 * sizeof(u64));
>
> Why the alignment?
Hi Jakub,

Thanks for looking into this!

Parroting examples, e.g.
struct pcpu_lstats {
u64_stats_t packets;
u64_stats_t bytes;
struct u64_stats_sync syncp;
} __aligned(2 * sizeof(u64));

in netdevice.h https://github.com/torvalds/linux/blob/master/include/linux/netdevice.h#L2743-L2747
I don't have any strongly held opinion about this. I'd appreciate an
explanation (a link would suffice) why this is a bad idea.


> > +static void netpoll_send_udp_count_errs(struct netconsole_target *nt,
> > +                                     const char *msg, int len)
>
> This is defined in the netconsole driver, it should not use the
> netpoll_ prefix for the function name.'

netconsole_send_udp_count_errs sounds better?

Regards,
Maksym
Jakub Kicinski Nov. 1, 2024, 1:49 a.m. UTC | #4
On Thu, 31 Oct 2024 21:19:26 -0400 Maksym Kutsevol wrote:
> Thanks for looking into this!
> 
> Parroting examples, e.g.
> struct pcpu_lstats {
> u64_stats_t packets;
> u64_stats_t bytes;
> struct u64_stats_sync syncp;
> } __aligned(2 * sizeof(u64));
> 
> in netdevice.h https://github.com/torvalds/linux/blob/master/include/linux/netdevice.h#L2743-L2747
> I don't have any strongly held opinion about this. I'd appreciate an
> explanation (a link would suffice) why this is a bad idea.

No entirely sure why the pcpu stats in netdev are aligned like this,
but yours are not per cpu, and not fast path of any sort. So aligning
is a premature optimization in the first place.

> > > +static void netpoll_send_udp_count_errs(struct netconsole_target *nt,
> > > +                                     const char *msg, int len)  
> >
> > This is defined in the netconsole driver, it should not use the
> > netpoll_ prefix for the function name.'  
> 
> netconsole_send_udp_count_errs sounds better?

I don't think the _count_errs() suffix is needed any more
diff mbox series

Patch

diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst
index d55c2a22ec7a..94c4680fdf3e 100644
--- a/Documentation/networking/netconsole.rst
+++ b/Documentation/networking/netconsole.rst
@@ -124,7 +124,7 @@  To remove a target::
 
 The interface exposes these parameters of a netconsole target to userspace:
 
-	==============  =================================       ============
+	=============== =================================       ============
 	enabled		Is this target currently enabled?	(read-write)
 	extended	Extended mode enabled			(read-write)
 	release		Prepend kernel release to message	(read-write)
@@ -135,7 +135,8 @@  The interface exposes these parameters of a netconsole target to userspace:
 	remote_ip	Remote agent's IP address		(read-write)
 	local_mac	Local interface's MAC address		(read-only)
 	remote_mac	Remote agent's MAC address		(read-write)
-	==============  =================================       ============
+	transmit_errors	Number of packet send errors		(read-only)
+	=============== =================================       ============
 
 The "enabled" attribute is also used to control whether the parameters of
 a target can be updated or not -- you can modify the parameters of only
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 4ea44a2f48f7..5f5e5a8cd896 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -36,6 +36,7 @@ 
 #include <linux/inet.h>
 #include <linux/configfs.h>
 #include <linux/etherdevice.h>
+#include <linux/u64_stats_sync.h>
 #include <linux/utsname.h>
 #include <linux/rtnetlink.h>
 
@@ -90,6 +91,12 @@  static DEFINE_MUTEX(target_cleanup_list_lock);
  */
 static struct console netconsole_ext;
 
+struct netconsole_target_stats  {
+	u64_stats_t xmit_drop_count;
+	u64_stats_t enomem_count;
+	struct u64_stats_sync syncp;
+} __aligned(2 * sizeof(u64));
+
 /**
  * struct netconsole_target - Represents a configured netconsole target.
  * @list:	Links this target into the target_list.
@@ -97,6 +104,7 @@  static struct console netconsole_ext;
  * @userdata_group:	Links to the userdata configfs hierarchy
  * @userdata_complete:	Cached, formatted string of append
  * @userdata_length:	String length of userdata_complete
+ * @stats:	Packet send stats for the target. Used for debugging.
  * @enabled:	On / off knob to enable / disable target.
  *		Visible from userspace (read-write).
  *		We maintain a strict 1:1 correspondence between this and
@@ -124,6 +132,7 @@  struct netconsole_target {
 	char userdata_complete[MAX_USERDATA_ENTRY_LENGTH * MAX_USERDATA_ITEMS];
 	size_t			userdata_length;
 #endif
+	struct netconsole_target_stats stats;
 	bool			enabled;
 	bool			extended;
 	bool			release;
@@ -262,6 +271,7 @@  static void netconsole_process_cleanups_core(void)
  *				|	remote_ip
  *				|	local_mac
  *				|	remote_mac
+ *				|	transmit_errors
  *				|	userdata/
  *				|		<key>/
  *				|			value
@@ -371,6 +381,21 @@  static ssize_t remote_mac_show(struct config_item *item, char *buf)
 	return sysfs_emit(buf, "%pM\n", to_target(item)->np.remote_mac);
 }
 
+static ssize_t transmit_errors_show(struct config_item *item, char *buf)
+{
+	struct netconsole_target *nt = to_target(item);
+	u64 xmit_drop_count, enomem_count;
+	unsigned int start;
+
+	do {
+		start = u64_stats_fetch_begin(&nt->stats.syncp);
+		xmit_drop_count = u64_stats_read(&nt->stats.xmit_drop_count);
+		enomem_count = u64_stats_read(&nt->stats.enomem_count);
+	} while (u64_stats_fetch_retry(&nt->stats.syncp, start));
+
+	return sysfs_emit(buf, "%llu\n", xmit_drop_count + enomem_count);
+}
+
 /*
  * This one is special -- targets created through the configfs interface
  * are not enabled (and the corresponding netpoll activated) by default.
@@ -842,6 +867,7 @@  CONFIGFS_ATTR(, remote_ip);
 CONFIGFS_ATTR_RO(, local_mac);
 CONFIGFS_ATTR(, remote_mac);
 CONFIGFS_ATTR(, release);
+CONFIGFS_ATTR_RO(, transmit_errors);
 
 static struct configfs_attribute *netconsole_target_attrs[] = {
 	&attr_enabled,
@@ -854,6 +880,7 @@  static struct configfs_attribute *netconsole_target_attrs[] = {
 	&attr_remote_ip,
 	&attr_local_mac,
 	&attr_remote_mac,
+	&attr_transmit_errors,
 	NULL,
 };
 
@@ -1058,6 +1085,34 @@  static struct notifier_block netconsole_netdev_notifier = {
 	.notifier_call  = netconsole_netdev_event,
 };
 
+/**
+ * netpoll_send_udp_count_errs - Wrapper for netpoll_send_udp that counts errors
+ * @nt: target to send message to
+ * @msg: message to send
+ * @len: length of message
+ *
+ * Calls netpoll_send_udp and classifies the return value. If an error
+ * occurred it increments statistics in nt->stats accordingly.
+ * Only calls netpoll_send_udp if CONFIG_NETCONSOLE_DYNAMIC is disabled.
+ */
+static void netpoll_send_udp_count_errs(struct netconsole_target *nt,
+					const char *msg, int len)
+{
+	int result = netpoll_send_udp(&nt->np, msg, len);
+
+	if (IS_ENABLED(CONFIG_NETCONSOLE_DYNAMIC)) {
+		if (result == NET_XMIT_DROP) {
+			u64_stats_update_begin(&nt->stats.syncp);
+			u64_stats_inc(&nt->stats.xmit_drop_count);
+			u64_stats_update_end(&nt->stats.syncp);
+		} else if (result == -ENOMEM) {
+			u64_stats_update_begin(&nt->stats.syncp);
+			u64_stats_inc(&nt->stats.enomem_count);
+			u64_stats_update_end(&nt->stats.syncp);
+		}
+	}
+}
+
 static void send_msg_no_fragmentation(struct netconsole_target *nt,
 				      const char *msg,
 				      int msg_len,
@@ -1085,7 +1140,7 @@  static void send_msg_no_fragmentation(struct netconsole_target *nt,
 				     MAX_PRINT_CHUNK - msg_len,
 				     "%s", userdata);
 
-	netpoll_send_udp(&nt->np, buf, msg_len);
+	netpoll_send_udp_count_errs(nt, buf, msg_len);
 }
 
 static void append_release(char *buf)
@@ -1178,7 +1233,7 @@  static void send_fragmented_body(struct netconsole_target *nt, char *buf,
 			this_offset += this_chunk;
 		}
 
-		netpoll_send_udp(&nt->np, buf, this_header + this_offset);
+		netpoll_send_udp_count_errs(nt, buf, this_header + this_offset);
 		offset += this_offset;
 	}
 }
@@ -1288,7 +1343,7 @@  static void write_msg(struct console *con, const char *msg, unsigned int len)
 			tmp = msg;
 			for (left = len; left;) {
 				frag = min(left, MAX_PRINT_CHUNK);
-				netpoll_send_udp(&nt->np, tmp, frag);
+				netpoll_send_udp_count_errs(nt, tmp, frag);
 				tmp += frag;
 				left -= frag;
 			}