Message ID | 20250204-netconsole-v2-2-5ef5eb5f6056@purestorage.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Johannes Berg |
Headers | show |
Series | netconsole: allow selection of egress interface via MAC address | expand |
On Tue, Feb 04, 2025 at 02:41:45PM -0700, Uday Shankar wrote: > Currently, netconsole has two methods of configuration - module > parameter and configfs. The former interface allows for netconsole > activation earlier during boot (by specifying the module parameter on > the kernel command line), so it is preferred for debugging issues which > arise before userspace is up/the configfs interface can be used. The > module parameter syntax requires specifying the egress interface name. > This requirement makes it hard to use for a couple reasons: > - The egress interface name can be hard or impossible to predict. For > example, installing a new network card in a system can change the > interface names assigned by the kernel. > - When constructing the module parameter, one may have trouble > determining the original (kernel-assigned) name of the interface > (which is the name that should be given to netconsole) if some stable > interface naming scheme is in effect. A human can usually look at > kernel logs to determine the original name, but this is very painful > if automation is constructing the parameter. > > For these reasons, allow selection of the egress interface via MAC > address when configuring netconsole using the module parameter. Update > the netconsole documentation with an example of the new syntax. > Selection of egress interface by MAC address via configfs is far less > interesting (since when this interface can be used, one should be able > to easily convert between MAC address and interface name), so it is left > unimplemented. > > Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Breno Leitao <leitao@debian.org> Tested-by: Breno Leitao <leitao@debian.org> > int netpoll_setup(struct netpoll *np) > { > + struct net *net = current->nsproxy->net_ns; > struct net_device *ndev = NULL; > bool ip_overwritten = false; > + char buf[MAC_ADDR_LEN + 1]; > struct in_device *in_dev; > int err; > > rtnl_lock(); > - if (np->dev_name[0]) { > - struct net *net = current->nsproxy->net_ns; > + if (np->dev_name[0]) > ndev = __dev_get_by_name(net, np->dev_name); > - } > + else if (is_valid_ether_addr(np->dev_mac)) > + ndev = dev_getbyhwaddr_rcu(net, ARPHRD_ETHER, np->dev_mac); You do not have the RCU read lock here. You have the rtnl(), which is sufficient, but, CONFIG_PROVE_RCU_LIST will show something as: WARNING: suspicious RCU usage 6.13.0-09701-g6610c7be45bb-dirty #18 Not tainted ----------------------------- net/core/dev.c:1143 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by swapper/0/1: #0: ffffffff832795b8 (rtnl_mutex){+.+.}-{4:4}, at: netpoll_setup+0x48/0x540 stack backtrace: CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.13.0-virtme-09701-g6610c7be45bb-dirty #18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x9f/0xf0 lockdep_rcu_suspicious+0x11a/0x150 dev_getbyhwaddr_rcu+0xb6/0xc0 netpoll_setup+0x8a/0x540 ? netpoll_parse_options+0x2bd/0x310 This is not a problem per-se, since you have RTNL. We probably need to tell for_each_netdev_rcu() to not comply about "RCU-list traversed in non-reader section" if RTNL is held. Not sure why we didn't hit in the test infrastructure, tho: https://patchwork.kernel.org/project/netdevbpf/patch/20250204-netconsole-v2-2-5ef5eb5f6056@purestorage.com/ Anyway, no action item for you here. I am talking to Jakub on a way to solve it, and I should send a fix soon. Thanks for the patch, --breno
On Wed, Feb 05, 2025 at 11:07:45AM -0800, Breno Leitao wrote: > > + else if (is_valid_ether_addr(np->dev_mac)) > > + ndev = dev_getbyhwaddr_rcu(net, ARPHRD_ETHER, np->dev_mac); > > You do not have the RCU read lock here. You have the rtnl(), which is > sufficient, but, CONFIG_PROVE_RCU_LIST will show something as: > > WARNING: suspicious RCU usage > 6.13.0-09701-g6610c7be45bb-dirty #18 Not tainted > ----------------------------- > net/core/dev.c:1143 RCU-list traversed in non-reader section!! > other info that might help us debug this: > rcu_scheduler_active = 2, debug_locks = 1 > 1 lock held by swapper/0/1: > #0: ffffffff832795b8 (rtnl_mutex){+.+.}-{4:4}, at: netpoll_setup+0x48/0x540 > stack backtrace: > CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.13.0-virtme-09701-g6610c7be45bb-dirty #18 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0x9f/0xf0 > lockdep_rcu_suspicious+0x11a/0x150 > dev_getbyhwaddr_rcu+0xb6/0xc0 > netpoll_setup+0x8a/0x540 > ? netpoll_parse_options+0x2bd/0x310 > > This is not a problem per-se, since you have RTNL. We probably need to > tell for_each_netdev_rcu() to not comply about "RCU-list traversed in > non-reader section" if RTNL is held. Not sure why we didn't hit in the > test infrastructure, tho: > > https://patchwork.kernel.org/project/netdevbpf/patch/20250204-netconsole-v2-2-5ef5eb5f6056@purestorage.com/ I don't think there is an automated test that will hit this path yet. I guess you got this trace from your manual testing? > > Anyway, no action item for you here. I am talking to Jakub on a way to > solve it, and I should send a fix soon. /** * list_for_each_entry_rcu - iterate over rcu list of given type * @pos: the type * to use as a loop cursor. * @head: the head for your list. * @member: the name of the list_head within the struct. * @cond: optional lockdep expression if called from non-RCU protection. * * This list-traversal primitive may safely run concurrently with * the _rcu list-mutation primitives such as list_add_rcu() * as long as the traversal is guarded by rcu_read_lock(). */ #define list_for_each_entry_rcu(pos, head, member, cond...) \ for (__list_check_rcu(dummy, ## cond, 0), \ pos = list_entry_rcu((head)->next, typeof(*pos), member); \ &pos->member != (head); \ pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) If we do something like list_for_each_entry_rcu(..., lockdep_rtnl_is_held()) ... I think that code will be okay with being called with either rcu or rtnl held. Of course, we need to plumb it through the net-specific helpers.
diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst index 94c4680fdf3e7e1a0020d11b44547acfd68072a5..90a1bbb52918a0163828f4e96c89781e0bc6856b 100644 --- a/Documentation/networking/netconsole.rst +++ b/Documentation/networking/netconsole.rst @@ -45,7 +45,7 @@ following format:: r if present, prepend kernel version (release) to the message src-port source for UDP packets (defaults to 6665) src-ip source IP to use (interface address) - dev network interface (eth0) + dev network interface name (eth0) or MAC address tgt-port port for logging agent (6666) tgt-ip IP address for logging agent tgt-macaddr ethernet MAC address for logging agent (broadcast) @@ -62,6 +62,10 @@ or using IPv6:: insmod netconsole netconsole=@/,@fd00:1:2:3::1/ +or using a MAC address to select the egress interface:: + + linux netconsole=4444@10.0.0.1/22:33:44:55:66:77,9353@10.0.0.2/12:34:56:78:9a:bc + It also supports logging to multiple remote agents by specifying parameters for the multiple agents separated by semicolons and the complete string enclosed in "quotes", thusly:: diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h index f91e50a76efd4b016381c632456397eea1ea877f..1ade65b59be49cfdcf86ed6e938287b949aa9f58 100644 --- a/include/linux/netpoll.h +++ b/include/linux/netpoll.h @@ -25,7 +25,13 @@ union inet_addr { struct netpoll { struct net_device *dev; netdevice_tracker dev_tracker; + /* + * Either dev_name or dev_mac can be used to specify the local + * interface - dev_name is used if it is a nonempty string, else + * dev_mac is used. + */ char dev_name[IFNAMSIZ]; + u8 dev_mac[ETH_ALEN]; const char *name; union inet_addr local_ip, remote_ip; diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 62b4041aae1ae8c7dc47c89fb40b14bbd4ad0e0e..c4b329fa874748eddff64bbba13902893faebbce 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -501,7 +501,8 @@ void netpoll_print_options(struct netpoll *np) np_info(np, "local IPv6 address %pI6c\n", &np->local_ip.in6); else np_info(np, "local IPv4 address %pI4\n", &np->local_ip.ip); - np_info(np, "interface '%s'\n", np->dev_name); + np_info(np, "interface name '%s'\n", np->dev_name); + np_info(np, "local ethernet address '%pM'\n", np->dev_mac); np_info(np, "remote port %d\n", np->remote_port); if (np->ipv6) np_info(np, "remote IPv6 address %pI6c\n", &np->remote_ip.in6); @@ -570,11 +571,18 @@ int netpoll_parse_options(struct netpoll *np, char *opt) cur++; if (*cur != ',') { - /* parse out dev name */ + /* parse out dev_name or dev_mac */ if ((delim = strchr(cur, ',')) == NULL) goto parse_failed; *delim = 0; - strscpy(np->dev_name, cur, sizeof(np->dev_name)); + + np->dev_name[0] = '\0'; + eth_broadcast_addr(np->dev_mac); + if (!strchr(cur, ':')) + strscpy(np->dev_name, cur, sizeof(np->dev_name)); + else if (!mac_pton(cur, np->dev_mac)) + goto parse_failed; + cur = delim; } cur++; @@ -679,27 +687,45 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev) } EXPORT_SYMBOL_GPL(__netpoll_setup); +/* + * Returns a pointer to a string representation of the identifier used + * to select the egress interface for the given netpoll instance. buf + * must be a buffer of length at least MAC_ADDR_LEN + 1. + */ +static char *egress_dev(struct netpoll *np, char *buf) +{ + if (np->dev_name[0]) + return np->dev_name; + + snprintf(buf, MAC_ADDR_LEN, "%pM", np->dev_mac); + return buf; +} + int netpoll_setup(struct netpoll *np) { + struct net *net = current->nsproxy->net_ns; struct net_device *ndev = NULL; bool ip_overwritten = false; + char buf[MAC_ADDR_LEN + 1]; struct in_device *in_dev; int err; rtnl_lock(); - if (np->dev_name[0]) { - struct net *net = current->nsproxy->net_ns; + if (np->dev_name[0]) ndev = __dev_get_by_name(net, np->dev_name); - } + else if (is_valid_ether_addr(np->dev_mac)) + ndev = dev_getbyhwaddr_rcu(net, ARPHRD_ETHER, np->dev_mac); + if (!ndev) { - np_err(np, "%s doesn't exist, aborting\n", np->dev_name); + np_err(np, "%s doesn't exist, aborting\n", egress_dev(np, buf)); err = -ENODEV; goto unlock; } netdev_hold(ndev, &np->dev_tracker, GFP_KERNEL); if (netdev_master_upper_dev_get(ndev)) { - np_err(np, "%s is a slave device, aborting\n", np->dev_name); + np_err(np, "%s is a slave device, aborting\n", + egress_dev(np, buf)); err = -EBUSY; goto put; } @@ -707,7 +733,8 @@ int netpoll_setup(struct netpoll *np) if (!netif_running(ndev)) { unsigned long atmost; - np_info(np, "device %s not up yet, forcing it\n", np->dev_name); + np_info(np, "device %s not up yet, forcing it\n", + egress_dev(np, buf)); err = dev_open(ndev, NULL); @@ -741,7 +768,7 @@ int netpoll_setup(struct netpoll *np) if (!ifa) { put_noaddr: np_err(np, "no IP address for %s, aborting\n", - np->dev_name); + egress_dev(np, buf)); err = -EDESTADDRREQ; goto put; } @@ -772,13 +799,13 @@ int netpoll_setup(struct netpoll *np) } if (err) { np_err(np, "no IPv6 address for %s, aborting\n", - np->dev_name); + egress_dev(np, buf)); goto put; } else np_info(np, "local IPv6 %pI6c\n", &np->local_ip.in6); #else np_err(np, "IPv6 is not supported %s, aborting\n", - np->dev_name); + egress_dev(np, buf)); err = -EINVAL; goto put; #endif
Currently, netconsole has two methods of configuration - module parameter and configfs. The former interface allows for netconsole activation earlier during boot (by specifying the module parameter on the kernel command line), so it is preferred for debugging issues which arise before userspace is up/the configfs interface can be used. The module parameter syntax requires specifying the egress interface name. This requirement makes it hard to use for a couple reasons: - The egress interface name can be hard or impossible to predict. For example, installing a new network card in a system can change the interface names assigned by the kernel. - When constructing the module parameter, one may have trouble determining the original (kernel-assigned) name of the interface (which is the name that should be given to netconsole) if some stable interface naming scheme is in effect. A human can usually look at kernel logs to determine the original name, but this is very painful if automation is constructing the parameter. For these reasons, allow selection of the egress interface via MAC address when configuring netconsole using the module parameter. Update the netconsole documentation with an example of the new syntax. Selection of egress interface by MAC address via configfs is far less interesting (since when this interface can be used, one should be able to easily convert between MAC address and interface name), so it is left unimplemented. Signed-off-by: Uday Shankar <ushankar@purestorage.com> --- Documentation/networking/netconsole.rst | 6 +++- include/linux/netpoll.h | 6 ++++ net/core/netpoll.c | 51 +++++++++++++++++++++++++-------- 3 files changed, 50 insertions(+), 13 deletions(-)