From patchwork Wed Dec 11 08:22:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yuyang Huang X-Patchwork-Id: 13903108 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 350591D63FF for ; Wed, 11 Dec 2024 08:22:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733905369; cv=none; b=VKDsJuWOKIE0eWAFnxqvg/4xMu2Kz/HvyKYyBXIPZMfGM23r1tdsK2uJ/W82eXqJNGRAzKAQ/6zsa4+uG/wQ6ExgHB8+uStgoFhFawFtp/y8LNkPLxcoWP3mQFHeXZ6TSN+pTq5D69cKmKgYvEkfPtfuAJqUNT5wRIXAIYWKyiE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733905369; c=relaxed/simple; bh=yQk9lnuDk5QkJJ1vnnO6yCjpyl7DoLRhV7AanwnQKDw=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=nlbLoML2ArxW37/HuJAQZfMWw2Lg+kDJD46Xy24cIUjKSF6jfQvFCvRKXN9kvS53Blm7z89Tt9zjBmIC3s+ywC6io6lDbz4jBZ4A1VVKi/sgKoqLHvy4GNhivRBUQhhdQw+0v9jgWwGbq5G9VF9x3wC2UIaT9HUTrgGZ8EHj9S0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuyanghuang.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Oid95MxH; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuyanghuang.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Oid95MxH" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-216430a88b0so31229275ad.0 for ; Wed, 11 Dec 2024 00:22:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733905366; x=1734510166; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:from:to:cc:subject:date:message-id:reply-to; bh=XKziddB+Jnwlg1Zk3J4SdGKXZ/1Nj6HoN4s1cv7Ynq0=; b=Oid95MxHm2AABGq6+kj6n0nUOtonPXbwPnaspbRdPaCGdH6zcQ34r1q2sn6ABdGQrF QsDQhGEJsAhSba9APvYz4rWlWI/4/dJmtmC56aZbN0Vfnei0B08J8qbhc7rs8s0czp9Q 958oBlsKxyQsoFCNKIJIas8kF0RAavBEg3wYuRi6po4vUlUIitOSp3Wtf9gfzN53qaAx rzgEi6Y9+ad4n3vSnGOf218Zqy3ZoC1kGu1DnRKAOalxDgmDF0bkJS/Ld0xHiX0jX+Dj aEb02s/jjAujoam84BBmE06NOiCRLIHro4hskelXdzWXY2zLCpkiRCwfAXOUOhauo3mm /Www== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733905366; x=1734510166; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=XKziddB+Jnwlg1Zk3J4SdGKXZ/1Nj6HoN4s1cv7Ynq0=; b=h7MDbtV6snNJwW6gJWJojbsqrTGRiMRYsxMzF8vq1mGlG+ramI8oV2THDq4G55BzfT LxZCW/hTd2qkeILNohCXiTJXbHewrEfUOcl6pjc1sp0utg96mbTiVK0Uit8Kk1/EpOte domXzmhSEnX2REP8j+te+L2JZwBrcM2HWu2xOGmcD5sPiHs+J3DmJ5MYeFIrFsX9G2fe by4S8WfuT8vBJAff10h3b0DVI/V1nntvzVj4FtZAUQMYTDN/g/1OHCz1Xx3XpfMyGdiO 8Cm4H5fByPysd/ArjT0SWOPb1l+GEjb6ifo4f4h2xFMykIL2frbPfhwpaxWODZ2mi/ch cC8Q== X-Forwarded-Encrypted: i=1; AJvYcCXW7T/3BwD9gYM1DBw0+AisDHD4Z7D/N0hPf4co0Ps4KtnaW/cQ4psgqijamHfWepHdpwUEXlE=@vger.kernel.org X-Gm-Message-State: AOJu0YxM003pJP7/1GbFFWuCh/5rUXnHRPY5kHxnyy1vyBBKBuOfdcMG 52TWP/YEOuvVG1e7IX/NHRqMc/Bn4aMGZgZUVcsesjInLRrwO/4YGhMhzK1ZhGCADb8xbfyBMFr jexA/O5lpP7+Pb+XQfvJzdA== X-Google-Smtp-Source: AGHT+IEBDyzzHWxUbnf0z90RqyKgYt79NZEO/5HQgIO2/osvdxmbw80skDH0N8SiADUlTmYMPLDJKcIZh8mJj94/ng== X-Received: from plwg5.prod.google.com ([2002:a17:902:f745:b0:216:271b:1b41]) (user=yuyanghuang job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:98e:b0:215:6f5d:b756 with SMTP id d9443c01a7336-2177839352emr29119645ad.7.1733905366479; Wed, 11 Dec 2024 00:22:46 -0800 (PST) Date: Wed, 11 Dec 2024 17:22:41 +0900 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241211082241.3372537-1-yuyanghuang@google.com> Subject: [PATCH net-next, v6] netlink: add IGMP/MLD join/leave notifications From: Yuyang Huang To: Yuyang Huang Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , David Ahern , roopa@cumulusnetworks.com, jiri@resnulli.us, stephen@networkplumber.org, jimictw@google.com, prohr@google.com, liuhangbin@gmail.com, nicolas.dichtel@6wind.com, andrew@lunn.ch, netdev@vger.kernel.org, " =?utf-8?q?Maciej_=C5=BBenczykow?= =?utf-8?q?ski?= " , Lorenzo Colitti , Patrick Ruddy X-Patchwork-Delegate: kuba@kernel.org This change introduces netlink notifications for multicast address changes. The following features are included: * Addition and deletion of multicast addresses are reported using RTM_NEWMULTICAST and RTM_DELMULTICAST messages with AF_INET and AF_INET6. * Two new notification groups: RTNLGRP_IPV4_MCADDR and RTNLGRP_IPV6_MCADDR are introduced for receiving these events. This change allows user space applications (e.g., ip monitor) to efficiently track multicast group memberships by listening for netlink events. Previously, applications relied on inefficient polling of procfs, introducing delays. With netlink notifications, applications receive realtime updates on multicast group membership changes, enabling more precise metrics collection and system monitoring.  This change also unlocks the potential for implementing a wide range of sophisticated multicast related features in user space by allowing applications to combine kernel provided multicast address information with user space data and communicate decisions back to the kernel for more fine grained control. This mechanism can be used for various purposes, including multicast filtering, IGMP/MLD offload, and IGMP/MLD snooping. Cc: Maciej Żenczykowski Cc: Lorenzo Colitti Co-developed-by: Patrick Ruddy Signed-off-by: Patrick Ruddy Link: https://lore.kernel.org/r/20180906091056.21109-1-pruddy@vyatta.att-mail.com Signed-off-by: Yuyang Huang --- Changelog since v5: - Update the inet_fill_ifmcaddr() function to populate IFA_CACHEINFO, aligning it with the behavior of inet6_fill_ifmcaddr(). - Reuse inet6_fill_ifmcaddr() in mcast.c for keeping notifications and getting responses in sync. - Use -ENOMEM instead of -ENOBUFS to initialize the error number. - Use nlmsg_free() instead of kfree_skb() on exist. - Minor style fix: Moved "+" to the end of the line. Changelog since v4: - Use WARN_ON_ONCE() instead of WARN_ON() Changelog since v3: - Remove unused variable 'scope' declaration. - Align RTM_NEWMULTICAST and RTM_GETMULTICAST enum definitions with existing code style. Changelog since v2: - Use RT_SCOPE_UNIVERSE for both IGMP and MLD notification messages for consistency. Changelog since v1: - Implement MLD join/leave notifications. - Revise the comment message to make it generic. - Fix netdev/source_inline error. - Reorder local variables according to "reverse xmas tree” style. include/linux/igmp.h | 2 ++ include/net/addrconf.h | 21 +++++++++++ include/uapi/linux/rtnetlink.h | 10 +++++- net/ipv4/igmp.c | 64 ++++++++++++++++++++++++++++++++++ net/ipv6/addrconf.c | 29 +++++---------- net/ipv6/mcast.c | 39 +++++++++++++++++++++ 6 files changed, 144 insertions(+), 21 deletions(-) diff --git a/include/linux/igmp.h b/include/linux/igmp.h index 5171231f70a8..073b30a9b850 100644 --- a/include/linux/igmp.h +++ b/include/linux/igmp.h @@ -87,6 +87,8 @@ struct ip_mc_list { char loaded; unsigned char gsquery; /* check source marks? */ unsigned char crcount; + unsigned long mca_cstamp; + unsigned long mca_tstamp; struct rcu_head rcu; }; diff --git a/include/net/addrconf.h b/include/net/addrconf.h index 363dd63babe7..58337898fa21 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -88,6 +88,23 @@ struct ifa6_config { u16 scope; }; +enum addr_type_t { + UNICAST_ADDR, + MULTICAST_ADDR, + ANYCAST_ADDR, +}; + +struct inet6_fill_args { + u32 portid; + u32 seq; + int event; + unsigned int flags; + int netnsid; + int ifindex; + enum addr_type_t type; + bool force_rt_scope_universe; +}; + int addrconf_init(void); void addrconf_cleanup(void); @@ -525,4 +542,8 @@ int if6_proc_init(void); void if6_proc_exit(void); #endif +int inet6_fill_ifmcaddr(struct sk_buff *skb, + const struct ifmcaddr6 *ifmca, + struct inet6_fill_args *args); + #endif diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index db7254d52d93..eccc0e7dcb7d 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -93,7 +93,11 @@ enum { RTM_NEWPREFIX = 52, #define RTM_NEWPREFIX RTM_NEWPREFIX - RTM_GETMULTICAST = 58, + RTM_NEWMULTICAST = 56, +#define RTM_NEWMULTICAST RTM_NEWMULTICAST + RTM_DELMULTICAST, +#define RTM_DELMULTICAST RTM_DELMULTICAST + RTM_GETMULTICAST, #define RTM_GETMULTICAST RTM_GETMULTICAST RTM_GETANYCAST = 62, @@ -774,6 +778,10 @@ enum rtnetlink_groups { #define RTNLGRP_TUNNEL RTNLGRP_TUNNEL RTNLGRP_STATS, #define RTNLGRP_STATS RTNLGRP_STATS + RTNLGRP_IPV4_MCADDR, +#define RTNLGRP_IPV4_MCADDR RTNLGRP_IPV4_MCADDR + RTNLGRP_IPV6_MCADDR, +#define RTNLGRP_IPV6_MCADDR RTNLGRP_IPV6_MCADDR __RTNLGRP_MAX }; #define RTNLGRP_MAX (__RTNLGRP_MAX - 1) diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 6a238398acc9..8a370ef37d3f 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -88,6 +88,8 @@ #include #include +#include +#include #include #include #include @@ -1430,6 +1432,63 @@ static void ip_mc_hash_remove(struct in_device *in_dev, *mc_hash = im->next_hash; } +static int inet_fill_ifmcaddr(struct sk_buff *skb, struct net_device *dev, + const struct ip_mc_list *im, int event) +{ + struct ifa_cacheinfo ci; + struct ifaddrmsg *ifm; + struct nlmsghdr *nlh; + + nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct ifaddrmsg), 0); + if (!nlh) + return -EMSGSIZE; + + ifm = nlmsg_data(nlh); + ifm->ifa_family = AF_INET; + ifm->ifa_prefixlen = 32; + ifm->ifa_flags = IFA_F_PERMANENT; + ifm->ifa_scope = RT_SCOPE_UNIVERSE; + ifm->ifa_index = dev->ifindex; + + ci.cstamp = (READ_ONCE(im->mca_cstamp) - INITIAL_JIFFIES) * 100UL / HZ; + ci.tstamp = ci.cstamp; + ci.ifa_prefered = INFINITY_LIFE_TIME; + ci.ifa_valid = INFINITY_LIFE_TIME; + + if (nla_put_in_addr(skb, IFA_MULTICAST, im->multiaddr) < 0 || + nla_put(skb, IFA_CACHEINFO, sizeof(ci), &ci) < 0) { + nlmsg_cancel(skb, nlh); + return -EMSGSIZE; + } + + nlmsg_end(skb, nlh); + return 0; +} + +static void inet_ifmcaddr_notify(struct net_device *dev, + const struct ip_mc_list *im, int event) +{ + struct net *net = dev_net(dev); + struct sk_buff *skb; + int err = -ENOMEM; + + skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct ifaddrmsg)) + + nla_total_size(sizeof(__be32)), GFP_ATOMIC); + if (!skb) + goto error; + + err = inet_fill_ifmcaddr(skb, dev, im, event); + if (err < 0) { + WARN_ON_ONCE(err == -EMSGSIZE); + nlmsg_free(skb); + goto error; + } + + rtnl_notify(skb, net, 0, RTNLGRP_IPV4_MCADDR, NULL, GFP_ATOMIC); + return; +error: + rtnl_set_sk_err(net, RTNLGRP_IPV4_MCADDR, err); +} /* * A socket has joined a multicast group on device dev. @@ -1473,6 +1532,8 @@ static void ____ip_mc_inc_group(struct in_device *in_dev, __be32 addr, im->interface = in_dev; in_dev_hold(in_dev); im->multiaddr = addr; + im->mca_cstamp = jiffies; + im->mca_tstamp = im->mca_cstamp; /* initial mode is (EX, empty) */ im->sfmode = mode; im->sfcount[mode] = 1; @@ -1492,6 +1553,7 @@ static void ____ip_mc_inc_group(struct in_device *in_dev, __be32 addr, igmpv3_del_delrec(in_dev, im); #endif igmp_group_added(im); + inet_ifmcaddr_notify(in_dev->dev, im, RTM_NEWMULTICAST); if (!in_dev->dead) ip_rt_multicast_event(in_dev); out: @@ -1705,6 +1767,8 @@ void __ip_mc_dec_group(struct in_device *in_dev, __be32 addr, gfp_t gfp) *ip = i->next_rcu; in_dev->mc_count--; __igmp_group_dropped(i, gfp); + inet_ifmcaddr_notify(in_dev->dev, i, + RTM_DELMULTICAST); ip_mc_clear_src(i); if (!in_dev->dead) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index c489a1e6aec9..ac8ddda9d14f 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -5126,22 +5126,6 @@ static inline int inet6_ifaddr_msgsize(void) + nla_total_size(4) /* IFA_RT_PRIORITY */; } -enum addr_type_t { - UNICAST_ADDR, - MULTICAST_ADDR, - ANYCAST_ADDR, -}; - -struct inet6_fill_args { - u32 portid; - u32 seq; - int event; - unsigned int flags; - int netnsid; - int ifindex; - enum addr_type_t type; -}; - static int inet6_fill_ifaddr(struct sk_buff *skb, const struct inet6_ifaddr *ifa, struct inet6_fill_args *args) @@ -5220,15 +5204,16 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, return -EMSGSIZE; } -static int inet6_fill_ifmcaddr(struct sk_buff *skb, - const struct ifmcaddr6 *ifmca, - struct inet6_fill_args *args) +int inet6_fill_ifmcaddr(struct sk_buff *skb, + const struct ifmcaddr6 *ifmca, + struct inet6_fill_args *args) { int ifindex = ifmca->idev->dev->ifindex; u8 scope = RT_SCOPE_UNIVERSE; struct nlmsghdr *nlh; - if (ipv6_addr_scope(&ifmca->mca_addr) & IFA_SITE) + if (!args->force_rt_scope_universe && + ipv6_addr_scope(&ifmca->mca_addr) & IFA_SITE) scope = RT_SCOPE_SITE; nlh = nlmsg_put(skb, args->portid, args->seq, args->event, @@ -5253,6 +5238,7 @@ static int inet6_fill_ifmcaddr(struct sk_buff *skb, nlmsg_end(skb, nlh); return 0; } +EXPORT_SYMBOL(inet6_fill_ifmcaddr); static int inet6_fill_ifacaddr(struct sk_buff *skb, const struct ifacaddr6 *ifaca, @@ -5417,6 +5403,7 @@ static int inet6_dump_addr(struct sk_buff *skb, struct netlink_callback *cb, .flags = NLM_F_MULTI, .netnsid = -1, .type = type, + .force_rt_scope_universe = false, }; struct { unsigned long ifindex; @@ -5545,6 +5532,7 @@ static int inet6_rtm_getaddr(struct sk_buff *in_skb, struct nlmsghdr *nlh, .event = RTM_NEWADDR, .flags = 0, .netnsid = -1, + .force_rt_scope_universe = false, }; struct ifaddrmsg *ifm; struct nlattr *tb[IFA_MAX+1]; @@ -5616,6 +5604,7 @@ static void inet6_ifa_notify(int event, struct inet6_ifaddr *ifa) .event = event, .flags = 0, .netnsid = -1, + .force_rt_scope_universe = false, }; int err = -ENOBUFS; diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index b244dbf61d5f..a65da675e16d 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -33,8 +33,10 @@ #include #include #include +#include #include #include +#include #include #include #include @@ -47,6 +49,7 @@ #include #include +#include #include #include @@ -901,6 +904,39 @@ static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev, return mc; } +static void inet6_ifmcaddr_notify(struct net_device *dev, + const struct ifmcaddr6 *ifmca, int event) +{ + struct inet6_fill_args fillargs = { + .portid = 0, + .seq = 0, + .event = event, + .flags = 0, + .netnsid = -1, + .force_rt_scope_universe = true, + }; + struct net *net = dev_net(dev); + struct sk_buff *skb; + int err = -ENOMEM; + + skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct ifaddrmsg)) + + nla_total_size(16), GFP_ATOMIC); + if (!skb) + goto error; + + err = inet6_fill_ifmcaddr(skb, ifmca, &fillargs); + if (err < 0) { + WARN_ON_ONCE(err == -EMSGSIZE); + nlmsg_free(skb); + goto error; + } + + rtnl_notify(skb, net, 0, RTNLGRP_IPV6_MCADDR, NULL, GFP_ATOMIC); + return; +error: + rtnl_set_sk_err(net, RTNLGRP_IPV6_MCADDR, err); +} + /* * device multicast group inc (add if not found) */ @@ -948,6 +984,7 @@ static int __ipv6_dev_mc_inc(struct net_device *dev, mld_del_delrec(idev, mc); igmp6_group_added(mc); + inet6_ifmcaddr_notify(dev, mc, RTM_NEWMULTICAST); mutex_unlock(&idev->mc_lock); ma_put(mc); return 0; @@ -977,6 +1014,8 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr) *map = ma->next; igmp6_group_dropped(ma); + inet6_ifmcaddr_notify(idev->dev, ma, + RTM_DELMULTICAST); ip6_mc_clear_src(ma); mutex_unlock(&idev->mc_lock);