Message ID | 20241117141137.2072899-1-yuyanghuang@google.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v2] netlink: add IGMP/MLD join/leave notifications | expand |
Hi Yuyang, On Sun, Nov 17, 2024 at 11:11:37PM +0900, Yuyang Huang wrote: > +static int inet_fill_ifmcaddr(struct sk_buff *skb, struct net_device *dev, > + __be32 addr, int event) > +{ > + struct ifaddrmsg *ifm; > + struct nlmsghdr *nlh; > + > + nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct ifaddrmsg), 0); > + if (!nlh) > + return -EMSGSIZE; > + > + ifm = nlmsg_data(nlh); > + ifm->ifa_family = AF_INET; > + ifm->ifa_prefixlen = 32; > + ifm->ifa_flags = IFA_F_PERMANENT; > + ifm->ifa_scope = RT_SCOPE_LINK; Why the IPv4 scope use RT_SCOPE_LINK, > +static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct net_device *dev, > + const struct in6_addr *addr, int event) > +{ > + struct ifaddrmsg *ifm; > + struct nlmsghdr *nlh; > + u8 scope; > + > + scope = RT_SCOPE_UNIVERSE; > + if (ipv6_addr_scope(addr) & IFA_SITE) > + scope = RT_SCOPE_SITE; And IPv6 use RT_SCOPE_UNIVERSE by default? > + > + nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct ifaddrmsg), 0); > + if (!nlh) > + return -EMSGSIZE; > + > + ifm = nlmsg_data(nlh); > + ifm->ifa_family = AF_INET6; > + ifm->ifa_prefixlen = 128; > + ifm->ifa_flags = IFA_F_PERMANENT; > + ifm->ifa_scope = scope; > + ifm->ifa_index = dev->ifindex; > + > +static void inet6_ifmcaddr_notify(struct net_device *dev, > + const struct in6_addr *addr, int event) > +{ > + struct net *net = dev_net(dev); > + struct sk_buff *skb; > + int err = -ENOBUFS; > + > + skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct ifaddrmsg)) > + + nla_total_size(16), GFP_ATOMIC); > + if (!skb) > + goto error; > + > + err = inet6_fill_ifmcaddr(skb, dev, addr, event); > + if (err < 0) { > + WARN_ON(err == -EMSGSIZE); Not sure if we really need this WARN_ON. Wait for others comments. Thanks Hangbin
Hi Hangbin Thanks for the review feedback. >Why the IPv4 scope use RT_SCOPE_LINK, I'm unsure if I'm setting the IPv4 rt scope correctly. I read the following document for rtm_scope: ``` /* rtm_scope Really it is not scope, but sort of distance to the destination. NOWHERE are reserved for not existing destinations, HOST is our local addresses, LINK are destinations, located on directly attached link and UNIVERSE is everywhere in the Universe. Intermediate values are also possible f.e. interior routes could be assigned a value between UNIVERSE and LINK. */ ``` I believe RT_SCOPE_LINK is the closest match to the use case. IGMP packets have a TTL of 1, so they are not forwarded to other networks. I saw the RT_SCOPE_LINK was chosen in the original patch so I followed the same pattern. Link: https://lore.kernel.org/r/20180906091056.21109-1-pruddy@vyatta.att-mail.com Please kindly advise here if we have more proper logic. >And IPv6 use RT_SCOPE_UNIVERSE by default? Since IPv6 provides the `static inline int ipv6_addr_scope(const struct in6_addr *addr)` helper function, we should utilize it to correctly determine the address scope. We have the logic like follows in addrconf.c to determine the rt_scope. ``` static inline int rt_scope(int ifa_scope) { if (ifa_scope & IFA_HOST) return RT_SCOPE_HOST; else if (ifa_scope & IFA_LINK) return RT_SCOPE_LINK; else if (ifa_scope & IFA_SITE) return RT_SCOPE_SITE; else return RT_SCOPE_UNIVERSE; } ``` However, I found the addrconf.c:inet6_fill_ifmcaddr() is using the following logic so I am trying to make the notification logic consistent with dump logic. Maybe we should update `inet6_fill_ifmcaddr()` to use `ipv6_addr_scope()` to determine the scope properly? ``` static int inet6_fill_ifmcaddr(struct sk_buff *skb, const struct ifmcaddr6 *ifmca, struct inet6_fill_args *args) { int ifindex = ifmca->idev->dev->ifindex; u8 scope = RT_SCOPE_UNIVERSE; struct nlmsghdr *nlh; if (ipv6_addr_scope(&ifmca->mca_addr) & IFA_SITE) scope = RT_SCOPE_SITE; ``` In general, I am not sure if the scope information is truly necessary for IPv4 and IPv6 multicast notifications. In my experience, only the address itself is needed. The `ip maddr` command also omits scope. Perhaps I'm missing some use cases where scope is essential. >Not sure if we really need this WARN_ON. Wait for others comments. I try to follow the existing code pattern in addrconf.c; for example: ``` err = inet6_fill_ifaddr(skb, ifa, &fillargs); if (err < 0) { /* -EMSGSIZE implies BUG in inet6_ifaddr_msgsize() */ WARN_ON(err == -EMSGSIZE); kfree_skb(skb); goto errout_ifa; } ``` Thanks, Yuyang On Tue, Nov 19, 2024 at 4:39 PM Hangbin Liu <liuhangbin@gmail.com> wrote: > > Hi Yuyang, > On Sun, Nov 17, 2024 at 11:11:37PM +0900, Yuyang Huang wrote: > > +static int inet_fill_ifmcaddr(struct sk_buff *skb, struct net_device *dev, > > + __be32 addr, int event) > > +{ > > + struct ifaddrmsg *ifm; > > + struct nlmsghdr *nlh; > > + > > + nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct ifaddrmsg), 0); > > + if (!nlh) > > + return -EMSGSIZE; > > + > > + ifm = nlmsg_data(nlh); > > + ifm->ifa_family = AF_INET; > > + ifm->ifa_prefixlen = 32; > > + ifm->ifa_flags = IFA_F_PERMANENT; > > + ifm->ifa_scope = RT_SCOPE_LINK; > > Why the IPv4 scope use RT_SCOPE_LINK, > > > +static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct net_device *dev, > > + const struct in6_addr *addr, int event) > > +{ > > + struct ifaddrmsg *ifm; > > + struct nlmsghdr *nlh; > > + u8 scope; > > + > > + scope = RT_SCOPE_UNIVERSE; > > + if (ipv6_addr_scope(addr) & IFA_SITE) > > + scope = RT_SCOPE_SITE; > > And IPv6 use RT_SCOPE_UNIVERSE by default? > > > + > > + nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct ifaddrmsg), 0); > > + if (!nlh) > > + return -EMSGSIZE; > > + > > + ifm = nlmsg_data(nlh); > > + ifm->ifa_family = AF_INET6; > > + ifm->ifa_prefixlen = 128; > > + ifm->ifa_flags = IFA_F_PERMANENT; > > + ifm->ifa_scope = scope; > > + ifm->ifa_index = dev->ifindex; > > + > > +static void inet6_ifmcaddr_notify(struct net_device *dev, > > + const struct in6_addr *addr, int event) > > +{ > > + struct net *net = dev_net(dev); > > + struct sk_buff *skb; > > + int err = -ENOBUFS; > > + > > + skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct ifaddrmsg)) > > + + nla_total_size(16), GFP_ATOMIC); > > + if (!skb) > > + goto error; > > + > > + err = inet6_fill_ifmcaddr(skb, dev, addr, event); > > + if (err < 0) { > > + WARN_ON(err == -EMSGSIZE); > > Not sure if we really need this WARN_ON. Wait for others comments. > > Thanks > Hangbin
On 11/19/24 10:21, Yuyang Huang wrote: >> Why the IPv4 scope use RT_SCOPE_LINK, > > I'm unsure if I'm setting the IPv4 rt scope correctly. > > I read the following document for rtm_scope: > > ``` > /* rtm_scope > > Really it is not scope, but sort of distance to the destination. > NOWHERE are reserved for not existing destinations, HOST is our > local addresses, LINK are destinations, located on directly attached > link and UNIVERSE is everywhere in the Universe. > > Intermediate values are also possible f.e. interior routes > could be assigned a value between UNIVERSE and LINK. > */ > ``` I think the most important thing is consistency. This patch is inconsistent WRT rtm_scope among ipv4 and ipv6, you should ensure similar behavior among them. Existing ip-related notification always use RT_SCOPE_UNIVERSE with the rater suspect exception of mctp. Possibly using RT_SCOPE_UNIVERSE here too could be fitting. /P
Hi Paolo > I think the most important thing is consistency. This patch is > inconsistent WRT rtm_scope among ipv4 and ipv6, you should ensure > similar behavior among them. > Existing ip-related notification always use RT_SCOPE_UNIVERSE with the > rater suspect exception of mctp. Possibly using RT_SCOPE_UNIVERSE here > too could be fitting. Thank you very much for the suggestion. To ensure consistency, I'll use RT_SCOPE_UNIVERSE for both IPv4 and IPv6 notifications, unless other reviewers have concerns. Thanks, Yuyang On Tue, Nov 19, 2024 at 9:10 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On 11/19/24 10:21, Yuyang Huang wrote: > >> Why the IPv4 scope use RT_SCOPE_LINK, > > > > I'm unsure if I'm setting the IPv4 rt scope correctly. > > > > I read the following document for rtm_scope: > > > > ``` > > /* rtm_scope > > > > Really it is not scope, but sort of distance to the destination. > > NOWHERE are reserved for not existing destinations, HOST is our > > local addresses, LINK are destinations, located on directly attached > > link and UNIVERSE is everywhere in the Universe. > > > > Intermediate values are also possible f.e. interior routes > > could be assigned a value between UNIVERSE and LINK. > > */ > > ``` > > I think the most important thing is consistency. This patch is > inconsistent WRT rtm_scope among ipv4 and ipv6, you should ensure > similar behavior among them. > > Existing ip-related notification always use RT_SCOPE_UNIVERSE with the > rater suspect exception of mctp. Possibly using RT_SCOPE_UNIVERSE here > too could be fitting. > > /P >
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index db7254d52d93..92964a9d2388 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -93,6 +93,10 @@ enum { RTM_NEWPREFIX = 52, #define RTM_NEWPREFIX RTM_NEWPREFIX + RTM_NEWMULTICAST, +#define RTM_NEWMULTICAST RTM_NEWMULTICAST + RTM_DELMULTICAST, +#define RTM_DELMULTICAST RTM_DELMULTICAST RTM_GETMULTICAST = 58, #define RTM_GETMULTICAST RTM_GETMULTICAST @@ -774,6 +778,10 @@ enum rtnetlink_groups { #define RTNLGRP_TUNNEL RTNLGRP_TUNNEL RTNLGRP_STATS, #define RTNLGRP_STATS RTNLGRP_STATS + RTNLGRP_IPV4_MCADDR, +#define RTNLGRP_IPV4_MCADDR RTNLGRP_IPV4_MCADDR + RTNLGRP_IPV6_MCADDR, +#define RTNLGRP_IPV6_MCADDR RTNLGRP_IPV6_MCADDR __RTNLGRP_MAX }; #define RTNLGRP_MAX (__RTNLGRP_MAX - 1) diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 6a238398acc9..e843b65bc7b5 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -88,6 +88,7 @@ #include <linux/byteorder/generic.h> #include <net/net_namespace.h> +#include <net/netlink.h> #include <net/arp.h> #include <net/ip.h> #include <net/protocol.h> @@ -1430,6 +1431,55 @@ static void ip_mc_hash_remove(struct in_device *in_dev, *mc_hash = im->next_hash; } +static int inet_fill_ifmcaddr(struct sk_buff *skb, struct net_device *dev, + __be32 addr, int event) +{ + struct ifaddrmsg *ifm; + struct nlmsghdr *nlh; + + nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct ifaddrmsg), 0); + if (!nlh) + return -EMSGSIZE; + + ifm = nlmsg_data(nlh); + ifm->ifa_family = AF_INET; + ifm->ifa_prefixlen = 32; + ifm->ifa_flags = IFA_F_PERMANENT; + ifm->ifa_scope = RT_SCOPE_LINK; + ifm->ifa_index = dev->ifindex; + + if (nla_put_in_addr(skb, IFA_MULTICAST, addr) < 0) { + nlmsg_cancel(skb, nlh); + return -EMSGSIZE; + } + + nlmsg_end(skb, nlh); + return 0; +} + +static void inet_ifmcaddr_notify(struct net_device *dev, __be32 addr, int event) +{ + struct net *net = dev_net(dev); + struct sk_buff *skb; + int err = -ENOBUFS; + + skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct ifaddrmsg)) + + nla_total_size(sizeof(__be32)), GFP_ATOMIC); + if (!skb) + goto error; + + err = inet_fill_ifmcaddr(skb, dev, addr, event); + if (err < 0) { + WARN_ON(err == -EMSGSIZE); + kfree_skb(skb); + goto error; + } + + rtnl_notify(skb, net, 0, RTNLGRP_IPV4_MCADDR, NULL, GFP_ATOMIC); + return; +error: + rtnl_set_sk_err(net, RTNLGRP_IPV4_MCADDR, err); +} /* * A socket has joined a multicast group on device dev. @@ -1492,6 +1542,7 @@ static void ____ip_mc_inc_group(struct in_device *in_dev, __be32 addr, igmpv3_del_delrec(in_dev, im); #endif igmp_group_added(im); + inet_ifmcaddr_notify(in_dev->dev, addr, RTM_NEWMULTICAST); if (!in_dev->dead) ip_rt_multicast_event(in_dev); out: @@ -1705,6 +1756,8 @@ void __ip_mc_dec_group(struct in_device *in_dev, __be32 addr, gfp_t gfp) *ip = i->next_rcu; in_dev->mc_count--; __igmp_group_dropped(i, gfp); + inet_ifmcaddr_notify(in_dev->dev, addr, + RTM_DELMULTICAST); ip_mc_clear_src(i); if (!in_dev->dead) diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index b244dbf61d5f..33f3d8a32282 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -33,8 +33,10 @@ #include <linux/in.h> #include <linux/in6.h> #include <linux/netdevice.h> +#include <linux/if_addr.h> #include <linux/if_arp.h> #include <linux/route.h> +#include <linux/rtnetlink.h> #include <linux/init.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> @@ -47,6 +49,7 @@ #include <linux/netfilter_ipv6.h> #include <net/net_namespace.h> +#include <net/netlink.h> #include <net/sock.h> #include <net/snmp.h> @@ -901,6 +904,62 @@ static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev, return mc; } +static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct net_device *dev, + const struct in6_addr *addr, int event) +{ + struct ifaddrmsg *ifm; + struct nlmsghdr *nlh; + u8 scope; + + scope = RT_SCOPE_UNIVERSE; + if (ipv6_addr_scope(addr) & IFA_SITE) + scope = RT_SCOPE_SITE; + + nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct ifaddrmsg), 0); + if (!nlh) + return -EMSGSIZE; + + ifm = nlmsg_data(nlh); + ifm->ifa_family = AF_INET6; + ifm->ifa_prefixlen = 128; + ifm->ifa_flags = IFA_F_PERMANENT; + ifm->ifa_scope = scope; + ifm->ifa_index = dev->ifindex; + + if (nla_put_in6_addr(skb, IFA_MULTICAST, addr) < 0) { + nlmsg_cancel(skb, nlh); + return -EMSGSIZE; + } + + nlmsg_end(skb, nlh); + return 0; +} + +static void inet6_ifmcaddr_notify(struct net_device *dev, + const struct in6_addr *addr, int event) +{ + struct net *net = dev_net(dev); + struct sk_buff *skb; + int err = -ENOBUFS; + + skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct ifaddrmsg)) + + nla_total_size(16), GFP_ATOMIC); + if (!skb) + goto error; + + err = inet6_fill_ifmcaddr(skb, dev, addr, event); + if (err < 0) { + WARN_ON(err == -EMSGSIZE); + kfree_skb(skb); + goto error; + } + + rtnl_notify(skb, net, 0, RTNLGRP_IPV6_MCADDR, NULL, GFP_ATOMIC); + return; +error: + rtnl_set_sk_err(net, RTNLGRP_IPV6_MCADDR, err); +} + /* * device multicast group inc (add if not found) */ @@ -948,6 +1007,7 @@ static int __ipv6_dev_mc_inc(struct net_device *dev, mld_del_delrec(idev, mc); igmp6_group_added(mc); + inet6_ifmcaddr_notify(dev, addr, RTM_NEWMULTICAST); mutex_unlock(&idev->mc_lock); ma_put(mc); return 0; @@ -977,6 +1037,8 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr) *map = ma->next; igmp6_group_dropped(ma); + inet6_ifmcaddr_notify(idev->dev, addr, + RTM_DELMULTICAST); ip6_mc_clear_src(ma); mutex_unlock(&idev->mc_lock);