Message ID | 20210225234631.2547776-1-Jason@zx2c4.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: always use icmp{,v6}_ndo_send from ndo_start_xmit | expand |
On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > There were a few remaining tunnel drivers that didn't receive the prior > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from > icmp{,v6}_ndo_send before sending") for details), there's even more > imperative to have these all converted. So this commit goes through the > remaining cases that I could find and does a boring translation to the > ndo variety. > > Cc: Willem de Bruijn <willemb@google.com> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Using a stack variable over skb->cb[] is definitely the right fix for all of these. Thanks Jason. Only part that I don't fully know is the conntrack conversion. That is a behavioral change. What is the context behind that? I assume it's fine. In that if needed, that is the case for all devices, nothing specific about the couple that call icmp(v6)_ndo_send already.
On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > There were a few remaining tunnel drivers that didn't receive the prior > > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to > > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from > > icmp{,v6}_ndo_send before sending") for details), there's even more > > imperative to have these all converted. So this commit goes through the > > remaining cases that I could find and does a boring translation to the > > ndo variety. > > > > Cc: Willem de Bruijn <willemb@google.com> > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > > Using a stack variable over skb->cb[] is definitely the right fix for > all of these. Thanks Jason. > > Only part that I don't fully know is the conntrack conversion. That is > a behavioral change. What is the context behind that? I assume it's > fine. In that if needed, that is the case for all devices, nothing > specific about the couple that call icmp(v6)_ndo_send already. That's actually a sensible change anyway. icmp_send does something bogus if the packet has already passed through netfilter, which is why the ndo variant was adopted. So it's good and correct for these to change in that way. Jason
On Fri, Feb 26, 2021 at 5:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > > > There were a few remaining tunnel drivers that didn't receive the prior > > > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to > > > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from > > > icmp{,v6}_ndo_send before sending") for details), there's even more > > > imperative to have these all converted. So this commit goes through the > > > remaining cases that I could find and does a boring translation to the > > > ndo variety. > > > > > > Cc: Willem de Bruijn <willemb@google.com> > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > > > > Using a stack variable over skb->cb[] is definitely the right fix for > > all of these. Thanks Jason. > > > > Only part that I don't fully know is the conntrack conversion. That is > > a behavioral change. What is the context behind that? I assume it's > > fine. In that if needed, that is the case for all devices, nothing > > specific about the couple that call icmp(v6)_ndo_send already. > > That's actually a sensible change anyway. icmp_send does something > bogus if the packet has already passed through netfilter, which is why > the ndo variant was adopted. So it's good and correct for these to > change in that way. > > Jason Something bogus, how? Does this apply to all uses of conntrack? Specifically NAT? Not trying to be obtuse, but I really find it hard to evaluate that part. Please cc: the maintainers for patches that are meant to be merged, btw.
On Fri, 26 Feb 2021 18:28:56 -0500 Willem de Bruijn wrote:
> Please cc: the maintainers for patches that are meant to be merged, btw.
I was about to say. Please repost.
On Sat, Feb 27, 2021 at 12:29 AM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Fri, Feb 26, 2021 at 5:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > > > > > There were a few remaining tunnel drivers that didn't receive the prior > > > > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to > > > > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from > > > > icmp{,v6}_ndo_send before sending") for details), there's even more > > > > imperative to have these all converted. So this commit goes through the > > > > remaining cases that I could find and does a boring translation to the > > > > ndo variety. > > > > > > > > Cc: Willem de Bruijn <willemb@google.com> > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > > > > > > Using a stack variable over skb->cb[] is definitely the right fix for > > > all of these. Thanks Jason. > > > > > > Only part that I don't fully know is the conntrack conversion. That is > > > a behavioral change. What is the context behind that? I assume it's > > > fine. In that if needed, that is the case for all devices, nothing > > > specific about the couple that call icmp(v6)_ndo_send already. > > > > That's actually a sensible change anyway. icmp_send does something > > bogus if the packet has already passed through netfilter, which is why > > the ndo variant was adopted. So it's good and correct for these to > > change in that way. > > > > Jason > > Something bogus, how? Does this apply to all uses of conntrack? > Specifically NAT? Not trying to be obtuse, but I really find it hard > to evaluate that part. By the time packets hit ndo_start_xmit, the src address has changed, and icmp can't deliver to the actual source, and its rate limiting works against the wrong source. All of this was explained, justified, and discussed on the original icmp_ndo_start patchset, which included the function and converted drivers to use it. However, a few spots were missed, which this patchset cleans up. Here's the merge with details: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=803381f9f117493d6204d82445a530c834040fe6 The network devices that this patch here adjusts are no different than the four I originally fixed up in that series -- xfrmi, gtp, sunvnet, wireguard. > Please cc: the maintainers for patches that are meant to be merged, btw. Whoops. I'll do so and repost.
On Fri, Feb 26, 2021 at 7:42 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > On Sat, Feb 27, 2021 at 12:29 AM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > On Fri, Feb 26, 2021 at 5:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > > > On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn > > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > > > On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > > > > > > > There were a few remaining tunnel drivers that didn't receive the prior > > > > > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to > > > > > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from > > > > > icmp{,v6}_ndo_send before sending") for details), there's even more > > > > > imperative to have these all converted. So this commit goes through the > > > > > remaining cases that I could find and does a boring translation to the > > > > > ndo variety. > > > > > > > > > > Cc: Willem de Bruijn <willemb@google.com> > > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > > > > > > > > Using a stack variable over skb->cb[] is definitely the right fix for > > > > all of these. Thanks Jason. > > > > > > > > Only part that I don't fully know is the conntrack conversion. That is > > > > a behavioral change. What is the context behind that? I assume it's > > > > fine. In that if needed, that is the case for all devices, nothing > > > > specific about the couple that call icmp(v6)_ndo_send already. > > > > > > That's actually a sensible change anyway. icmp_send does something > > > bogus if the packet has already passed through netfilter, which is why > > > the ndo variant was adopted. So it's good and correct for these to > > > change in that way. > > > > > > Jason > > > > Something bogus, how? Does this apply to all uses of conntrack? > > Specifically NAT? Not trying to be obtuse, but I really find it hard > > to evaluate that part. > > By the time packets hit ndo_start_xmit, the src address has changed, > and icmp can't deliver to the actual source, and its rate limiting > works against the wrong source. All of this was explained, justified, > and discussed on the original icmp_ndo_start patchset, which included > the function and converted drivers to use it. However, a few spots > were missed, which this patchset cleans up. Here's the merge with > details: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=803381f9f117493d6204d82445a530c834040fe6 Thanks for the link. I used git blame to a look at the patch that introduced the code before asking for more context, but this state was in the merge commit, so I missed it. > The network devices that this patch here adjusts are no different than > the four I originally fixed up in that series -- xfrmi, gtp, sunvnet, > wireguard. Agreed. > > Please cc: the maintainers for patches that are meant to be merged, btw. > > Whoops. I'll do so and repost.
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 76a420c76f16..f6cc26de5ed3 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -502,8 +502,7 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb, if (!skb_is_gso(skb) && (inner_iph->frag_off & htons(IP_DF)) && mtu < pkt_size) { - memset(IPCB(skb), 0, sizeof(*IPCB(skb))); - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); return -E2BIG; } } @@ -527,7 +526,7 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb, if (!skb_is_gso(skb) && mtu >= IPV6_MIN_MTU && mtu < pkt_size) { - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); return -E2BIG; } } diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index abc171e79d3e..eb207089ece0 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -238,13 +238,13 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev, if (skb->len > mtu) { skb_dst_update_pmtu_no_confirm(skb, mtu); if (skb->protocol == htons(ETH_P_IP)) { - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, - htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, + htonl(mtu)); } else { if (mtu < IPV6_MIN_MTU) mtu = IPV6_MIN_MTU; - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); } dst_release(dst); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index c3bc89b6b1a1..1baf43aacb2e 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -678,8 +678,8 @@ static int prepare_ip6gre_xmit_ipv6(struct sk_buff *skb, tel = (struct ipv6_tlv_tnl_enc_lim *)&skb_network_header(skb)[offset]; if (tel->encap_limit == 0) { - icmpv6_send(skb, ICMPV6_PARAMPROB, - ICMPV6_HDR_FIELD, offset + 2); + icmpv6_ndo_send(skb, ICMPV6_PARAMPROB, + ICMPV6_HDR_FIELD, offset + 2); return -1; } *encap_limit = tel->encap_limit - 1; @@ -805,8 +805,8 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev) if (err != 0) { /* XXX: send ICMP error even if DF is not set. */ if (err == -EMSGSIZE) - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, - htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, + htonl(mtu)); return -1; } @@ -837,7 +837,7 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, struct net_device *dev) &mtu, skb->protocol); if (err != 0) { if (err == -EMSGSIZE) - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); return -1; } @@ -1063,10 +1063,10 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb, /* XXX: send ICMP error even if DF is not set. */ if (err == -EMSGSIZE) { if (skb->protocol == htons(ETH_P_IP)) - icmp_send(skb, ICMP_DEST_UNREACH, - ICMP_FRAG_NEEDED, htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, + ICMP_FRAG_NEEDED, htonl(mtu)); else - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); } goto tx_err; diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index a7950baa05e5..3fa0eca5a06f 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1332,8 +1332,8 @@ ipxip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, tel = (void *)&skb_network_header(skb)[offset]; if (tel->encap_limit == 0) { - icmpv6_send(skb, ICMPV6_PARAMPROB, - ICMPV6_HDR_FIELD, offset + 2); + icmpv6_ndo_send(skb, ICMPV6_PARAMPROB, + ICMPV6_HDR_FIELD, offset + 2); return -1; } encap_limit = tel->encap_limit - 1; @@ -1385,11 +1385,11 @@ ipxip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, if (err == -EMSGSIZE) switch (protocol) { case IPPROTO_IPIP: - icmp_send(skb, ICMP_DEST_UNREACH, - ICMP_FRAG_NEEDED, htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, + ICMP_FRAG_NEEDED, htonl(mtu)); break; case IPPROTO_IPV6: - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); break; default: break; diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 0225fd694192..f10e7a72ea62 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -521,10 +521,10 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) if (mtu < IPV6_MIN_MTU) mtu = IPV6_MIN_MTU; - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); } else { - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, - htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, + htonl(mtu)); } err = -EMSGSIZE; diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 93636867aee2..63ccd9f2dccc 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -987,7 +987,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb, skb_dst_update_pmtu_no_confirm(skb, mtu); if (skb->len > mtu && !skb_is_gso(skb)) { - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); ip_rt_put(rt); goto tx_error; }
There were a few remaining tunnel drivers that didn't receive the prior conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending") for details), there's even more imperative to have these all converted. So this commit goes through the remaining cases that I could find and does a boring translation to the ndo variety. Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- net/ipv4/ip_tunnel.c | 5 ++--- net/ipv4/ip_vti.c | 6 +++--- net/ipv6/ip6_gre.c | 16 ++++++++-------- net/ipv6/ip6_tunnel.c | 10 +++++----- net/ipv6/ip6_vti.c | 6 +++--- net/ipv6/sit.c | 2 +- 6 files changed, 22 insertions(+), 23 deletions(-)