Message ID | 20250221013648.35716-1-lulie@linux.alibaba.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] ipvs: Always clear ipvs_property flag in skb_scrub_packet() | expand |
Le 21/02/2025 à 02:36, Philo Lu a écrit : > We found an issue when using bpf_redirect with ipvs NAT mode after > commit ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within > the same name space"). Particularly, we use bpf_redirect to return > the skb directly back to the netif it comes from, i.e., xnet is > false in skb_scrub_packet(), and then ipvs_property is preserved > and SNAT is skipped in the rx path. > > ipvs_property has been already cleared when netns is changed in > commit 2b5ec1a5f973 ("netfilter/ipvs: clear ipvs_property flag when > SKB net namespace changed"). This patch just clears it in spite of > netns. > > Signed-off-by: Philo Lu <lulie@linux.alibaba.com> > --- > This is in fact a fix patch, and the issue was found after commit > ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within > the same name space"). But I'm not sure if a "Fixes" tag should be > added to that commit. > --- > net/core/skbuff.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 7b03b64fdcb2..b1c81687e9d8 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -6033,11 +6033,11 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet) > skb->offload_fwd_mark = 0; > skb->offload_l3_fwd_mark = 0; > #endif > + ipvs_reset(skb); > > if (!xnet) > return; > > - ipvs_reset(skb); I don't know IPVS, but I wonder if this patch will not introduce a regression for other users. skb_scrub_packet() is used by a lot of tunnels, it's not specific to bpf_redirect(). Regards, Nicolas
Hello, On Fri, 21 Feb 2025, Philo Lu wrote: > We found an issue when using bpf_redirect with ipvs NAT mode after > commit ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within > the same name space"). Particularly, we use bpf_redirect to return > the skb directly back to the netif it comes from, i.e., xnet is > false in skb_scrub_packet(), and then ipvs_property is preserved > and SNAT is skipped in the rx path. > > ipvs_property has been already cleared when netns is changed in > commit 2b5ec1a5f973 ("netfilter/ipvs: clear ipvs_property flag when > SKB net namespace changed"). This patch just clears it in spite of > netns. > > Signed-off-by: Philo Lu <lulie@linux.alibaba.com> > --- > This is in fact a fix patch, and the issue was found after commit > ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within > the same name space"). But I'm not sure if a "Fixes" tag should be > added to that commit. You can add 2b5ec1a5f973 as a Fixes tag in v2 and I'll ack it. Nowadays, ipvs_property prevents unneeded connection lookups while the packet traverses the hooks (different IPVS hook handlers can see the same packet in same or next hook). This includes the case where packet can be routed to local server, eg. via loopback_xmit(), for example: Packet from device: LOCAL_IN: set ipvs_property=1 Reroute to local server ? Then to hit the local server continue the NF hook Otherwise, route via device: LOCAL_OUT: use ipvs_property to avoid conn lookup and rerouting LOCAL_OUT: Unexpected divert to local server by someone else? => LOCAL_IN: use ipvs_property to avoid conn lookup and rerouting Packet from local client: LOCAL_OUT: set ipvs_property=1 Reroute to local server? Then continue via loopback_xmit() without skb_scrub_packet() LOCAL_IN: use ipvs_property to avoid conn lookup and rerouting Hit the local server Otherwise, route via device: Others at LOCAL_OUT: Unexpected divert to local server by someone else ? => LOCAL_IN: use ipvs_property to avoid conn lookup and rerouting So, we need ipvs_property set only during the normal path where packet is received and sent, possibly to local server via local route. We do not care if the packet leaves this path. Currently, ipvs_property was cleared if netns changes but it is a safe option. What matters is if packet is routed via local route. Other paths can reset the flag safely. > --- > net/core/skbuff.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 7b03b64fdcb2..b1c81687e9d8 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -6033,11 +6033,11 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet) > skb->offload_fwd_mark = 0; > skb->offload_l3_fwd_mark = 0; > #endif > + ipvs_reset(skb); > > if (!xnet) > return; > > - ipvs_reset(skb); > skb->mark = 0; > skb_clear_tstamp(skb); > } > -- > 2.32.0.3.g01195cf9f Regards -- Julian Anastasov <ja@ssi.bg>
On 2025/2/21 19:42, Julian Anastasov wrote: > > Hello, > > On Fri, 21 Feb 2025, Philo Lu wrote: > >> We found an issue when using bpf_redirect with ipvs NAT mode after >> commit ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within >> the same name space"). Particularly, we use bpf_redirect to return >> the skb directly back to the netif it comes from, i.e., xnet is >> false in skb_scrub_packet(), and then ipvs_property is preserved >> and SNAT is skipped in the rx path. >> >> ipvs_property has been already cleared when netns is changed in >> commit 2b5ec1a5f973 ("netfilter/ipvs: clear ipvs_property flag when >> SKB net namespace changed"). This patch just clears it in spite of >> netns. >> >> Signed-off-by: Philo Lu <lulie@linux.alibaba.com> >> --- >> This is in fact a fix patch, and the issue was found after commit >> ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within >> the same name space"). But I'm not sure if a "Fixes" tag should be >> added to that commit. > > You can add 2b5ec1a5f973 as a Fixes tag in v2 and I'll ack it. Thank you, Julian. You also solve my worries. I'll post v2 soon.
Hello, On Fri, 21 Feb 2025, Nicolas Dichtel wrote: > Le 21/02/2025 à 02:36, Philo Lu a écrit : > > We found an issue when using bpf_redirect with ipvs NAT mode after > > commit ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within > > the same name space"). Particularly, we use bpf_redirect to return > > the skb directly back to the netif it comes from, i.e., xnet is > > false in skb_scrub_packet(), and then ipvs_property is preserved > > and SNAT is skipped in the rx path. > > > > ipvs_property has been already cleared when netns is changed in > > commit 2b5ec1a5f973 ("netfilter/ipvs: clear ipvs_property flag when > > SKB net namespace changed"). This patch just clears it in spite of > > netns. > > > > Signed-off-by: Philo Lu <lulie@linux.alibaba.com> > > --- > > This is in fact a fix patch, and the issue was found after commit > > ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within > > the same name space"). But I'm not sure if a "Fixes" tag should be > > added to that commit. > > --- > > net/core/skbuff.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > > index 7b03b64fdcb2..b1c81687e9d8 100644 > > --- a/net/core/skbuff.c > > +++ b/net/core/skbuff.c > > @@ -6033,11 +6033,11 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet) > > skb->offload_fwd_mark = 0; > > skb->offload_l3_fwd_mark = 0; > > #endif > > + ipvs_reset(skb); > > > > if (!xnet) > > return; > > > > - ipvs_reset(skb); > I don't know IPVS, but I wonder if this patch will not introduce a regression > for other users. skb_scrub_packet() is used by a lot of tunnels, it's not > specific to bpf_redirect(). Indeed, now we will start to clear the flag for tunnels and it can cause IPVS to attempt rerouting for UDP tunnels, i.e. once the packet is routed by IPVS to tunnel and second time later after tunneling again when ip_local_out() is called and we see ipvs_property=0 for the outer UDP header. Before such patch ipvs_property remains 1 and we do not try to balance the UDP traffic. So, for now, this patch may be can spend more cycles for the traffic via UDP tunnels but this looks like a rare IPVS setup, i.e. real servers reachable via UDP tunnels. Note that IPVS has own support for UDP tunnels where it is set as forwarding method + tunnel type GUE for the real server. It is not affected by this patch. Regards -- Julian Anastasov <ja@ssi.bg>
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 7b03b64fdcb2..b1c81687e9d8 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -6033,11 +6033,11 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet) skb->offload_fwd_mark = 0; skb->offload_l3_fwd_mark = 0; #endif + ipvs_reset(skb); if (!xnet) return; - ipvs_reset(skb); skb->mark = 0; skb_clear_tstamp(skb); }
We found an issue when using bpf_redirect with ipvs NAT mode after commit ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within the same name space"). Particularly, we use bpf_redirect to return the skb directly back to the netif it comes from, i.e., xnet is false in skb_scrub_packet(), and then ipvs_property is preserved and SNAT is skipped in the rx path. ipvs_property has been already cleared when netns is changed in commit 2b5ec1a5f973 ("netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed"). This patch just clears it in spite of netns. Signed-off-by: Philo Lu <lulie@linux.alibaba.com> --- This is in fact a fix patch, and the issue was found after commit ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within the same name space"). But I'm not sure if a "Fixes" tag should be added to that commit. --- net/core/skbuff.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)