Message ID | 28cb906436e87eada712f55e63ae5c420bea0ecb.1692153515.git.yan@cloudflare.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 29b22badb7a84b783e3a4fffca16f7768fb31205 |
Headers | show |
Series | lwt: fix return values of BPF ops | expand |
On Tue, Aug 15, 2023 at 9:54 PM Yan Zhai <yan@cloudflare.com> wrote: > > BPF encap ops can return different types of positive values, such like > NET_RX_DROP, NET_XMIT_CN, NETDEV_TX_BUSY, and so on, from function > skb_do_redirect and bpf_lwt_xmit_reroute. At the xmit hook, such return > values would be treated implicitly as LWTUNNEL_XMIT_CONTINUE in > ip(6)_finish_output2. When this happens, skbs that have been freed would > continue to the neighbor subsystem, causing use-after-free bug and > kernel crashes. > > To fix the incorrect behavior, skb_do_redirect return values can be > simply discarded, the same as tc-egress behavior. On the other hand, > bpf_lwt_xmit_reroute returns useful errors to local senders, e.g. PMTU > information. Thus convert its return values to avoid the conflict with > LWTUNNEL_XMIT_CONTINUE. > > Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure") > Suggested-by: Martin KaFai Lau <martin.lau@linux.dev> > Suggested-by: Stanislav Fomichev <sdf@google.com> > Reported-by: Jordan Griege <jgriege@cloudflare.com> > Signed-off-by: Yan Zhai <yan@cloudflare.com> > --- > * v5: discards skb_do_redirect return instead; convert > bpf_lwt_xmit_reroute return; > * v4: minor commit message changes > * v3: converts skb_do_redirect statuses from both ingress and egress > * v2: code style amend > --- > net/core/lwt_bpf.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c > index 8b6b5e72b217..4a0797f0a154 100644 > --- a/net/core/lwt_bpf.c > +++ b/net/core/lwt_bpf.c > @@ -60,9 +60,8 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt, > ret = BPF_OK; > } else { > skb_reset_mac_header(skb); > - ret = skb_do_redirect(skb); > - if (ret == 0) > - ret = BPF_REDIRECT; > + skb_do_redirect(skb); > + ret = BPF_REDIRECT; > } > break; > > @@ -255,7 +254,7 @@ static int bpf_lwt_xmit_reroute(struct sk_buff *skb) > > err = dst_output(dev_net(skb_dst(skb)->dev), skb->sk, skb); > if (unlikely(err)) > - return err; > + return net_xmit_errno(err); > > /* ip[6]_finish_output2 understand LWTUNNEL_XMIT_DONE */ > return LWTUNNEL_XMIT_DONE; > -- > 2.30.2 > no idea why this one would appear nested and without subject on the lore link. Let me double check what goes wrong with my mutt setting :( -- Yan
diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c index 8b6b5e72b217..4a0797f0a154 100644 --- a/net/core/lwt_bpf.c +++ b/net/core/lwt_bpf.c @@ -60,9 +60,8 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt, ret = BPF_OK; } else { skb_reset_mac_header(skb); - ret = skb_do_redirect(skb); - if (ret == 0) - ret = BPF_REDIRECT; + skb_do_redirect(skb); + ret = BPF_REDIRECT; } break; @@ -255,7 +254,7 @@ static int bpf_lwt_xmit_reroute(struct sk_buff *skb) err = dst_output(dev_net(skb_dst(skb)->dev), skb->sk, skb); if (unlikely(err)) - return err; + return net_xmit_errno(err); /* ip[6]_finish_output2 understand LWTUNNEL_XMIT_DONE */ return LWTUNNEL_XMIT_DONE;
BPF encap ops can return different types of positive values, such like NET_RX_DROP, NET_XMIT_CN, NETDEV_TX_BUSY, and so on, from function skb_do_redirect and bpf_lwt_xmit_reroute. At the xmit hook, such return values would be treated implicitly as LWTUNNEL_XMIT_CONTINUE in ip(6)_finish_output2. When this happens, skbs that have been freed would continue to the neighbor subsystem, causing use-after-free bug and kernel crashes. To fix the incorrect behavior, skb_do_redirect return values can be simply discarded, the same as tc-egress behavior. On the other hand, bpf_lwt_xmit_reroute returns useful errors to local senders, e.g. PMTU information. Thus convert its return values to avoid the conflict with LWTUNNEL_XMIT_CONTINUE. Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure") Suggested-by: Martin KaFai Lau <martin.lau@linux.dev> Suggested-by: Stanislav Fomichev <sdf@google.com> Reported-by: Jordan Griege <jgriege@cloudflare.com> Signed-off-by: Yan Zhai <yan@cloudflare.com> --- * v5: discards skb_do_redirect return instead; convert bpf_lwt_xmit_reroute return; * v4: minor commit message changes * v3: converts skb_do_redirect statuses from both ingress and egress * v2: code style amend --- net/core/lwt_bpf.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)