Message ID | 20230509135546.580158-1-dongchenchen2@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v3] net: nsh: Use correct mac_offset to unwind gso skb in nsh_gso_segment() | expand |
On Tue, May 9, 2023 at 3:55 PM Dong Chenchen <dongchenchen2@huawei.com> wrote: > > As the call trace shows, skb_panic was caused by wrong skb->mac_header > in nsh_gso_segment(): > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1 > RIP: 0010:skb_panic+0xda/0xe0 > call Trace: > skb_push+0x91/0xa0 > nsh_gso_segment+0x4f3/0x570 > skb_mac_gso_segment+0x19e/0x270 > __skb_gso_segment+0x1e8/0x3c0 > validate_xmit_skb+0x452/0x890 > validate_xmit_skb_list+0x99/0xd0 > sch_direct_xmit+0x294/0x7c0 > __dev_queue_xmit+0x16f0/0x1d70 > packet_xmit+0x185/0x210 > packet_snd+0xc15/0x1170 > packet_sendmsg+0x7b/0xa0 > sock_sendmsg+0x14f/0x160 > > The root cause is: > nsh_gso_segment() use skb->network_header - nhoff to reset mac_header > in skb_gso_error_unwind() if inner-layer protocol gso fails. > However, skb->network_header may be reset by inner-layer protocol > gso function e.g. mpls_gso_segment. skb->mac_header reset by the > inaccurate network_header will be larger than skb headroom. > > nsh_gso_segment > nhoff = skb->network_header - skb->mac_header; > __skb_pull(skb,nsh_len) > skb_mac_gso_segment > mpls_gso_segment > skb_reset_network_header(skb);//skb->network_header+=nsh_len > return -EINVAL; > skb_gso_error_unwind > skb_push(skb, nsh_len); > skb->mac_header = skb->network_header - nhoff; > // skb->mac_header > skb->headroom, cause skb_push panic > > Use correct mac_offset to restore mac_header to fix it. > > Fixes: c411ed854584 ("nsh: add GSO support") > Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> > > --- > v2: > - Use skb->mac_header not skb->network_header-nhoff for mac_offset. > > v3: > - 'net' is noted in the subject. > - arrange local variable following reverse xmas tree order > --- > net/nsh/nsh.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c > index e9ca007718b7..793e0bd94558 100644 > --- a/net/nsh/nsh.c > +++ b/net/nsh/nsh.c > @@ -77,6 +77,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, > netdev_features_t features) > { > struct sk_buff *segs = ERR_PTR(-EINVAL); > + u16 mac_offset = skb->mac_header; > unsigned int nsh_len, mac_len; > __be16 proto; > int nhoff; > @@ -108,8 +109,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, > segs = skb_mac_gso_segment(skb, features); > if (IS_ERR_OR_NULL(segs)) { > skb_gso_error_unwind(skb, htons(ETH_P_NSH), nsh_len, > - skb->network_header - nhoff, > - mac_len); > + mac_offset, mac_len); > goto out; > } > > -- > 2.25.1 >
On Tue, May 9, 2023 at 3:55 PM Dong Chenchen <dongchenchen2@huawei.com> wrote: > > As the call trace shows, skb_panic was caused by wrong skb->mac_header > in nsh_gso_segment(): > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1 > RIP: 0010:skb_panic+0xda/0xe0 > call Trace: > skb_push+0x91/0xa0 > nsh_gso_segment+0x4f3/0x570 > skb_mac_gso_segment+0x19e/0x270 > __skb_gso_segment+0x1e8/0x3c0 > validate_xmit_skb+0x452/0x890 > validate_xmit_skb_list+0x99/0xd0 > sch_direct_xmit+0x294/0x7c0 > __dev_queue_xmit+0x16f0/0x1d70 > packet_xmit+0x185/0x210 > packet_snd+0xc15/0x1170 > packet_sendmsg+0x7b/0xa0 > sock_sendmsg+0x14f/0x160 > > The root cause is: > nsh_gso_segment() use skb->network_header - nhoff to reset mac_header > in skb_gso_error_unwind() if inner-layer protocol gso fails. > However, skb->network_header may be reset by inner-layer protocol > gso function e.g. mpls_gso_segment. skb->mac_header reset by the > inaccurate network_header will be larger than skb headroom. > > nsh_gso_segment > nhoff = skb->network_header - skb->mac_header; > __skb_pull(skb,nsh_len) > skb_mac_gso_segment > mpls_gso_segment > skb_reset_network_header(skb);//skb->network_header+=nsh_len > return -EINVAL; > skb_gso_error_unwind > skb_push(skb, nsh_len); > skb->mac_header = skb->network_header - nhoff; > // skb->mac_header > skb->headroom, cause skb_push panic > > Use correct mac_offset to restore mac_header to fix it. > > Fixes: c411ed854584 ("nsh: add GSO support") > Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> > > --- > v2: > - Use skb->mac_header not skb->network_header-nhoff for mac_offset. > > v3: > - 'net' is noted in the subject. > - arrange local variable following reverse xmas tree order > --- > net/nsh/nsh.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c > index e9ca007718b7..793e0bd94558 100644 > --- a/net/nsh/nsh.c > +++ b/net/nsh/nsh.c > @@ -77,6 +77,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, > netdev_features_t features) > { > struct sk_buff *segs = ERR_PTR(-EINVAL); > + u16 mac_offset = skb->mac_header; > unsigned int nsh_len, mac_len; > __be16 proto; > int nhoff; > @@ -108,8 +109,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, > segs = skb_mac_gso_segment(skb, features); > if (IS_ERR_OR_NULL(segs)) { > skb_gso_error_unwind(skb, htons(ETH_P_NSH), nsh_len, > - skb->network_header - nhoff, > - mac_len); > + mac_offset, mac_len); > goto out; > } > I do not think this patch is enough ? This is still not nice, because mac_header == 0xFFFF nhoff = skb->network_header - skb->mac_header; ... skb_set_mac_header(skb, -nhoff); I would simply restore mac_header with "skb->mac_header = mac_offset" and get rid of nhoff. (Accept the fact that GSO layer should not rely on skb mac_header being set) In the future, we might be able to rewrite GSO without any assumptions on skb->mac_header.
> > > > As the call trace shows, skb_panic was caused by wrong skb->mac_header > > in nsh_gso_segment(): > > > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > > CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1 > > RIP: 0010:skb_panic+0xda/0xe0 > > call Trace: > > skb_push+0x91/0xa0 > > nsh_gso_segment+0x4f3/0x570 > > skb_mac_gso_segment+0x19e/0x270 > > __skb_gso_segment+0x1e8/0x3c0 > > validate_xmit_skb+0x452/0x890 > > validate_xmit_skb_list+0x99/0xd0 > > sch_direct_xmit+0x294/0x7c0 > > __dev_queue_xmit+0x16f0/0x1d70 > > packet_xmit+0x185/0x210 > > packet_snd+0xc15/0x1170 > > packet_sendmsg+0x7b/0xa0 > > sock_sendmsg+0x14f/0x160 > > > > The root cause is: > > nsh_gso_segment() use skb->network_header - nhoff to reset mac_header > > in skb_gso_error_unwind() if inner-layer protocol gso fails. > > However, skb->network_header may be reset by inner-layer protocol > > gso function e.g. mpls_gso_segment. skb->mac_header reset by the > > inaccurate network_header will be larger than skb headroom. > > > > nsh_gso_segment > > nhoff = skb->network_header - skb->mac_header; > > __skb_pull(skb,nsh_len) > > skb_mac_gso_segment > > mpls_gso_segment > > skb_reset_network_header(skb);//skb->network_header+=nsh_len > > return -EINVAL; > > skb_gso_error_unwind > > skb_push(skb, nsh_len); > > skb->mac_header = skb->network_header - nhoff; > > // skb->mac_header > skb->headroom, cause skb_push panic > > > > Use correct mac_offset to restore mac_header to fix it. > > > > Fixes: c411ed854584 ("nsh: add GSO support") > > Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> > > > > --- > > v2: > > - Use skb->mac_header not skb->network_header-nhoff for mac_offset. > > > > v3: > > - 'net' is noted in the subject. > > - arrange local variable following reverse xmas tree order > > --- > > net/nsh/nsh.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c > > index e9ca007718b7..793e0bd94558 100644 > > --- a/net/nsh/nsh.c > > +++ b/net/nsh/nsh.c > > @@ -77,6 +77,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, > > netdev_features_t features) > > { > > struct sk_buff *segs = ERR_PTR(-EINVAL); > > + u16 mac_offset = skb->mac_header; > > unsigned int nsh_len, mac_len; > > __be16 proto; > > int nhoff; > > @@ -108,8 +109,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, > > segs = skb_mac_gso_segment(skb, features); > > if (IS_ERR_OR_NULL(segs)) { > > skb_gso_error_unwind(skb, htons(ETH_P_NSH), nsh_len, > > - skb->network_header - nhoff, > > - mac_len); > > + mac_offset, mac_len); > > goto out; > > } > > > > I do not think this patch is enough ? > > This is still not nice, because mac_header == 0xFFFF > > nhoff = skb->network_header - skb->mac_header; > ... > skb_set_mac_header(skb, -nhoff); > > I would simply restore mac_header with "skb->mac_header = mac_offset" > and get rid of nhoff. > > (Accept the fact that GSO layer should not rely on skb mac_header being set) > > In the future, we might be able to rewrite GSO without any assumptions > on skb->mac_header. Thank you very much for your suggestions! I will revise it in v4. Dong Chenchen
diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c index e9ca007718b7..793e0bd94558 100644 --- a/net/nsh/nsh.c +++ b/net/nsh/nsh.c @@ -77,6 +77,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, netdev_features_t features) { struct sk_buff *segs = ERR_PTR(-EINVAL); + u16 mac_offset = skb->mac_header; unsigned int nsh_len, mac_len; __be16 proto; int nhoff; @@ -108,8 +109,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb, segs = skb_mac_gso_segment(skb, features); if (IS_ERR_OR_NULL(segs)) { skb_gso_error_unwind(skb, htons(ETH_P_NSH), nsh_len, - skb->network_header - nhoff, - mac_len); + mac_offset, mac_len); goto out; }
As the call trace shows, skb_panic was caused by wrong skb->mac_header in nsh_gso_segment(): invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1 RIP: 0010:skb_panic+0xda/0xe0 call Trace: skb_push+0x91/0xa0 nsh_gso_segment+0x4f3/0x570 skb_mac_gso_segment+0x19e/0x270 __skb_gso_segment+0x1e8/0x3c0 validate_xmit_skb+0x452/0x890 validate_xmit_skb_list+0x99/0xd0 sch_direct_xmit+0x294/0x7c0 __dev_queue_xmit+0x16f0/0x1d70 packet_xmit+0x185/0x210 packet_snd+0xc15/0x1170 packet_sendmsg+0x7b/0xa0 sock_sendmsg+0x14f/0x160 The root cause is: nsh_gso_segment() use skb->network_header - nhoff to reset mac_header in skb_gso_error_unwind() if inner-layer protocol gso fails. However, skb->network_header may be reset by inner-layer protocol gso function e.g. mpls_gso_segment. skb->mac_header reset by the inaccurate network_header will be larger than skb headroom. nsh_gso_segment nhoff = skb->network_header - skb->mac_header; __skb_pull(skb,nsh_len) skb_mac_gso_segment mpls_gso_segment skb_reset_network_header(skb);//skb->network_header+=nsh_len return -EINVAL; skb_gso_error_unwind skb_push(skb, nsh_len); skb->mac_header = skb->network_header - nhoff; // skb->mac_header > skb->headroom, cause skb_push panic Use correct mac_offset to restore mac_header to fix it. Fixes: c411ed854584 ("nsh: add GSO support") Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- v2: - Use skb->mac_header not skb->network_header-nhoff for mac_offset. v3: - 'net' is noted in the subject. - arrange local variable following reverse xmas tree order --- net/nsh/nsh.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)