Message ID | 1400077432.7973.85.camel@edumazet-glaptop2.roam.corp.google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 14 May 2014 07:23:52 -0700 > On Wed, 2014-05-14 at 14:40 +0100, Zoltan Kiss wrote: >> Hi, >> >> Recently I've investigated issues around SKBTX_DEV_ZEROCOPY skbs where >> the frags list were modified. I came across this function skb_shift(), >> which moves frags between skbs. And there are a lot more of such kind, >> skb_split or skb_try_coalesce, for example. >> It could be a dangerous thing if a frag is referenced from an skb which >> doesn't have the original destructor_arg, and to avoid that >> skb_orphan_frags should be called. Although probably these functions are >> not normally touched in usual usecases, I think it would be useful to >> review core skb functions proactively and add an skb_orphan_frags >> everywhere where the frags could be referenced from other places. >> Any opinion about this? > > > For skb_shift(), it is currently used from tcp stack only, where > this SKBTX_DEV_ZEROCOPY thing is not used, so I do not think there is a > bug for the moment. > > I already gave a patch for skb_try_coalesce() : For this one we do not > wan skb_orphan_frags() overhead. Its simply better in this case to > abort. Eric can you please submit this formally? It is second time I've seen it posted as RFC :-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2014-05-14 at 13:42 -0400, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Wed, 14 May 2014 07:23:52 -0700 > > > On Wed, 2014-05-14 at 14:40 +0100, Zoltan Kiss wrote: > >> Hi, > >> > >> Recently I've investigated issues around SKBTX_DEV_ZEROCOPY skbs where > >> the frags list were modified. I came across this function skb_shift(), > >> which moves frags between skbs. And there are a lot more of such kind, > >> skb_split or skb_try_coalesce, for example. > >> It could be a dangerous thing if a frag is referenced from an skb which > >> doesn't have the original destructor_arg, and to avoid that > >> skb_orphan_frags should be called. Although probably these functions are > >> not normally touched in usual usecases, I think it would be useful to > >> review core skb functions proactively and add an skb_orphan_frags > >> everywhere where the frags could be referenced from other places. > >> Any opinion about this? > > > > > > For skb_shift(), it is currently used from tcp stack only, where > > this SKBTX_DEV_ZEROCOPY thing is not used, so I do not think there is a > > bug for the moment. > > > > I already gave a patch for skb_try_coalesce() : For this one we do not > > wan skb_orphan_frags() overhead. Its simply better in this case to > > abort. > > Eric can you please submit this formally? It is second time I've seen > it posted as RFC :-) Sure, I was kind of waiting to make sure it was needed. It looks like it is ;) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 14/05/14 15:23, Eric Dumazet wrote: > On Wed, 2014-05-14 at 14:40 +0100, Zoltan Kiss wrote: >> Hi, >> >> Recently I've investigated issues around SKBTX_DEV_ZEROCOPY skbs where >> the frags list were modified. I came across this function skb_shift(), >> which moves frags between skbs. And there are a lot more of such kind, >> skb_split or skb_try_coalesce, for example. >> It could be a dangerous thing if a frag is referenced from an skb which >> doesn't have the original destructor_arg, and to avoid that >> skb_orphan_frags should be called. Although probably these functions are >> not normally touched in usual usecases, I think it would be useful to >> review core skb functions proactively and add an skb_orphan_frags >> everywhere where the frags could be referenced from other places. >> Any opinion about this? > > > For skb_shift(), it is currently used from tcp stack only, where > this SKBTX_DEV_ZEROCOPY thing is not used, so I do not think there is a > bug for the moment. It is called from tcp_input.c, which suggests it can be called on incoming TCP packets. If the backend domain communicates with the frontend through sockets, zerocopy packets can turn up here. But here is the thing: deliver_skb calls orphan_frags for every packet delivered to the local stack, so we are safe IF these functions are called before the IP stack. So we are safe now, but things can go wrong, if: - such a frag-mangling function is called before deliver_skb, now or in the future - if someone wants to take advantage of zerocopy in the guest<->backend path > > I already gave a patch for skb_try_coalesce() : For this one we do not > wan skb_orphan_frags() overhead. Its simply better in this case to > abort. > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 1b62343f5837..85995a14aafc 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -3838,7 +3839,10 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from, > return true; > } > > - if (skb_has_frag_list(to) || skb_has_frag_list(from)) > + if (skb_has_frag_list(to) || > + skb_has_frag_list(from) || > + (skb_shinfo(to)->tx_flags & SKBTX_DEV_ZEROCOPY) || > + (skb_shinfo(from)->tx_flags & SKBTX_DEV_ZEROCOPY)) > return false; > > if (skb_headlen(from) != 0) { > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2014-05-14 at 20:41 +0100, Zoltan Kiss wrote: > On 14/05/14 15:23, Eric Dumazet wrote: > > On Wed, 2014-05-14 at 14:40 +0100, Zoltan Kiss wrote: > >> Hi, > >> > >> Recently I've investigated issues around SKBTX_DEV_ZEROCOPY skbs where > >> the frags list were modified. I came across this function skb_shift(), > >> which moves frags between skbs. And there are a lot more of such kind, > >> skb_split or skb_try_coalesce, for example. > >> It could be a dangerous thing if a frag is referenced from an skb which > >> doesn't have the original destructor_arg, and to avoid that > >> skb_orphan_frags should be called. Although probably these functions are > >> not normally touched in usual usecases, I think it would be useful to > >> review core skb functions proactively and add an skb_orphan_frags > >> everywhere where the frags could be referenced from other places. > >> Any opinion about this? > > > > > > For skb_shift(), it is currently used from tcp stack only, where > > this SKBTX_DEV_ZEROCOPY thing is not used, so I do not think there is a > > bug for the moment. > It is called from tcp_input.c, which suggests it can be called on > incoming TCP packets. Nope. We split outgoing packets, stored in the socket write queue. These packets are locally generated by tcp_sendmsg() and tcp_sendpage(), no way we use SKBTX_DEV_ZEROCOPY yet. This split happens when we receive an ACK, that's why it is in tcp_input.c -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 14/05/14 20:41, Zoltan Kiss wrote: > But here is the thing: deliver_skb calls orphan_frags for every packet > delivered to the local stack, so we are safe IF these functions are > called before the IP stack. So we are safe now, but things can go wrong, > if: > - such a frag-mangling function is called before deliver_skb, now or in > the future > - if someone wants to take advantage of zerocopy in the guest<->backend > path Running through the code I've found the following core functions can shuffle frags between skbs (and don't handle zerocopy skbs already): skb_gro_receive skb_shift skb_split None of them can meet at the moment with zerocopy skbs, but it's better to keep it in mind for the future, that would blow up these kind of skbs. Zoli -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 1b62343f5837..85995a14aafc 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3838,7 +3839,10 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from, return true; } - if (skb_has_frag_list(to) || skb_has_frag_list(from)) + if (skb_has_frag_list(to) || + skb_has_frag_list(from) || + (skb_shinfo(to)->tx_flags & SKBTX_DEV_ZEROCOPY) || + (skb_shinfo(from)->tx_flags & SKBTX_DEV_ZEROCOPY)) return false; if (skb_headlen(from) != 0) {