Message ID | 20230802092340.9640-1-edward.cree@amd.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,net] net-gro: restore check for NULL skb in napi_gro_frags | expand |
On Wed, Aug 2, 2023 at 11:42 AM <edward.cree@amd.com> wrote: > > From: Edward Cree <ecree.xilinx@gmail.com> > > Cited commit removed the check on the grounds that napi_gro_frags must > not be called by drivers if napi_get_frags failed. But skb can also > be NULL if napi_frags_skb fails to pull the ethernet header ("dropping > impossible skb" message). In this case return GRO_CONSUMED, as > otherwise continuing on would cause a NULL dereference panic in > dev_gro_receive(). > > Fixes: 1d11fa696733 ("net-gro: remove GRO_DROP") > Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> > Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> > --- > An sfc customer has encountered this panic in the wild; we're still > investigating exactly how it happened (we have a reproducer) but it > seems wise to have the core handle this check rather than requiring > it in every driver. An ethernet driver feeding non-ethernet packets to the upper stacks seems weird to me, but given napi_frags_skb() does output a warning, I would say this patch would be acceptable until the real bug is fixed :/ Note that eth_type_trans() does not double-check that at least ETH_HLEN bytes are present in skb->data skb_pull_inline(skb, ETH_HLEN); So eth_type_trans() would definitely crash. Not sure why a napi_gro_frags() enabled driver would be allowed to cook arbitrary packets with length < ETH_HLEN Mixed feelings here, especially if for some reason the compiler would not inline napi_frags_skb().
On Wed, Aug 02, 2023 at 10:23:40AM +0100, edward.cree@amd.com wrote: > From: Edward Cree <ecree.xilinx@gmail.com> > > Cited commit removed the check on the grounds that napi_gro_frags must > not be called by drivers if napi_get_frags failed. But skb can also > be NULL if napi_frags_skb fails to pull the ethernet header ("dropping > impossible skb" message). In this case return GRO_CONSUMED, as > otherwise continuing on would cause a NULL dereference panic in > dev_gro_receive(). > > Fixes: 1d11fa696733 ("net-gro: remove GRO_DROP") > Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> > Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> > --- > An sfc customer has encountered this panic in the wild; we're still > investigating exactly how it happened (we have a reproducer) but it > seems wise to have the core handle this check rather than requiring > it in every driver. > --- > net/core/gro.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/core/gro.c b/net/core/gro.c > index 0759277dc14e..0159972038da 100644 > --- a/net/core/gro.c > +++ b/net/core/gro.c > @@ -731,6 +731,9 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) > gro_result_t ret; > struct sk_buff *skb = napi_frags_skb(napi); > > + if (!skb) > + return GRO_CONSUMED; > + > trace_napi_gro_frags_entry(skb); > > ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb)); Given that this case is rarely hit, and given that there are some performance concerns raised wrt this change, it may be beneficial to hint that this branch is unlikely.
On Fri, Aug 4, 2023 at 4:36 PM Tyler Stachecki <stachecki.tyler@gmail.com> wrote: > > Given that this case is rarely hit, and given that there are some performance > concerns raised wrt this change, it may be beneficial to hint that this > branch is unlikely. This is already done, a compiler should already infer this from this code: if (unlikely(skb_gro_header_hard(skb, hlen))) { eth = skb_gro_header_slow(skb, hlen, 0); if (unlikely(!eth)) { net_warn_ratelimited("%s: dropping impossible skb from %s\n", __func__, napi->dev->name); napi_reuse_skb(napi, skb); return NULL; }
On Fri, Aug 04, 2023 at 04:45:13PM +0200, Eric Dumazet wrote: > This is already done, a compiler should already infer this from this code: > > if (unlikely(skb_gro_header_hard(skb, hlen))) { > eth = skb_gro_header_slow(skb, hlen, 0); > if (unlikely(!eth)) { > net_warn_ratelimited("%s: dropping impossible skb from %s\n", > __func__, napi->dev->name); > napi_reuse_skb(napi, skb); > return NULL; > } It is a good point - though, FWIW, at least with clang-16 I am observing that it does not leverage this fact (fully?). Mostly for my own curiosity, I took a look... In both cases, the generated code for the NULL check is emitted as a a forward branch, meaning that (at least on x86), the branch direction hint is rendered as desired. However, without unlikely(...), the code for the taken branch is clumped roughly after the inlined code for napi_frags_finish and before the (hot paths of) napi_frags_skb. With unlikely added, the code for the taken NULL check is seated right next to the code generated for the unlikely paths in your comment. So, it does seem to have an effect, however minor! --- Anyways: perhaps the more important note here is that, at least with clang-16, I can confirm that everything is still inlined with this change.
On 02/08/2023 11:22, Eric Dumazet wrote: > On Wed, Aug 2, 2023 at 11:42 AM <edward.cree@amd.com> wrote: >> An sfc customer has encountered this panic in the wild; we're still >> investigating exactly how it happened (we have a reproducer) but it >> seems wise to have the core handle this check rather than requiring >> it in every driver. > > An ethernet driver feeding non-ethernet packets to the upper stacks > seems weird to me, ... > Not sure why a napi_gro_frags() enabled driver would be allowed to > cook arbitrary packets with length < ETH_HLEN ... > Mixed feelings here Fwiw we now have some more understanding of what caused this: the device produced an RX descriptor with a zero in its buffer length field (there was actually data in the buffer, but a — we think — firmware bug caused the length to be stored in the wrong place). And the driver, blithely trusting the device, attached the RX page to the skb from napi_get_frags, passing the buffer length it had straight to skb_fill_page_desc without checking it. (After all, the device had told us through RX event flags that the packet was TCP, so it had to have seen a complete set of headers.) This certainly can be fixed in the driver, by adding a length check before calling napi_gro_frags(), but I see two reasons to put it in the core instead. 1) The same issue is likely to be present in any napi_gro_frags() driver; I've just looked at e1000 and mlx4 (as examples) and it looks like they could both be susceptible if hw misbehaved in a similar way. 2) The core can recycle the SKB, but drivers can't because napi_reuse_skb is static. Instead, if they've already called napi_get_frags when they discover the problem, they have to do kfree_skb(skb); napi->skb = NULL; which is not pretty. Of course we could always export napi_reuse_skb... Another open question is whether, if we do put this in the core, we should do the same for the other RX entry points (napi_gro_receive, eth_type_trans) that could see something similar. Anyway, I'll post v2, with unlikely() per Tyler's analysis, and you can ack or nack as the mood takes you. -ed
On Wed, Aug 16, 2023 at 7:46 PM Edward Cree <ecree.xilinx@gmail.com> wrote: > > On 02/08/2023 11:22, Eric Dumazet wrote: > > On Wed, Aug 2, 2023 at 11:42 AM <edward.cree@amd.com> wrote: > >> An sfc customer has encountered this panic in the wild; we're still > >> investigating exactly how it happened (we have a reproducer) but it > >> seems wise to have the core handle this check rather than requiring > >> it in every driver. > > > > An ethernet driver feeding non-ethernet packets to the upper stacks > > seems weird to me, > ... > > Not sure why a napi_gro_frags() enabled driver would be allowed to > > cook arbitrary packets with length < ETH_HLEN > ... > > Mixed feelings here > Fwiw we now have some more understanding of what caused this: the > device produced an RX descriptor with a zero in its buffer length > field (there was actually data in the buffer, but a — we think — > firmware bug caused the length to be stored in the wrong place). > And the driver, blithely trusting the device, attached the RX page > to the skb from napi_get_frags, passing the buffer length it had > straight to skb_fill_page_desc without checking it. (After all, > the device had told us through RX event flags that the packet was > TCP, so it had to have seen a complete set of headers.) > This certainly can be fixed in the driver, by adding a length > check before calling napi_gro_frags(), but I see two reasons to > put it in the core instead. > 1) The same issue is likely to be present in any napi_gro_frags() > driver; I've just looked at e1000 and mlx4 (as examples) and it > looks like they could both be susceptible if hw misbehaved in a > similar way. Yet, it seems they do not misbehave ? How XDP will react to such bug btw ? I would vote to add the fix in a driver first. If really the same disease seems to be common to more than one vendor, perhaps we could harden core ?
diff --git a/net/core/gro.c b/net/core/gro.c index 0759277dc14e..0159972038da 100644 --- a/net/core/gro.c +++ b/net/core/gro.c @@ -731,6 +731,9 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) gro_result_t ret; struct sk_buff *skb = napi_frags_skb(napi); + if (!skb) + return GRO_CONSUMED; + trace_napi_gro_frags_entry(skb); ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb));