Message ID | 01000163d08f00b4-068f6b54-5d34-447d-90c6-010a24fc36d5-000000@email.amazonses.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Christopher, This eliminates the failure as expected (at the source). I do not think such a solution is required, and it probably affect performance. As Matthew said, slab objects should not be used in sk_buff fragments. The source of these is my kernel TCP sockets, where kernel_sendpage() is used with slab payload. I eliminated this, and the failure disappeared, even though with this kind of fine timing issues, no failure does not mean anything Moreover, I tried triggering on slab in sk_buff fragments and nothing came up. So far: 1) Use of slab payload in kernel_sendpage() is not polite, even though we do not BUG on this and documentation does not tell it was just wrong. 2) RX path cannot bring sk_buffs in slab: drivers use alloc_pagexxx or page_frag_alloc(). What I am still wondering about (and investigating), is how kernel_sendpage() with slab payload results in slab payload on another socket RX. Do you see how page ref-counting can be broken with extra references taken on a slab page containing the fragments, and dropped when networking is done with them? Thanks, Anton On Tue, Jun 5, 2018 at 8:27 AM, Christopher Lameter <cl@linux.com> wrote: > On Fri, 1 Jun 2018, Anton Eidelman wrote: > > > I do not have a way of reproducing this decent enough to recommend: I'll > > keep digging. > > If you can reproduce it: Could you try the following patch? > > > > Subject: [NET] Fix false positives of skb_can_coalesce > > Skb fragments may be slab objects. Two slab objects may reside > in the same slab page. In that case skb_can_coalesce() may return > true althought the skb cannot be expanded because it would > cross a slab boundary. > > Enabling slab debugging will avoid the issue since red zones will > be inserted and thus the skb_can_coalesce() check will not detect > neighboring objects and return false. > > Signed-off-by: Christoph Lameter <cl@linux.com> > > Index: linux/include/linux/skbuff.h > =================================================================== > --- linux.orig/include/linux/skbuff.h > +++ linux/include/linux/skbuff.h > @@ -3010,8 +3010,29 @@ static inline bool skb_can_coalesce(stru > if (i) { > const struct skb_frag_struct *frag = > &skb_shinfo(skb)->frags[i - 1]; > > - return page == skb_frag_page(frag) && > - off == frag->page_offset + skb_frag_size(frag); > + if (page != skb_frag_page(frag)) > + return false; > + > + if (off != frag->page_offset + skb_frag_size(frag)) > + return false; > + > + /* > + * This may be a slab page and we may have pointers > + * to different slab objects in the same page > + */ > + if (!PageSlab(skb_frag_page(frag))) > + return true; > + > + /* > + * We could still return true if we would check here > + * if the two fragments are within the same > + * slab object. But that is complicated and > + * I guess we would need a new slab function > + * to check if two pointers are within the same > + * object. > + */ > + return false; > + > } > return false; > } >
On Tue, 5 Jun 2018, Anton Eidelman wrote: > What I am still wondering about (and investigating), is how kernel_sendpage() > with slab payload results in slab payload on another socket RX. > Do you see how page ref-counting can be broken with extra references taken > on a slab page containing the fragments, and dropped when networking is > done with them? The slab allocators do not use page refcounting. The objects may be destroyed via poisioning etc if you use kfree() while still holding a refcount on the page. Even without poisoning the slab allocator may overwrite the object.
Index: linux/include/linux/skbuff.h =================================================================== --- linux.orig/include/linux/skbuff.h +++ linux/include/linux/skbuff.h @@ -3010,8 +3010,29 @@ static inline bool skb_can_coalesce(stru if (i) { const struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i - 1]; - return page == skb_frag_page(frag) && - off == frag->page_offset + skb_frag_size(frag); + if (page != skb_frag_page(frag)) + return false; + + if (off != frag->page_offset + skb_frag_size(frag)) + return false; + + /* + * This may be a slab page and we may have pointers + * to different slab objects in the same page + */ + if (!PageSlab(skb_frag_page(frag))) + return true; + + /* + * We could still return true if we would check here + * if the two fragments are within the same + * slab object. But that is complicated and + * I guess we would need a new slab function + * to check if two pointers are within the same + * object. + */ + return false; + } return false; }