Message ID | 20240611063618.106485-5-ofir.gal@volumez.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4,1/4] net: introduce helper sendpages_ok() | expand |
Xiubo/Ilya please take a look On 6/11/24 09:36, Ofir Gal wrote: > Currently ceph_tcp_sendpage() and do_try_sendpage() use sendpage_ok() in > order to enable MSG_SPLICE_PAGES, it check the first page of the > iterator, the iterator may represent contiguous pages. > > MSG_SPLICE_PAGES enables skb_splice_from_iter() which checks all the > pages it sends with sendpage_ok(). > > When ceph_tcp_sendpage() or do_try_sendpage() send an iterator that the > first page is sendable, but one of the other pages isn't > skb_splice_from_iter() warns and aborts the data transfer. > > Using the new helper sendpages_ok() in order to enable MSG_SPLICE_PAGES > solves the issue. > > Signed-off-by: Ofir Gal <ofir.gal@volumez.com> > --- > net/ceph/messenger_v1.c | 2 +- > net/ceph/messenger_v2.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c > index 0cb61c76b9b8..a6788f284cd7 100644 > --- a/net/ceph/messenger_v1.c > +++ b/net/ceph/messenger_v1.c > @@ -94,7 +94,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page, > * coalescing neighboring slab objects into a single frag which > * triggers one of hardened usercopy checks. > */ > - if (sendpage_ok(page)) > + if (sendpages_ok(page, size, offset)) > msg.msg_flags |= MSG_SPLICE_PAGES; > > bvec_set_page(&bvec, page, size, offset); > diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c > index bd608ffa0627..27f8f6c8eb60 100644 > --- a/net/ceph/messenger_v2.c > +++ b/net/ceph/messenger_v2.c > @@ -165,7 +165,7 @@ static int do_try_sendpage(struct socket *sock, struct iov_iter *it) > * coalescing neighboring slab objects into a single frag > * which triggers one of hardened usercopy checks. > */ > - if (sendpage_ok(bv.bv_page)) > + if (sendpages_ok(bv.bv_page, bv.bv_len, bv.bv_offset)) > msg.msg_flags |= MSG_SPLICE_PAGES; > else > msg.msg_flags &= ~MSG_SPLICE_PAGES;
On Tue, Jul 16, 2024 at 2:46 PM Ofir Gal <ofir.gal@volumez.com> wrote: > > Xiubo/Ilya please take a look > > On 6/11/24 09:36, Ofir Gal wrote: > > Currently ceph_tcp_sendpage() and do_try_sendpage() use sendpage_ok() in > > order to enable MSG_SPLICE_PAGES, it check the first page of the > > iterator, the iterator may represent contiguous pages. > > > > MSG_SPLICE_PAGES enables skb_splice_from_iter() which checks all the > > pages it sends with sendpage_ok(). > > > > When ceph_tcp_sendpage() or do_try_sendpage() send an iterator that the > > first page is sendable, but one of the other pages isn't > > skb_splice_from_iter() warns and aborts the data transfer. > > > > Using the new helper sendpages_ok() in order to enable MSG_SPLICE_PAGES > > solves the issue. > > > > Signed-off-by: Ofir Gal <ofir.gal@volumez.com> > > --- > > net/ceph/messenger_v1.c | 2 +- > > net/ceph/messenger_v2.c | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c > > index 0cb61c76b9b8..a6788f284cd7 100644 > > --- a/net/ceph/messenger_v1.c > > +++ b/net/ceph/messenger_v1.c > > @@ -94,7 +94,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page, > > * coalescing neighboring slab objects into a single frag which > > * triggers one of hardened usercopy checks. > > */ > > - if (sendpage_ok(page)) > > + if (sendpages_ok(page, size, offset)) > > msg.msg_flags |= MSG_SPLICE_PAGES; > > > > bvec_set_page(&bvec, page, size, offset); > > diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c > > index bd608ffa0627..27f8f6c8eb60 100644 > > --- a/net/ceph/messenger_v2.c > > +++ b/net/ceph/messenger_v2.c > > @@ -165,7 +165,7 @@ static int do_try_sendpage(struct socket *sock, struct iov_iter *it) > > * coalescing neighboring slab objects into a single frag > > * which triggers one of hardened usercopy checks. > > */ > > - if (sendpage_ok(bv.bv_page)) > > + if (sendpages_ok(bv.bv_page, bv.bv_len, bv.bv_offset)) > > msg.msg_flags |= MSG_SPLICE_PAGES; > > else > > msg.msg_flags &= ~MSG_SPLICE_PAGES; > Hi Ofir, Ceph should be fine as is -- there is an internal "cursor" abstraction that that is limited to PAGE_SIZE chunks, using bvec_iter_bvec() instead of mp_bvec_iter_bvec(), etc. This means that both do_try_sendpage() and ceph_tcp_sendpage() should be called only with page_off + len <= PAGE_SIZE being true even if the page is contiguous (and that we lose out on the potential performance benefit, of course...). That said, if the plan is to remove sendpage_ok() so that it doesn't accidentally grow new users who are unaware of this pitfall, consider this Acked-by: Ilya Dryomov <idryomov@gmail.com> Thanks, Ilya
On 17/07/2024 23:26, Ilya Dryomov wrote: > On Tue, Jul 16, 2024 at 2:46 PM Ofir Gal <ofir.gal@volumez.com> wrote: >> Xiubo/Ilya please take a look >> >> On 6/11/24 09:36, Ofir Gal wrote: >>> Currently ceph_tcp_sendpage() and do_try_sendpage() use sendpage_ok() in >>> order to enable MSG_SPLICE_PAGES, it check the first page of the >>> iterator, the iterator may represent contiguous pages. >>> >>> MSG_SPLICE_PAGES enables skb_splice_from_iter() which checks all the >>> pages it sends with sendpage_ok(). >>> >>> When ceph_tcp_sendpage() or do_try_sendpage() send an iterator that the >>> first page is sendable, but one of the other pages isn't >>> skb_splice_from_iter() warns and aborts the data transfer. >>> >>> Using the new helper sendpages_ok() in order to enable MSG_SPLICE_PAGES >>> solves the issue. >>> >>> Signed-off-by: Ofir Gal <ofir.gal@volumez.com> >>> --- >>> net/ceph/messenger_v1.c | 2 +- >>> net/ceph/messenger_v2.c | 2 +- >>> 2 files changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c >>> index 0cb61c76b9b8..a6788f284cd7 100644 >>> --- a/net/ceph/messenger_v1.c >>> +++ b/net/ceph/messenger_v1.c >>> @@ -94,7 +94,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page, >>> * coalescing neighboring slab objects into a single frag which >>> * triggers one of hardened usercopy checks. >>> */ >>> - if (sendpage_ok(page)) >>> + if (sendpages_ok(page, size, offset)) >>> msg.msg_flags |= MSG_SPLICE_PAGES; >>> >>> bvec_set_page(&bvec, page, size, offset); >>> diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c >>> index bd608ffa0627..27f8f6c8eb60 100644 >>> --- a/net/ceph/messenger_v2.c >>> +++ b/net/ceph/messenger_v2.c >>> @@ -165,7 +165,7 @@ static int do_try_sendpage(struct socket *sock, struct iov_iter *it) >>> * coalescing neighboring slab objects into a single frag >>> * which triggers one of hardened usercopy checks. >>> */ >>> - if (sendpage_ok(bv.bv_page)) >>> + if (sendpages_ok(bv.bv_page, bv.bv_len, bv.bv_offset)) >>> msg.msg_flags |= MSG_SPLICE_PAGES; >>> else >>> msg.msg_flags &= ~MSG_SPLICE_PAGES; > Hi Ofir, > > Ceph should be fine as is -- there is an internal "cursor" abstraction > that that is limited to PAGE_SIZE chunks, using bvec_iter_bvec() instead > of mp_bvec_iter_bvec(), etc. This means that both do_try_sendpage() and > ceph_tcp_sendpage() should be called only with > > page_off + len <= PAGE_SIZE > > being true even if the page is contiguous (and that we lose out on the > potential performance benefit, of course...). > > That said, if the plan is to remove sendpage_ok() so that it doesn't > accidentally grow new users who are unaware of this pitfall, consider > this > > Acked-by: Ilya Dryomov <idryomov@gmail.com> From which tree should this go from? we can take it via the nvme tree, unless someone else wants to queue it up...
On 7/17/24 23:26, Ilya Dryomov wrote: > On Tue, Jul 16, 2024 at 2:46 PM Ofir Gal <ofir.gal@volumez.com> wrote: >> >> Xiubo/Ilya please take a look >> >> On 6/11/24 09:36, Ofir Gal wrote: >>> Currently ceph_tcp_sendpage() and do_try_sendpage() use sendpage_ok() in >>> order to enable MSG_SPLICE_PAGES, it check the first page of the >>> iterator, the iterator may represent contiguous pages. >>> >>> MSG_SPLICE_PAGES enables skb_splice_from_iter() which checks all the >>> pages it sends with sendpage_ok(). >>> >>> When ceph_tcp_sendpage() or do_try_sendpage() send an iterator that the >>> first page is sendable, but one of the other pages isn't >>> skb_splice_from_iter() warns and aborts the data transfer. >>> >>> Using the new helper sendpages_ok() in order to enable MSG_SPLICE_PAGES >>> solves the issue. >>> >>> Signed-off-by: Ofir Gal <ofir.gal@volumez.com> >>> --- >>> net/ceph/messenger_v1.c | 2 +- >>> net/ceph/messenger_v2.c | 2 +- >>> 2 files changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c >>> index 0cb61c76b9b8..a6788f284cd7 100644 >>> --- a/net/ceph/messenger_v1.c >>> +++ b/net/ceph/messenger_v1.c >>> @@ -94,7 +94,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page, >>> * coalescing neighboring slab objects into a single frag which >>> * triggers one of hardened usercopy checks. >>> */ >>> - if (sendpage_ok(page)) >>> + if (sendpages_ok(page, size, offset)) >>> msg.msg_flags |= MSG_SPLICE_PAGES; >>> >>> bvec_set_page(&bvec, page, size, offset); >>> diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c >>> index bd608ffa0627..27f8f6c8eb60 100644 >>> --- a/net/ceph/messenger_v2.c >>> +++ b/net/ceph/messenger_v2.c >>> @@ -165,7 +165,7 @@ static int do_try_sendpage(struct socket *sock, struct iov_iter *it) >>> * coalescing neighboring slab objects into a single frag >>> * which triggers one of hardened usercopy checks. >>> */ >>> - if (sendpage_ok(bv.bv_page)) >>> + if (sendpages_ok(bv.bv_page, bv.bv_len, bv.bv_offset)) >>> msg.msg_flags |= MSG_SPLICE_PAGES; >>> else >>> msg.msg_flags &= ~MSG_SPLICE_PAGES; >> > > Hi Ofir, > > Ceph should be fine as is -- there is an internal "cursor" abstraction > that that is limited to PAGE_SIZE chunks, using bvec_iter_bvec() instead > of mp_bvec_iter_bvec(), etc. This means that both do_try_sendpage() and > ceph_tcp_sendpage() should be called only with > > page_off + len <= PAGE_SIZE > > being true even if the page is contiguous (and that we lose out on the > potential performance benefit, of course...). > > That said, if the plan is to remove sendpage_ok() so that it doesn't > accidentally grow new users who are unaware of this pitfall, consider > this > > Acked-by: Ilya Dryomov <idryomov@gmail.com> > > Thanks, > > Ilya I dont think the plan is to remove sendpage_ok() (unless someone says otherwise). Im sending v5 without the libceph patch. Thanks
diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c index 0cb61c76b9b8..a6788f284cd7 100644 --- a/net/ceph/messenger_v1.c +++ b/net/ceph/messenger_v1.c @@ -94,7 +94,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page, * coalescing neighboring slab objects into a single frag which * triggers one of hardened usercopy checks. */ - if (sendpage_ok(page)) + if (sendpages_ok(page, size, offset)) msg.msg_flags |= MSG_SPLICE_PAGES; bvec_set_page(&bvec, page, size, offset); diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index bd608ffa0627..27f8f6c8eb60 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -165,7 +165,7 @@ static int do_try_sendpage(struct socket *sock, struct iov_iter *it) * coalescing neighboring slab objects into a single frag * which triggers one of hardened usercopy checks. */ - if (sendpage_ok(bv.bv_page)) + if (sendpages_ok(bv.bv_page, bv.bv_len, bv.bv_offset)) msg.msg_flags |= MSG_SPLICE_PAGES; else msg.msg_flags &= ~MSG_SPLICE_PAGES;
Currently ceph_tcp_sendpage() and do_try_sendpage() use sendpage_ok() in order to enable MSG_SPLICE_PAGES, it check the first page of the iterator, the iterator may represent contiguous pages. MSG_SPLICE_PAGES enables skb_splice_from_iter() which checks all the pages it sends with sendpage_ok(). When ceph_tcp_sendpage() or do_try_sendpage() send an iterator that the first page is sendable, but one of the other pages isn't skb_splice_from_iter() warns and aborts the data transfer. Using the new helper sendpages_ok() in order to enable MSG_SPLICE_PAGES solves the issue. Signed-off-by: Ofir Gal <ofir.gal@volumez.com> --- net/ceph/messenger_v1.c | 2 +- net/ceph/messenger_v2.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)