diff mbox series

[net] tcp: take care of mixed splice()/sendmsg(MSG_ZEROCOPY) case

Message ID 20220203225547.665114-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit f8d9d938514f46c4892aff6bfe32f425e84d81cc
Delegated to: Netdev Maintainers
Headers show
Series [net] tcp: take care of mixed splice()/sendmsg(MSG_ZEROCOPY) case | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers warning 2 maintainers not CCed: dsahern@kernel.org yoshfuji@linux-ipv6.org
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 7 this patch: 7
netdev/checkpatch warning WARNING: return of an errno should typically be negative (ie: return -ENOMEM)
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet Feb. 3, 2022, 10:55 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

syzbot found that mixing sendpage() and sendmsg(MSG_ZEROCOPY)
calls over the same TCP socket would again trigger the
infamous warning in inet_sock_destruct()

	WARN_ON(sk_forward_alloc_get(sk));

While Talal took into account a mix of regular copied data
and MSG_ZEROCOPY one in the same skb, the sendpage() path
has been forgotten.

We want the charging to happen for sendpage(), because
pages could be coming from a pipe. What is missing is the
downgrading of pure zerocopy status to make sure
sk_forward_alloc will stay synced.

Add tcp_downgrade_zcopy_pure() helper so that we can
use it from the two callers.

Fixes: 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Talal Ahmad <talalahmad@google.com>
Cc: Arjun Roy <arjunroy@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp.c | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

Comments

Soheil Hassas Yeganeh Feb. 4, 2022, 2:39 p.m. UTC | #1
On Thu, Feb 3, 2022 at 5:55 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> syzbot found that mixing sendpage() and sendmsg(MSG_ZEROCOPY)
> calls over the same TCP socket would again trigger the
> infamous warning in inet_sock_destruct()
>
>         WARN_ON(sk_forward_alloc_get(sk));
>
> While Talal took into account a mix of regular copied data
> and MSG_ZEROCOPY one in the same skb, the sendpage() path
> has been forgotten.
>
> We want the charging to happen for sendpage(), because
> pages could be coming from a pipe. What is missing is the
> downgrading of pure zerocopy status to make sure
> sk_forward_alloc will stay synced.
>
> Add tcp_downgrade_zcopy_pure() helper so that we can
> use it from the two callers.
>
> Fixes: 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Cc: Talal Ahmad <talalahmad@google.com>
> Cc: Arjun Roy <arjunroy@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

Nice catch!

> ---
>  net/ipv4/tcp.c | 33 +++++++++++++++++++--------------
>  1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index bdf108f544a45a2aa24bc962fb81dfd0ca1e0682..e1f259da988df7493ce7d71ad8743ec5025e4e7c 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -937,6 +937,22 @@ void tcp_remove_empty_skb(struct sock *sk)
>         }
>  }
>
> +/* skb changing from pure zc to mixed, must charge zc */
> +static int tcp_downgrade_zcopy_pure(struct sock *sk, struct sk_buff *skb)
> +{
> +       if (unlikely(skb_zcopy_pure(skb))) {
> +               u32 extra = skb->truesize -
> +                           SKB_TRUESIZE(skb_end_offset(skb));
> +
> +               if (!sk_wmem_schedule(sk, extra))
> +                       return ENOMEM;
> +
> +               sk_mem_charge(sk, extra);
> +               skb_shinfo(skb)->flags &= ~SKBFL_PURE_ZEROCOPY;
> +       }
> +       return 0;
> +}
> +
>  static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
>                                       struct page *page, int offset, size_t *size)
>  {
> @@ -972,7 +988,7 @@ static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
>                 tcp_mark_push(tp, skb);
>                 goto new_segment;
>         }
> -       if (!sk_wmem_schedule(sk, copy))
> +       if (tcp_downgrade_zcopy_pure(sk, skb) || !sk_wmem_schedule(sk, copy))
>                 return NULL;
>
>         if (can_coalesce) {
> @@ -1320,19 +1336,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>
>                         copy = min_t(int, copy, pfrag->size - pfrag->offset);
>
> -                       /* skb changing from pure zc to mixed, must charge zc */
> -                       if (unlikely(skb_zcopy_pure(skb))) {
> -                               u32 extra = skb->truesize -
> -                                           SKB_TRUESIZE(skb_end_offset(skb));
> -
> -                               if (!sk_wmem_schedule(sk, extra))
> -                                       goto wait_for_space;
> -
> -                               sk_mem_charge(sk, extra);
> -                               skb_shinfo(skb)->flags &= ~SKBFL_PURE_ZEROCOPY;
> -                       }
> -
> -                       if (!sk_wmem_schedule(sk, copy))
> +                       if (tcp_downgrade_zcopy_pure(sk, skb) ||
> +                           !sk_wmem_schedule(sk, copy))
>                                 goto wait_for_space;
>
>                         err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb,
> --
> 2.35.0.263.gb82422642f-goog
>
Jakub Kicinski Feb. 5, 2022, 4:06 a.m. UTC | #2
On Thu,  3 Feb 2022 14:55:47 -0800 Eric Dumazet wrote:
> +		if (!sk_wmem_schedule(sk, extra))
> +			return ENOMEM;

Let me make this negative when applying, looks like an inconsequential
typo with current callers.
Eric Dumazet Feb. 5, 2022, 4:20 a.m. UTC | #3
On Fri, Feb 4, 2022 at 8:06 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu,  3 Feb 2022 14:55:47 -0800 Eric Dumazet wrote:
> > +             if (!sk_wmem_schedule(sk, extra))
> > +                     return ENOMEM;
>
> Let me make this negative when applying, looks like an inconsequential
> typo with current callers.

Agreed, at first I was returning 0/1, and changed the 1 to ENOMEM at
the last moment.

Thanks !
patchwork-bot+netdevbpf@kernel.org Feb. 5, 2022, 4:30 a.m. UTC | #4
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu,  3 Feb 2022 14:55:47 -0800 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> syzbot found that mixing sendpage() and sendmsg(MSG_ZEROCOPY)
> calls over the same TCP socket would again trigger the
> infamous warning in inet_sock_destruct()
> 
> 	WARN_ON(sk_forward_alloc_get(sk));
> 
> [...]

Here is the summary with links:
  - [net] tcp: take care of mixed splice()/sendmsg(MSG_ZEROCOPY) case
    https://git.kernel.org/netdev/net/c/f8d9d938514f

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index bdf108f544a45a2aa24bc962fb81dfd0ca1e0682..e1f259da988df7493ce7d71ad8743ec5025e4e7c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -937,6 +937,22 @@  void tcp_remove_empty_skb(struct sock *sk)
 	}
 }
 
+/* skb changing from pure zc to mixed, must charge zc */
+static int tcp_downgrade_zcopy_pure(struct sock *sk, struct sk_buff *skb)
+{
+	if (unlikely(skb_zcopy_pure(skb))) {
+		u32 extra = skb->truesize -
+			    SKB_TRUESIZE(skb_end_offset(skb));
+
+		if (!sk_wmem_schedule(sk, extra))
+			return ENOMEM;
+
+		sk_mem_charge(sk, extra);
+		skb_shinfo(skb)->flags &= ~SKBFL_PURE_ZEROCOPY;
+	}
+	return 0;
+}
+
 static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
 				      struct page *page, int offset, size_t *size)
 {
@@ -972,7 +988,7 @@  static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
 		tcp_mark_push(tp, skb);
 		goto new_segment;
 	}
-	if (!sk_wmem_schedule(sk, copy))
+	if (tcp_downgrade_zcopy_pure(sk, skb) || !sk_wmem_schedule(sk, copy))
 		return NULL;
 
 	if (can_coalesce) {
@@ -1320,19 +1336,8 @@  int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 			copy = min_t(int, copy, pfrag->size - pfrag->offset);
 
-			/* skb changing from pure zc to mixed, must charge zc */
-			if (unlikely(skb_zcopy_pure(skb))) {
-				u32 extra = skb->truesize -
-					    SKB_TRUESIZE(skb_end_offset(skb));
-
-				if (!sk_wmem_schedule(sk, extra))
-					goto wait_for_space;
-
-				sk_mem_charge(sk, extra);
-				skb_shinfo(skb)->flags &= ~SKBFL_PURE_ZEROCOPY;
-			}
-
-			if (!sk_wmem_schedule(sk, copy))
+			if (tcp_downgrade_zcopy_pure(sk, skb) ||
+			    !sk_wmem_schedule(sk, copy))
 				goto wait_for_space;
 
 			err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb,