From patchwork Fri Mar 12 08:33:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 12133947 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6114CC4332D for ; Fri, 12 Mar 2021 08:34:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 321FA64FC9 for ; Fri, 12 Mar 2021 08:34:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232302AbhCLIdm (ORCPT ); Fri, 12 Mar 2021 03:33:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231823AbhCLIdc (ORCPT ); Fri, 12 Mar 2021 03:33:32 -0500 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65E6AC061574; Fri, 12 Mar 2021 00:33:32 -0800 (PST) Received: by mail-pl1-x62f.google.com with SMTP id 30so7032584ple.4; Fri, 12 Mar 2021 00:33:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=kHe01k//eozg7NUQ6QVWsH0/MiPE4848fzfxFIaTxI0=; b=inoZpaSXefWJVTO9F+RRUxa6lAOwBPOXSxgInUhasIEbHnYntARL2tajVe8SwvE62d 3chXyGt4mUldMIUqqiU84rJPboeRb5rkJ6TqHbyF06Jb0Iw+7ZURYw2KDYye3N2irhnF vNRA5clVwF+Ifz6vJl4+wTMZVGqyzukV7JYzsOzcOxDHFdmWT27MNSaQVoc8h4DC1eSP BbQ4SY/vno2KviP6cuDpkewsG/LfEtn2xOYXiBMYemgA76+lbKfOG7nM7j0Uwc/YcuEg cMYZPwCP4YNDKkDXNSoyY+1gZoBXjtanJ9jhA88Jv1ocDE3obagPnRwIMM6bOHWSbEdb 1byw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=kHe01k//eozg7NUQ6QVWsH0/MiPE4848fzfxFIaTxI0=; b=tWOqRs7vJDdj4ySH9N8W+EqnWnz4L4UecXKl/+2o9He9MpWhyfgIFKTS2GNo08J3I4 yPTHnfQ3IsmeBd579NT75n/ymheL2FnTNSn8UxZ0fvOQ/VU5KbOjjkJZwQ/zTTcsK7tT 3/3mT+rBh+M/djj6ROWskV26qiG2tDcDsEHZX2cVResOA9aC8PChF4r4589U+efa/1MQ R8m2UKTiDUXDxKmCtMhjPcSBbJGsvoZpraqJeEVrWrpxZZpTVZDwkmnlGy30DmuQgwsj xi58VJldpWFatctz594lvMZHRVWvhJIrzmmHbpnFtgmGaHiB7bNS0QuaiKH9R6Ccmhx6 kDew== X-Gm-Message-State: AOAM533LhORQdWfqysmx4POq2ujPdev+otlh8EO2hRV46H2B3ni+LhOo RmDtrNU3sKXB6GhkwTW8uZc= X-Google-Smtp-Source: ABdhPJzqpUe2M5lXLkQCd42YT3pk/CVPyV1ZqXfTa9KL5Yr4hgUPf6WaguvvZ65AvsmnyE64jZZ7qg== X-Received: by 2002:a17:903:1cb:b029:e5:f712:c13c with SMTP id e11-20020a17090301cbb02900e5f712c13cmr12286120plh.22.1615538011929; Fri, 12 Mar 2021 00:33:31 -0800 (PST) Received: from edumazet1.svl.corp.google.com ([2620:15c:2c4:201:5186:d796:2218:6442]) by smtp.gmail.com with ESMTPSA id w1sm4258173pgs.15.2021.03.12.00.33.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Mar 2021 00:33:31 -0800 (PST) From: Eric Dumazet To: Greg Kroah-Hartman Cc: "David S . Miller" , netdev , Eric Dumazet , Eric Dumazet , stable@vger.kernel.org Subject: [PATCH 4.19-stable 1/3] tcp: annotate tp->copied_seq lockless reads Date: Fri, 12 Mar 2021 00:33:21 -0800 Message-Id: <20210312083323.3720479-1-eric.dumazet@gmail.com> X-Mailer: git-send-email 2.31.0.rc2.261.g7f71774620-goog MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Eric Dumazet [ Upstream commit 7db48e983930285b765743ebd665aecf9850582b ] There are few places where we fetch tp->copied_seq while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. Note that tcp_inq_hint() was already using READ_ONCE(tp->copied_seq) Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- net/ipv4/tcp.c | 18 +++++++++--------- net/ipv4/tcp_diag.c | 3 ++- net/ipv4/tcp_input.c | 6 +++--- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/tcp_output.c | 2 +- net/ipv6/tcp_ipv6.c | 2 +- 7 files changed, 18 insertions(+), 17 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 98e8ee8bb7593a30442c2e8d9694424c818840ed..f639c7d6083821c8725f5e28312eff3cbfa82e54 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -567,7 +567,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait) (state != TCP_SYN_RECV || tp->fastopen_rsk)) { int target = sock_rcvlowat(sk, 0, INT_MAX); - if (tp->urg_seq == tp->copied_seq && + if (tp->urg_seq == READ_ONCE(tp->copied_seq) && !sock_flag(sk, SOCK_URGINLINE) && tp->urg_data) target++; @@ -628,7 +628,7 @@ int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg) unlock_sock_fast(sk, slow); break; case SIOCATMARK: - answ = tp->urg_data && tp->urg_seq == tp->copied_seq; + answ = tp->urg_data && tp->urg_seq == READ_ONCE(tp->copied_seq); break; case SIOCOUTQ: if (sk->sk_state == TCP_LISTEN) @@ -1696,9 +1696,9 @@ int tcp_read_sock(struct sock *sk, read_descriptor_t *desc, sk_eat_skb(sk, skb); if (!desc->count) break; - tp->copied_seq = seq; + WRITE_ONCE(tp->copied_seq, seq); } - tp->copied_seq = seq; + WRITE_ONCE(tp->copied_seq, seq); tcp_rcv_space_adjust(sk); @@ -1835,7 +1835,7 @@ static int tcp_zerocopy_receive(struct sock *sk, out: up_read(¤t->mm->mmap_sem); if (length) { - tp->copied_seq = seq; + WRITE_ONCE(tp->copied_seq, seq); tcp_rcv_space_adjust(sk); /* Clean up data we have read: This will do ACK frames. */ @@ -2112,7 +2112,7 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, if (urg_offset < used) { if (!urg_offset) { if (!sock_flag(sk, SOCK_URGINLINE)) { - ++*seq; + WRITE_ONCE(*seq, *seq + 1); urg_hole++; offset++; used--; @@ -2134,7 +2134,7 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, } } - *seq += used; + WRITE_ONCE(*seq, *seq + used); copied += used; len -= used; @@ -2163,7 +2163,7 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, found_fin_ok: /* Process the FIN. */ - ++*seq; + WRITE_ONCE(*seq, *seq + 1); if (!(flags & MSG_PEEK)) sk_eat_skb(sk, skb); break; @@ -2578,7 +2578,7 @@ int tcp_disconnect(struct sock *sk, int flags) tcp_clear_xmit_timers(sk); __skb_queue_purge(&sk->sk_receive_queue); - tp->copied_seq = tp->rcv_nxt; + WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); tp->urg_data = 0; tcp_write_queue_purge(sk); tcp_fastopen_active_disable_ofo_check(sk); diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index c9e97f304f984b9339a683c45dea85a84ccba5bf..a96b252c742cb58108123a5ccad15511634dcfc5 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -30,7 +30,8 @@ static void tcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r, } else if (sk->sk_type == SOCK_STREAM) { const struct tcp_sock *tp = tcp_sk(sk); - r->idiag_rqueue = max_t(int, READ_ONCE(tp->rcv_nxt) - tp->copied_seq, 0); + r->idiag_rqueue = max_t(int, READ_ONCE(tp->rcv_nxt) - + READ_ONCE(tp->copied_seq), 0); r->idiag_wqueue = tp->write_seq - tp->snd_una; } if (info) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 5ffc85c8952a04905af7e560ce83135e73980bd3..36bff9291530b19539936609f001a6674acc8f72 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5889,7 +5889,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, /* Remember, tcp_poll() does not lock socket! * Change state from SYN-SENT only after copied_seq * is initialized. */ - tp->copied_seq = tp->rcv_nxt; + WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); smc_check_reset_syn(tp); @@ -5964,7 +5964,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, } WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1); - tp->copied_seq = tp->rcv_nxt; + WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); tp->rcv_wup = TCP_SKB_CB(skb)->seq + 1; /* RFC1323: The window in SYN & SYN/ACK segments is @@ -6126,7 +6126,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) tcp_rearm_rto(sk); } else { tcp_init_transfer(sk, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB); - tp->copied_seq = tp->rcv_nxt; + WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); } smp_mb(); tcp_set_state(sk, TCP_ESTABLISHED); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 7536f4c0bbf4f4b9ff60be4e4c83bca71945382b..627b5fb1eac8ecf808ea33d2994198b4551f3fc9 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2340,7 +2340,7 @@ static void get_tcp4_sock(struct sock *sk, struct seq_file *f, int i) * we might find a transient negative value. */ rx_queue = max_t(int, READ_ONCE(tp->rcv_nxt) - - tp->copied_seq, 0); + READ_ONCE(tp->copied_seq), 0); seq_printf(f, "%4d: %08X:%04X %08X:%04X %02X %08X:%08X %02X:%08lX " "%08X %5u %8d %lu %d %pK %lu %lu %u %u %d", diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 7ba8a90772b0a631e9d6af29699f99039fbce705..0b1a04fa54392fdb3325d45fd1d5e0aaa3170b6c 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -470,7 +470,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, seq = treq->rcv_isn + 1; newtp->rcv_wup = seq; - newtp->copied_seq = seq; + WRITE_ONCE(newtp->copied_seq, seq); WRITE_ONCE(newtp->rcv_nxt, seq); newtp->segs_in = 1; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 3cfefec819758989c225b131d845ad1e76552ea7..662aa48173b8197e006127e4d84fb2c1961836a4 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3381,7 +3381,7 @@ static void tcp_connect_init(struct sock *sk) else tp->rcv_tstamp = tcp_jiffies32; tp->rcv_wup = tp->rcv_nxt; - tp->copied_seq = tp->rcv_nxt; + WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); inet_csk(sk)->icsk_rto = tcp_timeout_init(sk); inet_csk(sk)->icsk_retransmits = 0; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 2e76ebfdc907dde33475310b42fac7fd56eee3a8..de9b9c0bf18f721f44d7ee0e4eec74c0b5576947 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1839,7 +1839,7 @@ static void get_tcp6_sock(struct seq_file *seq, struct sock *sp, int i) * we might find a transient negative value. */ rx_queue = max_t(int, READ_ONCE(tp->rcv_nxt) - - tp->copied_seq, 0); + READ_ONCE(tp->copied_seq), 0); seq_printf(seq, "%4d: %08X%08X%08X%08X:%04X %08X%08X%08X%08X:%04X " From patchwork Fri Mar 12 08:33:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 12133943 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38DFAC43381 for ; Fri, 12 Mar 2021 08:34:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E99064FC3 for ; Fri, 12 Mar 2021 08:34:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232311AbhCLIdn (ORCPT ); Fri, 12 Mar 2021 03:33:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232084AbhCLIde (ORCPT ); Fri, 12 Mar 2021 03:33:34 -0500 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33F8BC061574; Fri, 12 Mar 2021 00:33:34 -0800 (PST) Received: by mail-pj1-x1033.google.com with SMTP id s21so4812763pjq.1; Fri, 12 Mar 2021 00:33:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l7GcfrciSHzeUBQFcGqT0M9FB2DYQ7LxbgF7KxX4yTo=; b=sfvSKqESlNFIVstI2dGZdxAEXmXfF+1KLvheh5V45r2NGRqJzhpttY8AR7MAhm9OJv NXtTLOF91UFxXONnIgZE6ZniNzMFT0pLlhvFjXyP6A9i0dKCM4duqoHLj66xyGeVnu0J UQuYiA7SaRg3R3P87Wde/4zo29x/6BCC91QP16rMW4LCXNox3oQ7k/QXn7UW6IJ7AcLI EhLV2McOhsrKmScNFm2wo5OZAGGSWP2yESWGiDNeqawWkKmNMLT82HXKA+Btql/ynIMK YMEINhUmx/WPIqUBofVYracTg0YHXAl3xSH4PSbGtKNnG0nEXfoO0QuuPG0fRzD06GcX +V1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l7GcfrciSHzeUBQFcGqT0M9FB2DYQ7LxbgF7KxX4yTo=; b=JQh/s+BBldhOKQ5YzYPImIPc0y8mdW1YOjVGapz+AtCVHEzJAxqvocPirXyYLP2R0n KPOzxkt/hMQ2MH5goxCDBALq3Xv9aL8QBW1jSzvSAZ0EDB0tA7Xcnk/JNGMcEGCVOrzd 9WKB7qJ6BVQZtBw45QXKNvJBQrCF9hRsCeIe0HcWf8m56FS4GGcM3HUB8T7ugMvRdU13 MJEzZFX4P+tYgTawI9QRBJTPN8e+M+p6FTU03yPtMWTSQb8mo/1ZrIen50+cJ12Y/YUF +dATJeJlWBKy87H/4DTNEKJkledzJlRQpnLOYoAsOOwKLW3kQsAnnqoWMxD2NFUuDyps zAGw== X-Gm-Message-State: AOAM533eXGVDsXQRpqcXamrVWYc7Hk2szSEmayPXtmdPtM8xM0eIiEPH jhQ3AJH/pVnCNRzalYaw8XI1g9psUb0= X-Google-Smtp-Source: ABdhPJx+Kuejipv0L83gxQ+MKQZeLnEAZma3OIC2PaN9si+4od+041BbKYMX/ChB8QvtTYV9Xi8LWQ== X-Received: by 2002:a17:90a:f314:: with SMTP id ca20mr9613684pjb.136.1615538013716; Fri, 12 Mar 2021 00:33:33 -0800 (PST) Received: from edumazet1.svl.corp.google.com ([2620:15c:2c4:201:5186:d796:2218:6442]) by smtp.gmail.com with ESMTPSA id w1sm4258173pgs.15.2021.03.12.00.33.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Mar 2021 00:33:33 -0800 (PST) From: Eric Dumazet To: Greg Kroah-Hartman Cc: "David S . Miller" , netdev , Eric Dumazet , Eric Dumazet , stable@vger.kernel.org Subject: [PATCH 4.19-stable 2/3] tcp: annotate tp->write_seq lockless reads Date: Fri, 12 Mar 2021 00:33:22 -0800 Message-Id: <20210312083323.3720479-2-eric.dumazet@gmail.com> X-Mailer: git-send-email 2.31.0.rc2.261.g7f71774620-goog In-Reply-To: <20210312083323.3720479-1-eric.dumazet@gmail.com> References: <20210312083323.3720479-1-eric.dumazet@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Eric Dumazet [ Upstream commit 0f31746452e6793ad6271337438af8f4defb8940 ] There are few places where we fetch tp->write_seq while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- include/net/tcp.h | 2 +- net/ipv4/tcp.c | 20 ++++++++++++-------- net/ipv4/tcp_diag.c | 2 +- net/ipv4/tcp_ipv4.c | 21 ++++++++++++--------- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/tcp_output.c | 4 ++-- net/ipv6/tcp_ipv6.c | 13 +++++++------ 7 files changed, 36 insertions(+), 28 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 4fe3ab47b4803700e50346c0a85bc347046f6730..3f0d654984cf43fbbc5a51ebd4d654803d0e3649 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1880,7 +1880,7 @@ static inline u32 tcp_notsent_lowat(const struct tcp_sock *tp) static inline bool tcp_stream_memory_free(const struct sock *sk) { const struct tcp_sock *tp = tcp_sk(sk); - u32 notsent_bytes = tp->write_seq - tp->snd_nxt; + u32 notsent_bytes = READ_ONCE(tp->write_seq) - tp->snd_nxt; return notsent_bytes < tcp_notsent_lowat(tp); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index f639c7d6083821c8725f5e28312eff3cbfa82e54..370faff782cd363e82014969331df459b8188d94 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -637,7 +637,7 @@ int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg) if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) answ = 0; else - answ = tp->write_seq - tp->snd_una; + answ = READ_ONCE(tp->write_seq) - tp->snd_una; break; case SIOCOUTQNSD: if (sk->sk_state == TCP_LISTEN) @@ -646,7 +646,7 @@ int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg) if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) answ = 0; else - answ = tp->write_seq - tp->snd_nxt; + answ = READ_ONCE(tp->write_seq) - tp->snd_nxt; break; default: return -ENOIOCTLCMD; @@ -1037,7 +1037,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, sk->sk_wmem_queued += copy; sk_mem_charge(sk, copy); skb->ip_summed = CHECKSUM_PARTIAL; - tp->write_seq += copy; + WRITE_ONCE(tp->write_seq, tp->write_seq + copy); TCP_SKB_CB(skb)->end_seq += copy; tcp_skb_pcount_set(skb, 0); @@ -1391,7 +1391,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (!copied) TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH; - tp->write_seq += copy; + WRITE_ONCE(tp->write_seq, tp->write_seq + copy); TCP_SKB_CB(skb)->end_seq += copy; tcp_skb_pcount_set(skb, 0); @@ -2556,6 +2556,7 @@ int tcp_disconnect(struct sock *sk, int flags) struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); int old_state = sk->sk_state; + u32 seq; if (old_state != TCP_CLOSE) tcp_set_state(sk, TCP_CLOSE); @@ -2593,9 +2594,12 @@ int tcp_disconnect(struct sock *sk, int flags) sock_reset_flag(sk, SOCK_DONE); tp->srtt_us = 0; tp->rcv_rtt_last_tsecr = 0; - tp->write_seq += tp->max_window + 2; - if (tp->write_seq == 0) - tp->write_seq = 1; + + seq = tp->write_seq + tp->max_window + 2; + if (!seq) + seq = 1; + WRITE_ONCE(tp->write_seq, seq); + tp->snd_cwnd = 2; icsk->icsk_probes_out = 0; tp->snd_ssthresh = TCP_INFINITE_SSTHRESH; @@ -2885,7 +2889,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, if (sk->sk_state != TCP_CLOSE) err = -EPERM; else if (tp->repair_queue == TCP_SEND_QUEUE) - tp->write_seq = val; + WRITE_ONCE(tp->write_seq, val); else if (tp->repair_queue == TCP_RECV_QUEUE) { WRITE_ONCE(tp->rcv_nxt, val); WRITE_ONCE(tp->copied_seq, val); diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index a96b252c742cb58108123a5ccad15511634dcfc5..2a46f9f81ba09278195e4fe310cc06c1b3c772f0 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -32,7 +32,7 @@ static void tcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r, r->idiag_rqueue = max_t(int, READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq), 0); - r->idiag_wqueue = tp->write_seq - tp->snd_una; + r->idiag_wqueue = READ_ONCE(tp->write_seq) - tp->snd_una; } if (info) tcp_get_info(sk, info); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 627b5fb1eac8ecf808ea33d2994198b4551f3fc9..ac6135555e24a9b9244d1cd27e3d5619f8252490 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -169,9 +169,11 @@ int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp) * without appearing to create any others. */ if (likely(!tp->repair)) { - tp->write_seq = tcptw->tw_snd_nxt + 65535 + 2; - if (tp->write_seq == 0) - tp->write_seq = 1; + u32 seq = tcptw->tw_snd_nxt + 65535 + 2; + + if (!seq) + seq = 1; + WRITE_ONCE(tp->write_seq, seq); tp->rx_opt.ts_recent = tcptw->tw_ts_recent; tp->rx_opt.ts_recent_stamp = tcptw->tw_ts_recent_stamp; } @@ -258,7 +260,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) tp->rx_opt.ts_recent = 0; tp->rx_opt.ts_recent_stamp = 0; if (likely(!tp->repair)) - tp->write_seq = 0; + WRITE_ONCE(tp->write_seq, 0); } inet->inet_dport = usin->sin_port; @@ -296,10 +298,11 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) if (likely(!tp->repair)) { if (!tp->write_seq) - tp->write_seq = secure_tcp_seq(inet->inet_saddr, - inet->inet_daddr, - inet->inet_sport, - usin->sin_port); + WRITE_ONCE(tp->write_seq, + secure_tcp_seq(inet->inet_saddr, + inet->inet_daddr, + inet->inet_sport, + usin->sin_port)); tp->tsoffset = secure_tcp_ts_off(sock_net(sk), inet->inet_saddr, inet->inet_daddr); @@ -2345,7 +2348,7 @@ static void get_tcp4_sock(struct sock *sk, struct seq_file *f, int i) seq_printf(f, "%4d: %08X:%04X %08X:%04X %02X %08X:%08X %02X:%08lX " "%08X %5u %8d %lu %d %pK %lu %lu %u %u %d", i, src, srcp, dest, destp, state, - tp->write_seq - tp->snd_una, + READ_ONCE(tp->write_seq) - tp->snd_una, rx_queue, timer_active, jiffies_delta_to_clock_t(timer_expires - jiffies), diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 0b1a04fa54392fdb3325d45fd1d5e0aaa3170b6c..9436fb9b6a3d384b265b080fc41246987fbb0ea4 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -510,7 +510,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->app_limited = ~0U; tcp_init_xmit_timers(newsk); - newtp->write_seq = newtp->pushed_seq = treq->snt_isn + 1; + WRITE_ONCE(newtp->write_seq, newtp->pushed_seq = treq->snt_isn + 1); newtp->rx_opt.saw_tstamp = 0; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 662aa48173b8197e006127e4d84fb2c1961836a4..9b74041e8dd100a0123f89025cee3bed7c58d30e 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1175,7 +1175,7 @@ static void tcp_queue_skb(struct sock *sk, struct sk_buff *skb) struct tcp_sock *tp = tcp_sk(sk); /* Advance write_seq and place onto the write_queue. */ - tp->write_seq = TCP_SKB_CB(skb)->end_seq; + WRITE_ONCE(tp->write_seq, TCP_SKB_CB(skb)->end_seq); __skb_header_release(skb); tcp_add_write_queue_tail(sk, skb); sk->sk_wmem_queued += skb->truesize; @@ -3397,7 +3397,7 @@ static void tcp_connect_queue_skb(struct sock *sk, struct sk_buff *skb) __skb_header_release(skb); sk->sk_wmem_queued += skb->truesize; sk_mem_charge(sk, skb->truesize); - tp->write_seq = tcb->end_seq; + WRITE_ONCE(tp->write_seq, tcb->end_seq); tp->packets_out += tcp_skb_pcount(skb); } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index de9b9c0bf18f721f44d7ee0e4eec74c0b5576947..6e84f2eb08d643c2c79f34e90b05329a3dbabb56 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -206,7 +206,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, !ipv6_addr_equal(&sk->sk_v6_daddr, &usin->sin6_addr)) { tp->rx_opt.ts_recent = 0; tp->rx_opt.ts_recent_stamp = 0; - tp->write_seq = 0; + WRITE_ONCE(tp->write_seq, 0); } sk->sk_v6_daddr = usin->sin6_addr; @@ -304,10 +304,11 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, if (likely(!tp->repair)) { if (!tp->write_seq) - tp->write_seq = secure_tcpv6_seq(np->saddr.s6_addr32, - sk->sk_v6_daddr.s6_addr32, - inet->inet_sport, - inet->inet_dport); + WRITE_ONCE(tp->write_seq, + secure_tcpv6_seq(np->saddr.s6_addr32, + sk->sk_v6_daddr.s6_addr32, + inet->inet_sport, + inet->inet_dport)); tp->tsoffset = secure_tcpv6_ts_off(sock_net(sk), np->saddr.s6_addr32, sk->sk_v6_daddr.s6_addr32); @@ -1850,7 +1851,7 @@ static void get_tcp6_sock(struct seq_file *seq, struct sock *sp, int i) dest->s6_addr32[0], dest->s6_addr32[1], dest->s6_addr32[2], dest->s6_addr32[3], destp, state, - tp->write_seq - tp->snd_una, + READ_ONCE(tp->write_seq) - tp->snd_una, rx_queue, timer_active, jiffies_delta_to_clock_t(timer_expires - jiffies), From patchwork Fri Mar 12 08:33:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 12133945 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AC60C4332B for ; Fri, 12 Mar 2021 08:34:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4851B64FD5 for ; Fri, 12 Mar 2021 08:34:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232320AbhCLIdo (ORCPT ); Fri, 12 Mar 2021 03:33:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232191AbhCLIdh (ORCPT ); Fri, 12 Mar 2021 03:33:37 -0500 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 368C6C061574; Fri, 12 Mar 2021 00:33:37 -0800 (PST) Received: by mail-pg1-x535.google.com with SMTP id 16so8972142pgo.13; Fri, 12 Mar 2021 00:33:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XnJggIYcB7bvO5E+r31vh1ptSfkVx+1tzs4d1DijcUk=; b=nQ4djR7TmlpqCM8TuI3pDIlRoX/fM843QLW0fr/2q/9IE4P0aNa4wvhkrdpNwES6Ns 31tQ6bQoRIrddtEj2JaHP/bLXgC8EjyCwe0IWOA0UXBe5TYh/PU4wF7IJg4/tuX4rpJ9 4M3Qn3XIWfcJRszh+ZVYyppuggwIInvwkohTyTB+bkgREIGxznW0FoA+Bgal8MlTAvAa J477/SHD7fSuCnyEK3oj29FQXjaU6G5BDk61GdPcJjy0jibKtuOGkkJMLCOCrM4KVBfY WqDywfBdiKz8+1NeMh2tdNJB//BnoAEb3hDx6aRIbU2qNIws6ckyqi/bO4V15OEgLBb1 Gaqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XnJggIYcB7bvO5E+r31vh1ptSfkVx+1tzs4d1DijcUk=; b=TVPeaNhWQJGfnf+ufqGXn6ZVTOEx89HTsfXY1cFUwOLA+oyZPaS2lqjuPyj7s9Ytv8 /NSiuSkzrEY7Q1aFY9/a4zt7BLpJFCLODp0PShUG+C4gqnLz6zApsiT6JTZUX5KzmQta 3xiQHbi4hNCX7FgdQ8zhgZMDTZ2Ii/ghKUPgl127kKvVfkxu0ZxxeUuEveSEnhd/BPRy odb9f+3j5pGPm40bSSBbiqoWEUEvJr8+3FfQpTeyj5tyJdfUR3NSR5tqwUx7RwQ4kgWo QYxD2Xkvcuum/I+fEvLESpTmyJd3U6Ztx5d3kdUnJhiVjMAoA7GNf6iUhAjh+tXdbxGN E6cA== X-Gm-Message-State: AOAM5314XbV57SAKenEwGscf4amIqmJPBjAfhA07uEJSLfCHXqjSkmAo nbkiogVYbM9vnWxI9c2zFrw= X-Google-Smtp-Source: ABdhPJwgtpxZivWszUO3jj1kbnv01AeJb4nu+82wck9emisqxOWYv5eE91pqWe46UWT295VfoYY9sg== X-Received: by 2002:a63:8c0b:: with SMTP id m11mr10683839pgd.306.1615538015799; Fri, 12 Mar 2021 00:33:35 -0800 (PST) Received: from edumazet1.svl.corp.google.com ([2620:15c:2c4:201:5186:d796:2218:6442]) by smtp.gmail.com with ESMTPSA id w1sm4258173pgs.15.2021.03.12.00.33.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Mar 2021 00:33:35 -0800 (PST) From: Eric Dumazet To: Greg Kroah-Hartman Cc: "David S . Miller" , netdev , Eric Dumazet , Eric Dumazet , stable@vger.kernel.org, Pavel Emelyanov , Qingyu Li Subject: [PATCH 4.19-stable 3/3] tcp: add sanity tests to TCP_QUEUE_SEQ Date: Fri, 12 Mar 2021 00:33:23 -0800 Message-Id: <20210312083323.3720479-3-eric.dumazet@gmail.com> X-Mailer: git-send-email 2.31.0.rc2.261.g7f71774620-goog In-Reply-To: <20210312083323.3720479-1-eric.dumazet@gmail.com> References: <20210312083323.3720479-1-eric.dumazet@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Eric Dumazet [ Upstream commit 8811f4a9836e31c14ecdf79d9f3cb7c5d463265d ] Qingyu Li reported a syzkaller bug where the repro changes RCV SEQ _after_ restoring data in the receive queue. mprotect(0x4aa000, 12288, PROT_READ) = 0 mmap(0x1ffff000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x1ffff000 mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000 mmap(0x21000000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x21000000 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_TCP, TCP_REPAIR, [1], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="0x0000000000000003\0\0", iov_len=20}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 setsockopt(3, SOL_TCP, TCP_REPAIR, [0], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [128], 4) = 0 recvfrom(3, NULL, 20, 0, NULL, NULL) = -1 ECONNRESET (Connection reset by peer) syslog shows: [ 111.205099] TCP recvmsg seq # bug 2: copied 80, seq 0, rcvnxt 80, fl 0 [ 111.207894] WARNING: CPU: 1 PID: 356 at net/ipv4/tcp.c:2343 tcp_recvmsg_locked+0x90e/0x29a0 This should not be allowed. TCP_QUEUE_SEQ should only be used when queues are empty. This patch fixes this case, and the tx path as well. Fixes: ee9952831cfd ("tcp: Initial repair mode") Signed-off-by: Eric Dumazet Cc: Pavel Emelyanov Link: https://bugzilla.kernel.org/show_bug.cgi?id=212005 Reported-by: Qingyu Li Signed-off-by: David S. Miller --- net/ipv4/tcp.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 370faff782cd363e82014969331df459b8188d94..769e1f683471ace151855df8c682c1b784cc59f7 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2886,16 +2886,23 @@ static int do_tcp_setsockopt(struct sock *sk, int level, break; case TCP_QUEUE_SEQ: - if (sk->sk_state != TCP_CLOSE) + if (sk->sk_state != TCP_CLOSE) { err = -EPERM; - else if (tp->repair_queue == TCP_SEND_QUEUE) - WRITE_ONCE(tp->write_seq, val); - else if (tp->repair_queue == TCP_RECV_QUEUE) { - WRITE_ONCE(tp->rcv_nxt, val); - WRITE_ONCE(tp->copied_seq, val); - } - else + } else if (tp->repair_queue == TCP_SEND_QUEUE) { + if (!tcp_rtx_queue_empty(sk)) + err = -EPERM; + else + WRITE_ONCE(tp->write_seq, val); + } else if (tp->repair_queue == TCP_RECV_QUEUE) { + if (tp->rcv_nxt != tp->copied_seq) { + err = -EPERM; + } else { + WRITE_ONCE(tp->rcv_nxt, val); + WRITE_ONCE(tp->copied_seq, val); + } + } else { err = -EINVAL; + } break; case TCP_REPAIR_OPTIONS: