From patchwork Tue Nov 14 11:41:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13455207 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E58983A28E; Tue, 14 Nov 2023 11:42:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from wangsu.com (unknown [180.101.34.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 795DEA9; Tue, 14 Nov 2023 03:42:22 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.234]) by app2 (Coremail) with SMTP id SyJltADX3QkZXVNlXr9cAA--.24900S3; Tue, 14 Nov 2023 19:42:18 +0800 (CST) From: Pengcheng Yang To: John Fastabend , Jakub Sitnicki , Eric Dumazet , Jakub Kicinski , bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Pengcheng Yang Subject: [PATCH bpf-next 1/3] skmsg: Calculate the data length in ingress_msg Date: Tue, 14 Nov 2023 19:41:58 +0800 Message-Id: <1699962120-3390-2-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1699962120-3390-1-git-send-email-yangpc@wangsu.com> References: <1699962120-3390-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADX3QkZXVNlXr9cAA--.24900S3 X-Coremail-Antispam: 1UD129KBjvJXoWxWw4UWF1UGw15uw45WryUJrb_yoW5Xw4xpF 1DAa1UAa1UAFWxW393tFZ8Aw1agw1kGFyjkry7Aw4Sqr90kr1rJas8Jr1a9F1rtr1kCa1a qrsFgrs0kF13Ww7anT9S1TB71UUUUUDqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvF1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s1l8cAvFVAK 0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4 x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l 84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I 8CrVACY4xI64kE6c02F40Ex7xfMcIj64x0Y40En7xvr7AKxVWUJVW8JwAv7VCjz48v1sIE Y20_Gr4lYx0Ec7CjxVAajcxG14v26r4j6F4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0x vY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lc2xSY4AK67AK6r4kMxAIw28IcxkI 7VAKI48JMxAIw28IcVCjz48v1sIEY20_Gr4l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxV Aqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r12 6r1DMIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1lIxAIcVC0I7IYx2IY6x kF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AK xVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvj fU2CD7DUUUU X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-Patchwork-Delegate: bpf@iogearbox.net Currently we cannot get the data length in ingress_msg, we introduce sk_msg_queue_len() to do this. Signed-off-by: Pengcheng Yang --- include/linux/skmsg.h | 24 ++++++++++++++++++++++-- net/core/skmsg.c | 4 ++++ 2 files changed, 26 insertions(+), 2 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index c1637515a8a4..3023a573859d 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -82,6 +82,7 @@ struct sk_psock { u32 apply_bytes; u32 cork_bytes; u32 eval; + u32 msg_len; bool redir_ingress; /* undefined if sk_redir is null */ struct sk_msg *cork; struct sk_psock_progs progs; @@ -131,6 +132,11 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, int len, int flags); bool sk_msg_is_readable(struct sock *sk); +static inline void sk_msg_queue_consumed(struct sk_psock *psock, u32 len) +{ + psock->msg_len -= len; +} + static inline void sk_msg_check_to_free(struct sk_msg *msg, u32 i, u32 bytes) { WARN_ON(i == msg->sg.end && bytes); @@ -311,9 +317,10 @@ static inline void sk_psock_queue_msg(struct sk_psock *psock, struct sk_msg *msg) { spin_lock_bh(&psock->ingress_lock); - if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) + if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) { list_add_tail(&msg->list, &psock->ingress_msg); - else { + psock->msg_len += msg->sg.size; + } else { sk_msg_free(psock->sk, msg); kfree(msg); } @@ -368,6 +375,19 @@ static inline void kfree_sk_msg(struct sk_msg *msg) kfree(msg); } +static inline u32 sk_msg_queue_len(struct sock *sk) +{ + struct sk_psock *psock; + u32 len = 0; + + rcu_read_lock(); + psock = sk_psock(sk); + if (psock) + len = psock->msg_len; + rcu_read_unlock(); + return len; +} + static inline void sk_psock_report_error(struct sk_psock *psock, int err) { struct sock *sk = psock->sk; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 6c31eefbd777..b3de17e99b67 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -481,6 +481,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, msg_rx = sk_psock_peek_msg(psock); } out: + if (likely(!peek) && copied > 0) + sk_msg_queue_consumed(psock, copied); return copied; } EXPORT_SYMBOL_GPL(sk_msg_recvmsg); @@ -771,9 +773,11 @@ static void __sk_psock_purge_ingress_msg(struct sk_psock *psock) list_for_each_entry_safe(msg, tmp, &psock->ingress_msg, list) { list_del(&msg->list); + sk_msg_queue_consumed(psock, msg->sg.size); sk_msg_free(psock->sk, msg); kfree(msg); } + WARN_ON_ONCE(psock->msg_len != 0); } static void __sk_psock_zap_ingress(struct sk_psock *psock) From patchwork Tue Nov 14 11:41:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13455206 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E58E83B282; Tue, 14 Nov 2023 11:42:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from wangsu.com (unknown [180.101.34.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0E1EAAD; Tue, 14 Nov 2023 03:42:22 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.234]) by app2 (Coremail) with SMTP id SyJltADX3QkZXVNlXr9cAA--.24900S4; Tue, 14 Nov 2023 19:42:18 +0800 (CST) From: Pengcheng Yang To: John Fastabend , Jakub Sitnicki , Eric Dumazet , Jakub Kicinski , bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Pengcheng Yang Subject: [PATCH bpf-next 2/3] tcp: Add the data length in skmsg to SIOCINQ ioctl Date: Tue, 14 Nov 2023 19:41:59 +0800 Message-Id: <1699962120-3390-3-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1699962120-3390-1-git-send-email-yangpc@wangsu.com> References: <1699962120-3390-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADX3QkZXVNlXr9cAA--.24900S4 X-Coremail-Antispam: 1UD129KBjvdXoW7XF18KrW3Xr4xGFW5ZF48Crg_yoWfKFgE93 9rGF48G3yxWr1IvanFyFZ5t3WS9w18ur1fWF43Ca47ta4UJry5Crs3J3s8Crsrua9FkrZ5 WF93G3y3urya9jkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbxAFc2x0x2IEx4CE42xK8VAvwI8IcIk0rVWrJVCq3wA2ocxC64kI II0Yj41l84x0c7CEw4AK67xGY2AK021l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7 xvwVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2 z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4 xG64xvF2IEw4CE5I8CrVC2j2WlYx0EF7xvrVAajcxG14v26r1j6r4UMcIj6x8ErcxFaVAv 8VW8GwAv7VCY1x0262k0Y48FwI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48Icx kI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY02Avz4vE14v_Gw4l42xK82IYc2Ij 64vIr41l42xK82IY6x8ErcxFaVAv8VW8GwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c 02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_ Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7 CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v2 6r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0J jesjbUUUUU= X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-Patchwork-Delegate: bpf@iogearbox.net SIOCINQ ioctl returns the number unread bytes of the receive queue but does not include the ingress_msg queue. With the sk_msg redirect, an application may get a value 0 if it calls SIOCINQ ioctl before recv() to determine the readable size. Signed-off-by: Pengcheng Yang --- net/ipv4/tcp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 3d3a24f79573..04da0684c397 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -267,6 +267,7 @@ #include #include #include +#include #include #include @@ -613,7 +614,7 @@ int tcp_ioctl(struct sock *sk, int cmd, int *karg) return -EINVAL; slow = lock_sock_fast(sk); - answ = tcp_inq(sk); + answ = tcp_inq(sk) + sk_msg_queue_len(sk); unlock_sock_fast(sk, slow); break; case SIOCATMARK: From patchwork Tue Nov 14 11:42:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13455205 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83F25224E1; Tue, 14 Nov 2023 11:42:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from wangsu.com (unknown [180.101.34.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7965FAB; Tue, 14 Nov 2023 03:42:22 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.234]) by app2 (Coremail) with SMTP id SyJltADX3QkZXVNlXr9cAA--.24900S5; Tue, 14 Nov 2023 19:42:19 +0800 (CST) From: Pengcheng Yang To: John Fastabend , Jakub Sitnicki , Eric Dumazet , Jakub Kicinski , bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Pengcheng Yang Subject: [PATCH bpf-next 3/3] tcp_diag: Add the data length in skmsg to rx_queue Date: Tue, 14 Nov 2023 19:42:00 +0800 Message-Id: <1699962120-3390-4-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1699962120-3390-1-git-send-email-yangpc@wangsu.com> References: <1699962120-3390-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADX3QkZXVNlXr9cAA--.24900S5 X-Coremail-Antispam: 1UD129KBjvdXoWrZF4fAr13AF1fCr47Jr17Awb_yoW3Kwb_uw n7ZrW8W3srXr1xta1xZFZxJFyYk34IyFn5W3WS9a4qy34DJF9xuw4rXF98Ars7CanxCrZ5 ur1DJryUG34rujkaLaAFLSUrUUUUbb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbxAFc2x0x2IEx4CE42xK8VAvwI8IcIk0rVWrJVCq3wA2ocxC64kI II0Yj41l84x0c7CEw4AK67xGY2AK021l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7 xvwVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2 z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4 xG64xvF2IEw4CE5I8CrVC2j2WlYx0EF7xvrVAajcxG14v26r1j6r4UMcIj6x8ErcxFaVAv 8VW8GwAv7VCY1x0262k0Y48FwI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48Icx kI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY02Avz4vE14v_Gw4l42xK82IYc2Ij 64vIr41l42xK82IY6x8ErcxFaVAv8VW8GwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c 02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_ Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7 CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v2 6r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0J jesjbUUUUU= X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-Patchwork-Delegate: bpf@iogearbox.net In order to help us track the data length in ingress_msg when using sk_msg redirect. Signed-off-by: Pengcheng Yang --- net/ipv4/tcp_diag.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index 01b50fa79189..b22382820a4b 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -11,6 +11,7 @@ #include #include +#include #include #include @@ -28,6 +29,7 @@ static void tcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r, r->idiag_rqueue = max_t(int, READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq), 0); + r->idiag_rqueue += sk_msg_queue_len(sk); r->idiag_wqueue = READ_ONCE(tp->write_seq) - tp->snd_una; } if (info)