From patchwork Thu Oct 17 00:57:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zijian Zhang X-Patchwork-Id: 13839170 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59FA017588 for ; Thu, 17 Oct 2024 00:57:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729126673; cv=none; b=mGEqiu6fN2OI8QKfFJXzpj5EcXoUeZT74FSnpZ6On0lIs9obS7SLJgVl3YRq4jQcJaUEY0e0EhaUNIueZv+Vav0bxsgx7Fe52I9m8dwSe/ySyQd5w/VmN5OpBQ8Lsi/yrBeaFPZOfMLP+Dl5XsF6ddadAiLHH4kChRvN5TzCXoI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729126673; c=relaxed/simple; bh=75hR31KYcNVVw9XfAVhT5KZdKZHt01NPK5ObpGx/k0k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=SNGrneYgsf9wu0JyQwWlHF9b6uNTp5FfTHJqjileKeyRfkI7JZ3kNRhcIt2Fplj5pR5070j+4wHnM/zYgqVRFwfoKAbbMytZ6mpooX5pcPdhzegCYt3dfCqkc3cNZUzvvlUrropWeWrbbVPBuzJc9ic/jt4kEhT8KjYkVzrBFO0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=RxmvBFPr; arc=none smtp.client-ip=209.85.222.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="RxmvBFPr" Received: by mail-qk1-f179.google.com with SMTP id af79cd13be357-7b1488fde46so34888785a.2 for ; Wed, 16 Oct 2024 17:57:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729126669; x=1729731469; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0Ie/0oo/Q1j/Vv9YDm7GQSm/OrG3L8RDi9nDJTydVEA=; b=RxmvBFPrgdofw2l8y8pvtZvh7CYDCyzCl+v4kJrFUKFv3/cibGUzBjYa/RFWfvQaYL T7oxdX7Idk5eE5prqz8UzqxsGgRvPag7G1d67+OC+JxzWrfoYjrhn58p+5XBdi4ojhXM ahZK2emSx43wlJ4b5yXQWGaobGbtuJ/Ld8DfqbMAvD5sDQ4fUW2rIXoEMSIxOsOK/aBX LuLTJnJ5csafZ/2AhpATrMdgIhRxiswPmH5x+tgSsN5JPa5836e/L1PCWvaKCmKpuaxq SMQC1veQCdHmTvkUalZbvLHrGQY7KTkhAtQCn1S4iLs6E+HBGAtB/rOPrkOd25qWZbqN 5biA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729126669; x=1729731469; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0Ie/0oo/Q1j/Vv9YDm7GQSm/OrG3L8RDi9nDJTydVEA=; b=fEEiKHdvhgqKDMAofT0O6HLOHg60Vr8M4QhUBu5y3yiUhb8YJo8KXmRVjInbAFdzAF JEJ8utPkQqVcjre2g0LsVIrEOOiKcOof+4qCNLNXUiSkttL117t8aIh4TG3EJNzXYK0V gKrtm7vqAEItT+tY4EwBTySjyaNlU6yM3x2whifOaH82StcYPWbslgJJrOg7KfElexLs AkJmoAGH2+EpnKtPHRVv+SC7Hfc3RTNlCXw7IpCNeoiXruBQKjUscY0MeDqpzrdaQEvV PK+0Jc6ANaYYMAmPnEioi65FTIzD0oS2pnIpCPimtM4vW47pNeLbOml7Lstp7fT9OnEb Rsgg== X-Gm-Message-State: AOJu0Yz8QSJgdSgaaqg4F7ZEiCN6EHdrMqGSaMw7kinPP6gjE364upCw UrlLhaveG/NlzWWWpGtivlNswLL8tYV7sNs5mZvhWU9qJkLxhXmXslrbznjsYsI6xojs5pgeOKK 1 X-Google-Smtp-Source: AGHT+IFbS9Qlspq3VIc4TYQhRSNdXpqYJG5hpVKxDvn0CRr56E4UiAbQn5PBpciQJZUPAuZHJpYFAg== X-Received: by 2002:a05:620a:45ab:b0:7b1:4053:71c1 with SMTP id af79cd13be357-7b1417cb2fdmr901518185a.18.1729126668815; Wed, 16 Oct 2024 17:57:48 -0700 (PDT) Received: from n191-036-066.byted.org ([130.44.212.94]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b13617a23esm242466685a.60.2024.10.16.17.57.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Oct 2024 17:57:47 -0700 (PDT) From: zijianzhang@bytedance.com To: bpf@vger.kernel.org Cc: john.fastabend@gmail.com, jakub@cloudflare.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, cong.wang@bytedance.com, zijianzhang@bytedance.com Subject: [PATCH bpf 1/2] tcp_bpf: charge receive socket buffer in bpf_tcp_ingress() Date: Thu, 17 Oct 2024 00:57:41 +0000 Message-Id: <20241017005742.3374075-2-zijianzhang@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20241017005742.3374075-1-zijianzhang@bytedance.com> References: <20241017005742.3374075-1-zijianzhang@bytedance.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang When bpf_tcp_ingress() is called, the skmsg is being redirected to the ingress of the destination socket. Therefore, we should charge its receive socket buffer, instead of sending socket buffer. Because sk_rmem_schedule() tests pfmemalloc of skb, we need to introduce a wrapper and call it for skmsg. Signed-off-by: Cong Wang --- include/net/sock.h | 10 ++++++++-- net/ipv4/tcp_bpf.c | 2 +- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index c58ca8dd561b..4e796b1a92d2 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1519,7 +1519,7 @@ static inline bool sk_wmem_schedule(struct sock *sk, int size) } static inline bool -sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size) +__sk_rmem_schedule(struct sock *sk, int size, bool pfmemalloc) { int delta; @@ -1527,7 +1527,13 @@ sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size) return true; delta = size - sk->sk_forward_alloc; return delta <= 0 || __sk_mem_schedule(sk, delta, SK_MEM_RECV) || - skb_pfmemalloc(skb); + pfmemalloc; +} + +static inline bool +sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size) +{ + return __sk_rmem_schedule(sk, size, skb_pfmemalloc(skb)); } static inline int sk_unused_reserved_mem(const struct sock *sk) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index e7658c5d6b79..48c412744f77 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -49,7 +49,7 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, sge = sk_msg_elem(msg, i); size = (apply && apply_bytes < sge->length) ? apply_bytes : sge->length; - if (!sk_wmem_schedule(sk, size)) { + if (!__sk_rmem_schedule(sk, size, false)) { if (!copied) ret = -ENOMEM; break; From patchwork Thu Oct 17 00:57:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zijian Zhang X-Patchwork-Id: 13839171 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64BC91DA5A for ; Thu, 17 Oct 2024 00:57:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729126676; cv=none; b=deoTK2ybT28kylSzFWi5jR6ik2LR7qTs/pzWwamsHmtlGsb/K/9KVihju+cGBuCrdf5+lWdmvn9VkMyNg0z5HsK2/FHF8SthPZx8xNvCPusrGjBVRiFe/iuyekm63a+CXHBFukmxpQTr/Wt7E53he1zBFvEtbcW6rsxhe9ZNcUs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729126676; c=relaxed/simple; bh=er7gn+9qtnqKtk/zsO18B6Kbt/ktvMPPY9dwl/7TEsM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UrJGzJN/JmyCdDNYA76RB+vh2TCKunn46ZJcop/sl3QJ8b9g82x1g+ZIeATsQEdkbI+NNHarcEFov8OCddoEG5XYJ07Z50ehFdV6/dxcjXskuC81tUXB4JWINy16Kq47VxDpZgOXOcR5HpMRPRBtghBcJRXe8kisWv87DsZkugM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=d1Gjzb85; arc=none smtp.client-ip=209.85.222.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="d1Gjzb85" Received: by mail-qk1-f180.google.com with SMTP id af79cd13be357-7b1418058bbso25007385a.3 for ; Wed, 16 Oct 2024 17:57:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729126672; x=1729731472; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2NtcrI2eXbw87V4TzBkXNzP43wWWtbiCvAle8aiM7Xo=; b=d1Gjzb85GrSUdLctI6+KO9Tc5jsr9nEXF7R8KnF/ijY3TynsxpQ0yfsAUU7pk9kvhG RRVEa10RODPV57MPUCVaCVko+8WbVVgiYOm5MkcLhcYvnPjIQoqegHue8C4voZHaEpHy Ir9zYX9naX8w8crD4FG2QkF3RtPwgwHWk1s6RnIo4XYNvFMEbEvpWd5ljUZw5UOy+crq TnQKzETz279hzeCsjHFTy/6Dfb3CO4OYqSgyVcZ7U5iGiHJcMD0iha25XX6RfdCAWoXO AhM+do+Iqy0jOS//u+fGm38MStV0ADsrdMe/EU+lKJwzf4eSm0q5T/k1Sn5JP0/8Cm8T ik/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729126672; x=1729731472; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2NtcrI2eXbw87V4TzBkXNzP43wWWtbiCvAle8aiM7Xo=; b=hY5TmvYu478ZWixPfpfS3lZD3dsg1Pk224HLFRUQHfRnNtgi/CMhYpLnjvry14FzjX pkZrQND9sxo3LquPdI2Ympd0MUCjTfBXbZldHii0Jo8nCmXkOW3U083rGGUX6pSSkf9f UGiQWBywbmCk4qh8RD7kXHf8wBOI7rgf0wDv2UUi7dbJk6xGt3Ap2mxyBEDkE1jfNLdb 1tkmlgMCbGqEBx6XGjHL1Up5VFNfQbd+8yvYVBbjxtF1NR6DfpnOyBAseW1nNPYAtJcP 3+zy89LAr7LIn/J6OpaZkq2nCuNEs2oyx/bSuKr1msH36ynAGnsON0dLxV92TmrucOax Boag== X-Gm-Message-State: AOJu0YzN/R+kJRn1nrnWe3hA4tEzgH665ljrHs4IKMog5I0BIsI4OrYF +YYqX8j38g03m6sfo7N40HyW59zwQZN3EjIzM3TLsnhIhgn3nhR8myaanqbecCYBBrLAyvvp6az B X-Google-Smtp-Source: AGHT+IE+cHydkaERdSD+wRFQhHGrAS7bLwqT2iHcE1mxpJnlcyNLvPzueWZ8aAMjYSwm9joVel+kjw== X-Received: by 2002:a05:620a:470d:b0:7a9:a1f4:d4e1 with SMTP id af79cd13be357-7b120fcfe30mr2263816385a.39.1729126671812; Wed, 16 Oct 2024 17:57:51 -0700 (PDT) Received: from n191-036-066.byted.org ([130.44.212.94]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b13617a23esm242466685a.60.2024.10.16.17.57.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Oct 2024 17:57:50 -0700 (PDT) From: zijianzhang@bytedance.com To: bpf@vger.kernel.org Cc: john.fastabend@gmail.com, jakub@cloudflare.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, cong.wang@bytedance.com, zijianzhang@bytedance.com Subject: [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection Date: Thu, 17 Oct 2024 00:57:42 +0000 Message-Id: <20241017005742.3374075-3-zijianzhang@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20241017005742.3374075-1-zijianzhang@bytedance.com> References: <20241017005742.3374075-1-zijianzhang@bytedance.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Zijian Zhang Although we sk_rmem_schedule and add sk_msg to the ingress_msg of sk_redir in bpf_tcp_ingress, we do not update sk_rmem_alloc. As a result, except for the global memory limit, the rmem of sk_redir is nearly unlimited. Thus, add sk_rmem_alloc related logic to limit the recv buffer. Signed-off-by: Zijian Zhang --- include/linux/skmsg.h | 11 ++++++++--- net/core/skmsg.c | 6 +++++- net/ipv4/tcp_bpf.c | 4 +++- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index d9b03e0746e7..2cbe0c22a32f 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -317,17 +317,22 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb) kfree_skb(skb); } -static inline void sk_psock_queue_msg(struct sk_psock *psock, +static inline bool sk_psock_queue_msg(struct sk_psock *psock, struct sk_msg *msg) { + bool ret; + spin_lock_bh(&psock->ingress_lock); - if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) + if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) { list_add_tail(&msg->list, &psock->ingress_msg); - else { + ret = true; + } else { sk_msg_free(psock->sk, msg); kfree(msg); + ret = false; } spin_unlock_bh(&psock->ingress_lock); + return ret; } static inline struct sk_msg *sk_psock_dequeue_msg(struct sk_psock *psock) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index b1dcbd3be89e..110ee0abcfe0 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -445,8 +445,10 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, if (likely(!peek)) { sge->offset += copy; sge->length -= copy; - if (!msg_rx->skb) + if (!msg_rx->skb) { sk_mem_uncharge(sk, copy); + atomic_sub(copy, &sk->sk_rmem_alloc); + } msg_rx->sg.size -= copy; if (!sge->length) { @@ -772,6 +774,8 @@ static void __sk_psock_purge_ingress_msg(struct sk_psock *psock) list_for_each_entry_safe(msg, tmp, &psock->ingress_msg, list) { list_del(&msg->list); + if (!msg->skb) + atomic_sub(msg->sg.size, &psock->sk->sk_rmem_alloc); sk_msg_free(psock->sk, msg); kfree(msg); } diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 48c412744f77..39155bec746f 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -56,6 +56,7 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, } sk_mem_charge(sk, size); + atomic_add(size, &sk->sk_rmem_alloc); sk_msg_xfer(tmp, msg, i, size); copied += size; if (sge->length) @@ -74,7 +75,8 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, if (!ret) { msg->sg.start = i; - sk_psock_queue_msg(psock, tmp); + if (!sk_psock_queue_msg(psock, tmp)) + atomic_sub(copied, &sk->sk_rmem_alloc); sk_psock_data_ready(sk, psock); } else { sk_msg_free(sk, tmp);