From patchwork Mon Nov 18 01:07:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13878000 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A3D4D517 for ; Mon, 18 Nov 2024 01:13:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; cv=none; b=ucVN3swS4aTl41gdeyNw9U+UexCXQ7qcKdcSxNxlib8oPXV4XquwR896ML7FTQEVdc3XBCEsa2c7PxmmK5iZTOGgZQvy9lcafOD096sluLcRPDhtauD0WAiORZE/2th9LrEew+BgnPPq+J5FDlDn/vGcHuJTZbssxzHT/DcCJOI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; c=relaxed/simple; bh=75mNHbnjXH1zfTVi9SVBeaPbDWG8GXIRthteSNiCK8E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DVoodQJfwva4HthmGAYpCh/9sZsMfcskWg/i+fAHWCTxj66UCcVitLwuSMsRvvblqPSSZu0/5AYQ3UQn6o8wb3DirmjIY8BLLq8/V7WYKEdvs1SHo6eXEOKrsTwy3OJY9mPoJB/VP9oN2Z2JJsGUoLC9/1Ouk8yXBoOBKCty7fg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Lh16tVz4f3kvw for ; Mon, 18 Nov 2024 08:55:40 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 794ED1A018D for ; Mon, 18 Nov 2024 08:55:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S5; Mon, 18 Nov 2024 08:55:59 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 01/10] bpf: Remove unnecessary check when updating LPM trie Date: Mon, 18 Nov 2024 09:07:59 +0800 Message-Id: <20241118010808.2243555-2-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S5 X-Coremail-Antispam: 1UD129KBjvJXoW7Xr18Jr13KFy8JFyDCrW3Wrg_yoW8JF13pF Wrtr1Yqa15JF1fCwnayw4rGr98Jw48Ww42qa4kWryYkry8Xr93Kr1rur4Sqa18Jr4xJFnx JrWUJryfKw1DXaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPFb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGw A2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxU3cTm DUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao When "node->prefixlen == matchlen" is true, it means that the node is fully matched. If "node->prefixlen == key->prefixlen" is false, it means the prefix length of key is greater than the prefix length of node, otherwise, matchlen will not be equal with node->prefixlen. However, it also implies that the prefix length of node must be less than max_prefixlen. Therefore, "node->prefixlen == trie->max_prefixlen" will always be false when the check of "node->prefixlen == key->prefixlen" returns false. Remove this unnecessary comparison. Signed-off-by: Hou Tao Reviewed-by: Toke Høiland-Jørgensen --- kernel/bpf/lpm_trie.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index 9b60eda0f727..73fd593d3745 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -364,8 +364,7 @@ static long trie_update_elem(struct bpf_map *map, matchlen = longest_prefix_match(trie, node, key); if (node->prefixlen != matchlen || - node->prefixlen == key->prefixlen || - node->prefixlen == trie->max_prefixlen) + node->prefixlen == key->prefixlen) break; next_bit = extract_bit(key->data, node->prefixlen); From patchwork Mon Nov 18 01:08:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13877999 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A43314F90 for ; Mon, 18 Nov 2024 01:13:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; cv=none; b=su64oyMmg+vzhTnd4W33jyKHNKKsZML3YFYYBBJTCebT6Fm05qSqVp0P/KLxMYb8aemLgpc7/4Av+1x+PQTN1mLHo+FSrh1lfQXeN5mXNCxJpPltMDlxV4hMfpPxW/hDS/0jXXh9bRzv9LrFydmZv/M3tLlZzyxDYQC3GLo6YLY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; c=relaxed/simple; bh=ux9RPWoDbSeBk2aKBY7zdhgp2sZTekAnpbGIKfOcY/A=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=f0EADbhPxKHhOfNWL0Ap/EK/LBW1y34ZIZHXOfP+16FGx12gN/wUcg4tLX/cTAxJ5cxk6EYM2TFz9ekpr9lSnlzOrSW/aNp2K8Ye6piGDX20EmIFwIFu7sjePxOjhtJRMj5m4GdN40ucJoqX2q/ZD3uA06VJwwSpDl7epyKsjFw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Lh62NHz4f3lfR for ; Mon, 18 Nov 2024 08:55:40 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 2D0891A0568 for ; Mon, 18 Nov 2024 08:56:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S6; Mon, 18 Nov 2024 08:55:59 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 02/10] bpf: Remove unnecessary kfree(im_node) in lpm_trie_update_elem Date: Mon, 18 Nov 2024 09:08:00 +0800 Message-Id: <20241118010808.2243555-3-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S6 X-Coremail-Antispam: 1UD129KBjvkXoW8ZFW3Ary7Ar15Kw4fCr17p5X_CFWfpF4xKa s0yw4Fqw1Uua1vqrs5ur98G34xGw42g3WkKa93tFy5K3s3Ww18ua4YgF45XFWSyrs8AF1Y qrZYyw1ru3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy29KBj DU0xBIdaVrnRJUUUP2b4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxK x2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXwA204 8vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7 M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr 0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl 6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6x IIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_ Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI0_GF v_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AK xVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MIIYrx kI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v2 6F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr 1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU0I385UU UUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao There is no need to call kfree(im_node) when updating element fails, because im_node must be NULL. Remove the unnecessary kfree() for im_node. Signed-off-by: Hou Tao Reviewed-by: Toke Høiland-Jørgensen --- kernel/bpf/lpm_trie.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index 73fd593d3745..c6f036e3044b 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -431,9 +431,7 @@ static long trie_update_elem(struct bpf_map *map, if (ret) { if (new_node) trie->n_entries--; - kfree(new_node); - kfree(im_node); } spin_unlock_irqrestore(&trie->lock, irq_flags); From patchwork Mon Nov 18 01:08:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13878007 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A40CEEBB for ; Mon, 18 Nov 2024 01:13:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892439; cv=none; b=rOXk1KfDcN7D0eyMRs0GB6lMjKdWirDR+eOFd93HZK3YsBy0TkrqTIG3GUHz1w8FWWoKyipH9T08iRiDtnvQ8DX7IYIW7hBYw3tmzE4bOc0R7UGcNY3ysVorvTpylJ2vJK/5cEjF39F1XYAwRTHDwDzQ9XvaelwCzrnDaiP8isU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892439; c=relaxed/simple; bh=5SF4SXib61J8oIfFNFEz4RzsVwOlAf+XGfIVJWzhiak=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FzeKx05zE1iJodgDjvrK6Kvcj6vcjh5p/cMYiQ2U4bvFTeN9tfIxW1LuGTjcQjBJSw5de1iZ7lgrKha9VKwkODiozIxsMEhG2Pd48V0q9IzYBsLSOKLz+lfd9rO3T9E/2tr5/5vTVziOXTAaRiDrSS4aawdDxY/kuaSX0jp96yQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Lj3kfqz4f3lfV for ; Mon, 18 Nov 2024 08:55:41 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D59BE1A0196 for ; Mon, 18 Nov 2024 08:56:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S7; Mon, 18 Nov 2024 08:56:00 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 03/10] bpf: Handle BPF_EXIST and BPF_NOEXIST for LPM trie Date: Mon, 18 Nov 2024 09:08:01 +0800 Message-Id: <20241118010808.2243555-4-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S7 X-Coremail-Antispam: 1UD129KBjvJXoW7Cr1xGryUWrWrCr45Kr1rZwb_yoW8XFyfpF 1SkFWYkrW8Wr13Ca1Sv3Z8Jry5Ga4rG3y29FyxA34ayrWjyF9IgF1ruay5tr43trWxZrW3 AF4rtFWDKryq9rJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP2b4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU04x RDUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao There is exact match during the update of LPM trie, therefore, add the missed handling for BPF_EXIST and BPF_NOEXIST flags. Signed-off-by: Hou Tao Reviewed-by: Toke Høiland-Jørgensen --- kernel/bpf/lpm_trie.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index c6f036e3044b..4300bd51ec6e 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -375,6 +375,10 @@ static long trie_update_elem(struct bpf_map *map, * simply assign the @new_node to that slot and be done. */ if (!node) { + if (flags == BPF_EXIST) { + ret = -ENOENT; + goto out; + } rcu_assign_pointer(*slot, new_node); goto out; } @@ -383,18 +387,31 @@ static long trie_update_elem(struct bpf_map *map, * which already has the correct data array set. */ if (node->prefixlen == matchlen) { + if (!(node->flags & LPM_TREE_NODE_FLAG_IM)) { + if (flags == BPF_NOEXIST) { + ret = -EEXIST; + goto out; + } + trie->n_entries--; + } else if (flags == BPF_EXIST) { + ret = -ENOENT; + goto out; + } + new_node->child[0] = node->child[0]; new_node->child[1] = node->child[1]; - if (!(node->flags & LPM_TREE_NODE_FLAG_IM)) - trie->n_entries--; - rcu_assign_pointer(*slot, new_node); free_node = node; goto out; } + if (flags == BPF_EXIST) { + ret = -ENOENT; + goto out; + } + /* If the new node matches the prefix completely, it must be inserted * as an ancestor. Simply insert it between @node and *@slot. */ From patchwork Mon Nov 18 01:08:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13878001 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A34E8BFF for ; Mon, 18 Nov 2024 01:13:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; cv=none; b=iAwmWv5rtdMh+m0vEDgQdJ/VZ2uxhQ0H5K2aH7KzCAdFQ4zqui5SD7dl+mQeQni07EmoClFeJRwF19Vdqv89I/ZfRRa0Poejan0UGkxbVbK0IXpdCa30mOCqv1LefK4RAni7jVFQzfMJ1wzIuErntSZ/buYiNFebCGSjBn+9KFI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; c=relaxed/simple; bh=iT8ZUSEM7IlxGZyL8663ba3PW85q+uAow0c5s95HLEk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=twHxpkGUJUx8GqtVP6HqD0eTkt5AtcCqk2dsZO4qdQYF/1pm0beiB1EKSHqfVkq+J5hknQMjRp5nJOJ6oWkuCCD5DZzawdlaNzQG+NShmLp56e7WheSMgIn2IwMHS9I9VY/qzsB5/+NEXa5kmPCsx/nFpOF2u11JxeWhxVlJfCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Lk1kt2z4f3lW5 for ; Mon, 18 Nov 2024 08:55:42 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 8D1851A0568 for ; Mon, 18 Nov 2024 08:56:01 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S8; Mon, 18 Nov 2024 08:56:01 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 04/10] bpf: Handle in-place update for full LPM trie correctly Date: Mon, 18 Nov 2024 09:08:02 +0800 Message-Id: <20241118010808.2243555-5-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S8 X-Coremail-Antispam: 1UD129KBjvJXoWxCrWxKF1DCFWfXr48AF1DWrg_yoW5tryfpF WSkF90yr18Jr4UWr4Fva98JrZ8Zw4rG3yj9F93W347AFySkr93tF18uayFyrW5ArWUXrWY yF45tFyqgrZ5ZrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPSb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF 7I0E14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI 0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7I U0sqXPUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao When a LPM trie is full, in-place updates of existing elements incorrectly return -ENOSPC. Fix this by deferring the check of trie->n_entries. For new insertions, n_entries must not exceed max_entries. However, in-place updates are allowed even when the trie is full. Signed-off-by: Hou Tao --- kernel/bpf/lpm_trie.c | 46 +++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index 4300bd51ec6e..ff57e35357ae 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -310,6 +310,15 @@ static struct lpm_trie_node *lpm_trie_node_alloc(const struct lpm_trie *trie, return node; } +static int trie_check_noreplace_update(const struct lpm_trie *trie, u64 flags) +{ + if (flags == BPF_EXIST) + return -ENOENT; + if (trie->n_entries == trie->map.max_entries) + return -ENOSPC; + return 0; +} + /* Called from syscall or from eBPF program */ static long trie_update_elem(struct bpf_map *map, void *_key, void *value, u64 flags) @@ -333,20 +342,12 @@ static long trie_update_elem(struct bpf_map *map, spin_lock_irqsave(&trie->lock, irq_flags); /* Allocate and fill a new node */ - - if (trie->n_entries == trie->map.max_entries) { - ret = -ENOSPC; - goto out; - } - new_node = lpm_trie_node_alloc(trie, value); if (!new_node) { ret = -ENOMEM; goto out; } - trie->n_entries++; - new_node->prefixlen = key->prefixlen; RCU_INIT_POINTER(new_node->child[0], NULL); RCU_INIT_POINTER(new_node->child[1], NULL); @@ -375,10 +376,11 @@ static long trie_update_elem(struct bpf_map *map, * simply assign the @new_node to that slot and be done. */ if (!node) { - if (flags == BPF_EXIST) { - ret = -ENOENT; + ret = trie_check_noreplace_update(trie, flags); + if (ret) goto out; - } + + trie->n_entries++; rcu_assign_pointer(*slot, new_node); goto out; } @@ -392,10 +394,11 @@ static long trie_update_elem(struct bpf_map *map, ret = -EEXIST; goto out; } - trie->n_entries--; - } else if (flags == BPF_EXIST) { - ret = -ENOENT; - goto out; + } else { + ret = trie_check_noreplace_update(trie, flags); + if (ret) + goto out; + trie->n_entries++; } new_node->child[0] = node->child[0]; @@ -407,15 +410,15 @@ static long trie_update_elem(struct bpf_map *map, goto out; } - if (flags == BPF_EXIST) { - ret = -ENOENT; + ret = trie_check_noreplace_update(trie, flags); + if (ret) goto out; - } /* If the new node matches the prefix completely, it must be inserted * as an ancestor. Simply insert it between @node and *@slot. */ if (matchlen == key->prefixlen) { + trie->n_entries++; next_bit = extract_bit(node->data, matchlen); rcu_assign_pointer(new_node->child[next_bit], node); rcu_assign_pointer(*slot, new_node); @@ -428,6 +431,7 @@ static long trie_update_elem(struct bpf_map *map, goto out; } + trie->n_entries++; im_node->prefixlen = matchlen; im_node->flags |= LPM_TREE_NODE_FLAG_IM; memcpy(im_node->data, node->data, trie->data_size); @@ -445,12 +449,8 @@ static long trie_update_elem(struct bpf_map *map, rcu_assign_pointer(*slot, im_node); out: - if (ret) { - if (new_node) - trie->n_entries--; + if (ret) kfree(new_node); - } - spin_unlock_irqrestore(&trie->lock, irq_flags); kfree_rcu(free_node, rcu); From patchwork Mon Nov 18 01:08:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13878004 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9606B17C7C for ; Mon, 18 Nov 2024 01:13:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892438; cv=none; b=dg0EXxoOyk3rkNNAzUJSjj6djft9i5q+2VZxPN2DpNWr9TjttZZLKwHb3ZqvSdxY6MDpHt5PPXojh00kLrjeagvT8kr+VugmMccT2WP3HTp50sf43LwwMB2fwla9d52xkUmsMGwHpT/NwbIIQ09ofNMWbpBUtSUmW9J7hXOSBjU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892438; c=relaxed/simple; bh=PA/YwfbpQN/qPJIRU4H+1nER0wextFckJt8yyHtLk8E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=O5My0y5Q0dqO8saCtoJ7G6kGnyKzid++kapOJRcWn8ENqP9IaLH5QEfi3geW4jcV8vHIZ3RydZQADZ/1A69G2uN+t6VHtX8k1Mo+YpJURP8ts+40cvbMFw18bORLv6DfSJvSnJDjn9NIm0y4N90JQ9V3F4mF22qx1RMjW1g6hzI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Ll5kQfz4f3kFP for ; Mon, 18 Nov 2024 08:55:43 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 3B2251A0194 for ; Mon, 18 Nov 2024 08:56:02 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S9; Mon, 18 Nov 2024 08:56:01 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 05/10] bpf: Fix exact match conditions in trie_get_next_key() Date: Mon, 18 Nov 2024 09:08:03 +0800 Message-Id: <20241118010808.2243555-6-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S9 X-Coremail-Antispam: 1UD129KBjvJXoW7WFWDXFW5JFWxKF18CF1fCrg_yoW8CFWfpr 4rtryqyr45Jay2kanYqa18WryrtF13Kw4xJas5G3sYk3s8Xr93Kr10gFWYga43AFZ3JF1a yr40qr9Yqw4DXaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUI-eODUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao trie_get_next_key() uses node->prefixlen == key->prefixlen to identify an exact match, However, it is incorrect because when the target key doesn't fully match the found node (e.g., node->prefixlen != matchlen), these two nodes may also have the same prefixlen. It will return expected result when the passed key exist in the trie. However when a recently-deleted key or nonexistent key is passed to trie_get_next_key(), it may skip keys and return incorrect result. Fix it by using node->prefixlen == matchlen to identify exact matches. When the condition is true after the search, it also implies node->prefixlen equals key->prefixlen, otherwise, the search would return NULL instead. Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map") Signed-off-by: Hou Tao Reviewed-by: Toke Høiland-Jørgensen --- kernel/bpf/lpm_trie.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index ff57e35357ae..d447a6dab83b 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -647,7 +647,7 @@ static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key) struct lpm_trie_node **node_stack = NULL; int err = 0, stack_ptr = -1; unsigned int next_bit; - size_t matchlen; + size_t matchlen = 0; /* The get_next_key follows postorder. For the 4 node example in * the top of this file, the trie_get_next_key() returns the following @@ -686,7 +686,7 @@ static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key) next_bit = extract_bit(key->data, node->prefixlen); node = rcu_dereference(node->child[next_bit]); } - if (!node || node->prefixlen != key->prefixlen || + if (!node || node->prefixlen != matchlen || (node->flags & LPM_TREE_NODE_FLAG_IM)) goto find_leftmost; From patchwork Mon Nov 18 01:08:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13878006 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEDCD1946B for ; Mon, 18 Nov 2024 01:13:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892438; cv=none; b=VoxCEAdqCmlksSO2TyZItJWJ9dr0TIrwSlU14nVHDl7I9CSxUqS69FBWViXiAELPBa5W76tcbydKRpGIVeJPWBOxFcUJnbXCwLa1eTrp7WohqESlTYs5nV+Kz1BdJ6G5+8hTX22gVj/8GkTl4nel48P2E5YWGrtg1Kz3Vrj+WPo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892438; c=relaxed/simple; bh=YkIbSCeD1PXhVNI3KRAxnUyBFPS+VmerbPTNXeFCgJA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kSJvFkZ3K8Ak5zxeia3DMet1f2jEv4eSqJO0bdvaMUX4T5MD84U0QgNeoa9z4kcAHc7LToL/KtAcU+YmX/fwmQzjDH1D2Y1hxhSbHEv3FmA2YBYNNdh2AWX1Kex1k0+qv6E/98nd2AeJltkT7IH8qY4CHqhpcJtewLZ+Wev5mtY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Ls40S7z4f3jt9 for ; Mon, 18 Nov 2024 08:55:49 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id DFF951A0568 for ; Mon, 18 Nov 2024 08:56:02 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S10; Mon, 18 Nov 2024 08:56:02 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 06/10] bpf: Add bpf_mem_cache_is_mergeable() helper Date: Mon, 18 Nov 2024 09:08:04 +0800 Message-Id: <20241118010808.2243555-7-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S10 X-Coremail-Antispam: 1UD129KBjvJXoW7Kw4DCryUJFWkXF1UXFW8WFg_yoW8Ww4fpF W2kr18AF4FvF48Xw47Wrn2ya98Xw4Ig3W7Ka43XryUur13urnrGws8Gry7XF9Yvrs8KF40 kr1DKr4fC348ZrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUI-eODUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Add bpf_mem_cache_is_mergeable() to check whether two bpf mem allocator for fixed-size objects are mergeable or not. The merging could reduce the memory overhead of bpf mem allocator. Signed-off-by: Hou Tao --- include/linux/bpf_mem_alloc.h | 1 + kernel/bpf/memalloc.c | 12 ++++++++++++ 2 files changed, 13 insertions(+) diff --git a/include/linux/bpf_mem_alloc.h b/include/linux/bpf_mem_alloc.h index e45162ef59bb..faa54b9c7a04 100644 --- a/include/linux/bpf_mem_alloc.h +++ b/include/linux/bpf_mem_alloc.h @@ -47,5 +47,6 @@ void bpf_mem_cache_free(struct bpf_mem_alloc *ma, void *ptr); void bpf_mem_cache_free_rcu(struct bpf_mem_alloc *ma, void *ptr); void bpf_mem_cache_raw_free(void *ptr); void *bpf_mem_cache_alloc_flags(struct bpf_mem_alloc *ma, gfp_t flags); +bool bpf_mem_cache_is_mergeable(size_t size, size_t new_size, bool percpu); #endif /* _BPF_MEM_ALLOC_H */ diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 889374722d0a..49dd08ad1d4f 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -1014,3 +1014,15 @@ int bpf_mem_alloc_check_size(bool percpu, size_t size) return 0; } + +bool bpf_mem_cache_is_mergeable(size_t size, size_t new_size, bool percpu) +{ + /* Only for fixed-size object allocator */ + if (!size || !new_size) + return false; + + return (percpu && ALIGN(size, PCPU_MIN_ALLOC_SIZE) == + ALIGN(new_size, PCPU_MIN_ALLOC_SIZE)) || + (!percpu && kmalloc_size_roundup(size + LLIST_NODE_SZ) == + kmalloc_size_roundup(new_size + LLIST_NODE_SZ)); +} From patchwork Mon Nov 18 01:08:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13878002 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 798B017BA2 for ; Mon, 18 Nov 2024 01:13:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; cv=none; b=MJMdafZ1B1UtztWwUXQtWuWKwzbc/EHXyjLskHKCVWbVdDaoXvY6MHgx75UxEMcIvyMxp5169RelUG5I0BY2ILZgHkpYeTIKhSA8htw/+gWqufR8uKj8kRSHLIF+0Oa1FE81cuWKRwHN+q/SQq++gwZ0RGE2cCNPc5XbMz0BSxw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892437; c=relaxed/simple; bh=yiEvNYiru2ATyOLOdq3Vtg6kKl69Qpq4T919M+UGlV0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b77qP3ARcsD9AJaZ8/2k1KMOr5B2YQb08dl0eKmwatNW45s4byHiFtjf6KZCorpiTJAl4NIDwQrE3j6X7vmnFS6grYhvKKPgTnX2Dbejr/6RbIMsOOIzs2t1dhTgSwNon7pRujrpXjCC9UhIC0hXlrtONnE7j/8ivYtw3h5d7Ro= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Lq31Xgz4f3jsg for ; Mon, 18 Nov 2024 08:55:47 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D08951A058E for ; Mon, 18 Nov 2024 08:56:05 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S11; Mon, 18 Nov 2024 08:56:03 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 07/10] bpf: Switch to bpf mem allocator for LPM trie Date: Mon, 18 Nov 2024 09:08:05 +0800 Message-Id: <20241118010808.2243555-8-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S11 X-Coremail-Antispam: 1UD129KBjvJXoWxtF4kCrWfCw1kZF47urWxWFg_yoWfWryUpF ZxK34fArs8Xr45Wrs2qrs8Z345Zw40gw4UGas5WayrZF90vr9xJF18ZFW8ZFyYkFWkAa15 tF1DK3y0vr4UCr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUI-eODUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Multiple syzbot warnings have been reported. These warnings are mainly about the lock order between trie->lock and kmalloc()'s internal lock. See report [1] as an example: ====================================================== WARNING: possible circular locking dependency detected 6.10.0-rc7-syzkaller-00003-g4376e966ecb7 #0 Not tainted ------------------------------------------------------ syz.3.2069/15008 is trying to acquire lock: ffff88801544e6d8 (&n->list_lock){-.-.}-{2:2}, at: get_partial_node ... but task is already holding lock: ffff88802dcc89f8 (&trie->lock){-.-.}-{2:2}, at: trie_update_elem ... which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&trie->lock){-.-.}-{2:2}: __raw_spin_lock_irqsave _raw_spin_lock_irqsave+0x3a/0x60 trie_delete_elem+0xb0/0x820 ___bpf_prog_run+0x3e51/0xabd0 __bpf_prog_run32+0xc1/0x100 bpf_dispatcher_nop_func ...... bpf_trace_run2+0x231/0x590 __bpf_trace_contention_end+0xca/0x110 trace_contention_end.constprop.0+0xea/0x170 __pv_queued_spin_lock_slowpath+0x28e/0xcc0 pv_queued_spin_lock_slowpath queued_spin_lock_slowpath queued_spin_lock do_raw_spin_lock+0x210/0x2c0 __raw_spin_lock_irqsave _raw_spin_lock_irqsave+0x42/0x60 __put_partials+0xc3/0x170 qlink_free qlist_free_all+0x4e/0x140 kasan_quarantine_reduce+0x192/0x1e0 __kasan_slab_alloc+0x69/0x90 kasan_slab_alloc slab_post_alloc_hook slab_alloc_node kmem_cache_alloc_node_noprof+0x153/0x310 __alloc_skb+0x2b1/0x380 ...... -> #0 (&n->list_lock){-.-.}-{2:2}: check_prev_add check_prevs_add validate_chain __lock_acquire+0x2478/0x3b30 lock_acquire lock_acquire+0x1b1/0x560 __raw_spin_lock_irqsave _raw_spin_lock_irqsave+0x3a/0x60 get_partial_node.part.0+0x20/0x350 get_partial_node get_partial ___slab_alloc+0x65b/0x1870 __slab_alloc.constprop.0+0x56/0xb0 __slab_alloc_node slab_alloc_node __do_kmalloc_node __kmalloc_node_noprof+0x35c/0x440 kmalloc_node_noprof bpf_map_kmalloc_node+0x98/0x4a0 lpm_trie_node_alloc trie_update_elem+0x1ef/0xe00 bpf_map_update_value+0x2c1/0x6c0 map_update_elem+0x623/0x910 __sys_bpf+0x90c/0x49a0 ... other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&trie->lock); lock(&n->list_lock); lock(&trie->lock); lock(&n->list_lock); *** DEADLOCK *** [1]: https://syzkaller.appspot.com/bug?extid=9045c0a3d5a7f1b119f7 A bpf program attached to trace_contention_end() triggers after acquiring &n->list_lock. The program invokes trie_delete_elem(), which then acquires trie->lock. However, it is possible that another process is invoking trie_update_elem(). trie_update_elem() will acquire trie->lock first, then invoke kmalloc_node(). kmalloc_node() may invoke get_partial_node() and try to acquire &n->list_lock (not necessarily the same lock object). Therefore, lockdep warns about the circular locking dependency. Fix these warnings by replacing kmalloc()/kfree()/kfree_rcu() with equivalent bpf memory allocator APIs. Since intermediate node and leaf node have fixed sizes, fixed-size allocation APIs are used. Two aspects of this change require explanation: 1. A new flag LPM_TREE_NODE_FLAG_ALLOC_LEAF is added to track the original allocator. This is necessary because during deletion, a leaf node may be used as an intermediate node. These nodes must be freed through the leaf allocator. 2. The intermediate node allocator and leaf node allocator may be merged because value_size for LPM trie is usually small. The merging reduces the memory overhead of bpf memory allocator. Signed-off-by: Hou Tao --- kernel/bpf/lpm_trie.c | 90 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 74 insertions(+), 16 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index d447a6dab83b..d8995acecedf 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -15,23 +15,34 @@ #include #include #include +#include /* Intermediate node */ #define LPM_TREE_NODE_FLAG_IM BIT(0) +/* Allocated as leaf node. It may be used as intermediate node after deletion */ +#define LPM_TREE_NODE_FLAG_ALLOC_LEAF BIT(1) struct lpm_trie_node; struct lpm_trie_node { - struct rcu_head rcu; struct lpm_trie_node __rcu *child[2]; u32 prefixlen; u32 flags; u8 data[]; }; +enum { + LPM_TRIE_MA_IM = 0, + LPM_TRIE_MA_LEAF, + LPM_TRIE_MA_CNT, +}; + struct lpm_trie { struct bpf_map map; struct lpm_trie_node __rcu *root; + struct bpf_mem_alloc ma[LPM_TRIE_MA_CNT]; + struct bpf_mem_alloc *im_ma; + struct bpf_mem_alloc *leaf_ma; size_t n_entries; size_t max_prefixlen; size_t data_size; @@ -287,25 +298,21 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key) return found->data + trie->data_size; } -static struct lpm_trie_node *lpm_trie_node_alloc(const struct lpm_trie *trie, - const void *value) +static struct lpm_trie_node *lpm_trie_node_alloc(struct lpm_trie *trie, const void *value) { + struct bpf_mem_alloc *ma = value ? trie->leaf_ma : trie->im_ma; struct lpm_trie_node *node; - size_t size = sizeof(struct lpm_trie_node) + trie->data_size; - - if (value) - size += trie->map.value_size; - node = bpf_map_kmalloc_node(&trie->map, size, GFP_NOWAIT | __GFP_NOWARN, - trie->map.numa_node); + node = bpf_mem_cache_alloc(ma); if (!node) return NULL; node->flags = 0; - - if (value) + if (value) { + node->flags |= LPM_TREE_NODE_FLAG_ALLOC_LEAF; memcpy(node->data + trie->data_size, value, trie->map.value_size); + } return node; } @@ -319,6 +326,25 @@ static int trie_check_noreplace_update(const struct lpm_trie *trie, u64 flags) return 0; } +static void lpm_trie_node_free(struct lpm_trie *trie, + struct lpm_trie_node *node, bool defer) +{ + struct bpf_mem_alloc *ma; + + if (!node) + return; + + ma = (node->flags & LPM_TREE_NODE_FLAG_ALLOC_LEAF) ? trie->leaf_ma : + trie->im_ma; + + migrate_disable(); + if (defer) + bpf_mem_cache_free_rcu(ma, node); + else + bpf_mem_cache_free(ma, node); + migrate_enable(); +} + /* Called from syscall or from eBPF program */ static long trie_update_elem(struct bpf_map *map, void *_key, void *value, u64 flags) @@ -450,9 +476,10 @@ static long trie_update_elem(struct bpf_map *map, out: if (ret) - kfree(new_node); + bpf_mem_cache_free(trie->leaf_ma, new_node); + spin_unlock_irqrestore(&trie->lock, irq_flags); - kfree_rcu(free_node, rcu); + lpm_trie_node_free(trie, free_node, true); return ret; } @@ -550,8 +577,8 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) out: spin_unlock_irqrestore(&trie->lock, irq_flags); - kfree_rcu(free_parent, rcu); - kfree_rcu(free_node, rcu); + lpm_trie_node_free(trie, free_parent, true); + lpm_trie_node_free(trie, free_node, true); return ret; } @@ -572,7 +599,9 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) static struct bpf_map *trie_alloc(union bpf_attr *attr) { + size_t size, leaf_size; struct lpm_trie *trie; + int err; /* check sanity of attributes */ if (attr->max_entries == 0 || @@ -597,7 +626,30 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr) spin_lock_init(&trie->lock); + size = sizeof(struct lpm_trie_node) + trie->data_size; + err = bpf_mem_alloc_init(&trie->ma[LPM_TRIE_MA_IM], size, false); + if (err) + goto free_out; + trie->im_ma = &trie->ma[LPM_TRIE_MA_IM]; + + leaf_size = size + trie->map.value_size; + if (bpf_mem_cache_is_mergeable(size, leaf_size, false)) { + trie->leaf_ma = trie->im_ma; + } else { + err = bpf_mem_alloc_init(&trie->ma[LPM_TRIE_MA_LEAF], + leaf_size, false); + if (err) + goto destroy_ma_out; + trie->leaf_ma = &trie->ma[LPM_TRIE_MA_LEAF]; + } + return &trie->map; + +destroy_ma_out: + bpf_mem_alloc_destroy(trie->im_ma); +free_out: + bpf_map_area_free(trie); + return ERR_PTR(err); } static void trie_free(struct bpf_map *map) @@ -629,13 +681,19 @@ static void trie_free(struct bpf_map *map) continue; } - kfree(node); + /* No bpf program may access the map, so freeing the + * node without waiting for the extra RCU GP. + */ + lpm_trie_node_free(trie, node, false); RCU_INIT_POINTER(*slot, NULL); break; } } out: + bpf_mem_alloc_destroy(trie->im_ma); + if (trie->leaf_ma != trie->im_ma) + bpf_mem_alloc_destroy(trie->leaf_ma); bpf_map_area_free(trie); } From patchwork Mon Nov 18 01:08:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13877997 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70C468C07 for ; Mon, 18 Nov 2024 00:56:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731891372; cv=none; b=fxP1JoEghIIKOG5P6QOsITlrURj9J4sQO9/TCAcy88Tsj+6+EZY93poqSsiJLVnUDUSTEzrfucXL4RMH5Vgj4HJTjBJOCC7uU6QrctDQEGnVjDVECJbldMs7ZelB/o8KN7WyZ0RlOHfEzWcNwGQkxSfQLgTxlxvbAzHOzj1tLfU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731891372; c=relaxed/simple; bh=IGjvc+sJPUpcJEIdMIioYXehKMCaHh0dsUy/W8AwIAQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fQm/zpmaBsy8jFLoCvEtittowCeV5b3L29FlnXevVivghMj7VuZofX2gxJfPDiSRz1mBIxmKeDNq0fu5E69qOk0B9JFGk4TRKlnFLnGIk3LVoQfbmrTzQqYPk9JsJrcsqGkIviipsLcuwCtETHnoAtWCMs02k3QD0FW9vxupcGc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Lq1H8rz4f3lfZ for ; Mon, 18 Nov 2024 08:55:47 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 7ED8C1A0197 for ; Mon, 18 Nov 2024 08:56:06 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S12; Mon, 18 Nov 2024 08:56:06 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 08/10] bpf: Use raw_spinlock_t for LPM trie Date: Mon, 18 Nov 2024 09:08:06 +0800 Message-Id: <20241118010808.2243555-9-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S12 X-Coremail-Antispam: 1UD129KBjvJXoWxZw4Utr17Wr1kWr43Aw4fKrg_yoW5GryDpF 4xG3s0yw48Ar4Ygw4vqrZ0q3y5Ww4kWw4UGas5Wa4xJr90q3sxJr18AFy09F15AFWIyrsx tF15tFWSvrW8uFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUI-eODUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao After switching from kmalloc() to the bpf memory allocator, there will be no blocking operation during the update of LPM trie. Therefore, change trie->lock from spinlock_t to raw_spinlock_t to make LPM trie usable in atomic context, even on RT kernels. Signed-off-by: Hou Tao --- kernel/bpf/lpm_trie.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index d8995acecedf..f8ff2bfcd704 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -46,7 +46,7 @@ struct lpm_trie { size_t n_entries; size_t max_prefixlen; size_t data_size; - spinlock_t lock; + raw_spinlock_t lock; }; /* This trie implements a longest prefix match algorithm that can be used to @@ -365,7 +365,7 @@ static long trie_update_elem(struct bpf_map *map, if (key->prefixlen > trie->max_prefixlen) return -EINVAL; - spin_lock_irqsave(&trie->lock, irq_flags); + raw_spin_lock_irqsave(&trie->lock, irq_flags); /* Allocate and fill a new node */ new_node = lpm_trie_node_alloc(trie, value); @@ -478,7 +478,7 @@ static long trie_update_elem(struct bpf_map *map, if (ret) bpf_mem_cache_free(trie->leaf_ma, new_node); - spin_unlock_irqrestore(&trie->lock, irq_flags); + raw_spin_unlock_irqrestore(&trie->lock, irq_flags); lpm_trie_node_free(trie, free_node, true); return ret; @@ -500,7 +500,7 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) if (key->prefixlen > trie->max_prefixlen) return -EINVAL; - spin_lock_irqsave(&trie->lock, irq_flags); + raw_spin_lock_irqsave(&trie->lock, irq_flags); /* Walk the tree looking for an exact key/length match and keeping * track of the path we traverse. We will need to know the node @@ -576,7 +576,7 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) free_node = node; out: - spin_unlock_irqrestore(&trie->lock, irq_flags); + raw_spin_unlock_irqrestore(&trie->lock, irq_flags); lpm_trie_node_free(trie, free_parent, true); lpm_trie_node_free(trie, free_node, true); @@ -624,7 +624,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr) offsetof(struct bpf_lpm_trie_key_u8, data); trie->max_prefixlen = trie->data_size * 8; - spin_lock_init(&trie->lock); + raw_spin_lock_init(&trie->lock); size = sizeof(struct lpm_trie_node) + trie->data_size; err = bpf_mem_alloc_init(&trie->ma[LPM_TRIE_MA_IM], size, false); From patchwork Mon Nov 18 01:08:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13878005 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7986D175BF for ; Mon, 18 Nov 2024 01:13:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892438; cv=none; b=jnN/eeGEHifT2sKnG/qmRPCZ7Ps//VTetQwzF0ZkM5p3zNkT301063QL1dl+IhvnlzWdI0SQZ215YXGKhySRfmL35U+lVBw4mJi1cd/YNsI4ZjybeF7kPiobc4pwuHBvZD7eJWyXICxvp5EnqM2ZQ3JxRiMqUs09XorPrlFtUl0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731892438; c=relaxed/simple; bh=DdA1PbFjubhIDgObznYrbOhMHrVixV9Yw1JtsAWK3rg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=USnqJSMyKcBnY0kKX/vh+5KPpL2m2O9hHsr2C0g9i2Mzvng84hOeJIIOAywZ2vN/m+D69H9AzoRCSqUiPRGTfWF2Ax1w2Xju4FzKfUzL/RYM71gjmOKJR+CaWu1D28Ms0krRALBlU3SywW7kdpnYaG7Vhz+hpuDpITtuArhdKto= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Lr5Tmfz4f3jsg for ; Mon, 18 Nov 2024 08:55:48 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 326C61A0359 for ; Mon, 18 Nov 2024 08:56:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S13; Mon, 18 Nov 2024 08:56:06 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 09/10] selftests/bpf: Move test_lpm_map.c to map_tests Date: Mon, 18 Nov 2024 09:08:07 +0800 Message-Id: <20241118010808.2243555-10-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S13 X-Coremail-Antispam: 1UD129KBjvJXoWxCw45tFyrWF1kXw4xZFykuFg_yoW5WFWfpa 48t3WqkF1SvFnxX3WxuayUZF40gFsrXw40y3W8try5Zrn7Jr1xtr1xKFW7JFnxurZ3XFnx Aas3KFyfZFy8JaUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUI-eODUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Move test_lpm_map.c to map_tests/ to include LPM trie test cases in regular test_maps run. Most code remains unchanged, including the use of assert(). Only reduce n_lookups from 64K to 512, which decreases test_lpm_map runtime from 37s to 0.7s. Signed-off-by: Hou Tao --- tools/testing/selftests/bpf/.gitignore | 1 - tools/testing/selftests/bpf/Makefile | 2 +- .../lpm_trie_map_basic_ops.c} | 10 +++------- 3 files changed, 4 insertions(+), 9 deletions(-) rename tools/testing/selftests/bpf/{test_lpm_map.c => map_tests/lpm_trie_map_basic_ops.c} (99%) diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index c2a1842c3d8b..e9c377001f93 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -5,7 +5,6 @@ bpf-syscall* test_verifier test_maps test_lru_map -test_lpm_map test_tag FEATURE-DUMP.libbpf FEATURE-DUMP.selftests diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 6ad3b1ba1920..7eeb3cbe18c7 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -83,7 +83,7 @@ CLANG_CPUV4 := 1 endif # Order correspond to 'make run_tests' order -TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \ +TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_progs \ test_sockmap \ test_tcpnotify_user test_sysctl \ test_progs-no_alu32 diff --git a/tools/testing/selftests/bpf/test_lpm_map.c b/tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c similarity index 99% rename from tools/testing/selftests/bpf/test_lpm_map.c rename to tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c index d98c72dc563e..f375c89d78a4 100644 --- a/tools/testing/selftests/bpf/test_lpm_map.c +++ b/tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c @@ -223,7 +223,7 @@ static void test_lpm_map(int keysize) n_matches = 0; n_matches_after_delete = 0; n_nodes = 1 << 8; - n_lookups = 1 << 16; + n_lookups = 1 << 9; data = alloca(keysize); memset(data, 0, keysize); @@ -770,16 +770,13 @@ static void test_lpm_multi_thread(void) close(map_fd); } -int main(void) +void test_lpm_trie_map_basic_ops(void) { int i; /* we want predictable, pseudo random tests */ srand(0xf00ba1); - /* Use libbpf 1.0 API mode */ - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); - test_lpm_basic(); test_lpm_order(); @@ -792,6 +789,5 @@ int main(void) test_lpm_get_next_key(); test_lpm_multi_thread(); - printf("test_lpm: OK\n"); - return 0; + printf("%s: PASS\n", __func__); } From patchwork Mon Nov 18 01:08:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13877998 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2113D517 for ; Mon, 18 Nov 2024 00:56:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731891373; cv=none; b=a5hb+s538ape5YoMVtpRdmrOCff+9wjLZCi+13neFkn7x8t2OT2DdLi4XZazMBs6XPOb67VZhRzID/96hlLypYg14LA6otu8kOJ1RJQHO310iiJ/Pmn8B2t+aPdBC/EsThN+vcoMdix1IMCcKNckcT1JPpg3E+84MEbplpdK0ro= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731891373; c=relaxed/simple; bh=FvDTvhfIuWLEyb6na+3CTeXY7LBRFnkOoVM8Cr2HYb8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=osl5IdSqLWyNUhphu6Ynop/eP7ASGc6eQGd2zx5wJOvCykrIEJIkOGWvTp5+sN8211UeQXXTOtoQGK4NBylDH/EGtcSHX3i+/ZxKJWE8LsuD+AmLuF7LeXemTS9Ov4hx3Ph57XCWLcN2/Z5z1qcFgTkF69Q1ybomPYyDm2x2xS0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Xs8Ly3h1Qz4f3jkk for ; Mon, 18 Nov 2024 08:55:54 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D6A8F1A0197 for ; Mon, 18 Nov 2024 08:56:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgCXc4eckDpn2G5fCA--.44635S14; Mon, 18 Nov 2024 08:56:07 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , Thomas Gleixner , =?utf-8?q?Thomas_Wei=C3=9Fschuh?= , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 10/10] selftests/bpf: Add more test cases for LPM trie Date: Mon, 18 Nov 2024 09:08:08 +0800 Message-Id: <20241118010808.2243555-11-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20241118010808.2243555-1-houtao@huaweicloud.com> References: <20241118010808.2243555-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgCXc4eckDpn2G5fCA--.44635S14 X-Coremail-Antispam: 1UD129KBjvAXoWfGw43AFW3XF15GrWxAr43Wrg_yoW8XF18Wo WfWwsxtw1FgryUZ3y8Wa4kuw15GF4Uu34kJr9aqwsxtw4UtrykAa18GFW3Gay8WF12kryq v3sFvr9ayr9YkrWfn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOY7kC6x804xWl14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK 8VAvwI8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF 0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vE j48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxV AFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x02 67AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I 80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCj c4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4 kS14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E 5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZV WrXwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY 1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67 AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZE Xa7IU0sqXPUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Add more test cases for LPM trie in test_maps: 1) test_lpm_trie_update_flags() It constructs various use cases for BPF_EXIST and BPF_NOEXIST and check whether the return value of update operation is expected. 2) test_lpm_trie_reuse_leaf_node It tests whether the reuse of leaf node as intermediate node is handled correctly when freeing the intermediate node. 3) test_lpm_trie_iterate_strs & test_lpm_trie_iterate_ints There two test case test whether the iteration through get_next_key is sorted and expected. These two test cases delete the minimal key after each iteration and check whether next iteration returns the second minimal key. The only difference between these two test cases is the former one saves strings in the LPM trie and the latter saves integers. Signed-off-by: Hou Tao --- .../bpf/map_tests/lpm_trie_map_basic_ops.c | 391 ++++++++++++++++++ 1 file changed, 391 insertions(+) diff --git a/tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c b/tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c index f375c89d78a4..eb17c711b4c8 100644 --- a/tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c +++ b/tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c @@ -20,10 +20,12 @@ #include #include #include +#include #include #include #include +#include #include "bpf_util.h" @@ -33,6 +35,22 @@ struct tlpm_node { uint8_t key[]; }; +struct lpm_trie_bytes_key { + union { + struct bpf_lpm_trie_key_hdr hdr; + __u32 prefixlen; + }; + unsigned char data[8]; +}; + +struct lpm_trie_int_key { + union { + struct bpf_lpm_trie_key_hdr hdr; + __u32 prefixlen; + }; + unsigned int data; +}; + static struct tlpm_node *tlpm_match(struct tlpm_node *list, const uint8_t *key, size_t n_bits); @@ -770,6 +788,374 @@ static void test_lpm_multi_thread(void) close(map_fd); } +static int lpm_trie_create(unsigned int key_size, unsigned int value_size, unsigned int max_entries) +{ + LIBBPF_OPTS(bpf_map_create_opts, opts); + int fd; + + opts.map_flags = BPF_F_NO_PREALLOC; + fd = bpf_map_create(BPF_MAP_TYPE_LPM_TRIE, "lpm_trie", key_size, value_size, max_entries, + &opts); + CHECK(fd < 0, "bpf_map_create", "error %d\n", errno); + + return fd; +} + +static void test_lpm_trie_update_flags(void) +{ + struct lpm_trie_int_key key; + unsigned int value, got; + int fd, err; + + fd = lpm_trie_create(sizeof(key), sizeof(value), 3); + + /* invalid flags (Error) */ + key.prefixlen = 32; + key.data = 0; + value = 0; + err = bpf_map_update_elem(fd, &key, &value, BPF_F_LOCK); + CHECK(err != -EINVAL, "invalid update flag", "error %d\n", err); + + /* invalid flags (Error) */ + key.prefixlen = 32; + key.data = 0; + value = 0; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST | BPF_EXIST); + CHECK(err != -EINVAL, "invalid update flag", "error %d\n", err); + + /* overwrite an empty qp-trie (Error) */ + key.prefixlen = 32; + key.data = 0; + value = 2; + err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST); + CHECK(err != -ENOENT, "overwrite empty qp-trie", "error %d\n", err); + + /* add a new node */ + key.prefixlen = 16; + key.data = 0; + value = 1; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + CHECK(err, "add new elem", "error %d\n", err); + got = 0; + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup elem", "error %d\n", err); + CHECK(got != value, "check value", "got %d exp %d\n", got, value); + + /* add the same node as new node (Error) */ + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + CHECK(err != -EEXIST, "add new elem again", "error %d\n", err); + + /* overwrite the existed node */ + value = 4; + err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST); + CHECK(err, "overwrite elem", "error %d\n", err); + got = 0; + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup elem", "error %d\n", err); + CHECK(got != value, "check value", "got %d exp %d\n", got, value); + + /* overwrite the node */ + value = 1; + err = bpf_map_update_elem(fd, &key, &value, BPF_ANY); + CHECK(err, "update elem", "error %d\n", err); + got = 0; + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup elem", "error %d\n", err); + CHECK(got != value, "check value", "got %d exp %d\n", got, value); + + /* overwrite a non-existent node which is the prefix of the first + * node (Error). + */ + key.prefixlen = 8; + key.data = 0; + value = 2; + err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST); + CHECK(err != -ENOENT, "overwrite nonexistent elem", "error %d\n", err); + + /* add a new node which is the prefix of the first node */ + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + CHECK(err, "add new elem", "error %d\n", err); + got = 0; + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup key", "error %d\n", err); + CHECK(got != value, "check value", "got %d exp %d\n", got, value); + + /* add another new node which will be the sibling of the first node */ + key.prefixlen = 9; + key.data = htobe32(1 << 23); + value = 5; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + CHECK(err, "add new elem", "error %d\n", err); + got = 0; + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup key", "error %d\n", err); + CHECK(got != value, "check value", "got %d exp %d\n", got, value); + + /* overwrite the third node */ + value = 3; + err = bpf_map_update_elem(fd, &key, &value, BPF_ANY); + CHECK(err, "overwrite elem", "error %d\n", err); + got = 0; + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup key", "error %d\n", err); + CHECK(got != value, "check value", "got %d exp %d\n", got, value); + + /* delete the second node to make it an intermediate node */ + key.prefixlen = 8; + key.data = 0; + err = bpf_map_delete_elem(fd, &key); + CHECK(err, "del elem", "error %d\n", err); + + /* overwrite the intermediate node (Error) */ + value = 2; + err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST); + CHECK(err != -ENOENT, "overwrite nonexistent elem", "error %d\n", err); + + close(fd); +} + +static void test_lpm_trie_reuse_leaf_node(void) +{ + /* Use a big value_size to prevent the merging of intermediate node + * allocator and leaf node allocator. + */ + char value[64] = {}, got[64] = {}; + struct lpm_trie_int_key key; + int fd, err; + + fd = lpm_trie_create(sizeof(key), sizeof(value), 3); + + /* add a new node */ + key.prefixlen = 16; + key.data = 0; + memset(value, 'a', sizeof(value)); + err = bpf_map_update_elem(fd, &key, value, BPF_NOEXIST); + CHECK(err, "add new elem", "error %d\n", err); + memset(got, 0, sizeof(got)); + err = bpf_map_lookup_elem(fd, &key, got); + CHECK(err, "lookup elem", "error %d\n", err); + CHECK(memcmp(value, got, sizeof(value)), "check value", "got %.*s exp %.*s\n", + (int)sizeof(value), value, (int)sizeof(got), got); + + /* add a new node which is the prefix of the first node */ + key.prefixlen = 8; + key.data = 0; + memset(value, 'b', sizeof(value)); + err = bpf_map_update_elem(fd, &key, value, BPF_NOEXIST); + CHECK(err, "add new elem", "error %d\n", err); + memset(got, 0, sizeof(got)); + err = bpf_map_lookup_elem(fd, &key, got); + CHECK(err, "lookup elem", "error %d\n", err); + CHECK(memcmp(value, got, sizeof(value)), "check value", "got %.*s exp %.*s\n", + (int)sizeof(value), value, (int)sizeof(got), got); + + /* add another new node which will be the sibling of the first node */ + key.prefixlen = 9; + key.data = htobe32(1 << 23); + memset(value, 'c', sizeof(value)); + err = bpf_map_update_elem(fd, &key, value, BPF_NOEXIST); + CHECK(err, "add new elem", "error %d\n", err); + memset(got, 0, sizeof(got)); + err = bpf_map_lookup_elem(fd, &key, got); + CHECK(err, "lookup elem", "error %d\n", err); + CHECK(memcmp(value, got, sizeof(value)), "check value", "got %.*s exp %.*s\n", + (int)sizeof(value), value, (int)sizeof(got), got); + + /* try to add more node (Error) */ + key.prefixlen = 32; + key.data = 0; + memset(value, 'd', sizeof(value)); + err = bpf_map_update_elem(fd, &key, value, BPF_NOEXIST); + CHECK(err != -ENOSPC, "add to full trie", "error %d\n", err); + + /* delete the second node to make it an intermediate node */ + key.prefixlen = 8; + key.data = 0; + err = bpf_map_delete_elem(fd, &key); + CHECK(err, "del elem", "error %d\n", err); + + /* delete the first node to free the intermediate node */ + key.prefixlen = 16; + key.data = 0; + err = bpf_map_delete_elem(fd, &key); + CHECK(err, "del elem", "error %d\n", err); + + close(fd); +} + +static int cmp_str(const void *a, const void *b) +{ + const char *str_a = *(const char **)a, *str_b = *(const char **)b; + + return strcmp(str_a, str_b); +} + +/* Save strings in LPM trie. The trailing '\0' for each string will be + * accounted in the prefixlen. The strings returned during the iteration + * should be sorted as expected. + */ +static void test_lpm_trie_iterate_strs(void) +{ + static const char * const keys[] = { + "ab", "abc", "abo", "abS", "abcd", + }; + const char *sorted_keys[ARRAY_SIZE(keys)]; + struct lpm_trie_bytes_key key, next_key; + unsigned int value, got, i, j, len; + struct lpm_trie_bytes_key *cur; + int fd, err; + + fd = lpm_trie_create(sizeof(key), sizeof(value), ARRAY_SIZE(keys)); + + for (i = 0; i < ARRAY_SIZE(keys); i++) { + unsigned int flags; + + /* add i-th element */ + flags = i % 2 ? BPF_NOEXIST : 0; + len = strlen(keys[i]); + /* include the trailing '\0' */ + key.prefixlen = (len + 1) * 8; + memset(key.data, 0, sizeof(key.data)); + memcpy(key.data, keys[i], len); + value = i + 100; + err = bpf_map_update_elem(fd, &key, &value, flags); + CHECK(err, "add elem", "#%u error %d\n", i, err); + + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup elem", "#%u error %d\n", i, err); + CHECK(got != value, "lookup elem", "#%u expect %u got %u\n", i, value, got); + + /* re-add i-th element (Error) */ + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + CHECK(err != -EEXIST, "re-add elem", "#%u error %d\n", i, err); + + /* Overwrite i-th element */ + flags = i % 2 ? 0 : BPF_EXIST; + value = i; + err = bpf_map_update_elem(fd, &key, &value, flags); + CHECK(err, "update elem", "error %d\n", err); + + /* Lookup #[0~i] elements */ + for (j = 0; j <= i; j++) { + len = strlen(keys[j]); + key.prefixlen = (len + 1) * 8; + memset(key.data, 0, sizeof(key.data)); + memcpy(key.data, keys[j], len); + err = bpf_map_lookup_elem(fd, &key, &got); + CHECK(err, "lookup elem", "#%u/%u error %d\n", i, j, err); + CHECK(got != j, "lookup elem", "#%u/%u expect %u got %u\n", + i, j, value, got); + } + } + + /* Add element to a full qp-trie (Error) */ + key.prefixlen = sizeof(key.data) * 8; + memset(key.data, 0, sizeof(key.data)); + value = 0; + err = bpf_map_update_elem(fd, &key, &value, 0); + CHECK(err != -ENOSPC, "add to full qp-trie", "error %d\n", err); + + /* Iterate sorted elements: no deletion */ + memcpy(sorted_keys, keys, sizeof(keys)); + qsort(sorted_keys, ARRAY_SIZE(sorted_keys), sizeof(sorted_keys[0]), cmp_str); + cur = NULL; + for (i = 0; i < ARRAY_SIZE(sorted_keys); i++) { + len = strlen(sorted_keys[i]); + err = bpf_map_get_next_key(fd, cur, &next_key); + CHECK(err, "iterate", "#%u error %d\n", i, err); + CHECK(next_key.prefixlen != (len + 1) * 8, "iterate", + "#%u invalid len %u expect %u\n", + i, next_key.prefixlen, (len + 1) * 8); + CHECK(memcmp(sorted_keys[i], next_key.data, len + 1), "iterate", + "#%u got %.*s exp %.*s\n", i, len, next_key.data, len, sorted_keys[i]); + + cur = &next_key; + } + err = bpf_map_get_next_key(fd, cur, &next_key); + CHECK(err != -ENOENT, "more element", "error %d\n", err); + + /* Iterate sorted elements: delete the found key after each iteration */ + cur = NULL; + for (i = 0; i < ARRAY_SIZE(sorted_keys); i++) { + len = strlen(sorted_keys[i]); + err = bpf_map_get_next_key(fd, cur, &next_key); + CHECK(err, "iterate", "#%u error %d\n", i, err); + CHECK(next_key.prefixlen != (len + 1) * 8, "iterate", + "#%u invalid len %u expect %u\n", + i, next_key.prefixlen, (len + 1) * 8); + CHECK(memcmp(sorted_keys[i], next_key.data, len + 1), "iterate", + "#%u got %.*s exp %.*s\n", i, len, next_key.data, len, sorted_keys[i]); + + cur = &next_key; + + err = bpf_map_delete_elem(fd, cur); + CHECK(err, "delete", "#%u error %d\n", i, err); + } + err = bpf_map_get_next_key(fd, cur, &next_key); + CHECK(err != -ENOENT, "non-empty qp-trie", "error %d\n", err); + + close(fd); +} + +/* Use the fixed prefixlen (32) and save integers in LPM trie. The iteration of + * LPM trie will return these integers in big-endian order, therefore, convert + * these integers to big-endian before update. After each iteration, delete the + * found key (the smallest integer) and expect the next iteration will return + * the second smallest number. + */ +static void test_lpm_trie_iterate_ints(void) +{ + struct lpm_trie_int_key key, next_key; + unsigned int i, max_entries; + struct lpm_trie_int_key *cur; + unsigned int *data_set; + int fd, err; + bool value; + + max_entries = 4096; + data_set = calloc(max_entries, sizeof(*data_set)); + CHECK(!data_set, "malloc", "no mem\n"); + for (i = 0; i < max_entries; i++) + data_set[i] = i; + + fd = lpm_trie_create(sizeof(key), sizeof(value), max_entries); + value = true; + for (i = 0; i < max_entries; i++) { + key.prefixlen = 32; + key.data = htobe32(data_set[i]); + + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + CHECK(err, "add elem", "#%u error %d\n", i, err); + } + + cur = NULL; + for (i = 0; i < max_entries; i++) { + err = bpf_map_get_next_key(fd, cur, &next_key); + CHECK(err, "iterate", "#%u error %d\n", i, err); + CHECK(next_key.prefixlen != 32, "iterate", "#%u invalid len %u\n", + i, next_key.prefixlen); + CHECK(be32toh(next_key.data) != data_set[i], "iterate", "#%u got 0x%x exp 0x%x\n", + i, be32toh(next_key.data), data_set[i]); + cur = &next_key; + + /* + * Delete the minimal key, the next call of bpf_get_next_key() + * will return the second minimal key. + */ + err = bpf_map_delete_elem(fd, &next_key); + CHECK(err, "del elem", "#%u elem error %d\n", i, err); + } + err = bpf_map_get_next_key(fd, cur, &next_key); + CHECK(err != -ENOENT, "more element", "error %d\n", err); + + err = bpf_map_get_next_key(fd, NULL, &next_key); + CHECK(err != -ENOENT, "no-empty qp-trie", "error %d\n", err); + + free(data_set); + + close(fd); +} + void test_lpm_trie_map_basic_ops(void) { int i; @@ -789,5 +1175,10 @@ void test_lpm_trie_map_basic_ops(void) test_lpm_get_next_key(); test_lpm_multi_thread(); + test_lpm_trie_update_flags(); + test_lpm_trie_reuse_leaf_node(); + test_lpm_trie_iterate_strs(); + test_lpm_trie_iterate_ints(); + printf("%s: PASS\n", __func__); }