From patchwork Thu Dec 5 00:28:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 13894615 X-Patchwork-Delegate: kuba@kernel.org Received: from mail.netfilter.org (mail.netfilter.org [217.70.188.207]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2E7D7D268; Thu, 5 Dec 2024 00:29:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.70.188.207 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358554; cv=none; b=ruZ4562tcDJAvVshL7y23r9jH/AjrTBAvAZLZB3hW+LAZJfYX8Iy4kdwBMPYX0yth+SpXIR9VpivfibqMLlYiG4m8FdKrZIjrjCh+8JED3PdzmrwXTYC6sj/IM/BEvvrbPf7gA56seCZLB1qrYfFgfyDhnxLOEnD5wyfhfkWvY4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358554; c=relaxed/simple; bh=rl6K2qrpxyJybEOdvULFU5pYulet5pDFonGnFQj0GuE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=g+966O8eTu9LabCj5PXzBu93bovpapSoUsnj78D2x+4rcPfxuZgu/9QQkpho2cJoQsxVUtiydtZ8idDfdTljKqdvwq7HPv0ppHIhEip3bDcwAkCeBRbHGKsznQKkLv9t7p3lIanLanERMItgEC42dQrKNxyz98fbQK4C7ZU70wE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org; spf=pass smtp.mailfrom=netfilter.org; arc=none smtp.client-ip=217.70.188.207 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=netfilter.org From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com, fw@strlen.de Subject: [PATCH net 1/6] ipvs: fix UB due to uninitialized stack access in ip_vs_protocol_init() Date: Thu, 5 Dec 2024 01:28:49 +0100 Message-Id: <20241205002854.162490-2-pablo@netfilter.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20241205002854.162490-1-pablo@netfilter.org> References: <20241205002854.162490-1-pablo@netfilter.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org From: Jinghao Jia Under certain kernel configurations when building with Clang/LLVM, the compiler does not generate a return or jump as the terminator instruction for ip_vs_protocol_init(), triggering the following objtool warning during build time: vmlinux.o: warning: objtool: ip_vs_protocol_init() falls through to next function __initstub__kmod_ip_vs_rr__935_123_ip_vs_rr_init6() At runtime, this either causes an oops when trying to load the ipvs module or a boot-time panic if ipvs is built-in. This same issue has been reported by the Intel kernel test robot previously. Digging deeper into both LLVM and the kernel code reveals this to be a undefined behavior problem. ip_vs_protocol_init() uses a on-stack buffer of 64 chars to store the registered protocol names and leaves it uninitialized after definition. The function calls strnlen() when concatenating protocol names into the buffer. With CONFIG_FORTIFY_SOURCE strnlen() performs an extra step to check whether the last byte of the input char buffer is a null character (commit 3009f891bb9f ("fortify: Allow strlen() and strnlen() to pass compile-time known lengths")). This, together with possibly other configurations, cause the following IR to be generated: define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #5 section ".init.text" align 16 !kcfi_type !29 { %1 = alloca [64 x i8], align 16 ... 14: ; preds = %11 %15 = getelementptr inbounds i8, ptr %1, i64 63 %16 = load i8, ptr %15, align 1 %17 = tail call i1 @llvm.is.constant.i8(i8 %16) %18 = icmp eq i8 %16, 0 %19 = select i1 %17, i1 %18, i1 false br i1 %19, label %20, label %23 20: ; preds = %14 %21 = call i64 @strlen(ptr noundef nonnull dereferenceable(1) %1) #23 ... 23: ; preds = %14, %11, %20 %24 = call i64 @strnlen(ptr noundef nonnull dereferenceable(1) %1, i64 noundef 64) #24 ... } The above code calculates the address of the last char in the buffer (value %15) and then loads from it (value %16). Because the buffer is never initialized, the LLVM GVN pass marks value %16 as undefined: %13 = getelementptr inbounds i8, ptr %1, i64 63 br i1 undef, label %14, label %17 This gives later passes (SCCP, in particular) more DCE opportunities by propagating the undef value further, and eventually removes everything after the load on the uninitialized stack location: define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #0 section ".init.text" align 16 !kcfi_type !11 { %1 = alloca [64 x i8], align 16 ... 12: ; preds = %11 %13 = getelementptr inbounds i8, ptr %1, i64 63 unreachable } In this way, the generated native code will just fall through to the next function, as LLVM does not generate any code for the unreachable IR instruction and leaves the function without a terminator. Zero the on-stack buffer to avoid this possible UB. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202402100205.PWXIz1ZK-lkp@intel.com/ Co-developed-by: Ruowen Qin Signed-off-by: Ruowen Qin Signed-off-by: Jinghao Jia Acked-by: Julian Anastasov Signed-off-by: Pablo Neira Ayuso --- net/netfilter/ipvs/ip_vs_proto.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c index f100da4ba3bc..a9fd1d3fc2cb 100644 --- a/net/netfilter/ipvs/ip_vs_proto.c +++ b/net/netfilter/ipvs/ip_vs_proto.c @@ -340,7 +340,7 @@ void __net_exit ip_vs_protocol_net_cleanup(struct netns_ipvs *ipvs) int __init ip_vs_protocol_init(void) { - char protocols[64]; + char protocols[64] = { 0 }; #define REGISTER_PROTOCOL(p) \ do { \ register_ip_vs_protocol(p); \ @@ -348,8 +348,6 @@ int __init ip_vs_protocol_init(void) strcat(protocols, (p)->name); \ } while (0) - protocols[0] = '\0'; - protocols[2] = '\0'; #ifdef CONFIG_IP_VS_PROTO_TCP REGISTER_PROTOCOL(&ip_vs_protocol_tcp); #endif From patchwork Thu Dec 5 00:28:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 13894614 X-Patchwork-Delegate: kuba@kernel.org Received: from mail.netfilter.org (mail.netfilter.org [217.70.188.207]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 783AFD26D; Thu, 5 Dec 2024 00:29:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.70.188.207 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358554; cv=none; b=a8DmUOanPfFZkVsxTuX8dldzVLG/UoBoktu52lrWMeapSb/ELdbnU5HGvghEZ4jjwIIb/rA+hA4F2fRPA05ZD4EGw+o5TmxqdCrLU+AAO7M3oRwuIc4XDieHCowp5e7Qa6CE5gFb+luwW6m9tLbrFn4AQZ+qjNl4ELVpYuDFMfg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358554; c=relaxed/simple; bh=hFk7DcrCWku1f8Amlp053OgJaZqRDekiB+fJXOGIWn4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Sf2yZZwzcLb4787KhRHKyIsSZVtAwgIcYz6MOnsTwMvex4NG+aaYb4YtMuEPIT6JGmVNW2yQRpbqSRI4BgkPdDYTfqxbSVptPLE5GCJmBp9nB1rir1YjQ7yKNkiP/ozs7P1Hm1voJa5Ka0bS8CY3rCoqbOPXrYTpXJxWgqyHFIA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org; spf=pass smtp.mailfrom=netfilter.org; arc=none smtp.client-ip=217.70.188.207 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=netfilter.org From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com, fw@strlen.de Subject: [PATCH net 2/6] netfilter: x_tables: fix LED ID check in led_tg_check() Date: Thu, 5 Dec 2024 01:28:50 +0100 Message-Id: <20241205002854.162490-3-pablo@netfilter.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20241205002854.162490-1-pablo@netfilter.org> References: <20241205002854.162490-1-pablo@netfilter.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org From: Dmitry Antipov Syzbot has reported the following BUG detected by KASAN: BUG: KASAN: slab-out-of-bounds in strlen+0x58/0x70 Read of size 1 at addr ffff8881022da0c8 by task repro/5879 ... Call Trace: dump_stack_lvl+0x241/0x360 ? __pfx_dump_stack_lvl+0x10/0x10 ? __pfx__printk+0x10/0x10 ? _printk+0xd5/0x120 ? __virt_addr_valid+0x183/0x530 ? __virt_addr_valid+0x183/0x530 print_report+0x169/0x550 ? __virt_addr_valid+0x183/0x530 ? __virt_addr_valid+0x183/0x530 ? __virt_addr_valid+0x45f/0x530 ? __phys_addr+0xba/0x170 ? strlen+0x58/0x70 kasan_report+0x143/0x180 ? strlen+0x58/0x70 strlen+0x58/0x70 kstrdup+0x20/0x80 led_tg_check+0x18b/0x3c0 xt_check_target+0x3bb/0xa40 ? __pfx_xt_check_target+0x10/0x10 ? stack_depot_save_flags+0x6e4/0x830 ? nft_target_init+0x174/0xc30 nft_target_init+0x82d/0xc30 ? __pfx_nft_target_init+0x10/0x10 ? nf_tables_newrule+0x1609/0x2980 ? nf_tables_newrule+0x1609/0x2980 ? rcu_is_watching+0x15/0xb0 ? nf_tables_newrule+0x1609/0x2980 ? nf_tables_newrule+0x1609/0x2980 ? __kmalloc_noprof+0x21a/0x400 nf_tables_newrule+0x1860/0x2980 ? __pfx_nf_tables_newrule+0x10/0x10 ? __nla_parse+0x40/0x60 nfnetlink_rcv+0x14e5/0x2ab0 ? __pfx_validate_chain+0x10/0x10 ? __pfx_nfnetlink_rcv+0x10/0x10 ? __lock_acquire+0x1384/0x2050 ? netlink_deliver_tap+0x2e/0x1b0 ? __pfx_lock_release+0x10/0x10 ? netlink_deliver_tap+0x2e/0x1b0 netlink_unicast+0x7f8/0x990 ? __pfx_netlink_unicast+0x10/0x10 ? __virt_addr_valid+0x183/0x530 ? __check_object_size+0x48e/0x900 netlink_sendmsg+0x8e4/0xcb0 ? __pfx_netlink_sendmsg+0x10/0x10 ? aa_sock_msg_perm+0x91/0x160 ? __pfx_netlink_sendmsg+0x10/0x10 __sock_sendmsg+0x223/0x270 ____sys_sendmsg+0x52a/0x7e0 ? __pfx_____sys_sendmsg+0x10/0x10 __sys_sendmsg+0x292/0x380 ? __pfx___sys_sendmsg+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x43d/0x780 ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10 ? exc_page_fault+0x590/0x8c0 ? do_syscall_64+0xb6/0x230 do_syscall_64+0xf3/0x230 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... Since an invalid (without '\0' byte at all) byte sequence may be passed from userspace, add an extra check to ensure that such a sequence is rejected as possible ID and so never passed to 'kstrdup()' and further. Reported-by: syzbot+6c8215822f35fdb35667@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6c8215822f35fdb35667 Fixes: 268cb38e1802 ("netfilter: x_tables: add LED trigger target") Signed-off-by: Dmitry Antipov Signed-off-by: Pablo Neira Ayuso --- net/netfilter/xt_LED.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/netfilter/xt_LED.c b/net/netfilter/xt_LED.c index f7b0286d106a..8a80fd76fe45 100644 --- a/net/netfilter/xt_LED.c +++ b/net/netfilter/xt_LED.c @@ -96,7 +96,9 @@ static int led_tg_check(const struct xt_tgchk_param *par) struct xt_led_info_internal *ledinternal; int err; - if (ledinfo->id[0] == '\0') + /* Bail out if empty string or not a string at all. */ + if (ledinfo->id[0] == '\0' || + !memchr(ledinfo->id, '\0', sizeof(ledinfo->id))) return -EINVAL; mutex_lock(&xt_led_mutex); From patchwork Thu Dec 5 00:28:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 13894616 X-Patchwork-Delegate: kuba@kernel.org Received: from mail.netfilter.org (mail.netfilter.org [217.70.188.207]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 42CB5D51C; Thu, 5 Dec 2024 00:29:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.70.188.207 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358555; cv=none; b=UDkOZT8ryfW1AbP+lfaYRMrD2cJfaotJahdyxkfL5LT6+4FeEMquUes4hkv3r90Hwfl22rAOvUWI/Q5q4aHHjnYN+ZzoTZPHo2dYglgrJi/4yuFm9QMugKTsfV52+QAuFjoIOcTIN1JjW8SJeVDy/bi12qxV2TmQ4rrsFlb1iCA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358555; c=relaxed/simple; bh=QkK+4o+GxL8Vwgw3KI84zT5wXflQ89/VDVRsXipyGzA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=V08ZOv7QAGJKU/dMOC9XYZJkJCWNvmTaoMMDe319N8yjhYgPbG/rbWb+k6E8TLsdD4o/WZxUA1/Uki38DK3il/4D58Wb8IE3jg12Np5YDsCLb/VE4wxVnPU2Ew396BZJsscdVBhok/JJGNeOTW4AJWUDzfSagldeTKbQ1BrpRhs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org; spf=pass smtp.mailfrom=netfilter.org; arc=none smtp.client-ip=217.70.188.207 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=netfilter.org From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com, fw@strlen.de Subject: [PATCH net 3/6] netfilter: nft_socket: remove WARN_ON_ONCE on maximum cgroup level Date: Thu, 5 Dec 2024 01:28:51 +0100 Message-Id: <20241205002854.162490-4-pablo@netfilter.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20241205002854.162490-1-pablo@netfilter.org> References: <20241205002854.162490-1-pablo@netfilter.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org cgroup maximum depth is INT_MAX by default, there is a cgroup toggle to restrict this maximum depth to a more reasonable value not to harm performance. Remove unnecessary WARN_ON_ONCE which is reachable from userspace. Fixes: 7f3287db6543 ("netfilter: nft_socket: make cgroupsv2 matching work with namespaces") Reported-by: syzbot+57bac0866ddd99fe47c0@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso --- net/netfilter/nft_socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c index f5da0c1775f2..35d0409b0095 100644 --- a/net/netfilter/nft_socket.c +++ b/net/netfilter/nft_socket.c @@ -68,7 +68,7 @@ static noinline int nft_socket_cgroup_subtree_level(void) cgroup_put(cgrp); - if (WARN_ON_ONCE(level > 255)) + if (level > 255) return -ERANGE; if (WARN_ON_ONCE(level < 0)) From patchwork Thu Dec 5 00:28:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 13894617 X-Patchwork-Delegate: kuba@kernel.org Received: from mail.netfilter.org (mail.netfilter.org [217.70.188.207]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4250F36C; Thu, 5 Dec 2024 00:29:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.70.188.207 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358556; cv=none; b=qPfpBm2OWQdtI0VkmzlnpvVfXoJzEVtIUP9CtWDCB3Sab3hQ8vZ+76Btkxqq2/DIvpjNsml0Ns0s5+SJxECw6RdD/EWmlvv26Yy+uTAp0nVZReqXvnJrIaR40LEUQr3nlC0iDRARNGJH3Qy+IPrFsq4PwYa4DisopdpMM8Oaudo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358556; c=relaxed/simple; bh=MiVoOMPXNAHaS5CBMEMuXHJXczLtzML6OeycE5AAwZQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Do6TB55PQKBPhO/Eh1cHZz2vWF1IMObzEiGhfDNKXvqB3X4EJ9xCyqFDPTolVk4EHwy5A3aYX7Gh+Qi8s6eu+1Ebl9hSgqJAphx3nXJNF3jCOOa71RzPm3+Bu9SAJkCldTSVMN+malocCRkune3dv09gMNtpy4N3hbhbJqA9fXM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org; spf=pass smtp.mailfrom=netfilter.org; arc=none smtp.client-ip=217.70.188.207 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=netfilter.org From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com, fw@strlen.de Subject: [PATCH net 4/6] netfilter: nft_inner: incorrect percpu area handling under softirq Date: Thu, 5 Dec 2024 01:28:52 +0100 Message-Id: <20241205002854.162490-5-pablo@netfilter.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20241205002854.162490-1-pablo@netfilter.org> References: <20241205002854.162490-1-pablo@netfilter.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Softirq can interrupt ongoing packet from process context that is walking over the percpu area that contains inner header offsets. Disable bh and perform three checks before restoring the percpu inner header offsets to validate that the percpu area is valid for this skbuff: 1) If the NFT_PKTINFO_INNER_FULL flag is set on, then this skbuff has already been parsed before for inner header fetching to register. 2) Validate that the percpu area refers to this skbuff using the skbuff pointer as a cookie. If there is a cookie mismatch, then this skbuff needs to be parsed again. 3) Finally, validate if the percpu area refers to this tunnel type. Only after these three checks the percpu area is restored to a on-stack copy and bh is enabled again. After inner header fetching, the on-stack copy is stored back to the percpu area. Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Reported-by: syzbot+84d0441b9860f0d63285@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso --- include/net/netfilter/nf_tables_core.h | 1 + net/netfilter/nft_inner.c | 57 ++++++++++++++++++++------ 2 files changed, 46 insertions(+), 12 deletions(-) diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h index ff27cb2e1662..03b6165756fc 100644 --- a/include/net/netfilter/nf_tables_core.h +++ b/include/net/netfilter/nf_tables_core.h @@ -161,6 +161,7 @@ enum { }; struct nft_inner_tun_ctx { + unsigned long cookie; u16 type; u16 inner_tunoff; u16 inner_lloff; diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c index 928312d01eb1..817ab978d24a 100644 --- a/net/netfilter/nft_inner.c +++ b/net/netfilter/nft_inner.c @@ -210,35 +210,66 @@ static int nft_inner_parse(const struct nft_inner *priv, struct nft_pktinfo *pkt, struct nft_inner_tun_ctx *tun_ctx) { - struct nft_inner_tun_ctx ctx = {}; u32 off = pkt->inneroff; if (priv->flags & NFT_INNER_HDRSIZE && - nft_inner_parse_tunhdr(priv, pkt, &ctx, &off) < 0) + nft_inner_parse_tunhdr(priv, pkt, tun_ctx, &off) < 0) return -1; if (priv->flags & (NFT_INNER_LL | NFT_INNER_NH)) { - if (nft_inner_parse_l2l3(priv, pkt, &ctx, off) < 0) + if (nft_inner_parse_l2l3(priv, pkt, tun_ctx, off) < 0) return -1; } else if (priv->flags & NFT_INNER_TH) { - ctx.inner_thoff = off; - ctx.flags |= NFT_PAYLOAD_CTX_INNER_TH; + tun_ctx->inner_thoff = off; + tun_ctx->flags |= NFT_PAYLOAD_CTX_INNER_TH; } - *tun_ctx = ctx; tun_ctx->type = priv->type; + tun_ctx->cookie = (unsigned long)pkt->skb; pkt->flags |= NFT_PKTINFO_INNER_FULL; return 0; } +static bool nft_inner_restore_tun_ctx(const struct nft_pktinfo *pkt, + struct nft_inner_tun_ctx *tun_ctx) +{ + struct nft_inner_tun_ctx *this_cpu_tun_ctx; + + local_bh_disable(); + this_cpu_tun_ctx = this_cpu_ptr(&nft_pcpu_tun_ctx); + if (this_cpu_tun_ctx->cookie != (unsigned long)pkt->skb) { + local_bh_enable(); + return false; + } + *tun_ctx = *this_cpu_tun_ctx; + local_bh_enable(); + + return true; +} + +static void nft_inner_save_tun_ctx(const struct nft_pktinfo *pkt, + const struct nft_inner_tun_ctx *tun_ctx) +{ + struct nft_inner_tun_ctx *this_cpu_tun_ctx; + + local_bh_disable(); + this_cpu_tun_ctx = this_cpu_ptr(&nft_pcpu_tun_ctx); + if (this_cpu_tun_ctx->cookie != tun_ctx->cookie) + *this_cpu_tun_ctx = *tun_ctx; + local_bh_enable(); +} + static bool nft_inner_parse_needed(const struct nft_inner *priv, const struct nft_pktinfo *pkt, - const struct nft_inner_tun_ctx *tun_ctx) + struct nft_inner_tun_ctx *tun_ctx) { if (!(pkt->flags & NFT_PKTINFO_INNER_FULL)) return true; + if (!nft_inner_restore_tun_ctx(pkt, tun_ctx)) + return true; + if (priv->type != tun_ctx->type) return true; @@ -248,27 +279,29 @@ static bool nft_inner_parse_needed(const struct nft_inner *priv, static void nft_inner_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) { - struct nft_inner_tun_ctx *tun_ctx = this_cpu_ptr(&nft_pcpu_tun_ctx); const struct nft_inner *priv = nft_expr_priv(expr); + struct nft_inner_tun_ctx tun_ctx = {}; if (nft_payload_inner_offset(pkt) < 0) goto err; - if (nft_inner_parse_needed(priv, pkt, tun_ctx) && - nft_inner_parse(priv, (struct nft_pktinfo *)pkt, tun_ctx) < 0) + if (nft_inner_parse_needed(priv, pkt, &tun_ctx) && + nft_inner_parse(priv, (struct nft_pktinfo *)pkt, &tun_ctx) < 0) goto err; switch (priv->expr_type) { case NFT_INNER_EXPR_PAYLOAD: - nft_payload_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, tun_ctx); + nft_payload_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, &tun_ctx); break; case NFT_INNER_EXPR_META: - nft_meta_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, tun_ctx); + nft_meta_inner_eval((struct nft_expr *)&priv->expr, regs, pkt, &tun_ctx); break; default: WARN_ON_ONCE(1); goto err; } + nft_inner_save_tun_ctx(pkt, &tun_ctx); + return; err: regs->verdict.code = NFT_BREAK; From patchwork Thu Dec 5 00:28:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 13894618 X-Patchwork-Delegate: kuba@kernel.org Received: from mail.netfilter.org (mail.netfilter.org [217.70.188.207]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EA6BD125DF; Thu, 5 Dec 2024 00:29:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.70.188.207 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358556; cv=none; b=sJenPrjYOn4ZTnxryymKcVP4bCW9Lfxz4IrzRtlqyhBR7oUs0B6vH5r4lMXs+9RlIrt+KyKo8p1RN8ubhoe//mw9fwjPs37JK2VSwLOUfrwALjhWYXy231hol+EA1w/16Y2UrWMxFTwkiv3W19AFgcmIwbAk6ef28ADbKi09lAg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358556; c=relaxed/simple; bh=YqX/GhFyR7gxKAglN1gOFlmBUFRXX3aA6t/0OiPaq6U=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=uJoSWpvCs5intu34FlQDNI0AcQ4jAYGdlaUokbi4JNaNAcVc2Ep+TbH//A3KuqZE5iZ3SotUCPGiZVqpTAiu7qZ5cwYF/DJMsR8kL7KzW+pNSy+i7YVA6SESpiNrHPfZs7+IrTuw1jFn2n4XqyCkc56p0eIXFXjJSB1IbBhXwsI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org; spf=pass smtp.mailfrom=netfilter.org; arc=none smtp.client-ip=217.70.188.207 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=netfilter.org From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com, fw@strlen.de Subject: [PATCH net 5/6] netfilter: ipset: Hold module reference while requesting a module Date: Thu, 5 Dec 2024 01:28:53 +0100 Message-Id: <20241205002854.162490-6-pablo@netfilter.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20241205002854.162490-1-pablo@netfilter.org> References: <20241205002854.162490-1-pablo@netfilter.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org From: Phil Sutter User space may unload ip_set.ko while it is itself requesting a set type backend module, leading to a kernel crash. The race condition may be provoked by inserting an mdelay() right after the nfnl_unlock() call. Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Phil Sutter Acked-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso --- net/netfilter/ipset/ip_set_core.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 61431690cbd5..cc20e6d56807 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -104,14 +104,19 @@ find_set_type(const char *name, u8 family, u8 revision) static bool load_settype(const char *name) { + if (!try_module_get(THIS_MODULE)) + return false; + nfnl_unlock(NFNL_SUBSYS_IPSET); pr_debug("try to load ip_set_%s\n", name); if (request_module("ip_set_%s", name) < 0) { pr_warn("Can't find ip_set type %s\n", name); nfnl_lock(NFNL_SUBSYS_IPSET); + module_put(THIS_MODULE); return false; } nfnl_lock(NFNL_SUBSYS_IPSET); + module_put(THIS_MODULE); return true; } From patchwork Thu Dec 5 00:28:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 13894619 X-Patchwork-Delegate: kuba@kernel.org Received: from mail.netfilter.org (mail.netfilter.org [217.70.188.207]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 29E5E179BC; Thu, 5 Dec 2024 00:29:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.70.188.207 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358557; cv=none; b=dCXHZuuXzAxO4+KWNW2HSQ8inmNLElyTXHr/8XHtBAsvd+o1H8bOwmlQTgy2fTjo68V/i4Fnhc+c83zVD1nGNB2roFkj12v/li03rwqAD4GEhWGjvgFGPlVLttNQ7QNnRYawRcpRNHjsu68IeXKnOzFVUOt8YcjUCmr3/I8YctQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733358557; c=relaxed/simple; bh=40RGB94M9/76/dZ2UTiaprGwmKwXuTVwg+ARJnC2I9c=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=k6xIJSTZqZ1oiC/sI/mZTiRUcOO9Q2Wr9Ggz6tM0r/PYlEBQ6cjUCGjCIU3CswrCphyIIGJksBLQQP/qbIj5P/W7gzQwCjG0vyNQzRqNF2n55ha6K34Utr73q/SaJIRDoBx4Zi9sj8DN1IFZag9HCBUKrKBcpt9vscT1EWeKd6k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org; spf=pass smtp.mailfrom=netfilter.org; arc=none smtp.client-ip=217.70.188.207 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=netfilter.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=netfilter.org From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com, fw@strlen.de Subject: [PATCH net 6/6] netfilter: nft_set_hash: skip duplicated elements pending gc run Date: Thu, 5 Dec 2024 01:28:54 +0100 Message-Id: <20241205002854.162490-7-pablo@netfilter.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20241205002854.162490-1-pablo@netfilter.org> References: <20241205002854.162490-1-pablo@netfilter.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org rhashtable does not provide stable walk, duplicated elements are possible in case of resizing. I considered that checking for errors when calling rhashtable_walk_next() was sufficient to detect the resizing. However, rhashtable_walk_next() returns -EAGAIN only at the end of the iteration, which is too late, because a gc work containing duplicated elements could have been already scheduled for removal to the worker. Add a u32 gc worker sequence number per set, bump it on every workqueue run. Annotate gc worker sequence number on the expired element. Use it to skip those already seen in this gc workqueue run. Note that this new field is never reset in case gc transaction fails, so next gc worker run on the expired element overrides it. Wraparound of gc worker sequence number should not be an issue with stale gc worker sequence number in the element, that would just postpone the element removal in one gc run. Note that it is not possible to use flags to annotate that element is pending gc run to detect duplicates, given that gc transaction can be invalidated in case of update from the control plane, therefore, not allowing to clear such flag. On x86_64, pahole reports no changes in the size of nft_rhash_elem. Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Reported-by: Laurent Fasnacht Tested-by: Laurent Fasnacht Signed-off-by: Pablo Neira Ayuso --- net/netfilter/nft_set_hash.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c index 65bd291318f2..8bfac4185ac7 100644 --- a/net/netfilter/nft_set_hash.c +++ b/net/netfilter/nft_set_hash.c @@ -24,11 +24,13 @@ struct nft_rhash { struct rhashtable ht; struct delayed_work gc_work; + u32 wq_gc_seq; }; struct nft_rhash_elem { struct nft_elem_priv priv; struct rhash_head node; + u32 wq_gc_seq; struct nft_set_ext ext; }; @@ -338,6 +340,10 @@ static void nft_rhash_gc(struct work_struct *work) if (!gc) goto done; + /* Elements never collected use a zero gc worker sequence number. */ + if (unlikely(++priv->wq_gc_seq == 0)) + priv->wq_gc_seq++; + rhashtable_walk_enter(&priv->ht, &hti); rhashtable_walk_start(&hti); @@ -355,6 +361,14 @@ static void nft_rhash_gc(struct work_struct *work) goto try_later; } + /* rhashtable walk is unstable, already seen in this gc run? + * Then, skip this element. In case of (unlikely) sequence + * wraparound and stale element wq_gc_seq, next gc run will + * just find this expired element. + */ + if (he->wq_gc_seq == priv->wq_gc_seq) + continue; + if (nft_set_elem_is_dead(&he->ext)) goto dead_elem; @@ -371,6 +385,8 @@ static void nft_rhash_gc(struct work_struct *work) if (!gc) goto try_later; + /* annotate gc sequence for this attempt. */ + he->wq_gc_seq = priv->wq_gc_seq; nft_trans_gc_elem_add(gc, he); }