From patchwork Mon Mar 3 13:28:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13998885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AE1F2C282CD for ; Mon, 3 Mar 2025 13:54:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=4HoYAW6D8hO+aaE9DpE+f5d9KWUfqlQeSUmUL0RBP80=; b=REPzTldH7D71G/MT1vhR7husYS QjyWsVuLsMTTASebDUr/zK6LQiQqogIlKAX11khrnY8oHUfNESFWy/b8160PZd6oIbIqGu94FVXPE zYuz0eQXm7WwkG9yzqJH2wCxox8xtgrlr/D01wwDhQdLhzUnQxB8t3RqO5TF4uVqasRs0c7tIsM2y FzZY/kwuddFgV/CDFL3acqvkLwAv/JjqW6sNjYReQI9jlTQ1OsJsyEkdK4VVThM7MGO6G1F4itHvT gFPpqqdZ76lxrJsfgRzj1tVZeCtRNIBtRSPJzKZWB36XtekIlnWXfWPm4kDbET9kxtTdg5X+sfNGu RryvbeuQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tp6FI-00000000zZq-2vKB; Mon, 03 Mar 2025 13:54:12 +0000 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tp5sT-00000000vOk-22I8 for linux-arm-kernel@lists.infradead.org; Mon, 03 Mar 2025 13:30:38 +0000 Received: by mail-pl1-x641.google.com with SMTP id d9443c01a7336-22355618fd9so73288295ad.3 for ; Mon, 03 Mar 2025 05:30:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741008637; x=1741613437; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4HoYAW6D8hO+aaE9DpE+f5d9KWUfqlQeSUmUL0RBP80=; b=aqNQlIUGF6DRP1lR7Ga6nRQpRXO9dMLzLlT1Fqi0zBbmknp8DwFTf65i74W08HRLju +IhISh4SnngRneRWLXpga91OsSWnY/xsEWGHMDzyYYemqq8d35bDvUH8rJzVyBBlMVyV 0y5YhohYofrxlx5uXuiVj2n2xISTtbsEjzE++IP/dDhE2eIaAo4+/rpsnuM2b1OMOr2/ rimzmGwU7DVIzWz90sFED9b4mlcPr/CuGnO5eDka3lJZn5ma7yL6chgD9LYuuzys1YR1 axDsXMAeGmHd5m6xnQH2NOPOKsWOh6JRqlfRKomYZz2A7Lw3tXegHJklK+mplUiDHuFF IZgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741008637; x=1741613437; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4HoYAW6D8hO+aaE9DpE+f5d9KWUfqlQeSUmUL0RBP80=; b=teraFGxNKKHkSL4xl8EAEwBgR1A6cSe/Um9gQe2rJ4wmGiyguokHNUEh/gTYZT7lyf zwn7VVnkj85tdArC/DG8XP30EvUK2Tt6Ty9g+HErsTDSyGZVOLRAeaAkLjq68bPdhigx mVFwKHar8oj5Bwt/11G3kyyBmiq73FL877YjYDNYmPnVmePteJKzi9+jPyIIp51CasZ3 rSfMd58BBy4V3YQCsCgcjq1ZHZCBUhY3TQBefkd+ZH4qXQhUEzgJ1hExaCp/QzKDLHvj 9YjcUUPgoLp36vLtxqXeaRYUZFqjVZT5mYDTMqq4qtJ4OFkYOUi2q4ykRMJlWcTTwKUi dV9g== X-Forwarded-Encrypted: i=1; AJvYcCU4B0IjJdfogMzN5vX44eAJj39NoaLpRYbX+E+Mtklg/zvXFNWWNyItPle2QBVweECNb29L4NdVOm/FzD4IFFSy@lists.infradead.org X-Gm-Message-State: AOJu0YwbdRRFwZxfKX5tYgPgxMEmOWMbE3R/xHaXbSPAod8YX1tvi4t7 TBKwL68hS7kV8sHI0A7SOLYhciOcsN8qbRAsPYpQY6M7p/cb3/W1 X-Gm-Gg: ASbGnctJg1poifmRUDUA0U/jWZPO8UphmT7WkOIdZa/59mpcj4RlGLhksONdH5qVOXk 75gZGjbr0XSKX+b2fmeJPAAU3phFu26HVGcNf+NIw0J5djlbE6vf7G4xDmCrJW0lF4fTZ2iyEqp t1dmOHJOP9p2ZiN0Wd4ZxS6qIgC4Ov9hyoRsZXiUZYIFidksK3b1pVrPtZNRGRzb02NhdHQBvff jay1dtzgWqevkrNSjGkORq+6Sirz3gawENvSvXFVftPIRwrQtjHd2PtGBW2vNZoYuCjDkUcHdMI Wv230p6m37Z/wkpC3eD/PbZ63NDXzguvHe4n47M0RpLU+cucl2fmdPCPhy9Ixg== X-Google-Smtp-Source: AGHT+IGA2mWJTfvekJWlfyFWg7JZnBJRMEmVPtYCpiX/IyOQcjy5rGPgwuYYTPg+xM28+Vd56S613A== X-Received: by 2002:a17:903:230c:b0:223:7006:4db2 with SMTP id d9443c01a7336-22370064ea3mr156649035ad.31.1741008636600; Mon, 03 Mar 2025 05:30:36 -0800 (PST) Received: from localhost.localdomain ([43.129.244.20]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-223505359b8sm77297035ad.253.2025.03.03.05.30.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Mar 2025 05:30:36 -0800 (PST) From: Menglong Dong X-Google-Original-From: Menglong Dong To: peterz@infradead.org, rostedt@goodmis.org, mark.rutland@arm.com, alexei.starovoitov@gmail.com Cc: catalin.marinas@arm.com, will@kernel.org, mhiramat@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, eddyz87@gmail.com, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, jolsa@kernel.org, davem@davemloft.net, dsahern@kernel.org, mathieu.desnoyers@efficios.com, nathan@kernel.org, nick.desaulniers+lkml@gmail.com, morbo@google.com, samitolvanen@google.com, kees@kernel.org, dongml2@chinatelecom.cn, akpm@linux-foundation.org, riel@surriel.com, rppt@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, llvm@lists.linux.dev Subject: [PATCH v4 1/4] x86/ibt: factor out cfi and fineibt offset Date: Mon, 3 Mar 2025 21:28:34 +0800 Message-Id: <20250303132837.498938-2-dongml2@chinatelecom.cn> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250303132837.498938-1-dongml2@chinatelecom.cn> References: <20250303132837.498938-1-dongml2@chinatelecom.cn> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250303_053037_532931_0549AE18 X-CRM114-Status: GOOD ( 20.99 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org For now, the layout of cfi and fineibt is hard coded, and the padding is fixed on 16 bytes. Factor out FINEIBT_INSN_OFFSET and CFI_INSN_OFFSET. CFI_INSN_OFFSET is the offset of cfi, which is the same as FUNCTION_ALIGNMENT when CALL_PADDING is enabled. And FINEIBT_INSN_OFFSET is the offset where we put the fineibt preamble on, which is 16 for now. When the FUNCTION_ALIGNMENT is bigger than 16, we place the fineibt preamble on the last 16 bytes of the padding for better performance, which means the fineibt preamble don't use the space that cfi uses. The FINEIBT_INSN_OFFSET is not used in fineibt_caller_start and fineibt_paranoid_start, as it is always "0x10". Note that we need to update the offset in fineibt_caller_start and fineibt_paranoid_start if FINEIBT_INSN_OFFSET changes. Signed-off-by: Menglong Dong --- v4: - rebase to the newest tip/x86/core, the fineibt has some updating --- arch/x86/include/asm/cfi.h | 13 +++++++++---- arch/x86/kernel/alternative.c | 18 +++++++++++------- arch/x86/net/bpf_jit_comp.c | 22 +++++++++++----------- 3 files changed, 31 insertions(+), 22 deletions(-) diff --git a/arch/x86/include/asm/cfi.h b/arch/x86/include/asm/cfi.h index 2f6a01f098b5..04525f2f6bf2 100644 --- a/arch/x86/include/asm/cfi.h +++ b/arch/x86/include/asm/cfi.h @@ -108,6 +108,14 @@ extern bhi_thunk __bhi_args_end[]; struct pt_regs; +#ifdef CONFIG_CALL_PADDING +#define FINEIBT_INSN_OFFSET 16 +#define CFI_INSN_OFFSET CONFIG_FUNCTION_ALIGNMENT +#else +#define FINEIBT_INSN_OFFSET 0 +#define CFI_INSN_OFFSET 5 +#endif + #ifdef CONFIG_CFI_CLANG enum bug_trap_type handle_cfi_failure(struct pt_regs *regs); #define __bpfcall @@ -118,11 +126,8 @@ static inline int cfi_get_offset(void) { switch (cfi_mode) { case CFI_FINEIBT: - return 16; case CFI_KCFI: - if (IS_ENABLED(CONFIG_CALL_PADDING)) - return 16; - return 5; + return CFI_INSN_OFFSET; default: return 0; } diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 32e4b801db99..0088d2313f33 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -917,7 +917,7 @@ void __init_or_module noinline apply_seal_endbr(s32 *start, s32 *end) poison_endbr(addr); if (IS_ENABLED(CONFIG_FINEIBT)) - poison_cfi(addr - 16); + poison_cfi(addr); } } @@ -980,12 +980,13 @@ u32 cfi_get_func_hash(void *func) { u32 hash; - func -= cfi_get_offset(); switch (cfi_mode) { case CFI_FINEIBT: + func -= FINEIBT_INSN_OFFSET; func += 7; break; case CFI_KCFI: + func -= CFI_INSN_OFFSET; func += 1; break; default: @@ -1372,7 +1373,7 @@ static int cfi_rewrite_preamble(s32 *start, s32 *end) * have determined there are no indirect calls to it and we * don't need no CFI either. */ - if (!is_endbr(addr + 16)) + if (!is_endbr(addr + CFI_INSN_OFFSET)) continue; hash = decode_preamble_hash(addr, &arity); @@ -1380,6 +1381,7 @@ static int cfi_rewrite_preamble(s32 *start, s32 *end) addr, addr, 5, addr)) return -EINVAL; + addr += (CFI_INSN_OFFSET - FINEIBT_INSN_OFFSET); text_poke_early(addr, fineibt_preamble_start, fineibt_preamble_size); WARN_ON(*(u32 *)(addr + fineibt_preamble_hash) != 0x12345678); text_poke_early(addr + fineibt_preamble_hash, &hash, 4); @@ -1402,10 +1404,10 @@ static void cfi_rewrite_endbr(s32 *start, s32 *end) for (s = start; s < end; s++) { void *addr = (void *)s + *s; - if (!exact_endbr(addr + 16)) + if (!exact_endbr(addr + CFI_INSN_OFFSET)) continue; - poison_endbr(addr + 16); + poison_endbr(addr + CFI_INSN_OFFSET); } } @@ -1543,12 +1545,12 @@ static void __apply_fineibt(s32 *start_retpoline, s32 *end_retpoline, return; case CFI_FINEIBT: - /* place the FineIBT preamble at func()-16 */ + /* place the FineIBT preamble at func()-FINEIBT_INSN_OFFSET */ ret = cfi_rewrite_preamble(start_cfi, end_cfi); if (ret) goto err; - /* rewrite the callers to target func()-16 */ + /* rewrite the callers to target func()-FINEIBT_INSN_OFFSET */ ret = cfi_rewrite_callers(start_retpoline, end_retpoline); if (ret) goto err; @@ -1588,6 +1590,7 @@ static void poison_cfi(void *addr) */ switch (cfi_mode) { case CFI_FINEIBT: + addr -= FINEIBT_INSN_OFFSET; /* * FineIBT prefix should start with an ENDBR. */ @@ -1607,6 +1610,7 @@ static void poison_cfi(void *addr) break; case CFI_KCFI: + addr -= CFI_INSN_OFFSET; /* * kCFI prefix should start with a valid hash. */ diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 72776dcb75aa..ee86a5df5ffb 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -415,6 +415,12 @@ static int emit_call(u8 **prog, void *func, void *ip); static void emit_fineibt(u8 **pprog, u8 *ip, u32 hash, int arity) { u8 *prog = *pprog; +#ifdef CONFIG_CALL_PADDING + int i; + + for (i = 0; i < CFI_INSN_OFFSET - 16; i++) + EMIT1(0x90); +#endif EMIT_ENDBR(); EMIT3_off32(0x41, 0x81, 0xea, hash); /* subl $hash, %r10d */ @@ -432,20 +438,14 @@ static void emit_fineibt(u8 **pprog, u8 *ip, u32 hash, int arity) static void emit_kcfi(u8 **pprog, u32 hash) { u8 *prog = *pprog; +#ifdef CONFIG_CALL_PADDING + int i; +#endif EMIT1_off32(0xb8, hash); /* movl $hash, %eax */ #ifdef CONFIG_CALL_PADDING - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); - EMIT1(0x90); + for (i = 0; i < CFI_INSN_OFFSET - 5; i++) + EMIT1(0x90); #endif EMIT_ENDBR(); From patchwork Mon Mar 3 13:28:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13998886 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E00B3C282CD for ; Mon, 3 Mar 2025 13:55:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=g+TPvgOYa0/i6hWJUb51m5UVXTU1tbJahC/+FJNQh8Y=; b=b9kGARJFpK3oX3riVvp14RcvAS PSnSslUQa52rGJU6YJlp0trEpmyQrkcMdKQ36ww2lkHFRW9TFkSomOTNQGDZ8mjPm23UhayT/P0tj 8JWmPpIX+YcjBgE+tFawyVB1+Id7ot4ki15KOQbB+jIbdKOtLhyK0D1Ofmqg99YyHsinvk/3A+W/K BtFqszfll3q+vLdeNI36gBRvgyGOrQKhONg6afrB+aFZE1fB0V/NPVwD71bodoIE9W7Q5tV/nXqvh Oi57hTjsiYSNHElHITculZMsZSoZ/Xk6asFYMKpkYm/qoVSzyFsOOPdp+TtGFBpYyDzze4FjBNt7A MHb+FrdQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tp6Gq-00000000zk5-1KdS; Mon, 03 Mar 2025 13:55:48 +0000 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tp5sb-00000000vS7-1FuX for linux-arm-kernel@lists.infradead.org; Mon, 03 Mar 2025 13:30:46 +0000 Received: by mail-pl1-x643.google.com with SMTP id d9443c01a7336-223a7065ff8so29225405ad.0 for ; Mon, 03 Mar 2025 05:30:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741008645; x=1741613445; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=g+TPvgOYa0/i6hWJUb51m5UVXTU1tbJahC/+FJNQh8Y=; b=R95WX6/pKSqbzd+6n7ZEX9QF3bGMkALEN7iONR1mvQi4kLVK0tJEHqyBsssXbNQjqx mXd2TDnb0Ye0MoBaD/cP6loPdOKkYu9iJE7f6tCvgMckGUvg78vUFW246SjT7UJwkszs 9oq69uQhQRadosyJTK0hQDS5RrLBtG9d7VXvQCMlI+jlraAKWn4oPDLh4bE5lHm1b7BG Y5oyyx0TuIoEHuZYmKpDrgL/Q4QsZ/rNYoI7H8OwGz4Z8BOwIZqFBUfTzS4ShK3IVMzc Qm++V0pxq7ykOaSOiuMcp8rhtnXgarqeD2LSaqr/jJChmUV3G4gaLt6Jil8bI/1E9pS2 VJzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741008645; x=1741613445; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g+TPvgOYa0/i6hWJUb51m5UVXTU1tbJahC/+FJNQh8Y=; b=gsypfDrgnALid3mokLFhsXI/6xoiESfMkRD4lKL7xvEywlZlv4znn9Imnah54RaALI UCg6guFqyf1sGuWRBF9zR9vpp2lqE18FHPJzc8mnfOS/6LTQ71HGKLXZXQ0cxICGOgVX ptThKUoYejEWU9Y3EXnzE1r8U1fW5sg0qR0s6UjPR8+8ZspOI1ebsJUny5t+26dXnf9Y wzm/4Bwv6knBQqr7fqJtNoGp6m8kqx1aI0QiHDV6tnxSUD1KzOImQWRRH4tdi09t4t2S hItpwhuX4KFwcWyD0wMUCWJqKWg4yD/IrUo+5DQGubVgS5sFe40O3OyNYEgNOQb8Jmpw A1EA== X-Forwarded-Encrypted: i=1; AJvYcCXpPEEQwihNeapH+zMn5F5e4zqMnmWC5vqy84tlKnKt7bmKZ1g54CJwieoSsF50CJMFXdZZO6FEkhGOG1w+i7Ph@lists.infradead.org X-Gm-Message-State: AOJu0YzQA93FxHjUUgiwSAcnO+H3dT9MgB2YmE0j33SDldmRYKIakvhm 5reDZw4dN3rvVUP1r7e/s4TleUNGQQ+mymyIvbWEjrhN8pBsPiG8 X-Gm-Gg: ASbGnctPMoOXsuYraXd0NxzHPDvcMtYbLP0MWzl2xTNr7j25KtHsSjbDflQvrQe8v2e xukUBZQvFI75o+Ty/6QKY3JOwwcn7EATvmJwlptSyxK0V6juGmzX5iWEMmN2R9S6rsP0CEuSTrF wNxHbPhk4DbMCW1i4HEppoqbUTXTed/7RE8Kf10YlXdPLPknc+ozyf4AInfU4abduLNNs1ui/pa W9xtuXLhMHicZ9q03NPJRUqCdBGNEpurnLtNJewinB5eMiObka6IzV43h2zBSGDS/QJa+lwVvy8 3rDV/YMxizMJGxapxy9uDmYixj9fX6zX+pK7wtJDvNirCZ9oUG03X5wtGX7u6Q== X-Google-Smtp-Source: AGHT+IFlbT3UDNEQKtNvTWGWrAZWwF7WOjWCx8Sdq18QPhAIVe0wmrIF4oYZBBWwy52qUNo/JhCf0Q== X-Received: by 2002:a17:903:1790:b0:21f:71b4:d2aa with SMTP id d9443c01a7336-22368fa54b4mr247215185ad.5.1741008644544; Mon, 03 Mar 2025 05:30:44 -0800 (PST) Received: from localhost.localdomain ([43.129.244.20]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-223505359b8sm77297035ad.253.2025.03.03.05.30.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Mar 2025 05:30:44 -0800 (PST) From: Menglong Dong X-Google-Original-From: Menglong Dong To: peterz@infradead.org, rostedt@goodmis.org, mark.rutland@arm.com, alexei.starovoitov@gmail.com Cc: catalin.marinas@arm.com, will@kernel.org, mhiramat@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, eddyz87@gmail.com, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, jolsa@kernel.org, davem@davemloft.net, dsahern@kernel.org, mathieu.desnoyers@efficios.com, nathan@kernel.org, nick.desaulniers+lkml@gmail.com, morbo@google.com, samitolvanen@google.com, kees@kernel.org, dongml2@chinatelecom.cn, akpm@linux-foundation.org, riel@surriel.com, rppt@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, llvm@lists.linux.dev Subject: [PATCH v4 2/4] add per-function metadata storage support Date: Mon, 3 Mar 2025 21:28:35 +0800 Message-Id: <20250303132837.498938-3-dongml2@chinatelecom.cn> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250303132837.498938-1-dongml2@chinatelecom.cn> References: <20250303132837.498938-1-dongml2@chinatelecom.cn> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250303_053045_334092_2A1869FF X-CRM114-Status: GOOD ( 33.40 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org For now, there isn't a way to set and get per-function metadata with a low overhead, which is not convenient for some situations. Take BPF trampoline for example, we need to create a trampoline for each kernel function, as we have to store some information of the function to the trampoline, such as BPF progs, function arg count, etc. The performance overhead and memory consumption can be higher to create these trampolines. With the supporting of per-function metadata storage, we can store these information to the metadata, and create a global BPF trampoline for all the kernel functions. In the global trampoline, we get the information that we need from the function metadata through the ip (function address) with almost no overhead. Another beneficiary can be ftrace. For now, all the kernel functions that are enabled by dynamic ftrace will be added to a filter hash if there are more than one callbacks. And hash lookup will happen when the traced functions are called, which has an impact on the performance, see __ftrace_ops_list_func() -> ftrace_ops_test(). With the per-function metadata supporting, we can store the information that if the callback is enabled on the kernel function to the metadata. Support per-function metadata storage in the function padding, and previous discussion can be found in [1]. Generally speaking, we have two way to implement this feature: 1. Create a function metadata array, and prepend a insn which can hold the index of the function metadata in the array. And store the insn to the function padding. 2. Allocate the function metadata with kmalloc(), and prepend a insn which hold the pointer of the metadata. And store the insn to the function padding. Compared with way 2, way 1 consume less space, but we need to do more work on the global function metadata array. And we implement this function in the way 1. Link: https://lore.kernel.org/bpf/CADxym3anLzM6cAkn_z71GDd_VeKiqqk1ts=xuiP7pr4PO6USPA@mail.gmail.com/ [1] Signed-off-by: Menglong Dong --- v2: - add supporting for arm64 - split out arch relevant code - refactor the commit log --- include/linux/kfunc_md.h | 25 ++++ kernel/Makefile | 1 + kernel/trace/Makefile | 1 + kernel/trace/kfunc_md.c | 239 +++++++++++++++++++++++++++++++++++++++ 4 files changed, 266 insertions(+) create mode 100644 include/linux/kfunc_md.h create mode 100644 kernel/trace/kfunc_md.c diff --git a/include/linux/kfunc_md.h b/include/linux/kfunc_md.h new file mode 100644 index 000000000000..df616f0fcb36 --- /dev/null +++ b/include/linux/kfunc_md.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_KFUNC_MD_H +#define _LINUX_KFUNC_MD_H + +#include + +struct kfunc_md { + int users; + /* we can use this field later, make sure it is 8-bytes aligned + * for now. + */ + int pad0; + void *func; +}; + +extern struct kfunc_md *kfunc_mds; + +struct kfunc_md *kfunc_md_find(void *ip); +struct kfunc_md *kfunc_md_get(void *ip); +void kfunc_md_put(struct kfunc_md *meta); +void kfunc_md_put_by_ip(void *ip); +void kfunc_md_lock(void); +void kfunc_md_unlock(void); + +#endif diff --git a/kernel/Makefile b/kernel/Makefile index 87866b037fbe..7435674d5da3 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -108,6 +108,7 @@ obj-$(CONFIG_TRACE_CLOCK) += trace/ obj-$(CONFIG_RING_BUFFER) += trace/ obj-$(CONFIG_TRACEPOINTS) += trace/ obj-$(CONFIG_RETHOOK) += trace/ +obj-$(CONFIG_FUNCTION_METADATA) += trace/ obj-$(CONFIG_IRQ_WORK) += irq_work.o obj-$(CONFIG_CPU_PM) += cpu_pm.o obj-$(CONFIG_BPF) += bpf/ diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index 057cd975d014..9780ee3f8d8d 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -106,6 +106,7 @@ obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o obj-$(CONFIG_FPROBE) += fprobe.o obj-$(CONFIG_RETHOOK) += rethook.o obj-$(CONFIG_FPROBE_EVENTS) += trace_fprobe.o +obj-$(CONFIG_FUNCTION_METADATA) += kfunc_md.o obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o obj-$(CONFIG_RV) += rv/ diff --git a/kernel/trace/kfunc_md.c b/kernel/trace/kfunc_md.c new file mode 100644 index 000000000000..7ec25bcf778d --- /dev/null +++ b/kernel/trace/kfunc_md.c @@ -0,0 +1,239 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include + +#define ENTRIES_PER_PAGE (PAGE_SIZE / sizeof(struct kfunc_md)) + +static u32 kfunc_md_count = ENTRIES_PER_PAGE, kfunc_md_used; +struct kfunc_md __rcu *kfunc_mds; +EXPORT_SYMBOL_GPL(kfunc_mds); + +static DEFINE_MUTEX(kfunc_md_mutex); + + +void kfunc_md_unlock(void) +{ + mutex_unlock(&kfunc_md_mutex); +} +EXPORT_SYMBOL_GPL(kfunc_md_unlock); + +void kfunc_md_lock(void) +{ + mutex_lock(&kfunc_md_mutex); +} +EXPORT_SYMBOL_GPL(kfunc_md_lock); + +static u32 kfunc_md_get_index(void *ip) +{ + return *(u32 *)(ip - KFUNC_MD_DATA_OFFSET); +} + +static void kfunc_md_init(struct kfunc_md *mds, u32 start, u32 end) +{ + u32 i; + + for (i = start; i < end; i++) + mds[i].users = 0; +} + +static int kfunc_md_page_order(void) +{ + return fls(DIV_ROUND_UP(kfunc_md_count, ENTRIES_PER_PAGE)) - 1; +} + +/* Get next usable function metadata. On success, return the usable + * kfunc_md and store the index of it to *index. If no usable kfunc_md is + * found in kfunc_mds, a larger array will be allocated. + */ +static struct kfunc_md *kfunc_md_get_next(u32 *index) +{ + struct kfunc_md *new_mds, *mds; + u32 i, order; + + mds = rcu_dereference(kfunc_mds); + if (mds == NULL) { + order = kfunc_md_page_order(); + new_mds = (void *)__get_free_pages(GFP_KERNEL, order); + if (!new_mds) + return NULL; + kfunc_md_init(new_mds, 0, kfunc_md_count); + /* The first time to initialize kfunc_mds, so it is not + * used anywhere yet, and we can update it directly. + */ + rcu_assign_pointer(kfunc_mds, new_mds); + mds = new_mds; + } + + if (likely(kfunc_md_used < kfunc_md_count)) { + /* maybe we can manage the used function metadata entry + * with a bit map ? + */ + for (i = 0; i < kfunc_md_count; i++) { + if (!mds[i].users) { + kfunc_md_used++; + *index = i; + mds[i].users++; + return mds + i; + } + } + } + + order = kfunc_md_page_order(); + /* no available function metadata, so allocate a bigger function + * metadata array. + */ + new_mds = (void *)__get_free_pages(GFP_KERNEL, order + 1); + if (!new_mds) + return NULL; + + memcpy(new_mds, mds, kfunc_md_count * sizeof(*new_mds)); + kfunc_md_init(new_mds, kfunc_md_count, kfunc_md_count * 2); + + rcu_assign_pointer(kfunc_mds, new_mds); + synchronize_rcu(); + free_pages((u64)mds, order); + + mds = new_mds + kfunc_md_count; + *index = kfunc_md_count; + kfunc_md_count <<= 1; + kfunc_md_used++; + mds->users++; + + return mds; +} + +static int kfunc_md_text_poke(void *ip, void *insn, void *nop) +{ + void *target; + int ret = 0; + u8 *prog; + + target = ip - KFUNC_MD_INSN_OFFSET; + mutex_lock(&text_mutex); + if (insn) { + if (!memcmp(target, insn, KFUNC_MD_INSN_SIZE)) + goto out; + + if (memcmp(target, nop, KFUNC_MD_INSN_SIZE)) { + ret = -EBUSY; + goto out; + } + prog = insn; + } else { + if (!memcmp(target, nop, KFUNC_MD_INSN_SIZE)) + goto out; + prog = nop; + } + + ret = kfunc_md_arch_poke(target, prog); +out: + mutex_unlock(&text_mutex); + return ret; +} + +static bool __kfunc_md_put(struct kfunc_md *md) +{ + u8 nop_insn[KFUNC_MD_INSN_SIZE]; + + if (WARN_ON_ONCE(md->users <= 0)) + return false; + + md->users--; + if (md->users > 0) + return false; + + if (!kfunc_md_arch_exist(md->func)) + return false; + + kfunc_md_arch_nops(nop_insn); + /* release the metadata by recovering the function padding to NOPS */ + kfunc_md_text_poke(md->func, NULL, nop_insn); + /* TODO: we need a way to shrink the array "kfunc_mds" */ + kfunc_md_used--; + + return true; +} + +/* Decrease the reference of the md, release it if "md->users <= 0" */ +void kfunc_md_put(struct kfunc_md *md) +{ + mutex_lock(&kfunc_md_mutex); + __kfunc_md_put(md); + mutex_unlock(&kfunc_md_mutex); +} +EXPORT_SYMBOL_GPL(kfunc_md_put); + +/* Get a exist metadata by the function address, and NULL will be returned + * if not exist. + * + * NOTE: rcu lock should be held during reading the metadata, and + * kfunc_md_lock should be held if writing happens. + */ +struct kfunc_md *kfunc_md_find(void *ip) +{ + struct kfunc_md *md; + u32 index; + + if (kfunc_md_arch_exist(ip)) { + index = kfunc_md_get_index(ip); + if (WARN_ON_ONCE(index >= kfunc_md_count)) + return NULL; + + md = rcu_dereference(kfunc_mds) + index; + return md; + } + return NULL; +} +EXPORT_SYMBOL_GPL(kfunc_md_find); + +void kfunc_md_put_by_ip(void *ip) +{ + struct kfunc_md *md; + + mutex_lock(&kfunc_md_mutex); + md = kfunc_md_find(ip); + if (md) + __kfunc_md_put(md); + mutex_unlock(&kfunc_md_mutex); +} +EXPORT_SYMBOL_GPL(kfunc_md_put_by_ip); + +/* Get a exist metadata by the function address, and create one if not + * exist. Reference of the metadata will increase 1. + * + * NOTE: always call this function with kfunc_md_lock held, and all + * updating to metadata should also hold the kfunc_md_lock. + */ +struct kfunc_md *kfunc_md_get(void *ip) +{ + u8 nop_insn[KFUNC_MD_INSN_SIZE], insn[KFUNC_MD_INSN_SIZE]; + struct kfunc_md *md; + u32 index; + + md = kfunc_md_find(ip); + if (md) { + md->users++; + return md; + } + + md = kfunc_md_get_next(&index); + if (!md) + return NULL; + + kfunc_md_arch_pretend(insn, index); + kfunc_md_arch_nops(nop_insn); + + if (kfunc_md_text_poke(ip, insn, nop_insn)) { + kfunc_md_used--; + md->users = 0; + return NULL; + } + md->func = ip; + + return md; +} +EXPORT_SYMBOL_GPL(kfunc_md_get); From patchwork Mon Mar 3 13:28:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13998893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0B031C282CD for ; Mon, 3 Mar 2025 13:57:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=e0whn35vPm8J+vjBTbfA5u1AHwSTrw7swcARc6I++Hc=; b=x2b2XoQS/sSMcITlQ+2KBCsM71 RyuG4QI01yBksIGZmPVR50w7960f9imXQ/5/y6u+5ooon12OaV7kBtNVC6I+KJfpQ/IHaGrmjE4gz pUFrp4MyaskvS05v78sclejdmAXplqw0o1nRMMTB8QP8/OgDwAYh3bGX7ja6fRxbabsPT56EZ0F17 pjXSkRcVEjKAPdl8B3gocGh/qSRdcXbxMMW6rpu0k9eCfiH/VqevERHVaJo/0txUqW8EYUMk/uvqh S9CAzQlOsgCnBkHixZEYGKzQm02GjZUS6UnxwYCAxpfLivMoxVzNvubOOMyeMk7ffLRdqjpk39irU FQZ101mQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tp6IN-00000000zzG-0aNS; Mon, 03 Mar 2025 13:57:23 +0000 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tp5sj-00000000vUy-13DR for linux-arm-kernel@lists.infradead.org; Mon, 03 Mar 2025 13:30:54 +0000 Received: by mail-pl1-x641.google.com with SMTP id d9443c01a7336-2232b12cd36so58594285ad.0 for ; Mon, 03 Mar 2025 05:30:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741008653; x=1741613453; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=e0whn35vPm8J+vjBTbfA5u1AHwSTrw7swcARc6I++Hc=; b=ezIBxddQV/PXltfPFFd1cNrOfpnGY6zdeIIaqrNA655DG0EZz8PTR7OQ5Usgm2HnKX 3lCOQFjvkOHamBLFP6O1syJCpmPtKDrYvAsEWb6MQPUtbCVrHMdMCyvnd07m+ejEesGl B+z8O16jgo9jVvzAIsVz78ic/ZiZhCEFUNUmRAwnEMFrVBekKiNV8tsfeJXoY4ggo7Pr DoY3FyfCnFeAdySnCI/SvDFJgJNVZfztjMiMIlCbfVV68t1X50JPDu/fBNvuU4H3q2KL XuSfk496XDTrpEEUPGVsm5HKu2bSI/AG03yn1U7o1iOh8iOLRa+PVoLVNuo/7OqzBEd1 vG/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741008653; x=1741613453; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=e0whn35vPm8J+vjBTbfA5u1AHwSTrw7swcARc6I++Hc=; b=KHysmFwjxRxeinRw3wW1CJ7TsZaosXMcgTu2kklnArzNv3RRER75ZXhXGhaaXzTdLS TzttdNQpLaNt7fxafp5idsNi2me8W8xVQGFBjEqMUPdaXw0zHyxGJOqXfs/Aolpe9t8l Vi1P9QOPfz61BKOP6QEw15/HQNFHzzZpBoWWGv9MAOXLbhWAaLHFbvVdjynA4geJ55f7 FeTbod6dTd1mBLUx+zYXPf62dCaNV+F4cE9s4TUpAqA2RoE0UB/Kw4rOm1kYDQXqLeOm 1bjg0VZfcGnB+snVLFatgwVig4QttRmiA5kBbJoNQer9qrWqbDB1PMLrNuD5swmoXKe3 GzYw== X-Forwarded-Encrypted: i=1; AJvYcCUQxD/rL8bgBtyQ8FukYDW4xr93QpQMJ9zgGS5LIybvzwf/pGrxLhCIO21jK1eyTRTyuZ2HfyV8BqZ0qWwMYttE@lists.infradead.org X-Gm-Message-State: AOJu0Yx9LSjr65jsa7wTuPa6EQaR1r9xchE8jYZ3cHpvSCfrxDeM7iQx IQGN6hI/fDRKYuPbjxMkWU494PD8KIixhTLwhezYnRJdUHNSOUW2 X-Gm-Gg: ASbGncux6jWcLTFxDU5vVoOk/Uqyzfs1dAexjTdNVYXjLOfhweAfdLo7q5zkOud3cGz uOcirb+F7y9vtPf65D0FojASWygYqMWlBRDU9EN1/7Vi4YzKHqBrvslcX4M7rbJWvJZSewa7Owo xv5A+cWvWCLpkX6xrJXrAV6CMf96WPyDiemwxZNhnAXOXy2CXrTIfgSPtZv8KD2z9chtJnn1jWo 1R31ehMTGV3brEKSflnnFwzEzcUU9NDIpqT0NcxtdMkpypNUqyW9kH67y74WMlIktdWCjeCpIEw +aMQoiapElS4PAWcSG9UOw512kfzB9wIexzFpvj92vQ9i/cOQxRZNcu3jyQIzA== X-Google-Smtp-Source: AGHT+IECVEaRvzz7jFgM9Xs6Qsz7AN1fGJ1Lrd2ZRMORcgHaHlhwWOyshFl83O3butQP+uNtOa6yXg== X-Received: by 2002:a17:902:e5c5:b0:223:4bd6:3863 with SMTP id d9443c01a7336-22368f6a174mr217254605ad.10.1741008652539; Mon, 03 Mar 2025 05:30:52 -0800 (PST) Received: from localhost.localdomain ([43.129.244.20]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-223505359b8sm77297035ad.253.2025.03.03.05.30.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Mar 2025 05:30:52 -0800 (PST) From: Menglong Dong X-Google-Original-From: Menglong Dong To: peterz@infradead.org, rostedt@goodmis.org, mark.rutland@arm.com, alexei.starovoitov@gmail.com Cc: catalin.marinas@arm.com, will@kernel.org, mhiramat@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, eddyz87@gmail.com, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, jolsa@kernel.org, davem@davemloft.net, dsahern@kernel.org, mathieu.desnoyers@efficios.com, nathan@kernel.org, nick.desaulniers+lkml@gmail.com, morbo@google.com, samitolvanen@google.com, kees@kernel.org, dongml2@chinatelecom.cn, akpm@linux-foundation.org, riel@surriel.com, rppt@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, llvm@lists.linux.dev Subject: [PATCH v4 3/4] x86: implement per-function metadata storage for x86 Date: Mon, 3 Mar 2025 21:28:36 +0800 Message-Id: <20250303132837.498938-4-dongml2@chinatelecom.cn> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250303132837.498938-1-dongml2@chinatelecom.cn> References: <20250303132837.498938-1-dongml2@chinatelecom.cn> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250303_053053_288430_EC85227F X-CRM114-Status: GOOD ( 22.09 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With CONFIG_CALL_PADDING enabled, there will be 16-bytes padding space before all the kernel functions. And some kernel features can use it, such as MITIGATION_CALL_DEPTH_TRACKING, CFI_CLANG, FINEIBT, etc. In my research, MITIGATION_CALL_DEPTH_TRACKING will consume the tail 9-bytes in the function padding, CFI_CLANG will consume the head 5-bytes, and FINEIBT will consume all the 16 bytes if it is enabled. So there will be no space for us if MITIGATION_CALL_DEPTH_TRACKING and CFI_CLANG are both enabled, or FINEIBT is enabled. In x86, we need 5-bytes to prepend a "mov %eax xxx" insn, which can hold a 4-bytes index. So we have following logic: 1. use the head 5-bytes if CFI_CLANG is not enabled 2. use the tail 5-bytes if MITIGATION_CALL_DEPTH_TRACKING and FINEIBT are not enabled 3. compile the kernel with FUNCTION_ALIGNMENT_32B otherwise In the third case, we make the kernel function 32 bytes aligned, and there will be 32 bytes padding before the functions. According to my testing, the text size didn't increase on this case, which is weird. With 16-bytes padding: -rwxr-xr-x 1 401190688 x86-dev/vmlinux* -rw-r--r-- 1 251068 x86-dev/vmlinux.a -rw-r--r-- 1 851892992 x86-dev/vmlinux.o -rw-r--r-- 1 12395008 x86-dev/arch/x86/boot/bzImage With 32-bytes padding: -rwxr-xr-x 1 401318128 x86-dev/vmlinux* -rw-r--r-- 1 251154 x86-dev/vmlinux.a -rw-r--r-- 1 853636704 x86-dev/vmlinux.o -rw-r--r-- 1 12509696 x86-dev/arch/x86/boot/bzImage The way I tested should be right, and this is a good news for us. On the third case, the layout of the padding space will be like this if fineibt is enabled: __cfi_func: mov -- 5 -- cfi, not used anymore nop nop nop mov -- 5 -- function metadata nop nop nop fineibt -- 16 -- fineibt func: nopw -- 4 ...... I tested the fineibt with "cfi=fineibt" cmdline, and it works well together with FUNCTION_METADATA enabled. And I also tested the performance of this function by setting metadata for all the kernel function, and it consumes 0.7s for 70k+ functions, not bad :/ I can't find a machine that support IBT, so I didn't test the IBT. I'd appreciate it if someone can do this testing for me :/ Signed-off-by: Menglong Dong --- v3: - select FUNCTION_ALIGNMENT_32B on case3, instead of extra 5-bytes --- arch/x86/Kconfig | 18 ++++++++++++ arch/x86/include/asm/ftrace.h | 54 +++++++++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5c277261507e..b0614188c80b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2518,6 +2518,24 @@ config PREFIX_SYMBOLS def_bool y depends on CALL_PADDING && !CFI_CLANG +config FUNCTION_METADATA + bool "Per-function metadata storage support" + default y + depends on CC_HAS_ENTRY_PADDING && OBJTOOL + select CALL_PADDING + select FUNCTION_ALIGNMENT_32B if ((CFI_CLANG && CALL_THUNKS) || FINEIBT) + help + Support per-function metadata storage for kernel functions, and + get the metadata of the function by its address with almost no + overhead. + + The index of the metadata will be stored in the function padding + and consumes 5-bytes. FUNCTION_ALIGNMENT_32B will be selected if + "(CFI_CLANG && CALL_THUNKS) || FINEIBT" to make sure there is + enough available padding space for this function. However, it + seems that the text size almost don't change, compare with + FUNCTION_ALIGNMENT_16B. + menuconfig CPU_MITIGATIONS bool "Mitigations for CPU vulnerabilities" default y diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h index f2265246249a..700bb729e949 100644 --- a/arch/x86/include/asm/ftrace.h +++ b/arch/x86/include/asm/ftrace.h @@ -4,6 +4,28 @@ #include +#ifdef CONFIG_FUNCTION_METADATA +#if (defined(CONFIG_CFI_CLANG) && defined(CONFIG_CALL_THUNKS)) || (defined(CONFIG_FINEIBT)) + /* the CONFIG_FUNCTION_PADDING_BYTES is 32 in this case, use the + * range: [align + 8, align + 13]. + */ + #define KFUNC_MD_INSN_OFFSET (CONFIG_FUNCTION_PADDING_BYTES - 8) + #define KFUNC_MD_DATA_OFFSET (CONFIG_FUNCTION_PADDING_BYTES - 9) +#else + #ifdef CONFIG_CFI_CLANG + /* use the space that CALL_THUNKS suppose to use */ + #define KFUNC_MD_INSN_OFFSET (5) + #define KFUNC_MD_DATA_OFFSET (4) + #else + /* use the space that CFI_CLANG suppose to use */ + #define KFUNC_MD_INSN_OFFSET (CONFIG_FUNCTION_PADDING_BYTES) + #define KFUNC_MD_DATA_OFFSET (CONFIG_FUNCTION_PADDING_BYTES - 1) + #endif +#endif + +#define KFUNC_MD_INSN_SIZE (5) +#endif + #ifdef CONFIG_FUNCTION_TRACER #ifndef CC_USING_FENTRY # error Compiler does not support fentry? @@ -156,4 +178,36 @@ static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs) #endif /* !COMPILE_OFFSETS */ #endif /* !__ASSEMBLY__ */ +#if !defined(__ASSEMBLY__) && defined(CONFIG_FUNCTION_METADATA) +#include + +static inline bool kfunc_md_arch_exist(void *ip) +{ + return *(u8 *)(ip - KFUNC_MD_INSN_OFFSET) == 0xB8; +} + +static inline void kfunc_md_arch_pretend(u8 *insn, u32 index) +{ + *insn = 0xB8; + *(u32 *)(insn + 1) = index; +} + +static inline void kfunc_md_arch_nops(u8 *insn) +{ + *(insn++) = BYTES_NOP1; + *(insn++) = BYTES_NOP1; + *(insn++) = BYTES_NOP1; + *(insn++) = BYTES_NOP1; + *(insn++) = BYTES_NOP1; +} + +static inline int kfunc_md_arch_poke(void *ip, u8 *insn) +{ + text_poke(ip, insn, KFUNC_MD_INSN_SIZE); + text_poke_sync(); + return 0; +} + +#endif + #endif /* _ASM_X86_FTRACE_H */ From patchwork Mon Mar 3 13:28:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13998894 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DDF14C282CD for ; Mon, 3 Mar 2025 13:59:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=KvSQXTXm2WJtvrSSulKAAYVeipdazZIwWA04KeDmP3s=; b=AIET0TJXr/taQ8nHIQDCOYMFec srD2lloyp3fS2ZK54HVgIXA27oxdqI+AzKqLWQ3zuI1jSA+/iKQYCVwWI6/z4HB3Ar0/9BL/IY4d5 ar8eU/Zx5fwswBDYfsa8zUmFKQHk9MJ5njmeUNrRFjDy6POttSAVHzhADuwZ7rghLYtfEnsYjOHPE zctAcDCJ4+VxMzyo2CFFj0gKOOx0p8bmMawhYmyx4Bg3T6ZBtmn6VpAtd52Bymxdl2X8YdkLopeJn Ig92rLBX8E8bu0Xlp6zjafzXWnhZZede281YtuxrpW5GAVMHVFs1vSC3v8unYTMxLjx8HA9/HOsUD e+dkXOSw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tp6Jv-000000010Jd-3NbC; Mon, 03 Mar 2025 13:58:59 +0000 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tp5sr-00000000vXO-0cK5 for linux-arm-kernel@lists.infradead.org; Mon, 03 Mar 2025 13:31:02 +0000 Received: by mail-pl1-x642.google.com with SMTP id d9443c01a7336-2238e884f72so25858585ad.3 for ; Mon, 03 Mar 2025 05:31:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741008660; x=1741613460; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KvSQXTXm2WJtvrSSulKAAYVeipdazZIwWA04KeDmP3s=; b=myH9/VgDofhjqBH+htsfJnGvQ9OY3Sct5z0QfLDr743wIsEyDnLGx/nZzv7tdPGsKi NN0BRS2NBRtn+mW+NAy6J0tA18pfGs/vRvcidUNq5eKlospGIaCPx2ApoZECou9obFhK krxLZ+8CdBLqyAkilrhg6plFM3ICiahV9aacORHe2sHt06VjkhEnffORGu4S5DW93wqi DGyD1m5z1gzISu/VyZhWPJtYgIcg4N59dZJ2x4XBTTBW9iJtLXoa/ybv1hUx4YQOfNqe JBmfkglfuoeobMtsa6aU+T6EhLqHvt0Nsr84b7i/ZKlnbso23IVY4HHRwRRyndQVUh6c XG6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741008660; x=1741613460; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KvSQXTXm2WJtvrSSulKAAYVeipdazZIwWA04KeDmP3s=; b=MrsPaI+RWgkP+tpO4Y0SPPjEdtuVJEOUTIuw/oRzjlexwAgOYD8emVQ7g5fVzHuOXB T8K59Jui5jJQ3IqOna6BlKUxQQmP23fUn4IyzAwNJ4lutBsDsYLCB8SZx6Lrj19WOdYF Ku3fd4XVFogR393R3wP+3/PeUnsH70mcdxtirWBDwyNrDn+vivRnO+CVZzHa9bcJOzlD xFGuBnKtMkpNT9gy6bUb+9BinuQ2TK9MZzpk1kdhktCnUcTWlCfKmP9FPWPHXdIfvJ0b cq5k/b565HPKnRaStl8B3Qbu36Z1PxC40/zMjV4H8uNMPZTW839NOBT+ukNqnEcP7My3 rcWQ== X-Forwarded-Encrypted: i=1; AJvYcCVtoaKiUfI8jTpjMmwpCd12iqJUUqC8WLnBOcLsmzc/GGvTKXOdHgYXfNxLonqmGQZN+Ctp5zi8cs9mJ3BJrGEH@lists.infradead.org X-Gm-Message-State: AOJu0YxlnsAGC4qHlyOnO794fHcAs4IvcG95TUu8snU/Mbb3TxM8yABY XZ7q+GFEJTr6FyFpV9UzBSHsHnnl7M0vo7Tf1NKYgNM19FzHN8kB X-Gm-Gg: ASbGncuZoX2PQXPoT+/DQpghQHO8f0FdarxfCiuPejAeccjhutEi0iYlFpR68pjSEQL /adPZRzfrtE6BnAKn7p2DEjVvF1yMZbhr8UUWXW0pLnnb1ztZJEZxjz0Mift9cQpZ7hkvuG3pmz OgEHVHDlGL1PViaN4Z71Re/Weoghs31AzSK3CshLa8CA8UUy8hE1Yt7LGdFYzPB77KnLzQ/Wrx5 hQMO205VGHQzFRpmnRVYjnCC660f4qfMa9aFF8pBHbBuzaw08t17+L3NH4A/6g26IrEAvKWCPIS YwrudupcyMc8qc17JWAoDUw9yFwCAZv1z+2cNjZAJ3roIiVzit3X+q9nSTLU8A== X-Google-Smtp-Source: AGHT+IFrx9yunejrhIIBax0i6zoJK5cErggOeRBXOxeJLSAEu6Vgq/DyW9livSLEsya9w6YWkG4/Ug== X-Received: by 2002:a17:902:fc8d:b0:223:44dc:3f36 with SMTP id d9443c01a7336-2236925eef4mr220020495ad.43.1741008660442; Mon, 03 Mar 2025 05:31:00 -0800 (PST) Received: from localhost.localdomain ([43.129.244.20]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-223505359b8sm77297035ad.253.2025.03.03.05.30.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Mar 2025 05:31:00 -0800 (PST) From: Menglong Dong X-Google-Original-From: Menglong Dong To: peterz@infradead.org, rostedt@goodmis.org, mark.rutland@arm.com, alexei.starovoitov@gmail.com Cc: catalin.marinas@arm.com, will@kernel.org, mhiramat@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, eddyz87@gmail.com, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, jolsa@kernel.org, davem@davemloft.net, dsahern@kernel.org, mathieu.desnoyers@efficios.com, nathan@kernel.org, nick.desaulniers+lkml@gmail.com, morbo@google.com, samitolvanen@google.com, kees@kernel.org, dongml2@chinatelecom.cn, akpm@linux-foundation.org, riel@surriel.com, rppt@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, llvm@lists.linux.dev Subject: [PATCH v4 4/4] arm64: implement per-function metadata storage for arm64 Date: Mon, 3 Mar 2025 21:28:37 +0800 Message-Id: <20250303132837.498938-5-dongml2@chinatelecom.cn> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250303132837.498938-1-dongml2@chinatelecom.cn> References: <20250303132837.498938-1-dongml2@chinatelecom.cn> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250303_053101_180339_9123B0FE X-CRM114-Status: GOOD ( 21.55 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The per-function metadata storage is already used by ftrace if CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS is enabled, and it store the pointer of the callback directly to the function padding, which consume 8-bytes, in the commit baaf553d3bc3 ("arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS"). So we can directly store the index to the function padding too, without a prepending. With CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS enabled, the function is 8-bytes aligned, and we will compile the kernel with extra 8-bytes (2 NOPS) padding space. Otherwise, the function is 4-bytes aligned, and only extra 4-bytes (1 NOPS) is needed. However, we have the same problem with Mark in the commit above: we can't use the function padding together with CFI_CLANG, which can make the clang compiles a wrong offset to the pre-function type hash. He said that he was working with others on this problem 2 years ago. Hi Mark, is there any progress on this problem? Signed-off-by: Menglong Dong --- arch/arm64/Kconfig | 15 +++++++++++++++ arch/arm64/Makefile | 23 ++++++++++++++++++++-- arch/arm64/include/asm/ftrace.h | 34 +++++++++++++++++++++++++++++++++ arch/arm64/kernel/ftrace.c | 13 +++++++++++-- 4 files changed, 81 insertions(+), 4 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 940343beb3d4..7ed80f5eb267 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1536,6 +1536,21 @@ config NODES_SHIFT Specify the maximum number of NUMA Nodes available on the target system. Increases memory reserved to accommodate various tables. +config FUNCTION_METADATA + bool "Per-function metadata storage support" + default y + select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE if !FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY + depends on !CFI_CLANG + help + Support per-function metadata storage for kernel functions, and + get the metadata of the function by its address with almost no + overhead. + + The index of the metadata will be stored in the function padding, + which will consume 4-bytes. If FUNCTION_ALIGNMENT_8B is enabled, + extra 8-bytes function padding will be reserved during compiling. + Otherwise, only extra 4-bytes function padding is needed. + source "kernel/Kconfig.hz" config ARCH_SPARSEMEM_ENABLE diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index 2b25d671365f..2df2b0f4dd90 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -144,12 +144,31 @@ endif CHECKFLAGS += -D__aarch64__ +ifeq ($(CONFIG_FUNCTION_METADATA),y) + ifeq ($(CONFIG_FUNCTION_ALIGNMENT_8B),y) + __padding_nops := 2 + else + __padding_nops := 1 + endif +else + __padding_nops := 0 +endif + ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS),y) + __padding_nops := $(shell echo $(__padding_nops) + 2 | bc) KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY - CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2 + CC_FLAGS_FTRACE := -fpatchable-function-entry=$(shell echo $(__padding_nops) + 2 | bc),$(__padding_nops) else ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_ARGS),y) + CC_FLAGS_FTRACE := -fpatchable-function-entry=$(shell echo $(__padding_nops) + 2 | bc),$(__padding_nops) KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY - CC_FLAGS_FTRACE := -fpatchable-function-entry=2 +else ifeq ($(CONFIG_FUNCTION_METADATA),y) + CC_FLAGS_FTRACE += -fpatchable-function-entry=$(__padding_nops),$(__padding_nops) + ifneq ($(CONFIG_FUNCTION_TRACER),y) + KBUILD_CFLAGS += $(CC_FLAGS_FTRACE) + # some file need to remove this cflag when CONFIG_FUNCTION_TRACER + # is not enabled, so we need to export it here + export CC_FLAGS_FTRACE + endif endif ifeq ($(CONFIG_KASAN_SW_TAGS), y) diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h index bfe3ce9df197..aa3eaa91bf82 100644 --- a/arch/arm64/include/asm/ftrace.h +++ b/arch/arm64/include/asm/ftrace.h @@ -24,6 +24,16 @@ #define FTRACE_PLT_IDX 0 #define NR_FTRACE_PLTS 1 +#ifdef CONFIG_FUNCTION_METADATA +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS +#define KFUNC_MD_DATA_OFFSET (AARCH64_INSN_SIZE * 3) +#else +#define KFUNC_MD_DATA_OFFSET AARCH64_INSN_SIZE +#endif +#define KFUNC_MD_INSN_SIZE AARCH64_INSN_SIZE +#define KFUNC_MD_INSN_OFFSET KFUNC_MD_DATA_OFFSET +#endif + /* * Currently, gcc tends to save the link register after the local variables * on the stack. This causes the max stack tracer to report the function @@ -216,6 +226,30 @@ static inline bool arch_syscall_match_sym_name(const char *sym, */ return !strcmp(sym + 8, name); } + +#ifdef CONFIG_FUNCTION_METADATA +#include + +static inline bool kfunc_md_arch_exist(void *ip) +{ + return !aarch64_insn_is_nop(*(u32 *)(ip - KFUNC_MD_INSN_OFFSET)); +} + +static inline void kfunc_md_arch_pretend(u8 *insn, u32 index) +{ + *(u32 *)insn = index; +} + +static inline void kfunc_md_arch_nops(u8 *insn) +{ + *(u32 *)insn = aarch64_insn_gen_nop(); +} + +static inline int kfunc_md_arch_poke(void *ip, u8 *insn) +{ + return aarch64_insn_patch_text_nosync(ip, *(u32 *)insn); +} +#endif #endif /* ifndef __ASSEMBLY__ */ #ifndef __ASSEMBLY__ diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c index d7c0d023dfe5..4191ff0037f5 100644 --- a/arch/arm64/kernel/ftrace.c +++ b/arch/arm64/kernel/ftrace.c @@ -88,8 +88,10 @@ unsigned long ftrace_call_adjust(unsigned long addr) * to `BL `, which is at `addr + 4` bytes in either case. * */ - if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)) - return addr + AARCH64_INSN_SIZE; + if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)) { + addr += AARCH64_INSN_SIZE; + goto out; + } /* * When using patchable-function-entry with pre-function NOPs, addr is @@ -139,6 +141,13 @@ unsigned long ftrace_call_adjust(unsigned long addr) /* Skip the first NOP after function entry */ addr += AARCH64_INSN_SIZE; +out: + if (IS_ENABLED(CONFIG_FUNCTION_METADATA)) { + if (IS_ENABLED(CONFIG_FUNCTION_ALIGNMENT_8B)) + addr += 2 * AARCH64_INSN_SIZE; + else + addr += AARCH64_INSN_SIZE; + } return addr; }