From patchwork Sat Dec 24 11:43:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081370 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 50B12C4332F for ; Sat, 24 Dec 2022 15:24:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=uCAaSFmXAE2oPw0mT1wuTXkJJA09P0rKThf7Sy4zsN4=; b=DcgtDJUh0mvdJZ UgJc39gtfXjHD2x+kdBHu1CntxXXljNoln8Rb6vrZhHn0MEzUZxUsf4cOnU2+PKyHFOyuN6DProsx aVFv6f4AfERXtnJdbb9K9pSDUzRYye7SvbT7rd6i2ifwD5InoAOoNYtfmtrpSEdnyp1C/fYtXGDAG YcZjJPSgzztgEvUPstnUcx3Dq16jreRabWGA/3AaDSLxdtKPo0QfLS9KNLfnCB0FC+iqSuKV4JxOP chwg5TBMnjlFTCBKUUNCkx6pG6Se/dTwC+O8/L7CaQYunEsEUwrMN5WSjHNxUkysjvI/gkmPHgaV4 ti0L5JOYFCIZN9AsNuRA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96Nu-0014AL-86; Sat, 24 Dec 2022 15:24:26 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5ui-PF for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:58 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S5; Sat, 24 Dec 2022 19:43:27 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 1/9] riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES feature Date: Sat, 24 Dec 2022 19:43:07 +0800 Message-Id: <20221224114315.850130-2-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S5 X-Coremail-Antispam: 1UD129KBjvJXoW3XrykCF1UCr4UJry7JryrJFb_yoW7KFy7pF s5Cwn5WrWkA3s3G3y3A3WkurWFqan5Wa17KrWDZry5Jw45tr45Awn2grWUXrWxJrs0gryf WF1Fyry5Cry3X3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmK14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1l84 ACjcxK6I8E87Iv67AKxVWxJr0_GcWl84ACjcxK6I8E87Iv6xkF7I0E14v26F4UJVW0owAa c4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzV Aqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S 6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxw ACI402YVCY1x02628vn2kIc2xKxwCY02Avz4vE14v_Gw4l42xK82IYc2Ij64vIr41l4I8I 3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxV WUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAF wI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcI k0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j 6r4UJbIYCTnIWIevJa73UjIFyTuYvjfU0c_-DUUUU X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiCQYKE2OmnPBfhgABsn X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_269104_557D1D5F X-CRM114-Status: GOOD ( 15.88 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang Prepare skeleton to implement optimized kprobe on RISCV, it is consist of Makfile, Kconfig and some architecture specific files: kprobe.h and opt.c opt.c include some macro, type definition and functions required by kprobe framework, opt_trampoline.S provide a piece of assembly code template used to construct the detour buffer as the target of long jump instruction(s) for each optimzed kprobe. Since the jump range of PC-relative instruction JAL is +/-2M, that is too small to reach the detour buffer, hence the foudamental idea to address OPTPROBES on RISCV is replace 'EBREAK' with 'AUIPC/JALR'. which means it needs to clobber one more instruction beside the kprobe instruction, furthermore, RISCV supports hybird RVI and RVC in single kernel binary, so in theory a pair of 'AUIPC/JALR' is about to clobber 10 bytes(3 RVC and 1 RVI, 2 bytes is padding for alignment) at worst case. The second hardsome problem is looking for one integer register as the destination of 'AUIPC/JALR' without any side-effect. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/kprobes.h | 32 ++++++++++++++ arch/riscv/kernel/probes/Makefile | 1 + arch/riscv/kernel/probes/opt.c | 51 +++++++++++++++++++++++ arch/riscv/kernel/probes/opt_trampoline.S | 12 ++++++ 5 files changed, 97 insertions(+) create mode 100644 arch/riscv/kernel/probes/opt.c create mode 100644 arch/riscv/kernel/probes/opt_trampoline.S diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index e2b656043abf..5fa3094d55bc 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -103,6 +103,7 @@ config RISCV select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL select HAVE_KRETPROBES if !XIP_KERNEL select HAVE_RETHOOK if !XIP_KERNEL + select HAVE_OPTPROBES if !XIP_KERNEL select HAVE_MOVE_PMD select HAVE_MOVE_PUD select HAVE_PCI diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h index e7882ccb0fd4..e85130c9112f 100644 --- a/arch/riscv/include/asm/kprobes.h +++ b/arch/riscv/include/asm/kprobes.h @@ -41,5 +41,37 @@ int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr); bool kprobe_breakpoint_handler(struct pt_regs *regs); bool kprobe_single_step_handler(struct pt_regs *regs); +#ifdef CONFIG_OPTPROBES + +/* optinsn template addresses */ +extern __visible kprobe_opcode_t optprobe_template_entry[]; +extern __visible kprobe_opcode_t optprobe_template_end[]; + +#define MAX_OPTINSN_SIZE \ + ((unsigned long)optprobe_template_end - \ + (unsigned long)optprobe_template_entry) + +/* + * For RVI and RVC hybird encoding kernel, althought long jump just needs + * 2 RVI instructions(AUIPC+JALR), optimized instructions is 10 bytes long + * at most to ensure no RVI would be truncated actually, so it means four + * combinations: + * - 2 RVI + * - 4 RVC + * - 2 RVC + 1 RVI + * - 3 RVC + 1 RVI (truncated, need padding) + */ +#define MAX_COPIED_INSN 4 +#define MAX_OPTIMIZED_LENGTH 10 + +struct arch_optimized_insn { + kprobe_opcode_t copied_insn[MAX_COPIED_INSN]; + /* detour code buffer */ + kprobe_opcode_t *insn; + unsigned long length; + int rd; +}; + +#endif /* CONFIG_OPTPROBES */ #endif /* CONFIG_KPROBES */ #endif /* _ASM_RISCV_KPROBES_H */ diff --git a/arch/riscv/kernel/probes/Makefile b/arch/riscv/kernel/probes/Makefile index c40139e9ca47..3d837eb5f9be 100644 --- a/arch/riscv/kernel/probes/Makefile +++ b/arch/riscv/kernel/probes/Makefile @@ -3,4 +3,5 @@ obj-$(CONFIG_KPROBES) += kprobes.o decode-insn.o simulate-insn.o obj-$(CONFIG_RETHOOK) += rethook.o rethook_trampoline.o obj-$(CONFIG_KPROBES_ON_FTRACE) += ftrace.o obj-$(CONFIG_UPROBES) += uprobes.o decode-insn.o simulate-insn.o +obj-$(CONFIG_OPTPROBES) += opt.o opt_trampoline.o CFLAGS_REMOVE_simulate-insn.o = $(CC_FLAGS_FTRACE) diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c new file mode 100644 index 000000000000..56c8a227c857 --- /dev/null +++ b/arch/riscv/kernel/probes/opt.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Kernel Probes Jump Optimization (Optprobes) + * + * Copyright (C) Guokai Chen, 2022 + * Author: Guokai Chen chenguokai17@mails.ucas.ac.cn + */ + +#define pr_fmt(fmt) "optprobe: " fmt + +#include +#include + +int arch_prepared_optinsn(struct arch_optimized_insn *optinsn) +{ + return 0; +} + +int arch_check_optimized_kprobe(struct optimized_kprobe *op) +{ + return 0; +} + +int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, + struct kprobe *orig) +{ + return 0; +} + +void arch_remove_optimized_kprobe(struct optimized_kprobe *op) +{ +} + +void arch_optimize_kprobes(struct list_head *oplist) +{ +} + +void arch_unoptimize_kprobes(struct list_head *oplist, + struct list_head *done_list) +{ +} + +void arch_unoptimize_kprobe(struct optimized_kprobe *op) +{ +} + +int arch_within_optimized_kprobe(struct optimized_kprobe *op, + kprobe_opcode_t *addr) +{ + return 0; +} diff --git a/arch/riscv/kernel/probes/opt_trampoline.S b/arch/riscv/kernel/probes/opt_trampoline.S new file mode 100644 index 000000000000..16160c4367ff --- /dev/null +++ b/arch/riscv/kernel/probes/opt_trampoline.S @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2022 Guokai Chen + */ + +#include + +#incldue +#include + +SYM_ENTRY(optprobe_template_entry, SYM_L_GLOBAL, SYM_A_NONE) +SYM_ENTRY(optprobe_template_end, SYM_L_GLOBAL, SYM_A_NONE) From patchwork Sat Dec 24 11:43:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081369 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3B09AC4332F for ; Sat, 24 Dec 2022 15:24:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=283rYX2/l8wLotYDQageGvltSOmPWlZHQ069wldDao8=; b=3NHlIzg5AIE29V D8giPFCVWaJz4XmK81t9TeEZpqOGf/c3BtNn5PzpAQlGtlG03xyAD457v5SCKRX3L8o3AvR4I2/9d NV7UQgIbyxI894rfFzLMPuZg7wPYRuGw835p66kUc4VGGDBlUY8jVb8j1+ArH+5O6DbaEu9/N3ezb 9YCcSfh+aYv1k0YXk//5GNAVEOwP7P60iNpiDNNU3OWOIvcvZXpGsUQeNI1OHTEmap8YbGhgNNzcH HMzy1OoQ/WRVZmWvOx7fJHwiXcnX0Ks9fiepLMpS02g5pcNDbYKwxOOjgoL6Lay/VC2coG542evN2 PH8hMC0VYrRA5b93O3Ig==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96NX-0013xT-TP; Sat, 24 Dec 2022 15:24:04 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wP-00H5uf-Kk for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:57 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S6; Sat, 24 Dec 2022 19:43:27 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 2/9] riscv/kprobe: Allocate detour buffer from module area Date: Sat, 24 Dec 2022 19:43:08 +0800 Message-Id: <20221224114315.850130-3-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S6 X-Coremail-Antispam: 1UD129KBjvJXoW7ur1kJFWrKrWfCrW7Wr15urg_yoW8urW3pF 4DCr45JrZ8Aw13G3yayw1kZr9YvaykXw47KrW2vF1UZw47tr43Zw4v934Fqrn5trW29FyY qr4q9ryS9ayUZaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmK14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1l84 ACjcxK6I8E87Iv67AKxVWxJr0_GcWl84ACjcxK6I8E87Iv6xkF7I0E14v26F4UJVW0owAa c4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzV Aqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S 6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxw ACI402YVCY1x02628vn2kIc2xKxwCY02Avz4vE14v_Gw4l42xK82IYc2Ij64vIr41l4I8I 3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxV WUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAF wI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcI k0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j 6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUjSdgDUUUU X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiAwEKE2OmnOhnLgAAsj X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034350_179072_5BC3E1EF X-CRM114-Status: UNSURE ( 7.38 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang To address the limitation of PC-relative branch instruction on riscv architecture, detour buffer slot used for optprobes is allocated from the region, the distance of which from kernel should be less than 4GB. For the time being, Modules region always live before the kernel. But Vmalloc region reside far from kernel, the distance is half of the kernel address space (See Documentation/riscv/vm-layout.rst), hence it needs to override the alloc_optinsn_page() to make sure allocate detour buffer from jump-safe region. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/kernel/probes/kprobes.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index f21592d20306..e1856b04db04 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -84,6 +85,29 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) } #ifdef CONFIG_MMU +#if defined(CONFIG_OPTPROBES) && defined(CONFIG_64BIT) +void *alloc_optinsn_page(void) +{ + void *page; + + page = __vmalloc_node_range(PAGE_SIZE, 1, MODULES_VADDR, + MODULES_END, GFP_KERNEL, + PAGE_KERNEL, 0, NUMA_NO_NODE, + __builtin_return_address(0)); + if (!page) + return NULL; + + set_vm_flush_reset_perms(page); + /* + * First make the page read-only, and only then make it executable to + * prevent it from being W+X in between. + */ + set_memory_rox((unsigned long)page, 1); + + return page; +} +#endif + void *alloc_insn_page(void) { return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END, From patchwork Sat Dec 24 11:43:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081365 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 183DBC4332F for ; Sat, 24 Dec 2022 15:22:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=/e9GPyEmrXJvfx4yzAZfifCPcSdc5fqzFpXmKOkQoc0=; b=C2KAgXGRpEoFie 7DcdPiewhuPcIkDHQiMvJAfarK8ltOujoPuMH5MpLXiUNa2VVaN+HQFleA6TdoJstjcPlnm+1B42X /pY+9tZuIAo+dVH0PoPrNy/rouTm8Xcj4Z/odfrdb+994NqDFrICGcjLK9D2fJkF9H0KFJc6ZtEez +9Cgz9S5kkBtVBV5xWGodXhEr50/D73ZqgyqXQoWSx1jsOVzpknKV5q2i6BsaCzjqNpwIu6YtZDe7 BeCqBFixswO1pzfLdV4b2rjkApeAZjFnlQYgZC4VflPHqAW6WsL/i7PiFdFUs2Ae2Eh3+bw8FxBnQ 85Ye/si12Hr+WHQtB9Tw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96MB-00139d-UT; Sat, 24 Dec 2022 15:22:40 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5um-KD for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:56 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S7; Sat, 24 Dec 2022 19:43:28 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 3/9] riscv/kprobe: Prepare the skeleton to prepare optimized kprobe Date: Sat, 24 Dec 2022 19:43:09 +0800 Message-Id: <20221224114315.850130-4-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S7 X-Coremail-Antispam: 1UD129KBjvJXoWxGrWDZw13XF1UJw1DWFy3Jwb_yoWrtw13pF s3C3W5trWDJayfurWfZwsYqrySkFsYqr47CrWUJ34rZr13JrZ0vas7Kr4YvF98GrWS9rW3 JrsYvry5uF43ZrJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_Cr1j6rxdM28EF7xvwVC2z280aVCY1x0267AKxVWxJr0_GcWl nxkEFVAIw20F6cxK64vIFxWle2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4 CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvj eVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I 1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc2xSY4AK67AK6r4kMxAIw28IcxkI7VAKI48JMxC2 0s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI 0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE 14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20x vaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8 JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUjEdgPUUUUU== X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiAxAKE2OmnOhnLQAAsx X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_241210_E90A2C29 X-CRM114-Status: GOOD ( 18.46 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang This patch provide a skeleton to prepare optimized kprobe instruction slot, it is consist of two major parts, the first part is check if current kprobe satifies the requirement to optimize. The kprobe bases on breakpoint just require the instrumented instruction supports execute out-of-line or simulation, however optimized kprobe bases on long-jump needs more requirements, it includes: - The target of long-jump in the range of 'AUIPC/JALR'. - No near instruction jump to any instruction replaced by 'AUIPC/JALR' - It managed to find one free register to form 'AUIPC/JALR' jumping to detour buffer. - It managed to find one free register to form 'JR' jumping back from detour buffer The second part is allocate a larger instruction slot for each optimized kprobe, the payload of which is patched with the assembly code defined in opt_trampoline.S, a call to kprobe pre_handler and these instructions replaced by 'AUIPC/JALR'. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/kernel/probes/opt.c | 107 ++++++++++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 1 deletion(-) diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c index 56c8a227c857..a4271e6033ba 100644 --- a/arch/riscv/kernel/probes/opt.c +++ b/arch/riscv/kernel/probes/opt.c @@ -10,6 +10,54 @@ #include #include +#include + +static inline int in_auipc_jalr_range(long val) +{ +#ifdef CONFIG_ARCH_RV32I + return 1; +#else + /* + * Note that the set of address offsets that can be formed + * by pairing LUI with LD, AUIPC with JALR, etc. in RV64I is + * [−2^31−2^11, 2^31−2^11−1]. + */ + return ((-(1L << 31) - (1L << 11)) <= val) && + (val < ((1L << 31) - (1L << 11))); +#endif +} + +/* + * Copy optprobe assembly code template into detour buffer and modify some + * instructions for each kprobe. + */ +static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot, + int rd, struct optimized_kprobe *op, + kprobe_opcode_t opcode) +{ +} + +/* + * In RISC-V ISA, AUIPC/JALR clobber one register to form target address, + * by inspired by register renaming in OoO processor, this involves search + * backwards that is not previously used as a source register and is used + * as a destination register before any branch or jump instruction. + */ +static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op, + int *rd, int *ra) +{ +} + +/* + * If two free registers can be found at the beginning of both + * the start and the end of replaced code, it can be optimized + * Also, in-function jumps need to be checked to make sure that + * there is no jump to the second instruction to be replaced + */ +static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op) +{ + return false; +} int arch_prepared_optinsn(struct arch_optimized_insn *optinsn) { @@ -24,7 +72,64 @@ int arch_check_optimized_kprobe(struct optimized_kprobe *op) int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *orig) { - return 0; + long rel; + int rd, ra, ret; + kprobe_opcode_t *code = NULL, *slot = NULL; + + if (!can_optimize((unsigned long)orig->addr, op)) + return -EILSEQ; + + code = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL); + slot = get_optinsn_slot(); + if (!code || !slot) { + ret = -ENOMEM; + goto on_error; + } + + /* + * Verify if the address gap is within 4GB range, because this uses + * a auipc+jalr pair. + */ + rel = (unsigned long)slot - (unsigned long)orig->addr; + if (!in_auipc_jalr_range(rel)) { + /* + * Different from x86, we free code buf directly instead of + * calling __arch_remove_optimized_kprobe() because + * we have not fill any field in op. + */ + ret = -ERANGE; + goto on_error; + } + + /* + * Search two free registers, rd is used as to form AUIPC/JALR jumping + * to detour buffer, ra is used as to form JR jumping back from detour + * buffer. + */ + find_free_registers(orig, op, &rd, &ra); + if (rd == 0 || ra == 0) { + ret = -EILSEQ; + goto on_error; + } + + op->optinsn.rd = rd; + prepare_detour_buffer(code, slot, ra, op, orig->opcode); + + ret = patch_text_nosync((void *)slot, code, MAX_OPTINSN_SIZE); + if (!ret) { + op->optinsn.insn = slot; + kfree(code); + return 0; + } + +on_error: + if (slot) { + free_optinsn_slot(slot, 0); + op->optinsn.insn = NULL; + op->optinsn.length = 0; + } + kfree(code); + return ret; } void arch_remove_optimized_kprobe(struct optimized_kprobe *op) From patchwork Sat Dec 24 11:43:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081366 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B9347C4332F for ; Sat, 24 Dec 2022 15:23:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=uhcN+AhIOChCn2ehaFDFXcqp6pWL8tVZKwUgJWXJQhs=; b=LDTP3jrQMEDHle FxUnZ1YQRdyhv5fqgsN0TTpN/AkWIl9f9g+EL32HiLgBrDRpX5/6BJB83PaQlClBkFNqZiTqHSdRP DEzVACC2dZ9TqJJp67tWjdDWpjfb1xOnDsxYAiO1XPnRRqqOA8sM9agz/bjnecE+nSU9B7ivcNEVT PIq3YAlu29yTwfyAMeI+J7Fzd4OVtuOdzhp0wFjP+EDzzlKU8jFs1EXtJayjRMu0TC0lii8HaTucT 4wGw3AOODEnGfdeq8bP0kyhBEeDrPZ0tD00DOfDuI8iGF7C9Ds6j9H6dbX5zh4DGgib9K+3J89SDu WQR8QQXMz83KPHYLEoCw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96MY-0013Ma-12; Sat, 24 Dec 2022 15:23:02 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5ue-JO for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:56 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S8; Sat, 24 Dec 2022 19:43:28 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 4/9] riscv/kprobe: Add common RVI and RVC instruction decoder code Date: Sat, 24 Dec 2022 19:43:10 +0800 Message-Id: <20221224114315.850130-5-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S8 X-Coremail-Antispam: 1UD129KBjvJXoW3Xw15GryxCF1xKFyDWr1fXrb_yoW3tFy8pa n5Cw1Y93ykGF95CrZ7tr48Cr4Fqw4rGrs8Kay0ga1ayF1Iqr4UXr93try3tF4kWFWFgr47 CFZ8JrWkG3y2y3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr 1UM28EF7xvwVC2z280aVAFwI0_Cr1j6rxdM28EF7xvwVC2z280aVCY1x0267AKxVWxJr0_ GcWlnxkEFVAIw20F6cxK64vIFxWle2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2 IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4U McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I64 8v4I1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc2xSY4AK67AK6r4kMxAIw28IcxkI7VAKI48J MxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwV AFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv2 0xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4 v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AK xVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUjuc_3UUUUU== X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiBwYKE2OmnMtotwAAsV X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_181006_CB822FA6 X-CRM114-Status: GOOD ( 11.24 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang This patch add code that can be used to decode RVI and RVC instructions in searching one register for 'AUIPC/JALR'. As mentioned in previous patch, kprobe can't be optimized until one free integer register can be found out to save the jump target, in order to figure out the register searching, all instructions starts from the kprobe to the last one of function needs to decode and test if contains one candidate register. For all RVI instruction format, the position and length of 'rs1', 'rs2' ,'rd' and 'opcode' part are uniform, but the rule of RVC instruction format is more complicated, so it address a couple of inline functions to decode rs1/rs2/rd for RVC. These instruction decoder suppose to be consistent with the RVC and RV32/RV64G instruction set list specified in the riscv instruction reference published at August 25, 2022. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/include/asm/bug.h | 5 +- arch/riscv/kernel/probes/decode-insn.h | 148 +++++++++++++++++++++++ arch/riscv/kernel/probes/simulate-insn.h | 41 +++++++ 3 files changed, 193 insertions(+), 1 deletion(-) diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h index 1aaea81fb141..9c33d3b58225 100644 --- a/arch/riscv/include/asm/bug.h +++ b/arch/riscv/include/asm/bug.h @@ -19,11 +19,14 @@ #define __BUG_INSN_32 _UL(0x00100073) /* ebreak */ #define __BUG_INSN_16 _UL(0x9002) /* c.ebreak */ +#define RVI_INSN_LEN 4UL +#define RVC_INSN_LEN 2UL + #define GET_INSN_LENGTH(insn) \ ({ \ unsigned long __len; \ __len = ((insn & __INSN_LENGTH_MASK) == __INSN_LENGTH_32) ? \ - 4UL : 2UL; \ + RVI_INSN_LEN : RVC_INSN_LEN; \ __len; \ }) diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h index 42269a7d676d..785b023a62ea 100644 --- a/arch/riscv/kernel/probes/decode-insn.h +++ b/arch/riscv/kernel/probes/decode-insn.h @@ -3,6 +3,7 @@ #ifndef _RISCV_KERNEL_KPROBES_DECODE_INSN_H #define _RISCV_KERNEL_KPROBES_DECODE_INSN_H +#include #include #include @@ -15,4 +16,151 @@ enum probe_insn { enum probe_insn __kprobes riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi); +#ifdef CONFIG_KPROBES + +static inline u16 rvi_rs1(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 15) & 0x1f); +} + +static inline u16 rvi_rs2(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 20) & 0x1f); +} + +static inline u16 rvi_rd(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 7) & 0x1f); +} + +static inline s32 rvi_branch_imme(kprobe_opcode_t opcode) +{ + u32 imme = 0; + + imme |= (((opcode >> 8) & 0xf) << 1) | + (((opcode >> 25) & 0x3f) << 5) | + (((opcode >> 7) & 0x1) << 11) | + (((opcode >> 31) & 0x1) << 12); + + return sign_extend32(imme, 13); +} + +static inline s32 rvi_jal_imme(kprobe_opcode_t opcode) +{ + u32 imme = 0; + + imme |= (((opcode >> 21) & 0x3ff) << 1) | + (((opcode >> 20) & 0x1) << 11) | + (((opcode >> 12) & 0xff) << 12) | + (((opcode >> 31) & 0x1) << 20); + + return sign_extend32(imme, 21); +} + +#ifdef CONFIG_RISCV_ISA_C +static inline u16 rvc_r_rs1(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 2) & 0x1f); +} + +static inline u16 rvc_r_rs2(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 2) & 0x1f); +} + +static inline u16 rvc_r_rd(kprobe_opcode_t opcode) +{ + return rvc_r_rs1(opcode); +} + +static inline u16 rvc_i_rs1(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 7) & 0x1f); +} + +static inline u16 rvc_i_rd(kprobe_opcode_t opcode) +{ + return rvc_i_rs1(opcode); +} + +static inline u16 rvc_ss_rs2(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 2) & 0x1f); +} + +static inline u16 rvc_l_rd(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 2) & 0x7); +} + +static inline u16 rvc_l_rs(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 7) & 0x7); +} + +static inline u16 rvc_s_rs2(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 2) & 0x7); +} + +static inline u16 rvc_s_rs1(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 7) & 0x7); +} + +static inline u16 rvc_a_rs2(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 2) & 0x7); +} + +static inline u16 rvc_a_rs1(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 7) & 0x7); +} + +static inline u16 rvc_a_rd(kprobe_opcode_t opcode) +{ + return rvc_a_rs1(opcode); +} + +static inline u16 rvc_b_rd(kprobe_opcode_t opcode) +{ + return (u16)((opcode >> 7) & 0x7); +} + +static inline u16 rvc_b_rs(kprobe_opcode_t opcode) +{ + return rvc_b_rd(opcode); +} + +static inline s32 rvc_branch_imme(kprobe_opcode_t opcode) +{ + u32 imme = 0; + + imme |= (((opcode >> 3) & 0x3) << 1) | + (((opcode >> 10) & 0x3) << 3) | + (((opcode >> 2) & 0x1) << 5) | + (((opcode >> 5) & 0x3) << 6) | + (((opcode >> 12) & 0x1) << 8); + + return sign_extend32(imme, 9); +} + +static inline s32 rvc_jal_imme(kprobe_opcode_t opcode) +{ + u32 imme = 0; + + imme |= (((opcode >> 3) & 0x3) << 1) | + (((opcode >> 11) & 0x1) << 4) | + (((opcode >> 2) & 0x1) << 5) | + (((opcode >> 7) & 0x1) << 6) | + (((opcode >> 6) & 0x1) << 7) | + (((opcode >> 9) & 0x3) << 8) | + (((opcode >> 8) & 0x1) << 10) | + (((opcode >> 12) & 0x1) << 11); + + return sign_extend32(imme, 12); +} +#endif /* CONFIG_KPROBES */ +#endif /* CONFIG_RISCV_ISA_C */ #endif /* _RISCV_KERNEL_KPROBES_DECODE_INSN_H */ diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h index cb6ff7dccb92..74d8c1ba9064 100644 --- a/arch/riscv/kernel/probes/simulate-insn.h +++ b/arch/riscv/kernel/probes/simulate-insn.h @@ -37,6 +37,40 @@ __RISCV_INSN_FUNCS(c_jalr, 0xf007, 0x9002); __RISCV_INSN_FUNCS(c_beqz, 0xe003, 0xc001); __RISCV_INSN_FUNCS(c_bnez, 0xe003, 0xe001); __RISCV_INSN_FUNCS(c_ebreak, 0xffff, 0x9002); +/* RVC(S) instructions contain rs1 and rs2 */ +__RISCV_INSN_FUNCS(c_sq, 0xe003, 0xa000); +__RISCV_INSN_FUNCS(c_sw, 0xe003, 0xc000); +__RISCV_INSN_FUNCS(c_sd, 0xe003, 0xe000); +/* RVC(A) instructions contain rs1 and rs2 */ +__RISCV_INSN_FUNCS(c_sub, 0xfc03, 0x8c01); +__RISCV_INSN_FUNCS(c_subw, 0xfc43, 0x9c01); +/* RVC(L) instructions contain rs1 */ +__RISCV_INSN_FUNCS(c_lq, 0xe003, 0x2000); +__RISCV_INSN_FUNCS(c_lw, 0xe003, 0x4000); +__RISCV_INSN_FUNCS(c_ld, 0xe003, 0x6000); +/* RVC(I) instructions contain rs1 */ +__RISCV_INSN_FUNCS(c_addi, 0xe003, 0x0001); +__RISCV_INSN_FUNCS(c_addiw, 0xe003, 0x2001); +__RISCV_INSN_FUNCS(c_addi16sp, 0xe183, 0x6101); +__RISCV_INSN_FUNCS(c_slli, 0xe003, 0x0002); +/* RVC(B) instructions contain rs1 */ +__RISCV_INSN_FUNCS(c_sri, 0xe803, 0x8001); +__RISCV_INSN_FUNCS(c_andi, 0xec03, 0x8801); +/* RVC(SS) instructions contain rs2 */ +__RISCV_INSN_FUNCS(c_sqsp, 0xe003, 0xa002); +__RISCV_INSN_FUNCS(c_swsp, 0xe003, 0xc002); +__RISCV_INSN_FUNCS(c_sdsp, 0xe003, 0xe002); +/* RVC(R) instructions contain rs2 and rd */ +__RISCV_INSN_FUNCS(c_mv, 0xe003, 0x8002); +/* RVC(I) instructions contain sp and rd */ +__RISCV_INSN_FUNCS(c_lqsp, 0xe003, 0x2002); +__RISCV_INSN_FUNCS(c_lwsp, 0xe003, 0x4002); +__RISCV_INSN_FUNCS(c_ldsp, 0xe003, 0x6002); +/* RVC(CW) instructions contain sp and rd */ +__RISCV_INSN_FUNCS(c_addi4spn, 0xe003, 0x0000); +/* RVC(I) instructions contain rd */ +__RISCV_INSN_FUNCS(c_li, 0xe003, 0x4001); +__RISCV_INSN_FUNCS(c_lui, 0xe003, 0x6001); __RISCV_INSN_FUNCS(auipc, 0x7f, 0x17); __RISCV_INSN_FUNCS(branch, 0x7f, 0x63); @@ -44,4 +78,11 @@ __RISCV_INSN_FUNCS(branch, 0x7f, 0x63); __RISCV_INSN_FUNCS(jal, 0x7f, 0x6f); __RISCV_INSN_FUNCS(jalr, 0x707f, 0x67); +__RISCV_INSN_FUNCS(arith_rr, 0x77, 0x33); +__RISCV_INSN_FUNCS(arith_ri, 0x77, 0x13); +__RISCV_INSN_FUNCS(lui, 0x7f, 0x37); +__RISCV_INSN_FUNCS(load, 0x7f, 0x03); +__RISCV_INSN_FUNCS(store, 0x7f, 0x23); +__RISCV_INSN_FUNCS(amo, 0x7f, 0x2f); + #endif /* _RISCV_KERNEL_PROBES_SIMULATE_INSN_H */ From patchwork Sat Dec 24 11:43:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081362 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A5B60C4332F for ; Sat, 24 Dec 2022 15:22:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=XlKPf/ul8CSWk8aDQ7T+fJcEhR0BIEPSAbYRxwAbrgY=; b=AjyaKIRGsIxASq Bq5u89/SsZH5vmbPu0QSQNtpgKmvCqcLk7K00k2Mv3uKIxk4D9Tp0d8iX3mT4zG+w+0UVIDzu+iQz RINIfVDLsMwiri22ITPZVSoRHLAjUYcp48PRKkKeo5nAeaWjbgdANWaVt4feq+L9/0z+MpYXnrmeG 4yLN4Qlg7xsCGJGHndUjWAJP6yVEwewCrkvximlH0Ab5hJIWnHQebLEgTMz9pVk2fKcuHHirRCYxH Bj77kidc5J0/0EPypW8OSqCqeNxXVV+eE2xhTQ4FyEDAbjP5FpGKf3AJa+71SE6gjYR/JjCOW5Hc6 9+ZTGssGiCvaFzCccI9g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96LP-0012jc-8E; Sat, 24 Dec 2022 15:21:51 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5uj-IY for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:53 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S9; Sat, 24 Dec 2022 19:43:28 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 5/9] riscv/kprobe: Search free register(s) to clobber for 'AUIPC/JALR' Date: Sat, 24 Dec 2022 19:43:11 +0800 Message-Id: <20221224114315.850130-6-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S9 X-Coremail-Antispam: 1UD129KBjvJXoWfJry3uryfWFy5Kr1kJryxXwb_yoWDAr15pF ZxGw4rtF4Utrs5W3y3tF1kJrWSgFs3Grs8Ar15t3yUZw43G3ySqFWvga43Zr1DCF13Zr48 Jr4Y9rWI9r4DAFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr 1UM28EF7xvwVC2z280aVAFwI0_Cr1j6rxdM28EF7xvwVC2z280aVCY1x0267AKxVWxJr0_ GcWlnxkEFVAIw20F6cxK64vIFxWle2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2 IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4U McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I64 8v4I1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc2xSY4AK67AK6r4kMxAIw28IcxkI7VAKI48J MxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwV AFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv2 0xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26c xKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAF wI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUSApnUUUUU= X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiBwUKE2OmnMtouAAAsZ X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_051247_C4EC6E74 X-CRM114-Status: GOOD ( 27.74 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang This patch implement the algorithm of searching free register(s) to form a long-jump instruction pair. AUIPC/JALR instruction pair is introduced with a much wider jump range (4GB), where auipc loads the upper 20 bits to a free register and jalr appends the lower 12 bits to form a 32 bit immediate. Since kprobes can be instrumented at anywhere in kernel space, hence the free register should be found in a generic way, not depending on the calling convention or any other regulations. The algorithm for finding the free register is inspired by the register renaming in modern processors. From the perspective of register renaming, a register could be represented as two different registers if two neighbour instructions both write to it but no one ever reads. Extending this fact, a register is considered to be free if there is no read before its next write in the execution flow. We are free to change its value without interfering normal execution. In order to do jump optimization, it needs to search two free registers, the first one is used to form AUIPC/JALR jumping to detour buffer, the second one is used to form JR jumping back from detour buffer. If first one never been updated by any instructions replaced by 'AUIPC/JALR', both register supposes to the same one. Let's use the example below to explain how the algorithm work. Given kernel is RVI and RCV hybrid binary, and one kprobe is instrumented at the entry of function idle_dummy. Before Optimized Detour buffer : ... #1 add sp,sp,-16 auipc a0, #? add sp,sp,-16 #2 sd s0,8(sp) sd s0,8(sp) #3 addi s0,sp,16 jalr a0, #?(a0) addi s0,sp,16 #4 ld s0,8(sp) ld s0,8(sp) #5 li a0,0 li a0,0 auipc a0, #? #6 addi sp,sp,16 addi sp,sp,16 jr x0, #?(a0) #7 ret ret For regular kprobe, it is trival to replace the first instruction with C.EREABK, no more instruction and register will be clobber, in order to optimize kprobe with long-jump, it used to patch the first 8 bytes with AUIPC/JALR, and a0 will be chosen to save the address jumping to, because from #1 to #7, a0 is the only one register that satifies two conditions: (1) No read before write (2) Never been updated in detour buffer. While s0 has been used as the source register at #2, so it is not free to clobber. The searching starts from the kprobe and stop at the last instruction of function or the first branch/jump instruction, it decodes out the 'rs' and 'rd' part of each visited instruction. If the 'rd' never been read before, then record it to bitmask 'write'; if the 'rs' never been written before, then record it to another bitmask 'read'. When searching stops, the remaining bits of 'write' are the free registers to form AUIPC/JALR or JR. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/kernel/probes/opt.c | 223 +++++++++++++++++++++++++++++++++ 1 file changed, 223 insertions(+) diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c index a4271e6033ba..a0d2ab39e3fa 100644 --- a/arch/riscv/kernel/probes/opt.c +++ b/arch/riscv/kernel/probes/opt.c @@ -12,6 +12,9 @@ #include #include +#include "simulate-insn.h" +#include "decode-insn.h" + static inline int in_auipc_jalr_range(long val) { #ifdef CONFIG_ARCH_RV32I @@ -37,15 +40,235 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot, { } +/* Registers the first usage of which is the destination of instruction */ +#define WRITE_ON(reg) \ + (*write |= (((*read >> (reg)) ^ 1UL) & 1) << (reg)) +/* Registers the first usage of which is the source of instruction */ +#define READ_ON(reg) \ + (*read |= (((*write >> (reg)) ^ 1UL) & 1) << (reg)) + /* * In RISC-V ISA, AUIPC/JALR clobber one register to form target address, * by inspired by register renaming in OoO processor, this involves search * backwards that is not previously used as a source register and is used * as a destination register before any branch or jump instruction. */ +static void find_register(unsigned long start, unsigned long end, + unsigned long *write, unsigned long *read) +{ + kprobe_opcode_t insn; + unsigned long addr, offset = 0UL; + + for (addr = start; addr < end; addr += offset) { + insn = *(kprobe_opcode_t *)addr; + offset = GET_INSN_LENGTH(insn); + +#ifdef CONFIG_RISCV_ISA_C + if (offset == RVI_INSN_LEN) + goto is_rvi; + + insn &= __COMPRESSED_INSN_MASK; + /* Stop searching until any control transfer instruction */ + if (riscv_insn_is_c_ebreak(insn) || riscv_insn_is_c_j(insn)) + break; + + if (riscv_insn_is_c_jal(insn)) { + /* The rd of C.JAL is x1 by default */ + WRITE_ON(1); + break; + } + + if (riscv_insn_is_c_jr(insn)) { + READ_ON(rvc_r_rs1(insn)); + break; + } + + if (riscv_insn_is_c_jalr(insn)) { + READ_ON(rvc_r_rs1(insn)); + /* The rd of C.JALR is x1 by default */ + WRITE_ON(1); + break; + } + + if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn)) { + READ_ON(rvc_b_rs(insn)); + break; + } + + /* + * Decode RVC instructions that encode integer registers, try + * to find out some destination register, the number of which + * are equal with 'least' and never be used as source register. + */ + if (riscv_insn_is_c_sub(insn) || riscv_insn_is_c_subw(insn)) { + READ_ON(rvc_a_rs1(insn)); + READ_ON(rvc_a_rs2(insn)); + continue; + } else if (riscv_insn_is_c_sq(insn) || + riscv_insn_is_c_sw(insn) || + riscv_insn_is_c_sd(insn)) { + READ_ON(rvc_s_rs1(insn)); + READ_ON(rvc_s_rs2(insn)); + continue; + } else if (riscv_insn_is_c_addi16sp(insn) || + riscv_insn_is_c_addi(insn) || + riscv_insn_is_c_addiw(insn) || + riscv_insn_is_c_slli(insn)) { + READ_ON(rvc_i_rs1(insn)); + continue; + } else if (riscv_insn_is_c_sri(insn) || + riscv_insn_is_c_andi(insn)) { + READ_ON(rvc_b_rs(insn)); + continue; + } else if (riscv_insn_is_c_sqsp(insn) || + riscv_insn_is_c_swsp(insn) || + riscv_insn_is_c_sdsp(insn)) { + READ_ON(rvc_ss_rs2(insn)); + /* The rs2 of C.SQSP/SWSP/SDSP are x2 by default */ + READ_ON(2); + continue; + } else if (riscv_insn_is_c_mv(insn)) { + READ_ON(rvc_r_rs2(insn)); + WRITE_ON(rvc_r_rd(insn)); + } else if (riscv_insn_is_c_addi4spn(insn)) { + /* The rs of C.ADDI4SPN is x2 by default */ + READ_ON(2); + WRITE_ON(rvc_l_rd(insn)); + } else if (riscv_insn_is_c_lq(insn) || + riscv_insn_is_c_lw(insn) || + riscv_insn_is_c_ld(insn)) { + /* FIXME: c.lw/c.ld share opcode with c.flw/c.fld */ + READ_ON(rvc_l_rs(insn)); + WRITE_ON(rvc_l_rd(insn)); + } else if (riscv_insn_is_c_lqsp(insn) || + riscv_insn_is_c_lwsp(insn) || + riscv_insn_is_c_ldsp(insn)) { + /* + * FIXME: c.lwsp/c.ldsp share opcode with c.flwsp/c.fldsp + * The rs of C.LQSP/C.LWSP/C.LDSP is x2 by default. + */ + READ_ON(2); + WRITE_ON(rvc_i_rd(insn)); + } else if (riscv_insn_is_c_li(insn) || + riscv_insn_is_c_lui(insn)) { + WRITE_ON(rvc_i_rd(insn)); + } + + if ((*write > 1UL) && __builtin_ctzl(*write & ~1UL)) + return; +is_rvi: +#endif + /* Stop searching until any control transfer instruction */ + if (riscv_insn_is_branch(insn)) { + READ_ON(rvi_rs1(insn)); + READ_ON(rvi_rs2(insn)); + break; + } + + if (riscv_insn_is_jal(insn)) { + WRITE_ON(rvi_rd(insn)); + break; + } + + if (riscv_insn_is_jalr(insn)) { + READ_ON(rvi_rs1(insn)); + WRITE_ON(rvi_rd(insn)); + break; + } + + if (riscv_insn_is_system(insn)) { + /* csrrw, csrrs, csrrc */ + if (rvi_rs1(insn)) + READ_ON(rvi_rs1(insn)); + /* csrrwi, csrrsi, csrrci, csrrw, csrrs, csrrc */ + if (rvi_rd(insn)) + WRITE_ON(rvi_rd(insn)); + break; + } + + /* + * Decode RVC instructions that has rd and rs, try to find out + * some rd, the number of which are equal with 'least' and never + * be used as rs. + */ + if (riscv_insn_is_lui(insn) || riscv_insn_is_auipc(insn)) { + WRITE_ON(rvi_rd(insn)); + } else if (riscv_insn_is_arith_ri(insn) || + riscv_insn_is_load(insn)) { + READ_ON(rvi_rs1(insn)); + WRITE_ON(rvi_rd(insn)); + } else if (riscv_insn_is_arith_rr(insn) || + riscv_insn_is_store(insn) || + riscv_insn_is_amo(insn)) { + READ_ON(rvi_rs1(insn)); + READ_ON(rvi_rs2(insn)); + WRITE_ON(rvi_rd(insn)); + } + + if ((*write > 1UL) && __builtin_ctzl(*write & ~1UL)) + return; + } +} + static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op, int *rd, int *ra) { + unsigned long start, end; + /* + * Searching algorithm explanation: + * + * 1. Define two types of instruction area firstly: + * + * +-----+ + * + + + * + + ---> instrunctions modified by optprobe, named 'O-Area'. + * + + + * +-----+ + * + + + * + + ---> instructions after optprobe, named 'K-Area'. + * + + + * + ~ + + * + * 2. There are two usages for each GPR in given instruction area. + * + * - W: GPR is used as the RD oprand at first emergence. + * - R: GPR is used as the RS oprand at first emergence. + * + * Then there are 4 different usages for each GPR totally: + * + * 1. Used as W in O-Area, Used as W in K-Area. + * 2. Used as W in O-Area, Used as R in K-Area. + * 3. Used as R in O-Area, Used as W in K-Area. + * 4. Used as R in O-Area, Used as R in K-Area. + * + * All registers satisfy #1 or #3 could be chosen to form 'AUIPC/JALR' + * jumping to detour buffer. + * + * All registers satisfy #1 or #2, could be chosen to form 'JR' jumping + * back from detour buffer. + */ + unsigned long kw = 0UL, kr = 0UL, ow = 0UL, or = 0UL; + + /* Search one free register used to form AUIPC/JALR */ + start = (unsigned long)&kp->opcode; + end = start + GET_INSN_LENGTH(kp->opcode); + find_register(start, end, &ow, &or); + + start = (unsigned long)kp->addr + GET_INSN_LENGTH(kp->opcode); + end = (unsigned long)kp->addr + op->optinsn.length; + find_register(start, end, &ow, &or); + + /* Search one free register used to form JR */ + find_register(end, (unsigned long)_end, &kw, &kr); + + if ((kw & ow) > 1UL) { + *rd = __builtin_ctzl((kw & ow) & ~1UL); + *ra = *rd; + return; + } + + *rd = ((kw | ow) == 1UL) ? 0 : __builtin_ctzl((kw | ow) & ~1UL); + *ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL); } /* From patchwork Sat Dec 24 11:43:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081368 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 23D21C4332F for ; Sat, 24 Dec 2022 15:23:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=psf4QA2naKLKHl4Rqr0MqjMH4o763e7cSiFZQ8QwhPI=; b=RoPBATsgUmqqky rIbbODziG3SdXzJUW4F5kbHxEoc3f8K2B6U393UtGwpYH+VdqsOjfJg6rBr9aoEYqZucTz8QMt9fr aU8p+/CkDxY+QFsdREO4F4EAB07T3y4OusVFFvjNRxuLK3hTj0Av5UVi7iiBoXY6bHwQ5cbMG9Q/+ DrEV5EDVyBKLri7ArPjALneQmLEdITEqGtaSsVHvYdu/861YXJocW+p37OIweRvzEjYFiRVS02RNS 7snHSk4rXnOBQEjkvKl84OjNLGlyS+AcjS08PRiIaXaUfXfBvWUdhsUFUutjoaYF6X5t3t/4zpAeT x8TUCoqtBsIo8PZZm+KA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96NC-0013lN-VG; Sat, 24 Dec 2022 15:23:43 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5ul-MP for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:57 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S10; Sat, 24 Dec 2022 19:43:28 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 6/9] riscv/kprobe: Add code to check if kprobe can be optimized Date: Sat, 24 Dec 2022 19:43:12 +0800 Message-Id: <20221224114315.850130-7-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S10 X-Coremail-Antispam: 1UD129KBjvJXoWxGw4xZw4DJr1rKr15AryDGFg_yoW7JFyxpF sYka4YqrW8JFZagrWfAws5Jr4Syws5Gr48try7K34Fyw12qr9Iganakr4avFnxGF40gr17 AF40yry8uFy3ZrJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr 1UM28EF7xvwVC2z280aVAFwI0_Cr1j6rxdM28EF7xvwVC2z280aVCY1x0267AKxVWxJr0_ GcWlnxkEFVAIw20F6cxK64vIFxWle2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2 IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4U McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I64 8v4I1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc2xSY4AK67AK6r4kMxAIw28IcxkI7VAKI48J MxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwV AFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv2 0xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26c xKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAF wI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUSApnUUUUU= X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiBwMKE2OmnMtotgAAsR X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_273416_0224E056 X-CRM114-Status: GOOD ( 19.17 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang This patch add code to check if kprobe can be optimized, regular kprobe replaces single instruction with EBREAK or C.EBREAK, it just requires the instrumented instruction support execute out-of-line or simulation, while optimized kprobe patch AUIPC/JALR pair to do a long jump, it makes everything more compilated, espeically for kernel that is hybrid RVI and RVC binary, although AUIPC/JALR just need 8 bytes space, the bytes to patch are 10 bytes long at worst case to ensure no RVI would be truncated, so there are four methods to patch optimized kprobe. - Replace 2 RVI with AUIPC/JALR. - Replace 4 RVC with AUIPC/JALR. - Replace 2 RVC and 1 RVI with AUIPC/JALR. - Replace 3 RVC and 1 RVI with AUIPC/JALR, and patch C.NOP into last two bytes for alignment. So it has to find out a instruction window large enough to patch AUIPC/JALR from the address instrumented breakpoint, meanwhile, ensure no instruction has chance to jump into the range of patched window. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/kernel/probes/opt.c | 98 ++++++++++++++++++++++++++++++++-- 1 file changed, 93 insertions(+), 5 deletions(-) diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c index a0d2ab39e3fa..258a283c906d 100644 --- a/arch/riscv/kernel/probes/opt.c +++ b/arch/riscv/kernel/probes/opt.c @@ -271,15 +271,103 @@ static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op, *ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL); } +static bool insn_jump_into_range(unsigned long addr, unsigned long start, + unsigned long end) +{ + kprobe_opcode_t insn = *(kprobe_opcode_t *)addr; + unsigned long target, offset = GET_INSN_LENGTH(insn); + +#ifdef CONFIG_RISCV_ISA_C + if (offset == RVC_INSN_LEN) { + if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn)) + target = addr + rvc_branch_imme(insn); + else if (riscv_insn_is_c_jal(insn) || riscv_insn_is_c_j(insn)) + target = addr + rvc_jal_imme(insn); + else + target = 0; + return (target >= start) && (target < end); + } +#endif + + if (riscv_insn_is_branch(insn)) + target = addr + rvi_branch_imme(insn); + else if (riscv_insn_is_jal(insn)) + target = addr + rvi_jal_imme(insn); + else + target = 0; + return (target >= start) && (target < end); +} + +static int search_copied_insn(unsigned long paddr, struct optimized_kprobe *op) +{ + int i = 1; + unsigned long offset = GET_INSN_LENGTH(*(kprobe_opcode_t *)paddr); + + while ((i++ < MAX_COPIED_INSN) && (offset < 2 * RVI_INSN_LEN)) { + if (riscv_probe_decode_insn((probe_opcode_t *)paddr + offset, + NULL) != INSN_GOOD) + return -1; + offset += GET_INSN_LENGTH(*(kprobe_opcode_t *)(paddr + offset)); + } + + op->optinsn.length = offset; + return 0; +} + /* - * If two free registers can be found at the beginning of both - * the start and the end of replaced code, it can be optimized - * Also, in-function jumps need to be checked to make sure that - * there is no jump to the second instruction to be replaced + * The kprobe can be optimized when no in-function jump reaches to the + * instructions replaced by optimized jump instructions(AUIPC/JALR). */ static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op) { - return false; + int ret; + unsigned long addr, size = 0, offset = 0; + struct kprobe *kp = get_kprobe((kprobe_opcode_t *)paddr); + + /* + * Skip optimization if kprobe has been disarmed or instrumented + * instruction support XOI. + */ + if (!kp || (riscv_probe_decode_insn(&kp->opcode, NULL) != INSN_GOOD)) + return false; + + /* + * Find a instruction window large enough to contain a pair + * of AUIPC/JALR, and ensure each instruction in this window + * supports XOI. + */ + ret = search_copied_insn(paddr, op); + if (ret) + return false; + + if (!kallsyms_lookup_size_offset(paddr, &size, &offset)) + return false; + + /* Check there is enough space for relative jump(AUIPC/JALR) */ + if (size - offset <= op->optinsn.length) + return false; + + /* + * Decode instructions until function end, check any instruction + * don't jump into the window used to emit optprobe(AUIPC/JALR). + */ + addr = paddr - offset; + while (addr < paddr) { + if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN, + paddr + op->optinsn.length)) + return false; + addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr); + } + + addr = paddr + op->optinsn.length; + while (addr < paddr - offset + size) { + if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN, + paddr + op->optinsn.length)) + return false; + addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr); + } + + return true; } int arch_prepared_optinsn(struct arch_optimized_insn *optinsn) From patchwork Sat Dec 24 11:43:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081367 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A5DD4C4332F for ; Sat, 24 Dec 2022 15:23:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=b2fxq5IVJiKW7oYTdH7yj22lnhHn/IrAqI9HK/i4HI4=; b=V7LBmOJI26SY+o X4czQQh/mA6tgd7oYUnZ6N/OVRTPBL5ybcr5kn+od8cg721Tv+HfSMhTe6Nhm+ZBCMvEtCXI5Y1Wf f59pPyHYib9NUEL/5XgIr04R8Kj4iN7EtTj2ZvkIE6IDgJ/3+AfxjjwdOMCNoZOXV0yc2fbF0TCzD YFvjqhw6NTZK9Q4V1jUZBoCtiuoyzzZVA/TWk1/cy2MTuoYr5dCBMCQNXI+cAhRmaZlH18BW61nlZ pZVx1QD2rgDBfUY1ZMQclfhzjXJk775071HfACVS6kXcAoP1C6G9zM6VinyYm1t7XGRSo4Kif+34S u6eX6l5R89v0PWnXn3sw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96Ms-0013Zg-Av; Sat, 24 Dec 2022 15:23:22 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5un-L8 for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:56 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S11; Sat, 24 Dec 2022 19:43:28 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 7/9] riscv/kprobe: Prepare detour buffer for optimized kprobe Date: Sat, 24 Dec 2022 19:43:13 +0800 Message-Id: <20221224114315.850130-8-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S11 X-Coremail-Antispam: 1UD129KBjvJXoW3Gr1fWw45Xw48JF4DZF4fXwb_yoWDJF45pa 95CwnxXF4UZFn7Cry7t3Z5Jr4Sqws5Jws7Grs7GFW5tan7KrW5Xan2g345Xr15G3yqgry7 ua45JryUuF9xXrJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr 1UM28EF7xvwVC2z280aVAFwI0_Cr1j6rxdM28EF7xvwVC2z280aVCY1x0267AKxVWxJr0_ GcWlnxkEFVAIw20F6cxK64vIFxWle2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2 IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4U McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I64 8v4I1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc2xSY4AK67AK6r4kMxAIw28IcxkI7VAKI48J MxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwV AFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv2 0xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26c xKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAF wI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUSApnUUUUU= X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiCQUKE2OmnPBm9QAAsv X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_182761_02C06A09 X-CRM114-Status: GOOD ( 17.71 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang This patch introduce code to prepare instruction slot for optimized kprobe, the instruction slot for regular kprobe just records two instructions, first one is the original instruction replaced by EBREAK, the second one is EBREAK for single-step. While instruction slot for optimized kprobe is larger, beside execute instruction out-of-line, it also contains a standalone stackframe for calling kprobe handler. All optimized instruction slots consis of 5 major parts, which copied from the assembly code template in opt_trampoline.S. SAVE REGS CALL optimized_callback RESTORE REGS EXECUTE INSNS OUT-OF-LINE RETURN BACK Although most instructions in each slot are same, these slots still have a bit difference in their payload, it is result from three parts: - 'CALL optimized_callback', the relative offset for 'call' instruction is different for each kprobe. - 'EXECUTE INSN OUT-OF-LINE', no doubt. - 'RETURN BACK', the chosen free register is reused here as the destination register of jumping back. So it also need to customize the slot payload for each optimized kprobe. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/include/asm/kprobes.h | 16 +++ arch/riscv/kernel/probes/opt.c | 76 +++++++++++++ arch/riscv/kernel/probes/opt_trampoline.S | 125 ++++++++++++++++++++++ 3 files changed, 217 insertions(+) diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h index e85130c9112f..e40c837d0a1d 100644 --- a/arch/riscv/include/asm/kprobes.h +++ b/arch/riscv/include/asm/kprobes.h @@ -46,10 +46,26 @@ bool kprobe_single_step_handler(struct pt_regs *regs); /* optinsn template addresses */ extern __visible kprobe_opcode_t optprobe_template_entry[]; extern __visible kprobe_opcode_t optprobe_template_end[]; +extern __visible kprobe_opcode_t optprobe_template_save[]; +extern __visible kprobe_opcode_t optprobe_template_call[]; +extern __visible kprobe_opcode_t optprobe_template_insn[]; +extern __visible kprobe_opcode_t optprobe_template_return[]; #define MAX_OPTINSN_SIZE \ ((unsigned long)optprobe_template_end - \ (unsigned long)optprobe_template_entry) +#define DETOUR_SAVE_OFFSET \ + ((unsigned long)optprobe_template_save - \ + (unsigned long)optprobe_template_entry) +#define DETOUR_CALL_OFFSET \ + ((unsigned long)optprobe_template_call - \ + (unsigned long)optprobe_template_entry) +#define DETOUR_INSN_OFFSET \ + ((unsigned long)optprobe_template_insn - \ + (unsigned long)optprobe_template_entry) +#define DETOUR_RETURN_OFFSET \ + ((unsigned long)optprobe_template_return - \ + (unsigned long)optprobe_template_entry) /* * For RVI and RVC hybird encoding kernel, althought long jump just needs diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c index 258a283c906d..bc232fce5b39 100644 --- a/arch/riscv/kernel/probes/opt.c +++ b/arch/riscv/kernel/probes/opt.c @@ -11,9 +11,37 @@ #include #include #include +#include #include "simulate-insn.h" #include "decode-insn.h" +#include "../../net/bpf_jit.h" + +static void +optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs) +{ + unsigned long flags; + struct kprobe_ctlblk *kcb; + + /* Save skipped registers */ + regs->epc = (unsigned long)op->kp.addr; + regs->orig_a0 = ~0UL; + + local_irq_save(flags); + kcb = get_kprobe_ctlblk(); + + if (kprobe_running()) { + kprobes_inc_nmissed_count(&op->kp); + } else { + __this_cpu_write(current_kprobe, &op->kp); + kcb->kprobe_status = KPROBE_HIT_ACTIVE; + opt_pre_handler(&op->kp, regs); + __this_cpu_write(current_kprobe, NULL); + } + local_irq_restore(flags); +} + +NOKPROBE_SYMBOL(optimized_callback) static inline int in_auipc_jalr_range(long val) { @@ -30,6 +58,11 @@ static inline int in_auipc_jalr_range(long val) #endif } +#define DETOUR_ADDR(code, offs) \ + ((void *)((unsigned long)(code) + (offs))) +#define DETOUR_INSN(code, offs) \ + (*(kprobe_opcode_t *)((unsigned long)(code) + (offs))) + /* * Copy optprobe assembly code template into detour buffer and modify some * instructions for each kprobe. @@ -38,6 +71,49 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot, int rd, struct optimized_kprobe *op, kprobe_opcode_t opcode) { + long offs; + unsigned long data; + + memcpy(code, optprobe_template_entry, MAX_OPTINSN_SIZE); + + /* Step1: record optimized_kprobe pointer into detour buffer */ + memcpy(DETOUR_ADDR(code, DETOUR_SAVE_OFFSET), &op, sizeof(op)); + + /* + * Step2 + * auipc ra, 0 --> aupic ra, HI20.{optimized_callback - pc} + * jalr ra, 0(ra) --> jalr ra, LO12.{optimized_callback - pc}(ra) + */ + offs = (unsigned long)&optimized_callback - + (unsigned long)DETOUR_ADDR(slot, DETOUR_CALL_OFFSET); + DETOUR_INSN(code, DETOUR_CALL_OFFSET) = + rv_auipc(1, (offs + (1 << 11)) >> 12); + DETOUR_INSN(code, DETOUR_CALL_OFFSET + 0x4) = + rv_jalr(1, 1, offs & 0xFFF); + + /* Step3: copy replaced instructions into detour buffer */ + memcpy(DETOUR_ADDR(code, DETOUR_INSN_OFFSET), op->kp.addr, + op->optinsn.length); + memcpy(DETOUR_ADDR(code, DETOUR_INSN_OFFSET), &opcode, + GET_INSN_LENGTH(opcode)); + + /* Step4: record return address of long jump into detour buffer */ + data = (unsigned long)op->kp.addr + op->optinsn.length; + memcpy(DETOUR_ADDR(code, DETOUR_RETURN_OFFSET), &data, sizeof(data)); + + /* + * Step5 + * auipc ra, 0 --> auipc rd, 0 + * ld/w ra, -4(ra) --> ld/w rd, -8(rd) + * jalr x0, 0(ra) --> jalr x0, 0(rd) + */ + DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0x8) = rv_auipc(rd, 0); +#if __riscv_xlen == 32 + DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0xC) = rv_lw(rd, -8, rd); +#else + DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0xC) = rv_ld(rd, -8, rd); +#endif + DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0x10) = rv_jalr(0, rd, 0); } /* Registers the first usage of which is the destination of instruction */ diff --git a/arch/riscv/kernel/probes/opt_trampoline.S b/arch/riscv/kernel/probes/opt_trampoline.S index 16160c4367ff..75e34e373cf2 100644 --- a/arch/riscv/kernel/probes/opt_trampoline.S +++ b/arch/riscv/kernel/probes/opt_trampoline.S @@ -1,12 +1,137 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* * Copyright (C) 2022 Guokai Chen + * Copyright (C) 2022 Liao, Chang */ #include +#include #incldue #include SYM_ENTRY(optprobe_template_entry, SYM_L_GLOBAL, SYM_A_NONE) + addi sp, sp, -(PT_SIZE_ON_STACK) + REG_S x1, PT_RA(sp) + REG_S x2, PT_SP(sp) + REG_S x3, PT_GP(sp) + REG_S x4, PT_TP(sp) + REG_S x5, PT_T0(sp) + REG_S x6, PT_T1(sp) + REG_S x7, PT_T2(sp) + REG_S x8, PT_S0(sp) + REG_S x9, PT_S1(sp) + REG_S x10, PT_A0(sp) + REG_S x11, PT_A1(sp) + REG_S x12, PT_A2(sp) + REG_S x13, PT_A3(sp) + REG_S x14, PT_A4(sp) + REG_S x15, PT_A5(sp) + REG_S x16, PT_A6(sp) + REG_S x17, PT_A7(sp) + REG_S x18, PT_S2(sp) + REG_S x19, PT_S3(sp) + REG_S x20, PT_S4(sp) + REG_S x21, PT_S5(sp) + REG_S x22, PT_S6(sp) + REG_S x23, PT_S7(sp) + REG_S x24, PT_S8(sp) + REG_S x25, PT_S9(sp) + REG_S x26, PT_S10(sp) + REG_S x27, PT_S11(sp) + REG_S x28, PT_T3(sp) + REG_S x29, PT_T4(sp) + REG_S x30, PT_T5(sp) + REG_S x31, PT_T6(sp) + /* Update fp is friendly for stacktrace */ + addi s0, sp, (PT_SIZE_ON_STACK) + j 1f + +SYM_ENTRY(optprobe_template_save, SYM_L_GLOBAL, SYM_A_NONE) + /* + * Step1: + * Filled with the pointer to optimized_kprobe data + */ + .dword 0 +1: + /* Load optimize_kprobe pointer from .dword below */ + auipc a0, 0 + REG_L a0, -8(a0) + add a1, sp, x0 + +SYM_ENTRY(optprobe_template_call, SYM_L_GLOBAL, SYM_A_NONE) + /* + * Step2: + * of AUIPC/JALR are modified to the offset to optimized_callback + * jump target is loaded from above .dword. + */ + auipc ra, 0 + jalr ra, 0(ra) + + REG_L x1, PT_RA(sp) + REG_L x3, PT_GP(sp) + REG_L x4, PT_TP(sp) + REG_L x5, PT_T0(sp) + REG_L x6, PT_T1(sp) + REG_L x7, PT_T2(sp) + REG_L x8, PT_S0(sp) + REG_L x9, PT_S1(sp) + REG_L x10, PT_A0(sp) + REG_L x11, PT_A1(sp) + REG_L x12, PT_A2(sp) + REG_L x13, PT_A3(sp) + REG_L x14, PT_A4(sp) + REG_L x15, PT_A5(sp) + REG_L x16, PT_A6(sp) + REG_L x17, PT_A7(sp) + REG_L x18, PT_S2(sp) + REG_L x19, PT_S3(sp) + REG_L x20, PT_S4(sp) + REG_L x21, PT_S5(sp) + REG_L x22, PT_S6(sp) + REG_L x23, PT_S7(sp) + REG_L x24, PT_S8(sp) + REG_L x25, PT_S9(sp) + REG_L x26, PT_S10(sp) + REG_L x27, PT_S11(sp) + REG_L x28, PT_T3(sp) + REG_L x29, PT_T4(sp) + REG_L x30, PT_T5(sp) + REG_L x31, PT_T6(sp) + REG_L x2, PT_SP(sp) + addi sp, sp, (PT_SIZE_ON_STACK) + +SYM_ENTRY(optprobe_template_insn, SYM_L_GLOBAL, SYM_A_NONE) + /* + * Step3: + * NOPS will be replaced by the probed instruction, at worst case 3 RVC + * and 1 RVI instructions is about to execute out of line. + */ + nop + nop + nop + nop + nop + nop + nop + nop + nop + nop + j 2f + +SYM_ENTRY(optprobe_template_return, SYM_L_GLOBAL, SYM_A_NONE) + /* + * Step4: + * Filled with the return address of long jump(AUIPC/JALR) + */ + .dword 0 +2: + /* + * Step5: + * The of AUIPC/LD/JALR will be replaced for each kprobe, + * used to read return address saved in .dword above. + */ + auipc ra, 0 + REG_L ra, -8(ra) + jalr x0, 0(ra) SYM_ENTRY(optprobe_template_end, SYM_L_GLOBAL, SYM_A_NONE) From patchwork Sat Dec 24 11:43:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081363 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30961C4332F for ; Sat, 24 Dec 2022 15:22:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=a31PTAYz5tNvNi7+mO3x8aOE/F4USM5WduNBNBVMt6I=; b=JJtvi4NPWsaDeN xeOFxQuC5DbD3NotNjjNbV1NpqnV5a0u2iP37LasMVnXHDNmzmpEjtAlSyvGsBgfO+K8I1SaHMsBV /lvtDIpfau+dQfeTuA7NnZgYtyn6oAJmerlZCfSmZYKAl9rUwr8aW3qLmqK8KTGN3cv5Muk+d0YM4 P5NvO6W+xxlMki5qHTn2goarg7/x+5guT4EMXD+mJ0Gi1jzTmMDfArby9Ii/LOL0ue43LWe3a9/og c9J1nnhb+/FWIJapKVVXiE5b8+vIxT8VNmg1h6f9z+0cRA23aLywakNTr8f96nkVNNEvIezYkcz1y ATKJ2WJA7lbug0aWDBWQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96Le-0012rI-DU; Sat, 24 Dec 2022 15:22:06 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5uh-JP for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:55 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S12; Sat, 24 Dec 2022 19:43:29 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai Subject: [PATCH v5 8/9] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe Date: Sat, 24 Dec 2022 19:43:14 +0800 Message-Id: <20221224114315.850130-9-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S12 X-Coremail-Antispam: 1UD129KBjvJXoW3JFW8JrW7Xw4ruF4fWr18uFg_yoW7CryrpF s8GwnxJrWYkFn2grZxAws5ur1rKwsYyay3K3yDGrWfArsrJrs8Wwnakwn8Zr15GF1Fgr13 ArnYkryruay7JFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ214x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26F4UJVW0owA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Cr1j 6rxdM2vYz4IE04k24VAvwVAKI4IrM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4 xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8 JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4IIrI8v6xkF7I0E8cxan2IY04v7MxkIecxEwVAFwVW8KwCF04k20xvY0x0EwIxG rwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4 vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IY x2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8V AvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E 14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUeSoGDUUUU X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiCQMKE2OmnPBm9AAAso X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_183482_D8BE3DB2 X-CRM114-Status: GOOD ( 14.65 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Liao Chang The patch optimize 'EBREAK' with 'AUIPC/JALR', introduce new patching function to modify multiple instructions. Signed-off-by: Liao Chang Co-developed-by: Chen Guokai Signed-off-by: Chen Guokai --- arch/riscv/include/asm/patch.h | 1 + arch/riscv/kernel/patch.c | 23 ++++++++++--- arch/riscv/kernel/probes/opt.c | 63 ++++++++++++++++++++++++++++++++-- 3 files changed, 81 insertions(+), 6 deletions(-) diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h index 9a7d7346001e..ee31539de65f 100644 --- a/arch/riscv/include/asm/patch.h +++ b/arch/riscv/include/asm/patch.h @@ -8,5 +8,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len); int patch_text(void *addr, u32 insn); +int patch_text_batch(void *addr, const void *insn, size_t size); #endif /* _ASM_RISCV_PATCH_H */ diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c index 765004b60513..ce324b6a6998 100644 --- a/arch/riscv/kernel/patch.c +++ b/arch/riscv/kernel/patch.c @@ -15,7 +15,8 @@ struct patch_insn { void *addr; - u32 insn; + const void *insn; + size_t size; atomic_t cpu_count; }; @@ -106,8 +107,7 @@ static int patch_text_cb(void *data) if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) { ret = - patch_text_nosync(patch->addr, &patch->insn, - GET_INSN_LENGTH(patch->insn)); + patch_text_nosync(patch->addr, patch->insn, patch->size); atomic_inc(&patch->cpu_count); } else { while (atomic_read(&patch->cpu_count) <= num_online_cpus()) @@ -123,7 +123,8 @@ int patch_text(void *addr, u32 insn) { struct patch_insn patch = { .addr = addr, - .insn = insn, + .insn = &insn, + .size = GET_INSN_LENGTH(insn), .cpu_count = ATOMIC_INIT(0), }; @@ -131,3 +132,17 @@ int patch_text(void *addr, u32 insn) &patch, cpu_online_mask); } NOKPROBE_SYMBOL(patch_text); + +int patch_text_batch(void *addr, const void *insn, size_t size) +{ + struct patch_insn patch = { + .addr = addr, + .insn = insn, + .size = size, + .cpu_count = ATOMIC_INIT(0), + }; + + return stop_machine_cpuslocked(patch_text_cb, &patch, cpu_online_mask); +} + +NOKPROBE_SYMBOL(patch_text_batch); diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c index bc232fce5b39..1c0e9d218f6f 100644 --- a/arch/riscv/kernel/probes/opt.c +++ b/arch/riscv/kernel/probes/opt.c @@ -448,11 +448,19 @@ static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op) int arch_prepared_optinsn(struct arch_optimized_insn *optinsn) { - return 0; + return optinsn->length; } int arch_check_optimized_kprobe(struct optimized_kprobe *op) { + unsigned long i; + struct kprobe *p; + + for (i = RVC_INSN_LEN; i < op->optinsn.length; i += RVC_INSN_LEN) { + p = get_kprobe(op->kp.addr + i); + if (p && !kprobe_disabled(p)) + return -EEXIST; + } return 0; } @@ -521,23 +529,74 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, void arch_remove_optimized_kprobe(struct optimized_kprobe *op) { + if (op->optinsn.insn) { + free_optinsn_slot(op->optinsn.insn, 1); + op->optinsn.insn = NULL; + op->optinsn.length = 0; + } } void arch_optimize_kprobes(struct list_head *oplist) { + long offs; + kprobe_opcode_t insn[3]; + struct optimized_kprobe *op, *tmp; + + list_for_each_entry_safe(op, tmp, oplist, list) { + WARN_ON(kprobe_disabled(&op->kp)); + + /* Backup instructions which will be replaced by jump address */ + memcpy(op->optinsn.copied_insn, + DETOUR_ADDR(op->kp.addr, GET_INSN_LENGTH(op->kp.opcode)), + op->optinsn.length - GET_INSN_LENGTH(op->kp.opcode)); + + /* + * After patch, it should be: + * auipc free_register, %hi(detour_buffer) + * jalr free_register, free_register, %lo(detour_buffer) + * where free_register will eventually save the return address + */ + offs = (unsigned long)op->optinsn.insn - + (unsigned long)op->kp.addr; + insn[0] = rv_auipc(op->optinsn.rd, (offs + (1 << 11)) >> 12); + insn[1] = rv_jalr(op->optinsn.rd, op->optinsn.rd, offs & 0xFFF); + /* For 3 RVC + 1 RVI scenario, need C.NOP for padding */ + if (op->optinsn.length > 2 * RVI_INSN_LEN) + insn[2] = rvc_addi(0, 0); + + patch_text_batch(op->kp.addr, insn, op->optinsn.length); + if (memcmp(op->kp.addr, insn, op->optinsn.length)) + continue; + + list_del_init(&op->list); + } } void arch_unoptimize_kprobes(struct list_head *oplist, struct list_head *done_list) { + struct optimized_kprobe *op, *tmp; + + list_for_each_entry_safe(op, tmp, oplist, list) { + arch_unoptimize_kprobe(op); + list_move(&op->list, done_list); + } } void arch_unoptimize_kprobe(struct optimized_kprobe *op) { + kprobe_opcode_t buf[MAX_COPIED_INSN]; + unsigned long offset = GET_INSN_LENGTH(op->kp.opcode); + + buf[0] = (offset == RVI_INSN_LEN) ? __BUG_INSN_32 : __BUG_INSN_16; + memcpy(DETOUR_ADDR(buf, offset), op->optinsn.copied_insn, + op->optinsn.length - offset); + patch_text_batch(op->kp.addr, buf, op->optinsn.length); } int arch_within_optimized_kprobe(struct optimized_kprobe *op, kprobe_opcode_t *addr) { - return 0; + return (op->kp.addr <= addr && + op->kp.addr + op->optinsn.length > addr); } From patchwork Sat Dec 24 11:43:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Xim X-Patchwork-Id: 13081361 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3C6E2C4332F for ; Sat, 24 Dec 2022 15:21:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=wiBgUXcbILv+AApX34fsPEhzsbYXyZJP+w+BqojEkgM=; b=hMFNeeddrzFXAw oxT30wcb+lLEjJP4J0JN7R7OCN55PvqngHYSxJWRFgqQ2372vxBtm8ACfim70NX1nhH+axQ8vEasx +Fmk4vbE3xYBdmn5C9KsrHzl67MPNOJ5XcrICsKQg5LHHbUK1SEjYApc8XYuur3rP5jZ8BnAwXOwq v3EK39cvP89eD3BUBnKk9nmpvW/sPr3Zm4VzdhOCo+x4xKoerHxNtfS47LP0OWf1JH5Qb9N7ZL3KG KBKZ4fducA4TOFyYMQub7zaDk+ROX3myHRu03o0vG8qu2JV718UOKib257hfNrdv90xJjH97KBMu6 a/2WAQHFY2ln1elawKlA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p96L6-0012W5-Fz; Sat, 24 Dec 2022 15:21:32 +0000 Received: from smtp23.cstnet.cn ([159.226.251.23] helo=cstnet.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p92wO-00H5ug-L1 for linux-riscv@lists.infradead.org; Sat, 24 Dec 2022 11:43:53 +0000 Received: from cgk-Precision-3650-Tower.. (unknown [219.141.235.82]) by APP-03 (Coremail) with SMTP id rQCowABXXpbf5aZj9dVkCA--.18955S13; Sat, 24 Dec 2022 19:43:29 +0800 (CST) From: Chen Guokai To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com, sfr@canb.auug.org.au Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, liaochang1@huawei.com, Chen Guokai , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= Subject: [PATCH v5 9/9] riscv/kprobe: Search free registers from unused caller-saved ones Date: Sat, 24 Dec 2022 19:43:15 +0800 Message-Id: <20221224114315.850130-10-chenguokai17@mails.ucas.ac.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> References: <20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn> MIME-Version: 1.0 X-CM-TRANSID: rQCowABXXpbf5aZj9dVkCA--.18955S13 X-Coremail-Antispam: 1UD129KBjvJXoWxtF48Kw45tr17Kr4UGr1xZrb_yoW3uw1xpF sxGw4FvrWUJa4kurZrAw1kGr4fKrs8Kr40vFyUtFWSqr4aq39Ivayv93W3Xr1DtF40vr4x ArW5Cry8urW3AFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQa14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26F4UJVW0owA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Cr1j 6rxdM2vYz4IE04k24VAvwVAKI4IrM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4 xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8 JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4IIrI8v6xkF7I0E8cxan2IY04v7MxkIecxEwVAFwVW8KwCF04k20xvY0x0EwIxG rwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4 vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IY x2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw2 0EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x02 67AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUSApnUUUUU= X-Originating-IP: [219.141.235.82] X-CM-SenderInfo: xfkh0w5xrntxyrx6ztxlovh3xfdvhtffof0/1tbiAwgKE2OmnOhnMQAAs1 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221224_034349_159886_F20323EB X-CRM114-Status: GOOD ( 18.34 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org This patch further allows optprobe to use caller-saved registers that is not used across the function being optimized as free registers. Signed-off-by: Chen Guokai Co-developed-by: Liao Chang Signed-off-by: Liao Chang Reported-by: Björn Töpel --- arch/riscv/include/asm/kprobes.h | 1 + arch/riscv/kernel/probes/decode-insn.h | 5 + arch/riscv/kernel/probes/opt.c | 121 ++++++++++++++++++++++--- 3 files changed, 112 insertions(+), 15 deletions(-) diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h index e40c837d0a1d..7fecec799077 100644 --- a/arch/riscv/include/asm/kprobes.h +++ b/arch/riscv/include/asm/kprobes.h @@ -86,6 +86,7 @@ struct arch_optimized_insn { kprobe_opcode_t *insn; unsigned long length; int rd; + u32 free_reg; }; #endif /* CONFIG_OPTPROBES */ diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h index 785b023a62ea..907b951f2c86 100644 --- a/arch/riscv/kernel/probes/decode-insn.h +++ b/arch/riscv/kernel/probes/decode-insn.h @@ -13,6 +13,11 @@ enum probe_insn { INSN_GOOD, }; +#define NRREG 32 +#define ALL_REG_OCCUPIED 0xffffffffu +/* If a register is not caller-saved, its corresponding bit is set */ +#define NON_CALLER_SAVED_MASK 0xffc031d + enum probe_insn __kprobes riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi); diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c index 1c0e9d218f6f..884e77d2df4c 100644 --- a/arch/riscv/kernel/probes/opt.c +++ b/arch/riscv/kernel/probes/opt.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "simulate-insn.h" #include "decode-insn.h" @@ -130,7 +131,7 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot, * as a destination register before any branch or jump instruction. */ static void find_register(unsigned long start, unsigned long end, - unsigned long *write, unsigned long *read) + unsigned long *write, unsigned long *read) { kprobe_opcode_t insn; unsigned long addr, offset = 0UL; @@ -390,16 +391,99 @@ static int search_copied_insn(unsigned long paddr, struct optimized_kprobe *op) return 0; } +static void update_free_reg(unsigned long addr, uint32_t *used_reg) +{ + kprobe_opcode_t insn = *(kprobe_opcode_t *)addr; + unsigned long offset = GET_INSN_LENGTH(insn); + +#ifdef CONFIG_RISCV_ISA_C + if (offset == RVI_INSN_LEN) + goto is_rvi; + + insn &= __COMPRESSED_INSN_MASK; + if (riscv_insn_is_c_jal(insn)) { + *used_reg |= 1 << 1; + } else if (riscv_insn_is_c_jr(insn)) { + *used_reg |= 1 << rvc_r_rs1(insn); + } else if (riscv_insn_is_c_jalr(insn)) { + *used_reg |= 1 << rvc_r_rs1(insn); + } else if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn)) { + *used_reg |= 1 << rvc_b_rs(insn); + } else if (riscv_insn_is_c_sub(insn) || riscv_insn_is_c_subw(insn)) { + *used_reg |= 1 << rvc_a_rs1(insn); + *used_reg |= 1 << rvc_a_rs2(insn); + } else if (riscv_insn_is_c_sq(insn) || riscv_insn_is_c_sw(insn) || + riscv_insn_is_c_sd(insn)) { + *used_reg |= 1 << rvc_s_rs1(insn); + *used_reg |= 1 << rvc_s_rs2(insn); + } else if (riscv_insn_is_c_addi16sp(insn) || riscv_insn_is_c_addi(insn) || + riscv_insn_is_c_addiw(insn) || + riscv_insn_is_c_slli(insn)) { + *used_reg |= 1 << rvc_i_rs1(insn); + } else if (riscv_insn_is_c_sri(insn) || + riscv_insn_is_c_andi(insn)) { + *used_reg |= 1 << rvc_b_rs(insn); + } else if (riscv_insn_is_c_sqsp(insn) || riscv_insn_is_c_swsp(insn) || + riscv_insn_is_c_sdsp(insn)) { + *used_reg |= 1 << rvc_ss_rs2(insn); + *used_reg |= 1 << 2; + } else if (riscv_insn_is_c_mv(insn)) { + *used_reg |= 1 << rvc_r_rs2(insn); + } else if (riscv_insn_is_c_addi4spn(insn)) { + *used_reg |= 1 << 2; + } else if (riscv_insn_is_c_lq(insn) || riscv_insn_is_c_lw(insn) || + riscv_insn_is_c_ld(insn)) { + *used_reg |= 1 << rvc_l_rs(insn); + } else if (riscv_insn_is_c_lqsp(insn) || riscv_insn_is_c_lwsp(insn) || + riscv_insn_is_c_ldsp(insn)) { + *used_reg |= 1 << 2; + } + /* li and lui does not have source reg */ + return; +is_rvi: +#endif + if (riscv_insn_is_arith_ri(insn) || riscv_insn_is_load(insn)) { + *used_reg |= 1 << rvi_rs1(insn); + } else if (riscv_insn_is_arith_rr(insn) || riscv_insn_is_store(insn) || + riscv_insn_is_amo(insn)) { + *used_reg |= 1 << rvi_rs1(insn); + *used_reg |= 1 << rvi_rs2(insn); + } else if (riscv_insn_is_branch(insn)) { + *used_reg |= 1 << rvi_rs1(insn); + *used_reg |= 1 << rvi_rs2(insn); + } else if (riscv_insn_is_jalr(insn)) { + *used_reg |= 1 << rvi_rs1(insn); + } +} + +static bool scan_code(unsigned long *addr, unsigned long paddr, + struct optimized_kprobe *op, uint32_t *used_reg) +{ + if (insn_jump_into_range(*addr, paddr + RVC_INSN_LEN, + paddr + op->optinsn.length)) + return false; + if (search_exception_tables(*addr)) + return false; + update_free_reg(*addr, used_reg); + *addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr); + return true; +} + /* * The kprobe can be optimized when no in-function jump reaches to the * instructions replaced by optimized jump instructions(AUIPC/JALR). */ -static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op) +static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op, uint32_t *used_reg) { int ret; unsigned long addr, size = 0, offset = 0; struct kprobe *kp = get_kprobe((kprobe_opcode_t *)paddr); + /* + * All callee + */ + *used_reg = NON_CALLER_SAVED_MASK; + /* * Skip optimization if kprobe has been disarmed or instrumented * instruction support XOI. @@ -429,18 +513,14 @@ static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op) */ addr = paddr - offset; while (addr < paddr) { - if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN, - paddr + op->optinsn.length)) + if (!scan_code(&addr, paddr, op, used_reg)) return false; - addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr); } - - addr = paddr + op->optinsn.length; + update_free_reg((unsigned long)&kp->opcode, used_reg); + addr = paddr + GET_INSN_LENGTH(*(kprobe_opcode_t *)&kp->opcode); while (addr < paddr - offset + size) { - if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN, - paddr + op->optinsn.length)) + if (!scan_code(&addr, paddr, op, used_reg)) return false; - addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr); } return true; @@ -469,10 +549,13 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, { long rel; int rd, ra, ret; + u32 used_reg; kprobe_opcode_t *code = NULL, *slot = NULL; - if (!can_optimize((unsigned long)orig->addr, op)) + if (!can_optimize((unsigned long)orig->addr, op, &used_reg)) { + op->optinsn.rd = -1; return -EILSEQ; + } code = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL); slot = get_optinsn_slot(); @@ -497,11 +580,17 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, } /* - * Search two free registers, rd is used as to form AUIPC/JALR jumping - * to detour buffer, ra is used as to form JR jumping back from detour - * buffer. + * Search two free registers if no unused ones, rd is used as to form + * AUIPC/JALR jumping to detour buffer, ra is used as to form JR jumping + * back from detour buffer. */ - find_free_registers(orig, op, &rd, &ra); + if (used_reg == ALL_REG_OCCUPIED) { + find_free_registers(orig, op, &rd, &ra); + } else { + rd = ffz(used_reg); + ra = rd; + } + if (rd == 0 || ra == 0) { ret = -EILSEQ; goto on_error; @@ -545,6 +634,8 @@ void arch_optimize_kprobes(struct list_head *oplist) list_for_each_entry_safe(op, tmp, oplist, list) { WARN_ON(kprobe_disabled(&op->kp)); + if (op->optinsn.rd < 0) + continue; /* Backup instructions which will be replaced by jump address */ memcpy(op->optinsn.copied_insn, DETOUR_ADDR(op->kp.addr, GET_INSN_LENGTH(op->kp.opcode)),