From patchwork Sun Jun 16 17:05:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jisheng Zhang X-Patchwork-Id: 13699608 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8802CC27C53 for ; Sun, 16 Jun 2024 17:19:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=t+eyk42N1wMTySIMnfRKB83KpzWYdKWBkRy/iNyw9uI=; b=o2sArEOMDPLcg3 +SWA9X2TrNgUiVOEB2ZcsVRBdhb2lz7NWOZV7st3zs1G0n4zy3yZeotXLc9SRzdotJm1E4lBQZLeu Z91x5LETYJ0Fs94dYAQ49VL1krNN4QhTYnrgJtT+Effzxw7Gdh+iXKs7tSoE/iE6rsWRbjW3cB5ZO skqGUGjcoif6cd9PCzO32/VU2pPubtC3IKlKkL6B43yJ2nWuG0YJppfpr5c7dJre5BjD3TBEIB1Lg 56La5VPKHMTdMh8en/8DECsAx563sde91AbvOAW+NVHcsVcP7+EVmhr35SXwZWmYobhdVmLhU7TIA IogaBheYbzYruSnYtPFQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sItXm-00000007xDW-3aFC; Sun, 16 Jun 2024 17:19:54 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sItXi-00000007xBl-2cgz for linux-riscv@lists.infradead.org; Sun, 16 Jun 2024 17:19:52 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id E319160E75; Sun, 16 Jun 2024 17:19:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3D620C4AF1D; Sun, 16 Jun 2024 17:19:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1718558389; bh=Pd+tiD1d1gKrweB2fH2zKG5W19i1XxJlZfBI9MyqDG4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Fr3gZpJIF+mNWTE/k8gjb6h8WqjKYJ5XHk32WwBuZc8BBz2lyf54J0u3VdCN3acew /hXcpsTrkkeyioYogaABM3hRLgX5jMh5nyo541LwzOl7nqgg3xPNdowHIgpz8vh0HJ YLvDhpEBndaex5SskayRd60TjumzaeTGrse3Ahb5p+qNy2ITuhWY3988dww4zuyzZk 0apSPidmNpl3UzXFVrJ6T0ByZcogNbLLpOmVo0JXxhNKvlOYtriq5F2wEgGdo/vHEf WZB0ITPpZX0Cl86QiYJt+EuWMOtmg/0mElBaj+f0gMl1wJkgDK/pbZmRWjq78KK68u X5vBu6LEFW/kw== From: Jisheng Zhang To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Samuel Holland Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Anton Blanchard , Cyril Bur Subject: [PATCH 1/6] riscv: Improve exception and system call latency Date: Mon, 17 Jun 2024 01:05:48 +0800 Message-ID: <20240616170553.2832-2-jszhang@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240616170553.2832-1-jszhang@kernel.org> References: <20240616170553.2832-1-jszhang@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240616_101951_057632_C5ACC45B X-CRM114-Status: GOOD ( 14.38 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Anton Blanchard Many CPUs implement return address branch prediction as a stack. The RISCV architecture refers to this as a return address stack (RAS). If this gets corrupted then the CPU will mispredict at least one but potentally many function returns. There are two issues with the current RISCV exception code: - We are using the alternate link stack (x5/t0) for the indirect branch which makes the hardware think this is a function return. This will corrupt the RAS. - We modify the return address of handle_exception to point to ret_from_exception. This will also corrupt the RAS. Testing the null system call latency before and after the patch: Visionfive2 (StarFive JH7110 / U74) baseline: 189.87 ns patched: 176.76 ns Lichee pi 4a (T-Head TH1520 / C910) baseline: 666.58 ns patched: 636.90 ns Just over 7% on the U74 and just over 4% on the C910. Signed-off-by: Anton Blanchard Signed-off-by: Cyril Bur Reviewed-by: Charlie Jenkins --- arch/riscv/kernel/entry.S | 17 ++++++++++------- arch/riscv/kernel/stacktrace.c | 4 ++-- 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 68a24cf9481a..c933460ed3e9 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -88,7 +88,6 @@ SYM_CODE_START(handle_exception) call riscv_v_context_nesting_start #endif move a0, sp /* pt_regs */ - la ra, ret_from_exception /* * MSB of cause differentiates between @@ -97,7 +96,8 @@ SYM_CODE_START(handle_exception) bge s4, zero, 1f /* Handle interrupts */ - tail do_irq + call do_irq + j ret_from_exception 1: /* Handle other exceptions */ slli t0, s4, RISCV_LGPTR @@ -105,11 +105,14 @@ SYM_CODE_START(handle_exception) la t2, excp_vect_table_end add t0, t1, t0 /* Check if exception code lies within bounds */ - bgeu t0, t2, 1f - REG_L t0, 0(t0) - jr t0 -1: - tail do_trap_unknown + bgeu t0, t2, 3f + REG_L t1, 0(t0) +2: jalr t1 + j ret_from_exception +3: + + la t1, do_trap_unknown + j 2b SYM_CODE_END(handle_exception) ASM_NOKPROBE(handle_exception) diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c index 528ec7cc9a62..5eb3d135b717 100644 --- a/arch/riscv/kernel/stacktrace.c +++ b/arch/riscv/kernel/stacktrace.c @@ -16,7 +16,7 @@ #ifdef CONFIG_FRAME_POINTER -extern asmlinkage void ret_from_exception(void); +extern asmlinkage void handle_exception(void); static inline int fp_is_valid(unsigned long fp, unsigned long sp) { @@ -70,7 +70,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, fp = frame->fp; pc = ftrace_graph_ret_addr(current, NULL, frame->ra, &frame->ra); - if (pc == (unsigned long)ret_from_exception) { + if (pc == (unsigned long)handle_exception) { if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) break;