From patchwork Thu Feb 13 13:00:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC7DCC021A5 for ; Thu, 13 Feb 2025 13:12:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=LEp1NBz8iYiEO96kzTJeerHk+LTi3GR8gMbl0/ZP2nA=; b=DTDSDguyUANqQ90XfDwrjYjplS Y7faK3EFQtEhv36ygjUqbh03nkFYqMhOa2GvtweZvHPZxAMJ3BdMCuVOlv6q7aVtmQKb79Qyyw89F E1Uu5UzR/EgMq4KnchoTOZRkN2YokJp+QV8I+qbmJqtOjPGH5T5E+HYK8mLyxJrpsMbDTU+CDsBl/ 90V791xkKNAqMYyQnYcQd2ZBG8tPdfMuwUXKFhJ1OC8KNKdDVBePQu5gqfhqEpTS2klXVKV5hIZzA j9LC5SOoPP7j9mJ7Zj4eulhbkZJHyBlyklmAxgFX/dWevsGZlcJXgbNoe4rFZV+KF2xPBpkodNNZX FNlPPbSw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiZ0U-0000000B6tZ-2Aoa; Thu, 13 Feb 2025 13:11:54 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqq-0000000B515-1aCu for linux-arm-kernel@bombadil.infradead.org; Thu, 13 Feb 2025 13:01:56 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Content-Transfer-Encoding :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=LEp1NBz8iYiEO96kzTJeerHk+LTi3GR8gMbl0/ZP2nA=; b=T3O7dyie90kTWxsnlZvlBMKY+h n5tyz9kp4QfF8sPr/vNs3lwq9uCj8PEBPuNxxi/H8lt/2A1vDltrNsG6ryuieIzh6vrKFENYAsrxY dvDagMZSVQHi+2YDaTSv6OgyqDhyp2JKG0iURs3a2BQTK8hHKzsfvF5n5CpvHphO/CgJ1dnSPSbQX e/KpPWaTGfNQrU4Rr1FqUxAGFSGHoVlcYyS8oAx0idi2FrAe13Qg15K/GooK2vZMu1ZfDClB1YR9F yH2EvRkCbpmuT6Z4c/kdy9DkEz8+x4+mwH3WY/wM7lCcAclIDwMfA7hNOkCYKjxv73ehN5SZBb18k VxiLFozg==; Received: from szxga07-in.huawei.com ([45.249.212.35]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYql-00000000zkR-11F6 for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:54 +0000 Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4YtwFl2Nrtz1V6cB; Thu, 13 Feb 2025 20:57:47 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id 2917D140202; Thu, 13 Feb 2025 21:01:35 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:33 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry Date: Thu, 13 Feb 2025 21:00:00 +0800 Message-ID: <20250213130007.1418890-2-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_130152_636645_BC046716 X-CRM114-Status: GOOD ( 30.57 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Currently CONFIG_GENERIC_ENTRY enables both the generic exception entry logic and the generic syscall entry logic, which are otherwise loosely coupled. Introduce separate config options for these so that archtiectures can select the two independently. This will make it easier for architectures to migrate to generic entry code. Suggested-by: Mark Rutland Signed-off-by: Jinjie Ruan ---- v6: - Udapte the commit message. - Have this before the arm64 changes. - Make GENERIC_SYSCALL depends on GENERIC_IRQ_ENTRY. --- MAINTAINERS | 1 + arch/Kconfig | 9 + include/linux/entry-common.h | 382 +----------------------------- include/linux/irq-entry-common.h | 389 +++++++++++++++++++++++++++++++ kernel/entry/Makefile | 3 +- kernel/entry/common.c | 160 +------------ kernel/entry/syscall-common.c | 159 +++++++++++++ kernel/sched/core.c | 8 +- 8 files changed, 566 insertions(+), 545 deletions(-) create mode 100644 include/linux/irq-entry-common.h create mode 100644 kernel/entry/syscall-common.c diff --git a/MAINTAINERS b/MAINTAINERS index 92fc0eca7061..56e72dab6655 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9666,6 +9666,7 @@ S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/entry F: include/linux/entry-common.h F: include/linux/entry-kvm.h +F: include/linux/irq-entry-common.h F: kernel/entry/ GENERIC GPIO I2C DRIVER diff --git a/arch/Kconfig b/arch/Kconfig index b8a4ff365582..b59c23594342 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -64,8 +64,17 @@ config HOTPLUG_PARALLEL bool select HOTPLUG_SPLIT_STARTUP +config GENERIC_IRQ_ENTRY + bool + +config GENERIC_SYSCALL + bool + depends on GENERIC_IRQ_ENTRY + config GENERIC_ENTRY bool + select GENERIC_IRQ_ENTRY + select GENERIC_SYSCALL config KPROBES bool "Kprobes" diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index fc61d0205c97..b3233e8328c5 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -2,27 +2,15 @@ #ifndef __LINUX_ENTRYCOMMON_H #define __LINUX_ENTRYCOMMON_H -#include +#include #include -#include #include #include -#include #include #include -#include -#include #include -/* - * Define dummy _TIF work flags if not defined by the architecture or for - * disabled functionality. - */ -#ifndef _TIF_PATCH_PENDING -# define _TIF_PATCH_PENDING (0) -#endif - #ifndef _TIF_UPROBE # define _TIF_UPROBE (0) #endif @@ -55,69 +43,6 @@ SYSCALL_WORK_SYSCALL_EXIT_TRAP | \ ARCH_SYSCALL_WORK_EXIT) -/* - * TIF flags handled in exit_to_user_mode_loop() - */ -#ifndef ARCH_EXIT_TO_USER_MODE_WORK -# define ARCH_EXIT_TO_USER_MODE_WORK (0) -#endif - -#define EXIT_TO_USER_MODE_WORK \ - (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ - _TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \ - _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \ - ARCH_EXIT_TO_USER_MODE_WORK) - -/** - * arch_enter_from_user_mode - Architecture specific sanity check for user mode regs - * @regs: Pointer to currents pt_regs - * - * Defaults to an empty implementation. Can be replaced by architecture - * specific code. - * - * Invoked from syscall_enter_from_user_mode() in the non-instrumentable - * section. Use __always_inline so the compiler cannot push it out of line - * and make it instrumentable. - */ -static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs); - -#ifndef arch_enter_from_user_mode -static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) {} -#endif - -/** - * enter_from_user_mode - Establish state when coming from user mode - * - * Syscall/interrupt entry disables interrupts, but user mode is traced as - * interrupts enabled. Also with NO_HZ_FULL RCU might be idle. - * - * 1) Tell lockdep that interrupts are disabled - * 2) Invoke context tracking if enabled to reactivate RCU - * 3) Trace interrupts off state - * - * Invoked from architecture specific syscall entry code with interrupts - * disabled. The calling code has to be non-instrumentable. When the - * function returns all state is correct and interrupts are still - * disabled. The subsequent functions can be instrumented. - * - * This is invoked when there is architecture specific functionality to be - * done between establishing state and enabling interrupts. The caller must - * enable interrupts before invoking syscall_enter_from_user_mode_work(). - */ -static __always_inline void enter_from_user_mode(struct pt_regs *regs) -{ - arch_enter_from_user_mode(regs); - lockdep_hardirqs_off(CALLER_ADDR0); - - CT_WARN_ON(__ct_state() != CT_STATE_USER); - user_exit_irqoff(); - - instrumentation_begin(); - kmsan_unpoison_entry_regs(regs); - trace_hardirqs_off_finish(); - instrumentation_end(); -} - /** * syscall_enter_from_user_mode_prepare - Establish state and enable interrupts * @regs: Pointer to currents pt_regs @@ -202,170 +127,6 @@ static __always_inline long syscall_enter_from_user_mode(struct pt_regs *regs, l return ret; } -/** - * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable() - * @ti_work: Cached TIF flags gathered with interrupts disabled - * - * Defaults to local_irq_enable(). Can be supplied by architecture specific - * code. - */ -static inline void local_irq_enable_exit_to_user(unsigned long ti_work); - -#ifndef local_irq_enable_exit_to_user -static inline void local_irq_enable_exit_to_user(unsigned long ti_work) -{ - local_irq_enable(); -} -#endif - -/** - * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable() - * - * Defaults to local_irq_disable(). Can be supplied by architecture specific - * code. - */ -static inline void local_irq_disable_exit_to_user(void); - -#ifndef local_irq_disable_exit_to_user -static inline void local_irq_disable_exit_to_user(void) -{ - local_irq_disable(); -} -#endif - -/** - * arch_exit_to_user_mode_work - Architecture specific TIF work for exit - * to user mode. - * @regs: Pointer to currents pt_regs - * @ti_work: Cached TIF flags gathered with interrupts disabled - * - * Invoked from exit_to_user_mode_loop() with interrupt enabled - * - * Defaults to NOOP. Can be supplied by architecture specific code. - */ -static inline void arch_exit_to_user_mode_work(struct pt_regs *regs, - unsigned long ti_work); - -#ifndef arch_exit_to_user_mode_work -static inline void arch_exit_to_user_mode_work(struct pt_regs *regs, - unsigned long ti_work) -{ -} -#endif - -/** - * arch_exit_to_user_mode_prepare - Architecture specific preparation for - * exit to user mode. - * @regs: Pointer to currents pt_regs - * @ti_work: Cached TIF flags gathered with interrupts disabled - * - * Invoked from exit_to_user_mode_prepare() with interrupt disabled as the last - * function before return. Defaults to NOOP. - */ -static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, - unsigned long ti_work); - -#ifndef arch_exit_to_user_mode_prepare -static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, - unsigned long ti_work) -{ -} -#endif - -/** - * arch_exit_to_user_mode - Architecture specific final work before - * exit to user mode. - * - * Invoked from exit_to_user_mode() with interrupt disabled as the last - * function before return. Defaults to NOOP. - * - * This needs to be __always_inline because it is non-instrumentable code - * invoked after context tracking switched to user mode. - * - * An architecture implementation must not do anything complex, no locking - * etc. The main purpose is for speculation mitigations. - */ -static __always_inline void arch_exit_to_user_mode(void); - -#ifndef arch_exit_to_user_mode -static __always_inline void arch_exit_to_user_mode(void) { } -#endif - -/** - * arch_do_signal_or_restart - Architecture specific signal delivery function - * @regs: Pointer to currents pt_regs - * - * Invoked from exit_to_user_mode_loop(). - */ -void arch_do_signal_or_restart(struct pt_regs *regs); - -/** - * exit_to_user_mode_loop - do any pending work before leaving to user space - */ -unsigned long exit_to_user_mode_loop(struct pt_regs *regs, - unsigned long ti_work); - -/** - * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required - * @regs: Pointer to pt_regs on entry stack - * - * 1) check that interrupts are disabled - * 2) call tick_nohz_user_enter_prepare() - * 3) call exit_to_user_mode_loop() if any flags from - * EXIT_TO_USER_MODE_WORK are set - * 4) check that interrupts are still disabled - */ -static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) -{ - unsigned long ti_work; - - lockdep_assert_irqs_disabled(); - - /* Flush pending rcuog wakeup before the last need_resched() check */ - tick_nohz_user_enter_prepare(); - - ti_work = read_thread_flags(); - if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK)) - ti_work = exit_to_user_mode_loop(regs, ti_work); - - arch_exit_to_user_mode_prepare(regs, ti_work); - - /* Ensure that kernel state is sane for a return to userspace */ - kmap_assert_nomap(); - lockdep_assert_irqs_disabled(); - lockdep_sys_exit(); -} - -/** - * exit_to_user_mode - Fixup state when exiting to user mode - * - * Syscall/interrupt exit enables interrupts, but the kernel state is - * interrupts disabled when this is invoked. Also tell RCU about it. - * - * 1) Trace interrupts on state - * 2) Invoke context tracking if enabled to adjust RCU state - * 3) Invoke architecture specific last minute exit code, e.g. speculation - * mitigations, etc.: arch_exit_to_user_mode() - * 4) Tell lockdep that interrupts are enabled - * - * Invoked from architecture specific code when syscall_exit_to_user_mode() - * is not suitable as the last step before returning to userspace. Must be - * invoked with interrupts disabled and the caller must be - * non-instrumentable. - * The caller has to invoke syscall_exit_to_user_mode_work() before this. - */ -static __always_inline void exit_to_user_mode(void) -{ - instrumentation_begin(); - trace_hardirqs_on_prepare(); - lockdep_hardirqs_on_prepare(); - instrumentation_end(); - - user_enter_irqoff(); - arch_exit_to_user_mode(); - lockdep_hardirqs_on(CALLER_ADDR0); -} - /** * syscall_exit_to_user_mode_work - Handle work before returning to user mode * @regs: Pointer to currents pt_regs @@ -412,145 +173,4 @@ void syscall_exit_to_user_mode_work(struct pt_regs *regs); */ void syscall_exit_to_user_mode(struct pt_regs *regs); -/** - * irqentry_enter_from_user_mode - Establish state before invoking the irq handler - * @regs: Pointer to currents pt_regs - * - * Invoked from architecture specific entry code with interrupts disabled. - * Can only be called when the interrupt entry came from user mode. The - * calling code must be non-instrumentable. When the function returns all - * state is correct and the subsequent functions can be instrumented. - * - * The function establishes state (lockdep, RCU (context tracking), tracing) - */ -void irqentry_enter_from_user_mode(struct pt_regs *regs); - -/** - * irqentry_exit_to_user_mode - Interrupt exit work - * @regs: Pointer to current's pt_regs - * - * Invoked with interrupts disabled and fully valid regs. Returns with all - * work handled, interrupts disabled such that the caller can immediately - * switch to user mode. Called from architecture specific interrupt - * handling code. - * - * The call order is #2 and #3 as described in syscall_exit_to_user_mode(). - * Interrupt exit is not invoking #1 which is the syscall specific one time - * work. - */ -void irqentry_exit_to_user_mode(struct pt_regs *regs); - -#ifndef irqentry_state -/** - * struct irqentry_state - Opaque object for exception state storage - * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the - * exit path has to invoke ct_irq_exit(). - * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that - * lockdep state is restored correctly on exit from nmi. - * - * This opaque object is filled in by the irqentry_*_enter() functions and - * must be passed back into the corresponding irqentry_*_exit() functions - * when the exception is complete. - * - * Callers of irqentry_*_[enter|exit]() must consider this structure opaque - * and all members private. Descriptions of the members are provided to aid in - * the maintenance of the irqentry_*() functions. - */ -typedef struct irqentry_state { - union { - bool exit_rcu; - bool lockdep; - }; -} irqentry_state_t; -#endif - -/** - * irqentry_enter - Handle state tracking on ordinary interrupt entries - * @regs: Pointer to pt_regs of interrupted context - * - * Invokes: - * - lockdep irqflag state tracking as low level ASM entry disabled - * interrupts. - * - * - Context tracking if the exception hit user mode. - * - * - The hardirq tracer to keep the state consistent as low level ASM - * entry disabled interrupts. - * - * As a precondition, this requires that the entry came from user mode, - * idle, or a kernel context in which RCU is watching. - * - * For kernel mode entries RCU handling is done conditional. If RCU is - * watching then the only RCU requirement is to check whether the tick has - * to be restarted. If RCU is not watching then ct_irq_enter() has to be - * invoked on entry and ct_irq_exit() on exit. - * - * Avoiding the ct_irq_enter/exit() calls is an optimization but also - * solves the problem of kernel mode pagefaults which can schedule, which - * is not possible after invoking ct_irq_enter() without undoing it. - * - * For user mode entries irqentry_enter_from_user_mode() is invoked to - * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit - * would not be possible. - * - * Returns: An opaque object that must be passed to idtentry_exit() - */ -irqentry_state_t noinstr irqentry_enter(struct pt_regs *regs); - -/** - * irqentry_exit_cond_resched - Conditionally reschedule on return from interrupt - * - * Conditional reschedule with additional sanity checks. - */ -void raw_irqentry_exit_cond_resched(void); -#ifdef CONFIG_PREEMPT_DYNAMIC -#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) -#define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_resched -#define irqentry_exit_cond_resched_dynamic_disabled NULL -DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched); -#define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resched)() -#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) -DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); -void dynamic_irqentry_exit_cond_resched(void); -#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched() -#endif -#else /* CONFIG_PREEMPT_DYNAMIC */ -#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched() -#endif /* CONFIG_PREEMPT_DYNAMIC */ - -/** - * irqentry_exit - Handle return from exception that used irqentry_enter() - * @regs: Pointer to pt_regs (exception entry regs) - * @state: Return value from matching call to irqentry_enter() - * - * Depending on the return target (kernel/user) this runs the necessary - * preemption and work checks if possible and required and returns to - * the caller with interrupts disabled and no further work pending. - * - * This is the last action before returning to the low level ASM code which - * just needs to return to the appropriate context. - * - * Counterpart to irqentry_enter(). - */ -void noinstr irqentry_exit(struct pt_regs *regs, irqentry_state_t state); - -/** - * irqentry_nmi_enter - Handle NMI entry - * @regs: Pointer to currents pt_regs - * - * Similar to irqentry_enter() but taking care of the NMI constraints. - */ -irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs); - -/** - * irqentry_nmi_exit - Handle return from NMI handling - * @regs: Pointer to pt_regs (NMI entry regs) - * @irq_state: Return value from matching call to irqentry_nmi_enter() - * - * Last action before returning to the low level assembly code. - * - * Counterpart to irqentry_nmi_enter(). - */ -void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state); - #endif diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h new file mode 100644 index 000000000000..8af374331900 --- /dev/null +++ b/include/linux/irq-entry-common.h @@ -0,0 +1,389 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_IRQENTRYCOMMON_H +#define __LINUX_IRQENTRYCOMMON_H + +#include +#include +#include +#include +#include + +#include + +/* + * Define dummy _TIF work flags if not defined by the architecture or for + * disabled functionality. + */ +#ifndef _TIF_PATCH_PENDING +# define _TIF_PATCH_PENDING (0) +#endif + +/* + * TIF flags handled in exit_to_user_mode_loop() + */ +#ifndef ARCH_EXIT_TO_USER_MODE_WORK +# define ARCH_EXIT_TO_USER_MODE_WORK (0) +#endif + +#define EXIT_TO_USER_MODE_WORK \ + (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ + _TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \ + _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \ + ARCH_EXIT_TO_USER_MODE_WORK) + +/** + * arch_enter_from_user_mode - Architecture specific sanity check for user mode regs + * @regs: Pointer to currents pt_regs + * + * Defaults to an empty implementation. Can be replaced by architecture + * specific code. + * + * Invoked from syscall_enter_from_user_mode() in the non-instrumentable + * section. Use __always_inline so the compiler cannot push it out of line + * and make it instrumentable. + */ +static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs); + +#ifndef arch_enter_from_user_mode +static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) {} +#endif + +/** + * enter_from_user_mode - Establish state when coming from user mode + * + * Syscall/interrupt entry disables interrupts, but user mode is traced as + * interrupts enabled. Also with NO_HZ_FULL RCU might be idle. + * + * 1) Tell lockdep that interrupts are disabled + * 2) Invoke context tracking if enabled to reactivate RCU + * 3) Trace interrupts off state + * + * Invoked from architecture specific syscall entry code with interrupts + * disabled. The calling code has to be non-instrumentable. When the + * function returns all state is correct and interrupts are still + * disabled. The subsequent functions can be instrumented. + * + * This is invoked when there is architecture specific functionality to be + * done between establishing state and enabling interrupts. The caller must + * enable interrupts before invoking syscall_enter_from_user_mode_work(). + */ +static __always_inline void enter_from_user_mode(struct pt_regs *regs) +{ + arch_enter_from_user_mode(regs); + lockdep_hardirqs_off(CALLER_ADDR0); + + CT_WARN_ON(__ct_state() != CT_STATE_USER); + user_exit_irqoff(); + + instrumentation_begin(); + kmsan_unpoison_entry_regs(regs); + trace_hardirqs_off_finish(); + instrumentation_end(); +} + +/** + * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable() + * @ti_work: Cached TIF flags gathered with interrupts disabled + * + * Defaults to local_irq_enable(). Can be supplied by architecture specific + * code. + */ +static inline void local_irq_enable_exit_to_user(unsigned long ti_work); + +#ifndef local_irq_enable_exit_to_user +static inline void local_irq_enable_exit_to_user(unsigned long ti_work) +{ + local_irq_enable(); +} +#endif + +/** + * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable() + * + * Defaults to local_irq_disable(). Can be supplied by architecture specific + * code. + */ +static inline void local_irq_disable_exit_to_user(void); + +#ifndef local_irq_disable_exit_to_user +static inline void local_irq_disable_exit_to_user(void) +{ + local_irq_disable(); +} +#endif + +/** + * arch_exit_to_user_mode_work - Architecture specific TIF work for exit + * to user mode. + * @regs: Pointer to currents pt_regs + * @ti_work: Cached TIF flags gathered with interrupts disabled + * + * Invoked from exit_to_user_mode_loop() with interrupt enabled + * + * Defaults to NOOP. Can be supplied by architecture specific code. + */ +static inline void arch_exit_to_user_mode_work(struct pt_regs *regs, + unsigned long ti_work); + +#ifndef arch_exit_to_user_mode_work +static inline void arch_exit_to_user_mode_work(struct pt_regs *regs, + unsigned long ti_work) +{ +} +#endif + +/** + * arch_exit_to_user_mode_prepare - Architecture specific preparation for + * exit to user mode. + * @regs: Pointer to currents pt_regs + * @ti_work: Cached TIF flags gathered with interrupts disabled + * + * Invoked from exit_to_user_mode_prepare() with interrupt disabled as the last + * function before return. Defaults to NOOP. + */ +static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, + unsigned long ti_work); + +#ifndef arch_exit_to_user_mode_prepare +static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, + unsigned long ti_work) +{ +} +#endif + +/** + * arch_exit_to_user_mode - Architecture specific final work before + * exit to user mode. + * + * Invoked from exit_to_user_mode() with interrupt disabled as the last + * function before return. Defaults to NOOP. + * + * This needs to be __always_inline because it is non-instrumentable code + * invoked after context tracking switched to user mode. + * + * An architecture implementation must not do anything complex, no locking + * etc. The main purpose is for speculation mitigations. + */ +static __always_inline void arch_exit_to_user_mode(void); + +#ifndef arch_exit_to_user_mode +static __always_inline void arch_exit_to_user_mode(void) { } +#endif + +/** + * arch_do_signal_or_restart - Architecture specific signal delivery function + * @regs: Pointer to currents pt_regs + * + * Invoked from exit_to_user_mode_loop(). + */ +void arch_do_signal_or_restart(struct pt_regs *regs); + +/** + * exit_to_user_mode_loop - do any pending work before leaving to user space + */ +unsigned long exit_to_user_mode_loop(struct pt_regs *regs, + unsigned long ti_work); + +/** + * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required + * @regs: Pointer to pt_regs on entry stack + * + * 1) check that interrupts are disabled + * 2) call tick_nohz_user_enter_prepare() + * 3) call exit_to_user_mode_loop() if any flags from + * EXIT_TO_USER_MODE_WORK are set + * 4) check that interrupts are still disabled + */ +static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) +{ + unsigned long ti_work; + + lockdep_assert_irqs_disabled(); + + /* Flush pending rcuog wakeup before the last need_resched() check */ + tick_nohz_user_enter_prepare(); + + ti_work = read_thread_flags(); + if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK)) + ti_work = exit_to_user_mode_loop(regs, ti_work); + + arch_exit_to_user_mode_prepare(regs, ti_work); + + /* Ensure that kernel state is sane for a return to userspace */ + kmap_assert_nomap(); + lockdep_assert_irqs_disabled(); + lockdep_sys_exit(); +} + +/** + * exit_to_user_mode - Fixup state when exiting to user mode + * + * Syscall/interrupt exit enables interrupts, but the kernel state is + * interrupts disabled when this is invoked. Also tell RCU about it. + * + * 1) Trace interrupts on state + * 2) Invoke context tracking if enabled to adjust RCU state + * 3) Invoke architecture specific last minute exit code, e.g. speculation + * mitigations, etc.: arch_exit_to_user_mode() + * 4) Tell lockdep that interrupts are enabled + * + * Invoked from architecture specific code when syscall_exit_to_user_mode() + * is not suitable as the last step before returning to userspace. Must be + * invoked with interrupts disabled and the caller must be + * non-instrumentable. + * The caller has to invoke syscall_exit_to_user_mode_work() before this. + */ +static __always_inline void exit_to_user_mode(void) +{ + instrumentation_begin(); + trace_hardirqs_on_prepare(); + lockdep_hardirqs_on_prepare(); + instrumentation_end(); + + user_enter_irqoff(); + arch_exit_to_user_mode(); + lockdep_hardirqs_on(CALLER_ADDR0); +} + +/** + * irqentry_enter_from_user_mode - Establish state before invoking the irq handler + * @regs: Pointer to currents pt_regs + * + * Invoked from architecture specific entry code with interrupts disabled. + * Can only be called when the interrupt entry came from user mode. The + * calling code must be non-instrumentable. When the function returns all + * state is correct and the subsequent functions can be instrumented. + * + * The function establishes state (lockdep, RCU (context tracking), tracing) + */ +void irqentry_enter_from_user_mode(struct pt_regs *regs); + +/** + * irqentry_exit_to_user_mode - Interrupt exit work + * @regs: Pointer to current's pt_regs + * + * Invoked with interrupts disabled and fully valid regs. Returns with all + * work handled, interrupts disabled such that the caller can immediately + * switch to user mode. Called from architecture specific interrupt + * handling code. + * + * The call order is #2 and #3 as described in syscall_exit_to_user_mode(). + * Interrupt exit is not invoking #1 which is the syscall specific one time + * work. + */ +void irqentry_exit_to_user_mode(struct pt_regs *regs); + +#ifndef irqentry_state +/** + * struct irqentry_state - Opaque object for exception state storage + * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the + * exit path has to invoke ct_irq_exit(). + * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that + * lockdep state is restored correctly on exit from nmi. + * + * This opaque object is filled in by the irqentry_*_enter() functions and + * must be passed back into the corresponding irqentry_*_exit() functions + * when the exception is complete. + * + * Callers of irqentry_*_[enter|exit]() must consider this structure opaque + * and all members private. Descriptions of the members are provided to aid in + * the maintenance of the irqentry_*() functions. + */ +typedef struct irqentry_state { + union { + bool exit_rcu; + bool lockdep; + }; +} irqentry_state_t; +#endif + +/** + * irqentry_enter - Handle state tracking on ordinary interrupt entries + * @regs: Pointer to pt_regs of interrupted context + * + * Invokes: + * - lockdep irqflag state tracking as low level ASM entry disabled + * interrupts. + * + * - Context tracking if the exception hit user mode. + * + * - The hardirq tracer to keep the state consistent as low level ASM + * entry disabled interrupts. + * + * As a precondition, this requires that the entry came from user mode, + * idle, or a kernel context in which RCU is watching. + * + * For kernel mode entries RCU handling is done conditional. If RCU is + * watching then the only RCU requirement is to check whether the tick has + * to be restarted. If RCU is not watching then ct_irq_enter() has to be + * invoked on entry and ct_irq_exit() on exit. + * + * Avoiding the ct_irq_enter/exit() calls is an optimization but also + * solves the problem of kernel mode pagefaults which can schedule, which + * is not possible after invoking ct_irq_enter() without undoing it. + * + * For user mode entries irqentry_enter_from_user_mode() is invoked to + * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit + * would not be possible. + * + * Returns: An opaque object that must be passed to idtentry_exit() + */ +irqentry_state_t noinstr irqentry_enter(struct pt_regs *regs); + +/** + * irqentry_exit_cond_resched - Conditionally reschedule on return from interrupt + * + * Conditional reschedule with additional sanity checks. + */ +void raw_irqentry_exit_cond_resched(void); +#ifdef CONFIG_PREEMPT_DYNAMIC +#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) +#define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_resched +#define irqentry_exit_cond_resched_dynamic_disabled NULL +DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched); +#define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resched)() +#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) +DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); +void dynamic_irqentry_exit_cond_resched(void); +#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched() +#endif +#else /* CONFIG_PREEMPT_DYNAMIC */ +#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched() +#endif /* CONFIG_PREEMPT_DYNAMIC */ + +/** + * irqentry_exit - Handle return from exception that used irqentry_enter() + * @regs: Pointer to pt_regs (exception entry regs) + * @state: Return value from matching call to irqentry_enter() + * + * Depending on the return target (kernel/user) this runs the necessary + * preemption and work checks if possible and required and returns to + * the caller with interrupts disabled and no further work pending. + * + * This is the last action before returning to the low level ASM code which + * just needs to return to the appropriate context. + * + * Counterpart to irqentry_enter(). + */ +void noinstr irqentry_exit(struct pt_regs *regs, irqentry_state_t state); + +/** + * irqentry_nmi_enter - Handle NMI entry + * @regs: Pointer to currents pt_regs + * + * Similar to irqentry_enter() but taking care of the NMI constraints. + */ +irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs); + +/** + * irqentry_nmi_exit - Handle return from NMI handling + * @regs: Pointer to pt_regs (NMI entry regs) + * @irq_state: Return value from matching call to irqentry_nmi_enter() + * + * Last action before returning to the low level assembly code. + * + * Counterpart to irqentry_nmi_enter(). + */ +void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state); + +#endif diff --git a/kernel/entry/Makefile b/kernel/entry/Makefile index 095c775e001e..d38f3a7e7396 100644 --- a/kernel/entry/Makefile +++ b/kernel/entry/Makefile @@ -9,5 +9,6 @@ KCOV_INSTRUMENT := n CFLAGS_REMOVE_common.o = -fstack-protector -fstack-protector-strong CFLAGS_common.o += -fno-stack-protector -obj-$(CONFIG_GENERIC_ENTRY) += common.o syscall_user_dispatch.o +obj-$(CONFIG_GENERIC_IRQ_ENTRY) += common.o +obj-$(CONFIG_GENERIC_SYSCALL) += syscall-common.o syscall_user_dispatch.o obj-$(CONFIG_KVM_XFER_TO_GUEST_WORK) += kvm.o diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 20154572ede9..b82032777310 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -1,84 +1,13 @@ // SPDX-License-Identifier: GPL-2.0 -#include -#include +#include #include #include #include #include #include -#include #include -#include "common.h" - -#define CREATE_TRACE_POINTS -#include - -static inline void syscall_enter_audit(struct pt_regs *regs, long syscall) -{ - if (unlikely(audit_context())) { - unsigned long args[6]; - - syscall_get_arguments(current, regs, args); - audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]); - } -} - -long syscall_trace_enter(struct pt_regs *regs, long syscall, - unsigned long work) -{ - long ret = 0; - - /* - * Handle Syscall User Dispatch. This must comes first, since - * the ABI here can be something that doesn't make sense for - * other syscall_work features. - */ - if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { - if (syscall_user_dispatch(regs)) - return -1L; - } - - /* Handle ptrace */ - if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) { - ret = ptrace_report_syscall_entry(regs); - if (ret || (work & SYSCALL_WORK_SYSCALL_EMU)) - return -1L; - } - - /* Do seccomp after ptrace, to catch any tracer changes. */ - if (work & SYSCALL_WORK_SECCOMP) { - ret = __secure_computing(); - if (ret == -1L) - return ret; - } - - /* Either of the above might have changed the syscall number */ - syscall = syscall_get_nr(current, regs); - - if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) { - trace_sys_enter(regs, syscall); - /* - * Probes or BPF hooks in the tracepoint may have changed the - * system call number as well. - */ - syscall = syscall_get_nr(current, regs); - } - - syscall_enter_audit(regs, syscall); - - return ret ? : syscall; -} - -noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs) -{ - enter_from_user_mode(regs); - instrumentation_begin(); - local_irq_enable(); - instrumentation_end(); -} - /* Workaround to allow gradual conversion of architecture code */ void __weak arch_do_signal_or_restart(struct pt_regs *regs) { } @@ -133,93 +62,6 @@ __always_inline unsigned long exit_to_user_mode_loop(struct pt_regs *regs, return ti_work; } -/* - * If SYSCALL_EMU is set, then the only reason to report is when - * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall - * instruction has been already reported in syscall_enter_from_user_mode(). - */ -static inline bool report_single_step(unsigned long work) -{ - if (work & SYSCALL_WORK_SYSCALL_EMU) - return false; - - return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP; -} - -static void syscall_exit_work(struct pt_regs *regs, unsigned long work) -{ - bool step; - - /* - * If the syscall was rolled back due to syscall user dispatching, - * then the tracers below are not invoked for the same reason as - * the entry side was not invoked in syscall_trace_enter(): The ABI - * of these syscalls is unknown. - */ - if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { - if (unlikely(current->syscall_dispatch.on_dispatch)) { - current->syscall_dispatch.on_dispatch = false; - return; - } - } - - audit_syscall_exit(regs); - - if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT) - trace_sys_exit(regs, syscall_get_return_value(current, regs)); - - step = report_single_step(work); - if (step || work & SYSCALL_WORK_SYSCALL_TRACE) - ptrace_report_syscall_exit(regs, step); -} - -/* - * Syscall specific exit to user mode preparation. Runs with interrupts - * enabled. - */ -static void syscall_exit_to_user_mode_prepare(struct pt_regs *regs) -{ - unsigned long work = READ_ONCE(current_thread_info()->syscall_work); - unsigned long nr = syscall_get_nr(current, regs); - - CT_WARN_ON(ct_state() != CT_STATE_KERNEL); - - if (IS_ENABLED(CONFIG_PROVE_LOCKING)) { - if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr)) - local_irq_enable(); - } - - rseq_syscall(regs); - - /* - * Do one-time syscall specific work. If these work items are - * enabled, we want to run them exactly once per syscall exit with - * interrupts enabled. - */ - if (unlikely(work & SYSCALL_WORK_EXIT)) - syscall_exit_work(regs, work); -} - -static __always_inline void __syscall_exit_to_user_mode_work(struct pt_regs *regs) -{ - syscall_exit_to_user_mode_prepare(regs); - local_irq_disable_exit_to_user(); - exit_to_user_mode_prepare(regs); -} - -void syscall_exit_to_user_mode_work(struct pt_regs *regs) -{ - __syscall_exit_to_user_mode_work(regs); -} - -__visible noinstr void syscall_exit_to_user_mode(struct pt_regs *regs) -{ - instrumentation_begin(); - __syscall_exit_to_user_mode_work(regs); - instrumentation_end(); - exit_to_user_mode(); -} - noinstr void irqentry_enter_from_user_mode(struct pt_regs *regs) { enter_from_user_mode(regs); diff --git a/kernel/entry/syscall-common.c b/kernel/entry/syscall-common.c new file mode 100644 index 000000000000..88edab20820f --- /dev/null +++ b/kernel/entry/syscall-common.c @@ -0,0 +1,159 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include "common.h" + +#define CREATE_TRACE_POINTS +#include + +static inline void syscall_enter_audit(struct pt_regs *regs, long syscall) +{ + if (unlikely(audit_context())) { + unsigned long args[6]; + + syscall_get_arguments(current, regs, args); + audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]); + } +} + +long syscall_trace_enter(struct pt_regs *regs, long syscall, + unsigned long work) +{ + long ret = 0; + + /* + * Handle Syscall User Dispatch. This must comes first, since + * the ABI here can be something that doesn't make sense for + * other syscall_work features. + */ + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { + if (syscall_user_dispatch(regs)) + return -1L; + } + + /* Handle ptrace */ + if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) { + ret = ptrace_report_syscall_entry(regs); + if (ret || (work & SYSCALL_WORK_SYSCALL_EMU)) + return -1L; + } + + /* Do seccomp after ptrace, to catch any tracer changes. */ + if (work & SYSCALL_WORK_SECCOMP) { + ret = __secure_computing(); + if (ret == -1L) + return ret; + } + + /* Either of the above might have changed the syscall number */ + syscall = syscall_get_nr(current, regs); + + if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) { + trace_sys_enter(regs, syscall); + /* + * Probes or BPF hooks in the tracepoint may have changed the + * system call number as well. + */ + syscall = syscall_get_nr(current, regs); + } + + syscall_enter_audit(regs, syscall); + + return ret ? : syscall; +} + +noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs) +{ + enter_from_user_mode(regs); + instrumentation_begin(); + local_irq_enable(); + instrumentation_end(); +} + +/* + * If SYSCALL_EMU is set, then the only reason to report is when + * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall + * instruction has been already reported in syscall_enter_from_user_mode(). + */ +static inline bool report_single_step(unsigned long work) +{ + if (work & SYSCALL_WORK_SYSCALL_EMU) + return false; + + return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP; +} + +static void syscall_exit_work(struct pt_regs *regs, unsigned long work) +{ + bool step; + + /* + * If the syscall was rolled back due to syscall user dispatching, + * then the tracers below are not invoked for the same reason as + * the entry side was not invoked in syscall_trace_enter(): The ABI + * of these syscalls is unknown. + */ + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { + if (unlikely(current->syscall_dispatch.on_dispatch)) { + current->syscall_dispatch.on_dispatch = false; + return; + } + } + + audit_syscall_exit(regs); + + if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT) + trace_sys_exit(regs, syscall_get_return_value(current, regs)); + + step = report_single_step(work); + if (step || work & SYSCALL_WORK_SYSCALL_TRACE) + ptrace_report_syscall_exit(regs, step); +} + +/* + * Syscall specific exit to user mode preparation. Runs with interrupts + * enabled. + */ +static void syscall_exit_to_user_mode_prepare(struct pt_regs *regs) +{ + unsigned long work = READ_ONCE(current_thread_info()->syscall_work); + unsigned long nr = syscall_get_nr(current, regs); + + CT_WARN_ON(ct_state() != CT_STATE_KERNEL); + + if (IS_ENABLED(CONFIG_PROVE_LOCKING)) { + if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr)) + local_irq_enable(); + } + + rseq_syscall(regs); + + /* + * Do one-time syscall specific work. If these work items are + * enabled, we want to run them exactly once per syscall exit with + * interrupts enabled. + */ + if (unlikely(work & SYSCALL_WORK_EXIT)) + syscall_exit_work(regs, work); +} + +static __always_inline void __syscall_exit_to_user_mode_work(struct pt_regs *regs) +{ + syscall_exit_to_user_mode_prepare(regs); + local_irq_disable_exit_to_user(); + exit_to_user_mode_prepare(regs); +} + +void syscall_exit_to_user_mode_work(struct pt_regs *regs) +{ + __syscall_exit_to_user_mode_work(regs); +} + +__visible noinstr void syscall_exit_to_user_mode(struct pt_regs *regs) +{ + instrumentation_begin(); + __syscall_exit_to_user_mode_work(regs); + instrumentation_end(); + exit_to_user_mode(); +} diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8a478903dea7..09d6712c3ed3 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -68,8 +68,8 @@ #include #ifdef CONFIG_PREEMPT_DYNAMIC -# ifdef CONFIG_GENERIC_ENTRY -# include +# ifdef CONFIG_GENERIC_IRQ_ENTRY +# include # endif #endif @@ -7407,8 +7407,8 @@ EXPORT_SYMBOL(__cond_resched_rwlock_write); #ifdef CONFIG_PREEMPT_DYNAMIC -#ifdef CONFIG_GENERIC_ENTRY -#include +#ifdef CONFIG_GENERIC_IRQ_ENTRY +#include #endif /* From patchwork Thu Feb 13 13:00:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E16B4C021A4 for ; Thu, 13 Feb 2025 13:14:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=bkaMhcNmEWpgp1MChWKWck85+/ejIU8zUdpdj5tOMuo=; b=GqUlKZgixMHQgyfZU1hOETiUDL 9bQzsquJWimUsIaHMGyhKl+91w9Qd/i69NlKj97pkyIk6PpGe7mmh0ELuLJA8Bb2jhNEN57cieDCY VpogzllIghjGWEfCAfcZLKRwbhSf6sABevF+ONX9uF8W9h/3psZ0p6Ld0z1ue590BnXelXoc0o1sH tLPLXmjOxH+um6f86yaTzcFTDuGzpgbRdXnOJ7OIY+a+jvIedtI8y4z62REGkFil+TKctvYgIcOnd bzwqVH5n+mxZy8jQlW/+UfA0cDVoZTMZbLNvWI3vA3vOIHQbSrSFYNiJ0fIqKmNRBCCxzUDOnIoLe k/9UI1RQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiZ3I-0000000B7H7-0AjD; Thu, 13 Feb 2025 13:14:48 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqr-0000000B52B-43Fs for linux-arm-kernel@bombadil.infradead.org; Thu, 13 Feb 2025 13:01:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Content-Transfer-Encoding :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=bkaMhcNmEWpgp1MChWKWck85+/ejIU8zUdpdj5tOMuo=; b=J7mVStZLX1nUMAaUg7KksdWjaw xdHcxcjn0hNK15k1caCaUttGpmtaQyDAJcE5ax9QOSd35X6EGfMFQGXs4j5IafpcGH+DB6nwNbCnS Ly3WZtRJgJrlVd3Z2gkiWb9sTFtcdpxpHwlVVawn7tg4heCO2LXQ4ZDfr2sBL0+Y4aoUVLq66eIud 7e8V6rUuOIwoNnFF8pRMWxU1PQrRhqc076mG3dHdm+9pdkfFsgVLXQYzDEJGq5PQ9Qt68EtrwbHrd ZL6swyw/o0oMMZPUudSHhDQrwcFZw8Q+YODBz5g9yeVO0K8tSxKVH/AOe/7X1cCuCf1TLs6+5w2wx GS04taRA==; Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqm-00000000zkW-3krv for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:56 +0000 Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4YtwFn2H2dz1ltXq; Thu, 13 Feb 2025 20:57:49 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id 5E36A1A0188; Thu, 13 Feb 2025 21:01:36 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:34 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 2/8] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Date: Thu, 13 Feb 2025 21:00:01 +0800 Message-ID: <20250213130007.1418890-3-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_130155_395448_6D9D417F X-CRM114-Status: GOOD ( 16.03 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The generic entry code expects architecture code to provide regs_irqs_disabled(regs) function, but arm64 does not have this and provides inerrupts_enabled(regs), which has the opposite polarity. In preparation for moving arm64 over to the generic entry code, relace arm64's interrupts_enabled() with regs_irqs_disabled() and update its callers under arch/arm64. For the moment, a definition of interrupts_enabled() is provided for the GICv3 driver. Once arch/arm implement regs_irqs_disabled(), this can be removed. Delete the fast_interrupts_enabled() macro as it is unused and we don't want any new users to show up. No functional changes. Acked-by: Mark Rutland Suggested-by: Mark Rutland Signed-off-by: Jinjie Ruan --- v6: - Define regs_irqs_disabled() by inline function. - Define interrupts_enabled() in terms of regs_irqs_disabled(). - Delete the fast_interrupts_enabled() macro. --- arch/arm64/include/asm/daifflags.h | 2 +- arch/arm64/include/asm/ptrace.h | 9 +++++---- arch/arm64/include/asm/xen/events.h | 2 +- arch/arm64/kernel/acpi.c | 2 +- arch/arm64/kernel/debug-monitors.c | 2 +- arch/arm64/kernel/entry-common.c | 4 ++-- arch/arm64/kernel/sdei.c | 2 +- 7 files changed, 12 insertions(+), 11 deletions(-) diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h index fbb5c99eb2f9..5fca48009043 100644 --- a/arch/arm64/include/asm/daifflags.h +++ b/arch/arm64/include/asm/daifflags.h @@ -128,7 +128,7 @@ static inline void local_daif_inherit(struct pt_regs *regs) { unsigned long flags = regs->pstate & DAIF_MASK; - if (interrupts_enabled(regs)) + if (!regs_irqs_disabled(regs)) trace_hardirqs_on(); if (system_uses_irq_prio_masking()) diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 47ff8654c5ec..8b915d4a9d4b 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -214,11 +214,12 @@ static inline void forget_syscall(struct pt_regs *regs) (regs)->pmr == GIC_PRIO_IRQON : \ true) -#define interrupts_enabled(regs) \ - (!((regs)->pstate & PSR_I_BIT) && irqs_priority_unmasked(regs)) +static __always_inline bool regs_irqs_disabled(const struct pt_regs *regs) +{ + return (regs->pstate & PSR_I_BIT) || !irqs_priority_unmasked(regs); +} -#define fast_interrupts_enabled(regs) \ - (!((regs)->pstate & PSR_F_BIT)) +#define interrupts_enabled(regs) (!regs_irqs_disabled(regs)) static inline unsigned long user_stack_pointer(struct pt_regs *regs) { diff --git a/arch/arm64/include/asm/xen/events.h b/arch/arm64/include/asm/xen/events.h index 2788e95d0ff0..2977b5fe068d 100644 --- a/arch/arm64/include/asm/xen/events.h +++ b/arch/arm64/include/asm/xen/events.h @@ -14,7 +14,7 @@ enum ipi_vector { static inline int xen_irqs_disabled(struct pt_regs *regs) { - return !interrupts_enabled(regs); + return regs_irqs_disabled(regs); } #define xchg_xen_ulong(ptr, val) xchg((ptr), (val)) diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index e6f66491fbe9..732f89daae23 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -403,7 +403,7 @@ int apei_claim_sea(struct pt_regs *regs) return_to_irqs_enabled = !irqs_disabled_flags(arch_local_save_flags()); if (regs) - return_to_irqs_enabled = interrupts_enabled(regs); + return_to_irqs_enabled = !regs_irqs_disabled(regs); /* * SEA can interrupt SError, mask it and describe this as an NMI so diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c index 58f047de3e1c..460c09d03a73 100644 --- a/arch/arm64/kernel/debug-monitors.c +++ b/arch/arm64/kernel/debug-monitors.c @@ -231,7 +231,7 @@ static void send_user_sigtrap(int si_code) if (WARN_ON(!user_mode(regs))) return; - if (interrupts_enabled(regs)) + if (!regs_irqs_disabled(regs)) local_irq_enable(); arm64_force_sig_fault(SIGTRAP, si_code, instruction_pointer(regs), diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index b260ddc4d3e9..c547e70428d3 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -73,7 +73,7 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs) { lockdep_assert_irqs_disabled(); - if (interrupts_enabled(regs)) { + if (!regs_irqs_disabled(regs)) { if (regs->exit_rcu) { trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); @@ -569,7 +569,7 @@ static void noinstr el1_interrupt(struct pt_regs *regs, { write_sysreg(DAIF_PROCCTX_NOIRQ, daif); - if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs)) + if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && regs_irqs_disabled(regs)) __el1_pnmi(regs, handler); else __el1_irq(regs, handler); diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c index 255d12f881c2..27a17da635d8 100644 --- a/arch/arm64/kernel/sdei.c +++ b/arch/arm64/kernel/sdei.c @@ -247,7 +247,7 @@ unsigned long __kprobes do_sdei_event(struct pt_regs *regs, * If we interrupted the kernel with interrupts masked, we always go * back to wherever we came from. */ - if (mode == kernel_mode && !interrupts_enabled(regs)) + if (mode == kernel_mode && regs_irqs_disabled(regs)) return SDEI_EV_HANDLED; /* From patchwork Thu Feb 13 13:00:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8CE97C0219D for ; Thu, 13 Feb 2025 13:10:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=hZP41nP1Z1Rq+IehRRU2BW+RKS1Gbjb+5S2zIG/NS04=; b=WebOSpyEHx/G0RRitJjr8tDuMm /H8dOgHYcBfbUVd/rklSRf1FsMGYRzPbjvjpN1z1hoMOyWIHdTHeAPkmEJKYLZo6Sp3m0sHtYIEIe M1JpKBjNyxrY1P3IuqYWcN5gpk8Tgw2jYninTvREI16aAj/50iyiE+VQj9lb+EW4CPaB7Xv7dgrTo Jwdcx1/Ffkd/9Iuvpmv3zAN4B3MW8rmPLK3628c4rd2OCd+Kx7P2BRvBBmt+IiuQs6PJt7eMfst1K 9iLso/A6UDcOuU8hdZTZCr1ixMIOJGnCq7EbWeOieTLfIInPTWk0j8BYiOrre0OYz3557KwNn/lLa WmnN8dzw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiYz4-0000000B6e1-3ahi; Thu, 13 Feb 2025 13:10:26 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqq-0000000B514-1dI0 for linux-arm-kernel@bombadil.infradead.org; Thu, 13 Feb 2025 13:01:56 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Content-Transfer-Encoding :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=hZP41nP1Z1Rq+IehRRU2BW+RKS1Gbjb+5S2zIG/NS04=; b=Ew4bhs3Le58IQm8l9EsSMY4O0N ZlpYlnlB/VWjJfAUVH+K46UPpig/NrqslA124fXLS8ETKiMCZECbNyDPNstYQDYbr2oiqkfYTnje2 ESQcp3bxccCzsFPNH6lvQsWbW6rXCjf5xex7t0wBbONqhfg2X8oexlVhh5aQPLA53PiHsLY/Ex4OZ b8usZGOGupCGh3/lXtJxbCu0rR+STOnXG+JfclUcXS7Kitrr4NSrs2ZTUhRM12eL7rL0k6ZpIrL/M 02Dmn1Lr1CRTdRGSBsIkhqiRCHUiMyPia8ElyO/9avKqgODdqv7IPQhKZ2RMKxhDucYvuvNJR0WZg CqyJk4KA==; Received: from szxga04-in.huawei.com ([45.249.212.190]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqm-00000000zka-0lYS for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:54 +0000 Received: from mail.maildlp.com (unknown [172.19.88.163]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4YtwGq42Jdz22mjb; Thu, 13 Feb 2025 20:58:43 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id 8F0ED180042; Thu, 13 Feb 2025 21:01:37 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:36 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 3/8] arm64: entry: Refactor the entry and exit for exceptions from EL1 Date: Thu, 13 Feb 2025 21:00:02 +0800 Message-ID: <20250213130007.1418890-4-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_130153_463727_101FB6EE X-CRM114-Status: GOOD ( 18.40 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The generic entry code uses irqentry_state_t to track lockdep and RCU state across exception entry and return. For historical reasons, arm64 embeds similar fields within its pt_regs structure. In preparation for moving arm64 over to the generic entry code, pull these fields out of arm64's pt_regs, and use a separate structure, matching the style of the generic entry code. No functional changes. Suggested-by: Mark Rutland Signed-off-by: Jinjie Ruan --- v6: - irqentry_state_t -> arm64_irqentry_state_t. --- arch/arm64/include/asm/ptrace.h | 4 - arch/arm64/kernel/entry-common.c | 136 +++++++++++++++++++------------ 2 files changed, 85 insertions(+), 55 deletions(-) diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 8b915d4a9d4b..65b053a24d82 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -169,10 +169,6 @@ struct pt_regs { u64 sdei_ttbr1; struct frame_record_meta stackframe; - - /* Only valid for some EL1 exceptions. */ - u64 lockdep_hardirqs; - u64 exit_rcu; }; /* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */ diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index c547e70428d3..8e597d32433d 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -28,6 +28,13 @@ #include #include +typedef struct irqentry_state { + union { + bool exit_rcu; + bool lockdep; + }; +} arm64_irqentry_state_t; + /* * Handle IRQ/context state management when entering from kernel mode. * Before this function is called it is not safe to call regular kernel code, @@ -36,29 +43,36 @@ * This is intended to match the logic in irqentry_enter(), handling the kernel * mode transitions only. */ -static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs) +static __always_inline arm64_irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs) { - regs->exit_rcu = false; + arm64_irqentry_state_t state = { + .exit_rcu = false, + }; if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) { lockdep_hardirqs_off(CALLER_ADDR0); ct_irq_enter(); trace_hardirqs_off_finish(); - regs->exit_rcu = true; - return; + state.exit_rcu = true; + return state; } lockdep_hardirqs_off(CALLER_ADDR0); rcu_irq_enter_check_tick(); trace_hardirqs_off_finish(); + + return state; } -static void noinstr enter_from_kernel_mode(struct pt_regs *regs) +static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs) { - __enter_from_kernel_mode(regs); + arm64_irqentry_state_t state = __enter_from_kernel_mode(regs); + mte_check_tfsr_entry(); mte_disable_tco_entry(current); + + return state; } /* @@ -69,12 +83,13 @@ static void noinstr enter_from_kernel_mode(struct pt_regs *regs) * This is intended to match the logic in irqentry_exit(), handling the kernel * mode transitions only, and with preemption handled elsewhere. */ -static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs) +static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs, + arm64_irqentry_state_t state) { lockdep_assert_irqs_disabled(); if (!regs_irqs_disabled(regs)) { - if (regs->exit_rcu) { + if (state.exit_rcu) { trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); ct_irq_exit(); @@ -84,15 +99,16 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs) trace_hardirqs_on(); } else { - if (regs->exit_rcu) + if (state.exit_rcu) ct_irq_exit(); } } -static void noinstr exit_to_kernel_mode(struct pt_regs *regs) +static void noinstr exit_to_kernel_mode(struct pt_regs *regs, + arm64_irqentry_state_t state) { mte_check_tfsr_exit(); - __exit_to_kernel_mode(regs); + __exit_to_kernel_mode(regs, state); } /* @@ -190,9 +206,11 @@ asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs) * mode. Before this function is called it is not safe to call regular kernel * code, instrumentable code, or any code which may trigger an exception. */ -static void noinstr arm64_enter_nmi(struct pt_regs *regs) +static noinstr arm64_irqentry_state_t arm64_enter_nmi(struct pt_regs *regs) { - regs->lockdep_hardirqs = lockdep_hardirqs_enabled(); + arm64_irqentry_state_t state; + + state.lockdep = lockdep_hardirqs_enabled(); __nmi_enter(); lockdep_hardirqs_off(CALLER_ADDR0); @@ -201,6 +219,8 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs) trace_hardirqs_off_finish(); ftrace_nmi_enter(); + + return state; } /* @@ -208,19 +228,18 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs) * mode. After this function returns it is not safe to call regular kernel * code, instrumentable code, or any code which may trigger an exception. */ -static void noinstr arm64_exit_nmi(struct pt_regs *regs) +static void noinstr arm64_exit_nmi(struct pt_regs *regs, + arm64_irqentry_state_t state) { - bool restore = regs->lockdep_hardirqs; - ftrace_nmi_exit(); - if (restore) { + if (state.lockdep) { trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); } ct_nmi_exit(); lockdep_hardirq_exit(); - if (restore) + if (state.lockdep) lockdep_hardirqs_on(CALLER_ADDR0); __nmi_exit(); } @@ -230,14 +249,18 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs) * kernel mode. Before this function is called it is not safe to call regular * kernel code, instrumentable code, or any code which may trigger an exception. */ -static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs) +static noinstr arm64_irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs) { - regs->lockdep_hardirqs = lockdep_hardirqs_enabled(); + arm64_irqentry_state_t state; + + state.lockdep = lockdep_hardirqs_enabled(); lockdep_hardirqs_off(CALLER_ADDR0); ct_nmi_enter(); trace_hardirqs_off_finish(); + + return state; } /* @@ -245,17 +268,16 @@ static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs) * kernel mode. After this function returns it is not safe to call regular * kernel code, instrumentable code, or any code which may trigger an exception. */ -static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs) +static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs, + arm64_irqentry_state_t state) { - bool restore = regs->lockdep_hardirqs; - - if (restore) { + if (state.lockdep) { trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); } ct_nmi_exit(); - if (restore) + if (state.lockdep) lockdep_hardirqs_on(CALLER_ADDR0); } @@ -426,78 +448,86 @@ UNHANDLED(el1t, 64, error) static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr) { unsigned long far = read_sysreg(far_el1); + arm64_irqentry_state_t state; - enter_from_kernel_mode(regs); + state = enter_from_kernel_mode(regs); local_daif_inherit(regs); do_mem_abort(far, esr, regs); local_daif_mask(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr) { unsigned long far = read_sysreg(far_el1); + arm64_irqentry_state_t state; - enter_from_kernel_mode(regs); + state = enter_from_kernel_mode(regs); local_daif_inherit(regs); do_sp_pc_abort(far, esr, regs); local_daif_mask(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr) { - enter_from_kernel_mode(regs); + arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + local_daif_inherit(regs); do_el1_undef(regs, esr); local_daif_mask(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr) { - enter_from_kernel_mode(regs); + arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + local_daif_inherit(regs); do_el1_bti(regs, esr); local_daif_mask(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr) { - enter_from_kernel_mode(regs); + arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + local_daif_inherit(regs); do_el1_gcs(regs, esr); local_daif_mask(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr) { - enter_from_kernel_mode(regs); + arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + local_daif_inherit(regs); do_el1_mops(regs, esr); local_daif_mask(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr) { unsigned long far = read_sysreg(far_el1); + arm64_irqentry_state_t state; - arm64_enter_el1_dbg(regs); + state = arm64_enter_el1_dbg(regs); if (!cortex_a76_erratum_1463225_debug_handler(regs)) do_debug_exception(far, esr, regs); - arm64_exit_el1_dbg(regs); + arm64_exit_el1_dbg(regs, state); } static void noinstr el1_fpac(struct pt_regs *regs, unsigned long esr) { - enter_from_kernel_mode(regs); + arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + local_daif_inherit(regs); do_el1_fpac(regs, esr); local_daif_mask(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs) @@ -546,15 +576,16 @@ asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs) static __always_inline void __el1_pnmi(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { - arm64_enter_nmi(regs); + arm64_irqentry_state_t state = arm64_enter_nmi(regs); + do_interrupt_handler(regs, handler); - arm64_exit_nmi(regs); + arm64_exit_nmi(regs, state); } static __always_inline void __el1_irq(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { - enter_from_kernel_mode(regs); + arm64_irqentry_state_t state = enter_from_kernel_mode(regs); irq_enter_rcu(); do_interrupt_handler(regs, handler); @@ -562,7 +593,7 @@ static __always_inline void __el1_irq(struct pt_regs *regs, arm64_preempt_schedule_irq(); - exit_to_kernel_mode(regs); + exit_to_kernel_mode(regs, state); } static void noinstr el1_interrupt(struct pt_regs *regs, void (*handler)(struct pt_regs *)) @@ -588,11 +619,12 @@ asmlinkage void noinstr el1h_64_fiq_handler(struct pt_regs *regs) asmlinkage void noinstr el1h_64_error_handler(struct pt_regs *regs) { unsigned long esr = read_sysreg(esr_el1); + arm64_irqentry_state_t state; local_daif_restore(DAIF_ERRCTX); - arm64_enter_nmi(regs); + state = arm64_enter_nmi(regs); do_serror(regs, esr); - arm64_exit_nmi(regs); + arm64_exit_nmi(regs, state); } static void noinstr el0_da(struct pt_regs *regs, unsigned long esr) @@ -855,12 +887,13 @@ asmlinkage void noinstr el0t_64_fiq_handler(struct pt_regs *regs) static void noinstr __el0_error_handler_common(struct pt_regs *regs) { unsigned long esr = read_sysreg(esr_el1); + arm64_irqentry_state_t state; enter_from_user_mode(regs); local_daif_restore(DAIF_ERRCTX); - arm64_enter_nmi(regs); + state = arm64_enter_nmi(regs); do_serror(regs, esr); - arm64_exit_nmi(regs); + arm64_exit_nmi(regs, state); local_daif_restore(DAIF_PROCCTX); exit_to_user_mode(regs); } @@ -968,6 +1001,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs) asmlinkage noinstr unsigned long __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg) { + arm64_irqentry_state_t state; unsigned long ret; /* @@ -992,9 +1026,9 @@ __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg) else if (cpu_has_pan()) set_pstate_pan(0); - arm64_enter_nmi(regs); + state = arm64_enter_nmi(regs); ret = do_sdei_event(regs, arg); - arm64_exit_nmi(regs); + arm64_exit_nmi(regs, state); return ret; } From patchwork Thu Feb 13 13:00:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 71F46C0219D for ; Thu, 13 Feb 2025 13:04:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=+ngcEsE6JF4+dmZ0YI80Nqbofow0t/Jx1yYb7c1T/wY=; b=p4s+D5O4skapBJusr2hHKrJvmd M9cEV+faWql3yzLNJCUKZL16Uj23LWiBzAPXT0Inn1hqbpWVaD43b9e4Z7UK8fQLQfFiBiiV7MkGx 9Ezgan+P9S6XkknAP/dl17i3Hx8hT2sRzVVPEI5+RZRLDxO/39dYLyCEng0aKFESeDnR/YKcQ42eM GsiyMu06Eq/7csyJN8J12TVBOhDaK9/vnzDdV5kRcsyKJybbIpJz+FS32n5XHIKCYGwy+ArLubZNg iUj5PcReNBF80GrU7kOJmziM8P3KpJT1K5RPQtFVu5ehcWSxNJVBU0IFgT+dTLcjgO5yVAIhZEx1m El8jHPsQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiYtT-0000000B5aT-0jJv; Thu, 13 Feb 2025 13:04:39 +0000 Received: from szxga01-in.huawei.com ([45.249.212.187]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqf-0000000B4xs-34I2 for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:47 +0000 Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4YtwF20b00z11Pxg; Thu, 13 Feb 2025 20:57:10 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id BBD2618010B; Thu, 13 Feb 2025 21:01:38 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:37 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 4/8] arm64: entry: Rework arm64_preempt_schedule_irq() Date: Thu, 13 Feb 2025 21:00:03 +0800 Message-ID: <20250213130007.1418890-5-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_050146_135154_E5FD1B48 X-CRM114-Status: GOOD ( 13.37 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The generic entry code has the form: | raw_irqentry_exit_cond_resched() | { | if (!preempt_count()) { | ... | if (need_resched()) | preempt_schedule_irq(); | } | } In preparation for moving arm64 over to the generic entry code, align the structure of the arm64 code with raw_irqentry_exit_cond_resched() from the generic entry code. Signed-off-by: Jinjie Ruan --- v6: - Update the commit message. --- arch/arm64/kernel/entry-common.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 8e597d32433d..94e4132213ce 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -289,10 +289,10 @@ DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); #define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION)) #endif -static void __sched arm64_preempt_schedule_irq(void) +static inline bool arm64_preempt_schedule_irq(void) { if (!need_irq_preemption()) - return; + return false; /* * Note: thread_info::preempt_count includes both thread_info::count @@ -300,7 +300,7 @@ static void __sched arm64_preempt_schedule_irq(void) * preempt_count(). */ if (READ_ONCE(current_thread_info()->preempt_count) != 0) - return; + return false; /* * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC @@ -309,7 +309,7 @@ static void __sched arm64_preempt_schedule_irq(void) * DAIF we must have handled an NMI, so skip preemption. */ if (system_uses_irq_prio_masking() && read_sysreg(daif)) - return; + return false; /* * Preempting a task from an IRQ means we leave copies of PSTATE @@ -319,8 +319,10 @@ static void __sched arm64_preempt_schedule_irq(void) * Only allow a task to be preempted once cpufeatures have been * enabled. */ - if (system_capabilities_finalized()) - preempt_schedule_irq(); + if (!system_capabilities_finalized()) + return false; + + return true; } static void do_interrupt_handler(struct pt_regs *regs, @@ -591,7 +593,8 @@ static __always_inline void __el1_irq(struct pt_regs *regs, do_interrupt_handler(regs, handler); irq_exit_rcu(); - arm64_preempt_schedule_irq(); + if (arm64_preempt_schedule_irq()) + preempt_schedule_irq(); exit_to_kernel_mode(regs, state); } From patchwork Thu Feb 13 13:00:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4D9DBC0219D for ; Thu, 13 Feb 2025 13:07:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=M5sGTGijkijKquhsZpeteo9Z+KqHf8edhKrasj8jgT8=; b=OXZTGCdYRi3P+dGcAD1/hGxEaD Xez1AvAvLKHVfNako8dlaS5DO1GLxcLBWZKXDGR/1IxQoLfT4q+dVGSGMhwUMx9WbE5zEUd6ZpRWM +KSmnh7o7ABqh0hVLDmzSxUABUEH6/MF3xDtMWh9B8fY1BZvoJo17ggwbfXdaLMCfEDLGGMkOpaLM RTE/SZOazGL1gLlV/pSGtsz6i4ReRvPLR71hVe5NcBb8+oJcnx8PLTUPfXn2mHOPShuoRDxH3gzqF tdi678McVtKOiAZecxfwoMabqS01tzY6uBoenoNaD8/G2iSSl5pgavFFP05qDTTMOkWGJJiQVlyRw 9LwaZ9MQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiYwG-0000000B6BB-27My; Thu, 13 Feb 2025 13:07:32 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqq-0000000B517-233I for linux-arm-kernel@bombadil.infradead.org; Thu, 13 Feb 2025 13:01:56 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Content-Transfer-Encoding :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=M5sGTGijkijKquhsZpeteo9Z+KqHf8edhKrasj8jgT8=; b=l62kukB+8NSl0k9XolIIYvYIVp 5kzpQQeC5whtjgf7TUEHEx47Ay3VHr9lANn034gRLFXFK4IBC50PeI8rEvqnKKXSYftjer5Gki2a7 cPnobm3Su+hrZGBiMiVdCYukKJxfEAj9TIJ2HaX0A1vsFhz4uAw9p67PQIDbqlmqi60dCTv/0YVi0 uL4wyiN32gQt4G1MmEhHhd5vMohtGPSDIU8yp1Dh42xgDX49bCRKZ8xaKilS9vcsxJdZTLP8kpKPX DE01B2OCtp529YVO6StwxE5oa8deYHXiRhG8bu8xkRCpBMBSONvlt3UbjEJ62OHT91FZJUTIICAz3 2W4DOnvQ==; Received: from szxga06-in.huawei.com ([45.249.212.32]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqm-00000000zkq-3IFo for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:55 +0000 Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4YtwLn4XMqz20p2K; Thu, 13 Feb 2025 21:02:09 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id EBBFC1400D2; Thu, 13 Feb 2025 21:01:39 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:38 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 5/8] arm64: entry: Use preempt_count() and need_resched() helper Date: Thu, 13 Feb 2025 21:00:04 +0800 Message-ID: <20250213130007.1418890-6-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_130153_705920_93D7FE3E X-CRM114-Status: GOOD ( 13.55 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The generic entry code uses preempt_count() and need_resched() helpers to check if it should do preempt_schedule_irq(). Currently, arm64 use its own check logic, that is "READ_ONCE(current_thread_info()->preempt_count == 0", which is equivalent to "preempt_count() == 0 && need_resched()". In preparation for moving arm64 over to the generic entry code, use these helpers to replace arm64's own code and move it ahead. No functional changes. Signed-off-by: Jinjie Ruan --- v6: - Update the commit message. - Move this ahead before we change the preemption logic to preempt non-IRQ exceptions. --- arch/arm64/kernel/entry-common.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 94e4132213ce..dceef4cb140b 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -294,14 +294,6 @@ static inline bool arm64_preempt_schedule_irq(void) if (!need_irq_preemption()) return false; - /* - * Note: thread_info::preempt_count includes both thread_info::count - * and thread_info::need_resched, and is not equivalent to - * preempt_count(). - */ - if (READ_ONCE(current_thread_info()->preempt_count) != 0) - return false; - /* * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC * priority masking is used the GIC irqchip driver will clear DAIF.IF @@ -593,8 +585,10 @@ static __always_inline void __el1_irq(struct pt_regs *regs, do_interrupt_handler(regs, handler); irq_exit_rcu(); - if (arm64_preempt_schedule_irq()) - preempt_schedule_irq(); + if (!preempt_count() && need_resched()) { + if (arm64_preempt_schedule_irq()) + preempt_schedule_irq(); + } exit_to_kernel_mode(regs, state); } From patchwork Thu Feb 13 13:00:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973234 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0E9C3C021A0 for ; Thu, 13 Feb 2025 13:09:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=/8C9xP3j9vVHObZ0nbWK6/+zkEBNpOvThUrqh9AzmzU=; b=g96D27eynG7hjFlL5Hd2k6TU4m 3vM4n14gi8uRM2PShGPUM7PPenRjd1u2g0AY2Q8mk92zXYQl0FS423aJLooItBdK6YaTonUsIIJoS u9Yq3j59qoGrN1Gmb2NeM99vsf0HuMF5kI8enZpNZSjZWXP0Xxk/JGrs3txWeX1vVRaqELzW86DeE dScAoAC5cunLqu/iMNuBn4nQiwPBNIl7cnYeS2LrF9rVT+jNgkComb/3kuurP+ce9JgISMlP+AJmI lWKjSgr2uyzoyjwM91zSWCQuA35uu9N1t/I2lS4xReRnZWhzzq2P2Yey23RTtrkc+uimFPHPJ344u 2+aqknFw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiYxg-0000000B6Oq-0wIO; Thu, 13 Feb 2025 13:09:00 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqq-0000000B516-1dZt for linux-arm-kernel@bombadil.infradead.org; Thu, 13 Feb 2025 13:01:56 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Content-Transfer-Encoding :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=/8C9xP3j9vVHObZ0nbWK6/+zkEBNpOvThUrqh9AzmzU=; b=CT+MPDKRMNaFGAZRizcc0MSRFS rogqsw266Bg7mT/oVpEqersIiU3SrWjFCTSIapJ+V/AO1qJYdKmrm7LB/Jl+lFOOA0veoEWC++kbO dBlHoAeuNm2AZGbBJuIBHjLM50tVAT70RsGKO+t9VCbYBlFzKvcZt2tVTOwnwgpADLF3P3qAFzJfn x0ypa5/zEKNyBeZwwQIviBJC7CRdpwhhc5cMd4uUZNlU7TKBnNcGBEYRWRseD9iGtGk/4fBJmJWqi 1EU40ssh/qebMAdVedv1Q/pzbz8aOI8RHfy8gHDVcpaHJ9IbLLzr9vLK9PqWuPVYFoSaP5kB2C3sP uTGGBIOA==; Received: from szxga07-in.huawei.com ([45.249.212.35]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqm-00000000zkr-27nJ for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:55 +0000 Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4YtwFs1rQvz1V6fb; Thu, 13 Feb 2025 20:57:53 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id 1B738140202; Thu, 13 Feb 2025 21:01:41 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:39 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 6/8] arm64: entry: Refactor preempt_schedule_irq() check code Date: Thu, 13 Feb 2025 21:00:05 +0800 Message-ID: <20250213130007.1418890-7-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_130153_482291_F6B8E1CE X-CRM114-Status: GOOD ( 15.12 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org ARM64 requires an additional check whether to reschedule on return from interrupt. So add arch_irqentry_exit_need_resched() as the default NOP implementation and hook it up into the need_resched() condition in raw_irqentry_exit_cond_resched(). This allows ARM64 to implement the architecture specific version for switching over to the generic entry code. To align the structure of the code with irqentry_exit_cond_resched() from the generic entry code, hoist the need_irq_preemption() and IS_ENABLED() check earlier. And different preemption check functions are defined based on whether dynamic preemption is enabled. Suggested-by: Mark Rutland Suggested-by: Kevin Brodsky Suggested-by: Thomas Gleixner Signed-off-by: Jinjie Ruan --- v6: - Update the commit message. - Hoist the IS_ENABLED() and need_irq_preemption() check earlier. - Merge the 4 pathes. --- arch/arm64/include/asm/preempt.h | 4 ++++ arch/arm64/kernel/entry-common.c | 35 ++++++++++++++++++-------------- kernel/entry/common.c | 16 ++++++++++++++- 3 files changed, 39 insertions(+), 16 deletions(-) diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h index 0159b625cc7f..0f0ba250efe8 100644 --- a/arch/arm64/include/asm/preempt.h +++ b/arch/arm64/include/asm/preempt.h @@ -85,6 +85,7 @@ static inline bool should_resched(int preempt_offset) void preempt_schedule(void); void preempt_schedule_notrace(void); +void raw_irqentry_exit_cond_resched(void); #ifdef CONFIG_PREEMPT_DYNAMIC DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); @@ -92,11 +93,14 @@ void dynamic_preempt_schedule(void); #define __preempt_schedule() dynamic_preempt_schedule() void dynamic_preempt_schedule_notrace(void); #define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace() +void dynamic_irqentry_exit_cond_resched(void); +#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched() #else /* CONFIG_PREEMPT_DYNAMIC */ #define __preempt_schedule() preempt_schedule() #define __preempt_schedule_notrace() preempt_schedule_notrace() +#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched() #endif /* CONFIG_PREEMPT_DYNAMIC */ #endif /* CONFIG_PREEMPTION */ diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index dceef4cb140b..1b4936d4cf6e 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -281,19 +281,8 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs, lockdep_hardirqs_on(CALLER_ADDR0); } -#ifdef CONFIG_PREEMPT_DYNAMIC -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); -#define need_irq_preemption() \ - (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) -#else -#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION)) -#endif - static inline bool arm64_preempt_schedule_irq(void) { - if (!need_irq_preemption()) - return false; - /* * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC * priority masking is used the GIC irqchip driver will clear DAIF.IF @@ -576,6 +565,24 @@ static __always_inline void __el1_pnmi(struct pt_regs *regs, arm64_exit_nmi(regs, state); } +void raw_irqentry_exit_cond_resched(void) +{ + if (!preempt_count()) { + if (need_resched() && arm64_preempt_schedule_irq()) + preempt_schedule_irq(); + } +} + +#ifdef CONFIG_PREEMPT_DYNAMIC +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); +void dynamic_irqentry_exit_cond_resched(void) +{ + if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) + return; + raw_irqentry_exit_cond_resched(); +} +#endif + static __always_inline void __el1_irq(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { @@ -585,10 +592,8 @@ static __always_inline void __el1_irq(struct pt_regs *regs, do_interrupt_handler(regs, handler); irq_exit_rcu(); - if (!preempt_count() && need_resched()) { - if (arm64_preempt_schedule_irq()) - preempt_schedule_irq(); - } + if (IS_ENABLED(CONFIG_PREEMPTION)) + irqentry_exit_cond_resched(); exit_to_kernel_mode(regs, state); } diff --git a/kernel/entry/common.c b/kernel/entry/common.c index b82032777310..4aa9656fa1b4 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -142,6 +142,20 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs) return ret; } +/** + * arch_irqentry_exit_need_resched - Architecture specific need resched function + * + * Invoked from raw_irqentry_exit_cond_resched() to check if need resched. + * Defaults return true. + * + * The main purpose is to permit arch to skip preempt a task from an IRQ. + */ +static inline bool arch_irqentry_exit_need_resched(void); + +#ifndef arch_irqentry_exit_need_resched +static inline bool arch_irqentry_exit_need_resched(void) { return true; } +#endif + void raw_irqentry_exit_cond_resched(void) { if (!preempt_count()) { @@ -149,7 +163,7 @@ void raw_irqentry_exit_cond_resched(void) rcu_irq_exit_check_preempt(); if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) WARN_ON_ONCE(!on_thread_stack()); - if (need_resched()) + if (need_resched() && arch_irqentry_exit_need_resched()) preempt_schedule_irq(); } } From patchwork Thu Feb 13 13:00:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A705C021A0 for ; Thu, 13 Feb 2025 13:13:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=vEqHYah2qP1v0vIIHodnQ4Ix31VPIWVY8Rv8d3MGHgk=; b=MaP/DGIwbVe94QPTCFjp8negU5 h6rnLA9Yl20YFJMf793v5DQYFb+/BVhZH5BcHVmWhSHxttNKjbvsut+PeEJ5v+yLrTPICajvpxj5z k2EBZtg5NrLh+OS4xmPxQpM9SQxulI7gjtOmNibhTJo6HRCs/EAN43xkLCtNGG1kfaIq6De6p6c0P XGwSMnkEyvv0mDuPrvXqkeT1QImDaQxMghwJgkh/Bbe5kT5GGYFKMmnWtAR38b/JSE8nwRAXPorbm Yqdg0JpbriAg6Pi+YR7hSW2skua5h9o8ohQ6qZjhyRTe46RwrA869PgGsK5X1eAh8ZNS8Mbna07zk BC75yzqQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiZ1t-0000000B773-0t71; Thu, 13 Feb 2025 13:13:21 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqr-0000000B52A-42fY for linux-arm-kernel@bombadil.infradead.org; Thu, 13 Feb 2025 13:01:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Content-Transfer-Encoding :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=vEqHYah2qP1v0vIIHodnQ4Ix31VPIWVY8Rv8d3MGHgk=; b=hCfSmQmMFbaqA++TNVMHK5iPtf vKnafqTQzl70AYFlgZOSrj3kuFyiUuhyh2tRqygmEHhqqLxIPW/NT8orti4nqdYDhqLqh1xd6H/pr JN/hc+PluMmIriUwv5w6te+HsGXVe8i0SX60cuEJeAOD9S4k6rM/uHsZpINAiaBuSLX1Aq+rPhPIg utCwIE+iiUQQcSZwRvfn0wYHc5VzTEzev692JiKGdRilVWJ/bttescFUDaSORY18as+fPQ0Suefip TVqd6vCQt9v6ancXC+sLvXRuhWJtAsGhZNplLzU3CsnoOi0mF1sPrMzSW3XrOxxjVQsUU/NiCFV/v yyZmxOyA==; Received: from szxga06-in.huawei.com ([45.249.212.32]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqo-00000000zkx-0YdA for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:56 +0000 Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4YtwLq73fsz20qN4; Thu, 13 Feb 2025 21:02:11 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id 4DDA01A016C; Thu, 13 Feb 2025 21:01:42 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:40 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 7/8] arm64: entry: Move arm64_preempt_schedule_irq() into __exit_to_kernel_mode() Date: Thu, 13 Feb 2025 21:00:06 +0800 Message-ID: <20250213130007.1418890-8-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_130155_398421_BEE814B8 X-CRM114-Status: GOOD ( 24.30 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The arm64 entry code only preempts a kernel context upon a return from a regular IRQ exception. The generic entry code may preempt a kernel context for any exception return where irqentry_exit() is used, and so may preempt other exceptions such as faults. In preparation for moving arm64 over to the generic entry code, align arm64 with the generic behaviour by calling arm64_preempt_schedule_irq() from exit_to_kernel_mode(). To make this possible, arm64_preempt_schedule_irq() and dynamic/raw_irqentry_exit_cond_resched() are moved earlier in the file, with no changes. As Mark pointed out, this change will have the following 2 key impact: - " We'll preempt even without taking a "real" interrupt. That shouldn't result in preemption that wasn't possible before, but it does change the probability of preempting at certain points, and might have a performance impact, so probably warrants a benchmark." - " We will not preempt when taking interrupts from a region of kernel code where IRQs are enabled but RCU is not watching, matching the behaviour of the generic entry code. This has the potential to introduce livelock if we can ever have a screaming interrupt in such a region, so we'll need to go figure out whether that's actually a problem. Having this as a separate patch will make it easier to test/bisect for that specifically." Suggested-by: Mark Rutland Signed-off-by: Jinjie Ruan --- v6: - Update the commit message. --- arch/arm64/kernel/entry-common.c | 92 ++++++++++++++++---------------- 1 file changed, 46 insertions(+), 46 deletions(-) diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 1b4936d4cf6e..7056c584f59c 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -75,6 +75,49 @@ static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *reg return state; } +static inline bool arm64_preempt_schedule_irq(void) +{ + /* + * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC + * priority masking is used the GIC irqchip driver will clear DAIF.IF + * using gic_arch_enable_irqs() for normal IRQs. If anything is set in + * DAIF we must have handled an NMI, so skip preemption. + */ + if (system_uses_irq_prio_masking() && read_sysreg(daif)) + return false; + + /* + * Preempting a task from an IRQ means we leave copies of PSTATE + * on the stack. cpufeature's enable calls may modify PSTATE, but + * resuming one of these preempted tasks would undo those changes. + * + * Only allow a task to be preempted once cpufeatures have been + * enabled. + */ + if (!system_capabilities_finalized()) + return false; + + return true; +} + +void raw_irqentry_exit_cond_resched(void) +{ + if (!preempt_count()) { + if (need_resched() && arm64_preempt_schedule_irq()) + preempt_schedule_irq(); + } +} + +#ifdef CONFIG_PREEMPT_DYNAMIC +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); +void dynamic_irqentry_exit_cond_resched(void) +{ + if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) + return; + raw_irqentry_exit_cond_resched(); +} +#endif + /* * Handle IRQ/context state management when exiting to kernel mode. * After this function returns it is not safe to call regular kernel code, @@ -97,6 +140,9 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs, return; } + if (IS_ENABLED(CONFIG_PREEMPTION)) + irqentry_exit_cond_resched(); + trace_hardirqs_on(); } else { if (state.exit_rcu) @@ -281,31 +327,6 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs, lockdep_hardirqs_on(CALLER_ADDR0); } -static inline bool arm64_preempt_schedule_irq(void) -{ - /* - * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC - * priority masking is used the GIC irqchip driver will clear DAIF.IF - * using gic_arch_enable_irqs() for normal IRQs. If anything is set in - * DAIF we must have handled an NMI, so skip preemption. - */ - if (system_uses_irq_prio_masking() && read_sysreg(daif)) - return false; - - /* - * Preempting a task from an IRQ means we leave copies of PSTATE - * on the stack. cpufeature's enable calls may modify PSTATE, but - * resuming one of these preempted tasks would undo those changes. - * - * Only allow a task to be preempted once cpufeatures have been - * enabled. - */ - if (!system_capabilities_finalized()) - return false; - - return true; -} - static void do_interrupt_handler(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { @@ -565,24 +586,6 @@ static __always_inline void __el1_pnmi(struct pt_regs *regs, arm64_exit_nmi(regs, state); } -void raw_irqentry_exit_cond_resched(void) -{ - if (!preempt_count()) { - if (need_resched() && arm64_preempt_schedule_irq()) - preempt_schedule_irq(); - } -} - -#ifdef CONFIG_PREEMPT_DYNAMIC -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); -void dynamic_irqentry_exit_cond_resched(void) -{ - if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) - return; - raw_irqentry_exit_cond_resched(); -} -#endif - static __always_inline void __el1_irq(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { @@ -592,9 +595,6 @@ static __always_inline void __el1_irq(struct pt_regs *regs, do_interrupt_handler(regs, handler); irq_exit_rcu(); - if (IS_ENABLED(CONFIG_PREEMPTION)) - irqentry_exit_cond_resched(); - exit_to_kernel_mode(regs, state); } static void noinstr el1_interrupt(struct pt_regs *regs, From patchwork Thu Feb 13 13:00:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjie Ruan X-Patchwork-Id: 13973232 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ACAF4C0219D for ; Thu, 13 Feb 2025 13:06:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date :Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=4pe4H/aswzP+VVwCej36T7RUVNvKGWBmX5Ew2vb2Z8U=; b=Fu7FD+R6v8Nay2szga02Ul18mV i7iZQ1roxtRuK28MtD9JZG7Ebi6/pZdK9f091M7LN44L1P6FZZEQQPmizMJT6Z0aQm+I+0ts5xfmO qXsmfIDNAWUffXk8Ti+Mtstlqx8XmePDaSA0FFXlpm1P5AP/owfphyC/DTc58toLGAwyE9LjabgrY qQQL5MxS8/yiIByP98j9Ir4yhTm5lUnobblwsPvM+AfGnaV4/9yLLhF8AfAj5qcJGcvPX2NOClSA+ actYPXZwuMI9xqB1DKUKDEp3mSgm9r+RkguyQ65qKw1UlzfzZG9WXOICH4ctkJ7y83f/h4Mpvp/po GBxY+QYg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tiYur-0000000B5wU-3XAK; Thu, 13 Feb 2025 13:06:05 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tiYqg-0000000B4yB-1P0R for linux-arm-kernel@lists.infradead.org; Thu, 13 Feb 2025 13:01:48 +0000 Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4YtwJg2GNzz1JK16; Thu, 13 Feb 2025 21:00:19 +0800 (CST) Received: from kwepemg200008.china.huawei.com (unknown [7.202.181.35]) by mail.maildlp.com (Postfix) with ESMTPS id 84A471400D2; Thu, 13 Feb 2025 21:01:43 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemg200008.china.huawei.com (7.202.181.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 13 Feb 2025 21:01:42 +0800 From: Jinjie Ruan To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , CC: Subject: [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry Date: Thu, 13 Feb 2025 21:00:07 +0800 Message-ID: <20250213130007.1418890-9-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250213130007.1418890-1-ruanjinjie@huawei.com> References: <20250213130007.1418890-1-ruanjinjie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.53.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemg200008.china.huawei.com (7.202.181.35) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250213_050146_770579_D7797F45 X-CRM114-Status: GOOD ( 31.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64 to use the generic entry infrastructure from kernel/entry/*. The generic entry makes maintainers' work easier and codes more elegant. Switch arm64 to generic IRQ entry first, which removed duplicate 100+ LOC and make Lazy preemption on arm64 available by adding a _TIF_NEED_RESCHED_LAZY bit and enabling ARCH_HAS_PREEMPT_LAZY. The next patch serise will switch to generic entry completely later. Switch to generic entry in two steps according to Mark's suggestion will make it easier to review. The changes are below: - Remove *enter_from/exit_to_kernel_mode(), and wrap with generic irqentry_enter/exit(). Also remove *enter_from/exit_to_user_mode(), and wrap with generic enter_from/exit_to_user_mode() because they are exactly the same so far. - Remove arm64_enter/exit_nmi() and use generic irqentry_nmi_enter/exit() because they're exactly the same, so the temporary arm64 version irqentry_state can also be removed. - Remove PREEMPT_DYNAMIC code, as generic entry do the same thing if arm64 implement arch_irqentry_exit_need_resched(). Tested ok with following test cases on Qemu virt platform: - Perf tests. - Different `dynamic preempt` mode switch. - Pseudo NMI tests. - Stress-ng CPU stress test. - MTE test case in Documentation/arch/arm64/memory-tagging-extension.rst and all test cases in tools/testing/selftests/arm64/mte/*. Suggested-by: Mark Rutland Signed-off-by: Jinjie Ruan --- v6: - Remove arch_exit_to_user_mode_prepare() and pull local_daif_mask() later in the arm64 exit sequence so that we can have it explicit in entry-common.c. - Update the commit message. --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/entry-common.h | 56 +++++ arch/arm64/include/asm/preempt.h | 6 - arch/arm64/kernel/entry-common.c | 350 +++++++------------------- arch/arm64/kernel/signal.c | 3 +- 5 files changed, 143 insertions(+), 273 deletions(-) create mode 100644 arch/arm64/include/asm/entry-common.h diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c997b27b7da1..f234e3e9e956 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -150,6 +150,7 @@ config ARM64 select GENERIC_EARLY_IOREMAP select GENERIC_IDLE_POLL_SETUP select GENERIC_IOREMAP + select GENERIC_IRQ_ENTRY select GENERIC_IRQ_IPI select GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD select GENERIC_IRQ_PROBE diff --git a/arch/arm64/include/asm/entry-common.h b/arch/arm64/include/asm/entry-common.h new file mode 100644 index 000000000000..93c30b8d653d --- /dev/null +++ b/arch/arm64/include/asm/entry-common.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _ASM_ARM64_ENTRY_COMMON_H +#define _ASM_ARM64_ENTRY_COMMON_H + +#include + +#include +#include +#include +#include + +#define ARCH_EXIT_TO_USER_MODE_WORK (_TIF_MTE_ASYNC_FAULT | _TIF_FOREIGN_FPSTATE) + +static __always_inline void arch_exit_to_user_mode_work(struct pt_regs *regs, + unsigned long ti_work) +{ + if (ti_work & _TIF_MTE_ASYNC_FAULT) { + clear_thread_flag(TIF_MTE_ASYNC_FAULT); + send_sig_fault(SIGSEGV, SEGV_MTEAERR, (void __user *)NULL, current); + } + + if (ti_work & _TIF_FOREIGN_FPSTATE) + fpsimd_restore_current_state(); +} + +#define arch_exit_to_user_mode_work arch_exit_to_user_mode_work + +static inline bool arch_irqentry_exit_need_resched(void) +{ + /* + * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC + * priority masking is used the GIC irqchip driver will clear DAIF.IF + * using gic_arch_enable_irqs() for normal IRQs. If anything is set in + * DAIF we must have handled an NMI, so skip preemption. + */ + if (system_uses_irq_prio_masking() && read_sysreg(daif)) + return false; + + /* + * Preempting a task from an IRQ means we leave copies of PSTATE + * on the stack. cpufeature's enable calls may modify PSTATE, but + * resuming one of these preempted tasks would undo those changes. + * + * Only allow a task to be preempted once cpufeatures have been + * enabled. + */ + if (!system_capabilities_finalized()) + return false; + + return true; +} + +#define arch_irqentry_exit_need_resched arch_irqentry_exit_need_resched + +#endif /* _ASM_ARM64_ENTRY_COMMON_H */ diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h index 0f0ba250efe8..932ea4b62042 100644 --- a/arch/arm64/include/asm/preempt.h +++ b/arch/arm64/include/asm/preempt.h @@ -2,7 +2,6 @@ #ifndef __ASM_PREEMPT_H #define __ASM_PREEMPT_H -#include #include #define PREEMPT_NEED_RESCHED BIT(32) @@ -85,22 +84,17 @@ static inline bool should_resched(int preempt_offset) void preempt_schedule(void); void preempt_schedule_notrace(void); -void raw_irqentry_exit_cond_resched(void); #ifdef CONFIG_PREEMPT_DYNAMIC -DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); void dynamic_preempt_schedule(void); #define __preempt_schedule() dynamic_preempt_schedule() void dynamic_preempt_schedule_notrace(void); #define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace() -void dynamic_irqentry_exit_cond_resched(void); -#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched() #else /* CONFIG_PREEMPT_DYNAMIC */ #define __preempt_schedule() preempt_schedule() #define __preempt_schedule_notrace() preempt_schedule_notrace() -#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched() #endif /* CONFIG_PREEMPT_DYNAMIC */ #endif /* CONFIG_PREEMPTION */ diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 7056c584f59c..c3583524c37a 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -6,6 +6,7 @@ */ #include +#include #include #include #include @@ -28,13 +29,6 @@ #include #include -typedef struct irqentry_state { - union { - bool exit_rcu; - bool lockdep; - }; -} arm64_irqentry_state_t; - /* * Handle IRQ/context state management when entering from kernel mode. * Before this function is called it is not safe to call regular kernel code, @@ -43,31 +37,14 @@ typedef struct irqentry_state { * This is intended to match the logic in irqentry_enter(), handling the kernel * mode transitions only. */ -static __always_inline arm64_irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs) +static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs) { - arm64_irqentry_state_t state = { - .exit_rcu = false, - }; - - if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) { - lockdep_hardirqs_off(CALLER_ADDR0); - ct_irq_enter(); - trace_hardirqs_off_finish(); - - state.exit_rcu = true; - return state; - } - - lockdep_hardirqs_off(CALLER_ADDR0); - rcu_irq_enter_check_tick(); - trace_hardirqs_off_finish(); - - return state; + return irqentry_enter(regs); } -static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs) +static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs) { - arm64_irqentry_state_t state = __enter_from_kernel_mode(regs); + irqentry_state_t state = __enter_from_kernel_mode(regs); mte_check_tfsr_entry(); mte_disable_tco_entry(current); @@ -75,49 +52,6 @@ static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *reg return state; } -static inline bool arm64_preempt_schedule_irq(void) -{ - /* - * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC - * priority masking is used the GIC irqchip driver will clear DAIF.IF - * using gic_arch_enable_irqs() for normal IRQs. If anything is set in - * DAIF we must have handled an NMI, so skip preemption. - */ - if (system_uses_irq_prio_masking() && read_sysreg(daif)) - return false; - - /* - * Preempting a task from an IRQ means we leave copies of PSTATE - * on the stack. cpufeature's enable calls may modify PSTATE, but - * resuming one of these preempted tasks would undo those changes. - * - * Only allow a task to be preempted once cpufeatures have been - * enabled. - */ - if (!system_capabilities_finalized()) - return false; - - return true; -} - -void raw_irqentry_exit_cond_resched(void) -{ - if (!preempt_count()) { - if (need_resched() && arm64_preempt_schedule_irq()) - preempt_schedule_irq(); - } -} - -#ifdef CONFIG_PREEMPT_DYNAMIC -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); -void dynamic_irqentry_exit_cond_resched(void) -{ - if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) - return; - raw_irqentry_exit_cond_resched(); -} -#endif - /* * Handle IRQ/context state management when exiting to kernel mode. * After this function returns it is not safe to call regular kernel code, @@ -127,31 +61,13 @@ void dynamic_irqentry_exit_cond_resched(void) * mode transitions only, and with preemption handled elsewhere. */ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs, - arm64_irqentry_state_t state) -{ - lockdep_assert_irqs_disabled(); - - if (!regs_irqs_disabled(regs)) { - if (state.exit_rcu) { - trace_hardirqs_on_prepare(); - lockdep_hardirqs_on_prepare(); - ct_irq_exit(); - lockdep_hardirqs_on(CALLER_ADDR0); - return; - } - - if (IS_ENABLED(CONFIG_PREEMPTION)) - irqentry_exit_cond_resched(); - - trace_hardirqs_on(); - } else { - if (state.exit_rcu) - ct_irq_exit(); - } + irqentry_state_t state) +{ + irqentry_exit(regs, state); } static void noinstr exit_to_kernel_mode(struct pt_regs *regs, - arm64_irqentry_state_t state) + irqentry_state_t state) { mte_check_tfsr_exit(); __exit_to_kernel_mode(regs, state); @@ -162,18 +78,15 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs, * Before this function is called it is not safe to call regular kernel code, * instrumentable code, or any code which may trigger an exception. */ -static __always_inline void __enter_from_user_mode(void) +static __always_inline void __enter_from_user_mode(struct pt_regs *regs) { - lockdep_hardirqs_off(CALLER_ADDR0); - CT_WARN_ON(ct_state() != CT_STATE_USER); - user_exit_irqoff(); - trace_hardirqs_off_finish(); + enter_from_user_mode(regs); mte_disable_tco_entry(current); } -static __always_inline void enter_from_user_mode(struct pt_regs *regs) +static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs) { - __enter_from_user_mode(); + __enter_from_user_mode(regs); } /* @@ -181,113 +94,18 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs) * After this function returns it is not safe to call regular kernel code, * instrumentable code, or any code which may trigger an exception. */ -static __always_inline void __exit_to_user_mode(void) -{ - trace_hardirqs_on_prepare(); - lockdep_hardirqs_on_prepare(); - user_enter_irqoff(); - lockdep_hardirqs_on(CALLER_ADDR0); -} - -static void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags) -{ - do { - local_irq_enable(); - - if (thread_flags & _TIF_NEED_RESCHED) - schedule(); - - if (thread_flags & _TIF_UPROBE) - uprobe_notify_resume(regs); - - if (thread_flags & _TIF_MTE_ASYNC_FAULT) { - clear_thread_flag(TIF_MTE_ASYNC_FAULT); - send_sig_fault(SIGSEGV, SEGV_MTEAERR, - (void __user *)NULL, current); - } - - if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) - do_signal(regs); - - if (thread_flags & _TIF_NOTIFY_RESUME) - resume_user_mode_work(regs); - - if (thread_flags & _TIF_FOREIGN_FPSTATE) - fpsimd_restore_current_state(); - - local_irq_disable(); - thread_flags = read_thread_flags(); - } while (thread_flags & _TIF_WORK_MASK); -} - -static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) +static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs) { - unsigned long flags; - local_irq_disable(); - - flags = read_thread_flags(); - if (unlikely(flags & _TIF_WORK_MASK)) - do_notify_resume(regs, flags); - - local_daif_mask(); - - lockdep_sys_exit(); -} - -static __always_inline void exit_to_user_mode(struct pt_regs *regs) -{ exit_to_user_mode_prepare(regs); + local_daif_mask(); mte_check_tfsr_exit(); - __exit_to_user_mode(); + exit_to_user_mode(); } asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs) { - exit_to_user_mode(regs); -} - -/* - * Handle IRQ/context state management when entering an NMI from user/kernel - * mode. Before this function is called it is not safe to call regular kernel - * code, instrumentable code, or any code which may trigger an exception. - */ -static noinstr arm64_irqentry_state_t arm64_enter_nmi(struct pt_regs *regs) -{ - arm64_irqentry_state_t state; - - state.lockdep = lockdep_hardirqs_enabled(); - - __nmi_enter(); - lockdep_hardirqs_off(CALLER_ADDR0); - lockdep_hardirq_enter(); - ct_nmi_enter(); - - trace_hardirqs_off_finish(); - ftrace_nmi_enter(); - - return state; -} - -/* - * Handle IRQ/context state management when exiting an NMI from user/kernel - * mode. After this function returns it is not safe to call regular kernel - * code, instrumentable code, or any code which may trigger an exception. - */ -static void noinstr arm64_exit_nmi(struct pt_regs *regs, - arm64_irqentry_state_t state) -{ - ftrace_nmi_exit(); - if (state.lockdep) { - trace_hardirqs_on_prepare(); - lockdep_hardirqs_on_prepare(); - } - - ct_nmi_exit(); - lockdep_hardirq_exit(); - if (state.lockdep) - lockdep_hardirqs_on(CALLER_ADDR0); - __nmi_exit(); + arm64_exit_to_user_mode(regs); } /* @@ -295,9 +113,9 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs, * kernel mode. Before this function is called it is not safe to call regular * kernel code, instrumentable code, or any code which may trigger an exception. */ -static noinstr arm64_irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs) +static noinstr irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs) { - arm64_irqentry_state_t state; + irqentry_state_t state; state.lockdep = lockdep_hardirqs_enabled(); @@ -315,7 +133,7 @@ static noinstr arm64_irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs) * kernel code, instrumentable code, or any code which may trigger an exception. */ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs, - arm64_irqentry_state_t state) + irqentry_state_t state) { if (state.lockdep) { trace_hardirqs_on_prepare(); @@ -346,7 +164,7 @@ extern void (*handle_arch_fiq)(struct pt_regs *); static void noinstr __panic_unhandled(struct pt_regs *regs, const char *vector, unsigned long esr) { - arm64_enter_nmi(regs); + irqentry_nmi_enter(regs); console_verbose(); @@ -452,7 +270,7 @@ UNHANDLED(el1t, 64, error) static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr) { unsigned long far = read_sysreg(far_el1); - arm64_irqentry_state_t state; + irqentry_state_t state; state = enter_from_kernel_mode(regs); local_daif_inherit(regs); @@ -464,7 +282,7 @@ static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr) static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr) { unsigned long far = read_sysreg(far_el1); - arm64_irqentry_state_t state; + irqentry_state_t state; state = enter_from_kernel_mode(regs); local_daif_inherit(regs); @@ -475,7 +293,7 @@ static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr) static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr) { - arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + irqentry_state_t state = enter_from_kernel_mode(regs); local_daif_inherit(regs); do_el1_undef(regs, esr); @@ -485,7 +303,7 @@ static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr) static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr) { - arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + irqentry_state_t state = enter_from_kernel_mode(regs); local_daif_inherit(regs); do_el1_bti(regs, esr); @@ -495,7 +313,7 @@ static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr) static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr) { - arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + irqentry_state_t state = enter_from_kernel_mode(regs); local_daif_inherit(regs); do_el1_gcs(regs, esr); @@ -505,7 +323,7 @@ static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr) static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr) { - arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + irqentry_state_t state = enter_from_kernel_mode(regs); local_daif_inherit(regs); do_el1_mops(regs, esr); @@ -516,7 +334,7 @@ static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr) static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr) { unsigned long far = read_sysreg(far_el1); - arm64_irqentry_state_t state; + irqentry_state_t state; state = arm64_enter_el1_dbg(regs); if (!cortex_a76_erratum_1463225_debug_handler(regs)) @@ -526,7 +344,7 @@ static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr) static void noinstr el1_fpac(struct pt_regs *regs, unsigned long esr) { - arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + irqentry_state_t state = enter_from_kernel_mode(regs); local_daif_inherit(regs); do_el1_fpac(regs, esr); @@ -580,16 +398,16 @@ asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs) static __always_inline void __el1_pnmi(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { - arm64_irqentry_state_t state = arm64_enter_nmi(regs); + irqentry_state_t state = irqentry_nmi_enter(regs); do_interrupt_handler(regs, handler); - arm64_exit_nmi(regs, state); + irqentry_nmi_exit(regs, state); } static __always_inline void __el1_irq(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { - arm64_irqentry_state_t state = enter_from_kernel_mode(regs); + irqentry_state_t state = enter_from_kernel_mode(regs); irq_enter_rcu(); do_interrupt_handler(regs, handler); @@ -621,22 +439,22 @@ asmlinkage void noinstr el1h_64_fiq_handler(struct pt_regs *regs) asmlinkage void noinstr el1h_64_error_handler(struct pt_regs *regs) { unsigned long esr = read_sysreg(esr_el1); - arm64_irqentry_state_t state; + irqentry_state_t state; local_daif_restore(DAIF_ERRCTX); - state = arm64_enter_nmi(regs); + state = irqentry_nmi_enter(regs); do_serror(regs, esr); - arm64_exit_nmi(regs, state); + irqentry_nmi_exit(regs, state); } static void noinstr el0_da(struct pt_regs *regs, unsigned long esr) { unsigned long far = read_sysreg(far_el1); - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_mem_abort(far, esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_ia(struct pt_regs *regs, unsigned long esr) @@ -651,50 +469,50 @@ static void noinstr el0_ia(struct pt_regs *regs, unsigned long esr) if (!is_ttbr0_addr(far)) arm64_apply_bp_hardening(); - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_mem_abort(far, esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_fpsimd_acc(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_fpsimd_acc(esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_sve_acc(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_sve_acc(esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_sme_acc(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_sme_acc(esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_fpsimd_exc(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_fpsimd_exc(esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_sys(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_el0_sys(esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_pc(struct pt_regs *regs, unsigned long esr) @@ -704,58 +522,58 @@ static void noinstr el0_pc(struct pt_regs *regs, unsigned long esr) if (!is_ttbr0_addr(instruction_pointer(regs))) arm64_apply_bp_hardening(); - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_sp_pc_abort(far, esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_sp(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_sp_pc_abort(regs->sp, esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_undef(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_el0_undef(regs, esr); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_bti(struct pt_regs *regs) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_el0_bti(regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_mops(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_el0_mops(regs, esr); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_gcs(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_el0_gcs(regs, esr); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_inv(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); bad_el0_sync(regs, 0, esr); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_dbg(struct pt_regs *regs, unsigned long esr) @@ -763,28 +581,28 @@ static void noinstr el0_dbg(struct pt_regs *regs, unsigned long esr) /* Only watchpoints write FAR_EL1, otherwise its UNKNOWN */ unsigned long far = read_sysreg(far_el1); - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); do_debug_exception(far, esr, regs); local_daif_restore(DAIF_PROCCTX); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_svc(struct pt_regs *regs) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); cortex_a76_erratum_1463225_svc_handler(); fp_user_discard(); local_daif_restore(DAIF_PROCCTX); do_el0_svc(regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_fpac(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_el0_fpac(regs, esr); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs) @@ -852,7 +670,7 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs) static void noinstr el0_interrupt(struct pt_regs *regs, void (*handler)(struct pt_regs *)) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); write_sysreg(DAIF_PROCCTX_NOIRQ, daif); @@ -863,7 +681,7 @@ static void noinstr el0_interrupt(struct pt_regs *regs, do_interrupt_handler(regs, handler); irq_exit_rcu(); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr __el0_irq_handler_common(struct pt_regs *regs) @@ -889,15 +707,15 @@ asmlinkage void noinstr el0t_64_fiq_handler(struct pt_regs *regs) static void noinstr __el0_error_handler_common(struct pt_regs *regs) { unsigned long esr = read_sysreg(esr_el1); - arm64_irqentry_state_t state; + irqentry_state_t state; - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_ERRCTX); - state = arm64_enter_nmi(regs); + state = irqentry_nmi_enter(regs); do_serror(regs, esr); - arm64_exit_nmi(regs, state); + irqentry_nmi_exit(regs, state); local_daif_restore(DAIF_PROCCTX); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } asmlinkage void noinstr el0t_64_error_handler(struct pt_regs *regs) @@ -908,19 +726,19 @@ asmlinkage void noinstr el0t_64_error_handler(struct pt_regs *regs) #ifdef CONFIG_COMPAT static void noinstr el0_cp15(struct pt_regs *regs, unsigned long esr) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); local_daif_restore(DAIF_PROCCTX); do_el0_cp15(esr, regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } static void noinstr el0_svc_compat(struct pt_regs *regs) { - enter_from_user_mode(regs); + arm64_enter_from_user_mode(regs); cortex_a76_erratum_1463225_svc_handler(); local_daif_restore(DAIF_PROCCTX); do_el0_svc_compat(regs); - exit_to_user_mode(regs); + arm64_exit_to_user_mode(regs); } asmlinkage void noinstr el0t_32_sync_handler(struct pt_regs *regs) @@ -994,7 +812,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs) unsigned long esr = read_sysreg(esr_el1); unsigned long far = read_sysreg(far_el1); - arm64_enter_nmi(regs); + irqentry_nmi_enter(regs); panic_bad_stack(regs, esr, far); } #endif /* CONFIG_VMAP_STACK */ @@ -1003,7 +821,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs) asmlinkage noinstr unsigned long __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg) { - arm64_irqentry_state_t state; + irqentry_state_t state; unsigned long ret; /* @@ -1028,9 +846,9 @@ __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg) else if (cpu_has_pan()) set_pstate_pan(0); - state = arm64_enter_nmi(regs); + state = irqentry_nmi_enter(regs); ret = do_sdei_event(regs, arg); - arm64_exit_nmi(regs, state); + irqentry_nmi_exit(regs, state); return ret; } diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index 99ea26d400ff..e1c1abc2cb3f 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include @@ -1616,7 +1617,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs) * the kernel can handle, and then we build all the user-level signal handling * stack-frames in one go after that. */ -void do_signal(struct pt_regs *regs) +void arch_do_signal_or_restart(struct pt_regs *regs) { unsigned long continue_addr = 0, restart_addr = 0; int retval = 0;