From patchwork Tue Dec 20 06:36:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32CBAC4332F for ; Tue, 20 Dec 2022 07:01:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233214AbiLTHBg (ORCPT ); Tue, 20 Dec 2022 02:01:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229824AbiLTHB2 (ORCPT ); Tue, 20 Dec 2022 02:01:28 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E3002ADE; Mon, 19 Dec 2022 23:01:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519687; x=1703055687; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4bpIJXn+Cqwm7qv6s3tf9xXrKBIQ0tJR6b3uTXiGGZk=; b=U9n3NMKd9/M1MJ5C0BdK509iNeNj9CjX5a+22Os/TrITjAk5O6EKPoMm 5ddPJ3U/A03upZnxUz1ddMtjKzD1UOM+U7zIQRu5gaUEA1/nefq/3Kb50 GDJq9qry0InPbbW2iibuZN2ruwd1c51c8OYPkpDy58nrgSAcE9ZejzM/U Emn9UQ/zeX1ELyTjGlDgo09BYXRWeGE3JZds0UU9c2X2Ze5I+L7XBrcDR QCQHlf+IV3rH+CObMN/AauIo9DNiJ80trMojSSNrRXwsqY3l/z3cuBOSE 829ER6gudPEmuHQYtcGtLI97SBZqT/LRk9rd/MVbw0KHWMDMiKcfjLLc/ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971889" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971889" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326404" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326404" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:08 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 01/32] x86/traps: let common_interrupt() handle IRQ_MOVE_CLEANUP_VECTOR Date: Mon, 19 Dec 2022 22:36:27 -0800 Message-Id: <20221220063658.19271-2-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" IRQ_MOVE_CLEANUP_VECTOR is the only one of the system IRQ vectors that is *below* FIRST_SYSTEM_VECTOR. It is a slow path, so just push it into common_interrupt() just before the spurious interrupt handling. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/kernel/irq.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 766ffe3ba313..7e125fff45ab 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -248,6 +248,10 @@ DEFINE_IDTENTRY_IRQ(common_interrupt) desc = __this_cpu_read(vector_irq[vector]); if (likely(!IS_ERR_OR_NULL(desc))) { handle_irq(desc, regs); +#ifdef CONFIG_SMP + } else if (vector == IRQ_MOVE_CLEANUP_VECTOR) { + sysvec_irq_move_cleanup(regs); +#endif } else { ack_APIC_irq(); From patchwork Tue Dec 20 06:36:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21086C4167B for ; Tue, 20 Dec 2022 07:01:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233270AbiLTHBm (ORCPT ); Tue, 20 Dec 2022 02:01:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53250 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232633AbiLTHBa (ORCPT ); Tue, 20 Dec 2022 02:01:30 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20D2A140AC; Mon, 19 Dec 2022 23:01:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519688; x=1703055688; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ntf8jaiwEE+fGtM3gCkXz+ISQnT30q6joOVeczDYVtI=; b=daPFWhWCJKgA6FbvS9+sInhSNpV8QO8mpipgfamFV6D5Tk4Q9L82l1Zn vL3dGlmQ+2u6fzCFfgJr673HaU2g8cygNPQaOpJZyN6bwUaof1NR5klRw pRYU/PCIwAQG3mbR2cL3+IpclcBOmjYvVohkgC4s1WytFddIpTFAMMRJf L7mEoGr17XPAoPJXFMMFN4FlgZK5RK35HIrmZGas20xGThhmZ5fmQwSIT OODXyspzM3HsV2iJ9gcDRbQRbrRI7Wo1HgtjxAto1zgbSCdqkz1ahca/j 3EMBYGLofCzXBv2iF4MJH5nnb/yvie6WF50dmT7qVZF1HBKwNnS3oRrk8 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971899" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971899" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326409" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326409" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:08 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 02/32] x86/traps: add a system interrupt table for system interrupt dispatch Date: Mon, 19 Dec 2022 22:36:28 -0800 Message-Id: <20221220063658.19271-3-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Upon receiving an external interrupt, KVM VMX reinjects it through calling the interrupt handler in its IDT descriptor on the current kernel stack, which essentially uses the IDT as an interrupt dispatch table. However the IDT is one of the lowest level critical data structures between a x86 CPU and the Linux kernel, we should avoid using it *directly* whenever possible, espeically in a software defined manner. On x86, external interrupts are divided into the following groups 1) system interrupts 2) external device interrupts With the IDT, system interrupts are dispatched through the IDT directly, while external device interrupts are all routed to the external interrupt dispatch function common_interrupt(), which dispatches external device interrupts through a per-CPU external interrupt dispatch table vector_irq. To eliminate dispatching external interrupts through the IDT, add a system interrupt handler table for dispatching a system interrupt to its corresponding handler directly. Thus a software based dispatch function will be: void external_interrupt(struct pt_regs *regs, u8 vector) { if (is_system_interrupt(vector)) system_interrupt_handlers[vector_to_sysvec(vector)](regs); else /* external device interrupt */ common_interrupt(regs, vector); } What's more, with the Intel FRED (Flexible Return and Event Delivery) architecture, IDT, the hardware based event dispatch table, is gone, and the Linux kernel needs to dispatch events to their handlers with vector to handler mappings, the dispatch function external_interrupt() is also needed. Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Signed-off-by: Xin Li --- arch/x86/include/asm/idtentry.h | 56 +++++++++++++++++++++++++++------ arch/x86/include/asm/traps.h | 7 +++++ arch/x86/kernel/traps.c | 40 +++++++++++++++++++++++ 3 files changed, 93 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 72184b0b2219..966d720046f1 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -167,18 +167,24 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code) /** * DECLARE_IDTENTRY_IRQ - Declare functions for device interrupt IDT entry - * points (common/spurious) + * points (common/spurious) and their corresponding + * software based dispatch handlers in non-noinstr + * text section * @vector: Vector number (ignored for C) * @func: Function name of the entry point * * Maps to DECLARE_IDTENTRY_ERRORCODE() */ #define DECLARE_IDTENTRY_IRQ(vector, func) \ - DECLARE_IDTENTRY_ERRORCODE(vector, func) + DECLARE_IDTENTRY_ERRORCODE(vector, func); \ + void dispatch_##func(struct pt_regs *regs, unsigned long error_code) /** * DEFINE_IDTENTRY_IRQ - Emit code for device interrupt IDT entry points - * @func: Function name of the entry point + * and their corresponding software based dispatch + * handlers in non-noinstr text section. + * @func: Function name of the IDT entry point + * @dispatch_func: Function name of the software based dispatch handler * * The vector number is pushed by the low level entry stub and handed * to the function as error_code argument which needs to be truncated @@ -204,10 +210,20 @@ __visible noinstr void func(struct pt_regs *regs, \ irqentry_exit(regs, state); \ } \ \ +void dispatch_##func(struct pt_regs *regs, unsigned long error_code) \ +{ \ + u32 vector = (u32)(u8)error_code; \ + \ + kvm_set_cpu_l1tf_flush_l1d(); \ + run_irq_on_irqstack_cond(__##func, regs, vector); \ +} \ + \ static noinline void __##func(struct pt_regs *regs, u32 vector) /** * DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points + * and their corresponding software based dispatch + * handlers in non-noinstr text section * @vector: Vector number (ignored for C) * @func: Function name of the entry point * @@ -215,15 +231,20 @@ static noinline void __##func(struct pt_regs *regs, u32 vector) * - The ASM entry point: asm_##func * - The XEN PV trap entry point: xen_##func (maybe unused) * - The C handler called from the ASM entry point + * - The C handler used in the system interrupt handler table * * Maps to DECLARE_IDTENTRY(). */ #define DECLARE_IDTENTRY_SYSVEC(vector, func) \ - DECLARE_IDTENTRY(vector, func) + DECLARE_IDTENTRY(vector, func); \ + void dispatch_table_##func(struct pt_regs *regs) /** * DEFINE_IDTENTRY_SYSVEC - Emit code for system vector IDT entry points - * @func: Function name of the entry point + * and their corresponding software based dispatch + * handlers in non-noinstr text section + * @func: Function name of the IDT entry point + * @dispatch_table_func:Function name of the software based dispatch handler * * irqentry_enter/exit() and irq_enter/exit_rcu() are invoked before the * function body. KVM L1D flush request is set. @@ -244,12 +265,21 @@ __visible noinstr void func(struct pt_regs *regs) \ irqentry_exit(regs, state); \ } \ \ +void dispatch_table_##func(struct pt_regs *regs) \ +{ \ + kvm_set_cpu_l1tf_flush_l1d(); \ + run_sysvec_on_irqstack_cond(__##func, regs); \ +} \ + \ static noinline void __##func(struct pt_regs *regs) /** * DEFINE_IDTENTRY_SYSVEC_SIMPLE - Emit code for simple system vector IDT - * entry points - * @func: Function name of the entry point + * entry points and their corresponding + * software based dispatch handlers in + * non-noinstr text section + * @func: Function name of the IDT entry point + * @dispatch_table_func:Function name of the software based dispatch handler * * Runs the function on the interrupted stack. No switch to IRQ stack and * only the minimal __irq_enter/exit() handling. @@ -273,6 +303,14 @@ __visible noinstr void func(struct pt_regs *regs) \ irqentry_exit(regs, state); \ } \ \ +void dispatch_table_##func(struct pt_regs *regs) \ +{ \ + __irq_enter_raw(); \ + kvm_set_cpu_l1tf_flush_l1d(); \ + __##func (regs); \ + __irq_exit_raw(); \ +} \ + \ static __always_inline void __##func(struct pt_regs *regs) /** @@ -638,9 +676,7 @@ DECLARE_IDTENTRY(X86_TRAP_VE, exc_virtualization_exception); /* Device interrupts common/spurious */ DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, common_interrupt); -#ifdef CONFIG_X86_LOCAL_APIC DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, spurious_interrupt); -#endif /* System vector entry points */ #ifdef CONFIG_X86_LOCAL_APIC @@ -651,7 +687,7 @@ DECLARE_IDTENTRY_SYSVEC(X86_PLATFORM_IPI_VECTOR, sysvec_x86_platform_ipi); #endif #ifdef CONFIG_SMP -DECLARE_IDTENTRY(RESCHEDULE_VECTOR, sysvec_reschedule_ipi); +DECLARE_IDTENTRY_SYSVEC(RESCHEDULE_VECTOR, sysvec_reschedule_ipi); DECLARE_IDTENTRY_SYSVEC(IRQ_MOVE_CLEANUP_VECTOR, sysvec_irq_move_cleanup); DECLARE_IDTENTRY_SYSVEC(REBOOT_VECTOR, sysvec_reboot); DECLARE_IDTENTRY_SYSVEC(CALL_FUNCTION_SINGLE_VECTOR, sysvec_call_function_single); diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 47ecfff2c83d..28c8ba5fd81c 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -47,4 +47,11 @@ void __noreturn handle_stack_overflow(struct pt_regs *regs, struct stack_info *info); #endif +/* + * How system interrupt handlers are called. + */ +#define DECLARE_SYSTEM_INTERRUPT_HANDLER(f) \ + void f (struct pt_regs *regs) +typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); + #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index d3fdec706f1d..8f751c06c052 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1451,6 +1451,46 @@ DEFINE_IDTENTRY_SW(iret_error) } #endif +#define SYSV(x,y) [(x) - FIRST_SYSTEM_VECTOR] = y + +static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = { +#ifdef CONFIG_SMP + SYSV(RESCHEDULE_VECTOR, dispatch_table_sysvec_reschedule_ipi), + SYSV(CALL_FUNCTION_VECTOR, dispatch_table_sysvec_call_function), + SYSV(CALL_FUNCTION_SINGLE_VECTOR, dispatch_table_sysvec_call_function_single), + SYSV(REBOOT_VECTOR, dispatch_table_sysvec_reboot), +#endif + +#ifdef CONFIG_X86_THERMAL_VECTOR + SYSV(THERMAL_APIC_VECTOR, dispatch_table_sysvec_thermal), +#endif + +#ifdef CONFIG_X86_MCE_THRESHOLD + SYSV(THRESHOLD_APIC_VECTOR, dispatch_table_sysvec_threshold), +#endif + +#ifdef CONFIG_X86_MCE_AMD + SYSV(DEFERRED_ERROR_VECTOR, dispatch_table_sysvec_deferred_error), +#endif + +#ifdef CONFIG_X86_LOCAL_APIC + SYSV(LOCAL_TIMER_VECTOR, dispatch_table_sysvec_apic_timer_interrupt), + SYSV(X86_PLATFORM_IPI_VECTOR, dispatch_table_sysvec_x86_platform_ipi), +# ifdef CONFIG_HAVE_KVM + SYSV(POSTED_INTR_VECTOR, dispatch_table_sysvec_kvm_posted_intr_ipi), + SYSV(POSTED_INTR_WAKEUP_VECTOR, dispatch_table_sysvec_kvm_posted_intr_wakeup_ipi), + SYSV(POSTED_INTR_NESTED_VECTOR, dispatch_table_sysvec_kvm_posted_intr_nested_ipi), +# endif +# ifdef CONFIG_IRQ_WORK + SYSV(IRQ_WORK_VECTOR, dispatch_table_sysvec_irq_work), +# endif + SYSV(SPURIOUS_APIC_VECTOR, dispatch_table_sysvec_spurious_apic_interrupt), + SYSV(ERROR_APIC_VECTOR, dispatch_table_sysvec_error_interrupt), +#endif +}; + +#undef SYSV + void __init trap_init(void) { /* Init cpu_entry_area before IST entries are set up */ From patchwork Tue Dec 20 06:36:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 298BAC4332F for ; Tue, 20 Dec 2022 07:01:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233244AbiLTHBi (ORCPT ); Tue, 20 Dec 2022 02:01:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232561AbiLTHBa (ORCPT ); Tue, 20 Dec 2022 02:01:30 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABDF413DD5; Mon, 19 Dec 2022 23:01:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519687; x=1703055687; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hrHn73+7WwsxSUNYDH7X4fW8QMTERBDkB3EEBTnAey4=; b=ACXeGdjI8PhbNXCYfIagmFFeD/th+MR5N6bK3A9i+OmSrQVNXjjQ8i/H jxA4sneQ/jjrVAlfbS6d8hzNUpR/C1w3iQBOnL1ES5i33kkl+HOtvcgn+ 3YhiG5ue6jQFhixmyv6YNbHmvv0QEHzeesoD6tMQvf4mkJzSKU9Wd3V/Z cDNHrrz4CZPpbBVCDlxT2F5ThObzPCRXwlDpS6a6lLCNUPwMWSNEG2blU xOcujj+d1E3Z67UqWlAfs2DmBPPDww08LX5L0fgJXd7+84enLkIjZ7Its /rEm89XpXsA82siqF+f6Fore2RNWphzX0YeHjN+O1l22VXiD/LjS/TGTo A==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971910" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971910" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326415" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326415" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:09 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 03/32] x86/traps: add install_system_interrupt_handler() Date: Mon, 19 Dec 2022 22:36:29 -0800 Message-Id: <20221220063658.19271-4-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Some kernel components install system interrupt handlers into the IDT, and we need to do the same for system_interrupt_handlers. A new function install_system_interrupt_handler() is added to install a system interrupt handler into both the IDT and system_interrupt_handlers. Signed-off-by: Xin Li --- arch/x86/include/asm/traps.h | 2 ++ arch/x86/kernel/cpu/acrn.c | 7 +++++-- arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++-------- arch/x86/kernel/kvm.c | 4 +++- arch/x86/kernel/traps.c | 8 ++++++++ drivers/xen/events/events_base.c | 5 ++++- 6 files changed, 36 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 28c8ba5fd81c..46f5e4e2a346 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -41,6 +41,8 @@ void math_emulate(struct math_emu_info *); bool fault_in_kernel_space(unsigned long address); +void install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr); + #ifdef CONFIG_VMAP_STACK void __noreturn handle_stack_overflow(struct pt_regs *regs, unsigned long fault_address, diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c index 485441b7f030..9351bf183a9e 100644 --- a/arch/x86/kernel/cpu/acrn.c +++ b/arch/x86/kernel/cpu/acrn.c @@ -18,6 +18,7 @@ #include #include #include +#include static u32 __init acrn_detect(void) { @@ -26,8 +27,10 @@ static u32 __init acrn_detect(void) static void __init acrn_init_platform(void) { - /* Setup the IDT for ACRN hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_acrn_hv_callback); + /* Install system interrupt handler for ACRN hypervisor callback */ + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_acrn_hv_callback, + sysvec_acrn_hv_callback); x86_platform.calibrate_tsc = acrn_get_tsc_khz; x86_platform.calibrate_cpu = acrn_get_tsc_khz; diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 831613959a92..144b4a622188 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -415,19 +416,24 @@ static void __init ms_hyperv_init_platform(void) */ x86_platform.apic_post_init = hyperv_init; hyperv_setup_mmu_ops(); - /* Setup the IDT for hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_hyperv_callback); - /* Setup the IDT for reenlightenment notifications */ + /* Install system interrupt handler for hypervisor callback */ + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_hyperv_callback, + sysvec_hyperv_callback); + + /* Install system interrupt handler for reenlightenment notifications */ if (ms_hyperv.features & HV_ACCESS_REENLIGHTENMENT) { - alloc_intr_gate(HYPERV_REENLIGHTENMENT_VECTOR, - asm_sysvec_hyperv_reenlightenment); + install_system_interrupt_handler(HYPERV_REENLIGHTENMENT_VECTOR, + asm_sysvec_hyperv_reenlightenment, + sysvec_hyperv_reenlightenment); } - /* Setup the IDT for stimer0 */ + /* Install system interrupt handler for stimer0 */ if (ms_hyperv.misc_features & HV_STIMER_DIRECT_MODE_AVAILABLE) { - alloc_intr_gate(HYPERV_STIMER0_VECTOR, - asm_sysvec_hyperv_stimer0); + install_system_interrupt_handler(HYPERV_STIMER0_VECTOR, + asm_sysvec_hyperv_stimer0, + sysvec_hyperv_stimer0); } # ifdef CONFIG_SMP diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index d4e48b4a438b..b7388ed2a980 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -835,7 +835,9 @@ static void __init kvm_guest_init(void) if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT) && kvmapf) { static_branch_enable(&kvm_async_pf_enabled); - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_kvm_asyncpf_interrupt); + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_kvm_asyncpf_interrupt, + sysvec_kvm_asyncpf_interrupt); } #ifdef CONFIG_SMP diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 8f751c06c052..2b8530235e47 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1491,6 +1491,14 @@ static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = { #undef SYSV +void __init install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr) +{ + BUG_ON(n < FIRST_SYSTEM_VECTOR); + + system_interrupt_handlers[n - FIRST_SYSTEM_VECTOR] = (system_interrupt_handler)addr; + alloc_intr_gate(n, asm_addr); +} + void __init trap_init(void) { /* Init cpu_entry_area before IST entries are set up */ diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c443f04aaad7..1a9eaf417acc 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include #endif @@ -2246,7 +2247,9 @@ static __init void xen_alloc_callback_vector(void) return; pr_info("Xen HVM callback vector for event delivery is enabled\n"); - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_xen_hvm_callback); + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_xen_hvm_callback, + sysvec_xen_hvm_callback); } #else void xen_setup_callback_vector(void) {} From patchwork Tue Dec 20 06:36:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90844C4167B for ; Tue, 20 Dec 2022 07:01:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233324AbiLTHBq (ORCPT ); Tue, 20 Dec 2022 02:01:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232858AbiLTHBb (ORCPT ); Tue, 20 Dec 2022 02:01:31 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0596E2ADE; Mon, 19 Dec 2022 23:01:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519690; x=1703055690; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DCWXhKrktyBv8aibjttu3M7wa+RfYSum0IIVl7wgtzY=; b=ZbEX1h4+x9FlSbSJjDtzmwJSAx/UxRoI4QK35371x6/3wnU95Fs6iM4N QQnsV5Jb5Sld5P41pfAU145ybYCoBnOSeVmF+h+LufYj6k5Fc2TL2Im/t nqHkyWiLTNSRq9Ezpz89wFycqpWsU8fMAO0WklXFXR1aE9wzt5IYE4mWD hAVOfWoP0WrMQdS8AbDzn3fKtXsO+eIiGpdSKlQcmDamBO08OpoV9hiSZ 621E07VVw3r+g3+cPEj5/LI9W1XXEle2cvH2ZUvrJ1qNybSE6TpizqIFu VicJhpc4My/f4oMkVwZqPf9mQtpQiMvqI5JtbiTs0au1LaeYBXo504dcy A==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971919" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971919" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326421" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326421" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:09 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 04/32] x86/traps: add external_interrupt() to dispatch external interrupts Date: Mon, 19 Dec 2022 22:36:30 -0800 Message-Id: <20221220063658.19271-5-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add external_interrupt() to dispatch external interrupts to their handlers. If an external interrupt is a system interrupt, dipatch it through system_interrupt_handler_table, otherwise call into dispatch_common_interrupt(). Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Signed-off-by: Xin Li --- arch/x86/kernel/traps.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 2b8530235e47..c35dd2b4d146 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1499,6 +1499,43 @@ void __init install_system_interrupt_handler(unsigned int n, const void *asm_add alloc_intr_gate(n, asm_addr); } +#ifndef CONFIG_X86_LOCAL_APIC +DEFINE_IDTENTRY_IRQ(spurious_interrupt) +{ + pr_info("Spurious interrupt (vector 0x%x) on CPU#%d, should never happen.\n", + vector, smp_processor_id()); +} +#endif + +/* + * External interrupt dispatch function. + * + * Until/unless dispatch_common_interrupt() can be taught to deal with the + * special system vectors, split the dispatch. + * + * Note: dispatch_common_interrupt() already deals with IRQ_MOVE_CLEANUP_VECTOR. + */ +int external_interrupt(struct pt_regs *regs, unsigned int vector) +{ + unsigned int sysvec = vector - FIRST_SYSTEM_VECTOR; + + if (vector < FIRST_EXTERNAL_VECTOR) { + pr_err("invalid external interrupt vector %d\n", vector); + return -EINVAL; + } + + if (sysvec < NR_SYSTEM_VECTORS) { + if (system_interrupt_handlers[sysvec]) + system_interrupt_handlers[sysvec](regs); + else + dispatch_spurious_interrupt(regs, vector); + } else { + dispatch_common_interrupt(regs, vector); + } + + return 0; +} + void __init trap_init(void) { /* Init cpu_entry_area before IST entries are set up */ From patchwork Tue Dec 20 06:36:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28C33C4332F for ; Tue, 20 Dec 2022 07:01:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233354AbiLTHBv (ORCPT ); Tue, 20 Dec 2022 02:01:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232678AbiLTHBa (ORCPT ); Tue, 20 Dec 2022 02:01:30 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F4CC14D2E; Mon, 19 Dec 2022 23:01:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519688; x=1703055688; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b2h40hZlbzAYlSPYHfekdfFFlQufXWZBJiq+lRFxfWo=; b=HrRLsLqEhd0sRfW/H3OVWvOCi8quRwLkvMASsElbRI4DrOpksHhUCkjO WpOwAu9J4/VteZ71fKbQ4ZdsY1TBWHu9Fp+GanBgvCNPhWGNC8wNLZbhR EZaD0yzzhQb4m75w6f7QnQD6WNklg9tFx8YvYXKD5DBhx3+2F58dq6tEr zt/iqsgfy+KDXSGwFi5sMTDC5cim4aaX3RyxPj9f4Fl0BGCjpF8fm/rw7 cdNV2hgoyf3ZmP+l+yp9AcmYNF+QD3DJWCA2gthPRnBhIb61Af6zRCCPX oATmulEOr4V0Zv7hptLLUdaH8Z/mioVjw+V3RvMoX8QIswvRkcN0IQKoC w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971929" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971929" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326426" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326426" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:09 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 05/32] x86/traps: add exc_raise_irq() for VMX IRQ reinjection Date: Mon, 19 Dec 2022 22:36:31 -0800 Message-Id: <20221220063658.19271-6-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To eliminate dispatching IRQ through the IDT, add exc_raise_irq(), which calls external_interrupt() for IRQ reinjection. Signed-off-by: Xin Li --- arch/x86/include/asm/traps.h | 2 ++ arch/x86/kernel/traps.c | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 46f5e4e2a346..366b1675c033 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -56,4 +56,6 @@ void __noreturn handle_stack_overflow(struct pt_regs *regs, void f (struct pt_regs *regs) typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); +int exc_raise_irq(struct pt_regs *regs, u32 vector); + #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index c35dd2b4d146..99386836b02e 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1536,6 +1536,24 @@ int external_interrupt(struct pt_regs *regs, unsigned int vector) return 0; } +#if IS_ENABLED(CONFIG_KVM_INTEL) +/* + * KVM VMX reinjects IRQ on its current stack, it's a sync call + * thus the values in the pt_regs structure are not used in + * executing IRQ handlers, except cs.RPL and flags.IF, which + * are both always 0 in the VMX IRQ reinjection context. + * + * However, the pt_regs structure is sometimes used in stack + * dump, e.g., show_regs(). So let the caller, i.e., KVM VMX + * decide how to initialize the input pt_regs structure. + */ +int exc_raise_irq(struct pt_regs *regs, u32 vector) +{ + return external_interrupt(regs, vector); +} +EXPORT_SYMBOL_GPL(exc_raise_irq); +#endif + void __init trap_init(void) { /* Init cpu_entry_area before IST entries are set up */ From patchwork Tue Dec 20 06:36:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077528 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A2CDC4332F for ; Tue, 20 Dec 2022 07:01:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233288AbiLTHBn (ORCPT ); Tue, 20 Dec 2022 02:01:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232920AbiLTHBb (ORCPT ); Tue, 20 Dec 2022 02:01:31 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05AC1DEC6; Mon, 19 Dec 2022 23:01:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519690; x=1703055690; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t538ktauL1cAgq6AUr7Ip99zqgMTE2CVEc4jgszwTWk=; b=PoBoDKCfIBPEZDfKN8E6HndXBXY4ahxuMMcR5ndmeNEE3PqSgc8Tz3tY pll2fsGdDfWW2aKx+0oQc7V5fT7fkQWeDN5OPzn0bvtYjRwhUNcGEuOel TiaZmsQ+GEdnQtwLtSfpMtY+I/i8lPPQQ29gC6yoMhlqUAWque54DSbdA WYmKKsjKtslfr1OTvwviNJAWNjRrZaIulE5ZmUfjH4iii5JOHaGBW7yyK Cj0uo4T0SxCp+tB1VSeKE+KBtBoxcrZWUIXwIKkkYlHpDp+zvYDCHSPia mJ8pihvlbR4mXfTr9XbGjSUVqdJUzaz3NGDClufqspXRKHayy3C79S3mt g==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971938" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971938" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326431" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326431" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:10 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 06/32] x86/cpufeature: add the cpu feature bit for FRED Date: Mon, 19 Dec 2022 22:36:32 -0800 Message-Id: <20221220063658.19271-7-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add the CPU feature bit for FRED (Flexible Return and Event Delivery). The Intel flexible return and event delivery (FRED) architecture defines simple new transitions that change privilege level (ring transitions). The FRED architecture was designed with the following goals: 1) Improve overall performance and response time by replacing event delivery through the interrupt descriptor table (IDT event delivery) and event return by the IRET instruction with lower latency transitions. 2) Improve software robustness by ensuring that event delivery establishes the full supervisor context and that event return establishes the full user context. The new transitions defined by the FRED architecture are FRED event delivery and, for returning from events, two FRED return instructions. FRED event delivery can effect a transition from ring 3 to ring 0, but it is used also to deliver events incident to ring 0. One FRED instruction (ERETU) effects a return from ring 0 to ring 3, while the other (ERETS) returns while remaining in ring 0. The Intel FRED architecture spec can be downloaded from: https://cdrdv2.intel.com/v1/dl/getContent/678938 Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/cpufeatures.h | 1 + tools/arch/x86/include/asm/cpufeatures.h | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 29f53b31056e..6148e8a94d24 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -312,6 +312,7 @@ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ #define X86_FEATURE_LKGS (12*32+ 18) /* "" Load "kernel" (userspace) gs */ +#define X86_FEATURE_FRED (12*32+ 17) /* Flexible Return and Event Delivery */ /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ #define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */ diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 3dc1a48c2796..41d1e1b4a6cb 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -308,6 +308,7 @@ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ +#define X86_FEATURE_FRED (12*32+ 17) /* Flexible Return and Event Delivery */ #define X86_FEATURE_LKGS (12*32+ 18) /* "" Load "kernel" (userspace) gs */ /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ From patchwork Tue Dec 20 06:36:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077530 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2C5AC4167B for ; Tue, 20 Dec 2022 07:01:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233336AbiLTHBs (ORCPT ); Tue, 20 Dec 2022 02:01:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233071AbiLTHBb (ORCPT ); Tue, 20 Dec 2022 02:01:31 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05C70DEE4; Mon, 19 Dec 2022 23:01:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519690; x=1703055690; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HUHrRZTDjkecGuHDXK4PBl+TXz2Ml8+ie04d+kki1CQ=; b=AvQvql6m16PcNdYAt6aalDgU3E5Zew9nhn7vlDa9yVktC2F1fn8NLOS8 hGge+DyvS6Etdt10Jo/trvRJv981jB3az1W0BtipOFF9JpP7MjaE2uk38 ky9hg9er9tDdZMkmioNgM5ZZYT4Um28S2sl06mQ8M3CBukXCM22ASoZLT oaVHTVdHbqP1jJTviNwdJOr3Ep4RVzeqFdajlCvE/mD9kF71m6teUJCM1 ei4wnZ8fl3dl1QgEyG5LOuNfS0mkkMi5eMZuPH2FFLa0Y35FqOqGcl0bj LlbS7zirExi11p6Xnl0pVDPb07Cc13Cay6Nhqbe+FXLulKo71o20reR32 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971948" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971948" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326436" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326436" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:10 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 07/32] x86/opcode: add ERETU, ERETS instructions to x86-opcode-map Date: Mon, 19 Dec 2022 22:36:33 -0800 Message-Id: <20221220063658.19271-8-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add the instruction opcodes used by FRED: ERETU, ERETS. Opcode number is per public FRED draft spec v3.0 https://cdrdv2.intel.com/v1/dl/getContent/678938. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/lib/x86-opcode-map.txt | 2 +- tools/arch/x86/lib/x86-opcode-map.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 5168ee0360b2..7a269e269dc0 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -1052,7 +1052,7 @@ EndTable GrpTable: Grp7 0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) -1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) +1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B) 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B) 3: LIDT Ms 4: SMSW Mw/Rv diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index 5168ee0360b2..7a269e269dc0 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -1052,7 +1052,7 @@ EndTable GrpTable: Grp7 0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) -1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) +1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B) 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B) 3: LIDT Ms 4: SMSW Mw/Rv From patchwork Tue Dec 20 06:36:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077532 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41325C4708D for ; Tue, 20 Dec 2022 07:02:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233364AbiLTHBy (ORCPT ); Tue, 20 Dec 2022 02:01:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233142AbiLTHBe (ORCPT ); Tue, 20 Dec 2022 02:01:34 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D799DEE4; Mon, 19 Dec 2022 23:01:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519693; x=1703055693; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o7a+0760MneBwxPKzFiaj9Nq7KFWmVxmzGnbcDhAUFY=; b=DLiKuLaS+NtpbVg2WFWwFmhTDlSJQHLW62FhwMqTvKIidYiWVsPOCa4r CD3YcsCOp3edUNsSjSl8+eONlND7eWBb9mMpNuc2MV957JaVg+Edxeq49 nT+ffZ7LJib2oZFnSn7FvMZNn9+4xmLnK6aO1OmZu1WRlWsln2Pa0OO0H vxuGoGPO22EdT00u4iYoXHRSwKFKjEztL8ykt9Cc/tJPlU90YAWBLBOah 1/YWig30ndsEAUTWVqx2b49d9Il+kTuIhsUvbCO6le90cv18sV/u4V16h yBnYwV/nUK6/3WUK1U00ydQuHAT0AUjFSHFk6VKDf08tRJRXYMs+STcpm g==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971959" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971959" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326440" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326440" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:11 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 08/32] x86/objtool: teach objtool about ERETU and ERETS Date: Mon, 19 Dec 2022 22:36:34 -0800 Message-Id: <20221220063658.19271-9-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Update the objtool decoder to know about the ERETU and ERETS instructions (type INSN_CONTEXT_SWITCH.) Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- tools/objtool/arch/x86/decode.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 1c253b4b7ce0..fbfe0a39599a 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -480,12 +480,22 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec case 0x0f: if (op2 == 0x01) { - - if (modrm == 0xca) - *type = INSN_CLAC; - else if (modrm == 0xcb) - *type = INSN_STAC; - + switch (insn_last_prefix_id(&insn)) { + case INAT_PFX_REPE: + case INAT_PFX_REPNE: + if (modrm == 0xca) { + /* eretu/erets */ + *type = INSN_CONTEXT_SWITCH; + } + break; + default: + if (modrm == 0xca) { + *type = INSN_CLAC; + } else if (modrm == 0xcb) { + *type = INSN_STAC; + } + break; + } } else if (op2 >= 0x80 && op2 <= 0x8f) { *type = INSN_JUMP_CONDITIONAL; From patchwork Tue Dec 20 06:36:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4064DC4332F for ; Tue, 20 Dec 2022 07:02:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233411AbiLTHCA (ORCPT ); Tue, 20 Dec 2022 02:02:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233158AbiLTHBe (ORCPT ); Tue, 20 Dec 2022 02:01:34 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D5762ADE; Mon, 19 Dec 2022 23:01:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519693; x=1703055693; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=msUUkkdZ4tNoxXqmpYk+fBav1/cjI+hkwg5c4Hl6h2s=; b=TM0H2CXgXGMtsXVQ5l26GZT3ZWEm4wUTmJbrEcbR1iQFWDDzKNTA3iQf 2fqu653LqprU44/sBs4rJcj5WlOXAWf8ayp9ozxO2U/HHm9oua4/JaQK7 DQ1ppY1bte+gZVNQcRQCYVoN6DUpuvey9udTPzP5RpfEbUpdPXenz7Xw8 khn2cWW0hBICcNJczZi4idx+SDdqF8TwYeLNMI9uCQOA0p1zXxOnZrMZc jGYGdNvDgmk4RWoC20TCuuZZ+y2ppYQ3mHqaOfxO+ioTw0dnhUCL0VbyP lZwhZX/PrejiDT4ttOSKcxJT5z7TXw9KurtkebGt6Nv6Qi90ma7kvs9/F g==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971969" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971969" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326445" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326445" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:11 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 09/32] x86/cpu: add X86_CR4_FRED macro Date: Mon, 19 Dec 2022 22:36:35 -0800 Message-Id: <20221220063658.19271-10-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add X86_CR4_FRED macro for the FRED bit in %cr4. This bit should be a pinned bit, not to be changed after initialization. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/uapi/asm/processor-flags.h | 2 ++ arch/x86/kernel/cpu/common.c | 11 ++++++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index c47cc7f2feeb..a90933f1ff41 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -132,6 +132,8 @@ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) #define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology */ #define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) +#define X86_CR4_FRED_BIT 32 /* enable FRED kernel entry */ +#define X86_CR4_FRED _BITULL(X86_CR4_FRED_BIT) /* * x86-64 Task Priority Register, CR8 diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index d6eb4f60b47d..05a5538052ad 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -411,10 +411,15 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c) cr4_clear_bits(X86_CR4_UMIP); } -/* These bits should not change their value after CPU init is finished. */ +/* + * These bits should not change their value after CPU init is + * finished. The explicit cast to unsigned long suppresses a warning + * on i386 for x86-64 only feature bits >= 32. + */ static const unsigned long cr4_pinned_mask = - X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | - X86_CR4_FSGSBASE | X86_CR4_CET; + (unsigned long) + (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | + X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED); static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); static unsigned long cr4_pinned_bits __ro_after_init; From patchwork Tue Dec 20 06:36:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077534 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64EDCC3DA79 for ; Tue, 20 Dec 2022 07:02:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233379AbiLTHB4 (ORCPT ); Tue, 20 Dec 2022 02:01:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233150AbiLTHBe (ORCPT ); Tue, 20 Dec 2022 02:01:34 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D62EDEC6; Mon, 19 Dec 2022 23:01:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519693; x=1703055693; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=q9cIpOvm3xSuQ2LLiyXWWa3lHFFJYVxzwwgrBF18MB8=; b=jvfiATR6D/T6Bf9n/iNZgvKwDMjZH61tHuIVWh9WDhfpmVhaYQmU02RO H6Z5UA40j6yIm63zqVsJ8/4ciVw8+K/tClK2PF1wz0IFpFgYPvI7aHJVp H4H/nRijHHHU4RmZZnTa7K0Hu4WxyzdmLx1FYCIUXPM7D2HeN637PQnlS 9fvmBcwwl9afmCR0CVykWeF+OvUzI0ijLlErO3slfBZhOxLXjvnhzDFiE 58h6SgXVzne/VkBHzHSClkc3qM1fRm5RuutWlXVe5bqVBtX0CB7EVmgMo yOuCJjLQR84Xkude4GjMYpOddqxkr+kZPJg7fDf+rsZPhNOjiPp3PIAHY w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971980" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971980" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326450" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326450" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:11 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 10/32] x86/fred: add Kconfig option for FRED (CONFIG_X86_FRED) Date: Mon, 19 Dec 2022 22:36:36 -0800 Message-Id: <20221220063658.19271-11-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add the configuration option CONFIG_X86_FRED to enable FRED. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/Kconfig | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 67745ceab0db..1155d2e06fd1 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -500,6 +500,15 @@ config X86_CPU_RESCTRL Say N if unsure. +config X86_FRED + bool "Flexible Return and Event Delivery" + depends on X86_64 + help + When enabled, try to use Flexible Return and Event Delivery + instead of the legacy SYSCALL/SYSENTER/IDT architecture for + ring transitions and exception/interrupt handling if the + system supports. + if X86_32 config X86_BIGSMP bool "Support for big SMP systems with more than 8 CPUs" From patchwork Tue Dec 20 06:36:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077535 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 410F4C4332F for ; Tue, 20 Dec 2022 07:02:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233420AbiLTHCE (ORCPT ); Tue, 20 Dec 2022 02:02:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233179AbiLTHBf (ORCPT ); Tue, 20 Dec 2022 02:01:35 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 890901408A; Mon, 19 Dec 2022 23:01:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519694; x=1703055694; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fxruyRHpnh5K7ebWUalaOrQmsWiBP+Qwng9F8EjMSTU=; b=URr4+EbUrG3DjgWQaCq8wFj7kmFglHGofo2Y3Galv6tjYdFEizPohHSJ 6/CC3fk4ejwdWeB7iBi9SBXqD7xEKjyKv7vK49YLoGBYsHcIcltlE/PkC TefE/xpMcliGVw/QeWWy2A9Ee42fdOwFRWjaBAtLm4xKPyTMblmQKvQ5x 8GKjsLRX5bC21AY6Z+FrUk59OMf/rkLoCrYvXDkhe8JUgKx+Tq1Eis1gI PyawqWLKTFHWdnV1zjLqIshu0jx1Vv0ZVW+PTk7gDmgnN0vT8+elqFbpg f4jPOiN/ixwlLx2E/4luzfxAFbzccYFKns7jWubpmILm8ZR9qP56dH7lo Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302971992" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302971992" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326454" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326454" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:12 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 11/32] x86/fred: if CONFIG_X86_FRED is disabled, disable FRED support Date: Mon, 19 Dec 2022 22:36:37 -0800 Message-Id: <20221220063658.19271-12-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add CONFIG_X86_FRED to to make cpu_feature_enabled() work correctly with FRED. Originally-by: Megha Dey Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/disabled-features.h | 8 +++++++- tools/arch/x86/include/asm/disabled-features.h | 8 +++++++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 33d2cd04d254..3a2d0ad63332 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -87,6 +87,12 @@ # define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31)) #endif +#ifdef CONFIG_X86_FRED +# define DISABLE_FRED 0 +#else +# define DISABLE_FRED (1 << (X86_FEATURE_FRED & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -102,7 +108,7 @@ #define DISABLED_MASK9 (DISABLE_SGX) #define DISABLED_MASK10 0 #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET) -#define DISABLED_MASK12 0 +#define DISABLED_MASK12 (DISABLE_FRED) #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h index 33d2cd04d254..3a2d0ad63332 100644 --- a/tools/arch/x86/include/asm/disabled-features.h +++ b/tools/arch/x86/include/asm/disabled-features.h @@ -87,6 +87,12 @@ # define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31)) #endif +#ifdef CONFIG_X86_FRED +# define DISABLE_FRED 0 +#else +# define DISABLE_FRED (1 << (X86_FEATURE_FRED & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -102,7 +108,7 @@ #define DISABLED_MASK9 (DISABLE_SGX) #define DISABLED_MASK10 0 #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET) -#define DISABLED_MASK12 0 +#define DISABLED_MASK12 (DISABLE_FRED) #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 From patchwork Tue Dec 20 06:36:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077536 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF99BC4332F for ; Tue, 20 Dec 2022 07:02:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233426AbiLTHCF (ORCPT ); Tue, 20 Dec 2022 02:02:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233215AbiLTHBg (ORCPT ); Tue, 20 Dec 2022 02:01:36 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D343140AC; Mon, 19 Dec 2022 23:01:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519695; x=1703055695; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o65+j06rKvTvMzwdKVOrgkkK4xTrmwbR1mLB3PAtwFI=; b=SdCJBRj1XRf/FOIHQsq9wzdLAGCA1LYDFIrOnY8W4OhA4WbCxOFJ6JXA sOXm44uS/4BiznPtHHhUAx56+heIw231TDpKQhRSNMS4+hIjPWJ2qodlb ie9LCuLk7ONKaWGIUne7FA7eORisT4e2BeGI7a2MIaGdPYi4AREfYKvLy 1LtEQl6Vy1kDCOlC7Sf5U9huLsrf3NcthL/kA9DddeELDWj6LwThv84Ub iV04vFI7wCLciZmBqbNYvwm65pAUmr6uiykWUyTZmvMFhQ6H1h0MIqcLf AphFF2FG6YzEMQCtOsezoqjmj3v0lniW3SbOBsmk9G70eCr9UtUI+pEbX g==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972002" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972002" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326459" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326459" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:12 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 12/32] x86/cpu: add MSR numbers for FRED configuration Date: Mon, 19 Dec 2022 22:36:38 -0800 Message-Id: <20221220063658.19271-13-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add MSR numbers for the FRED configuration registers. Originally-by: Megha Dey Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/msr-index.h | 12 +++++++++++- tools/arch/x86/include/asm/msr-index.h | 12 +++++++++++- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 4a2af82553e4..dea9223ec9ba 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -39,8 +39,18 @@ #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) -/* Intel MSRs. Some also available on other CPUs */ +/* FRED MSRs */ +#define MSR_IA32_FRED_RSP0 0x1cc /* Level 0 stack pointer */ +#define MSR_IA32_FRED_RSP1 0x1cd /* Level 1 stack pointer */ +#define MSR_IA32_FRED_RSP2 0x1ce /* Level 2 stack pointer */ +#define MSR_IA32_FRED_RSP3 0x1cf /* Level 3 stack pointer */ +#define MSR_IA32_FRED_STKLVLS 0x1d0 /* Exception stack levels */ +#define MSR_IA32_FRED_SSP1 0x1d1 /* Level 1 shadow stack pointer */ +#define MSR_IA32_FRED_SSP2 0x1d2 /* Level 2 shadow stack pointer */ +#define MSR_IA32_FRED_SSP3 0x1d3 /* Level 3 shadow stack pointer */ +#define MSR_IA32_FRED_CONFIG 0x1d4 /* Entrypoint and interrupt stack level */ +/* Intel MSRs. Some also available on other CPUs */ #define MSR_TEST_CTRL 0x00000033 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT 29 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT BIT(MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT) diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index f17ade084720..5c9d9040dd04 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -39,8 +39,18 @@ #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) -/* Intel MSRs. Some also available on other CPUs */ +/* FRED MSRs */ +#define MSR_IA32_FRED_RSP0 0x1cc /* Level 0 stack pointer */ +#define MSR_IA32_FRED_RSP1 0x1cd /* Level 1 stack pointer */ +#define MSR_IA32_FRED_RSP2 0x1ce /* Level 2 stack pointer */ +#define MSR_IA32_FRED_RSP3 0x1cf /* Level 3 stack pointer */ +#define MSR_IA32_FRED_STKLVLS 0x1d0 /* Exception stack levels */ +#define MSR_IA32_FRED_SSP1 0x1d1 /* Level 1 shadow stack pointer */ +#define MSR_IA32_FRED_SSP2 0x1d2 /* Level 2 shadow stack pointer */ +#define MSR_IA32_FRED_SSP3 0x1d3 /* Level 3 shadow stack pointer */ +#define MSR_IA32_FRED_CONFIG 0x1d4 /* Entrypoint and interrupt stack level */ +/* Intel MSRs. Some also available on other CPUs */ #define MSR_TEST_CTRL 0x00000033 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT 29 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT BIT(MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT) From patchwork Tue Dec 20 06:36:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A31EC10F1E for ; Tue, 20 Dec 2022 07:02:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233319AbiLTHCr (ORCPT ); Tue, 20 Dec 2022 02:02:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233223AbiLTHBg (ORCPT ); Tue, 20 Dec 2022 02:01:36 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C4DE14D2E; Mon, 19 Dec 2022 23:01:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519695; x=1703055695; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UOsdS7+owJQ73WNU9ofGpL7BxDN6CoX4IXCHRYhbv+k=; b=aHZqzgyTtlJDLGCJrPpbmKXLxccoGPXsrkIsCGJxQaXRpIN5dKV0I4g1 VZ1+7L4thjzkQhHeHTFrA6V4tRY1o76Sqf0XKnk7EJiXYaH8E3S9zGZLa AwigcRFNHkUtBox02rkBxLkTEsSbML+eTD5Sp9INeYoDeiQqS1+RX4qVn YkeJ4qudG7vnsDO2KNiC67Gw8kyDojbVCrvT/NwNTqxS2oFjZKbj8ttmi FUsc05Sbu07DjXzquKpQS/O7AmoRpq3wXnTg4IaECueqT5W6PkGcOL+si 6pADefwcBAXwPac9bOEpQ6mG/7cuWFS759QVl2fa/+zhzt72KRXAThwHu Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972012" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972012" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326465" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326465" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:12 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 13/32] x86/fred: header file for event types Date: Mon, 19 Dec 2022 22:36:39 -0800 Message-Id: <20221220063658.19271-14-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org FRED inherits the Intel VT-x enhancement of classified events with a two-level event dispatch logic. The first-level dispatch is on the event type, not the event vector as used in the IDT architecture. This also means that vectors in different event types are orthogonal, e.g., vectors 0x10-0x1f become available as hardware interrupts. Add a header file for event types, and also use it in . Suggested-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/event-type.h | 17 +++++++++++++++++ arch/x86/include/asm/vmx.h | 17 +++++++++-------- 2 files changed, 26 insertions(+), 8 deletions(-) create mode 100644 arch/x86/include/asm/event-type.h diff --git a/arch/x86/include/asm/event-type.h b/arch/x86/include/asm/event-type.h new file mode 100644 index 000000000000..fedaa0e492c5 --- /dev/null +++ b/arch/x86/include/asm/event-type.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_EVENT_TYPE_H +#define _ASM_X86_EVENT_TYPE_H + +/* + * Event type codes: these are the same that are used by VTx. + */ +#define EVENT_TYPE_HWINT 0 /* Maskable external interrupt */ +#define EVENT_TYPE_RESERVED 1 +#define EVENT_TYPE_NMI 2 /* Non-maskable interrupt */ +#define EVENT_TYPE_HWFAULT 3 /* Hardware exceptions (e.g., page fault) */ +#define EVENT_TYPE_SWINT 4 /* Software interrupt (INT n) */ +#define EVENT_TYPE_PRIVSW 5 /* INT1 (ICEBP) */ +#define EVENT_TYPE_SWFAULT 6 /* Software exception (INT3 or INTO) */ +#define EVENT_TYPE_OTHER 7 /* FRED: SYSCALL/SYSENTER */ + +#endif diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 498dc600bd5c..8d9b8b0d8e56 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -15,6 +15,7 @@ #include #include #include +#include #include #define VMCS_CONTROL_BIT(x) BIT(VMX_FEATURE_##x & 0x1f) @@ -372,14 +373,14 @@ enum vmcs_field { #define VECTORING_INFO_DELIVER_CODE_MASK INTR_INFO_DELIVER_CODE_MASK #define VECTORING_INFO_VALID_MASK INTR_INFO_VALID_MASK -#define INTR_TYPE_EXT_INTR (0 << 8) /* external interrupt */ -#define INTR_TYPE_RESERVED (1 << 8) /* reserved */ -#define INTR_TYPE_NMI_INTR (2 << 8) /* NMI */ -#define INTR_TYPE_HARD_EXCEPTION (3 << 8) /* processor exception */ -#define INTR_TYPE_SOFT_INTR (4 << 8) /* software interrupt */ -#define INTR_TYPE_PRIV_SW_EXCEPTION (5 << 8) /* ICE breakpoint - undocumented */ -#define INTR_TYPE_SOFT_EXCEPTION (6 << 8) /* software exception */ -#define INTR_TYPE_OTHER_EVENT (7 << 8) /* other event */ +#define INTR_TYPE_EXT_INTR (EVENT_TYPE_HWINT << 8) /* external interrupt */ +#define INTR_TYPE_RESERVED (EVENT_TYPE_RESERVED << 8) /* reserved */ +#define INTR_TYPE_NMI_INTR (EVENT_TYPE_NMI << 8) /* NMI */ +#define INTR_TYPE_HARD_EXCEPTION (EVENT_TYPE_HWFAULT << 8) /* processor exception */ +#define INTR_TYPE_SOFT_INTR (EVENT_TYPE_SWINT << 8) /* software interrupt */ +#define INTR_TYPE_PRIV_SW_EXCEPTION (EVENT_TYPE_PRIVSW << 8) /* ICE breakpoint - undocumented */ +#define INTR_TYPE_SOFT_EXCEPTION (EVENT_TYPE_SWFAULT << 8) /* software exception */ +#define INTR_TYPE_OTHER_EVENT (EVENT_TYPE_OTHER << 8) /* other event */ /* GUEST_INTERRUPTIBILITY_INFO flags. */ #define GUEST_INTR_STATE_STI 0x00000001 From patchwork Tue Dec 20 06:36:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24B2AC4332F for ; Tue, 20 Dec 2022 07:02:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233434AbiLTHCL (ORCPT ); Tue, 20 Dec 2022 02:02:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229746AbiLTHBh (ORCPT ); Tue, 20 Dec 2022 02:01:37 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E9451581B; Mon, 19 Dec 2022 23:01:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519695; x=1703055695; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HmlZIFRY86SsuxVZ3f2cHdcHUA8c4g3To7+jNRITrUo=; b=E6payz2VgowUdCzRGPgWkiXUWnis+v8GmVUqmj77PWlPp+p6gTAQzcps 4dHir9pP4YvgZtV5QlvFqITV+wU2VMi1z2MFL4S0Eubid/hGE6AsBObhP OPZQ5yhSj1vK6I7+lWYH5sm1tG4b7TlMLOjy5dyFCKholaFwBL6ss9Xks KJATmvjlNIP5iQKnU+zG5G2A3la2sNKV26mULpwZNh4N+N0o9KeZZ97NA i55qroHf2arNj7b1GjMWOaP3yVtTdcGe3IVbBkru5dPdsjziZGYPAingQ l9E7Kp4NQXqtpViVhKDBd0UDpN134exdaPrsmcD/lKvtlQdwiY3C9GxiL w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972022" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972022" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326473" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326473" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:13 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 14/32] x86/fred: header file with FRED definitions Date: Mon, 19 Dec 2022 22:36:40 -0800 Message-Id: <20221220063658.19271-15-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add a header file for FRED prototypes and definitions. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 99 +++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 arch/x86/include/asm/fred.h diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h new file mode 100644 index 000000000000..6292b28d461d --- /dev/null +++ b/arch/x86/include/asm/fred.h @@ -0,0 +1,99 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * arch/x86/include/asm/fred.h + * + * Macros for Flexible Return and Event Delivery (FRED) + */ + +#ifndef ASM_X86_FRED_H +#define ASM_X86_FRED_H + +#ifdef CONFIG_X86_FRED + +#include +#include + +/* + * FRED return instructions + * + * Replace with "ERETS"/"ERETU" once binutils support FRED return instructions. + */ +#define ERETS _ASM_BYTES(0xf2,0x0f,0x01,0xca) +#define ERETU _ASM_BYTES(0xf3,0x0f,0x01,0xca) + +/* + * Event stack level macro for the FRED_STKLVLS MSR. + * Usage example: FRED_STKLVL(X86_TRAP_DF, 3) + * Multiple values can be ORd together. + */ +#define FRED_STKLVL(v,l) (_AT(unsigned long, l) << (2*(v))) + +/* FRED_CONFIG MSR */ +#define FRED_CONFIG_CSL_MASK 0x3 +#define FRED_CONFIG_SHADOW_STACK_SPACE _BITUL(3) +#define FRED_CONFIG_REDZONE(b) __ALIGN_KERNEL_MASK((b), _UL(0x3f)) +#define FRED_CONFIG_INT_STKLVL(l) (_AT(unsigned long, l) << 9) +#define FRED_CONFIG_ENTRYPOINT(p) _AT(unsigned long, (p)) + +/* FRED event type and vector bit width and counts */ +#define FRED_EVENT_TYPE_BITS 3 /* only 3 bits used in FRED 3.0 */ +#define FRED_EVENT_TYPE_COUNT _BITUL(FRED_EVENT_TYPE_BITS) +#define FRED_EVENT_VECTOR_BITS 8 +#define FRED_EVENT_VECTOR_COUNT _BITUL(FRED_EVENT_VECTOR_BITS) + +/* FRED EVENT_TYPE_OTHER vector numbers */ +#define FRED_SYSCALL 1 +#define FRED_SYSENTER 2 + +/* Flags above the CS selector (regs->csl) */ +#define FRED_CSL_ENABLE_NMI _BITUL(28) +#define FRED_CSL_ALLOW_SINGLE_STEP _BITUL(25) +#define FRED_CSL_INTERRUPT_SHADOW _BITUL(24) + +#ifndef __ASSEMBLY__ + +#include +#include + +/* FRED stack frame information */ +struct fred_info { + unsigned long edata; /* Event data: CR2, DR6, ... */ + unsigned long resv; +}; + +/* Full format of the FRED stack frame */ +struct fred_frame { + struct pt_regs regs; + struct fred_info info; +}; + +/* Getting the FRED frame information from a pt_regs pointer */ +static __always_inline struct fred_info *fred_info(struct pt_regs *regs) +{ + return &container_of(regs, struct fred_frame, regs)->info; +} + +static __always_inline unsigned long fred_event_data(struct pt_regs *regs) +{ + return fred_info(regs)->edata; +} + +/* + * How FRED event handlers are called. + * + * FRED event delivery establishes the full supervisor context + * by pushing everything related to the event being delivered + * to the FRED stack frame, e.g., the faulting linear address + * of a #PF is pushed as event data of the FRED #PF stack frame. + * Thus a struct pt_regs has everything needed and it's the only + * input parameter required for a FRED event handler. + */ +#define DECLARE_FRED_HANDLER(f) void f (struct pt_regs *regs) +#define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) +typedef DECLARE_FRED_HANDLER((*fred_handler)); + +#endif /* __ASSEMBLY__ */ + +#endif /* CONFIG_X86_FRED */ + +#endif /* ASM_X86_FRED_H */ From patchwork Tue Dec 20 06:36:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077544 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6086C4332F for ; Tue, 20 Dec 2022 07:02:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233382AbiLTHCo (ORCPT ); Tue, 20 Dec 2022 02:02:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233238AbiLTHBh (ORCPT ); Tue, 20 Dec 2022 02:01:37 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 912D113DD5; Mon, 19 Dec 2022 23:01:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519696; x=1703055696; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WILQEUcGGh0f3+f/4EusBZPXm98ZhSdCi01R/9hMgCA=; b=R2aKI80caPQZa4KRIzO31xq6/EzJ0uT1GTmkkaU0vu7GcJ/0w1S9xpQb gxliCZswJ93+zNrE2SD78BUE84x//qC2lTWE/NjK2GQXbDLVFFauPLdlw OXe9peFBjrheFYaztaMZOCIBnmPgTx3aPoj5in1VBgE/WXmNxtX3RAnXT hn+0fJCnzsJeCeMzqA94IXrqZrYmm8ELeyTjobc8sPZS2JkXbhX8BmozG NVZFXyeIJtoVEXUedmjLJYn+FhwArXd66uds2OxXfQy4GJN7N8PG10qLY pOKN4v2qj7+Kcb7p4XNmgK8UXE5tjMYRJ5DVmPZw/vxjSJjNwVVAM+pj8 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972033" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972033" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326477" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326477" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:13 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 15/32] x86/fred: make unions for the cs and ss fields in struct pt_regs Date: Mon, 19 Dec 2022 22:36:41 -0800 Message-Id: <20221220063658.19271-16-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Make the cs and ss fields in struct pt_regs unions between the actual selector and the unsigned long stack slot. FRED uses this space to store additional flags. The printk changes are simply due to the cs and ss fields changed to unsigned short from unsigned long. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/entry/vsyscall/vsyscall_64.c | 2 +- arch/x86/include/asm/ptrace.h | 36 ++++++++++++++++++++++++--- arch/x86/kernel/process_64.c | 2 +- 3 files changed, 34 insertions(+), 6 deletions(-) diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index 4af81df133ee..6349c818d20a 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -76,7 +76,7 @@ static void warn_bad_vsyscall(const char *level, struct pt_regs *regs, if (!show_unhandled_signals) return; - printk_ratelimited("%s%s[%d] %s ip:%lx cs:%lx sp:%lx ax:%lx si:%lx di:%lx\n", + printk_ratelimited("%s%s[%d] %s ip:%lx cs:%x sp:%lx ax:%lx si:%lx di:%lx\n", level, current->comm, task_pid_nr(current), message, regs->ip, regs->cs, regs->sp, regs->ax, regs->si, regs->di); diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index f4db78b09c8f..341e44847cc1 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -82,13 +82,41 @@ struct pt_regs { * On hw interrupt, it's IRQ number: */ unsigned long orig_ax; -/* Return frame for iretq */ + + /* Return frame for iretq/eretu/erets */ unsigned long ip; - unsigned long cs; + union { + unsigned long csl; /* CS + any fields above it */ + struct __attribute__((__packed__)) { + unsigned short cs; /* CS selector proper */ + unsigned int current_stack_level: 2; + unsigned int __csl_resv1 : 6; + unsigned int interrupt_shadowed : 1; + unsigned int software_initiated : 1; + unsigned int __csl_resv2 : 2; + unsigned int nmi : 1; + unsigned int __csl_resv3 : 3; + unsigned int __csl_resv4 : 32; + }; + }; unsigned long flags; unsigned long sp; - unsigned long ss; -/* top of stack page */ + union { + unsigned long ssl; /* SS + any fields above it */ + struct __attribute__((__packed__)) { + unsigned short ss; /* SS selector proper */ + unsigned int __ssl_resv1: 16; + unsigned int vector : 8; + unsigned int __ssl_resv2: 8; + unsigned int type : 4; + unsigned int __ssl_resv3: 4; + unsigned int enclv : 1; + unsigned int long_mode : 1; + unsigned int nested : 1; + unsigned int __ssl_resv4: 1; + unsigned int instr_len : 4; + }; + }; }; #endif /* !__i386__ */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 6b3418bff326..bfe6179b7a17 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -116,7 +116,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode, printk("%sFS: %016lx(%04x) GS:%016lx(%04x) knlGS:%016lx\n", log_lvl, fs, fsindex, gs, gsindex, shadowgs); - printk("%sCS: %04lx DS: %04x ES: %04x CR0: %016lx\n", + printk("%sCS: %04x DS: %04x ES: %04x CR0: %016lx\n", log_lvl, regs->cs, ds, es, cr0); printk("%sCR2: %016lx CR3: %016lx CR4: %016lx\n", log_lvl, cr2, cr3, cr4); From patchwork Tue Dec 20 06:36:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0AFFC10F1E for ; Tue, 20 Dec 2022 07:02:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233470AbiLTHCd (ORCPT ); Tue, 20 Dec 2022 02:02:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233242AbiLTHBi (ORCPT ); Tue, 20 Dec 2022 02:01:38 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCEEB15828; Mon, 19 Dec 2022 23:01:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519696; x=1703055696; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=itDwNcmuOavaLbxBus/tcxgByCKbUIXcxoUL6CHXDYI=; b=nXRAP5+nQHfjuE3qu6ylwfOwQv7rjWkinxcQhDQiH3RdQIBpXlNPmPNj hDk6t4gdCmkBmUzJsBSEqZppRW6oUfqJ5FUyTVpSO9dI2JlxQiLsRjfVv 2RkIAt7WyXkOiZCD/S7pCJIWy8qTapa6piRWK5lRaOL9S+crXLgPU1uMD VAwMxWV6/2DXbxfLM9q3owHOfWvUx/NnwJP+Uu+CYzcYhhdF8ws4HRatF BJcO/f09yNWSqut8NcRIrNzPbmp7yGEvXXVcMAbtcIV9w++K2gQ9YDBIt 21HD3dWQy144gL+a5iboSgGH3479qk7K1XM2BVixVjFxTey6FLtU6ed7o w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972042" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972042" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326481" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326481" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:14 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 16/32] x86/fred: reserve space for the FRED stack frame Date: Mon, 19 Dec 2022 22:36:42 -0800 Message-Id: <20221220063658.19271-17-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" When using FRED, reserve space at the top of the stack frame, just like i386 does. A future version of FRED might have dynamic frame sizes, though, in which case it might be necessary to make TOP_OF_KERNEL_STACK_PADDING a variable instead of a constant. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/thread_info.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index f0cb881c1d69..fea0e69fc3d4 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -31,7 +31,9 @@ * In vm86 mode, the hardware frame is much longer still, so add 16 * bytes to make room for the real-mode segments. * - * x86_64 has a fixed-length stack frame. + * x86-64 has a fixed-length stack frame, but it depends on whether + * or not FRED is enabled. Future versions of FRED might make this + * dynamic, but for now it is always 2 words longer. */ #ifdef CONFIG_X86_32 # ifdef CONFIG_VM86 @@ -39,8 +41,12 @@ # else # define TOP_OF_KERNEL_STACK_PADDING 8 # endif -#else -# define TOP_OF_KERNEL_STACK_PADDING 0 +#else /* x86-64 */ +# ifdef CONFIG_X86_FRED +# define TOP_OF_KERNEL_STACK_PADDING (2*8) +# else +# define TOP_OF_KERNEL_STACK_PADDING 0 +# endif #endif /* From patchwork Tue Dec 20 06:36:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FF0CC4332F for ; Tue, 20 Dec 2022 07:02:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233506AbiLTHCl (ORCPT ); Tue, 20 Dec 2022 02:02:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233240AbiLTHBi (ORCPT ); Tue, 20 Dec 2022 02:01:38 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 073B215A19; Mon, 19 Dec 2022 23:01:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519697; x=1703055697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=umQlTxKGNKLlev+BqP1ZCZCF7yuytMs0GSGunAXbGS4=; b=QnK6o9uitIBKnBttTN7XFM2Q15edqEEB49EJTmwB88fLoaGEZPzOqxbO EU+dd/AGTnmeSBPbyDmyrWqsH5v9yYBTsW1pOc0ccbIYt46GUKA1MpwTy qLkBx6akv2w/IJZvH1p5Dh0cPRXTYQQ9+gRtAbysJRw99ERBRb34R9NiX V2A/palY2o5GcoPhEfU7+Jo1XEiBxTAZ7sKuTJ0HIow/2nO1AXX19C3Rz oMPG9egcqdPaLA/+1k9//kmt5Etw+/y+xkCPqm7V031X5W0Khe0BDA2SH N0lfwgRcdLYhcDiaoEwOAckp8wEhUAzrjhfuh5hA1crUggobIs21Hol0X g==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972051" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972051" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326487" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326487" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:14 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 17/32] x86/fred: add a page fault entry stub for FRED Date: Mon, 19 Dec 2022 22:36:43 -0800 Message-Id: <20221220063658.19271-18-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add a page fault entry stub for FRED. On a FRED system, the faulting address (CR2) is passed on the stack, to avoid the problem of transient state. Thus we get the page fault address from the stack instead of CR2. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 2 ++ arch/x86/mm/fault.c | 20 ++++++++++++++++++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 6292b28d461d..38a90eae7c0f 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -92,6 +92,8 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) #define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) typedef DECLARE_FRED_HANDLER((*fred_handler)); +DECLARE_FRED_HANDLER(fred_exc_page_fault); + #endif /* __ASSEMBLY__ */ #endif /* CONFIG_X86_FRED */ diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 7b0d4ab894c8..f31053f32048 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -33,6 +33,7 @@ #include /* kvm_handle_async_pf */ #include /* fixup_vdso_exception() */ #include +#include /* fred_event_data() */ #define CREATE_TRACE_POINTS #include @@ -1528,9 +1529,10 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code, } } -DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) +static __always_inline void page_fault_common(struct pt_regs *regs, + unsigned int error_code, + unsigned long address) { - unsigned long address = read_cr2(); irqentry_state_t state; prefetchw(¤t->mm->mmap_lock); @@ -1577,3 +1579,17 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) irqentry_exit(regs, state); } + +DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) +{ + page_fault_common(regs, error_code, read_cr2()); +} + +#ifdef CONFIG_X86_FRED + +DEFINE_FRED_HANDLER(fred_exc_page_fault) +{ + page_fault_common(regs, regs->orig_ax, fred_event_data(regs)); +} + +#endif /* CONFIG_X86_FRED */ From patchwork Tue Dec 20 06:36:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077542 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58375C4332F for ; Tue, 20 Dec 2022 07:02:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233493AbiLTHCh (ORCPT ); Tue, 20 Dec 2022 02:02:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233250AbiLTHBi (ORCPT ); Tue, 20 Dec 2022 02:01:38 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82CEBDEE4; Mon, 19 Dec 2022 23:01:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519697; x=1703055697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GhhTMX7j/CtwNJ9iPX8pNCtQRNtO0Tx2XqCNn06W3ZE=; b=c8aMwiG4MCZOmIKLUmp80NE66dol0Bu5EP/TL1jURiocA6OYEH95dtGb WSCZkx/2QQuL4y25KcpR3DLNugeKENnhMM/WoLDFkOYD9DSXtkGXqz5Gn +huJSNtRdYRoWCfMc85rmcYSQfrU4UiDyfkBEy9UonxTBYjGe4SQ99lYq 226oQu+9Z5vmzhRpLaks11rTCxNHOTKs3c8sQN+LB9ABqVlatS0eQT7g9 ZTbzjBxebrz6vJGc5FcOqG+Pembx4nB+UA60ropI9WP1yvMyvXtH5f9Uc QJki4gw5n7iGDD609UBcEmQNZ3btuziziG65+Wnp8hnRY0jJhXWXZQKRC w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972067" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972067" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326494" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326494" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:14 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 18/32] x86/fred: add a debug fault entry stub for FRED Date: Mon, 19 Dec 2022 22:36:44 -0800 Message-Id: <20221220063658.19271-19-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add a debug fault entry stub for FRED. On a FRED system, the debug trap status information (DR6) is passed on the stack, to avoid the problem of transient state. Furthermore, FRED transitions avoid a lot of ugly corner cases the handling of which can, and should be, skipped. The FRED debug trap status information saved on the stack differs from DR6 in both stickiness and polarity; it is exactly what debug_read_clear_dr6() returns, and exc_debug_user()/exc_debug_kernel() expect. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 1 + arch/x86/kernel/traps.c | 61 ++++++++++++++++++++++++++----------- 2 files changed, 45 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 38a90eae7c0f..3089d1c70771 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -92,6 +92,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) #define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) typedef DECLARE_FRED_HANDLER((*fred_handler)); +DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 99386836b02e..b0ee83bab9e6 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -47,6 +47,7 @@ #include #include #include +#include #include #include #include @@ -1020,22 +1021,9 @@ static bool notify_debug(struct pt_regs *regs, unsigned long *dr6) return false; } -static __always_inline void exc_debug_kernel(struct pt_regs *regs, - unsigned long dr6) +static __always_inline void debug_kernel_common(struct pt_regs *regs, + unsigned long dr6) { - /* - * Disable breakpoints during exception handling; recursive exceptions - * are exceedingly 'fun'. - * - * Since this function is NOKPROBE, and that also applies to - * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a - * HW_BREAKPOINT_W on our stack) - * - * Entry text is excluded for HW_BP_X and cpu_entry_area, which - * includes the entry stack is excluded for everything. - */ - unsigned long dr7 = local_db_save(); - irqentry_state_t irq_state = irqentry_nmi_enter(regs); instrumentation_begin(); /* @@ -1062,7 +1050,8 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, * Catch SYSENTER with TF set and clear DR_STEP. If this hit a * watchpoint at the same time then that will still be handled. */ - if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs)) + if (!cpu_feature_enabled(X86_FEATURE_FRED) && + (dr6 & DR_STEP) && is_sysenter_singlestep(regs)) dr6 &= ~DR_STEP; /* @@ -1089,8 +1078,28 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, regs->flags &= ~X86_EFLAGS_TF; out: instrumentation_end(); - irqentry_nmi_exit(regs, irq_state); +} + +static __always_inline void exc_debug_kernel(struct pt_regs *regs, + unsigned long dr6) +{ + /* + * Disable breakpoints during exception handling; recursive exceptions + * are exceedingly 'fun'. + * + * Since this function is NOKPROBE, and that also applies to + * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a + * HW_BREAKPOINT_W on our stack) + * + * Entry text is excluded for HW_BP_X and cpu_entry_area, which + * includes the entry stack is excluded for everything. + */ + unsigned long dr7 = local_db_save(); + irqentry_state_t irq_state = irqentry_nmi_enter(regs); + + debug_kernel_common(regs, dr6); + irqentry_nmi_exit(regs, irq_state); local_db_restore(dr7); } @@ -1179,6 +1188,24 @@ DEFINE_IDTENTRY_DEBUG_USER(exc_debug) { exc_debug_user(regs, debug_read_clear_dr6()); } + +# ifdef CONFIG_X86_FRED +DEFINE_FRED_HANDLER(fred_exc_debug) +{ + /* + * The FRED debug information saved onto stack differs from + * DR6 in both stickiness and polarity; it is exactly what + * debug_read_clear_dr6() returns. + */ + unsigned long dr6 = fred_event_data(regs); + + if (user_mode(regs)) + exc_debug_user(regs, dr6); + else + debug_kernel_common(regs, dr6); +} +# endif /* CONFIG_X86_FRED */ + #else /* 32 bit does not have separate entry points. */ DEFINE_IDTENTRY_RAW(exc_debug) From patchwork Tue Dec 20 06:36:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A10CEC4332F for ; Tue, 20 Dec 2022 07:02:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233346AbiLTHCX (ORCPT ); Tue, 20 Dec 2022 02:02:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233257AbiLTHBj (ORCPT ); Tue, 20 Dec 2022 02:01:39 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D494315A2C; Mon, 19 Dec 2022 23:01:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519697; x=1703055697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Aj4SBgFDxPyxr8I0rdhDIUq7owde47aPDFKuZiENGTs=; b=Rc/SPy8Hg5wH4yerMUlfgNjm5r+WYA46wDlpbMSnQVOxOD4kxBY2Qbzy 20ZO5YjYv6Rmh2575lennzpoO++qhi2XuXsb2wslmleArmkUzddX3Joye Q5hYk2jbYzXq2vRrA4bYP7qpCz3gjj+18oqiXzvd5XMVlGrLp/k6G3K2L 1fPj410dvsl8W8y4/DxkaBQ+iWaFzolIJYKL2/KxBZSMKYB7aAFq07WlP 0ukzuOoOD+ewG0IN/lP2wMnnjsNq2UijmaMXxd7xMTNoba7ze5ZD8WIS/ 23mzTjlQ/0+BeEkM4M1+gAz9rlXL3CFipC0KwnIaS1JYRIl5EnO4Vav2k Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972071" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972071" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326501" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326501" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:15 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 19/32] x86/fred: add a NMI entry stub for FRED Date: Mon, 19 Dec 2022 22:36:45 -0800 Message-Id: <20221220063658.19271-20-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" On a FRED system, NMIs nest both with themselves and faults, transient information is saved into the stack frame, and NMI unblocking only happens when the stack frame indicates that so should happen. Thus, the NMI entry stub for FRED is really quite small... Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 1 + arch/x86/kernel/nmi.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 29 insertions(+) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 3089d1c70771..66c274a12e26 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -92,6 +92,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) #define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) typedef DECLARE_FRED_HANDLER((*fred_handler)); +DECLARE_FRED_HANDLER(fred_exc_nmi); DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c index cec0bfa3bc04..d497071a79f2 100644 --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.c @@ -34,6 +34,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -537,6 +538,33 @@ DEFINE_IDTENTRY_RAW(exc_nmi_noist) EXPORT_SYMBOL_GPL(asm_exc_nmi_noist); #endif +#ifdef CONFIG_X86_FRED +DEFINE_FRED_HANDLER(fred_exc_nmi) +{ + /* + * With FRED, CR2 and DR6 are pushed atomically on faults, + * so we don't have to worry about saving and restoring them. + * Breakpoint faults nest, so assume it is OK to leave DR7 + * enabled. + */ + irqentry_state_t irq_state = irqentry_nmi_enter(regs); + + /* + * VM exit induced by a NMI keeps NMI blocked, and we do + * "int $2" to reinject the NMI w/ NMI kept being blocked. + * However "int $2" doesn't set the nmi bit in the FRED + * stack frame, so we explicitly set it to make sure a + * later ERETS will unblock NMI immediately. + */ + regs->nmi = 1; + + inc_irq_stat(__nmi_count); + default_do_nmi(regs); + + irqentry_nmi_exit(regs, irq_state); +} +#endif + void stop_nmi(void) { ignore_nmis++; From patchwork Tue Dec 20 06:36:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077540 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1229DC4332F for ; Tue, 20 Dec 2022 07:02:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233462AbiLTHC3 (ORCPT ); Tue, 20 Dec 2022 02:02:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233253AbiLTHBj (ORCPT ); Tue, 20 Dec 2022 02:01:39 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B469813D4E; Mon, 19 Dec 2022 23:01:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519697; x=1703055697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SHEDrPLJu3KGjUhs/QSp7RJL1HsF029t/QfkfJr+3aI=; b=k7i3JM+nxRRxBzUXH410RKTO16WPvdVyP4CR+zFbOr6bYFuSnHSEowpN p/G/FP6ZReC1b1pksK9HlQb/ORsVdWWfoNnqQbgLZYkbe7RnCyGnb1K8O IYjOB/1MgJ3VJdF77DdQhex10Nrc0JHTKCPXRWLozFs77UlOlHpMg1IfK V2a5RKwI7STVDq3WuzggGeycUbt7/n7rB1J/3cSpgoFEZQ5/xL2SF7//+ GFGmKO4gk5Rp1XfFw6VwJ1Yf95Nruje9jdS/4lTOL3lIWPvB7AqGr3SJP kJWOrHtRcH4NHRm3ZV+cnsWArnM9fDHHJvzYG9XLlCg/inLFzwRA9fKf1 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972080" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972080" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326509" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326509" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:15 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 20/32] x86/fred: add a machine check entry stub for FRED Date: Mon, 19 Dec 2022 22:36:46 -0800 Message-Id: <20221220063658.19271-21-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a machine check entry stub for FRED. Unlike IDT, no need to save/restore dr7 in FRED machine check handler. Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 1 + arch/x86/kernel/cpu/mce/core.c | 11 +++++++++++ 2 files changed, 12 insertions(+) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 66c274a12e26..01678ced5451 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -95,6 +95,7 @@ typedef DECLARE_FRED_HANDLER((*fred_handler)); DECLARE_FRED_HANDLER(fred_exc_nmi); DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); +DECLARE_FRED_HANDLER(fred_exc_machine_check); #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 2c8ec5c71712..0186c9b39f5f 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -52,6 +52,7 @@ #include #include #include +#include #include "internal.h" @@ -2121,6 +2122,16 @@ DEFINE_IDTENTRY_MCE_USER(exc_machine_check) exc_machine_check_user(regs); local_db_restore(dr7); } + +#ifdef CONFIG_X86_FRED +DEFINE_FRED_HANDLER(fred_exc_machine_check) +{ + if (user_mode(regs)) + exc_machine_check_user(regs); + else + exc_machine_check_kernel(regs); +} +#endif #else /* 32bit unified entry point */ DEFINE_IDTENTRY_RAW(exc_machine_check) From patchwork Tue Dec 20 06:36:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077538 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B36CC4332F for ; Tue, 20 Dec 2022 07:02:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233443AbiLTHCQ (ORCPT ); Tue, 20 Dec 2022 02:02:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233283AbiLTHBn (ORCPT ); Tue, 20 Dec 2022 02:01:43 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7D6615FF0; Mon, 19 Dec 2022 23:01:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519698; x=1703055698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hb71FjnS4IGfRJOJBzO+RW509BXu1SCIOAS8RAhI6Mo=; b=hK/pFesHsVnEbL2DNbR9Tc9yzKDuFyfR335gpxcyy7EFSxwiOLHfA+yv bcFogcs12MCajbG/CeSF/20dIwSV/UPgkRyu82QNTG+sfoDyfk1ogb86o XG1/YE2zfmIFbuBp7UpUh7fEJMrmHcouTwg2F9RISCt5WT5ywN7UmMDtT 5YBVu3pFJfx6y8m0M0H6TBeR/Sv2kT6XI767Zb3ab7V377sV7HtomMivZ 26N2yGri8nEGDnBgZ3R5v3RsJ0yCvSDEc9Tm1F5a/lstE1JhtcfjyNBfS iwFv+vBzvvVNzboYjiaELH/sAo27XAAu2t10G4NFVckoJbKcD7OgdptFQ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972090" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972090" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326513" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326513" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:16 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 21/32] x86/fred: FRED entry/exit and dispatch code Date: Mon, 19 Dec 2022 22:36:47 -0800 Message-Id: <20221220063658.19271-22-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" The code to actually handle kernel and event entry/exit using FRED. It is split up into two files thus: - entry_64_fred.S contains the actual entrypoints and exit code, and saves and restores registers. - entry_fred.c contains the event multi-level dispatch code for FRED. The two-level dispatch is on the event type, and the second-level is on the event vector. Some event handlers, #DB/#BP/#DF/#PF/#MC/#UD, start instrumentation in their own ways. Dave Hansen suggested to use an exception bitmap for the checking whether to start instrumentation in the exception dispatch framework. Originally-by: Megha Dey Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Signed-off-by: Xin Li --- arch/x86/entry/Makefile | 5 +- arch/x86/entry/entry_64_fred.S | 55 +++++++ arch/x86/entry/entry_fred.c | 270 ++++++++++++++++++++++++++++++++ arch/x86/include/asm/idtentry.h | 2 + arch/x86/include/asm/traps.h | 2 + 5 files changed, 333 insertions(+), 1 deletion(-) create mode 100644 arch/x86/entry/entry_64_fred.S create mode 100644 arch/x86/entry/entry_fred.c diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile index ca2fe186994b..c93e7f5c2a06 100644 --- a/arch/x86/entry/Makefile +++ b/arch/x86/entry/Makefile @@ -18,6 +18,9 @@ obj-y += vdso/ obj-y += vsyscall/ obj-$(CONFIG_PREEMPTION) += thunk_$(BITS).o +CFLAGS_entry_fred.o += -fno-stack-protector +CFLAGS_REMOVE_entry_fred.o += -pg $(CC_FLAGS_FTRACE) +obj-$(CONFIG_X86_FRED) += entry_64_fred.o entry_fred.o + obj-$(CONFIG_IA32_EMULATION) += entry_64_compat.o syscall_32.o obj-$(CONFIG_X86_X32_ABI) += syscall_x32.o - diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S new file mode 100644 index 000000000000..1fb765fd3871 --- /dev/null +++ b/arch/x86/entry/entry_64_fred.S @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * arch/x86/entry/entry_64_fred.S + * + * The actual FRED entry points. + */ +#include +#include +#include +#include + +#include "calling.h" + + .code64 + .section ".noinstr.text", "ax" + +.macro FRED_ENTER + UNWIND_HINT_EMPTY + PUSH_AND_CLEAR_REGS + movq %rsp, %rdi /* %rdi -> pt_regs */ +.endm + +.macro FRED_EXIT + UNWIND_HINT_REGS + POP_REGS + addq $8,%rsp /* Drop error code */ +.endm + +/* + * The new RIP value that FRED event delivery establishes is + * IA32_FRED_CONFIG & ~FFFH for events that occur in ring 3. + * Thus the FRED ring 3 entry point must be 4K page aligned. + */ + .align 4096 + +SYM_CODE_START_NOALIGN(fred_entrypoint_user) + FRED_ENTER + call fred_entry_from_user +SYM_INNER_LABEL(fred_exit_user, SYM_L_GLOBAL) + FRED_EXIT + ERETU +SYM_CODE_END(fred_entrypoint_user) + +/* + * The new RIP value that FRED event delivery establishes is + * (IA32_FRED_CONFIG & ~FFFH) + 256 for events that occur in + * ring 0, i.e., fred_entrypoint_user + 256. + */ + .org fred_entrypoint_user+256 +SYM_CODE_START_NOALIGN(fred_entrypoint_kernel) + FRED_ENTER + call fred_entry_from_kernel + FRED_EXIT + ERETS +SYM_CODE_END(fred_entrypoint_kernel) diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c new file mode 100644 index 000000000000..56814ab0b825 --- /dev/null +++ b/arch/x86/entry/entry_fred.c @@ -0,0 +1,270 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * arch/x86/entry/entry_fred.c + * + * This contains the dispatch functions called from the entry point + * assembly. + */ + +#include +#include /* oops_begin/end, ... */ +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * Badness... + */ +static DEFINE_FRED_HANDLER(fred_bad_event) +{ + irqentry_state_t irq_state = irqentry_nmi_enter(regs); + + instrumentation_begin(); + + /* Panic on events from a high stack level */ + if (regs->current_stack_level > 0) { + pr_emerg("PANIC: invalid or fatal FRED event; event type %u " + "vector %u error 0x%lx aux 0x%lx at %04x:%016lx\n", + regs->type, regs->vector, regs->orig_ax, + fred_event_data(regs), regs->cs, regs->ip); + die("invalid or fatal FRED event", regs, regs->orig_ax); + panic("invalid or fatal FRED event"); + } else { + unsigned long flags = oops_begin(); + int sig = SIGKILL; + + pr_alert("BUG: invalid or fatal FRED event; event type %u " + "vector %u error 0x%lx aux 0x%lx at %04x:%016lx\n", + regs->type, regs->vector, regs->orig_ax, + fred_event_data(regs), regs->cs, regs->ip); + + if (__die("Invalid or fatal FRED event", regs, regs->orig_ax)) + sig = 0; + + oops_end(flags, regs, sig); + } + + instrumentation_end(); + irqentry_nmi_exit(regs, irq_state); +} + +#define DEFINE_FRED_EXCEPTION_HANDLER(func) \ +static void fred_##func(struct pt_regs *regs) \ +{ \ + func (regs); \ +} + +DEFINE_FRED_EXCEPTION_HANDLER(exc_divide_error); +DEFINE_FRED_EXCEPTION_HANDLER(exc_overflow); +DEFINE_FRED_EXCEPTION_HANDLER(exc_bounds); +DEFINE_FRED_EXCEPTION_HANDLER(exc_device_not_available); +DEFINE_FRED_EXCEPTION_HANDLER(exc_coprocessor_error); +DEFINE_FRED_EXCEPTION_HANDLER(exc_simd_coprocessor_error); + +#define DEFINE_FRED_EXCEPTION_HANDLER_ERRORCODE(func) \ +static void fred_##func(struct pt_regs *regs) \ +{ \ + func (regs, regs->orig_ax); \ +} + +DEFINE_FRED_EXCEPTION_HANDLER_ERRORCODE(exc_invalid_tss); +DEFINE_FRED_EXCEPTION_HANDLER_ERRORCODE(exc_segment_not_present); +noinstr DEFINE_FRED_EXCEPTION_HANDLER_ERRORCODE(exc_double_fault); +DEFINE_FRED_EXCEPTION_HANDLER_ERRORCODE(exc_stack_segment); +DEFINE_FRED_EXCEPTION_HANDLER_ERRORCODE(exc_general_protection); +DEFINE_FRED_EXCEPTION_HANDLER_ERRORCODE(exc_alignment_check); + +/* + * Exception entry + */ +static DEFINE_FRED_HANDLER(fred_exception) +{ + /* + * This intentially omits exceptions that cannot happen on FRED h/w: + * vectors _NOT_ listed are set to NULL. + */ + static const fred_handler exception_handlers[NUM_EXCEPTION_VECTORS] = { + [X86_TRAP_DE] = fred_exc_divide_error, + [X86_TRAP_DB] = fred_exc_debug, + [X86_TRAP_NMI] = NULL, /* A separate event type, not handled here */ + [X86_TRAP_BP] = exc_int3, + [X86_TRAP_OF] = fred_exc_overflow, + [X86_TRAP_BR] = fred_exc_bounds, + [X86_TRAP_UD] = exc_invalid_op, + [X86_TRAP_NM] = fred_exc_device_not_available, + [X86_TRAP_DF] = fred_exc_double_fault, + [X86_TRAP_OLD_MF] = NULL, /* 387 only! */ + [X86_TRAP_TS] = fred_exc_invalid_tss, + [X86_TRAP_NP] = fred_exc_segment_not_present, + [X86_TRAP_SS] = fred_exc_stack_segment, + [X86_TRAP_GP] = fred_exc_general_protection, + [X86_TRAP_PF] = fred_exc_page_fault, + [X86_TRAP_SPURIOUS] = NULL, /* Interrupts are their own event type */ + [X86_TRAP_MF] = fred_exc_coprocessor_error, + [X86_TRAP_AC] = fred_exc_alignment_check, + [X86_TRAP_MC] = fred_exc_machine_check, + [X86_TRAP_XF] = fred_exc_simd_coprocessor_error + }; + static const u32 noinstr_mask = BIT(X86_TRAP_DB) | BIT(X86_TRAP_BP) | + BIT(X86_TRAP_DF) | BIT(X86_TRAP_PF) | + BIT(X86_TRAP_MC) | BIT(X86_TRAP_UD); + u8 vector = array_index_nospec((u8)regs->vector, NUM_EXCEPTION_VECTORS); + irqentry_state_t state; + + if (likely(exception_handlers[vector])) { + if (!(BIT(vector) & noinstr_mask)) { + state = irqentry_enter(regs); + instrumentation_begin(); + } + + exception_handlers[vector](regs); + + if (!(BIT(vector) & noinstr_mask)) { + instrumentation_end(); + irqentry_exit(regs, state); + } + } else { + return fred_bad_event(regs); + } +} + +static __always_inline void fred_emulate_trap(struct pt_regs *regs) +{ + regs->type = EVENT_TYPE_SWFAULT; + regs->orig_ax = 0; + fred_exception(regs); +} + +static __always_inline void fred_emulate_fault(struct pt_regs *regs) +{ + regs->ip -= regs->instr_len; + fred_emulate_trap(regs); +} + +/* + * Emulate SYSENTER if applicable. This is not the preferred system + * call in 32-bit mode under FRED, rather int $0x80 is preferred and + * exported in the vdso. SYSCALL proper has a hard-coded early out in + * fred_entry_from_user(). + */ +static DEFINE_FRED_HANDLER(fred_syscall_slow) +{ + if (IS_ENABLED(CONFIG_IA32_EMULATION) && + likely(regs->vector == FRED_SYSENTER)) { + /* Convert frame to a syscall frame */ + regs->orig_ax = regs->ax; + regs->ax = -ENOSYS; + do_fast_syscall_32(regs); + } else { + regs->vector = X86_TRAP_UD; + fred_emulate_fault(regs); + } +} + +/* + * Some software exceptions can also be triggered as int instructions, + * for historical reasons. Implement those here. The performance-critical + * int $0x80 (32-bit system call) has a hard-coded early out. + */ +static DEFINE_FRED_HANDLER(fred_sw_interrupt_user) +{ + if (likely(regs->vector == IA32_SYSCALL_VECTOR)) { + /* Convert frame to a syscall frame */ + regs->orig_ax = regs->ax; + regs->ax = -ENOSYS; + return do_int80_syscall_32(regs); + } + + switch (regs->vector) { + case X86_TRAP_BP: + case X86_TRAP_OF: + fred_emulate_trap(regs); + break; + default: + regs->vector = X86_TRAP_GP; + fred_emulate_fault(regs); + break; + } +} + +static DEFINE_FRED_HANDLER(fred_hw_interrupt) +{ + irqentry_state_t state = irqentry_enter(regs); + + instrumentation_begin(); + external_interrupt(regs, regs->vector); + instrumentation_end(); + irqentry_exit(regs, state); +} + +__visible noinstr void fred_entry_from_user(struct pt_regs *regs) +{ + static const fred_handler user_handlers[FRED_EVENT_TYPE_COUNT] = + { + [EVENT_TYPE_HWINT] = fred_hw_interrupt, + [EVENT_TYPE_RESERVED] = fred_bad_event, + [EVENT_TYPE_NMI] = fred_exc_nmi, + [EVENT_TYPE_SWINT] = fred_sw_interrupt_user, + [EVENT_TYPE_HWFAULT] = fred_exception, + [EVENT_TYPE_SWFAULT] = fred_exception, + [EVENT_TYPE_PRIVSW] = fred_exception, + [EVENT_TYPE_OTHER] = fred_syscall_slow + }; + + /* + * FRED employs a two-level event dispatch mechanism, with + * the first-level on the type of an event and the second-level + * on its vector. Thus a dispatch typically induces 2 calls. + * We optimize it by using early outs for the most frequent + * events, and syscalls are the first. We may also need early + * outs for page faults. + */ + if (likely(regs->type == EVENT_TYPE_OTHER && + regs->vector == FRED_SYSCALL)) { + /* Convert frame to a syscall frame */ + regs->orig_ax = regs->ax; + regs->ax = -ENOSYS; + do_syscall_64(regs, regs->orig_ax); + } else { + /* Not a system call */ + u8 type = array_index_nospec((u8)regs->type, FRED_EVENT_TYPE_COUNT); + + user_handlers[type](regs); + } +} + +static DEFINE_FRED_HANDLER(fred_sw_interrupt_kernel) +{ + switch (regs->vector) { + case X86_TRAP_NMI: + fred_exc_nmi(regs); + break; + default: + fred_bad_event(regs); + break; + } +} + +__visible noinstr void fred_entry_from_kernel(struct pt_regs *regs) +{ + static const fred_handler kernel_handlers[FRED_EVENT_TYPE_COUNT] = + { + [EVENT_TYPE_HWINT] = fred_hw_interrupt, + [EVENT_TYPE_RESERVED] = fred_bad_event, + [EVENT_TYPE_NMI] = fred_exc_nmi, + [EVENT_TYPE_SWINT] = fred_sw_interrupt_kernel, + [EVENT_TYPE_HWFAULT] = fred_exception, + [EVENT_TYPE_SWFAULT] = fred_exception, + [EVENT_TYPE_PRIVSW] = fred_exception, + [EVENT_TYPE_OTHER] = fred_bad_event + }; + u8 type = array_index_nospec((u8)regs->type, FRED_EVENT_TYPE_COUNT); + + /* The pt_regs frame on entry here is an exception frame */ + kernel_handlers[type](regs); +} diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 966d720046f1..5b3b8402e0c5 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -616,6 +616,8 @@ DECLARE_IDTENTRY_RAW(X86_TRAP_MC, exc_machine_check); #ifdef CONFIG_XEN_PV DECLARE_IDTENTRY_RAW(X86_TRAP_MC, xenpv_exc_machine_check); #endif +#else +#define fred_exc_machine_check (NULL) #endif /* NMI */ diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 366b1675c033..77ffc580e821 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -58,4 +58,6 @@ typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); int exc_raise_irq(struct pt_regs *regs, u32 vector); +int external_interrupt(struct pt_regs *regs, unsigned int vector); + #endif /* _ASM_X86_TRAPS_H */ From patchwork Tue Dec 20 06:36:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4798AC10F1E for ; Tue, 20 Dec 2022 07:02:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233555AbiLTHCz (ORCPT ); Tue, 20 Dec 2022 02:02:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233291AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1702A15FF8; Mon, 19 Dec 2022 23:01:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519699; x=1703055699; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qQ9Z77NM1zplaDcAHAltOb9XPQ/w7BLCugheNo/3+b4=; b=Syxx1tduIPJWZ1i0y+IVYoj+oRxg4VyQGr76cCkNYeXkCQZNwuFpLFiL p7mp3G7HgaZ6iQdbiWcRMqMAh37Znoi6N8SsA9jAYmrC/NVvcGpy8/3Oq JEpgOVRQfTyxc73WoQ+zXK1Yr6z/5vPxFjp0JE3VtUoKoytSeXEgecNuP +GaOsG6RxJEqUUQg+reL7Nn70pfOLKLE0+/u+57u+zX0u1YvhdbR1cTwE TFWfWzEIZid5RXyb6pxz6qbYvXf22Vk2/NDNyj1kYNI81LFSTPSRGyAMZ zjDrVgzszwU26orL4qjAZq6fdEm+o2GGqcRyoLp2dpfcprXGn14GS8/mw Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972099" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972099" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:17 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326519" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326519" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:16 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 22/32] x86/fred: FRED initialization code Date: Mon, 19 Dec 2022 22:36:48 -0800 Message-Id: <20221220063658.19271-23-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" The code to initialize FRED when it's available and _not_ disabled. cpu_init_fred_exceptions() is the core function to initialize FRED, which 1. Sets up FRED entrypoints for events happening in ring 0 and 3. 2. Sets up a default stack for event handling. 3. Sets up dedicated event stacks for DB/NMI/MC/DF, equivalent to the IDT IST stacks. 4. Forces 32-bit system calls to use "int $0x80" only. 5. Enables FRED and invalidtes IDT. When the FRED is used, cpu_init_exception_handling() initializes FRED through calling cpu_init_fred_exceptions(), otherwise it sets up TSS IST and loads IDT. As FRED uses the ring 3 FRED entrypoint for SYSCALL and SYSENTER, it skips setting up SYSCALL/SYSENTER related MSRs, e.g., MSR_LSTAR. Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 14 +++++++ arch/x86/include/asm/traps.h | 2 + arch/x86/kernel/Makefile | 1 + arch/x86/kernel/cpu/common.c | 74 +++++++++++++++++++++++------------- arch/x86/kernel/fred.c | 73 +++++++++++++++++++++++++++++++++++ arch/x86/kernel/irqinit.c | 7 +++- arch/x86/kernel/traps.c | 16 +++++++- 7 files changed, 157 insertions(+), 30 deletions(-) create mode 100644 arch/x86/kernel/fred.c diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 01678ced5451..b6308e351e14 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -97,8 +97,22 @@ DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); DECLARE_FRED_HANDLER(fred_exc_machine_check); +/* + * The actual assembly entry and exit points + */ +extern __visible void fred_entrypoint_user(void); + +/* + * Initialization + */ +void cpu_init_fred_exceptions(void); +void fred_setup_apic(void); + #endif /* __ASSEMBLY__ */ +#else +#define cpu_init_fred_exceptions() BUG() +#define fred_setup_apic() BUG() #endif /* CONFIG_X86_FRED */ #endif /* ASM_X86_FRED_H */ diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 77ffc580e821..963c51e680bd 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -56,6 +56,8 @@ void __noreturn handle_stack_overflow(struct pt_regs *regs, void f (struct pt_regs *regs) typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); +system_interrupt_handler get_system_interrupt_handler(unsigned int i); + int exc_raise_irq(struct pt_regs *regs, u32 vector); int external_interrupt(struct pt_regs *regs, unsigned int vector); diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index f901658d9f7c..1d9e669e288b 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -48,6 +48,7 @@ obj-y += process_$(BITS).o signal.o obj-$(CONFIG_COMPAT) += signal_compat.o obj-y += traps.o idt.o irq.o irq_$(BITS).o dumpstack_$(BITS).o obj-y += time.o ioport.o dumpstack.o nmi.o +obj-$(CONFIG_X86_FRED) += fred.o obj-$(CONFIG_MODIFY_LDT_SYSCALL) += ldt.o obj-y += setup.o x86_init.o i8259.o irqinit.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 05a5538052ad..5de68356fe62 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -2034,28 +2035,6 @@ static void wrmsrl_cstar(unsigned long val) /* May not be marked __init: used by software suspend */ void syscall_init(void) { - wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS); - wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64); - -#ifdef CONFIG_IA32_EMULATION - wrmsrl_cstar((unsigned long)entry_SYSCALL_compat); - /* - * This only works on Intel CPUs. - * On AMD CPUs these MSRs are 32-bit, CPU truncates MSR_IA32_SYSENTER_EIP. - * This does not cause SYSENTER to jump to the wrong location, because - * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit). - */ - wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS); - wrmsrl_safe(MSR_IA32_SYSENTER_ESP, - (unsigned long)(cpu_entry_stack(smp_processor_id()) + 1)); - wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat); -#else - wrmsrl_cstar((unsigned long)ignore_sysret); - wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); - wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); - wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); -#endif - /* * Flags to clear on syscall; clear as much as possible * to minimize user space-kernel interference. @@ -2066,6 +2045,41 @@ void syscall_init(void) X86_EFLAGS_IF|X86_EFLAGS_DF|X86_EFLAGS_OF| X86_EFLAGS_IOPL|X86_EFLAGS_NT|X86_EFLAGS_RF| X86_EFLAGS_AC|X86_EFLAGS_ID); + + /* + * The default user and kernel segments + */ + wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS); + + if (cpu_feature_enabled(X86_FEATURE_FRED)) { + /* Both sysexit and sysret cause #UD when FRED is enabled */ + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); + } else { + wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64); + +#ifdef CONFIG_IA32_EMULATION + wrmsrl_cstar((unsigned long)entry_SYSCALL_compat); + /* + * This only works on Intel CPUs. + * On AMD CPUs these MSRs are 32-bit, CPU truncates + * MSR_IA32_SYSENTER_EIP. + * This does not cause SYSENTER to jump to the wrong + * location, because AMD doesn't allow SYSENTER in + * long mode (either 32- or 64-bit). + */ + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, + (unsigned long)(cpu_entry_stack(smp_processor_id()) + 1)); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat); +#else + wrmsrl_cstar((unsigned long)ignore_sysret); + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); +#endif + } } #else /* CONFIG_X86_64 */ @@ -2214,18 +2228,24 @@ void cpu_init_exception_handling(void) /* paranoid_entry() gets the CPU number from the GDT */ setup_getcpu(cpu); - /* IST vectors need TSS to be set up. */ - tss_setup_ist(tss); + /* Set up the TSS */ tss_setup_io_bitmap(tss); set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); - load_TR_desc(); /* GHCB needs to be setup to handle #VC. */ setup_ghcb(); - /* Finally load the IDT */ - load_current_idt(); + if (cpu_feature_enabled(X86_FEATURE_FRED)) { + /* Set up FRED exception handling */ + cpu_init_fred_exceptions(); + } else { + /* IST vectors need TSS to be set up. */ + tss_setup_ist(tss); + + /* Finally load the IDT */ + load_current_idt(); + } } /* diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c new file mode 100644 index 000000000000..827b58fd98d4 --- /dev/null +++ b/arch/x86/kernel/fred.c @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include +#include +#include +#include /* For cr4_set_bits() */ +#include + +/* + * Initialize FRED on this CPU. This cannot be __init as it is called + * during CPU hotplug. + */ +void cpu_init_fred_exceptions(void) +{ + wrmsrl(MSR_IA32_FRED_CONFIG, + FRED_CONFIG_ENTRYPOINT(fred_entrypoint_user) | + FRED_CONFIG_REDZONE(8) | /* Reserve for CALL emulation */ + FRED_CONFIG_INT_STKLVL(0)); + + wrmsrl(MSR_IA32_FRED_STKLVLS, + FRED_STKLVL(X86_TRAP_DB, 1) | + FRED_STKLVL(X86_TRAP_NMI, 2) | + FRED_STKLVL(X86_TRAP_MC, 2) | + FRED_STKLVL(X86_TRAP_DF, 3)); + + /* The FRED equivalents to IST stacks... */ + wrmsrl(MSR_IA32_FRED_RSP1, __this_cpu_ist_top_va(DB)); + wrmsrl(MSR_IA32_FRED_RSP2, __this_cpu_ist_top_va(NMI)); + wrmsrl(MSR_IA32_FRED_RSP3, __this_cpu_ist_top_va(DF)); + + /* Not used with FRED */ + wrmsrl(MSR_LSTAR, 0ULL); + wrmsrl(MSR_CSTAR, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); + + /* Enable FRED */ + cr4_set_bits(X86_CR4_FRED); + idt_invalidate(); /* Any further IDT use is a bug */ + + /* Use int $0x80 for 32-bit system calls in FRED mode */ + setup_clear_cpu_cap(X86_FEATURE_SYSENTER32); + setup_clear_cpu_cap(X86_FEATURE_SYSCALL32); +} + +/* + * Initialize system vectors from a FRED perspective, so + * lapic_assign_system_vectors() can do its job. + */ +void __init fred_setup_apic(void) +{ + int i; + + for (i = 0; i < FIRST_EXTERNAL_VECTOR; i++) + set_bit(i, system_vectors); + + /* + * Don't set the non assigned system vectors in the + * system_vectors bitmap. Otherwise they show up in + * /proc/interrupts. + */ +#ifdef CONFIG_SMP + set_bit(IRQ_MOVE_CLEANUP_VECTOR, system_vectors); +#endif + + for (i = 0; i < NR_SYSTEM_VECTORS; i++) { + if (get_system_interrupt_handler(i) != NULL) { + set_bit(i + FIRST_SYSTEM_VECTOR, system_vectors); + } + } + + /* The rest are fair game... */ +} diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c index beb1bada1b0a..bb59661f0278 100644 --- a/arch/x86/kernel/irqinit.c +++ b/arch/x86/kernel/irqinit.c @@ -28,6 +28,7 @@ #include #include #include +#include #include /* @@ -94,7 +95,11 @@ void __init native_init_IRQ(void) /* Execute any quirks before the call gates are initialised: */ x86_init.irqs.pre_vector_init(); - idt_setup_apic_and_irq_gates(); + if (cpu_feature_enabled(X86_FEATURE_FRED)) + fred_setup_apic(); + else + idt_setup_apic_and_irq_gates(); + lapic_assign_system_vectors(); if (!acpi_ioapic && !of_ioapic && nr_legacy_irqs()) { diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index b0ee83bab9e6..36a15df9b5e5 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1518,12 +1518,21 @@ static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = { #undef SYSV +system_interrupt_handler get_system_interrupt_handler(unsigned int i) +{ + if (i >= NR_SYSTEM_VECTORS) + return NULL; + + return system_interrupt_handlers[i]; +} + void __init install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr) { BUG_ON(n < FIRST_SYSTEM_VECTOR); system_interrupt_handlers[n - FIRST_SYSTEM_VECTOR] = (system_interrupt_handler)addr; - alloc_intr_gate(n, asm_addr); + if (!cpu_feature_enabled(X86_FEATURE_FRED)) + alloc_intr_gate(n, asm_addr); } #ifndef CONFIG_X86_LOCAL_APIC @@ -1591,7 +1600,10 @@ void __init trap_init(void) /* Initialize TSS before setting up traps so ISTs work */ cpu_init_exception_handling(); + /* Setup traps as cpu_init() might #GP */ - idt_setup_traps(); + if (!cpu_feature_enabled(X86_FEATURE_FRED)) + idt_setup_traps(); + cpu_init(); } From patchwork Tue Dec 20 06:36:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077546 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 789F7C4332F for ; Tue, 20 Dec 2022 07:02:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233430AbiLTHCu (ORCPT ); Tue, 20 Dec 2022 02:02:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233284AbiLTHBn (ORCPT ); Tue, 20 Dec 2022 02:01:43 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BED616486; Mon, 19 Dec 2022 23:01:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519699; x=1703055699; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TP8n9pSqR9d/xRUDBC0wSmLuk1j6QLJ8duTmvudCB0o=; b=a6fcbdcH8dy0QBrarpcl7TcMNcyf3uvfacmtICPy340s8gAWo+0Zpl75 ECNaPzGkh7O5TMF/e0KEvcYHzBiunO8Tp1Z1oLsHvZh/oZH5/5chnzgi9 bJ9xtCx+j+qEHaxuizUj1XD8BxBXWL/3/5rFMVK+R/ybDe8065Fj+/mRF uFw17yu3wkyZGt01HhXDUYS7TTWeY3PVbRN8/CFIJzt2WUvYSfyRbxzo2 c5PnMzT+X9AJsBFpziEPwTBftCI1gXW5xcsUzEwN7HyKOXIptwv3hGdny ug/67oIGW4/edNN1fg+h4dxGAYDYmIYCpPbYKmFj0bjsOxKCRItAJQFjL g==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972109" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972109" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:17 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326522" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326522" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:17 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 23/32] x86/fred: update MSR_IA32_FRED_RSP0 during task switch Date: Mon, 19 Dec 2022 22:36:49 -0800 Message-Id: <20221220063658.19271-24-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" MSR_IA32_FRED_RSP0 is used during ring 3 event delivery, and needs to be updated to point to the top of next task stack during task switch. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/switch_to.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h index c08eb0fdd11f..c28170d4fbba 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -71,9 +71,13 @@ static inline void update_task_stack(struct task_struct *task) else this_cpu_write(cpu_tss_rw.x86_tss.sp1, task->thread.sp0); #else - /* Xen PV enters the kernel on the thread stack. */ - if (static_cpu_has(X86_FEATURE_XENPV)) + if (cpu_feature_enabled(X86_FEATURE_FRED)) { + wrmsrl(MSR_IA32_FRED_RSP0, + task_top_of_stack(task) + TOP_OF_KERNEL_STACK_PADDING); + } else if (static_cpu_has(X86_FEATURE_XENPV)) { + /* Xen PV enters the kernel on the thread stack. */ load_sp0(task_top_of_stack(task)); + } #endif } From patchwork Tue Dec 20 06:36:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 242F4C3DA79 for ; Tue, 20 Dec 2022 07:02:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233569AbiLTHC4 (ORCPT ); Tue, 20 Dec 2022 02:02:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233290AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 106A915F04; Mon, 19 Dec 2022 23:01:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519700; x=1703055700; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9Ow+AUlF3YAQbms59Th+YlBw4taXZzsqN7sPL++pLxY=; b=lsxv/yEdrsDp/zOeI2dLcYmfjwWoq3PqZ8oSwV2aseNzMKwVPYr8xQbx QlCgLscpeE8LuntB8mxdw0B3Ek9k39J0peruw1wNcI5AmMwBA9ezJtBtj IQQZ50Zb7+VPFQHiVR1fdH5r8mQexbKmD++HZhvxZLCUoC4djv4YROqnU bmr3I/x44LXRtFpnWqT78swRK8+8r5WV3/6wseYb0tksOkOJjxeaDghjW Gb8urLO1sjp2vgZFEO1l/4uaWUmobTkOCuSIp3viPdzVUWnmr9n1YNdyT qjfEUb9cxcbyvcC1FBAtQDMObn9hsv0CqJYc2ulWS+75Oy7mkl/4Sz+Ls Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972119" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972119" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326526" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326526" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:17 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 24/32] x86/fred: let ret_from_fork() jmp to fred_exit_user when FRED is enabled Date: Mon, 19 Dec 2022 22:36:50 -0800 Message-Id: <20221220063658.19271-25-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Let ret_from_fork() jmp to fred_exit_user when FRED is enabled, otherwise the existing IDT code is chosen. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/entry/entry_64.S | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index e0c48998d2fb..cdb696cbb2a0 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -297,7 +297,12 @@ SYM_CODE_START(ret_from_fork) UNWIND_HINT_REGS movq %rsp, %rdi call syscall_exit_to_user_mode /* returns with IRQs disabled */ +#ifdef CONFIG_X86_FRED + ALTERNATIVE "jmp swapgs_restore_regs_and_return_to_usermode", \ + "jmp fred_exit_user", X86_FEATURE_FRED +#else jmp swapgs_restore_regs_and_return_to_usermode +#endif 1: /* kernel thread */ From patchwork Tue Dec 20 06:36:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D15A6C4332F for ; Tue, 20 Dec 2022 07:02:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233545AbiLTHCw (ORCPT ); Tue, 20 Dec 2022 02:02:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232887AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0B8F15FDF; Mon, 19 Dec 2022 23:01:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519700; x=1703055700; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cFa67ljZDOj6B4hnKmSyOvFw/+uBlbUZju1/l5Toj94=; b=bssOtzC1EihsWMNGwXyi8KsQIb6P94dulS/RQW2o/Hu3Ei3wZpXo3uUz BR3lzhuL3u+3YdCULK/GcQ6GGLxGGGq+hOD4aKzABEas9Nd5YUZE3kztV XUo5TScwIQ8Wlrzmtg1mSNzgS07safNFvOZ7lAaPnVYryvS9FmZq/lMfF 9XfUbSVJUloTIJ14MlBRAGTny/4c5zSkk4JEYRYFM5zinB4tUN9AegAKz yUg2mgnWZZ/6xfo7Qwc4pLw8ipQ/LJ8wIVjmVi3WFCA4saaPVlVSdXSmq d3mI/2TE4XyDKjh5PqsZprm+Ws4wSjhNCLfIWqaBwB+KRGLvaGDq1XZG4 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972128" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972128" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326530" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326530" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:17 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 25/32] x86/fred: disallow the swapgs instruction when FRED is enabled Date: Mon, 19 Dec 2022 22:36:51 -0800 Message-Id: <20221220063658.19271-26-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" The FRED architecture establishes the full supervisor/user through: 1) FRED event delivery swaps the value of the GS base address and that of the IA32_KERNEL_GS_BASE MSR. 2) ERETU swaps the value of the GS base address and that of the IA32_KERNEL_GS_BASE MSR. Thus, the swapgs instruction is disallowed when FRED is enabled, otherwise it cauess #UD. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/kernel/process_64.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index bfe6179b7a17..5b6cfd2ca630 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -165,7 +165,8 @@ static noinstr unsigned long __rdgsbase_inactive(void) lockdep_assert_irqs_disabled(); - if (!static_cpu_has(X86_FEATURE_XENPV)) { + if (!cpu_feature_enabled(X86_FEATURE_FRED) && + !static_cpu_has(X86_FEATURE_XENPV)) { native_swapgs(); gsbase = rdgsbase(); native_swapgs(); @@ -190,7 +191,8 @@ static noinstr void __wrgsbase_inactive(unsigned long gsbase) { lockdep_assert_irqs_disabled(); - if (!static_cpu_has(X86_FEATURE_XENPV)) { + if (!cpu_feature_enabled(X86_FEATURE_FRED) && + !static_cpu_has(X86_FEATURE_XENPV)) { native_swapgs(); wrgsbase(gsbase); native_swapgs(); From patchwork Tue Dec 20 06:36:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077550 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A03E7C10F1E for ; Tue, 20 Dec 2022 07:03:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233576AbiLTHDA (ORCPT ); Tue, 20 Dec 2022 02:03:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233293AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1B1B15FFB; Mon, 19 Dec 2022 23:01:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519700; x=1703055700; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JKAIJtKpEV0pDtyUyzmPScPoDyIj7KB0n/KR4J5yOTE=; b=lxSZZffo6R2IVG4O66/yYyCQDH/75enSFundJ+3UDvT3CGtyDDogOvYO zNPFQE96TlkBVEN0xdWuV/zA9dS/kqzS3NPeSLAzhvEGm279K6fJlL6qI i5GF6fS+3UCiGEOVN2iG2YA+ZTRatjZQJwfqpJLh3VqUc9/TVdhbdGpwj ditvax7jv5oWk8b+7KXshFgF8I2O3A3Q0rqMz4+/4rxR1WsAFH4LluMr/ ekhfinBM2KUM/aX5qtCGv41Ikg5fCJ8/knfsPe3rQmvYbOraVZxRtYR44 GCN+PloMttvehJWkpc/WhKO+S1ESphp7LiQbs3ydXhVZuYDAOoAwVpoem Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972137" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972137" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326537" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326537" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:18 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 26/32] x86/fred: no ESPFIX needed when FRED is enabled Date: Mon, 19 Dec 2022 22:36:52 -0800 Message-Id: <20221220063658.19271-27-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Because FRED always restores the full value of %rsp, ESPFIX is no longer needed when it's enabled. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/kernel/espfix_64.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c index 9417d5aa7305..b594fcc0a4b7 100644 --- a/arch/x86/kernel/espfix_64.c +++ b/arch/x86/kernel/espfix_64.c @@ -116,6 +116,10 @@ void __init init_espfix_bsp(void) pgd_t *pgd; p4d_t *p4d; + /* FRED systems don't need ESPFIX */ + if (cpu_feature_enabled(X86_FEATURE_FRED)) + return; + /* Install the espfix pud into the kernel page directory */ pgd = &init_top_pgt[pgd_index(ESPFIX_BASE_ADDR)]; p4d = p4d_alloc(&init_mm, pgd, ESPFIX_BASE_ADDR); @@ -139,6 +143,10 @@ void init_espfix_ap(int cpu) void *stack_page; pteval_t ptemask; + /* FRED systems don't need ESPFIX */ + if (cpu_feature_enabled(X86_FEATURE_FRED)) + return; + /* We only have to do this once... */ if (likely(per_cpu(espfix_stack, cpu))) return; /* Already initialized */ From patchwork Tue Dec 20 06:36:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48522C4332F for ; Tue, 20 Dec 2022 07:03:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233583AbiLTHDC (ORCPT ); Tue, 20 Dec 2022 02:03:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233298AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DAAA15F1B; Mon, 19 Dec 2022 23:01:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519701; x=1703055701; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w5B5k+hmkxOaK+wejq61l/Q17ercfMb2xEXn5kh7ENU=; b=n5nY4xcVJ7PQZzrLyCeEiXggq9YuQ60yQZO60xDcad0vNkFKtrnqK6bn kUSlh+R03k6lePnwWHZhEeRWhDBT26oWM9crsdYRkY/a5+ZY1OF8R6FQ+ RAzlB/RHfgmGGdSSm34u+npxooK070+IvmOj5RrVFZHeuxnIHNZmrWZkm d6yLevDCuuTv/h3u0VSCbKZjHnjMkd/qhi4mLkt1CbLKcUxKogqB2jnyW +615M1oyiB79LQ7+zU47xdjVjTV0G1C7V1ubE9mU2kgTaedzXm8BJyDdq smXYYTRoowhJ8BX/hwTzTaVRs6jCXfpiwGf12igo0buNpMSREpsO75egu Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972146" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972146" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326542" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326542" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:18 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 27/32] x86/fred: allow single-step trap and NMI when starting a new thread Date: Mon, 19 Dec 2022 22:36:53 -0800 Message-Id: <20221220063658.19271-28-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Allow single-step trap and NMI when starting a new thread, thus once the new thread returns to ring3, single-step trap and NMI are both enabled immediately. High-order 48 bits above the lowest 16 bit CS are discarded by the legacy IRET instruction, thus can be set unconditionally, even when FRED is not enabled. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 11 +++++++++++ arch/x86/kernel/process_64.c | 13 +++++++------ 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index b6308e351e14..730e69d2bb87 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -50,6 +50,14 @@ #define FRED_CSL_ALLOW_SINGLE_STEP _BITUL(25) #define FRED_CSL_INTERRUPT_SHADOW _BITUL(24) +/* + * High-order 48 bits above the lowest 16 bit CS are discarded by the + * legacy IRET instruction, thus can be set unconditionally, even when + * FRED is not enabled. + */ +#define CSL_PROCESS_START \ + (FRED_CSL_ENABLE_NMI | FRED_CSL_ALLOW_SINGLE_STEP) + #ifndef __ASSEMBLY__ #include @@ -113,6 +121,9 @@ void fred_setup_apic(void); #else #define cpu_init_fred_exceptions() BUG() #define fred_setup_apic() BUG() + +#define CSL_PROCESS_START 0 + #endif /* CONFIG_X86_FRED */ #endif /* ASM_X86_FRED_H */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 5b6cfd2ca630..128dafc04acf 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -55,6 +55,7 @@ #include #include #include +#include #ifdef CONFIG_IA32_EMULATION /* Not included via unistd.h */ #include @@ -506,7 +507,7 @@ void x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase) static void start_thread_common(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp, - unsigned int _cs, unsigned int _ss, unsigned int _ds) + u16 _cs, u16 _ss, u16 _ds) { WARN_ON_ONCE(regs != current_pt_regs()); @@ -521,11 +522,11 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip, loadsegment(ds, _ds); load_gs_index(0); - regs->ip = new_ip; - regs->sp = new_sp; - regs->cs = _cs; - regs->ss = _ss; - regs->flags = X86_EFLAGS_IF; + regs->ip = new_ip; + regs->sp = new_sp; + regs->csl = _cs | CSL_PROCESS_START; + regs->ssl = _ss; + regs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED; } void From patchwork Tue Dec 20 06:36:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077552 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8042CC46467 for ; Tue, 20 Dec 2022 07:03:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233283AbiLTHDE (ORCPT ); Tue, 20 Dec 2022 02:03:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233296AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E09B15FFC; Mon, 19 Dec 2022 23:01:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519702; x=1703055702; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uggmxMn6oJUIBEOGX26R5VtrTEJ+CYj34lHFi9H82Hs=; b=nFU7lkN3HiG5nD4TTdl2XA1ZZHOIShVI9Yv3dSezTsTr4iBLB3O4BK1E ZXysdbDlOSPxZHvHv89JAr6hoQ/iQF4MboQf9zBSFZYLEeeHKtnBj/SDz L/x8j5Mpj7eaG6OJ8AoGmgwrvipO+Vqc1b5cjGAnUpxEqK4U9mbpl4/2q V6ZW83XzwPlbMnK+XtddkUJFoptyqxYA/BX2uXw0L0VA7BkC2PPiKXq9m Me8JUGPsQXdyfwVXiZjr1INt4GAySMM9CjjbxNq8cKn2Sy3VMGb+geGPD sisWt3hc2b4CWazxbBf+xOeh8NK3ZzV9xJUQWqFZ1hXrsMWYh1nUPRxqW Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972155" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972155" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326547" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326547" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:19 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 28/32] x86/fred: fixup fault on ERETU by jumping to fred_entrypoint_user Date: Mon, 19 Dec 2022 22:36:54 -0800 Message-Id: <20221220063658.19271-29-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If the stack frame contains an invalid user context (e.g. due to invalid SS, a non-canonical RIP, etc.) the ERETU instruction will trap (#SS or #GP). From a Linux point of view, this really should be considered a user space failure, so use the standard fault fixup mechanism to intercept the fault, fix up the exception frame, and redirect execution to fred_entrypoint_user. The end result is that it appears just as if the hardware had taken the exception immediately after completing the transition to user space. Suggested-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/entry/entry_64_fred.S | 8 +++++-- arch/x86/include/asm/extable_fixup_types.h | 4 ++++ arch/x86/mm/extable.c | 28 ++++++++++++++++++++++ 3 files changed, 38 insertions(+), 2 deletions(-) diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S index 1fb765fd3871..027ef8f1e600 100644 --- a/arch/x86/entry/entry_64_fred.S +++ b/arch/x86/entry/entry_64_fred.S @@ -5,8 +5,10 @@ * The actual FRED entry points. */ #include -#include +#include #include +#include +#include #include #include "calling.h" @@ -38,7 +40,9 @@ SYM_CODE_START_NOALIGN(fred_entrypoint_user) call fred_entry_from_user SYM_INNER_LABEL(fred_exit_user, SYM_L_GLOBAL) FRED_EXIT - ERETU +1: ERETU + + _ASM_EXTABLE_TYPE(1b, fred_entrypoint_user, EX_TYPE_ERETU) SYM_CODE_END(fred_entrypoint_user) /* diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/asm/extable_fixup_types.h index 991e31cfde94..ddebd5b8b340 100644 --- a/arch/x86/include/asm/extable_fixup_types.h +++ b/arch/x86/include/asm/extable_fixup_types.h @@ -66,4 +66,8 @@ #define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ +#ifdef CONFIG_X86_FRED +#define EX_TYPE_ERETU 21 +#endif + #endif diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 60814e110a54..be9d75358f50 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -195,6 +196,29 @@ static bool ex_handler_ucopy_len(const struct exception_table_entry *fixup, return ex_handler_uaccess(fixup, regs, trapnr); } +#ifdef CONFIG_X86_FRED +static bool ex_handler_eretu(const struct exception_table_entry *fixup, + struct pt_regs *regs, unsigned long error_code) +{ + struct pt_regs *uregs = (struct pt_regs *)(regs->sp - offsetof(struct pt_regs, ip)); + unsigned short ss = uregs->ss; + unsigned short cs = uregs->cs; + + fred_info(uregs)->edata = fred_event_data(regs); + uregs->ssl = regs->ssl; + uregs->ss = ss; + uregs->csl = regs->csl; + uregs->current_stack_level = 0; + uregs->cs = cs; + uregs->orig_ax = error_code; + + /* drop error code */ + regs->sp -= 8; + + return ex_handler_default(fixup, regs); +} +#endif + int ex_get_fixup_type(unsigned long ip) { const struct exception_table_entry *e = search_exception_tables(ip); @@ -272,6 +296,10 @@ int fixup_exception(struct pt_regs *regs, int trapnr, unsigned long error_code, return ex_handler_ucopy_len(e, regs, trapnr, reg, imm); case EX_TYPE_ZEROPAD: return ex_handler_zeropad(e, regs, fault_addr); +#ifdef CONFIG_X86_FRED + case EX_TYPE_ERETU: + return ex_handler_eretu(e, regs, error_code); +#endif } BUG(); } From patchwork Tue Dec 20 06:36:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41A47C4332F for ; Tue, 20 Dec 2022 07:03:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233597AbiLTHDG (ORCPT ); Tue, 20 Dec 2022 02:03:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233300AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DF8715FEC; Mon, 19 Dec 2022 23:01:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519702; x=1703055702; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yvHd27MPzoRPhJLMrlSGr6kQm8k0tLId/1KA4wMkuYk=; b=BDPoVh9AaDqOWI32Qx9QseFcXRmMDZzl+jRzot659QlJF1SQmh+e5Tv4 iK0CDvJijF5bIQ1Q3D6DHcoJxWtybByVwKKgnlXgVEL+GT8Vt8GhnC2Ch ojDJUW+Y+68EKQ8R69JGxQASLspsfQfNdBDLV82O9hMs54HLcDxxOv4ki BHQq5K1Q6/t6tSQBaYKaduyjNGlfnXsUtvPpPllqCnnozU/0Lj0aAe0F+ FQMaA9ScdhOEgWQ+DMpJnPACX/39T8ifL3azyFfVO4IRqkroGqSmsqrbg 0+cmA6weMp6dtUOxJZcSe/+sEmMJ5QykUQ3zBgUcHSjdBIyT5Je2QixkF A==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972165" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972165" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326551" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326551" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:19 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 29/32] x86/ia32: do not modify the DPL bits for a null selector Date: Mon, 19 Dec 2022 22:36:55 -0800 Message-Id: <20221220063658.19271-30-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When a null selector is to be loaded into a segment register, reload_segments() sets its DPL bits to 3. Later when the IRET instruction loads it, it zeros the segment register. The two operations offset each other to actually effect a nop. Unlike IRET, ERETU does not make any of DS, ES, FS, or GS null if it is found to have DPL < 3. It is expected that a FRED-enabled operating system will return to ring 3 (in compatibility mode) only when those segments all have DPL = 3. Thus when FRED is enabled, we end up with having 3 in a segment register even when it is initially set to 0. Fix it by not modifying the DPL bits for a null selector. Signed-off-by: Xin Li --- arch/x86/ia32/ia32_signal.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c index 14c739303099..31f5bbb59441 100644 --- a/arch/x86/ia32/ia32_signal.c +++ b/arch/x86/ia32/ia32_signal.c @@ -36,22 +36,27 @@ #include #include +static inline u16 usrseg(u16 sel) +{ + return sel <= 3 ? sel : sel | 3; +} + static inline void reload_segments(struct sigcontext_32 *sc) { unsigned int cur; savesegment(gs, cur); - if ((sc->gs | 0x03) != cur) - load_gs_index(sc->gs | 0x03); + if (usrseg(sc->gs) != cur) + load_gs_index(usrseg(sc->gs)); savesegment(fs, cur); - if ((sc->fs | 0x03) != cur) - loadsegment(fs, sc->fs | 0x03); + if (usrseg(sc->fs) != cur) + loadsegment(fs, usrseg(sc->fs)); savesegment(ds, cur); - if ((sc->ds | 0x03) != cur) - loadsegment(ds, sc->ds | 0x03); + if (usrseg(sc->ds) != cur) + loadsegment(ds, usrseg(sc->ds)); savesegment(es, cur); - if ((sc->es | 0x03) != cur) - loadsegment(es, sc->es | 0x03); + if (usrseg(sc->es) != cur) + loadsegment(es, usrseg(sc->es)); } /* From patchwork Tue Dec 20 06:36:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077556 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69E4BC4332F for ; Tue, 20 Dec 2022 07:03:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233621AbiLTHDP (ORCPT ); Tue, 20 Dec 2022 02:03:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233316AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95AAFDEC6; Mon, 19 Dec 2022 23:01:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519703; x=1703055703; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=08DMJTwR5UqrP5eaEjFHiFkQPR9Chw6/A6KZptBXKkQ=; b=U6O9NU09qtHuQyDoz12atI16k9Rlxzm8Z8C64RwuEknxQqZQW1BPj86H 9yxT1BnEaC4XdoNT0OfFOYYXiPprINR6AxGhBvMgQVw6WqV8FlXY7NdY2 PMJKzgREn4bmrDCRdetD8+Bzrf5C2RJVk5AyDFkhhUYATd+2oYLnE9FKM 2lp8atgVY6oyqnkH42fXOsKhBvu1Pm1gnAIYdqXc+sG6nWA7NSdzh3Hxx Y9DkjS7DGje+sPgkPmkxI8V69t3D1DYVVXBzOJAnfy1RO0WRcXOfl+Ec/ bjhh6m4/AXoa5kyOFRw/cwRqcS14BayHv4o31m1DaeYQPXMAw3Hw8LzVn g==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972174" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972174" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:20 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326559" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326559" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:19 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 30/32] x86/fred: allow FRED systems to use interrupt vectors 0x10-0x1f Date: Mon, 19 Dec 2022 22:36:56 -0800 Message-Id: <20221220063658.19271-31-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" FRED inherits the Intel VT-x enhancement of classified events with a two-level event dispatch logic. The first-level dispatch is on the event type, and the second-level is on the event vector. This also means that vectors in different event types are orthogonal, thus, vectors 0x10-0x1f become available as hardware interrupts. Enable interrupt vectors 0x10-0x1f on FRED systems (interrupt 0x80 is already enabled.) Most of these changes are about removing the assumption that the lowest-priority vector is hard-wired to 0x20. Signed-off-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/include/asm/idtentry.h | 4 ++-- arch/x86/include/asm/irq.h | 5 +++++ arch/x86/include/asm/irq_vectors.h | 15 +++++++++++---- arch/x86/kernel/apic/apic.c | 11 ++++++++--- arch/x86/kernel/apic/vector.c | 8 +++++++- arch/x86/kernel/fred.c | 4 ++-- arch/x86/kernel/idt.c | 6 +++--- arch/x86/kernel/irq.c | 2 +- arch/x86/kernel/traps.c | 2 ++ 9 files changed, 41 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 5b3b8402e0c5..a3660f2f85ca 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -534,8 +534,8 @@ __visible noinstr void func(struct pt_regs *regs, \ */ .align IDT_ALIGN SYM_CODE_START(irq_entries_start) - vector=FIRST_EXTERNAL_VECTOR - .rept NR_EXTERNAL_VECTORS + vector=FIRST_EXTERNAL_VECTOR_IDT + .rept FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR_IDT UNWIND_HINT_IRET_REGS 0 : ENDBR diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h index 768aa234cbb4..e4be6f8409ad 100644 --- a/arch/x86/include/asm/irq.h +++ b/arch/x86/include/asm/irq.h @@ -11,6 +11,11 @@ #include #include +/* + * The first available IRQ vector + */ +extern unsigned int __ro_after_init first_external_vector; + /* * The irq entry code is in the noinstr section and the start/end of * __irqentry_text is emitted via labels. Make the build fail if diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 43dcb9284208..cb3670a7c18f 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -31,15 +31,23 @@ /* * IDT vectors usable for external interrupt sources start at 0x20. - * (0x80 is the syscall vector, 0x30-0x3f are for ISA) + * (0x80 is the syscall vector, 0x30-0x3f are for ISA). + * + * With FRED we can also use 0x10-0x1f even though those overlap + * exception vectors as FRED distinguishes exceptions and interrupts. + * Therefore, FIRST_EXTERNAL_VECTOR is no longer a constant. */ -#define FIRST_EXTERNAL_VECTOR 0x20 +#define FIRST_EXTERNAL_VECTOR_IDT 0x20 +#define FIRST_EXTERNAL_VECTOR_FRED 0x10 +#define FIRST_EXTERNAL_VECTOR first_external_vector /* * Reserve the lowest usable vector (and hence lowest priority) 0x20 for * triggering cleanup after irq migration. 0x21-0x2f will still be used * for device interrupts. */ +#define IRQ_MOVE_CLEANUP_VECTOR_IDT FIRST_EXTERNAL_VECTOR_IDT +#define IRQ_MOVE_CLEANUP_VECTOR_FRED FIRST_EXTERNAL_VECTOR_FRED #define IRQ_MOVE_CLEANUP_VECTOR FIRST_EXTERNAL_VECTOR #define IA32_SYSCALL_VECTOR 0x80 @@ -48,7 +56,7 @@ * Vectors 0x30-0x3f are used for ISA interrupts. * round up to the next 16-vector boundary */ -#define ISA_IRQ_VECTOR(irq) (((FIRST_EXTERNAL_VECTOR + 16) & ~15) + irq) +#define ISA_IRQ_VECTOR(irq) (((FIRST_EXTERNAL_VECTOR_IDT + 16) & ~15) + irq) /* * Special IRQ vectors used by the SMP architecture, 0xf0-0xff @@ -114,7 +122,6 @@ #define FIRST_SYSTEM_VECTOR NR_VECTORS #endif -#define NR_EXTERNAL_VECTORS (FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR) #define NR_SYSTEM_VECTORS (NR_VECTORS - FIRST_SYSTEM_VECTOR) /* diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index c6876d3ea4b1..1fbf6e3ed6c7 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1621,12 +1621,17 @@ static void setup_local_APIC(void) /* * Set Task Priority to 'accept all except vectors 0-31'. An APIC * vector in the 16-31 range could be delivered if TPR == 0, but we - * would think it's an exception and terrible things will happen. We - * never change this later on. + * would think it's an exception and terrible things will happen, + * unless we are using FRED in which case interrupts and + * exceptions are distinguished by type code. + * + * We never change this later on. */ + BUG_ON(!first_external_vector); + value = apic_read(APIC_TASKPRI); value &= ~APIC_TPRI_MASK; - value |= 0x10; + value |= (first_external_vector - 0x10) & APIC_TPRI_MASK; apic_write(APIC_TASKPRI, value); /* Clear eventually stale ISR/IRR bits */ diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 3e6f6b448f6a..1d7374fa8a1c 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -46,6 +46,7 @@ static struct irq_matrix *vector_matrix; #ifdef CONFIG_SMP static DEFINE_PER_CPU(struct hlist_head, cleanup_list); #endif +unsigned int first_external_vector = FIRST_EXTERNAL_VECTOR_IDT; void lock_vector_lock(void) { @@ -800,7 +801,12 @@ int __init arch_early_irq_init(void) * Allocate the vector matrix allocator data structure and limit the * search area. */ - vector_matrix = irq_alloc_matrix(NR_VECTORS, FIRST_EXTERNAL_VECTOR, + if (cpu_feature_enabled(X86_FEATURE_FRED)) + first_external_vector = FIRST_EXTERNAL_VECTOR_FRED; + else + first_external_vector = FIRST_EXTERNAL_VECTOR_IDT; + + vector_matrix = irq_alloc_matrix(NR_VECTORS, first_external_vector, FIRST_SYSTEM_VECTOR); BUG_ON(!vector_matrix); diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c index 827b58fd98d4..04f057219c6e 100644 --- a/arch/x86/kernel/fred.c +++ b/arch/x86/kernel/fred.c @@ -51,7 +51,7 @@ void __init fred_setup_apic(void) { int i; - for (i = 0; i < FIRST_EXTERNAL_VECTOR; i++) + for (i = 0; i < FIRST_EXTERNAL_VECTOR_FRED; i++) set_bit(i, system_vectors); /* @@ -60,7 +60,7 @@ void __init fred_setup_apic(void) * /proc/interrupts. */ #ifdef CONFIG_SMP - set_bit(IRQ_MOVE_CLEANUP_VECTOR, system_vectors); + set_bit(IRQ_MOVE_CLEANUP_VECTOR_FRED, system_vectors); #endif for (i = 0; i < NR_SYSTEM_VECTORS; i++) { diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index a58c6bc1cd68..d3fd86f85de9 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -131,7 +131,7 @@ static const __initconst struct idt_data apic_idts[] = { INTG(RESCHEDULE_VECTOR, asm_sysvec_reschedule_ipi), INTG(CALL_FUNCTION_VECTOR, asm_sysvec_call_function), INTG(CALL_FUNCTION_SINGLE_VECTOR, asm_sysvec_call_function_single), - INTG(IRQ_MOVE_CLEANUP_VECTOR, asm_sysvec_irq_move_cleanup), + INTG(IRQ_MOVE_CLEANUP_VECTOR_IDT, asm_sysvec_irq_move_cleanup), INTG(REBOOT_VECTOR, asm_sysvec_reboot), #endif @@ -274,13 +274,13 @@ static void __init idt_map_in_cea(void) */ void __init idt_setup_apic_and_irq_gates(void) { - int i = FIRST_EXTERNAL_VECTOR; + int i = FIRST_EXTERNAL_VECTOR_IDT; void *entry; idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts), true); for_each_clear_bit_from(i, system_vectors, FIRST_SYSTEM_VECTOR) { - entry = irq_entries_start + IDT_ALIGN * (i - FIRST_EXTERNAL_VECTOR); + entry = irq_entries_start + IDT_ALIGN * (i - FIRST_EXTERNAL_VECTOR_IDT); set_intr_gate(i, entry); } diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 7e125fff45ab..b7511e02959c 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -359,7 +359,7 @@ void fixup_irqs(void) * vector_lock because the cpu is already marked !online, so * nothing else will touch it. */ - for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { + for (vector = first_external_vector; vector < NR_VECTORS; vector++) { if (IS_ERR_OR_NULL(__this_cpu_read(vector_irq[vector]))) continue; diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 36a15df9b5e5..c6e60e888de7 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1541,6 +1541,8 @@ DEFINE_IDTENTRY_IRQ(spurious_interrupt) pr_info("Spurious interrupt (vector 0x%x) on CPU#%d, should never happen.\n", vector, smp_processor_id()); } + +unsigned int first_external_vector = FIRST_EXTERNAL_VECTOR_IDT; #endif /* From patchwork Tue Dec 20 06:36:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EE7CC4332F for ; Tue, 20 Dec 2022 07:03:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233612AbiLTHDM (ORCPT ); Tue, 20 Dec 2022 02:03:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233308AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 489DB16488; Mon, 19 Dec 2022 23:01:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519702; x=1703055702; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hhWyyWvCyG2CMXr63ox6YBTeAh5qJwMmT5ESLvyJHRI=; b=UwXjxYWuXnuDpDwDGoqu85wqos44Fpw8qp4ht8kmmTi/kCrZTC4eHyz5 vsMTDbaslXQ6gtBEMmViDp39vk8DGuKShlPlvzQ/s3wsmFYuLE/dSM6Sn 2VrV5z/fHjiHI/Mn9VMfX+kI6pxXLeja0/yeRbRVkhj0GaMBg9IZBOzOU XUKCOJMEcD5Jo9Z5DFGSbqkSp+7SEQ/GNarJPiH64RCW2+Sduyn1BbWiS 1niyQ46u2XHWmsLkCMa4oFUPqpWw3Vq2+PfjWlq5dgX9QGf170hAumdj7 HcgYjYw8C2ZcgM6vm+NIGUO8LeOhJnyjSoYWg6RlaHUXo2znX3Iis213N w==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972183" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972183" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:20 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326565" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326565" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:20 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 31/32] x86/fred: allow dynamic stack frame size Date: Mon, 19 Dec 2022 22:36:57 -0800 Message-Id: <20221220063658.19271-32-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org A FRED stack frame could contain different amount of information for different event types, or perhaps even for different instances of the same event type. Thus we need to eliminate the need of any advance information of the stack frame size to allow dynamic stack frame size. Implement it through: 1) add a new field user_pt_regs to thread_info, and initialize it with a pointer to a virtual pt_regs structure at the top of a thread stack. 2) save a pointer to the user-space pt_regs structure created by fred_entrypoint_user() to user_pt_regs in fred_entry_from_user(). 3) initialize the init_thread_info's user_pt_regs with a pointer to a virtual pt_regs structure at the top of init stack. This approach also works for IDT, thus we unify the code. Suggested-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li --- arch/x86/entry/entry_32.S | 2 +- arch/x86/entry/entry_fred.c | 2 ++ arch/x86/include/asm/entry-common.h | 3 +++ arch/x86/include/asm/processor.h | 12 +++------ arch/x86/include/asm/switch_to.h | 3 +-- arch/x86/include/asm/thread_info.h | 41 ++++------------------------- arch/x86/kernel/head_32.S | 3 +-- arch/x86/kernel/process.c | 5 ++++ kernel/fork.c | 6 +++++ 9 files changed, 27 insertions(+), 50 deletions(-) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index e309e7156038..d98cc64ca82b 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -1244,7 +1244,7 @@ SYM_CODE_START(rewind_stack_and_make_dead) xorl %ebp, %ebp movl PER_CPU_VAR(cpu_current_top_of_stack), %esi - leal -TOP_OF_KERNEL_STACK_PADDING-PTREGS_SIZE(%esi), %esp + leal -PTREGS_SIZE(%esi), %esp call make_task_dead 1: jmp 1b diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c index 56814ab0b825..140d9110bc39 100644 --- a/arch/x86/entry/entry_fred.c +++ b/arch/x86/entry/entry_fred.c @@ -216,6 +216,8 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs) [EVENT_TYPE_OTHER] = fred_syscall_slow }; + current->thread_info.user_pt_regs = regs; + /* * FRED employs a two-level event dispatch mechanism, with * the first-level on the type of an event and the second-level diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h index 674ed46d3ced..21e1e3ef9e33 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -12,6 +12,9 @@ /* Check that the stack and regs on entry from user mode are sane. */ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) { + if (!cpu_feature_enabled(X86_FEATURE_FRED)) + current->thread_info.user_pt_regs = regs; + if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) { /* * Make sure that the entry code gave us a sensible EFLAGS diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 67c9d73b31fa..6d573eeea074 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -747,17 +747,11 @@ static inline void spin_lock_prefetch(const void *x) prefetchw(x); } -#define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack) - \ - TOP_OF_KERNEL_STACK_PADDING) +#define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack)) -#define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1)) +#define task_top_of_stack(task) ((unsigned long)task_stack_page(task) + THREAD_SIZE) -#define task_pt_regs(task) \ -({ \ - unsigned long __ptr = (unsigned long)task_stack_page(task); \ - __ptr += THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING; \ - ((struct pt_regs *)__ptr) - 1; \ -}) +#define task_pt_regs(task) ((task)->thread_info.user_pt_regs) #ifdef CONFIG_X86_32 #define INIT_THREAD { \ diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h index c28170d4fbba..8ad5788da416 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -72,8 +72,7 @@ static inline void update_task_stack(struct task_struct *task) this_cpu_write(cpu_tss_rw.x86_tss.sp1, task->thread.sp0); #else if (cpu_feature_enabled(X86_FEATURE_FRED)) { - wrmsrl(MSR_IA32_FRED_RSP0, - task_top_of_stack(task) + TOP_OF_KERNEL_STACK_PADDING); + wrmsrl(MSR_IA32_FRED_RSP0, task_top_of_stack(task)); } else if (static_cpu_has(X86_FEATURE_XENPV)) { /* Xen PV enters the kernel on the thread stack. */ load_sp0(task_top_of_stack(task)); diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index fea0e69fc3d4..9b88b7a04fda 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -13,42 +13,6 @@ #include #include -/* - * TOP_OF_KERNEL_STACK_PADDING is a number of unused bytes that we - * reserve at the top of the kernel stack. We do it because of a nasty - * 32-bit corner case. On x86_32, the hardware stack frame is - * variable-length. Except for vm86 mode, struct pt_regs assumes a - * maximum-length frame. If we enter from CPL 0, the top 8 bytes of - * pt_regs don't actually exist. Ordinarily this doesn't matter, but it - * does in at least one case: - * - * If we take an NMI early enough in SYSENTER, then we can end up with - * pt_regs that extends above sp0. On the way out, in the espfix code, - * we can read the saved SS value, but that value will be above sp0. - * Without this offset, that can result in a page fault. (We are - * careful that, in this case, the value we read doesn't matter.) - * - * In vm86 mode, the hardware frame is much longer still, so add 16 - * bytes to make room for the real-mode segments. - * - * x86-64 has a fixed-length stack frame, but it depends on whether - * or not FRED is enabled. Future versions of FRED might make this - * dynamic, but for now it is always 2 words longer. - */ -#ifdef CONFIG_X86_32 -# ifdef CONFIG_VM86 -# define TOP_OF_KERNEL_STACK_PADDING 16 -# else -# define TOP_OF_KERNEL_STACK_PADDING 8 -# endif -#else /* x86-64 */ -# ifdef CONFIG_X86_FRED -# define TOP_OF_KERNEL_STACK_PADDING (2*8) -# else -# define TOP_OF_KERNEL_STACK_PADDING 0 -# endif -#endif - /* * low level task data that entry.S needs immediate access to * - this struct should fit entirely inside of one cache line @@ -56,6 +20,7 @@ */ #ifndef __ASSEMBLY__ struct task_struct; +struct pt_regs; #include #include @@ -66,11 +31,14 @@ struct thread_info { #ifdef CONFIG_SMP u32 cpu; /* current CPU */ #endif + struct pt_regs *user_pt_regs; }; +#define INIT_TASK_PT_REGS ((struct pt_regs *)TOP_OF_INIT_STACK - 1) #define INIT_THREAD_INFO(tsk) \ { \ .flags = 0, \ + .user_pt_regs = INIT_TASK_PT_REGS, \ } #else /* !__ASSEMBLY__ */ @@ -235,6 +203,7 @@ static inline int arch_within_stack_frames(const void * const stack, extern void arch_task_cache_init(void); extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); +extern void arch_init_user_pt_regs(struct task_struct *tsk); extern void arch_release_task_struct(struct task_struct *tsk); extern void arch_setup_new_exec(void); #define arch_setup_new_exec arch_setup_new_exec diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S index 9b7acc9c7874..8961946f1418 100644 --- a/arch/x86/kernel/head_32.S +++ b/arch/x86/kernel/head_32.S @@ -539,8 +539,7 @@ SYM_DATA_END(initial_page_table) * reliably detect the end of the stack. */ SYM_DATA(initial_stack, - .long init_thread_union + THREAD_SIZE - - SIZEOF_PTREGS - TOP_OF_KERNEL_STACK_PADDING) + .long init_thread_union + THREAD_SIZE - SIZEOF_PTREGS) __INITRODATA int_msg: diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index e436c9c1ef3b..6294d41f7691 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -97,6 +97,11 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) return 0; } +void arch_init_user_pt_regs(struct task_struct *tsk) +{ + tsk->thread_info.user_pt_regs = (struct pt_regs *)task_top_of_stack(tsk)- 1; +} + #ifdef CONFIG_X86_64 void arch_release_task_struct(struct task_struct *tsk) { diff --git a/kernel/fork.c b/kernel/fork.c index 08969f5aa38d..00bd585a4e07 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -948,6 +948,10 @@ int __weak arch_dup_task_struct(struct task_struct *dst, return 0; } +void __weak arch_init_user_pt_regs(struct task_struct *tsk) +{ +} + void set_task_stack_end_magic(struct task_struct *tsk) { unsigned long *stackend; @@ -975,6 +979,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) if (err) goto free_tsk; + arch_init_user_pt_regs(tsk); + #ifdef CONFIG_THREAD_INFO_IN_TASK refcount_set(&tsk->stack_refcount, 1); #endif From patchwork Tue Dec 20 06:36:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13077554 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E684C10F1E for ; Tue, 20 Dec 2022 07:03:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233257AbiLTHDJ (ORCPT ); Tue, 20 Dec 2022 02:03:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233311AbiLTHBo (ORCPT ); Tue, 20 Dec 2022 02:01:44 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 661B0164B3; Mon, 19 Dec 2022 23:01:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1671519703; x=1703055703; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dG+PmrxecS8Zp0V+8AcHi6OKB+U1bLaxzvtp836DcuY=; b=JZkPLpFGoEmHdbTgHrNgwx1CUPms/OqPSC8Xihf6b/8ZGC1BwZ/L4dnV wQpyDnwJ1vSVnsNMmjJl7WpHWTi3ghhfmT51aTiMS0xyGPLCLa/IWeLss +o60EHLsellGZR4lO7eKfcxLHmQcloP2QEbfnqAquSFGMM6zl3nF14kAc p9YKn9PYJV/maB6O71WiLgNo+xMpFm9WXv1znVS4LPtoXPalK/y+hZrYu UF1TF47Zft1iFEb5eSmEtm1+vRYAT2qbcafpxMeP53/60doRizMbgbiNG iXvtlMP9Mx3wspWXWXq9hbYDO9Jfmr7zQDU5sWGPHJoAYmdInellKWEqT Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="302972194" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="302972194" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2022 23:01:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10566"; a="644326571" X-IronPort-AV: E=Sophos;i="5.96,258,1665471600"; d="scan'208";a="644326571" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga007.jf.intel.com with ESMTP; 19 Dec 2022 23:01:20 -0800 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com Subject: [RFC PATCH 32/32] x86/fred: disable FRED by default in its early stage Date: Mon, 19 Dec 2022 22:36:58 -0800 Message-Id: <20221220063658.19271-33-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221220063658.19271-1-xin3.li@intel.com> References: <20221220063658.19271-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Disable FRED by default in its early stage. To enable FRED, a new kernel command line option "fred" needs to be added. Signed-off-by: Xin Li --- Documentation/admin-guide/kernel-parameters.txt | 4 ++++ arch/x86/kernel/cpu/common.c | 3 +++ 2 files changed, 7 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 42af9ca0127e..0bc76d926dd4 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1506,6 +1506,10 @@ Warning: use of this parameter will taint the kernel and may cause unknown problems. + fred + Forcefully enable flexible return and event delivery, + which is otherwise disabled by default. + ftrace=[tracer] [FTRACE] will set and start the specified tracer as early as possible in order to facilitate early diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 5de68356fe62..1a160337ad41 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1434,6 +1434,9 @@ static void __init cpu_parse_early_param(void) char *argptr = arg, *opt; int arglen, taint = 0; + if (!cmdline_find_option_bool(boot_command_line, "fred")) + setup_clear_cpu_cap(X86_FEATURE_FRED); + #ifdef CONFIG_X86_32 if (cmdline_find_option_bool(boot_command_line, "no387")) #ifdef CONFIG_MATH_EMULATION