From patchwork Mon Mar 27 07:58:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EEECC761A6 for ; Mon, 27 Mar 2023 08:24:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232978AbjC0IYe (ORCPT ); Mon, 27 Mar 2023 04:24:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232019AbjC0IYc (ORCPT ); Mon, 27 Mar 2023 04:24:32 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 614572D49; Mon, 27 Mar 2023 01:24:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905471; x=1711441471; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CJ5cn83oehVUqkHaIiQkGH139f3gJuf37y85qUnwXYY=; b=A183ZBs83JmI5q9u2KTkcLFnTZh6rMxQXmv4lqdI191STg61+j47pG4w BeAJzwjTgv3LqRmQ+MVncDc/CWWGIl0APK8SkWnZwo14lpo5whQZLjT80 2yEp3h1I2SoZeagDdws2/GirMKR2c9u2EV+KgqKxcrbCMmGsXFsjPXbWq 6mcyAg1OVNVRd8EY82pOAYkfXwODQXRHaCpZhRbqICx3pqJRIhSshw85r cAdelbU5t0QQjBZoN5aJpOxGu8L40oXVSpqGoyiNszqGuwpIxYXkK6hAH q/AHbYya9EL8tlbLm7NMzAD8Ee4+XlQWhkLU62wV6Of8YqoeulSYTq1Ay w==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930092" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930092" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787027" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787027" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:29 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 01/33] x86/traps: let common_interrupt() handle IRQ_MOVE_CLEANUP_VECTOR Date: Mon, 27 Mar 2023 00:58:06 -0700 Message-Id: <20230327075838.5403-2-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" IRQ_MOVE_CLEANUP_VECTOR is the only one of the system IRQ vectors that is *below* FIRST_SYSTEM_VECTOR. It is a slow path, so just push it into common_interrupt() just before the spurious interrupt handling. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/kernel/irq.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 766ffe3ba313..7e125fff45ab 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -248,6 +248,10 @@ DEFINE_IDTENTRY_IRQ(common_interrupt) desc = __this_cpu_read(vector_irq[vector]); if (likely(!IS_ERR_OR_NULL(desc))) { handle_irq(desc, regs); +#ifdef CONFIG_SMP + } else if (vector == IRQ_MOVE_CLEANUP_VECTOR) { + sysvec_irq_move_cleanup(regs); +#endif } else { ack_APIC_irq(); From patchwork Mon Mar 27 07:58:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5317EC77B61 for ; Mon, 27 Mar 2023 08:24:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233030AbjC0IYg (ORCPT ); Mon, 27 Mar 2023 04:24:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232345AbjC0IYc (ORCPT ); Mon, 27 Mar 2023 04:24:32 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A17B2D57; Mon, 27 Mar 2023 01:24:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905471; x=1711441471; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Nl6wMwkpqmIYYyq7+MpcXBB5xF17anU3m57tPiay76c=; b=oHip5YYqgHVdRLtplvrXazOcS7wTP7e0N2o2DLRqyrqJgKZiRmMo5He2 HyebUHGgJYqF/xpOroteFY+Ly3Vlcsp/VqPl4yDcFwnhqqUbnagmCd5+g fWKt/QciLnGEQpFe9X0xb3yaE506rQ2/33Woe2uev8dYQbjdsi47OAtix qa4lqSYCrmp6aAGti4zUY76qJiXPICQyft5TxBTRm/mev16gjF79CUEjG tgGYk2RbuFCF087Vd89Nb1/4u2GuW+ybw92y7rdMs9lgYOqCJg9+G7m5E 2Hz/Dzhm1v+hAaKDypTl9q/oRSEeXcKK7AcoreXT59ZCJ0e0gjqAq2W5X Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930093" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930093" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787032" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787032" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:29 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 02/33] x86/fred: make unions for the cs and ss fields in struct pt_regs Date: Mon, 27 Mar 2023 00:58:07 -0700 Message-Id: <20230327075838.5403-3-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Make the cs and ss fields in struct pt_regs unions between the actual selector and the unsigned long stack slot. FRED uses this space to store additional flags. The printk changes are simply due to the cs and ss fields changed to unsigned short from unsigned long. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v3: * Rename csl/ssl of the pt_regs structure to csx/ssx (x for extended) (Andrew Cooper). --- arch/x86/entry/vsyscall/vsyscall_64.c | 2 +- arch/x86/include/asm/ptrace.h | 34 ++++++++++++++++++++++++--- arch/x86/kernel/process_64.c | 2 +- 3 files changed, 33 insertions(+), 5 deletions(-) diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index d234ca797e4a..2429ad0df068 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -76,7 +76,7 @@ static void warn_bad_vsyscall(const char *level, struct pt_regs *regs, if (!show_unhandled_signals) return; - printk_ratelimited("%s%s[%d] %s ip:%lx cs:%lx sp:%lx ax:%lx si:%lx di:%lx\n", + printk_ratelimited("%s%s[%d] %s ip:%lx cs:%x sp:%lx ax:%lx si:%lx di:%lx\n", level, current->comm, task_pid_nr(current), message, regs->ip, regs->cs, regs->sp, regs->ax, regs->si, regs->di); diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index f4db78b09c8f..2abb23e6c1e2 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -82,12 +82,40 @@ struct pt_regs { * On hw interrupt, it's IRQ number: */ unsigned long orig_ax; -/* Return frame for iretq */ +/* Return frame for iretq/eretu/erets */ unsigned long ip; - unsigned long cs; + union { + unsigned long csx; /* cs extended: CS + any fields above it */ + struct __attribute__((__packed__)) { + unsigned short cs; /* CS selector proper */ + unsigned int current_stack_level: 2; + unsigned int __csx_resv1 : 6; + unsigned int interrupt_shadowed : 1; + unsigned int software_initiated : 1; + unsigned int __csx_resv2 : 2; + unsigned int nmi : 1; + unsigned int __csx_resv3 : 3; + unsigned int __csx_resv4 : 32; + }; + }; unsigned long flags; unsigned long sp; - unsigned long ss; + union { + unsigned long ssx; /* ss extended: SS + any fields above it */ + struct __attribute__((__packed__)) { + unsigned short ss; /* SS selector proper */ + unsigned int __ssx_resv1 : 16; + unsigned int vector : 8; + unsigned int __ssx_resv2 : 8; + unsigned int type : 4; + unsigned int __ssx_resv3 : 4; + unsigned int enclv : 1; + unsigned int long_mode : 1; + unsigned int nested : 1; + unsigned int __ssx_resv4 : 1; + unsigned int instr_len : 4; + }; + }; /* top of stack page */ }; diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index bb65a68b4b49..a1aa74864c8b 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -116,7 +116,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode, printk("%sFS: %016lx(%04x) GS:%016lx(%04x) knlGS:%016lx\n", log_lvl, fs, fsindex, gs, gsindex, shadowgs); - printk("%sCS: %04lx DS: %04x ES: %04x CR0: %016lx\n", + printk("%sCS: %04x DS: %04x ES: %04x CR0: %016lx\n", log_lvl, regs->cs, ds, es, cr0); printk("%sCR2: %016lx CR3: %016lx CR4: %016lx\n", log_lvl, cr2, cr3, cr4); From patchwork Mon Mar 27 07:58:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188730 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3B85C7619A for ; Mon, 27 Mar 2023 08:24:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233057AbjC0IYk (ORCPT ); Mon, 27 Mar 2023 04:24:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232701AbjC0IYd (ORCPT ); Mon, 27 Mar 2023 04:24:33 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07E3F2D63; Mon, 27 Mar 2023 01:24:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905472; x=1711441472; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z5qENiAMJjZqe4SfagAzXU8H0qxh2oDTfz+gWyWVxMM=; b=RZZiL0GHqqWe/ERmCctpD4G+l91MzjZcpkvj9vQc+r/OiKAbW0r/YJFu 51qqoyTEMPVNdV2m9W5+n3lA2XbasQLAgjBa2PkPq+GAr9iWxNa50Buch Xk+w6igASj2tzfM9AR9djGSHenAf25XzjjYo7FFBoms9EEeav6JOFXsrn pOpYhHF1m8cCxgzyhxOxAQU07HkhS2Vatx08b2K4OnQ6Suw7AsTYuGkiP 0O5TpHVbJMbGpbJegpTnvZm454UTYpRM1mkD8jYBc6gjM8B9e95ie4VA5 Hw7l4awC67Lmlx+Gf4rHdltTUIN3AbcX2fZDpUjCJCcvSew+707gwRnXd Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930104" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930104" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787035" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787035" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:30 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 03/33] x86/traps: add a system interrupt table for system interrupt dispatch Date: Mon, 27 Mar 2023 00:58:08 -0700 Message-Id: <20230327075838.5403-4-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" On x86, external interrupts are divided into the following two groups 1) system interrupts 2) external device interrupts External device interrupts are all routed to common_interrupt(), which dispatches external device interrupts through a per-CPU external interrupt dispatch table vector_irq. For system interrupts, add a system interrupt handler table for dispatching a system interrupt to its corresponding handler directly. Thus a software based dispatch function will be: void external_interrupt(struct pt_regs *regs) { u8 vector = regs->vector; if (is_system_interrupt(vector)) system_interrupt_handlers[vector_to_sysvec(vector)](regs); else /* external device interrupt */ common_interrupt(regs); } Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/idtentry.h | 64 +++++++++++++++++++++++++++------ arch/x86/include/asm/traps.h | 7 ++++ arch/x86/kernel/traps.c | 62 ++++++++++++++++++++++++++++++++ 3 files changed, 122 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index b241af4ce9b4..2876ddae02bc 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -167,17 +167,22 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code) /** * DECLARE_IDTENTRY_IRQ - Declare functions for device interrupt IDT entry - * points (common/spurious) + * points (common/spurious) and their corresponding + * software based dispatch handlers in the non-noinstr + * text section * @vector: Vector number (ignored for C) * @func: Function name of the entry point * - * Maps to DECLARE_IDTENTRY_ERRORCODE() + * Maps to DECLARE_IDTENTRY_ERRORCODE(), plus a dispatch function prototype */ #define DECLARE_IDTENTRY_IRQ(vector, func) \ - DECLARE_IDTENTRY_ERRORCODE(vector, func) + DECLARE_IDTENTRY_ERRORCODE(vector, func); \ + void dispatch_##func(struct pt_regs *regs, unsigned long error_code) /** * DEFINE_IDTENTRY_IRQ - Emit code for device interrupt IDT entry points + * and their corresponding software based dispatch + * handlers in the non-noinstr text section * @func: Function name of the entry point * * The vector number is pushed by the low level entry stub and handed @@ -187,6 +192,9 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code) * irq_enter/exit_rcu() are invoked before the function body and the * KVM L1D flush request is set. Stack switching to the interrupt stack * has to be done in the function body if necessary. + * + * dispatch_func() is a software based dispatch handler in the non-noinstr + * text section. */ #define DEFINE_IDTENTRY_IRQ(func) \ static void __##func(struct pt_regs *regs, u32 vector); \ @@ -204,31 +212,48 @@ __visible noinstr void func(struct pt_regs *regs, \ irqentry_exit(regs, state); \ } \ \ +void dispatch_##func(struct pt_regs *regs, unsigned long error_code) \ +{ \ + u32 vector = (u32)(u8)error_code; \ + \ + kvm_set_cpu_l1tf_flush_l1d(); \ + run_irq_on_irqstack_cond(__##func, regs, vector); \ +} \ + \ static noinline void __##func(struct pt_regs *regs, u32 vector) /** * DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points + * and their corresponding software based dispatch + * handlers in the non-noinstr text section * @vector: Vector number (ignored for C) * @func: Function name of the entry point * - * Declares three functions: + * Declares four functions: * - The ASM entry point: asm_##func * - The XEN PV trap entry point: xen_##func (maybe unused) * - The C handler called from the ASM entry point + * - The C handler used in the system interrupt handler table * - * Maps to DECLARE_IDTENTRY(). + * Maps to DECLARE_IDTENTRY(), plus a dispatch table function prototype */ #define DECLARE_IDTENTRY_SYSVEC(vector, func) \ - DECLARE_IDTENTRY(vector, func) + DECLARE_IDTENTRY(vector, func); \ + void dispatch_table_##func(struct pt_regs *regs) /** * DEFINE_IDTENTRY_SYSVEC - Emit code for system vector IDT entry points + * and their corresponding software based dispatch + * handlers in the non-noinstr text section * @func: Function name of the entry point * * irqentry_enter/exit() and irq_enter/exit_rcu() are invoked before the * function body. KVM L1D flush request is set. * - * Runs the function on the interrupt stack if the entry hit kernel mode + * Runs the function on the interrupt stack if the entry hit kernel mode. + * + * dispatch_table_func() is used in the system interrupt handler table for + * system interrupts dispatching. */ #define DEFINE_IDTENTRY_SYSVEC(func) \ static void __##func(struct pt_regs *regs); \ @@ -244,11 +269,19 @@ __visible noinstr void func(struct pt_regs *regs) \ irqentry_exit(regs, state); \ } \ \ +void dispatch_table_##func(struct pt_regs *regs) \ +{ \ + kvm_set_cpu_l1tf_flush_l1d(); \ + run_sysvec_on_irqstack_cond(__##func, regs); \ +} \ + \ static noinline void __##func(struct pt_regs *regs) /** * DEFINE_IDTENTRY_SYSVEC_SIMPLE - Emit code for simple system vector IDT - * entry points + * entry points and their corresponding + * software based dispatch handlers in + * the non-noinstr text section * @func: Function name of the entry point * * Runs the function on the interrupted stack. No switch to IRQ stack and @@ -256,6 +289,9 @@ static noinline void __##func(struct pt_regs *regs) * * Only use for 'empty' vectors like reschedule IPI and KVM posted * interrupt vectors. + * + * dispatch_table_func() is used in the system interrupt handler table for + * system interrupts dispatching. */ #define DEFINE_IDTENTRY_SYSVEC_SIMPLE(func) \ static __always_inline void __##func(struct pt_regs *regs); \ @@ -273,6 +309,14 @@ __visible noinstr void func(struct pt_regs *regs) \ irqentry_exit(regs, state); \ } \ \ +void dispatch_table_##func(struct pt_regs *regs) \ +{ \ + __irq_enter_raw(); \ + kvm_set_cpu_l1tf_flush_l1d(); \ + __##func (regs); \ + __irq_exit_raw(); \ +} \ + \ static __always_inline void __##func(struct pt_regs *regs) /** @@ -634,9 +678,7 @@ DECLARE_IDTENTRY(X86_TRAP_VE, exc_virtualization_exception); /* Device interrupts common/spurious */ DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, common_interrupt); -#ifdef CONFIG_X86_LOCAL_APIC DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, spurious_interrupt); -#endif /* System vector entry points */ #ifdef CONFIG_X86_LOCAL_APIC @@ -647,7 +689,7 @@ DECLARE_IDTENTRY_SYSVEC(X86_PLATFORM_IPI_VECTOR, sysvec_x86_platform_ipi); #endif #ifdef CONFIG_SMP -DECLARE_IDTENTRY(RESCHEDULE_VECTOR, sysvec_reschedule_ipi); +DECLARE_IDTENTRY_SYSVEC(RESCHEDULE_VECTOR, sysvec_reschedule_ipi); DECLARE_IDTENTRY_SYSVEC(IRQ_MOVE_CLEANUP_VECTOR, sysvec_irq_move_cleanup); DECLARE_IDTENTRY_SYSVEC(REBOOT_VECTOR, sysvec_reboot); DECLARE_IDTENTRY_SYSVEC(CALL_FUNCTION_SINGLE_VECTOR, sysvec_call_function_single); diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 47ecfff2c83d..28c8ba5fd81c 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -47,4 +47,11 @@ void __noreturn handle_stack_overflow(struct pt_regs *regs, struct stack_info *info); #endif +/* + * How system interrupt handlers are called. + */ +#define DECLARE_SYSTEM_INTERRUPT_HANDLER(f) \ + void f (struct pt_regs *regs) +typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); + #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index d317dc3d06a3..2cbe7e7e8b96 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1451,6 +1451,68 @@ DEFINE_IDTENTRY_SW(iret_error) } #endif +#ifdef CONFIG_X86_64 + +#ifndef CONFIG_X86_LOCAL_APIC +/* + * Used when local APIC is not configured to build into the kernel, but + * dispatch_table_spurious_interrupt() needs dispatch_spurious_interrupt(). + */ +DEFINE_IDTENTRY_IRQ(spurious_interrupt) +{ + pr_info("Spurious interrupt (vector 0x%x) on CPU#%d, should never happen.\n", + vector, smp_processor_id()); +} +#endif + +static void dispatch_table_spurious_interrupt(struct pt_regs *regs) +{ + dispatch_spurious_interrupt(regs, regs->vector); +} + +#define SYSV(x,y) [(x) - FIRST_SYSTEM_VECTOR] = y + +static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = { + [0 ... NR_SYSTEM_VECTORS-1] = dispatch_table_spurious_interrupt, +#ifdef CONFIG_SMP + SYSV(RESCHEDULE_VECTOR, dispatch_table_sysvec_reschedule_ipi), + SYSV(CALL_FUNCTION_VECTOR, dispatch_table_sysvec_call_function), + SYSV(CALL_FUNCTION_SINGLE_VECTOR, dispatch_table_sysvec_call_function_single), + SYSV(REBOOT_VECTOR, dispatch_table_sysvec_reboot), +#endif + +#ifdef CONFIG_X86_THERMAL_VECTOR + SYSV(THERMAL_APIC_VECTOR, dispatch_table_sysvec_thermal), +#endif + +#ifdef CONFIG_X86_MCE_THRESHOLD + SYSV(THRESHOLD_APIC_VECTOR, dispatch_table_sysvec_threshold), +#endif + +#ifdef CONFIG_X86_MCE_AMD + SYSV(DEFERRED_ERROR_VECTOR, dispatch_table_sysvec_deferred_error), +#endif + +#ifdef CONFIG_X86_LOCAL_APIC + SYSV(LOCAL_TIMER_VECTOR, dispatch_table_sysvec_apic_timer_interrupt), + SYSV(X86_PLATFORM_IPI_VECTOR, dispatch_table_sysvec_x86_platform_ipi), +# ifdef CONFIG_HAVE_KVM + SYSV(POSTED_INTR_VECTOR, dispatch_table_sysvec_kvm_posted_intr_ipi), + SYSV(POSTED_INTR_WAKEUP_VECTOR, dispatch_table_sysvec_kvm_posted_intr_wakeup_ipi), + SYSV(POSTED_INTR_NESTED_VECTOR, dispatch_table_sysvec_kvm_posted_intr_nested_ipi), +# endif +# ifdef CONFIG_IRQ_WORK + SYSV(IRQ_WORK_VECTOR, dispatch_table_sysvec_irq_work), +# endif + SYSV(SPURIOUS_APIC_VECTOR, dispatch_table_sysvec_spurious_apic_interrupt), + SYSV(ERROR_APIC_VECTOR, dispatch_table_sysvec_error_interrupt), +#endif +}; + +#undef SYSV + +#endif /* CONFIG_X86_64 */ + void __init trap_init(void) { /* Init cpu_entry_area before IST entries are set up */ From patchwork Mon Mar 27 07:58:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F27EC761A6 for ; Mon, 27 Mar 2023 08:24:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233050AbjC0IYh (ORCPT ); Mon, 27 Mar 2023 04:24:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232797AbjC0IYd (ORCPT ); Mon, 27 Mar 2023 04:24:33 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E549273A; Mon, 27 Mar 2023 01:24:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905472; x=1711441472; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/O/qXmHKArMjAWeJZsFCxyoE2+USZF2ezTlQhvzsStE=; b=VD2/hyTDjbaRSd1c0VZ98zPJYCFNup19n9CE1sulY3kXZSiU1yAj6pSA PMKP0RAW7ueKZgYnS2r/S/4mIpAl4lJ4XgWGOXYk5VQ1Vuhkyodd+tHl5 XlgBVdRTZ4pdw7cBSvKlB3cWlUHJRYqXRumbuzBSraXYQd1DTZtGvClSa pJAo6WRM1FlPar2UfgePcWEKXq5x52JKiICJqH2fZ03V8bMWiRk1XY47w Jvq8QTepNFp206RpJzIqOBD+pHp8KweBHPB3mOZtN71YedU4y+OQz1k3S lZakRtJBrW+fovpNlrbhzSgT9M9u1g+7fVlLfcbIQgUF7jhKkRARb/u/M Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930115" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930115" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787040" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787040" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:30 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 04/33] x86/traps: add install_system_interrupt_handler() Date: Mon, 27 Mar 2023 00:58:09 -0700 Message-Id: <20230327075838.5403-5-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Some kernel components install system interrupt handlers into the IDT, and we need to do the same for system_interrupt_handlers. A new function install_system_interrupt_handler() is added to install a system interrupt handler into both the IDT and system_interrupt_handlers. Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/traps.h | 2 ++ arch/x86/kernel/cpu/acrn.c | 7 +++++-- arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++-------- arch/x86/kernel/kvm.c | 4 +++- arch/x86/kernel/traps.c | 10 ++++++++++ drivers/xen/events/events_base.c | 5 ++++- 6 files changed, 38 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 28c8ba5fd81c..46f5e4e2a346 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -41,6 +41,8 @@ void math_emulate(struct math_emu_info *); bool fault_in_kernel_space(unsigned long address); +void install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr); + #ifdef CONFIG_VMAP_STACK void __noreturn handle_stack_overflow(struct pt_regs *regs, unsigned long fault_address, diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c index 485441b7f030..9351bf183a9e 100644 --- a/arch/x86/kernel/cpu/acrn.c +++ b/arch/x86/kernel/cpu/acrn.c @@ -18,6 +18,7 @@ #include #include #include +#include static u32 __init acrn_detect(void) { @@ -26,8 +27,10 @@ static u32 __init acrn_detect(void) static void __init acrn_init_platform(void) { - /* Setup the IDT for ACRN hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_acrn_hv_callback); + /* Install system interrupt handler for ACRN hypervisor callback */ + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_acrn_hv_callback, + sysvec_acrn_hv_callback); x86_platform.calibrate_tsc = acrn_get_tsc_khz; x86_platform.calibrate_cpu = acrn_get_tsc_khz; diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index f36dc2f796c5..63282f4bfdcd 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -487,19 +488,24 @@ static void __init ms_hyperv_init_platform(void) */ x86_platform.apic_post_init = hyperv_init; hyperv_setup_mmu_ops(); - /* Setup the IDT for hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_hyperv_callback); - /* Setup the IDT for reenlightenment notifications */ + /* Install system interrupt handler for hypervisor callback */ + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_hyperv_callback, + sysvec_hyperv_callback); + + /* Install system interrupt handler for reenlightenment notifications */ if (ms_hyperv.features & HV_ACCESS_REENLIGHTENMENT) { - alloc_intr_gate(HYPERV_REENLIGHTENMENT_VECTOR, - asm_sysvec_hyperv_reenlightenment); + install_system_interrupt_handler(HYPERV_REENLIGHTENMENT_VECTOR, + asm_sysvec_hyperv_reenlightenment, + sysvec_hyperv_reenlightenment); } - /* Setup the IDT for stimer0 */ + /* Install system interrupt handler for stimer0 */ if (ms_hyperv.misc_features & HV_STIMER_DIRECT_MODE_AVAILABLE) { - alloc_intr_gate(HYPERV_STIMER0_VECTOR, - asm_sysvec_hyperv_stimer0); + install_system_interrupt_handler(HYPERV_STIMER0_VECTOR, + asm_sysvec_hyperv_stimer0, + sysvec_hyperv_stimer0); } # ifdef CONFIG_SMP diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 1cceac5984da..5c684df6de7a 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -829,7 +829,9 @@ static void __init kvm_guest_init(void) if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT) && kvmapf) { static_branch_enable(&kvm_async_pf_enabled); - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_kvm_asyncpf_interrupt); + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_kvm_asyncpf_interrupt, + sysvec_kvm_asyncpf_interrupt); } #ifdef CONFIG_SMP diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 2cbe7e7e8b96..12072e2af4a6 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1513,6 +1513,16 @@ static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = { #endif /* CONFIG_X86_64 */ +void __init install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr) +{ + BUG_ON(n < FIRST_SYSTEM_VECTOR); + +#ifdef CONFIG_X86_64 + system_interrupt_handlers[n - FIRST_SYSTEM_VECTOR] = (system_interrupt_handler)addr; +#endif + alloc_intr_gate(n, asm_addr); +} + void __init trap_init(void) { /* Init cpu_entry_area before IST entries are set up */ diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c7715f8bd452..cf1a5ca3bf62 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include #endif @@ -2249,7 +2250,9 @@ static __init void xen_alloc_callback_vector(void) return; pr_info("Xen HVM callback vector for event delivery is enabled\n"); - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, asm_sysvec_xen_hvm_callback); + install_system_interrupt_handler(HYPERVISOR_CALLBACK_VECTOR, + asm_sysvec_xen_hvm_callback, + sysvec_xen_hvm_callback); } #else void xen_setup_callback_vector(void) {} From patchwork Mon Mar 27 07:58:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CE21C761AF for ; Mon, 27 Mar 2023 08:24:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233064AbjC0IYm (ORCPT ); Mon, 27 Mar 2023 04:24:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232893AbjC0IYd (ORCPT ); Mon, 27 Mar 2023 04:24:33 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F57C2D49; Mon, 27 Mar 2023 01:24:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905472; x=1711441472; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BCbf0Vj9ZyZPtbZTSgD/ohjbP2Y9NHqyEC0o1Sj/JgM=; b=DfounY/Lqz2ECExkXBI9AFNS1uXRY6W7VCx4p0APrkfWFo7w5XoWvhEe yc+SG98lCJ1BZYGxbLG2v9hIY5bZs26HAErU1qo79Pu77H5eVcGQXjjmd SfDIlUVJk/FltwhDibNDgY3pKIEytk0gcNXpkB+ly226bU4ciwZRDyU2m puESNzkfnURJqDdD83BidDO/FpLoWw5KOziadaL6mTG3H/Ahe5zoVHbbe 08FXSGIlXW7ww5whWlsQVZ0qBPtq7kRVe6wVqetqqCeYMj9FBk288VmbO eY7v2jIB51u50GwA5J4yWNhiQbOhOFYfvw/MuPLleaDykrBVpYwVFxTkL g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930126" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930126" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787045" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787045" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:31 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 05/33] x86/traps: add external_interrupt() to dispatch external interrupts Date: Mon, 27 Mar 2023 00:58:10 -0700 Message-Id: <20230327075838.5403-6-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add external_interrupt() to dispatch external interrupts to their handlers. If an external interrupt is a system interrupt, dipatch it through system_interrupt_handlers table, otherwise to dispatch_common_interrupt(). Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v5: * Initialize system_interrupt_handlers with dispatch_table_spurious_interrupt() instead of NULL to get rid of a branch (Peter Zijlstra). --- arch/x86/kernel/traps.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 12072e2af4a6..f86cd233b00b 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1511,6 +1511,32 @@ static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = { #undef SYSV +/* + * External interrupt dispatch function. + * + * Until/unless dispatch_common_interrupt() can be taught to deal with the + * special system vectors, split the dispatch. + * + * Note: dispatch_common_interrupt() already deals with IRQ_MOVE_CLEANUP_VECTOR. + */ +int external_interrupt(struct pt_regs *regs) +{ + unsigned int vector = regs->vector; + unsigned int sysvec = vector - FIRST_SYSTEM_VECTOR; + + if (vector < FIRST_EXTERNAL_VECTOR) { + pr_err("invalid external interrupt vector %d\n", vector); + return -EINVAL; + } + + if (sysvec < NR_SYSTEM_VECTORS) + system_interrupt_handlers[sysvec](regs); + else + dispatch_common_interrupt(regs, vector); + + return 0; +} + #endif /* CONFIG_X86_64 */ void __init install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr) From patchwork Mon Mar 27 07:58:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188732 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF447C761A6 for ; Mon, 27 Mar 2023 08:24:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233076AbjC0IYn (ORCPT ); Mon, 27 Mar 2023 04:24:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232938AbjC0IYe (ORCPT ); Mon, 27 Mar 2023 04:24:34 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F35F2D57; Mon, 27 Mar 2023 01:24:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905473; x=1711441473; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g68cyMdX2debPTjgjq16HL+D6AhwC/hj42OGPtnTvek=; b=m6CNPt6IPMO+2Wpk8wDqph8p+THn6N3vu//8K7kuPiqBrOsVhTxzer9T X8qU5ILzWATOX2jw9I4NEuryoAlb+P0uGyStlGCi+VVtVMviIhJAE/XpC m3lEXNjVoreC185e2xtAZtrm6HJGQvPkTIbYcSqM1Ui6CIZ0TpHR9A3xv UIHUJ7PnYb9O7zb86BlY9bJF9mcc2i7d+55TT6SuQAcIF691QDNMDIJBB AmvvPcfSiNMk10X5LU8i+FnrHsDg0WiAq0795SPN+TiJKMkqUOi5UTr/7 KCLRLqiHfvL++utVIoZT790E73sc2K+WbHC2jiXfPHx686iyCyZsoXIix w==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930137" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930137" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787049" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787049" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:31 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 06/33] x86/cpufeature: add the cpu feature bit for FRED Date: Mon, 27 Mar 2023 00:58:11 -0700 Message-Id: <20230327075838.5403-7-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add the CPU feature bit for FRED (Flexible Return and Event Delivery). The Intel flexible return and event delivery (FRED) architecture defines simple new transitions that change privilege level (ring transitions). The FRED architecture was designed with the following goals: 1) Improve overall performance and response time by replacing event delivery through the interrupt descriptor table (IDT event delivery) and event return by the IRET instruction with lower latency transitions. 2) Improve software robustness by ensuring that event delivery establishes the full supervisor context and that event return establishes the full user context. The new transitions defined by the FRED architecture are FRED event delivery and, for returning from events, two FRED return instructions. FRED event delivery can effect a transition from ring 3 to ring 0, but it is used also to deliver events incident to ring 0. One FRED instruction (ERETU) effects a return from ring 0 to ring 3, while the other (ERETS) returns while remaining in ring 0. Search for the latest FRED spec in most search engines with this search pattern: site:intel.com FRED (flexible return and event delivery) specification Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/cpufeatures.h | 1 + tools/arch/x86/include/asm/cpufeatures.h | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 73c9672c123b..1fa444478d33 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -318,6 +318,7 @@ #define X86_FEATURE_FZRM (12*32+10) /* "" Fast zero-length REP MOVSB */ #define X86_FEATURE_FSRS (12*32+11) /* "" Fast short REP STOSB */ #define X86_FEATURE_FSRC (12*32+12) /* "" Fast short REP {CMPSB,SCASB} */ +#define X86_FEATURE_FRED (12*32+17) /* Flexible Return and Event Delivery */ #define X86_FEATURE_LKGS (12*32+18) /* "" Load "kernel" (userspace) GS */ #define X86_FEATURE_AMX_FP16 (12*32+21) /* "" AMX fp16 Support */ #define X86_FEATURE_AVX_IFMA (12*32+23) /* "" Support for VPMADD52[H,L]UQ */ diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index b89005819cd5..e9064f4a011a 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -312,6 +312,7 @@ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ #define X86_FEATURE_CMPCCXADD (12*32+ 7) /* "" CMPccXADD instructions */ +#define X86_FEATURE_FRED (12*32+17) /* Flexible Return and Event Delivery */ #define X86_FEATURE_LKGS (12*32+18) /* "" Load "kernel" (userspace) GS */ #define X86_FEATURE_AMX_FP16 (12*32+21) /* "" AMX fp16 Support */ #define X86_FEATURE_AVX_IFMA (12*32+23) /* "" Support for VPMADD52[H,L]UQ */ From patchwork Mon Mar 27 07:58:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188734 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 493ABC76195 for ; Mon, 27 Mar 2023 08:25:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233085AbjC0IYt (ORCPT ); Mon, 27 Mar 2023 04:24:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232985AbjC0IYf (ORCPT ); Mon, 27 Mar 2023 04:24:35 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76F692D6B; Mon, 27 Mar 2023 01:24:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905473; x=1711441473; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9JZohEEZ3OOYA+uz2zlIPReipLAVM4kv9u7LWFHKKew=; b=C9krPeJ7yDNs8cmsXdTzBQ+G8QdUOPL1007L4L8UJY2AmupdXNEs3z7V LjCwBVzW2FWnA0pXj57IJ+Lscx6W6RBCj96Cg89WIwGmjNYs2xf4QYnLL 3L0b0qg52L4vvFlGJVD0ENDH0+CnqP4Uqs7YNpJXUJR3A7wYIP4MHVa73 JE4dW2Ld4ZDtHMM8Nqu4I+kSv5YAgfWtU+GUmCqIkcQBwCzz6jGDv7SvE vrnx4sXZqyYUtPaiSYBmdxqISdineYr20xn4C7YmVDk4Y2lFn6OsJ1Jxd qDH9mYYydHCYCnOjQQ5LDuJDXTg1Yrk7F09NDMk1ok1t3btkTGZNNJlc6 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930148" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930148" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787052" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787052" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:31 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 07/33] x86/opcode: add ERETU, ERETS instructions to x86-opcode-map Date: Mon, 27 Mar 2023 00:58:12 -0700 Message-Id: <20230327075838.5403-8-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add the instruction opcodes used by FRED: ERETU, ERETS. Opcode number is per public FRED draft spec v3.0. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/lib/x86-opcode-map.txt | 2 +- tools/arch/x86/lib/x86-opcode-map.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 5168ee0360b2..7a269e269dc0 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -1052,7 +1052,7 @@ EndTable GrpTable: Grp7 0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) -1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) +1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B) 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B) 3: LIDT Ms 4: SMSW Mw/Rv diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index 5168ee0360b2..7a269e269dc0 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -1052,7 +1052,7 @@ EndTable GrpTable: Grp7 0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) -1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) +1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B) 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B) 3: LIDT Ms 4: SMSW Mw/Rv From patchwork Mon Mar 27 07:58:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE640C761AF for ; Mon, 27 Mar 2023 08:24:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232836AbjC0IYr (ORCPT ); Mon, 27 Mar 2023 04:24:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232999AbjC0IYf (ORCPT ); Mon, 27 Mar 2023 04:24:35 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B63432D63; Mon, 27 Mar 2023 01:24:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905473; x=1711441473; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4lCYxmSNxB+oTxZx5ZOjZmvc7OZS0xLkR1RLWDKH14M=; b=Y4pZbYuQEnePmFsRAnGKw6Ek/Iogi1dP8YxEgpemugKr6X2D40NRiXdo 54J0iwE2prgF1Z/EPG8zHjvhltGZI/rBq7NhuidbKon6bGaRhKZykVBjT RRSvqofXHyxA15fMMDNf95nNgo7/nCko43P64RvzKFUaUf3bWt6cxbznO 49EaW0IWy3EjonJk1v7UZRtevH8YuICzmicyURyM29gIxOjT1iKhb+pXQ wkHThgwdl35ARYeqqlo0mxm8H2yCycU8A03b+inr4bLBa9GiPLRGjBM5n XYCfwxFdkBGqk5qsV6nVTpML6tGjLVee5jooDAu0F+WTD2eI9q5TVsxCw w==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930158" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930158" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787056" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787056" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:32 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 08/33] x86/objtool: teach objtool about ERETU and ERETS Date: Mon, 27 Mar 2023 00:58:13 -0700 Message-Id: <20230327075838.5403-9-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Update the objtool decoder to know about the ERETU and ERETS instructions (type INSN_CONTEXT_SWITCH.) Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- tools/objtool/arch/x86/decode.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 9ef024fd648c..8e9c802f78ec 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -509,11 +509,20 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec if (op2 == 0x01) { - if (modrm == 0xca) - insn->type = INSN_CLAC; - else if (modrm == 0xcb) - insn->type = INSN_STAC; - + switch (insn_last_prefix_id(&ins)) { + case INAT_PFX_REPE: + case INAT_PFX_REPNE: + if (modrm == 0xca) + /* eretu/erets */ + insn->type = INSN_CONTEXT_SWITCH; + break; + default: + if (modrm == 0xca) + insn->type = INSN_CLAC; + else if (modrm == 0xcb) + insn->type = INSN_STAC; + break; + } } else if (op2 >= 0x80 && op2 <= 0x8f) { insn->type = INSN_JUMP_CONDITIONAL; From patchwork Mon Mar 27 07:58:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4F8FC761AF for ; Mon, 27 Mar 2023 08:25:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233094AbjC0IYw (ORCPT ); Mon, 27 Mar 2023 04:24:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233004AbjC0IYf (ORCPT ); Mon, 27 Mar 2023 04:24:35 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6BFC3A91; Mon, 27 Mar 2023 01:24:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905474; x=1711441474; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xePdNXHU682aUfMVCFmcmPRxw9pVvyjrZ2d9bkzRB0U=; b=HC2p1kBcGnUg0XxAGVB3fZ9z+bvwTw3FFOZu+KvQ7tTUREHWQEVPw/Wq TBzxcQMBQ1F+1uc58hwhmw5av2x0/d/E5DHcbPb4qux1NuVcqPaFgf9pC eWjzkRP/g4Ksvh8zMp9+IKVvsnBbjc9JC8SRtq9JZL6z67z9YR9un8KzH Lvg9BdPRjFMQAjccnP5AzaXVQaHE9tDZ6oLrS+FBaf8rOd2nFV9lzUboa 2HmGw2/v0NyvpS4/4cUnQGWWs3j3p/zU7BnF9B+eZiMKntcu0Rl0plbde Ep0LWD/3fwQi3XATA2Nh0dUhkJqNjLglDLmNqmLbDftmGUz+b27MZ4cuk g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930170" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930170" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787060" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787060" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:32 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 09/33] x86/cpu: add X86_CR4_FRED macro Date: Mon, 27 Mar 2023 00:58:14 -0700 Message-Id: <20230327075838.5403-10-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add X86_CR4_FRED macro for the FRED bit in %cr4. This bit should be a pinned bit, not to be changed after initialization. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/uapi/asm/processor-flags.h | 2 ++ arch/x86/kernel/cpu/common.c | 11 ++++++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index c47cc7f2feeb..a90933f1ff41 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -132,6 +132,8 @@ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) #define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology */ #define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) +#define X86_CR4_FRED_BIT 32 /* enable FRED kernel entry */ +#define X86_CR4_FRED _BITULL(X86_CR4_FRED_BIT) /* * x86-64 Task Priority Register, CR8 diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8cd4126d8253..e8cf6f4cfb52 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -412,10 +412,15 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c) cr4_clear_bits(X86_CR4_UMIP); } -/* These bits should not change their value after CPU init is finished. */ +/* + * These bits should not change their value after CPU init is finished. + * The explicit cast to unsigned long suppresses a warning on i386 for + * x86-64 only feature bits >= 32. + */ static const unsigned long cr4_pinned_mask = - X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | - X86_CR4_FSGSBASE | X86_CR4_CET; + (unsigned long) + (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | + X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED); static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); static unsigned long cr4_pinned_bits __ro_after_init; From patchwork Mon Mar 27 07:58:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188737 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE058C761A6 for ; Mon, 27 Mar 2023 08:25:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233089AbjC0IZG (ORCPT ); Mon, 27 Mar 2023 04:25:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233026AbjC0IYg (ORCPT ); Mon, 27 Mar 2023 04:24:36 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 427BC2D57; Mon, 27 Mar 2023 01:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905475; x=1711441475; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8lktqN9Ow6nF5dYh+N7FoY+KesvEnknJu52aiBKcpfg=; b=ZXke2SCfe+AQphBUYh8eKzJK1aSmls7bCF+lrK+oBXqGCcsCdOxdeqg8 3ttcSfAUsZLz0OnXqRV1FQtJcuuEEuHzYt6CKb+CkIIXnBdbnpt8cZOL+ RcAGMIoNFRpUECoNj+2jRoS7CM+fZevW3osVWeBsyrp0qN/tlwkqdvxXN xak3o+0wQ+W9WAszv4gz2FAx+5DGEu1Jw1kxfFhLzWjuE5qifePhD59a9 wDAwwWQwGfo6oIQwbjRF/BH0bL0jLJd6evwMHHzv+WyzX9xVNVZP6qMlE peH/phoDBq2SHXIu6dVV9/OCVabKLrOXkt1ShHhrQZ3r0Z9xYYdQQdyq+ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930182" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930182" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787064" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787064" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:33 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 10/33] x86/fred: add Kconfig option for FRED (CONFIG_X86_FRED) Date: Mon, 27 Mar 2023 00:58:15 -0700 Message-Id: <20230327075838.5403-11-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add the configuration option CONFIG_X86_FRED to enable FRED. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/Kconfig | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index a825bf031f49..da62178bb246 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -500,6 +500,15 @@ config X86_CPU_RESCTRL Say N if unsure. +config X86_FRED + bool "Flexible Return and Event Delivery" + depends on X86_64 + help + When enabled, try to use Flexible Return and Event Delivery + instead of the legacy SYSCALL/SYSENTER/IDT architecture for + ring transitions and exception/interrupt handling if the + system supports. + if X86_32 config X86_BIGSMP bool "Support for big SMP systems with more than 8 CPUs" From patchwork Mon Mar 27 07:58:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188736 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D739EC7619A for ; Mon, 27 Mar 2023 08:25:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233106AbjC0IZD (ORCPT ); Mon, 27 Mar 2023 04:25:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233027AbjC0IYg (ORCPT ); Mon, 27 Mar 2023 04:24:36 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B0502D53; Mon, 27 Mar 2023 01:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905475; x=1711441475; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QmoU6lab7p0QYpuYROKRU1bljKMMFBm03a1BVvi5hdc=; b=d8AZ8YocVDtYp1ktyqPG0bWac+jMa47HM4sFfZ+P7cjl8E3OfHyuUsmg n8gdHM+r3vwC/KdZNX2JjC6+rZVUu8ttYi5S+1re07XA00bL1B1jB2i4s L9qN469ZrzNuyaGaVdi+nuVpRbWBs4uthSCAUlicwxDpA8f1kTca071q+ GCD4cyRpSnwa4m2LK3RgApnEXepmqOJ7XFjLXT11+6mYw6y8x0daB+Ens BwY8uhuUlSsZEZV3LT37eMtABytYSNzq41G1uG82N9tn1d1VF17kHkROA JNZTOFkEwF7QVvgrXmCpIPmF9PUsyUs1wKWm3ORklGSJwch5lAm/jCHtG A==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930194" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930194" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787068" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787068" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:33 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 11/33] x86/fred: if CONFIG_X86_FRED is disabled, disable FRED support Date: Mon, 27 Mar 2023 00:58:16 -0700 Message-Id: <20230327075838.5403-12-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add CONFIG_X86_FRED to to make cpu_feature_enabled() work correctly with FRED. Originally-by: Megha Dey Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/disabled-features.h | 8 +++++++- tools/arch/x86/include/asm/disabled-features.h | 8 +++++++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 5dfa4fb76f4b..56838de9cb23 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -99,6 +99,12 @@ # define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31)) #endif +#ifdef CONFIG_X86_FRED +# define DISABLE_FRED 0 +#else +# define DISABLE_FRED (1 << (X86_FEATURE_FRED & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -115,7 +121,7 @@ #define DISABLED_MASK10 0 #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ DISABLE_CALL_DEPTH_TRACKING) -#define DISABLED_MASK12 0 +#define DISABLED_MASK12 (DISABLE_FRED) #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h index 5dfa4fb76f4b..56838de9cb23 100644 --- a/tools/arch/x86/include/asm/disabled-features.h +++ b/tools/arch/x86/include/asm/disabled-features.h @@ -99,6 +99,12 @@ # define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31)) #endif +#ifdef CONFIG_X86_FRED +# define DISABLE_FRED 0 +#else +# define DISABLE_FRED (1 << (X86_FEATURE_FRED & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -115,7 +121,7 @@ #define DISABLED_MASK10 0 #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ DISABLE_CALL_DEPTH_TRACKING) -#define DISABLED_MASK12 0 +#define DISABLED_MASK12 (DISABLE_FRED) #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 From patchwork Mon Mar 27 07:58:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E104CC7619A for ; Mon, 27 Mar 2023 08:25:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233133AbjC0IZK (ORCPT ); Mon, 27 Mar 2023 04:25:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233037AbjC0IYg (ORCPT ); Mon, 27 Mar 2023 04:24:36 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74E723C3A; Mon, 27 Mar 2023 01:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905475; x=1711441475; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9ZDzco+F9bEe6GLOCDycLaH8JYPs/ssjd/Vxju2hLBE=; b=AVk5Dy2u7kc49KCx4b8CP5X9BoJtIvNjGqyHtXVnquufi3mPEpwfSCEQ c0E7d2MkRgByVwTrW3sheewgjIoFO7uCPmqEiwVq0KP4UW8MuWWQdS9Fp 57DUyvtiAYCd+yNrktt1E5u1FE8DeOPmk6Zp7MtJYBlFohmkoHjzVpIg3 +XNjRwoxlpbEKO+xyUNyxWnlLJmNaRBlqOn9wAERVauVUTH24D999Upl8 zlQFTFmobdVj1Hz9dfc633qgJer1g6W6gcbbMP5c76lEnKe+Kgupw8XiS UKpt4LROeEwcjJOkCm4lPgdzI1x4BUM3PnhZtFoSuA38jPGCGxLjni8tY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930206" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930206" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787071" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787071" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:33 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 12/33] x86/cpu: add MSR numbers for FRED configuration Date: Mon, 27 Mar 2023 00:58:17 -0700 Message-Id: <20230327075838.5403-13-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add MSR numbers for the FRED configuration registers. Originally-by: Megha Dey Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/msr-index.h | 13 ++++++++++++- tools/arch/x86/include/asm/msr-index.h | 13 ++++++++++++- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index ad35355ee43e..87db728f8bbc 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -36,8 +36,19 @@ #define EFER_FFXSR (1<<_EFER_FFXSR) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) -/* Intel MSRs. Some also available on other CPUs */ +/* FRED MSRs */ +#define MSR_IA32_FRED_RSP0 0x1cc /* Level 0 stack pointer */ +#define MSR_IA32_FRED_RSP1 0x1cd /* Level 1 stack pointer */ +#define MSR_IA32_FRED_RSP2 0x1ce /* Level 2 stack pointer */ +#define MSR_IA32_FRED_RSP3 0x1cf /* Level 3 stack pointer */ +#define MSR_IA32_FRED_STKLVLS 0x1d0 /* Exception stack levels */ +#define MSR_IA32_FRED_SSP0 MSR_IA32_PL0_SSP /* Level 0 shadow stack pointer */ +#define MSR_IA32_FRED_SSP1 0x1d1 /* Level 1 shadow stack pointer */ +#define MSR_IA32_FRED_SSP2 0x1d2 /* Level 2 shadow stack pointer */ +#define MSR_IA32_FRED_SSP3 0x1d3 /* Level 3 shadow stack pointer */ +#define MSR_IA32_FRED_CONFIG 0x1d4 /* Entrypoint and interrupt stack level */ +/* Intel MSRs. Some also available on other CPUs */ #define MSR_TEST_CTRL 0x00000033 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT 29 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT BIT(MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT) diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index ad35355ee43e..87db728f8bbc 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -36,8 +36,19 @@ #define EFER_FFXSR (1<<_EFER_FFXSR) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) -/* Intel MSRs. Some also available on other CPUs */ +/* FRED MSRs */ +#define MSR_IA32_FRED_RSP0 0x1cc /* Level 0 stack pointer */ +#define MSR_IA32_FRED_RSP1 0x1cd /* Level 1 stack pointer */ +#define MSR_IA32_FRED_RSP2 0x1ce /* Level 2 stack pointer */ +#define MSR_IA32_FRED_RSP3 0x1cf /* Level 3 stack pointer */ +#define MSR_IA32_FRED_STKLVLS 0x1d0 /* Exception stack levels */ +#define MSR_IA32_FRED_SSP0 MSR_IA32_PL0_SSP /* Level 0 shadow stack pointer */ +#define MSR_IA32_FRED_SSP1 0x1d1 /* Level 1 shadow stack pointer */ +#define MSR_IA32_FRED_SSP2 0x1d2 /* Level 2 shadow stack pointer */ +#define MSR_IA32_FRED_SSP3 0x1d3 /* Level 3 shadow stack pointer */ +#define MSR_IA32_FRED_CONFIG 0x1d4 /* Entrypoint and interrupt stack level */ +/* Intel MSRs. Some also available on other CPUs */ #define MSR_TEST_CTRL 0x00000033 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT 29 #define MSR_TEST_CTRL_SPLIT_LOCK_DETECT BIT(MSR_TEST_CTRL_SPLIT_LOCK_DETECT_BIT) From patchwork Mon Mar 27 07:58:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188738 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98A19C761A6 for ; Mon, 27 Mar 2023 08:25:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232508AbjC0IZH (ORCPT ); Mon, 27 Mar 2023 04:25:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233042AbjC0IYh (ORCPT ); Mon, 27 Mar 2023 04:24:37 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 071224204; Mon, 27 Mar 2023 01:24:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905476; x=1711441476; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tP9iIQUbrhMc+5l0AHCXnkXnhn3TgNNmkDoitFhCFbw=; b=VBWmxcQE5gLSk+Bgv76xE1onydzJlwD/7pnuCbidUQNwn1cXtU9++4rT 01b7sAcs7U2/88I7hBnISvQcAzd5T/vU5qPkGqofsr4XKP4/Z+FEaZABT EVPw7ZkSs34H6pjqIOQJp0xvpBTBGBKhOPPyYnxMOyOdqMIH5pGLqtMkG RMCXEN+0K6Xjsu+w0ox+0tNabnrTYqAJH8dI5s//BskDeIV4MjwpZqs9B dP0pBtyy5z3S5N6Y7/VVRDzsR8yMd7NuWfAsQaYf30vQ86+v3PW22kUvH Y5OqC5EaOCIDv4XyzfhkuMs63170nktdJezyjWdZjNmuw17O/cNpdwa2P Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930218" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930218" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787075" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787075" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:34 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 13/33] x86/fred: header file for event types Date: Mon, 27 Mar 2023 00:58:18 -0700 Message-Id: <20230327075838.5403-14-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org FRED inherits the Intel VT-x enhancement of classified events with a two-level event dispatch logic. The first-level dispatch is on the event type, not the event vector as used in the IDT architecture. This also means that vectors in different event types are orthogonal, e.g., vectors 0x10-0x1f become available as hardware interrupts. Add a header file for event types, and also use it in . Suggested-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/event-type.h | 17 +++++++++++++++++ arch/x86/include/asm/vmx.h | 17 +++++++++-------- 2 files changed, 26 insertions(+), 8 deletions(-) create mode 100644 arch/x86/include/asm/event-type.h diff --git a/arch/x86/include/asm/event-type.h b/arch/x86/include/asm/event-type.h new file mode 100644 index 000000000000..fedaa0e492c5 --- /dev/null +++ b/arch/x86/include/asm/event-type.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_EVENT_TYPE_H +#define _ASM_X86_EVENT_TYPE_H + +/* + * Event type codes: these are the same that are used by VTx. + */ +#define EVENT_TYPE_HWINT 0 /* Maskable external interrupt */ +#define EVENT_TYPE_RESERVED 1 +#define EVENT_TYPE_NMI 2 /* Non-maskable interrupt */ +#define EVENT_TYPE_HWFAULT 3 /* Hardware exceptions (e.g., page fault) */ +#define EVENT_TYPE_SWINT 4 /* Software interrupt (INT n) */ +#define EVENT_TYPE_PRIVSW 5 /* INT1 (ICEBP) */ +#define EVENT_TYPE_SWFAULT 6 /* Software exception (INT3 or INTO) */ +#define EVENT_TYPE_OTHER 7 /* FRED: SYSCALL/SYSENTER */ + +#endif diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 498dc600bd5c..8d9b8b0d8e56 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -15,6 +15,7 @@ #include #include #include +#include #include #define VMCS_CONTROL_BIT(x) BIT(VMX_FEATURE_##x & 0x1f) @@ -372,14 +373,14 @@ enum vmcs_field { #define VECTORING_INFO_DELIVER_CODE_MASK INTR_INFO_DELIVER_CODE_MASK #define VECTORING_INFO_VALID_MASK INTR_INFO_VALID_MASK -#define INTR_TYPE_EXT_INTR (0 << 8) /* external interrupt */ -#define INTR_TYPE_RESERVED (1 << 8) /* reserved */ -#define INTR_TYPE_NMI_INTR (2 << 8) /* NMI */ -#define INTR_TYPE_HARD_EXCEPTION (3 << 8) /* processor exception */ -#define INTR_TYPE_SOFT_INTR (4 << 8) /* software interrupt */ -#define INTR_TYPE_PRIV_SW_EXCEPTION (5 << 8) /* ICE breakpoint - undocumented */ -#define INTR_TYPE_SOFT_EXCEPTION (6 << 8) /* software exception */ -#define INTR_TYPE_OTHER_EVENT (7 << 8) /* other event */ +#define INTR_TYPE_EXT_INTR (EVENT_TYPE_HWINT << 8) /* external interrupt */ +#define INTR_TYPE_RESERVED (EVENT_TYPE_RESERVED << 8) /* reserved */ +#define INTR_TYPE_NMI_INTR (EVENT_TYPE_NMI << 8) /* NMI */ +#define INTR_TYPE_HARD_EXCEPTION (EVENT_TYPE_HWFAULT << 8) /* processor exception */ +#define INTR_TYPE_SOFT_INTR (EVENT_TYPE_SWINT << 8) /* software interrupt */ +#define INTR_TYPE_PRIV_SW_EXCEPTION (EVENT_TYPE_PRIVSW << 8) /* ICE breakpoint - undocumented */ +#define INTR_TYPE_SOFT_EXCEPTION (EVENT_TYPE_SWFAULT << 8) /* software exception */ +#define INTR_TYPE_OTHER_EVENT (EVENT_TYPE_OTHER << 8) /* other event */ /* GUEST_INTERRUPTIBILITY_INFO flags. */ #define GUEST_INTR_STATE_STI 0x00000001 From patchwork Mon Mar 27 07:58:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37C9DC761AF for ; Mon, 27 Mar 2023 08:25:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233137AbjC0IZP (ORCPT ); Mon, 27 Mar 2023 04:25:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233048AbjC0IYh (ORCPT ); Mon, 27 Mar 2023 04:24:37 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72B122D57; Mon, 27 Mar 2023 01:24:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905476; x=1711441476; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LJAIMdwNFTUrVjDLTbKC/eMN5WktrEucLVbwGrEfGpc=; b=cucVf7zphc68Y/LESTYvYE4RGvaBZtNiC++ILyMRnjLeD97kCR+Ok71Q oAinxkGjd7zT8cLurvXMVsxNCbodQcixYUPYvxajJl7ZBe2Jnz7yhuqJB w90WyrUAxHoS1yWHhMxssKSY8a/3TAq7GOWR8tC9+9cBf6A6p6w6L7aug 41UJPJOnTAPDDIrAkzE4er/e0pSmiPizgLrjBz2QZ/qavudiwoofcxGIa TlJ3HG2S+aSUvc/gEfvdE5AQJnFhzd1r/1PCHpBYObxst3D4uoy66K4Vf hpfFxIvy+NDCC9on0kHDO5UMqqbYfVY+7idgUR1H4MTITiVz3+GN5vT4p A==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930229" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930229" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787079" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787079" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:34 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 14/33] x86/fred: header file with FRED definitions Date: Mon, 27 Mar 2023 00:58:19 -0700 Message-Id: <20230327075838.5403-15-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add a header file for FRED prototypes and definitions. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 101 ++++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 arch/x86/include/asm/fred.h diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h new file mode 100644 index 000000000000..2f337162da73 --- /dev/null +++ b/arch/x86/include/asm/fred.h @@ -0,0 +1,101 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * arch/x86/include/asm/fred.h + * + * Macros for Flexible Return and Event Delivery (FRED) + */ + +#ifndef ASM_X86_FRED_H +#define ASM_X86_FRED_H + +#ifdef CONFIG_X86_FRED + +#include +#include + +/* + * FRED return instructions + * + * Replace with "ERETS"/"ERETU" once binutils support FRED return instructions. + * The binutils version supporting FRED instructions is still TBD, and will + * update once we have it. + */ +#define ERETS _ASM_BYTES(0xf2,0x0f,0x01,0xca) +#define ERETU _ASM_BYTES(0xf3,0x0f,0x01,0xca) + +/* + * Event stack level macro for the FRED_STKLVLS MSR. + * Usage example: FRED_STKLVL(X86_TRAP_DF, 3) + * Multiple values can be ORd together. + */ +#define FRED_STKLVL(v,l) (_AT(unsigned long, l) << (2*(v))) + +/* FRED_CONFIG MSR */ +#define FRED_CONFIG_CSL_MASK 0x3 +#define FRED_CONFIG_SHADOW_STACK_SPACE _BITUL(3) +#define FRED_CONFIG_REDZONE(b) __ALIGN_KERNEL_MASK((b), _UL(0x3f)) +#define FRED_CONFIG_INT_STKLVL(l) (_AT(unsigned long, l) << 9) +#define FRED_CONFIG_ENTRYPOINT(p) _AT(unsigned long, (p)) + +/* FRED event type and vector bit width and counts */ +#define FRED_EVENT_TYPE_BITS 3 /* only 3 bits used in FRED 3.0 */ +#define FRED_EVENT_TYPE_COUNT _BITUL(FRED_EVENT_TYPE_BITS) +#define FRED_EVENT_VECTOR_BITS 8 +#define FRED_EVENT_VECTOR_COUNT _BITUL(FRED_EVENT_VECTOR_BITS) + +/* FRED EVENT_TYPE_OTHER vector numbers */ +#define FRED_SYSCALL 1 +#define FRED_SYSENTER 2 + +/* Flags above the CS selector (regs->csx) */ +#define FRED_CSL_ENABLE_NMI _BITUL(28) +#define FRED_CSL_ALLOW_SINGLE_STEP _BITUL(25) +#define FRED_CSL_INTERRUPT_SHADOW _BITUL(24) + +#ifndef __ASSEMBLY__ + +#include +#include + +/* FRED stack frame information */ +struct fred_info { + unsigned long edata; /* Event data: CR2, DR6, ... */ + unsigned long resv; +}; + +/* Full format of the FRED stack frame */ +struct fred_frame { + struct pt_regs regs; + struct fred_info info; +}; + +/* Getting the FRED frame information from a pt_regs pointer */ +static __always_inline struct fred_info *fred_info(struct pt_regs *regs) +{ + return &container_of(regs, struct fred_frame, regs)->info; +} + +static __always_inline unsigned long fred_event_data(struct pt_regs *regs) +{ + return fred_info(regs)->edata; +} + +/* + * How FRED event handlers are called. + * + * FRED event delivery establishes the full supervisor context + * by pushing everything related to the event being delivered + * to the FRED stack frame, e.g., the faulting linear address + * of a #PF is pushed as event data of the FRED #PF stack frame. + * Thus a struct pt_regs has everything needed and it's the only + * input parameter required for a FRED event handler. + */ +#define DECLARE_FRED_HANDLER(f) void f (struct pt_regs *regs) +#define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) +typedef DECLARE_FRED_HANDLER((*fred_handler)); + +#endif /* __ASSEMBLY__ */ + +#endif /* CONFIG_X86_FRED */ + +#endif /* ASM_X86_FRED_H */ From patchwork Mon Mar 27 07:58:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188740 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5194AC761A6 for ; Mon, 27 Mar 2023 08:25:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233128AbjC0IZM (ORCPT ); Mon, 27 Mar 2023 04:25:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233049AbjC0IYh (ORCPT ); Mon, 27 Mar 2023 04:24:37 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3A5444AE; Mon, 27 Mar 2023 01:24:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905476; x=1711441476; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=voFQVjFSRNzDP3of726cqSfwzN6e3VmowuyZWcsyuKM=; b=O/GbxqLrYaW8G296BKcVpenILyRSEK2Q4XazWhgjj/v+/8Wx/8W/8psr et74+Yk09b5tPGaMK4NyFOHHJpLqGl2XTRPPr1oQD3yGpFuKIDiroWJWm e2kgJNi/5pwJIb2lR0nYqFs8g4IfSYIAS6G5l5oNVACRwcHoSsMO+kcAu 28dSjH7pqgxEo3WOWhECYHXeqUYPf7/7REqTghgTbnKHFE890fEviM+SH fQ230CKL3SNtl6+WqdxLWVMqTflasn+aa3dFQ9ReSNcKjjHL7N90HWNzg LztzM3phmZL6yE5xO8UOM/8Yubswxi5hUOs856wB8OVFqVgUXLkrf/71q Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930240" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930240" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787084" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787084" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:35 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 15/33] x86/fred: reserve space for the FRED stack frame Date: Mon, 27 Mar 2023 00:58:20 -0700 Message-Id: <20230327075838.5403-16-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" When using FRED, reserve space at the top of the stack frame, just like i386 does. A future version of FRED might have dynamic frame sizes, though, in which case it might be necessary to make TOP_OF_KERNEL_STACK_PADDING a variable instead of a constant. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/thread_info.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index f1cccba52eb9..998483078d5f 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -31,7 +31,9 @@ * In vm86 mode, the hardware frame is much longer still, so add 16 * bytes to make room for the real-mode segments. * - * x86_64 has a fixed-length stack frame. + * x86-64 has a fixed-length stack frame, but it depends on whether + * or not FRED is enabled. Future versions of FRED might make this + * dynamic, but for now it is always 2 words longer. */ #ifdef CONFIG_X86_32 # ifdef CONFIG_VM86 @@ -39,8 +41,12 @@ # else # define TOP_OF_KERNEL_STACK_PADDING 8 # endif -#else -# define TOP_OF_KERNEL_STACK_PADDING 0 +#else /* x86-64 */ +# ifdef CONFIG_X86_FRED +# define TOP_OF_KERNEL_STACK_PADDING (2*8) +# else +# define TOP_OF_KERNEL_STACK_PADDING 0 +# endif #endif /* From patchwork Mon Mar 27 07:58:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188744 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91819C77B61 for ; Mon, 27 Mar 2023 08:25:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233004AbjC0IZX (ORCPT ); Mon, 27 Mar 2023 04:25:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233099AbjC0IYw (ORCPT ); Mon, 27 Mar 2023 04:24:52 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0997D46B5; Mon, 27 Mar 2023 01:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905477; x=1711441477; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zbDdVunyGdhXyhjUE3Gbpn/xbHM27xiNyk5ThtGHdLg=; b=HzGOXXTqUZbl1OS/GKzzsWpBOVOmTPyW3TjE7fFidqkG8lPBDWLiOAB9 HFgql7yMKNM0wQB2/LBS84VWAuWuyJKSEakiFwOrmM3YpZ2XNWkAMvobb ELJzbuJr5w3oTGEJCEPc4olqZ0Wibb8P3dhNvi446qkvFOHqEHplvfG3p Y0IBYKbTZSp0XWmcCNCdFHMkOsqsKFhfRLGzxZm7yTsetuKBUe1G0TedI 20lmcfzo/ZyRbp4IDl+uKK46HnZvp7miqsAOVH65Ta3WIxuIie6F7QzGm TgIydXJBY0+OD0pFmz3hfLE1GwiGFzqAqQRGj/tWrj7o/tHYzMW3S6mjt g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930250" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930250" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787091" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787091" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:35 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 16/33] x86/fred: add a page fault entry stub for FRED Date: Mon, 27 Mar 2023 00:58:21 -0700 Message-Id: <20230327075838.5403-17-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add a page fault entry stub for FRED. On a FRED system, the faulting address (CR2) is passed on the stack, to avoid the problem of transient state. Thus we get the page fault address from the stack instead of CR2. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 2 ++ arch/x86/mm/fault.c | 20 ++++++++++++++++++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 2f337162da73..57affbf80ced 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -94,6 +94,8 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) #define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) typedef DECLARE_FRED_HANDLER((*fred_handler)); +DECLARE_FRED_HANDLER(fred_exc_page_fault); + #endif /* __ASSEMBLY__ */ #endif /* CONFIG_X86_FRED */ diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index a498ae1fbe66..0f946121de14 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -33,6 +33,7 @@ #include /* kvm_handle_async_pf */ #include /* fixup_vdso_exception() */ #include +#include /* fred_event_data() */ #define CREATE_TRACE_POINTS #include @@ -1507,9 +1508,10 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code, } } -DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) +static __always_inline void page_fault_common(struct pt_regs *regs, + unsigned int error_code, + unsigned long address) { - unsigned long address = read_cr2(); irqentry_state_t state; prefetchw(¤t->mm->mmap_lock); @@ -1556,3 +1558,17 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) irqentry_exit(regs, state); } + +DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) +{ + page_fault_common(regs, error_code, read_cr2()); +} + +#ifdef CONFIG_X86_FRED + +DEFINE_FRED_HANDLER(fred_exc_page_fault) +{ + page_fault_common(regs, regs->orig_ax, fred_event_data(regs)); +} + +#endif /* CONFIG_X86_FRED */ From patchwork Mon Mar 27 07:58:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4700C761A6 for ; Mon, 27 Mar 2023 08:25:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233203AbjC0IZ0 (ORCPT ); Mon, 27 Mar 2023 04:25:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233098AbjC0IYw (ORCPT ); Mon, 27 Mar 2023 04:24:52 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E4382D70; Mon, 27 Mar 2023 01:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905477; x=1711441477; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TAj+Us21b88uxMR/HQ4b1urOhW5qC0Uu268Yt8fMR3c=; b=PKOMWo18TRKEO61D3cpDyl+re6YKzLsQld4nAD5rAWizMDvoUBOW2S/I l25I2jLZ8Hf3rBZoLvIQELg/tjxLEiutFNrqM5yx1vFJYgLD3qkd6I9vD xNSOd6UH+8nbw7qgUlRD70nHfGihfooFaFKZy30KtEKG6rIKQ134aiZJr tF/adyanaIagveDbfqbldc3EaMMVhvRYuIyKWszxxepPrJK6JiZHSfMhy 3RCxY1NeJzJtGZoJxTHrcj0qImfdsepQNkLfKSYHuSqAAPmtheli2G7ip EVFt1xBQcIwMzEVzw5dObzcZhg6YQdapf3pzJq5g1K0kb2c3jp6dve4Pc A==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930261" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930261" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787096" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787096" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:35 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 17/33] x86/fred: add a debug fault entry stub for FRED Date: Mon, 27 Mar 2023 00:58:22 -0700 Message-Id: <20230327075838.5403-18-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Add a debug fault entry stub for FRED. On a FRED system, the debug trap status information (DR6) is passed on the stack, to avoid the problem of transient state. Furthermore, FRED transitions avoid a lot of ugly corner cases the handling of which can, and should be, skipped. The FRED debug trap status information saved on the stack differs from DR6 in both stickiness and polarity; it is exactly what debug_read_clear_dr6() returns, and exc_debug_user()/exc_debug_kernel() expect. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v1: * call irqentry_nmi_{enter,exit}() in both IDT and FRED debug fault kernel handler (Peter Zijlstra). --- arch/x86/include/asm/fred.h | 1 + arch/x86/kernel/traps.c | 56 +++++++++++++++++++++++++++---------- 2 files changed, 42 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 57affbf80ced..633dd9e6a68e 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -94,6 +94,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) #define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) typedef DECLARE_FRED_HANDLER((*fred_handler)); +DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index f86cd233b00b..549f7f962f8f 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -47,6 +47,7 @@ #include #include #include +#include #include #include #include @@ -1020,21 +1021,9 @@ static bool notify_debug(struct pt_regs *regs, unsigned long *dr6) return false; } -static __always_inline void exc_debug_kernel(struct pt_regs *regs, - unsigned long dr6) +static __always_inline void debug_kernel_common(struct pt_regs *regs, + unsigned long dr6) { - /* - * Disable breakpoints during exception handling; recursive exceptions - * are exceedingly 'fun'. - * - * Since this function is NOKPROBE, and that also applies to - * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a - * HW_BREAKPOINT_W on our stack) - * - * Entry text is excluded for HW_BP_X and cpu_entry_area, which - * includes the entry stack is excluded for everything. - */ - unsigned long dr7 = local_db_save(); irqentry_state_t irq_state = irqentry_nmi_enter(regs); instrumentation_begin(); @@ -1062,7 +1051,8 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, * Catch SYSENTER with TF set and clear DR_STEP. If this hit a * watchpoint at the same time then that will still be handled. */ - if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs)) + if (!cpu_feature_enabled(X86_FEATURE_FRED) && + (dr6 & DR_STEP) && is_sysenter_singlestep(regs)) dr6 &= ~DR_STEP; /* @@ -1090,7 +1080,25 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, out: instrumentation_end(); irqentry_nmi_exit(regs, irq_state); +} +static __always_inline void exc_debug_kernel(struct pt_regs *regs, + unsigned long dr6) +{ + /* + * Disable breakpoints during exception handling; recursive exceptions + * are exceedingly 'fun'. + * + * Since this function is NOKPROBE, and that also applies to + * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a + * HW_BREAKPOINT_W on our stack) + * + * Entry text is excluded for HW_BP_X and cpu_entry_area, which + * includes the entry stack is excluded for everything. + */ + unsigned long dr7 = local_db_save(); + + debug_kernel_common(regs, dr6); local_db_restore(dr7); } @@ -1179,6 +1187,24 @@ DEFINE_IDTENTRY_DEBUG_USER(exc_debug) { exc_debug_user(regs, debug_read_clear_dr6()); } + +# ifdef CONFIG_X86_FRED +DEFINE_FRED_HANDLER(fred_exc_debug) +{ + /* + * The FRED debug information saved onto stack differs from + * DR6 in both stickiness and polarity; it is exactly what + * debug_read_clear_dr6() returns. + */ + unsigned long dr6 = fred_event_data(regs); + + if (user_mode(regs)) + exc_debug_user(regs, dr6); + else + debug_kernel_common(regs, dr6); +} +# endif /* CONFIG_X86_FRED */ + #else /* 32 bit does not have separate entry points. */ DEFINE_IDTENTRY_RAW(exc_debug) From patchwork Mon Mar 27 07:58:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78493C761A6 for ; Mon, 27 Mar 2023 08:25:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233165AbjC0IZV (ORCPT ); Mon, 27 Mar 2023 04:25:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233102AbjC0IYx (ORCPT ); Mon, 27 Mar 2023 04:24:53 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C20449D7; Mon, 27 Mar 2023 01:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905477; x=1711441477; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z2X5VK9fQIUiWFOXZ/Z/uPBIrWUORMITtKdHy89L3Wo=; b=Z7oLudKy5gVHckeP44adoQ7ZTo+YnBF3t5Oo1ih6KQDnlWkJvQfN9Vvh 3eMXTfySEJPgx4wP17+JwbdfeiXpn9CLxb65fcrY9meH5moAjZsqCu9px 6DUjuBeRr4evLnUrQ3Meta1lZUWr3yvata5WRbV9O4f+o8KERpxXI+Sv4 aOp8bTTA0Q5RNwrk9g73FajYOOATvGU5co0lZjgmRBwK9CGBPx8DS/1eA RfMDuyTNleZNdMOk4gbeZyIoHhq0uZlA1rVVs+rVu78Go6UZuIY2sYUYa kM2XGyAe0O8xdIegWdOVwvslmHXnHCtHi/BE+X0AfRg0ZXptDopoqp1hv g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930273" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930273" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787101" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787101" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:36 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 18/33] x86/fred: add a NMI entry stub for FRED Date: Mon, 27 Mar 2023 00:58:23 -0700 Message-Id: <20230327075838.5403-19-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" On a FRED system, NMIs nest both with themselves and faults, transient information is saved into the stack frame, and NMI unblocking only happens when the stack frame indicates that so should happen. Thus, the NMI entry stub for FRED is really quite small... Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 1 + arch/x86/kernel/nmi.c | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 633dd9e6a68e..f928a03082af 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -94,6 +94,7 @@ static __always_inline unsigned long fred_event_data(struct pt_regs *regs) #define DEFINE_FRED_HANDLER(f) noinstr DECLARE_FRED_HANDLER(f) typedef DECLARE_FRED_HANDLER((*fred_handler)); +DECLARE_FRED_HANDLER(fred_exc_nmi); DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c index 776f4b1e395b..e7b2abe42583 100644 --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.c @@ -34,6 +34,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -643,6 +644,24 @@ void nmi_backtrace_stall_check(const struct cpumask *btp) #endif +#ifdef CONFIG_X86_FRED +DEFINE_FRED_HANDLER(fred_exc_nmi) +{ + /* + * With FRED, CR2 and DR6 are pushed atomically on faults, + * so we don't have to worry about saving and restoring them. + * Breakpoint faults nest, so assume it is OK to leave DR7 + * enabled. + */ + irqentry_state_t irq_state = irqentry_nmi_enter(regs); + + inc_irq_stat(__nmi_count); + default_do_nmi(regs); + + irqentry_nmi_exit(regs, irq_state); +} +#endif + void stop_nmi(void) { ignore_nmis++; From patchwork Mon Mar 27 07:58:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188742 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C846C7619A for ; Mon, 27 Mar 2023 08:25:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233157AbjC0IZT (ORCPT ); Mon, 27 Mar 2023 04:25:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233104AbjC0IYx (ORCPT ); Mon, 27 Mar 2023 04:24:53 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3124330E0; Mon, 27 Mar 2023 01:24:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905478; x=1711441478; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Vo6Y4KRnake/BwUScJbQdgNXqARSX4a72bEPCU1WlFI=; b=nwbf75yHdOzlvPni92iNUJ1hhhO2QIRfooXInYALgdtKmhV5j6pJJ87r u7R3383QwbZtQPIHWSRkEu72tqOIldA4vn85VP4SKT4T4Gj6F49PQbLYu 7/+UEoBeQKzTsliTPIPqrZf5T+Syt0N7xJQ2KaSMbC5T/nRBTJOcQhH11 Yxt0F5QXcbpOI/ZE+KqnxwlOo9j96qLP2f9ryQJ3C4+lhGffZUGV35OC+ uwzs39AHR5TdGCV/zXASS68Tb5JWLogd9dYe7Xi27jxTxvf9H93gH6MiV aiQQqTz2abVrzgpd+8e6euDSM7QWAy0ACHcIkfOoTcYtmsxKCwZ2rPzcb Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930284" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930284" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787104" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787104" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:36 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 19/33] x86/fred: add a machine check entry stub for FRED Date: Mon, 27 Mar 2023 00:58:24 -0700 Message-Id: <20230327075838.5403-20-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a machine check entry stub for FRED. Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v5: * Disallow #DB inside #MCE for robustness sake (Peter Zijlstra). --- arch/x86/include/asm/fred.h | 1 + arch/x86/kernel/cpu/mce/core.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index f928a03082af..54746e8c0a17 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -97,6 +97,7 @@ typedef DECLARE_FRED_HANDLER((*fred_handler)); DECLARE_FRED_HANDLER(fred_exc_nmi); DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); +DECLARE_FRED_HANDLER(fred_exc_machine_check); #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 2eec60f50057..859331a6a7ad 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -52,6 +52,7 @@ #include #include #include +#include #include "internal.h" @@ -2111,6 +2112,20 @@ DEFINE_IDTENTRY_MCE_USER(exc_machine_check) exc_machine_check_user(regs); local_db_restore(dr7); } + +#ifdef CONFIG_X86_FRED +DEFINE_FRED_HANDLER(fred_exc_machine_check) +{ + unsigned long dr7; + + dr7 = local_db_save(); + if (user_mode(regs)) + exc_machine_check_user(regs); + else + exc_machine_check_kernel(regs); + local_db_restore(dr7); +} +#endif #else /* 32bit unified entry point */ DEFINE_IDTENTRY_RAW(exc_machine_check) From patchwork Mon Mar 27 07:58:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188746 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B009C76195 for ; Mon, 27 Mar 2023 08:25:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233152AbjC0IZ2 (ORCPT ); Mon, 27 Mar 2023 04:25:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233113AbjC0IYy (ORCPT ); Mon, 27 Mar 2023 04:24:54 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6AC322D63; Mon, 27 Mar 2023 01:24:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905478; x=1711441478; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YScIvyCvP63IrCWW2JMUZyi9KLrv2/ev8QcHj+YBcOs=; b=BaWqQ2cXtEOrXtKw/ye54/Oku3XtY+6j4Ho+rKWAITD8SeXOny+ItkPD g6/8RsBaFZnOjm8SUPwsQWX/hDhgESgvYZXTLuDH0R185Q9Ra36RJ4ZGS 3wXfwhaCg3MvDJWBcqzukVrV9RG6Yg0xiO9V2vph2Yw236E9rSa6R6AqB IPnpSsdskP4e0SEbIzXF2CVGxKrMREkanSv4cTNsBsXqyzc+o+VEmPOQ9 WfNyuyFcvuTbFYumeqtKFmKz8+EUi5Kc+yTIljn7sPknrpkmfh99rss5N U5ncOxz7mmjfKGGs29blWW4/JqKBYBctSzvYeBWd3aYJBof/zsbYPOxzZ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930294" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930294" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787107" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787107" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:37 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 20/33] x86/fred: FRED entry/exit and dispatch code Date: Mon, 27 Mar 2023 00:58:25 -0700 Message-Id: <20230327075838.5403-21-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" The code to actually handle kernel and event entry/exit using FRED. It is split up into two files thus: - entry_64_fred.S contains the actual entrypoints and exit code, and saves and restores registers. - entry_fred.c contains the two-level event dispatch code for FRED. The first-level dispatch is on the event type, and the second-level is on the event vector. Originally-by: Megha Dey Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v1: * Initialize a FRED exception handler to fred_bad_event() instead of NULL if no FRED handler defined for an exception vector (Peter Zijlstra). * Push calling irqentry_{enter,exit}() and instrumentation_{begin,end}() down into individual FRED exception handlers, instead of in the dispatch framework (Peter Zijlstra). --- arch/x86/entry/Makefile | 5 +- arch/x86/entry/entry_64_fred.S | 57 ++++++++ arch/x86/entry/entry_fred.c | 232 ++++++++++++++++++++++++++++++++ arch/x86/include/asm/idtentry.h | 8 ++ 4 files changed, 301 insertions(+), 1 deletion(-) create mode 100644 arch/x86/entry/entry_64_fred.S create mode 100644 arch/x86/entry/entry_fred.c diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile index ca2fe186994b..c93e7f5c2a06 100644 --- a/arch/x86/entry/Makefile +++ b/arch/x86/entry/Makefile @@ -18,6 +18,9 @@ obj-y += vdso/ obj-y += vsyscall/ obj-$(CONFIG_PREEMPTION) += thunk_$(BITS).o +CFLAGS_entry_fred.o += -fno-stack-protector +CFLAGS_REMOVE_entry_fred.o += -pg $(CC_FLAGS_FTRACE) +obj-$(CONFIG_X86_FRED) += entry_64_fred.o entry_fred.o + obj-$(CONFIG_IA32_EMULATION) += entry_64_compat.o syscall_32.o obj-$(CONFIG_X86_X32_ABI) += syscall_x32.o - diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S new file mode 100644 index 000000000000..d975cacd060f --- /dev/null +++ b/arch/x86/entry/entry_64_fred.S @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * arch/x86/entry/entry_64_fred.S + * + * The actual FRED entry points. + */ +#include +#include +#include +#include + +#include "calling.h" + + .code64 + .section ".noinstr.text", "ax" + +.macro FRED_ENTER + UNWIND_HINT_EMPTY + PUSH_AND_CLEAR_REGS + movq %rsp, %rdi /* %rdi -> pt_regs */ +.endm + +.macro FRED_EXIT + UNWIND_HINT_REGS + POP_REGS + addq $8,%rsp /* Drop error code */ +.endm + +/* + * The new RIP value that FRED event delivery establishes is + * IA32_FRED_CONFIG & ~FFFH for events that occur in ring 3. + * Thus the FRED ring 3 entry point must be 4K page aligned. + */ + .align 4096 + +SYM_CODE_START_NOALIGN(fred_entrypoint_user) + FRED_ENTER + call fred_entry_from_user +SYM_INNER_LABEL(fred_exit_user, SYM_L_GLOBAL) + FRED_EXIT + ERETU +SYM_CODE_END(fred_entrypoint_user) + +.fill fred_entrypoint_kernel - ., 1, 0xcc + +/* + * The new RIP value that FRED event delivery establishes is + * (IA32_FRED_CONFIG & ~FFFH) + 256 for events that occur in + * ring 0, i.e., fred_entrypoint_user + 256. + */ + .org fred_entrypoint_user+256 +SYM_CODE_START_NOALIGN(fred_entrypoint_kernel) + FRED_ENTER + call fred_entry_from_kernel + FRED_EXIT + ERETS +SYM_CODE_END(fred_entrypoint_kernel) diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c new file mode 100644 index 000000000000..1862bd7a9fc1 --- /dev/null +++ b/arch/x86/entry/entry_fred.c @@ -0,0 +1,232 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * arch/x86/entry/entry_fred.c + * + * This contains the dispatch functions called from the entry point + * assembly. + */ + +#include +#include /* oops_begin/end, ... */ +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * Badness... + */ +static DEFINE_FRED_HANDLER(fred_bad_event) +{ + irqentry_state_t irq_state = irqentry_nmi_enter(regs); + + instrumentation_begin(); + + /* Panic on events from a high stack level */ + if (regs->current_stack_level > 0) { + pr_emerg("PANIC: invalid or fatal FRED event; event type %u " + "vector %u error 0x%lx aux 0x%lx at %04x:%016lx\n", + regs->type, regs->vector, regs->orig_ax, + fred_event_data(regs), regs->cs, regs->ip); + die("invalid or fatal FRED event", regs, regs->orig_ax); + panic("invalid or fatal FRED event"); + } else { + unsigned long flags = oops_begin(); + int sig = SIGKILL; + + pr_alert("BUG: invalid or fatal FRED event; event type %u " + "vector %u error 0x%lx aux 0x%lx at %04x:%016lx\n", + regs->type, regs->vector, regs->orig_ax, + fred_event_data(regs), regs->cs, regs->ip); + + if (__die("Invalid or fatal FRED event", regs, regs->orig_ax)) + sig = 0; + + oops_end(flags, regs, sig); + } + + instrumentation_end(); + irqentry_nmi_exit(regs, irq_state); +} + +noinstr void fred_exc_double_fault(struct pt_regs *regs) +{ + exc_double_fault(regs, regs->orig_ax); +} + +/* + * Exception entry + */ +static DEFINE_FRED_HANDLER(fred_exception) +{ + /* + * Exceptions that cannot happen on FRED h/w are set to fred_bad_event(). + */ + static const fred_handler exception_handlers[NUM_EXCEPTION_VECTORS] = { + [X86_TRAP_DE] = exc_divide_error, + [X86_TRAP_DB] = fred_exc_debug, + [X86_TRAP_NMI] = fred_bad_event, /* A separate event type, not handled here */ + [X86_TRAP_BP] = exc_int3, + [X86_TRAP_OF] = exc_overflow, + [X86_TRAP_BR] = exc_bounds, + [X86_TRAP_UD] = exc_invalid_op, + [X86_TRAP_NM] = exc_device_not_available, + [X86_TRAP_DF] = fred_exc_double_fault, + [X86_TRAP_OLD_MF] = fred_bad_event, /* 387 only! */ + [X86_TRAP_TS] = fred_exc_invalid_tss, + [X86_TRAP_NP] = fred_exc_segment_not_present, + [X86_TRAP_SS] = fred_exc_stack_segment, + [X86_TRAP_GP] = fred_exc_general_protection, + [X86_TRAP_PF] = fred_exc_page_fault, + [X86_TRAP_SPURIOUS] = fred_bad_event, /* Interrupts are their own event type */ + [X86_TRAP_MF] = exc_coprocessor_error, + [X86_TRAP_AC] = fred_exc_alignment_check, + [X86_TRAP_MC] = fred_exc_machine_check, + [X86_TRAP_XF] = exc_simd_coprocessor_error, + [X86_TRAP_VE...NUM_EXCEPTION_VECTORS-1] = fred_bad_event + }; + u8 vector = array_index_nospec((u8)regs->vector, NUM_EXCEPTION_VECTORS); + + exception_handlers[vector](regs); +} + +static __always_inline void fred_emulate_trap(struct pt_regs *regs) +{ + regs->type = EVENT_TYPE_SWFAULT; + regs->orig_ax = 0; + fred_exception(regs); +} + +static __always_inline void fred_emulate_fault(struct pt_regs *regs) +{ + regs->ip -= regs->instr_len; + fred_emulate_trap(regs); +} + +/* + * Emulate SYSENTER if applicable. This is not the preferred system + * call in 32-bit mode under FRED, rather int $0x80 is preferred and + * exported in the vdso. SYSCALL proper has a hard-coded early out in + * fred_entry_from_user(). + */ +static DEFINE_FRED_HANDLER(fred_syscall_slow) +{ + if (IS_ENABLED(CONFIG_IA32_EMULATION) && + likely(regs->vector == FRED_SYSENTER)) { + /* Convert frame to a syscall frame */ + regs->orig_ax = regs->ax; + regs->ax = -ENOSYS; + do_fast_syscall_32(regs); + } else { + regs->vector = X86_TRAP_UD; + fred_emulate_fault(regs); + } +} + +/* + * Some software exceptions can also be triggered as int instructions, + * for historical reasons. Implement those here. The performance-critical + * int $0x80 (32-bit system call) has a hard-coded early out. + */ +static DEFINE_FRED_HANDLER(fred_sw_interrupt_user) +{ + if (IS_ENABLED(CONFIG_IA32_EMULATION) && + likely(regs->vector == IA32_SYSCALL_VECTOR)) { + /* Convert frame to a syscall frame */ + regs->orig_ax = regs->ax; + regs->ax = -ENOSYS; + return do_int80_syscall_32(regs); + } + + switch (regs->vector) { + case X86_TRAP_BP: + case X86_TRAP_OF: + fred_emulate_trap(regs); + break; + default: + regs->vector = X86_TRAP_GP; + fred_emulate_fault(regs); + break; + } +} + +static DEFINE_FRED_HANDLER(fred_hw_interrupt) +{ + irqentry_state_t state = irqentry_enter(regs); + + instrumentation_begin(); + external_interrupt(regs); + instrumentation_end(); + irqentry_exit(regs, state); +} + +__visible noinstr void fred_entry_from_user(struct pt_regs *regs) +{ + static const fred_handler user_handlers[FRED_EVENT_TYPE_COUNT] = + { + [EVENT_TYPE_HWINT] = fred_hw_interrupt, + [EVENT_TYPE_RESERVED] = fred_bad_event, + [EVENT_TYPE_NMI] = fred_exc_nmi, + [EVENT_TYPE_SWINT] = fred_sw_interrupt_user, + [EVENT_TYPE_HWFAULT] = fred_exception, + [EVENT_TYPE_SWFAULT] = fred_exception, + [EVENT_TYPE_PRIVSW] = fred_exception, + [EVENT_TYPE_OTHER] = fred_syscall_slow + }; + + /* + * FRED employs a two-level event dispatch mechanism, with + * the first-level on the type of an event and the second-level + * on its vector. Thus a dispatch typically induces 2 calls. + * We optimize it by using early outs for the most frequent + * events, and syscalls are the first. We may also need early + * outs for page faults. + */ + if (likely(regs->type == EVENT_TYPE_OTHER && + regs->vector == FRED_SYSCALL)) { + /* Convert frame to a syscall frame */ + regs->orig_ax = regs->ax; + regs->ax = -ENOSYS; + do_syscall_64(regs, regs->orig_ax); + } else { + /* Not a system call */ + u8 type = array_index_nospec((u8)regs->type, FRED_EVENT_TYPE_COUNT); + + user_handlers[type](regs); + } +} + +static DEFINE_FRED_HANDLER(fred_sw_interrupt_kernel) +{ + switch (regs->vector) { + case X86_TRAP_NMI: + fred_exc_nmi(regs); + break; + default: + fred_bad_event(regs); + break; + } +} + +__visible noinstr void fred_entry_from_kernel(struct pt_regs *regs) +{ + static const fred_handler kernel_handlers[FRED_EVENT_TYPE_COUNT] = + { + [EVENT_TYPE_HWINT] = fred_hw_interrupt, + [EVENT_TYPE_RESERVED] = fred_bad_event, + [EVENT_TYPE_NMI] = fred_exc_nmi, + [EVENT_TYPE_SWINT] = fred_sw_interrupt_kernel, + [EVENT_TYPE_HWFAULT] = fred_exception, + [EVENT_TYPE_SWFAULT] = fred_exception, + [EVENT_TYPE_PRIVSW] = fred_exception, + [EVENT_TYPE_OTHER] = fred_bad_event + }; + u8 type = array_index_nospec((u8)regs->type, FRED_EVENT_TYPE_COUNT); + + /* The pt_regs frame on entry here is an exception frame */ + kernel_handlers[type](regs); +} diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 2876ddae02bc..bd43866f9c3e 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -82,6 +82,7 @@ static __always_inline void __##func(struct pt_regs *regs) #define DECLARE_IDTENTRY_ERRORCODE(vector, func) \ asmlinkage void asm_##func(void); \ asmlinkage void xen_asm_##func(void); \ + __visible void fred_##func(struct pt_regs *regs); \ __visible void func(struct pt_regs *regs, unsigned long error_code) /** @@ -106,6 +107,11 @@ __visible noinstr void func(struct pt_regs *regs, \ irqentry_exit(regs, state); \ } \ \ +__visible noinstr void fred_##func(struct pt_regs *regs) \ +{ \ + func (regs, regs->orig_ax); \ +} \ + \ static __always_inline void __##func(struct pt_regs *regs, \ unsigned long error_code) @@ -622,6 +628,8 @@ DECLARE_IDTENTRY_RAW(X86_TRAP_MC, exc_machine_check); #ifdef CONFIG_XEN_PV DECLARE_IDTENTRY_RAW(X86_TRAP_MC, xenpv_exc_machine_check); #endif +#else +#define fred_exc_machine_check fred_bad_event #endif /* NMI */ From patchwork Mon Mar 27 07:58:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188748 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49279C761A6 for ; Mon, 27 Mar 2023 08:25:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233231AbjC0IZc (ORCPT ); Mon, 27 Mar 2023 04:25:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233119AbjC0IYz (ORCPT ); Mon, 27 Mar 2023 04:24:55 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADB495586; Mon, 27 Mar 2023 01:24:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905478; x=1711441478; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RB/5r74H9yfWwhN9uD7BZ0JW6OZMEtwepKGs972tkRY=; b=QMQVdfhozeHFym3wzNKojdTVRhZ2VcbyLCidDnjgl9Ajp7z06dUNn+YD LB2pwSJ1MCTuidTQuc6GhwTh0ZBBFcSVtuB6AQ4PYI6ypH8cb4v1wqaTG uqQiW5YQbFqK1JKo8Dn3St8nPyHgyvfSpPzufeoZR3T6on6r8g8ecrG7C 1MWiQjBh97v7Q6ZHcVauDOpjj7BhciIkduwDzuxaB+WNb0ml2LbQMPOBN +wBC0YtJQQ8rYzNTHChS2FBVZeMXa+DqMPe/D4jEsVupAACHZMLCSJCfL s8hLD9/bUAGNIU4JwwcKGmhj0gj76Fd7St3/5O/pf2LAFmha4L4HyaVAp w==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930304" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930304" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787111" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787111" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:37 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 21/33] x86/fred: FRED initialization code Date: Mon, 27 Mar 2023 00:58:26 -0700 Message-Id: <20230327075838.5403-22-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" The code to initialize FRED when it's available and _not_ disabled. cpu_init_fred_exceptions() is the core function to initialize FRED, which 1. Sets up FRED entrypoints for events happening in ring 0 and 3. 2. Sets up a default stack for event handling. 3. Sets up dedicated event stacks for DB/NMI/MC/DF, equivalent to the IDT IST stacks. 4. Forces 32-bit system calls to use "int $0x80" only. 5. Enables FRED and invalidtes IDT. When the FRED is used, cpu_init_exception_handling() initializes FRED through calling cpu_init_fred_exceptions(), otherwise it sets up TSS IST and loads IDT. As FRED uses the ring 3 FRED entrypoint for SYSCALL and SYSENTER, it skips setting up SYSCALL/SYSENTER related MSRs, e.g., MSR_LSTAR. Signed-off-by: H. Peter Anvin (Intel) Co-developed-by: Xin Li Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v5: * Add a comment for FRED stack level settings (Lai Jiangshan). --- arch/x86/include/asm/fred.h | 14 ++++++ arch/x86/include/asm/traps.h | 2 + arch/x86/kernel/Makefile | 1 + arch/x86/kernel/cpu/common.c | 74 ++++++++++++++++++++----------- arch/x86/kernel/fred.c | 86 ++++++++++++++++++++++++++++++++++++ arch/x86/kernel/irqinit.c | 7 ++- arch/x86/kernel/traps.c | 16 ++++++- 7 files changed, 170 insertions(+), 30 deletions(-) create mode 100644 arch/x86/kernel/fred.c diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index 54746e8c0a17..cd974edc8e8a 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -99,8 +99,22 @@ DECLARE_FRED_HANDLER(fred_exc_debug); DECLARE_FRED_HANDLER(fred_exc_page_fault); DECLARE_FRED_HANDLER(fred_exc_machine_check); +/* + * The actual assembly entry and exit points + */ +extern __visible void fred_entrypoint_user(void); + +/* + * Initialization + */ +void cpu_init_fred_exceptions(void); +void fred_setup_apic(void); + #endif /* __ASSEMBLY__ */ +#else +#define cpu_init_fred_exceptions() BUG() +#define fred_setup_apic() BUG() #endif /* CONFIG_X86_FRED */ #endif /* ASM_X86_FRED_H */ diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 46f5e4e2a346..612b3d6fec53 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -56,4 +56,6 @@ void __noreturn handle_stack_overflow(struct pt_regs *regs, void f (struct pt_regs *regs) typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); +system_interrupt_handler get_system_interrupt_handler(unsigned int i); + #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index dd61752f4c96..08d9c0a0bfbe 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -47,6 +47,7 @@ obj-y += platform-quirks.o obj-y += process_$(BITS).o signal.o signal_$(BITS).o obj-y += traps.o idt.o irq.o irq_$(BITS).o dumpstack_$(BITS).o obj-y += time.o ioport.o dumpstack.o nmi.o +obj-$(CONFIG_X86_FRED) += fred.o obj-$(CONFIG_MODIFY_LDT_SYSCALL) += ldt.o obj-y += setup.o x86_init.o i8259.o irqinit.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index e8cf6f4cfb52..eea41cb8722e 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -58,6 +58,7 @@ #include #include #include +#include #include #include #include @@ -2054,28 +2055,6 @@ static void wrmsrl_cstar(unsigned long val) /* May not be marked __init: used by software suspend */ void syscall_init(void) { - wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS); - wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64); - -#ifdef CONFIG_IA32_EMULATION - wrmsrl_cstar((unsigned long)entry_SYSCALL_compat); - /* - * This only works on Intel CPUs. - * On AMD CPUs these MSRs are 32-bit, CPU truncates MSR_IA32_SYSENTER_EIP. - * This does not cause SYSENTER to jump to the wrong location, because - * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit). - */ - wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS); - wrmsrl_safe(MSR_IA32_SYSENTER_ESP, - (unsigned long)(cpu_entry_stack(smp_processor_id()) + 1)); - wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat); -#else - wrmsrl_cstar((unsigned long)ignore_sysret); - wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); - wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); - wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); -#endif - /* * Flags to clear on syscall; clear as much as possible * to minimize user space-kernel interference. @@ -2086,6 +2065,41 @@ void syscall_init(void) X86_EFLAGS_IF|X86_EFLAGS_DF|X86_EFLAGS_OF| X86_EFLAGS_IOPL|X86_EFLAGS_NT|X86_EFLAGS_RF| X86_EFLAGS_AC|X86_EFLAGS_ID); + + /* + * The default user and kernel segments + */ + wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS); + + if (cpu_feature_enabled(X86_FEATURE_FRED)) { + /* Both sysexit and sysret cause #UD when FRED is enabled */ + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); + } else { + wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64); + +#ifdef CONFIG_IA32_EMULATION + wrmsrl_cstar((unsigned long)entry_SYSCALL_compat); + /* + * This only works on Intel CPUs. + * On AMD CPUs these MSRs are 32-bit, CPU truncates + * MSR_IA32_SYSENTER_EIP. + * This does not cause SYSENTER to jump to the wrong + * location, because AMD doesn't allow SYSENTER in + * long mode (either 32- or 64-bit). + */ + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, + (unsigned long)(cpu_entry_stack(smp_processor_id()) + 1)); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat); +#else + wrmsrl_cstar((unsigned long)ignore_sysret); + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); +#endif + } } #else /* CONFIG_X86_64 */ @@ -2218,18 +2232,24 @@ void cpu_init_exception_handling(void) /* paranoid_entry() gets the CPU number from the GDT */ setup_getcpu(cpu); - /* IST vectors need TSS to be set up. */ - tss_setup_ist(tss); + /* Set up the TSS */ tss_setup_io_bitmap(tss); set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); - load_TR_desc(); /* GHCB needs to be setup to handle #VC. */ setup_ghcb(); - /* Finally load the IDT */ - load_current_idt(); + if (cpu_feature_enabled(X86_FEATURE_FRED)) { + /* Set up FRED exception handling */ + cpu_init_fred_exceptions(); + } else { + /* IST vectors need TSS to be set up. */ + tss_setup_ist(tss); + + /* Finally load the IDT */ + load_current_idt(); + } } /* diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c new file mode 100644 index 000000000000..5b4272235f2e --- /dev/null +++ b/arch/x86/kernel/fred.c @@ -0,0 +1,86 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include +#include +#include +#include /* For cr4_set_bits() */ +#include + +/* + * Initialize FRED on this CPU. This cannot be __init as it is called + * during CPU hotplug. + */ +void cpu_init_fred_exceptions(void) +{ + wrmsrl(MSR_IA32_FRED_CONFIG, + FRED_CONFIG_ENTRYPOINT(fred_entrypoint_user) | + FRED_CONFIG_REDZONE(8) | /* Reserve for CALL emulation */ + FRED_CONFIG_INT_STKLVL(0)); + + /* + * The purpose of separate stacks for NMI, #DB and #MC *in the kernel* + * (remember that user space faults are always taken on stack level 0) + * is to avoid overflowing the kernel stack. + * + * #DB in the kernel would imply the use of a kernel debugger. + * + * #DF is the highest level because a #DF means "something went wrong + * *while delivering an exception*." The number of cases for which that + * can happen with FRED is drastically reduced and basically amounts to + * "the stack you pointed me to is broken." Thus, always change stacks + * on #DF, which means it should be at the highest level. + */ + wrmsrl(MSR_IA32_FRED_STKLVLS, + FRED_STKLVL(X86_TRAP_DB, 1) | + FRED_STKLVL(X86_TRAP_NMI, 2) | + FRED_STKLVL(X86_TRAP_MC, 2) | + FRED_STKLVL(X86_TRAP_DF, 3)); + + /* The FRED equivalents to IST stacks... */ + wrmsrl(MSR_IA32_FRED_RSP1, __this_cpu_ist_top_va(DB)); + wrmsrl(MSR_IA32_FRED_RSP2, __this_cpu_ist_top_va(NMI)); + wrmsrl(MSR_IA32_FRED_RSP3, __this_cpu_ist_top_va(DF)); + + /* Not used with FRED */ + wrmsrl(MSR_LSTAR, 0ULL); + wrmsrl(MSR_CSTAR, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL); + wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL); + + /* Enable FRED */ + cr4_set_bits(X86_CR4_FRED); + idt_invalidate(); /* Any further IDT use is a bug */ + + /* Use int $0x80 for 32-bit system calls in FRED mode */ + setup_clear_cpu_cap(X86_FEATURE_SYSENTER32); + setup_clear_cpu_cap(X86_FEATURE_SYSCALL32); +} + +/* + * Initialize system vectors from a FRED perspective, so + * lapic_assign_system_vectors() can do its job. + */ +void __init fred_setup_apic(void) +{ + int i; + + for (i = 0; i < FIRST_EXTERNAL_VECTOR; i++) + set_bit(i, system_vectors); + + /* + * Don't set the non assigned system vectors in the + * system_vectors bitmap. Otherwise they show up in + * /proc/interrupts. + */ +#ifdef CONFIG_SMP + set_bit(IRQ_MOVE_CLEANUP_VECTOR, system_vectors); +#endif + + for (i = 0; i < NR_SYSTEM_VECTORS; i++) { + if (get_system_interrupt_handler(i) != NULL) { + set_bit(i + FIRST_SYSTEM_VECTOR, system_vectors); + } + } + + /* The rest are fair game... */ +} diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c index c683666876f1..2a510f72dd11 100644 --- a/arch/x86/kernel/irqinit.c +++ b/arch/x86/kernel/irqinit.c @@ -28,6 +28,7 @@ #include #include #include +#include #include /* @@ -96,7 +97,11 @@ void __init native_init_IRQ(void) /* Execute any quirks before the call gates are initialised: */ x86_init.irqs.pre_vector_init(); - idt_setup_apic_and_irq_gates(); + if (cpu_feature_enabled(X86_FEATURE_FRED)) + fred_setup_apic(); + else + idt_setup_apic_and_irq_gates(); + lapic_assign_system_vectors(); if (!acpi_ioapic && !of_ioapic && nr_legacy_irqs()) { diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 549f7f962f8f..ecfaf4d647bb 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1537,6 +1537,14 @@ static system_interrupt_handler system_interrupt_handlers[NR_SYSTEM_VECTORS] = { #undef SYSV +system_interrupt_handler get_system_interrupt_handler(unsigned int i) +{ + if (i >= NR_SYSTEM_VECTORS) + return NULL; + + return system_interrupt_handlers[i]; +} + /* * External interrupt dispatch function. * @@ -1572,7 +1580,8 @@ void __init install_system_interrupt_handler(unsigned int n, const void *asm_add #ifdef CONFIG_X86_64 system_interrupt_handlers[n - FIRST_SYSTEM_VECTOR] = (system_interrupt_handler)addr; #endif - alloc_intr_gate(n, asm_addr); + if (!cpu_feature_enabled(X86_FEATURE_FRED)) + alloc_intr_gate(n, asm_addr); } void __init trap_init(void) @@ -1585,7 +1594,10 @@ void __init trap_init(void) /* Initialize TSS before setting up traps so ISTs work */ cpu_init_exception_handling(); + /* Setup traps as cpu_init() might #GP */ - idt_setup_traps(); + if (!cpu_feature_enabled(X86_FEATURE_FRED)) + idt_setup_traps(); + cpu_init(); } From patchwork Mon Mar 27 07:58:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188747 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 875DEC761AF for ; Mon, 27 Mar 2023 08:25:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233223AbjC0IZa (ORCPT ); Mon, 27 Mar 2023 04:25:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233120AbjC0IYz (ORCPT ); Mon, 27 Mar 2023 04:24:55 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 317E2559F; Mon, 27 Mar 2023 01:24:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905479; x=1711441479; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SfNRJW38HST/J2vYBy+d2Rz7YNwfS+VaJzsap6qMLmE=; b=fvgsSzQg2y5+8N4JOrf6ltJVVeIToOnneOUJoxpYYCCUW5KZkhkG9Ocb K7ehhSjB6sbKJqxXc3ZicSKnM4AL9q4QJzgnXMN/DKGZFJULH14eaJqA7 cBDoxuxTgizgE0ywkJJXIMj5494rMjDnshr/E5x14nMgIgeKCMCkQeT1z Y+JZtpPfzNO1v5zYCjHw5SpUrBb3lEHSXWRha0+VEZnqpiRsXGV6/xp8F nECDULAFymj8ZqOn/GW9UXGZNZuwIY1c8UMf5DBX3y3kYH6dMOgIzw6aG vjlzfmtWk4PXfjh3k6lm4VEWzgbc9eU9H2D728oJQ8xjnwr0buOOEO906 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930315" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930315" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787115" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787115" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:37 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 22/33] x86/fred: update MSR_IA32_FRED_RSP0 during task switch Date: Mon, 27 Mar 2023 00:58:27 -0700 Message-Id: <20230327075838.5403-23-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" MSR_IA32_FRED_RSP0 is used during ring 3 event delivery, and needs to be updated to point to the top of next task stack during task switch. Update MSR_IA32_FRED_RSP0 with WRMSR instruction for now, and will use WRMSRNS/WRMSRLIST for performance once it gets upstreamed. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/switch_to.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h index 5c91305d09d2..00fd85abc1d2 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -68,9 +68,16 @@ static inline void update_task_stack(struct task_struct *task) #ifdef CONFIG_X86_32 this_cpu_write(cpu_tss_rw.x86_tss.sp1, task->thread.sp0); #else - /* Xen PV enters the kernel on the thread stack. */ - if (cpu_feature_enabled(X86_FEATURE_XENPV)) + if (cpu_feature_enabled(X86_FEATURE_FRED)) { + /* + * Will use WRMSRNS/WRMSRLIST for performance once it's upstreamed. + */ + wrmsrl(MSR_IA32_FRED_RSP0, + task_top_of_stack(task) + TOP_OF_KERNEL_STACK_PADDING); + } else if (cpu_feature_enabled(X86_FEATURE_XENPV)) { + /* Xen PV enters the kernel on the thread stack. */ load_sp0(task_top_of_stack(task)); + } #endif } From patchwork Mon Mar 27 07:58:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188750 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FD4AC761A6 for ; Mon, 27 Mar 2023 08:25:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233254AbjC0IZh (ORCPT ); Mon, 27 Mar 2023 04:25:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233143AbjC0IY5 (ORCPT ); Mon, 27 Mar 2023 04:24:57 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEF5A49F5; Mon, 27 Mar 2023 01:24:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905480; x=1711441480; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O9acOeCuSr/+wKyZ7069qEc1yb3O4hIxP1jgg1zr2lE=; b=JDudpN/wLOK1gz2rJGiwhSLnC2/pgicIip9UV6MViOd4+IsGn3yJI/Yy EQbf4fUheKh8tDW0XYbTzJ6ItpVMTIwkMoYqm/jIof8pylc8EOHPX7Blj 6d1O1J8IgzwgmUprz1nbcVK8kpH1zKAzER1YTdaPNHMG9/+eihnSgd2zx jAJIzy1aNB2YmGCH4P8vJz+KYJsTBAbqdUvWNODEwB1tstzWYgIqYq0ww 3oGyouVBmLfLETSBgkzjxnOoQZPuSa5MwX15TyXlre1tfrnREF3z9VpTK O736mw9goo7yXW1BiV0Oh07wtIJS50cA49D0JX5j9GHo+6vgKg0KZkFAq g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930328" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930328" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787118" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787118" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:38 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 23/33] x86/fred: let ret_from_fork() jmp to fred_exit_user when FRED is enabled Date: Mon, 27 Mar 2023 00:58:28 -0700 Message-Id: <20230327075838.5403-24-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Let ret_from_fork() jmp to fred_exit_user when FRED is enabled, otherwise the existing IDT code is chosen. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/entry/entry_64.S | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index eccc3431e515..5b595a9b2ffb 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -299,7 +299,12 @@ SYM_CODE_START_NOALIGN(ret_from_fork) UNWIND_HINT_REGS movq %rsp, %rdi call syscall_exit_to_user_mode /* returns with IRQs disabled */ +#ifdef CONFIG_X86_FRED + ALTERNATIVE "jmp swapgs_restore_regs_and_return_to_usermode", \ + "jmp fred_exit_user", X86_FEATURE_FRED +#else jmp swapgs_restore_regs_and_return_to_usermode +#endif 1: /* kernel thread */ From patchwork Mon Mar 27 07:58:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188749 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3319CC761AF for ; Mon, 27 Mar 2023 08:25:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233243AbjC0IZf (ORCPT ); Mon, 27 Mar 2023 04:25:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233142AbjC0IY5 (ORCPT ); Mon, 27 Mar 2023 04:24:57 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E82B05BAE; Mon, 27 Mar 2023 01:24:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905481; x=1711441481; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T7eEyAoHvwiW3Acg9V6KJ8qZ7RgqMyuTGxhG3JO7gws=; b=VMbCHpXCgAVEzNVydp1vFudTY35BnyVXFoCu3v06MiC4LNVC7HRQCkvr 7wUSFG5eCZnzSg8uHUdx/9tGXR+9jNW/jSJ74k1tYeEhVK0JVzM5sRqdB 0NrmcsClK1GGvtRtjoRd+sA4TxUhRiROac7pDm8NU7bK/T0Ue9SJlM1Il BKpWcMIgTj9airG3DRPfAfe4jvbFpYAQf3CPFSnGh5kDK/Ye4aDTHIdIU ZihUMSQZ3Kl0Uco59OmzBVfygEmAFS7cJqMLOMenqTkbRO2ddxcuKCqKj lGSq9D01lkOEOL6Uju32Sq71Phx4MedQOel5itO2+RUBMCcfdQY5WQk4X A==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930340" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930340" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787121" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787121" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:38 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 24/33] x86/fred: disallow the swapgs instruction when FRED is enabled Date: Mon, 27 Mar 2023 00:58:29 -0700 Message-Id: <20230327075838.5403-25-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" The FRED architecture establishes the full supervisor/user through: 1) FRED event delivery swaps the value of the GS base address and that of the IA32_KERNEL_GS_BASE MSR. 2) ERETU swaps the value of the GS base address and that of the IA32_KERNEL_GS_BASE MSR. Thus, the swapgs instruction is disallowed when FRED is enabled, otherwise it causes #UD. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/kernel/process_64.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index a1aa74864c8b..2bea86073646 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -165,7 +165,8 @@ static noinstr unsigned long __rdgsbase_inactive(void) lockdep_assert_irqs_disabled(); - if (!cpu_feature_enabled(X86_FEATURE_XENPV)) { + if (!cpu_feature_enabled(X86_FEATURE_FRED) && + !cpu_feature_enabled(X86_FEATURE_XENPV)) { native_swapgs(); gsbase = rdgsbase(); native_swapgs(); @@ -190,7 +191,8 @@ static noinstr void __wrgsbase_inactive(unsigned long gsbase) { lockdep_assert_irqs_disabled(); - if (!cpu_feature_enabled(X86_FEATURE_XENPV)) { + if (!cpu_feature_enabled(X86_FEATURE_FRED) && + !cpu_feature_enabled(X86_FEATURE_XENPV)) { native_swapgs(); wrgsbase(gsbase); native_swapgs(); From patchwork Mon Mar 27 07:58:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188751 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4DCEC761AF for ; Mon, 27 Mar 2023 08:25:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233037AbjC0IZj (ORCPT ); Mon, 27 Mar 2023 04:25:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233146AbjC0IY5 (ORCPT ); Mon, 27 Mar 2023 04:24:57 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EBB6B5BAF; Mon, 27 Mar 2023 01:24:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905481; x=1711441481; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YTgEgmA3f1uzY8PacyZ6XmpZcGIgDW5/dB0vDrivilc=; b=nMsYT/+D516rMU7OdG+zZpqVDvNBVMJL14MLn9WoRIC9GHtR12TK+89U PclzXEfeE7swfjmZofCNYMMqH8fjrFInzf23LcXwbOKivAsdxZ85FwVh4 R8s++5TtT5YUdY97E/H0EMBIQAcOUyOKY8Ak32NQxS1K0vVcu2TuPbBAI AEps6S7LV09c1Rogdbj3DLyOHvDqpc5duXPnqt3BW2z5IMZnst1jom/wu B8mgwGjRrwnIbGp9JocAast4w6tNlfhIkdhOgANo2c1AvyU9ncd2R6qAv y+rgcEatElrZJVNq/8rkBgaJtGTTPUW+D+4B395T9rDm3esFU63mWe0DF Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930351" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930351" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787126" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787126" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:39 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 25/33] x86/fred: no ESPFIX needed when FRED is enabled Date: Mon, 27 Mar 2023 00:58:30 -0700 Message-Id: <20230327075838.5403-26-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Because FRED always restores the full value of %rsp, ESPFIX is no longer needed when it's enabled. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/kernel/espfix_64.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c index 16f9814c9be0..48d133a54f45 100644 --- a/arch/x86/kernel/espfix_64.c +++ b/arch/x86/kernel/espfix_64.c @@ -106,6 +106,10 @@ void __init init_espfix_bsp(void) pgd_t *pgd; p4d_t *p4d; + /* FRED systems don't need ESPFIX */ + if (cpu_feature_enabled(X86_FEATURE_FRED)) + return; + /* Install the espfix pud into the kernel page directory */ pgd = &init_top_pgt[pgd_index(ESPFIX_BASE_ADDR)]; p4d = p4d_alloc(&init_mm, pgd, ESPFIX_BASE_ADDR); @@ -129,6 +133,10 @@ void init_espfix_ap(int cpu) void *stack_page; pteval_t ptemask; + /* FRED systems don't need ESPFIX */ + if (cpu_feature_enabled(X86_FEATURE_FRED)) + return; + /* We only have to do this once... */ if (likely(per_cpu(espfix_stack, cpu))) return; /* Already initialized */ From patchwork Mon Mar 27 07:58:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188752 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13663C76195 for ; Mon, 27 Mar 2023 08:25:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233263AbjC0IZk (ORCPT ); Mon, 27 Mar 2023 04:25:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233163AbjC0IY7 (ORCPT ); Mon, 27 Mar 2023 04:24:59 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5342035B8; Mon, 27 Mar 2023 01:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905482; x=1711441482; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0wimZRjNi3uiq5fVA6+RsDM4ZmskElbCpQsewoI4FTQ=; b=oICGSYfMHDKqxXA827UYkD+kyqtxIKupa5LBQx0fhC2uA5hyaNDUCihC U24ToFeatRvhltMJS67Exglc1Nb43w1z7XcFxykclzRJSZuKfOEobUDon R4uLfWJ8occ8gANF6oJSlcz5wsrepcs4unTu/H38EpDOEbmrSStwoqOLp Suuy48+9J0WfUzARsmIiKOYPek2lCD1/mLpfCfgum44WrcPlpFgYRtO48 jEu+9M8qPs31TX0zQiDOoyhmK22ZnF75oo53nqLBN+rTR9Sc3DtOZ1ApS KCM5dku5uJTZawn3SDtyr4uhstpO58IKE9ZYINqJ93tOwrFPka8RRd0Op Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930372" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930372" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787130" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787130" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:39 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 26/33] x86/fred: allow single-step trap and NMI when starting a new thread Date: Mon, 27 Mar 2023 00:58:31 -0700 Message-Id: <20230327075838.5403-27-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" Allow single-step trap and NMI when starting a new thread, thus once the new thread returns to ring3, single-step trap and NMI are both enabled immediately. High-order 48 bits above the lowest 16 bit CS are discarded by the legacy IRET instruction, thus can be set unconditionally, even when FRED is not enabled. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/fred.h | 11 +++++++++++ arch/x86/kernel/process_64.c | 13 +++++++------ 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index cd974edc8e8a..12449448e9bf 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -52,6 +52,14 @@ #define FRED_CSL_ALLOW_SINGLE_STEP _BITUL(25) #define FRED_CSL_INTERRUPT_SHADOW _BITUL(24) +/* + * High-order 48 bits above the lowest 16 bit CS are discarded by the + * legacy IRET instruction, thus can be set unconditionally, even when + * FRED is not enabled. + */ +#define CSL_PROCESS_START \ + (FRED_CSL_ENABLE_NMI | FRED_CSL_ALLOW_SINGLE_STEP) + #ifndef __ASSEMBLY__ #include @@ -115,6 +123,9 @@ void fred_setup_apic(void); #else #define cpu_init_fred_exceptions() BUG() #define fred_setup_apic() BUG() + +#define CSL_PROCESS_START 0 + #endif /* CONFIG_X86_FRED */ #endif /* ASM_X86_FRED_H */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 2bea86073646..58addf3c78bb 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -55,6 +55,7 @@ #include #include #include +#include #ifdef CONFIG_IA32_EMULATION /* Not included via unistd.h */ #include @@ -506,7 +507,7 @@ void x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase) static void start_thread_common(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp, - unsigned int _cs, unsigned int _ss, unsigned int _ds) + u16 _cs, u16 _ss, u16 _ds) { WARN_ON_ONCE(regs != current_pt_regs()); @@ -521,11 +522,11 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip, loadsegment(ds, _ds); load_gs_index(0); - regs->ip = new_ip; - regs->sp = new_sp; - regs->cs = _cs; - regs->ss = _ss; - regs->flags = X86_EFLAGS_IF; + regs->ip = new_ip; + regs->sp = new_sp; + regs->csx = _cs | CSL_PROCESS_START; + regs->ssx = _ss; + regs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED; } void From patchwork Mon Mar 27 07:58:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188753 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCD93C761AF for ; Mon, 27 Mar 2023 08:25:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233270AbjC0IZm (ORCPT ); Mon, 27 Mar 2023 04:25:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233167AbjC0IY7 (ORCPT ); Mon, 27 Mar 2023 04:24:59 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5348F55AF; Mon, 27 Mar 2023 01:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905482; x=1711441482; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VwyNhQio5Socw2IUc5t0oP8YMATfSIZwdBkoKsUZAk4=; b=TmApJqE8b8xFgRgi+hOhjSdyzxapoI4qJdTvMTVCZv31IVGjI2z4cTaH xGeJFRhctrFdRKr/QzHx88zUip9Kc035Zl/Qv/wc0AaLUOxOf84ZxvOnV 2zhs342DF7lWxsz0+irVj9olkiUaGFRuXYkNgmfNFow+LYiKcYR6A2QDJ eiDeSJaP2Q83MQ9Sm5xnlkPJipBj0JWW5Y/9tOHxIhNF5I/i9b/sL4QPK KT4gSHtSfgKgvUPUk6V+aJkABm9i3YnOsKzOV99vZKFo7qLvLDrOyBRc0 TL2WJJo6mntAtJz9xaZEA777kze+tOxl+rDGS5JAfUBNa3CJMIlwc1gU3 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930375" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930375" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787138" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787138" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:39 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 27/33] x86/fred: fixup fault on ERETU by jumping to fred_entrypoint_user Date: Mon, 27 Mar 2023 00:58:32 -0700 Message-Id: <20230327075838.5403-28-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If the stack frame contains an invalid user context (e.g. due to invalid SS, a non-canonical RIP, etc.) the ERETU instruction will trap (#SS or #GP). From a Linux point of view, this really should be considered a user space failure, so use the standard fault fixup mechanism to intercept the fault, fix up the exception frame, and redirect execution to fred_entrypoint_user. The end result is that it appears just as if the hardware had taken the exception immediately after completing the transition to user space. Suggested-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v5: * Move the NMI bit from an invalid stack frame, which caused ERETU to fault, to the fault handler's stack frame, thus to unblock NMI ASAP if NMI is blocked (Lai Jiangshan). --- arch/x86/entry/entry_64_fred.S | 8 +++-- arch/x86/include/asm/extable_fixup_types.h | 4 ++- arch/x86/mm/extable.c | 36 ++++++++++++++++++++++ 3 files changed, 45 insertions(+), 3 deletions(-) diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S index d975cacd060f..efe2bcd11273 100644 --- a/arch/x86/entry/entry_64_fred.S +++ b/arch/x86/entry/entry_64_fred.S @@ -5,8 +5,10 @@ * The actual FRED entry points. */ #include -#include +#include #include +#include +#include #include #include "calling.h" @@ -38,7 +40,9 @@ SYM_CODE_START_NOALIGN(fred_entrypoint_user) call fred_entry_from_user SYM_INNER_LABEL(fred_exit_user, SYM_L_GLOBAL) FRED_EXIT - ERETU +1: ERETU + + _ASM_EXTABLE_TYPE(1b, fred_entrypoint_user, EX_TYPE_ERETU) SYM_CODE_END(fred_entrypoint_user) .fill fred_entrypoint_kernel - ., 1, 0xcc diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/asm/extable_fixup_types.h index 991e31cfde94..1585c798a02f 100644 --- a/arch/x86/include/asm/extable_fixup_types.h +++ b/arch/x86/include/asm/extable_fixup_types.h @@ -64,6 +64,8 @@ #define EX_TYPE_UCOPY_LEN4 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(4)) #define EX_TYPE_UCOPY_LEN8 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(8)) -#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ +#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ + +#define EX_TYPE_ERETU 21 #endif diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 60814e110a54..a5d75b27a993 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -195,6 +196,37 @@ static bool ex_handler_ucopy_len(const struct exception_table_entry *fixup, return ex_handler_uaccess(fixup, regs, trapnr); } +#ifdef CONFIG_X86_FRED +static bool ex_handler_eretu(const struct exception_table_entry *fixup, + struct pt_regs *regs, unsigned long error_code) +{ + struct pt_regs *uregs = (struct pt_regs *)(regs->sp - offsetof(struct pt_regs, ip)); + unsigned short ss = uregs->ss; + unsigned short cs = uregs->cs; + + /* + * Move the NMI bit from the invalid stack frame, which caused ERETU + * to fault, to the fault handler's stack frame, thus to unblock NMI + * with the fault handler's ERETS instruction ASAP if NMI is blocked. + */ + regs->nmi = uregs->nmi; + + fred_info(uregs)->edata = fred_event_data(regs); + uregs->ssx = regs->ssx; + uregs->ss = ss; + uregs->csx = regs->csx; + uregs->nmi = 0; /* The NMI bit was moved away above */ + uregs->current_stack_level = 0; + uregs->cs = cs; + + /* Copy error code to uregs and adjust stack pointer accordingly */ + uregs->orig_ax = error_code; + regs->sp -= 8; + + return ex_handler_default(fixup, regs); +} +#endif + int ex_get_fixup_type(unsigned long ip) { const struct exception_table_entry *e = search_exception_tables(ip); @@ -272,6 +304,10 @@ int fixup_exception(struct pt_regs *regs, int trapnr, unsigned long error_code, return ex_handler_ucopy_len(e, regs, trapnr, reg, imm); case EX_TYPE_ZEROPAD: return ex_handler_zeropad(e, regs, fault_addr); +#ifdef CONFIG_X86_FRED + case EX_TYPE_ERETU: + return ex_handler_eretu(e, regs, error_code); +#endif } BUG(); } From patchwork Mon Mar 27 07:58:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188754 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB046C7619A for ; Mon, 27 Mar 2023 08:25:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233274AbjC0IZo (ORCPT ); Mon, 27 Mar 2023 04:25:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233170AbjC0IY7 (ORCPT ); Mon, 27 Mar 2023 04:24:59 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5A0E5B8B; Mon, 27 Mar 2023 01:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905482; x=1711441482; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5IBl0XY5xGj0Ul5jZJTZVlHtGnpVju3gJ8HZ07Stwk0=; b=MhuTPWr+7THHdZI9zsPhgiWedqMcjcaQhkNoXGXGj0csVvzYKn5FctUZ xucUkg9DV50r8wdzcuaX9RtMWXnJQW88pr2TvfZrvakhGnbFEc+QxOcQQ isfKyrFd+u+P4tHFa+V6vwLHkB6gNjYSXEGER+J89Y0biKCaQDHlKiZ1h F4k+z4TIxsV9UuR/ILYbT5Tjm6C+4Wf+X1c/Fh32ZfS4GQ8kXUKZE19T4 SKpGzYY8gVCtbnjKRZPhNBXUcSMXxwvbQ+1xgWaYGnU91pagRG222x6tF yfyZ44gBCwbGUDDJ9P5j+yIw4tcM1CVKs3Bqb5O+iC+b+sdU1PQ36M8ik g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930391" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930391" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787142" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787142" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:40 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 28/33] x86/ia32: do not modify the DPL bits for a null selector Date: Mon, 27 Mar 2023 00:58:33 -0700 Message-Id: <20230327075838.5403-29-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When a null selector is to be loaded into a segment register, reload_segments() sets its DPL bits to 3. Later when the IRET instruction loads it, it zeros the segment register. The two operations offset each other to actually effect a nop. Unlike IRET, ERETU does not make any of DS, ES, FS, or GS null if it is found to have DPL < 3. It is expected that a FRED-enabled operating system will return to ring 3 (in compatibility mode) only when those segments all have DPL = 3. Thus when FRED is enabled, we end up with having 3 in a segment register even when it is initially set to 0. Fix it by not modifying the DPL bits for a null selector. Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/kernel/signal_32.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c index 9027fc088f97..7796cf84fca2 100644 --- a/arch/x86/kernel/signal_32.c +++ b/arch/x86/kernel/signal_32.c @@ -36,22 +36,27 @@ #ifdef CONFIG_IA32_EMULATION #include +static inline u16 usrseg(u16 sel) +{ + return sel <= 3 ? sel : sel | 3; +} + static inline void reload_segments(struct sigcontext_32 *sc) { unsigned int cur; savesegment(gs, cur); - if ((sc->gs | 0x03) != cur) - load_gs_index(sc->gs | 0x03); + if (usrseg(sc->gs) != cur) + load_gs_index(usrseg(sc->gs)); savesegment(fs, cur); - if ((sc->fs | 0x03) != cur) - loadsegment(fs, sc->fs | 0x03); + if (usrseg(sc->fs) != cur) + loadsegment(fs, usrseg(sc->fs)); savesegment(ds, cur); - if ((sc->ds | 0x03) != cur) - loadsegment(ds, sc->ds | 0x03); + if (usrseg(sc->ds) != cur) + loadsegment(ds, usrseg(sc->ds)); savesegment(es, cur); - if ((sc->es | 0x03) != cur) - loadsegment(es, sc->es | 0x03); + if (usrseg(sc->es) != cur) + loadsegment(es, usrseg(sc->es)); } #define sigset32_t compat_sigset_t From patchwork Mon Mar 27 07:58:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188755 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F80DC7619A for ; Mon, 27 Mar 2023 08:26:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233360AbjC0I0r (ORCPT ); Mon, 27 Mar 2023 04:26:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233206AbjC0IZ0 (ORCPT ); Mon, 27 Mar 2023 04:25:26 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B4B83581; Mon, 27 Mar 2023 01:24:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905485; x=1711441485; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZfrrJeS7iXSZm5q7fN9/blE2JnxKZ2C428XwyL9v4Mw=; b=TuvE3pdrkPHEiOW0foOUwfPKN4uZUZrEjqlql8/okLIwUcKkVf2sA4Dt hLq06FN28CnlgYLx+GntMBD1CytlhMHSKqC9r+vCAngq/DSkwff/NP8q1 NHndbwpNHT4LhwvsQJ8BfZ9+cmnqsu4qQcT8QqyM85dHVs8/ZVkdjXa1o EQkRdNwW0WKIyTZU1ZVF43xdxjWzPlU8V1DG1+UH0OcJ/Ki5BD3cdfnl4 yAcffkIYpJaZemX/ZdGayV6pyroyDd3hEgEcPU9KXbJWRePUJNeBw1IUt I343ZOV2K4RkqzUrcDpiVlRTCBQr72c3uvAx7XClxp660zjvTFJskvtMR w==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930395" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930395" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787147" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787147" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:40 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 29/33] x86/fred: allow FRED systems to use interrupt vectors 0x10-0x1f Date: Mon, 27 Mar 2023 00:58:34 -0700 Message-Id: <20230327075838.5403-30-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "H. Peter Anvin (Intel)" FRED inherits the Intel VT-x enhancement of classified events with a two-level event dispatch logic. The first-level dispatch is on the event type, and the second-level is on the event vector. This also means that vectors in different event types are orthogonal, thus, vectors 0x10-0x1f become available as hardware interrupts. Enable interrupt vectors 0x10-0x1f on FRED systems (interrupt 0x80 is already enabled.) Most of these changes are about removing the assumption that the lowest-priority vector is hard-wired to 0x20. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/idtentry.h | 4 ++-- arch/x86/include/asm/irq.h | 5 +++++ arch/x86/include/asm/irq_vectors.h | 15 +++++++++++---- arch/x86/kernel/apic/apic.c | 11 ++++++++--- arch/x86/kernel/apic/vector.c | 8 +++++++- arch/x86/kernel/fred.c | 4 ++-- arch/x86/kernel/idt.c | 6 +++--- arch/x86/kernel/irq.c | 2 +- arch/x86/kernel/traps.c | 2 ++ 9 files changed, 41 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index bd43866f9c3e..57c891148b59 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -546,8 +546,8 @@ __visible noinstr void func(struct pt_regs *regs, \ */ .align IDT_ALIGN SYM_CODE_START(irq_entries_start) - vector=FIRST_EXTERNAL_VECTOR - .rept NR_EXTERNAL_VECTORS + vector=FIRST_EXTERNAL_VECTOR_IDT + .rept FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR_IDT UNWIND_HINT_IRET_REGS 0 : ENDBR diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h index 768aa234cbb4..e4be6f8409ad 100644 --- a/arch/x86/include/asm/irq.h +++ b/arch/x86/include/asm/irq.h @@ -11,6 +11,11 @@ #include #include +/* + * The first available IRQ vector + */ +extern unsigned int __ro_after_init first_external_vector; + /* * The irq entry code is in the noinstr section and the start/end of * __irqentry_text is emitted via labels. Make the build fail if diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 43dcb9284208..cb3670a7c18f 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -31,15 +31,23 @@ /* * IDT vectors usable for external interrupt sources start at 0x20. - * (0x80 is the syscall vector, 0x30-0x3f are for ISA) + * (0x80 is the syscall vector, 0x30-0x3f are for ISA). + * + * With FRED we can also use 0x10-0x1f even though those overlap + * exception vectors as FRED distinguishes exceptions and interrupts. + * Therefore, FIRST_EXTERNAL_VECTOR is no longer a constant. */ -#define FIRST_EXTERNAL_VECTOR 0x20 +#define FIRST_EXTERNAL_VECTOR_IDT 0x20 +#define FIRST_EXTERNAL_VECTOR_FRED 0x10 +#define FIRST_EXTERNAL_VECTOR first_external_vector /* * Reserve the lowest usable vector (and hence lowest priority) 0x20 for * triggering cleanup after irq migration. 0x21-0x2f will still be used * for device interrupts. */ +#define IRQ_MOVE_CLEANUP_VECTOR_IDT FIRST_EXTERNAL_VECTOR_IDT +#define IRQ_MOVE_CLEANUP_VECTOR_FRED FIRST_EXTERNAL_VECTOR_FRED #define IRQ_MOVE_CLEANUP_VECTOR FIRST_EXTERNAL_VECTOR #define IA32_SYSCALL_VECTOR 0x80 @@ -48,7 +56,7 @@ * Vectors 0x30-0x3f are used for ISA interrupts. * round up to the next 16-vector boundary */ -#define ISA_IRQ_VECTOR(irq) (((FIRST_EXTERNAL_VECTOR + 16) & ~15) + irq) +#define ISA_IRQ_VECTOR(irq) (((FIRST_EXTERNAL_VECTOR_IDT + 16) & ~15) + irq) /* * Special IRQ vectors used by the SMP architecture, 0xf0-0xff @@ -114,7 +122,6 @@ #define FIRST_SYSTEM_VECTOR NR_VECTORS #endif -#define NR_EXTERNAL_VECTORS (FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR) #define NR_SYSTEM_VECTORS (NR_VECTORS - FIRST_SYSTEM_VECTOR) /* diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 20d9a604da7c..eef67f64aa81 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1621,12 +1621,17 @@ static void setup_local_APIC(void) /* * Set Task Priority to 'accept all except vectors 0-31'. An APIC * vector in the 16-31 range could be delivered if TPR == 0, but we - * would think it's an exception and terrible things will happen. We - * never change this later on. + * would think it's an exception and terrible things will happen, + * unless we are using FRED in which case interrupts and + * exceptions are distinguished by type code. + * + * We never change this later on. */ + BUG_ON(!first_external_vector); + value = apic_read(APIC_TASKPRI); value &= ~APIC_TPRI_MASK; - value |= 0x10; + value |= (first_external_vector - 0x10) & APIC_TPRI_MASK; apic_write(APIC_TASKPRI, value); /* Clear eventually stale ISR/IRR bits */ diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index c1efebd27e6c..f4325445fd78 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -46,6 +46,7 @@ static struct irq_matrix *vector_matrix; #ifdef CONFIG_SMP static DEFINE_PER_CPU(struct hlist_head, cleanup_list); #endif +unsigned int first_external_vector = FIRST_EXTERNAL_VECTOR_IDT; void lock_vector_lock(void) { @@ -796,7 +797,12 @@ int __init arch_early_irq_init(void) * Allocate the vector matrix allocator data structure and limit the * search area. */ - vector_matrix = irq_alloc_matrix(NR_VECTORS, FIRST_EXTERNAL_VECTOR, + if (cpu_feature_enabled(X86_FEATURE_FRED)) + first_external_vector = FIRST_EXTERNAL_VECTOR_FRED; + else + first_external_vector = FIRST_EXTERNAL_VECTOR_IDT; + + vector_matrix = irq_alloc_matrix(NR_VECTORS, first_external_vector, FIRST_SYSTEM_VECTOR); BUG_ON(!vector_matrix); diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c index 5b4272235f2e..a3b678667b07 100644 --- a/arch/x86/kernel/fred.c +++ b/arch/x86/kernel/fred.c @@ -64,7 +64,7 @@ void __init fred_setup_apic(void) { int i; - for (i = 0; i < FIRST_EXTERNAL_VECTOR; i++) + for (i = 0; i < FIRST_EXTERNAL_VECTOR_FRED; i++) set_bit(i, system_vectors); /* @@ -73,7 +73,7 @@ void __init fred_setup_apic(void) * /proc/interrupts. */ #ifdef CONFIG_SMP - set_bit(IRQ_MOVE_CLEANUP_VECTOR, system_vectors); + set_bit(IRQ_MOVE_CLEANUP_VECTOR_FRED, system_vectors); #endif for (i = 0; i < NR_SYSTEM_VECTORS; i++) { diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index a58c6bc1cd68..d3fd86f85de9 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -131,7 +131,7 @@ static const __initconst struct idt_data apic_idts[] = { INTG(RESCHEDULE_VECTOR, asm_sysvec_reschedule_ipi), INTG(CALL_FUNCTION_VECTOR, asm_sysvec_call_function), INTG(CALL_FUNCTION_SINGLE_VECTOR, asm_sysvec_call_function_single), - INTG(IRQ_MOVE_CLEANUP_VECTOR, asm_sysvec_irq_move_cleanup), + INTG(IRQ_MOVE_CLEANUP_VECTOR_IDT, asm_sysvec_irq_move_cleanup), INTG(REBOOT_VECTOR, asm_sysvec_reboot), #endif @@ -274,13 +274,13 @@ static void __init idt_map_in_cea(void) */ void __init idt_setup_apic_and_irq_gates(void) { - int i = FIRST_EXTERNAL_VECTOR; + int i = FIRST_EXTERNAL_VECTOR_IDT; void *entry; idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts), true); for_each_clear_bit_from(i, system_vectors, FIRST_SYSTEM_VECTOR) { - entry = irq_entries_start + IDT_ALIGN * (i - FIRST_EXTERNAL_VECTOR); + entry = irq_entries_start + IDT_ALIGN * (i - FIRST_EXTERNAL_VECTOR_IDT); set_intr_gate(i, entry); } diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 7e125fff45ab..b7511e02959c 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -359,7 +359,7 @@ void fixup_irqs(void) * vector_lock because the cpu is already marked !online, so * nothing else will touch it. */ - for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { + for (vector = first_external_vector; vector < NR_VECTORS; vector++) { if (IS_ERR_OR_NULL(__this_cpu_read(vector_irq[vector]))) continue; diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index ecfaf4d647bb..73471053ed02 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1489,6 +1489,8 @@ DEFINE_IDTENTRY_IRQ(spurious_interrupt) pr_info("Spurious interrupt (vector 0x%x) on CPU#%d, should never happen.\n", vector, smp_processor_id()); } + +unsigned int first_external_vector = FIRST_EXTERNAL_VECTOR_IDT; #endif static void dispatch_table_spurious_interrupt(struct pt_regs *regs) From patchwork Mon Mar 27 07:58:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188756 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19726C76195 for ; Mon, 27 Mar 2023 08:27:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233368AbjC0I1E (ORCPT ); Mon, 27 Mar 2023 04:27:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233185AbjC0I0C (ORCPT ); Mon, 27 Mar 2023 04:26:02 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0E8D658A; Mon, 27 Mar 2023 01:24:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905490; x=1711441490; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nPnfSEbo6BTspdqPq0giRPFRqa1bDec0cuE9fytvdhg=; b=I5/u01oRa9TSnGFyP8y+2Cuxq2uCiDpBYOAXjlqNWPMPKhWuIwmnRbct 8mJpSGVIm0bfn4n1Wqv9BR614ZzbQ2TbkwhUwD09HHlDbU3fwIY+OB0Jf sBD7oYQmnMCuniWA0AFz6tHmj/s3bYZzAX7NM3NY4ZLWJv9CYkrSdd2c7 I8NQsL+zlw+Ls9HwsU2ZBlYaOFzBJVK/GIKrn9xgpV7evcjs5zDavy0Oo 8LrZChvzLkU+TNQ8TX07bZ999QxzCT7S+CWe06tC4rkRVLUpDXX+j9CW/ BKK19Uou0fzNqiLeZ8K+OX6OI4D0biwMkm5yCRp0E9SUwm6sOxKzZUcLI A==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930407" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930407" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787154" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787154" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:41 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 30/33] x86/fred: allow dynamic stack frame size Date: Mon, 27 Mar 2023 00:58:35 -0700 Message-Id: <20230327075838.5403-31-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org A FRED stack frame could contain different amount of information for different event types, or perhaps even for different instances of the same event type. Thus we need to eliminate the need of any advance information of the stack frame size to allow dynamic stack frame size. Implement it through: 1) add a new field user_pt_regs to thread_info, and initialize it with a pointer to a virtual pt_regs structure at the top of a thread stack. 2) save a pointer to the user-space pt_regs structure created by fred_entrypoint_user() to user_pt_regs in fred_entry_from_user(). 3) initialize the init_thread_info's user_pt_regs with a pointer to a virtual pt_regs structure at the top of init stack. This approach also works for IDT, thus we unify the code. Suggested-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/entry/entry_32.S | 2 +- arch/x86/entry/entry_fred.c | 2 ++ arch/x86/include/asm/entry-common.h | 3 +++ arch/x86/include/asm/processor.h | 12 +++------ arch/x86/include/asm/switch_to.h | 3 +-- arch/x86/include/asm/thread_info.h | 41 ++++------------------------- arch/x86/kernel/head_32.S | 3 +-- arch/x86/kernel/process.c | 5 ++++ kernel/fork.c | 6 +++++ 9 files changed, 27 insertions(+), 50 deletions(-) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index 91397f58ac30..5adc4cf33d92 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -1244,7 +1244,7 @@ SYM_CODE_START(rewind_stack_and_make_dead) xorl %ebp, %ebp movl PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %esi - leal -TOP_OF_KERNEL_STACK_PADDING-PTREGS_SIZE(%esi), %esp + leal -PTREGS_SIZE(%esi), %esp call make_task_dead 1: jmp 1b diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c index 1862bd7a9fc1..2754835b0efe 100644 --- a/arch/x86/entry/entry_fred.c +++ b/arch/x86/entry/entry_fred.c @@ -178,6 +178,8 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs) [EVENT_TYPE_OTHER] = fred_syscall_slow }; + current->thread_info.user_pt_regs = regs; + /* * FRED employs a two-level event dispatch mechanism, with * the first-level on the type of an event and the second-level diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h index 117903881fe4..5b7d0f47f188 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -12,6 +12,9 @@ /* Check that the stack and regs on entry from user mode are sane. */ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) { + if (!cpu_feature_enabled(X86_FEATURE_FRED)) + current->thread_info.user_pt_regs = regs; + if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) { /* * Make sure that the entry code gave us a sensible EFLAGS diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8d73004e4cac..4a50d2a2c14b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -626,17 +626,11 @@ static inline void spin_lock_prefetch(const void *x) prefetchw(x); } -#define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack) - \ - TOP_OF_KERNEL_STACK_PADDING) +#define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack)) -#define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1)) +#define task_top_of_stack(task) ((unsigned long)task_stack_page(task) + THREAD_SIZE) -#define task_pt_regs(task) \ -({ \ - unsigned long __ptr = (unsigned long)task_stack_page(task); \ - __ptr += THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING; \ - ((struct pt_regs *)__ptr) - 1; \ -}) +#define task_pt_regs(task) ((task)->thread_info.user_pt_regs) #ifdef CONFIG_X86_32 #define INIT_THREAD { \ diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h index 00fd85abc1d2..0a31da150808 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -72,8 +72,7 @@ static inline void update_task_stack(struct task_struct *task) /* * Will use WRMSRNS/WRMSRLIST for performance once it's upstreamed. */ - wrmsrl(MSR_IA32_FRED_RSP0, - task_top_of_stack(task) + TOP_OF_KERNEL_STACK_PADDING); + wrmsrl(MSR_IA32_FRED_RSP0, task_top_of_stack(task)); } else if (cpu_feature_enabled(X86_FEATURE_XENPV)) { /* Xen PV enters the kernel on the thread stack. */ load_sp0(task_top_of_stack(task)); diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 998483078d5f..ced0a01e0a3e 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -13,42 +13,6 @@ #include #include -/* - * TOP_OF_KERNEL_STACK_PADDING is a number of unused bytes that we - * reserve at the top of the kernel stack. We do it because of a nasty - * 32-bit corner case. On x86_32, the hardware stack frame is - * variable-length. Except for vm86 mode, struct pt_regs assumes a - * maximum-length frame. If we enter from CPL 0, the top 8 bytes of - * pt_regs don't actually exist. Ordinarily this doesn't matter, but it - * does in at least one case: - * - * If we take an NMI early enough in SYSENTER, then we can end up with - * pt_regs that extends above sp0. On the way out, in the espfix code, - * we can read the saved SS value, but that value will be above sp0. - * Without this offset, that can result in a page fault. (We are - * careful that, in this case, the value we read doesn't matter.) - * - * In vm86 mode, the hardware frame is much longer still, so add 16 - * bytes to make room for the real-mode segments. - * - * x86-64 has a fixed-length stack frame, but it depends on whether - * or not FRED is enabled. Future versions of FRED might make this - * dynamic, but for now it is always 2 words longer. - */ -#ifdef CONFIG_X86_32 -# ifdef CONFIG_VM86 -# define TOP_OF_KERNEL_STACK_PADDING 16 -# else -# define TOP_OF_KERNEL_STACK_PADDING 8 -# endif -#else /* x86-64 */ -# ifdef CONFIG_X86_FRED -# define TOP_OF_KERNEL_STACK_PADDING (2*8) -# else -# define TOP_OF_KERNEL_STACK_PADDING 0 -# endif -#endif - /* * low level task data that entry.S needs immediate access to * - this struct should fit entirely inside of one cache line @@ -56,6 +20,7 @@ */ #ifndef __ASSEMBLY__ struct task_struct; +struct pt_regs; #include #include @@ -66,11 +31,14 @@ struct thread_info { #ifdef CONFIG_SMP u32 cpu; /* current CPU */ #endif + struct pt_regs *user_pt_regs; }; +#define INIT_TASK_PT_REGS ((struct pt_regs *)TOP_OF_INIT_STACK - 1) #define INIT_THREAD_INFO(tsk) \ { \ .flags = 0, \ + .user_pt_regs = INIT_TASK_PT_REGS, \ } #else /* !__ASSEMBLY__ */ @@ -240,6 +208,7 @@ static inline int arch_within_stack_frames(const void * const stack, extern void arch_task_cache_init(void); extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); +extern void arch_init_user_pt_regs(struct task_struct *tsk); extern void arch_release_task_struct(struct task_struct *tsk); extern void arch_setup_new_exec(void); #define arch_setup_new_exec arch_setup_new_exec diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S index 67c8ed99144b..0201ddcd7576 100644 --- a/arch/x86/kernel/head_32.S +++ b/arch/x86/kernel/head_32.S @@ -517,8 +517,7 @@ SYM_DATA_END(initial_page_table) * reliably detect the end of the stack. */ SYM_DATA(initial_stack, - .long init_thread_union + THREAD_SIZE - - SIZEOF_PTREGS - TOP_OF_KERNEL_STACK_PADDING) + .long init_thread_union + THREAD_SIZE - SIZEOF_PTREGS) __INITRODATA int_msg: diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index b650cde3f64d..e1c6350290ae 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -98,6 +98,11 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) return 0; } +void arch_init_user_pt_regs(struct task_struct *tsk) +{ + tsk->thread_info.user_pt_regs = (struct pt_regs *)task_top_of_stack(tsk)- 1; +} + #ifdef CONFIG_X86_64 void arch_release_task_struct(struct task_struct *tsk) { diff --git a/kernel/fork.c b/kernel/fork.c index c0257cbee093..c5a6c23ec6d4 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -953,6 +953,10 @@ int __weak arch_dup_task_struct(struct task_struct *dst, return 0; } +void __weak arch_init_user_pt_regs(struct task_struct *tsk) +{ +} + void set_task_stack_end_magic(struct task_struct *tsk) { unsigned long *stackend; @@ -980,6 +984,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) if (err) goto free_tsk; + arch_init_user_pt_regs(tsk); + #ifdef CONFIG_THREAD_INFO_IN_TASK refcount_set(&tsk->stack_refcount, 1); #endif From patchwork Mon Mar 27 07:58:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188757 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B78ECC761AF for ; Mon, 27 Mar 2023 08:27:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233351AbjC0I1G (ORCPT ); Mon, 27 Mar 2023 04:27:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233115AbjC0I0W (ORCPT ); Mon, 27 Mar 2023 04:26:22 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F83E5FC3; Mon, 27 Mar 2023 01:24:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905492; x=1711441492; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qlKMvNMl6phwDM/NFmlFjCkIxGABaABfFRVavP867bI=; b=NEserNeaek6yzaixPJBN4iyfZ8SxQl4y3g9koSN3zMM7T6hpKzGq7t9S xaOhLvE730k/NDbglyFqkXbMrOU8sMNmBKYbnglKrj6EJcNY3YFbTsVuV 5f5agu6r9eJLTHhkigND+//z14gV6i/KTTf2Ssd0nd7aSfGSTrOaxMbPH jvEllAGHG3vqW2XxogesV4HsCA4RmOV/PlRsnsK2mQmKtnZysYpbTjvH2 25twTTMwOIIBEV0MsHiJ+cL/ozOq3cfgU3vGW3HkPW4V9zBbIUS/31YXa 8hjpiA3sZ+EwUeAXjAkSZCgRvBQRzb+XhC823M9p49eH17Wq3V+Nzub3T A==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930418" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930418" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787159" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787159" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:41 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 31/33] x86/fred: BUG() when ERETU with %rsp not equal to that when the ring 3 event was just delivered Date: Mon, 27 Mar 2023 00:58:36 -0700 Message-Id: <20230327075838.5403-32-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org A FRED stack frame generated by a ring 3 event should never be messed up, and the first thing we must make sure is that at the time an ERETU instruction is executed, %rsp must have the same address as that when the user level event was just delivered. However we don't want to bother the normal code path of ERETU because it's on the hotest code path, a good choice is to do this check when ERETU faults. Suggested-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/mm/extable.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index a5d75b27a993..bf8005558935 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -204,6 +204,14 @@ static bool ex_handler_eretu(const struct exception_table_entry *fixup, unsigned short ss = uregs->ss; unsigned short cs = uregs->cs; + /* + * A FRED stack frame generated by a ring 3 event should never be + * messed up, and the first thing we must make sure is that at the + * time an ERETU instruction is executed, %rsp must have the same + * address as that when the user level event was just delivered. + */ + BUG_ON(uregs != current->thread_info.user_pt_regs); + /* * Move the NMI bit from the invalid stack frame, which caused ERETU * to fault, to the fault handler's stack frame, thus to unblock NMI From patchwork Mon Mar 27 07:58:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188758 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C88DC7619A for ; Mon, 27 Mar 2023 08:27:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233385AbjC0I1Y (ORCPT ); Mon, 27 Mar 2023 04:27:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233240AbjC0I0z (ORCPT ); Mon, 27 Mar 2023 04:26:55 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3321655BE; Mon, 27 Mar 2023 01:25:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905500; x=1711441500; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nd37h2vxkhLXg5jhh870M6BiFI8TxVn0dDXlR1ukIGo=; b=CaDQqz4biBKIUebVXOjp08vSpEgbZB5LohEewk8IHf34Z+alF3B6P/Rl q8jlLHgOIpAvGdkbPuBs+c3cpXLuezuXZonSguqepRAkLKqaT3MJfeqki FhYCZlbOYuHnUe66bWAZn50ARIv2+8/tcDQZ4ggca5OBRZ7UPCyxeABC/ eP+ioHpGvNByNNJ+PMlJjeEVlHow/JlWuHsUzyqDMYUD2KSEf2NMGUTIN aa/2BUUOli+Z91rBKsrDGANLyGhEo3jaJX3taQT43V2KLLIs8yEk/9wta Ki6pFhKZIkoRhlEYQ9GOPPjqZRB+4EOgTEjFfoAl5CTI9WNeCpeSR6vIa g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930428" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930428" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787165" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787165" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:41 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 32/33] x86/fred: disable FRED by default in its early stage Date: Mon, 27 Mar 2023 00:58:37 -0700 Message-Id: <20230327075838.5403-33-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Disable FRED by default in its early stage. To enable FRED, a new kernel command line option "fred" needs to be added. Tested-by: Shan Kang Signed-off-by: Xin Li --- Documentation/admin-guide/kernel-parameters.txt | 4 ++++ arch/x86/kernel/cpu/common.c | 3 +++ 2 files changed, 7 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6221a1d057dd..c55ea60e1a0c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1498,6 +1498,10 @@ Warning: use of this parameter will taint the kernel and may cause unknown problems. + fred + Forcefully enable flexible return and event delivery, + which is otherwise disabled by default. + ftrace=[tracer] [FTRACE] will set and start the specified tracer as early as possible in order to facilitate early diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index eea41cb8722e..4db5e619fc97 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1467,6 +1467,9 @@ static void __init cpu_parse_early_param(void) char *argptr = arg, *opt; int arglen, taint = 0; + if (!cmdline_find_option_bool(boot_command_line, "fred")) + setup_clear_cpu_cap(X86_FEATURE_FRED); + #ifdef CONFIG_X86_32 if (cmdline_find_option_bool(boot_command_line, "no387")) #ifdef CONFIG_MATH_EMULATION From patchwork Mon Mar 27 07:58:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 13188759 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5F87C7619A for ; Mon, 27 Mar 2023 08:27:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233341AbjC0I1h (ORCPT ); Mon, 27 Mar 2023 04:27:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233052AbjC0I1F (ORCPT ); Mon, 27 Mar 2023 04:27:05 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2F8F5FF9; Mon, 27 Mar 2023 01:25:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679905505; x=1711441505; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MdNLDINmriyqE7KeCKfADER0gVCNbMyHdH3hw5/HrHQ=; b=kKmkED1czx7Sfskt/cv4466E6hEYlpFGNlvDFQLHjHXR7ealU3UEenCM v0S4SrdykcnVEwE+/GeLaBUhQkq2EnDYxrouP6Nai948tonm57Lf6qETd TIfGHS2nU3+sfZFz+SDRFcO/C6N4cbULYHsrLIwN7TcaKVIkfx/TKVgFP LMyhiKb3WsAGDqojJcJXmRXvFF5HOxVc1LsjzntouKb/hwwv1ue2pZJBa YaPlR2AC9NmXIIeI1udGwtQoiqDCL/OQ+oRikWBcJF7NCBogja5WkLUzC MWWqdIdq705/7OLyxGs1hsucPYm1S1nsD8P4iSnfJ+rcj6SMz/gluIxM+ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="338930440" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="338930440" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 01:24:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10661"; a="713787170" X-IronPort-AV: E=Sophos;i="5.98,294,1673942400"; d="scan'208";a="713787170" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga008.jf.intel.com with ESMTP; 27 Mar 2023 01:24:42 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v6 33/33] KVM: x86/vmx: refactor VMX_DO_EVENT_IRQOFF to generate FRED stack frames Date: Mon, 27 Mar 2023 00:58:38 -0700 Message-Id: <20230327075838.5403-34-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230327075838.5403-1-xin3.li@intel.com> References: <20230327075838.5403-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Comparing to an IDT stack frame, a FRED stack frame has extra 16 bytes of information pushed at the regular stack top and 8 bytes of error code _always_ pushed at the regular stack bottom, VMX_DO_EVENT_IRQOFF can be refactored to generate FRED stack frames with event type and vector properly set. Thus, IRQ/NMI can be handled with the existing approach when FRED is enabled. As a FRED stack frame always contains an error code pushed by hardware, call a trampoline function first to have the return instruction address pushed on the regular stack. Then the trampoline function pushes an error code (0 for both IRQ and NMI) and jumps to fred_entrypoint_kernel() for NMI handling or calls external_interrupt() for IRQ handling. The trampoline function for IRQ handling pushes general purpose registers to form a pt_regs structure and then use it to call external_interrupt(). As a result, IRQ handling does not execute any noinstr code. Of course external_interrupt() needs to be exported. Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/traps.h | 2 ++ arch/x86/kernel/traps.c | 5 +++ arch/x86/kvm/vmx/vmenter.S | 59 ++++++++++++++++++++++++++++++++++-- arch/x86/kvm/vmx/vmx.c | 8 ++++- 4 files changed, 70 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 612b3d6fec53..017b95624325 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -58,4 +58,6 @@ typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); system_interrupt_handler get_system_interrupt_handler(unsigned int i); +int external_interrupt(struct pt_regs *regs); + #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 73471053ed02..0f1fcd53cb52 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1573,6 +1573,11 @@ int external_interrupt(struct pt_regs *regs) return 0; } +#if IS_ENABLED(CONFIG_KVM_INTEL) +/* For KVM VMX to handle IRQs in IRQ induced VM exits. */ +EXPORT_SYMBOL_GPL(external_interrupt); +#endif + #endif /* CONFIG_X86_64 */ void __init install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr) diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S index 631fd7da2bc3..43c9da9c9c24 100644 --- a/arch/x86/kvm/vmx/vmenter.S +++ b/arch/x86/kvm/vmx/vmenter.S @@ -8,6 +8,7 @@ #include #include "kvm-asm-offsets.h" #include "run_flags.h" +#include "../../entry/calling.h" #define WORD_SIZE (BITS_PER_LONG / 8) @@ -31,7 +32,7 @@ #define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE #endif -.macro VMX_DO_EVENT_IRQOFF call_insn call_target +.macro VMX_DO_EVENT_IRQOFF call_insn call_target fred=1 nmi=0 /* * Unconditionally create a stack frame, getting the correct RSP on the * stack (for x86-64) would take two instructions anyways, and RBP can @@ -46,11 +47,34 @@ * creating the synthetic interrupt stack frame for the IRQ/NMI. */ and $-16, %rsp + + .if \fred + push $0 /* Reserved by FRED, must be 0 */ + push $0 /* FRED event data, 0 for NMI and external interrupts */ + + .if \nmi + mov $(2 << 32 | 2 << 48), %_ASM_AX /* NMI event type and vector */ + .else + mov %_ASM_ARG1, %_ASM_AX + shl $32, %_ASM_AX /* external interrupt vector */ + .endif + add $__KERNEL_DS, %_ASM_AX + bts $57, %_ASM_AX /* bit 57: 64-bit mode */ + push %_ASM_AX + .else push $__KERNEL_DS + .endif + push %rbp #endif pushf + .if \nmi + mov $__KERNEL_CS, %_ASM_AX + bts $28, %_ASM_AX /* set the NMI bit */ + push %_ASM_AX + .else push $__KERNEL_CS + .endif \call_insn \call_target /* @@ -299,8 +323,19 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL) SYM_FUNC_END(__vmx_vcpu_run) +SYM_FUNC_START(vmx_do_nmi_trampoline) +#ifdef CONFIG_X86_FRED + ALTERNATIVE "jmp .Lno_errorcode_push", "", X86_FEATURE_FRED + push $0 /* FRED error code, 0 for NMI */ + jmp fred_entrypoint_kernel +#endif + +.Lno_errorcode_push: + jmp asm_exc_nmi_kvm_vmx +SYM_FUNC_END(vmx_do_nmi_trampoline) + SYM_FUNC_START(vmx_do_nmi_irqoff) - VMX_DO_EVENT_IRQOFF call asm_exc_nmi_kvm_vmx + VMX_DO_EVENT_IRQOFF call vmx_do_nmi_trampoline nmi=1 SYM_FUNC_END(vmx_do_nmi_irqoff) @@ -358,5 +393,23 @@ SYM_FUNC_END(vmread_error_trampoline) #endif SYM_FUNC_START(vmx_do_interrupt_irqoff) - VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1 + VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1 fred=0 SYM_FUNC_END(vmx_do_interrupt_irqoff) + +#ifdef CONFIG_X86_64 +SYM_FUNC_START(vmx_do_fred_interrupt_trampoline) + push $0 /* FRED error code, 0 for NMI and external interrupts */ + PUSH_REGS + + movq %rsp, %rdi /* %rdi -> pt_regs */ + call external_interrupt + + POP_REGS + addq $8,%rsp /* Drop FRED error code */ + RET +SYM_FUNC_END(vmx_do_fred_interrupt_trampoline) + +SYM_FUNC_START(vmx_do_fred_interrupt_irqoff) + VMX_DO_EVENT_IRQOFF call vmx_do_fred_interrupt_trampoline +SYM_FUNC_END(vmx_do_fred_interrupt_irqoff) +#endif diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d2d6e1b6c788..5addfee5cc6d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6875,6 +6875,7 @@ static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu) } void vmx_do_interrupt_irqoff(unsigned long entry); +void vmx_do_fred_interrupt_irqoff(unsigned int vector); void vmx_do_nmi_irqoff(void); static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) @@ -6923,7 +6924,12 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) return; kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ); - vmx_do_interrupt_irqoff(gate_offset(desc)); +#ifdef CONFIG_X86_64 + if (cpu_feature_enabled(X86_FEATURE_FRED)) + vmx_do_fred_interrupt_irqoff(vector); + else +#endif + vmx_do_interrupt_irqoff(gate_offset(desc)); kvm_after_interrupt(vcpu); vcpu->arch.at_instruction_boundary = true;