From patchwork Tue Nov 14 17:49:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455757 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC572C41535 for ; Tue, 14 Nov 2023 17:57:51 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633087.987635 (Exim 4.92) (envelope-from ) id 1r2xfO-00011L-DZ; Tue, 14 Nov 2023 17:57:38 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633087.987635; Tue, 14 Nov 2023 17:57:38 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfO-0000zw-56; Tue, 14 Nov 2023 17:57:38 +0000 Received: by outflank-mailman (input) for mailman id 633087; Tue, 14 Nov 2023 17:50:40 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYe-0004wk-SX for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:40 +0000 Received: from 12.mo583.mail-out.ovh.net (12.mo583.mail-out.ovh.net [46.105.39.65]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 51dedb92-8316-11ee-9b0e-b553b5be7939; Tue, 14 Nov 2023 18:50:38 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.108.4.72]) by mo583.mail-out.ovh.net (Postfix) with ESMTP id E7C18293F1 for ; Tue, 14 Nov 2023 17:50:37 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id 72C7E1FD24; Tue, 14 Nov 2023 17:50:37 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id aBEEGW2zU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:37 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 51dedb92-8316-11ee-9b0e-b553b5be7939 Authentication-Results: garm.ovh; auth=pass (GARM-103G0055adfc20a-9646-4e70-96a1-0204e862fa82, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Andrew Cooper , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 01/10] x86/spec-ctrl: Remove conditional IRQs-on-ness for INT $0x80/0x82 paths Date: Tue, 14 Nov 2023 18:49:57 +0100 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12928990108487231856 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudefucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhggtgfgsehtkeertdertdejnecuhfhrohhmpefmrhihshhtihgrnhcujfgvsggvlhcuoehkrhihshhtihgrnhdrhhgvsggvlhesfehmuggvsgdrtghomheqnecuggftrfgrthhtvghrnhepuddtieefheevleefuefgheffkeeivdeggffgleehjeelkeevuefgieevfeejvdeknecuffhomhgrihhnpegvnhhtrhihrdhssgenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheekfedpmhhouggvpehsmhhtphhouhht From: Andrew Cooper Before speculation defences, some paths in Xen could genuinely get away with being IRQs-on at entry. But XPTI invalidated this property on most paths, and attempting to maintain it on the remaining paths was a mistake. Fast forward, and DO_SPEC_CTRL_COND_IBPB (protection for AMD BTC/SRSO) is not IRQ-safe, running with IRQs enabled in some cases. The other actions taken on these paths happen to be IRQ-safe. Make entry_int82() and int80_direct_trap() unconditionally Interrupt Gates rather than Trap Gates. Remove the conditional re-adjustment of int80_direct_trap() in smp_prepare_cpus(), and have entry_int82() explicitly enable interrupts when safe to do so. In smp_prepare_cpus(), with the conditional re-adjustment removed, the clearing of pv_cr3 is the only remaining action gated on XPTI, and it is out of place anyway, repeating work already done by smp_prepare_boot_cpu(). Drop the entire if() condition to avoid leaving an incorrect vestigial remnant. Also drop comments which make incorrect statements about when its safe to enable interrupts. This is XSA-446 / CVE-2023-46836 Signed-off-by: Andrew Cooper Reviewed-by: Roger Pau Monné --- xen/arch/x86/pv/traps.c | 4 ++-- xen/arch/x86/smpboot.c | 14 -------------- xen/arch/x86/x86_64/compat/entry.S | 2 ++ xen/arch/x86/x86_64/entry.S | 1 - 4 files changed, 4 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c index 74f333da7e1c..240d1a2db7a3 100644 --- a/xen/arch/x86/pv/traps.c +++ b/xen/arch/x86/pv/traps.c @@ -139,11 +139,11 @@ void __init pv_trap_init(void) #ifdef CONFIG_PV32 /* The 32-on-64 hypercall vector is only accessible from ring 1. */ _set_gate(idt_table + HYPERCALL_VECTOR, - SYS_DESC_trap_gate, 1, entry_int82); + SYS_DESC_irq_gate, 1, entry_int82); #endif /* Fast trap for int80 (faster than taking the #GP-fixup path). */ - _set_gate(idt_table + LEGACY_SYSCALL_VECTOR, SYS_DESC_trap_gate, 3, + _set_gate(idt_table + LEGACY_SYSCALL_VECTOR, SYS_DESC_irq_gate, 3, &int80_direct_trap); open_softirq(NMI_SOFTIRQ, nmi_softirq); diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 3a1a659082c6..4c54ecbc91d7 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -1158,20 +1158,6 @@ void __init smp_prepare_cpus(void) stack_base[0] = (void *)((unsigned long)stack_start & ~(STACK_SIZE - 1)); - if ( opt_xpti_hwdom || opt_xpti_domu ) - { - get_cpu_info()->pv_cr3 = 0; - -#ifdef CONFIG_PV - /* - * All entry points which may need to switch page tables have to start - * with interrupts off. Re-write what pv_trap_init() has put there. - */ - _set_gate(idt_table + LEGACY_SYSCALL_VECTOR, SYS_DESC_irq_gate, 3, - &int80_direct_trap); -#endif - } - set_nr_sockets(); socket_cpumask = xzalloc_array(cpumask_t *, nr_sockets); diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S index bd5abd8040bd..fcc3a721f147 100644 --- a/xen/arch/x86/x86_64/compat/entry.S +++ b/xen/arch/x86/x86_64/compat/entry.S @@ -21,6 +21,8 @@ ENTRY(entry_int82) SPEC_CTRL_ENTRY_FROM_PV /* Req: %rsp=regs/cpuinfo, %rdx=0, Clob: acd */ /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */ + sti + CR4_PV32_RESTORE GET_CURRENT(bx) diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S index 5ca74f5f62b2..9a7b129aa7e4 100644 --- a/xen/arch/x86/x86_64/entry.S +++ b/xen/arch/x86/x86_64/entry.S @@ -327,7 +327,6 @@ ENTRY(sysenter_entry) #ifdef CONFIG_XEN_SHSTK ALTERNATIVE "", "setssbsy", X86_FEATURE_XEN_SHSTK #endif - /* sti could live here when we don't switch page tables below. */ pushq $FLAT_USER_SS pushq $0 pushfq From patchwork Tue Nov 14 17:49:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455762 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 22C56C07D59 for ; Tue, 14 Nov 2023 17:57:53 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633081.987613 (Exim 4.92) (envelope-from ) id 1r2xfN-0000aJ-8J; Tue, 14 Nov 2023 17:57:37 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633081.987613; Tue, 14 Nov 2023 17:57:37 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfN-0000Yj-2i; Tue, 14 Nov 2023 17:57:37 +0000 Received: by outflank-mailman (input) for mailman id 633081; Tue, 14 Nov 2023 17:50:39 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYd-0004mK-H8 for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:39 +0000 Received: from 8.mo550.mail-out.ovh.net (8.mo550.mail-out.ovh.net [178.33.110.239]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 5233b874-8316-11ee-98db-6d05b1d4d9a1; Tue, 14 Nov 2023 18:50:38 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.108.16.30]) by mo550.mail-out.ovh.net (Postfix) with ESMTP id 83145294F8 for ; Tue, 14 Nov 2023 17:50:38 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id EFD131FDF1; Tue, 14 Nov 2023 17:50:37 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id CGqmN22zU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:37 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 5233b874-8316-11ee-98db-6d05b1d4d9a1 Authentication-Results: garm.ovh; auth=pass (GARM-103G00598bb5702-cfe7-4f88-8cef-640e0a4e1820, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 02/10] x86/boot: choose AP stack based on APIC ID Date: Tue, 14 Nov 2023 18:49:58 +0100 Message-ID: <0e7dd957b6f26fa7b752bdce1ef6ebe97c825903.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12929271583775369584 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpefhgfdtvdeigedtvedvudehgefgkeefffetfeekkedthfdvveeiteduteehtedvgfenucffohhmrghinhepthhrrghmphholhhinhgvrdhssgdpgiekiegpieegrdhssgenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheehtddpmhhouggvpehsmhhtphhouhht This is made as first step of making parallel AP bring-up possible. It should be enough for pre-C code. Signed-off-by: Krystian Hebel --- xen/arch/x86/boot/trampoline.S | 20 ++++++++++++++++++++ xen/arch/x86/boot/x86_64.S | 28 +++++++++++++++++++++++++++- xen/arch/x86/setup.c | 7 +++++++ 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/boot/trampoline.S b/xen/arch/x86/boot/trampoline.S index b8ab0ffdcbb0..ec254125016d 100644 --- a/xen/arch/x86/boot/trampoline.S +++ b/xen/arch/x86/boot/trampoline.S @@ -72,6 +72,26 @@ trampoline_protmode_entry: mov $X86_CR4_PAE,%ecx mov %ecx,%cr4 + /* + * Get APIC ID while we're in non-paged mode. Start by checking if + * x2APIC is enabled. + */ + mov $MSR_APIC_BASE, %ecx + rdmsr + and $APIC_BASE_EXTD, %eax + jnz .Lx2apic + + /* Not x2APIC, read from MMIO */ + mov 0xfee00020, %esp + shr $24, %esp + jmp 1f + +.Lx2apic: + mov $(MSR_X2APIC_FIRST + (0x20 >> 4)), %ecx + rdmsr + mov %eax, %esp +1: + /* Load pagetable base register. */ mov $sym_offs(idle_pg_table),%eax add bootsym_rel(trampoline_xen_phys_start,4,%eax) diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S index 04bb62ae8680..b85b47b5c1a0 100644 --- a/xen/arch/x86/boot/x86_64.S +++ b/xen/arch/x86/boot/x86_64.S @@ -15,7 +15,33 @@ ENTRY(__high_start) mov $XEN_MINIMAL_CR4,%rcx mov %rcx,%cr4 - mov stack_start(%rip),%rsp + test %ebx,%ebx + cmovz stack_start(%rip), %rsp + jz .L_stack_set + + /* APs only: get stack base from APIC ID saved in %esp. */ + mov $-1, %rax + lea x86_cpu_to_apicid(%rip), %rcx +1: + add $1, %rax + cmp $NR_CPUS, %eax + jb 2f + hlt +2: + cmp %esp, (%rcx, %rax, 4) + jne 1b + + /* %eax is now Xen CPU index. */ + lea stack_base(%rip), %rcx + mov (%rcx, %rax, 8), %rsp + + test %rsp,%rsp + jnz 1f + hlt +1: + add $(STACK_SIZE - CPUINFO_sizeof), %rsp + +.L_stack_set: /* Reset EFLAGS (subsumes CLI and CLD). */ pushq $0 diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index a3d3f797bb1e..1285969901e0 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1951,6 +1951,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) */ if ( !pv_shim ) { + /* Separate loop to make parallel AP bringup possible. */ for_each_present_cpu ( i ) { /* Set up cpu_to_node[]. */ @@ -1958,6 +1959,12 @@ void __init noreturn __start_xen(unsigned long mbi_p) /* Set up node_to_cpumask based on cpu_to_node[]. */ numa_add_cpu(i); + if ( stack_base[i] == NULL ) + stack_base[i] = cpu_alloc_stack(i); + } + + for_each_present_cpu ( i ) + { if ( (park_offline_cpus || num_online_cpus() < max_cpus) && !cpu_online(i) ) { From patchwork Tue Nov 14 17:50:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455754 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BA5E5C4167D for ; Tue, 14 Nov 2023 17:57:50 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633091.987649 (Exim 4.92) (envelope-from ) id 1r2xfP-0001MF-Ez; Tue, 14 Nov 2023 17:57:39 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633091.987649; Tue, 14 Nov 2023 17:57:39 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfP-0001I5-3m; Tue, 14 Nov 2023 17:57:39 +0000 Received: by outflank-mailman (input) for mailman id 633091; Tue, 14 Nov 2023 17:50:42 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYg-0004mK-Hs for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:42 +0000 Received: from 1.mo575.mail-out.ovh.net (1.mo575.mail-out.ovh.net [46.105.41.146]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 533029ce-8316-11ee-98db-6d05b1d4d9a1; Tue, 14 Nov 2023 18:50:40 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.108.1.219]) by mo575.mail-out.ovh.net (Postfix) with ESMTP id 382D729417 for ; Tue, 14 Nov 2023 17:50:40 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id AE2A41FDE8; Tue, 14 Nov 2023 17:50:39 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id kAWdJ2+zU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:39 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 533029ce-8316-11ee-98db-6d05b1d4d9a1 Authentication-Results: garm.ovh; auth=pass (GARM-103G005504fe416-d68e-45d1-9e76-8851a34994cd, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 03/10] x86: don't access x86_cpu_to_apicid[] directly, use cpu_physical_id(cpu) Date: Tue, 14 Nov 2023 18:50:01 +0100 Message-ID: <705574ddb7f18bae9ed3f60ddf2e4bda02c70388.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12929834530462607728 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeevtdevieehieeiveekvefhlefftdfhteefueelhfdvhedtjeegkedugfefvdekffenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheejhedpmhhouggvpehsmhhtphhouhht This is done in preparation to move data from x86_cpu_to_apicid[] elsewhere. Signed-off-by: Krystian Hebel --- xen/arch/x86/acpi/cpu_idle.c | 4 ++-- xen/arch/x86/acpi/lib.c | 2 +- xen/arch/x86/apic.c | 2 +- xen/arch/x86/cpu/mwait-idle.c | 4 ++-- xen/arch/x86/domain.c | 2 +- xen/arch/x86/mpparse.c | 6 +++--- xen/arch/x86/numa.c | 2 +- xen/arch/x86/platform_hypercall.c | 2 +- xen/arch/x86/setup.c | 14 +++++++------- xen/arch/x86/smpboot.c | 4 ++-- xen/arch/x86/spec_ctrl.c | 2 +- xen/arch/x86/sysctl.c | 2 +- 12 files changed, 23 insertions(+), 23 deletions(-) diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c index cfce4cc0408f..d03e54bcef38 100644 --- a/xen/arch/x86/acpi/cpu_idle.c +++ b/xen/arch/x86/acpi/cpu_idle.c @@ -1256,7 +1256,7 @@ int get_cpu_id(u32 acpi_id) for ( i = 0; i < nr_cpu_ids; i++ ) { - if ( apic_id == x86_cpu_to_apicid[i] ) + if ( apic_id == cpu_physical_id(i) ) return i; } @@ -1362,7 +1362,7 @@ long set_cx_pminfo(uint32_t acpi_id, struct xen_processor_power *power) if ( !cpu_online(cpu_id) ) { - uint32_t apic_id = x86_cpu_to_apicid[cpu_id]; + uint32_t apic_id = cpu_physical_id(cpu_id); /* * If we've just learned of more available C states, wake the CPU if diff --git a/xen/arch/x86/acpi/lib.c b/xen/arch/x86/acpi/lib.c index 51cb082ca02a..87a1f38f3f5a 100644 --- a/xen/arch/x86/acpi/lib.c +++ b/xen/arch/x86/acpi/lib.c @@ -91,7 +91,7 @@ unsigned int acpi_get_processor_id(unsigned int cpu) { unsigned int acpiid, apicid; - if ((apicid = x86_cpu_to_apicid[cpu]) == BAD_APICID) + if ((apicid = cpu_physical_id(cpu)) == BAD_APICID) return INVALID_ACPIID; for (acpiid = 0; acpiid < ARRAY_SIZE(x86_acpiid_to_apicid); acpiid++) diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c index 6acdd0ec1468..63e18968e54c 100644 --- a/xen/arch/x86/apic.c +++ b/xen/arch/x86/apic.c @@ -950,7 +950,7 @@ __next: */ if (boot_cpu_physical_apicid == -1U) boot_cpu_physical_apicid = get_apic_id(); - x86_cpu_to_apicid[0] = get_apic_id(); + cpu_physical_id(0) = get_apic_id(); ioapic_init(); } diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c index ff5c808bc952..88168da8a0ca 100644 --- a/xen/arch/x86/cpu/mwait-idle.c +++ b/xen/arch/x86/cpu/mwait-idle.c @@ -1202,8 +1202,8 @@ static void __init ivt_idle_state_table_update(void) unsigned int cpu, max_apicid = boot_cpu_physical_apicid; for_each_present_cpu(cpu) - if (max_apicid < x86_cpu_to_apicid[cpu]) - max_apicid = x86_cpu_to_apicid[cpu]; + if (max_apicid < cpu_physical_id(cpu)) + max_apicid = cpu_physical_id(cpu); switch (apicid_to_socket(max_apicid)) { case 0: case 1: /* 1 and 2 socket systems use default ivt_cstates */ diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 3712e36df930..f45437511a46 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1615,7 +1615,7 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) break; cpu_id.phys_id = - (uint64_t)x86_cpu_to_apicid[v->vcpu_id] | + (uint64_t)cpu_physical_id(v->vcpu_id) | ((uint64_t)acpi_get_processor_id(v->vcpu_id) << 32); rc = -EFAULT; diff --git a/xen/arch/x86/mpparse.c b/xen/arch/x86/mpparse.c index d8ccab2449c6..b8cabebe7bf9 100644 --- a/xen/arch/x86/mpparse.c +++ b/xen/arch/x86/mpparse.c @@ -187,7 +187,7 @@ static int MP_processor_info_x(struct mpc_config_processor *m, " Processor with apicid %i ignored\n", apicid); return cpu; } - x86_cpu_to_apicid[cpu] = apicid; + cpu_physical_id(cpu) = apicid; cpumask_set_cpu(cpu, &cpu_present_map); } @@ -822,12 +822,12 @@ void mp_unregister_lapic(uint32_t apic_id, uint32_t cpu) if (!cpu || (apic_id == boot_cpu_physical_apicid)) return; - if (x86_cpu_to_apicid[cpu] != apic_id) + if (cpu_physical_id(cpu) != apic_id) return; physid_clear(apic_id, phys_cpu_present_map); - x86_cpu_to_apicid[cpu] = BAD_APICID; + cpu_physical_id(cpu) = BAD_APICID; cpumask_clear_cpu(cpu, &cpu_present_map); } diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c index 4b0b297c7e09..39e131cb4f35 100644 --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -70,7 +70,7 @@ void __init init_cpu_to_node(void) for ( i = 0; i < nr_cpu_ids; i++ ) { - u32 apicid = x86_cpu_to_apicid[i]; + u32 apicid = cpu_physical_id(i); if ( apicid == BAD_APICID ) continue; node = apicid < MAX_LOCAL_APIC ? apicid_to_node[apicid] : NUMA_NO_NODE; diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c index 9469de9045c7..9a52e048f67c 100644 --- a/xen/arch/x86/platform_hypercall.c +++ b/xen/arch/x86/platform_hypercall.c @@ -588,7 +588,7 @@ ret_t do_platform_op( } else { - g_info->apic_id = x86_cpu_to_apicid[g_info->xen_cpuid]; + g_info->apic_id = cpu_physical_id(g_info->xen_cpuid); g_info->acpi_id = acpi_get_processor_id(g_info->xen_cpuid); ASSERT(g_info->apic_id != BAD_APICID); g_info->flags = 0; diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 1285969901e0..a19fe219bbf6 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -319,7 +319,7 @@ static void __init init_idle_domain(void) void srat_detect_node(int cpu) { nodeid_t node; - u32 apicid = x86_cpu_to_apicid[cpu]; + u32 apicid = cpu_physical_id(cpu); node = apicid < MAX_LOCAL_APIC ? apicid_to_node[apicid] : NUMA_NO_NODE; if ( node == NUMA_NO_NODE ) @@ -346,7 +346,7 @@ static void __init normalise_cpu_order(void) for_each_present_cpu ( i ) { - apicid = x86_cpu_to_apicid[i]; + apicid = cpu_physical_id(i); min_diff = min_cpu = ~0u; /* @@ -357,12 +357,12 @@ static void __init normalise_cpu_order(void) j < nr_cpu_ids; j = cpumask_next(j, &cpu_present_map) ) { - diff = x86_cpu_to_apicid[j] ^ apicid; + diff = cpu_physical_id(j) ^ apicid; while ( diff & (diff-1) ) diff &= diff-1; if ( (diff < min_diff) || ((diff == min_diff) && - (x86_cpu_to_apicid[j] < x86_cpu_to_apicid[min_cpu])) ) + (cpu_physical_id(j) < cpu_physical_id(min_cpu))) ) { min_diff = diff; min_cpu = j; @@ -378,9 +378,9 @@ static void __init normalise_cpu_order(void) /* Switch the best-matching CPU with the next CPU in logical order. */ j = cpumask_next(i, &cpu_present_map); - apicid = x86_cpu_to_apicid[min_cpu]; - x86_cpu_to_apicid[min_cpu] = x86_cpu_to_apicid[j]; - x86_cpu_to_apicid[j] = apicid; + apicid = cpu_physical_id(min_cpu); + cpu_physical_id(min_cpu) = cpu_physical_id(j); + cpu_physical_id(j) = apicid; } } diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 4c54ecbc91d7..de87c5a41926 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -1154,7 +1154,7 @@ void __init smp_prepare_cpus(void) print_cpu_info(0); boot_cpu_physical_apicid = get_apic_id(); - x86_cpu_to_apicid[0] = boot_cpu_physical_apicid; + cpu_physical_id(0) = boot_cpu_physical_apicid; stack_base[0] = (void *)((unsigned long)stack_start & ~(STACK_SIZE - 1)); @@ -1374,7 +1374,7 @@ int __cpu_up(unsigned int cpu) { int apicid, ret; - if ( (apicid = x86_cpu_to_apicid[cpu]) == BAD_APICID ) + if ( (apicid = cpu_physical_id(cpu)) == BAD_APICID ) return -ENODEV; if ( (!x2apic_enabled && apicid >= APIC_ALL_CPUS) || diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c index a8d8af22f6d8..d54c8d93cff0 100644 --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -589,7 +589,7 @@ static bool __init check_smt_enabled(void) * has a non-zero thread id component indicates that SMT is active. */ for_each_present_cpu ( cpu ) - if ( x86_cpu_to_apicid[cpu] & (boot_cpu_data.x86_num_siblings - 1) ) + if ( cpu_physical_id(cpu) & (boot_cpu_data.x86_num_siblings - 1) ) return true; return false; diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c index c107f40c6283..67d8ab3f824a 100644 --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -58,7 +58,7 @@ static long cf_check smt_up_down_helper(void *data) for_each_present_cpu ( cpu ) { /* Skip primary siblings (those whose thread id is 0). */ - if ( !(x86_cpu_to_apicid[cpu] & sibling_mask) ) + if ( !(cpu_physical_id(cpu) & sibling_mask) ) continue; if ( !up && core_parking_remove(cpu) ) From patchwork Tue Nov 14 17:50:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455764 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0CC83C07D5A for ; Tue, 14 Nov 2023 17:57:54 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633095.987676 (Exim 4.92) (envelope-from ) id 1r2xfR-00020g-LO; Tue, 14 Nov 2023 17:57:41 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633095.987676; Tue, 14 Nov 2023 17:57:41 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfQ-0001wp-TT; Tue, 14 Nov 2023 17:57:40 +0000 Received: by outflank-mailman (input) for mailman id 633095; Tue, 14 Nov 2023 17:50:44 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYi-0004mK-ID for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:44 +0000 Received: from 8.mo575.mail-out.ovh.net (8.mo575.mail-out.ovh.net [46.105.74.219]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 53cf22dc-8316-11ee-98db-6d05b1d4d9a1; Tue, 14 Nov 2023 18:50:41 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.109.138.246]) by mo575.mail-out.ovh.net (Postfix) with ESMTP id 4424B293D2 for ; Tue, 14 Nov 2023 17:50:41 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id CBEAA1FDE8; Tue, 14 Nov 2023 17:50:40 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id SFnkLnCzU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:40 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 53cf22dc-8316-11ee-98db-6d05b1d4d9a1 Authentication-Results: garm.ovh; auth=pass (GARM-103G005a4eb1ba5-6e31-46c5-bb87-74b65d95b45a, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 04/10] x86/smp: drop x86_cpu_to_apicid, use cpu_data[cpu].apicid instead Date: Tue, 14 Nov 2023 18:50:03 +0100 Message-ID: <8121d9b472b305be751158aa3af3fed98ff0572e.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12930116006647343472 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeehleekveevvdfhgeetlefhjedtjefgjedtkeekffeitdffkeffueetkedtgfeiueenucffohhmrghinhepgiekiegpieegrdhssgenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheejhedpmhhouggvpehsmhhtphhouhht Both fields held the same data. Signed-off-by: Krystian Hebel --- xen/arch/x86/boot/x86_64.S | 8 +++++--- xen/arch/x86/include/asm/asm_defns.h | 2 +- xen/arch/x86/include/asm/processor.h | 2 ++ xen/arch/x86/include/asm/smp.h | 4 ---- xen/arch/x86/numa.c | 15 +++++++-------- xen/arch/x86/smpboot.c | 8 ++++---- xen/arch/x86/x86_64/asm-offsets.c | 4 +++- 7 files changed, 22 insertions(+), 21 deletions(-) diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S index b85b47b5c1a0..195550b5c0ea 100644 --- a/xen/arch/x86/boot/x86_64.S +++ b/xen/arch/x86/boot/x86_64.S @@ -20,15 +20,17 @@ ENTRY(__high_start) jz .L_stack_set /* APs only: get stack base from APIC ID saved in %esp. */ - mov $-1, %rax - lea x86_cpu_to_apicid(%rip), %rcx + mov $0, %rax + lea cpu_data(%rip), %rcx + /* cpu_data[0] is BSP, skip it. */ 1: add $1, %rax + add $CPUINFO_X86_sizeof, %rcx cmp $NR_CPUS, %eax jb 2f hlt 2: - cmp %esp, (%rcx, %rax, 4) + cmp %esp, CPUINFO_X86_apicid(%rcx) jne 1b /* %eax is now Xen CPU index. */ diff --git a/xen/arch/x86/include/asm/asm_defns.h b/xen/arch/x86/include/asm/asm_defns.h index baaaccb26e17..6b05d9d140b8 100644 --- a/xen/arch/x86/include/asm/asm_defns.h +++ b/xen/arch/x86/include/asm/asm_defns.h @@ -158,7 +158,7 @@ register unsigned long current_stack_pointer asm("rsp"); #endif #define CPUINFO_FEATURE_OFFSET(feature) \ - (CPUINFO_features + (cpufeat_word(feature) * 4)) + (CPUINFO_X86_features + (cpufeat_word(feature) * 4)) #else diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h index b0d2a62c075f..8345d58094da 100644 --- a/xen/arch/x86/include/asm/processor.h +++ b/xen/arch/x86/include/asm/processor.h @@ -92,6 +92,8 @@ struct x86_cpu_id { extern struct cpuinfo_x86 cpu_data[]; #define current_cpu_data cpu_data[smp_processor_id()] +#define cpu_physical_id(cpu) cpu_data[cpu].apicid + extern bool probe_cpuid_faulting(void); extern void ctxt_switch_levelling(const struct vcpu *next); extern void (*ctxt_switch_masking)(const struct vcpu *next); diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index c0b5d7cdd8dd..94c557491860 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -39,10 +39,6 @@ extern void (*mtrr_hook) (void); extern void zap_low_mappings(void); -extern u32 x86_cpu_to_apicid[]; - -#define cpu_physical_id(cpu) x86_cpu_to_apicid[cpu] - #define cpu_is_offline(cpu) unlikely(!cpu_online(cpu)) extern void cpu_exit_clear(unsigned int cpu); extern void cpu_uninit(unsigned int cpu); diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c index 39e131cb4f35..91527be5b406 100644 --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -54,14 +54,13 @@ bool __init arch_numa_unavailable(void) /* * Setup early cpu_to_node. * - * Populate cpu_to_node[] only if x86_cpu_to_apicid[], - * and apicid_to_node[] tables have valid entries for a CPU. - * This means we skip cpu_to_node[] initialisation for NUMA - * emulation and faking node case (when running a kernel compiled - * for NUMA on a non NUMA box), which is OK as cpu_to_node[] - * is already initialized in a round robin manner at numa_init_array, - * prior to this call, and this initialization is good enough - * for the fake NUMA cases. + * Populate cpu_to_node[] only if cpu_data[], and apicid_to_node[] + * tables have valid entries for a CPU. This means we skip + * cpu_to_node[] initialisation for NUMA emulation and faking node + * case (when running a kernel compiled for NUMA on a non NUMA box), + * which is OK as cpu_to_node[] is already initialized in a round + * robin manner at numa_init_array, prior to this call, and this + * initialization is good enough for the fake NUMA cases. */ void __init init_cpu_to_node(void) { diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index de87c5a41926..f061486e56eb 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -61,10 +61,8 @@ unsigned int __read_mostly nr_sockets; cpumask_t **__read_mostly socket_cpumask; static cpumask_t *secondary_socket_cpumask; -struct cpuinfo_x86 cpu_data[NR_CPUS]; - -u32 x86_cpu_to_apicid[NR_CPUS] __read_mostly = - { [0 ... NR_CPUS-1] = BAD_APICID }; +struct cpuinfo_x86 cpu_data[NR_CPUS] = + { [0 ... NR_CPUS-1] .apicid = BAD_APICID }; static int cpu_error; static enum cpu_state { @@ -81,7 +79,9 @@ void *stack_base[NR_CPUS]; void initialize_cpu_data(unsigned int cpu) { + uint32_t apicid = cpu_physical_id(cpu); cpu_data[cpu] = boot_cpu_data; + cpu_physical_id(cpu) = apicid; } static bool smp_store_cpu_info(unsigned int id) diff --git a/xen/arch/x86/x86_64/asm-offsets.c b/xen/arch/x86/x86_64/asm-offsets.c index 57b73a4e6214..e881cd5de0a0 100644 --- a/xen/arch/x86/x86_64/asm-offsets.c +++ b/xen/arch/x86/x86_64/asm-offsets.c @@ -159,7 +159,9 @@ void __dummy__(void) OFFSET(IRQSTAT_softirq_pending, irq_cpustat_t, __softirq_pending); BLANK(); - OFFSET(CPUINFO_features, struct cpuinfo_x86, x86_capability); + OFFSET(CPUINFO_X86_features, struct cpuinfo_x86, x86_capability); + OFFSET(CPUINFO_X86_apicid, struct cpuinfo_x86, apicid); + DEFINE(CPUINFO_X86_sizeof, sizeof(struct cpuinfo_x86)); BLANK(); OFFSET(MB_flags, multiboot_info_t, flags); From patchwork Tue Nov 14 17:50:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455758 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BB866C07545 for ; Tue, 14 Nov 2023 17:57:52 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633096.987683 (Exim 4.92) (envelope-from ) id 1r2xfS-0002G8-8t; Tue, 14 Nov 2023 17:57:42 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633096.987683; Tue, 14 Nov 2023 17:57:42 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfR-0002CH-Tg; Tue, 14 Nov 2023 17:57:41 +0000 Received: by outflank-mailman (input) for mailman id 633096; Tue, 14 Nov 2023 17:50:45 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYj-0004wk-7W for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:45 +0000 Received: from 9.mo575.mail-out.ovh.net (9.mo575.mail-out.ovh.net [46.105.78.111]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 550bf8cc-8316-11ee-9b0e-b553b5be7939; Tue, 14 Nov 2023 18:50:43 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.109.138.246]) by mo575.mail-out.ovh.net (Postfix) with ESMTP id 515E429385 for ; Tue, 14 Nov 2023 17:50:43 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id 98AB31FDF1; Tue, 14 Nov 2023 17:50:42 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id kPhUInKzU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:42 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 550bf8cc-8316-11ee-9b0e-b553b5be7939 Authentication-Results: garm.ovh; auth=pass (GARM-103G005f5dee295-d4f4-4524-b5c3-7761c38da2df, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu , George Dunlap , Julien Grall , Stefano Stabellini Subject: [PATCH 05/10] x86/smp: move stack_base to cpu_data Date: Tue, 14 Nov 2023 18:50:06 +0100 Message-ID: <70e3b7c84a69a7ec52b3ed6314395165c281734c.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12930678958905928138 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeehleekveevvdfhgeetlefhjedtjefgjedtkeekffeitdffkeffueetkedtgfeiueenucffohhmrghinhepgiekiegpieegrdhssgenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgepheenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheejhedpmhhouggvpehsmhhtphhouhht This location is easier to access from assembly. Having it close to other data required during initialization has also positive (although rather small) impact on prefetching data from RAM. Signed-off-by: Krystian Hebel --- xen/arch/x86/boot/x86_64.S | 5 ++--- xen/arch/x86/include/asm/cpufeature.h | 1 + xen/arch/x86/include/asm/smp.h | 2 +- xen/arch/x86/setup.c | 6 +++--- xen/arch/x86/smpboot.c | 25 +++++++++++++------------ xen/arch/x86/traps.c | 4 ++-- xen/arch/x86/x86_64/asm-offsets.c | 1 + xen/include/xen/smp.h | 2 -- 8 files changed, 23 insertions(+), 23 deletions(-) diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S index 195550b5c0ea..8d61f270761f 100644 --- a/xen/arch/x86/boot/x86_64.S +++ b/xen/arch/x86/boot/x86_64.S @@ -33,9 +33,8 @@ ENTRY(__high_start) cmp %esp, CPUINFO_X86_apicid(%rcx) jne 1b - /* %eax is now Xen CPU index. */ - lea stack_base(%rip), %rcx - mov (%rcx, %rax, 8), %rsp + /* %rcx is now cpu_data[cpu], read stack base from it. */ + mov CPUINFO_X86_stack_base(%rcx), %rsp test %rsp,%rsp jnz 1f diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h index 06e1dd7f3332..ff0e18864cc7 100644 --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -37,6 +37,7 @@ struct cpuinfo_x86 { unsigned int phys_proc_id; /* package ID of each logical CPU */ unsigned int cpu_core_id; /* core ID of each logical CPU */ unsigned int compute_unit_id; /* AMD compute unit ID of each logical CPU */ + void *stack_base; unsigned short x86_clflush_size; } __cacheline_aligned; diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index 94c557491860..98739028a6ed 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -69,7 +69,7 @@ extern cpumask_t **socket_cpumask; * by certain scheduling code only. */ #define get_cpu_current(cpu) \ - (get_cpu_info_from_stack((unsigned long)stack_base[cpu])->current_vcpu) + (get_cpu_info_from_stack((unsigned long)cpu_data[cpu].stack_base)->current_vcpu) extern unsigned int disabled_cpus; extern bool unaccounted_cpus; diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index a19fe219bbf6..b2c0679725ea 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -798,7 +798,7 @@ static void __init noreturn reinit_bsp_stack(void) /* Update SYSCALL trampolines */ percpu_traps_init(); - stack_base[0] = stack; + cpu_data[0].stack_base = stack; rc = setup_cpu_root_pgt(0); if ( rc ) @@ -1959,8 +1959,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) /* Set up node_to_cpumask based on cpu_to_node[]. */ numa_add_cpu(i); - if ( stack_base[i] == NULL ) - stack_base[i] = cpu_alloc_stack(i); + if ( cpu_data[i].stack_base == NULL ) + cpu_data[i].stack_base = cpu_alloc_stack(i); } for_each_present_cpu ( i ) diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index f061486e56eb..8ae65ab1769f 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -75,13 +75,15 @@ static enum cpu_state { } cpu_state; #define set_cpu_state(state) do { smp_mb(); cpu_state = (state); } while (0) -void *stack_base[NR_CPUS]; - void initialize_cpu_data(unsigned int cpu) { uint32_t apicid = cpu_physical_id(cpu); + void *stack = cpu_data[cpu].stack_base; + cpu_data[cpu] = boot_cpu_data; + cpu_physical_id(cpu) = apicid; + cpu_data[cpu].stack_base = stack; } static bool smp_store_cpu_info(unsigned int id) @@ -579,8 +581,6 @@ static int do_boot_cpu(int apicid, int cpu) printk("Booting processor %d/%d eip %lx\n", cpu, apicid, start_eip); - stack_start = stack_base[cpu] + STACK_SIZE - sizeof(struct cpu_info); - /* This grunge runs the startup process for the targeted processor. */ set_cpu_state(CPU_STATE_INIT); @@ -856,7 +856,7 @@ int setup_cpu_root_pgt(unsigned int cpu) /* Install direct map page table entries for stack, IDT, and TSS. */ for ( off = rc = 0; !rc && off < STACK_SIZE; off += PAGE_SIZE ) - rc = clone_mapping(__va(__pa(stack_base[cpu])) + off, rpt); + rc = clone_mapping(__va(__pa(cpu_data[cpu].stack_base)) + off, rpt); if ( !rc ) rc = clone_mapping(idt_tables[cpu], rpt); @@ -1007,10 +1007,10 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) FREE_XENHEAP_PAGE(per_cpu(gdt, cpu)); FREE_XENHEAP_PAGE(idt_tables[cpu]); - if ( stack_base[cpu] ) + if ( cpu_data[cpu].stack_base ) { - memguard_unguard_stack(stack_base[cpu]); - FREE_XENHEAP_PAGES(stack_base[cpu], STACK_ORDER); + memguard_unguard_stack(cpu_data[cpu].stack_base); + FREE_XENHEAP_PAGES(cpu_data[cpu].stack_base, STACK_ORDER); } } } @@ -1044,11 +1044,11 @@ static int cpu_smpboot_alloc(unsigned int cpu) if ( node != NUMA_NO_NODE ) memflags = MEMF_node(node); - if ( stack_base[cpu] == NULL && - (stack_base[cpu] = cpu_alloc_stack(cpu)) == NULL ) + if ( cpu_data[cpu].stack_base == NULL && + (cpu_data[cpu].stack_base = cpu_alloc_stack(cpu)) == NULL ) goto out; - info = get_cpu_info_from_stack((unsigned long)stack_base[cpu]); + info = get_cpu_info_from_stack((unsigned long)cpu_data[cpu].stack_base); info->processor_id = cpu; info->per_cpu_offset = __per_cpu_offset[cpu]; @@ -1156,7 +1156,8 @@ void __init smp_prepare_cpus(void) boot_cpu_physical_apicid = get_apic_id(); cpu_physical_id(0) = boot_cpu_physical_apicid; - stack_base[0] = (void *)((unsigned long)stack_start & ~(STACK_SIZE - 1)); + cpu_data[0].stack_base = (void *) + ((unsigned long)stack_start & ~(STACK_SIZE - 1)); set_nr_sockets(); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index e1356f696aba..90d9201d1c52 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -611,9 +611,9 @@ void show_stack_overflow(unsigned int cpu, const struct cpu_user_regs *regs) unsigned long curr_stack_base = esp & ~(STACK_SIZE - 1); unsigned long esp_top, esp_bottom; - if ( _p(curr_stack_base) != stack_base[cpu] ) + if ( _p(curr_stack_base) != cpu_data[cpu].stack_base ) printk("Current stack base %p differs from expected %p\n", - _p(curr_stack_base), stack_base[cpu]); + _p(curr_stack_base), cpu_data[cpu].stack_base); esp_bottom = (esp | (STACK_SIZE - 1)) + 1; esp_top = esp_bottom - PRIMARY_STACK_SIZE; diff --git a/xen/arch/x86/x86_64/asm-offsets.c b/xen/arch/x86/x86_64/asm-offsets.c index e881cd5de0a0..d81a30344677 100644 --- a/xen/arch/x86/x86_64/asm-offsets.c +++ b/xen/arch/x86/x86_64/asm-offsets.c @@ -161,6 +161,7 @@ void __dummy__(void) OFFSET(CPUINFO_X86_features, struct cpuinfo_x86, x86_capability); OFFSET(CPUINFO_X86_apicid, struct cpuinfo_x86, apicid); + OFFSET(CPUINFO_X86_stack_base, struct cpuinfo_x86, stack_base); DEFINE(CPUINFO_X86_sizeof, sizeof(struct cpuinfo_x86)); BLANK(); diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index 0a9219173f0f..994fdc474200 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -67,8 +67,6 @@ void smp_send_call_function_mask(const cpumask_t *mask); int alloc_cpu_id(void); -extern void *stack_base[NR_CPUS]; - void initialize_cpu_data(unsigned int cpu); int setup_cpu_root_pgt(unsigned int cpu); From patchwork Tue Nov 14 17:50:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455768 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8CD39C4167D for ; Tue, 14 Nov 2023 17:57:58 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633102.987717 (Exim 4.92) (envelope-from ) id 1r2xfV-0003D0-Im; Tue, 14 Nov 2023 17:57:45 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633102.987717; Tue, 14 Nov 2023 17:57:45 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfU-00036u-QE; Tue, 14 Nov 2023 17:57:44 +0000 Received: by outflank-mailman (input) for mailman id 633102; Tue, 14 Nov 2023 17:50:46 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYk-0004wk-Me for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:46 +0000 Received: from 5.mo583.mail-out.ovh.net (5.mo583.mail-out.ovh.net [87.98.173.103]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 55eb02bf-8316-11ee-9b0e-b553b5be7939; Tue, 14 Nov 2023 18:50:45 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.108.4.72]) by mo583.mail-out.ovh.net (Postfix) with ESMTP id 831882530C for ; Tue, 14 Nov 2023 17:50:44 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id 053BA1FD62; Tue, 14 Nov 2023 17:50:43 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id +FLBOXOzU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:43 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 55eb02bf-8316-11ee-9b0e-b553b5be7939 Authentication-Results: garm.ovh; auth=pass (GARM-103G0053f19ddfa-4a8e-402b-a711-632ec1da18cf, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 06/10] x86/smp: call x2apic_ap_setup() earlier Date: Tue, 14 Nov 2023 18:50:08 +0100 Message-ID: <7c13554e60cc76516922992b7faf911b91f99a2a.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12930960431237409136 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudefucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeevtdevieehieeiveekvefhlefftdfhteefueelhfdvhedtjeegkedugfefvdekffenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgepgeenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheekfedpmhhouggvpehsmhhtphhouhht It used to be called from smp_callin(), however BUG_ON() was invoked on multiple occasions before that. It may end up calling machine_restart() which tries to get APIC ID for CPU running this code. If BSP detected that x2APIC is enabled, get_apic_id() will try to use it for all CPUs. Enabling x2APIC on secondary CPUs earlier protects against an endless loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID MSR while x2APIC is disabled in IA32_APIC_BASE. Signed-off-by: Krystian Hebel --- xen/arch/x86/smpboot.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 8ae65ab1769f..a3895dafa267 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -184,7 +184,6 @@ static void smp_callin(void) * update until we finish. We are free to set up this CPU: first the APIC. */ Dprintk("CALLIN, before setup_local_APIC().\n"); - x2apic_ap_setup(); setup_local_APIC(false); /* Save our processor parameters. */ @@ -351,6 +350,14 @@ void start_secondary(void *unused) get_cpu_info()->xen_cr3 = 0; get_cpu_info()->pv_cr3 = 0; + /* + * BUG_ON() used in load_system_tables() and later code may end up calling + * machine_restart() which tries to get APIC ID for CPU running this code. + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up + * with endless #GP loop. + */ + x2apic_ap_setup(); load_system_tables(); /* Full exception support from here on in. */ From patchwork Tue Nov 14 17:50:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455770 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF7CBC07524 for ; Tue, 14 Nov 2023 17:58:01 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633107.987740 (Exim 4.92) (envelope-from ) id 1r2xfY-0003nq-8N; Tue, 14 Nov 2023 17:57:48 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633107.987740; Tue, 14 Nov 2023 17:57:47 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfX-0003ki-7o; Tue, 14 Nov 2023 17:57:47 +0000 Received: by outflank-mailman (input) for mailman id 633107; Tue, 14 Nov 2023 17:50:47 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYl-0004mK-Iv for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:47 +0000 Received: from 4.mo576.mail-out.ovh.net (4.mo576.mail-out.ovh.net [46.105.42.102]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 562ff767-8316-11ee-98db-6d05b1d4d9a1; Tue, 14 Nov 2023 18:50:45 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.109.146.213]) by mo576.mail-out.ovh.net (Postfix) with ESMTP id 33ED12BB9F for ; Tue, 14 Nov 2023 17:50:45 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id AD9971FE62; Tue, 14 Nov 2023 17:50:44 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id aOi5InSzU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:44 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 562ff767-8316-11ee-98db-6d05b1d4d9a1 Authentication-Results: garm.ovh; auth=pass (GARM-103G0052b324679-71f1-4ee1-bb31-3382685591f2, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 07/10] x86/shutdown: protect against recurrent machine_restart() Date: Tue, 14 Nov 2023 18:50:09 +0100 Message-ID: <87b0e650f28038c2fb64c5eb607c8fdaa7b4db07.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12931241908158114160 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudefucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeevtdevieehieeiveekvefhlefftdfhteefueelhfdvhedtjeegkedugfefvdekffenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgepheenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheejiedpmhhouggvpehsmhhtphhouhht If multiple CPUs called machine_restart() before actual restart took place, but after boot CPU declared itself not online, ASSERT in on_selected_cpus() will fail. Few calls later execution would end up in machine_restart() again, with another frame on call stack for new exception. To protect against running out of stack, code checks if boot CPU is still online before calling on_selected_cpus(). Signed-off-by: Krystian Hebel --- xen/arch/x86/shutdown.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/shutdown.c b/xen/arch/x86/shutdown.c index 7619544d14da..32c70505ed77 100644 --- a/xen/arch/x86/shutdown.c +++ b/xen/arch/x86/shutdown.c @@ -577,9 +577,23 @@ void machine_restart(unsigned int delay_millisecs) /* Ensure we are the boot CPU. */ if ( get_apic_id() != boot_cpu_physical_apicid ) { - /* Send IPI to the boot CPU (logical cpu 0). */ - on_selected_cpus(cpumask_of(0), __machine_restart, - &delay_millisecs, 0); + /* + * Send IPI to the boot CPU (logical cpu 0). + * + * If multiple CPUs called machine_restart() before actual restart + * took place, but after boot CPU declared itself not online, ASSERT + * in on_selected_cpus() will fail. Few calls later we would end up + * here again, with another frame on call stack for new exception. + * To protect against running out of stack, check if boot CPU is + * online. + * + * Note this is not an atomic operation, so it is possible for + * on_selected_cpus() to be called once after boot CPU is offline + * before we hit halt() below. + */ + if ( cpu_online(0) ) + on_selected_cpus(cpumask_of(0), __machine_restart, + &delay_millisecs, 0); for ( ; ; ) halt(); } From patchwork Tue Nov 14 17:50:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 36834C4332F for ; Tue, 14 Nov 2023 17:59:26 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633141.987791 (Exim 4.92) (envelope-from ) id 1r2xh0-0003DN-5I; Tue, 14 Nov 2023 17:59:18 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633141.987791; Tue, 14 Nov 2023 17:59:18 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xh0-0003DG-1r; Tue, 14 Nov 2023 17:59:18 +0000 Received: by outflank-mailman (input) for mailman id 633141; Tue, 14 Nov 2023 17:59:16 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYm-0004mK-J6 for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:48 +0000 Received: from 2.mo575.mail-out.ovh.net (2.mo575.mail-out.ovh.net [46.105.52.162]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 56e500ac-8316-11ee-98db-6d05b1d4d9a1; Tue, 14 Nov 2023 18:50:46 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.108.1.219]) by mo575.mail-out.ovh.net (Postfix) with ESMTP id 6591E291FE for ; Tue, 14 Nov 2023 17:50:46 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id D435F1FE62; Tue, 14 Nov 2023 17:50:45 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id wKTdMHWzU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:45 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 56e500ac-8316-11ee-98db-6d05b1d4d9a1 Authentication-Results: garm.ovh; auth=pass (GARM-103G0050aa832c6-640f-4b02-9663-a128ddce512f, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 08/10] x86/smp: drop booting_cpu variable Date: Tue, 14 Nov 2023 18:50:11 +0100 Message-ID: <22109ebd7edef1140cb438a6ec5fa1726cdf2c12.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12931523382006163824 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeehleekveevvdfhgeetlefhjedtjefgjedtkeekffeitdffkeffueetkedtgfeiueenucffohhmrghinhepgiekiegpieegrdhssgenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgepheenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheejhedpmhhouggvpehsmhhtphhouhht CPU id is obtained as a side effect of searching for appropriate stack for AP. It can be used as a parameter to start_secondary(). Coincidentally this also makes further work on making AP bring-up code parallel easier. Signed-off-by: Krystian Hebel --- xen/arch/x86/boot/x86_64.S | 13 +++++++++---- xen/arch/x86/smpboot.c | 15 +++++---------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S index 8d61f270761f..ad01f20d548d 100644 --- a/xen/arch/x86/boot/x86_64.S +++ b/xen/arch/x86/boot/x86_64.S @@ -20,20 +20,24 @@ ENTRY(__high_start) jz .L_stack_set /* APs only: get stack base from APIC ID saved in %esp. */ - mov $0, %rax + mov $0, %rbx lea cpu_data(%rip), %rcx /* cpu_data[0] is BSP, skip it. */ 1: - add $1, %rax + add $1, %rbx add $CPUINFO_X86_sizeof, %rcx - cmp $NR_CPUS, %eax + cmp $NR_CPUS, %rbx jb 2f hlt 2: cmp %esp, CPUINFO_X86_apicid(%rcx) jne 1b - /* %rcx is now cpu_data[cpu], read stack base from it. */ + /* + * At this point: + * - %rcx is cpu_data[cpu], read stack base from it, + * - %rbx (callee-save) is Xen cpu number, pass it to start_secondary(). + */ mov CPUINFO_X86_stack_base(%rcx), %rsp test %rsp,%rsp @@ -101,6 +105,7 @@ ENTRY(__high_start) .L_ap_cet_done: #endif /* CONFIG_XEN_SHSTK || CONFIG_XEN_IBT */ + mov %rbx, %rdi tailcall start_secondary .L_bsp: diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index a3895dafa267..39ffd356dbbc 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -222,8 +222,6 @@ static void smp_callin(void) cpu_relax(); } -static int booting_cpu; - /* CPUs for which sibling maps can be computed. */ static cpumask_t cpu_sibling_setup_map; @@ -311,15 +309,14 @@ static void set_cpu_sibling_map(unsigned int cpu) } } -void start_secondary(void *unused) +void start_secondary(unsigned int cpu) { struct cpu_info *info = get_cpu_info(); /* - * Dont put anything before smp_callin(), SMP booting is so fragile that we + * Don't put anything before smp_callin(), SMP booting is so fragile that we * want to limit the things done here to the most necessary things. */ - unsigned int cpu = booting_cpu; /* Critical region without IDT or TSS. Any fault is deadly! */ @@ -346,9 +343,9 @@ void start_secondary(void *unused) */ spin_debug_disable(); - get_cpu_info()->use_pv_cr3 = false; - get_cpu_info()->xen_cr3 = 0; - get_cpu_info()->pv_cr3 = 0; + info->use_pv_cr3 = false; + info->xen_cr3 = 0; + info->pv_cr3 = 0; /* * BUG_ON() used in load_system_tables() and later code may end up calling @@ -575,8 +572,6 @@ static int do_boot_cpu(int apicid, int cpu) */ mtrr_save_state(); - booting_cpu = cpu; - start_eip = bootsym_phys(trampoline_realmode_entry); /* start_eip needs be page aligned, and below the 1M boundary. */ From patchwork Tue Nov 14 17:50:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 95AE9C07545 for ; Tue, 14 Nov 2023 17:58:01 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633110.987746 (Exim 4.92) (envelope-from ) id 1r2xfZ-000428-SJ; Tue, 14 Nov 2023 17:57:49 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633110.987746; Tue, 14 Nov 2023 17:57:49 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xfY-0003ug-2Z; Tue, 14 Nov 2023 17:57:48 +0000 Received: by outflank-mailman (input) for mailman id 633110; Tue, 14 Nov 2023 17:50:49 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYn-0004wk-IS for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:49 +0000 Received: from 20.mo550.mail-out.ovh.net (20.mo550.mail-out.ovh.net [188.165.45.168]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 57962a2f-8316-11ee-9b0e-b553b5be7939; Tue, 14 Nov 2023 18:50:47 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.108.16.30]) by mo550.mail-out.ovh.net (Postfix) with ESMTP id 9591C29502 for ; Tue, 14 Nov 2023 17:50:47 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id 1B4B41FE62; Tue, 14 Nov 2023 17:50:47 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id KN3PAnezU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:47 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 57962a2f-8316-11ee-9b0e-b553b5be7939 Authentication-Results: garm.ovh; auth=pass (GARM-103G0055119ad6f-2ad2-4238-88e6-1a3ab37426bc, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 09/10] x86/smp: make cpu_state per-CPU Date: Tue, 14 Nov 2023 18:50:13 +0100 Message-ID: <52083114d4cbbc75f021e8c61763ad0e166cf05b.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12931804858740550000 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeevtdevieehieeiveekvefhlefftdfhteefueelhfdvhedtjeegkedugfefvdekffenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgepkeenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheehtddpmhhouggvpehsmhhtphhouhht This will be used for parallel AP bring-up. CPU_STATE_INIT changed direction. It was previously set by BSP and never consumed by AP. Now it signals that AP got through assembly part of initialization and waits for BSP to call notifiers that set up data structures required for further initialization. Signed-off-by: Krystian Hebel --- xen/arch/x86/include/asm/cpufeature.h | 1 + xen/arch/x86/smpboot.c | 80 ++++++++++++++++----------- 2 files changed, 49 insertions(+), 32 deletions(-) diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h index ff0e18864cc7..1831b5fb368f 100644 --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -38,6 +38,7 @@ struct cpuinfo_x86 { unsigned int cpu_core_id; /* core ID of each logical CPU */ unsigned int compute_unit_id; /* AMD compute unit ID of each logical CPU */ void *stack_base; + unsigned int cpu_state; unsigned short x86_clflush_size; } __cacheline_aligned; diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 39ffd356dbbc..cbea2d45f70d 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -65,15 +65,18 @@ struct cpuinfo_x86 cpu_data[NR_CPUS] = { [0 ... NR_CPUS-1] .apicid = BAD_APICID }; static int cpu_error; -static enum cpu_state { +enum cpu_state { CPU_STATE_DYING, /* slave -> master: I am dying */ CPU_STATE_DEAD, /* slave -> master: I am completely dead */ - CPU_STATE_INIT, /* master -> slave: Early bringup phase 1 */ - CPU_STATE_CALLOUT, /* master -> slave: Early bringup phase 2 */ + CPU_STATE_INIT, /* slave -> master: Early bringup phase 1 completed */ + CPU_STATE_CALLOUT, /* master -> slave: Start early bringup phase 2 */ CPU_STATE_CALLIN, /* slave -> master: Completed phase 2 */ CPU_STATE_ONLINE /* master -> slave: Go fully online now. */ -} cpu_state; -#define set_cpu_state(state) do { smp_mb(); cpu_state = (state); } while (0) +}; +#define set_cpu_state(cpu, state) do { \ + smp_mb(); \ + cpu_data[cpu].cpu_state = (state); \ +} while (0) void initialize_cpu_data(unsigned int cpu) { @@ -168,16 +171,7 @@ static void synchronize_tsc_slave(unsigned int slave) static void smp_callin(void) { unsigned int cpu = smp_processor_id(); - int i, rc; - - /* Wait 2s total for startup. */ - Dprintk("Waiting for CALLOUT.\n"); - for ( i = 0; cpu_state != CPU_STATE_CALLOUT; i++ ) - { - BUG_ON(i >= 200); - cpu_relax(); - mdelay(10); - } + int rc; /* * The boot CPU has finished the init stage and is spinning on cpu_state @@ -213,12 +207,12 @@ static void smp_callin(void) } /* Allow the master to continue. */ - set_cpu_state(CPU_STATE_CALLIN); + set_cpu_state(cpu, CPU_STATE_CALLIN); synchronize_tsc_slave(cpu); /* And wait for our final Ack. */ - while ( cpu_state != CPU_STATE_ONLINE ) + while ( cpu_data[cpu].cpu_state != CPU_STATE_ONLINE ) cpu_relax(); } @@ -313,6 +307,9 @@ void start_secondary(unsigned int cpu) { struct cpu_info *info = get_cpu_info(); + /* Tell BSP that we are awake. */ + set_cpu_state(cpu, CPU_STATE_INIT); + /* * Don't put anything before smp_callin(), SMP booting is so fragile that we * want to limit the things done here to the most necessary things. @@ -320,6 +317,10 @@ void start_secondary(unsigned int cpu) /* Critical region without IDT or TSS. Any fault is deadly! */ + /* Wait until data set up by CPU_UP_PREPARE notifiers is ready. */ + while ( cpu_data[cpu].cpu_state != CPU_STATE_CALLOUT ) + cpu_relax(); + set_current(idle_vcpu[cpu]); this_cpu(curr_vcpu) = idle_vcpu[cpu]; rdmsrl(MSR_EFER, this_cpu(efer)); @@ -585,26 +586,35 @@ static int do_boot_cpu(int apicid, int cpu) /* This grunge runs the startup process for the targeted processor. */ - set_cpu_state(CPU_STATE_INIT); - /* Starting actual IPI sequence... */ boot_error = wakeup_secondary_cpu(apicid, start_eip); if ( !boot_error ) { - /* Allow AP to start initializing. */ - set_cpu_state(CPU_STATE_CALLOUT); - Dprintk("After Callout %d.\n", cpu); - - /* Wait 5s total for a response. */ - for ( timeout = 0; timeout < 50000; timeout++ ) + /* Wait 2s total for a response. */ + for ( timeout = 0; timeout < 20000; timeout++ ) { - if ( cpu_state != CPU_STATE_CALLOUT ) + if ( cpu_data[cpu].cpu_state == CPU_STATE_INIT ) break; udelay(100); } - if ( cpu_state == CPU_STATE_CALLIN ) + if ( cpu_data[cpu].cpu_state == CPU_STATE_INIT ) + { + /* Allow AP to start initializing. */ + set_cpu_state(cpu, CPU_STATE_CALLOUT); + Dprintk("After Callout %d.\n", cpu); + + /* Wait 5s total for a response. */ + for ( timeout = 0; timeout < 500000; timeout++ ) + { + if ( cpu_data[cpu].cpu_state != CPU_STATE_CALLOUT ) + break; + udelay(10); + } + } + + if ( cpu_data[cpu].cpu_state == CPU_STATE_CALLIN ) { /* number CPUs logically, starting from 1 (BSP is 0) */ Dprintk("OK.\n"); @@ -612,7 +622,7 @@ static int do_boot_cpu(int apicid, int cpu) synchronize_tsc_master(cpu); Dprintk("CPU has booted.\n"); } - else if ( cpu_state == CPU_STATE_DEAD ) + else if ( cpu_data[cpu].cpu_state == CPU_STATE_DEAD ) { smp_rmb(); rc = cpu_error; @@ -683,7 +693,7 @@ unsigned long alloc_stub_page(unsigned int cpu, unsigned long *mfn) void cpu_exit_clear(unsigned int cpu) { cpu_uninit(cpu); - set_cpu_state(CPU_STATE_DEAD); + set_cpu_state(cpu, CPU_STATE_DEAD); } static int clone_mapping(const void *ptr, root_pgentry_t *rpt) @@ -1161,6 +1171,12 @@ void __init smp_prepare_cpus(void) cpu_data[0].stack_base = (void *) ((unsigned long)stack_start & ~(STACK_SIZE - 1)); + /* Set state as CALLOUT so APs won't change it in initialize_cpu_data() */ + boot_cpu_data.cpu_state = CPU_STATE_CALLOUT; + + /* Not really used anywhere, but set it just in case. */ + set_cpu_state(0, CPU_STATE_ONLINE); + set_nr_sockets(); socket_cpumask = xzalloc_array(cpumask_t *, nr_sockets); @@ -1267,7 +1283,7 @@ void __cpu_disable(void) { int cpu = smp_processor_id(); - set_cpu_state(CPU_STATE_DYING); + set_cpu_state(cpu, CPU_STATE_DYING); local_irq_disable(); clear_local_APIC(); @@ -1292,7 +1308,7 @@ void __cpu_die(unsigned int cpu) unsigned int i = 0; enum cpu_state seen_state; - while ( (seen_state = cpu_state) != CPU_STATE_DEAD ) + while ( (seen_state = cpu_data[cpu].cpu_state) != CPU_STATE_DEAD ) { BUG_ON(seen_state != CPU_STATE_DYING); mdelay(100); @@ -1393,7 +1409,7 @@ int __cpu_up(unsigned int cpu) time_latch_stamps(); - set_cpu_state(CPU_STATE_ONLINE); + set_cpu_state(cpu, CPU_STATE_ONLINE); while ( !cpu_online(cpu) ) { cpu_relax(); From patchwork Tue Nov 14 17:50:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krystian Hebel X-Patchwork-Id: 13455789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 23528C4167B for ; Tue, 14 Nov 2023 18:08:21 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.633170.987810 (Exim 4.92) (envelope-from ) id 1r2xpa-0007sZ-6n; Tue, 14 Nov 2023 18:08:10 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 633170.987810; Tue, 14 Nov 2023 18:08:10 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xpa-0007sS-41; Tue, 14 Nov 2023 18:08:10 +0000 Received: by outflank-mailman (input) for mailman id 633170; Tue, 14 Nov 2023 18:08:08 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r2xYo-0004mK-JY for xen-devel@lists.xenproject.org; Tue, 14 Nov 2023 17:50:50 +0000 Received: from 9.mo575.mail-out.ovh.net (9.mo575.mail-out.ovh.net [46.105.78.111]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 583befc2-8316-11ee-98db-6d05b1d4d9a1; Tue, 14 Nov 2023 18:50:49 +0100 (CET) Received: from director8.ghost.mail-out.ovh.net (unknown [10.109.138.246]) by mo575.mail-out.ovh.net (Postfix) with ESMTP id ABD9329246 for ; Tue, 14 Nov 2023 17:50:48 +0000 (UTC) Received: from ghost-submission-6684bf9d7b-x5j2z (unknown [10.110.115.90]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id 3D13F1FD16; Tue, 14 Nov 2023 17:50:48 +0000 (UTC) Received: from 3mdeb.com ([37.59.142.103]) by ghost-submission-6684bf9d7b-x5j2z with ESMTPSA id aHObC3izU2V/lwcATVRwWg (envelope-from ); Tue, 14 Nov 2023 17:50:48 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 583befc2-8316-11ee-98db-6d05b1d4d9a1 Authentication-Results: garm.ovh; auth=pass (GARM-103G005909cbe0d-2db8-4cee-a3fc-f8a2f45a27b0, 1C6EC45AC3E1968723EBE40916FD99D0F8B07574) smtp.auth=krystian.hebel@3mdeb.com X-OVh-ClientIp: 213.192.77.249 From: Krystian Hebel To: xen-devel@lists.xenproject.org Cc: Krystian Hebel , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu Subject: [PATCH 10/10] x86/smp: start APs in parallel during boot Date: Tue, 14 Nov 2023 18:50:15 +0100 Message-ID: <77c9199eabf3a30ebcf89356b2dd35abd611a3a9.1699981248.git.krystian.hebel@3mdeb.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Ovh-Tracer-Id: 12932086331639900528 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeffedgudegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhrhihsthhirghnucfjvggsvghluceokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqeenucggtffrrghtthgvrhhnpeevtdevieehieeiveekvefhlefftdfhteefueelhfdvhedtjeegkedugfefvdekffenucfkphepuddvjedrtddrtddruddpvddufedrudelvddrjeejrddvgeelpdefjedrheelrddugedvrddutdefnecuvehluhhsthgvrhfuihiivgepleenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeokhhrhihsthhirghnrdhhvggsvghlseefmhguvggsrdgtohhmqedpnhgspghrtghpthhtohepuddprhgtphhtthhopeigvghnqdguvghvvghlsehlihhsthhsrdigvghnphhrohhjvggtthdrohhrghdpoffvtefjohhsthepmhhoheejhedpmhhouggvpehsmhhtphhouhht Multiple delays are required when sending IPIs and waiting for responses. During boot, 4 such IPIs were sent per each AP. With this change, only one set of broadcast IPIs is sent. This reduces boot time, especially for platforms with large number of cores. Single CPU initialization is still possible, it is used for hotplug. During wakeup from S3 APs are started one by one. It should be possible to enable parallel execution there as well, but I don't have a way of properly testing it as of now. Signed-off-by: Krystian Hebel --- xen/arch/x86/include/asm/smp.h | 1 + xen/arch/x86/setup.c | 2 + xen/arch/x86/smpboot.c | 68 ++++++++++++++++++++++++---------- 3 files changed, 51 insertions(+), 20 deletions(-) diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index 98739028a6ed..6ca0158a368d 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -31,6 +31,7 @@ DECLARE_PER_CPU(cpumask_var_t, send_ipi_cpumask); extern bool park_offline_cpus; void smp_send_nmi_allbutself(void); +void smp_send_init_sipi_sipi_allbutself(void); void send_IPI_mask(const cpumask_t *, int vector); void send_IPI_self(int vector); diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index b2c0679725ea..42a9067b81eb 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1963,6 +1963,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) cpu_data[i].stack_base = cpu_alloc_stack(i); } + smp_send_init_sipi_sipi_allbutself(); + for_each_present_cpu ( i ) { if ( (park_offline_cpus || num_online_cpus() < max_cpus) && diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index cbea2d45f70d..e9a7f78a5a6f 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -425,7 +425,7 @@ void start_secondary(unsigned int cpu) static int wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip) { - unsigned long send_status = 0, accept_status = 0; + unsigned long send_status = 0, accept_status = 0, sh = 0; int maxlvt, timeout, i; /* @@ -445,6 +445,12 @@ static int wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip) if ( tboot_in_measured_env() && !tboot_wake_ap(phys_apicid, start_eip) ) return 0; + /* + * Use destination shorthand for broadcasting IPIs during boot. + */ + if ( phys_apicid == BAD_APICID ) + sh = APIC_DEST_ALLBUT; + /* * Be paranoid about clearing APIC errors. */ @@ -458,7 +464,7 @@ static int wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip) /* * Turn INIT on target chip via IPI */ - apic_icr_write(APIC_INT_LEVELTRIG | APIC_INT_ASSERT | APIC_DM_INIT, + apic_icr_write(APIC_INT_LEVELTRIG | APIC_INT_ASSERT | APIC_DM_INIT | sh, phys_apicid); if ( !x2apic_enabled ) @@ -475,7 +481,7 @@ static int wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip) Dprintk("Deasserting INIT.\n"); - apic_icr_write(APIC_INT_LEVELTRIG | APIC_DM_INIT, phys_apicid); + apic_icr_write(APIC_INT_LEVELTRIG | APIC_DM_INIT | sh, phys_apicid); Dprintk("Waiting for send to finish...\n"); timeout = 0; @@ -512,7 +518,7 @@ static int wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip) * STARTUP IPI * Boot on the stack */ - apic_icr_write(APIC_DM_STARTUP | (start_eip >> 12), phys_apicid); + apic_icr_write(APIC_DM_STARTUP | (start_eip >> 12) | sh, phys_apicid); if ( !x2apic_enabled ) { @@ -565,7 +571,6 @@ int alloc_cpu_id(void) static int do_boot_cpu(int apicid, int cpu) { int timeout, boot_error = 0, rc = 0; - unsigned long start_eip; /* * Save current MTRR state in case it was changed since early boot @@ -573,21 +578,31 @@ static int do_boot_cpu(int apicid, int cpu) */ mtrr_save_state(); - start_eip = bootsym_phys(trampoline_realmode_entry); + /* Check if AP is already up. */ + if ( cpu_data[cpu].cpu_state != CPU_STATE_INIT ) + { + /* This grunge runs the startup process for the targeted processor. */ + unsigned long start_eip; + start_eip = bootsym_phys(trampoline_realmode_entry); - /* start_eip needs be page aligned, and below the 1M boundary. */ - if ( start_eip & ~0xff000 ) - panic("AP trampoline %#lx not suitably positioned\n", start_eip); + /* start_eip needs be page aligned, and below the 1M boundary. */ + if ( start_eip & ~0xff000 ) + panic("AP trampoline %#lx not suitably positioned\n", start_eip); - /* So we see what's up */ - if ( opt_cpu_info ) - printk("Booting processor %d/%d eip %lx\n", - cpu, apicid, start_eip); + /* So we see what's up */ + if ( opt_cpu_info ) + printk("AP trampoline at %lx\n", start_eip); - /* This grunge runs the startup process for the targeted processor. */ + /* mark "stuck" area as not stuck */ + bootsym(trampoline_cpu_started) = 0; + smp_mb(); - /* Starting actual IPI sequence... */ - boot_error = wakeup_secondary_cpu(apicid, start_eip); + /* Starting actual IPI sequence... */ + boot_error = wakeup_secondary_cpu(apicid, start_eip); + } + + if ( opt_cpu_info ) + printk("Booting processor %d/%d\n", cpu, apicid); if ( !boot_error ) { @@ -646,10 +661,6 @@ static int do_boot_cpu(int apicid, int cpu) rc = -EIO; } - /* mark "stuck" area as not stuck */ - bootsym(trampoline_cpu_started) = 0; - smp_mb(); - return rc; } @@ -1155,6 +1166,23 @@ static struct notifier_block cpu_smpboot_nfb = { .notifier_call = cpu_smpboot_callback }; +void smp_send_init_sipi_sipi_allbutself(void) +{ + unsigned long start_eip; + start_eip = bootsym_phys(trampoline_realmode_entry); + + /* start_eip needs be page aligned, and below the 1M boundary. */ + if ( start_eip & ~0xff000 ) + panic("AP trampoline %#lx not suitably positioned\n", start_eip); + + /* So we see what's up */ + if ( opt_cpu_info ) + printk("Booting APs in parallel, eip %lx\n", start_eip); + + /* Starting actual broadcast IPI sequence... */ + wakeup_secondary_cpu(BAD_APICID, start_eip); +} + void __init smp_prepare_cpus(void) { register_cpu_notifier(&cpu_smpboot_nfb);