Message ID | 20230308171328.1562857-11-usama.arif@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Parallel CPU bringup for x86_64 | expand |
On Wed, Mar 08 2023 at 17:13, Usama Arif wrote: > > +/* Bringup step one: Send INIT/SIPI to the target AP */ > +static int native_cpu_kick(unsigned int cpu) > +{ > + return do_cpu_up(cpu, idle_thread_get(cpu)); This idle_thread_get() is not sufficient. bringup_cpu() does: struct task_struct *idle = idle_thread_get(cpu); /* * Reset stale stack state from the last time this CPU was online. */ scs_task_reset(idle); kasan_unpoison_task_stack(idle); But with this new model neither the shadow stack gets reset nor the kasan unpoisoning happens _before_ the to be kicked CPU starts executing. That needs a new function which does the get() and the above. Thanks, tglx
On Sat, 2023-03-11 at 10:54 +0200, Thomas Gleixner wrote: > On Wed, Mar 08 2023 at 17:13, Usama Arif wrote: > > > > +/* Bringup step one: Send INIT/SIPI to the target AP */ > > +static int native_cpu_kick(unsigned int cpu) > > +{ > > + return do_cpu_up(cpu, idle_thread_get(cpu)); > > This idle_thread_get() is not sufficient. bringup_cpu() does: > > struct task_struct *idle = idle_thread_get(cpu); > > /* > * Reset stale stack state from the last time this CPU was online. > */ > scs_task_reset(idle); > kasan_unpoison_task_stack(idle); > > But with this new model neither the shadow stack gets reset nor the > kasan unpoisoning happens _before_ the to be kicked CPU starts > executing. > > That needs a new function which does the get() and the above. Ah, good catch. Those were added after we started on this journey :) I think I'll do it with a 'bool unpoison' argument to idle_thread_get(). Or just make it unconditional; they're idempotent anyway and cheap enough? Kind of weird to be doing it from finish_cpu() though, so I'll probably stick with the argument. ....*types*.... Erm, there are circumstances (!CONFIG_GENERIC_SMP_IDLE_THREAD) when idle_thread_get() just unconditionally returns NULL. At first glance, it doesn't look like scs_task_reset() copes with being passed a NULL. Am I missing something? $ grep -c GENERIC_SMP_IDLE_THREAD `grep -l SMP arch/*/Kconfig` arch/alpha/Kconfig:1 arch/arc/Kconfig:1 arch/arm64/Kconfig:1 arch/arm/Kconfig:1 arch/csky/Kconfig:1 arch/hexagon/Kconfig:1 arch/ia64/Kconfig:1 arch/loongarch/Kconfig:1 arch/mips/Kconfig:1 arch/openrisc/Kconfig:1 arch/parisc/Kconfig:1 arch/powerpc/Kconfig:1 arch/riscv/Kconfig:1 arch/s390/Kconfig:1 arch/sh/Kconfig:1 arch/sparc/Kconfig:1 arch/um/Kconfig:0 arch/x86/Kconfig:1 arch/xtensa/Kconfig:1 Maybe just nobody but UM cares?
On Sat, 2023-03-11 at 10:54 +0200, Thomas Gleixner wrote: > On Wed, Mar 08 2023 at 17:13, Usama Arif wrote: > > > > +/* Bringup step one: Send INIT/SIPI to the target AP */ > > +static int native_cpu_kick(unsigned int cpu) > > +{ > > + return do_cpu_up(cpu, idle_thread_get(cpu)); > > This idle_thread_get() is not sufficient. bringup_cpu() does: > > struct task_struct *idle = idle_thread_get(cpu); > > /* > * Reset stale stack state from the last time this CPU was online. > */ > scs_task_reset(idle); > kasan_unpoison_task_stack(idle); > > But with this new model neither the shadow stack gets reset nor the > kasan unpoisoning happens _before_ the to be kicked CPU starts > executing. > > That needs a new function which does the get() and the above. From f2ea5c62be5f63a8c701e0eda8accd177939e087 Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw@amazon.co.uk> Date: Thu, 23 Feb 2023 19:11:30 +0000 Subject: [PATCH 02/10] cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> Instead of relying purely on the special-case wrapper in bringup_cpu() to pass the idle thread to __cpu_up(), expose idle_thread_get() so that the architecture code can obtain it directly when necessary. This will be useful when the existing __cpu_up() is split into multiple phases, only *one* of which will actually need the idle thread. If the architecture code is to register its new pre-bringup states with the cpuhp core, having a special-case wrapper to pass extra arguments is non-trivial and it's easier just to let the arch register its function pointer to be invoked with the standard API. To reduce duplication, move the shadow stack reset and kasan unpoisoning into idle_thread_get() too. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Usama Arif <usama.arif@bytedance.com> Tested-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Kim Phillips <kim.phillips@amd.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> --- include/linux/smpboot.h | 10 ++++++++++ kernel/cpu.c | 13 +++---------- kernel/smpboot.c | 11 ++++++++++- kernel/smpboot.h | 2 -- 4 files changed, 23 insertions(+), 13 deletions(-) diff --git a/include/linux/smpboot.h b/include/linux/smpboot.h index 9d1bc65d226c..df6417703e4c 100644 --- a/include/linux/smpboot.h +++ b/include/linux/smpboot.h @@ -5,6 +5,16 @@ #include <linux/types.h> struct task_struct; + +#ifdef CONFIG_GENERIC_SMP_IDLE_THREAD +struct task_struct *idle_thread_get(unsigned int cpu, bool unpoison); +#else +static inline struct task_struct *idle_thread_get(unsigned int cpu, bool unpoison) +{ + return NULL; +} +#endif + /* Cookie handed to the thread_fn*/ struct smpboot_thread_data; diff --git a/kernel/cpu.c b/kernel/cpu.c index 6c0a92ca6bb5..6b3dccb4a888 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -31,7 +31,6 @@ #include <linux/smpboot.h> #include <linux/relay.h> #include <linux/slab.h> -#include <linux/scs.h> #include <linux/percpu-rwsem.h> #include <linux/cpuset.h> #include <linux/random.h> @@ -588,15 +587,9 @@ static int bringup_wait_for_ap(unsigned int cpu) static int bringup_cpu(unsigned int cpu) { - struct task_struct *idle = idle_thread_get(cpu); + struct task_struct *idle = idle_thread_get(cpu, true); int ret; - /* - * Reset stale stack state from the last time this CPU was online. - */ - scs_task_reset(idle); - kasan_unpoison_task_stack(idle); - /* * Some architectures have to walk the irq descriptors to * setup the vector space for the cpu which comes online. @@ -614,7 +607,7 @@ static int bringup_cpu(unsigned int cpu) static int finish_cpu(unsigned int cpu) { - struct task_struct *idle = idle_thread_get(cpu); + struct task_struct *idle = idle_thread_get(cpu, false); struct mm_struct *mm = idle->active_mm; /* @@ -1378,7 +1371,7 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target) if (st->state == CPUHP_OFFLINE) { /* Let it fail before we try to bring the cpu up */ - idle = idle_thread_get(cpu); + idle = idle_thread_get(cpu, false); if (IS_ERR(idle)) { ret = PTR_ERR(idle); goto out; diff --git a/kernel/smpboot.c b/kernel/smpboot.c index 2c7396da470c..24e81c725e7b 100644 --- a/kernel/smpboot.c +++ b/kernel/smpboot.c @@ -11,6 +11,7 @@ #include <linux/slab.h> #include <linux/sched.h> #include <linux/sched/task.h> +#include <linux/scs.h> #include <linux/export.h> #include <linux/percpu.h> #include <linux/kthread.h> @@ -27,12 +28,20 @@ */ static DEFINE_PER_CPU(struct task_struct *, idle_threads); -struct task_struct *idle_thread_get(unsigned int cpu) +struct task_struct *idle_thread_get(unsigned int cpu, bool unpoison) { struct task_struct *tsk = per_cpu(idle_threads, cpu); if (!tsk) return ERR_PTR(-ENOMEM); + + if (unpoison) { + /* + * Reset stale stack state from last time this CPU was online. + */ + scs_task_reset(tsk); + kasan_unpoison_task_stack(tsk); + } return tsk; } diff --git a/kernel/smpboot.h b/kernel/smpboot.h index 34dd3d7ba40b..60c609318ad6 100644 --- a/kernel/smpboot.h +++ b/kernel/smpboot.h @@ -5,11 +5,9 @@ struct task_struct; #ifdef CONFIG_GENERIC_SMP_IDLE_THREAD -struct task_struct *idle_thread_get(unsigned int cpu); void idle_thread_set_boot_cpu(void); void idle_threads_init(void); #else -static inline struct task_struct *idle_thread_get(unsigned int cpu) { return NULL; } static inline void idle_thread_set_boot_cpu(void) { } static inline void idle_threads_init(void) { } #endif
On Sat, Mar 11 2023 at 09:55, David Woodhouse wrote: > On Sat, 2023-03-11 at 10:54 +0200, Thomas Gleixner wrote: > I think I'll do it with a 'bool unpoison' argument to > idle_thread_get(). Or just make it unconditional; they're idempotent > anyway and cheap enough? Kind of weird to be doing it from finish_cpu() > though, so I'll probably stick with the argument. Eew. > ....*types*.... > > Erm, there are circumstances (!CONFIG_GENERIC_SMP_IDLE_THREAD) when > idle_thread_get() just unconditionally returns NULL. > > At first glance, it doesn't look like scs_task_reset() copes with being > passed a NULL. Am I missing something? Shadow call stacks are only enabled by arm64 today, and that uses the generic idle threads. Thanks, tglx
On 11 March 2023 14:14:53 GMT, Thomas Gleixner <tglx@linutronix.de> wrote: >On Sat, Mar 11 2023 at 09:55, David Woodhouse wrote: >> On Sat, 2023-03-11 at 10:54 +0200, Thomas Gleixner wrote: >> I think I'll do it with a 'bool unpoison' argument to >> idle_thread_get(). Or just make it unconditional; they're idempotent >> anyway and cheap enough? Kind of weird to be doing it from finish_cpu() >> though, so I'll probably stick with the argument. > >Eew. Hm? I prefer the idea that idle_thread_get() is able to just return a *usable* one, and that we don't rely on architectures to have the *same* set of functions to unpoison/prepare it, and keep those duplicates in sync... I suppose we could make a separate make_that_idle_thread_you_gave_me_actually_useful() function and avoid the duplication of anything but *that* call... but meh.
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index fd4e678b6588..a3572b2ebfd3 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -57,6 +57,7 @@ #include <linux/pgtable.h> #include <linux/overflow.h> #include <linux/stackprotector.h> +#include <linux/smpboot.h> #include <asm/acpi.h> #include <asm/cacheinfo.h> @@ -992,7 +993,8 @@ static void announce_cpu(int cpu, int apicid) node_width = num_digits(num_possible_nodes()) + 1; /* + '#' */ if (cpu == 1) - printk(KERN_INFO "x86: Booting SMP configuration:\n"); + printk(KERN_INFO "x86: Booting SMP configuration in %s:\n", + do_parallel_bringup ? "parallel" : "series"); if (system_state < SYSTEM_RUNNING) { if (node != current_node) { @@ -1325,9 +1327,12 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle) { int ret; - ret = do_cpu_up(cpu, tidle); - if (ret) - return ret; + /* If parallel AP bringup isn't enabled, perform the first steps now. */ + if (!do_parallel_bringup) { + ret = do_cpu_up(cpu, tidle); + if (ret) + return ret; + } ret = do_wait_cpu_initialized(cpu); if (ret) @@ -1349,6 +1354,12 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle) return ret; } +/* Bringup step one: Send INIT/SIPI to the target AP */ +static int native_cpu_kick(unsigned int cpu) +{ + return do_cpu_up(cpu, idle_thread_get(cpu)); +} + /** * arch_disable_smp_support() - disables SMP support for x86 at runtime */ @@ -1517,6 +1528,8 @@ static bool prepare_parallel_bringup(void) smpboot_control = STARTUP_APICID_CPUID_01; } + cpuhp_setup_state_nocalls(CPUHP_BP_PARALLEL_DYN, "x86/cpu:kick", + native_cpu_kick, NULL); return true; }