Message ID | 20191125112754.25223-2-qais.yousef@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,01/14] smp: Create a new function to shutdown nonboot cpus | expand |
On Mon, Nov 25, 2019 at 11:27:41AM +0000, Qais Yousef wrote: > This function will be used later in machine_shutdown() for some archs. > > Signed-off-by: Qais Yousef <qais.yousef@arm.com> > CC: Thomas Gleixner <tglx@linutronix.de> > CC: Josh Poimboeuf <jpoimboe@redhat.com> > CC: "Peter Zijlstra (Intel)" <peterz@infradead.org> > CC: Jiri Kosina <jkosina@suse.cz> > CC: Nicholas Piggin <npiggin@gmail.com> > CC: Daniel Lezcano <daniel.lezcano@linaro.org> > CC: Ingo Molnar <mingo@kernel.org> > CC: Eiichi Tsukata <devel@etsukata.com> > CC: Zhenzhong Duan <zhenzhong.duan@oracle.com> > CC: Nadav Amit <namit@vmware.com> > CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > CC: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> > CC: Tony Luck <tony.luck@intel.com> > CC: Fenghua Yu <fenghua.yu@intel.com> > CC: Russell King <linux@armlinux.org.uk> > CC: Catalin Marinas <catalin.marinas@arm.com> > CC: Will Deacon <will@kernel.org> > CC: linux-arm-kernel@lists.infradead.org > CC: linux-ia64@vger.kernel.org > CC: linux-kernel@vger.kernel.org > --- > include/linux/cpu.h | 2 ++ > kernel/cpu.c | 17 +++++++++++++++++ > 2 files changed, 19 insertions(+) > > diff --git a/include/linux/cpu.h b/include/linux/cpu.h > index bc6c879bd110..8229932fb053 100644 > --- a/include/linux/cpu.h > +++ b/include/linux/cpu.h > @@ -118,6 +118,7 @@ extern void cpu_hotplug_disable(void); > extern void cpu_hotplug_enable(void); > void clear_tasks_mm_cpumask(int cpu); > int cpu_down(unsigned int cpu); > +extern void smp_shutdown_nonboot_cpus(unsigned int primary_cpu); > > #else /* CONFIG_HOTPLUG_CPU */ > > @@ -129,6 +130,7 @@ static inline int cpus_read_trylock(void) { return true; } > static inline void lockdep_assert_cpus_held(void) { } > static inline void cpu_hotplug_disable(void) { } > static inline void cpu_hotplug_enable(void) { } > +static inline void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) { } > #endif /* !CONFIG_HOTPLUG_CPU */ > > /* Wrappers which go away once all code is converted */ > diff --git a/kernel/cpu.c b/kernel/cpu.c > index e2cad3ee2ead..94055a0d989e 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -1058,6 +1058,23 @@ int cpu_down(unsigned int cpu) > } > EXPORT_SYMBOL(cpu_down); > > +void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) > +{ > + unsigned int cpu; > + > + if (!cpu_online(primary_cpu)) { > + pr_info("Attempting to shutdodwn nonboot cpus while boot cpu is offline!\n"); > + cpu_online(primary_cpu); > + } > + > + for_each_present_cpu(cpu) { > + if (cpu == primary_cpu) > + continue; > + if (cpu_online(cpu)) > + cpu_down(cpu); > + } How does this avoid racing with userspace attempting to restart CPUs that have already been taken down by this function?
On 01/21/20 17:03, Russell King - ARM Linux admin wrote: > On Mon, Nov 25, 2019 at 11:27:41AM +0000, Qais Yousef wrote: > > +void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) > > +{ > > + unsigned int cpu; > > + > > + if (!cpu_online(primary_cpu)) { > > + pr_info("Attempting to shutdodwn nonboot cpus while boot cpu is offline!\n"); > > + cpu_online(primary_cpu); Eh, that should be cpu_up(primary_cpu)! Which I have to say I'm not if is the right thing to do. migrate_to_reboot_cpu() picks the first online cpu if reboot_cpu (assumed 0) is offline migrate_to_reboot_cpu(): 225 /* Make certain the cpu I'm about to reboot on is online */ 226 if (!cpu_online(cpu)) 227 cpu = cpumask_first(cpu_online_mask); > > + } > > + > > + for_each_present_cpu(cpu) { > > + if (cpu == primary_cpu) > > + continue; > > + if (cpu_online(cpu)) > > + cpu_down(cpu); > > + } > > How does this avoid racing with userspace attempting to restart CPUs > that have already been taken down by this function? This is meant to be called from machine_shutdown() only. But you've got a point. The previous logic that used disable_nonboot_cpus(), which in turn called freeze_secondary_cpus() didn't hold hotplug lock. So I assumed the higher level logic of machine_shutdown() ensures that hotplug lock is held to synchronize with potential other hotplug operations. But I can see now that it doesn't. With this series that migrates users to use device_{online,offline}, holding the lock_device_hotplug() should protect against such races. Worth noting that this an existing problem in the code and not something I introduced, of course it makes sense to fix it properly as part of this series. I'm not sure how the other archs deal with this TBH. Thanks for having a look! Cheers -- Qais Yousef
On Tue, Jan 21, 2020 at 05:47:52PM +0000, Qais Yousef wrote: > On 01/21/20 17:03, Russell King - ARM Linux admin wrote: > > On Mon, Nov 25, 2019 at 11:27:41AM +0000, Qais Yousef wrote: > > > +void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) > > > +{ > > > + unsigned int cpu; > > > + > > > + if (!cpu_online(primary_cpu)) { > > > + pr_info("Attempting to shutdodwn nonboot cpus while boot cpu is offline!\n"); > > > + cpu_online(primary_cpu); > > Eh, that should be cpu_up(primary_cpu)! > > Which I have to say I'm not if is the right thing to do. > migrate_to_reboot_cpu() picks the first online cpu if reboot_cpu (assumed 0) is > offline > > migrate_to_reboot_cpu(): > 225 /* Make certain the cpu I'm about to reboot on is online */ > 226 if (!cpu_online(cpu)) > 227 cpu = cpumask_first(cpu_online_mask); > > > > + } > > > + > > > + for_each_present_cpu(cpu) { > > > + if (cpu == primary_cpu) > > > + continue; > > > + if (cpu_online(cpu)) > > > + cpu_down(cpu); > > > + } > > > > How does this avoid racing with userspace attempting to restart CPUs > > that have already been taken down by this function? > > This is meant to be called from machine_shutdown() only. > > But you've got a point. > > The previous logic that used disable_nonboot_cpus(), which in turn called > freeze_secondary_cpus() didn't hold hotplug lock. So I assumed the higher level > logic of machine_shutdown() ensures that hotplug lock is held to synchronize > with potential other hotplug operations. freeze_secondary_cpus() takes the CPU maps lock while it takes CPUs down, and then disables cpu hotplug by incrementing cpu_hotplug_disabled. Incrementing that prevents cpu_up() and cpu_down() being used, thereby preventing userspace from changing the online state of any CPU in the system. > But I can see now that it doesn't. > > With this series that migrates users to use device_{online,offline}, holding > the lock_device_hotplug() should protect against such races. > > Worth noting that this an existing problem in the code and not something > I introduced, of course it makes sense to fix it properly as part of this > series. > > I'm not sure how the other archs deal with this TBH. > > Thanks for having a look! > > Cheers > > -- > Qais Yousef >
On 01/21/20 18:09, Russell King - ARM Linux admin wrote: > On Tue, Jan 21, 2020 at 05:47:52PM +0000, Qais Yousef wrote: > > On 01/21/20 17:03, Russell King - ARM Linux admin wrote: > > > On Mon, Nov 25, 2019 at 11:27:41AM +0000, Qais Yousef wrote: > > > > +void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) > > > > +{ > > > > + unsigned int cpu; > > > > + > > > > + if (!cpu_online(primary_cpu)) { > > > > + pr_info("Attempting to shutdodwn nonboot cpus while boot cpu is offline!\n"); > > > > + cpu_online(primary_cpu); > > > > Eh, that should be cpu_up(primary_cpu)! > > > > Which I have to say I'm not if is the right thing to do. > > migrate_to_reboot_cpu() picks the first online cpu if reboot_cpu (assumed 0) is > > offline > > > > migrate_to_reboot_cpu(): > > 225 /* Make certain the cpu I'm about to reboot on is online */ > > 226 if (!cpu_online(cpu)) > > 227 cpu = cpumask_first(cpu_online_mask); > > > > > > + } > > > > + > > > > + for_each_present_cpu(cpu) { > > > > + if (cpu == primary_cpu) > > > > + continue; > > > > + if (cpu_online(cpu)) > > > > + cpu_down(cpu); > > > > + } > > > > > > How does this avoid racing with userspace attempting to restart CPUs > > > that have already been taken down by this function? > > > > This is meant to be called from machine_shutdown() only. > > > > But you've got a point. > > > > The previous logic that used disable_nonboot_cpus(), which in turn called > > freeze_secondary_cpus() didn't hold hotplug lock. So I assumed the higher level > > logic of machine_shutdown() ensures that hotplug lock is held to synchronize > > with potential other hotplug operations. > > freeze_secondary_cpus() takes the CPU maps lock while it takes CPUs > down, and then disables cpu hotplug by incrementing > cpu_hotplug_disabled. Incrementing that prevents cpu_up() and > cpu_down() being used, thereby preventing userspace from changing the > online state of any CPU in the system. I see. Sorry I missed the CPU maps lock. Yes this makes sense and should work here too. Thanks for the help. Thomas, I'll wait for your comment on this and potentially other patches before sending v3. Thanks -- Qais Yousef
diff --git a/include/linux/cpu.h b/include/linux/cpu.h index bc6c879bd110..8229932fb053 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -118,6 +118,7 @@ extern void cpu_hotplug_disable(void); extern void cpu_hotplug_enable(void); void clear_tasks_mm_cpumask(int cpu); int cpu_down(unsigned int cpu); +extern void smp_shutdown_nonboot_cpus(unsigned int primary_cpu); #else /* CONFIG_HOTPLUG_CPU */ @@ -129,6 +130,7 @@ static inline int cpus_read_trylock(void) { return true; } static inline void lockdep_assert_cpus_held(void) { } static inline void cpu_hotplug_disable(void) { } static inline void cpu_hotplug_enable(void) { } +static inline void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) { } #endif /* !CONFIG_HOTPLUG_CPU */ /* Wrappers which go away once all code is converted */ diff --git a/kernel/cpu.c b/kernel/cpu.c index e2cad3ee2ead..94055a0d989e 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1058,6 +1058,23 @@ int cpu_down(unsigned int cpu) } EXPORT_SYMBOL(cpu_down); +void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) +{ + unsigned int cpu; + + if (!cpu_online(primary_cpu)) { + pr_info("Attempting to shutdodwn nonboot cpus while boot cpu is offline!\n"); + cpu_online(primary_cpu); + } + + for_each_present_cpu(cpu) { + if (cpu == primary_cpu) + continue; + if (cpu_online(cpu)) + cpu_down(cpu); + } +} + #else #define takedown_cpu NULL #endif /*CONFIG_HOTPLUG_CPU*/
This function will be used later in machine_shutdown() for some archs. Signed-off-by: Qais Yousef <qais.yousef@arm.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Josh Poimboeuf <jpoimboe@redhat.com> CC: "Peter Zijlstra (Intel)" <peterz@infradead.org> CC: Jiri Kosina <jkosina@suse.cz> CC: Nicholas Piggin <npiggin@gmail.com> CC: Daniel Lezcano <daniel.lezcano@linaro.org> CC: Ingo Molnar <mingo@kernel.org> CC: Eiichi Tsukata <devel@etsukata.com> CC: Zhenzhong Duan <zhenzhong.duan@oracle.com> CC: Nadav Amit <namit@vmware.com> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> CC: Tony Luck <tony.luck@intel.com> CC: Fenghua Yu <fenghua.yu@intel.com> CC: Russell King <linux@armlinux.org.uk> CC: Catalin Marinas <catalin.marinas@arm.com> CC: Will Deacon <will@kernel.org> CC: linux-arm-kernel@lists.infradead.org CC: linux-ia64@vger.kernel.org CC: linux-kernel@vger.kernel.org --- include/linux/cpu.h | 2 ++ kernel/cpu.c | 17 +++++++++++++++++ 2 files changed, 19 insertions(+)