Message ID | 1380292153-10480-1-git-send-email-lorenzo.pieralisi@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Sep 27, 2013 at 03:29:13PM +0100, Lorenzo Pieralisi wrote: > On the TC2 testchip, when all CPUs in a cluster enter standbywfi and > commit a power down request, the power controller will wait for [...] (Minor nit: please wrap the commit message to a shorter length for git log. I use 67, but I can't remember where that recommendation came from) [...] > This patch moves the GIC CPU IF disable call in the TC2 MCPM implementation > from the suspend method to the power down method to fix the mentioned > race condition. Reviewed-by: Dave Martin <Dave.Martin@arm.com> Tested-by: Dave Martn <Dave.Martin@arm.com> (for kexec) It's worth briefly summarising the kexec scenario too: kexec hotplugs secondary CPUs out during kernel reload/restart. Because kexec may (deliberately) trash the old kernel text, it is not OK for CPUs to follow the MCPM soft reboot path, since instructions after the WFI may have been replaced by kexec. If tc2_pm_down() does not disable the GIC cpu interface, there is a race between CPU powerdown in the old kernel and the IPI from the new kernel that triggers secondary boot, particluarly if the powerdown is slow (due to L2 cache cleaning for example). If the new kernel wins the race, the affected CPU(s) will not really be reset and may execute garbage after the WFI. (some comments below) > > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> > --- > arch/arm/mach-vexpress/tc2_pm.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/mach-vexpress/tc2_pm.c b/arch/arm/mach-vexpress/tc2_pm.c > index 7aeb5d6..e6eb481 100644 > --- a/arch/arm/mach-vexpress/tc2_pm.c > +++ b/arch/arm/mach-vexpress/tc2_pm.c > @@ -131,6 +131,16 @@ static void tc2_pm_down(u64 residency) > } else > BUG(); > > + /* > + * If the CPU is committed to power down, make sure > + * the power controller will be in charge of waking it > + * up upon IRQ, ie IRQ lines are cut from GIC CPU IF > + * to the CPU by disabling the GIC CPU IF to prevent wfi > + * from completing execution behind power controller back > + */ > + if (!skip_wfi) > + gic_cpu_if_down(); > + In my version of this fix, I disabled the CPU interface much earlier, just after entering the function. I think it probably works either way. Do you think the location is critical, or should it not matter? As far as I could tell, it matters only that this is done sometime between committing to WFI and actually doing it, but if you think there are extra requirements then I would like to understand them. [...] Cheers ---Dave
On Fri, Sep 27, 2013 at 04:10:49PM +0100, Dave Martin wrote: > On Fri, Sep 27, 2013 at 03:29:13PM +0100, Lorenzo Pieralisi wrote: > > On the TC2 testchip, when all CPUs in a cluster enter standbywfi and > > commit a power down request, the power controller will wait for > > [...] > > (Minor nit: please wrap the commit message to a shorter length for git > log. I use 67, but I can't remember where that recommendation came from) Ok, will do. > [...] > > > This patch moves the GIC CPU IF disable call in the TC2 MCPM implementation > > from the suspend method to the power down method to fix the mentioned > > race condition. > > Reviewed-by: Dave Martin <Dave.Martin@arm.com> > Tested-by: Dave Martn <Dave.Martin@arm.com> (for kexec) Thanks ! > It's worth briefly summarising the kexec scenario too: > > kexec hotplugs secondary CPUs out during kernel reload/restart. > Because kexec may (deliberately) trash the old kernel text, it is > not OK for CPUs to follow the MCPM soft reboot path, since > instructions after the WFI may have been replaced by kexec. > > If tc2_pm_down() does not disable the GIC cpu interface, there is a > race between CPU powerdown in the old kernel and the IPI from the > new kernel that triggers secondary boot, particluarly if the > powerdown is slow (due to L2 cache cleaning for example). If the > new kernel wins the race, the affected CPU(s) will not really be > reset and may execute garbage after the WFI. > > (some comments below) > > > > > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> > > --- > > arch/arm/mach-vexpress/tc2_pm.c | 11 ++++++++++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/arch/arm/mach-vexpress/tc2_pm.c b/arch/arm/mach-vexpress/tc2_pm.c > > index 7aeb5d6..e6eb481 100644 > > --- a/arch/arm/mach-vexpress/tc2_pm.c > > +++ b/arch/arm/mach-vexpress/tc2_pm.c > > @@ -131,6 +131,16 @@ static void tc2_pm_down(u64 residency) > > } else > > BUG(); > > > > + /* > > + * If the CPU is committed to power down, make sure > > + * the power controller will be in charge of waking it > > + * up upon IRQ, ie IRQ lines are cut from GIC CPU IF > > + * to the CPU by disabling the GIC CPU IF to prevent wfi > > + * from completing execution behind power controller back > > + */ > > + if (!skip_wfi) > > + gic_cpu_if_down(); > > + > > In my version of this fix, I disabled the CPU interface much earlier, > just after entering the function. I think it probably works either > way. Do you think the location is critical, or should it not matter? > > As far as I could tell, it matters only that this is done sometime > between committing to WFI and actually doing it, but if you think > there are extra requirements then I would like to understand them. You are right, it can be added anywhere before executing wfi; I just added it there for two reasons: - MCPM checked if wfi should be skipped. If wfi should be skipped disabling GIC CPU IF is useless - I want to issue the write to the GIC before we enter the no-fly zone (disable C-bit, clean cache, exit SMP (and disable CCI)). I know we could disable the GIC CPU IF right before executing wfi, but let's not play with fire, if that write hides a barrier we are in a bind. Lorenzo
diff --git a/arch/arm/mach-vexpress/tc2_pm.c b/arch/arm/mach-vexpress/tc2_pm.c index 7aeb5d6..e6eb481 100644 --- a/arch/arm/mach-vexpress/tc2_pm.c +++ b/arch/arm/mach-vexpress/tc2_pm.c @@ -131,6 +131,16 @@ static void tc2_pm_down(u64 residency) } else BUG(); + /* + * If the CPU is committed to power down, make sure + * the power controller will be in charge of waking it + * up upon IRQ, ie IRQ lines are cut from GIC CPU IF + * to the CPU by disabling the GIC CPU IF to prevent wfi + * from completing execution behind power controller back + */ + if (!skip_wfi) + gic_cpu_if_down(); + if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) { arch_spin_unlock(&tc2_pm_lock); @@ -231,7 +241,6 @@ static void tc2_pm_suspend(u64 residency) cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); ve_spc_set_resume_addr(cluster, cpu, virt_to_phys(mcpm_entry_point)); - gic_cpu_if_down(); tc2_pm_down(residency); }