Message ID | 53A007F7.6020103@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Srivatsa, Thanks for your prompt reply. On Tue, Jun 17, 2014 at 2:48 PM, Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> wrote: > Hi Sachin, > > On 06/17/2014 01:39 PM, Sachin Kamat wrote: >> Hi, >> >> I observe the below warnings while trying to boot Exynos5420 based boards >> since yesterday's linux-next (next-20140616) using multi_v7_defconfig. Looks > > I guess you meant next-20140617. I meant I started observing this warning next-20140616 onwards (next-20140617 as well). > >> like it is triggered by the commit 56e6921829 ("CPU hotplug, smp: >> flush any pending IPI callbacks before CPU offline"). Any ideas? >> >> >> ************************************************************* >> [ 0.046521] Exynos MCPM support installed >> [ 0.048939] CPU1: Booted secondary processor >> [ 0.065005] CPU1: update cpu_capacity 1535 >> [ 0.065011] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001 >> [ 0.065660] CPU2: Booted secondary processor >> [ 0.085005] CPU2: update cpu_capacity 1535 >> [ 0.085012] CPU2: thread -1, cpu 2, socket 0, mpidr 80000002 >> [ 0.085662] CPU3: Booted secondary processor >> [ 0.105005] CPU3: update cpu_capacity 1535 >> [ 0.105011] CPU3: thread -1, cpu 3, socket 0, mpidr 80000003 >> [ 1.105031] CPU4: failed to come online >> [ 1.105081] ------------[ cut here ]------------ >> [ 1.105104] WARNING: CPU: 0 PID: 1 at kernel/smp.c:228 >> flush_smp_call_function_queue+0xc0/0x178() >> [ 1.105112] Modules linked in: >> [ 1.105129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >> 3.15.0-next-20140616-00002-g38f9385a061b #2035 >> [ 1.105157] [<c02160f0>] (unwind_backtrace) from [<c0211c8c>] >> (show_stack+0x10/0x14) >> [ 1.105179] [<c0211c8c>] (show_stack) from [<c0853794>] >> (dump_stack+0x8c/0x9c) >> [ 1.105198] [<c0853794>] (dump_stack) from [<c024bdf4>] >> (warn_slowpath_common+0x70/0x8c) >> [ 1.105216] [<c024bdf4>] (warn_slowpath_common) from [<c024beac>] >> (warn_slowpath_null+0x1c/0x24) >> [ 1.105235] [<c024beac>] (warn_slowpath_null) from [<c02a3944>] >> (flush_smp_call_function_queue+0xc0/0x178) >> [ 1.105253] [<c02a3944>] (flush_smp_call_function_queue) from >> [<c02a3a94>] (hotplug_cfd+0x98/0xd8) >> [ 1.105269] [<c02a3a94>] (hotplug_cfd) from [<c026b064>] >> (notifier_call_chain+0x44/0x84) >> [ 1.105285] [<c026b064>] (notifier_call_chain) from [<c024c1a4>] >> (_cpu_up+0x120/0x170) >> [ 1.105302] [<c024c1a4>] (_cpu_up) from [<c024c264>] (cpu_up+0x70/0x94) >> [ 1.105319] [<c024c264>] (cpu_up) from [<c0b5839c>] (smp_init+0xac/0xb0) >> [ 1.105337] [<c0b5839c>] (smp_init) from [<c0b2fc54>] >> (kernel_init_freeable+0x90/0x1dc) >> [ 1.105353] [<c0b2fc54>] (kernel_init_freeable) from [<c0851248>] >> (kernel_init+0xc/0xe8) >> [ 1.105368] [<c0851248>] (kernel_init) from [<c020e7f8>] >> (ret_from_fork+0x14/0x3c) >> [ 1.105389] ---[ end trace bc66942e4ab63168 ]--- > > Argh! I had put the switch-case handling for CPU_DYING at the 'wrong' place, > since I hadn't noticed that CPU_UP_CANCELED silently falls-through to CPU_DEAD. > This is what happens when people don't explicitly write "fall-through" in the > comments in a switch-case statement :-( > > Below is an updated patch, please let me know how it goes. (You'll have to > revert c47a9d7cca first, and then 56e692182, before trying this patch). I am unable to apply your below patch on top of the above 2 reverts. Applying: CPU hotplug, smp: Execute any pending IPI callbacks before CPU offline fatal: corrupt patch at line 106 Patch failed at 0001 CPU hotplug, smp: Execute any pending IPI callbacks before CPU offline Even with 'patch' I get the below failures: patching file kernel/smp.c Hunk #2 FAILED at 53. Hunk #3 FAILED at 179. 2 out of 3 hunks FAILED -- saving rejects to file kernel/smp.c.rej Regards, Sachin.
diff --git a/kernel/smp.c b/kernel/smp.c index 306f818..80c33f8 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -29,6 +29,8 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_function_data, cfd_data); static DEFINE_PER_CPU_SHARED_ALIGNED(struct llist_head, call_single_queue); +static void flush_smp_call_function_queue(bool warn_cpu_offline); + static int hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu) { @@ -51,12 +53,27 @@ hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu) #ifdef CONFIG_HOTPLUG_CPU case CPU_UP_CANCELED: case CPU_UP_CANCELED_FROZEN: + /* Fall-through to the CPU_DEAD[_FROZEN] case. */ case CPU_DEAD: case CPU_DEAD_FROZEN: free_cpumask_var(cfd->cpumask); free_percpu(cfd->csd); break; + + case CPU_DYING: + case CPU_DYING_FROZEN: + /* + * The IPIs for the smp-call-function callbacks queued by other + * CPUs might arrive late, either due to hardware latencies or + * because this CPU disabled interrupts (inside stop-machine) + * before the IPIs were sent. So flush out any pending callbacks + * explicitly (without waiting for the IPIs to arrive), to + * ensure that the outgoing CPU doesn't go offline with work + * still pending. + */ + flush_smp_call_function_queue(false); + break; #endif }; @@ -177,23 +194,47 @@ static int generic_exec_single(int cpu, struct call_single_data *csd, return 0; } -/* - * Invoked by arch to handle an IPI for call function single. Must be - * called from the arch with interrupts disabled. +/** + * generic_smp_call_function_single_interrupt - Execute SMP IPI callbacks + * + * Invoked by arch to handle an IPI for call function single. + * Must be called with interrupts disabled. */ void generic_smp_call_function_single_interrupt(void) { + flush_smp_call_function_queue(true); +} + +/** + * flush_smp_call_function_queue - Flush pending smp-call-function callbacks + * + * @warn_cpu_offline: If set to 'true', warn if callbacks were queued on an + * offline CPU. Skip this check if set to 'false'. + * + * Flush any pending smp-call-function callbacks queued on this CPU. This is + * invoked by the generic IPI handler, as well as by a CPU about to go offline, + * to ensure that all pending IPI callbacks are run before it goes completely + * offline. + * + * Loop through the call_single_queue and run all the queued callbacks. + * Must be called with interrupts disabled. + */ +static void flush_smp_call_function_queue(bool warn_cpu_offline) +{ + struct llist_head *head; struct llist_node *entry; struct call_single_data *csd, *csd_next; static bool warned; - entry = llist_del_all(&__get_cpu_var(call_single_queue)); + WARN_ON(!irqs_disabled()); + + head = &__get_cpu_var(call_single_queue); + entry = llist_del_all(head); entry = llist_reverse_order(entry); - /* - * Shouldn't receive this interrupt on a cpu that is not yet online. - */ - if (unlikely(!cpu_online(smp_processor_id()) && !warned)) { + /* There shouldn't be any pending callbacks on an offline CPU. */ + if (unlikely(warn_cpu_offline && !cpu_online(smp_processor_id()) && + !warned && !llist_empty(head))) { warned = true; WARN(1, "IPI on offline CPU %d\n", smp_processor_id());