From patchwork Fri Feb 22 12:07:30 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 2175821 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id 81F5ADFABD for ; Fri, 22 Feb 2013 12:10:32 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1U8rPf-0005Hz-GQ; Fri, 22 Feb 2013 12:07:39 +0000 Received: from www.linutronix.de ([62.245.132.108] helo=Galois.linutronix.de) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1U8rPb-0005HJ-Kw for linux-arm-kernel@lists.infradead.org; Fri, 22 Feb 2013 12:07:36 +0000 Received: from localhost ([127.0.0.1]) by Galois.linutronix.de with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1U8rPX-0005L6-4o; Fri, 22 Feb 2013 13:07:31 +0100 Date: Fri, 22 Feb 2013 13:07:30 +0100 (CET) From: Thomas Gleixner To: Santosh Shilimkar Subject: Re: too many timer retries happen when do local timer swtich with broadcast timer In-Reply-To: <51275058.7010809@ti.com> Message-ID: References: <51263975.20906@ti.com> <5127436E.4040100@ti.com> <20130222103149.GC12140@e102568-lin.cambridge.arm.com> <51275058.7010809@ti.com> User-Agent: Alpine 2.02 (LFD 1266 2009-07-14) MIME-Version: 1.0 X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1, SHORTCIRCUIT=-0.0001 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130222_070735_984120_E4C4B385 X-CRM114-Status: GOOD ( 29.29 ) X-Spam-Score: -4.9 (----) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-4.9 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [62.245.132.108 listed in list.dnswl.org] -0.7 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Jason Liu , Lorenzo Pieralisi , LKML , "linux-arm-kernel@lists.infradead.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org On Fri, 22 Feb 2013, Santosh Shilimkar wrote: > On Friday 22 February 2013 04:01 PM, Lorenzo Pieralisi wrote: > > On Fri, Feb 22, 2013 at 10:24:00AM +0000, Thomas Gleixner wrote: > > > On Fri, 22 Feb 2013, Santosh Shilimkar wrote: > > > > BTW, Lorenzo off-list mentioned to me about warning in boot-up > > > > which I missed while testing your patch. It will take bit more > > > > time for me to look into it and hence thought of reporting it. > > > > > > > > [ 2.186126] ------------[ cut here ]------------ > > > > [ 2.190979] WARNING: at kernel/time/tick-broadcast.c:501 > > > > tick_broadcast_oneshot_control+0x1c0/0x21c() > > > > > > Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ? > > > > It is the tick_force_broadcast_mask and I think that's because on all > > systems we are testing, the broadcast timer IRQ is a thundering herd, > > all CPUs get out of idle at once and try to get out of broadcast mode > > at more or less the same time. > > > So the issue comes ups only when the idle state used where CPU wakeup > more or less at same time as Lorenzo mentioned. I have two platforms > where I could test the patch and see the issue only with one platform. > > Yesterday I didn't notice the warning because it wasn't seen on that > platform :-) OMAP4 idle entry and exit in deep state is staggered > between CPUs and hence the warning isn't seen. On OMAP5 though, > there is an additional C-state where idle entry/exit for CPU > isn't staggered and I see the issue in that case. > > Actually the broad-cast code doesn't expect such a behavior > from CPUs since only the broad-cast affine CPU should wake > up and rest of the CPU should be woken up by the broad-cast > IPIs. That's what I feared. We might have the same issue on x86, depending on the cpu model. But thinking more about it. It's actually not a real problem, just pointless burning of cpu cycles. So on the CPU which gets woken along with the target CPU of the broadcast the following happens: deep_idle() <-- spurious wakeup broadcast_exit() set forced bit enable interrupts <-- Nothing happens disable interrupts broadcast_enter() <-- Here we observe the forced bit is set deep_idle() Now after that the target CPU of the broadcast runs the broadcast handler and finds the other CPU in both the broadcast and the forced mask, sends the IPI and stuff gets back to normal. So it's not actually harmful, just more evidence for the theory, that hardware designers have access to very special drug supplies. Now we could make use of that and avoid going deep idle just to come back right away via the IPI. Unfortunately the notification thingy has no return value, but we can fix that. To confirm that theory, could you please try the hack below and add some instrumentation (trace_printk)? Thanks, tglx Index: linux-2.6/arch/arm/kernel/process.c =================================================================== --- linux-2.6.orig/arch/arm/kernel/process.c +++ linux-2.6/arch/arm/kernel/process.c @@ -199,7 +199,7 @@ void cpu_idle(void) #ifdef CONFIG_PL310_ERRATA_769419 wmb(); #endif - if (hlt_counter) { + if (hlt_counter || tick_check_broadcast_pending()) { local_irq_enable(); cpu_relax(); } else if (!need_resched()) { Index: linux-2.6/include/linux/clockchips.h =================================================================== --- linux-2.6.orig/include/linux/clockchips.h +++ linux-2.6/include/linux/clockchips.h @@ -170,6 +170,12 @@ extern void tick_broadcast(const struct extern int tick_receive_broadcast(void); #endif +#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT) +extern int tick_check_broadcast_pending(void); +#else +static inline int tick_check_broadcast_pending(void) { return 0; } +#endif + #ifdef CONFIG_GENERIC_CLOCKEVENTS extern void clockevents_notify(unsigned long reason, void *arg); #else Index: linux-2.6/kernel/time/tick-broadcast.c =================================================================== --- linux-2.6.orig/kernel/time/tick-broadcast.c +++ linux-2.6/kernel/time/tick-broadcast.c @@ -418,6 +418,15 @@ static int tick_broadcast_set_event(ktim return clockevents_program_event(bc, expires, force); } +/* + * Called before going idle with interrupts disabled. Checks whether a + * broadcast event from the other core is about to happen. + */ +int tick_check_broadcast_pending(void) +{ + return test_bit(smp_processor_id(), tick_force_broadcast_mask); +} + int tick_resume_broadcast_oneshot(struct clock_event_device *bc) { clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);