Message ID | 20130814115602.5193.60835.stgit@preeti.in.ibm.com (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On Wed, 2013-08-14 at 17:26 +0530, Preeti U Murthy wrote: > -static irqreturn_t unused_action(int irq, void *data) > +static irqreturn_t timer_action(int irq, void *data) > { > - /* This slot is unused and hence available for use, if needed > */ > + timer_interrupt(); > return IRQ_HANDLED; > } > That means we'll do irq_enter/irq_exit twice no ? And things like may_hard_irq_enable() are also already done by do_IRQ so you don't need timer_interrupt() to do it again. We probably are better off breaking timer_interrupt in two: void __timer_interrupt(struct pt_regs * regs) Does the current stuff between irq_enter and irq_exit, timer_interrupt does the remaining around it and calls __timer_interrupt. Then from timer_action, you call __timer_interrupt() Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Ben On 08/22/2013 08:40 AM, Benjamin Herrenschmidt wrote: > On Wed, 2013-08-14 at 17:26 +0530, Preeti U Murthy wrote: >> -static irqreturn_t unused_action(int irq, void *data) >> +static irqreturn_t timer_action(int irq, void *data) >> { >> - /* This slot is unused and hence available for use, if needed >> */ >> + timer_interrupt(); >> return IRQ_HANDLED; >> } >> > > That means we'll do irq_enter/irq_exit twice no ? And things like > may_hard_irq_enable() are also already done by do_IRQ so you > don't need timer_interrupt() to do it again. > > We probably are better off breaking timer_interrupt in two: > > void __timer_interrupt(struct pt_regs * regs) > > Does the current stuff between irq_enter and irq_exit, timer_interrupt > does the remaining around it and calls __timer_interrupt. > > Then from timer_action, you call __timer_interrupt() We actually tried out this approach. The implementation was have a set_dec(0) in the timer_action(). This would ensure that we actually do get a timer interrupt. But the problem with either this approach or the one that you suggest,i.e. calling __timer_interrupt is in the following flow. do_IRQ() -> irq_exit() -> tick_irq_exit() -> tick_nohz_irq_exit() -> tick_nohz_stop_sched_tick() The problem lies in the function tick_nohz_stop_sched_tick(). This function checks for the next timer interrupt pending on this cpu, and programs the decrementer_next_event to the time of the next event, which is of course > now. As a result when in the timer_action() above, we do call __timer_interrupt() or try to trigger a timer interrupt through set_dec(0), the condition if(now >= *next_tb) in timer_interrupt() or __timer_interrupt() will fail, and will not call the timer interrupt event handler. ---> if (now >= *next_tb) { *next_tb = ~(u64)0; if (evt->event_handler) evt->event_handler(evt); } else { The broadcast IPI , is meant to make the target CPU of this IPI believe that it woke up from a timer interrupt, and not from an IPI. (The reason for this I will explain in the reply to the next patch). The target CPU should then ideally do what it would have done had it received a real timer interrupt, call the timer interrupt event handler. But due to the above code flow this does not happen. Hence as the next patch PATCH 3/6 shows, we simply call the event handler of a timer interrupt without this explicit now >= *next_tb check. This problem arises only in the implementation of this patchset, because a timer interrupt is pseudo triggered from an IPI. So the effects of the IPI handler will be felt on the timer interrupt handler triggered from this IPI, like above. Regards Preeti U Murthy -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 51bf017..d877b69 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -117,7 +117,7 @@ extern int cpu_to_core_id(int cpu); * * Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up * in /proc/interrupts will be wrong!!! --Troy */ -#define PPC_MSG_UNUSED 0 +#define PPC_MSG_TIMER 0 #define PPC_MSG_RESCHEDULE 1 #define PPC_MSG_CALL_FUNC_SINGLE 2 #define PPC_MSG_DEBUGGER_BREAK 3 @@ -190,6 +190,7 @@ extern struct smp_ops_t *smp_ops; extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); +extern void arch_send_tick_broadcast(const struct cpumask *mask); /* Definitions relative to the secondary CPU spin loop * and entry point. Not all of them exist on both 32 and diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index bc41e9f..6a68ca4 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -35,6 +35,7 @@ #include <asm/ptrace.h> #include <linux/atomic.h> #include <asm/irq.h> +#include <asm/hw_irq.h> #include <asm/page.h> #include <asm/pgtable.h> #include <asm/prom.h> @@ -111,9 +112,9 @@ int smp_generic_kick_cpu(int nr) } #endif /* CONFIG_PPC64 */ -static irqreturn_t unused_action(int irq, void *data) +static irqreturn_t timer_action(int irq, void *data) { - /* This slot is unused and hence available for use, if needed */ + timer_interrupt(); return IRQ_HANDLED; } @@ -144,14 +145,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data) } static irq_handler_t smp_ipi_action[] = { - [PPC_MSG_UNUSED] = unused_action, /* Slot available for future use */ + [PPC_MSG_TIMER] = timer_action, [PPC_MSG_RESCHEDULE] = reschedule_action, [PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action, [PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action, }; const char *smp_ipi_name[] = { - [PPC_MSG_UNUSED] = "ipi unused", + [PPC_MSG_TIMER] = "ipi timer", [PPC_MSG_RESCHEDULE] = "ipi reschedule", [PPC_MSG_CALL_FUNC_SINGLE] = "ipi call function single", [PPC_MSG_DEBUGGER_BREAK] = "ipi debugger", @@ -221,6 +222,8 @@ irqreturn_t smp_ipi_demux(void) all = xchg(&info->messages, 0); #ifdef __BIG_ENDIAN + if (all & (1 << (24 - 8 * PPC_MSG_TIMER))) + timer_interrupt(); if (all & (1 << (24 - 8 * PPC_MSG_RESCHEDULE))) scheduler_ipi(); if (all & (1 << (24 - 8 * PPC_MSG_CALL_FUNC_SINGLE))) @@ -266,6 +269,14 @@ void arch_send_call_function_ipi_mask(const struct cpumask *mask) do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE); } +void arch_send_tick_broadcast(const struct cpumask *mask) +{ + unsigned int cpu; + + for_each_cpu(cpu, mask) + do_message_pass(cpu, PPC_MSG_TIMER); +} + #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC) void smp_send_debugger_break(void) { diff --git a/arch/powerpc/platforms/cell/interrupt.c b/arch/powerpc/platforms/cell/interrupt.c index 28166e4..1359113 100644 --- a/arch/powerpc/platforms/cell/interrupt.c +++ b/arch/powerpc/platforms/cell/interrupt.c @@ -213,7 +213,7 @@ static void iic_request_ipi(int msg) void iic_request_IPIs(void) { - iic_request_ipi(PPC_MSG_UNUSED); + iic_request_ipi(PPC_MSG_TIMER); iic_request_ipi(PPC_MSG_RESCHEDULE); iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE); iic_request_ipi(PPC_MSG_DEBUGGER_BREAK); diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c index 488f069..5cb742a 100644 --- a/arch/powerpc/platforms/ps3/smp.c +++ b/arch/powerpc/platforms/ps3/smp.c @@ -74,7 +74,7 @@ static int __init ps3_smp_probe(void) * to index needs to be setup. */ - BUILD_BUG_ON(PPC_MSG_UNUSED != 0); + BUILD_BUG_ON(PPC_MSG_TIMER != 0); BUILD_BUG_ON(PPC_MSG_RESCHEDULE != 1); BUILD_BUG_ON(PPC_MSG_CALL_FUNC_SINGLE != 2); BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK != 3);