Message ID | 1396011962-4467-1-git-send-email-k.kozlowski@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On pi?, 2014-03-28 at 14:06 +0100, Krzysztof Kozlowski wrote: > Fix stall after hotplugging CPU1. Affected are SoCs where Multi Core Timer > interrupts are shared (SPI), e.g. Exynos 4210. The stall was a result of > starting the CPU1 local timer not in L1 timer but in L0 (which is used > by CPU0). Hi, Do you have any comments on these 3 patches? They fix the CPU stall on Exynos4210 and also on Exynos3250 (Chanwoo Choi sent patches for it recently). Best regards, Krzysztof > Trigger: > $ echo 0 > /sys/bus/cpu/devices/cpu1/online && echo 1 > /sys/bus/cpu/devices/cpu1/online > > Stall information: > [ 530.045259] INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 530.045618] 1: (6 GPs behind) idle=6d0/0/0 softirq=369/369 > [ 530.050987] (detected by 0, t=6589 jiffies, g=33, c=32, q=0) > [ 530.056721] Task dump for CPU 1: > [ 530.059928] swapper/1 R running 0 0 1 0x00001000 > [ 530.066377] [<c0524e14>] (__schedule+0x414/0x9b4) from [<c00b6610>] (rcu_idle_enter+0x18/0x38) > [ 530.074955] [<c00b6610>] (rcu_idle_enter+0x18/0x38) from [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) > [ 530.084069] [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) from [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) > [ 530.094029] [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) from [<40517244>] (0x40517244) > > The timers for CPU1 were missed: > [ 591.668436] cpu: 1 > [ 591.670430] clock 0: > [ 591.672691] .base: c0ab7750 > [ 591.676160] .index: 0 > [ 591.679025] .resolution: 1 nsecs > [ 591.682404] .get_time: ktime_get > [ 591.685970] .offset: 0 nsecs > [ 591.689349] active timers: > [ 591.692045] #0: <dfb51f40>, hrtimer_wakeup, S:01 > [ 591.696759] # expires at 454687834257-454687884257 nsecs [in -136770537232 to -136770487232 nsecs] > > And the event_handler for next event was wrong: > [ 591.917120] Tick Device: mode: 1 > [ 591.920676] Per CPU device: 0 > [ 591.923621] Clock Event Device: mct_tick0 > [ 591.927623] max_delta_ns: 178956969027 > [ 591.931613] min_delta_ns: 1249 > [ 591.934913] mult: 51539608 > [ 591.938557] shift: 32 > [ 591.941681] mode: 3 > [ 591.944724] next_event: 595025000000 nsecs > [ 591.949227] set_next_event: exynos4_tick_set_next_event > [ 591.954522] set_mode: exynos4_tick_set_mode > [ 591.959296] event_handler: hrtimer_interrupt > [ 591.963730] retries: 0 > [ 591.966761] > [ 591.968245] Tick Device: mode: 0 > [ 591.971801] Per CPU device: 1 > [ 591.974746] Clock Event Device: mct_tick1 > [ 591.978750] max_delta_ns: 178956969027 > [ 591.982739] min_delta_ns: 1249 > [ 591.986037] mult: 51539608 > [ 591.989681] shift: 32 > [ 591.992806] mode: 3 > [ 591.995848] next_event: 453685000000 nsecs > [ 592.000353] set_next_event: exynos4_tick_set_next_event > [ 592.005648] set_mode: exynos4_tick_set_mode > [ 592.010421] event_handler: tick_handle_periodic > [ 592.015115] retries: 0 > [ 592.018145] > > After turning off the CPU1, the MCT L1 local timer was disabled but the > interrupt was not cleared. Turning on the CPU1 enabled the IRQ > with setup_irq() but, before setting affinity to CPU1, the pending L1 timer > interrupt was processed by CPU0 in exynos4_mct_tick_isr(). > > The ISR then called event handler which set up the next timer event for > current CPU (CPU0). Therefore the MCT L1 timer wasn't actually started. > > Fix the stall by: > 1. Setting next timer event not on current CPU but on the CPU indicated > by cpumask in 'clock_event_device'. > 2. Clearing the timer interrupt upon stopping the local timer. > > The patch also moves around the call to exynos4_mct_tick_stop() but this > is done only for the code readability as it is not essential for the fix. > > Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> > Cc: <stable@vger.kernel.org> > --- > drivers/clocksource/exynos_mct.c | 33 ++++++++++++++++++++------------- > 1 file changed, 20 insertions(+), 13 deletions(-) > > diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c > index 48f76bc05da0..0b49b09dd1a9 100644 > --- a/drivers/clocksource/exynos_mct.c > +++ b/drivers/clocksource/exynos_mct.c > @@ -339,7 +339,14 @@ static void exynos4_mct_tick_start(unsigned long cycles, > static int exynos4_tick_set_next_event(unsigned long cycles, > struct clock_event_device *evt) > { > - struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); > + /* > + * In case of hotplugging non-boot CPU, the set_next_event could be > + * called on CPU0 by ISR before IRQ affinity is set to proper CPU. > + * Thus for accessing proper MCT Lx timer, 'per_cpu' for cpumask > + * in event must be used instead of 'this_cpu_ptr'. > + */ > + struct mct_clock_event_device *mevt = &per_cpu(percpu_mct_tick, > + cpumask_first(evt->cpumask)); > > exynos4_mct_tick_start(cycles, mevt); > > @@ -371,23 +378,13 @@ static inline void exynos4_tick_set_mode(enum clock_event_mode mode, > > static int exynos4_mct_tick_clear(struct mct_clock_event_device *mevt) > { > - struct clock_event_device *evt = &mevt->evt; > - > - /* > - * This is for supporting oneshot mode. > - * Mct would generate interrupt periodically > - * without explicit stopping. > - */ > - if (evt->mode != CLOCK_EVT_MODE_PERIODIC) > - exynos4_mct_tick_stop(mevt); > - > /* Clear the MCT tick interrupt */ > if (__raw_readl(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1) { > exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET); > return 1; > - } else { > - return 0; > } > + > + return 0; > } > > static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) > @@ -395,6 +392,13 @@ static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) > struct mct_clock_event_device *mevt = dev_id; > struct clock_event_device *evt = &mevt->evt; > > + /* > + * This is for supporting oneshot mode. > + * Mct would generate interrupt periodically > + * without explicit stopping. > + */ > + if (evt->mode != CLOCK_EVT_MODE_PERIODIC) > + exynos4_mct_tick_stop(mevt); > exynos4_mct_tick_clear(mevt); > > evt->event_handler(evt); > @@ -441,7 +445,10 @@ static int exynos4_local_timer_setup(struct clock_event_device *evt) > > static void exynos4_local_timer_stop(struct clock_event_device *evt) > { > + struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); > + > evt->set_mode(CLOCK_EVT_MODE_UNUSED, evt); > + exynos4_mct_tick_clear(mevt); > if (mct_int_type == MCT_INT_SPI) > free_irq(evt->irq, this_cpu_ptr(&percpu_mct_tick)); > else -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/15/2014 11:34 AM, Krzysztof Kozlowski wrote: > On pi?, 2014-03-28 at 14:06 +0100, Krzysztof Kozlowski wrote: >> Fix stall after hotplugging CPU1. Affected are SoCs where Multi Core Timer >> interrupts are shared (SPI), e.g. Exynos 4210. The stall was a result of >> starting the CPU1 local timer not in L1 timer but in L0 (which is used >> by CPU0). > > Hi, > > Do you have any comments on these 3 patches? They fix the CPU stall on > Exynos4210 and also on Exynos3250 (Chanwoo Choi sent patches for it > recently). You describe this issue as impacting different SoC not only the exynos, right ? Do you know what other SoCs are impacted by this ? I guess this issue is not reproducible just with the line below, we need a timer to expire right at the moment CPU1 is hotplugged, right ? >> Trigger: >> $ echo 0 > /sys/bus/cpu/devices/cpu1/online && echo 1 > /sys/bus/cpu/devices/cpu1/online >> >> Stall information: >> [ 530.045259] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 530.045618] 1: (6 GPs behind) idle=6d0/0/0 softirq=369/369 >> [ 530.050987] (detected by 0, t=6589 jiffies, g=33, c=32, q=0) >> [ 530.056721] Task dump for CPU 1: >> [ 530.059928] swapper/1 R running 0 0 1 0x00001000 >> [ 530.066377] [<c0524e14>] (__schedule+0x414/0x9b4) from [<c00b6610>] (rcu_idle_enter+0x18/0x38) >> [ 530.074955] [<c00b6610>] (rcu_idle_enter+0x18/0x38) from [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) >> [ 530.084069] [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) from [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) >> [ 530.094029] [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) from [<40517244>] (0x40517244) >> >> The timers for CPU1 were missed: >> [ 591.668436] cpu: 1 >> [ 591.670430] clock 0: >> [ 591.672691] .base: c0ab7750 >> [ 591.676160] .index: 0 >> [ 591.679025] .resolution: 1 nsecs >> [ 591.682404] .get_time: ktime_get >> [ 591.685970] .offset: 0 nsecs >> [ 591.689349] active timers: >> [ 591.692045] #0: <dfb51f40>, hrtimer_wakeup, S:01 >> [ 591.696759] # expires at 454687834257-454687884257 nsecs [in -136770537232 to -136770487232 nsecs] >> >> And the event_handler for next event was wrong: >> [ 591.917120] Tick Device: mode: 1 >> [ 591.920676] Per CPU device: 0 >> [ 591.923621] Clock Event Device: mct_tick0 >> [ 591.927623] max_delta_ns: 178956969027 >> [ 591.931613] min_delta_ns: 1249 >> [ 591.934913] mult: 51539608 >> [ 591.938557] shift: 32 >> [ 591.941681] mode: 3 >> [ 591.944724] next_event: 595025000000 nsecs >> [ 591.949227] set_next_event: exynos4_tick_set_next_event >> [ 591.954522] set_mode: exynos4_tick_set_mode >> [ 591.959296] event_handler: hrtimer_interrupt >> [ 591.963730] retries: 0 >> [ 591.966761] >> [ 591.968245] Tick Device: mode: 0 >> [ 591.971801] Per CPU device: 1 >> [ 591.974746] Clock Event Device: mct_tick1 >> [ 591.978750] max_delta_ns: 178956969027 >> [ 591.982739] min_delta_ns: 1249 >> [ 591.986037] mult: 51539608 >> [ 591.989681] shift: 32 >> [ 591.992806] mode: 3 >> [ 591.995848] next_event: 453685000000 nsecs >> [ 592.000353] set_next_event: exynos4_tick_set_next_event >> [ 592.005648] set_mode: exynos4_tick_set_mode >> [ 592.010421] event_handler: tick_handle_periodic >> [ 592.015115] retries: 0 >> [ 592.018145] >> >> After turning off the CPU1, the MCT L1 local timer was disabled but the >> interrupt was not cleared. Turning on the CPU1 enabled the IRQ >> with setup_irq() but, before setting affinity to CPU1, the pending L1 timer >> interrupt was processed by CPU0 in exynos4_mct_tick_isr(). >> >> The ISR then called event handler which set up the next timer event for >> current CPU (CPU0). Therefore the MCT L1 timer wasn't actually started. >> >> Fix the stall by: >> 1. Setting next timer event not on current CPU but on the CPU indicated >> by cpumask in 'clock_event_device'. >> 2. Clearing the timer interrupt upon stopping the local timer. >> >> The patch also moves around the call to exynos4_mct_tick_stop() but this >> is done only for the code readability as it is not essential for the fix. >> >> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> >> Cc: <stable@vger.kernel.org> >> --- >> drivers/clocksource/exynos_mct.c | 33 ++++++++++++++++++++------------- >> 1 file changed, 20 insertions(+), 13 deletions(-) >> >> diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c >> index 48f76bc05da0..0b49b09dd1a9 100644 >> --- a/drivers/clocksource/exynos_mct.c >> +++ b/drivers/clocksource/exynos_mct.c >> @@ -339,7 +339,14 @@ static void exynos4_mct_tick_start(unsigned long cycles, >> static int exynos4_tick_set_next_event(unsigned long cycles, >> struct clock_event_device *evt) >> { >> - struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); >> + /* >> + * In case of hotplugging non-boot CPU, the set_next_event could be >> + * called on CPU0 by ISR before IRQ affinity is set to proper CPU. >> + * Thus for accessing proper MCT Lx timer, 'per_cpu' for cpumask >> + * in event must be used instead of 'this_cpu_ptr'. >> + */ >> + struct mct_clock_event_device *mevt = &per_cpu(percpu_mct_tick, >> + cpumask_first(evt->cpumask)); >> >> exynos4_mct_tick_start(cycles, mevt); >> >> @@ -371,23 +378,13 @@ static inline void exynos4_tick_set_mode(enum clock_event_mode mode, >> >> static int exynos4_mct_tick_clear(struct mct_clock_event_device *mevt) >> { >> - struct clock_event_device *evt = &mevt->evt; >> - >> - /* >> - * This is for supporting oneshot mode. >> - * Mct would generate interrupt periodically >> - * without explicit stopping. >> - */ >> - if (evt->mode != CLOCK_EVT_MODE_PERIODIC) >> - exynos4_mct_tick_stop(mevt); >> - >> /* Clear the MCT tick interrupt */ >> if (__raw_readl(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1) { >> exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET); >> return 1; >> - } else { >> - return 0; >> } >> + >> + return 0; >> } >> >> static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) >> @@ -395,6 +392,13 @@ static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) >> struct mct_clock_event_device *mevt = dev_id; >> struct clock_event_device *evt = &mevt->evt; >> >> + /* >> + * This is for supporting oneshot mode. >> + * Mct would generate interrupt periodically >> + * without explicit stopping. >> + */ >> + if (evt->mode != CLOCK_EVT_MODE_PERIODIC) >> + exynos4_mct_tick_stop(mevt); >> exynos4_mct_tick_clear(mevt); >> >> evt->event_handler(evt); >> @@ -441,7 +445,10 @@ static int exynos4_local_timer_setup(struct clock_event_device *evt) >> >> static void exynos4_local_timer_stop(struct clock_event_device *evt) >> { >> + struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); >> + >> evt->set_mode(CLOCK_EVT_MODE_UNUSED, evt); >> + exynos4_mct_tick_clear(mevt); >> if (mct_int_type == MCT_INT_SPI) >> free_irq(evt->irq, this_cpu_ptr(&percpu_mct_tick)); >> else >
On wto, 2014-04-15 at 14:28 +0200, Daniel Lezcano wrote: > On 04/15/2014 11:34 AM, Krzysztof Kozlowski wrote: > > On pi?, 2014-03-28 at 14:06 +0100, Krzysztof Kozlowski wrote: > >> Fix stall after hotplugging CPU1. Affected are SoCs where Multi Core Timer > >> interrupts are shared (SPI), e.g. Exynos 4210. The stall was a result of > >> starting the CPU1 local timer not in L1 timer but in L0 (which is used > >> by CPU0). > > > > Hi, > > > > Do you have any comments on these 3 patches? They fix the CPU stall on > > Exynos4210 and also on Exynos3250 (Chanwoo Choi sent patches for it > > recently). > > You describe this issue as impacting different SoC not only the exynos, > right ? > > Do you know what other SoCs are impacted by this ? No, affected are only Exynos SoC-s. It was confirmed on Exynos4210 (Trats board) and Exynos3250 (new SoC, patches for it were recently posted by Chanwoo). Other Exynos SoC-s where MCT local timers use shared interrupts (SPI) can also be affected. Candidates are Exynos 5250 and 5420 but I haven't tested them. > I guess this issue is not reproducible just with the line below, we need > a timer to expire right at the moment CPU1 is hotplugged, right ? Right. The timer must fire in short time between enabling local timer for CPU1 and setting the affinity for IRQ. In my case on ~1 GHz 2 core CPU (during idle state, system booted with init=/bin/sh) it was easily triggered with: for i in `seq 200`; do echo 0 > /sys/bus/cpu/devices/cpu1/online echo 1 > /sys/bus/cpu/devices/cpu1/online sleep 1 done The stall happened typically after 10-30 seconds of such test. Best regards, Krzysztof > > >> Trigger: > >> $ echo 0 > /sys/bus/cpu/devices/cpu1/online && echo 1 > /sys/bus/cpu/devices/cpu1/online > >> > >> Stall information: > >> [ 530.045259] INFO: rcu_preempt detected stalls on CPUs/tasks: > >> [ 530.045618] 1: (6 GPs behind) idle=6d0/0/0 softirq=369/369 > >> [ 530.050987] (detected by 0, t=6589 jiffies, g=33, c=32, q=0) > >> [ 530.056721] Task dump for CPU 1: > >> [ 530.059928] swapper/1 R running 0 0 1 0x00001000 > >> [ 530.066377] [<c0524e14>] (__schedule+0x414/0x9b4) from [<c00b6610>] (rcu_idle_enter+0x18/0x38) > >> [ 530.074955] [<c00b6610>] (rcu_idle_enter+0x18/0x38) from [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) > >> [ 530.084069] [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) from [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) > >> [ 530.094029] [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) from [<40517244>] (0x40517244) > >> > >> The timers for CPU1 were missed: > >> [ 591.668436] cpu: 1 > >> [ 591.670430] clock 0: > >> [ 591.672691] .base: c0ab7750 > >> [ 591.676160] .index: 0 > >> [ 591.679025] .resolution: 1 nsecs > >> [ 591.682404] .get_time: ktime_get > >> [ 591.685970] .offset: 0 nsecs > >> [ 591.689349] active timers: > >> [ 591.692045] #0: <dfb51f40>, hrtimer_wakeup, S:01 > >> [ 591.696759] # expires at 454687834257-454687884257 nsecs [in -136770537232 to -136770487232 nsecs] > >> > >> And the event_handler for next event was wrong: > >> [ 591.917120] Tick Device: mode: 1 > >> [ 591.920676] Per CPU device: 0 > >> [ 591.923621] Clock Event Device: mct_tick0 > >> [ 591.927623] max_delta_ns: 178956969027 > >> [ 591.931613] min_delta_ns: 1249 > >> [ 591.934913] mult: 51539608 > >> [ 591.938557] shift: 32 > >> [ 591.941681] mode: 3 > >> [ 591.944724] next_event: 595025000000 nsecs > >> [ 591.949227] set_next_event: exynos4_tick_set_next_event > >> [ 591.954522] set_mode: exynos4_tick_set_mode > >> [ 591.959296] event_handler: hrtimer_interrupt > >> [ 591.963730] retries: 0 > >> [ 591.966761] > >> [ 591.968245] Tick Device: mode: 0 > >> [ 591.971801] Per CPU device: 1 > >> [ 591.974746] Clock Event Device: mct_tick1 > >> [ 591.978750] max_delta_ns: 178956969027 > >> [ 591.982739] min_delta_ns: 1249 > >> [ 591.986037] mult: 51539608 > >> [ 591.989681] shift: 32 > >> [ 591.992806] mode: 3 > >> [ 591.995848] next_event: 453685000000 nsecs > >> [ 592.000353] set_next_event: exynos4_tick_set_next_event > >> [ 592.005648] set_mode: exynos4_tick_set_mode > >> [ 592.010421] event_handler: tick_handle_periodic > >> [ 592.015115] retries: 0 > >> [ 592.018145] > >> > >> After turning off the CPU1, the MCT L1 local timer was disabled but the > >> interrupt was not cleared. Turning on the CPU1 enabled the IRQ > >> with setup_irq() but, before setting affinity to CPU1, the pending L1 timer > >> interrupt was processed by CPU0 in exynos4_mct_tick_isr(). > >> > >> The ISR then called event handler which set up the next timer event for > >> current CPU (CPU0). Therefore the MCT L1 timer wasn't actually started. > >> > >> Fix the stall by: > >> 1. Setting next timer event not on current CPU but on the CPU indicated > >> by cpumask in 'clock_event_device'. > >> 2. Clearing the timer interrupt upon stopping the local timer. > >> > >> The patch also moves around the call to exynos4_mct_tick_stop() but this > >> is done only for the code readability as it is not essential for the fix. > >> > >> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> > >> Cc: <stable@vger.kernel.org> > >> --- > >> drivers/clocksource/exynos_mct.c | 33 ++++++++++++++++++++------------- > >> 1 file changed, 20 insertions(+), 13 deletions(-) > >> > >> diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c > >> index 48f76bc05da0..0b49b09dd1a9 100644 > >> --- a/drivers/clocksource/exynos_mct.c > >> +++ b/drivers/clocksource/exynos_mct.c > >> @@ -339,7 +339,14 @@ static void exynos4_mct_tick_start(unsigned long cycles, > >> static int exynos4_tick_set_next_event(unsigned long cycles, > >> struct clock_event_device *evt) > >> { > >> - struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); > >> + /* > >> + * In case of hotplugging non-boot CPU, the set_next_event could be > >> + * called on CPU0 by ISR before IRQ affinity is set to proper CPU. > >> + * Thus for accessing proper MCT Lx timer, 'per_cpu' for cpumask > >> + * in event must be used instead of 'this_cpu_ptr'. > >> + */ > >> + struct mct_clock_event_device *mevt = &per_cpu(percpu_mct_tick, > >> + cpumask_first(evt->cpumask)); > >> > >> exynos4_mct_tick_start(cycles, mevt); > >> > >> @@ -371,23 +378,13 @@ static inline void exynos4_tick_set_mode(enum clock_event_mode mode, > >> > >> static int exynos4_mct_tick_clear(struct mct_clock_event_device *mevt) > >> { > >> - struct clock_event_device *evt = &mevt->evt; > >> - > >> - /* > >> - * This is for supporting oneshot mode. > >> - * Mct would generate interrupt periodically > >> - * without explicit stopping. > >> - */ > >> - if (evt->mode != CLOCK_EVT_MODE_PERIODIC) > >> - exynos4_mct_tick_stop(mevt); > >> - > >> /* Clear the MCT tick interrupt */ > >> if (__raw_readl(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1) { > >> exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET); > >> return 1; > >> - } else { > >> - return 0; > >> } > >> + > >> + return 0; > >> } > >> > >> static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) > >> @@ -395,6 +392,13 @@ static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) > >> struct mct_clock_event_device *mevt = dev_id; > >> struct clock_event_device *evt = &mevt->evt; > >> > >> + /* > >> + * This is for supporting oneshot mode. > >> + * Mct would generate interrupt periodically > >> + * without explicit stopping. > >> + */ > >> + if (evt->mode != CLOCK_EVT_MODE_PERIODIC) > >> + exynos4_mct_tick_stop(mevt); > >> exynos4_mct_tick_clear(mevt); > >> > >> evt->event_handler(evt); > >> @@ -441,7 +445,10 @@ static int exynos4_local_timer_setup(struct clock_event_device *evt) > >> > >> static void exynos4_local_timer_stop(struct clock_event_device *evt) > >> { > >> + struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); > >> + > >> evt->set_mode(CLOCK_EVT_MODE_UNUSED, evt); > >> + exynos4_mct_tick_clear(mevt); > >> if (mct_int_type == MCT_INT_SPI) > >> free_irq(evt->irq, this_cpu_ptr(&percpu_mct_tick)); > >> else > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c index 48f76bc05da0..0b49b09dd1a9 100644 --- a/drivers/clocksource/exynos_mct.c +++ b/drivers/clocksource/exynos_mct.c @@ -339,7 +339,14 @@ static void exynos4_mct_tick_start(unsigned long cycles, static int exynos4_tick_set_next_event(unsigned long cycles, struct clock_event_device *evt) { - struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); + /* + * In case of hotplugging non-boot CPU, the set_next_event could be + * called on CPU0 by ISR before IRQ affinity is set to proper CPU. + * Thus for accessing proper MCT Lx timer, 'per_cpu' for cpumask + * in event must be used instead of 'this_cpu_ptr'. + */ + struct mct_clock_event_device *mevt = &per_cpu(percpu_mct_tick, + cpumask_first(evt->cpumask)); exynos4_mct_tick_start(cycles, mevt); @@ -371,23 +378,13 @@ static inline void exynos4_tick_set_mode(enum clock_event_mode mode, static int exynos4_mct_tick_clear(struct mct_clock_event_device *mevt) { - struct clock_event_device *evt = &mevt->evt; - - /* - * This is for supporting oneshot mode. - * Mct would generate interrupt periodically - * without explicit stopping. - */ - if (evt->mode != CLOCK_EVT_MODE_PERIODIC) - exynos4_mct_tick_stop(mevt); - /* Clear the MCT tick interrupt */ if (__raw_readl(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1) { exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET); return 1; - } else { - return 0; } + + return 0; } static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) @@ -395,6 +392,13 @@ static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) struct mct_clock_event_device *mevt = dev_id; struct clock_event_device *evt = &mevt->evt; + /* + * This is for supporting oneshot mode. + * Mct would generate interrupt periodically + * without explicit stopping. + */ + if (evt->mode != CLOCK_EVT_MODE_PERIODIC) + exynos4_mct_tick_stop(mevt); exynos4_mct_tick_clear(mevt); evt->event_handler(evt); @@ -441,7 +445,10 @@ static int exynos4_local_timer_setup(struct clock_event_device *evt) static void exynos4_local_timer_stop(struct clock_event_device *evt) { + struct mct_clock_event_device *mevt = this_cpu_ptr(&percpu_mct_tick); + evt->set_mode(CLOCK_EVT_MODE_UNUSED, evt); + exynos4_mct_tick_clear(mevt); if (mct_int_type == MCT_INT_SPI) free_irq(evt->irq, this_cpu_ptr(&percpu_mct_tick)); else
Fix stall after hotplugging CPU1. Affected are SoCs where Multi Core Timer interrupts are shared (SPI), e.g. Exynos 4210. The stall was a result of starting the CPU1 local timer not in L1 timer but in L0 (which is used by CPU0). Trigger: $ echo 0 > /sys/bus/cpu/devices/cpu1/online && echo 1 > /sys/bus/cpu/devices/cpu1/online Stall information: [ 530.045259] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 530.045618] 1: (6 GPs behind) idle=6d0/0/0 softirq=369/369 [ 530.050987] (detected by 0, t=6589 jiffies, g=33, c=32, q=0) [ 530.056721] Task dump for CPU 1: [ 530.059928] swapper/1 R running 0 0 1 0x00001000 [ 530.066377] [<c0524e14>] (__schedule+0x414/0x9b4) from [<c00b6610>] (rcu_idle_enter+0x18/0x38) [ 530.074955] [<c00b6610>] (rcu_idle_enter+0x18/0x38) from [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) [ 530.084069] [<c0079a18>] (cpu_startup_entry+0x60/0x3bc) from [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) [ 530.094029] [<c0517d34>] (secondary_start_kernel+0x164/0x1a0) from [<40517244>] (0x40517244) The timers for CPU1 were missed: [ 591.668436] cpu: 1 [ 591.670430] clock 0: [ 591.672691] .base: c0ab7750 [ 591.676160] .index: 0 [ 591.679025] .resolution: 1 nsecs [ 591.682404] .get_time: ktime_get [ 591.685970] .offset: 0 nsecs [ 591.689349] active timers: [ 591.692045] #0: <dfb51f40>, hrtimer_wakeup, S:01 [ 591.696759] # expires at 454687834257-454687884257 nsecs [in -136770537232 to -136770487232 nsecs] And the event_handler for next event was wrong: [ 591.917120] Tick Device: mode: 1 [ 591.920676] Per CPU device: 0 [ 591.923621] Clock Event Device: mct_tick0 [ 591.927623] max_delta_ns: 178956969027 [ 591.931613] min_delta_ns: 1249 [ 591.934913] mult: 51539608 [ 591.938557] shift: 32 [ 591.941681] mode: 3 [ 591.944724] next_event: 595025000000 nsecs [ 591.949227] set_next_event: exynos4_tick_set_next_event [ 591.954522] set_mode: exynos4_tick_set_mode [ 591.959296] event_handler: hrtimer_interrupt [ 591.963730] retries: 0 [ 591.966761] [ 591.968245] Tick Device: mode: 0 [ 591.971801] Per CPU device: 1 [ 591.974746] Clock Event Device: mct_tick1 [ 591.978750] max_delta_ns: 178956969027 [ 591.982739] min_delta_ns: 1249 [ 591.986037] mult: 51539608 [ 591.989681] shift: 32 [ 591.992806] mode: 3 [ 591.995848] next_event: 453685000000 nsecs [ 592.000353] set_next_event: exynos4_tick_set_next_event [ 592.005648] set_mode: exynos4_tick_set_mode [ 592.010421] event_handler: tick_handle_periodic [ 592.015115] retries: 0 [ 592.018145] After turning off the CPU1, the MCT L1 local timer was disabled but the interrupt was not cleared. Turning on the CPU1 enabled the IRQ with setup_irq() but, before setting affinity to CPU1, the pending L1 timer interrupt was processed by CPU0 in exynos4_mct_tick_isr(). The ISR then called event handler which set up the next timer event for current CPU (CPU0). Therefore the MCT L1 timer wasn't actually started. Fix the stall by: 1. Setting next timer event not on current CPU but on the CPU indicated by cpumask in 'clock_event_device'. 2. Clearing the timer interrupt upon stopping the local timer. The patch also moves around the call to exynos4_mct_tick_stop() but this is done only for the code readability as it is not essential for the fix. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Cc: <stable@vger.kernel.org> --- drivers/clocksource/exynos_mct.c | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-)