Message ID | 1508331506-23782-5-git-send-email-benjamin.gaignard@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 18 Oct 2017, Benjamin Gaignard wrote: > Rework driver code to be able to implement clocksource and clockevent > on the same hardware block. > Before this patch only the counter of the hardware block was used to > generate clock events. Now counter will be used to provide a 32 bits > clock source and a comparator will provide clock events. Again. Read, understand and comply with the patch submission documentation. Proper changelogs are not optional. "Before this patch ...." is bogus because it suggests that the patch is already applied which is obviously not the case. Let me give you an example. The stm32 timer hardware is currently only used as a clock event device, but it can be utilized as a clocksource as well. Implement this by enabling the free running counter in the hardware block and converting the clock event part from a count down event timer to a comparator based timer. Can you see the difference? > -static int stm32_clock_event_set_periodic(struct clock_event_device *evt) > +static int stm32_clock_event_set_next_event(unsigned long evt, > + struct clock_event_device *clkevt) > { > - struct timer_of *to = to_timer_of(evt); > + struct timer_of *to = to_timer_of(clkevt); > + unsigned long cnt; > > - writel_relaxed(timer_of_period(to), timer_of_base(to) + TIM_ARR); > - writel_relaxed(TIM_CR1_ARPE | TIM_CR1_CEN, timer_of_base(to) + TIM_CR1); > + cnt = readl_relaxed(timer_of_base(to) + TIM_CNT); > + writel_relaxed(cnt + evt, timer_of_base(to) + TIM_CCR1); > + writel_relaxed(TIM_DIER_CC1IE, timer_of_base(to) + TIM_DIER); This implementation is doomed. You cannot rely on the assumption that the read/modify/write sequence is 'atomic'. Bus/pipeline delays, FIQs, hypervisor exits and whatever can delay it enough so that the write comes too late which means that you have to wait for a full wraparound of the counter for the next interrupt. See the big fat comment in hpet_next_event() for gory details of issues caused by comparator based timers. Your change of min delay in one of the previous patches is papering over this problem and I really wonder if your argumentation of 'required because the CPU can't keep up otherwise' is just wrong and you failed to decode the RMW issue proper. Though fact is, that your implementation CANNOT cover all potential RMW disturbances safely. You need a proper safety net for these cases. To be honest. I prefer having a slow, inaccurate down counting timer over a fast comparator based one any time as long as the comparator is not cleverly implemented and can do less than equal comparisons which take the wraparound of the counter into account. It's not rocket science to do that, it just takes a few more gates, but hardware people can't be bothered to think about the consequences of their cheap implementations ever. Thanks, tglx
2017-10-18 20:59 GMT+02:00 Thomas Gleixner <tglx@linutronix.de>: > On Wed, 18 Oct 2017, Benjamin Gaignard wrote: > >> Rework driver code to be able to implement clocksource and clockevent >> on the same hardware block. >> Before this patch only the counter of the hardware block was used to >> generate clock events. Now counter will be used to provide a 32 bits >> clock source and a comparator will provide clock events. > > Again. Read, understand and comply with the patch submission > documentation. Proper changelogs are not optional. > > "Before this patch ...." is bogus because it suggests that the patch is > already applied which is obviously not the case. > > Let me give you an example. > > The stm32 timer hardware is currently only used as a clock event device, > but it can be utilized as a clocksource as well. > > Implement this by enabling the free running counter in the hardware block > and converting the clock event part from a count down event timer to a > comparator based timer. > > Can you see the difference? I will rework the commit message > >> -static int stm32_clock_event_set_periodic(struct clock_event_device *evt) >> +static int stm32_clock_event_set_next_event(unsigned long evt, >> + struct clock_event_device *clkevt) >> { >> - struct timer_of *to = to_timer_of(evt); >> + struct timer_of *to = to_timer_of(clkevt); >> + unsigned long cnt; >> >> - writel_relaxed(timer_of_period(to), timer_of_base(to) + TIM_ARR); >> - writel_relaxed(TIM_CR1_ARPE | TIM_CR1_CEN, timer_of_base(to) + TIM_CR1); >> + cnt = readl_relaxed(timer_of_base(to) + TIM_CNT); >> + writel_relaxed(cnt + evt, timer_of_base(to) + TIM_CCR1); >> + writel_relaxed(TIM_DIER_CC1IE, timer_of_base(to) + TIM_DIER); > > This implementation is doomed. You cannot rely on the assumption that the > read/modify/write sequence is 'atomic'. > > Bus/pipeline delays, FIQs, hypervisor exits and whatever can delay it > enough so that the write comes too late which means that you have to wait > for a full wraparound of the counter for the next interrupt. > > See the big fat comment in hpet_next_event() for gory details of issues > caused by comparator based timers. Other drivers like prima2 have the same problem. They have solve it by checking if the current time is still below next event time after wirting it, so the function will be like that: unsigned long now, next; next = readl_relaxed(timer_of_base(to) + TIM_CNT) + evt; writel_relaxed(next, timer_of_base(to) + TIM_CCR1); writel_relaxed(TIM_DIER_CC1IE, timer_of_base(to) + TIM_DIER); now = readl_relaxed(timer_of_base(to) + TIM_CNT); return next - now > delta ? -ETIME : 0; > > Your change of min delay in one of the previous patches is papering over > this problem and I really wonder if your argumentation of 'required because > the CPU can't keep up otherwise' is just wrong and you failed to decode the > RMW issue proper. The CPU is a CortexM4 @ 200MHZ and the clocks driving the timers are at 90MHZ with a min delta at 1 you could have an interrupt each 0.01 ms which is really to much. By increase it to 0x60 it give time to CPU to handle the interrupt. Also want to remove 16 bits counters because the maximum period is around 750 ms which is a short period for a clocksource. With 32 bits this period is close 47 secondes. > > Though fact is, that your implementation CANNOT cover all potential RMW > disturbances safely. You need a proper safety net for these cases. > > To be honest. I prefer having a slow, inaccurate down counting timer over a > fast comparator based one any time as long as the comparator is not > cleverly implemented and can do less than equal comparisons which take the > wraparound of the counter into account. It's not rocket science to do that, > it just takes a few more gates, but hardware people can't be bothered to > think about the consequences of their cheap implementations ever. I will forward you point of to the hardware designer but I will have to deal the hardware I have anyway. Benjamin > > Thanks, > > tglx > >
On Thu, 19 Oct 2017, Benjamin Gaignard wrote: > 2017-10-18 20:59 GMT+02:00 Thomas Gleixner <tglx@linutronix.de>: > >> -static int stm32_clock_event_set_periodic(struct clock_event_device *evt) > >> +static int stm32_clock_event_set_next_event(unsigned long evt, > >> + struct clock_event_device *clkevt) > >> { > >> - struct timer_of *to = to_timer_of(evt); > >> + struct timer_of *to = to_timer_of(clkevt); > >> + unsigned long cnt; > >> > >> - writel_relaxed(timer_of_period(to), timer_of_base(to) + TIM_ARR); > >> - writel_relaxed(TIM_CR1_ARPE | TIM_CR1_CEN, timer_of_base(to) + TIM_CR1); > >> + cnt = readl_relaxed(timer_of_base(to) + TIM_CNT); > >> + writel_relaxed(cnt + evt, timer_of_base(to) + TIM_CCR1); > >> + writel_relaxed(TIM_DIER_CC1IE, timer_of_base(to) + TIM_DIER); > > > > This implementation is doomed. You cannot rely on the assumption that the > > read/modify/write sequence is 'atomic'. > > > > Bus/pipeline delays, FIQs, hypervisor exits and whatever can delay it > > enough so that the write comes too late which means that you have to wait > > for a full wraparound of the counter for the next interrupt. > > > > See the big fat comment in hpet_next_event() for gory details of issues > > caused by comparator based timers. > > Other drivers like prima2 have the same problem. That does not make it any better. > > Your change of min delay in one of the previous patches is papering over > > this problem and I really wonder if your argumentation of 'required because > > the CPU can't keep up otherwise' is just wrong and you failed to decode the > > RMW issue proper. > > The CPU is a CortexM4 @ 200MHZ and the clocks driving the timers are at 90MHZ > with a min delta at 1 you could have an interrupt each 0.01 ms which > is really to much. > By increase it to 0x60 it give time to CPU to handle the interrupt. Fair enough, but exactly this information wants to be in the changelog. And still, if the hardware only supports 16 bit you still can use the clock events part and not initialize the clocksource. > Also want to remove 16 bits counters because the maximum period is around > 750 ms which is a short period for a clocksource. With 32 bits this > period is close 47 secondes. Again. The changelog is missing this information. You cannot expect reviewers to crystal ball your reasonings. > > To be honest. I prefer having a slow, inaccurate down counting timer over a > > fast comparator based one any time as long as the comparator is not > > cleverly implemented and can do less than equal comparisons which take the > > wraparound of the counter into account. It's not rocket science to do that, > > it just takes a few more gates, but hardware people can't be bothered to > > think about the consequences of their cheap implementations ever. > > I will forward you point of to the hardware designer but I will have to deal the > hardware I have anyway. I know that it's to late. Just wanted to mention it as a general note. Thanks, tglx
diff --git a/drivers/clocksource/timer-stm32.c b/drivers/clocksource/timer-stm32.c index c834648..461b3ba 100644 --- a/drivers/clocksource/timer-stm32.c +++ b/drivers/clocksource/timer-stm32.c @@ -16,6 +16,8 @@ #include <linux/of_irq.h> #include <linux/clk.h> #include <linux/reset.h> +#include <linux/sched_clock.h> +#include <linux/slab.h> #include "timer-of.h" @@ -23,16 +25,16 @@ #define TIM_DIER 0x0c #define TIM_SR 0x10 #define TIM_EGR 0x14 +#define TIM_CNT 0x24 #define TIM_PSC 0x28 #define TIM_ARR 0x2c +#define TIM_CCR1 0x34 #define TIM_CR1_CEN BIT(0) -#define TIM_CR1_OPM BIT(3) +#define TIM_CR1_UDIS BIT(1) #define TIM_CR1_ARPE BIT(7) -#define TIM_DIER_UIE BIT(0) - -#define TIM_SR_UIF BIT(0) +#define TIM_DIER_CC1IE BIT(1) #define TIM_EGR_UG BIT(0) @@ -40,30 +42,34 @@ static int stm32_clock_event_shutdown(struct clock_event_device *evt) { struct timer_of *to = to_timer_of(evt); - writel_relaxed(0, timer_of_base(to) + TIM_CR1); + writel_relaxed(0, timer_of_base(to) + TIM_DIER); + return 0; } -static int stm32_clock_event_set_periodic(struct clock_event_device *evt) +static int stm32_clock_event_set_next_event(unsigned long evt, + struct clock_event_device *clkevt) { - struct timer_of *to = to_timer_of(evt); + struct timer_of *to = to_timer_of(clkevt); + unsigned long cnt; - writel_relaxed(timer_of_period(to), timer_of_base(to) + TIM_ARR); - writel_relaxed(TIM_CR1_ARPE | TIM_CR1_CEN, timer_of_base(to) + TIM_CR1); + cnt = readl_relaxed(timer_of_base(to) + TIM_CNT); + writel_relaxed(cnt + evt, timer_of_base(to) + TIM_CCR1); + writel_relaxed(TIM_DIER_CC1IE, timer_of_base(to) + TIM_DIER); return 0; } -static int stm32_clock_event_set_next_event(unsigned long evt, - struct clock_event_device *clkevt) +static int stm32_clock_event_set_periodic(struct clock_event_device *evt) { - struct timer_of *to = to_timer_of(clkevt); + struct timer_of *to = to_timer_of(evt); - writel_relaxed(evt, timer_of_base(to) + TIM_ARR); - writel_relaxed(TIM_CR1_ARPE | TIM_CR1_OPM | TIM_CR1_CEN, - timer_of_base(to) + TIM_CR1); + return stm32_clock_event_set_next_event(timer_of_period(to), evt); +} - return 0; +static int stm32_clock_event_set_oneshot(struct clock_event_device *evt) +{ + return stm32_clock_event_set_next_event(0, evt); } static irqreturn_t stm32_clock_event_handler(int irq, void *dev_id) @@ -73,12 +79,57 @@ static irqreturn_t stm32_clock_event_handler(int irq, void *dev_id) writel_relaxed(0, timer_of_base(to) + TIM_SR); + if (clockevent_state_periodic(evt)) + stm32_clock_event_set_periodic(evt); + else + stm32_clock_event_shutdown(evt); + evt->event_handler(evt); return IRQ_HANDLED; } -static int __init stm32_clockevent_init(struct device_node *node) +static void __init stm32_clockevent_init(struct timer_of *to) +{ + writel_relaxed(0, timer_of_base(to) + TIM_DIER); + writel_relaxed(0, timer_of_base(to) + TIM_SR); + + clockevents_config_and_register(&to->clkevt, + timer_of_rate(to), 0x60, ~0U); +} + +static void __iomem *stm32_timer_cnt __read_mostly; +static u64 notrace stm32_read_sched_clock(void) +{ + return readl_relaxed(stm32_timer_cnt); +} + +static int __init stm32_clocksource_init(struct timer_of *to) +{ + writel_relaxed(~0U, timer_of_base(to) + TIM_ARR); + writel_relaxed(0, timer_of_base(to) + TIM_PSC); + writel_relaxed(0, timer_of_base(to) + TIM_SR); + writel_relaxed(0, timer_of_base(to) + TIM_DIER); + writel_relaxed(0, timer_of_base(to) + TIM_SR); + writel_relaxed(TIM_CR1_ARPE | TIM_CR1_UDIS, + timer_of_base(to) + TIM_CR1); + + /* Make sure that registers are updated */ + writel_relaxed(TIM_EGR_UG, timer_of_base(to) + TIM_EGR); + + /* Enable controller */ + writel_relaxed(TIM_CR1_ARPE | TIM_CR1_UDIS | TIM_CR1_CEN, + timer_of_base(to) + TIM_CR1); + + stm32_timer_cnt = timer_of_base(to) + TIM_CNT; + sched_clock_register(stm32_read_sched_clock, 32, timer_of_rate(to)); + + return clocksource_mmio_init(stm32_timer_cnt, "stm32_timer", + timer_of_rate(to), 250, 32, + clocksource_mmio_readl_up); +} + +static int __init stm32_timer_init(struct device_node *node) { struct reset_control *rstc; unsigned long max_arr; @@ -90,12 +141,13 @@ static int __init stm32_clockevent_init(struct device_node *node) return -ENOMEM; to->flags = TIMER_OF_IRQ | TIMER_OF_CLOCK | TIMER_OF_BASE; + to->clkevt.name = "stm32_clockevent"; to->clkevt.rating = 200; - to->clkevt.features = CLOCK_EVT_FEAT_PERIODIC; + to->clkevt.features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_PERIODIC; to->clkevt.set_state_shutdown = stm32_clock_event_shutdown; to->clkevt.set_state_periodic = stm32_clock_event_set_periodic; - to->clkevt.set_state_oneshot = stm32_clock_event_shutdown; + to->clkevt.set_state_oneshot = stm32_clock_event_set_oneshot; to->clkevt.tick_resume = stm32_clock_event_shutdown; to->clkevt.set_next_event = stm32_clock_event_set_next_event; @@ -120,15 +172,11 @@ static int __init stm32_clockevent_init(struct device_node *node) goto deinit; } - writel_relaxed(0, timer_of_base(to) + TIM_ARR); - - writel_relaxed(0, timer_of_base(to) + TIM_PSC); - writel_relaxed(TIM_EGR_UG, timer_of_base(to) + TIM_EGR); - writel_relaxed(TIM_DIER_UIE, timer_of_base(to) + TIM_DIER); - writel_relaxed(0, timer_of_base(to) + TIM_SR); + ret = stm32_clocksource_init(to); + if (ret) + goto deinit; - clockevents_config_and_register(&to->clkevt, - timer_of_period(to), 0x60, ~0U); + stm32_clockevent_init(to); return 0; @@ -139,4 +187,4 @@ static int __init stm32_clockevent_init(struct device_node *node) return ret; } -TIMER_OF_DECLARE(stm32, "st,stm32-timer", stm32_clockevent_init); +TIMER_OF_DECLARE(stm32, "st,stm32-timer", stm32_timer_init);
Rework driver code to be able to implement clocksource and clockevent on the same hardware block. Before this patch only the counter of the hardware block was used to generate clock events. Now counter will be used to provide a 32 bits clock source and a comparator will provide clock events. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> --- drivers/clocksource/timer-stm32.c | 104 ++++++++++++++++++++++++++++---------- 1 file changed, 76 insertions(+), 28 deletions(-)