Message ID | 1443196253-22765-1-git-send-email-suzuki.poulose@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
As you're posting an update to CCI500, I thought I'd hijack this thread to complain about the same thing that I complained about with CCI400: +config ARM_CCI500_PMU + bool "ARM CCI500 PMU support" + default y + depends on (ARM && CPU_V7) || ARM64 + depends on PERF_EVENTS + select ARM_CCI_PMU Why "default y" ? Not all ARMv7 CPUs have the CCI bus, and what it means is that when moving forward, it's a pain to remember to turn the option off amongst all the other Kconfig questions. Only make things "default y" if it was something that was built in a previous kernel and not building it in a later kernel would cause a regression, or where it's absolutely necessary to do so. Thanks.
On 25/09/15 17:05, Russell King - ARM Linux wrote: > As you're posting an update to CCI500, I thought I'd hijack this > thread to complain about the same thing that I complained about > with CCI400: > > +config ARM_CCI500_PMU > + bool "ARM CCI500 PMU support" > + default y > + depends on (ARM && CPU_V7) || ARM64 > + depends on PERF_EVENTS > + select ARM_CCI_PMU > > Why "default y" ? Not all ARMv7 CPUs have the CCI bus, and what it > means is that when moving forward, it's a pain to remember to turn > the option off amongst all the other Kconfig questions. > > Only make things "default y" if it was something that was built in > a previous kernel and not building it in a later kernel would cause > a regression, or where it's absolutely necessary to do so. Sure, we can turn that off by default. I will send a patch. Suzuki
On Fri, Sep 25, 2015 at 04:50:53PM +0100, Suzuki K. Poulose wrote: > From: "Suzuki K. Poulose" <suzuki.poulose@arm.com> > > The CCI PMU driver sets the event counter to the half of the maximum > value(2^31) it can count before we start the counters via > pmu_event_set_period(). This is done to give us the best chance to > handle the overflow interrupt, taking care of extreme interrupt latencies. > > However, CCI-500 comes with advanced power saving schemes, which disables > the clock to the event counters unless the counters are enabled to count > (PMCR.CEN). This prevents the driver from writing the period to the > counters before starting them. Also, there is no way we can reset the > individual event counter to 0 (PMCR.RST resets all the counters, losing > their current readings). However the value of the counter is preserved and > could be read back, when the counters are not enabled. Does reset work while the counters are clock gated? I take it that we can configure the counters while they are disabled, even if we can't alter the counter value? > So we cannot reliably use the counters and compute the number of events > generated during the sampling period since we don't have the value of the > counter at start. > > Here are the possible solutions: > > 1) Disable clock gating on CCI-500 by setting Control_Override_Reg[bit3]. > - The Control_Override_Reg is secure (and hence not programmable from > Linux), and also has an impact on power consumption. > > 2) Change the order of operations > i.e, > a) Program and enable individual counters > b) Enable counting on all the counters by setting PMCR.CEN > c) Write the period to the individual counters > d) Disable the counters > - This could cause in unnecessary noise in the other counters and is > costly (we should repeat this for all enabled counters). > > 3) Don't set the counter value, instead use the current count as the > starting count and compute the delta at the end of sampling. This leaves us less broken than we currently are, but as far as I can tell, this leaves us back where we started, with the overflow issues seen elsewhere, cf: a737823d37666255 ("ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency") 5727347180ebc6b4 ("ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode") > @@ -792,15 +803,33 @@ static void pmu_read(struct perf_event *event) > void pmu_event_set_period(struct perf_event *event) > { > struct hw_perf_event *hwc = &event->hw; > + struct cci_pmu *cci_pmu = to_cci_pmu(event->pmu); > + u64 val = 0; > + > /* > - * The CCI PMU counters have a period of 2^32. To account for the > - * possiblity of extreme interrupt latency we program for a period of > - * half that. Hopefully we can handle the interrupt before another 2^31 > - * events occur and the counter overtakes its previous value. > + * If the PMU gates the clock to PMU event counters when the counters > + * are not enabled, any write to it is ineffective. Hence we cannot > + * set a value reliably. Also, we cannot reset an individual event > + * counter; PMCR.RST resets all the counters. However the existing > + * counter values can be read back. Hence, we use the existing counter > + * value as the period and set the prev_count accordingly. This is > + * safe, since we don't support sampling of events anyway. > */ > - u64 val = 1ULL << 31; > + if (pmu_ignores_cntr_write(cci_pmu)) { > + val = pmu_read_counter(event); > + } else { > + /* > + * The CCI PMU counters have a period of 2^32. To account for > + * the possiblity of extreme interrupt latency we program for > + * a period of half that. Hopefully we can handle the interrupt > + * before another 2^31 events occur and the counter overtakes > + * its previous value. > + */ > + val = 1ULL << 31; > + pmu_write_counter(event, val); > + } > + > local64_set(&hwc->prev_count, val); > - pmu_write_counter(event, val); > } We could get rid of the flag and always do: pmu_write_counter(event, 1UL << 31); val = pmu_read_counter(event); That should work for CCI-400 and CCI-500, and if the CCI-500 counters aren't clock-gated (e.g. because of FW configuration) then we get the period that we actually wanted. However, can't we use an event which won't count to solve the race with option 2? Per the TRM, events for unimplemented interfaces don't count, though I'm not sure if there's an event that's guaranteed to not count in all configurations. Perhaps we might be able to dig one up. Thanks, Mark.
On 28/09/15 16:38, Mark Rutland wrote: > On Fri, Sep 25, 2015 at 04:50:53PM +0100, Suzuki K. Poulose wrote: >> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com> >> >> The CCI PMU driver sets the event counter to the half of the maximum >> value(2^31) it can count before we start the counters via >> pmu_event_set_period(). This is done to give us the best chance to >> handle the overflow interrupt, taking care of extreme interrupt latencies. >> >> However, CCI-500 comes with advanced power saving schemes, which disables >> the clock to the event counters unless the counters are enabled to count >> (PMCR.CEN). This prevents the driver from writing the period to the >> counters before starting them. Also, there is no way we can reset the >> individual event counter to 0 (PMCR.RST resets all the counters, losing >> their current readings). However the value of the counter is preserved and >> could be read back, when the counters are not enabled. > > Does reset work while the counters are clock gated? No. It doesn't. Both PMCR.CEN and Count Control should be turned on to alter the value of the counter. > > I take it that we can configure the counters while they are disabled, > even if we can't alter the counter value? We can program the events for the counter, but we can't alter the counter value. > >> So we cannot reliably use the counters and compute the number of events >> generated during the sampling period since we don't have the value of the >> counter at start. >> >> Here are the possible solutions: >> >> 1) Disable clock gating on CCI-500 by setting Control_Override_Reg[bit3]. >> - The Control_Override_Reg is secure (and hence not programmable from >> Linux), and also has an impact on power consumption. >> >> 2) Change the order of operations >> i.e, >> a) Program and enable individual counters >> b) Enable counting on all the counters by setting PMCR.CEN >> c) Write the period to the individual counters >> d) Disable the counters >> - This could cause in unnecessary noise in the other counters and is >> costly (we should repeat this for all enabled counters). >> >> 3) Don't set the counter value, instead use the current count as the >> starting count and compute the delta at the end of sampling. > > This leaves us less broken than we currently are, but as far as I can > tell, this leaves us back where we started, with the overflow issues > seen elsewhere, cf: > > a737823d37666255 ("ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency") > 5727347180ebc6b4 ("ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode") Thanks for the pointers. Even now, without this change, theoretically we could run into the overflow, isn't it ? So we should fix it for ever. >> @@ -792,15 +803,33 @@ static void pmu_read(struct perf_event *event) >> void pmu_event_set_period(struct perf_event *event) >> { >> struct hw_perf_event *hwc = &event->hw; >> + struct cci_pmu *cci_pmu = to_cci_pmu(event->pmu); >> + u64 val = 0; >> + >> /* >> - * The CCI PMU counters have a period of 2^32. To account for the >> - * possiblity of extreme interrupt latency we program for a period of >> - * half that. Hopefully we can handle the interrupt before another 2^31 >> - * events occur and the counter overtakes its previous value. >> + * If the PMU gates the clock to PMU event counters when the counters >> + * are not enabled, any write to it is ineffective. Hence we cannot >> + * set a value reliably. Also, we cannot reset an individual event >> + * counter; PMCR.RST resets all the counters. However the existing >> + * counter values can be read back. Hence, we use the existing counter >> + * value as the period and set the prev_count accordingly. This is >> + * safe, since we don't support sampling of events anyway. >> */ >> - u64 val = 1ULL << 31; >> + if (pmu_ignores_cntr_write(cci_pmu)) { >> + val = pmu_read_counter(event); >> + } else { >> + /* >> + * The CCI PMU counters have a period of 2^32. To account for >> + * the possiblity of extreme interrupt latency we program for >> + * a period of half that. Hopefully we can handle the interrupt >> + * before another 2^31 events occur and the counter overtakes >> + * its previous value. >> + */ >> + val = 1ULL << 31; >> + pmu_write_counter(event, val); >> + } >> + >> local64_set(&hwc->prev_count, val); >> - pmu_write_counter(event, val); >> } > > We could get rid of the flag and always do: > > pmu_write_counter(event, 1UL << 31); > val = pmu_read_counter(event); > > That should work for CCI-400 and CCI-500, and if the CCI-500 counters > aren't clock-gated (e.g. because of FW configuration) then we get the > period that we actually wanted. Yes, thats looks neater. I will change it with some comments in there. > > However, can't we use an event which won't count to solve the race with > option 2? Even if we found one, won't we have to set the 'no-op' event on all counters, if we want to change a counter value to prevent other counters from counting? And it gets complicated with/without event grouping. > > Per the TRM, events for unimplemented interfaces don't count, though I'm > not sure if there's an event that's guaranteed to not count in all > configurations. Perhaps we might be able to dig one up. IIRC, there wasn't any mention about a single event which falls into that category. I think solution 3 is much cleaner than finding and using a dummy event. Suzuki
Hi, > > Does reset work while the counters are clock gated? > > No. It doesn't. Both PMCR.CEN and Count Control should be turned on to > alter the value of the counter. Ok. I take it PMCR.CEN is global and Count Control is per-counter? > > I take it that we can configure the counters while they are disabled, > > even if we can't alter the counter value? > > We can program the events for the counter, but we can't alter the counter > value. Ok, that's something, at least. > >> So we cannot reliably use the counters and compute the number of events > >> generated during the sampling period since we don't have the value of the > >> counter at start. > >> > >> Here are the possible solutions: > >> > >> 1) Disable clock gating on CCI-500 by setting Control_Override_Reg[bit3]. > >> - The Control_Override_Reg is secure (and hence not programmable from > >> Linux), and also has an impact on power consumption. > >> > >> 2) Change the order of operations > >> i.e, > >> a) Program and enable individual counters > >> b) Enable counting on all the counters by setting PMCR.CEN > >> c) Write the period to the individual counters > >> d) Disable the counters > >> - This could cause in unnecessary noise in the other counters and is > >> costly (we should repeat this for all enabled counters). > >> > >> 3) Don't set the counter value, instead use the current count as the > >> starting count and compute the delta at the end of sampling. > > > > This leaves us less broken than we currently are, but as far as I can > > tell, this leaves us back where we started, with the overflow issues > > seen elsewhere, cf: > > > > a737823d37666255 ("ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency") > > 5727347180ebc6b4 ("ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode") > > Thanks for the pointers. Even now, without this change, theoretically > we could run into the overflow, isn't it ? So we should fix it for ever. It's not really fixable as such (given sufficiently frequent events and large enough IRQ latency there's simply nothing you can do). The current behaviour on CCI-400 is about the best we can do, so we should aim for that on CCI-500 if possible. > > We could get rid of the flag and always do: > > > > pmu_write_counter(event, 1UL << 31); > > val = pmu_read_counter(event); > > > > That should work for CCI-400 and CCI-500, and if the CCI-500 counters > > aren't clock-gated (e.g. because of FW configuration) then we get the > > period that we actually wanted. > > Yes, thats looks neater. I will change it with some comments in there. Great. That still leaves us with racy overflow handling on a clock gated CCI-500 PMU, which we should aim to avoid if possible. > > However, can't we use an event which won't count to solve the race with > > option 2? > > Even if we found one, won't we have to set the 'no-op' event on all counters, > if we want to change a counter value to prevent other counters from counting? Can't we just clear the counter_enable bits for the other counters? We could handle that in pmu_{disable,enable} to amortize the cost (or even do it lazily on the first re-programming). > And it gets complicated with/without event grouping. I don't see why grouping would be any more complicated by this. What am I missing? > > Per the TRM, events for unimplemented interfaces don't count, though I'm > > not sure if there's an event that's guaranteed to not count in all > > configurations. Perhaps we might be able to dig one up. > > IIRC, there wasn't any mention about a single event which falls into that > category. I think solution 3 is much cleaner than finding and using a dummy > event. I agree that it is simpler, but it leaves us with a worrying high chance of coming up with erroneous values, which detracts from the usefulness of the PMU. Thanks, Mark.
diff --git a/drivers/bus/arm-cci.c b/drivers/bus/arm-cci.c index 577cc4b..7db99dc 100644 --- a/drivers/bus/arm-cci.c +++ b/drivers/bus/arm-cci.c @@ -88,6 +88,10 @@ static const struct of_device_id arm_cci_matches[] = { #define CCI_PMU_MAX_HW_CNTRS(model) \ ((model)->num_hw_cntrs + (model)->fixed_hw_cntrs) +/* CCI PMU flags */ +/* Writes to CNTR is ignored, see pmu_event_set_period() */ +#define CCI_PMU_IGNORE_CNTR_WRITE (1 << 0) + /* Types of interfaces that can generate events */ enum { CCI_IF_SLAVE, @@ -118,6 +122,7 @@ struct cci_pmu; */ struct cci_pmu_model { char *name; + u32 flags; u32 fixed_hw_cntrs; u32 num_hw_cntrs; u32 cntr_size; @@ -472,6 +477,7 @@ static inline struct cci_pmu_model *probe_cci_model(struct platform_device *pdev #define CCI500_GLOBAL_PORT_MIN_EV 0x00 #define CCI500_GLOBAL_PORT_MAX_EV 0x0f +#define CCI500_PMU_FLAGS (CCI_PMU_IGNORE_CNTR_WRITE) #define CCI500_GLOBAL_EVENT_EXT_ATTR_ENTRY(_name, _config) \ CCI_EXT_ATTR_ENTRY(_name, cci500_pmu_global_event_show, \ @@ -619,6 +625,11 @@ static ssize_t cci_pmu_event_show(struct device *dev, (unsigned long)eattr->var); } +static inline bool pmu_ignores_cntr_write(struct cci_pmu *cci_pmu) +{ + return !!(cci_pmu->model->flags & CCI_PMU_IGNORE_CNTR_WRITE); +} + static int pmu_is_valid_counter(struct cci_pmu *cci_pmu, int idx) { return 0 <= idx && idx <= CCI_PMU_CNTR_LAST(cci_pmu); @@ -792,15 +803,33 @@ static void pmu_read(struct perf_event *event) void pmu_event_set_period(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; + struct cci_pmu *cci_pmu = to_cci_pmu(event->pmu); + u64 val = 0; + /* - * The CCI PMU counters have a period of 2^32. To account for the - * possiblity of extreme interrupt latency we program for a period of - * half that. Hopefully we can handle the interrupt before another 2^31 - * events occur and the counter overtakes its previous value. + * If the PMU gates the clock to PMU event counters when the counters + * are not enabled, any write to it is ineffective. Hence we cannot + * set a value reliably. Also, we cannot reset an individual event + * counter; PMCR.RST resets all the counters. However the existing + * counter values can be read back. Hence, we use the existing counter + * value as the period and set the prev_count accordingly. This is + * safe, since we don't support sampling of events anyway. */ - u64 val = 1ULL << 31; + if (pmu_ignores_cntr_write(cci_pmu)) { + val = pmu_read_counter(event); + } else { + /* + * The CCI PMU counters have a period of 2^32. To account for + * the possiblity of extreme interrupt latency we program for + * a period of half that. Hopefully we can handle the interrupt + * before another 2^31 events occur and the counter overtakes + * its previous value. + */ + val = 1ULL << 31; + pmu_write_counter(event, val); + } + local64_set(&hwc->prev_count, val); - pmu_write_counter(event, val); } static irqreturn_t pmu_handle_irq(int irq_num, void *dev) @@ -1380,6 +1409,7 @@ static struct cci_pmu_model cci_pmu_models[] = { .name = "CCI_500", .fixed_hw_cntrs = 0, .num_hw_cntrs = 8, + .flags = CCI500_PMU_FLAGS, .cntr_size = SZ_64K, .format_attrs = cci500_pmu_format_attrs, .nformat_attrs = ARRAY_SIZE(cci500_pmu_format_attrs),