perf/imx_ddr: Add stop event counters support for i.MX8MP

Message ID	1599472439-22770-1-git-send-email-qiangqing.zhang@nxp.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=rv5h=CQ=lists.infradead.org=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5977621473 From: Joakim Zhang <qiangqing.zhang@nxp.com> To: will@kernel.org, mark.rutland@arm.com, robin.murphy@arm.com Subject: [PATCH] perf/imx_ddr: Add stop event counters support for i.MX8MP Date: Mon, 7 Sep 2020 17:53:59 +0800 Message-Id: <1599472439-22770-1-git-send-email-qiangqing.zhang@nxp.com> summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [92.121.34.13 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record Precedence: list Cc: linux-imx@nxp.com, linux-arm-kernel@lists.infradead.org, Joakim Zhang <qiangqing.zhang@nxp.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
Series	perf/imx_ddr: Add stop event counters support for i.MX8MP \| expand perf/imx_ddr: Add stop event counters support for i.MX8MP

Message ID

1599472439-22770-1-git-send-email-qiangqing.zhang@nxp.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5977621473
From: Joakim Zhang <qiangqing.zhang@nxp.com>
To: will@kernel.org,
	mark.rutland@arm.com,
	robin.murphy@arm.com
Subject: [PATCH] perf/imx_ddr: Add stop event counters support for i.MX8MP
Date: Mon,  7 Sep 2020 17:53:59 +0800
Message-Id: <1599472439-22770-1-git-send-email-qiangqing.zhang@nxp.com>
Precedence: list
Cc: linux-imx@nxp.com, linux-arm-kernel@lists.infradead.org,
 Joakim Zhang <qiangqing.zhang@nxp.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Series

perf/imx_ddr: Add stop event counters support for i.MX8MP | expand

Commit Message

Joakim Zhang Sept. 7, 2020, 9:53 a.m. UTC

DDR Perf driver only supports free-running event counters(counter1/2/3)
now, this patch adds support for stop event counters.

Legacy SoCs:
Cycle counter(counter0) is a special counter, only count cycles. When
cycle counter overflow, it will lock all counters and generate an
interrupt. In ddr_perf_irq_handler, disable cycle counter then all
counters would stop at the same time, update all counters' count, then
enable cycle counter that all counters count again. During this process,
only clear cycle counter, no need to clear event counters since they are
free-running counters. They would continue counting after overflow and
do/while loop from ddr_perf_event_update can handle event counters
overflow case.

i.MX8MP:
Almost all is the same as legacy SoCs, the only difference is that, event
counters are not free-running any more. Like cycle counter, when event
counters overflow, they would stop counting unless clear the counter,
and no interrupt generate for event counters. So we should clear event
counters that let them re-count when cycle counter overflow, which ensure
event counters will not lose data.

Take one case into consideration, from cycle counter interrupt context,
when invoke ddr_perf_counter_enable to clear event counters, but have
not set prev_count equal 0 yet. Concurrently, ddr_perf_event_update from
another thread context invokes ddr_perf_read_counter to read that event
counter value, it will return 0. Delta(new_raw_count - prev_raw_count)
calculate is incorrect. So I add a spinlock, for that clear event
counters and update event counters never happened concurrently. It is
save for cycle counter to clear then update the counter, since it is
exactly overflow.

This patch adds stop event counters support which would be compatible to
free-running event counters.

Hi Will,

I resend the patch for your review since last mail time span is too long.

I am not sure whether it is a formal solution or not. If any better solution,
please share me how to implement it? Thanks.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
---
 drivers/perf/fsl_imx8_ddr_perf.c | 52 +++++++++++++++++++++++++++-----
 1 file changed, 45 insertions(+), 7 deletions(-)

Comments

Will Deacon Sept. 7, 2020, 5:06 p.m. UTC | #1

On Mon, Sep 07, 2020 at 05:53:59PM +0800, Joakim Zhang wrote:
> DDR Perf driver only supports free-running event counters(counter1/2/3)
> now, this patch adds support for stop event counters.
> 
> Legacy SoCs:
> Cycle counter(counter0) is a special counter, only count cycles. When
> cycle counter overflow, it will lock all counters and generate an
> interrupt. In ddr_perf_irq_handler, disable cycle counter then all
> counters would stop at the same time, update all counters' count, then
> enable cycle counter that all counters count again. During this process,
> only clear cycle counter, no need to clear event counters since they are
> free-running counters. They would continue counting after overflow and
> do/while loop from ddr_perf_event_update can handle event counters
> overflow case.
> 
> i.MX8MP:
> Almost all is the same as legacy SoCs, the only difference is that, event
> counters are not free-running any more. Like cycle counter, when event
> counters overflow, they would stop counting unless clear the counter,
> and no interrupt generate for event counters. So we should clear event
> counters that let them re-count when cycle counter overflow, which ensure
> event counters will not lose data.

Was this supposed to be an improvement over the "Legacy SoCs"
implementation? It seems even worse...

Do you _have_ to write zeroes back to the event counters to get them going
again, or will any value do?

> diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c
> index 90884d14f95f..057e361eb391 100644
> --- a/drivers/perf/fsl_imx8_ddr_perf.c
> +++ b/drivers/perf/fsl_imx8_ddr_perf.c
> @@ -14,6 +14,7 @@
>  #include <linux/of_device.h>
>  #include <linux/of_irq.h>
>  #include <linux/perf_event.h>
> +#include <linux/spinlock.h>
>  #include <linux/slab.h>
>  
>  #define COUNTER_CNTL		0x0
> @@ -82,6 +83,7 @@ struct ddr_pmu {
>  	const struct fsl_ddr_devtype_data *devtype_data;
>  	int irq;
>  	int id;
> +	spinlock_t lock;
>  };
>  
>  enum ddr_perf_filter_capabilities {
> @@ -368,16 +370,19 @@ static void ddr_perf_event_update(struct perf_event *event)
>  	struct hw_perf_event *hwc = &event->hw;
>  	u64 delta, prev_raw_count, new_raw_count;
>  	int counter = hwc->idx;
> +	unsigned long flags;
>  
> -	do {
> -		prev_raw_count = local64_read(&hwc->prev_count);
> -		new_raw_count = ddr_perf_read_counter(pmu, counter);
> -	} while (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> -			new_raw_count) != prev_raw_count);
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	prev_raw_count = local64_read(&hwc->prev_count);
> +	new_raw_count = ddr_perf_read_counter(pmu, counter);
>  
>  	delta = (new_raw_count - prev_raw_count) & 0xFFFFFFFF;
>  
>  	local64_add(delta, &event->count);
> +	local64_set(&hwc->prev_count, new_raw_count);

Hmm, assuming that the event counters never overflow, why do we care about
the prev count at all? In other words, why don't we just add the counter
value to event->count and reset the hardware to zero every time?

Will

Joakim Zhang Sept. 8, 2020, 4:10 a.m. UTC | #2

> -----Original Message-----
> From: Will Deacon <will@kernel.org>
> Sent: 2020年9月8日 1:07
> To: Joakim Zhang <qiangqing.zhang@nxp.com>
> Cc: mark.rutland@arm.com; robin.murphy@arm.com; dl-linux-imx
> <linux-imx@nxp.com>; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH] perf/imx_ddr: Add stop event counters support for
> i.MX8MP
> 
> On Mon, Sep 07, 2020 at 05:53:59PM +0800, Joakim Zhang wrote:
> > DDR Perf driver only supports free-running event
> > counters(counter1/2/3) now, this patch adds support for stop event counters.
> >
> > Legacy SoCs:
> > Cycle counter(counter0) is a special counter, only count cycles. When
> > cycle counter overflow, it will lock all counters and generate an
> > interrupt. In ddr_perf_irq_handler, disable cycle counter then all
> > counters would stop at the same time, update all counters' count, then
> > enable cycle counter that all counters count again. During this
> > process, only clear cycle counter, no need to clear event counters
> > since they are free-running counters. They would continue counting
> > after overflow and do/while loop from ddr_perf_event_update can handle
> > event counters overflow case.
> >
> > i.MX8MP:
> > Almost all is the same as legacy SoCs, the only difference is that,
> > event counters are not free-running any more. Like cycle counter, when
> > event counters overflow, they would stop counting unless clear the
> > counter, and no interrupt generate for event counters. So we should
> > clear event counters that let them re-count when cycle counter
> > overflow, which ensure event counters will not lose data.
> 
> Was this supposed to be an improvement over the "Legacy SoCs"
> implementation? It seems even worse...
Per IC guys' perspective, they think this is an improvement. Event counters should also stop counting when they are overflow. So they fix it as a bug.
Per software perspective, we more hope event counters are free-running. However, IC guys has not informed us when they do this change.


> Do you _have_ to write zeroes back to the event counters to get them going
> again, or will any value do?
No, event counters also have a CLEAR bit, only need clear this CLEAR bit, then event counters start counting again from zero.


> > diff --git a/drivers/perf/fsl_imx8_ddr_perf.c
> > b/drivers/perf/fsl_imx8_ddr_perf.c
> > index 90884d14f95f..057e361eb391 100644
> > --- a/drivers/perf/fsl_imx8_ddr_perf.c
> > +++ b/drivers/perf/fsl_imx8_ddr_perf.c
> > @@ -14,6 +14,7 @@
> >  #include <linux/of_device.h>
> >  #include <linux/of_irq.h>
> >  #include <linux/perf_event.h>
> > +#include <linux/spinlock.h>
> >  #include <linux/slab.h>
> >
> >  #define COUNTER_CNTL		0x0
> > @@ -82,6 +83,7 @@ struct ddr_pmu {
> >  	const struct fsl_ddr_devtype_data *devtype_data;
> >  	int irq;
> >  	int id;
> > +	spinlock_t lock;
> >  };
> >
> >  enum ddr_perf_filter_capabilities {
> > @@ -368,16 +370,19 @@ static void ddr_perf_event_update(struct
> perf_event *event)
> >  	struct hw_perf_event *hwc = &event->hw;
> >  	u64 delta, prev_raw_count, new_raw_count;
> >  	int counter = hwc->idx;
> > +	unsigned long flags;
> >
> > -	do {
> > -		prev_raw_count = local64_read(&hwc->prev_count);
> > -		new_raw_count = ddr_perf_read_counter(pmu, counter);
> > -	} while (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> > -			new_raw_count) != prev_raw_count);
> > +	spin_lock_irqsave(&pmu->lock, flags);
> > +
> > +	prev_raw_count = local64_read(&hwc->prev_count);
> > +	new_raw_count = ddr_perf_read_counter(pmu, counter);
> >
> >  	delta = (new_raw_count - prev_raw_count) & 0xFFFFFFFF;
> >
> >  	local64_add(delta, &event->count);
> > +	local64_set(&hwc->prev_count, new_raw_count);
> 
> Hmm, assuming that the event counters never overflow, why do we care about
> the prev count at all? In other words, why don't we just add the counter value
> to event->count and reset the hardware to zero every time?
Do you mean that:
for cycle counter, keep the original routine,
for event counters, add counter value to event->count, and then clear event counters to let them counting from zero again?

Sounds great! I will have a try. Thanks.

Best Regards,
Joakim Zhang
> Will

diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c
index 90884d14f95f..057e361eb391 100644
--- a/drivers/perf/fsl_imx8_ddr_perf.c
+++ b/drivers/perf/fsl_imx8_ddr_perf.c
@@ -14,6 +14,7 @@ 
 #include <linux/of_device.h>
 #include <linux/of_irq.h>
 #include <linux/perf_event.h>
+#include <linux/spinlock.h>
 #include <linux/slab.h>
 
 #define COUNTER_CNTL		0x0
@@ -82,6 +83,7 @@  struct ddr_pmu {
 	const struct fsl_ddr_devtype_data *devtype_data;
 	int irq;
 	int id;
+	spinlock_t lock;
 };
 
 enum ddr_perf_filter_capabilities {
@@ -368,16 +370,19 @@  static void ddr_perf_event_update(struct perf_event *event)
 	struct hw_perf_event *hwc = &event->hw;
 	u64 delta, prev_raw_count, new_raw_count;
 	int counter = hwc->idx;
+	unsigned long flags;
 
-	do {
-		prev_raw_count = local64_read(&hwc->prev_count);
-		new_raw_count = ddr_perf_read_counter(pmu, counter);
-	} while (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
-			new_raw_count) != prev_raw_count);
+	spin_lock_irqsave(&pmu->lock, flags);
+
+	prev_raw_count = local64_read(&hwc->prev_count);
+	new_raw_count = ddr_perf_read_counter(pmu, counter);
 
 	delta = (new_raw_count - prev_raw_count) & 0xFFFFFFFF;
 
 	local64_add(delta, &event->count);
+	local64_set(&hwc->prev_count, new_raw_count);
+
+	spin_unlock_irqrestore(&pmu->lock, flags);
 }
 
 static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config,
@@ -404,6 +409,15 @@  static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config,
 	}
 }
 
+static bool ddr_perf_counter_overflow(struct ddr_pmu *pmu, int counter)
+{
+	int val;
+
+	val = readl_relaxed(pmu->base + counter * 4 + COUNTER_CNTL);
+
+	return val & CNTL_OVER ? true : false;
+}
+
 static void ddr_perf_event_start(struct perf_event *event, int flags)
 {
 	struct ddr_pmu *pmu = to_ddr_pmu(event->pmu);
@@ -534,7 +548,7 @@  static int ddr_perf_init(struct ddr_pmu *pmu, void __iomem *base,
 
 static irqreturn_t ddr_perf_irq_handler(int irq, void *p)
 {
-	int i;
+	int i, ret;
 	struct ddr_pmu *pmu = (struct ddr_pmu *) p;
 	struct perf_event *event, *cycle_event = NULL;
 
@@ -546,7 +560,7 @@  static irqreturn_t ddr_perf_irq_handler(int irq, void *p)
 	/*
 	 * When the cycle counter overflows, all counters are stopped,
 	 * and an IRQ is raised. If any other counter overflows, it
-	 * continues counting, and no IRQ is raised.
+	 * continues counting (stop counting for i.MX8MP), and no IRQ is raised.
 	 *
 	 * Cycles occur at least 4 times as often as other events, so we
 	 * can update all events on a cycle counter overflow and not
@@ -566,6 +580,29 @@  static irqreturn_t ddr_perf_irq_handler(int irq, void *p)
 			cycle_event = event;
 	}
 
+	/* Clear event counters to avoid they stop counting when overflow, such as i.MX8MP */
+	spin_lock(&pmu->lock);
+	for (i = 0; i < NUM_COUNTERS; i++) {
+		if (!pmu->events[i])
+			continue;
+
+		event = pmu->events[i];
+
+		if (event->hw.idx == EVENT_CYCLES_COUNTER)
+			continue;
+
+		/* check event counters overflow */
+		ret = ddr_perf_counter_overflow(pmu, event->hw.idx);
+		if (ret)
+			dev_warn(pmu->dev, "Event Counter%d overflow happened, data incorrect!!\n", i);
+
+		/* clear event counters */
+		ddr_perf_counter_enable(pmu, event->attr.config, event->hw.idx, true);
+
+		local64_set(&event->hw.prev_count, 0);
+	}
+	spin_unlock(&pmu->lock);
+
 	ddr_perf_counter_enable(pmu,
 			      EVENT_CYCLES_ID,
 			      EVENT_CYCLES_COUNTER,
@@ -619,6 +656,7 @@  static int ddr_perf_probe(struct platform_device *pdev)
 	num = ddr_perf_init(pmu, base, &pdev->dev);
 
 	platform_set_drvdata(pdev, pmu);
+	spin_lock_init(&pmu->lock);
 
 	name = devm_kasprintf(&pdev->dev, GFP_KERNEL, DDR_PERF_DEV_NAME "%d",
 			      num);

perf/imx_ddr: Add stop event counters support for i.MX8MP

Commit Message

Comments

Patch