From patchwork Thu Jun 13 06:17:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anshuman Khandual X-Patchwork-Id: 13696178 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 426C7C27C4F for ; Thu, 13 Jun 2024 06:18:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=y1xOwL3xrIyHt6DgilsOI77UJzU6AP01onqZS1isdmQ=; b=ohxrPXR+2Vmg68FQo8X8TnHKri 7M+rd8JF1w/fmI9eYwrWSANprlpEMJbgLLfnpChZvQw6eEWDA3OhN3D3ELE2i2+/Ah8JAE5asx8Ex 9znWyFgdSiVp0pT6Ptxw5Qn4BWgykWylqxbVnZhgSxYn6DnFYkv8BrdxolsP0pMB6tHrELuxK7YZR ybpHDzjRIcIXNgy19rtYEbgw5rdi4BaGjzaV2JPlkA2zL6gI5k4cjNJ7tkVG2bzhGKdSa1hhM5q1/ TBc4agCv1yDmYocWjib5bcf3gUKGVpafaLc63SHUqqy0TMgIexhDxXBmeZAmTrfF6lt8BCrophy4l /oBmRRlw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sHdmo-0000000FH6H-1zkC; Thu, 13 Jun 2024 06:18:14 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sHdmh-0000000FH27-1OMo for linux-arm-kernel@lists.infradead.org; Thu, 13 Jun 2024 06:18:09 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 591EA1477; Wed, 12 Jun 2024 23:18:29 -0700 (PDT) Received: from a077893.arm.com (unknown [10.163.44.128]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 764723F5A1; Wed, 12 Jun 2024 23:17:59 -0700 (PDT) From: Anshuman Khandual To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, catalin.marinas@arm.com, mark.rutland@arm.com Cc: Anshuman Khandual , Mark Brown , James Clark , Rob Herring , Marc Zyngier , Suzuki Poulose , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , linux-perf-users@vger.kernel.org Subject: [PATCH V18 3/9] drivers: perf: arm_pmu: Add infrastructure for branch stack sampling Date: Thu, 13 Jun 2024 11:47:25 +0530 Message-Id: <20240613061731.3109448-4-anshuman.khandual@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240613061731.3109448-1-anshuman.khandual@arm.com> References: <20240613061731.3109448-1-anshuman.khandual@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240612_231807_477763_52C5DAA9 X-CRM114-Status: GOOD ( 26.01 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In order to support the Branch Record Buffer Extension (BRBE), we need to extend the arm_pmu framework with some basic infrastructure for branch stack sampling which arm_pmu drivers can opt-in to using. Subsequent patches will use this to add support for BRBE in the PMUv3 driver. With BRBE, the hardware records branches into a hardware FIFO, which will be sampled by software when perf events overflow. A task may be context- switched an arbitrary number of times between overflows, and to avoid losing samples we need to save the current records when a task is context- switched out. To do these we'll need to use the pmu::sched_task() callback, and we'll also need to allocate some per-task storage space via event flag PERF_ATTACH_TASK_DATA. Cc: Catalin Marinas Cc: Will Deacon Cc: Mark Rutland Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual ---- Changes in V18: - Scan valid branch stack events in armpmu_start() to create merged filter - Updated the commit message drivers/perf/arm_pmu.c | 42 +++++++++++++++++++++++++++++++++--- include/linux/perf/arm_pmu.h | 32 ++++++++++++++++++++++++++- 2 files changed, 70 insertions(+), 4 deletions(-) diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 8458fe2cebb4..219c1e276327 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -289,6 +289,23 @@ static void armpmu_start(struct perf_event *event, int flags) { struct arm_pmu *armpmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; + struct pmu_hw_events *cpuc = this_cpu_ptr(armpmu->hw_events); + int idx; + + /* + * Merge all branch filter requests from different perf + * events being added into this PMU. This includes both + * privilege and branch type filters. + */ + if (armpmu->has_branch_stack) { + cpuc->branch_sample_type = 0; + for (idx = 0; idx < ARMPMU_MAX_HWEVENTS; idx++) { + struct perf_event *event_idx = cpuc->events[idx]; + + if (event_idx && has_branch_stack(event_idx)) + cpuc->branch_sample_type |= event_idx->attr.branch_sample_type; + } + } /* * ARM pmu always has to reprogram the period, so ignore @@ -317,6 +334,9 @@ armpmu_del(struct perf_event *event, int flags) struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; + if (has_branch_stack(event)) + armpmu->branch_stack_del(event, hw_events); + armpmu_stop(event, PERF_EF_UPDATE); hw_events->events[idx] = NULL; armpmu->clear_event_idx(hw_events, event); @@ -342,6 +362,9 @@ armpmu_add(struct perf_event *event, int flags) if (idx < 0) return idx; + if (has_branch_stack(event)) + armpmu->branch_stack_add(event, hw_events); + /* * If there is an event in the counter we are going to use then make * sure it is disabled. @@ -511,13 +534,25 @@ static int armpmu_event_init(struct perf_event *event) !cpumask_test_cpu(event->cpu, &armpmu->supported_cpus)) return -ENOENT; - /* does not support taken branch sampling */ - if (has_branch_stack(event)) - return -EOPNOTSUPP; + if (has_branch_stack(event)) { + if (!armpmu->has_branch_stack) + return -EOPNOTSUPP; + + if (!armpmu->branch_stack_init(event)) + return -EOPNOTSUPP; + } return __hw_perf_event_init(event); } +static void armpmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in) +{ + struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu); + + if (armpmu->sched_task) + armpmu->sched_task(pmu_ctx, sched_in); +} + static void armpmu_enable(struct pmu *pmu) { struct arm_pmu *armpmu = to_arm_pmu(pmu); @@ -864,6 +899,7 @@ struct arm_pmu *armpmu_alloc(void) } pmu->pmu = (struct pmu) { + .sched_task = armpmu_sched_task, .pmu_enable = armpmu_enable, .pmu_disable = armpmu_disable, .event_init = armpmu_event_init, diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index b3b34f6670cf..9eda16dd684e 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -46,6 +46,18 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_63BIT) == ARMPMU_EVT_63BIT); }, \ } +/* + * Maximum branch record entries which could be processed + * for core perf branch stack sampling support, regardless + * of the hardware support available on a given ARM PMU. + */ +#define MAX_BRANCH_RECORDS 64 + +struct branch_records { + struct perf_branch_stack branch_stack; + struct perf_branch_entry branch_entries[MAX_BRANCH_RECORDS]; +}; + /* The events for a given PMU register set. */ struct pmu_hw_events { /* @@ -66,6 +78,17 @@ struct pmu_hw_events { struct arm_pmu *percpu_pmu; int irq; + + struct branch_records *branches; + + /* Active context for task events */ + void *branch_context; + + /* Active events requesting branch records */ + unsigned int branch_users; + + /* Active branch sample type filters */ + unsigned long branch_sample_type; }; enum armpmu_attr_groups { @@ -96,8 +119,15 @@ struct arm_pmu { void (*stop)(struct arm_pmu *); void (*reset)(void *); int (*map_event)(struct perf_event *event); + void (*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in); + bool (*branch_stack_init)(struct perf_event *event); + void (*branch_stack_add)(struct perf_event *event, struct pmu_hw_events *cpuc); + void (*branch_stack_del)(struct perf_event *event, struct pmu_hw_events *cpuc); + void (*branch_stack_reset)(void); int num_events; - bool secure_access; /* 32-bit ARM only */ + unsigned int secure_access : 1, /* 32-bit ARM only */ + has_branch_stack: 1, /* 64-bit ARM only */ + reserved : 30; #define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40 DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS); #define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000