From patchwork Wed Oct 17 13:04:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sharat Masetty X-Patchwork-Id: 10645547 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72835181D for ; Wed, 17 Oct 2018 13:04:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 67FFB2851D for ; Wed, 17 Oct 2018 13:04:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5BCAA2AF9F; Wed, 17 Oct 2018 13:04:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CF9152851D for ; Wed, 17 Oct 2018 13:04:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727263AbeJQU74 (ORCPT ); Wed, 17 Oct 2018 16:59:56 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:56424 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727000AbeJQU74 (ORCPT ); Wed, 17 Oct 2018 16:59:56 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id E107061307; Wed, 17 Oct 2018 13:04:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1539781458; bh=bSgNWDkDTGZdhQjy0H7j0lZXn9CIoYJUfUo++lGQ7ns=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ngv4Or7PUca/KVmuWEMC2AO5quX7w3eSUmxWX1jxdMvsZiEf4JlqxKFE7o87asux0 4qQtc5Gv+QrrTv8BbMRrNFFjRjzKklWz8R8FPO54h6TxNzjcQDcS3RjIBZAFeb+C3o p2xxAS86F0fWU1e6/+AlBQTtb818y/2+fSns+Bjc= Received: from smasetty-linux.qualcomm.com (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: smasetty@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id A867861307; Wed, 17 Oct 2018 13:04:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1539781457; bh=bSgNWDkDTGZdhQjy0H7j0lZXn9CIoYJUfUo++lGQ7ns=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TQyySlVJgz245H3A0BMlVXqlPeSJMngvyLSqJeoHVWevipC+ra6dbJbLUUTH6znmF tIl+B1oV2Ch5krgsIvak2PdzsRbQ06qiOAG9JWgueQbRLGKPwZmYygYMhJv7a6mCmB iiWaTplsjqAcPD8Vp9qtVoyttHdrK9GCPZKTsoH0= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org A867861307 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=smasetty@codeaurora.org From: Sharat Masetty To: freedreno@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, jcrouse@codeaurora.org, Sharat Masetty Subject: [PATCH 3/3] drm/msm: Use Hardware counters for perf profiling Date: Wed, 17 Oct 2018 18:34:01 +0530 Message-Id: <1539781441-13076-4-git-send-email-smasetty@codeaurora.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1539781441-13076-1-git-send-email-smasetty@codeaurora.org> References: <1539781441-13076-1-git-send-email-smasetty@codeaurora.org> Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch attempts to make use of the hardware counters for GPU busy % estimation when possible and skip using the software counters as it also accounts for software side delays. This should help give more accurate representation of the GPU workload. Signed-off-by: Sharat Masetty --- drivers/gpu/drm/msm/msm_gpu.c | 30 ++++++++++++++++++++++++++---- drivers/gpu/drm/msm/msm_gpu.h | 5 +++-- drivers/gpu/drm/msm/msm_perf.c | 10 +++++----- 3 files changed, 34 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index e9b5426..a896541 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -592,6 +592,9 @@ static void update_sw_cntrs(struct msm_gpu *gpu) uint32_t elapsed; unsigned long flags; + if (gpu->funcs->gpu_busy) + return; + spin_lock_irqsave(&gpu->perf_lock, flags); if (!gpu->perfcntr_active) goto out; @@ -620,6 +623,7 @@ void msm_gpu_perfcntr_start(struct msm_gpu *gpu) /* we could dynamically enable/disable perfcntr registers too.. */ gpu->last_sample.active = msm_gpu_active(gpu); gpu->last_sample.time = ktime_get(); + gpu->last_sample.busy_cycles = 0; gpu->activetime = gpu->totaltime = 0; gpu->perfcntr_active = true; update_hw_cntrs(gpu, 0, NULL); @@ -632,9 +636,22 @@ void msm_gpu_perfcntr_stop(struct msm_gpu *gpu) pm_runtime_put_sync(&gpu->pdev->dev); } +static void msm_gpu_hw_sample(struct msm_gpu *gpu, uint64_t *activetime, + uint64_t *totaltime) +{ + ktime_t time; + + *activetime = gpu->funcs->gpu_busy(gpu, + &gpu->last_sample.busy_cycles); + + time = ktime_get(); + *totaltime = ktime_us_delta(time, gpu->last_sample.time); + gpu->last_sample.time = time; +} + /* returns -errno or # of cntrs sampled */ -int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint32_t *activetime, - uint32_t *totaltime, uint32_t ncntrs, uint32_t *cntrs) +int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint64_t *activetime, + uint64_t *totaltime, uint32_t ncntrs, uint32_t *cntrs) { unsigned long flags; int ret; @@ -646,13 +663,18 @@ int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint32_t *activetime, goto out; } + ret = update_hw_cntrs(gpu, ncntrs, cntrs); + + if (gpu->funcs->gpu_busy) { + msm_gpu_hw_sample(gpu, activetime, totaltime); + goto out; + } + *activetime = gpu->activetime; *totaltime = gpu->totaltime; gpu->activetime = gpu->totaltime = 0; - ret = update_hw_cntrs(gpu, ncntrs, cntrs); - out: spin_unlock_irqrestore(&gpu->perf_lock, flags); diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 0ff23ca..7dc775f 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -90,6 +90,7 @@ struct msm_gpu { struct { bool active; ktime_t time; + u64 busy_cycles; } last_sample; uint32_t totaltime, activetime; /* sw counters */ uint32_t last_cntrs[5]; /* hw counters */ @@ -275,8 +276,8 @@ static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val) void msm_gpu_perfcntr_start(struct msm_gpu *gpu); void msm_gpu_perfcntr_stop(struct msm_gpu *gpu); -int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint32_t *activetime, - uint32_t *totaltime, uint32_t ncntrs, uint32_t *cntrs); +int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint64_t *activetime, + uint64_t *totaltime, uint32_t ncntrs, uint32_t *cntrs); void msm_gpu_retire(struct msm_gpu *gpu); void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit, diff --git a/drivers/gpu/drm/msm/msm_perf.c b/drivers/gpu/drm/msm/msm_perf.c index 5ab21bd..318f7dd 100644 --- a/drivers/gpu/drm/msm/msm_perf.c +++ b/drivers/gpu/drm/msm/msm_perf.c @@ -17,7 +17,7 @@ /* For profiling, userspace can: * - * tail -f /sys/kernel/debug/dri//gpu + * tail -f /sys/kernel/debug/dri//perf * * This will enable performance counters/profiling to track the busy time * and any gpu specific performance counters that are supported. @@ -85,9 +85,9 @@ static int refill_buf(struct msm_perf_state *perf) } } else { /* Sample line: */ - uint32_t activetime = 0, totaltime = 0; + uint64_t activetime = 0, totaltime = 0; uint32_t cntrs[5]; - uint32_t val; + uint64_t val; int ret; /* sleep until next sample time: */ @@ -101,14 +101,14 @@ static int refill_buf(struct msm_perf_state *perf) return ret; val = totaltime ? 1000 * activetime / totaltime : 0; - n = snprintf(ptr, rem, "%3d.%d%%", val / 10, val % 10); + n = snprintf(ptr, rem, "%3llu.%llu%%", val / 10, val % 10); ptr += n; rem -= n; for (i = 0; i < ret; i++) { /* cycle counters (I think).. convert to MHz.. */ val = cntrs[i] / 10000; - n = snprintf(ptr, rem, "\t%5d.%02d", + n = snprintf(ptr, rem, "\t%5llu.%02llu", val / 100, val % 100); ptr += n; rem -= n;