From patchwork Sat Sep 9 01:16:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dixit, Ashutosh" X-Patchwork-Id: 13377932 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BE252EEB565 for ; Sat, 9 Sep 2023 01:16:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7E1E810E0DB; Sat, 9 Sep 2023 01:16:32 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2AEAF10E093 for ; Sat, 9 Sep 2023 01:16:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694222190; x=1725758190; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hECeGbA6UIBWlITx3oG7l2xUJXmpAQPQBt5ZnZ68IIk=; b=dgNG9PQ88UqJ4aEDrtGQ1UJR9JQOsqsN28pGtOKdqnOfkJKWwyc2OdFK rQGrHoJgFU9uexHe6FS0sNj4VqM6SzXoGCAKE3ns8sLHdnzMqbLbWWfPI IEYJEzXFVvVZd8GvkELnG10Dy7Bg0lSPo3oAJyN2MHi29uFwrN+kXrO1T APHpsaxcOEMIh9X2zUBn/MA4IEtYEpj08ajNidrhUbTV9+tVlZQ9ocPSP v2MPHzXd9gcHY081h2ibg9j2tt7NGlrGsRWvc9U1zZ+kBaYQVYUBO/zbt aV3Lm15iFfu2u5c5QidU90jedydobPvYR5w6V8vJCdBJFudVTSgH88QwD w==; X-IronPort-AV: E=McAfee;i="6600,9927,10827"; a="441792746" X-IronPort-AV: E=Sophos;i="6.02,238,1688454000"; d="scan'208";a="441792746" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2023 18:16:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10827"; a="719342290" X-IronPort-AV: E=Sophos;i="6.02,238,1688454000"; d="scan'208";a="719342290" Received: from orsosgc001.jf.intel.com (HELO unerlige-ril.jf.intel.com) ([10.165.21.138]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2023 18:16:29 -0700 From: Ashutosh Dixit To: intel-gfx@lists.freedesktop.org Date: Fri, 8 Sep 2023 18:16:24 -0700 Message-ID: <20230909011626.1643734-2-ashutosh.dixit@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230909011626.1643734-1-ashutosh.dixit@intel.com> References: <20230909011626.1643734-1-ashutosh.dixit@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/3] drm/i915/perf: Subtract gtt_offset from hw_tail X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" The code in oa_buffer_check_unlocked() is correct only if the OA buffer is 16 MB aligned (which seems to be the case today in i915). However when the 16 MB alignment is dropped, when we "Subtract partial amount off the tail", the "& (OA_BUFFER_SIZE - 1)" operation in OA_TAKEN() will result in an incorrect hw_tail value. Therefore hw_tail must be brought to the same base as head and read_tail prior to OA_TAKEN by subtracting gtt_offset from hw_tail. Signed-off-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 018f42fff4cc0..ec0fc2934045a 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -565,6 +565,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) partial_report_size %= report_size; /* Subtract partial amount off the tail */ + hw_tail -= gtt_offset; hw_tail = OA_TAKEN(hw_tail, partial_report_size); /* NB: The head we observe here might effectively be a little From patchwork Sat Sep 9 01:16:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Dixit, Ashutosh" X-Patchwork-Id: 13377931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 509C3EEB57A for ; Sat, 9 Sep 2023 01:16:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9FC1710E093; Sat, 9 Sep 2023 01:16:31 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7B28110E093 for ; Sat, 9 Sep 2023 01:16:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694222190; x=1725758190; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Bhf+7XaSRTDEX6nElJomyMGfpwM34lEfs43D5uAAVMU=; b=Xj3ZeL02ZmyTVcRS6umSdoWtPuV5Yjf1fl3JYR2gEqAUlUGeeH/2nMCr GPxCzClGEjbgTZ4z/cpzpumE6OyC+Yctdyjl4+HEgBMmbBv2IlX3LBOvB 3oLsSgA2niMFwWKMdymJMeT5lvUtZXMUBUNqGOe67yzEYeKBW4Y0EAprd ln9LqFTrVacRXNfIrt4wVAPG1FW7gkzlJH/R4c08MT27cDh7WLx6pAtfr SFK9c+4i9TyS9e2k+9SpajOps/Mq9QSX9ts8fXFfWJuDL4iIwWpvpt5BG Zq4rNh7H97p5/CcmPGlb83jz71xowryRF7bMMtIzRxy6Ug5yL82gcZol5 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10827"; a="441792747" X-IronPort-AV: E=Sophos;i="6.02,238,1688454000"; d="scan'208";a="441792747" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2023 18:16:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10827"; a="719342294" X-IronPort-AV: E=Sophos;i="6.02,238,1688454000"; d="scan'208";a="719342294" Received: from orsosgc001.jf.intel.com (HELO unerlige-ril.jf.intel.com) ([10.165.21.138]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2023 18:16:29 -0700 From: Ashutosh Dixit To: intel-gfx@lists.freedesktop.org Date: Fri, 8 Sep 2023 18:16:25 -0700 Message-ID: <20230909011626.1643734-3-ashutosh.dixit@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230909011626.1643734-1-ashutosh.dixit@intel.com> References: <20230909011626.1643734-1-ashutosh.dixit@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/3] drm/i915/perf: Remove gtt_offset from stream->oa_buffer.head/.tail X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" There is no reason to add gtt_offset to the cached head/tail pointers stream->oa_buffer.head and stream->oa_buffer.tail. This causes the code to constantly add gtt_offset and subtract gtt_offset and is error prone (e.g. see previous patch). It is much simpler to maintain stream->oa_buffer.head and stream->oa_buffer.tail without adding gtt_offset to them and just allow for the gtt_offset when reading/writing from/to HW registers. Signed-off-by: Ashutosh Dixit Reviewed-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/i915/i915_perf.c | 53 ++++++++------------------------ 1 file changed, 13 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index ec0fc2934045a..1347e4ec9dd5a 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -543,10 +543,9 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) { u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); int report_size = stream->oa_buffer.format->size; - u32 head, tail, read_tail; + u32 tail, hw_tail; unsigned long flags; bool pollin; - u32 hw_tail; u32 partial_report_size; /* We have to consider the (unlikely) possibility that read() errors @@ -556,6 +555,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); hw_tail = stream->perf->ops.oa_hw_tail_read(stream); + hw_tail -= gtt_offset; /* The tail pointer increases in 64 byte increments, not in report_size * steps. Also the report size may not be a power of 2. Compute @@ -565,16 +565,8 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) partial_report_size %= report_size; /* Subtract partial amount off the tail */ - hw_tail -= gtt_offset; hw_tail = OA_TAKEN(hw_tail, partial_report_size); - /* NB: The head we observe here might effectively be a little - * out of date. If a read() is in progress, the head could be - * anywhere between this head and stream->oa_buffer.tail. - */ - head = stream->oa_buffer.head - gtt_offset; - read_tail = stream->oa_buffer.tail - gtt_offset; - tail = hw_tail; /* Walk the stream backward until we find a report with report @@ -588,7 +580,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) * memory in the order they were written to. * If not : (╯°□°)╯︵ ┻━┻ */ - while (OA_TAKEN(tail, read_tail) >= report_size) { + while (OA_TAKEN(tail, stream->oa_buffer.tail) >= report_size) { void *report = stream->oa_buffer.vaddr + tail; if (oa_report_id(stream, report) || @@ -602,9 +594,9 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) __ratelimit(&stream->perf->tail_pointer_race)) drm_notice(&stream->uncore->i915->drm, "unlanded report(s) head=0x%x tail=0x%x hw_tail=0x%x\n", - head, tail, hw_tail); + stream->oa_buffer.head, tail, hw_tail); - stream->oa_buffer.tail = gtt_offset + tail; + stream->oa_buffer.tail = tail; pollin = OA_TAKEN(stream->oa_buffer.tail, stream->oa_buffer.head) >= report_size; @@ -754,13 +746,6 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); - /* - * NB: oa_buffer.head/tail include the gtt_offset which we don't want - * while indexing relative to oa_buf_base. - */ - head -= gtt_offset; - tail -= gtt_offset; - /* * An out of bounds or misaligned head or tail pointer implies a driver * bug since we validate + align the tail pointers we read from the @@ -896,9 +881,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream, * We removed the gtt_offset for the copy loop above, indexing * relative to oa_buf_base so put back here... */ - head += gtt_offset; intel_uncore_write(uncore, oaheadptr, - head & GEN12_OAG_OAHEADPTR_MASK); + (head + gtt_offset) & GEN12_OAG_OAHEADPTR_MASK); stream->oa_buffer.head = head; spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); @@ -1043,12 +1027,6 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); - /* NB: oa_buffer.head/tail include the gtt_offset which we don't want - * while indexing relative to oa_buf_base. - */ - head -= gtt_offset; - tail -= gtt_offset; - /* An out of bounds or misaligned head or tail pointer implies a driver * bug since we validate + align the tail pointers we read from the * hardware and we are in full control of the head pointer which should @@ -1111,13 +1089,8 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, if (start_offset != *offset) { spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); - /* We removed the gtt_offset for the copy loop above, indexing - * relative to oa_buf_base so put back here... - */ - head += gtt_offset; - intel_uncore_write(uncore, GEN7_OASTATUS2, - (head & GEN7_OASTATUS2_HEAD_MASK) | + ((head + gtt_offset) & GEN7_OASTATUS2_HEAD_MASK) | GEN7_OASTATUS2_MEM_SELECT_GGTT); stream->oa_buffer.head = head; @@ -1705,7 +1678,7 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream) */ intel_uncore_write(uncore, GEN7_OASTATUS2, /* head */ gtt_offset | GEN7_OASTATUS2_MEM_SELECT_GGTT); - stream->oa_buffer.head = gtt_offset; + stream->oa_buffer.head = 0; intel_uncore_write(uncore, GEN7_OABUFFER, gtt_offset); @@ -1713,7 +1686,7 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream) gtt_offset | OABUFFER_SIZE_16M); /* Mark that we need updated tail pointers to read from... */ - stream->oa_buffer.tail = gtt_offset; + stream->oa_buffer.tail = 0; spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); @@ -1747,7 +1720,7 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) intel_uncore_write(uncore, GEN8_OASTATUS, 0); intel_uncore_write(uncore, GEN8_OAHEADPTR, gtt_offset); - stream->oa_buffer.head = gtt_offset; + stream->oa_buffer.head = 0; intel_uncore_write(uncore, GEN8_OABUFFER_UDW, 0); @@ -1764,7 +1737,7 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) intel_uncore_write(uncore, GEN8_OATAILPTR, gtt_offset & GEN8_OATAILPTR_MASK); /* Mark that we need updated tail pointers to read from... */ - stream->oa_buffer.tail = gtt_offset; + stream->oa_buffer.tail = 0; /* * Reset state used to recognise context switches, affecting which @@ -1801,7 +1774,7 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream) intel_uncore_write(uncore, __oa_regs(stream)->oa_status, 0); intel_uncore_write(uncore, __oa_regs(stream)->oa_head_ptr, gtt_offset & GEN12_OAG_OAHEADPTR_MASK); - stream->oa_buffer.head = gtt_offset; + stream->oa_buffer.head = 0; /* * PRM says: @@ -1817,7 +1790,7 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream) gtt_offset & GEN12_OAG_OATAILPTR_MASK); /* Mark that we need updated tail pointers to read from... */ - stream->oa_buffer.tail = gtt_offset; + stream->oa_buffer.tail = 0; /* * Reset state used to recognise context switches, affecting which From patchwork Sat Sep 9 01:16:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dixit, Ashutosh" X-Patchwork-Id: 13377934 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF81AEEB57A for ; Sat, 9 Sep 2023 01:16:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4789110E24A; Sat, 9 Sep 2023 01:16:35 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id A476010E0DB for ; Sat, 9 Sep 2023 01:16:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694222190; x=1725758190; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pm9VQ7vKSeJHM0OblBpciqL0JoHjXzpxwBTYtBFTOUI=; b=O9NdBzU25I5RZY415+Z4Q7eKc5nEp8c0utxB1AmQQrl1HQ8r3uPzRp8O OayKR7LyiUifJsh2kn75ArHJQ6JOCEggSS6BuustlAe9OdVbFHde/uJru SwRPsgwIYaUxnxjVqqWIwXlvyVe6PGy1qU5ooE6ekSTtk5aXYDvgebqEP GmQp4xV++kYf9fBMZFT3yYFpnVHyKbHGYGdSsojSbAOa8fFpjfxC8nHE5 7uPhAeItaMuJ66zvcLdXZ7+gAS0ltrQODHgcOvcqShj56MGEioljf+V7W caScFQhO3ZgWB6aNOnvqID3HN5DrBgzkR6g0wTR5PIKs99gUMDrpSR7ge A==; X-IronPort-AV: E=McAfee;i="6600,9927,10827"; a="441792748" X-IronPort-AV: E=Sophos;i="6.02,238,1688454000"; d="scan'208";a="441792748" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2023 18:16:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10827"; a="719342296" X-IronPort-AV: E=Sophos;i="6.02,238,1688454000"; d="scan'208";a="719342296" Received: from orsosgc001.jf.intel.com (HELO unerlige-ril.jf.intel.com) ([10.165.21.138]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2023 18:16:29 -0700 From: Ashutosh Dixit To: intel-gfx@lists.freedesktop.org Date: Fri, 8 Sep 2023 18:16:26 -0700 Message-ID: <20230909011626.1643734-4-ashutosh.dixit@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230909011626.1643734-1-ashutosh.dixit@intel.com> References: <20230909011626.1643734-1-ashutosh.dixit@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/3] drm/i915/perf: Initialize gen12 OA buffer unconditionally X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Umesh Nerlige Ramappa Correct values for OAR counters are still dependent on enabling the GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE in OAG_OACONTROL. Enabling this bit means OAG unit will write reports to the OAG buffer, so initialize the OAG buffer unconditionally for all use cases. BSpec: 46822 Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_perf.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 1347e4ec9dd5a..30cf37d6e79be 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3032,12 +3032,12 @@ static void gen12_oa_enable(struct i915_perf_stream *stream) u32 val; /* - * If we don't want OA reports from the OA buffer, then we don't even - * need to program the OAG unit. + * BSpec: 46822 + * Correct values for OAR counters are still dependent on enabling the + * GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE in OAG_OACONTROL. Enabling this + * bit means OAG unit will write reports to the OAG buffer, so + * initialize the OAG buffer correctly. */ - if (!(stream->sample_flags & SAMPLE_OA_REPORT)) - return; - gen12_init_oa_buffer(stream); regs = __oa_regs(stream);