diff mbox series

[3/3] i915/perf: Drop the aged_tail from rewind logic

Message ID 20230531235634.1309525-4-umesh.nerlige.ramappa@intel.com (mailing list archive)
State New, archived
Headers show
Series Avoid reading OA reports before they land | expand

Commit Message

Umesh Nerlige Ramappa May 31, 2023, 11:56 p.m. UTC
Instead of aged_tail use an iterator that starts from the hw_tail and
goes backward until the oa_buffer.tail looking for valid reports.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

Comments

Dixit, Ashutosh June 1, 2023, 4:13 a.m. UTC | #1
On Wed, 31 May 2023 16:56:34 -0700, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

> Instead of aged_tail use an iterator that starts from the hw_tail and
> goes backward until the oa_buffer.tail looking for valid reports.

Hmm I don't think this description is correct. All this patch is doing is
the following:

a. s/aged_tail/tail/
b. s/tail/iter/

So basically I don't think we need this patch. All we want to do here is
change the variable name aged_tail to something else (to completely remove
the concept of aging from the OA code) but other changes such as name
change to iter etc. is unnecessary.

So I would just keep the patch simple and change the name aged_tail to
advertized_tail or exported_tail or read_tail, because basically
stream->oa_buffer.tail is the tail which the writer updates (or advertizes
or exports) for the reader.

So we only should rename aged_tail here, the other changes are not needed.

We could even squash this change into Patch 1 or Patch 2, since it is
really a trivial variable rename.

Thanks.
--
Ashutosh


>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_perf.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index beb1269422ca..39f5ab1911c8 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -543,7 +543,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>  {
>	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
>	int report_size = stream->oa_buffer.format->size;
> -	u32 head, tail, aged_tail;
> +	u32 head, tail, iter;
>	unsigned long flags;
>	bool pollin;
>	u32 hw_tail;
> @@ -567,15 +567,14 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>	/* Subtract partial amount off the tail */
>	hw_tail = OA_TAKEN(hw_tail, partial_report_size);
>
> -
>	/* NB: The head we observe here might effectively be a little
>	 * out of date. If a read() is in progress, the head could be
>	 * anywhere between this head and stream->oa_buffer.tail.
>	 */
>	head = stream->oa_buffer.head - gtt_offset;
> -	aged_tail = stream->oa_buffer.tail - gtt_offset;
> +	tail = stream->oa_buffer.tail - gtt_offset;
>
> -	tail = hw_tail;
> +	iter = hw_tail;
>
>		/* Walk the stream backward until we find a report with report
>		 * id and timestmap not at 0. Since the circular buffer pointers
> @@ -588,23 +587,23 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>	 * memory in the order they were written to.
>	 * If not : (╯°□°)╯︵ ┻━┻
>	 */
> -	while (OA_TAKEN(tail, aged_tail) >= report_size) {
> -		void *report = stream->oa_buffer.vaddr + tail;
> +	while (OA_TAKEN(iter, tail) >= report_size) {
> +		void *report = stream->oa_buffer.vaddr + iter;
>
>		if (oa_report_id(stream, report) ||
>		    oa_timestamp(stream, report))
>			break;
>
> -		tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
> +		iter = (iter - report_size) & (OA_BUFFER_SIZE - 1);
>	}
>
> -	if (OA_TAKEN(hw_tail, tail) > report_size &&
> +	if (OA_TAKEN(hw_tail, iter) > report_size &&
>	    __ratelimit(&stream->perf->tail_pointer_race))
>		drm_notice(&stream->uncore->i915->drm,
>			   "unlanded report(s) head=0x%x tail=0x%x hw_tail=0x%x\n",
>		 head, tail, hw_tail);
>
> -	stream->oa_buffer.tail = gtt_offset + tail;
> +	stream->oa_buffer.tail = gtt_offset + iter;
>
>	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
>			  stream->oa_buffer.head - gtt_offset) >= report_size;
> --
> 2.36.1
>
Umesh Nerlige Ramappa June 1, 2023, 6:16 p.m. UTC | #2
On Wed, May 31, 2023 at 09:13:02PM -0700, Dixit, Ashutosh wrote:
>On Wed, 31 May 2023 16:56:34 -0700, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>> Instead of aged_tail use an iterator that starts from the hw_tail and
>> goes backward until the oa_buffer.tail looking for valid reports.
>
>Hmm I don't think this description is correct. All this patch is doing is
>the following:
>
>a. s/aged_tail/tail/
>b. s/tail/iter/
>
>So basically I don't think we need this patch. All we want to do here is
>change the variable name aged_tail to something else (to completely remove
>the concept of aging from the OA code) but other changes such as name
>change to iter etc. is unnecessary.
>
>So I would just keep the patch simple and change the name aged_tail to
>advertized_tail or exported_tail or read_tail, because basically
>stream->oa_buffer.tail is the tail which the writer updates (or advertizes
>or exports) for the reader.
>
>So we only should rename aged_tail here, the other changes are not needed.
>
>We could even squash this change into Patch 1 or Patch 2, since it is
>really a trivial variable rename.

The whole point was just readability. head/tail point to what the user 
consumes. hw_tail points to the actual hw register value and iter is 
just loop iterator.

Since the intent of the series is to just get rid of aging/aged logic, I 
can just s/aged_tail/read_tail/ and squash it with 1 since it belongs 
more to 1 than 2, although, I still like the my current patch (maybe 
with additional description in the commit message to clarify that the 
patch is just renames for readability).

Will post next rev with the simple rename and squash.

Thanks,
Umesh

>
>Thanks.
>--
>Ashutosh
>
>
>>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_perf.c | 17 ++++++++---------
>>  1 file changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index beb1269422ca..39f5ab1911c8 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -543,7 +543,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>  {
>>	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
>>	int report_size = stream->oa_buffer.format->size;
>> -	u32 head, tail, aged_tail;
>> +	u32 head, tail, iter;
>>	unsigned long flags;
>>	bool pollin;
>>	u32 hw_tail;
>> @@ -567,15 +567,14 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>	/* Subtract partial amount off the tail */
>>	hw_tail = OA_TAKEN(hw_tail, partial_report_size);
>>
>> -
>>	/* NB: The head we observe here might effectively be a little
>>	 * out of date. If a read() is in progress, the head could be
>>	 * anywhere between this head and stream->oa_buffer.tail.
>>	 */
>>	head = stream->oa_buffer.head - gtt_offset;
>> -	aged_tail = stream->oa_buffer.tail - gtt_offset;
>> +	tail = stream->oa_buffer.tail - gtt_offset;
>>
>> -	tail = hw_tail;
>> +	iter = hw_tail;
>>
>>		/* Walk the stream backward until we find a report with report
>>		 * id and timestmap not at 0. Since the circular buffer pointers
>> @@ -588,23 +587,23 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>	 * memory in the order they were written to.
>>	 * If not : (╯°□°)╯︵ ┻━┻
>>	 */
>> -	while (OA_TAKEN(tail, aged_tail) >= report_size) {
>> -		void *report = stream->oa_buffer.vaddr + tail;
>> +	while (OA_TAKEN(iter, tail) >= report_size) {
>> +		void *report = stream->oa_buffer.vaddr + iter;
>>
>>		if (oa_report_id(stream, report) ||
>>		    oa_timestamp(stream, report))
>>			break;
>>
>> -		tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
>> +		iter = (iter - report_size) & (OA_BUFFER_SIZE - 1);
>>	}
>>
>> -	if (OA_TAKEN(hw_tail, tail) > report_size &&
>> +	if (OA_TAKEN(hw_tail, iter) > report_size &&
>>	    __ratelimit(&stream->perf->tail_pointer_race))
>>		drm_notice(&stream->uncore->i915->drm,
>>			   "unlanded report(s) head=0x%x tail=0x%x hw_tail=0x%x\n",
>>		 head, tail, hw_tail);
>>
>> -	stream->oa_buffer.tail = gtt_offset + tail;
>> +	stream->oa_buffer.tail = gtt_offset + iter;
>>
>>	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
>>			  stream->oa_buffer.head - gtt_offset) >= report_size;
>> --
>> 2.36.1
>>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index beb1269422ca..39f5ab1911c8 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -543,7 +543,7 @@  static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 {
 	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
 	int report_size = stream->oa_buffer.format->size;
-	u32 head, tail, aged_tail;
+	u32 head, tail, iter;
 	unsigned long flags;
 	bool pollin;
 	u32 hw_tail;
@@ -567,15 +567,14 @@  static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 	/* Subtract partial amount off the tail */
 	hw_tail = OA_TAKEN(hw_tail, partial_report_size);
 
-
 	/* NB: The head we observe here might effectively be a little
 	 * out of date. If a read() is in progress, the head could be
 	 * anywhere between this head and stream->oa_buffer.tail.
 	 */
 	head = stream->oa_buffer.head - gtt_offset;
-	aged_tail = stream->oa_buffer.tail - gtt_offset;
+	tail = stream->oa_buffer.tail - gtt_offset;
 
-	tail = hw_tail;
+	iter = hw_tail;
 
 		/* Walk the stream backward until we find a report with report
 		 * id and timestmap not at 0. Since the circular buffer pointers
@@ -588,23 +587,23 @@  static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 	 * memory in the order they were written to.
 	 * If not : (╯°□°)╯︵ ┻━┻
 	 */
-	while (OA_TAKEN(tail, aged_tail) >= report_size) {
-		void *report = stream->oa_buffer.vaddr + tail;
+	while (OA_TAKEN(iter, tail) >= report_size) {
+		void *report = stream->oa_buffer.vaddr + iter;
 
 		if (oa_report_id(stream, report) ||
 		    oa_timestamp(stream, report))
 			break;
 
-		tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
+		iter = (iter - report_size) & (OA_BUFFER_SIZE - 1);
 	}
 
-	if (OA_TAKEN(hw_tail, tail) > report_size &&
+	if (OA_TAKEN(hw_tail, iter) > report_size &&
 	    __ratelimit(&stream->perf->tail_pointer_race))
 		drm_notice(&stream->uncore->i915->drm,
 			   "unlanded report(s) head=0x%x tail=0x%x hw_tail=0x%x\n",
 		 head, tail, hw_tail);
 
-	stream->oa_buffer.tail = gtt_offset + tail;
+	stream->oa_buffer.tail = gtt_offset + iter;
 
 	pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
 			  stream->oa_buffer.head - gtt_offset) >= report_size;