diff mbox series

[7/7] drm/i915/perf: add flushing ioctl

Message ID 20200303221905.25866-8-umesh.nerlige.ramappa@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/perf: add OA interrupt support | expand

Commit Message

Umesh Nerlige Ramappa March 3, 2020, 10:19 p.m. UTC
From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

With the currently available parameters for the i915-perf stream,
there are still situations that are not well covered :

If an application opens the stream with polling disable or at very low
frequency and OA interrupt enabled, no data will be available even
though somewhere between nothing and half of the OA buffer worth of
data might have landed in memory.

To solve this issue we have a new flush ioctl on the perf stream that
forces the i915-perf driver to look at the state of the buffer when
called and makes any data available through both poll() & read() type
syscalls.

v2: Version the ioctl (Joonas)
v3: Rebase (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 18 ++++++++++++++++++
 include/uapi/drm/i915_drm.h      | 21 +++++++++++++++++++++
 2 files changed, 39 insertions(+)

Comments

Dixit, Ashutosh March 4, 2020, 5:48 a.m. UTC | #1
On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>
> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>
> With the currently available parameters for the i915-perf stream,
> there are still situations that are not well covered :
>
> If an application opens the stream with polling disable or at very low
> frequency and OA interrupt enabled, no data will be available even
> though somewhere between nothing and half of the OA buffer worth of
> data might have landed in memory.
>
> To solve this issue we have a new flush ioctl on the perf stream that
> forces the i915-perf driver to look at the state of the buffer when
> called and makes any data available through both poll() & read() type
> syscalls.
>
> v2: Version the ioctl (Joonas)
> v3: Rebase (Umesh)
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

[snip]

> +/**
> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
> + * @stream: An enabled i915 perf stream
> + *
> + * The intention is to flush all the data available for reading from the OA
> + * buffer
> + */
> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
> +{
> +	stream->pollin = oa_buffer_check(stream, true);
> +}

Since this function doesn't actually wake up any thread (which anyway can
be done by sending a signal to the blocked thread), is the only purpose of
this function to update OA buffer head/tail? But in that it is not clear
why a separate ioctl should be created for this, can't the read() call
itself call oa_buffer_check() to update the OA buffer head/tail?

Again just trying to minimize uapi changes if possible.
Lionel Landwerlin March 4, 2020, 8:52 a.m. UTC | #2
On 04/03/2020 07:48, Dixit, Ashutosh wrote:
> On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>
>> With the currently available parameters for the i915-perf stream,
>> there are still situations that are not well covered :
>>
>> If an application opens the stream with polling disable or at very low
>> frequency and OA interrupt enabled, no data will be available even
>> though somewhere between nothing and half of the OA buffer worth of
>> data might have landed in memory.
>>
>> To solve this issue we have a new flush ioctl on the perf stream that
>> forces the i915-perf driver to look at the state of the buffer when
>> called and makes any data available through both poll() & read() type
>> syscalls.
>>
>> v2: Version the ioctl (Joonas)
>> v3: Rebase (Umesh)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> [snip]
>
>> +/**
>> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>> + * @stream: An enabled i915 perf stream
>> + *
>> + * The intention is to flush all the data available for reading from the OA
>> + * buffer
>> + */
>> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>> +{
>> +	stream->pollin = oa_buffer_check(stream, true);
>> +}
> Since this function doesn't actually wake up any thread (which anyway can
> be done by sending a signal to the blocked thread), is the only purpose of
> this function to update OA buffer head/tail? But in that it is not clear
> why a separate ioctl should be created for this, can't the read() call
> itself call oa_buffer_check() to update the OA buffer head/tail?
>
> Again just trying to minimize uapi changes if possible.

Most applications will call read() after being notified by 
poll()/select() that some data is available.

Changing that behavior will break some of the existing perf tests .


If any data is available, this new ioctl will wake up existing waiters 
on poll()/select().


-Lionel
Dixit, Ashutosh March 5, 2020, 5:56 a.m. UTC | #3
On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>
> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >>
> >> With the currently available parameters for the i915-perf stream,
> >> there are still situations that are not well covered :
> >>
> >> If an application opens the stream with polling disable or at very low
> >> frequency and OA interrupt enabled, no data will be available even
> >> though somewhere between nothing and half of the OA buffer worth of
> >> data might have landed in memory.
> >>
> >> To solve this issue we have a new flush ioctl on the perf stream that
> >> forces the i915-perf driver to look at the state of the buffer when
> >> called and makes any data available through both poll() & read() type
> >> syscalls.
> >>
> >> v2: Version the ioctl (Joonas)
> >> v3: Rebase (Umesh)
> >>
> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> > [snip]
> >
> >> +/**
> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
> >> + * @stream: An enabled i915 perf stream
> >> + *
> >> + * The intention is to flush all the data available for reading from the OA
> >> + * buffer
> >> + */
> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
> >> +{
> >> +	stream->pollin = oa_buffer_check(stream, true);
> >> +}
> > Since this function doesn't actually wake up any thread (which anyway can
> > be done by sending a signal to the blocked thread), is the only purpose of
> > this function to update OA buffer head/tail? But in that it is not clear
> > why a separate ioctl should be created for this, can't the read() call
> > itself call oa_buffer_check() to update the OA buffer head/tail?
> >
> > Again just trying to minimize uapi changes if possible.
>
> Most applications will call read() after being notified by poll()/select()
> that some data is available.

Correct this is the standard non blocking read behavior.

> Changing that behavior will break some of the existing perf tests .

I am not suggesting changing that (that standard non blocking read
behavior).

> If any data is available, this new ioctl will wake up existing waiters on
> poll()/select().

The issue is we are not calling wake_up() in the above function to wake up
any blocked waiters. The ioctl will just update the OA buffer head/tail so
that (a) a subsequent blocking read will not block, or (b) a subsequent non
blocking read will return valid data (not -EAGAIN), or (c) a poll/select
will not block but return immediately saying data is available.

That is why it seems to me the ioctl is not required, updating the OA
buffer head/tail can be done as part of the read() (and the poll/select)
calls themselves.

We will investigate if this can be done and update the patches in the next
revision accordingly. Thanks!
Umesh Nerlige Ramappa March 9, 2020, 7:51 p.m. UTC | #4
On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>
>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >>
>> >> With the currently available parameters for the i915-perf stream,
>> >> there are still situations that are not well covered :
>> >>
>> >> If an application opens the stream with polling disable or at very low
>> >> frequency and OA interrupt enabled, no data will be available even
>> >> though somewhere between nothing and half of the OA buffer worth of
>> >> data might have landed in memory.
>> >>
>> >> To solve this issue we have a new flush ioctl on the perf stream that
>> >> forces the i915-perf driver to look at the state of the buffer when
>> >> called and makes any data available through both poll() & read() type
>> >> syscalls.
>> >>
>> >> v2: Version the ioctl (Joonas)
>> >> v3: Rebase (Umesh)
>> >>
>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> > [snip]
>> >
>> >> +/**
>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>> >> + * @stream: An enabled i915 perf stream
>> >> + *
>> >> + * The intention is to flush all the data available for reading from the OA
>> >> + * buffer
>> >> + */
>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>> >> +{
>> >> +	stream->pollin = oa_buffer_check(stream, true);
>> >> +}
>> > Since this function doesn't actually wake up any thread (which anyway can
>> > be done by sending a signal to the blocked thread), is the only purpose of
>> > this function to update OA buffer head/tail? But in that it is not clear
>> > why a separate ioctl should be created for this, can't the read() call
>> > itself call oa_buffer_check() to update the OA buffer head/tail?
>> >
>> > Again just trying to minimize uapi changes if possible.
>>
>> Most applications will call read() after being notified by poll()/select()
>> that some data is available.
>
>Correct this is the standard non blocking read behavior.
>
>> Changing that behavior will break some of the existing perf tests .
>
>I am not suggesting changing that (that standard non blocking read
>behavior).
>
>> If any data is available, this new ioctl will wake up existing waiters on
>> poll()/select().
>
>The issue is we are not calling wake_up() in the above function to wake up
>any blocked waiters. The ioctl will just update the OA buffer head/tail so
>that (a) a subsequent blocking read will not block, or (b) a subsequent non
>blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>will not block but return immediately saying data is available.
>
>That is why it seems to me the ioctl is not required, updating the OA
>buffer head/tail can be done as part of the read() (and the poll/select)
>calls themselves.
>
>We will investigate if this can be done and update the patches in the next
>revision accordingly. Thanks!

In this case, where we are trying to determine if there is any data in 
the oa buffer before the next interrupt has fired, user could call poll 
with a reasonable timeout to determine if data is available or not.  
That would eliminate the need for the flush ioctl. Thoughts?

Thanks,
Umesh
Umesh Nerlige Ramappa March 9, 2020, 9:15 p.m. UTC | #5
On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>
>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >>
>> >> With the currently available parameters for the i915-perf stream,
>> >> there are still situations that are not well covered :
>> >>
>> >> If an application opens the stream with polling disable or at very low
>> >> frequency and OA interrupt enabled, no data will be available even
>> >> though somewhere between nothing and half of the OA buffer worth of
>> >> data might have landed in memory.
>> >>
>> >> To solve this issue we have a new flush ioctl on the perf stream that
>> >> forces the i915-perf driver to look at the state of the buffer when
>> >> called and makes any data available through both poll() & read() type
>> >> syscalls.
>> >>
>> >> v2: Version the ioctl (Joonas)
>> >> v3: Rebase (Umesh)
>> >>
>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> > [snip]
>> >
>> >> +/**
>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>> >> + * @stream: An enabled i915 perf stream
>> >> + *
>> >> + * The intention is to flush all the data available for reading from the OA
>> >> + * buffer
>> >> + */
>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>> >> +{
>> >> +	stream->pollin = oa_buffer_check(stream, true);
>> >> +}
>> > Since this function doesn't actually wake up any thread (which anyway can
>> > be done by sending a signal to the blocked thread), is the only purpose of
>> > this function to update OA buffer head/tail? But in that it is not clear
>> > why a separate ioctl should be created for this, can't the read() call
>> > itself call oa_buffer_check() to update the OA buffer head/tail?
>> >
>> > Again just trying to minimize uapi changes if possible.
>>
>> Most applications will call read() after being notified by poll()/select()
>> that some data is available.
>
>Correct this is the standard non blocking read behavior.
>
>> Changing that behavior will break some of the existing perf tests .
>
>I am not suggesting changing that (that standard non blocking read
>behavior).
>
>> If any data is available, this new ioctl will wake up existing waiters on
>> poll()/select().
>
>The issue is we are not calling wake_up() in the above function to wake up
>any blocked waiters. The ioctl will just update the OA buffer head/tail so
>that (a) a subsequent blocking read will not block, or (b) a subsequent non
>blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>will not block but return immediately saying data is available.
>
>That is why it seems to me the ioctl is not required, updating the OA
>buffer head/tail can be done as part of the read() (and the poll/select)
>calls themselves.
>
>We will investigate if this can be done and update the patches in the next
>revision accordingly. Thanks!

resending (cc: Lionel)..

In this case, where we are trying to determine if there is any data in 
the oa buffer before the next interrupt has fired, user could call poll 
with a reasonable timeout to determine if data is available or not.  
That would eliminate the need for the flush ioctl.  Thoughts?

Thanks,
Umesh
Lionel Landwerlin March 10, 2020, 8:44 p.m. UTC | #6
On 09/03/2020 21:51, Umesh Nerlige Ramappa wrote:
> On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>> On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>>
>>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>> >>
>>> >> With the currently available parameters for the i915-perf stream,
>>> >> there are still situations that are not well covered :
>>> >>
>>> >> If an application opens the stream with polling disable or at 
>>> very low
>>> >> frequency and OA interrupt enabled, no data will be available even
>>> >> though somewhere between nothing and half of the OA buffer worth of
>>> >> data might have landed in memory.
>>> >>
>>> >> To solve this issue we have a new flush ioctl on the perf stream 
>>> that
>>> >> forces the i915-perf driver to look at the state of the buffer when
>>> >> called and makes any data available through both poll() & read() 
>>> type
>>> >> syscalls.
>>> >>
>>> >> v2: Version the ioctl (Joonas)
>>> >> v3: Rebase (Umesh)
>>> >>
>>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>> >> Signed-off-by: Umesh Nerlige Ramappa 
>>> <umesh.nerlige.ramappa@intel.com>
>>> > [snip]
>>> >
>>> >> +/**
>>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>>> >> + * @stream: An enabled i915 perf stream
>>> >> + *
>>> >> + * The intention is to flush all the data available for reading 
>>> from the OA
>>> >> + * buffer
>>> >> + */
>>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>>> >> +{
>>> >> +    stream->pollin = oa_buffer_check(stream, true);
>>> >> +}
>>> > Since this function doesn't actually wake up any thread (which 
>>> anyway can
>>> > be done by sending a signal to the blocked thread), is the only 
>>> purpose of
>>> > this function to update OA buffer head/tail? But in that it is not 
>>> clear
>>> > why a separate ioctl should be created for this, can't the read() 
>>> call
>>> > itself call oa_buffer_check() to update the OA buffer head/tail?
>>> >
>>> > Again just trying to minimize uapi changes if possible.
>>>
>>> Most applications will call read() after being notified by 
>>> poll()/select()
>>> that some data is available.
>>
>> Correct this is the standard non blocking read behavior.
>>
>>> Changing that behavior will break some of the existing perf tests .
>>
>> I am not suggesting changing that (that standard non blocking read
>> behavior).
>>
>>> If any data is available, this new ioctl will wake up existing 
>>> waiters on
>>> poll()/select().
>>
>> The issue is we are not calling wake_up() in the above function to 
>> wake up
>> any blocked waiters. The ioctl will just update the OA buffer 
>> head/tail so
>> that (a) a subsequent blocking read will not block, or (b) a 
>> subsequent non
>> blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>> will not block but return immediately saying data is available.
>>
>> That is why it seems to me the ioctl is not required, updating the OA
>> buffer head/tail can be done as part of the read() (and the poll/select)
>> calls themselves.
>>
>> We will investigate if this can be done and update the patches in the 
>> next
>> revision accordingly. Thanks!
>
> In this case, where we are trying to determine if there is any data in 
> the oa buffer before the next interrupt has fired, user could call 
> poll with a reasonable timeout to determine if data is available or 
> not.  That would eliminate the need for the flush ioctl. Thoughts?
>
> Thanks,
> Umesh


I almost forgot why this would cause problem.

Checking the state of the buffer every time you call poll() will pretty 
much guarantee you have at least one report to read every time.

So that would lead to lot more wakeups :(


The whole system has to stay "unidirectional" with either interrupts or 
timeout driving the wakeups.

This additional ioctl is the only solution I could find to add one more 
input to the wakeup mechanism.


-Lionel
Dixit, Ashutosh March 11, 2020, 3:05 a.m. UTC | #7
On Tue, 10 Mar 2020 13:44:30 -0700, Lionel Landwerlin wrote:
>
> On 09/03/2020 21:51, Umesh Nerlige Ramappa wrote:
> > On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
> >> On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
> >>>
> >>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
> >>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
> >>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >>> >>
> >>> >> With the currently available parameters for the i915-perf stream,
> >>> >> there are still situations that are not well covered :
> >>> >>
> >>> >> If an application opens the stream with polling disable or at very
> >>> low
> >>> >> frequency and OA interrupt enabled, no data will be available even
> >>> >> though somewhere between nothing and half of the OA buffer worth of
> >>> >> data might have landed in memory.
> >>> >>
> >>> >> To solve this issue we have a new flush ioctl on the perf stream
> >>> that
> >>> >> forces the i915-perf driver to look at the state of the buffer when
> >>> >> called and makes any data available through both poll() & read()
> >>> type
> >>> >> syscalls.
> >>> >>
> >>> >> v2: Version the ioctl (Joonas)
> >>> >> v3: Rebase (Umesh)
> >>> >>
> >>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >>> >> Signed-off-by: Umesh Nerlige Ramappa
> >>> <umesh.nerlige.ramappa@intel.com>
> >>> > [snip]
> >>> >
> >>> >> +/**
> >>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
> >>> >> + * @stream: An enabled i915 perf stream
> >>> >> + *
> >>> >> + * The intention is to flush all the data available for reading
> >>> from the OA
> >>> >> + * buffer
> >>> >> + */
> >>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
> >>> >> +{
> >>> >> +    stream->pollin = oa_buffer_check(stream, true);
> >>> >> +}
> >>> > Since this function doesn't actually wake up any thread (which anyway
> >>> can
> >>> > be done by sending a signal to the blocked thread), is the only
> >>> purpose of
> >>> > this function to update OA buffer head/tail? But in that it is not
> >>> clear
> >>> > why a separate ioctl should be created for this, can't the read()
> >>> call
> >>> > itself call oa_buffer_check() to update the OA buffer head/tail?
> >>> >
> >>> > Again just trying to minimize uapi changes if possible.
> >>>
> >>> Most applications will call read() after being notified by
> >>> poll()/select()
> >>> that some data is available.
> >>
> >> Correct this is the standard non blocking read behavior.
> >>
> >>> Changing that behavior will break some of the existing perf tests .
> >>
> >> I am not suggesting changing that (that standard non blocking read
> >> behavior).
> >>
> >>> If any data is available, this new ioctl will wake up existing waiters
> >>> on
> >>> poll()/select().
> >>
> >> The issue is we are not calling wake_up() in the above function to wake
> >> up
> >> any blocked waiters. The ioctl will just update the OA buffer head/tail
> >> so
> >> that (a) a subsequent blocking read will not block, or (b) a subsequent
> >> non
> >> blocking read will return valid data (not -EAGAIN), or (c) a poll/select
> >> will not block but return immediately saying data is available.
> >>
> >> That is why it seems to me the ioctl is not required, updating the OA
> >> buffer head/tail can be done as part of the read() (and the poll/select)
> >> calls themselves.
> >>
> >> We will investigate if this can be done and update the patches in the
> >> next
> >> revision accordingly. Thanks!
> >
> > In this case, where we are trying to determine if there is any data in
> > the oa buffer before the next interrupt has fired, user could call poll
> > with a reasonable timeout to determine if data is available or not.  That
> > would eliminate the need for the flush ioctl. Thoughts?
> >
> > Thanks,
> > Umesh
>
>
> I almost forgot why this would cause problem.
>
> Checking the state of the buffer every time you call poll() will pretty
> much guarantee you have at least one report to read every time.
>
> So that would lead to lot more wakeups :(
>
> The whole system has to stay "unidirectional" with either interrupts or
> timeout driving the wakeups.
>
> This additional ioctl is the only solution I could find to add one more
> input to the wakeup mechanism.

Well, aren't we asking the app to sleep for time T and then call flush
(followed by read)? Then we might as well ask them to sleep for time T and
call poll? Or we can ask them set the hrtimer to T, skip the sleep and call
poll (followed by read)? Aren't these 3 mechanisms equivalent? To me the
last option seems to be the cleanest. Thanks!
Lionel Landwerlin March 11, 2020, 4:35 p.m. UTC | #8
On 11/03/2020 05:05, Dixit, Ashutosh wrote:
> On Tue, 10 Mar 2020 13:44:30 -0700, Lionel Landwerlin wrote:
>> On 09/03/2020 21:51, Umesh Nerlige Ramappa wrote:
>>> On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>>>> On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>>>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>>>>>> On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>>>>>>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>>>>>>
>>>>>>> With the currently available parameters for the i915-perf stream,
>>>>>>> there are still situations that are not well covered :
>>>>>>>
>>>>>>> If an application opens the stream with polling disable or at very
>>>>> low
>>>>>>> frequency and OA interrupt enabled, no data will be available even
>>>>>>> though somewhere between nothing and half of the OA buffer worth of
>>>>>>> data might have landed in memory.
>>>>>>>
>>>>>>> To solve this issue we have a new flush ioctl on the perf stream
>>>>> that
>>>>>>> forces the i915-perf driver to look at the state of the buffer when
>>>>>>> called and makes any data available through both poll() & read()
>>>>> type
>>>>>>> syscalls.
>>>>>>>
>>>>>>> v2: Version the ioctl (Joonas)
>>>>>>> v3: Rebase (Umesh)
>>>>>>>
>>>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>>>>>> Signed-off-by: Umesh Nerlige Ramappa
>>>>> <umesh.nerlige.ramappa@intel.com>
>>>>>> [snip]
>>>>>>
>>>>>>> +/**
>>>>>>> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>>>>>>> + * @stream: An enabled i915 perf stream
>>>>>>> + *
>>>>>>> + * The intention is to flush all the data available for reading
>>>>> from the OA
>>>>>>> + * buffer
>>>>>>> + */
>>>>>>> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>>>>>>> +{
>>>>>>> +    stream->pollin = oa_buffer_check(stream, true);
>>>>>>> +}
>>>>>> Since this function doesn't actually wake up any thread (which anyway
>>>>> can
>>>>>> be done by sending a signal to the blocked thread), is the only
>>>>> purpose of
>>>>>> this function to update OA buffer head/tail? But in that it is not
>>>>> clear
>>>>>> why a separate ioctl should be created for this, can't the read()
>>>>> call
>>>>>> itself call oa_buffer_check() to update the OA buffer head/tail?
>>>>>>
>>>>>> Again just trying to minimize uapi changes if possible.
>>>>> Most applications will call read() after being notified by
>>>>> poll()/select()
>>>>> that some data is available.
>>>> Correct this is the standard non blocking read behavior.
>>>>
>>>>> Changing that behavior will break some of the existing perf tests .
>>>> I am not suggesting changing that (that standard non blocking read
>>>> behavior).
>>>>
>>>>> If any data is available, this new ioctl will wake up existing waiters
>>>>> on
>>>>> poll()/select().
>>>> The issue is we are not calling wake_up() in the above function to wake
>>>> up
>>>> any blocked waiters. The ioctl will just update the OA buffer head/tail
>>>> so
>>>> that (a) a subsequent blocking read will not block, or (b) a subsequent
>>>> non
>>>> blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>>>> will not block but return immediately saying data is available.
>>>>
>>>> That is why it seems to me the ioctl is not required, updating the OA
>>>> buffer head/tail can be done as part of the read() (and the poll/select)
>>>> calls themselves.
>>>>
>>>> We will investigate if this can be done and update the patches in the
>>>> next
>>>> revision accordingly. Thanks!
>>> In this case, where we are trying to determine if there is any data in
>>> the oa buffer before the next interrupt has fired, user could call poll
>>> with a reasonable timeout to determine if data is available or not.  That
>>> would eliminate the need for the flush ioctl. Thoughts?
>>>
>>> Thanks,
>>> Umesh
>>
>> I almost forgot why this would cause problem.
>>
>> Checking the state of the buffer every time you call poll() will pretty
>> much guarantee you have at least one report to read every time.
>>
>> So that would lead to lot more wakeups :(
>>
>> The whole system has to stay "unidirectional" with either interrupts or
>> timeout driving the wakeups.
>>
>> This additional ioctl is the only solution I could find to add one more
>> input to the wakeup mechanism.
> Well, aren't we asking the app to sleep for time T and then call flush
> (followed by read)? Then we might as well ask them to sleep for time T and
> call poll? Or we can ask them set the hrtimer to T, skip the sleep and call
> poll (followed by read)? Aren't these 3 mechanisms equivalent? To me the
> last option seems to be the cleanest. Thanks!


I guess one thing that could work is to call oa_buffer_check() at the 
top of i915_oa_read().

Maybe you proposed this in a previous email.


That way poll() doesn't trigger too many wake ups and right before 
closing the FD will pull the remaining data.

One side effect of this is that read() might almost always return 
something because the HW is faster to write the OA buffer than the app 
can call read().

Still I think it still works out.


What do you think?


-Lionel
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index ab41cba85b40..b6cb47e80b86 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3221,6 +3221,18 @@  static void i915_perf_disable_locked(struct i915_perf_stream *stream)
 		stream->ops->disable(stream);
 }
 
+/**
+ * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
+ * @stream: An enabled i915 perf stream
+ *
+ * The intention is to flush all the data available for reading from the OA
+ * buffer
+ */
+static void i915_perf_flush_data(struct i915_perf_stream *stream)
+{
+	stream->pollin = oa_buffer_check(stream, true);
+}
+
 static long i915_perf_config_locked(struct i915_perf_stream *stream,
 				    unsigned long metrics_set)
 {
@@ -3282,6 +3294,9 @@  static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
 		return 0;
 	case I915_PERF_IOCTL_CONFIG:
 		return i915_perf_config_locked(stream, arg);
+	case I915_PERF_IOCTL_FLUSH_DATA:
+		i915_perf_flush_data(stream);
+		return 0;
 	}
 
 	return -EINVAL;
@@ -4551,6 +4566,9 @@  int i915_perf_ioctl_version(void)
 	 *
 	 * 5: Add DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT paramter to
 	 *    enable/disable interrupts in OA.
+	 *
+	 * 6: Add ioctl to flush OA data before reading.
+	 *    I915_PERF_IOCTL_FLUSH_DATA
 	 */
 	return 5;
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index f609ff4ceccb..3fd6bb189248 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2044,6 +2044,27 @@  struct drm_i915_perf_open_param {
  */
 #define I915_PERF_IOCTL_CONFIG	_IO('i', 0x2)
 
+/**
+ * Actively check the availability of data from a stream.
+ *
+ * A stream data availability can be driven by two types of events :
+ *
+ *   - if enabled, the kernel's hrtimer checking the amount of available data
+ *     in the OA buffer through head/tail registers.
+ *
+ *   - if enabled, the OA unit's interrupt mechanism
+ *
+ * The kernel hrtimer incur a cost of running callback at fixed time
+ * intervals, while the OA interrupt might only happen rarely. In the
+ * situation where the application has disabled the kernel's hrtimer and only
+ * uses the OA interrupt to know about available data, the application can
+ * request an active check of the available OA data through this ioctl. This
+ * will make any data in the OA buffer available with either poll() or read().
+ *
+ * This ioctl is available in perf revision 6.
+ */
+#define I915_PERF_IOCTL_FLUSH_DATA _IO('i', 0x3)
+
 /**
  * Common to all i915 perf records
  */