diff mbox

prime_self_import: Assure no pending requests before object counting

Message ID 1383310422-30702-1-git-send-email-oscar.mateo@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

oscar.mateo@intel.com Nov. 1, 2013, 12:53 p.m. UTC
From: Oscar Mateo <oscar.mateo@intel.com>

We don't want a previously used object to be freed in the middle of a
before/after object counting operation (or we would get a "-1 objects
leaked" message). We have seen this happening, e.g., when a context
from a previous run dies, but its backing object is alive waiting for
a retire_work to kick in.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
---
 tests/prime_self_import.c |   29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

Comments

Daniel Vetter Nov. 1, 2013, 4:08 p.m. UTC | #1
On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> We don't want a previously used object to be freed in the middle of a
> before/after object counting operation (or we would get a "-1 objects
> leaked" message). We have seen this happening, e.g., when a context
> from a previous run dies, but its backing object is alive waiting for
> a retire_work to kick in.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>

Nice catch. Should we do this in general as part of our gem_quiescent_gpu
helper? All i-g-t testcase are written under the assumption that they
completel own the gpu and that the gtt is completely empty besides the few
driver-allocated and pinned objects. So trying really hard to get rid of
any residual stuff sounds like a good idea.

Then we could just add a call to gem_quiescent_gpu to the get_object_count
function.

Care to rework the patch a bit and make sure it all still works?

Thanks, Daniel

> ---
>  tests/prime_self_import.c |   29 +++++++++++++++++++++++++++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/prime_self_import.c b/tests/prime_self_import.c
> index 481a809..e48abd1 100644
> --- a/tests/prime_self_import.c
> +++ b/tests/prime_self_import.c
> @@ -211,6 +211,25 @@ static void test_with_one_bo(void)
>  	check_bo(fd2, handle_import1, fd2, handle_import1);
>  }
>  
> +static void retire_requests(void)
> +{
> +	char fname[FILENAME_MAX];
> +	int drop_caches_fd;
> +	const char *data = "0x4";
> +
> +	snprintf(fname, FILENAME_MAX, "%s/%i/%s",
> +		 "/sys/kernel/debug/dri", drm_get_card(),
> +		 "i915_gem_drop_caches");
> +
> +	drop_caches_fd = open(fname, O_WRONLY);
> +
> +	if (drop_caches_fd >= 0)
> +	{
> +		write(drop_caches_fd, data, strlen(data) + 1);
> +		close(drop_caches_fd);
> +	}
> +}
> +
>  static int get_object_count(void)
>  {
>  	FILE *file;
> @@ -252,10 +271,13 @@ static void test_reimport_close_race(void)
>  	pthread_t *threads;
>  	int r, i, num_threads;
>  	int fds[2];
> -	int obj_count = get_object_count();
> +	int obj_count;
>  	void *status;
>  	uint32_t handle;
>  
> +	retire_requests();
> +	obj_count = get_object_count();
> +
>  	num_threads = sysconf(_SC_NPROCESSORS_ONLN);
>  
>  	threads = calloc(num_threads, sizeof(pthread_t));
> @@ -330,9 +352,12 @@ static void test_export_close_race(void)
>  	pthread_t *threads;
>  	int r, i, num_threads;
>  	int fd;
> -	int obj_count = get_object_count();
> +	int obj_count;
>  	void *status;
>  
> +	retire_requests();
> +	obj_count = get_object_count();
> +
>  	num_threads = sysconf(_SC_NPROCESSORS_ONLN);
>  
>  	threads = calloc(num_threads, sizeof(pthread_t));
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Ben Widawsky Nov. 1, 2013, 4:16 p.m. UTC | #2
On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> We don't want a previously used object to be freed in the middle of a
> before/after object counting operation (or we would get a "-1 objects
> leaked" message). We have seen this happening, e.g., when a context
> from a previous run dies, but its backing object is alive waiting for
> a retire_work to kick in.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>

Since as far as we know this can't happen until the 'context per fd'
code lands, I think we may as well hold off.

Instead how about 3 patches:

1. Introduce a library version of drop caches in igt. Move existing
callers.
2. Update prime_self_import.c
3. Update gem_flink_close

Meanwhile, I will merge this patch to the PPGTT IGT branch, so QA will
use it.

What do you think? If you don't want to do that, then simply merging
this afterwards is also fine with me.

[snip]
Ben Widawsky Nov. 1, 2013, 4:18 p.m. UTC | #3
On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> > From: Oscar Mateo <oscar.mateo@intel.com>
> > 
> > We don't want a previously used object to be freed in the middle of a
> > before/after object counting operation (or we would get a "-1 objects
> > leaked" message). We have seen this happening, e.g., when a context
> > from a previous run dies, but its backing object is alive waiting for
> > a retire_work to kick in.
> > 
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> 
> Nice catch. Should we do this in general as part of our gem_quiescent_gpu
> helper? All i-g-t testcase are written under the assumption that they
> completel own the gpu and that the gtt is completely empty besides the few
> driver-allocated and pinned objects. So trying really hard to get rid of
> any residual stuff sounds like a good idea.

I was going to address this in the other mail thread.... in any case, I
think not. I believe a separate helper is the way to go, and we should
only call it when we absolutely want to.

Though it's not the intention, I've seen many tests fail because of
previous state, and I don't want to miss out on those in the future. It
would also slow down the run unnecessarily further.

> 
> Then we could just add a call to gem_quiescent_gpu to the get_object_count
> function.
> 
> Care to rework the patch a bit and make sure it all still works?
> 
> Thanks, Daniel
> 
> > ---
> >  tests/prime_self_import.c |   29 +++++++++++++++++++++++++++--
> >  1 file changed, 27 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tests/prime_self_import.c b/tests/prime_self_import.c
> > index 481a809..e48abd1 100644
> > --- a/tests/prime_self_import.c
> > +++ b/tests/prime_self_import.c
> > @@ -211,6 +211,25 @@ static void test_with_one_bo(void)
> >  	check_bo(fd2, handle_import1, fd2, handle_import1);
> >  }
> >  
> > +static void retire_requests(void)
> > +{
> > +	char fname[FILENAME_MAX];
> > +	int drop_caches_fd;
> > +	const char *data = "0x4";
> > +
> > +	snprintf(fname, FILENAME_MAX, "%s/%i/%s",
> > +		 "/sys/kernel/debug/dri", drm_get_card(),
> > +		 "i915_gem_drop_caches");
> > +
> > +	drop_caches_fd = open(fname, O_WRONLY);
> > +
> > +	if (drop_caches_fd >= 0)
> > +	{
> > +		write(drop_caches_fd, data, strlen(data) + 1);
> > +		close(drop_caches_fd);
> > +	}
> > +}
> > +
> >  static int get_object_count(void)
> >  {
> >  	FILE *file;
> > @@ -252,10 +271,13 @@ static void test_reimport_close_race(void)
> >  	pthread_t *threads;
> >  	int r, i, num_threads;
> >  	int fds[2];
> > -	int obj_count = get_object_count();
> > +	int obj_count;
> >  	void *status;
> >  	uint32_t handle;
> >  
> > +	retire_requests();
> > +	obj_count = get_object_count();
> > +
> >  	num_threads = sysconf(_SC_NPROCESSORS_ONLN);
> >  
> >  	threads = calloc(num_threads, sizeof(pthread_t));
> > @@ -330,9 +352,12 @@ static void test_export_close_race(void)
> >  	pthread_t *threads;
> >  	int r, i, num_threads;
> >  	int fd;
> > -	int obj_count = get_object_count();
> > +	int obj_count;
> >  	void *status;
> >  
> > +	retire_requests();
> > +	obj_count = get_object_count();
> > +
> >  	num_threads = sysconf(_SC_NPROCESSORS_ONLN);
> >  
> >  	threads = calloc(num_threads, sizeof(pthread_t));
> > -- 
> > 1.7.9.5
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Lespiau, Damien Nov. 1, 2013, 4:22 p.m. UTC | #4
On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> +static void retire_requests(void)
> +{
> +	char fname[FILENAME_MAX];
> +	int drop_caches_fd;
> +	const char *data = "0x4";
> +
> +	snprintf(fname, FILENAME_MAX, "%s/%i/%s",
> +		 "/sys/kernel/debug/dri", drm_get_card(),
> +		 "i915_gem_drop_caches");
> +
> +	drop_caches_fd = open(fname, O_WRONLY);

This can be replaced by the newly introduced igt_debugfs_t facilities
(that take care or where to find the debugfs mount point and allows to
have fixes aroud that located in one place):

    igt_debugfs_t debugfs;

    igt_debugfs_init(&debugfs);
    fd = igt_debugfs_open(&debugfs, "i915_gem_drop_caches", O_WRONLY);
Daniel Vetter Nov. 1, 2013, 6:42 p.m. UTC | #5
On Fri, Nov 01, 2013 at 09:18:51AM -0700, Ben Widawsky wrote:
> On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> > On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > 
> > > We don't want a previously used object to be freed in the middle of a
> > > before/after object counting operation (or we would get a "-1 objects
> > > leaked" message). We have seen this happening, e.g., when a context
> > > from a previous run dies, but its backing object is alive waiting for
> > > a retire_work to kick in.
> > > 
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > 
> > Nice catch. Should we do this in general as part of our gem_quiescent_gpu
> > helper? All i-g-t testcase are written under the assumption that they
> > completel own the gpu and that the gtt is completely empty besides the few
> > driver-allocated and pinned objects. So trying really hard to get rid of
> > any residual stuff sounds like a good idea.
> 
> I was going to address this in the other mail thread.... in any case, I
> think not. I believe a separate helper is the way to go, and we should
> only call it when we absolutely want to.
> 
> Though it's not the intention, I've seen many tests fail because of
> previous state, and I don't want to miss out on those in the future. It
> would also slow down the run unnecessarily further.

We already do rather eregious stuff in quiescent. So I want hard numbers
on your claim that it slows down stuff further - there really shouldn't be
much at all to retire/evict.
-Daniel
Ben Widawsky Nov. 1, 2013, 6:44 p.m. UTC | #6
On Fri, Nov 01, 2013 at 07:42:59PM +0100, Daniel Vetter wrote:
> On Fri, Nov 01, 2013 at 09:18:51AM -0700, Ben Widawsky wrote:
> > On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> > > On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > 
> > > > We don't want a previously used object to be freed in the middle of a
> > > > before/after object counting operation (or we would get a "-1 objects
> > > > leaked" message). We have seen this happening, e.g., when a context
> > > > from a previous run dies, but its backing object is alive waiting for
> > > > a retire_work to kick in.
> > > > 
> > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > Nice catch. Should we do this in general as part of our gem_quiescent_gpu
> > > helper? All i-g-t testcase are written under the assumption that they
> > > completel own the gpu and that the gtt is completely empty besides the few
> > > driver-allocated and pinned objects. So trying really hard to get rid of
> > > any residual stuff sounds like a good idea.
> > 
> > I was going to address this in the other mail thread.... in any case, I
> > think not. I believe a separate helper is the way to go, and we should
> > only call it when we absolutely want to.
> > 
> > Though it's not the intention, I've seen many tests fail because of
> > previous state, and I don't want to miss out on those in the future. It
> > would also slow down the run unnecessarily further.
> 
> We already do rather eregious stuff in quiescent. So I want hard numbers
> on your claim that it slows down stuff further - there really shouldn't be
> much at all to retire/evict.
> -Daniel

I don't like any of those arbitrary calls to quiescent either fwiw.

Can't I make the same demand for data BEFORE we merge the patch that it
doesn't slow anything down?
Daniel Vetter Nov. 1, 2013, 6:47 p.m. UTC | #7
On Fri, Nov 01, 2013 at 11:44:40AM -0700, Ben Widawsky wrote:
> On Fri, Nov 01, 2013 at 07:42:59PM +0100, Daniel Vetter wrote:
> > On Fri, Nov 01, 2013 at 09:18:51AM -0700, Ben Widawsky wrote:
> > > On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> > > > On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> > > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > > 
> > > > > We don't want a previously used object to be freed in the middle of a
> > > > > before/after object counting operation (or we would get a "-1 objects
> > > > > leaked" message). We have seen this happening, e.g., when a context
> > > > > from a previous run dies, but its backing object is alive waiting for
> > > > > a retire_work to kick in.
> > > > > 
> > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > > > 
> > > > Nice catch. Should we do this in general as part of our gem_quiescent_gpu
> > > > helper? All i-g-t testcase are written under the assumption that they
> > > > completel own the gpu and that the gtt is completely empty besides the few
> > > > driver-allocated and pinned objects. So trying really hard to get rid of
> > > > any residual stuff sounds like a good idea.
> > > 
> > > I was going to address this in the other mail thread.... in any case, I
> > > think not. I believe a separate helper is the way to go, and we should
> > > only call it when we absolutely want to.
> > > 
> > > Though it's not the intention, I've seen many tests fail because of
> > > previous state, and I don't want to miss out on those in the future. It
> > > would also slow down the run unnecessarily further.
> > 
> > We already do rather eregious stuff in quiescent. So I want hard numbers
> > on your claim that it slows down stuff further - there really shouldn't be
> > much at all to retire/evict.
> > -Daniel
> 
> I don't like any of those arbitrary calls to quiescent either fwiw.
> 
> Can't I make the same demand for data BEFORE we merge the patch that it
> doesn't slow anything down?

All those "arbitrary calls to quiescent" actually fixed spurious igt
failures. igts are written under the assumption that _nothing_ else is
going on in gpu-land, since otherwise it's just impossible to hit some
races. So this is matter of correctness first and speed second.
-Daniel
Ben Widawsky Nov. 1, 2013, 6:52 p.m. UTC | #8
On Fri, Nov 01, 2013 at 07:47:37PM +0100, Daniel Vetter wrote:
> On Fri, Nov 01, 2013 at 11:44:40AM -0700, Ben Widawsky wrote:
> > On Fri, Nov 01, 2013 at 07:42:59PM +0100, Daniel Vetter wrote:
> > > On Fri, Nov 01, 2013 at 09:18:51AM -0700, Ben Widawsky wrote:
> > > > On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> > > > > On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> > > > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > > > 
> > > > > > We don't want a previously used object to be freed in the middle of a
> > > > > > before/after object counting operation (or we would get a "-1 objects
> > > > > > leaked" message). We have seen this happening, e.g., when a context
> > > > > > from a previous run dies, but its backing object is alive waiting for
> > > > > > a retire_work to kick in.
> > > > > > 
> > > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > > > > 
> > > > > Nice catch. Should we do this in general as part of our gem_quiescent_gpu
> > > > > helper? All i-g-t testcase are written under the assumption that they
> > > > > completel own the gpu and that the gtt is completely empty besides the few
> > > > > driver-allocated and pinned objects. So trying really hard to get rid of
> > > > > any residual stuff sounds like a good idea.
> > > > 
> > > > I was going to address this in the other mail thread.... in any case, I
> > > > think not. I believe a separate helper is the way to go, and we should
> > > > only call it when we absolutely want to.
> > > > 
> > > > Though it's not the intention, I've seen many tests fail because of
> > > > previous state, and I don't want to miss out on those in the future. It
> > > > would also slow down the run unnecessarily further.
> > > 
> > > We already do rather eregious stuff in quiescent. So I want hard numbers
> > > on your claim that it slows down stuff further - there really shouldn't be
> > > much at all to retire/evict.
> > > -Daniel
> > 
> > I don't like any of those arbitrary calls to quiescent either fwiw.
> > 
> > Can't I make the same demand for data BEFORE we merge the patch that it
> > doesn't slow anything down?
> 
> All those "arbitrary calls to quiescent" actually fixed spurious igt
> failures. igts are written under the assumption that _nothing_ else is
> going on in gpu-land, since otherwise it's just impossible to hit some
> races. So this is matter of correctness first and speed second.
> -Daniel

They should be called where there are "spurious" errors and have an
understanding why it's required to do. Sprinkling synchronizing code all
over the place and calling it a fix is false. It's a "workaround" at
best, but more likely dearth of time to do it properly. I can live with
either honestly. I can't live with the statement that it's the proper
thing to do.

Very few tests we have will actually care that _nothing_ else is
running, and if they do, annotations in code via quiescent calls is a
nice way to document it.
Daniel Vetter Nov. 1, 2013, 7:07 p.m. UTC | #9
On Fri, Nov 01, 2013 at 11:52:48AM -0700, Ben Widawsky wrote:
> On Fri, Nov 01, 2013 at 07:47:37PM +0100, Daniel Vetter wrote:
> > On Fri, Nov 01, 2013 at 11:44:40AM -0700, Ben Widawsky wrote:
> > > On Fri, Nov 01, 2013 at 07:42:59PM +0100, Daniel Vetter wrote:
> > > > On Fri, Nov 01, 2013 at 09:18:51AM -0700, Ben Widawsky wrote:
> > > > > On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> > > > > > On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com wrote:
> > > > > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > > > > 
> > > > > > > We don't want a previously used object to be freed in the middle of a
> > > > > > > before/after object counting operation (or we would get a "-1 objects
> > > > > > > leaked" message). We have seen this happening, e.g., when a context
> > > > > > > from a previous run dies, but its backing object is alive waiting for
> > > > > > > a retire_work to kick in.
> > > > > > > 
> > > > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > > > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > > > > > 
> > > > > > Nice catch. Should we do this in general as part of our gem_quiescent_gpu
> > > > > > helper? All i-g-t testcase are written under the assumption that they
> > > > > > completel own the gpu and that the gtt is completely empty besides the few
> > > > > > driver-allocated and pinned objects. So trying really hard to get rid of
> > > > > > any residual stuff sounds like a good idea.
> > > > > 
> > > > > I was going to address this in the other mail thread.... in any case, I
> > > > > think not. I believe a separate helper is the way to go, and we should
> > > > > only call it when we absolutely want to.
> > > > > 
> > > > > Though it's not the intention, I've seen many tests fail because of
> > > > > previous state, and I don't want to miss out on those in the future. It
> > > > > would also slow down the run unnecessarily further.
> > > > 
> > > > We already do rather eregious stuff in quiescent. So I want hard numbers
> > > > on your claim that it slows down stuff further - there really shouldn't be
> > > > much at all to retire/evict.
> > > > -Daniel
> > > 
> > > I don't like any of those arbitrary calls to quiescent either fwiw.
> > > 
> > > Can't I make the same demand for data BEFORE we merge the patch that it
> > > doesn't slow anything down?
> > 
> > All those "arbitrary calls to quiescent" actually fixed spurious igt
> > failures. igts are written under the assumption that _nothing_ else is
> > going on in gpu-land, since otherwise it's just impossible to hit some
> > races. So this is matter of correctness first and speed second.
> > -Daniel
> 
> They should be called where there are "spurious" errors and have an
> understanding why it's required to do. Sprinkling synchronizing code all
> over the place and calling it a fix is false. It's a "workaround" at
> best, but more likely dearth of time to do it properly. I can live with
> either honestly. I can't live with the statement that it's the proper
> thing to do.
> 
> Very few tests we have will actually care that _nothing_ else is
> running, and if they do, annotations in code via quiescent calls is a
> nice way to document it.

Atm a call to quiescent_gpu on an idle machine takes roughly 25us (in a
loop of 100k, snb laptop). You're optimizing the wrong thing.

Also, as long as everyone bitches and moans about igt tests being unstable
I'm leaning _massively_ towards stable tests results. And I've really seen
too many igt tests fail spuriously so that I've decided to go back to
an unconditional to quiescent_gpu (it wasn't like that originally).
-Daniel
oscar.mateo@intel.com Nov. 4, 2013, 4:45 p.m. UTC | #10
Ok, I have sent a series of four patches that address Ben´s and Damien´s comments, plus a fix for a typo I found in gem_*_reloc. I have left out the igt_drop_caches_set() call inside gem_quiescent_gpu() because it is still being disputed, but I could easily include this patch as well.

IMHO, gem_quiescent_gpu() does need the drop cache call, otherwise it doesn´t really do what it advertises. However, calling gem_quiescent_gpu() inside get_object_count() is probably overkill, and could potentially mask something going wrong (not so much when called at the beginning of a subtest, but definitely when called at the end!).

What do you think?

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Friday, November 01, 2013 7:08 PM
> To: Ben Widawsky
> Cc: Daniel Vetter; Mateo Lozano, Oscar; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH] prime_self_import: Assure no pending
> requests before object counting
> 
> On Fri, Nov 01, 2013 at 11:52:48AM -0700, Ben Widawsky wrote:
> > On Fri, Nov 01, 2013 at 07:47:37PM +0100, Daniel Vetter wrote:
> > > On Fri, Nov 01, 2013 at 11:44:40AM -0700, Ben Widawsky wrote:
> > > > On Fri, Nov 01, 2013 at 07:42:59PM +0100, Daniel Vetter wrote:
> > > > > On Fri, Nov 01, 2013 at 09:18:51AM -0700, Ben Widawsky wrote:
> > > > > > On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> > > > > > > On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo@intel.com
> wrote:
> > > > > > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > > > > > >
> > > > > > > > We don't want a previously used object to be freed in the
> > > > > > > > middle of a before/after object counting operation (or we
> > > > > > > > would get a "-1 objects leaked" message). We have seen
> > > > > > > > this happening, e.g., when a context from a previous run
> > > > > > > > dies, but its backing object is alive waiting for a retire_work to
> kick in.
> > > > > > > >
> > > > > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > > > > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > > > > > >
> > > > > > > Nice catch. Should we do this in general as part of our
> > > > > > > gem_quiescent_gpu helper? All i-g-t testcase are written
> > > > > > > under the assumption that they completel own the gpu and
> > > > > > > that the gtt is completely empty besides the few
> > > > > > > driver-allocated and pinned objects. So trying really hard to get rid
> of any residual stuff sounds like a good idea.
> > > > > >
> > > > > > I was going to address this in the other mail thread.... in
> > > > > > any case, I think not. I believe a separate helper is the way
> > > > > > to go, and we should only call it when we absolutely want to.
> > > > > >
> > > > > > Though it's not the intention, I've seen many tests fail
> > > > > > because of previous state, and I don't want to miss out on
> > > > > > those in the future. It would also slow down the run unnecessarily
> further.
> > > > >
> > > > > We already do rather eregious stuff in quiescent. So I want hard
> > > > > numbers on your claim that it slows down stuff further - there
> > > > > really shouldn't be much at all to retire/evict.
> > > > > -Daniel
> > > >
> > > > I don't like any of those arbitrary calls to quiescent either fwiw.
> > > >
> > > > Can't I make the same demand for data BEFORE we merge the patch
> > > > that it doesn't slow anything down?
> > >
> > > All those "arbitrary calls to quiescent" actually fixed spurious igt
> > > failures. igts are written under the assumption that _nothing_ else
> > > is going on in gpu-land, since otherwise it's just impossible to hit
> > > some races. So this is matter of correctness first and speed second.
> > > -Daniel
> >
> > They should be called where there are "spurious" errors and have an
> > understanding why it's required to do. Sprinkling synchronizing code
> > all over the place and calling it a fix is false. It's a "workaround"
> > at best, but more likely dearth of time to do it properly. I can live
> > with either honestly. I can't live with the statement that it's the
> > proper thing to do.
> >
> > Very few tests we have will actually care that _nothing_ else is
> > running, and if they do, annotations in code via quiescent calls is a
> > nice way to document it.
> 
> Atm a call to quiescent_gpu on an idle machine takes roughly 25us (in a loop
> of 100k, snb laptop). You're optimizing the wrong thing.
> 
> Also, as long as everyone bitches and moans about igt tests being unstable
> I'm leaning _massively_ towards stable tests results. And I've really seen too
> many igt tests fail spuriously so that I've decided to go back to an
> unconditional to quiescent_gpu (it wasn't like that originally).
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Daniel Vetter Nov. 4, 2013, 5:13 p.m. UTC | #11
On Mon, Nov 4, 2013 at 5:45 PM, Mateo Lozano, Oscar
<oscar.mateo@intel.com> wrote:
> IMHO, gem_quiescent_gpu() does need the drop cache call, otherwise it doesn´t really do what it advertises. However, calling gem_quiescent_gpu() inside get_object_count() is probably overkill, and could potentially mask something going wrong (not so much when called at the beginning of a subtest, but definitely when called at the end!).

Well i-g-t testcase run under the assumption that nothing else does,
so most often the drop_cache call would be a noop. Save when there's
indeed something left to retire, in which case it matters actually.
I'll add it if you don't feel like doing it yourself.
-Danile
oscar.mateo@intel.com Nov. 4, 2013, 5:26 p.m. UTC | #12
> > IMHO, gem_quiescent_gpu() does need the drop cache call, otherwise it
> doesn´t really do what it advertises. However, calling gem_quiescent_gpu()
> inside get_object_count() is probably overkill, and could potentially mask
> something going wrong (not so much when called at the beginning of a
> subtest, but definitely when called at the end!).
> 
> Well i-g-t testcase run under the assumption that nothing else does, so most
> often the drop_cache call would be a noop. Save when there's indeed
> something left to retire, in which case it matters actually.
> I'll add it if you don't feel like doing it yourself.

It´s ok, I´ll add it. But... which call do you want?:

A) igt_drop_caches_set() inside gem_quiescent_gpu()
B) gem_quiescent_gpu() inside get_object_count()
C) Both of them
Daniel Vetter Nov. 4, 2013, 5:43 p.m. UTC | #13
On Mon, Nov 4, 2013 at 6:26 PM, Mateo Lozano, Oscar
<oscar.mateo@intel.com> wrote:
>> > IMHO, gem_quiescent_gpu() does need the drop cache call, otherwise it
>> doesn´t really do what it advertises. However, calling gem_quiescent_gpu()
>> inside get_object_count() is probably overkill, and could potentially mask
>> something going wrong (not so much when called at the beginning of a
>> subtest, but definitely when called at the end!).
>>
>> Well i-g-t testcase run under the assumption that nothing else does, so most
>> often the drop_cache call would be a noop. Save when there's indeed
>> something left to retire, in which case it matters actually.
>> I'll add it if you don't feel like doing it yourself.
>
> It´s ok, I´ll add it. But... which call do you want?:
>
> A) igt_drop_caches_set() inside gem_quiescent_gpu()
> B) gem_quiescent_gpu() inside get_object_count()
> C) Both of them

I'd go with drop_caches(RETIRE) in get_object_count + in gem_quiescent_gpu.
-Daniel
diff mbox

Patch

diff --git a/tests/prime_self_import.c b/tests/prime_self_import.c
index 481a809..e48abd1 100644
--- a/tests/prime_self_import.c
+++ b/tests/prime_self_import.c
@@ -211,6 +211,25 @@  static void test_with_one_bo(void)
 	check_bo(fd2, handle_import1, fd2, handle_import1);
 }
 
+static void retire_requests(void)
+{
+	char fname[FILENAME_MAX];
+	int drop_caches_fd;
+	const char *data = "0x4";
+
+	snprintf(fname, FILENAME_MAX, "%s/%i/%s",
+		 "/sys/kernel/debug/dri", drm_get_card(),
+		 "i915_gem_drop_caches");
+
+	drop_caches_fd = open(fname, O_WRONLY);
+
+	if (drop_caches_fd >= 0)
+	{
+		write(drop_caches_fd, data, strlen(data) + 1);
+		close(drop_caches_fd);
+	}
+}
+
 static int get_object_count(void)
 {
 	FILE *file;
@@ -252,10 +271,13 @@  static void test_reimport_close_race(void)
 	pthread_t *threads;
 	int r, i, num_threads;
 	int fds[2];
-	int obj_count = get_object_count();
+	int obj_count;
 	void *status;
 	uint32_t handle;
 
+	retire_requests();
+	obj_count = get_object_count();
+
 	num_threads = sysconf(_SC_NPROCESSORS_ONLN);
 
 	threads = calloc(num_threads, sizeof(pthread_t));
@@ -330,9 +352,12 @@  static void test_export_close_race(void)
 	pthread_t *threads;
 	int r, i, num_threads;
 	int fd;
-	int obj_count = get_object_count();
+	int obj_count;
 	void *status;
 
+	retire_requests();
+	obj_count = get_object_count();
+
 	num_threads = sysconf(_SC_NPROCESSORS_ONLN);
 
 	threads = calloc(num_threads, sizeof(pthread_t));