Message ID | 1418922221-18789-1-git-send-email-david.s.gordon@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Dec 18, 2014 at 05:03:41PM +0000, Dave Gordon wrote: > In {intel,logical}_ring_wait_request(), we try to find a request > whose completion will release the amount of ring space required. > If we find such a request, we wait for it, and then retire it, in > the expectation that we will now have at least the required amount > of free space in the ring. But it's good to check that this has > actually happened, so we can back out with a meaningful error > code if something unexpected has happened, such as wait_request > returning early. It's pretty pointless. It's a programming bug, if anything it should WARN if it happens. -Chris
On 18/12/14 17:18, Chris Wilson wrote: > On Thu, Dec 18, 2014 at 05:03:41PM +0000, Dave Gordon wrote: >> In {intel,logical}_ring_wait_request(), we try to find a request >> whose completion will release the amount of ring space required. >> If we find such a request, we wait for it, and then retire it, in >> the expectation that we will now have at least the required amount >> of free space in the ring. But it's good to check that this has >> actually happened, so we can back out with a meaningful error >> code if something unexpected has happened, such as wait_request >> returning early. > > It's pretty pointless. It's a programming bug, if anything it should > WARN if it happens. > -Chris I don't regard preventing the propagation of errors as pointless, whether or not it's a programming bug. If this case arose, then without this check we would go ahead and trample over whatever was next in the ring. Quite likely TAIL would overtake HEAD, and then all sorts of bizarre things would happen some time later. In particular, this may be able to catch bugs introduced by /future features/, such as TDR, scheduling, and preemption, all of which are in preparation now and all of which have the potential to disrupt the predictable flow of requests. So it's good to be resilient to upcoming changes and make it as easy as possible to catch bugs at the earliest opportunity. This is also a step towards the deduplication of the ringbuffer and LRC paths by making sure that they are as similar as possible, and thus simple to merge at some later opportunity. .Dave.
On Thu, Dec 18, 2014 at 06:09:18PM +0000, Dave Gordon wrote: > On 18/12/14 17:18, Chris Wilson wrote: > > On Thu, Dec 18, 2014 at 05:03:41PM +0000, Dave Gordon wrote: > >> In {intel,logical}_ring_wait_request(), we try to find a request > >> whose completion will release the amount of ring space required. > >> If we find such a request, we wait for it, and then retire it, in > >> the expectation that we will now have at least the required amount > >> of free space in the ring. But it's good to check that this has > >> actually happened, so we can back out with a meaningful error > >> code if something unexpected has happened, such as wait_request > >> returning early. > > > > It's pretty pointless. It's a programming bug, if anything it should > > WARN if it happens. > > -Chris > > I don't regard preventing the propagation of errors as pointless, > whether or not it's a programming bug. If this case arose, then without > this check we would go ahead and trample over whatever was next in the > ring. Quite likely TAIL would overtake HEAD, and then all sorts of > bizarre things would happen some time later. > > In particular, this may be able to catch bugs introduced by /future > features/, such as TDR, scheduling, and preemption, all of which are in > preparation now and all of which have the potential to disrupt the > predictable flow of requests. So it's good to be resilient to upcoming > changes and make it as easy as possible to catch bugs at the earliest > opportunity. > > This is also a step towards the deduplication of the ringbuffer and LRC > paths by making sure that they are as similar as possible, and thus > simple to merge at some later opportunity. Overwriting the TAIL is a fairly benign disaster as far as gpus are concerned, the gpu tends to survive those. I concur with Chris that a simple WARN_ON for paranoia is good enough. And I don't think we need the comments, imo the code is clear enough as-is. -Daniel
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 90867b2..329cb5c 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -955,6 +955,15 @@ static int logical_ring_wait_request(struct intel_ringbuffer *ringbuf, i915_gem_retire_requests_ring(ring); + /* + * According to our calculation above, retiring the request we just + * waited for should have resulted in there being enough space in + * the ringbuffer; but let's check. + * + * If there is not now enough space, something has gone horribly worng + * (such as wait_request returning early, but with no error, or + * retire_requests failing to retire the request we expected it to). + */ return intel_ring_space(ringbuf) >= bytes ? 0 : -ENOSPC; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 018a37e..c95bf3a 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1932,6 +1932,7 @@ static int intel_ring_wait_request(struct intel_engine_cs *ring, int n) return 0; list_for_each_entry(request, &ring->request_list, list) { + /* Would completion of this request free enough space? */ if (__intel_ring_space(request->tail, ringbuf->tail, ringbuf->effective_size) >= n) { break; @@ -1947,7 +1948,16 @@ static int intel_ring_wait_request(struct intel_engine_cs *ring, int n) i915_gem_retire_requests_ring(ring); - return 0; + /* + * According to our calculation above, retiring the request we just + * waited for should have resulted in there being enough space in + * the ringbuffer; but let's check. + * + * If there is not now enough space, something has gone horribly worng + * (such as wait_request returning early, but with no error, or + * retire_requests failing to retire the request we expected it to). + */ + return intel_ring_space(ringbuf) >= n ? 0 : -ENOSPC; } static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
In {intel,logical}_ring_wait_request(), we try to find a request whose completion will release the amount of ring space required. If we find such a request, we wait for it, and then retire it, in the expectation that we will now have at least the required amount of free space in the ring. But it's good to check that this has actually happened, so we can back out with a meaningful error code if something unexpected has happened, such as wait_request returning early. This code was already in the execlist version, so the change to intel_lrc.c is just to add a comment; but we want the same check in the legacy ringbuffer mode too. v2: rebase to latest drm-intel-nightly Signed-off-by: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++++ drivers/gpu/drm/i915/intel_ringbuffer.c | 12 +++++++++++- 2 files changed, 20 insertions(+), 1 deletion(-)