Message ID | 1476469424-25618-1-git-send-email-akash.goel@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Oct 14, 2016 at 11:53:44PM +0530, akash.goel@intel.com wrote: > From: Akash Goel <akash.goel@intel.com> > > Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are > pinned in mappable aperture portion of GGTT and for ringbuffer pages > allocated from Stolen memory, access can only be done through GMADR BAR. > In case of GuC based submission, updates done in ringbuffer via GMADR > may not get commited to memory by the time the Command streamer starts > reading them, resulting in fetching of stale data. Please leave a blank line between paragraphs, or try to not leave so much whitespace at the end of a sentence. > For Host based submission, such problem is not there as the write to Ring > Tail or ELSP register happens from the Host side prior to submission. > Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already > enforces the ordering between outstanding GMADR writes & new GTTMADR access. > MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to > registers within GT is contained within GT, so ordering is not enforced > resulting in a race, which can manifest in form of a hang. > To ensure the flush of in flight GMADR writes, a POSTING READ is done to > GuC register prior to doorbell ring. > There is already a similar WA in i915_gem_object_flush_gtt_write_domain(), > which takes care of GMADR writes from User space to GEM buffers, but not the > ringbuffer writes from KMD. > This WA is needed on all recent HW. > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> > Signed-off-by: Akash Goel <akash.goel@intel.com> > --- > drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c > index a1f76c8..43c8a72 100644 > --- a/drivers/gpu/drm/i915/i915_guc_submission.c > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c > @@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc) > */ > static void i915_guc_submit(struct drm_i915_gem_request *rq) > { > + struct drm_i915_private *dev_priv = rq->i915; > unsigned int engine_id = rq->engine->id; > struct intel_guc *guc = &rq->i915->guc; > struct i915_guc_client *client = guc->execbuf_client; > @@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq) > > spin_lock(&client->wq_lock); > guc_wq_item_append(client, rq); > + > + /* WA to flush out the pending GMADR writes to ring buffer. */ > + if (i915_vma_is_map_and_fenceable(rq->ring->vma)) > + POSTING_READ(GUC_STATUS); Did you test POSTING_READ_FW() ? Otherwise it makes an unfortunate amount of sense, and I feel justified in what I had to do in flush_gtt_write_domwin! :) -Chris
On 10/14/2016 11:45 PM, Chris Wilson wrote: > On Fri, Oct 14, 2016 at 11:53:44PM +0530, akash.goel@intel.com wrote: >> From: Akash Goel <akash.goel@intel.com> >> >> Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are >> pinned in mappable aperture portion of GGTT and for ringbuffer pages >> allocated from Stolen memory, access can only be done through GMADR BAR. >> In case of GuC based submission, updates done in ringbuffer via GMADR >> may not get commited to memory by the time the Command streamer starts >> reading them, resulting in fetching of stale data. > > Please leave a blank line between paragraphs, or try to not leave so > much whitespace at the end of a sentence. > I am sorry. Will be mindful of this from now. >> For Host based submission, such problem is not there as the write to Ring >> Tail or ELSP register happens from the Host side prior to submission. >> Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already >> enforces the ordering between outstanding GMADR writes & new GTTMADR access. >> MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to >> registers within GT is contained within GT, so ordering is not enforced >> resulting in a race, which can manifest in form of a hang. >> To ensure the flush of in flight GMADR writes, a POSTING READ is done to >> GuC register prior to doorbell ring. >> There is already a similar WA in i915_gem_object_flush_gtt_write_domain(), >> which takes care of GMADR writes from User space to GEM buffers, but not the >> ringbuffer writes from KMD. >> This WA is needed on all recent HW. >> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> >> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> >> Signed-off-by: Akash Goel <akash.goel@intel.com> >> --- >> drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c >> index a1f76c8..43c8a72 100644 >> --- a/drivers/gpu/drm/i915/i915_guc_submission.c >> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c >> @@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc) >> */ >> static void i915_guc_submit(struct drm_i915_gem_request *rq) >> { >> + struct drm_i915_private *dev_priv = rq->i915; >> unsigned int engine_id = rq->engine->id; >> struct intel_guc *guc = &rq->i915->guc; >> struct i915_guc_client *client = guc->execbuf_client; >> @@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq) >> >> spin_lock(&client->wq_lock); >> guc_wq_item_append(client, rq); >> + >> + /* WA to flush out the pending GMADR writes to ring buffer. */ >> + if (i915_vma_is_map_and_fenceable(rq->ring->vma)) >> + POSTING_READ(GUC_STATUS); > > Did you test POSTING_READ_FW() ? Sorry though we haven't explicitly tried POSTING_READ_FW() but it should work since, as per the __gen9_fw_ranges[] table, GuC registers (C000-Cxxx) do not lie in any Forcewake domain range. > > Otherwise it makes an unfortunate amount of sense, and I feel justified > in what I had to do in flush_gtt_write_domwin! :) Yes your hunch, expectedly, was spot on :). Best regards Akash > -Chris >
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index a1f76c8..43c8a72 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc) */ static void i915_guc_submit(struct drm_i915_gem_request *rq) { + struct drm_i915_private *dev_priv = rq->i915; unsigned int engine_id = rq->engine->id; struct intel_guc *guc = &rq->i915->guc; struct i915_guc_client *client = guc->execbuf_client; @@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq) spin_lock(&client->wq_lock); guc_wq_item_append(client, rq); + + /* WA to flush out the pending GMADR writes to ring buffer. */ + if (i915_vma_is_map_and_fenceable(rq->ring->vma)) + POSTING_READ(GUC_STATUS); + b_ret = guc_ring_doorbell(client); client->submissions[engine_id] += 1;