Message ID | 20220913232212.894826-1-daniele.ceraolospurio@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/huc: stall media submission until HuC is loaded | expand |
On Tue, 2022-09-13 at 16:22 -0700, Ceraolo Spurio, Daniele wrote: > Wait on the fence to be signalled to avoid the submissions finding HuC > not yet loaded. > > v2: use dedicaded wait_queue_entry for waiting in HuC load, as submitq > can't be re-used for it. > > Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > Cc: Tony Ye <tony.ye@intel.com> > Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> #v1 > Acked-by: Tony Ye <tony.ye@intel.com> > --- > drivers/gpu/drm/i915/gt/uc/intel_huc.h | 6 ++++++ > drivers/gpu/drm/i915/i915_request.c | 24 ++++++++++++++++++++++++ > drivers/gpu/drm/i915/i915_request.h | 5 +++++ > 3 files changed, 35 insertions(+) > [snip] > diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h > index 47041ec68df8..f5e1bb5e857a 100644 > --- a/drivers/gpu/drm/i915/i915_request.h > +++ b/drivers/gpu/drm/i915/i915_request.h > @@ -348,6 +348,11 @@ struct i915_request { > #define GUC_PRIO_FINI 0xfe > u8 guc_prio; > > + /** > + * @hucq: wait queue entry used to wait on the HuC load to complete > + */ > + wait_queue_entry_t hucq; > + > > I believe that in future if we have multiple engines that requires a similiar stalled initialization wait, we should have an array of ptrs here and a not-huc-specific-helper that can sort out adding fence-signalled-waiters. But for now this is a very rare race condition that only happens with HuC so this hucq specific wait-entry will do fine. Thus: Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.h b/drivers/gpu/drm/i915/gt/uc/intel_huc.h index 915d281c1c72..52db03620c60 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.h @@ -81,6 +81,12 @@ static inline bool intel_huc_is_loaded_by_gsc(const struct intel_huc *huc) return huc->fw.loaded_via_gsc; } +static inline bool intel_huc_wait_required(struct intel_huc *huc) +{ + return intel_huc_is_used(huc) && intel_huc_is_loaded_by_gsc(huc) && + !intel_huc_is_authenticated(huc); +} + void intel_huc_load_status(struct intel_huc *huc, struct drm_printer *p); #endif diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 62fad16a55e8..f949a9495758 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1621,6 +1621,20 @@ i915_request_await_object(struct i915_request *to, return ret; } +static void i915_request_await_huc(struct i915_request *rq) +{ + struct intel_huc *huc = &rq->context->engine->gt->uc.huc; + + /* don't stall kernel submissions! */ + if (!rcu_access_pointer(rq->context->gem_context)) + return; + + if (intel_huc_wait_required(huc)) + i915_sw_fence_await_sw_fence(&rq->submit, + &huc->delayed_load.fence, + &rq->hucq); +} + static struct i915_request * __i915_request_ensure_parallel_ordering(struct i915_request *rq, struct intel_timeline *timeline) @@ -1702,6 +1716,16 @@ __i915_request_add_to_timeline(struct i915_request *rq) struct intel_timeline *timeline = i915_request_timeline(rq); struct i915_request *prev; + /* + * Media workloads may require HuC, so stall them until HuC loading is + * complete. Note that HuC not being loaded when a user submission + * arrives can only happen when HuC is loaded via GSC and in that case + * we still expect the window between us starting to accept submissions + * and HuC loading completion to be small (a few hundred ms). + */ + if (rq->engine->class == VIDEO_DECODE_CLASS) + i915_request_await_huc(rq); + /* * Dependency tracking and request ordering along the timeline * is special cased so that we can eliminate redundant ordering diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 47041ec68df8..f5e1bb5e857a 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -348,6 +348,11 @@ struct i915_request { #define GUC_PRIO_FINI 0xfe u8 guc_prio; + /** + * @hucq: wait queue entry used to wait on the HuC load to complete + */ + wait_queue_entry_t hucq; + I915_SELFTEST_DECLARE(struct { struct list_head link; unsigned long delay;