Message ID | 20210820224446.30620-8-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Parallel submission aka multi-bb execbuf | expand |
On 8/20/2021 15:44, Matthew Brost wrote: > Calling switch_to_kernel_context isn't needed if the engine PM reference > is taken while all contexts are pinned. By not calling > switch_to_kernel_context we save on issuing a request to the engine. I thought the intention of the switch_to_kernel was to ensure that the GPU is not touching any user context and is basically idle. That is not a valid assumption with an external scheduler such as GuC. So why is the description above only mentioning PM references? What is the connection between the PM ref and the switch_to_kernel? Also, the comment in the code does not mention anything about PM references, it just says 'not necessary with GuC' but no explanation at all. > v2: > (Daniel Vetter) > - Add FIXME comment about pushing switch_to_kernel_context to backend > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > --- > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > index 1f07ac4e0672..11fee66daf60 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > unsigned long flags; > bool result = true; > > + /* > + * No need to switch_to_kernel_context if GuC submission > + * > + * FIXME: This execlists specific backend behavior in generic code, this "This execlists" -> "This is execlist" "this should be" -> "it should be" John. > + * should be pushed to the backend. > + */ > + if (intel_engine_uses_guc(engine)) > + return true; > + > /* GPU is pointing to the void, as good as in the kernel context. */ > if (intel_gt_is_wedged(engine->gt)) > return true;
On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote: > On 8/20/2021 15:44, Matthew Brost wrote: > > Calling switch_to_kernel_context isn't needed if the engine PM reference > > is taken while all contexts are pinned. By not calling > > switch_to_kernel_context we save on issuing a request to the engine. > I thought the intention of the switch_to_kernel was to ensure that the GPU > is not touching any user context and is basically idle. That is not a valid > assumption with an external scheduler such as GuC. So why is the description > above only mentioning PM references? What is the connection between the PM > ref and the switch_to_kernel? > > Also, the comment in the code does not mention anything about PM references, > it just says 'not necessary with GuC' but no explanation at all. > Yea, this need to be explained better. How about this? Calling switch_to_kernel_context isn't needed if the engine PM reference is take while all user contexts have scheduling enabled. Once scheduling is disabled on all user contexts the GuC is guaranteed to not touch any user context state which is effectively the same pointing to a kernel context. Matt > > > v2: > > (Daniel Vetter) > > - Add FIXME comment about pushing switch_to_kernel_context to backend > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > --- > > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > index 1f07ac4e0672..11fee66daf60 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > > unsigned long flags; > > bool result = true; > > + /* > > + * No need to switch_to_kernel_context if GuC submission > > + * > > + * FIXME: This execlists specific backend behavior in generic code, this > "This execlists" -> "This is execlist" > > "this should be" -> "it should be" > > John. > > > + * should be pushed to the backend. > > + */ > > + if (intel_engine_uses_guc(engine)) > > + return true; > > + > > /* GPU is pointing to the void, as good as in the kernel context. */ > > if (intel_gt_is_wedged(engine->gt)) > > return true; >
On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote: > On 8/20/2021 15:44, Matthew Brost wrote: > > Calling switch_to_kernel_context isn't needed if the engine PM reference > > is taken while all contexts are pinned. By not calling > > switch_to_kernel_context we save on issuing a request to the engine. > I thought the intention of the switch_to_kernel was to ensure that the GPU > is not touching any user context and is basically idle. That is not a valid > assumption with an external scheduler such as GuC. So why is the description > above only mentioning PM references? What is the connection between the PM > ref and the switch_to_kernel? > > Also, the comment in the code does not mention anything about PM references, > it just says 'not necessary with GuC' but no explanation at all. > > > > v2: > > (Daniel Vetter) > > - Add FIXME comment about pushing switch_to_kernel_context to backend > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > --- > > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > index 1f07ac4e0672..11fee66daf60 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > > unsigned long flags; > > bool result = true; > > + /* > > + * No need to switch_to_kernel_context if GuC submission > > + * > > + * FIXME: This execlists specific backend behavior in generic code, this > "This execlists" -> "This is execlist" > > "this should be" -> "it should be" > Missed this. Will fix in next rev. Matt > John. > > > + * should be pushed to the backend. > > + */ > > + if (intel_engine_uses_guc(engine)) > > + return true; > > + > > /* GPU is pointing to the void, as good as in the kernel context. */ > > if (intel_gt_is_wedged(engine->gt)) > > return true; >
On 9/13/2021 09:54, Matthew Brost wrote: > On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote: >> On 8/20/2021 15:44, Matthew Brost wrote: >>> Calling switch_to_kernel_context isn't needed if the engine PM reference >>> is taken while all contexts are pinned. By not calling >>> switch_to_kernel_context we save on issuing a request to the engine. >> I thought the intention of the switch_to_kernel was to ensure that the GPU >> is not touching any user context and is basically idle. That is not a valid >> assumption with an external scheduler such as GuC. So why is the description >> above only mentioning PM references? What is the connection between the PM >> ref and the switch_to_kernel? >> >> Also, the comment in the code does not mention anything about PM references, >> it just says 'not necessary with GuC' but no explanation at all. >> > Yea, this need to be explained better. How about this? > > Calling switch_to_kernel_context isn't needed if the engine PM reference > is take while all user contexts have scheduling enabled. Once scheduling > is disabled on all user contexts the GuC is guaranteed to not touch any > user context state which is effectively the same pointing to a kernel > context. > > Matt I'm still not seeing how the PM reference is involved? Also, IMHO the focus is wrong in the above text. The fundamental requirement is the ensure the hardware is idle. Execlist achieves this by switching to a safe context. GuC achieves it by disabling scheduling. Indeed, switching to a 'safe' context really has no effect with GuC submission. So 'effectively the same as pointing to a kernel context' is an incorrect description. I would go with something like: "This is execlist specific behaviour intended to ensure the GPU is idle by switching to a known 'safe' context. With GuC submission, the same idle guarantee is achieved by other means (disabling scheduling). Further, switching to a 'safe' context has no effect with GuC submission as the scheduler can just switch back again. FIXME: Move this backend scheduler specific behaviour into the scheduler backend." John. > >>> v2: >>> (Daniel Vetter) >>> - Add FIXME comment about pushing switch_to_kernel_context to backend >>> >>> Signed-off-by: Matthew Brost <matthew.brost@intel.com> >>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> >>> --- >>> drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c >>> index 1f07ac4e0672..11fee66daf60 100644 >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c >>> @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) >>> unsigned long flags; >>> bool result = true; >>> + /* >>> + * No need to switch_to_kernel_context if GuC submission >>> + * >>> + * FIXME: This execlists specific backend behavior in generic code, this >> "This execlists" -> "This is execlist" >> >> "this should be" -> "it should be" >> >> John. >> >>> + * should be pushed to the backend. >>> + */ >>> + if (intel_engine_uses_guc(engine)) >>> + return true; >>> + >>> /* GPU is pointing to the void, as good as in the kernel context. */ >>> if (intel_gt_is_wedged(engine->gt)) >>> return true;
On Mon, Sep 13, 2021 at 03:38:44PM -0700, John Harrison wrote: > On 9/13/2021 09:54, Matthew Brost wrote: > > On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote: > > On 8/20/2021 15:44, Matthew Brost wrote: > > Calling switch_to_kernel_context isn't needed if the engine PM reference > is taken while all contexts are pinned. By not calling > switch_to_kernel_context we save on issuing a request to the engine. > > I thought the intention of the switch_to_kernel was to ensure that the GPU > is not touching any user context and is basically idle. That is not a valid > assumption with an external scheduler such as GuC. So why is the description > above only mentioning PM references? What is the connection between the PM > ref and the switch_to_kernel? > > Also, the comment in the code does not mention anything about PM references, > it just says 'not necessary with GuC' but no explanation at all. > > > Yea, this need to be explained better. How about this? > > Calling switch_to_kernel_context isn't needed if the engine PM reference > is take while all user contexts have scheduling enabled. Once scheduling > is disabled on all user contexts the GuC is guaranteed to not touch any > user context state which is effectively the same pointing to a kernel > context. > > Matt > > I'm still not seeing how the PM reference is involved? > We shouldn't trap into the GT PM park code while a user context has scheduling enabled as the GT PM park code may have side affects we don't to execute if a user context still has scheduling enabled. I guess that isn't explained very well. > Also, IMHO the focus is wrong in the above text. The fundamental requirement is > the ensure the hardware is idle. Execlist achieves this by switching to a safe > context. GuC achieves it by disabling scheduling. Indeed, switching to a 'safe' > context really has no effect with GuC submission. So 'effectively the same as > pointing to a kernel context' is an incorrect description. I would go with > something like: > > "This is execlist specific behaviour intended to ensure the GPU is idle by > switching to a known 'safe' context. With GuC submission, the same idle > guarantee is achieved by other means (disabling scheduling). Further, > switching to a 'safe' context has no effect with GuC submission as the > scheduler can just switch back again. > FIXME: Move this backend scheduler specific behaviour into the scheduler > backend." > That is worded better. Will pull into the next rev. Matt > > John. > > > > > > v2: > (Daniel Vetter) > - Add FIXME comment about pushing switch_to_kernel_context to backend > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > --- > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > index 1f07ac4e0672..11fee66daf60 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > unsigned long flags; > bool result = true; > + /* > + * No need to switch_to_kernel_context if GuC submission > + * > + * FIXME: This execlists specific backend behavior in generic code, this > > "This execlists" -> "This is execlist" > > "this should be" -> "it should be" > > John. > > > + * should be pushed to the backend. > + */ > + if (intel_engine_uses_guc(engine)) > + return true; > + > /* GPU is pointing to the void, as good as in the kernel context. */ > if (intel_gt_is_wedged(engine->gt)) > return true; > > > SECURITY NOTE: file ~/.netrc must not be accessible by others
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 1f07ac4e0672..11fee66daf60 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) unsigned long flags; bool result = true; + /* + * No need to switch_to_kernel_context if GuC submission + * + * FIXME: This execlists specific backend behavior in generic code, this + * should be pushed to the backend. + */ + if (intel_engine_uses_guc(engine)) + return true; + /* GPU is pointing to the void, as good as in the kernel context. */ if (intel_gt_is_wedged(engine->gt)) return true;