Message ID | 20210803222943.27686-12-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Parallel submission aka multi-bb execbuf | expand |
On Tue, Aug 03, 2021 at 03:29:08PM -0700, Matthew Brost wrote: > Calling switch_to_kernel_context isn't needed if the engine PM reference > is taken while all contexts are pinned. By not calling > switch_to_kernel_context we save on issuing a request to the engine. > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > index 1f07ac4e0672..58099de6bf07 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > @@ -162,6 +162,10 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > unsigned long flags; > bool result = true; > > + /* No need to switch_to_kernel_context if GuC submission */ Maybe whack a big FIXME on here that we should unravel this properly. Currently the execlist backend assumptions are leaked all over the place, leading to stuff like this. Which means extremely fragile code. I currently don't have a great idea on how exactly we should do that, but oh well. btw just in case we ever want to make guc lrc properly evictable (which as the og use-case for this function, way, way back), would we need to fully unregister them from guc? At least I'm assuming there's no other trick like the below one. Another aside: How does the perf/OA patching work on GuC? Anyway, patch looks legit: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > + if (intel_engine_uses_guc(engine)) > + return true; > + > /* GPU is pointing to the void, as good as in the kernel context. */ > if (intel_gt_is_wedged(engine->gt)) > return true; > -- > 2.28.0 >
On Mon, Aug 09, 2021 at 04:27:01PM +0200, Daniel Vetter wrote: > On Tue, Aug 03, 2021 at 03:29:08PM -0700, Matthew Brost wrote: > > Calling switch_to_kernel_context isn't needed if the engine PM reference > > is taken while all contexts are pinned. By not calling > > switch_to_kernel_context we save on issuing a request to the engine. > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > --- > > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > index 1f07ac4e0672..58099de6bf07 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > @@ -162,6 +162,10 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > > unsigned long flags; > > bool result = true; > > > > + /* No need to switch_to_kernel_context if GuC submission */ > > Maybe whack a big FIXME on here that we should unravel this properly. Sure, can add a FIXME here. > Currently the execlist backend assumptions are leaked all over the place, > leading to stuff like this. Which means extremely fragile code. > Yes, this something required for execlists implemented in what should be generic code. > I currently don't have a great idea on how exactly we should do that, but > oh well. Me either, it will be a process. > > btw just in case we ever want to make guc lrc properly evictable (which as > the og use-case for this function, way, way back), would we need to fully Can you explain what you mean by fully evictable? Not getting what you mean in this context. > unregister them from guc? At least I'm assuming there's no other trick If scheduling is disabled on the context (currently done on unpin) you are free move anything around as the GuC is guaranteed not to touch the context state. If on re-pin something has moved (e.g. the LRC vaddr is different), you need to unregister and re-register the context with the GuC. > like the below one. > > Another aside: How does the perf/OA patching work on GuC? > Not my area of expertise but perf somewhat a WIP. The plan is for the GuC to write out some stats to HWSP I think? John Harrison is working to get this fully implemented. OA is working afaik, with Umesh Nerlige Ramappa being the expert here. Matt > Anyway, patch looks legit: > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > + if (intel_engine_uses_guc(engine)) > > + return true; > > + > > /* GPU is pointing to the void, as good as in the kernel context. */ > > if (intel_gt_is_wedged(engine->gt)) > > return true; > > -- > > 2.28.0 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Mon, Aug 09, 2021 at 06:20:51PM +0000, Matthew Brost wrote: > On Mon, Aug 09, 2021 at 04:27:01PM +0200, Daniel Vetter wrote: > > On Tue, Aug 03, 2021 at 03:29:08PM -0700, Matthew Brost wrote: > > > Calling switch_to_kernel_context isn't needed if the engine PM reference > > > is taken while all contexts are pinned. By not calling > > > switch_to_kernel_context we save on issuing a request to the engine. > > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > --- > > > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > > index 1f07ac4e0672..58099de6bf07 100644 > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > > @@ -162,6 +162,10 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > > > unsigned long flags; > > > bool result = true; > > > > > > + /* No need to switch_to_kernel_context if GuC submission */ > > > > Maybe whack a big FIXME on here that we should unravel this properly. > > Sure, can add a FIXME here. > > > Currently the execlist backend assumptions are leaked all over the place, > > leading to stuff like this. Which means extremely fragile code. > > > > Yes, this something required for execlists implemented in what should be > generic code. > > > I currently don't have a great idea on how exactly we should do that, but > > oh well. > > Me either, it will be a process. > > > > > btw just in case we ever want to make guc lrc properly evictable (which as > > the og use-case for this function, way, way back), would we need to fully > > Can you explain what you mean by fully evictable? Not getting what you > mean in this context. > > > unregister them from guc? At least I'm assuming there's no other trick > > If scheduling is disabled on the context (currently done on unpin) you are > free move anything around as the GuC is guaranteed not to touch the > context state. If on re-pin something has moved (e.g. the LRC vaddr is > different), you need to unregister and re-register the context with the > GuC. So at that point GuC also guarantees that it's not left in the hw engine? Execlist has this barrier request to fully unload the ctx from the hw, and that's also why I cam on the topic of OA. > > like the below one. > > > > Another aside: How does the perf/OA patching work on GuC? > > > > Not my area of expertise but perf somewhat a WIP. The plan is for the > GuC to write out some stats to HWSP I think? John Harrison is working to > get this fully implemented. > > OA is working afaik, with Umesh Nerlige Ramappa being the expert here. I think it's OA that I'm thinking of here: We have code in i915_perf.c to patch all the ctx currently in the system, so that they have a consistent OA config. That's also relying on this barrier stuff, and I was wondering how that will work with GuC. -Daniel > > Matt > > > Anyway, patch looks legit: > > > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > > > > + if (intel_engine_uses_guc(engine)) > > > + return true; > > > + > > > /* GPU is pointing to the void, as good as in the kernel context. */ > > > if (intel_gt_is_wedged(engine->gt)) > > > return true; > > > -- > > > 2.28.0 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch
On Tue, Aug 10, 2021 at 08:47:10AM +0200, Daniel Vetter wrote: > On Mon, Aug 09, 2021 at 06:20:51PM +0000, Matthew Brost wrote: > > On Mon, Aug 09, 2021 at 04:27:01PM +0200, Daniel Vetter wrote: > > > On Tue, Aug 03, 2021 at 03:29:08PM -0700, Matthew Brost wrote: > > > > Calling switch_to_kernel_context isn't needed if the engine PM reference > > > > is taken while all contexts are pinned. By not calling > > > > switch_to_kernel_context we save on issuing a request to the engine. > > > > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > > --- > > > > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 ++++ > > > > 1 file changed, 4 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > > > index 1f07ac4e0672..58099de6bf07 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c > > > > @@ -162,6 +162,10 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) > > > > unsigned long flags; > > > > bool result = true; > > > > > > > > + /* No need to switch_to_kernel_context if GuC submission */ > > > > > > Maybe whack a big FIXME on here that we should unravel this properly. > > > > Sure, can add a FIXME here. > > > > > Currently the execlist backend assumptions are leaked all over the place, > > > leading to stuff like this. Which means extremely fragile code. > > > > > > > Yes, this something required for execlists implemented in what should be > > generic code. > > > > > I currently don't have a great idea on how exactly we should do that, but > > > oh well. > > > > Me either, it will be a process. > > > > > > > > btw just in case we ever want to make guc lrc properly evictable (which as > > > the og use-case for this function, way, way back), would we need to fully > > > > Can you explain what you mean by fully evictable? Not getting what you > > mean in this context. > > > > > unregister them from guc? At least I'm assuming there's no other trick > > > > If scheduling is disabled on the context (currently done on unpin) you are > > free move anything around as the GuC is guaranteed not to touch the > > context state. If on re-pin something has moved (e.g. the LRC vaddr is > > different), you need to unregister and re-register the context with the > > GuC. > > So at that point GuC also guarantees that it's not left in the hw engine? > Execlist has this barrier request to fully unload the ctx from the hw, and > that's also why I cam on the topic of OA. > > > > like the below one. > > > > > > Another aside: How does the perf/OA patching work on GuC? > > > > > > > Not my area of expertise but perf somewhat a WIP. The plan is for the > > GuC to write out some stats to HWSP I think? John Harrison is working to > > get this fully implemented. > > > > OA is working afaik, with Umesh Nerlige Ramappa being the expert here. > > I think it's OA that I'm thinking of here: We have code in i915_perf.c to > patch all the ctx currently in the system, so that they have a consistent > OA config. That's also relying on this barrier stuff, and I was wondering > how that will work with GuC. > -Daniel > Not an OA expert at all but glanced at the code I don't see anything in there that prevents it working with GuC submission. We certainly have this working internally. If you have questions about this I'd reach out to Umesh Nerlige Ramappa as he likely has the answers. Matt > > > > Matt > > > > > Anyway, patch looks legit: > > > > > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > > > > > > > + if (intel_engine_uses_guc(engine)) > > > > + return true; > > > > + > > > > /* GPU is pointing to the void, as good as in the kernel context. */ > > > > if (intel_gt_is_wedged(engine->gt)) > > > > return true; > > > > -- > > > > 2.28.0 > > > > > > > > > > -- > > > Daniel Vetter > > > Software Engineer, Intel Corporation > > > http://blog.ffwll.ch > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 1f07ac4e0672..58099de6bf07 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -162,6 +162,10 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) unsigned long flags; bool result = true; + /* No need to switch_to_kernel_context if GuC submission */ + if (intel_engine_uses_guc(engine)) + return true; + /* GPU is pointing to the void, as good as in the kernel context. */ if (intel_gt_is_wedged(engine->gt)) return true;
Calling switch_to_kernel_context isn't needed if the engine PM reference is taken while all contexts are pinned. By not calling switch_to_kernel_context we save on issuing a request to the engine. Signed-off-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 ++++ 1 file changed, 4 insertions(+)