Message ID | 20170911131000.23446-1-michal.winiarski@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/09/2017 14:09, Michał Winiarski wrote: > There's no reason to hide those tracepoints. > Let's also remove the DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option. No numbers from (micro-)bechmarks showing how small the impact of doing this is? I thought John was compiling this data. It will be just a no-op on the fast path, but a bit more generated code. Assuming that will be fine, the only potentially problematic aspect that comes to mind is the fact meaning of these tracepoints is a bit different between execlists and guc. But maybe that is thinking to low level (!) - in fact they are in both cases at points where i915 is passing/receiving requests to/from hardware so not an issue? Regards, Tvrtko > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: John Harrison <john.c.harrison@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> > --- > drivers/gpu/drm/i915/Kconfig.debug | 11 ----------- > drivers/gpu/drm/i915/i915_trace.h | 24 ------------------------ > 2 files changed, 35 deletions(-) > > diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug > index aed7d207ea84..63bfac79a403 100644 > --- a/drivers/gpu/drm/i915/Kconfig.debug > +++ b/drivers/gpu/drm/i915/Kconfig.debug > @@ -90,17 +90,6 @@ config DRM_I915_SELFTEST > > If in doubt, say "N". > > -config DRM_I915_LOW_LEVEL_TRACEPOINTS > - bool "Enable low level request tracing events" > - depends on DRM_I915 > - default n > - help > - Choose this option to turn on low level request tracing events. > - This provides the ability to precisely monitor engine utilisation > - and also analyze the request dependency resolving timeline. > - > - If in doubt, say "N". > - > config DRM_I915_DEBUG_VBLANK_EVADE > bool "Enable extra debug warnings for vblank evasion" > depends on DRM_I915 > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h > index 92f4c5bb7aa7..b8f037986ae2 100644 > --- a/drivers/gpu/drm/i915/i915_trace.h > +++ b/drivers/gpu/drm/i915/i915_trace.h > @@ -702,7 +702,6 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_add, > TP_ARGS(req) > ); > > -#if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) > DEFINE_EVENT(i915_gem_request, i915_gem_request_submit, > TP_PROTO(struct drm_i915_gem_request *req), > TP_ARGS(req) > @@ -751,29 +750,6 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_out, > TP_PROTO(struct drm_i915_gem_request *req), > TP_ARGS(req) > ); > -#else > -#if !defined(TRACE_HEADER_MULTI_READ) > -static inline void > -trace_i915_gem_request_submit(struct drm_i915_gem_request *req) > -{ > -} > - > -static inline void > -trace_i915_gem_request_execute(struct drm_i915_gem_request *req) > -{ > -} > - > -static inline void > -trace_i915_gem_request_in(struct drm_i915_gem_request *req, unsigned int port) > -{ > -} > - > -static inline void > -trace_i915_gem_request_out(struct drm_i915_gem_request *req) > -{ > -} > -#endif > -#endif > > TRACE_EVENT(intel_engine_notify, > TP_PROTO(struct intel_engine_cs *engine, bool waiters), >
Quoting Tvrtko Ursulin (2017-09-11 16:34:08) > > On 11/09/2017 14:09, Michał Winiarski wrote: > > There's no reason to hide those tracepoints. > > Let's also remove the DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option. > > No numbers from (micro-)bechmarks showing how small the impact of doing > this is? I thought John was compiling this data. It will be just a no-op > on the fast path, but a bit more generated code. > > Assuming that will be fine, the only potentially problematic aspect that > comes to mind is the fact meaning of these tracepoints is a bit > different between execlists and guc. But maybe that is thinking to low > level (!) - in fact they are in both cases at points where i915 is > passing/receiving requests to/from hardware so not an issue? Along the same lines is that this implies that these are important enough to be ABI, and that means we need to make a long term decision on the viability and meaning of such tracepoints. -Chris
On Mon, 2017-09-11 at 20:52 +0100, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2017-09-11 16:34:08) > > > > On 11/09/2017 14:09, Michał Winiarski wrote: > > > There's no reason to hide those tracepoints. > > > Let's also remove the DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option. > > > > No numbers from (micro-)bechmarks showing how small the impact of doing > > this is? I thought John was compiling this data. It will be just a no-op > > on the fast path, but a bit more generated code. > > > > Assuming that will be fine, the only potentially problematic aspect that > > comes to mind is the fact meaning of these tracepoints is a bit > > different between execlists and guc. But maybe that is thinking to low > > level (!) - in fact they are in both cases at points where i915 is > > passing/receiving requests to/from hardware so not an issue? > > Along the same lines is that this implies that these are important > enough to be ABI, and that means we need to make a long term decision on > the viability and meaning of such tracepoints. > -Chris There is a number of applications which use these tracepoints for tasks profiling putting them on the time scale for visualization. For example, VTune and GPA, - these 2 are closed source. Not sure whether there are open source profiling tools with such support (at least for GPU). Right now VTune and GPA are stuck with custom patches for the i915 in the better case, thus limiting themselves with the customers who are ok with the custom patching. Thus, making this ABI (if possible) or at least making sure this does not get broken or disappears (with for example GUC enabling) would be highly beneficial. Dmitry. > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Hi! I use GPUVis and now Intel Vtune Profiler. These tools don't work out-of-the-box on all Linux based systems for Intel integrated graphics. It is needed to rebuild at least i915 module. And each time when the kernel is updated it is needed to rebuild i915 module again. > No numbers from (micro-)bechmarks showing how small the impact of doing > this is? I thought John was compiling this data. It will be just a no-op > on the fast path, but a bit more generated code. Have you collected the results? If not, I've done it for you: Benchmark for Metro 2033 Last Light Redux: w/o events: 1st run aver. fps: 36.06 2nd run aver. fps: 35.87 w events: 1st run aver. fps: 36.05 2nd run aver. fps: 35.92 There is no difference. It was run on Intel Core i9-9900K CPU @ 3.60GHz on integrated graphics. > Assuming that will be fine, the only potentially problematic aspect that > comes to mind is the fact meaning of these tracepoints is a bit > different between execlists and guc. But maybe that is thinking to low > level (!) - in fact they are in both cases at points where i915 is >passing/receiving requests to/from hardware so not an issue? In my view, it is not an issue. The real issue now that you cannot collect performance results for Intel GPU on Linux systems without rebuilding the i915 module. You cannot debug performance problems on the system even if you use tools from Intel. Do you have ETA to accept this patch? Thanks, Egor
Hi! I use GPUVis and now Intel Vtune Profiler. These tools don't work out-of-the-box on all Linux based systems for Intel integrated graphics. It is needed to rebuild at least i915 module. And each time when the kernel is updated it is needed to rebuild i915 module again. > No numbers from (micro-)bechmarks showing how small the impact of doing > this is? I thought John was compiling this data. It will be just a no-op > on the fast path, but a bit more generated code. Have you collected the results? If not, I've done it for you: Benchmark for Metro 2033 Last Light Redux: w/o events: 1st run aver. fps: 36.06 2nd run aver. fps: 35.87 w events: 1st run aver. fps: 36.05 2nd run aver. fps: 35.92 There is no difference. It was run on Intel Core i9-9900K CPU @ 3.60GHz on integrated graphics. > Assuming that will be fine, the only potentially problematic aspect that > comes to mind is the fact meaning of these tracepoints is a bit > different between execlists and guc. But maybe that is thinking to low > level (!) - in fact they are in both cases at points where i915 is >passing/receiving requests to/from hardware so not an issue? In my view, it is not an issue. The real issue now that you cannot collect performance results for Intel GPU on Linux systems without rebuilding the i915 module. You cannot debug performance problems on the system even if you use tools from Intel. Do you have ETA to accept this patch? Thanks, Egor
diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug index aed7d207ea84..63bfac79a403 100644 --- a/drivers/gpu/drm/i915/Kconfig.debug +++ b/drivers/gpu/drm/i915/Kconfig.debug @@ -90,17 +90,6 @@ config DRM_I915_SELFTEST If in doubt, say "N". -config DRM_I915_LOW_LEVEL_TRACEPOINTS - bool "Enable low level request tracing events" - depends on DRM_I915 - default n - help - Choose this option to turn on low level request tracing events. - This provides the ability to precisely monitor engine utilisation - and also analyze the request dependency resolving timeline. - - If in doubt, say "N". - config DRM_I915_DEBUG_VBLANK_EVADE bool "Enable extra debug warnings for vblank evasion" depends on DRM_I915 diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 92f4c5bb7aa7..b8f037986ae2 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -702,7 +702,6 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_add, TP_ARGS(req) ); -#if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) DEFINE_EVENT(i915_gem_request, i915_gem_request_submit, TP_PROTO(struct drm_i915_gem_request *req), TP_ARGS(req) @@ -751,29 +750,6 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_out, TP_PROTO(struct drm_i915_gem_request *req), TP_ARGS(req) ); -#else -#if !defined(TRACE_HEADER_MULTI_READ) -static inline void -trace_i915_gem_request_submit(struct drm_i915_gem_request *req) -{ -} - -static inline void -trace_i915_gem_request_execute(struct drm_i915_gem_request *req) -{ -} - -static inline void -trace_i915_gem_request_in(struct drm_i915_gem_request *req, unsigned int port) -{ -} - -static inline void -trace_i915_gem_request_out(struct drm_i915_gem_request *req) -{ -} -#endif -#endif TRACE_EVENT(intel_engine_notify, TP_PROTO(struct intel_engine_cs *engine, bool waiters),
There's no reason to hide those tracepoints. Let's also remove the DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <john.c.harrison@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> --- drivers/gpu/drm/i915/Kconfig.debug | 11 ----------- drivers/gpu/drm/i915/i915_trace.h | 24 ------------------------ 2 files changed, 35 deletions(-)