Message ID | 20210204121121.2660-3-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC,1/3] proc: Show GPU runtimes | expand |
Hi Chris, On Thu, 4 Feb 2021 at 12:11, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > Register with /proc/gpu to provide the client runtimes for generic > top-like overview, e.g. gnome-system-monitor can use this information to > show the per-process multi-GPU usage. > Exposing this information to userspace sounds great IMHO and like the proposed "channels" for the device engines. If it were me, I would have the channel names a) exposed to userspace and b) be a "fixed set". Whereby with a "fixed set" I mean, we should have these akin to the KMS UAPI properties, where we have core helpers exposing prop X/Y and there should be no driver specific ones. This would allow for consistent and deterministic userspace handling, even if some hardware/drivers do not have all engines - say no copy engine. > --- /dev/null > +++ b/drivers/gpu/drm/i915/gt/intel_gt_proc.c > @@ -0,0 +1,66 @@ > +// SPDX-License-Identifier: MIT Thanks for making these available under MIT. > +/* > + * Copyright © 2020 Intel Corporation Might want to make this 2021 in the next revision. HTH Emil
Quoting Emil Velikov (2021-02-12 14:57:56) > Hi Chris, > > On Thu, 4 Feb 2021 at 12:11, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > Register with /proc/gpu to provide the client runtimes for generic > > top-like overview, e.g. gnome-system-monitor can use this information to > > show the per-process multi-GPU usage. > > > Exposing this information to userspace sounds great IMHO and like the > proposed "channels" for the device engines. > If it were me, I would have the channel names a) exposed to userspace > and b) be a "fixed set". - Total - Graphics - Compute - Unified - Video - Copy - Display - Other Enough versatility for the foreseeable future? But plan for extension. The other aspect then is the capacity of each channel. We can keep it simple as the union/average (whichever the driver has to hand) runtime in nanoseconds over all IP blocks within a channel. -Chris
On Fri, 12 Feb 2021 at 15:16, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > Quoting Emil Velikov (2021-02-12 14:57:56) > > Hi Chris, > > > > On Thu, 4 Feb 2021 at 12:11, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > > > Register with /proc/gpu to provide the client runtimes for generic > > > top-like overview, e.g. gnome-system-monitor can use this information to > > > show the per-process multi-GPU usage. > > > > > Exposing this information to userspace sounds great IMHO and like the > > proposed "channels" for the device engines. > > If it were me, I would have the channel names a) exposed to userspace > > and b) be a "fixed set". > > - Total > - Graphics > - Compute > - Unified > - Video > - Copy > - Display > - Other > > Enough versatility for the foreseeable future? > But plan for extension. > With a bit of documentation about "unified" (is it a metric also counted towards any of the rest) it would be perfect. For future extension one might consider splitting video into encoder/decoder/post-processing. > The other aspect then is the capacity of each channel. We can keep it > simple as the union/average (whichever the driver has to hand) runtime in > nanoseconds over all IP blocks within a channel. Not sure what you mean with capacity. Are you referring to having multiple instances of the same engine (say 3 separate copy engines)? Personally I'm inclined to keep these separate entries, since some hardware can have multiple ones. For example - before the latest changes nouveau had 8 copy engines, 3+3 video 'generic' video (enc,dec)oder engines, amongst others. Thanks Emil
Quoting Emil Velikov (2021-02-12 15:45:04) > On Fri, 12 Feb 2021 at 15:16, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > Quoting Emil Velikov (2021-02-12 14:57:56) > > > Hi Chris, > > > > > > On Thu, 4 Feb 2021 at 12:11, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > > > > > Register with /proc/gpu to provide the client runtimes for generic > > > > top-like overview, e.g. gnome-system-monitor can use this information to > > > > show the per-process multi-GPU usage. > > > > > > > Exposing this information to userspace sounds great IMHO and like the > > > proposed "channels" for the device engines. > > > If it were me, I would have the channel names a) exposed to userspace > > > and b) be a "fixed set". > > > > - Total > > - Graphics > > - Compute > > - Unified > > - Video > > - Copy > > - Display > > - Other > > > > Enough versatility for the foreseeable future? > > But plan for extension. > > > With a bit of documentation about "unified" (is it a metric also > counted towards any of the rest) it would be perfect. With unified I was trying to find a place to things that are neither wholly graphics nor compute, as some may prefer not to categorise themselves as one or the other. Also whether or not some cores are more compute than others (so should there be an AI/RT/ALU?) > For future extension one might consider splitting video into > encoder/decoder/post-processing. Ok, I wasn't sure how commonly those functions were split on different HW. > > The other aspect then is the capacity of each channel. We can keep it > > simple as the union/average (whichever the driver has to hand) runtime in > > nanoseconds over all IP blocks within a channel. > > Not sure what you mean with capacity. Are you referring to having > multiple instances of the same engine (say 3 separate copy engines)? > Personally I'm inclined to keep these separate entries, since some > hardware can have multiple ones. > > For example - before the latest changes nouveau had 8 copy engines, > 3+3 video 'generic' video (enc,dec)oder engines, amongst others. Yes, most HW have multiple engines within a family. Trying to keep it simple, I thought presenting just one runtime metric for the whole channel. Especially for the single-line per device format I had picked :) If we switch to a more extensible format, -'$device0' : -$channel0 : { Total : $total # avg/union over all engines Engines : [ $0, $1, ... ] } ... -'$device1' : ... Using the same fixed channel names, and dev_name(), pesky concerns such as keeping it as a simple scanf can be forgotten. -Chris
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index ce01634d4ea7..16171f65f5d1 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -104,6 +104,7 @@ gt-y += \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ gt/intel_gt_pm_irq.o \ + gt/intel_gt_proc.o \ gt/intel_gt_requests.o \ gt/intel_gtt.o \ gt/intel_llc.o \ diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index ca76f93bc03d..72199c13330d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -12,6 +12,7 @@ #include "intel_gt_buffer_pool.h" #include "intel_gt_clock_utils.h" #include "intel_gt_pm.h" +#include "intel_gt_proc.h" #include "intel_gt_requests.h" #include "intel_mocs.h" #include "intel_rc6.h" @@ -373,6 +374,8 @@ void intel_gt_driver_register(struct intel_gt *gt) intel_rps_driver_register(>->rps); debugfs_gt_register(gt); + + intel_gt_driver_register__proc(gt); } static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size) @@ -656,6 +659,8 @@ void intel_gt_driver_unregister(struct intel_gt *gt) { intel_wakeref_t wakeref; + intel_gt_driver_unregister__proc(gt); + intel_rps_driver_unregister(>->rps); /* diff --git a/drivers/gpu/drm/i915/gt/intel_gt_proc.c b/drivers/gpu/drm/i915/gt/intel_gt_proc.c new file mode 100644 index 000000000000..42db22326c7c --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_gt_proc.c @@ -0,0 +1,66 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ + +#include <linux/proc_gpu.h> + +#include "i915_drm_client.h" +#include "i915_drv.h" +#include "intel_gt.h" +#include "intel_gt_pm.h" +#include "intel_gt_proc.h" + +static void proc_runtime_pid(struct intel_gt *gt, + struct pid *pid, + struct proc_gpu_runtime *rt) +{ + struct i915_drm_clients *clients = >->i915->clients; + + BUILD_BUG_ON(MAX_ENGINE_CLASS >= ARRAY_SIZE(rt->channel)); + + rt->device = i915_drm_clients_get_runtime(clients, pid, rt->channel); + rt->nchannel = MAX_ENGINE_CLASS + 1; +} + +static void proc_runtime_device(struct intel_gt *gt, + struct pid *pid, + struct proc_gpu_runtime *rt) +{ + struct intel_engine_cs *engine; + enum intel_engine_id id; + ktime_t dummy; + + rt->nchannel = 0; + for_each_engine(engine, gt, id) { + rt->channel[rt->nchannel++] = + intel_engine_get_busy_time(engine, &dummy); + if (rt->nchannel == ARRAY_SIZE(rt->channel)) + break; + } + rt->device = intel_gt_get_awake_time(gt); +} + +static void proc_runtime(struct proc_gpu *pg, + struct pid *pid, + struct proc_gpu_runtime *rt) +{ + struct intel_gt *gt = container_of(pg, typeof(*gt), proc); + + strscpy(rt->name, dev_name(gt->i915->drm.dev), sizeof(rt->name)); + if (pid) + proc_runtime_pid(gt, pid, rt); + else + proc_runtime_device(gt, pid, rt); +} + +void intel_gt_driver_register__proc(struct intel_gt *gt) +{ + gt->proc.fn = proc_runtime; + proc_gpu_register(>->proc); +} + +void intel_gt_driver_unregister__proc(struct intel_gt *gt) +{ + proc_gpu_unregister(>->proc); +} diff --git a/drivers/gpu/drm/i915/gt/intel_gt_proc.h b/drivers/gpu/drm/i915/gt/intel_gt_proc.h new file mode 100644 index 000000000000..7a9bff0fb020 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_gt_proc.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2020 Intel Corporation + */ + +#ifndef INTEL_GT_PROC_H +#define INTEL_GT_PROC_H + +struct intel_gt; + +void intel_gt_driver_register__proc(struct intel_gt *gt); +void intel_gt_driver_unregister__proc(struct intel_gt *gt); + +#endif /* INTEL_GT_PROC_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 626af37c7790..3fc6d9741764 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -10,6 +10,7 @@ #include <linux/list.h> #include <linux/mutex.h> #include <linux/notifier.h> +#include <linux/proc_gpu.h> #include <linux/spinlock.h> #include <linux/types.h> @@ -135,6 +136,8 @@ struct intel_gt { struct i915_vma *scratch; + struct proc_gpu proc; + struct intel_gt_info { intel_engine_mask_t engine_mask; u8 num_engines;
Register with /proc/gpu to provide the client runtimes for generic top-like overview, e.g. gnome-system-monitor can use this information to show the per-process multi-GPU usage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gt/intel_gt.c | 5 ++ drivers/gpu/drm/i915/gt/intel_gt_proc.c | 66 ++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gt_proc.h | 14 +++++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 ++ 5 files changed, 89 insertions(+) create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_proc.c create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_proc.h