diff mbox series

[v2,3/3] drm/i915/perf: enable filtering on multiple contexts

Message ID 20200331114821.81957-3-lionel.g.landwerlin@intel.com (mailing list archive)
State New, archived
Headers show
Series [v2,1/3] drm/i915/perf: break OA config buffer object in 2 | expand

Commit Message

Lionel Landwerlin March 31, 2020, 11:48 a.m. UTC
Add 2 new properties to the i915-perf open ioctl to specify an array
of GEM context handles as well as the length of the array.

This can be used by drivers using multiple GEM contexts to implement a
single GL context.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 58 ++++++++++++++++++++++++++++++--
 include/uapi/drm/i915_drm.h      | 21 ++++++++++++
 2 files changed, 76 insertions(+), 3 deletions(-)

Comments

Chris Wilson March 31, 2020, 6:08 p.m. UTC | #1
Quoting Lionel Landwerlin (2020-03-31 12:48:21)
> Add 2 new properties to the i915-perf open ioctl to specify an array
> of GEM context handles as well as the length of the array.

Hmm. The other thought is ctx->engine[] where one context may have more
than one logical context instance that OA could track. The extension to
track multiple pinned contexts should equally work for multiple engines.

	I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 64) = {};
	struct drm_i915_gem_context_param p = {
		.ctx_id = gem_context_create(i915),
		.param = I915_CONTEXT_PARAM_ENGINES,
		.value = to_user_pointer(&engines),
		.size = sizeof(struct i915_context_param_engines),
	};
	gem_context_set_param(i915, &p);

would do the trick in creating one context with 64 rcs0 that they may
want to track. And that also opens the door to virtual engines over top.
-Chris
Lionel Landwerlin April 6, 2020, 1:54 p.m. UTC | #2
On 31/03/2020 21:08, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2020-03-31 12:48:21)
>> Add 2 new properties to the i915-perf open ioctl to specify an array
>> of GEM context handles as well as the length of the array.
> Hmm. The other thought is ctx->engine[] where one context may have more
> than one logical context instance that OA could track. The extension to
> track multiple pinned contexts should equally work for multiple engines.
>
> 	I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 64) = {};
> 	struct drm_i915_gem_context_param p = {
> 		.ctx_id = gem_context_create(i915),
> 		.param = I915_CONTEXT_PARAM_ENGINES,
> 		.value = to_user_pointer(&engines),
> 		.size = sizeof(struct i915_context_param_engines),
> 	};
> 	gem_context_set_param(i915, &p);
>
> would do the trick in creating one context with 64 rcs0 that they may
> want to track. And that also opens the door to virtual engines over top.
> -Chris


I rather punt this away for now :)

I can't see use cases for Iris/Vulkan.

This seems more of a media thing where we have multiple engines already.

And media doesn't do much perf queries because much of the available 
counters are 3D/compute and queries are not available on media engines.


We can always bump the perf revision later once we've figured how this 
would be used.


-Lionel
Chris Wilson April 6, 2020, 1:59 p.m. UTC | #3
Quoting Lionel Landwerlin (2020-04-06 14:54:38)
> On 31/03/2020 21:08, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2020-03-31 12:48:21)
> >> Add 2 new properties to the i915-perf open ioctl to specify an array
> >> of GEM context handles as well as the length of the array.
> > Hmm. The other thought is ctx->engine[] where one context may have more
> > than one logical context instance that OA could track. The extension to
> > track multiple pinned contexts should equally work for multiple engines.
> >
> >       I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 64) = {};
> >       struct drm_i915_gem_context_param p = {
> >               .ctx_id = gem_context_create(i915),
> >               .param = I915_CONTEXT_PARAM_ENGINES,
> >               .value = to_user_pointer(&engines),
> >               .size = sizeof(struct i915_context_param_engines),
> >       };
> >       gem_context_set_param(i915, &p);
> >
> > would do the trick in creating one context with 64 rcs0 that they may
> > want to track. And that also opens the door to virtual engines over top.
> > -Chris
> 
> 
> I rather punt this away for now :)
> 
> I can't see use cases for Iris/Vulkan.

There's immediate use cases for iris, since it uses 2 contexts instead
of 2 logical instances within one GL/GEM context.

The changes you have here trivially accommodate supporting user defined
engines[], it seems a waste not to.
-Chris
Lionel Landwerlin April 6, 2020, 2:07 p.m. UTC | #4
On 06/04/2020 16:59, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2020-04-06 14:54:38)
>> On 31/03/2020 21:08, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2020-03-31 12:48:21)
>>>> Add 2 new properties to the i915-perf open ioctl to specify an array
>>>> of GEM context handles as well as the length of the array.
>>> Hmm. The other thought is ctx->engine[] where one context may have more
>>> than one logical context instance that OA could track. The extension to
>>> track multiple pinned contexts should equally work for multiple engines.
>>>
>>>        I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 64) = {};
>>>        struct drm_i915_gem_context_param p = {
>>>                .ctx_id = gem_context_create(i915),
>>>                .param = I915_CONTEXT_PARAM_ENGINES,
>>>                .value = to_user_pointer(&engines),
>>>                .size = sizeof(struct i915_context_param_engines),
>>>        };
>>>        gem_context_set_param(i915, &p);
>>>
>>> would do the trick in creating one context with 64 rcs0 that they may
>>> want to track. And that also opens the door to virtual engines over top.
>>> -Chris
>>
>> I rather punt this away for now :)
>>
>> I can't see use cases for Iris/Vulkan.
> There's immediate use cases for iris, since it uses 2 contexts instead
> of 2 logical instances within one GL/GEM context.


I don't understand this. Are you saying Iris could use the stuff from 
above and still select what logical context to dispatch to?


It really wants to pin a given logical context to particular set of 
commands.


-Lionel


>
> The changes you have here trivially accommodate supporting user defined
> engines[], it seems a waste not to.
> -Chris
Chris Wilson April 6, 2020, 2:27 p.m. UTC | #5
Quoting Lionel Landwerlin (2020-04-06 15:07:30)
> On 06/04/2020 16:59, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2020-04-06 14:54:38)
> >> On 31/03/2020 21:08, Chris Wilson wrote:
> >>> Quoting Lionel Landwerlin (2020-03-31 12:48:21)
> >>>> Add 2 new properties to the i915-perf open ioctl to specify an array
> >>>> of GEM context handles as well as the length of the array.
> >>> Hmm. The other thought is ctx->engine[] where one context may have more
> >>> than one logical context instance that OA could track. The extension to
> >>> track multiple pinned contexts should equally work for multiple engines.
> >>>
> >>>        I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 64) = {};
> >>>        struct drm_i915_gem_context_param p = {
> >>>                .ctx_id = gem_context_create(i915),
> >>>                .param = I915_CONTEXT_PARAM_ENGINES,
> >>>                .value = to_user_pointer(&engines),
> >>>                .size = sizeof(struct i915_context_param_engines),
> >>>        };
> >>>        gem_context_set_param(i915, &p);
> >>>
> >>> would do the trick in creating one context with 64 rcs0 that they may
> >>> want to track. And that also opens the door to virtual engines over top.
> >>> -Chris
> >>
> >> I rather punt this away for now :)
> >>
> >> I can't see use cases for Iris/Vulkan.
> > There's immediate use cases for iris, since it uses 2 contexts instead
> > of 2 logical instances within one GL/GEM context.
> 
> 
> I don't understand this. Are you saying Iris could use the stuff from 
> above and still select what logical context to dispatch to?

Yes. And can be done so only by changing the context setup to create one
context with two independent rcs engines (different logical GPU state,
and rings, *only* sharing the same vm). And since legacy EXEC_DEFAULT[0]
and EXEC_RENDER[1] alias to the same engine, very little code needs to be
changed to support (1 ctx, 2 engines) vs (2 ctx, 1 engine).
-Chris
Lionel Landwerlin April 6, 2020, 2:30 p.m. UTC | #6
On 06/04/2020 17:27, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2020-04-06 15:07:30)
>> On 06/04/2020 16:59, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2020-04-06 14:54:38)
>>>> On 31/03/2020 21:08, Chris Wilson wrote:
>>>>> Quoting Lionel Landwerlin (2020-03-31 12:48:21)
>>>>>> Add 2 new properties to the i915-perf open ioctl to specify an array
>>>>>> of GEM context handles as well as the length of the array.
>>>>> Hmm. The other thought is ctx->engine[] where one context may have more
>>>>> than one logical context instance that OA could track. The extension to
>>>>> track multiple pinned contexts should equally work for multiple engines.
>>>>>
>>>>>         I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 64) = {};
>>>>>         struct drm_i915_gem_context_param p = {
>>>>>                 .ctx_id = gem_context_create(i915),
>>>>>                 .param = I915_CONTEXT_PARAM_ENGINES,
>>>>>                 .value = to_user_pointer(&engines),
>>>>>                 .size = sizeof(struct i915_context_param_engines),
>>>>>         };
>>>>>         gem_context_set_param(i915, &p);
>>>>>
>>>>> would do the trick in creating one context with 64 rcs0 that they may
>>>>> want to track. And that also opens the door to virtual engines over top.
>>>>> -Chris
>>>> I rather punt this away for now :)
>>>>
>>>> I can't see use cases for Iris/Vulkan.
>>> There's immediate use cases for iris, since it uses 2 contexts instead
>>> of 2 logical instances within one GL/GEM context.
>>
>> I don't understand this. Are you saying Iris could use the stuff from
>> above and still select what logical context to dispatch to?
> Yes. And can be done so only by changing the context setup to create one
> context with two independent rcs engines (different logical GPU state,
> and rings, *only* sharing the same vm). And since legacy EXEC_DEFAULT[0]
> and EXEC_RENDER[1] alias to the same engine, very little code needs to be
> changed to support (1 ctx, 2 engines) vs (2 ctx, 1 engine).
> -Chris

Sounds like there could a performance gain somewhere...

Will look into this.


-Lionel
Chris Wilson April 6, 2020, 2:37 p.m. UTC | #7
Quoting Lionel Landwerlin (2020-04-06 15:30:38)
> On 06/04/2020 17:27, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2020-04-06 15:07:30)
> >> On 06/04/2020 16:59, Chris Wilson wrote:
> >>> Quoting Lionel Landwerlin (2020-04-06 14:54:38)
> >>>> On 31/03/2020 21:08, Chris Wilson wrote:
> >>>>> Quoting Lionel Landwerlin (2020-03-31 12:48:21)
> >>>>>> Add 2 new properties to the i915-perf open ioctl to specify an array
> >>>>>> of GEM context handles as well as the length of the array.
> >>>>> Hmm. The other thought is ctx->engine[] where one context may have more
> >>>>> than one logical context instance that OA could track. The extension to
> >>>>> track multiple pinned contexts should equally work for multiple engines.
> >>>>>
> >>>>>         I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 64) = {};
> >>>>>         struct drm_i915_gem_context_param p = {
> >>>>>                 .ctx_id = gem_context_create(i915),
> >>>>>                 .param = I915_CONTEXT_PARAM_ENGINES,
> >>>>>                 .value = to_user_pointer(&engines),
> >>>>>                 .size = sizeof(struct i915_context_param_engines),
> >>>>>         };
> >>>>>         gem_context_set_param(i915, &p);
> >>>>>
> >>>>> would do the trick in creating one context with 64 rcs0 that they may
> >>>>> want to track. And that also opens the door to virtual engines over top.
> >>>>> -Chris
> >>>> I rather punt this away for now :)
> >>>>
> >>>> I can't see use cases for Iris/Vulkan.
> >>> There's immediate use cases for iris, since it uses 2 contexts instead
> >>> of 2 logical instances within one GL/GEM context.
> >>
> >> I don't understand this. Are you saying Iris could use the stuff from
> >> above and still select what logical context to dispatch to?
> > Yes. And can be done so only by changing the context setup to create one
> > context with two independent rcs engines (different logical GPU state,
> > and rings, *only* sharing the same vm). And since legacy EXEC_DEFAULT[0]
> > and EXEC_RENDER[1] alias to the same engine, very little code needs to be
> > changed to support (1 ctx, 2 engines) vs (2 ctx, 1 engine).
> > -Chris
> 
> Sounds like there could a performance gain somewhere...
> 
> Will look into this.

A year old, so a bit stale,
https://gitlab.freedesktop.org/ickle/mesa/-/commits/iris-composite-context

Note it doesn't setup a single shared timeline as Kenneth wanted to keep
explicit control over the fencing between the pipelines.

* fingers crossed that's the right version.
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 112bb5bd6665..a6d82329c178 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3696,7 +3696,8 @@  static int read_properties_unlocked(struct i915_perf *perf,
 				    struct perf_open_properties *props)
 {
 	u64 __user *uprop = uprops;
-	u32 i;
+	u32 __user *uctx_handles = NULL;
+	u32 i, n_uctx_handles = 0;
 	int err;
 
 	memset(props, 0, sizeof(struct perf_open_properties));
@@ -3747,7 +3748,7 @@  static int read_properties_unlocked(struct i915_perf *perf,
 
 		switch ((enum drm_i915_perf_property_id)id) {
 		case DRM_I915_PERF_PROP_CTX_HANDLE:
-			if (props->n_ctx_handles > 0) {
+			if (props->n_ctx_handles > 0 || n_uctx_handles > 0) {
 				DRM_DEBUG("Context handle specified multiple times\n");
 				err = -EINVAL;
 				goto error;
@@ -3859,6 +3860,38 @@  static int read_properties_unlocked(struct i915_perf *perf,
 			}
 			props->poll_oa_period = value;
 			break;
+		case DRM_I915_PERF_PROP_CTX_HANDLE_ARRAY:
+			/* HSW can only filter in HW and only on a single
+			 * context.
+			 */
+			if (IS_HASWELL(perf->i915)) {
+				DRM_DEBUG("Multi context filter not supported on HSW\n");
+				err = -ENODEV;
+				goto error;
+			}
+			uctx_handles = u64_to_user_ptr(value);
+			break;
+		case DRM_I915_PERF_PROP_CTX_HANDLE_ARRAY_LENGTH:
+			if (IS_HASWELL(perf->i915)) {
+				DRM_DEBUG("Multi context filter not supported on HSW\n");
+				err = -ENODEV;
+				goto error;
+			}
+			if (props->n_ctx_handles > 0 || n_uctx_handles > 0) {
+				DRM_DEBUG("Context handle specified multiple times\n");
+				err = -EINVAL;
+				goto error;
+			}
+			props->ctx_handles =
+				kmalloc_array(value,
+					      sizeof(*props->ctx_handles),
+					      GFP_KERNEL);
+			if (!props->ctx_handles) {
+				err = -ENOMEM;
+				goto error;
+			}
+			n_uctx_handles = value;
+			break;
 		case DRM_I915_PERF_PROP_MAX:
 			MISSING_CASE(id);
 			err = -EINVAL;
@@ -3868,6 +3901,21 @@  static int read_properties_unlocked(struct i915_perf *perf,
 		uprop += 2;
 	}
 
+	if (n_uctx_handles > 0 && props->n_ctx_handles > 0) {
+		DRM_DEBUG("Context handle specified multiple times\n");
+		err = -EINVAL;
+		goto error;
+	}
+
+	for (i = 0; i < n_uctx_handles; i++) {
+		err = get_user(props->ctx_handles[i], uctx_handles);
+		if (err)
+			goto error;
+
+		uctx_handles++;
+		props->n_ctx_handles++;
+	}
+
 	return 0;
 
 error:
@@ -4653,8 +4701,12 @@  int i915_perf_ioctl_version(void)
 	 *    interval for the hrtimer used to check for OA data.
 	 *
 	 * 6. Add edge trigger report generation support.
+	 *
+	 * 7: Add DRM_I915_PERF_PROP_CTX_HANDLE_ARRAY &
+	 *    DRM_I915_PERF_PROP_CTX_HANDLE_ARRAY_LENGTH to allow an
+	 *    application monitor/pin multiple contexts.
 	 */
-	return 6;
+	return 7;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 14b67cd6b54b..d2ec2eb98aed 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1993,6 +1993,27 @@  enum drm_i915_perf_property_id {
 	 */
 	DRM_I915_PERF_PROP_POLL_OA_PERIOD,
 
+	/**
+	 * Specifies an array of u32 GEM context handles to filter reports
+	 * with.
+	 *
+	 * Using this parameter is incompatible with using
+	 * DRM_I915_PERF_PROP_CTX_HANDLE.
+	 *
+	 * This property is available in perf revision 7.
+	 */
+	DRM_I915_PERF_PROP_CTX_HANDLE_ARRAY,
+
+	/**
+	 * Specifies the length of the array specified with
+	 * DRM_I915_PERF_PROP_CTX_HANDLE_ARRAY.
+	 *
+	 * The length must be in the range [1, 4].
+	 *
+	 * This property is available in perf revision 7.
+	 */
+	DRM_I915_PERF_PROP_CTX_HANDLE_ARRAY_LENGTH,
+
 	DRM_I915_PERF_PROP_MAX /* non-ABI */
 };