diff mbox series

[v12,5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl

Message ID 20250324230931.63840-6-jonathan.cavitt@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/xe/xe_vm: Implement xe_vm_get_property_ioctl | expand

Commit Message

Cavitt, Jonathan March 24, 2025, 11:09 p.m. UTC
Add support for userspace to request a list of observed faults
from a specified VM.

v2:
- Only allow querying of failed pagefaults (Matt Brost)

v3:
- Remove unnecessary size parameter from helper function, as it
  is a property of the arguments. (jcavitt)
- Remove unnecessary copy_from_user (Jainxun)
- Set address_precision to 1 (Jainxun)
- Report max size instead of dynamic size for memory allocation
  purposes.  Total memory usage is reported separately.

v4:
- Return int from xe_vm_get_property_size (Shuicheng)
- Fix memory leak (Shuicheng)
- Remove unnecessary size variable (jcavitt)

v5:
- Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)
- Update fill_property_pfs to eliminate need for kzalloc (Jianxun)

v6:
- Repair and move fill_faults break condition (Dan Carpenter)
- Free vm after use (jcavitt)
- Combine assertions (jcavitt)
- Expand size check in xe_vm_get_faults_ioctl (jcavitt)
- Remove return mask from fill_faults, as return is already -EFAULT or 0
  (jcavitt)

v7:
- Revert back to using xe_vm_get_property_ioctl
- Apply better copy_to_user logic (jcavitt)

v8:
- Fix and clean up error value handling in ioctl (jcavitt)
- Reapply return mask for fill_faults (jcavitt)

v9:
- Future-proof size logic for zero-size properties (jcavitt)
- Add access and fault types (Jianxun)
- Remove address type (Jianxun)

v10:
- Remove unnecessary switch case logic (Raag)
- Compress size get, size validation, and property fill functions into a
  single helper function (jcavitt)
- Assert valid size (jcavitt)

v11:
- Remove unnecessary else condition
- Correct backwards helper function size logic (jcavitt)

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Jainxun Zhang <jianxun.zhang@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c |  3 ++
 drivers/gpu/drm/xe/xe_vm.c     | 96 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h     |  2 +
 3 files changed, 101 insertions(+)

Comments

Raag Jadav March 25, 2025, 7:39 a.m. UTC | #1
On Mon, Mar 24, 2025 at 11:09:28PM +0000, Jonathan Cavitt wrote:
> Add support for userspace to request a list of observed faults
> from a specified VM.

...

> v10:
> - Remove unnecessary switch case logic (Raag)

This is usually "changes present in version" and not "comments received
in version" but I guess this must be one of those drm things.

...

> +static int xe_vm_get_property_helper(struct xe_vm *vm,
> +				     struct drm_xe_vm_get_property *args)
> +{
> +	int size;
> +
> +	switch (args->property) {
> +	case DRM_XE_VM_GET_PROPERTY_FAULTS:
> +		spin_lock(&vm->faults.lock);
> +		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
> +		spin_unlock(&vm->faults.lock);
> +
> +		if (args->size)
> +			/*
> +			 * Number of faults may increase between calls to
> +			 * xe_vm_get_property_ioctl, so just report the
> +			 * number of faults the user requests if it's less
> +			 * than or equal to the number of faults in the VM
> +			 * fault array.
> +			 */
> +			return args->size <= size ? fill_faults(vm, args) : -EINVAL;

You're comparing an int with u32 and I'm not sure how this plays out.
The usual practice is to use size_t (even in the struct) but I'm not aware
of its userspace counterpart.

Raag
Cavitt, Jonathan March 25, 2025, 2:45 p.m. UTC | #2
-----Original Message-----
From: Jadav, Raag <raag.jadav@intel.com> 
Sent: Tuesday, March 25, 2025 12:40 AM
To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
Cc: intel-xe@lists.freedesktop.org; Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; joonas.lahtinen@linux.intel.com; Brost, Matthew <matthew.brost@intel.com>; Zhang, Jianxun <jianxun.zhang@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; dri-devel@lists.freedesktop.org; Wajdeczko, Michal <Michal.Wajdeczko@intel.com>; Mrozek, Michal <michal.mrozek@intel.com>
Subject: Re: [PATCH v12 5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
> 
> On Mon, Mar 24, 2025 at 11:09:28PM +0000, Jonathan Cavitt wrote:
> > Add support for userspace to request a list of observed faults
> > from a specified VM.
> 
> ...
> 
> > v10:
> > - Remove unnecessary switch case logic (Raag)
> 
> This is usually "changes present in version" and not "comments received
> in version" but I guess this must be one of those drm things.

I'm fairly certain change logs are supposed to be written in the future tense.
Or at the very least, I think I picked up the habit in college.  Is this not correct?

> 
> ...
> 
> > +static int xe_vm_get_property_helper(struct xe_vm *vm,
> > +				     struct drm_xe_vm_get_property *args)
> > +{
> > +	int size;
> > +
> > +	switch (args->property) {
> > +	case DRM_XE_VM_GET_PROPERTY_FAULTS:
> > +		spin_lock(&vm->faults.lock);
> > +		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
> > +		spin_unlock(&vm->faults.lock);
> > +
> > +		if (args->size)
> > +			/*
> > +			 * Number of faults may increase between calls to
> > +			 * xe_vm_get_property_ioctl, so just report the
> > +			 * number of faults the user requests if it's less
> > +			 * than or equal to the number of faults in the VM
> > +			 * fault array.
> > +			 */
> > +			return args->size <= size ? fill_faults(vm, args) : -EINVAL;
> 
> You're comparing an int with u32 and I'm not sure how this plays out.
> The usual practice is to use size_t (even in the struct) 

Size is a u32 in struct drm_xe_device_query.

> but I'm not aware of its userspace counterpart.

You mean the associated IGT tests?  The interface has been changing so much
that I've been putting off making the IGT tests until the interface has been
finalized.

Or do you mean the use case?  Because that's
https://registry.khronos.org/vulkan/specs/latest/man/html/VK_EXT_device_fault.html
-Jonathan Cavitt

> 
> Raag
>
Jianxun Zhang March 25, 2025, 6:09 p.m. UTC | #3
On 3/24/25 16:09, Jonathan Cavitt wrote:
> Add support for userspace to request a list of observed faults
> from a specified VM.
> 
> v2:
> - Only allow querying of failed pagefaults (Matt Brost)
> 
> v3:
> - Remove unnecessary size parameter from helper function, as it
>    is a property of the arguments. (jcavitt)
> - Remove unnecessary copy_from_user (Jainxun)
> - Set address_precision to 1 (Jainxun)
> - Report max size instead of dynamic size for memory allocation
>    purposes.  Total memory usage is reported separately.
> 
> v4:
> - Return int from xe_vm_get_property_size (Shuicheng)
> - Fix memory leak (Shuicheng)
> - Remove unnecessary size variable (jcavitt)
> 
> v5:
> - Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)
> - Update fill_property_pfs to eliminate need for kzalloc (Jianxun)
> 
> v6:
> - Repair and move fill_faults break condition (Dan Carpenter)
> - Free vm after use (jcavitt)
> - Combine assertions (jcavitt)
> - Expand size check in xe_vm_get_faults_ioctl (jcavitt)
> - Remove return mask from fill_faults, as return is already -EFAULT or 0
>    (jcavitt)
> 
> v7:
> - Revert back to using xe_vm_get_property_ioctl
> - Apply better copy_to_user logic (jcavitt)
> 
> v8:
> - Fix and clean up error value handling in ioctl (jcavitt)
> - Reapply return mask for fill_faults (jcavitt)
> 
> v9:
> - Future-proof size logic for zero-size properties (jcavitt)
> - Add access and fault types (Jianxun)
> - Remove address type (Jianxun)
> 
> v10:
> - Remove unnecessary switch case logic (Raag)
> - Compress size get, size validation, and property fill functions into a
>    single helper function (jcavitt)
> - Assert valid size (jcavitt)
> 
> v11:
> - Remove unnecessary else condition
> - Correct backwards helper function size logic (jcavitt)
> 
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Cc: Jainxun Zhang <jianxun.zhang@intel.com>
> Cc: Shuicheng Lin <shuicheng.lin@intel.com>
> Cc: Raag Jadav <raag.jadav@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device.c |  3 ++
>   drivers/gpu/drm/xe/xe_vm.c     | 96 ++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_vm.h     |  2 +
>   3 files changed, 101 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 1ffb7d1f6be6..02f84a855502 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -195,6 +195,9 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
>   	DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl,
>   			  DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
> +			  DRM_RENDER_ALLOW),
> +
>   };
>   
>   static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 9a627ba17f55..3ed50d6f1f42 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -43,6 +43,14 @@
>   #include "xe_wa.h"
>   #include "xe_hmm.h"
>   
> +static const u16 xe_to_user_engine_class[] = {
> +	[XE_ENGINE_CLASS_RENDER] = DRM_XE_ENGINE_CLASS_RENDER,
> +	[XE_ENGINE_CLASS_COPY] = DRM_XE_ENGINE_CLASS_COPY,
> +	[XE_ENGINE_CLASS_VIDEO_DECODE] = DRM_XE_ENGINE_CLASS_VIDEO_DECODE,
> +	[XE_ENGINE_CLASS_VIDEO_ENHANCE] = DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE,
> +	[XE_ENGINE_CLASS_COMPUTE] = DRM_XE_ENGINE_CLASS_COMPUTE,
Unless we can completely rule it out, page faults can come from other 
sources (engine class 0x4),so we should have it in the mapping.

But once it is added, the mapping would very likely be a completed 1:1 
translation, so it becomes unnecessary and can be replaced by just 
reporting the raw value to UMD. If we want to keep both KMD and UMD at 
the same page and aligned with the spec (we shoud), we can define an 
enum or a group of macros as identifiers used in both sides to make 
things clear.

Of course, KMD may want to have such mapping internally for some other 
places. But from the perspective of the interface, the above is my argument.
> +
>   static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
>   {
>   	return vm->gpuvm.r_obj;
> @@ -3553,6 +3561,94 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>   	return err;
>   }
>   
> +static int fill_faults(struct xe_vm *vm,
> +		       struct drm_xe_vm_get_property *args)
> +{
> +	struct xe_vm_fault __user *usr_ptr = u64_to_user_ptr(args->data);
> +	struct xe_vm_fault store = { 0 };
> +	struct xe_vm_fault_entry *entry;
> +	int ret = 0, i = 0, count, entry_size;
> +
> +	entry_size = sizeof(struct xe_vm_fault);
> +	count = args->size / entry_size;
> +
> +	spin_lock(&vm->faults.lock);
> +	list_for_each_entry(entry, &vm->faults.list, list) {
> +		if (i++ == count)
> +			break;
> +
> +		memset(&store, 0, entry_size);
> +
> +		store.address = entry->address;
> +		store.address_precision = entry->address_precision;
> +		store.access_type = entry->access_type;
> +		store.fault_type = entry->fault_type;
> +		store.fault_level = entry->fault_level;
> +		store.engine_class = xe_to_user_engine_class[entry->engine_class];
> +		store.engine_instance = entry->engine_instance;
> +
> +		ret = copy_to_user(usr_ptr, &store, entry_size);
> +		if (ret)
> +			break;
> +
> +		usr_ptr++;
> +	}
> +	spin_unlock(&vm->faults.lock);
> +
> +	return ret ? -EFAULT : 0;
> +}
> +
> +static int xe_vm_get_property_helper(struct xe_vm *vm,
> +				     struct drm_xe_vm_get_property *args)
> +{
> +	int size;
> +
> +	switch (args->property) {
> +	case DRM_XE_VM_GET_PROPERTY_FAULTS:
> +		spin_lock(&vm->faults.lock);
> +		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
> +		spin_unlock(&vm->faults.lock);
> +
> +		if (args->size)
> +			/*
> +			 * Number of faults may increase between calls to
> +			 * xe_vm_get_property_ioctl, so just report the
> +			 * number of faults the user requests if it's less
> +			 * than or equal to the number of faults in the VM
> +			 * fault array.
> +			 */
> +			return args->size <= size ? fill_faults(vm, args) : -EINVAL;
> +
> +		args->size = size;
> +		return 0;
> +	}
> +	return -EINVAL;
> +}
> +
> +int xe_vm_get_property_ioctl(struct drm_device *drm, void *data,
> +			     struct drm_file *file)
> +{
> +	struct xe_device *xe = to_xe_device(drm);
> +	struct xe_file *xef = to_xe_file(file);
> +	struct drm_xe_vm_get_property *args = data;
> +	struct xe_vm *vm;
> +	int ret = 0;
> +
> +	if (XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
> +		return -EINVAL;
> +	if (XE_IOCTL_DBG(xe, args->size < 0))
> +		return -EINVAL;
> +
> +	vm = xe_vm_lookup(xef, args->vm_id);
> +	if (XE_IOCTL_DBG(xe, !vm))
> +		return -ENOENT;
> +
> +	ret = xe_vm_get_property_helper(vm, args);
> +
> +	xe_vm_put(vm);
> +	return ret;
> +}
> +
>   /**
>    * xe_vm_bind_kernel_bo - bind a kernel BO to a VM
>    * @vm: VM to bind the BO to
> diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
> index 9bd7e93824da..63ec22458e04 100644
> --- a/drivers/gpu/drm/xe/xe_vm.h
> +++ b/drivers/gpu/drm/xe/xe_vm.h
> @@ -196,6 +196,8 @@ int xe_vm_destroy_ioctl(struct drm_device *dev, void *data,
>   			struct drm_file *file);
>   int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
>   		     struct drm_file *file);
> +int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
> +			     struct drm_file *file);
>   
>   void xe_vm_close_and_put(struct xe_vm *vm);
>
Raag Jadav March 25, 2025, 11:45 p.m. UTC | #4
On Tue, Mar 25, 2025 at 08:15:59PM +0530, Cavitt, Jonathan wrote:
> From: Jadav, Raag <raag.jadav@intel.com> 
> > On Mon, Mar 24, 2025 at 11:09:28PM +0000, Jonathan Cavitt wrote:
> > > Add support for userspace to request a list of observed faults
> > > from a specified VM.
> > 
> > ...
> > 
> > > v10:
> > > - Remove unnecessary switch case logic (Raag)
> > 
> > This is usually "changes present in version" and not "comments received
> > in version" but I guess this must be one of those drm things.
> 
> I'm fairly certain change logs are supposed to be written in the future tense.
> Or at the very least, I think I picked up the habit in college.  Is this not correct?

I meant it belongs to v11.

> > > +static int xe_vm_get_property_helper(struct xe_vm *vm,
> > > +				     struct drm_xe_vm_get_property *args)
> > > +{
> > > +	int size;
> > > +
> > > +	switch (args->property) {
> > > +	case DRM_XE_VM_GET_PROPERTY_FAULTS:
> > > +		spin_lock(&vm->faults.lock);
> > > +		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
> > > +		spin_unlock(&vm->faults.lock);
> > > +
> > > +		if (args->size)
> > > +			/*
> > > +			 * Number of faults may increase between calls to
> > > +			 * xe_vm_get_property_ioctl, so just report the
> > > +			 * number of faults the user requests if it's less
> > > +			 * than or equal to the number of faults in the VM
> > > +			 * fault array.
> > > +			 */
> > > +			return args->size <= size ? fill_faults(vm, args) : -EINVAL;
> > 
> > You're comparing an int with u32 and I'm not sure how this plays out.
> > The usual practice is to use size_t (even in the struct) 
> 
> Size is a u32 in struct drm_xe_device_query.

And what about the local one?

Raag
Cavitt, Jonathan March 26, 2025, 2:21 p.m. UTC | #5
-----Original Message-----
> From: Jadav, Raag <raag.jadav@intel.com> 
> Sent: Tuesday, March 25, 2025 4:45 PM
> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
> Cc: intel-xe@lists.freedesktop.org; Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; joonas.lahtinen@linux.intel.com; Brost, Matthew <matthew.brost@intel.com>; Zhang, Jianxun <jianxun.zhang@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; dri-devel@lists.freedesktop.org; Wajdeczko, Michal <Michal.Wajdeczko@intel.com>; Mrozek, Michal <michal.mrozek@intel.com>
> Subject: Re: [PATCH v12 5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
> 
> On Tue, Mar 25, 2025 at 08:15:59PM +0530, Cavitt, Jonathan wrote:
> > From: Jadav, Raag <raag.jadav@intel.com> 
> > > On Mon, Mar 24, 2025 at 11:09:28PM +0000, Jonathan Cavitt wrote:
> > > > Add support for userspace to request a list of observed faults
> > > > from a specified VM.
> > > 
> > > ...
> > > 
> > > > v10:
> > > > - Remove unnecessary switch case logic (Raag)
> > > 
> > > This is usually "changes present in version" and not "comments received
> > > in version" but I guess this must be one of those drm things.
> > 
> > I'm fairly certain change logs are supposed to be written in the future tense.
> > Or at the very least, I think I picked up the habit in college.  Is this not correct?
> 
> I meant it belongs to v11.

Apologies for the confusion.  Not every patch is modified each revision cycle, so
removing unnecessary switch case logic was the 10th actual revision of the patch,
whereas the series as a whole had been modified 11 times at that point.

The change is correctly labeled as a part of revision 11 in the cover letter.

> 
> > > > +static int xe_vm_get_property_helper(struct xe_vm *vm,
> > > > +				     struct drm_xe_vm_get_property *args)
> > > > +{
> > > > +	int size;
> > > > +
> > > > +	switch (args->property) {
> > > > +	case DRM_XE_VM_GET_PROPERTY_FAULTS:
> > > > +		spin_lock(&vm->faults.lock);
> > > > +		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
> > > > +		spin_unlock(&vm->faults.lock);
> > > > +
> > > > +		if (args->size)
> > > > +			/*
> > > > +			 * Number of faults may increase between calls to
> > > > +			 * xe_vm_get_property_ioctl, so just report the
> > > > +			 * number of faults the user requests if it's less
> > > > +			 * than or equal to the number of faults in the VM
> > > > +			 * fault array.
> > > > +			 */
> > > > +			return args->size <= size ? fill_faults(vm, args) : -EINVAL;
> > > 
> > > You're comparing an int with u32 and I'm not sure how this plays out.
> > > The usual practice is to use size_t (even in the struct) 
> > 
> > Size is a u32 in struct drm_xe_device_query.
> 
> And what about the local one?

Do you mean the size variable used in xe_query functions?  Because that's
size_t.  Though the initial question was about the size variables in the structs,
AFAICT.

I need to solve a rogue locking issue with the series, so I'll change the size variable
to be a size_t in xe_vm_get_property_helper while I'm taking care of that.

-Jonathan Cavitt

> 
> Raag
>
Raag Jadav March 26, 2025, 3:30 p.m. UTC | #6
On Wed, Mar 26, 2025 at 07:51:15PM +0530, Cavitt, Jonathan wrote:
> > From: Jadav, Raag <raag.jadav@intel.com> 
> > On Tue, Mar 25, 2025 at 08:15:59PM +0530, Cavitt, Jonathan wrote:
> > > From: Jadav, Raag <raag.jadav@intel.com> 
> > > > On Mon, Mar 24, 2025 at 11:09:28PM +0000, Jonathan Cavitt wrote:
> > > > > Add support for userspace to request a list of observed faults
> > > > > from a specified VM.
> > > > 
> > > > ...
> > > > 
> > > > > v10:
> > > > > - Remove unnecessary switch case logic (Raag)
> > > > 
> > > > This is usually "changes present in version" and not "comments received
> > > > in version" but I guess this must be one of those drm things.
> > > 
> > > I'm fairly certain change logs are supposed to be written in the future tense.
> > > Or at the very least, I think I picked up the habit in college.  Is this not correct?
> > 
> > I meant it belongs to v11.
> 
> Apologies for the confusion.  Not every patch is modified each revision cycle, so
> removing unnecessary switch case logic was the 10th actual revision of the patch,
> whereas the series as a whole had been modified 11 times at that point.

Yeah, that's one of the drm things which adds on to the confusion instead
of making things easier.

The usual practice (atleast in other subsystems) is to not differentiate
between patch and series version, which is quite easier to follow.

https://kernelnewbies.org/FirstKernelPatch -> "Versioning patchsets"

> The change is correctly labeled as a part of revision 11 in the cover letter.
> 
> > 
> > > > > +static int xe_vm_get_property_helper(struct xe_vm *vm,
> > > > > +				     struct drm_xe_vm_get_property *args)
> > > > > +{
> > > > > +	int size;
> > > > > +
> > > > > +	switch (args->property) {
> > > > > +	case DRM_XE_VM_GET_PROPERTY_FAULTS:
> > > > > +		spin_lock(&vm->faults.lock);
> > > > > +		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
> > > > > +		spin_unlock(&vm->faults.lock);
> > > > > +
> > > > > +		if (args->size)
> > > > > +			/*
> > > > > +			 * Number of faults may increase between calls to
> > > > > +			 * xe_vm_get_property_ioctl, so just report the
> > > > > +			 * number of faults the user requests if it's less
> > > > > +			 * than or equal to the number of faults in the VM
> > > > > +			 * fault array.
> > > > > +			 */
> > > > > +			return args->size <= size ? fill_faults(vm, args) : -EINVAL;
> > > > 
> > > > You're comparing an int with u32 and I'm not sure how this plays out.
> > > > The usual practice is to use size_t (even in the struct) 
> > > 
> > > Size is a u32 in struct drm_xe_device_query.
> > 
> > And what about the local one?
> 
> Do you mean the size variable used in xe_query functions?  Because that's
> size_t.  Though the initial question was about the size variables in the structs,
> AFAICT.

Yeah, mixing/matching types usual leads to bugs (atleast overtime), but
if you're trying to mimic xe_query then I think you should stick to it.

Raag
diff mbox series

Patch

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 1ffb7d1f6be6..02f84a855502 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -195,6 +195,9 @@  static const struct drm_ioctl_desc xe_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl,
 			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_OBSERVATION, xe_observation_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
+			  DRM_RENDER_ALLOW),
+
 };
 
 static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 9a627ba17f55..3ed50d6f1f42 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -43,6 +43,14 @@ 
 #include "xe_wa.h"
 #include "xe_hmm.h"
 
+static const u16 xe_to_user_engine_class[] = {
+	[XE_ENGINE_CLASS_RENDER] = DRM_XE_ENGINE_CLASS_RENDER,
+	[XE_ENGINE_CLASS_COPY] = DRM_XE_ENGINE_CLASS_COPY,
+	[XE_ENGINE_CLASS_VIDEO_DECODE] = DRM_XE_ENGINE_CLASS_VIDEO_DECODE,
+	[XE_ENGINE_CLASS_VIDEO_ENHANCE] = DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE,
+	[XE_ENGINE_CLASS_COMPUTE] = DRM_XE_ENGINE_CLASS_COMPUTE,
+};
+
 static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
 {
 	return vm->gpuvm.r_obj;
@@ -3553,6 +3561,94 @@  int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	return err;
 }
 
+static int fill_faults(struct xe_vm *vm,
+		       struct drm_xe_vm_get_property *args)
+{
+	struct xe_vm_fault __user *usr_ptr = u64_to_user_ptr(args->data);
+	struct xe_vm_fault store = { 0 };
+	struct xe_vm_fault_entry *entry;
+	int ret = 0, i = 0, count, entry_size;
+
+	entry_size = sizeof(struct xe_vm_fault);
+	count = args->size / entry_size;
+
+	spin_lock(&vm->faults.lock);
+	list_for_each_entry(entry, &vm->faults.list, list) {
+		if (i++ == count)
+			break;
+
+		memset(&store, 0, entry_size);
+
+		store.address = entry->address;
+		store.address_precision = entry->address_precision;
+		store.access_type = entry->access_type;
+		store.fault_type = entry->fault_type;
+		store.fault_level = entry->fault_level;
+		store.engine_class = xe_to_user_engine_class[entry->engine_class];
+		store.engine_instance = entry->engine_instance;
+
+		ret = copy_to_user(usr_ptr, &store, entry_size);
+		if (ret)
+			break;
+
+		usr_ptr++;
+	}
+	spin_unlock(&vm->faults.lock);
+
+	return ret ? -EFAULT : 0;
+}
+
+static int xe_vm_get_property_helper(struct xe_vm *vm,
+				     struct drm_xe_vm_get_property *args)
+{
+	int size;
+
+	switch (args->property) {
+	case DRM_XE_VM_GET_PROPERTY_FAULTS:
+		spin_lock(&vm->faults.lock);
+		size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
+		spin_unlock(&vm->faults.lock);
+
+		if (args->size)
+			/*
+			 * Number of faults may increase between calls to
+			 * xe_vm_get_property_ioctl, so just report the
+			 * number of faults the user requests if it's less
+			 * than or equal to the number of faults in the VM
+			 * fault array.
+			 */
+			return args->size <= size ? fill_faults(vm, args) : -EINVAL;
+
+		args->size = size;
+		return 0;
+	}
+	return -EINVAL;
+}
+
+int xe_vm_get_property_ioctl(struct drm_device *drm, void *data,
+			     struct drm_file *file)
+{
+	struct xe_device *xe = to_xe_device(drm);
+	struct xe_file *xef = to_xe_file(file);
+	struct drm_xe_vm_get_property *args = data;
+	struct xe_vm *vm;
+	int ret = 0;
+
+	if (XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
+		return -EINVAL;
+	if (XE_IOCTL_DBG(xe, args->size < 0))
+		return -EINVAL;
+
+	vm = xe_vm_lookup(xef, args->vm_id);
+	if (XE_IOCTL_DBG(xe, !vm))
+		return -ENOENT;
+
+	ret = xe_vm_get_property_helper(vm, args);
+
+	xe_vm_put(vm);
+	return ret;
+}
+
 /**
  * xe_vm_bind_kernel_bo - bind a kernel BO to a VM
  * @vm: VM to bind the BO to
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 9bd7e93824da..63ec22458e04 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -196,6 +196,8 @@  int xe_vm_destroy_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file);
 int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *file);
+int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file);
 
 void xe_vm_close_and_put(struct xe_vm *vm);