Message ID | cover.1681976394.git.nicolinc@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Add set_dev_data and unset_dev_data support | expand |
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Thursday, April 20, 2023 3:48 PM > > This is a pair of new uAPI/ops for user space to set an iommu specific > device data for a passthrough device. This is primarily used by SMMUv3 > driver for now, to link the vSID and the pSID of a device that's behind > the SMMU. The link (lookup table) will be used to verify any ATC_INV > command from the user space for that device, and then replace the SID > field (virtual SID) with the corresponding physical SID. > > This series is available on Github: > https://github.com/nicolinc/iommufd/commits/set_dev_data-rfc-v2 > > Thanks! > Nicolin > there is no changelog compared to v1. Could you add some words why changing from passing the information in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward the latter option...
On Fri, Apr 21, 2023 at 07:35:52AM +0000, Tian, Kevin wrote: > External email: Use caution opening links or attachments > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > Sent: Thursday, April 20, 2023 3:48 PM > > > > This is a pair of new uAPI/ops for user space to set an iommu specific > > device data for a passthrough device. This is primarily used by SMMUv3 > > driver for now, to link the vSID and the pSID of a device that's behind > > the SMMU. The link (lookup table) will be used to verify any ATC_INV > > command from the user space for that device, and then replace the SID > > field (virtual SID) with the corresponding physical SID. > > > > This series is available on Github: > > https://github.com/nicolinc/iommufd/commits/set_dev_data-rfc-v2 > > > > Thanks! > > Nicolin > > > > there is no changelog compared to v1. Weird! How could it be missed during copy-n-paste.. I recalled that I had it but seemingly lost it after an update. It is in the commit message of the cover-letter though: https://github.com/nicolinc/iommufd/commit/5e17d270bfca2a5e3e7401d4bf58ae53eb7a8a55 -------------------------------------------------------- Changelog v2: * Integrated the uAPI into VFIO_DEVICE_BIND_IOMMUFD call * Renamed the previous set_rid_user to set_dev_data, to decouple from the PCI regime. v1: https://lore.kernel.org/all/cover.1680762112.git.nicolinc@nvidia.com/ -------------------------------------------------------- > Could you add some words why changing from passing the information > in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward > the latter option... Yea. Jason told me to decouple it from PCI. And merge it into a general uAPI. So I picked the BIND ioctl. Thanks Nic
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Friday, April 21, 2023 3:42 PM > > On Fri, Apr 21, 2023 at 07:35:52AM +0000, Tian, Kevin wrote: > > External email: Use caution opening links or attachments > > > > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > Sent: Thursday, April 20, 2023 3:48 PM > > > > > > This is a pair of new uAPI/ops for user space to set an iommu specific > > > device data for a passthrough device. This is primarily used by SMMUv3 > > > driver for now, to link the vSID and the pSID of a device that's behind > > > the SMMU. The link (lookup table) will be used to verify any ATC_INV > > > command from the user space for that device, and then replace the SID > > > field (virtual SID) with the corresponding physical SID. > > > > > > This series is available on Github: > > > https://github.com/nicolinc/iommufd/commits/set_dev_data-rfc-v2 > > > > > > Thanks! > > > Nicolin > > > > > > > there is no changelog compared to v1. > > Weird! How could it be missed during copy-n-paste.. > I recalled that I had it but seemingly lost it after an update. > > It is in the commit message of the cover-letter though: > https://github.com/nicolinc/iommufd/commit/5e17d270bfca2a5e3e7401d4b > f58ae53eb7a8a55 > -------------------------------------------------------- > Changelog > v2: > * Integrated the uAPI into VFIO_DEVICE_BIND_IOMMUFD call > * Renamed the previous set_rid_user to set_dev_data, to decouple from > the PCI regime. > v1: > https://lore.kernel.org/all/cover.1680762112.git.nicolinc@nvidia.com/ > -------------------------------------------------------- > > > Could you add some words why changing from passing the information > > in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward > > the latter option... > > Yea. Jason told me to decouple it from PCI. And merge it into > a general uAPI. So I picked the BIND ioctl. > 'decouple it from PCI' is kind of covered by renaming set_rid to set_data. but I didn't get why this has to be merged with another uAPI. Once iommufd_device is created we could have separate ioctls to poke its attributes individually. What'd be broken if this is not done at BIND time?
On Fri, Apr 21, 2023 at 07:47:13AM +0000, Tian, Kevin wrote: > > It is in the commit message of the cover-letter though: > > https://github.com/nicolinc/iommufd/commit/5e17d270bfca2a5e3e7401d4b > > f58ae53eb7a8a55 > > -------------------------------------------------------- > > Changelog > > v2: > > * Integrated the uAPI into VFIO_DEVICE_BIND_IOMMUFD call > > * Renamed the previous set_rid_user to set_dev_data, to decouple from > > the PCI regime. > > v1: > > https://lore.kernel.org/all/cover.1680762112.git.nicolinc@nvidia.com/ > > -------------------------------------------------------- > > > > > Could you add some words why changing from passing the information > > > in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward > > > the latter option... > > > > Yea. Jason told me to decouple it from PCI. And merge it into > > a general uAPI. So I picked the BIND ioctl. > > > > 'decouple it from PCI' is kind of covered by renaming set_rid > to set_data. but I didn't get why this has to be merged with another > uAPI. Once iommufd_device is created we could have separate > ioctls to poke its attributes individually. What'd be broken if this > is not done at BIND time? Oh, sorry. He didn't literally told me to merge, but commented "make sense" at my proposal of reusing BIND. So, I don't think adding to the BIND is a must here. The BIND is done in vfio_realize() where the RID (dev_data) is available also. And the new uAPI in my v1 actually gets called near the BIND. So, I feel we may just do it once? I am open to a better idea. Thanks Nic
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Friday, April 21, 2023 3:56 PM > > On Fri, Apr 21, 2023 at 07:47:13AM +0000, Tian, Kevin wrote: > > > > It is in the commit message of the cover-letter though: > > > > https://github.com/nicolinc/iommufd/commit/5e17d270bfca2a5e3e7401d4b > > > f58ae53eb7a8a55 > > > -------------------------------------------------------- > > > Changelog > > > v2: > > > * Integrated the uAPI into VFIO_DEVICE_BIND_IOMMUFD call > > > * Renamed the previous set_rid_user to set_dev_data, to decouple from > > > the PCI regime. > > > v1: > > > https://lore.kernel.org/all/cover.1680762112.git.nicolinc@nvidia.com/ > > > -------------------------------------------------------- > > > > > > > Could you add some words why changing from passing the information > > > > in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward > > > > the latter option... > > > > > > Yea. Jason told me to decouple it from PCI. And merge it into > > > a general uAPI. So I picked the BIND ioctl. > > > > > > > 'decouple it from PCI' is kind of covered by renaming set_rid > > to set_data. but I didn't get why this has to be merged with another > > uAPI. Once iommufd_device is created we could have separate > > ioctls to poke its attributes individually. What'd be broken if this > > is not done at BIND time? > > Oh, sorry. He didn't literally told me to merge, but commented > "make sense" at my proposal of reusing BIND. So, I don't think > adding to the BIND is a must here. > > The BIND is done in vfio_realize() where the RID (dev_data) is > available also. And the new uAPI in my v1 actually gets called > near the BIND. So, I feel we may just do it once? I am open to > a better idea. > IMHO if this can be done within iommufd then that should be the choice. vfio doesn't need to know this data at all and doing so means vdpa or a 3rd driver also needs to implement similar logic in their uAPI...
On Fri, Apr 21, 2023 at 08:07:19AM +0000, Tian, Kevin wrote: > External email: Use caution opening links or attachments > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > Sent: Friday, April 21, 2023 3:56 PM > > > > On Fri, Apr 21, 2023 at 07:47:13AM +0000, Tian, Kevin wrote: > > > > > > It is in the commit message of the cover-letter though: > > > > > > https://github.com/nicolinc/iommufd/commit/5e17d270bfca2a5e3e7401d4b > > > > f58ae53eb7a8a55 > > > > -------------------------------------------------------- > > > > Changelog > > > > v2: > > > > * Integrated the uAPI into VFIO_DEVICE_BIND_IOMMUFD call > > > > * Renamed the previous set_rid_user to set_dev_data, to decouple from > > > > the PCI regime. > > > > v1: > > > > https://lore.kernel.org/all/cover.1680762112.git.nicolinc@nvidia.com/ > > > > -------------------------------------------------------- > > > > > > > > > Could you add some words why changing from passing the information > > > > > in an iommufd ioctl to bind_iommufd? My gut-feeling leans toward > > > > > the latter option... > > > > > > > > Yea. Jason told me to decouple it from PCI. And merge it into > > > > a general uAPI. So I picked the BIND ioctl. > > > > > > > > > > 'decouple it from PCI' is kind of covered by renaming set_rid > > > to set_data. but I didn't get why this has to be merged with another > > > uAPI. Once iommufd_device is created we could have separate > > > ioctls to poke its attributes individually. What'd be broken if this > > > is not done at BIND time? > > > > Oh, sorry. He didn't literally told me to merge, but commented > > "make sense" at my proposal of reusing BIND. So, I don't think > > adding to the BIND is a must here. > > > > The BIND is done in vfio_realize() where the RID (dev_data) is > > available also. And the new uAPI in my v1 actually gets called > > near the BIND. So, I feel we may just do it once? I am open to > > a better idea. > > > > IMHO if this can be done within iommufd then that should be > the choice. vfio doesn't need to know this data at all and doing > so means vdpa or a 3rd driver also needs to implement similar > logic in their uAPI... Reusing the VFIO ioctl is because the device is a VFIO device. But doing it within iommufd could save us a lot of efforts, as you said. So... +/** + * struct iommufd_device_set_data - ioctl(IOMMU_DEVICE_SET_DATA) + * @size: sizeof(struct iommufd_device_set_data) + * @dev_id: The device to set a device data + * @data_uptr: User pointer of the device user data. + * @data_len: Length of the device user data. + */ +struct iommufd_device_set_data { + __u32 size; + __u32 dev_id; + __aligned_u64 data_uptr; + __u32 data_len; +}; +#define IOMMU_DEVICE_SET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_SET_DATA) + +/** + * struct iommufd_device_unset_data - ioctl(IOMMU_DEVICE_UNSET_DATA) + * @size: sizeof(struct iommufd_device_unset_data) + * @dev_id: The device to unset its device data + */ +struct iommufd_device_unset_data { + __u32 size; + __u32 dev_id; +}; +#define IOMMU_DEVICE_UNSET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_UNSET_DATA) Maybe just like this? Thanks Nic
On Fri, Apr 21, 2023 at 01:20:13AM -0700, Nicolin Chen wrote: > +/** > + * struct iommufd_device_set_data - ioctl(IOMMU_DEVICE_SET_DATA) > + * @size: sizeof(struct iommufd_device_set_data) > + * @dev_id: The device to set a device data > + * @data_uptr: User pointer of the device user data. > + * @data_len: Length of the device user data. > + */ > +struct iommufd_device_set_data { > + __u32 size; > + __u32 dev_id; > + __aligned_u64 data_uptr; > + __u32 data_len; > +}; > +#define IOMMU_DEVICE_SET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_SET_DATA) > + > +/** > + * struct iommufd_device_unset_data - ioctl(IOMMU_DEVICE_UNSET_DATA) > + * @size: sizeof(struct iommufd_device_unset_data) > + * @dev_id: The device to unset its device data > + */ > +struct iommufd_device_unset_data { > + __u32 size; > + __u32 dev_id; > +}; > +#define IOMMU_DEVICE_UNSET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_UNSET_DATA) > > Maybe just like this? How would the iommu_ops backing this work? Jason
On Fri, Apr 21, 2023 at 10:09:35AM -0300, Jason Gunthorpe wrote: > On Fri, Apr 21, 2023 at 01:20:13AM -0700, Nicolin Chen wrote: > > > +/** > > + * struct iommufd_device_set_data - ioctl(IOMMU_DEVICE_SET_DATA) > > + * @size: sizeof(struct iommufd_device_set_data) > > + * @dev_id: The device to set a device data > > + * @data_uptr: User pointer of the device user data. > > + * @data_len: Length of the device user data. > > + */ > > +struct iommufd_device_set_data { > > + __u32 size; > > + __u32 dev_id; > > + __aligned_u64 data_uptr; > > + __u32 data_len; > > +}; > > +#define IOMMU_DEVICE_SET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_SET_DATA) > > + > > +/** > > + * struct iommufd_device_unset_data - ioctl(IOMMU_DEVICE_UNSET_DATA) > > + * @size: sizeof(struct iommufd_device_unset_data) > > + * @dev_id: The device to unset its device data > > + */ > > +struct iommufd_device_unset_data { > > + __u32 size; > > + __u32 dev_id; > > +}; > > +#define IOMMU_DEVICE_UNSET_DATA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_UNSET_DATA) > > > > Maybe just like this? > > How would the iommu_ops backing this work? How about the following piece? Needs a test with QEMU though.. static const size_t iommufd_device_data_size[] = { [IOMMU_HW_INFO_TYPE_NONE] = 0, [IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0, [IOMMU_HW_INFO_TYPE_ARM_SMMUV3] = sizeof(struct iommu_device_data_arm_smmuv3), }; int iommufd_device_set_data(struct iommufd_ucmd *ucmd) { struct iommufd_device_set_data *cmd = ucmd->cmd; struct iommufd_device *idev; const struct iommu_ops *ops; void *data = NULL; u32 klen = 0; int rc; if (!cmd->data_uptr || !cmd->data_len) return -EINVAL; idev = iommufd_get_device(ucmd, cmd->dev_id); if (IS_ERR(idev)) return PTR_ERR(idev); ops = dev_iommu_ops(idev->dev); if (!ops || !ops->set_dev_data_user || !ops->unset_dev_data_user || ops->hw_info_type >= ARRAY_SIZE(iommufd_device_data_size)) { rc = -EOPNOTSUPP; goto out_put_idev; } klen = iommufd_device_data_size[ops->hw_info_type]; if (!klen) { rc = -EOPNOTSUPP; goto out_put_idev; } data = kzalloc(klen, GFP_KERNEL); if (!data) { rc = -ENOMEM; goto out_put_idev; } if (copy_struct_from_user(data, klen, u64_to_user_ptr(cmd->data_uptr), cmd->data_len)) { rc = -EFAULT; goto out_free_data; } rc = ops->set_dev_data_user(idev->dev, data); out_free_data: kfree(data); out_put_idev: iommufd_put_object(&idev->obj); return rc; }
On Fri, Apr 21, 2023 at 10:37:22AM -0700, Nicolin Chen wrote: > How about the following piece? Needs a test with QEMU though.. > > static const size_t iommufd_device_data_size[] = { > [IOMMU_HW_INFO_TYPE_NONE] = 0, > [IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0, > [IOMMU_HW_INFO_TYPE_ARM_SMMUV3] = > sizeof(struct iommu_device_data_arm_smmuv3), > }; If we need more than one of these things we'll need a better solution.. > rc = ops->set_dev_data_user(idev->dev, data); Where will the iommu driver store the vsid to sid xarray from these arguments? Jason
On Fri, Apr 21, 2023 at 02:59:37PM -0300, Jason Gunthorpe wrote: > On Fri, Apr 21, 2023 at 10:37:22AM -0700, Nicolin Chen wrote: > > > How about the following piece? Needs a test with QEMU though.. > > > > static const size_t iommufd_device_data_size[] = { > > [IOMMU_HW_INFO_TYPE_NONE] = 0, > > [IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0, > > [IOMMU_HW_INFO_TYPE_ARM_SMMUV3] = > > sizeof(struct iommu_device_data_arm_smmuv3), > > }; > > If we need more than one of these things we'll need a better > solution.. How about adding ops->device_data_size to store the value? And, since we have a few size arrays in hw_pagetable.c too, perhaps a new structure in ops packing all these sizes can clean up a bit things too? For example, static struct iommu_user_data_size arm_smmu_user_data_size = { .device_data_size = sizeof(iommu_device_data_arm_smmuv3), .hwpt_alloc_data_size = sizeof(iommu_hwpt_alloc_arm_smmuv3), .hwpt_invalidate_data_size = sizeof(iommu_hwpt_invalidate_arm_smmuv3), } The hwpt_xxx_data_size might be in form of arrays for multi- HWPT_TYPE support. > > rc = ops->set_dev_data_user(idev->dev, data); > > Where will the iommu driver store the vsid to sid xarray from these > arguments? The ARM structure packs a vsid. For example: static int arm_smmu_set_data(struct device *dev, const void *user_data) { const struct iommufd_device_data_arm_smmuv3 *data = user_data; struct arm_smmu_master *master = dev_iommu_priv_get(dev); struct arm_smmu_stream *stream = &master->streams[0]; struct arm_smmu_device *smmu = master->smmu; u32 sid_user = data->sid; int ret = 0; if (!sid_user) return -EINVAL; ret = xa_alloc(&smmu->streams_user, &sid_user, stream, XA_LIMIT(sid_user, sid_user), GFP_KERNEL_ACCOUNT); if (ret) return ret; stream->id_user = sid_user; return 0; } Thanks Nic
On Fri, Apr 21, 2023 at 11:19:23AM -0700, Nicolin Chen wrote: > On Fri, Apr 21, 2023 at 02:59:37PM -0300, Jason Gunthorpe wrote: > > On Fri, Apr 21, 2023 at 10:37:22AM -0700, Nicolin Chen wrote: > > > > > How about the following piece? Needs a test with QEMU though.. > > > > > > static const size_t iommufd_device_data_size[] = { > > > [IOMMU_HW_INFO_TYPE_NONE] = 0, > > > [IOMMU_HW_INFO_TYPE_INTEL_VTD] = 0, > > > [IOMMU_HW_INFO_TYPE_ARM_SMMUV3] = > > > sizeof(struct iommu_device_data_arm_smmuv3), > > > }; > > > > If we need more than one of these things we'll need a better > > solution.. > > How about adding ops->device_data_size to store the value? https://lore.kernel.org/linux-iommu/cover.1682234302.git.nicolinc@nvidia.com/ I sent a v3 that includes this replacing the data_size array. If it looks good, we can drop the other two data_size arrays for hwpt in the nesting series too. Thanks Nic