Message ID | 20230401151833.124749-7-yi.l.liu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add vfio_device cdev for iommufd support | expand |
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*. > Old userspace uses KVM_DEV_VFIO_GROUP* works as well. > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > Tested-by: Terrence Xu <terrence.xu@intel.com> > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> > Tested-by: Yanting Jiang <yanting.jiang@intel.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > --- > Documentation/virt/kvm/devices/vfio.rst | 53 +++++++++++++++++-------- > include/uapi/linux/kvm.h | 16 ++++++-- > virt/kvm/vfio.c | 16 ++++---- > 3 files changed, 56 insertions(+), 29 deletions(-) > > diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst > index 79b6811bb4f3..277d727ec1a2 100644 > --- a/Documentation/virt/kvm/devices/vfio.rst > +++ b/Documentation/virt/kvm/devices/vfio.rst > @@ -9,24 +9,38 @@ Device types supported: > - KVM_DEV_TYPE_VFIO > > Only one VFIO instance may be created per VM. The created device > -tracks VFIO groups in use by the VM and features of those groups > -important to the correctness and acceleration of the VM. As groups > -are enabled and disabled for use by the VM, KVM should be updated > -about their presence. When registered with KVM, a reference to the > -VFIO-group is held by KVM. > +tracks VFIO files (group or device) in use by the VM and features > +of those groups/devices important to the correctness and acceleration > +of the VM. As groups/devices are enabled and disabled for use by the > +VM, KVM should be updated about their presence. When registered with > +KVM, a reference to the VFIO file is held by KVM. > > Groups: > - KVM_DEV_VFIO_GROUP > - > -KVM_DEV_VFIO_GROUP attributes: > - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking > - kvm_device_attr.addr points to an int32_t file descriptor > - for the VFIO group. > - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking > - kvm_device_attr.addr points to an int32_t file descriptor > - for the VFIO group. > - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table > + KVM_DEV_VFIO_FILE > + alias: KVM_DEV_VFIO_GROUP > + > +KVM_DEV_VFIO_FILE attributes: > + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device > + tracking > + > + alias: KVM_DEV_VFIO_GROUP_ADD > + > + kvm_device_attr.addr points to an int32_t file descriptor for the > + VFIO file. > + > + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM > + device tracking > + > + alias: KVM_DEV_VFIO_GROUP_DEL > + > + kvm_device_attr.addr points to an int32_t file descriptor for the > + VFIO file. > + > + KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table > allocated by sPAPR KVM. > + > + alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE > + > kvm_device_attr.addr points to a struct:: > > struct kvm_vfio_spapr_tce { > @@ -40,9 +54,14 @@ KVM_DEV_VFIO_GROUP attributes: > - @tablefd is a file descriptor for a TCE table allocated via > KVM_CREATE_SPAPR_TCE. > > + only accepts vfio group file as SPAPR has no iommufd support So then what is the point of introducing KVM_DEV_VFIO_FILE_SET_SPAPR_TCE at this stage? I think would have separated the Groups: KVM_DEV_VFIO_FILE alias: KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE attributes: KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device tracking kvm_device_attr.addr points to an int32_t file descriptor for the VFIO file. KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM device tracking kvm_device_attr.addr points to an int32_t file descriptor for the VFIO file. KVM_DEV_VFIO_GROUP (legacy kvm device group restricted to the handling of VFIO group fd) KVM_DEV_VFIO_GROUP_ADD: same as KVM_DEV_VFIO_FILE_ADD for group fd only KVM_DEV_VFIO_GROUP_DEL: same as KVM_DEV_VFIO_FILE_DEL for group fd only KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table allocated by sPAPR KVM. kvm_device_attr.addr points to a struct:: struct kvm_vfio_spapr_tce { __s32 groupfd; __s32 tablefd; }; where: - @groupfd is a file descriptor for a VFIO group; - @tablefd is a file descriptor for a TCE table allocated via KVM_CREATE_SPAPR_TCE. You don't say anything about potential restriction, ie. what if the user calls KVM_DEV_VFIO_FILE with device fds while it has been using legacy container/group API? Thanks Eric > + > :: > > -The GROUP_ADD operation above should be invoked prior to accessing the > +The FILE/GROUP_ADD operation above should be invoked prior to accessing the > device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support > drivers which require a kvm pointer to be set in their .open_device() > -callback. > +callback. It is the same for device file descriptor via character device > +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file > +descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD > +to support the drivers mentioned in prior sentence as well. > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index d77aef872a0a..a8eeca70a498 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -1410,10 +1410,18 @@ struct kvm_device_attr { > __u64 addr; /* userspace address of attr data */ > }; > > -#define KVM_DEV_VFIO_GROUP 1 > -#define KVM_DEV_VFIO_GROUP_ADD 1 > -#define KVM_DEV_VFIO_GROUP_DEL 2 > -#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3 > +#define KVM_DEV_VFIO_FILE 1 > + > +#define KVM_DEV_VFIO_FILE_ADD 1 > +#define KVM_DEV_VFIO_FILE_DEL 2 > +#define KVM_DEV_VFIO_FILE_SET_SPAPR_TCE 3 > + > +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */ > +#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE > + > +#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD > +#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL > +#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE KVM_DEV_VFIO_FILE_SET_SPAPR_TCE > > enum kvm_device_type { > KVM_DEV_TYPE_FSL_MPIC_20 = 1, > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > index 857d6ba349e1..d869913baafd 100644 > --- a/virt/kvm/vfio.c > +++ b/virt/kvm/vfio.c > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr, > int32_t fd; > > switch (attr) { > - case KVM_DEV_VFIO_GROUP_ADD: > + case KVM_DEV_VFIO_FILE_ADD: > if (get_user(fd, argp)) > return -EFAULT; > return kvm_vfio_file_add(dev, fd); > > - case KVM_DEV_VFIO_GROUP_DEL: > + case KVM_DEV_VFIO_FILE_DEL: > if (get_user(fd, argp)) > return -EFAULT; > return kvm_vfio_file_del(dev, fd); > > #ifdef CONFIG_SPAPR_TCE_IOMMU > - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: > + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: > return kvm_vfio_file_set_spapr_tce(dev, arg); > #endif > } > @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, > struct kvm_device_attr *attr) > { > switch (attr->group) { > - case KVM_DEV_VFIO_GROUP: > + case KVM_DEV_VFIO_FILE: > return kvm_vfio_set_file(dev, attr->attr, > u64_to_user_ptr(attr->addr)); > } > @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, > struct kvm_device_attr *attr) > { > switch (attr->group) { > - case KVM_DEV_VFIO_GROUP: > + case KVM_DEV_VFIO_FILE: > switch (attr->attr) { > - case KVM_DEV_VFIO_GROUP_ADD: > - case KVM_DEV_VFIO_GROUP_DEL: > + case KVM_DEV_VFIO_FILE_ADD: > + case KVM_DEV_VFIO_FILE_DEL: > #ifdef CONFIG_SPAPR_TCE_IOMMU > - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: > + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: > #endif > return 0; > }
Hi Eric, > From: Eric Auger <eric.auger@redhat.com> > Sent: Thursday, April 6, 2023 5:47 PM > > Hi Yi, > > On 4/1/23 17:18, Yi Liu wrote: > > This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*. > > Old userspace uses KVM_DEV_VFIO_GROUP* works as well. > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > Tested-by: Terrence Xu <terrence.xu@intel.com> > > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> > > Tested-by: Yanting Jiang <yanting.jiang@intel.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > Documentation/virt/kvm/devices/vfio.rst | 53 +++++++++++++++++-------- > > include/uapi/linux/kvm.h | 16 ++++++-- > > virt/kvm/vfio.c | 16 ++++---- > > 3 files changed, 56 insertions(+), 29 deletions(-) > > > > diff --git a/Documentation/virt/kvm/devices/vfio.rst > b/Documentation/virt/kvm/devices/vfio.rst > > index 79b6811bb4f3..277d727ec1a2 100644 > > --- a/Documentation/virt/kvm/devices/vfio.rst > > +++ b/Documentation/virt/kvm/devices/vfio.rst > > @@ -9,24 +9,38 @@ Device types supported: > > - KVM_DEV_TYPE_VFIO > > > > Only one VFIO instance may be created per VM. The created device > > -tracks VFIO groups in use by the VM and features of those groups > > -important to the correctness and acceleration of the VM. As groups > > -are enabled and disabled for use by the VM, KVM should be updated > > -about their presence. When registered with KVM, a reference to the > > -VFIO-group is held by KVM. > > +tracks VFIO files (group or device) in use by the VM and features > > +of those groups/devices important to the correctness and acceleration > > +of the VM. As groups/devices are enabled and disabled for use by the > > +VM, KVM should be updated about their presence. When registered with > > +KVM, a reference to the VFIO file is held by KVM. > > > > Groups: > > - KVM_DEV_VFIO_GROUP > > - > > -KVM_DEV_VFIO_GROUP attributes: > > - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking > > - kvm_device_attr.addr points to an int32_t file descriptor > > - for the VFIO group. > > - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device > tracking > > - kvm_device_attr.addr points to an int32_t file descriptor > > - for the VFIO group. > > - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table > > + KVM_DEV_VFIO_FILE > > + alias: KVM_DEV_VFIO_GROUP > > + > > +KVM_DEV_VFIO_FILE attributes: > > + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device > > + tracking > > + > > + alias: KVM_DEV_VFIO_GROUP_ADD > > + > > + kvm_device_attr.addr points to an int32_t file descriptor for the > > + VFIO file. > > + > > + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM > > + device tracking > > + > > + alias: KVM_DEV_VFIO_GROUP_DEL > > + > > + kvm_device_attr.addr points to an int32_t file descriptor for the > > + VFIO file. > > + > > + KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table > > allocated by sPAPR KVM. > > + > > + alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE > > + > > kvm_device_attr.addr points to a struct:: > > > > struct kvm_vfio_spapr_tce { > > @@ -40,9 +54,14 @@ KVM_DEV_VFIO_GROUP attributes: > > - @tablefd is a file descriptor for a TCE table allocated via > > KVM_CREATE_SPAPR_TCE. > > > > + only accepts vfio group file as SPAPR has no iommufd support > So then what is the point of introducing > > KVM_DEV_VFIO_FILE_SET_SPAPR_TCE at this stage? the major reason is to make the naming aligned since this patch names the groups as KVM_DEV_VFIO_FILE. > > I think would have separated the > > Groups: > KVM_DEV_VFIO_FILE > alias: KVM_DEV_VFIO_GROUP > > KVM_DEV_VFIO_FILE attributes: > KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device > tracking > > kvm_device_attr.addr points to an int32_t file descriptor for the > VFIO file. > > KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM > device tracking > > kvm_device_attr.addr points to an int32_t file descriptor for the > VFIO file. > > KVM_DEV_VFIO_GROUP (legacy kvm device group restricted to the handling of VFIO > group fd) > KVM_DEV_VFIO_GROUP_ADD: same as KVM_DEV_VFIO_FILE_ADD for group fd only > KVM_DEV_VFIO_GROUP_DEL: same as KVM_DEV_VFIO_FILE_DEL for group fd only > KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table > allocated by sPAPR KVM. > kvm_device_attr.addr points to a struct:: > > struct kvm_vfio_spapr_tce { > __s32 groupfd; > __s32 tablefd; > }; > > where: > > - @groupfd is a file descriptor for a VFIO group; > - @tablefd is a file descriptor for a TCE table allocated via > KVM_CREATE_SPAPR_TCE. hmmm, this way is clearer. I'd adopt it if it's acceptable. > > You don't say anything about potential restriction, ie. what if the user calls > KVM_DEV_VFIO_FILE with device fds while it has been using legacy container/group > API? legacy container/group path cannot do it as the below enhancement. User needs to call KVM_DEV_VFIO_FILE before open devices, so this should happen before _GET_DEVICE_FD. So the legacy path can never pass device fds in KVM_DEV_VFIO_FILE. https://lore.kernel.org/kvm/20230327102059.333d6976.alex.williamson@redhat.com/#t > > -The GROUP_ADD operation above should be invoked prior to accessing the > > +The FILE/GROUP_ADD operation above should be invoked prior to accessing the > > device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support > > drivers which require a kvm pointer to be set in their .open_device() > > -callback. > > +callback. It is the same for device file descriptor via character device > > +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file > > +descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD > > +to support the drivers mentioned in prior sentence as well. just as here. This means device fds can only be passed with KVM_DEV_VFIO_FILE in the cdev path. Regards, Yi Liu > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > > index d77aef872a0a..a8eeca70a498 100644 > > --- a/include/uapi/linux/kvm.h > > +++ b/include/uapi/linux/kvm.h > > @@ -1410,10 +1410,18 @@ struct kvm_device_attr { > > __u64 addr; /* userspace address of attr data */ > > }; > > > > -#define KVM_DEV_VFIO_GROUP 1 > > -#define KVM_DEV_VFIO_GROUP_ADD 1 > > -#define KVM_DEV_VFIO_GROUP_DEL 2 > > -#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3 > > +#define KVM_DEV_VFIO_FILE 1 > > + > > +#define KVM_DEV_VFIO_FILE_ADD 1 > > +#define KVM_DEV_VFIO_FILE_DEL 2 > > +#define KVM_DEV_VFIO_FILE_SET_SPAPR_TCE 3 > > + > > +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */ > > +#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE > > + > > +#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD > > +#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL > > +#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE > KVM_DEV_VFIO_FILE_SET_SPAPR_TCE > > > > enum kvm_device_type { > > KVM_DEV_TYPE_FSL_MPIC_20 = 1, > > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > > index 857d6ba349e1..d869913baafd 100644 > > --- a/virt/kvm/vfio.c > > +++ b/virt/kvm/vfio.c > > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long > attr, > > int32_t fd; > > > > switch (attr) { > > - case KVM_DEV_VFIO_GROUP_ADD: > > + case KVM_DEV_VFIO_FILE_ADD: > > if (get_user(fd, argp)) > > return -EFAULT; > > return kvm_vfio_file_add(dev, fd); > > > > - case KVM_DEV_VFIO_GROUP_DEL: > > + case KVM_DEV_VFIO_FILE_DEL: > > if (get_user(fd, argp)) > > return -EFAULT; > > return kvm_vfio_file_del(dev, fd); > > > > #ifdef CONFIG_SPAPR_TCE_IOMMU > > - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: > > + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: > > return kvm_vfio_file_set_spapr_tce(dev, arg); > > #endif > > } > > @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, > > struct kvm_device_attr *attr) > > { > > switch (attr->group) { > > - case KVM_DEV_VFIO_GROUP: > > + case KVM_DEV_VFIO_FILE: > > return kvm_vfio_set_file(dev, attr->attr, > > u64_to_user_ptr(attr->addr)); > > } > > @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, > > struct kvm_device_attr *attr) > > { > > switch (attr->group) { > > - case KVM_DEV_VFIO_GROUP: > > + case KVM_DEV_VFIO_FILE: > > switch (attr->attr) { > > - case KVM_DEV_VFIO_GROUP_ADD: > > - case KVM_DEV_VFIO_GROUP_DEL: > > + case KVM_DEV_VFIO_FILE_ADD: > > + case KVM_DEV_VFIO_FILE_DEL: > > #ifdef CONFIG_SPAPR_TCE_IOMMU > > - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: > > + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: > > #endif > > return 0; > > }
On Thu, 6 Apr 2023 10:49:45 +0000 "Liu, Yi L" <yi.l.liu@intel.com> wrote: > Hi Eric, > > > From: Eric Auger <eric.auger@redhat.com> > > Sent: Thursday, April 6, 2023 5:47 PM > > > > Hi Yi, > > > > On 4/1/23 17:18, Yi Liu wrote: > > > This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*. > > > Old userspace uses KVM_DEV_VFIO_GROUP* works as well. > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > > Tested-by: Terrence Xu <terrence.xu@intel.com> > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> > > > Tested-by: Yanting Jiang <yanting.jiang@intel.com> > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > > --- > > > Documentation/virt/kvm/devices/vfio.rst | 53 +++++++++++++++++-------- > > > include/uapi/linux/kvm.h | 16 ++++++-- > > > virt/kvm/vfio.c | 16 ++++---- > > > 3 files changed, 56 insertions(+), 29 deletions(-) > > > > > > diff --git a/Documentation/virt/kvm/devices/vfio.rst > > b/Documentation/virt/kvm/devices/vfio.rst > > > index 79b6811bb4f3..277d727ec1a2 100644 > > > --- a/Documentation/virt/kvm/devices/vfio.rst > > > +++ b/Documentation/virt/kvm/devices/vfio.rst > > > @@ -9,24 +9,38 @@ Device types supported: > > > - KVM_DEV_TYPE_VFIO > > > > > > Only one VFIO instance may be created per VM. The created device > > > -tracks VFIO groups in use by the VM and features of those groups > > > -important to the correctness and acceleration of the VM. As groups > > > -are enabled and disabled for use by the VM, KVM should be updated > > > -about their presence. When registered with KVM, a reference to the > > > -VFIO-group is held by KVM. > > > +tracks VFIO files (group or device) in use by the VM and features > > > +of those groups/devices important to the correctness and acceleration > > > +of the VM. As groups/devices are enabled and disabled for use by the > > > +VM, KVM should be updated about their presence. When registered with > > > +KVM, a reference to the VFIO file is held by KVM. > > > > > > Groups: > > > - KVM_DEV_VFIO_GROUP > > > - > > > -KVM_DEV_VFIO_GROUP attributes: > > > - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking > > > - kvm_device_attr.addr points to an int32_t file descriptor > > > - for the VFIO group. > > > - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device > > tracking > > > - kvm_device_attr.addr points to an int32_t file descriptor > > > - for the VFIO group. > > > - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table > > > + KVM_DEV_VFIO_FILE > > > + alias: KVM_DEV_VFIO_GROUP > > > + > > > +KVM_DEV_VFIO_FILE attributes: > > > + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device > > > + tracking > > > + > > > + alias: KVM_DEV_VFIO_GROUP_ADD > > > + > > > + kvm_device_attr.addr points to an int32_t file descriptor for the > > > + VFIO file. > > > + > > > + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM > > > + device tracking > > > + > > > + alias: KVM_DEV_VFIO_GROUP_DEL > > > + > > > + kvm_device_attr.addr points to an int32_t file descriptor for the > > > + VFIO file. > > > + > > > + KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table > > > allocated by sPAPR KVM. > > > + > > > + alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE > > > + > > > kvm_device_attr.addr points to a struct:: > > > > > > struct kvm_vfio_spapr_tce { > > > @@ -40,9 +54,14 @@ KVM_DEV_VFIO_GROUP attributes: > > > - @tablefd is a file descriptor for a TCE table allocated via > > > KVM_CREATE_SPAPR_TCE. > > > > > > + only accepts vfio group file as SPAPR has no iommufd support > > So then what is the point of introducing > > > > KVM_DEV_VFIO_FILE_SET_SPAPR_TCE at this stage? > > the major reason is to make the naming aligned since this patch > names the groups as KVM_DEV_VFIO_FILE. > > > > > I think would have separated the > > > > Groups: > > KVM_DEV_VFIO_FILE > > alias: KVM_DEV_VFIO_GROUP > > > > KVM_DEV_VFIO_FILE attributes: > > KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device > > tracking > > > > kvm_device_attr.addr points to an int32_t file descriptor for the > > VFIO file. > > > > KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM > > device tracking > > > > kvm_device_attr.addr points to an int32_t file descriptor for the > > VFIO file. > > > > KVM_DEV_VFIO_GROUP (legacy kvm device group restricted to the handling of VFIO > > group fd) > > KVM_DEV_VFIO_GROUP_ADD: same as KVM_DEV_VFIO_FILE_ADD for group fd only > > KVM_DEV_VFIO_GROUP_DEL: same as KVM_DEV_VFIO_FILE_DEL for group fd only > > KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table > > allocated by sPAPR KVM. > > kvm_device_attr.addr points to a struct:: > > > > struct kvm_vfio_spapr_tce { > > __s32 groupfd; > > __s32 tablefd; > > }; > > > > where: > > > > - @groupfd is a file descriptor for a VFIO group; > > - @tablefd is a file descriptor for a TCE table allocated via > > KVM_CREATE_SPAPR_TCE. > > hmmm, this way is clearer. I'd adopt it if it's acceptable. > > > > > You don't say anything about potential restriction, ie. what if the user calls > > KVM_DEV_VFIO_FILE with device fds while it has been using legacy container/group > > API? > > legacy container/group path cannot do it as the below enhancement. > User needs to call KVM_DEV_VFIO_FILE before open devices, so this > should happen before _GET_DEVICE_FD. So the legacy path can never > pass device fds in KVM_DEV_VFIO_FILE. > > https://lore.kernel.org/kvm/20230327102059.333d6976.alex.williamson@redhat.com/#t Wait, are you suggesting that a comment in the documentation suggesting a usage policy somehow provides enforcement of that ordering?? That's not how this works. Thanks, Alex > > > -The GROUP_ADD operation above should be invoked prior to accessing the > > > +The FILE/GROUP_ADD operation above should be invoked prior to accessing the > > > device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support > > > drivers which require a kvm pointer to be set in their .open_device() > > > -callback. > > > +callback. It is the same for device file descriptor via character device > > > +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file > > > +descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD > > > +to support the drivers mentioned in prior sentence as well. > > just as here. This means device fds can only be passed with KVM_DEV_VFIO_FILE > in the cdev path. > > Regards, > Yi Liu > > > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > > > index d77aef872a0a..a8eeca70a498 100644 > > > --- a/include/uapi/linux/kvm.h > > > +++ b/include/uapi/linux/kvm.h > > > @@ -1410,10 +1410,18 @@ struct kvm_device_attr { > > > __u64 addr; /* userspace address of attr data */ > > > }; > > > > > > -#define KVM_DEV_VFIO_GROUP 1 > > > -#define KVM_DEV_VFIO_GROUP_ADD 1 > > > -#define KVM_DEV_VFIO_GROUP_DEL 2 > > > -#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3 > > > +#define KVM_DEV_VFIO_FILE 1 > > > + > > > +#define KVM_DEV_VFIO_FILE_ADD 1 > > > +#define KVM_DEV_VFIO_FILE_DEL 2 > > > +#define KVM_DEV_VFIO_FILE_SET_SPAPR_TCE 3 > > > + > > > +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */ > > > +#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE > > > + > > > +#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD > > > +#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL > > > +#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE > > KVM_DEV_VFIO_FILE_SET_SPAPR_TCE > > > > > > enum kvm_device_type { > > > KVM_DEV_TYPE_FSL_MPIC_20 = 1, > > > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > > > index 857d6ba349e1..d869913baafd 100644 > > > --- a/virt/kvm/vfio.c > > > +++ b/virt/kvm/vfio.c > > > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long > > attr, > > > int32_t fd; > > > > > > switch (attr) { > > > - case KVM_DEV_VFIO_GROUP_ADD: > > > + case KVM_DEV_VFIO_FILE_ADD: > > > if (get_user(fd, argp)) > > > return -EFAULT; > > > return kvm_vfio_file_add(dev, fd); > > > > > > - case KVM_DEV_VFIO_GROUP_DEL: > > > + case KVM_DEV_VFIO_FILE_DEL: > > > if (get_user(fd, argp)) > > > return -EFAULT; > > > return kvm_vfio_file_del(dev, fd); > > > > > > #ifdef CONFIG_SPAPR_TCE_IOMMU > > > - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: > > > + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: > > > return kvm_vfio_file_set_spapr_tce(dev, arg); > > > #endif > > > } > > > @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, > > > struct kvm_device_attr *attr) > > > { > > > switch (attr->group) { > > > - case KVM_DEV_VFIO_GROUP: > > > + case KVM_DEV_VFIO_FILE: > > > return kvm_vfio_set_file(dev, attr->attr, > > > u64_to_user_ptr(attr->addr)); > > > } > > > @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, > > > struct kvm_device_attr *attr) > > > { > > > switch (attr->group) { > > > - case KVM_DEV_VFIO_GROUP: > > > + case KVM_DEV_VFIO_FILE: > > > switch (attr->attr) { > > > - case KVM_DEV_VFIO_GROUP_ADD: > > > - case KVM_DEV_VFIO_GROUP_DEL: > > > + case KVM_DEV_VFIO_FILE_ADD: > > > + case KVM_DEV_VFIO_FILE_DEL: > > > #ifdef CONFIG_SPAPR_TCE_IOMMU > > > - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: > > > + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: > > > #endif > > > return 0; > > > } >
> From: Alex Williamson <alex.williamson@redhat.com> > Sent: Friday, April 7, 2023 2:58 AM > > > > > > You don't say anything about potential restriction, ie. what if the user calls > > > KVM_DEV_VFIO_FILE with device fds while it has been using legacy > container/group > > > API? > > > > legacy container/group path cannot do it as the below enhancement. > > User needs to call KVM_DEV_VFIO_FILE before open devices, so this > > should happen before _GET_DEVICE_FD. So the legacy path can never > > pass device fds in KVM_DEV_VFIO_FILE. > > > > > https://lore.kernel.org/kvm/20230327102059.333d6976.alex.williamson@redhat.com > /#t > > Wait, are you suggesting that a comment in the documentation suggesting > a usage policy somehow provides enforcement of that ordering?? That's > not how this works. Thanks, I don't know if there is a good way to enforce this order in the code. The vfio_device->kvm pointer is optional. If it is NULL, vfio just ignores it. So vfio doesn't have a good way to tell if the order requirement is met or not. Perhaps just trigger NULL pointer dereference when kvm pointer is used in the device drivers like kvmgt if this order is not met. So that's why I come up to document it here. The applications uses kvm should know this and follow this otherwise it may encounter error. Do you have other suggestions for it? This order should be a generic requirement. is it? group path also needs to follow it to make the mdev driver that refers kvm pointer to be workable. Thanks, Yi Liu > > > > -The GROUP_ADD operation above should be invoked prior to accessing the > > > > +The FILE/GROUP_ADD operation above should be invoked prior to accessing the > > > > device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support > > > > drivers which require a kvm pointer to be set in their .open_device() > > > > -callback. > > > > +callback. It is the same for device file descriptor via character device > > > > +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file > > > > +descriptors, FILE_ADD should be invoked before > VFIO_DEVICE_BIND_IOMMUFD > > > > +to support the drivers mentioned in prior sentence as well. > > > > just as here. This means device fds can only be passed with KVM_DEV_VFIO_FILE > > in the cdev path. > > > > Regards, > > Yi Liu
Hi Yi, On 4/7/23 05:42, Liu, Yi L wrote: >> From: Alex Williamson <alex.williamson@redhat.com> >> Sent: Friday, April 7, 2023 2:58 AM >>>> You don't say anything about potential restriction, ie. what if the user calls >>>> KVM_DEV_VFIO_FILE with device fds while it has been using legacy >> container/group >>>> API? >>> legacy container/group path cannot do it as the below enhancement. >>> User needs to call KVM_DEV_VFIO_FILE before open devices, so this >>> should happen before _GET_DEVICE_FD. So the legacy path can never >>> pass device fds in KVM_DEV_VFIO_FILE. >>> >>> >> https://lore.kernel.org/kvm/20230327102059.333d6976.alex.williamson@redhat.com >> /#t >> >> Wait, are you suggesting that a comment in the documentation suggesting >> a usage policy somehow provides enforcement of that ordering?? That's >> not how this works. Thanks, > I don't know if there is a good way to enforce this order in the code. The > vfio_device->kvm pointer is optional. If it is NULL, vfio just ignores it. > So vfio doesn't have a good way to tell if the order requirement is met or > not. Perhaps just trigger NULL pointer dereference when kvm pointer is used > in the device drivers like kvmgt if this order is not met. > > So that's why I come up to document it here. The applications uses kvm > should know this and follow this otherwise it may encounter error. > > Do you have other suggestions for it? This order should be a generic > requirement. is it? group path also needs to follow it to make the mdev > driver that refers kvm pointer to be workable. In the same way as kvm_vfio_file_is_valid() called in kvm_vfio_file_add() can't you have a kernel API that checks the fd consistence? Thanks Eric > > Thanks, > Yi Liu > >>>>> -The GROUP_ADD operation above should be invoked prior to accessing the >>>>> +The FILE/GROUP_ADD operation above should be invoked prior to accessing the >>>>> device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support >>>>> drivers which require a kvm pointer to be set in their .open_device() >>>>> -callback. >>>>> +callback. It is the same for device file descriptor via character device >>>>> +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file >>>>> +descriptors, FILE_ADD should be invoked before >> VFIO_DEVICE_BIND_IOMMUFD >>>>> +to support the drivers mentioned in prior sentence as well. >>> just as here. This means device fds can only be passed with KVM_DEV_VFIO_FILE >>> in the cdev path. >>> >>> Regards, >>> Yi Liu
Hi Eric, > From: Eric Auger <eric.auger@redhat.com> > Sent: Friday, April 7, 2023 4:57 PM > > Hi Yi, > > On 4/7/23 05:42, Liu, Yi L wrote: > >> From: Alex Williamson <alex.williamson@redhat.com> > >> Sent: Friday, April 7, 2023 2:58 AM > >>>> You don't say anything about potential restriction, ie. what if the user calls > >>>> KVM_DEV_VFIO_FILE with device fds while it has been using legacy > >> container/group > >>>> API? > >>> legacy container/group path cannot do it as the below enhancement. > >>> User needs to call KVM_DEV_VFIO_FILE before open devices, so this > >>> should happen before _GET_DEVICE_FD. So the legacy path can never > >>> pass device fds in KVM_DEV_VFIO_FILE. > >>> > >>> > >> > https://lore.kernel.org/kvm/20230327102059.333d6976.alex.williamson@redhat.com > >> /#t > >> > >> Wait, are you suggesting that a comment in the documentation suggesting > >> a usage policy somehow provides enforcement of that ordering?? That's > >> not how this works. Thanks, > > I don't know if there is a good way to enforce this order in the code. The > > vfio_device->kvm pointer is optional. If it is NULL, vfio just ignores it. > > So vfio doesn't have a good way to tell if the order requirement is met or > > not. Perhaps just trigger NULL pointer dereference when kvm pointer is used > > in the device drivers like kvmgt if this order is not met. > > > > So that's why I come up to document it here. The applications uses kvm > > should know this and follow this otherwise it may encounter error. > > > > Do you have other suggestions for it? This order should be a generic > > requirement. is it? group path also needs to follow it to make the mdev > > driver that refers kvm pointer to be workable. > > In the same way as kvm_vfio_file_is_valid() called in kvm_vfio_file_add() > can't you have a kernel API that checks the fd consistence? I think we are talking about how to check if the order between KVM_DEV_VFIO_FILE_ADD and the device open (e.g. invoked by VFIO_GROUP_GET_DEVICE_FD) is met in the code rather than document it here. Am I missing anything here? Maybe I've misunderstood Alex's question. ☹ Regards, Yi Liu > Thanks > > Eric > > > > Thanks, > > Yi Liu > > > >>>>> -The GROUP_ADD operation above should be invoked prior to accessing the > >>>>> +The FILE/GROUP_ADD operation above should be invoked prior to accessing > the > >>>>> device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support > >>>>> drivers which require a kvm pointer to be set in their .open_device() > >>>>> -callback. > >>>>> +callback. It is the same for device file descriptor via character device > >>>>> +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such > file > >>>>> +descriptors, FILE_ADD should be invoked before > >> VFIO_DEVICE_BIND_IOMMUFD > >>>>> +to support the drivers mentioned in prior sentence as well. > >>> just as here. This means device fds can only be passed with KVM_DEV_VFIO_FILE > >>> in the cdev path. > >>> > >>> Regards, > >>> Yi Liu
diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst index 79b6811bb4f3..277d727ec1a2 100644 --- a/Documentation/virt/kvm/devices/vfio.rst +++ b/Documentation/virt/kvm/devices/vfio.rst @@ -9,24 +9,38 @@ Device types supported: - KVM_DEV_TYPE_VFIO Only one VFIO instance may be created per VM. The created device -tracks VFIO groups in use by the VM and features of those groups -important to the correctness and acceleration of the VM. As groups -are enabled and disabled for use by the VM, KVM should be updated -about their presence. When registered with KVM, a reference to the -VFIO-group is held by KVM. +tracks VFIO files (group or device) in use by the VM and features +of those groups/devices important to the correctness and acceleration +of the VM. As groups/devices are enabled and disabled for use by the +VM, KVM should be updated about their presence. When registered with +KVM, a reference to the VFIO file is held by KVM. Groups: - KVM_DEV_VFIO_GROUP - -KVM_DEV_VFIO_GROUP attributes: - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table + KVM_DEV_VFIO_FILE + alias: KVM_DEV_VFIO_GROUP + +KVM_DEV_VFIO_FILE attributes: + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device + tracking + + alias: KVM_DEV_VFIO_GROUP_ADD + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM + device tracking + + alias: KVM_DEV_VFIO_GROUP_DEL + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + + KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table allocated by sPAPR KVM. + + alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE + kvm_device_attr.addr points to a struct:: struct kvm_vfio_spapr_tce { @@ -40,9 +54,14 @@ KVM_DEV_VFIO_GROUP attributes: - @tablefd is a file descriptor for a TCE table allocated via KVM_CREATE_SPAPR_TCE. + only accepts vfio group file as SPAPR has no iommufd support + :: -The GROUP_ADD operation above should be invoked prior to accessing the +The FILE/GROUP_ADD operation above should be invoked prior to accessing the device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support drivers which require a kvm pointer to be set in their .open_device() -callback. +callback. It is the same for device file descriptor via character device +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file +descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD +to support the drivers mentioned in prior sentence as well. diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index d77aef872a0a..a8eeca70a498 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1410,10 +1410,18 @@ struct kvm_device_attr { __u64 addr; /* userspace address of attr data */ }; -#define KVM_DEV_VFIO_GROUP 1 -#define KVM_DEV_VFIO_GROUP_ADD 1 -#define KVM_DEV_VFIO_GROUP_DEL 2 -#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3 +#define KVM_DEV_VFIO_FILE 1 + +#define KVM_DEV_VFIO_FILE_ADD 1 +#define KVM_DEV_VFIO_FILE_DEL 2 +#define KVM_DEV_VFIO_FILE_SET_SPAPR_TCE 3 + +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */ +#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE + +#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD +#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL +#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE KVM_DEV_VFIO_FILE_SET_SPAPR_TCE enum kvm_device_type { KVM_DEV_TYPE_FSL_MPIC_20 = 1, diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 857d6ba349e1..d869913baafd 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr, int32_t fd; switch (attr) { - case KVM_DEV_VFIO_GROUP_ADD: + case KVM_DEV_VFIO_FILE_ADD: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_add(dev, fd); - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_DEL: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_del(dev, fd); #ifdef CONFIG_SPAPR_TCE_IOMMU - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: return kvm_vfio_file_set_spapr_tce(dev, arg); #endif } @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: return kvm_vfio_set_file(dev, attr->attr, u64_to_user_ptr(attr->addr)); } @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: switch (attr->attr) { - case KVM_DEV_VFIO_GROUP_ADD: - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_ADD: + case KVM_DEV_VFIO_FILE_DEL: #ifdef CONFIG_SPAPR_TCE_IOMMU - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: #endif return 0; }