Message ID | 580617BD.8000300@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 18 Oct 2016 20:38:21 +0800 Jike Song <jike.song@intel.com> wrote: > On 10/18/2016 12:02 AM, Alex Williamson wrote: > > On Fri, 14 Oct 2016 15:19:01 -0700 > > Neo Jia <cjia@nvidia.com> wrote: > > > >> On Fri, Oct 14, 2016 at 10:51:24AM -0600, Alex Williamson wrote: > >>> On Fri, 14 Oct 2016 09:35:45 -0700 > >>> Neo Jia <cjia@nvidia.com> wrote: > >>> > >>>> On Fri, Oct 14, 2016 at 08:46:01AM -0600, Alex Williamson wrote: > >>>>> On Fri, 14 Oct 2016 08:41:58 -0600 > >>>>> Alex Williamson <alex.williamson@redhat.com> wrote: > >>>>> > >>>>>> On Fri, 14 Oct 2016 18:37:45 +0800 > >>>>>> Jike Song <jike.song@intel.com> wrote: > >>>>>> > >>>>>>> On 10/11/2016 05:47 PM, Paolo Bonzini wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> On 11/10/2016 11:21, Xiao Guangrong wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 10/11/2016 04:54 PM, Paolo Bonzini wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 11/10/2016 04:39, Xiao Guangrong wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 10/11/2016 02:32 AM, Paolo Bonzini wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 10/10/2016 20:01, Neo Jia wrote: > >>>>>>>>>>>>>> Hi Neo, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> AFAIK this is needed because KVMGT doesn't paravirtualize the PPGTT, > >>>>>>>>>>>>>> while nVidia does. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Hi Paolo and Xiaoguang, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I am just wondering how device driver can register a notifier so he > >>>>>>>>>>>>> can be > >>>>>>>>>>>>> notified for write-protected pages when writes are happening. > >>>>>>>>>>>> > >>>>>>>>>>>> It can't yet, but the API is ready for that. kvm_vfio_set_group is > >>>>>>>>>>>> currently where a struct kvm_device* and struct vfio_group* touch. > >>>>>>>>>>>> Given > >>>>>>>>>>>> a struct kvm_device*, dev->kvm provides the struct kvm to be passed to > >>>>>>>>>>>> kvm_page_track_register_notifier. So I guess you could add a callback > >>>>>>>>>>>> that passes the struct kvm_device* to the mdev device. > >>>>>>>>>>>> > >>>>>>>>>>>> Xiaoguang and Guangrong, what were your plans? We discussed it briefly > >>>>>>>>>>>> at KVM Forum but I don't remember the details. > >>>>>>>>>>> > >>>>>>>>>>> Your suggestion was that pass kvm fd to KVMGT via VFIO, so that we can > >>>>>>>>>>> figure out the kvm instance based on the fd. > >>>>>>>>>>> > >>>>>>>>>>> We got a new idea, how about search the kvm instance by mm_struct, it > >>>>>>>>>>> can work as KVMGT is running in the vcpu context and it is much more > >>>>>>>>>>> straightforward. > >>>>>>>>>> > >>>>>>>>>> Perhaps I didn't understand your suggestion, but the same mm_struct can > >>>>>>>>>> have more than 1 struct kvm so I'm not sure that it can work. > >>>>>>>>> > >>>>>>>>> vcpu->pid is valid during vcpu running so that it can be used to figure > >>>>>>>>> out which kvm instance owns the vcpu whose pid is the one as current > >>>>>>>>> thread, i think it can work. :) > >>>>>>>> > >>>>>>>> No, don't do that. There's no reason for a thread to run a single VCPU, > >>>>>>>> and if you can have multiple VCPUs you can also have multiple VCPUs from > >>>>>>>> multiple VMs. > >>>>>>>> > >>>>>>>> Passing file descriptors around are the right way to connect subsystems. > >>>>>>> > >>>>>>> [CC Alex, Kevin and Qemu-devel] > >>>>>>> > >>>>>>> Hi Paolo & Alex, > >>>>>>> > >>>>>>> IIUC, passing file descriptors means touching QEMU and the UAPI between > >>>>>>> QEMU and VFIO. Would you guys have a look at below draft patch? If it's > >>>>>>> on the correct direction, I'll send the split ones. Thanks! > >>>>>>> > >>>>>>> -- > >>>>>>> Thanks, > >>>>>>> Jike > >>>>>>> > >>>>>>> > >>>>>>> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c > >>>>>>> index bec694c..f715d37 100644 > >>>>>>> --- a/hw/vfio/pci-quirks.c > >>>>>>> +++ b/hw/vfio/pci-quirks.c > >>>>>>> @@ -10,12 +10,14 @@ > >>>>>>> * the COPYING file in the top-level directory. > >>>>>>> */ > >>>>>>> > >>>>>>> +#include <sys/ioctl.h> > >>>>>>> #include "qemu/osdep.h" > >>>>>>> #include "qemu/error-report.h" > >>>>>>> #include "qemu/range.h" > >>>>>>> #include "qapi/error.h" > >>>>>>> #include "hw/nvram/fw_cfg.h" > >>>>>>> #include "pci.h" > >>>>>>> +#include "sysemu/kvm.h" > >>>>>>> #include "trace.h" > >>>>>>> > >>>>>>> /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ > >>>>>>> @@ -1844,3 +1846,15 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev) > >>>>>>> break; > >>>>>>> } > >>>>>>> } > >>>>>>> + > >>>>>>> +void vfio_quirk_kvmgt(VFIOPCIDevice *vdev) > >>>>>>> +{ > >>>>>>> + int vmfd; > >>>>>>> + > >>>>>>> + if (!kvm_enabled() || !vdev->kvmgt) > >>>>>>> + return; > >>>>>>> + > >>>>>>> + /* Tell the device what KVM it attached */ > >>>>>>> + vmfd = kvm_get_vmfd(kvm_state); > >>>>>>> + ioctl(vdev->vbasedev.fd, VFIO_SET_KVMFD, vmfd); > >>>>>>> +} > >>>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > >>>>>>> index a5a620a..8732552 100644 > >>>>>>> --- a/hw/vfio/pci.c > >>>>>>> +++ b/hw/vfio/pci.c > >>>>>>> @@ -2561,6 +2561,8 @@ static int vfio_initfn(PCIDevice *pdev) > >>>>>>> return ret; > >>>>>>> } > >>>>>>> > >>>>>>> + vfio_quirk_kvmgt(vdev); > >>>>>>> + > >>>>>>> /* Get a copy of config space */ > >>>>>>> ret = pread(vdev->vbasedev.fd, vdev->pdev.config, > >>>>>>> MIN(pci_config_size(&vdev->pdev), vdev->config_size), > >>>>>>> @@ -2832,6 +2834,7 @@ static Property vfio_pci_dev_properties[] = { > >>>>>>> DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, > >>>>>>> sub_device_id, PCI_ANY_ID), > >>>>>>> DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), > >>>>>>> + DEFINE_PROP_BOOL("kvmgt", VFIOPCIDevice, kvmgt, false), > >>>>>> > >>>>>> Just a side note, device options are a headache, users are prone to get > >>>>>> them wrong and minimally it requires an entire round to get libvirt > >>>>>> support. We should be able to detect from the device or vfio API > >>>>>> whether such a call is required. Obviously if we can use the existing > >>>>>> kvm-vfio device, that's the better option anyway. Thanks, > >>>>> > >>>>> Also, vfio devices currently have no hard dependencies on KVM, if kvmgt > >>>>> does, it needs to produce a device failure when unavailable. Thanks, > >>>> > >>>> Also, I would like to see this as an generic feature instead of > >>>> kvmgt specific interface, so we don't have to add new options to QEMU and it is > >>>> up to the vendor driver to proceed with or without it. > >>> > >>> In general this should be decided by lack of some required feature > >>> exclusively provided by KVM. I would not want to add a generic opt-out > >>> for mdev vendor drivers to decide that they arbitrarily want to disable > >>> that path. Thanks, > >> > >> IIUC, you are suggesting that this path should be controlled by KVM feature cap > >> and it will be accessible to VFIO users when such checking is satisfied. > > > > Maybe we're getting too loose with our pronouns here, I'm starting to > > lose track of what "this" is referring to. I agree that there's no > > reason for the ioctl, as proposed to be kvmgt specific. I would hope > > that going through the kvm-vfio device to create that linkage would > > eliminate that, but we'll need to see what Jike can come up with to > > plumb between KVM and vfio. Vendor drivers can implement their own > > ioctls, now that we pass them through the mdev layer, but someone needs > > to call those ioctls. Ideally we want something programmatic to > > trigger that, without requiring a user to pass an extra device > > parameter. Additionally, if there is any hope of making use of the > > device with userspace drivers other than QEMU, hard dependencies on KVM > > should be avoided. Thanks, > > > > Alex > > > > Thanks for the advice, so I cooked another patch for your comments. > Basically a 'void *usrdata' is added to vfio_group, external users > can set it (kvm) or get it (kvm or other users like kvmgt). > > BTW, in device-model, the open method will return failure to vfio-mdev > in case that such kvm information is not available. > > -- > Thanks, > Jike > > > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index d1d70e0..6b8d1d2 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -86,6 +86,7 @@ struct vfio_group { > struct mutex unbound_lock; > atomic_t opened; > bool noiommu; > + void *usrdata; > }; > > struct vfio_device { > @@ -447,14 +448,13 @@ static struct vfio_group *vfio_group_try_get(struct vfio_group *group) > } > > static > -struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) > +struct vfio_group *__vfio_group_get_from_iommu(struct iommu_group *iommu_group) > { > struct vfio_group *group; > > mutex_lock(&vfio.group_lock); > list_for_each_entry(group, &vfio.group_list, vfio_next) { > if (group->iommu_group == iommu_group) { > - vfio_group_get(group); This is wrong, we can't add our reference after we release the lock. > mutex_unlock(&vfio.group_lock); > return group; > } > @@ -464,6 +464,17 @@ struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) > return NULL; > } > > +static > +struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) > +{ > + struct vfio_group *group = __vfio_group_get_from_iommu(iommu_group); > + if (!group) > + return NULL; > + > + vfio_group_get(group); We have no basis to get a reference here. This function cannot exist separate from the existing function above. > + return group; > +} > + > static struct vfio_group *vfio_group_get_from_minor(int minor) > { > struct vfio_group *group; > @@ -1728,6 +1739,31 @@ long vfio_external_check_extension(struct vfio_group *group, unsigned long arg) > } > EXPORT_SYMBOL_GPL(vfio_external_check_extension); > > +void vfio_group_set_usrdata(struct vfio_group *group, void *data) > +{ > + group->usrdata = data; > +} > +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); > + > +void *vfio_group_get_usrdata(struct vfio_group *group) > +{ > + return group->usrdata; > +} > +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); > + > +void *vfio_group_get_usrdata_by_device(struct device *dev) > +{ > + struct vfio_group *vfio_group; > + > + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); We actually need to use iommu_group_get() here. Kirti adds a vfio_group_get_from_dev() in v9 03/12 that does this properly. > + if (!vfio_group) > + return NULL; > + > + return vfio_group_get_usrdata(vfio_group); This operates on a group for which we have no reference. > +} > +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata_by_device); > + > + > /** > * Sub-module support > */ > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index 0ecae0b..712588f 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -91,6 +91,10 @@ extern void vfio_unregister_iommu_driver( > extern int vfio_external_user_iommu_id(struct vfio_group *group); > extern long vfio_external_check_extension(struct vfio_group *group, > unsigned long arg); > +extern void vfio_group_set_usrdata(struct vfio_group *group, void *data); > +extern void *vfio_group_get_usrdata(struct vfio_group *group); > +extern void *vfio_group_get_usrdata_by_device(struct device *dev); > + > > /* > * Sub-module helpers > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > index 1dd087d..e00d401 100644 > --- a/virt/kvm/vfio.c > +++ b/virt/kvm/vfio.c > @@ -60,6 +60,20 @@ static void kvm_vfio_group_put_external_user(struct vfio_group *vfio_group) > symbol_put(vfio_group_put_external_user); > } > > +static void kvm_vfio_group_set_kvm(struct vfio_group *group, void *kvm) > +{ > + void (*fn)(struct vfio_group *, void *); > + > + fn = symbol_get(vfio_group_set_usrdata); > + if (!fn) > + return; > + > + fn(group, kvm); > + kvm_get_kvm(kvm); > + > + symbol_put(vfio_group_set_usrdata); > +} > + > static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group) > { > long (*fn)(struct vfio_group *, unsigned long); > @@ -161,6 +175,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) > > kvm_vfio_update_coherency(dev); > > + kvm_vfio_group_set_kvm(vfio_group, dev->kvm); > + > return 0; > > case KVM_DEV_VFIO_GROUP_DEL: > @@ -200,6 +216,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) > > kvm_vfio_update_coherency(dev); > > + kvm_put_kvm(dev->kvm); > + > return ret; > } How does anyone get'ing the usrdata know what it contains? Does the vendor driver compare it to a pointer it found elsewhere? How does the vendor driver generate an error back to the user if this linkage is necessary but unavailable? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/18/2016 10:59 PM, Alex Williamson wrote: > On Tue, 18 Oct 2016 20:38:21 +0800 > Jike Song <jike.song@intel.com> wrote: >> On 10/18/2016 12:02 AM, Alex Williamson wrote: >>> On Fri, 14 Oct 2016 15:19:01 -0700 >>> Neo Jia <cjia@nvidia.com> wrote: >>> >>>> On Fri, Oct 14, 2016 at 10:51:24AM -0600, Alex Williamson wrote: >>>>> On Fri, 14 Oct 2016 09:35:45 -0700 >>>>> Neo Jia <cjia@nvidia.com> wrote: >>>>> >>>>>> On Fri, Oct 14, 2016 at 08:46:01AM -0600, Alex Williamson wrote: >>>>>>> On Fri, 14 Oct 2016 08:41:58 -0600 >>>>>>> Alex Williamson <alex.williamson@redhat.com> wrote: >>>>>>> >>>>>>>> On Fri, 14 Oct 2016 18:37:45 +0800 >>>>>>>> Jike Song <jike.song@intel.com> wrote: >>>>>>>> >>>>>>>>> On 10/11/2016 05:47 PM, Paolo Bonzini wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11/10/2016 11:21, Xiao Guangrong wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 10/11/2016 04:54 PM, Paolo Bonzini wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 11/10/2016 04:39, Xiao Guangrong wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/11/2016 02:32 AM, Paolo Bonzini wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/10/2016 20:01, Neo Jia wrote: >>>>>>>>>>>>>>>> Hi Neo, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> AFAIK this is needed because KVMGT doesn't paravirtualize the PPGTT, >>>>>>>>>>>>>>>> while nVidia does. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Paolo and Xiaoguang, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am just wondering how device driver can register a notifier so he >>>>>>>>>>>>>>> can be >>>>>>>>>>>>>>> notified for write-protected pages when writes are happening. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It can't yet, but the API is ready for that. kvm_vfio_set_group is >>>>>>>>>>>>>> currently where a struct kvm_device* and struct vfio_group* touch. >>>>>>>>>>>>>> Given >>>>>>>>>>>>>> a struct kvm_device*, dev->kvm provides the struct kvm to be passed to >>>>>>>>>>>>>> kvm_page_track_register_notifier. So I guess you could add a callback >>>>>>>>>>>>>> that passes the struct kvm_device* to the mdev device. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Xiaoguang and Guangrong, what were your plans? We discussed it briefly >>>>>>>>>>>>>> at KVM Forum but I don't remember the details. >>>>>>>>>>>>> >>>>>>>>>>>>> Your suggestion was that pass kvm fd to KVMGT via VFIO, so that we can >>>>>>>>>>>>> figure out the kvm instance based on the fd. >>>>>>>>>>>>> >>>>>>>>>>>>> We got a new idea, how about search the kvm instance by mm_struct, it >>>>>>>>>>>>> can work as KVMGT is running in the vcpu context and it is much more >>>>>>>>>>>>> straightforward. >>>>>>>>>>>> >>>>>>>>>>>> Perhaps I didn't understand your suggestion, but the same mm_struct can >>>>>>>>>>>> have more than 1 struct kvm so I'm not sure that it can work. >>>>>>>>>>> >>>>>>>>>>> vcpu->pid is valid during vcpu running so that it can be used to figure >>>>>>>>>>> out which kvm instance owns the vcpu whose pid is the one as current >>>>>>>>>>> thread, i think it can work. :) >>>>>>>>>> >>>>>>>>>> No, don't do that. There's no reason for a thread to run a single VCPU, >>>>>>>>>> and if you can have multiple VCPUs you can also have multiple VCPUs from >>>>>>>>>> multiple VMs. >>>>>>>>>> >>>>>>>>>> Passing file descriptors around are the right way to connect subsystems. >>>>>>>>> >>>>>>>>> [CC Alex, Kevin and Qemu-devel] >>>>>>>>> >>>>>>>>> Hi Paolo & Alex, >>>>>>>>> >>>>>>>>> IIUC, passing file descriptors means touching QEMU and the UAPI between >>>>>>>>> QEMU and VFIO. Would you guys have a look at below draft patch? If it's >>>>>>>>> on the correct direction, I'll send the split ones. Thanks! >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks, >>>>>>>>> Jike >>>>>>>>> >>>>>>>>> >>>>>>>>> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c >>>>>>>>> index bec694c..f715d37 100644 >>>>>>>>> --- a/hw/vfio/pci-quirks.c >>>>>>>>> +++ b/hw/vfio/pci-quirks.c >>>>>>>>> @@ -10,12 +10,14 @@ >>>>>>>>> * the COPYING file in the top-level directory. >>>>>>>>> */ >>>>>>>>> >>>>>>>>> +#include <sys/ioctl.h> >>>>>>>>> #include "qemu/osdep.h" >>>>>>>>> #include "qemu/error-report.h" >>>>>>>>> #include "qemu/range.h" >>>>>>>>> #include "qapi/error.h" >>>>>>>>> #include "hw/nvram/fw_cfg.h" >>>>>>>>> #include "pci.h" >>>>>>>>> +#include "sysemu/kvm.h" >>>>>>>>> #include "trace.h" >>>>>>>>> >>>>>>>>> /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ >>>>>>>>> @@ -1844,3 +1846,15 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev) >>>>>>>>> break; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> + >>>>>>>>> +void vfio_quirk_kvmgt(VFIOPCIDevice *vdev) >>>>>>>>> +{ >>>>>>>>> + int vmfd; >>>>>>>>> + >>>>>>>>> + if (!kvm_enabled() || !vdev->kvmgt) >>>>>>>>> + return; >>>>>>>>> + >>>>>>>>> + /* Tell the device what KVM it attached */ >>>>>>>>> + vmfd = kvm_get_vmfd(kvm_state); >>>>>>>>> + ioctl(vdev->vbasedev.fd, VFIO_SET_KVMFD, vmfd); >>>>>>>>> +} >>>>>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>>>>>>>> index a5a620a..8732552 100644 >>>>>>>>> --- a/hw/vfio/pci.c >>>>>>>>> +++ b/hw/vfio/pci.c >>>>>>>>> @@ -2561,6 +2561,8 @@ static int vfio_initfn(PCIDevice *pdev) >>>>>>>>> return ret; >>>>>>>>> } >>>>>>>>> >>>>>>>>> + vfio_quirk_kvmgt(vdev); >>>>>>>>> + >>>>>>>>> /* Get a copy of config space */ >>>>>>>>> ret = pread(vdev->vbasedev.fd, vdev->pdev.config, >>>>>>>>> MIN(pci_config_size(&vdev->pdev), vdev->config_size), >>>>>>>>> @@ -2832,6 +2834,7 @@ static Property vfio_pci_dev_properties[] = { >>>>>>>>> DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, >>>>>>>>> sub_device_id, PCI_ANY_ID), >>>>>>>>> DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), >>>>>>>>> + DEFINE_PROP_BOOL("kvmgt", VFIOPCIDevice, kvmgt, false), >>>>>>>> >>>>>>>> Just a side note, device options are a headache, users are prone to get >>>>>>>> them wrong and minimally it requires an entire round to get libvirt >>>>>>>> support. We should be able to detect from the device or vfio API >>>>>>>> whether such a call is required. Obviously if we can use the existing >>>>>>>> kvm-vfio device, that's the better option anyway. Thanks, >>>>>>> >>>>>>> Also, vfio devices currently have no hard dependencies on KVM, if kvmgt >>>>>>> does, it needs to produce a device failure when unavailable. Thanks, >>>>>> >>>>>> Also, I would like to see this as an generic feature instead of >>>>>> kvmgt specific interface, so we don't have to add new options to QEMU and it is >>>>>> up to the vendor driver to proceed with or without it. >>>>> >>>>> In general this should be decided by lack of some required feature >>>>> exclusively provided by KVM. I would not want to add a generic opt-out >>>>> for mdev vendor drivers to decide that they arbitrarily want to disable >>>>> that path. Thanks, >>>> >>>> IIUC, you are suggesting that this path should be controlled by KVM feature cap >>>> and it will be accessible to VFIO users when such checking is satisfied. >>> >>> Maybe we're getting too loose with our pronouns here, I'm starting to >>> lose track of what "this" is referring to. I agree that there's no >>> reason for the ioctl, as proposed to be kvmgt specific. I would hope >>> that going through the kvm-vfio device to create that linkage would >>> eliminate that, but we'll need to see what Jike can come up with to >>> plumb between KVM and vfio. Vendor drivers can implement their own >>> ioctls, now that we pass them through the mdev layer, but someone needs >>> to call those ioctls. Ideally we want something programmatic to >>> trigger that, without requiring a user to pass an extra device >>> parameter. Additionally, if there is any hope of making use of the >>> device with userspace drivers other than QEMU, hard dependencies on KVM >>> should be avoided. Thanks, >>> >>> Alex >>> >> >> Thanks for the advice, so I cooked another patch for your comments. >> Basically a 'void *usrdata' is added to vfio_group, external users >> can set it (kvm) or get it (kvm or other users like kvmgt). >> >> BTW, in device-model, the open method will return failure to vfio-mdev >> in case that such kvm information is not available. >> >> -- >> Thanks, >> Jike >> >> >> >> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c >> index d1d70e0..6b8d1d2 100644 >> --- a/drivers/vfio/vfio.c >> +++ b/drivers/vfio/vfio.c >> @@ -86,6 +86,7 @@ struct vfio_group { >> struct mutex unbound_lock; >> atomic_t opened; >> bool noiommu; >> + void *usrdata; >> }; >> >> struct vfio_device { >> @@ -447,14 +448,13 @@ static struct vfio_group *vfio_group_try_get(struct vfio_group *group) >> } >> >> static >> -struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) >> +struct vfio_group *__vfio_group_get_from_iommu(struct iommu_group *iommu_group) >> { >> struct vfio_group *group; >> >> mutex_lock(&vfio.group_lock); >> list_for_each_entry(group, &vfio.group_list, vfio_next) { >> if (group->iommu_group == iommu_group) { >> - vfio_group_get(group); > > This is wrong, we can't add our reference after we release the lock. > Thanks for pointing it out :) >> mutex_unlock(&vfio.group_lock); >> return group; >> } >> @@ -464,6 +464,17 @@ struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) >> return NULL; >> } >> >> +static >> +struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) >> +{ >> + struct vfio_group *group = __vfio_group_get_from_iommu(iommu_group); >> + if (!group) >> + return NULL; >> + >> + vfio_group_get(group); > > We have no basis to get a reference here. This function cannot exist > separate from the existing function above. > >> + return group; >> +} >> + >> static struct vfio_group *vfio_group_get_from_minor(int minor) >> { >> struct vfio_group *group; >> @@ -1728,6 +1739,31 @@ long vfio_external_check_extension(struct vfio_group *group, unsigned long arg) >> } >> EXPORT_SYMBOL_GPL(vfio_external_check_extension); >> >> +void vfio_group_set_usrdata(struct vfio_group *group, void *data) >> +{ >> + group->usrdata = data; >> +} >> +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); >> + >> +void *vfio_group_get_usrdata(struct vfio_group *group) >> +{ >> + return group->usrdata; >> +} >> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); >> + >> +void *vfio_group_get_usrdata_by_device(struct device *dev) >> +{ >> + struct vfio_group *vfio_group; >> + >> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); > > We actually need to use iommu_group_get() here. Kirti adds a > vfio_group_get_from_dev() in v9 03/12 that does this properly. > >> + if (!vfio_group) >> + return NULL; >> + >> + return vfio_group_get_usrdata(vfio_group); > > This operates on a group for which we have no reference. Great to know Kirti's work! BTW, this means user need to call vfio_group_put_external_user afterwards, right? >> +} >> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata_by_device); >> + >> + >> /** >> * Sub-module support >> */ >> diff --git a/include/linux/vfio.h b/include/linux/vfio.h >> index 0ecae0b..712588f 100644 >> --- a/include/linux/vfio.h >> +++ b/include/linux/vfio.h >> @@ -91,6 +91,10 @@ extern void vfio_unregister_iommu_driver( >> extern int vfio_external_user_iommu_id(struct vfio_group *group); >> extern long vfio_external_check_extension(struct vfio_group *group, >> unsigned long arg); >> +extern void vfio_group_set_usrdata(struct vfio_group *group, void *data); >> +extern void *vfio_group_get_usrdata(struct vfio_group *group); >> +extern void *vfio_group_get_usrdata_by_device(struct device *dev); >> + >> >> /* >> * Sub-module helpers >> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c >> index 1dd087d..e00d401 100644 >> --- a/virt/kvm/vfio.c >> +++ b/virt/kvm/vfio.c >> @@ -60,6 +60,20 @@ static void kvm_vfio_group_put_external_user(struct vfio_group *vfio_group) >> symbol_put(vfio_group_put_external_user); >> } >> >> +static void kvm_vfio_group_set_kvm(struct vfio_group *group, void *kvm) >> +{ >> + void (*fn)(struct vfio_group *, void *); >> + >> + fn = symbol_get(vfio_group_set_usrdata); >> + if (!fn) >> + return; >> + >> + fn(group, kvm); >> + kvm_get_kvm(kvm); >> + >> + symbol_put(vfio_group_set_usrdata); >> +} >> + >> static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group) >> { >> long (*fn)(struct vfio_group *, unsigned long); >> @@ -161,6 +175,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) >> >> kvm_vfio_update_coherency(dev); >> >> + kvm_vfio_group_set_kvm(vfio_group, dev->kvm); >> + >> return 0; >> >> case KVM_DEV_VFIO_GROUP_DEL: >> @@ -200,6 +216,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) >> >> kvm_vfio_update_coherency(dev); >> >> + kvm_put_kvm(dev->kvm); >> + >> return ret; >> } > > How does anyone get'ing the usrdata know what it contains? Currently only the KVM instance. Maybe we can add other data along with flags in the future? > Does the > vendor driver compare it to a pointer it found elsewhere? How does the > vendor driver generate an error back to the user if this linkage is > necessary but unavailable? For the data == kvm scenario, yes, I think it's only valid to use it inside the kvm thread context, IIUC, comparing kvm->mm with current->mm does the trick. If not equal, in our case, the parent_ops->open() will get an -ESRCH indicating that this mdev must be used along with KVM. -- Thanks, Jike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/19/2016 10:32 AM, Jike Song wrote: +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); >>> + >>> +void *vfio_group_get_usrdata(struct vfio_group *group) >>> +{ >>> + return group->usrdata; >>> +} >>> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); >>> + >>> +void *vfio_group_get_usrdata_by_device(struct device *dev) >>> +{ >>> + struct vfio_group *vfio_group; >>> + >>> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); >> >> We actually need to use iommu_group_get() here. Kirti adds a >> vfio_group_get_from_dev() in v9 03/12 that does this properly. >> >>> + if (!vfio_group) >>> + return NULL; >>> + >>> + return vfio_group_get_usrdata(vfio_group); I am worrying if the kvm instance got from group->usrdata is safe enough? What happens if you get the instance after kvm released kvm-vfio device? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 19/10/2016 07:45, Xiao Guangrong wrote: > > > On 10/19/2016 10:32 AM, Jike Song wrote: > +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); >>>> + >>>> +void *vfio_group_get_usrdata(struct vfio_group *group) >>>> +{ >>>> + return group->usrdata; >>>> +} >>>> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); >>>> + >>>> +void *vfio_group_get_usrdata_by_device(struct device *dev) >>>> +{ >>>> + struct vfio_group *vfio_group; >>>> + >>>> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); >>> >>> We actually need to use iommu_group_get() here. Kirti adds a >>> vfio_group_get_from_dev() in v9 03/12 that does this properly. >>> >>>> + if (!vfio_group) >>>> + return NULL; >>>> + >>>> + return vfio_group_get_usrdata(vfio_group); > > I am worrying if the kvm instance got from group->usrdata is safe > enough? What happens if you get the instance after kvm released > kvm-vfio device? It shouldn't happen if you use kvm_get_kvm and kvm_put_kvm properly. It is almost okay in the patch, just: > @@ -200,6 +216,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) > > kvm_vfio_update_coherency(dev); > > + kvm_put_kvm(dev->kvm); > + > return ret; > } ... please add a new function kvm_vfio_group_clear_kvm(vfio_group) here, that does vfio_group_set_usrdata(vfio_group, NULL) and kvm_put_kvm. This should avoid use-after-free. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/19/2016 07:56 PM, Paolo Bonzini wrote: > > > On 19/10/2016 07:45, Xiao Guangrong wrote: >> >> >> On 10/19/2016 10:32 AM, Jike Song wrote: >> +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); >>>>> + >>>>> +void *vfio_group_get_usrdata(struct vfio_group *group) >>>>> +{ >>>>> + return group->usrdata; >>>>> +} >>>>> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); >>>>> + >>>>> +void *vfio_group_get_usrdata_by_device(struct device *dev) >>>>> +{ >>>>> + struct vfio_group *vfio_group; >>>>> + >>>>> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); >>>> >>>> We actually need to use iommu_group_get() here. Kirti adds a >>>> vfio_group_get_from_dev() in v9 03/12 that does this properly. >>>> >>>>> + if (!vfio_group) >>>>> + return NULL; >>>>> + >>>>> + return vfio_group_get_usrdata(vfio_group); >> >> I am worrying if the kvm instance got from group->usrdata is safe >> enough? What happens if you get the instance after kvm released >> kvm-vfio device? > > It shouldn't happen if you use kvm_get_kvm and kvm_put_kvm properly. It > is almost okay in the patch, just: > How about if KVM releases kvm-vfio device between vfio_group_get_usrdata() and get_kvm()? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[meta-comment] On 10/18/2016 09:32 PM, Jike Song wrote: > On 10/18/2016 10:59 PM, Alex Williamson wrote: ... >>>>>>>>>>>>>>> On 10/10/2016 20:01, Neo Jia wrote: >>>>>>>>>>>>>>>>> Hi Neo, 17 levels of quoting is rather over-the-top. It is OKAY (and in fact DESIRABLE) to trim your emails to relevant portions, when posting to a high-volume list. Readers shouldn't have to scroll through pages of deeply-nested quoting... >>> >>> mutex_lock(&vfio.group_lock); >>> list_for_each_entry(group, &vfio.group_list, vfio_next) { >>> if (group->iommu_group == iommu_group) { >>> - vfio_group_get(group); >> >> This is wrong, we can't add our reference after we release the lock. >> > > Thanks for pointing it out :) > ...to get to the much smaller meat of the message.
On 19/10/2016 15:39, Xiao Guangrong wrote: > > > On 10/19/2016 07:56 PM, Paolo Bonzini wrote: >> >> >> On 19/10/2016 07:45, Xiao Guangrong wrote: >>> >>> >>> On 10/19/2016 10:32 AM, Jike Song wrote: >>> +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); >>>>>> + >>>>>> +void *vfio_group_get_usrdata(struct vfio_group *group) >>>>>> +{ >>>>>> + return group->usrdata; >>>>>> +} >>>>>> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); >>>>>> + >>>>>> +void *vfio_group_get_usrdata_by_device(struct device *dev) >>>>>> +{ >>>>>> + struct vfio_group *vfio_group; >>>>>> + >>>>>> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); >>>>> >>>>> We actually need to use iommu_group_get() here. Kirti adds a >>>>> vfio_group_get_from_dev() in v9 03/12 that does this properly. >>>>> >>>>>> + if (!vfio_group) >>>>>> + return NULL; >>>>>> + >>>>>> + return vfio_group_get_usrdata(vfio_group); >>> >>> I am worrying if the kvm instance got from group->usrdata is safe >>> enough? What happens if you get the instance after kvm released >>> kvm-vfio device? >> >> It shouldn't happen if you use kvm_get_kvm and kvm_put_kvm properly. It >> is almost okay in the patch, just: > > How about if KVM releases kvm-vfio device between vfio_group_get_usrdata() > and get_kvm()? That cannot happen as long as there is a struct file* for the device (see kvm_ioctl_create_device and kvm_device_release). Since you're sending a ioctl to it, it's fine. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/19/2016 10:14 PM, Paolo Bonzini wrote: > > > On 19/10/2016 15:39, Xiao Guangrong wrote: >> >> >> On 10/19/2016 07:56 PM, Paolo Bonzini wrote: >>> >>> >>> On 19/10/2016 07:45, Xiao Guangrong wrote: >>>> >>>> >>>> On 10/19/2016 10:32 AM, Jike Song wrote: >>>> +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); >>>>>>> + >>>>>>> +void *vfio_group_get_usrdata(struct vfio_group *group) >>>>>>> +{ >>>>>>> + return group->usrdata; >>>>>>> +} >>>>>>> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); >>>>>>> + >>>>>>> +void *vfio_group_get_usrdata_by_device(struct device *dev) >>>>>>> +{ >>>>>>> + struct vfio_group *vfio_group; >>>>>>> + >>>>>>> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); >>>>>> >>>>>> We actually need to use iommu_group_get() here. Kirti adds a >>>>>> vfio_group_get_from_dev() in v9 03/12 that does this properly. >>>>>> >>>>>>> + if (!vfio_group) >>>>>>> + return NULL; >>>>>>> + >>>>>>> + return vfio_group_get_usrdata(vfio_group); >>>> >>>> I am worrying if the kvm instance got from group->usrdata is safe >>>> enough? What happens if you get the instance after kvm released >>>> kvm-vfio device? >>> >>> It shouldn't happen if you use kvm_get_kvm and kvm_put_kvm properly. It >>> is almost okay in the patch, just: >> >> How about if KVM releases kvm-vfio device between vfio_group_get_usrdata() >> and get_kvm()? > > That cannot happen as long as there is a struct file* for the device > (see kvm_ioctl_create_device and kvm_device_release). Since you're > sending a ioctl to it, it's fine. I understood that KVM side is safe, however, vfio side is independent with kvm and the user of usrdata can fetch kvm struct at any time, consider this scenario: CPU 0 CPU 1 KVM: VFIO/userdata user kvm_ioctl_create_device get_kvm() vfio_group_get_usrdata(vfio_group) kvm_device_release put_kvm() !!! kvm refcount has gone use KVM struct Then, the user of userdata have fetched kvm struct but the refcount has already gone. What i missed? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20/10/2016 03:48, Xiao Guangrong wrote: > > > On 10/19/2016 10:14 PM, Paolo Bonzini wrote: >> >> >> On 19/10/2016 15:39, Xiao Guangrong wrote: >>> >>> >>> On 10/19/2016 07:56 PM, Paolo Bonzini wrote: >>>> >>>> >>>> On 19/10/2016 07:45, Xiao Guangrong wrote: >>>>> >>>>> >>>>> On 10/19/2016 10:32 AM, Jike Song wrote: >>>>> +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); >>>>>>>> + >>>>>>>> +void *vfio_group_get_usrdata(struct vfio_group *group) >>>>>>>> +{ >>>>>>>> + return group->usrdata; >>>>>>>> +} >>>>>>>> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); >>>>>>>> + >>>>>>>> +void *vfio_group_get_usrdata_by_device(struct device *dev) >>>>>>>> +{ >>>>>>>> + struct vfio_group *vfio_group; >>>>>>>> + >>>>>>>> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); >>>>>>> >>>>>>> We actually need to use iommu_group_get() here. Kirti adds a >>>>>>> vfio_group_get_from_dev() in v9 03/12 that does this properly. >>>>>>> >>>>>>>> + if (!vfio_group) >>>>>>>> + return NULL; >>>>>>>> + >>>>>>>> + return vfio_group_get_usrdata(vfio_group); >>>>> >>>>> I am worrying if the kvm instance got from group->usrdata is safe >>>>> enough? What happens if you get the instance after kvm released >>>>> kvm-vfio device? >>>> >>>> It shouldn't happen if you use kvm_get_kvm and kvm_put_kvm >>>> properly. It >>>> is almost okay in the patch, just: >>> >>> How about if KVM releases kvm-vfio device between >>> vfio_group_get_usrdata() >>> and get_kvm()? >> >> That cannot happen as long as there is a struct file* for the device >> (see kvm_ioctl_create_device and kvm_device_release). Since you're >> sending a ioctl to it, it's fine. > > I understood that KVM side is safe, however, vfio side is independent with > kvm and the user of usrdata can fetch kvm struct at any time, consider > this scenario: > > CPU 0 CPU 1 > KVM: VFIO/userdata user > kvm_ioctl_create_device > get_kvm() > vfio_group_get_usrdata(vfio_group) > kvm_device_release > put_kvm() > !!! kvm refcount has gone > use KVM struct > > Then, the user of userdata have fetched kvm struct but the refcount has > already gone. vfio_group_set_usrdata (actually) kvm_vfio_group_set_kvm has called kvm_get_kvm too, however. What you need is a mutex that is taken by vfio_group_set_usrdata and by the callers of vfio_group_get_usrdata. Paolo > What i missed? > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
DQoNCi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQpGcm9tOiBQYW9sbyBCb256aW5pIFttYWls dG86cGFvbG8uYm9uemluaUBnbWFpbC5jb21dIE9uIEJlaGFsZiBPZiBQYW9sbyBCb256aW5pDQpT ZW50OiBGcmlkYXksIE9jdG9iZXIgMjEsIDIwMTYgMTowNyBBTQ0KVG86IFhpYW8gR3Vhbmdyb25n IDxndWFuZ3JvbmcueGlhb0BsaW51eC5pbnRlbC5jb20+OyBYaWFvLCBHdWFuZ3JvbmcgPGd1YW5n cm9uZy54aWFvQGludGVsLmNvbT47IFNvbmcsIEppa2UgPGppa2Uuc29uZ0BpbnRlbC5jb20+OyBB bGV4IFdpbGxpYW1zb24gPGFsZXgud2lsbGlhbXNvbkByZWRoYXQuY29tPg0KQ2M6IFRpYW4sIEtl dmluIDxrZXZpbi50aWFuQGludGVsLmNvbT47IE5lbyBKaWEgPGNqaWFAbnZpZGlhLmNvbT47IGt2 bUB2Z2VyLmtlcm5lbC5vcmc7IHFlbXUtZGV2ZWwgPHFlbXUtZGV2ZWxAbm9uZ251Lm9yZz47IENo ZW4sIFhpYW9ndWFuZyA8eGlhb2d1YW5nLmNoZW5AaW50ZWwuY29tPjsgS2lydGkgV2Fua2hlZGUg PGt3YW5raGVkZUBudmlkaWEuY29tPg0KU3ViamVjdDogUmU6IFtRZW11LWRldmVsXSBbUEFUQ0gg MS8yXSBLVk06IHBhZ2UgdHJhY2s6IGFkZCBhIG5ldyBub3RpZmllciB0eXBlOiB0cmFja19mbHVz aF9zbG90DQoNCg0KDQpPbiAyMC8xMC8yMDE2IDAzOjQ4LCBYaWFvIEd1YW5ncm9uZyB3cm90ZToN Cj4gDQo+IA0KPiBPbiAxMC8xOS8yMDE2IDEwOjE0IFBNLCBQYW9sbyBCb256aW5pIHdyb3RlOg0K Pj4NCj4+DQo+PiBPbiAxOS8xMC8yMDE2IDE1OjM5LCBYaWFvIEd1YW5ncm9uZyB3cm90ZToNCj4+ Pg0KPj4+DQo+Pj4gT24gMTAvMTkvMjAxNiAwNzo1NiBQTSwgUGFvbG8gQm9uemluaSB3cm90ZToN Cj4+Pj4NCj4+Pj4NCj4+Pj4gT24gMTkvMTAvMjAxNiAwNzo0NSwgWGlhbyBHdWFuZ3Jvbmcgd3Jv dGU6DQo+Pj4+Pg0KPj4+Pj4NCj4+Pj4+IE9uIDEwLzE5LzIwMTYgMTA6MzIgQU0sIEppa2UgU29u ZyB3cm90ZToNCj4+Pj4+ICtFWFBPUlRfU1lNQk9MX0dQTCh2ZmlvX2dyb3VwX3NldF91c3JkYXRh KTsNCj4+Pj4+Pj4+ICsNCj4+Pj4+Pj4+ICt2b2lkICp2ZmlvX2dyb3VwX2dldF91c3JkYXRhKHN0 cnVjdCB2ZmlvX2dyb3VwICpncm91cCkgew0KPj4+Pj4+Pj4gKyAgICByZXR1cm4gZ3JvdXAtPnVz cmRhdGE7DQo+Pj4+Pj4+PiArfQ0KPj4+Pj4+Pj4gK0VYUE9SVF9TWU1CT0xfR1BMKHZmaW9fZ3Jv dXBfZ2V0X3VzcmRhdGEpOw0KPj4+Pj4+Pj4gKw0KPj4+Pj4+Pj4gK3ZvaWQgKnZmaW9fZ3JvdXBf Z2V0X3VzcmRhdGFfYnlfZGV2aWNlKHN0cnVjdCBkZXZpY2UgKmRldikgew0KPj4+Pj4+Pj4gKyAg ICBzdHJ1Y3QgdmZpb19ncm91cCAqdmZpb19ncm91cDsNCj4+Pj4+Pj4+ICsNCj4+Pj4+Pj4+ICsg ICAgdmZpb19ncm91cCA9IA0KPj4+Pj4+Pj4gKyBfX3ZmaW9fZ3JvdXBfZ2V0X2Zyb21faW9tbXUo ZGV2LT5pb21tdV9ncm91cCk7DQo+Pj4+Pj4+DQo+Pj4+Pj4+IFdlIGFjdHVhbGx5IG5lZWQgdG8g dXNlIGlvbW11X2dyb3VwX2dldCgpIGhlcmUuICBLaXJ0aSBhZGRzIGENCj4+Pj4+Pj4gdmZpb19n cm91cF9nZXRfZnJvbV9kZXYoKSBpbiB2OSAwMy8xMiB0aGF0IGRvZXMgdGhpcyBwcm9wZXJseS4N Cj4+Pj4+Pj4NCj4+Pj4+Pj4+ICsgICAgaWYgKCF2ZmlvX2dyb3VwKQ0KPj4+Pj4+Pj4gKyAgICAg ICAgcmV0dXJuIE5VTEw7DQo+Pj4+Pj4+PiArDQo+Pj4+Pj4+PiArICAgIHJldHVybiB2ZmlvX2dy b3VwX2dldF91c3JkYXRhKHZmaW9fZ3JvdXApOw0KPj4+Pj4NCj4+Pj4+IEkgYW0gd29ycnlpbmcg aWYgdGhlIGt2bSBpbnN0YW5jZSBnb3QgZnJvbSBncm91cC0+dXNyZGF0YSBpcyBzYWZlIA0KPj4+ Pj4gZW5vdWdoPyBXaGF0IGhhcHBlbnMgaWYgeW91IGdldCB0aGUgaW5zdGFuY2UgYWZ0ZXIga3Zt IHJlbGVhc2VkIA0KPj4+Pj4ga3ZtLXZmaW8gZGV2aWNlPw0KPj4+Pg0KPj4+PiBJdCBzaG91bGRu J3QgaGFwcGVuIGlmIHlvdSB1c2Uga3ZtX2dldF9rdm0gYW5kIGt2bV9wdXRfa3ZtIA0KPj4+PiBw cm9wZXJseS4gIEl0IGlzIGFsbW9zdCBva2F5IGluIHRoZSBwYXRjaCwganVzdDoNCj4+Pg0KPj4+ IEhvdyBhYm91dCBpZiBLVk0gcmVsZWFzZXMga3ZtLXZmaW8gZGV2aWNlIGJldHdlZW4NCj4+PiB2 ZmlvX2dyb3VwX2dldF91c3JkYXRhKCkNCj4+PiBhbmQgZ2V0X2t2bSgpPw0KPj4NCj4+IFRoYXQg Y2Fubm90IGhhcHBlbiBhcyBsb25nIGFzIHRoZXJlIGlzIGEgc3RydWN0IGZpbGUqIGZvciB0aGUg ZGV2aWNlIA0KPj4gKHNlZSBrdm1faW9jdGxfY3JlYXRlX2RldmljZSBhbmQga3ZtX2RldmljZV9y ZWxlYXNlKS4gIFNpbmNlIHlvdSdyZSANCj4+IHNlbmRpbmcgYSBpb2N0bCB0byBpdCwgaXQncyBm aW5lLg0KPiANCj4gSSB1bmRlcnN0b29kIHRoYXQgS1ZNIHNpZGUgaXMgc2FmZSwgaG93ZXZlciwg dmZpbyBzaWRlIGlzIGluZGVwZW5kZW50IA0KPiB3aXRoIGt2bSBhbmQgdGhlIHVzZXIgb2YgdXNy ZGF0YSBjYW4gZmV0Y2gga3ZtIHN0cnVjdCBhdCBhbnkgdGltZSwgDQo+IGNvbnNpZGVyIHRoaXMg c2NlbmFyaW86DQo+IA0KPiBDUFUgMCAgICAgICAgICAgICAgICAgICAgICAgICBDUFUgMQ0KPiBL Vk06ICAgICAgICAgICAgICAgICAgICAgICAgIFZGSU8vdXNlcmRhdGEgdXNlcg0KPiAgIGt2bV9p b2N0bF9jcmVhdGVfZGV2aWNlDQo+ICAgICAgZ2V0X2t2bSgpDQo+ICAgICAgICAgICAgICAgICAg ICAgICAgICAgICB2ZmlvX2dyb3VwX2dldF91c3JkYXRhKHZmaW9fZ3JvdXApDQo+ICAga3ZtX2Rl dmljZV9yZWxlYXNlDQo+ICAgICBwdXRfa3ZtKCkNCj4gICAgICAgICAgICAgICAgICAgICAgICAg ICAgICEhISBrdm0gcmVmY291bnQgaGFzIGdvbmUNCj4gICAgICAgICAgICAgICAgICAgICAgICAg ICAgIHVzZSBLVk0gc3RydWN0DQo+IA0KPiBUaGVuLCB0aGUgdXNlciBvZiB1c2VyZGF0YSBoYXZl IGZldGNoZWQga3ZtIHN0cnVjdCBidXQgdGhlIHJlZmNvdW50IA0KPiBoYXMgYWxyZWFkeSBnb25l Lg0KDQp2ZmlvX2dyb3VwX3NldF91c3JkYXRhIChhY3R1YWxseSkga3ZtX3ZmaW9fZ3JvdXBfc2V0 X2t2bSBoYXMgY2FsbGVkIGt2bV9nZXRfa3ZtIHRvbywgaG93ZXZlci4gIFdoYXQgeW91IG5lZWQg aXMgYSBtdXRleCB0aGF0IGlzIHRha2VuIGJ5IHZmaW9fZ3JvdXBfc2V0X3VzcmRhdGEgYW5kIGJ5 IHRoZSBjYWxsZXJzIG9mIHZmaW9fZ3JvdXBfZ2V0X3VzcmRhdGEuDQoNClllcywgbXV0ZXggY2Fu IGZpeCBpdCBhbmQgaXMgZ29vZCB0byBtZS4gOikNCg0K -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/21/2016 01:19 AM, Xiao, Guangrong wrote: >> On 10/19/2016 10:14 PM, Paolo Bonzini wrote: >>> On 19/10/2016 15:39, Xiao Guangrong wrote: >>> >>> >>> I understood that KVM side is safe, however, vfio side is independent >>> with kvm and the user of usrdata can fetch kvm struct at any time, >>> consider this scenario: >>> >>> CPU 0 CPU 1 >>> KVM: VFIO/userdata user >>> kvm_ioctl_create_device >>> get_kvm() >>> vfio_group_get_usrdata(vfio_group) >>> kvm_device_release >>> put_kvm() >>> !!! kvm refcount has gone >>> use KVM struct >>> >>> Then, the user of userdata have fetched kvm struct but the refcount >>> has already gone. >> >> vfio_group_set_usrdata (actually) kvm_vfio_group_set_kvm has called >>kvm_get_kvm too, however. What you need is a mutex that is taken by >>vfio_group_set_usrdata and by the callers of vfio_group_get_usrdata. > > Yes, mutex can fix it and is good to me. :) Thanks everyone, I'll cook another patch according to your guidance. -- Thanks, Jike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/19/2016 09:56 PM, Eric Blake wrote: > 17 levels of quoting is rather over-the-top. It is OKAY (and in fact > DESIRABLE) to trim your emails to relevant portions, when posting to a > high-volume list. Readers shouldn't have to scroll through pages of > deeply-nested quoting... Hi Eric, Sorry for that, will trim the quotation next time. Thanks for reminding! -- Thanks, Jike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/21/2016 01:06 AM, Paolo Bonzini wrote: > On 20/10/2016 03:48, Xiao Guangrong wrote: >> I understood that KVM side is safe, however, vfio side is independent with >> kvm and the user of usrdata can fetch kvm struct at any time, consider >> this scenario: >> >> CPU 0 CPU 1 >> KVM: VFIO/userdata user >> kvm_ioctl_create_device >> get_kvm() >> vfio_group_get_usrdata(vfio_group) >> kvm_device_release >> put_kvm() >> !!! kvm refcount has gone >> use KVM struct >> >> Then, the user of userdata have fetched kvm struct but the refcount has >> already gone. > > vfio_group_set_usrdata (actually) kvm_vfio_group_set_kvm has called > kvm_get_kvm too, however. What you need is a mutex that is taken by > vfio_group_set_usrdata and by the callers of vfio_group_get_usrdata. Hi Paolo & Guangrong, I walked the whole thread and became a little nervous: I don't want to introduce a global mutex. The problem is, as I understand, vfio_group_get_usrdata() returns a KVM pointer but it may be stale. To make the pointer always valid, it can call kvm_get_kvm() *before* return the pointer. I would apologize in advance if this idea turns out totally nonsense, but hey, please kindly help fix my whim :-) [vfio.h] struct vfio_usrdata { void *data; void (*get)(void *data); void (*put)(void *data) }; vfio_group { ... vfio_usrdata *usrdata; [kvm.ko] struvt vfio_usrdata kvmdata = { .data = kvm, .get = kvm_get_kvm, .put = kvm_put_kvm, }; fn = symbol_get(vfio_group_set_usrdata) fn(vfio_group, &kvmdata) [vfio.ko] vfio_group_set_usrdata lock vfio_group->d = kvmdata unlock void *vfio_group_get_usrdata lock struct vfio_usrdata *d = vfio_group->usrdata; d->get(d->data); unlock return d->data; void vfio_group_put_usrdata lock struct vfio_usrdata *d = vfio_group->usrdata; d->put(d->data) unlock [kvmgt.ko] call vfio_group_get_usrdata to get kvm, call vfio_group_put_usrdata to release it *never* call kvm_get_kvm/kvm_put_kvm -- Thanks, Jike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 26/10/2016 15:44, Jike Song wrote: > On 10/21/2016 01:06 AM, Paolo Bonzini wrote: >> On 20/10/2016 03:48, Xiao Guangrong wrote: >>> I understood that KVM side is safe, however, vfio side is independent with >>> kvm and the user of usrdata can fetch kvm struct at any time, consider >>> this scenario: >>> >>> CPU 0 CPU 1 >>> KVM: VFIO/userdata user >>> kvm_ioctl_create_device >>> get_kvm() >>> vfio_group_get_usrdata(vfio_group) >>> kvm_device_release >>> put_kvm() >>> !!! kvm refcount has gone >>> use KVM struct >>> >>> Then, the user of userdata have fetched kvm struct but the refcount has >>> already gone. >> >> vfio_group_set_usrdata (actually) kvm_vfio_group_set_kvm has called >> kvm_get_kvm too, however. What you need is a mutex that is taken by >> vfio_group_set_usrdata and by the callers of vfio_group_get_usrdata. > > Hi Paolo & Guangrong, > > I walked the whole thread and became a little nervous: I don't want > to introduce a global mutex. > > The problem is, as I understand, vfio_group_get_usrdata() returns a > KVM pointer but it may be stale. To make the pointer always valid, > it can call kvm_get_kvm() *before* return the pointer. That doesn't work, you still have to protect get against concurrent set. But the mutex need not be global, it is specific to the vfio device. You probably have such a mutex anyway... Paolo > I would apologize in advance if this idea turns out totally > nonsense, but hey, please kindly help fix my whim :-) > > > [vfio.h] > > struct vfio_usrdata { > void *data; > void (*get)(void *data); > void (*put)(void *data) > }; > > vfio_group { > ... > vfio_usrdata *usrdata; > > [kvm.ko] > > struvt vfio_usrdata kvmdata = { > .data = kvm, > .get = kvm_get_kvm, > .put = kvm_put_kvm, > }; > > fn = symbol_get(vfio_group_set_usrdata) > fn(vfio_group, &kvmdata) > > > [vfio.ko] > > vfio_group_set_usrdata > lock > vfio_group->d = kvmdata > unlock > > void *vfio_group_get_usrdata > lock > struct vfio_usrdata *d = vfio_group->usrdata; > d->get(d->data); > unlock > return d->data; > > void vfio_group_put_usrdata > lock > struct vfio_usrdata *d = vfio_group->usrdata; > d->put(d->data) > unlock > > [kvmgt.ko] > > call vfio_group_get_usrdata to get kvm, > call vfio_group_put_usrdata to release it > *never* call kvm_get_kvm/kvm_put_kvm > > -- > Thanks, > Jike > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/26/2016 10:45 PM, Paolo Bonzini wrote: > On 26/10/2016 15:44, Jike Song wrote: >> On 10/21/2016 01:06 AM, Paolo Bonzini wrote: >>> On 20/10/2016 03:48, Xiao Guangrong wrote: >>>> I understood that KVM side is safe, however, vfio side is independent with >>>> kvm and the user of usrdata can fetch kvm struct at any time, consider >>>> this scenario: >>>> >>>> CPU 0 CPU 1 >>>> KVM: VFIO/userdata user >>>> kvm_ioctl_create_device >>>> get_kvm() >>>> vfio_group_get_usrdata(vfio_group) >>>> kvm_device_release >>>> put_kvm() >>>> !!! kvm refcount has gone >>>> use KVM struct >>>> >>>> Then, the user of userdata have fetched kvm struct but the refcount has >>>> already gone. >>> >>> vfio_group_set_usrdata (actually) kvm_vfio_group_set_kvm has called >>> kvm_get_kvm too, however. What you need is a mutex that is taken by >>> vfio_group_set_usrdata and by the callers of vfio_group_get_usrdata. >> >> Hi Paolo & Guangrong, >> >> I walked the whole thread and became a little nervous: I don't want >> to introduce a global mutex. >> >> The problem is, as I understand, vfio_group_get_usrdata() returns a >> KVM pointer but it may be stale. To make the pointer always valid, >> it can call kvm_get_kvm() *before* return the pointer. > > That doesn't work, you still have to protect get against concurrent set. > But the mutex need not be global, it is specific to the vfio device. > You probably have such a mutex anyway... Thanks Paolo, I agree whatsoever a mutex is necessary. I cooked a patch sent to you and Alex, please kindly have a look :-) -- Thanks, Jike >> I would apologize in advance if this idea turns out totally >> nonsense, but hey, please kindly help fix my whim :-) >> >> >> [vfio.h] >> >> struct vfio_usrdata { >> void *data; >> void (*get)(void *data); >> void (*put)(void *data) >> }; >> >> vfio_group { >> ... >> vfio_usrdata *usrdata; >> >> [kvm.ko] >> >> struvt vfio_usrdata kvmdata = { >> .data = kvm, >> .get = kvm_get_kvm, >> .put = kvm_put_kvm, >> }; >> >> fn = symbol_get(vfio_group_set_usrdata) >> fn(vfio_group, &kvmdata) >> >> >> [vfio.ko] >> >> vfio_group_set_usrdata >> lock >> vfio_group->d = kvmdata >> unlock >> >> void *vfio_group_get_usrdata >> lock >> struct vfio_usrdata *d = vfio_group->usrdata; >> d->get(d->data); >> unlock >> return d->data; >> >> void vfio_group_put_usrdata >> lock >> struct vfio_usrdata *d = vfio_group->usrdata; >> d->put(d->data) >> unlock >> >> [kvmgt.ko] >> >> call vfio_group_get_usrdata to get kvm, >> call vfio_group_put_usrdata to release it >> *never* call kvm_get_kvm/kvm_put_kvm -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index d1d70e0..6b8d1d2 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -86,6 +86,7 @@ struct vfio_group { struct mutex unbound_lock; atomic_t opened; bool noiommu; + void *usrdata; }; struct vfio_device { @@ -447,14 +448,13 @@ static struct vfio_group *vfio_group_try_get(struct vfio_group *group) } static -struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) +struct vfio_group *__vfio_group_get_from_iommu(struct iommu_group *iommu_group) { struct vfio_group *group; mutex_lock(&vfio.group_lock); list_for_each_entry(group, &vfio.group_list, vfio_next) { if (group->iommu_group == iommu_group) { - vfio_group_get(group); mutex_unlock(&vfio.group_lock); return group; } @@ -464,6 +464,17 @@ struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) return NULL; } +static +struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group) +{ + struct vfio_group *group = __vfio_group_get_from_iommu(iommu_group); + if (!group) + return NULL; + + vfio_group_get(group); + return group; +} + static struct vfio_group *vfio_group_get_from_minor(int minor) { struct vfio_group *group; @@ -1728,6 +1739,31 @@ long vfio_external_check_extension(struct vfio_group *group, unsigned long arg) } EXPORT_SYMBOL_GPL(vfio_external_check_extension); +void vfio_group_set_usrdata(struct vfio_group *group, void *data) +{ + group->usrdata = data; +} +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata); + +void *vfio_group_get_usrdata(struct vfio_group *group) +{ + return group->usrdata; +} +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata); + +void *vfio_group_get_usrdata_by_device(struct device *dev) +{ + struct vfio_group *vfio_group; + + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group); + if (!vfio_group) + return NULL; + + return vfio_group_get_usrdata(vfio_group); +} +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata_by_device); + + /** * Sub-module support */ diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 0ecae0b..712588f 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -91,6 +91,10 @@ extern void vfio_unregister_iommu_driver( extern int vfio_external_user_iommu_id(struct vfio_group *group); extern long vfio_external_check_extension(struct vfio_group *group, unsigned long arg); +extern void vfio_group_set_usrdata(struct vfio_group *group, void *data); +extern void *vfio_group_get_usrdata(struct vfio_group *group); +extern void *vfio_group_get_usrdata_by_device(struct device *dev); + /* * Sub-module helpers diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 1dd087d..e00d401 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -60,6 +60,20 @@ static void kvm_vfio_group_put_external_user(struct vfio_group *vfio_group) symbol_put(vfio_group_put_external_user); } +static void kvm_vfio_group_set_kvm(struct vfio_group *group, void *kvm) +{ + void (*fn)(struct vfio_group *, void *); + + fn = symbol_get(vfio_group_set_usrdata); + if (!fn) + return; + + fn(group, kvm); + kvm_get_kvm(kvm); + + symbol_put(vfio_group_set_usrdata); +} + static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group) { long (*fn)(struct vfio_group *, unsigned long); @@ -161,6 +175,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) kvm_vfio_update_coherency(dev); + kvm_vfio_group_set_kvm(vfio_group, dev->kvm); + return 0; case KVM_DEV_VFIO_GROUP_DEL: @@ -200,6 +216,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) kvm_vfio_update_coherency(dev); + kvm_put_kvm(dev->kvm); + return ret; }