Message ID | 20240429065046.3688701-1-zhenzhong.duan@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Add a host IOMMU device abstraction to check with vIOMMU | expand |
On 4/29/24 08:50, Zhenzhong Duan wrote: > Hi, > > The most important change in this version is instroducing a common > HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface > between vIOMMU and HostIOMMUDevice. > > HostIOMMUDeviceClass::realize() is introduced to initialize > HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants. > > HostIOMMUDeviceClass::check_cap() is introduced to query host IOMMU > device capabilities. > > After the change, part2 is only 3 patches, so merge it with part1 to be > a single prerequisite series, same for changelog. If anyone doesn't like > that, I can split again. > > The class tree is as below: > > HostIOMMUDevice > | .caps > | .realize() > | .check_cap() > | > .-----------------------------------------------. > | | | > HostIOMMUDeviceLegacyVFIO {HostIOMMUDeviceLegacyVDPA} HostIOMMUDeviceIOMMUFD > | .vdev | {.vdev} | .iommufd > | .devid > | [.ioas_id] > | [.attach_hwpt()] > | [.detach_hwpt()] > | > .----------------------. > | | > HostIOMMUDeviceIOMMUFDVFIO {HostIOMMUDeviceIOMMUFDVDPA} > | .vdev | {.vdev} > > * The attributes in [] will be implemented in nesting series. > * The classes in {} will be implemented in future. > * .vdev in different class points to different agent device, > * i.e., for VFIO it points to VFIODevice. > > PATCH1-4: Introduce HostIOMMUDevice and its sub classes > PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize() and .check_cap() handler > PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU > PATCH17-19: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu) > > Qemu code can be found at: > https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v3 > > Besides the compatibility check in this series, in nesting series, this > host IOMMU device is extended for much wider usage. For anyone interested > on the nesting series, here is the link: > https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2 v4 should be a good candidate, we will need feedback from the vIOMMU maintainers though. However, have you considered another/complementary approach which would be to create an host IOMMU (iommufd) backend object and a vIOMMU device object together for each vfio-pci device being plugged in the machine ? Something like, -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \ -object iommufd,id=iommufd1 \ -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on,iommufd=iommufd1 \ -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0 The vIOMMU device would be linked to the host IOMMU (iommufd) backend object at realize time and it would simplify the discovery of the host IOMMU properties. The implementation would be more straight forward. That said, I didn't study deeply what needs to be done. The vIOMMU implementation is not ready yet to support multiple instances and some massaging is needed to change that first. Thanks, C.
On Fri, May 03, 2024 at 04:04:25PM +0200, Cédric Le Goater wrote: > However, have you considered another/complementary approach which > would be to create an host IOMMU (iommufd) backend object and a vIOMMU > device object together for each vfio-pci device being plugged in the > machine ? > > Something like, > -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \ > -object iommufd,id=iommufd1 \ > -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on,iommufd=iommufd1 \ > -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0 ? The main point of this is to have a single iommufd FD open in qemu. Not multiple. Would you achieve this with a iommufd0 and iommufd1 ? Jason
On 5/3/24 16:10, Jason Gunthorpe wrote: > On Fri, May 03, 2024 at 04:04:25PM +0200, Cédric Le Goater wrote: >> However, have you considered another/complementary approach which >> would be to create an host IOMMU (iommufd) backend object and a vIOMMU >> device object together for each vfio-pci device being plugged in the >> machine ? >> >> Something like, >> -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \ >> -object iommufd,id=iommufd1 \ >> -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on,iommufd=iommufd1 \ >> -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0 > > ? The main point of this is to have a single iommufd FD open in > qemu. Not multiple. oups. The above example should have the same IOMMUFD object device instance: iommufd0. This is bogus copy-paste of a command line with multiple vfio-pci devices, each using its own IOMMUFD object device instance. That's where the idea comes from. Sorry for the noise. C.
Hi Cédric, >-----Original Message----- >From: Cédric Le Goater <clg@redhat.com> >Sent: Friday, May 3, 2024 10:04 PM >To: Duan, Zhenzhong <zhenzhong.duan@intel.com>; qemu- >devel@nongnu.org >Cc: alex.williamson@redhat.com; eric.auger@redhat.com; mst@redhat.com; >peterx@redhat.com; jasowang@redhat.com; jgg@nvidia.com; >nicolinc@nvidia.com; joao.m.martins@oracle.com; Tian, Kevin ><kevin.tian@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; Peng, Chao P ><chao.p.peng@intel.com> >Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to >check with vIOMMU > >On 4/29/24 08:50, Zhenzhong Duan wrote: >> Hi, >> >> The most important change in this version is instroducing a common >> HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new >interface >> between vIOMMU and HostIOMMUDevice. >> >> HostIOMMUDeviceClass::realize() is introduced to initialize >> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants. >> >> HostIOMMUDeviceClass::check_cap() is introduced to query host IOMMU >> device capabilities. >> >> After the change, part2 is only 3 patches, so merge it with part1 to be >> a single prerequisite series, same for changelog. If anyone doesn't like >> that, I can split again. >> >> The class tree is as below: >> >> HostIOMMUDevice >> | .caps >> | .realize() >> | .check_cap() >> | >> .-----------------------------------------------. >> | | | >> HostIOMMUDeviceLegacyVFIO {HostIOMMUDeviceLegacyVDPA} >HostIOMMUDeviceIOMMUFD >> | .vdev | {.vdev} | .iommufd >> | .devid >> | [.ioas_id] >> | [.attach_hwpt()] >> | [.detach_hwpt()] >> | >> .----------------------. >> | | >> HostIOMMUDeviceIOMMUFDVFIO >{HostIOMMUDeviceIOMMUFDVDPA} >> | .vdev | {.vdev} >> >> * The attributes in [] will be implemented in nesting series. >> * The classes in {} will be implemented in future. >> * .vdev in different class points to different agent device, >> * i.e., for VFIO it points to VFIODevice. >> >> PATCH1-4: Introduce HostIOMMUDevice and its sub classes >> PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize() >and .check_cap() handler >> PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU >> PATCH17-19: Implement compatibility check between host IOMMU and >vIOMMU(intel_iommu) >> >> Qemu code can be found at: >> >https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre >q_v3 >> >> Besides the compatibility check in this series, in nesting series, this >> host IOMMU device is extended for much wider usage. For anyone >interested >> on the nesting series, here is the link: >> >https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfc >v2 > > >v4 should be a good candidate, we will need feedback from the vIOMMU >maintainers though. > >However, have you considered another/complementary approach which >would be to create an host IOMMU (iommufd) backend object and a >vIOMMU >device object together for each vfio-pci device being plugged in the >machine ? I did consider about a single iommufd instance for qemu and finally chose to support multiple iommufd instances, reason below: I was taking iommufd as a backend of VFIO device not a backend of vIOMMU. So there is an iommufd property linked to iommufd instances. We do support multiple iommufd instances in nesting series just as we do in cdev series, such as: -device intel-iommu,caching-mode=on,dma-drain=on,device-iotlb=on,x-scalable-mode=modern -object iommufd,id=iommufd0 -device vfio-pci,host=6f:01.0,id=vfio0,bus=root0,iommufd=iommufd0 -object iommufd,id=iommufd1 -device vfio-pci,host=6f:01.1,id=vfio1,bus=root1,iommufd=iommufd1 -device vfio-pci,host=6f:01.2,id=vfio2,bus=root2 Adding iommufd property to vIOMMU will limit the whole qemu to use only one iommufd instance, it's also confusing if there is also vfio device with legacy backend. I'm not clear how useful multiple iommufd instances support are. One possible benefit is for security? It may bring a slightly fine-grained isolation in kernel. We can discuss this further, if it's unnecessary to support multiple iommufd instances in nesting series, I can change to single iommufd instance support and add an iommufd property for vIOMMU just as you suggested. Thanks Zhenzhong > >Something like, > > -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0 \ > -object iommufd,id=iommufd1 \ > -device intel-iommu,intremap=on,device-iotlb=on,caching- >mode=on,iommufd=iommufd1 \ > -device vfio-pci,host=0000:08:10.0,bus=pci.1,iommufd=iommufd0 > >The vIOMMU device would be linked to the host IOMMU (iommufd) backend >object at realize time and it would simplify the discovery of the host >IOMMU properties. The implementation would be more straight forward. > >That said, I didn't study deeply what needs to be done. The vIOMMU >implementation is not ready yet to support multiple instances and some >massaging is needed to change that first.
On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote: > I'm not clear how useful multiple iommufd instances support are. > One possible benefit is for security? It may bring a slightly fine-grained > isolation in kernel. No. I don't think there is any usecase, it is only harmful. Jason
>-----Original Message----- >From: Jason Gunthorpe <jgg@nvidia.com> >Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to >check with vIOMMU > >On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote: > >> I'm not clear how useful multiple iommufd instances support are. >> One possible benefit is for security? It may bring a slightly fine-grained >> isolation in kernel. > >No. I don't think there is any usecase, it is only harmful. OK, so we need to limit QEMU to only one iommufd instance. In cdev series, we support mix of legacy and iommufd backend and multiple iommufd backend instances for flexibility. We need to make a choice to have this limitation only for nesting series or globally(including cdev). May I ask what harmfulness we may have? Thanks Zhenzhong
On Tue, May 07, 2024 at 02:24:30AM +0000, Duan, Zhenzhong wrote: > >On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote: > > > >> I'm not clear how useful multiple iommufd instances support are. > >> One possible benefit is for security? It may bring a slightly fine-grained > >> isolation in kernel. > > > >No. I don't think there is any usecase, it is only harmful. > > OK, so we need to limit QEMU to only one iommufd instance. I don't know about limit, but you don't need to do extra stuff to make it work. The main issue will be to get all the viommu instances to share the same iommufd IOAS for the guest physical mapping. Otherwise each viommu should be largely unware of the others sharing (or not) a iommufd. If you can structure things properly it probably doesn't need a hard limit, it will just work worse. Jason
>-----Original Message----- >From: Jason Gunthorpe <jgg@nvidia.com> >Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to >check with vIOMMU > >On Tue, May 07, 2024 at 02:24:30AM +0000, Duan, Zhenzhong wrote: >> >On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote: >> > >> >> I'm not clear how useful multiple iommufd instances support are. >> >> One possible benefit is for security? It may bring a slightly fine-grained >> >> isolation in kernel. >> > >> >No. I don't think there is any usecase, it is only harmful. >> >> OK, so we need to limit QEMU to only one iommufd instance. > >I don't know about limit, but you don't need to do extra stuff to make >it work. > >The main issue will be to get all the viommu instances to share the >same iommufd IOAS for the guest physical mapping. Otherwise each >viommu should be largely unware of the others sharing (or not) a >iommufd. I see. > >If you can structure things properly it probably doesn't need a hard >limit, it will just work worse. OK, thanks for clarify. The extra code to support multiple instances in intel_iommu is trivial. So I'd like to keep this flexibility to user just like cdev. User can configure QEMU cmdline to use one IOMMUFD instance easily whenever they want. Thanks Zhenzhong