VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

On Mon, Jan 25, 2016 at 09:45:14PM +0000, Tian, Kevin wrote:
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: Tuesday, January 26, 2016 5:30 AM
> > 
> > [cc +Neo @Nvidia]
> > 
> > Hi Jike,
> > 
> > On Mon, 2016-01-25 at 19:34 +0800, Jike Song wrote:
> > > On 01/20/2016 05:05 PM, Tian, Kevin wrote:
> > > > I would expect we can spell out next level tasks toward above
> > > > direction, upon which Alex can easily judge whether there are
> > > > some common VFIO framework changes that he can help :-)
> > >
> > > Hi Alex,
> > >
> > > Here is a draft task list after a short discussion w/ Kevin,
> > > would you please have a look?
> > >
> > > 	Bus Driver
> > >
> > > 		{ in i915/vgt/xxx.c }
> > >
> > > 		- define a subset of vfio_pci interfaces
> > > 		- selective pass-through (say aperture)
> > > 		- trap MMIO: interface w/ QEMU
> > 
> > What's included in the subset?  Certainly the bus reset ioctls really
> > don't apply, but you'll need to support the full device interface,
> > right?  That includes the region info ioctl and access through the vfio
> > device file descriptor as well as the interrupt info and setup ioctls.
> 
> That is the next level detail Jike will figure out and discuss soon.
> 
> yes, basic region info/access should be necessary. For interrupt, could
> you elaborate a bit what current interface is doing? If just about creating
> an eventfd for virtual interrupt injection, it applies to vgpu too.
> 
> > 
> > > 	IOMMU
> > >
> > > 		{ in a new vfio_xxx.c }
> > >
> > > 		- allocate: struct device & IOMMU group
> > 
> > It seems like the vgpu instance management would do this.
> > 
> > > 		- map/unmap functions for vgpu
> > > 		- rb-tree to maintain iova/hpa mappings
> > 
> > Yep, pretty much what type1 does now, but without mapping through the
> > IOMMU API.  Essentially just a database of the current userspace
> > mappings that can be accessed for page pinning and IOVA->HPA
> > translation.
> 
> The thought is to reuse iommu_type1.c, by abstracting several underlying
> operations and then put vgpu specific implementation in a vfio_vgpu.c (e.g.
> for map/unmap instead of using IOMMU API, an iova/hpa mapping is updated
> accordingly), etc.
> 
> This file will also connect between VFIO and vendor specific vgpu driver,
> e.g. exposing interfaces to allow the latter querying iova<->hpa and also 
> creating necessary VFIO structures like aforementioned device/IOMMUas...
> 
> > 
> > > 		- interacts with kvmgt.c
> > >
> > >
> > > 	vgpu instance management
> > >
> > > 		{ in i915 }
> > >
> > > 		- path, create/destroy
> > >
> > 
> > Yes, and since you're creating and destroying the vgpu here, this is
> > where I'd expect a struct device to be created and added to an IOMMU
> > group.  The lifecycle management should really include links between
> > the vGPU and physical GPU, which would be much, much easier to do with
> > struct devices create here rather than at the point where we start
> > doing vfio "stuff".
> 
> It's invoked here, but expecting the function exposed by vfio_vgpu.c. It's
> not good to touch vfio internal structures from another module (such as
> i915.ko)
> 
> > 
> > Nvidia has also been looking at this and has some ideas how we might
> > standardize on some of the interfaces and create a vgpu framework to
> > help share code between vendors and hopefully make a more consistent
> > userspace interface for libvirt as well.  I'll let Neo provide some
> > details.  Thanks,
> > 
> 
> Nice to know that. Neo, please share your thought here.

Hi Alex, Kevin and Jike,

(Seems I shouldn't use attachment, resend it again to the list, patches are
inline at the end)

Thanks for adding me to this technical discussion, a great opportunity
for us to design together which can bring both Intel and NVIDIA vGPU solution to
KVM platform.

Instead of directly jumping to the proposal that we have been working on
recently for NVIDIA vGPU on KVM, I think it is better for me to put out couple
quick comments / thoughts regarding the existing discussions on this thread as
fundamentally I think we are solving the same problem, DMA, interrupt and MMIO.

Then we can look at what we have, hopefully we can reach some consensus soon.

> Yes, and since you're creating and destroying the vgpu here, this is
> where I'd expect a struct device to be created and added to an IOMMU
> group.  The lifecycle management should really include links between
> the vGPU and physical GPU, which would be much, much easier to do with
> struct devices create here rather than at the point where we start
> doing vfio "stuff".

Infact to keep vfio-vgpu to be more generic, vgpu device creation and management
can be centralized and done in vfio-vgpu. That also include adding to IOMMU
group and VFIO group.

Graphics driver can register with vfio-vgpu to get management and emulation call
backs to graphics driver.   

We already have struct vgpu_device in our proposal that keeps pointer to
physical device.  

> - vfio_pci will inject an IRQ to guest only when physical IRQ
> generated; whereas vfio_vgpu may inject an IRQ for emulation
> purpose. Anyway they can share the same injection interface;

eventfd to inject the interrupt is known to vfio-vgpu, that fd should be
available to graphics driver so that graphics driver can inject interrupts
directly when physical device triggers interrupt. 

Here is the proposal we have, please review.

Please note the patches we have put out here is mainly for POC purpose to
verify our understanding also can serve the purpose to reduce confusions and speed up 
our design, although we are very happy to refine that to something eventually
can be used for both parties and upstreamed.

Linux vGPU kernel design

VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

Commit Message

Comments

Patch