Message ID | 20200519163234.226513-1-sashal@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | DirectX on Linux | expand |
On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote: > +/* > + * Dxgkrnl Graphics Port Driver ioctl definitions > + * > + */ > + > +#define LX_IOCTL_DIR_WRITE 0x1 > +#define LX_IOCTL_DIR_READ 0x2 > + > +#define LX_IOCTL_DIR(_ioctl) (((_ioctl) >> 30) & 0x3) > +#define LX_IOCTL_SIZE(_ioctl) (((_ioctl) >> 16) & 0x3FFF) > +#define LX_IOCTL_TYPE(_ioctl) (((_ioctl) >> 8) & 0xFF) > +#define LX_IOCTL_CODE(_ioctl) (((_ioctl) >> 0) & 0xFF) Why create new ioctl macros, can't the "normal" kernel macros work properly? > +#define LX_IOCTL(_dir, _size, _type, _code) ( \ > + (((uint)(_dir) & 0x3) << 30) | \ > + (((uint)(_size) & 0x3FFF) << 16) | \ > + (((uint)(_type) & 0xFF) << 8) | \ > + (((uint)(_code) & 0xFF) << 0)) > + > +#define LX_IO(_type, _code) LX_IOCTL(0, 0, (_type), (_code)) > +#define LX_IOR(_type, _code, _size) \ > + LX_IOCTL(LX_IOCTL_DIR_READ, (_size), (_type), (_code)) > +#define LX_IOW(_type, _code, _size) \ > + LX_IOCTL(LX_IOCTL_DIR_WRITE, (_size), (_type), (_code)) > +#define LX_IOWR(_type, _code, _size) \ > + LX_IOCTL(LX_IOCTL_DIR_WRITE | \ > + LX_IOCTL_DIR_READ, (_size), (_type), (_code)) > + > +#define LX_DXOPENADAPTERFROMLUID \ > + LX_IOWR(0x47, 0x01, sizeof(struct d3dkmt_openadapterfromluid)) <snip> These structures do not seem to be all using the correct types for a "real" ioctl in the kernel, so you will have to fix them all up before this will work properly. > +void ioctl_desc_init(void); Very odd global name you are using here :) Anyway, neat stuff, glad to see it posted, great work! greg k-h
On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote: > + > +#define DXGK_MAX_LOCK_DEPTH 64 > +#define W_MAX_PATH 260 We already have a max path number, why use a different one? > +#define d3dkmt_handle u32 > +#define d3dgpu_virtual_address u64 > +#define winwchar u16 > +#define winhandle u64 > +#define ntstatus int > +#define winbool u32 > +#define d3dgpu_size_t u64 These are all ripe for a simple search/replace in your editor before you do your next version :) thanks, greg k-h
On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote: > +static int dxgglobal_init_global_channel(struct hv_device *hdev) > +{ > + int ret = 0; > + > + TRACE_DEBUG(1, "%s %x %x", __func__, hdev->vendor_id, hdev->device_id); > + { > + TRACE_DEBUG(1, "device type : %pUb\n", &hdev->dev_type); > + TRACE_DEBUG(1, "device channel: %pUb %p primary: %p\n", > + &hdev->channel->offermsg.offer.if_type, > + hdev->channel, hdev->channel->primary_channel); > + } > + > + if (dxgglobal->hdev) { > + /* This device should appear only once */ > + pr_err("dxgglobal already initialized\n"); > + ret = -EBADE; > + goto error; > + } > + > + dxgglobal->hdev = hdev; > + > + ret = dxgvmbuschannel_init(&dxgglobal->channel, hdev); > + if (ret) { > + pr_err("dxgvmbuschannel_init failed: %d\n", ret); > + goto error; > + } > + > + ret = dxgglobal_getiospace(dxgglobal); > + if (ret) { > + pr_err("getiospace failed: %d\n", ret); > + goto error; > + } > + > + ret = dxgvmb_send_set_iospace_region(dxgglobal->mmiospace_base, > + dxgglobal->mmiospace_size, 0); > + if (ret) { > + pr_err("send_set_iospace_region failed\n"); > + goto error; > + } > + > + hv_set_drvdata(hdev, dxgglobal); > + > + if (alloc_chrdev_region(&dxgglobal->device_devt, 0, 1, "dxgkrnl") < 0) { > + pr_err("alloc_chrdev_region failed\n"); > + ret = -ENODEV; > + goto error; > + } > + dxgglobal->devt_initialized = true; > + dxgglobal->device_class = class_create(THIS_MODULE, "dxgkdrv"); > + if (dxgglobal->device_class == NULL) { > + pr_err("class_create failed\n"); > + ret = -ENODEV; > + goto error; > + } > + dxgglobal->device_class->devnode = dxg_devnode; > + dxgglobal->device = device_create(dxgglobal->device_class, NULL, > + dxgglobal->device_devt, NULL, "dxg"); > + if (dxgglobal->device == NULL) { > + pr_err("device_create failed\n"); > + ret = -ENODEV; > + goto error; > + } > + dxgglobaldev = dxgglobal->device; > + cdev_init(&dxgglobal->device_cdev, &dxgk_fops); > + ret = cdev_add(&dxgglobal->device_cdev, dxgglobal->device_devt, 1); > + if (ret < 0) { > + pr_err("cdev_add failed: %d\n", ret); > + goto error; > + } > + dxgglobal->cdev_initialized = true; > + > +error: > + return ret; > +} As you only are asking for a single char dev node, please just use the misc device api instead of creating your own class and major number on the fly. It's much simpler and easier overall to make sure you got all of the above logic correct. thanks, greg k-h
On Tue, May 19, 2020 at 07:21:05PM +0200, Greg KH wrote: >On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote: >> + >> +#define DXGK_MAX_LOCK_DEPTH 64 >> +#define W_MAX_PATH 260 > >We already have a max path number, why use a different one? It's max path for Windows, not Linux (thus the "W_" prefix) :) Maybe changing it to WIN_MAX_PATH or such will make it better? >> +#define d3dkmt_handle u32 >> +#define d3dgpu_virtual_address u64 >> +#define winwchar u16 >> +#define winhandle u64 >> +#define ntstatus int >> +#define winbool u32 >> +#define d3dgpu_size_t u64 > >These are all ripe for a simple search/replace in your editor before you >do your next version :) I've actually attempted that, and reverted that change, mostly because the whole 'handle' thing became very confusing. Note that we have a few 'handles', each with a different size, and thus calling get_something_something_handle() type of functions becase very confusing since it's not clear what handle we're working with in that case. With regards to the rest, I wanted to leave stuff like 'winbool' to document the expected ABI between the Windows and Linux side of things. Ideally it would be 'bool' or 'u8', but as you see we had to use 'u32' here which I feel lessens our ability to have the code document itself. I don't feel too strongly against doing the conversion, and I won't object to doing it if you do, but just be aware that I've tried it and preferred to go back (even though our coding style doesn't like this) :)
Hi Sasha So obviously great that Microsoft is trying to upstream all this, and very much welcome and all that. But I guess there's a bunch of rather fundamental issues before we look into any kind of code details. And that might make this quite a hard sell for upstream to drivers/gpu subsystem: - From the blog it sounds like the userspace is all closed. That includes the hw specific part and compiler chunks, all stuff we've generally expected to be able to look in the past for any kind of other driver. It's event documented here: https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements What's your plan here? btw since the main goal here (at least at first) seems to be get compute and ML going the official work-around here is to relabel your driver as an accelerator driver (just sed -e s/vGPU/vaccel/ over the entire thing or so) and then Olof and Greg will take it into drivers/accel ... - Next up (but that's not really a surprise for a fresh vendor driver) at a more technical level, this seems to reinvent the world, from device enumeration (why is this not exposed as /dev/dri/card0 so it better integrates with existing linux desktop stuff, in case that becomes a goal ever) down to reinvented kref_put_mutex (and please look at drm_device->struct_mutex for an example of how bad of a nightmare that locking pattern is and how many years it took us to untangle that one. - Why DX12 on linux? Looking at this feels like classic divide and conquer (or well triple E from the 90s), we have vk, we have drm_syncobj, we have an entire ecosystem of winsys layers that work across vendors. Is the plan here that we get a dx12 driver for other hw mesa drivers from you guys, so this is all consistent and we have a nice linux platform? How does this integrate everywhere else with linux winsys standards, like dma-buf for passing stuff around, dma-fence/sync_file/drm_syncobj for syncing, drm_fourcc/modifiers for some idea how it all meshes together? - There's been a pile of hallway track/private discussions about moving on from the buffer-based memory managed model to something more modern. That relates to your DXLOCK2 question, but there's a lot more to userspace managed gpu memory residency than just that. monitored fences are another part. Also, to avoid a platform split we need to figure out how to tie this back into the dma-buf and dma-fence (including various uapi flavours) or it'll be made of fail. dx12 has all that in some form, except 0 integration with the linux stuff we have (no surprise, since linux isn't windows). Finally if we go to the trouble of a completely revamped I think ioctls aren't a great idea, something like iouring (the gossip name is drm_uring) would be a lot better. Also for easier paravirt we'd need 0 cpu pointers in any such new interface. Adding a few people who've been involved in these discussions thus far, mostly under a drm/hmm.ko heading iirc. I think the above are the really big ticket items around what's the plan here and are we solving even the right problem. Cheers, Daniel On Tue, May 19, 2020 at 6:33 PM Sasha Levin <sashal@kernel.org> wrote: > > There is a blog post that goes into more detail about the bigger > picture, and walks through all the required pieces to make this work. It > is available here: > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of > this cover letter will focus on the Linux Kernel bits. > > Overview > ======== > > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The > driver exposes a paravirtualized GPU to user mode applications running > in a virtual machine on a Windows host. This enables hardware > acceleration in environment such as WSL (Windows Subsystem for Linux) > where the Linux virtual machine is able to share the GPU with the > Windows host. > > The projection is accomplished by exposing the WDDM (Windows Display > Driver Model) interface as a set of IOCTL. This allows APIs and user > mode driver written against the WDDM GPU abstraction on Windows to be > ported to run within a Linux environment. This enables the port of the > D3D12 and DirectML APIs as well as their associated user mode driver to > Linux. This also enables third party APIs, such as the popular NVIDIA > Cuda compute API, to be hardware accelerated within a WSL environment. > > Only the rendering/compute aspect of the GPU are projected to the > virtual machine, no display functionality is exposed. Further, at this > time there are no presentation integration. So although the D3D12 API > can be use to render graphics offscreen, there is no path (yet) for > pixel to flow from the Linux environment back onto the Windows host > desktop. This GPU stack is effectively side-by-side with the native > Linux graphics stack. > > The driver creates the /dev/dxg device, which can be opened by user mode > application and handles their ioctls. The IOCTL interface to the driver > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl > definitions). The interface matches the D3DKMT interface on Windows. > Ioctls are implemented in ioctl.c. > > When a VM starts, hyper-v on the host adds virtual GPU devices to the VM > via the hyper-v driver. The host offers several VM bus channels to the > VM: the global channel and one channel per virtual GPU, assigned to the > VM. > > The driver registers with the hyper-v driver (hv_driver) for the arrival > of VM bus channels. dxg_probe_device recognizes the vGPU channels and > creates the corresponding objects (dxgadapter for vGPUs and dxgglobal > for the global channel). > > The driver uses the hyper-V VM bus interface to communicate with the > host. dxgvmbus.c implements the communication interface. > > The global channel has 8GB of IO space assigned by the host. This space > is managed by the host and used to give the guest direct CPU access to > some allocations. Video memory is allocated on the host except in the > case of existing_sysmem allocations. The Windows host allocates memory > for the GPU on behalf of the guest. The Linux guest can access that > memory by mapping GPU virtual address to allocations and then > referencing those GPU virtual address from within GPU command buffers > submitted to the GPU. For allocations which require CPU access, the > allocation is mapped by the host into a location in the 8GB of IO space > reserved in the guest for that purpose. The Windows host uses the nested > CPU page table to ensure that this guest IO space always map to the > correct location for the allocation as it may migrate between dedicated > GPU memory (e.g. VRAM, firmware reserved DDR) and shared system memory > (regular DDR) over its lifetime. The Linux guest maps a user mode CPU > virtual address to an allocation IO space range for direct access by > user mode APIs and drivers. > > > > Implementation of LX_DXLOCK2 ioctl > ================================== > > We would appreciate your feedback on the implementation of the > LX_DXLOCK2 ioctl. > > This ioctl is used to get a CPU address to an allocation, which is > resident in video/system memory on the host. The way it works: > > 1. The driver sends the Lock message to the host > > 2. The host allocates space in the VM IO space and maps it to the > allocation memory > > 3. The host returns the address in IO space for the mapped allocation > > 4. The driver (in dxg_map_iospace) allocates a user mode virtual address > range using vm_mmap and maps it to the IO space using > io_remap_ofn_range) > > 5. The VA is returned to the application > > > > Internal objects > ================ > > The following objects are created by the driver (defined in dxgkrnl.h): > > - dxgadapter - represents a virtual GPU > > - dxgprocess - tracks per process state (handle table of created > objects, list of objects, etc.) > > - dxgdevice - a container for other objects (contexts, paging queues, > allocations, GPU synchronization objects) > > - dxgcontext - represents thread of GPU execution for packet > scheduling. > > - dxghwqueue - represents thread of GPU execution of hardware scheduling > > - dxgallocation - represents a GPU accessible allocation > > - dxgsyncobject - represents a GPU synchronization object > > - dxgresource - collection of dxgalloction objects > > - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects > between different dxgdevice objects, which can belong to different > processes > > > > Object handles > ============== > > All GPU objects, created by the driver, are accessible by a handle > (d3dkmt_handle). Each process has its own handle table, which is > implemented in hmgr.c. For each API visible object, created by the > driver, there is an object, created on the host. For example, the is a > dxgprocess object on the host for each dxgprocess object in the VM, etc. > The object handles have the same value in the host and the VM, which is > done to avoid translation from the guest handles to the host handles. > > > > Signaling CPU events by the host > ================================ > > The WDDM interface provides a way to signal CPU event objects when > execution of a context reached certain point. The way it is implemented: > > - application sends an event_fd via ioctl to the driver > > - eventfd_ctx_get is used to get a pointer to the file object > (eventfd_ctx) > > - the pointer to sent the host via a VM bus message > > - when GPU execution reaches a certain point, the host sends a message > to the VM with the event pointer > > - signal_guest_event() handles the messages and eventually > eventfd_signal() is called. > > > Sasha Levin (4): > gpu: dxgkrnl: core code > gpu: dxgkrnl: hook up dxgkrnl > Drivers: hv: vmbus: hook up dxgkrnl > gpu: dxgkrnl: create a MAINTAINERS entry > > MAINTAINERS | 7 + > drivers/gpu/Makefile | 2 +- > drivers/gpu/dxgkrnl/Kconfig | 10 + > drivers/gpu/dxgkrnl/Makefile | 12 + > drivers/gpu/dxgkrnl/d3dkmthk.h | 1635 +++++++++ > drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++ > drivers/gpu/dxgkrnl/dxgkrnl.h | 913 ++++++ > drivers/gpu/dxgkrnl/dxgmodule.c | 692 ++++ > drivers/gpu/dxgkrnl/dxgprocess.c | 355 ++ > drivers/gpu/dxgkrnl/dxgvmbus.c | 2955 +++++++++++++++++ > drivers/gpu/dxgkrnl/dxgvmbus.h | 859 +++++ > drivers/gpu/dxgkrnl/hmgr.c | 593 ++++ > drivers/gpu/dxgkrnl/hmgr.h | 107 + > drivers/gpu/dxgkrnl/ioctl.c | 5269 ++++++++++++++++++++++++++++++ > drivers/gpu/dxgkrnl/misc.c | 280 ++ > drivers/gpu/dxgkrnl/misc.h | 288 ++ > drivers/video/Kconfig | 2 + > include/linux/hyperv.h | 16 + > 18 files changed, 15393 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/dxgkrnl/Kconfig > create mode 100644 drivers/gpu/dxgkrnl/Makefile > create mode 100644 drivers/gpu/dxgkrnl/d3dkmthk.h > create mode 100644 drivers/gpu/dxgkrnl/dxgadapter.c > create mode 100644 drivers/gpu/dxgkrnl/dxgkrnl.h > create mode 100644 drivers/gpu/dxgkrnl/dxgmodule.c > create mode 100644 drivers/gpu/dxgkrnl/dxgprocess.c > create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.c > create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.h > create mode 100644 drivers/gpu/dxgkrnl/hmgr.c > create mode 100644 drivers/gpu/dxgkrnl/hmgr.h > create mode 100644 drivers/gpu/dxgkrnl/ioctl.c > create mode 100644 drivers/gpu/dxgkrnl/misc.c > create mode 100644 drivers/gpu/dxgkrnl/misc.h > > -- > 2.25.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Hi Daniel, On Tue, May 19, 2020 at 09:21:15PM +0200, Daniel Vetter wrote: >Hi Sasha > >So obviously great that Microsoft is trying to upstream all this, and >very much welcome and all that. > >But I guess there's a bunch of rather fundamental issues before we >look into any kind of code details. And that might make this quite a >hard sell for upstream to drivers/gpu subsystem: Let me preface my answers by saying that speaking personally I very much dislike that the userspace is closed and wish I could do something about it. >- From the blog it sounds like the userspace is all closed. That >includes the hw specific part and compiler chunks, all stuff we've >generally expected to be able to look in the past for any kind of >other driver. It's event documented here: > >https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements > >What's your plan here? Let me answer with a (genuine) question: does this driver have anything to do with DRM even after we enable graphics on it? I'm still trying to figure it out. There is an open source DX12 Galluim driver (that lives here: https://gitlab.freedesktop.org/kusma/mesa/-/tree/msclc-d3d12) with open source compiler and so on. The plan is for Microsoft to provide shims to allow the existing Linux userspace interact with DX12; I'll explain below why we had to pipe DX12 all the way into the Linux guest, but this is *not* to introduce DX12 into the Linux world as competition. There is no intent for anyone in the Linux world to start coding for the DX12 API. This is why I'm not sure whether this touches DRM on the Linux side of things. Nothing is actually rendered on Linux but rather piped to Windows to be done there. >btw since the main goal here (at least at first) seems to be get >compute and ML going the official work-around here is to relabel your >driver as an accelerator driver (just sed -e s/vGPU/vaccel/ over the >entire thing or so) and then Olof and Greg will take it into >drivers/accel ... This submission is not a case of "we want it upstream NOW" but rather "let's work together to figure out how to do it right" :) I thought about placing this driver in drivers/hyper-v/ given that it's basically just a pipe between the host and the guest. There is no fancy logic in this drivers. Maybe the right place is indeed drivers/accel or drivers/hyper-v but I'd love if we agree on that rather than doing that as a workaround and 6 months down the road enabling graphics. >- Next up (but that's not really a surprise for a fresh vendor driver) >at a more technical level, this seems to reinvent the world, from >device enumeration (why is this not exposed as /dev/dri/card0 so it >better integrates with existing linux desktop stuff, in case that >becomes a goal ever) down to reinvented kref_put_mutex (and please >look at drm_device->struct_mutex for an example of how bad of a >nightmare that locking pattern is and how many years it took us to >untangle that one. I'd maybe note that neither of us here at Microsoft is an expert in the Linux DRM world. Stuff might have been done in a certain way because we didn't know better. >- Why DX12 on linux? Looking at this feels like classic divide and There is a single usecase for this: WSL2 developer who wants to run machine learning on his GPU. The developer is working on his laptop, which is running Windows and that laptop has a single GPU that Windows is using. Since the GPU is being used by Windows, we can't assign it directly to the Linux guest, but instead we can use GPU Partitioning to give the guest access to the GPU. This means that the guest needs to be able to "speak" DX12, which is why we pulled DX12 into Linux. >conquer (or well triple E from the 90s), we have vk, we have >drm_syncobj, we have an entire ecosystem of winsys layers that work >across vendors. Is the plan here that we get a dx12 driver for other >hw mesa drivers from you guys, so this is all consistent and we have a >nice linux platform? How does this integrate everywhere else with >linux winsys standards, like dma-buf for passing stuff around, >dma-fence/sync_file/drm_syncobj for syncing, drm_fourcc/modifiers for >some idea how it all meshes together? Let me point you to this blog post that has more information about the graphics side of things: https://www.collabora.com/news-and-blog/news-and-events/introducing-opencl-and-opengl-on-directx.html . The intent is to wrap DX12 with shims to work with the existing ecosystem; DX12 isn't a new player on it's own and thus isn't trying to divide/conquer anything. >- There's been a pile of hallway track/private discussions about >moving on from the buffer-based memory managed model to something more >modern. That relates to your DXLOCK2 question, but there's a lot more >to userspace managed gpu memory residency than just that. monitored >fences are another part. Also, to avoid a platform split we need to >figure out how to tie this back into the dma-buf and dma-fence >(including various uapi flavours) or it'll be made of fail. dx12 has >all that in some form, except 0 integration with the linux stuff we >have (no surprise, since linux isn't windows). Finally if we go to the >trouble of a completely revamped I think ioctls aren't a great idea, >something like iouring (the gossip name is drm_uring) would be a lot >better. Also for easier paravirt we'd need 0 cpu pointers in any such >new interface. Adding a few people who've been involved in these >discussions thus far, mostly under a drm/hmm.ko heading iirc. > >I think the above are the really big ticket items around what's the >plan here and are we solving even the right problem. Part of the reason behind this implementation is simplicity. Again, no objections around moving to uring and doing other improvements.
On Wed, 20 May 2020 at 02:33, Sasha Levin <sashal@kernel.org> wrote: > > There is a blog post that goes into more detail about the bigger > picture, and walks through all the required pieces to make this work. It > is available here: > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of > this cover letter will focus on the Linux Kernel bits. > > Overview > ======== > > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The > driver exposes a paravirtualized GPU to user mode applications running > in a virtual machine on a Windows host. This enables hardware > acceleration in environment such as WSL (Windows Subsystem for Linux) > where the Linux virtual machine is able to share the GPU with the > Windows host. > > The projection is accomplished by exposing the WDDM (Windows Display > Driver Model) interface as a set of IOCTL. This allows APIs and user > mode driver written against the WDDM GPU abstraction on Windows to be > ported to run within a Linux environment. This enables the port of the > D3D12 and DirectML APIs as well as their associated user mode driver to > Linux. This also enables third party APIs, such as the popular NVIDIA > Cuda compute API, to be hardware accelerated within a WSL environment. > > Only the rendering/compute aspect of the GPU are projected to the > virtual machine, no display functionality is exposed. Further, at this > time there are no presentation integration. So although the D3D12 API > can be use to render graphics offscreen, there is no path (yet) for > pixel to flow from the Linux environment back onto the Windows host > desktop. This GPU stack is effectively side-by-side with the native > Linux graphics stack. Okay I've had some caffiene and absorbed some more of this. This is a driver that connects a binary blob interface in the Windows kernel drivers to a binary blob that you run inside a Linux guest. It's a binary transport between two binary pieces. Personally this holds little of interest to me, I can see why it might be nice to have this upstream, but I don't forsee any other Linux distributor ever enabling it or having to ship it, it's purely a WSL2 pipe. I'm not saying I'd be happy to see this in the tree, since I don't see the value of maintaining it upstream, but it probably should just exists in a drivers/hyperv type area. Having said that, I hit one stumbling block: "Further, at this time there are no presentation integration. " If we upstream this driver as-is into some hyperv specific place, and you decide to add presentation integration this is more than likely going to mean you will want to interact with dma-bufs and dma-fences. If the driver is hidden away in a hyperv place it's likely we won't even notice that feature landing until it's too late. I would like to see a coherent plan for presentation support (not code, just an architectural diagram), because I think when you contemplate how that works it will change the picture of how this driver looks and intergrates into the rest of the Linux graphics ecosystem. As-is I'd rather this didn't land under my purview, since I don't see the value this adds to the Linux ecosystem at all, and I think it's important when putting a burden on upstream that you provide some value. Dave.
On Wed, May 20, 2020 at 12:42 AM Dave Airlie <airlied@gmail.com> wrote: > > On Wed, 20 May 2020 at 02:33, Sasha Levin <sashal@kernel.org> wrote: > > > > There is a blog post that goes into more detail about the bigger > > picture, and walks through all the required pieces to make this work. It > > is available here: > > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of > > this cover letter will focus on the Linux Kernel bits. > > > > Overview > > ======== > > > > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The > > driver exposes a paravirtualized GPU to user mode applications running > > in a virtual machine on a Windows host. This enables hardware > > acceleration in environment such as WSL (Windows Subsystem for Linux) > > where the Linux virtual machine is able to share the GPU with the > > Windows host. > > > > The projection is accomplished by exposing the WDDM (Windows Display > > Driver Model) interface as a set of IOCTL. This allows APIs and user > > mode driver written against the WDDM GPU abstraction on Windows to be > > ported to run within a Linux environment. This enables the port of the > > D3D12 and DirectML APIs as well as their associated user mode driver to > > Linux. This also enables third party APIs, such as the popular NVIDIA > > Cuda compute API, to be hardware accelerated within a WSL environment. > > > > Only the rendering/compute aspect of the GPU are projected to the > > virtual machine, no display functionality is exposed. Further, at this > > time there are no presentation integration. So although the D3D12 API > > can be use to render graphics offscreen, there is no path (yet) for > > pixel to flow from the Linux environment back onto the Windows host > > desktop. This GPU stack is effectively side-by-side with the native > > Linux graphics stack. > > Okay I've had some caffiene and absorbed some more of this. > > This is a driver that connects a binary blob interface in the Windows > kernel drivers to a binary blob that you run inside a Linux guest. > It's a binary transport between two binary pieces. Personally this > holds little of interest to me, I can see why it might be nice to have > this upstream, but I don't forsee any other Linux distributor ever > enabling it or having to ship it, it's purely a WSL2 pipe. I'm not > saying I'd be happy to see this in the tree, since I don't see the > value of maintaining it upstream, but it probably should just exists > in a drivers/hyperv type area. Yup as-is (especially with the goal of this being aimed at ml/compute only) drivers/hyperv sounds a bunch more reasonable than drivers/gpu. > Having said that, I hit one stumbling block: > "Further, at this time there are no presentation integration. " > > If we upstream this driver as-is into some hyperv specific place, and > you decide to add presentation integration this is more than likely > going to mean you will want to interact with dma-bufs and dma-fences. > If the driver is hidden away in a hyperv place it's likely we won't > even notice that feature landing until it's too late. I've recently added regex matches to MAINTAINERS so we'll see dma_buf/fence/anything show up on dri-devel. So that part is solved hopefully. > I would like to see a coherent plan for presentation support (not > code, just an architectural diagram), because I think when you > contemplate how that works it will change the picture of how this > driver looks and intergrates into the rest of the Linux graphics > ecosystem. Yeah once we have the feature-creep to presentation support all the integration fun starts, with all the questions about "why does this not look like any other linux gpu driver". We have that already with nvidia insisting they just can't implement any of the upstream gpu uapi we have, but at least they're not in-tree, so not our problem from an upstream maintainership pov. But once this dx12 pipe is landed and then we want to extend it it's still going to have all the "we can't ever release the sources to any of the parts we usually expect to be open for gpu drivers in upstream" problems. Then we're stuck at a rather awkward point of why one vendor gets an exception and all the others dont. > As-is I'd rather this didn't land under my purview, since I don't see > the value this adds to the Linux ecosystem at all, and I think it's > important when putting a burden on upstream that you provide some > value. Well there is some in the form of "more hw/platform support". But given that gpus evolved rather fast, including the entire integration ecosystem (it's by far not just the hw drivers that move quickly). So that value deprecates a lot faster than for other kernel subsystems. And all that's left is the pain of not breaking anything without actually being able to evolve the overall stack in any meaningful way. -Daniel
On Wed, 20 May 2020 at 08:42, Dave Airlie <airlied@gmail.com> wrote: > > On Wed, 20 May 2020 at 02:33, Sasha Levin <sashal@kernel.org> wrote: > > > > There is a blog post that goes into more detail about the bigger > > picture, and walks through all the required pieces to make this work. It > > is available here: > > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of > > this cover letter will focus on the Linux Kernel bits. > > > > Overview > > ======== > > > > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The > > driver exposes a paravirtualized GPU to user mode applications running > > in a virtual machine on a Windows host. This enables hardware > > acceleration in environment such as WSL (Windows Subsystem for Linux) > > where the Linux virtual machine is able to share the GPU with the > > Windows host. > > > > The projection is accomplished by exposing the WDDM (Windows Display > > Driver Model) interface as a set of IOCTL. This allows APIs and user > > mode driver written against the WDDM GPU abstraction on Windows to be > > ported to run within a Linux environment. This enables the port of the > > D3D12 and DirectML APIs as well as their associated user mode driver to > > Linux. This also enables third party APIs, such as the popular NVIDIA > > Cuda compute API, to be hardware accelerated within a WSL environment. > > > > Only the rendering/compute aspect of the GPU are projected to the > > virtual machine, no display functionality is exposed. Further, at this > > time there are no presentation integration. So although the D3D12 API > > can be use to render graphics offscreen, there is no path (yet) for > > pixel to flow from the Linux environment back onto the Windows host > > desktop. This GPU stack is effectively side-by-side with the native > > Linux graphics stack. > > Okay I've had some caffiene and absorbed some more of this. > > This is a driver that connects a binary blob interface in the Windows > kernel drivers to a binary blob that you run inside a Linux guest. > It's a binary transport between two binary pieces. Personally this > holds little of interest to me, I can see why it might be nice to have > this upstream, but I don't forsee any other Linux distributor ever > enabling it or having to ship it, it's purely a WSL2 pipe. I'm not > saying I'd be happy to see this in the tree, since I don't see the > value of maintaining it upstream, but it probably should just exists > in a drivers/hyperv type area. > > Having said that, I hit one stumbling block: > "Further, at this time there are no presentation integration. " > > If we upstream this driver as-is into some hyperv specific place, and > you decide to add presentation integration this is more than likely > going to mean you will want to interact with dma-bufs and dma-fences. > If the driver is hidden away in a hyperv place it's likely we won't > even notice that feature landing until it's too late. > > I would like to see a coherent plan for presentation support (not > code, just an architectural diagram), because I think when you > contemplate how that works it will change the picture of how this > driver looks and intergrates into the rest of the Linux graphics > ecosystem. > > As-is I'd rather this didn't land under my purview, since I don't see > the value this adds to the Linux ecosystem at all, and I think it's > important when putting a burden on upstream that you provide some > value. I also have another concern from a legal standpoint I'd rather not review the ioctl part of this. I'd probably request under DRI developers abstain as well. This is a Windows kernel API being smashed into a Linux driver. I don't want to be tainted by knowledge of an API that I've no idea of the legal status of derived works. (it this all covered patent wise under OIN?) I don't want to ever be accused of designing a Linux kernel API with illgotten D3DKMT knowledge, I feel tainting myself with knowledge of a properietary API might cause derived work issues. Dave.
Hey guys, Thanks for the discussion. I may not be able to immediately answer all of your questions, but I'll do my best
On Tue, May 19, 2020 at 01:45:53PM -0400, Sasha Levin wrote: > On Tue, May 19, 2020 at 07:21:05PM +0200, Greg KH wrote: > > On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote: > > > + > > > +#define DXGK_MAX_LOCK_DEPTH 64 > > > +#define W_MAX_PATH 260 > > > > We already have a max path number, why use a different one? > > It's max path for Windows, not Linux (thus the "W_" prefix) :) Ah, not obvious :) > Maybe changing it to WIN_MAX_PATH or such will make it better? Probably. > > > +#define d3dkmt_handle u32 > > > +#define d3dgpu_virtual_address u64 > > > +#define winwchar u16 > > > +#define winhandle u64 > > > +#define ntstatus int > > > +#define winbool u32 > > > +#define d3dgpu_size_t u64 > > > > These are all ripe for a simple search/replace in your editor before you > > do your next version :) > > I've actually attempted that, and reverted that change, mostly because > the whole 'handle' thing became very confusing. Yeah, "handles" in windows can be a mess, with some being pointers and others just integers. Trying to make a specific typedef for it is usually the better way overall, that way you can get the compiler to check for mistakes. These #defines will not really help with that. But, 'ntstatus' should be ok to just make "int" everywhere, right? > Note that we have a few 'handles', each with a different size, and thus > calling get_something_something_handle() type of functions becase very > confusing since it's not clear what handle we're working with in that > case. Yeah, typedefs can help there. > With regards to the rest, I wanted to leave stuff like 'winbool' to > document the expected ABI between the Windows and Linux side of things. > Ideally it would be 'bool' or 'u8', but as you see we had to use 'u32' > here which I feel lessens our ability to have the code document itself. 'bool' probably will not work as I think it's compiler dependent, __u8 is probably best. thanks, greg k-h
Hi Am 19.05.20 um 18:32 schrieb Sasha Levin: > There is a blog post that goes into more detail about the bigger > picture, and walks through all the required pieces to make this work. It > is available here: > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of > this cover letter will focus on the Linux Kernel bits. That's quite a surprise. Thanks for your efforts to contribute. > > Overview > ======== > > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The > driver exposes a paravirtualized GPU to user mode applications running > in a virtual machine on a Windows host. This enables hardware > acceleration in environment such as WSL (Windows Subsystem for Linux) > where the Linux virtual machine is able to share the GPU with the > Windows host. > > The projection is accomplished by exposing the WDDM (Windows Display > Driver Model) interface as a set of IOCTL. This allows APIs and user > mode driver written against the WDDM GPU abstraction on Windows to be > ported to run within a Linux environment. This enables the port of the > D3D12 and DirectML APIs as well as their associated user mode driver to > Linux. This also enables third party APIs, such as the popular NVIDIA > Cuda compute API, to be hardware accelerated within a WSL environment. > > Only the rendering/compute aspect of the GPU are projected to the > virtual machine, no display functionality is exposed. Further, at this > time there are no presentation integration. So although the D3D12 API > can be use to render graphics offscreen, there is no path (yet) for > pixel to flow from the Linux environment back onto the Windows host > desktop. This GPU stack is effectively side-by-side with the native > Linux graphics stack. > > The driver creates the /dev/dxg device, which can be opened by user mode > application and handles their ioctls. The IOCTL interface to the driver > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl > definitions). The interface matches the D3DKMT interface on Windows. > Ioctls are implemented in ioctl.c. Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code. I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions? If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver? Best regards Thomas > > When a VM starts, hyper-v on the host adds virtual GPU devices to the VM > via the hyper-v driver. The host offers several VM bus channels to the > VM: the global channel and one channel per virtual GPU, assigned to the > VM. > > The driver registers with the hyper-v driver (hv_driver) for the arrival > of VM bus channels. dxg_probe_device recognizes the vGPU channels and > creates the corresponding objects (dxgadapter for vGPUs and dxgglobal > for the global channel). > > The driver uses the hyper-V VM bus interface to communicate with the > host. dxgvmbus.c implements the communication interface. > > The global channel has 8GB of IO space assigned by the host. This space > is managed by the host and used to give the guest direct CPU access to > some allocations. Video memory is allocated on the host except in the > case of existing_sysmem allocations. The Windows host allocates memory > for the GPU on behalf of the guest. The Linux guest can access that > memory by mapping GPU virtual address to allocations and then > referencing those GPU virtual address from within GPU command buffers > submitted to the GPU. For allocations which require CPU access, the > allocation is mapped by the host into a location in the 8GB of IO space > reserved in the guest for that purpose. The Windows host uses the nested > CPU page table to ensure that this guest IO space always map to the > correct location for the allocation as it may migrate between dedicated > GPU memory (e.g. VRAM, firmware reserved DDR) and shared system memory > (regular DDR) over its lifetime. The Linux guest maps a user mode CPU > virtual address to an allocation IO space range for direct access by > user mode APIs and drivers. > > > > Implementation of LX_DXLOCK2 ioctl > ================================== > > We would appreciate your feedback on the implementation of the > LX_DXLOCK2 ioctl. > > This ioctl is used to get a CPU address to an allocation, which is > resident in video/system memory on the host. The way it works: > > 1. The driver sends the Lock message to the host > > 2. The host allocates space in the VM IO space and maps it to the > allocation memory > > 3. The host returns the address in IO space for the mapped allocation > > 4. The driver (in dxg_map_iospace) allocates a user mode virtual address > range using vm_mmap and maps it to the IO space using > io_remap_ofn_range) > > 5. The VA is returned to the application > > > > Internal objects > ================ > > The following objects are created by the driver (defined in dxgkrnl.h): > > - dxgadapter - represents a virtual GPU > > - dxgprocess - tracks per process state (handle table of created > objects, list of objects, etc.) > > - dxgdevice - a container for other objects (contexts, paging queues, > allocations, GPU synchronization objects) > > - dxgcontext - represents thread of GPU execution for packet > scheduling. > > - dxghwqueue - represents thread of GPU execution of hardware scheduling > > - dxgallocation - represents a GPU accessible allocation > > - dxgsyncobject - represents a GPU synchronization object > > - dxgresource - collection of dxgalloction objects > > - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects > between different dxgdevice objects, which can belong to different > processes > > > > Object handles > ============== > > All GPU objects, created by the driver, are accessible by a handle > (d3dkmt_handle). Each process has its own handle table, which is > implemented in hmgr.c. For each API visible object, created by the > driver, there is an object, created on the host. For example, the is a > dxgprocess object on the host for each dxgprocess object in the VM, etc. > The object handles have the same value in the host and the VM, which is > done to avoid translation from the guest handles to the host handles. > > > > Signaling CPU events by the host > ================================ > > The WDDM interface provides a way to signal CPU event objects when > execution of a context reached certain point. The way it is implemented: > > - application sends an event_fd via ioctl to the driver > > - eventfd_ctx_get is used to get a pointer to the file object > (eventfd_ctx) > > - the pointer to sent the host via a VM bus message > > - when GPU execution reaches a certain point, the host sends a message > to the VM with the event pointer > > - signal_guest_event() handles the messages and eventually > eventfd_signal() is called. > > > Sasha Levin (4): > gpu: dxgkrnl: core code > gpu: dxgkrnl: hook up dxgkrnl > Drivers: hv: vmbus: hook up dxgkrnl > gpu: dxgkrnl: create a MAINTAINERS entry > > MAINTAINERS | 7 + > drivers/gpu/Makefile | 2 +- > drivers/gpu/dxgkrnl/Kconfig | 10 + > drivers/gpu/dxgkrnl/Makefile | 12 + > drivers/gpu/dxgkrnl/d3dkmthk.h | 1635 +++++++++ > drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++ > drivers/gpu/dxgkrnl/dxgkrnl.h | 913 ++++++ > drivers/gpu/dxgkrnl/dxgmodule.c | 692 ++++ > drivers/gpu/dxgkrnl/dxgprocess.c | 355 ++ > drivers/gpu/dxgkrnl/dxgvmbus.c | 2955 +++++++++++++++++ > drivers/gpu/dxgkrnl/dxgvmbus.h | 859 +++++ > drivers/gpu/dxgkrnl/hmgr.c | 593 ++++ > drivers/gpu/dxgkrnl/hmgr.h | 107 + > drivers/gpu/dxgkrnl/ioctl.c | 5269 ++++++++++++++++++++++++++++++ > drivers/gpu/dxgkrnl/misc.c | 280 ++ > drivers/gpu/dxgkrnl/misc.h | 288 ++ > drivers/video/Kconfig | 2 + > include/linux/hyperv.h | 16 + > 18 files changed, 15393 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/dxgkrnl/Kconfig > create mode 100644 drivers/gpu/dxgkrnl/Makefile > create mode 100644 drivers/gpu/dxgkrnl/d3dkmthk.h > create mode 100644 drivers/gpu/dxgkrnl/dxgadapter.c > create mode 100644 drivers/gpu/dxgkrnl/dxgkrnl.h > create mode 100644 drivers/gpu/dxgkrnl/dxgmodule.c > create mode 100644 drivers/gpu/dxgkrnl/dxgprocess.c > create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.c > create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.h > create mode 100644 drivers/gpu/dxgkrnl/hmgr.c > create mode 100644 drivers/gpu/dxgkrnl/hmgr.h > create mode 100644 drivers/gpu/dxgkrnl/ioctl.c > create mode 100644 drivers/gpu/dxgkrnl/misc.c > create mode 100644 drivers/gpu/dxgkrnl/misc.h >
Hi Steve, Sounds all good, some more comments and details below. On Wed, May 20, 2020 at 5:47 AM Steve Pronovost <spronovo@microsoft.com> wrote: > Hey guys, > > Thanks for the discussion. I may not be able to immediately answer all of > your questions, but I'll do my best
>Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code. Agreed, please see my earlier reply. We'll be moving the driver to drivers/hyperv node or something similar. Apology for the confusion here. > I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions? > If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver? You should think of the communication over the VM Bus for the vGPU projection as a strongly versioned interface. We will be keeping compatibility with older version of that interface as it evolves over time so we can continue to run older guest (we already do). This protocol isn't actually tied to the DX API. It is a generic abstraction for the GPU that can be used for any APIs (for example the NVIDIA CUDA driver that we announced is going over the same protocol to access the GPU). New version of user mode DX can either take advantage or sometime require new services from this kernel abstraction. This mean that pulling a new version of user mode DX can mean having to also pull a new version of this vGPU kernel driver. For WSL, these essentially ships together. The kernel driver ships as part of our WSL2 Linux Kernel integration. User mode DX bits ships with Windows. -----Original Message----- From: Thomas Zimmermann <tzimmermann@suse.de> Sent: Wednesday, May 20, 2020 12:11 AM To: Sasha Levin <sashal@kernel.org>; alexander.deucher@amd.com; chris@chris-wilson.co.uk; ville.syrjala@linux.intel.com; Hawking.Zhang@amd.com; tvrtko.ursulin@intel.com Cc: linux-kernel@vger.kernel.org; linux-hyperv@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; wei.liu@kernel.org; Steve Pronovost <spronovo@microsoft.com>; Iouri Tarassov <iourit@microsoft.com>; dri-devel@lists.freedesktop.org; linux-fbdev@vger.kernel.org; gregkh@linuxfoundation.org Subject: [EXTERNAL] Re: [RFC PATCH 0/4] DirectX on Linux Hi Am 19.05.20 um 18:32 schrieb Sasha Levin: > There is a blog post that goes into more detail about the bigger > picture, and walks through all the required pieces to make this work. > It is available here: > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest > of this cover letter will focus on the Linux Kernel bits. That's quite a surprise. Thanks for your efforts to contribute. > > Overview > ======== > > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. > The driver exposes a paravirtualized GPU to user mode applications > running in a virtual machine on a Windows host. This enables hardware > acceleration in environment such as WSL (Windows Subsystem for Linux) > where the Linux virtual machine is able to share the GPU with the > Windows host. > > The projection is accomplished by exposing the WDDM (Windows Display > Driver Model) interface as a set of IOCTL. This allows APIs and user > mode driver written against the WDDM GPU abstraction on Windows to be > ported to run within a Linux environment. This enables the port of the > D3D12 and DirectML APIs as well as their associated user mode driver > to Linux. This also enables third party APIs, such as the popular > NVIDIA Cuda compute API, to be hardware accelerated within a WSL environment. > > Only the rendering/compute aspect of the GPU are projected to the > virtual machine, no display functionality is exposed. Further, at this > time there are no presentation integration. So although the D3D12 API > can be use to render graphics offscreen, there is no path (yet) for > pixel to flow from the Linux environment back onto the Windows host > desktop. This GPU stack is effectively side-by-side with the native > Linux graphics stack. > > The driver creates the /dev/dxg device, which can be opened by user > mode application and handles their ioctls. The IOCTL interface to the > driver is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl > definitions). The interface matches the D3DKMT interface on Windows. > Ioctls are implemented in ioctl.c. Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code. I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions? If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver? Best regards Thomas > > When a VM starts, hyper-v on the host adds virtual GPU devices to the > VM via the hyper-v driver. The host offers several VM bus channels to > the > VM: the global channel and one channel per virtual GPU, assigned to > the VM. > > The driver registers with the hyper-v driver (hv_driver) for the > arrival of VM bus channels. dxg_probe_device recognizes the vGPU > channels and creates the corresponding objects (dxgadapter for vGPUs > and dxgglobal for the global channel). > > The driver uses the hyper-V VM bus interface to communicate with the > host. dxgvmbus.c implements the communication interface. > > The global channel has 8GB of IO space assigned by the host. This > space is managed by the host and used to give the guest direct CPU > access to some allocations. Video memory is allocated on the host > except in the case of existing_sysmem allocations. The Windows host > allocates memory for the GPU on behalf of the guest. The Linux guest > can access that memory by mapping GPU virtual address to allocations > and then referencing those GPU virtual address from within GPU command > buffers submitted to the GPU. For allocations which require CPU > access, the allocation is mapped by the host into a location in the > 8GB of IO space reserved in the guest for that purpose. The Windows > host uses the nested CPU page table to ensure that this guest IO space > always map to the correct location for the allocation as it may > migrate between dedicated GPU memory (e.g. VRAM, firmware reserved > DDR) and shared system memory (regular DDR) over its lifetime. The > Linux guest maps a user mode CPU virtual address to an allocation IO > space range for direct access by user mode APIs and drivers. > > > > Implementation of LX_DXLOCK2 ioctl > ================================== > > We would appreciate your feedback on the implementation of the > LX_DXLOCK2 ioctl. > > This ioctl is used to get a CPU address to an allocation, which is > resident in video/system memory on the host. The way it works: > > 1. The driver sends the Lock message to the host > > 2. The host allocates space in the VM IO space and maps it to the > allocation memory > > 3. The host returns the address in IO space for the mapped allocation > > 4. The driver (in dxg_map_iospace) allocates a user mode virtual > address range using vm_mmap and maps it to the IO space using > io_remap_ofn_range) > > 5. The VA is returned to the application > > > > Internal objects > ================ > > The following objects are created by the driver (defined in dxgkrnl.h): > > - dxgadapter - represents a virtual GPU > > - dxgprocess - tracks per process state (handle table of created > objects, list of objects, etc.) > > - dxgdevice - a container for other objects (contexts, paging queues, > allocations, GPU synchronization objects) > > - dxgcontext - represents thread of GPU execution for packet > scheduling. > > - dxghwqueue - represents thread of GPU execution of hardware > scheduling > > - dxgallocation - represents a GPU accessible allocation > > - dxgsyncobject - represents a GPU synchronization object > > - dxgresource - collection of dxgalloction objects > > - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects > between different dxgdevice objects, which can belong to different > processes > > > > Object handles > ============== > > All GPU objects, created by the driver, are accessible by a handle > (d3dkmt_handle). Each process has its own handle table, which is > implemented in hmgr.c. For each API visible object, created by the > driver, there is an object, created on the host. For example, the is a > dxgprocess object on the host for each dxgprocess object in the VM, etc. > The object handles have the same value in the host and the VM, which > is done to avoid translation from the guest handles to the host handles. > > > > Signaling CPU events by the host > ================================ > > The WDDM interface provides a way to signal CPU event objects when > execution of a context reached certain point. The way it is implemented: > > - application sends an event_fd via ioctl to the driver > > - eventfd_ctx_get is used to get a pointer to the file object > (eventfd_ctx) > > - the pointer to sent the host via a VM bus message > > - when GPU execution reaches a certain point, the host sends a message > to the VM with the event pointer > > - signal_guest_event() handles the messages and eventually > eventfd_signal() is called. > > > Sasha Levin (4): > gpu: dxgkrnl: core code > gpu: dxgkrnl: hook up dxgkrnl > Drivers: hv: vmbus: hook up dxgkrnl > gpu: dxgkrnl: create a MAINTAINERS entry > > MAINTAINERS | 7 + > drivers/gpu/Makefile | 2 +- > drivers/gpu/dxgkrnl/Kconfig | 10 + > drivers/gpu/dxgkrnl/Makefile | 12 + > drivers/gpu/dxgkrnl/d3dkmthk.h | 1635 +++++++++ > drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++ > drivers/gpu/dxgkrnl/dxgkrnl.h | 913 ++++++ > drivers/gpu/dxgkrnl/dxgmodule.c | 692 ++++ > drivers/gpu/dxgkrnl/dxgprocess.c | 355 ++ > drivers/gpu/dxgkrnl/dxgvmbus.c | 2955 +++++++++++++++++ > drivers/gpu/dxgkrnl/dxgvmbus.h | 859 +++++ > drivers/gpu/dxgkrnl/hmgr.c | 593 ++++ > drivers/gpu/dxgkrnl/hmgr.h | 107 + > drivers/gpu/dxgkrnl/ioctl.c | 5269 ++++++++++++++++++++++++++++++ > drivers/gpu/dxgkrnl/misc.c | 280 ++ > drivers/gpu/dxgkrnl/misc.h | 288 ++ > drivers/video/Kconfig | 2 + > include/linux/hyperv.h | 16 + > 18 files changed, 15393 insertions(+), 1 deletion(-) create mode > 100644 drivers/gpu/dxgkrnl/Kconfig create mode 100644 > drivers/gpu/dxgkrnl/Makefile create mode 100644 > drivers/gpu/dxgkrnl/d3dkmthk.h create mode 100644 > drivers/gpu/dxgkrnl/dxgadapter.c create mode 100644 > drivers/gpu/dxgkrnl/dxgkrnl.h create mode 100644 > drivers/gpu/dxgkrnl/dxgmodule.c create mode 100644 > drivers/gpu/dxgkrnl/dxgprocess.c create mode 100644 > drivers/gpu/dxgkrnl/dxgvmbus.c create mode 100644 > drivers/gpu/dxgkrnl/dxgvmbus.h create mode 100644 > drivers/gpu/dxgkrnl/hmgr.c create mode 100644 > drivers/gpu/dxgkrnl/hmgr.h create mode 100644 > drivers/gpu/dxgkrnl/ioctl.c create mode 100644 > drivers/gpu/dxgkrnl/misc.c create mode 100644 > drivers/gpu/dxgkrnl/misc.h > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer
Thanks Daniel, more below.
From: Daniel Vetter <daniel@ffwll.ch>
Sent: Wednesday, May 20, 2020 12:41 AM
To: Steve Pronovost <spronovo@microsoft.com>
Cc: Dave Airlie <airlied@gmail.com>; Sasha Levin <sashal@kernel.org>; linux-hyperv@vger.kernel.org; Stephen Hemminger <sthemmin@microsoft.com>; Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; Greg Kroah-Hartman <gregkh@linuxfoundation.org>; Haiyang Zhang <haiyangz@microsoft.com>; LKML <linux-kernel@vger.kernel.org>; dri-devel <dri-devel@lists.freedesktop.org>; Chris Wilson <chris@chris-wilson.co.uk>; Linux Fbdev development list <linux-fbdev@vger.kernel.org>; Iouri Tarassov <iourit@microsoft.com>; Deucher, Alexander <alexander.deucher@amd.com>; KY Srinivasan <kys@microsoft.com>; Wei Liu <wei.liu@kernel.org>; Hawking Zhang <Hawking.Zhang@amd.com>
Subject: Re: [EXTERNAL] Re: [RFC PATCH 0/4] DirectX on Linux
Hi Steve,
Sounds all good, some more comments and details below.
On Wed, May 20, 2020 at 5:47 AM Steve Pronovost <spronovo@microsoft.com<mailto:spronovo@microsoft.com>> wrote:
Hey guys,
Thanks for the discussion. I may not be able to immediately answer all of your questions, but I'll do my best
On Tuesday 2020-05-19 22:36, Sasha Levin wrote: > >> - Why DX12 on linux? Looking at this feels like classic divide and > > There is a single usecase for this: WSL2 developer who wants to run > machine learning on his GPU. The developer is working on his laptop, > which is running Windows and that laptop has a single GPU that Windows > is using. It does not feel right conceptually. If the target is a Windows API (DX12/ML), why bother with Linux environments? Make it a Windows executable, thereby skipping the WSL translation layer and passthrough.
Hi Steve, thank you for the fast reply. Am 20.05.20 um 09:42 schrieb Steve Pronovost: >> Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code. > > Agreed, please see my earlier reply. We'll be moving the driver to drivers/hyperv node or something similar. Apology for the confusion here. > >> I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions? >> If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver? > > You should think of the communication over the VM Bus for the vGPU projection as a strongly versioned interface. We will be keeping compatibility with older version of that interface as it evolves over time so we can continue to run older guest (we already do). This protocol isn't actually tied to the DX API. It is a generic abstraction for the GPU that can be used for any APIs (for example the NVIDIA CUDA driver that we announced is going over the same protocol to access the GPU). > > New version of user mode DX can either take advantage or sometime require new services from this kernel abstraction. This mean that pulling a new version of user mode DX can mean having to also pull a new version of this vGPU kernel driver. For WSL, these essentially ships together. The kernel driver ships as part of our WSL2 Linux Kernel integration. User mode DX bits ships with Windows. Just a friendly advise: maintaining a proprietary component within a Linux environment is tough. You will need a good plan for long-term interface stability and compatibility with the other components. Best regards Thomas > > -----Original Message----- > From: Thomas Zimmermann <tzimmermann@suse.de> > Sent: Wednesday, May 20, 2020 12:11 AM > To: Sasha Levin <sashal@kernel.org>; alexander.deucher@amd.com; chris@chris-wilson.co.uk; ville.syrjala@linux.intel.com; Hawking.Zhang@amd.com; tvrtko.ursulin@intel.com > Cc: linux-kernel@vger.kernel.org; linux-hyperv@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; wei.liu@kernel.org; Steve Pronovost <spronovo@microsoft.com>; Iouri Tarassov <iourit@microsoft.com>; dri-devel@lists.freedesktop.org; linux-fbdev@vger.kernel.org; gregkh@linuxfoundation.org > Subject: [EXTERNAL] Re: [RFC PATCH 0/4] DirectX on Linux > > Hi > > Am 19.05.20 um 18:32 schrieb Sasha Levin: >> There is a blog post that goes into more detail about the bigger >> picture, and walks through all the required pieces to make this work. >> It is available here: >> https://devblogs.microsoft.com/directx/directx-heart-linux . The rest >> of this cover letter will focus on the Linux Kernel bits. > > That's quite a surprise. Thanks for your efforts to contribute. > >> >> Overview >> ======== >> >> This is the first draft of the Microsoft Virtual GPU (vGPU) driver. >> The driver exposes a paravirtualized GPU to user mode applications >> running in a virtual machine on a Windows host. This enables hardware >> acceleration in environment such as WSL (Windows Subsystem for Linux) >> where the Linux virtual machine is able to share the GPU with the >> Windows host. >> >> The projection is accomplished by exposing the WDDM (Windows Display >> Driver Model) interface as a set of IOCTL. This allows APIs and user >> mode driver written against the WDDM GPU abstraction on Windows to be >> ported to run within a Linux environment. This enables the port of the >> D3D12 and DirectML APIs as well as their associated user mode driver >> to Linux. This also enables third party APIs, such as the popular >> NVIDIA Cuda compute API, to be hardware accelerated within a WSL environment. >> >> Only the rendering/compute aspect of the GPU are projected to the >> virtual machine, no display functionality is exposed. Further, at this >> time there are no presentation integration. So although the D3D12 API >> can be use to render graphics offscreen, there is no path (yet) for >> pixel to flow from the Linux environment back onto the Windows host >> desktop. This GPU stack is effectively side-by-side with the native >> Linux graphics stack. >> >> The driver creates the /dev/dxg device, which can be opened by user >> mode application and handles their ioctls. The IOCTL interface to the >> driver is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl >> definitions). The interface matches the D3DKMT interface on Windows. >> Ioctls are implemented in ioctl.c. > > Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code. > > I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions? > > If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver? > > Best regards > Thomas > >> >> When a VM starts, hyper-v on the host adds virtual GPU devices to the >> VM via the hyper-v driver. The host offers several VM bus channels to >> the >> VM: the global channel and one channel per virtual GPU, assigned to >> the VM. >> >> The driver registers with the hyper-v driver (hv_driver) for the >> arrival of VM bus channels. dxg_probe_device recognizes the vGPU >> channels and creates the corresponding objects (dxgadapter for vGPUs >> and dxgglobal for the global channel). >> >> The driver uses the hyper-V VM bus interface to communicate with the >> host. dxgvmbus.c implements the communication interface. >> >> The global channel has 8GB of IO space assigned by the host. This >> space is managed by the host and used to give the guest direct CPU >> access to some allocations. Video memory is allocated on the host >> except in the case of existing_sysmem allocations. The Windows host >> allocates memory for the GPU on behalf of the guest. The Linux guest >> can access that memory by mapping GPU virtual address to allocations >> and then referencing those GPU virtual address from within GPU command >> buffers submitted to the GPU. For allocations which require CPU >> access, the allocation is mapped by the host into a location in the >> 8GB of IO space reserved in the guest for that purpose. The Windows >> host uses the nested CPU page table to ensure that this guest IO space >> always map to the correct location for the allocation as it may >> migrate between dedicated GPU memory (e.g. VRAM, firmware reserved >> DDR) and shared system memory (regular DDR) over its lifetime. The >> Linux guest maps a user mode CPU virtual address to an allocation IO >> space range for direct access by user mode APIs and drivers. >> >> >> >> Implementation of LX_DXLOCK2 ioctl >> ================================== >> >> We would appreciate your feedback on the implementation of the >> LX_DXLOCK2 ioctl. >> >> This ioctl is used to get a CPU address to an allocation, which is >> resident in video/system memory on the host. The way it works: >> >> 1. The driver sends the Lock message to the host >> >> 2. The host allocates space in the VM IO space and maps it to the >> allocation memory >> >> 3. The host returns the address in IO space for the mapped allocation >> >> 4. The driver (in dxg_map_iospace) allocates a user mode virtual >> address range using vm_mmap and maps it to the IO space using >> io_remap_ofn_range) >> >> 5. The VA is returned to the application >> >> >> >> Internal objects >> ================ >> >> The following objects are created by the driver (defined in dxgkrnl.h): >> >> - dxgadapter - represents a virtual GPU >> >> - dxgprocess - tracks per process state (handle table of created >> objects, list of objects, etc.) >> >> - dxgdevice - a container for other objects (contexts, paging queues, >> allocations, GPU synchronization objects) >> >> - dxgcontext - represents thread of GPU execution for packet >> scheduling. >> >> - dxghwqueue - represents thread of GPU execution of hardware >> scheduling >> >> - dxgallocation - represents a GPU accessible allocation >> >> - dxgsyncobject - represents a GPU synchronization object >> >> - dxgresource - collection of dxgalloction objects >> >> - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects >> between different dxgdevice objects, which can belong to different >> processes >> >> >> >> Object handles >> ============== >> >> All GPU objects, created by the driver, are accessible by a handle >> (d3dkmt_handle). Each process has its own handle table, which is >> implemented in hmgr.c. For each API visible object, created by the >> driver, there is an object, created on the host. For example, the is a >> dxgprocess object on the host for each dxgprocess object in the VM, etc. >> The object handles have the same value in the host and the VM, which >> is done to avoid translation from the guest handles to the host handles. >> >> >> >> Signaling CPU events by the host >> ================================ >> >> The WDDM interface provides a way to signal CPU event objects when >> execution of a context reached certain point. The way it is implemented: >> >> - application sends an event_fd via ioctl to the driver >> >> - eventfd_ctx_get is used to get a pointer to the file object >> (eventfd_ctx) >> >> - the pointer to sent the host via a VM bus message >> >> - when GPU execution reaches a certain point, the host sends a message >> to the VM with the event pointer >> >> - signal_guest_event() handles the messages and eventually >> eventfd_signal() is called. >> >> >> Sasha Levin (4): >> gpu: dxgkrnl: core code >> gpu: dxgkrnl: hook up dxgkrnl >> Drivers: hv: vmbus: hook up dxgkrnl >> gpu: dxgkrnl: create a MAINTAINERS entry >> >> MAINTAINERS | 7 + >> drivers/gpu/Makefile | 2 +- >> drivers/gpu/dxgkrnl/Kconfig | 10 + >> drivers/gpu/dxgkrnl/Makefile | 12 + >> drivers/gpu/dxgkrnl/d3dkmthk.h | 1635 +++++++++ >> drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++ >> drivers/gpu/dxgkrnl/dxgkrnl.h | 913 ++++++ >> drivers/gpu/dxgkrnl/dxgmodule.c | 692 ++++ >> drivers/gpu/dxgkrnl/dxgprocess.c | 355 ++ >> drivers/gpu/dxgkrnl/dxgvmbus.c | 2955 +++++++++++++++++ >> drivers/gpu/dxgkrnl/dxgvmbus.h | 859 +++++ >> drivers/gpu/dxgkrnl/hmgr.c | 593 ++++ >> drivers/gpu/dxgkrnl/hmgr.h | 107 + >> drivers/gpu/dxgkrnl/ioctl.c | 5269 ++++++++++++++++++++++++++++++ >> drivers/gpu/dxgkrnl/misc.c | 280 ++ >> drivers/gpu/dxgkrnl/misc.h | 288 ++ >> drivers/video/Kconfig | 2 + >> include/linux/hyperv.h | 16 + >> 18 files changed, 15393 insertions(+), 1 deletion(-) create mode >> 100644 drivers/gpu/dxgkrnl/Kconfig create mode 100644 >> drivers/gpu/dxgkrnl/Makefile create mode 100644 >> drivers/gpu/dxgkrnl/d3dkmthk.h create mode 100644 >> drivers/gpu/dxgkrnl/dxgadapter.c create mode 100644 >> drivers/gpu/dxgkrnl/dxgkrnl.h create mode 100644 >> drivers/gpu/dxgkrnl/dxgmodule.c create mode 100644 >> drivers/gpu/dxgkrnl/dxgprocess.c create mode 100644 >> drivers/gpu/dxgkrnl/dxgvmbus.c create mode 100644 >> drivers/gpu/dxgkrnl/dxgvmbus.h create mode 100644 >> drivers/gpu/dxgkrnl/hmgr.c create mode 100644 >> drivers/gpu/dxgkrnl/hmgr.h create mode 100644 >> drivers/gpu/dxgkrnl/ioctl.c create mode 100644 >> drivers/gpu/dxgkrnl/misc.c create mode 100644 >> drivers/gpu/dxgkrnl/misc.h >> > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer >
[resending as plain text, sorry about that]
Thanks Daniel, more below.
From: Daniel Vetter <mailto:daniel@ffwll.ch>
Sent: Wednesday, May 20, 2020 12:41 AM
To: Steve Pronovost <mailto:spronovo@microsoft.com>
Cc: Dave Airlie <mailto:airlied@gmail.com>; Sasha Levin <mailto:sashal@kernel.org>; mailto:linux-hyperv@vger.kernel.org; Stephen Hemminger <mailto:sthemmin@microsoft.com>; Ursulin, Tvrtko <mailto:tvrtko.ursulin@intel.com>; Greg Kroah-Hartman <mailto:gregkh@linuxfoundation.org>; Haiyang Zhang <mailto:haiyangz@microsoft.com>; LKML <mailto:linux-kernel@vger.kernel.org>; dri-devel <mailto:dri-devel@lists.freedesktop.org>; Chris Wilson <mailto:chris@chris-wilson.co.uk>; Linux Fbdev development list <mailto:linux-fbdev@vger.kernel.org>; Iouri Tarassov <mailto:iourit@microsoft.com>; Deucher, Alexander <mailto:alexander.deucher@amd.com>; KY Srinivasan <mailto:kys@microsoft.com>; Wei Liu <mailto:wei.liu@kernel.org>; Hawking Zhang <mailto:Hawking.Zhang@amd.com>
Subject: Re: [EXTERNAL] Re: [RFC PATCH 0/4] DirectX on Linux
Hi Steve,
Sounds all good, some more comments and details below.
On Wed, May 20, 2020 at 5:47 AM Steve Pronovost <mailto:spronovo@microsoft.com> wrote:
Hey guys,
Thanks for the discussion. I may not be able to immediately answer all of your questions, but I'll do my best
Hi! > > The driver creates the /dev/dxg device, which can be opened by user mode > > application and handles their ioctls. The IOCTL interface to the driver > > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl > > definitions). The interface matches the D3DKMT interface on Windows. > > Ioctls are implemented in ioctl.c. > > Echoing what others said, you're not making a DRM driver. The driver should live outside > of the DRM code. > Actually, this sounds to me like "this should not be merged into linux kernel". I mean, we already have DRM API on Linux. We don't want another one, do we? And at the very least... this misses API docs for /dev/dxg. Code can't really be reviewed without that. Best regards, Pavel
Hi! > Thanks for the discussion. I may not be able to immediately answer all of your questions, but I'll do my best ????. > Could you do something with your email settings? Because this is not how you should use email on lkml. "[EXTERNAL]" in the subject, top-posting, unwrapped lines... Thank you, Pavel
> > Having said that, I hit one stumbling block: > > "Further, at this time there are no presentation integration. " > > > > If we upstream this driver as-is into some hyperv specific place, and > > you decide to add presentation integration this is more than likely > > going to mean you will want to interact with dma-bufs and dma-fences. > > If the driver is hidden away in a hyperv place it's likely we won't > > even notice that feature landing until it's too late. > > > > I would like to see a coherent plan for presentation support (not > > code, just an architectural diagram), because I think when you > > contemplate how that works it will change the picture of how this > > driver looks and intergrates into the rest of the Linux graphics > > ecosystem. > > > > As-is I'd rather this didn't land under my purview, since I don't see > > the value this adds to the Linux ecosystem at all, and I think it's > > important when putting a burden on upstream that you provide some > > value. > > I also have another concern from a legal standpoint I'd rather not > review the ioctl part of this. I'd probably request under DRI > developers abstain as well. > > This is a Windows kernel API being smashed into a Linux driver. I don't want to be > tainted by knowledge of an API that I've no idea of the legal status of derived works. > (it this all covered patent wise under OIN?) If you can't look onto it, perhaps it is not suitable to merge into kernel...? What would be legal requirements so this is "safe to look at"? We should really require submitter to meet them... Pavel
On Tue, Jun 16, 2020 at 12:51:56PM +0200, Pavel Machek wrote: >> > Having said that, I hit one stumbling block: >> > "Further, at this time there are no presentation integration. " >> > >> > If we upstream this driver as-is into some hyperv specific place, and >> > you decide to add presentation integration this is more than likely >> > going to mean you will want to interact with dma-bufs and dma-fences. >> > If the driver is hidden away in a hyperv place it's likely we won't >> > even notice that feature landing until it's too late. >> > >> > I would like to see a coherent plan for presentation support (not >> > code, just an architectural diagram), because I think when you >> > contemplate how that works it will change the picture of how this >> > driver looks and intergrates into the rest of the Linux graphics >> > ecosystem. >> > >> > As-is I'd rather this didn't land under my purview, since I don't see >> > the value this adds to the Linux ecosystem at all, and I think it's >> > important when putting a burden on upstream that you provide some >> > value. >> >> I also have another concern from a legal standpoint I'd rather not >> review the ioctl part of this. I'd probably request under DRI >> developers abstain as well. >> >> This is a Windows kernel API being smashed into a Linux driver. I don't want to be >> tainted by knowledge of an API that I've no idea of the legal status of derived works. >> (it this all covered patent wise under OIN?) > >If you can't look onto it, perhaps it is not suitable to merge into kernel...? > >What would be legal requirements so this is "safe to look at"? We should really >require submitter to meet them... Could you walk me through your view on what the function of the "Signed-off-by" tag is?
On Tue, Jun 16, 2020 at 12:51:13PM +0200, Pavel Machek wrote: >Hi! > >> > The driver creates the /dev/dxg device, which can be opened by user mode >> > application and handles their ioctls. The IOCTL interface to the driver >> > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl >> > definitions). The interface matches the D3DKMT interface on Windows. >> > Ioctls are implemented in ioctl.c. >> >> Echoing what others said, you're not making a DRM driver. The driver should live outside >> of the DRM code. >> > >Actually, this sounds to me like "this should not be merged into linux kernel". I mean, >we already have DRM API on Linux. We don't want another one, do we? This driver doesn't have any display functionality. >And at the very least... this misses API docs for /dev/dxg. Code can't really >be reviewed without that. The docs live here: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/
On Tue 2020-06-16 09:28:19, Sasha Levin wrote: > On Tue, Jun 16, 2020 at 12:51:13PM +0200, Pavel Machek wrote: > > Hi! > > > > > > The driver creates the /dev/dxg device, which can be opened by user mode > > > > application and handles their ioctls. The IOCTL interface to the driver > > > > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl > > > > definitions). The interface matches the D3DKMT interface on Windows. > > > > Ioctls are implemented in ioctl.c. > > > > > > Echoing what others said, you're not making a DRM driver. The driver should live outside > > > of the DRM code. > > > > > > > Actually, this sounds to me like "this should not be merged into linux kernel". I mean, > > we already have DRM API on Linux. We don't want another one, do we? > > This driver doesn't have any display functionality. Graphics cards without displays connected are quite common. I may be wrong, but I believe we normally handle them using DRM... > > And at the very least... this misses API docs for /dev/dxg. Code can't really > > be reviewed without that. > > The docs live here: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/ I don't see "/dev/dxg" being metioned there. Plus, kernel API documentation should really go to Documentation, and be suitably licensed. Pavel
On Tue, Jun 16, 2020 at 04:41:22PM +0200, Pavel Machek wrote: >On Tue 2020-06-16 09:28:19, Sasha Levin wrote: >> On Tue, Jun 16, 2020 at 12:51:13PM +0200, Pavel Machek wrote: >> > Hi! >> > >> > > > The driver creates the /dev/dxg device, which can be opened by user mode >> > > > application and handles their ioctls. The IOCTL interface to the driver >> > > > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl >> > > > definitions). The interface matches the D3DKMT interface on Windows. >> > > > Ioctls are implemented in ioctl.c. >> > > >> > > Echoing what others said, you're not making a DRM driver. The driver should live outside >> > > of the DRM code. >> > > >> > >> > Actually, this sounds to me like "this should not be merged into linux kernel". I mean, >> > we already have DRM API on Linux. We don't want another one, do we? >> >> This driver doesn't have any display functionality. > >Graphics cards without displays connected are quite common. I may be >wrong, but I believe we normally handle them using DRM... This is more similar to the accelerators that live in drivers/misc/ right now. >> > And at the very least... this misses API docs for /dev/dxg. Code can't really >> > be reviewed without that. >> >> The docs live here: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/ > >I don't see "/dev/dxg" being metioned there. Plus, kernel API Right, this is because this entire codebase is just a pipe to the API I've linked, it doesn't implement anything new on it's own. >documentation should really go to Documentation, and be suitably >licensed. While I don't mind copying the docs into Documentation, I'm concerned that over time they will diverge from the docs on the website. This is similar to how other documentation (such as the virtio spec) live out of tree to avoid these issues. w.r.t the licensing, again: this was sent under GPL2 (note the SPDX tags in each file), and the patches carry a S-O-B by someone who was a Microsoft employee at the time the patches were sent.
On Tue, May 19, 2020 at 2:36 PM Sasha Levin <sashal@kernel.org> wrote: > > Hi Daniel, > > On Tue, May 19, 2020 at 09:21:15PM +0200, Daniel Vetter wrote: > >Hi Sasha > > > >So obviously great that Microsoft is trying to upstream all this, and > >very much welcome and all that. > > > >But I guess there's a bunch of rather fundamental issues before we > >look into any kind of code details. And that might make this quite a > >hard sell for upstream to drivers/gpu subsystem: > > Let me preface my answers by saying that speaking personally I very much > dislike that the userspace is closed and wish I could do something about > it. > > >- From the blog it sounds like the userspace is all closed. That > >includes the hw specific part and compiler chunks, all stuff we've > >generally expected to be able to look in the past for any kind of > >other driver. It's event documented here: > > > >https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements > > > >What's your plan here? > > Let me answer with a (genuine) question: does this driver have anything > to do with DRM even after we enable graphics on it? I'm still trying to > figure it out. > > There is an open source DX12 Galluim driver (that lives here: > https://gitlab.freedesktop.org/kusma/mesa/-/tree/msclc-d3d12) with open > source compiler and so on. > > The plan is for Microsoft to provide shims to allow the existing Linux > userspace interact with DX12; I'll explain below why we had to pipe DX12 > all the way into the Linux guest, but this is *not* to introduce DX12 > into the Linux world as competition. There is no intent for anyone in > the Linux world to start coding for the DX12 API. If that really is the case why is microsoft recommending developers to break compatibility with native Linux and use the DX12 API's here: https://devblogs.microsoft.com/directx/in-the-works-opencl-and-opengl-mapping-layers-to-directx/ Quote: "Make it easier for developers to port their apps to D3D12. For developers looking to move from older OpenCL and OpenGL API versions to D3D12, the open source mapping layers will provide helpful example code on how to use the D3D12 Translation Layer library." If developers of applications that use OpenCL and OpenGL API's were to follow this advice and transition to D3D12 their applications would no longer work on Linux systems unless using WSL2. Is Microsoft planning on creating a D3D12/DirectML frontend that doesn't depend on WSL2? > > This is why I'm not sure whether this touches DRM on the Linux side of > things. Nothing is actually rendered on Linux but rather piped to > Windows to be done there. > > >btw since the main goal here (at least at first) seems to be get > >compute and ML going the official work-around here is to relabel your > >driver as an accelerator driver (just sed -e s/vGPU/vaccel/ over the > >entire thing or so) and then Olof and Greg will take it into > >drivers/accel ... > > This submission is not a case of "we want it upstream NOW" but rather > "let's work together to figure out how to do it right" :) > > I thought about placing this driver in drivers/hyper-v/ given that it's > basically just a pipe between the host and the guest. There is no fancy > logic in this drivers. Maybe the right place is indeed drivers/accel or > drivers/hyper-v but I'd love if we agree on that rather than doing that > as a workaround and 6 months down the road enabling graphics. > > >- Next up (but that's not really a surprise for a fresh vendor driver) > >at a more technical level, this seems to reinvent the world, from > >device enumeration (why is this not exposed as /dev/dri/card0 so it > >better integrates with existing linux desktop stuff, in case that > >becomes a goal ever) down to reinvented kref_put_mutex (and please > >look at drm_device->struct_mutex for an example of how bad of a > >nightmare that locking pattern is and how many years it took us to > >untangle that one. > > I'd maybe note that neither of us here at Microsoft is an expert in the > Linux DRM world. Stuff might have been done in a certain way because we > didn't know better. > > >- Why DX12 on linux? Looking at this feels like classic divide and > > There is a single usecase for this: WSL2 developer who wants to run > machine learning on his GPU. The developer is working on his laptop, > which is running Windows and that laptop has a single GPU that Windows > is using. > > Since the GPU is being used by Windows, we can't assign it directly to > the Linux guest, but instead we can use GPU Partitioning to give the > guest access to the GPU. This means that the guest needs to be able to > "speak" DX12, which is why we pulled DX12 into Linux. > > >conquer (or well triple E from the 90s), we have vk, we have > >drm_syncobj, we have an entire ecosystem of winsys layers that work > >across vendors. Is the plan here that we get a dx12 driver for other > >hw mesa drivers from you guys, so this is all consistent and we have a > >nice linux platform? How does this integrate everywhere else with > >linux winsys standards, like dma-buf for passing stuff around, > >dma-fence/sync_file/drm_syncobj for syncing, drm_fourcc/modifiers for > >some idea how it all meshes together? > > Let me point you to this blog post that has more information about the > graphics side of things: > https://www.collabora.com/news-and-blog/news-and-events/introducing-opencl-and-opengl-on-directx.html > . > > The intent is to wrap DX12 with shims to work with the existing > ecosystem; DX12 isn't a new player on it's own and thus isn't trying to > divide/conquer anything. Shouldn't tensorflow/machine learning be going through the opencl compatibility layer/shims instead of talking directly to DX12/DirectML? If tensorflow or any other machine learning software uses DX12 API's directly then they won't be compatible with Linux unless running on top of WSL2. > > >- There's been a pile of hallway track/private discussions about > >moving on from the buffer-based memory managed model to something more > >modern. That relates to your DXLOCK2 question, but there's a lot more > >to userspace managed gpu memory residency than just that. monitored > >fences are another part. Also, to avoid a platform split we need to > >figure out how to tie this back into the dma-buf and dma-fence > >(including various uapi flavours) or it'll be made of fail. dx12 has > >all that in some form, except 0 integration with the linux stuff we > >have (no surprise, since linux isn't windows). Finally if we go to the > >trouble of a completely revamped I think ioctls aren't a great idea, > >something like iouring (the gossip name is drm_uring) would be a lot > >better. Also for easier paravirt we'd need 0 cpu pointers in any such > >new interface. Adding a few people who've been involved in these > >discussions thus far, mostly under a drm/hmm.ko heading iirc. > > > >I think the above are the really big ticket items around what's the > >plan here and are we solving even the right problem. > > Part of the reason behind this implementation is simplicity. Again, no > objections around moving to uring and doing other improvements. > > -- > Thanks, > Sasha > > >