mbox series

[RFC,0/4] DirectX on Linux

Message ID 20200519163234.226513-1-sashal@kernel.org (mailing list archive)
Headers show
Series DirectX on Linux | expand

Message

Sasha Levin May 19, 2020, 4:32 p.m. UTC
There is a blog post that goes into more detail about the bigger
picture, and walks through all the required pieces to make this work. It
is available here:
https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of
this cover letter will focus on the Linux Kernel bits.

Overview
========

This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The
driver exposes a paravirtualized GPU to user mode applications running
in a virtual machine on a Windows host. This enables hardware
acceleration in environment such as WSL (Windows Subsystem for Linux)
where the Linux virtual machine is able to share the GPU with the
Windows host.

The projection is accomplished by exposing the WDDM (Windows Display
Driver Model) interface as a set of IOCTL. This allows APIs and user
mode driver written against the WDDM GPU abstraction on Windows to be
ported to run within a Linux environment. This enables the port of the
D3D12 and DirectML APIs as well as their associated user mode driver to
Linux. This also enables third party APIs, such as the popular NVIDIA
Cuda compute API, to be hardware accelerated within a WSL environment.

Only the rendering/compute aspect of the GPU are projected to the
virtual machine, no display functionality is exposed. Further, at this
time there are no presentation integration. So although the D3D12 API
can be use to render graphics offscreen, there is no path (yet) for
pixel to flow from the Linux environment back onto the Windows host
desktop. This GPU stack is effectively side-by-side with the native
Linux graphics stack.

The driver creates the /dev/dxg device, which can be opened by user mode
application and handles their ioctls. The IOCTL interface to the driver
is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl
definitions). The interface matches the D3DKMT interface on Windows.
Ioctls are implemented in ioctl.c.

When a VM starts, hyper-v on the host adds virtual GPU devices to the VM
via the hyper-v driver. The host offers several VM bus channels to the
VM: the global channel and one channel per virtual GPU, assigned to the
VM.

The driver registers with the hyper-v driver (hv_driver) for the arrival
of VM bus channels. dxg_probe_device recognizes the vGPU channels and
creates the corresponding objects (dxgadapter for vGPUs and dxgglobal
for the global channel).

The driver uses the hyper-V VM bus interface to communicate with the
host. dxgvmbus.c implements the communication interface.

The global channel has 8GB of IO space assigned by the host. This space
is managed by the host and used to give the guest direct CPU access to
some allocations. Video memory is allocated on the host except in the
case of existing_sysmem allocations. The Windows host allocates memory
for the GPU on behalf of the guest. The Linux guest can access that
memory by mapping GPU virtual address to allocations and then
referencing those GPU virtual address from within GPU command buffers
submitted to the GPU. For allocations which require CPU access, the
allocation is mapped by the host into a location in the 8GB of IO space
reserved in the guest for that purpose. The Windows host uses the nested
CPU page table to ensure that this guest IO space always map to the
correct location for the allocation as it may migrate between dedicated
GPU memory (e.g. VRAM, firmware reserved DDR) and shared system memory
(regular DDR) over its lifetime. The Linux guest maps a user mode CPU
virtual address to an allocation IO space range for direct access by
user mode APIs and drivers.

 

Implementation of LX_DXLOCK2 ioctl
==================================

We would appreciate your feedback on the implementation of the
LX_DXLOCK2 ioctl.

This ioctl is used to get a CPU address to an allocation, which is
resident in video/system memory on the host. The way it works:

1. The driver sends the Lock message to the host

2. The host allocates space in the VM IO space and maps it to the
allocation memory

3. The host returns the address in IO space for the mapped allocation

4. The driver (in dxg_map_iospace) allocates a user mode virtual address
range using vm_mmap and maps it to the IO space using
io_remap_ofn_range)

5. The VA is returned to the application

 

Internal objects
================

The following objects are created by the driver (defined in dxgkrnl.h):

- dxgadapter - represents a virtual GPU

- dxgprocess - tracks per process state (handle table of created
  objects, list of objects, etc.)

- dxgdevice - a container for other objects (contexts, paging queues,
  allocations, GPU synchronization objects)

- dxgcontext - represents thread of GPU execution for packet
  scheduling.

- dxghwqueue - represents thread of GPU execution of hardware scheduling

- dxgallocation - represents a GPU accessible allocation

- dxgsyncobject - represents a GPU synchronization object

- dxgresource - collection of dxgalloction objects

- dxgsharedresource, dxgsharedsyncobj - helper objects to share objects
  between different dxgdevice objects, which can belong to different
processes


 
Object handles
==============

All GPU objects, created by the driver, are accessible by a handle
(d3dkmt_handle). Each process has its own handle table, which is
implemented in hmgr.c. For each API visible object, created by the
driver, there is an object, created on the host. For example, the is a
dxgprocess object on the host for each dxgprocess object in the VM, etc.
The object handles have the same value in the host and the VM, which is
done to avoid translation from the guest handles to the host handles.
 


Signaling CPU events by the host
================================

The WDDM interface provides a way to signal CPU event objects when
execution of a context reached certain point. The way it is implemented:

- application sends an event_fd via ioctl to the driver

- eventfd_ctx_get is used to get a pointer to the file object
  (eventfd_ctx)

- the pointer to sent the host via a VM bus message

- when GPU execution reaches a certain point, the host sends a message
  to the VM with the event pointer

- signal_guest_event() handles the messages and eventually
  eventfd_signal() is called.


Sasha Levin (4):
  gpu: dxgkrnl: core code
  gpu: dxgkrnl: hook up dxgkrnl
  Drivers: hv: vmbus: hook up dxgkrnl
  gpu: dxgkrnl: create a MAINTAINERS entry

 MAINTAINERS                      |    7 +
 drivers/gpu/Makefile             |    2 +-
 drivers/gpu/dxgkrnl/Kconfig      |   10 +
 drivers/gpu/dxgkrnl/Makefile     |   12 +
 drivers/gpu/dxgkrnl/d3dkmthk.h   | 1635 +++++++++
 drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++
 drivers/gpu/dxgkrnl/dxgkrnl.h    |  913 ++++++
 drivers/gpu/dxgkrnl/dxgmodule.c  |  692 ++++
 drivers/gpu/dxgkrnl/dxgprocess.c |  355 ++
 drivers/gpu/dxgkrnl/dxgvmbus.c   | 2955 +++++++++++++++++
 drivers/gpu/dxgkrnl/dxgvmbus.h   |  859 +++++
 drivers/gpu/dxgkrnl/hmgr.c       |  593 ++++
 drivers/gpu/dxgkrnl/hmgr.h       |  107 +
 drivers/gpu/dxgkrnl/ioctl.c      | 5269 ++++++++++++++++++++++++++++++
 drivers/gpu/dxgkrnl/misc.c       |  280 ++
 drivers/gpu/dxgkrnl/misc.h       |  288 ++
 drivers/video/Kconfig            |    2 +
 include/linux/hyperv.h           |   16 +
 18 files changed, 15393 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/dxgkrnl/Kconfig
 create mode 100644 drivers/gpu/dxgkrnl/Makefile
 create mode 100644 drivers/gpu/dxgkrnl/d3dkmthk.h
 create mode 100644 drivers/gpu/dxgkrnl/dxgadapter.c
 create mode 100644 drivers/gpu/dxgkrnl/dxgkrnl.h
 create mode 100644 drivers/gpu/dxgkrnl/dxgmodule.c
 create mode 100644 drivers/gpu/dxgkrnl/dxgprocess.c
 create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.c
 create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.h
 create mode 100644 drivers/gpu/dxgkrnl/hmgr.c
 create mode 100644 drivers/gpu/dxgkrnl/hmgr.h
 create mode 100644 drivers/gpu/dxgkrnl/ioctl.c
 create mode 100644 drivers/gpu/dxgkrnl/misc.c
 create mode 100644 drivers/gpu/dxgkrnl/misc.h

Comments

Greg Kroah-Hartman May 19, 2020, 5:19 p.m. UTC | #1
On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote:
> +/*
> + * Dxgkrnl Graphics Port Driver ioctl definitions
> + *
> + */
> +
> +#define LX_IOCTL_DIR_WRITE 0x1
> +#define LX_IOCTL_DIR_READ  0x2
> +
> +#define LX_IOCTL_DIR(_ioctl)	(((_ioctl) >> 30) & 0x3)
> +#define LX_IOCTL_SIZE(_ioctl)	(((_ioctl) >> 16) & 0x3FFF)
> +#define LX_IOCTL_TYPE(_ioctl)	(((_ioctl) >> 8) & 0xFF)
> +#define LX_IOCTL_CODE(_ioctl)	(((_ioctl) >> 0) & 0xFF)

Why create new ioctl macros, can't the "normal" kernel macros work
properly?

> +#define LX_IOCTL(_dir, _size, _type, _code) (	\
> +	(((uint)(_dir) & 0x3) << 30) |		\
> +	(((uint)(_size) & 0x3FFF) << 16) |	\
> +	(((uint)(_type) & 0xFF) << 8) |		\
> +	(((uint)(_code) & 0xFF) << 0))
> +
> +#define LX_IO(_type, _code) LX_IOCTL(0, 0, (_type), (_code))
> +#define LX_IOR(_type, _code, _size)	\
> +	LX_IOCTL(LX_IOCTL_DIR_READ, (_size), (_type), (_code))
> +#define LX_IOW(_type, _code, _size)	\
> +	LX_IOCTL(LX_IOCTL_DIR_WRITE, (_size), (_type), (_code))
> +#define LX_IOWR(_type, _code, _size)	\
> +	LX_IOCTL(LX_IOCTL_DIR_WRITE |	\
> +	LX_IOCTL_DIR_READ, (_size), (_type), (_code))
> +
> +#define LX_DXOPENADAPTERFROMLUID	\
> +	LX_IOWR(0x47, 0x01, sizeof(struct d3dkmt_openadapterfromluid))

<snip>

These structures do not seem to be all using the correct types for a
"real" ioctl in the kernel, so you will have to fix them all up before
this will work properly.

> +void ioctl_desc_init(void);

Very odd global name you are using here :)

Anyway, neat stuff, glad to see it posted, great work!

greg k-h
Greg Kroah-Hartman May 19, 2020, 5:21 p.m. UTC | #2
On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote:
> +
> +#define DXGK_MAX_LOCK_DEPTH	64
> +#define W_MAX_PATH		260

We already have a max path number, why use a different one?

> +#define d3dkmt_handle		u32
> +#define d3dgpu_virtual_address	u64
> +#define winwchar		u16
> +#define winhandle		u64
> +#define ntstatus		int
> +#define winbool			u32
> +#define d3dgpu_size_t		u64

These are all ripe for a simple search/replace in your editor before you
do your next version :)

thanks,

greg k-h
Greg Kroah-Hartman May 19, 2020, 5:27 p.m. UTC | #3
On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote:
> +static int dxgglobal_init_global_channel(struct hv_device *hdev)
> +{
> +	int ret = 0;
> +
> +	TRACE_DEBUG(1, "%s %x  %x", __func__, hdev->vendor_id, hdev->device_id);
> +	{
> +		TRACE_DEBUG(1, "device type   : %pUb\n", &hdev->dev_type);
> +		TRACE_DEBUG(1, "device channel: %pUb %p primary: %p\n",
> +			    &hdev->channel->offermsg.offer.if_type,
> +			    hdev->channel, hdev->channel->primary_channel);
> +	}
> +
> +	if (dxgglobal->hdev) {
> +		/* This device should appear only once */
> +		pr_err("dxgglobal already initialized\n");
> +		ret = -EBADE;
> +		goto error;
> +	}
> +
> +	dxgglobal->hdev = hdev;
> +
> +	ret = dxgvmbuschannel_init(&dxgglobal->channel, hdev);
> +	if (ret) {
> +		pr_err("dxgvmbuschannel_init failed: %d\n", ret);
> +		goto error;
> +	}
> +
> +	ret = dxgglobal_getiospace(dxgglobal);
> +	if (ret) {
> +		pr_err("getiospace failed: %d\n", ret);
> +		goto error;
> +	}
> +
> +	ret = dxgvmb_send_set_iospace_region(dxgglobal->mmiospace_base,
> +					     dxgglobal->mmiospace_size, 0);
> +	if (ret) {
> +		pr_err("send_set_iospace_region failed\n");
> +		goto error;
> +	}
> +
> +	hv_set_drvdata(hdev, dxgglobal);
> +
> +	if (alloc_chrdev_region(&dxgglobal->device_devt, 0, 1, "dxgkrnl") < 0) {
> +		pr_err("alloc_chrdev_region failed\n");
> +		ret = -ENODEV;
> +		goto error;
> +	}
> +	dxgglobal->devt_initialized = true;
> +	dxgglobal->device_class = class_create(THIS_MODULE, "dxgkdrv");
> +	if (dxgglobal->device_class == NULL) {
> +		pr_err("class_create failed\n");
> +		ret = -ENODEV;
> +		goto error;
> +	}
> +	dxgglobal->device_class->devnode = dxg_devnode;
> +	dxgglobal->device = device_create(dxgglobal->device_class, NULL,
> +					  dxgglobal->device_devt, NULL, "dxg");
> +	if (dxgglobal->device == NULL) {
> +		pr_err("device_create failed\n");
> +		ret = -ENODEV;
> +		goto error;
> +	}
> +	dxgglobaldev = dxgglobal->device;
> +	cdev_init(&dxgglobal->device_cdev, &dxgk_fops);
> +	ret = cdev_add(&dxgglobal->device_cdev, dxgglobal->device_devt, 1);
> +	if (ret < 0) {
> +		pr_err("cdev_add failed: %d\n", ret);
> +		goto error;
> +	}
> +	dxgglobal->cdev_initialized = true;
> +
> +error:
> +	return ret;
> +}

As you only are asking for a single char dev node, please just use the
misc device api instead of creating your own class and major number on
the fly.  It's much simpler and easier overall to make sure you got all
of the above logic correct.

thanks,

greg k-h
Sasha Levin May 19, 2020, 5:45 p.m. UTC | #4
On Tue, May 19, 2020 at 07:21:05PM +0200, Greg KH wrote:
>On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote:
>> +
>> +#define DXGK_MAX_LOCK_DEPTH	64
>> +#define W_MAX_PATH		260
>
>We already have a max path number, why use a different one?

It's max path for Windows, not Linux (thus the "W_" prefix) :)

Maybe changing it to WIN_MAX_PATH or such will make it better?

>> +#define d3dkmt_handle		u32
>> +#define d3dgpu_virtual_address	u64
>> +#define winwchar		u16
>> +#define winhandle		u64
>> +#define ntstatus		int
>> +#define winbool			u32
>> +#define d3dgpu_size_t		u64
>
>These are all ripe for a simple search/replace in your editor before you
>do your next version :)

I've actually attempted that, and reverted that change, mostly because
the whole 'handle' thing became very confusing.

Note that we have a few 'handles', each with a different size, and thus
calling get_something_something_handle() type of functions becase very
confusing since it's not clear what handle we're working with in that
case.

With regards to the rest, I wanted to leave stuff like 'winbool' to
document the expected ABI between the Windows and Linux side of things.
Ideally it would be 'bool' or 'u8', but as you see we had to use 'u32'
here which I feel lessens our ability to have the code document itself.

I don't feel too strongly against doing the conversion, and I won't
object to doing it if you do, but just be aware that I've tried it and
preferred to go back (even though our coding style doesn't like this) :)
Daniel Vetter May 19, 2020, 7:21 p.m. UTC | #5
Hi Sasha

So obviously great that Microsoft is trying to upstream all this, and
very much welcome and all that.

But I guess there's a bunch of rather fundamental issues before we
look into any kind of code details. And that might make this quite a
hard sell for upstream to drivers/gpu subsystem:

- From the blog it sounds like the userspace is all closed. That
includes the hw specific part and compiler chunks, all stuff we've
generally expected to be able to look in the past for any kind of
other driver. It's event documented here:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements

What's your plan here?

btw since the main goal here (at least at first) seems to be get
compute and ML going the official work-around here is to relabel your
driver as an accelerator driver (just sed -e s/vGPU/vaccel/ over the
entire thing or so) and then Olof and Greg will take it into
drivers/accel ...

- Next up (but that's not really a surprise for a fresh vendor driver)
at a more technical level, this seems to reinvent the world, from
device enumeration (why is this not exposed as /dev/dri/card0 so it
better integrates with existing linux desktop stuff, in case that
becomes a goal ever) down to reinvented kref_put_mutex (and please
look at drm_device->struct_mutex for an example of how bad of a
nightmare that locking pattern is and how many years it took us to
untangle that one.

- Why DX12 on linux? Looking at this feels like classic divide and
conquer (or well triple E from the 90s), we have vk, we have
drm_syncobj, we have an entire ecosystem of winsys layers that work
across vendors. Is the plan here that we get a dx12 driver for other
hw mesa drivers from you guys, so this is all consistent and we have a
nice linux platform? How does this integrate everywhere else with
linux winsys standards, like dma-buf for passing stuff around,
dma-fence/sync_file/drm_syncobj for syncing, drm_fourcc/modifiers for
some idea how it all meshes together?

- There's been a pile of hallway track/private discussions about
moving on from the buffer-based memory managed model to something more
modern. That relates to your DXLOCK2 question, but there's a lot more
to userspace managed gpu memory residency than just that. monitored
fences are another part. Also, to avoid a platform split we need to
figure out how to tie this back into the dma-buf and dma-fence
(including various uapi flavours) or it'll be made of fail. dx12 has
all that in some form, except 0 integration with the linux stuff we
have (no surprise, since linux isn't windows). Finally if we go to the
trouble of a completely revamped I think ioctls aren't a great idea,
something like iouring (the gossip name is drm_uring) would be a lot
better. Also for easier paravirt we'd need 0 cpu pointers in any such
new interface. Adding a few people who've been involved in these
discussions thus far, mostly under a drm/hmm.ko heading iirc.

I think the above are the really big ticket items around what's the
plan here and are we solving even the right problem.

Cheers, Daniel


On Tue, May 19, 2020 at 6:33 PM Sasha Levin <sashal@kernel.org> wrote:
>
> There is a blog post that goes into more detail about the bigger
> picture, and walks through all the required pieces to make this work. It
> is available here:
> https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of
> this cover letter will focus on the Linux Kernel bits.
>
> Overview
> ========
>
> This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The
> driver exposes a paravirtualized GPU to user mode applications running
> in a virtual machine on a Windows host. This enables hardware
> acceleration in environment such as WSL (Windows Subsystem for Linux)
> where the Linux virtual machine is able to share the GPU with the
> Windows host.
>
> The projection is accomplished by exposing the WDDM (Windows Display
> Driver Model) interface as a set of IOCTL. This allows APIs and user
> mode driver written against the WDDM GPU abstraction on Windows to be
> ported to run within a Linux environment. This enables the port of the
> D3D12 and DirectML APIs as well as their associated user mode driver to
> Linux. This also enables third party APIs, such as the popular NVIDIA
> Cuda compute API, to be hardware accelerated within a WSL environment.
>
> Only the rendering/compute aspect of the GPU are projected to the
> virtual machine, no display functionality is exposed. Further, at this
> time there are no presentation integration. So although the D3D12 API
> can be use to render graphics offscreen, there is no path (yet) for
> pixel to flow from the Linux environment back onto the Windows host
> desktop. This GPU stack is effectively side-by-side with the native
> Linux graphics stack.
>
> The driver creates the /dev/dxg device, which can be opened by user mode
> application and handles their ioctls. The IOCTL interface to the driver
> is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl
> definitions). The interface matches the D3DKMT interface on Windows.
> Ioctls are implemented in ioctl.c.
>
> When a VM starts, hyper-v on the host adds virtual GPU devices to the VM
> via the hyper-v driver. The host offers several VM bus channels to the
> VM: the global channel and one channel per virtual GPU, assigned to the
> VM.
>
> The driver registers with the hyper-v driver (hv_driver) for the arrival
> of VM bus channels. dxg_probe_device recognizes the vGPU channels and
> creates the corresponding objects (dxgadapter for vGPUs and dxgglobal
> for the global channel).
>
> The driver uses the hyper-V VM bus interface to communicate with the
> host. dxgvmbus.c implements the communication interface.
>
> The global channel has 8GB of IO space assigned by the host. This space
> is managed by the host and used to give the guest direct CPU access to
> some allocations. Video memory is allocated on the host except in the
> case of existing_sysmem allocations. The Windows host allocates memory
> for the GPU on behalf of the guest. The Linux guest can access that
> memory by mapping GPU virtual address to allocations and then
> referencing those GPU virtual address from within GPU command buffers
> submitted to the GPU. For allocations which require CPU access, the
> allocation is mapped by the host into a location in the 8GB of IO space
> reserved in the guest for that purpose. The Windows host uses the nested
> CPU page table to ensure that this guest IO space always map to the
> correct location for the allocation as it may migrate between dedicated
> GPU memory (e.g. VRAM, firmware reserved DDR) and shared system memory
> (regular DDR) over its lifetime. The Linux guest maps a user mode CPU
> virtual address to an allocation IO space range for direct access by
> user mode APIs and drivers.
>
>
>
> Implementation of LX_DXLOCK2 ioctl
> ==================================
>
> We would appreciate your feedback on the implementation of the
> LX_DXLOCK2 ioctl.
>
> This ioctl is used to get a CPU address to an allocation, which is
> resident in video/system memory on the host. The way it works:
>
> 1. The driver sends the Lock message to the host
>
> 2. The host allocates space in the VM IO space and maps it to the
> allocation memory
>
> 3. The host returns the address in IO space for the mapped allocation
>
> 4. The driver (in dxg_map_iospace) allocates a user mode virtual address
> range using vm_mmap and maps it to the IO space using
> io_remap_ofn_range)
>
> 5. The VA is returned to the application
>
>
>
> Internal objects
> ================
>
> The following objects are created by the driver (defined in dxgkrnl.h):
>
> - dxgadapter - represents a virtual GPU
>
> - dxgprocess - tracks per process state (handle table of created
>   objects, list of objects, etc.)
>
> - dxgdevice - a container for other objects (contexts, paging queues,
>   allocations, GPU synchronization objects)
>
> - dxgcontext - represents thread of GPU execution for packet
>   scheduling.
>
> - dxghwqueue - represents thread of GPU execution of hardware scheduling
>
> - dxgallocation - represents a GPU accessible allocation
>
> - dxgsyncobject - represents a GPU synchronization object
>
> - dxgresource - collection of dxgalloction objects
>
> - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects
>   between different dxgdevice objects, which can belong to different
> processes
>
>
>
> Object handles
> ==============
>
> All GPU objects, created by the driver, are accessible by a handle
> (d3dkmt_handle). Each process has its own handle table, which is
> implemented in hmgr.c. For each API visible object, created by the
> driver, there is an object, created on the host. For example, the is a
> dxgprocess object on the host for each dxgprocess object in the VM, etc.
> The object handles have the same value in the host and the VM, which is
> done to avoid translation from the guest handles to the host handles.
>
>
>
> Signaling CPU events by the host
> ================================
>
> The WDDM interface provides a way to signal CPU event objects when
> execution of a context reached certain point. The way it is implemented:
>
> - application sends an event_fd via ioctl to the driver
>
> - eventfd_ctx_get is used to get a pointer to the file object
>   (eventfd_ctx)
>
> - the pointer to sent the host via a VM bus message
>
> - when GPU execution reaches a certain point, the host sends a message
>   to the VM with the event pointer
>
> - signal_guest_event() handles the messages and eventually
>   eventfd_signal() is called.
>
>
> Sasha Levin (4):
>   gpu: dxgkrnl: core code
>   gpu: dxgkrnl: hook up dxgkrnl
>   Drivers: hv: vmbus: hook up dxgkrnl
>   gpu: dxgkrnl: create a MAINTAINERS entry
>
>  MAINTAINERS                      |    7 +
>  drivers/gpu/Makefile             |    2 +-
>  drivers/gpu/dxgkrnl/Kconfig      |   10 +
>  drivers/gpu/dxgkrnl/Makefile     |   12 +
>  drivers/gpu/dxgkrnl/d3dkmthk.h   | 1635 +++++++++
>  drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++
>  drivers/gpu/dxgkrnl/dxgkrnl.h    |  913 ++++++
>  drivers/gpu/dxgkrnl/dxgmodule.c  |  692 ++++
>  drivers/gpu/dxgkrnl/dxgprocess.c |  355 ++
>  drivers/gpu/dxgkrnl/dxgvmbus.c   | 2955 +++++++++++++++++
>  drivers/gpu/dxgkrnl/dxgvmbus.h   |  859 +++++
>  drivers/gpu/dxgkrnl/hmgr.c       |  593 ++++
>  drivers/gpu/dxgkrnl/hmgr.h       |  107 +
>  drivers/gpu/dxgkrnl/ioctl.c      | 5269 ++++++++++++++++++++++++++++++
>  drivers/gpu/dxgkrnl/misc.c       |  280 ++
>  drivers/gpu/dxgkrnl/misc.h       |  288 ++
>  drivers/video/Kconfig            |    2 +
>  include/linux/hyperv.h           |   16 +
>  18 files changed, 15393 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/dxgkrnl/Kconfig
>  create mode 100644 drivers/gpu/dxgkrnl/Makefile
>  create mode 100644 drivers/gpu/dxgkrnl/d3dkmthk.h
>  create mode 100644 drivers/gpu/dxgkrnl/dxgadapter.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgkrnl.h
>  create mode 100644 drivers/gpu/dxgkrnl/dxgmodule.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgprocess.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.h
>  create mode 100644 drivers/gpu/dxgkrnl/hmgr.c
>  create mode 100644 drivers/gpu/dxgkrnl/hmgr.h
>  create mode 100644 drivers/gpu/dxgkrnl/ioctl.c
>  create mode 100644 drivers/gpu/dxgkrnl/misc.c
>  create mode 100644 drivers/gpu/dxgkrnl/misc.h
>
> --
> 2.25.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Sasha Levin May 19, 2020, 8:36 p.m. UTC | #6
Hi Daniel,

On Tue, May 19, 2020 at 09:21:15PM +0200, Daniel Vetter wrote:
>Hi Sasha
>
>So obviously great that Microsoft is trying to upstream all this, and
>very much welcome and all that.
>
>But I guess there's a bunch of rather fundamental issues before we
>look into any kind of code details. And that might make this quite a
>hard sell for upstream to drivers/gpu subsystem:

Let me preface my answers by saying that speaking personally I very much
dislike that the userspace is closed and wish I could do something about
it.

>- From the blog it sounds like the userspace is all closed. That
>includes the hw specific part and compiler chunks, all stuff we've
>generally expected to be able to look in the past for any kind of
>other driver. It's event documented here:
>
>https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
>
>What's your plan here?

Let me answer with a (genuine) question: does this driver have anything
to do with DRM even after we enable graphics on it? I'm still trying to
figure it out.

There is an open source DX12 Galluim driver (that lives here:
https://gitlab.freedesktop.org/kusma/mesa/-/tree/msclc-d3d12) with open
source compiler and so on.

The plan is for Microsoft to provide shims to allow the existing Linux
userspace interact with DX12; I'll explain below why we had to pipe DX12
all the way into the Linux guest, but this is *not* to introduce DX12
into the Linux world as competition. There is no intent for anyone in
the Linux world to start coding for the DX12 API.

This is why I'm not sure whether this touches DRM on the Linux side of
things. Nothing is actually rendered on Linux but rather piped to
Windows to be done there.

>btw since the main goal here (at least at first) seems to be get
>compute and ML going the official work-around here is to relabel your
>driver as an accelerator driver (just sed -e s/vGPU/vaccel/ over the
>entire thing or so) and then Olof and Greg will take it into
>drivers/accel ...

This submission is not a case of "we want it upstream NOW" but rather
"let's work together to figure out how to do it right" :)

I thought about placing this driver in drivers/hyper-v/ given that it's
basically just a pipe between the host and the guest. There is no fancy
logic in this drivers. Maybe the right place is indeed drivers/accel or
drivers/hyper-v but I'd love if we agree on that rather than doing that
as a workaround and 6 months down the road enabling graphics.

>- Next up (but that's not really a surprise for a fresh vendor driver)
>at a more technical level, this seems to reinvent the world, from
>device enumeration (why is this not exposed as /dev/dri/card0 so it
>better integrates with existing linux desktop stuff, in case that
>becomes a goal ever) down to reinvented kref_put_mutex (and please
>look at drm_device->struct_mutex for an example of how bad of a
>nightmare that locking pattern is and how many years it took us to
>untangle that one.

I'd maybe note that neither of us here at Microsoft is an expert in the
Linux DRM world. Stuff might have been done in a certain way because we
didn't know better.

>- Why DX12 on linux? Looking at this feels like classic divide and

There is a single usecase for this: WSL2 developer who wants to run
machine learning on his GPU. The developer is working on his laptop,
which is running Windows and that laptop has a single GPU that Windows
is using.

Since the GPU is being used by Windows, we can't assign it directly to
the Linux guest, but instead we can use GPU Partitioning to give the
guest access to the GPU. This means that the guest needs to be able to
"speak" DX12, which is why we pulled DX12 into Linux.

>conquer (or well triple E from the 90s), we have vk, we have
>drm_syncobj, we have an entire ecosystem of winsys layers that work
>across vendors. Is the plan here that we get a dx12 driver for other
>hw mesa drivers from you guys, so this is all consistent and we have a
>nice linux platform? How does this integrate everywhere else with
>linux winsys standards, like dma-buf for passing stuff around,
>dma-fence/sync_file/drm_syncobj for syncing, drm_fourcc/modifiers for
>some idea how it all meshes together?

Let me point you to this blog post that has more information about the
graphics side of things:
https://www.collabora.com/news-and-blog/news-and-events/introducing-opencl-and-opengl-on-directx.html
.

The intent is to wrap DX12 with shims to work with the existing
ecosystem; DX12 isn't a new player on it's own and thus isn't trying to
divide/conquer anything.

>- There's been a pile of hallway track/private discussions about
>moving on from the buffer-based memory managed model to something more
>modern. That relates to your DXLOCK2 question, but there's a lot more
>to userspace managed gpu memory residency than just that. monitored
>fences are another part. Also, to avoid a platform split we need to
>figure out how to tie this back into the dma-buf and dma-fence
>(including various uapi flavours) or it'll be made of fail. dx12 has
>all that in some form, except 0 integration with the linux stuff we
>have (no surprise, since linux isn't windows). Finally if we go to the
>trouble of a completely revamped I think ioctls aren't a great idea,
>something like iouring (the gossip name is drm_uring) would be a lot
>better. Also for easier paravirt we'd need 0 cpu pointers in any such
>new interface. Adding a few people who've been involved in these
>discussions thus far, mostly under a drm/hmm.ko heading iirc.
>
>I think the above are the really big ticket items around what's the
>plan here and are we solving even the right problem.

Part of the reason behind this implementation is simplicity. Again, no
objections around moving to uring and doing other improvements.
Dave Airlie May 19, 2020, 10:42 p.m. UTC | #7
On Wed, 20 May 2020 at 02:33, Sasha Levin <sashal@kernel.org> wrote:
>
> There is a blog post that goes into more detail about the bigger
> picture, and walks through all the required pieces to make this work. It
> is available here:
> https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of
> this cover letter will focus on the Linux Kernel bits.
>
> Overview
> ========
>
> This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The
> driver exposes a paravirtualized GPU to user mode applications running
> in a virtual machine on a Windows host. This enables hardware
> acceleration in environment such as WSL (Windows Subsystem for Linux)
> where the Linux virtual machine is able to share the GPU with the
> Windows host.
>
> The projection is accomplished by exposing the WDDM (Windows Display
> Driver Model) interface as a set of IOCTL. This allows APIs and user
> mode driver written against the WDDM GPU abstraction on Windows to be
> ported to run within a Linux environment. This enables the port of the
> D3D12 and DirectML APIs as well as their associated user mode driver to
> Linux. This also enables third party APIs, such as the popular NVIDIA
> Cuda compute API, to be hardware accelerated within a WSL environment.
>
> Only the rendering/compute aspect of the GPU are projected to the
> virtual machine, no display functionality is exposed. Further, at this
> time there are no presentation integration. So although the D3D12 API
> can be use to render graphics offscreen, there is no path (yet) for
> pixel to flow from the Linux environment back onto the Windows host
> desktop. This GPU stack is effectively side-by-side with the native
> Linux graphics stack.

Okay I've had some caffiene and absorbed some more of this.

This is a driver that connects a binary blob interface in the Windows
kernel drivers to a binary blob that you run inside a Linux guest.
It's a binary transport between two binary pieces. Personally this
holds little of interest to me, I can see why it might be nice to have
this upstream, but I don't forsee any other Linux distributor ever
enabling it or having to ship it, it's purely a WSL2 pipe. I'm not
saying I'd be happy to see this in the tree, since I don't see the
value of maintaining it upstream, but it probably should just exists
in a drivers/hyperv type area.

Having said that, I hit one stumbling block:
"Further, at this time there are no presentation integration. "

If we upstream this driver as-is into some hyperv specific place, and
you decide to add presentation integration this is more than likely
going to mean you will want to interact with dma-bufs and dma-fences.
If the driver is hidden away in a hyperv place it's likely we won't
even notice that feature landing until it's too late.

I would like to see a coherent plan for presentation support (not
code, just an architectural diagram), because I think when you
contemplate how that works it will change the picture of how this
driver looks and intergrates into the rest of the Linux graphics
ecosystem.

As-is I'd rather this didn't land under my purview, since I don't see
the value this adds to the Linux ecosystem at all, and I think it's
important when putting a burden on upstream that you provide some
value.

Dave.
Daniel Vetter May 19, 2020, 11:01 p.m. UTC | #8
On Wed, May 20, 2020 at 12:42 AM Dave Airlie <airlied@gmail.com> wrote:
>
> On Wed, 20 May 2020 at 02:33, Sasha Levin <sashal@kernel.org> wrote:
> >
> > There is a blog post that goes into more detail about the bigger
> > picture, and walks through all the required pieces to make this work. It
> > is available here:
> > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of
> > this cover letter will focus on the Linux Kernel bits.
> >
> > Overview
> > ========
> >
> > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The
> > driver exposes a paravirtualized GPU to user mode applications running
> > in a virtual machine on a Windows host. This enables hardware
> > acceleration in environment such as WSL (Windows Subsystem for Linux)
> > where the Linux virtual machine is able to share the GPU with the
> > Windows host.
> >
> > The projection is accomplished by exposing the WDDM (Windows Display
> > Driver Model) interface as a set of IOCTL. This allows APIs and user
> > mode driver written against the WDDM GPU abstraction on Windows to be
> > ported to run within a Linux environment. This enables the port of the
> > D3D12 and DirectML APIs as well as their associated user mode driver to
> > Linux. This also enables third party APIs, such as the popular NVIDIA
> > Cuda compute API, to be hardware accelerated within a WSL environment.
> >
> > Only the rendering/compute aspect of the GPU are projected to the
> > virtual machine, no display functionality is exposed. Further, at this
> > time there are no presentation integration. So although the D3D12 API
> > can be use to render graphics offscreen, there is no path (yet) for
> > pixel to flow from the Linux environment back onto the Windows host
> > desktop. This GPU stack is effectively side-by-side with the native
> > Linux graphics stack.
>
> Okay I've had some caffiene and absorbed some more of this.
>
> This is a driver that connects a binary blob interface in the Windows
> kernel drivers to a binary blob that you run inside a Linux guest.
> It's a binary transport between two binary pieces. Personally this
> holds little of interest to me, I can see why it might be nice to have
> this upstream, but I don't forsee any other Linux distributor ever
> enabling it or having to ship it, it's purely a WSL2 pipe. I'm not
> saying I'd be happy to see this in the tree, since I don't see the
> value of maintaining it upstream, but it probably should just exists
> in a drivers/hyperv type area.

Yup as-is (especially with the goal of this being aimed at ml/compute
only) drivers/hyperv sounds a bunch more reasonable than drivers/gpu.

> Having said that, I hit one stumbling block:
> "Further, at this time there are no presentation integration. "
>
> If we upstream this driver as-is into some hyperv specific place, and
> you decide to add presentation integration this is more than likely
> going to mean you will want to interact with dma-bufs and dma-fences.
> If the driver is hidden away in a hyperv place it's likely we won't
> even notice that feature landing until it's too late.

I've recently added regex matches to MAINTAINERS so we'll see
dma_buf/fence/anything show up on dri-devel. So that part is solved
hopefully.

> I would like to see a coherent plan for presentation support (not
> code, just an architectural diagram), because I think when you
> contemplate how that works it will change the picture of how this
> driver looks and intergrates into the rest of the Linux graphics
> ecosystem.

Yeah once we have the feature-creep to presentation support all the
integration fun starts, with all the questions about "why does this
not look like any other linux gpu driver". We have that already with
nvidia insisting they just can't implement any of the upstream gpu
uapi we have, but at least they're not in-tree, so not our problem
from an upstream maintainership pov.

But once this dx12 pipe is landed and then we want to extend it it's
still going to have all the "we can't ever release the sources to any
of the parts we usually expect to be open for gpu drivers in upstream"
problems. Then we're stuck at a rather awkward point of why one vendor
gets an exception and all the others dont.

> As-is I'd rather this didn't land under my purview, since I don't see
> the value this adds to the Linux ecosystem at all, and I think it's
> important when putting a burden on upstream that you provide some
> value.

Well there is some in the form of "more hw/platform support". But
given that gpus evolved rather fast, including the entire integration
ecosystem (it's by far not just the hw drivers that move quickly). So
that value deprecates a lot faster than for other kernel subsystems.
And all that's left is the pain of not breaking anything without
actually being able to evolve the overall stack in any meaningful way.
-Daniel
Dave Airlie May 19, 2020, 11:12 p.m. UTC | #9
On Wed, 20 May 2020 at 08:42, Dave Airlie <airlied@gmail.com> wrote:
>
> On Wed, 20 May 2020 at 02:33, Sasha Levin <sashal@kernel.org> wrote:
> >
> > There is a blog post that goes into more detail about the bigger
> > picture, and walks through all the required pieces to make this work. It
> > is available here:
> > https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of
> > this cover letter will focus on the Linux Kernel bits.
> >
> > Overview
> > ========
> >
> > This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The
> > driver exposes a paravirtualized GPU to user mode applications running
> > in a virtual machine on a Windows host. This enables hardware
> > acceleration in environment such as WSL (Windows Subsystem for Linux)
> > where the Linux virtual machine is able to share the GPU with the
> > Windows host.
> >
> > The projection is accomplished by exposing the WDDM (Windows Display
> > Driver Model) interface as a set of IOCTL. This allows APIs and user
> > mode driver written against the WDDM GPU abstraction on Windows to be
> > ported to run within a Linux environment. This enables the port of the
> > D3D12 and DirectML APIs as well as their associated user mode driver to
> > Linux. This also enables third party APIs, such as the popular NVIDIA
> > Cuda compute API, to be hardware accelerated within a WSL environment.
> >
> > Only the rendering/compute aspect of the GPU are projected to the
> > virtual machine, no display functionality is exposed. Further, at this
> > time there are no presentation integration. So although the D3D12 API
> > can be use to render graphics offscreen, there is no path (yet) for
> > pixel to flow from the Linux environment back onto the Windows host
> > desktop. This GPU stack is effectively side-by-side with the native
> > Linux graphics stack.
>
> Okay I've had some caffiene and absorbed some more of this.
>
> This is a driver that connects a binary blob interface in the Windows
> kernel drivers to a binary blob that you run inside a Linux guest.
> It's a binary transport between two binary pieces. Personally this
> holds little of interest to me, I can see why it might be nice to have
> this upstream, but I don't forsee any other Linux distributor ever
> enabling it or having to ship it, it's purely a WSL2 pipe. I'm not
> saying I'd be happy to see this in the tree, since I don't see the
> value of maintaining it upstream, but it probably should just exists
> in a drivers/hyperv type area.
>
> Having said that, I hit one stumbling block:
> "Further, at this time there are no presentation integration. "
>
> If we upstream this driver as-is into some hyperv specific place, and
> you decide to add presentation integration this is more than likely
> going to mean you will want to interact with dma-bufs and dma-fences.
> If the driver is hidden away in a hyperv place it's likely we won't
> even notice that feature landing until it's too late.
>
> I would like to see a coherent plan for presentation support (not
> code, just an architectural diagram), because I think when you
> contemplate how that works it will change the picture of how this
> driver looks and intergrates into the rest of the Linux graphics
> ecosystem.
>
> As-is I'd rather this didn't land under my purview, since I don't see
> the value this adds to the Linux ecosystem at all, and I think it's
> important when putting a burden on upstream that you provide some
> value.

I also have another concern from a legal standpoint I'd rather not
review the ioctl part of this. I'd probably request under DRI
developers abstain as well.

This is a Windows kernel API being smashed into a Linux driver. I
don't want to be tainted by knowledge of an API that I've no idea of
the legal status of derived works. (it this all covered patent wise
under OIN?)

I don't want to ever be accused of designing a Linux kernel API with
illgotten D3DKMT knowledge, I feel tainting myself with knowledge of a
properietary API might cause derived work issues.

Dave.
Steve Pronovost May 20, 2020, 3:47 a.m. UTC | #10
Hey guys,

Thanks for the discussion. I may not be able to immediately answer all of your questions, but I'll do my best 
Greg Kroah-Hartman May 20, 2020, 6:13 a.m. UTC | #11
On Tue, May 19, 2020 at 01:45:53PM -0400, Sasha Levin wrote:
> On Tue, May 19, 2020 at 07:21:05PM +0200, Greg KH wrote:
> > On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote:
> > > +
> > > +#define DXGK_MAX_LOCK_DEPTH	64
> > > +#define W_MAX_PATH		260
> > 
> > We already have a max path number, why use a different one?
> 
> It's max path for Windows, not Linux (thus the "W_" prefix) :)

Ah, not obvious :)

> Maybe changing it to WIN_MAX_PATH or such will make it better?

Probably.

> > > +#define d3dkmt_handle		u32
> > > +#define d3dgpu_virtual_address	u64
> > > +#define winwchar		u16
> > > +#define winhandle		u64
> > > +#define ntstatus		int
> > > +#define winbool			u32
> > > +#define d3dgpu_size_t		u64
> > 
> > These are all ripe for a simple search/replace in your editor before you
> > do your next version :)
> 
> I've actually attempted that, and reverted that change, mostly because
> the whole 'handle' thing became very confusing.

Yeah, "handles" in windows can be a mess, with some being pointers and
others just integers.  Trying to make a specific typedef for it is
usually the better way overall, that way you can get the compiler to
check for mistakes.  These #defines will not really help with that.

But, 'ntstatus' should be ok to just make "int" everywhere, right?

> Note that we have a few 'handles', each with a different size, and thus
> calling get_something_something_handle() type of functions becase very
> confusing since it's not clear what handle we're working with in that
> case.

Yeah, typedefs can help there.

> With regards to the rest, I wanted to leave stuff like 'winbool' to
> document the expected ABI between the Windows and Linux side of things.
> Ideally it would be 'bool' or 'u8', but as you see we had to use 'u32'
> here which I feel lessens our ability to have the code document itself.

'bool' probably will not work as I think it's compiler dependent, __u8
is probably best.

thanks,

greg k-h
Thomas Zimmermann May 20, 2020, 7:10 a.m. UTC | #12
Hi

Am 19.05.20 um 18:32 schrieb Sasha Levin:
> There is a blog post that goes into more detail about the bigger
> picture, and walks through all the required pieces to make this work. It
> is available here:
> https://devblogs.microsoft.com/directx/directx-heart-linux . The rest of
> this cover letter will focus on the Linux Kernel bits.

That's quite a surprise. Thanks for your efforts to contribute.

> 
> Overview
> ========
> 
> This is the first draft of the Microsoft Virtual GPU (vGPU) driver. The
> driver exposes a paravirtualized GPU to user mode applications running
> in a virtual machine on a Windows host. This enables hardware
> acceleration in environment such as WSL (Windows Subsystem for Linux)
> where the Linux virtual machine is able to share the GPU with the
> Windows host.
> 
> The projection is accomplished by exposing the WDDM (Windows Display
> Driver Model) interface as a set of IOCTL. This allows APIs and user
> mode driver written against the WDDM GPU abstraction on Windows to be
> ported to run within a Linux environment. This enables the port of the
> D3D12 and DirectML APIs as well as their associated user mode driver to
> Linux. This also enables third party APIs, such as the popular NVIDIA
> Cuda compute API, to be hardware accelerated within a WSL environment.
> 
> Only the rendering/compute aspect of the GPU are projected to the
> virtual machine, no display functionality is exposed. Further, at this
> time there are no presentation integration. So although the D3D12 API
> can be use to render graphics offscreen, there is no path (yet) for
> pixel to flow from the Linux environment back onto the Windows host
> desktop. This GPU stack is effectively side-by-side with the native
> Linux graphics stack.
> 
> The driver creates the /dev/dxg device, which can be opened by user mode
> application and handles their ioctls. The IOCTL interface to the driver
> is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl
> definitions). The interface matches the D3DKMT interface on Windows.
> Ioctls are implemented in ioctl.c.

Echoing what others said, you're not making a DRM driver. The driver
should live outside of the DRM code.

I have one question about the driver API: on Windows, DirectX versions
are loosly tied to Windows releases. So I guess you can change the
kernel interface among DirectX versions?

If so, how would this work on Linux in the long term? If there ever is a
DirectX 13 or 14 with incompatible kernel interfaces, how would you plan
to update the Linux driver?

Best regards
Thomas

> 
> When a VM starts, hyper-v on the host adds virtual GPU devices to the VM
> via the hyper-v driver. The host offers several VM bus channels to the
> VM: the global channel and one channel per virtual GPU, assigned to the
> VM.
> 
> The driver registers with the hyper-v driver (hv_driver) for the arrival
> of VM bus channels. dxg_probe_device recognizes the vGPU channels and
> creates the corresponding objects (dxgadapter for vGPUs and dxgglobal
> for the global channel).
> 
> The driver uses the hyper-V VM bus interface to communicate with the
> host. dxgvmbus.c implements the communication interface.
> 
> The global channel has 8GB of IO space assigned by the host. This space
> is managed by the host and used to give the guest direct CPU access to
> some allocations. Video memory is allocated on the host except in the
> case of existing_sysmem allocations. The Windows host allocates memory
> for the GPU on behalf of the guest. The Linux guest can access that
> memory by mapping GPU virtual address to allocations and then
> referencing those GPU virtual address from within GPU command buffers
> submitted to the GPU. For allocations which require CPU access, the
> allocation is mapped by the host into a location in the 8GB of IO space
> reserved in the guest for that purpose. The Windows host uses the nested
> CPU page table to ensure that this guest IO space always map to the
> correct location for the allocation as it may migrate between dedicated
> GPU memory (e.g. VRAM, firmware reserved DDR) and shared system memory
> (regular DDR) over its lifetime. The Linux guest maps a user mode CPU
> virtual address to an allocation IO space range for direct access by
> user mode APIs and drivers.
> 
>  
> 
> Implementation of LX_DXLOCK2 ioctl
> ==================================
> 
> We would appreciate your feedback on the implementation of the
> LX_DXLOCK2 ioctl.
> 
> This ioctl is used to get a CPU address to an allocation, which is
> resident in video/system memory on the host. The way it works:
> 
> 1. The driver sends the Lock message to the host
> 
> 2. The host allocates space in the VM IO space and maps it to the
> allocation memory
> 
> 3. The host returns the address in IO space for the mapped allocation
> 
> 4. The driver (in dxg_map_iospace) allocates a user mode virtual address
> range using vm_mmap and maps it to the IO space using
> io_remap_ofn_range)
> 
> 5. The VA is returned to the application
> 
>  
> 
> Internal objects
> ================
> 
> The following objects are created by the driver (defined in dxgkrnl.h):
> 
> - dxgadapter - represents a virtual GPU
> 
> - dxgprocess - tracks per process state (handle table of created
>   objects, list of objects, etc.)
> 
> - dxgdevice - a container for other objects (contexts, paging queues,
>   allocations, GPU synchronization objects)
> 
> - dxgcontext - represents thread of GPU execution for packet
>   scheduling.
> 
> - dxghwqueue - represents thread of GPU execution of hardware scheduling
> 
> - dxgallocation - represents a GPU accessible allocation
> 
> - dxgsyncobject - represents a GPU synchronization object
> 
> - dxgresource - collection of dxgalloction objects
> 
> - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects
>   between different dxgdevice objects, which can belong to different
> processes
> 
> 
>  
> Object handles
> ==============
> 
> All GPU objects, created by the driver, are accessible by a handle
> (d3dkmt_handle). Each process has its own handle table, which is
> implemented in hmgr.c. For each API visible object, created by the
> driver, there is an object, created on the host. For example, the is a
> dxgprocess object on the host for each dxgprocess object in the VM, etc.
> The object handles have the same value in the host and the VM, which is
> done to avoid translation from the guest handles to the host handles.
>  
> 
> 
> Signaling CPU events by the host
> ================================
> 
> The WDDM interface provides a way to signal CPU event objects when
> execution of a context reached certain point. The way it is implemented:
> 
> - application sends an event_fd via ioctl to the driver
> 
> - eventfd_ctx_get is used to get a pointer to the file object
>   (eventfd_ctx)
> 
> - the pointer to sent the host via a VM bus message
> 
> - when GPU execution reaches a certain point, the host sends a message
>   to the VM with the event pointer
> 
> - signal_guest_event() handles the messages and eventually
>   eventfd_signal() is called.
> 
> 
> Sasha Levin (4):
>   gpu: dxgkrnl: core code
>   gpu: dxgkrnl: hook up dxgkrnl
>   Drivers: hv: vmbus: hook up dxgkrnl
>   gpu: dxgkrnl: create a MAINTAINERS entry
> 
>  MAINTAINERS                      |    7 +
>  drivers/gpu/Makefile             |    2 +-
>  drivers/gpu/dxgkrnl/Kconfig      |   10 +
>  drivers/gpu/dxgkrnl/Makefile     |   12 +
>  drivers/gpu/dxgkrnl/d3dkmthk.h   | 1635 +++++++++
>  drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++
>  drivers/gpu/dxgkrnl/dxgkrnl.h    |  913 ++++++
>  drivers/gpu/dxgkrnl/dxgmodule.c  |  692 ++++
>  drivers/gpu/dxgkrnl/dxgprocess.c |  355 ++
>  drivers/gpu/dxgkrnl/dxgvmbus.c   | 2955 +++++++++++++++++
>  drivers/gpu/dxgkrnl/dxgvmbus.h   |  859 +++++
>  drivers/gpu/dxgkrnl/hmgr.c       |  593 ++++
>  drivers/gpu/dxgkrnl/hmgr.h       |  107 +
>  drivers/gpu/dxgkrnl/ioctl.c      | 5269 ++++++++++++++++++++++++++++++
>  drivers/gpu/dxgkrnl/misc.c       |  280 ++
>  drivers/gpu/dxgkrnl/misc.h       |  288 ++
>  drivers/video/Kconfig            |    2 +
>  include/linux/hyperv.h           |   16 +
>  18 files changed, 15393 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/dxgkrnl/Kconfig
>  create mode 100644 drivers/gpu/dxgkrnl/Makefile
>  create mode 100644 drivers/gpu/dxgkrnl/d3dkmthk.h
>  create mode 100644 drivers/gpu/dxgkrnl/dxgadapter.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgkrnl.h
>  create mode 100644 drivers/gpu/dxgkrnl/dxgmodule.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgprocess.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.c
>  create mode 100644 drivers/gpu/dxgkrnl/dxgvmbus.h
>  create mode 100644 drivers/gpu/dxgkrnl/hmgr.c
>  create mode 100644 drivers/gpu/dxgkrnl/hmgr.h
>  create mode 100644 drivers/gpu/dxgkrnl/ioctl.c
>  create mode 100644 drivers/gpu/dxgkrnl/misc.c
>  create mode 100644 drivers/gpu/dxgkrnl/misc.h
>
Steve Pronovost May 20, 2020, 7:42 a.m. UTC | #13
>Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code.

Agreed, please see my earlier reply. We'll be moving the driver to drivers/hyperv node or something similar. Apology for the confusion here.

> I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions?
> If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver?

You should think of the communication over the VM Bus for the vGPU projection as a strongly versioned interface. We will be keeping compatibility with older version of that interface as it evolves over time so we can continue to run older guest (we already do). This protocol isn't actually tied to the DX API. It is a generic abstraction for the GPU that can be used for any APIs (for example the NVIDIA CUDA driver that we announced is going over the same protocol to access the GPU). 

New version of user mode DX can either take advantage or sometime require new services from this kernel abstraction. This mean that pulling a new version of user mode DX can mean having to also pull a new version of this vGPU kernel driver. For WSL, these essentially ships together. The kernel driver ships as part of our WSL2 Linux Kernel integration. User mode DX bits ships with Windows. 

-----Original Message-----
From: Thomas Zimmermann <tzimmermann@suse.de> 
Sent: Wednesday, May 20, 2020 12:11 AM
To: Sasha Levin <sashal@kernel.org>; alexander.deucher@amd.com; chris@chris-wilson.co.uk; ville.syrjala@linux.intel.com; Hawking.Zhang@amd.com; tvrtko.ursulin@intel.com
Cc: linux-kernel@vger.kernel.org; linux-hyperv@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; wei.liu@kernel.org; Steve Pronovost <spronovo@microsoft.com>; Iouri Tarassov <iourit@microsoft.com>; dri-devel@lists.freedesktop.org; linux-fbdev@vger.kernel.org; gregkh@linuxfoundation.org
Subject: [EXTERNAL] Re: [RFC PATCH 0/4] DirectX on Linux

Hi

Am 19.05.20 um 18:32 schrieb Sasha Levin:
> There is a blog post that goes into more detail about the bigger 
> picture, and walks through all the required pieces to make this work. 
> It is available here:
> https://devblogs.microsoft.com/directx/directx-heart-linux . The rest 
> of this cover letter will focus on the Linux Kernel bits.

That's quite a surprise. Thanks for your efforts to contribute.

> 
> Overview
> ========
> 
> This is the first draft of the Microsoft Virtual GPU (vGPU) driver. 
> The driver exposes a paravirtualized GPU to user mode applications 
> running in a virtual machine on a Windows host. This enables hardware 
> acceleration in environment such as WSL (Windows Subsystem for Linux) 
> where the Linux virtual machine is able to share the GPU with the 
> Windows host.
> 
> The projection is accomplished by exposing the WDDM (Windows Display 
> Driver Model) interface as a set of IOCTL. This allows APIs and user 
> mode driver written against the WDDM GPU abstraction on Windows to be 
> ported to run within a Linux environment. This enables the port of the
> D3D12 and DirectML APIs as well as their associated user mode driver 
> to Linux. This also enables third party APIs, such as the popular 
> NVIDIA Cuda compute API, to be hardware accelerated within a WSL environment.
> 
> Only the rendering/compute aspect of the GPU are projected to the 
> virtual machine, no display functionality is exposed. Further, at this 
> time there are no presentation integration. So although the D3D12 API 
> can be use to render graphics offscreen, there is no path (yet) for 
> pixel to flow from the Linux environment back onto the Windows host 
> desktop. This GPU stack is effectively side-by-side with the native 
> Linux graphics stack.
> 
> The driver creates the /dev/dxg device, which can be opened by user 
> mode application and handles their ioctls. The IOCTL interface to the 
> driver is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl 
> definitions). The interface matches the D3DKMT interface on Windows.
> Ioctls are implemented in ioctl.c.

Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code.

I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions?

If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver?

Best regards
Thomas

> 
> When a VM starts, hyper-v on the host adds virtual GPU devices to the 
> VM via the hyper-v driver. The host offers several VM bus channels to 
> the
> VM: the global channel and one channel per virtual GPU, assigned to 
> the VM.
> 
> The driver registers with the hyper-v driver (hv_driver) for the 
> arrival of VM bus channels. dxg_probe_device recognizes the vGPU 
> channels and creates the corresponding objects (dxgadapter for vGPUs 
> and dxgglobal for the global channel).
> 
> The driver uses the hyper-V VM bus interface to communicate with the 
> host. dxgvmbus.c implements the communication interface.
> 
> The global channel has 8GB of IO space assigned by the host. This 
> space is managed by the host and used to give the guest direct CPU 
> access to some allocations. Video memory is allocated on the host 
> except in the case of existing_sysmem allocations. The Windows host 
> allocates memory for the GPU on behalf of the guest. The Linux guest 
> can access that memory by mapping GPU virtual address to allocations 
> and then referencing those GPU virtual address from within GPU command 
> buffers submitted to the GPU. For allocations which require CPU 
> access, the allocation is mapped by the host into a location in the 
> 8GB of IO space reserved in the guest for that purpose. The Windows 
> host uses the nested CPU page table to ensure that this guest IO space 
> always map to the correct location for the allocation as it may 
> migrate between dedicated GPU memory (e.g. VRAM, firmware reserved 
> DDR) and shared system memory (regular DDR) over its lifetime. The 
> Linux guest maps a user mode CPU virtual address to an allocation IO 
> space range for direct access by user mode APIs and drivers.
> 
>  
> 
> Implementation of LX_DXLOCK2 ioctl
> ==================================
> 
> We would appreciate your feedback on the implementation of the
> LX_DXLOCK2 ioctl.
> 
> This ioctl is used to get a CPU address to an allocation, which is 
> resident in video/system memory on the host. The way it works:
> 
> 1. The driver sends the Lock message to the host
> 
> 2. The host allocates space in the VM IO space and maps it to the 
> allocation memory
> 
> 3. The host returns the address in IO space for the mapped allocation
> 
> 4. The driver (in dxg_map_iospace) allocates a user mode virtual 
> address range using vm_mmap and maps it to the IO space using
> io_remap_ofn_range)
> 
> 5. The VA is returned to the application
> 
>  
> 
> Internal objects
> ================
> 
> The following objects are created by the driver (defined in dxgkrnl.h):
> 
> - dxgadapter - represents a virtual GPU
> 
> - dxgprocess - tracks per process state (handle table of created
>   objects, list of objects, etc.)
> 
> - dxgdevice - a container for other objects (contexts, paging queues,
>   allocations, GPU synchronization objects)
> 
> - dxgcontext - represents thread of GPU execution for packet
>   scheduling.
> 
> - dxghwqueue - represents thread of GPU execution of hardware 
> scheduling
> 
> - dxgallocation - represents a GPU accessible allocation
> 
> - dxgsyncobject - represents a GPU synchronization object
> 
> - dxgresource - collection of dxgalloction objects
> 
> - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects
>   between different dxgdevice objects, which can belong to different 
> processes
> 
> 
>  
> Object handles
> ==============
> 
> All GPU objects, created by the driver, are accessible by a handle 
> (d3dkmt_handle). Each process has its own handle table, which is 
> implemented in hmgr.c. For each API visible object, created by the 
> driver, there is an object, created on the host. For example, the is a 
> dxgprocess object on the host for each dxgprocess object in the VM, etc.
> The object handles have the same value in the host and the VM, which 
> is done to avoid translation from the guest handles to the host handles.
>  
> 
> 
> Signaling CPU events by the host
> ================================
> 
> The WDDM interface provides a way to signal CPU event objects when 
> execution of a context reached certain point. The way it is implemented:
> 
> - application sends an event_fd via ioctl to the driver
> 
> - eventfd_ctx_get is used to get a pointer to the file object
>   (eventfd_ctx)
> 
> - the pointer to sent the host via a VM bus message
> 
> - when GPU execution reaches a certain point, the host sends a message
>   to the VM with the event pointer
> 
> - signal_guest_event() handles the messages and eventually
>   eventfd_signal() is called.
> 
> 
> Sasha Levin (4):
>   gpu: dxgkrnl: core code
>   gpu: dxgkrnl: hook up dxgkrnl
>   Drivers: hv: vmbus: hook up dxgkrnl
>   gpu: dxgkrnl: create a MAINTAINERS entry
> 
>  MAINTAINERS                      |    7 +
>  drivers/gpu/Makefile             |    2 +-
>  drivers/gpu/dxgkrnl/Kconfig      |   10 +
>  drivers/gpu/dxgkrnl/Makefile     |   12 +
>  drivers/gpu/dxgkrnl/d3dkmthk.h   | 1635 +++++++++
>  drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++
>  drivers/gpu/dxgkrnl/dxgkrnl.h    |  913 ++++++
>  drivers/gpu/dxgkrnl/dxgmodule.c  |  692 ++++  
> drivers/gpu/dxgkrnl/dxgprocess.c |  355 ++
>  drivers/gpu/dxgkrnl/dxgvmbus.c   | 2955 +++++++++++++++++
>  drivers/gpu/dxgkrnl/dxgvmbus.h   |  859 +++++
>  drivers/gpu/dxgkrnl/hmgr.c       |  593 ++++
>  drivers/gpu/dxgkrnl/hmgr.h       |  107 +
>  drivers/gpu/dxgkrnl/ioctl.c      | 5269 ++++++++++++++++++++++++++++++
>  drivers/gpu/dxgkrnl/misc.c       |  280 ++
>  drivers/gpu/dxgkrnl/misc.h       |  288 ++
>  drivers/video/Kconfig            |    2 +
>  include/linux/hyperv.h           |   16 +
>  18 files changed, 15393 insertions(+), 1 deletion(-)  create mode 
> 100644 drivers/gpu/dxgkrnl/Kconfig  create mode 100644 
> drivers/gpu/dxgkrnl/Makefile  create mode 100644 
> drivers/gpu/dxgkrnl/d3dkmthk.h  create mode 100644 
> drivers/gpu/dxgkrnl/dxgadapter.c  create mode 100644 
> drivers/gpu/dxgkrnl/dxgkrnl.h  create mode 100644 
> drivers/gpu/dxgkrnl/dxgmodule.c  create mode 100644 
> drivers/gpu/dxgkrnl/dxgprocess.c  create mode 100644 
> drivers/gpu/dxgkrnl/dxgvmbus.c  create mode 100644 
> drivers/gpu/dxgkrnl/dxgvmbus.h  create mode 100644 
> drivers/gpu/dxgkrnl/hmgr.c  create mode 100644 
> drivers/gpu/dxgkrnl/hmgr.h  create mode 100644 
> drivers/gpu/dxgkrnl/ioctl.c  create mode 100644 
> drivers/gpu/dxgkrnl/misc.c  create mode 100644 
> drivers/gpu/dxgkrnl/misc.h
> 

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Jan Engelhardt May 20, 2020, 10:37 a.m. UTC | #14
On Tuesday 2020-05-19 22:36, Sasha Levin wrote:
>
>> - Why DX12 on linux? Looking at this feels like classic divide and
>
> There is a single usecase for this: WSL2 developer who wants to run
> machine learning on his GPU. The developer is working on his laptop,
> which is running Windows and that laptop has a single GPU that Windows
> is using.

It does not feel right conceptually. If the target is a Windows API
(DX12/ML), why bother with Linux environments? Make it a Windows executable,
thereby skipping the WSL translation layer and passthrough.
Thomas Zimmermann May 20, 2020, 11:06 a.m. UTC | #15
Hi Steve,

thank you for the fast reply.

Am 20.05.20 um 09:42 schrieb Steve Pronovost:
>> Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code.
> 
> Agreed, please see my earlier reply. We'll be moving the driver to drivers/hyperv node or something similar. Apology for the confusion here.
> 
>> I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions?
>> If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver?
> 
> You should think of the communication over the VM Bus for the vGPU projection as a strongly versioned interface. We will be keeping compatibility with older version of that interface as it evolves over time so we can continue to run older guest (we already do). This protocol isn't actually tied to the DX API. It is a generic abstraction for the GPU that can be used for any APIs (for example the NVIDIA CUDA driver that we announced is going over the same protocol to access the GPU). 
> 
> New version of user mode DX can either take advantage or sometime require new services from this kernel abstraction. This mean that pulling a new version of user mode DX can mean having to also pull a new version of this vGPU kernel driver. For WSL, these essentially ships together. The kernel driver ships as part of our WSL2 Linux Kernel integration. User mode DX bits ships with Windows. 

Just a friendly advise: maintaining a proprietary component within a
Linux environment is tough. You will need a good plan for long-term
interface stability and compatibility with the other components.

Best regards
Thomas

> 
> -----Original Message-----
> From: Thomas Zimmermann <tzimmermann@suse.de> 
> Sent: Wednesday, May 20, 2020 12:11 AM
> To: Sasha Levin <sashal@kernel.org>; alexander.deucher@amd.com; chris@chris-wilson.co.uk; ville.syrjala@linux.intel.com; Hawking.Zhang@amd.com; tvrtko.ursulin@intel.com
> Cc: linux-kernel@vger.kernel.org; linux-hyperv@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; wei.liu@kernel.org; Steve Pronovost <spronovo@microsoft.com>; Iouri Tarassov <iourit@microsoft.com>; dri-devel@lists.freedesktop.org; linux-fbdev@vger.kernel.org; gregkh@linuxfoundation.org
> Subject: [EXTERNAL] Re: [RFC PATCH 0/4] DirectX on Linux
> 
> Hi
> 
> Am 19.05.20 um 18:32 schrieb Sasha Levin:
>> There is a blog post that goes into more detail about the bigger 
>> picture, and walks through all the required pieces to make this work. 
>> It is available here:
>> https://devblogs.microsoft.com/directx/directx-heart-linux . The rest 
>> of this cover letter will focus on the Linux Kernel bits.
> 
> That's quite a surprise. Thanks for your efforts to contribute.
> 
>>
>> Overview
>> ========
>>
>> This is the first draft of the Microsoft Virtual GPU (vGPU) driver. 
>> The driver exposes a paravirtualized GPU to user mode applications 
>> running in a virtual machine on a Windows host. This enables hardware 
>> acceleration in environment such as WSL (Windows Subsystem for Linux) 
>> where the Linux virtual machine is able to share the GPU with the 
>> Windows host.
>>
>> The projection is accomplished by exposing the WDDM (Windows Display 
>> Driver Model) interface as a set of IOCTL. This allows APIs and user 
>> mode driver written against the WDDM GPU abstraction on Windows to be 
>> ported to run within a Linux environment. This enables the port of the
>> D3D12 and DirectML APIs as well as their associated user mode driver 
>> to Linux. This also enables third party APIs, such as the popular 
>> NVIDIA Cuda compute API, to be hardware accelerated within a WSL environment.
>>
>> Only the rendering/compute aspect of the GPU are projected to the 
>> virtual machine, no display functionality is exposed. Further, at this 
>> time there are no presentation integration. So although the D3D12 API 
>> can be use to render graphics offscreen, there is no path (yet) for 
>> pixel to flow from the Linux environment back onto the Windows host 
>> desktop. This GPU stack is effectively side-by-side with the native 
>> Linux graphics stack.
>>
>> The driver creates the /dev/dxg device, which can be opened by user 
>> mode application and handles their ioctls. The IOCTL interface to the 
>> driver is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl 
>> definitions). The interface matches the D3DKMT interface on Windows.
>> Ioctls are implemented in ioctl.c.
> 
> Echoing what others said, you're not making a DRM driver. The driver should live outside of the DRM code.
> 
> I have one question about the driver API: on Windows, DirectX versions are loosly tied to Windows releases. So I guess you can change the kernel interface among DirectX versions?
> 
> If so, how would this work on Linux in the long term? If there ever is a DirectX 13 or 14 with incompatible kernel interfaces, how would you plan to update the Linux driver?
> 
> Best regards
> Thomas
> 
>>
>> When a VM starts, hyper-v on the host adds virtual GPU devices to the 
>> VM via the hyper-v driver. The host offers several VM bus channels to 
>> the
>> VM: the global channel and one channel per virtual GPU, assigned to 
>> the VM.
>>
>> The driver registers with the hyper-v driver (hv_driver) for the 
>> arrival of VM bus channels. dxg_probe_device recognizes the vGPU 
>> channels and creates the corresponding objects (dxgadapter for vGPUs 
>> and dxgglobal for the global channel).
>>
>> The driver uses the hyper-V VM bus interface to communicate with the 
>> host. dxgvmbus.c implements the communication interface.
>>
>> The global channel has 8GB of IO space assigned by the host. This 
>> space is managed by the host and used to give the guest direct CPU 
>> access to some allocations. Video memory is allocated on the host 
>> except in the case of existing_sysmem allocations. The Windows host 
>> allocates memory for the GPU on behalf of the guest. The Linux guest 
>> can access that memory by mapping GPU virtual address to allocations 
>> and then referencing those GPU virtual address from within GPU command 
>> buffers submitted to the GPU. For allocations which require CPU 
>> access, the allocation is mapped by the host into a location in the 
>> 8GB of IO space reserved in the guest for that purpose. The Windows 
>> host uses the nested CPU page table to ensure that this guest IO space 
>> always map to the correct location for the allocation as it may 
>> migrate between dedicated GPU memory (e.g. VRAM, firmware reserved 
>> DDR) and shared system memory (regular DDR) over its lifetime. The 
>> Linux guest maps a user mode CPU virtual address to an allocation IO 
>> space range for direct access by user mode APIs and drivers.
>>
>>  
>>
>> Implementation of LX_DXLOCK2 ioctl
>> ==================================
>>
>> We would appreciate your feedback on the implementation of the
>> LX_DXLOCK2 ioctl.
>>
>> This ioctl is used to get a CPU address to an allocation, which is 
>> resident in video/system memory on the host. The way it works:
>>
>> 1. The driver sends the Lock message to the host
>>
>> 2. The host allocates space in the VM IO space and maps it to the 
>> allocation memory
>>
>> 3. The host returns the address in IO space for the mapped allocation
>>
>> 4. The driver (in dxg_map_iospace) allocates a user mode virtual 
>> address range using vm_mmap and maps it to the IO space using
>> io_remap_ofn_range)
>>
>> 5. The VA is returned to the application
>>
>>  
>>
>> Internal objects
>> ================
>>
>> The following objects are created by the driver (defined in dxgkrnl.h):
>>
>> - dxgadapter - represents a virtual GPU
>>
>> - dxgprocess - tracks per process state (handle table of created
>>   objects, list of objects, etc.)
>>
>> - dxgdevice - a container for other objects (contexts, paging queues,
>>   allocations, GPU synchronization objects)
>>
>> - dxgcontext - represents thread of GPU execution for packet
>>   scheduling.
>>
>> - dxghwqueue - represents thread of GPU execution of hardware 
>> scheduling
>>
>> - dxgallocation - represents a GPU accessible allocation
>>
>> - dxgsyncobject - represents a GPU synchronization object
>>
>> - dxgresource - collection of dxgalloction objects
>>
>> - dxgsharedresource, dxgsharedsyncobj - helper objects to share objects
>>   between different dxgdevice objects, which can belong to different 
>> processes
>>
>>
>>  
>> Object handles
>> ==============
>>
>> All GPU objects, created by the driver, are accessible by a handle 
>> (d3dkmt_handle). Each process has its own handle table, which is 
>> implemented in hmgr.c. For each API visible object, created by the 
>> driver, there is an object, created on the host. For example, the is a 
>> dxgprocess object on the host for each dxgprocess object in the VM, etc.
>> The object handles have the same value in the host and the VM, which 
>> is done to avoid translation from the guest handles to the host handles.
>>  
>>
>>
>> Signaling CPU events by the host
>> ================================
>>
>> The WDDM interface provides a way to signal CPU event objects when 
>> execution of a context reached certain point. The way it is implemented:
>>
>> - application sends an event_fd via ioctl to the driver
>>
>> - eventfd_ctx_get is used to get a pointer to the file object
>>   (eventfd_ctx)
>>
>> - the pointer to sent the host via a VM bus message
>>
>> - when GPU execution reaches a certain point, the host sends a message
>>   to the VM with the event pointer
>>
>> - signal_guest_event() handles the messages and eventually
>>   eventfd_signal() is called.
>>
>>
>> Sasha Levin (4):
>>   gpu: dxgkrnl: core code
>>   gpu: dxgkrnl: hook up dxgkrnl
>>   Drivers: hv: vmbus: hook up dxgkrnl
>>   gpu: dxgkrnl: create a MAINTAINERS entry
>>
>>  MAINTAINERS                      |    7 +
>>  drivers/gpu/Makefile             |    2 +-
>>  drivers/gpu/dxgkrnl/Kconfig      |   10 +
>>  drivers/gpu/dxgkrnl/Makefile     |   12 +
>>  drivers/gpu/dxgkrnl/d3dkmthk.h   | 1635 +++++++++
>>  drivers/gpu/dxgkrnl/dxgadapter.c | 1399 ++++++++
>>  drivers/gpu/dxgkrnl/dxgkrnl.h    |  913 ++++++
>>  drivers/gpu/dxgkrnl/dxgmodule.c  |  692 ++++  
>> drivers/gpu/dxgkrnl/dxgprocess.c |  355 ++
>>  drivers/gpu/dxgkrnl/dxgvmbus.c   | 2955 +++++++++++++++++
>>  drivers/gpu/dxgkrnl/dxgvmbus.h   |  859 +++++
>>  drivers/gpu/dxgkrnl/hmgr.c       |  593 ++++
>>  drivers/gpu/dxgkrnl/hmgr.h       |  107 +
>>  drivers/gpu/dxgkrnl/ioctl.c      | 5269 ++++++++++++++++++++++++++++++
>>  drivers/gpu/dxgkrnl/misc.c       |  280 ++
>>  drivers/gpu/dxgkrnl/misc.h       |  288 ++
>>  drivers/video/Kconfig            |    2 +
>>  include/linux/hyperv.h           |   16 +
>>  18 files changed, 15393 insertions(+), 1 deletion(-)  create mode 
>> 100644 drivers/gpu/dxgkrnl/Kconfig  create mode 100644 
>> drivers/gpu/dxgkrnl/Makefile  create mode 100644 
>> drivers/gpu/dxgkrnl/d3dkmthk.h  create mode 100644 
>> drivers/gpu/dxgkrnl/dxgadapter.c  create mode 100644 
>> drivers/gpu/dxgkrnl/dxgkrnl.h  create mode 100644 
>> drivers/gpu/dxgkrnl/dxgmodule.c  create mode 100644 
>> drivers/gpu/dxgkrnl/dxgprocess.c  create mode 100644 
>> drivers/gpu/dxgkrnl/dxgvmbus.c  create mode 100644 
>> drivers/gpu/dxgkrnl/dxgvmbus.h  create mode 100644 
>> drivers/gpu/dxgkrnl/hmgr.c  create mode 100644 
>> drivers/gpu/dxgkrnl/hmgr.h  create mode 100644 
>> drivers/gpu/dxgkrnl/ioctl.c  create mode 100644 
>> drivers/gpu/dxgkrnl/misc.c  create mode 100644 
>> drivers/gpu/dxgkrnl/misc.h
>>
> 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>
Steve Pronovost May 20, 2020, 3:34 p.m. UTC | #16
[resending as plain text, sorry about that]

Thanks Daniel, more below.

From: Daniel Vetter <mailto:daniel@ffwll.ch> 
Sent: Wednesday, May 20, 2020 12:41 AM
To: Steve Pronovost <mailto:spronovo@microsoft.com>
Cc: Dave Airlie <mailto:airlied@gmail.com>; Sasha Levin <mailto:sashal@kernel.org>; mailto:linux-hyperv@vger.kernel.org; Stephen Hemminger <mailto:sthemmin@microsoft.com>; Ursulin, Tvrtko <mailto:tvrtko.ursulin@intel.com>; Greg Kroah-Hartman <mailto:gregkh@linuxfoundation.org>; Haiyang Zhang <mailto:haiyangz@microsoft.com>; LKML <mailto:linux-kernel@vger.kernel.org>; dri-devel <mailto:dri-devel@lists.freedesktop.org>; Chris Wilson <mailto:chris@chris-wilson.co.uk>; Linux Fbdev development list <mailto:linux-fbdev@vger.kernel.org>; Iouri Tarassov <mailto:iourit@microsoft.com>; Deucher, Alexander <mailto:alexander.deucher@amd.com>; KY Srinivasan <mailto:kys@microsoft.com>; Wei Liu <mailto:wei.liu@kernel.org>; Hawking Zhang <mailto:Hawking.Zhang@amd.com>
Subject: Re: [EXTERNAL] Re: [RFC PATCH 0/4] DirectX on Linux

Hi Steve,

Sounds all good, some more comments and details below.

On Wed, May 20, 2020 at 5:47 AM Steve Pronovost <mailto:spronovo@microsoft.com> wrote:
Hey guys,

Thanks for the discussion. I may not be able to immediately answer all of your questions, but I'll do my best 
Pavel Machek June 16, 2020, 10:51 a.m. UTC | #17
Hi!

> > The driver creates the /dev/dxg device, which can be opened by user mode
> > application and handles their ioctls. The IOCTL interface to the driver
> > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl
> > definitions). The interface matches the D3DKMT interface on Windows.
> > Ioctls are implemented in ioctl.c.
> 
> Echoing what others said, you're not making a DRM driver. The driver should live outside 
> of the DRM code.
> 

Actually, this sounds to me like "this should not be merged into linux kernel". I mean,
we already have DRM API on Linux. We don't want another one, do we?

And at the very least... this misses API docs for /dev/dxg. Code can't really 
be reviewed without that.

Best regards,
										Pavel
Pavel Machek June 16, 2020, 10:51 a.m. UTC | #18
Hi!

> Thanks for the discussion. I may not be able to immediately answer all of your questions, but I'll do my best ????.
> 

Could you do something with your email settings? Because this is not how you should use
email on lkml. "[EXTERNAL]" in the subject, top-posting, unwrapped lines...

Thank you,
									Pavel
Pavel Machek June 16, 2020, 10:51 a.m. UTC | #19
> > Having said that, I hit one stumbling block:
> > "Further, at this time there are no presentation integration. "
> >
> > If we upstream this driver as-is into some hyperv specific place, and
> > you decide to add presentation integration this is more than likely
> > going to mean you will want to interact with dma-bufs and dma-fences.
> > If the driver is hidden away in a hyperv place it's likely we won't
> > even notice that feature landing until it's too late.
> >
> > I would like to see a coherent plan for presentation support (not
> > code, just an architectural diagram), because I think when you
> > contemplate how that works it will change the picture of how this
> > driver looks and intergrates into the rest of the Linux graphics
> > ecosystem.
> >
> > As-is I'd rather this didn't land under my purview, since I don't see
> > the value this adds to the Linux ecosystem at all, and I think it's
> > important when putting a burden on upstream that you provide some
> > value.
> 
> I also have another concern from a legal standpoint I'd rather not
> review the ioctl part of this. I'd probably request under DRI
> developers abstain as well.
> 
> This is a Windows kernel API being smashed into a Linux driver. I don't want to be 
> tainted by knowledge of an API that I've no idea of the legal status of derived works. 
> (it this all covered patent wise under OIN?)

If you can't look onto it, perhaps it is not suitable to merge into kernel...?

What would be legal requirements so this is "safe to look at"? We should really
require submitter to meet them...

									Pavel
Sasha Levin June 16, 2020, 1:21 p.m. UTC | #20
On Tue, Jun 16, 2020 at 12:51:56PM +0200, Pavel Machek wrote:
>> > Having said that, I hit one stumbling block:
>> > "Further, at this time there are no presentation integration. "
>> >
>> > If we upstream this driver as-is into some hyperv specific place, and
>> > you decide to add presentation integration this is more than likely
>> > going to mean you will want to interact with dma-bufs and dma-fences.
>> > If the driver is hidden away in a hyperv place it's likely we won't
>> > even notice that feature landing until it's too late.
>> >
>> > I would like to see a coherent plan for presentation support (not
>> > code, just an architectural diagram), because I think when you
>> > contemplate how that works it will change the picture of how this
>> > driver looks and intergrates into the rest of the Linux graphics
>> > ecosystem.
>> >
>> > As-is I'd rather this didn't land under my purview, since I don't see
>> > the value this adds to the Linux ecosystem at all, and I think it's
>> > important when putting a burden on upstream that you provide some
>> > value.
>>
>> I also have another concern from a legal standpoint I'd rather not
>> review the ioctl part of this. I'd probably request under DRI
>> developers abstain as well.
>>
>> This is a Windows kernel API being smashed into a Linux driver. I don't want to be
>> tainted by knowledge of an API that I've no idea of the legal status of derived works.
>> (it this all covered patent wise under OIN?)
>
>If you can't look onto it, perhaps it is not suitable to merge into kernel...?
>
>What would be legal requirements so this is "safe to look at"? We should really
>require submitter to meet them...

Could you walk me through your view on what the function of the
"Signed-off-by" tag is?
Sasha Levin June 16, 2020, 1:28 p.m. UTC | #21
On Tue, Jun 16, 2020 at 12:51:13PM +0200, Pavel Machek wrote:
>Hi!
>
>> > The driver creates the /dev/dxg device, which can be opened by user mode
>> > application and handles their ioctls. The IOCTL interface to the driver
>> > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl
>> > definitions). The interface matches the D3DKMT interface on Windows.
>> > Ioctls are implemented in ioctl.c.
>>
>> Echoing what others said, you're not making a DRM driver. The driver should live outside
>> of the DRM code.
>>
>
>Actually, this sounds to me like "this should not be merged into linux kernel". I mean,
>we already have DRM API on Linux. We don't want another one, do we?

This driver doesn't have any display functionality.

>And at the very least... this misses API docs for /dev/dxg. Code can't really
>be reviewed without that.

The docs live here: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/
Pavel Machek June 16, 2020, 2:41 p.m. UTC | #22
On Tue 2020-06-16 09:28:19, Sasha Levin wrote:
> On Tue, Jun 16, 2020 at 12:51:13PM +0200, Pavel Machek wrote:
> > Hi!
> > 
> > > > The driver creates the /dev/dxg device, which can be opened by user mode
> > > > application and handles their ioctls. The IOCTL interface to the driver
> > > > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl
> > > > definitions). The interface matches the D3DKMT interface on Windows.
> > > > Ioctls are implemented in ioctl.c.
> > > 
> > > Echoing what others said, you're not making a DRM driver. The driver should live outside
> > > of the DRM code.
> > > 
> > 
> > Actually, this sounds to me like "this should not be merged into linux kernel". I mean,
> > we already have DRM API on Linux. We don't want another one, do we?
> 
> This driver doesn't have any display functionality.

Graphics cards without displays connected are quite common. I may be
wrong, but I believe we normally handle them using DRM...

> > And at the very least... this misses API docs for /dev/dxg. Code can't really
> > be reviewed without that.
> 
> The docs live here: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/

I don't see "/dev/dxg" being metioned there. Plus, kernel API
documentation should really go to Documentation, and be suitably
licensed.
									Pavel
Sasha Levin June 16, 2020, 4 p.m. UTC | #23
On Tue, Jun 16, 2020 at 04:41:22PM +0200, Pavel Machek wrote:
>On Tue 2020-06-16 09:28:19, Sasha Levin wrote:
>> On Tue, Jun 16, 2020 at 12:51:13PM +0200, Pavel Machek wrote:
>> > Hi!
>> >
>> > > > The driver creates the /dev/dxg device, which can be opened by user mode
>> > > > application and handles their ioctls. The IOCTL interface to the driver
>> > > > is defined in dxgkmthk.h (Dxgkrnl Graphics Port Driver ioctl
>> > > > definitions). The interface matches the D3DKMT interface on Windows.
>> > > > Ioctls are implemented in ioctl.c.
>> > >
>> > > Echoing what others said, you're not making a DRM driver. The driver should live outside
>> > > of the DRM code.
>> > >
>> >
>> > Actually, this sounds to me like "this should not be merged into linux kernel". I mean,
>> > we already have DRM API on Linux. We don't want another one, do we?
>>
>> This driver doesn't have any display functionality.
>
>Graphics cards without displays connected are quite common. I may be
>wrong, but I believe we normally handle them using DRM...

This is more similar to the accelerators that live in drivers/misc/
right now.

>> > And at the very least... this misses API docs for /dev/dxg. Code can't really
>> > be reviewed without that.
>>
>> The docs live here: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/
>
>I don't see "/dev/dxg" being metioned there. Plus, kernel API

Right, this is because this entire codebase is just a pipe to the API
I've linked, it doesn't implement anything new on it's own.

>documentation should really go to Documentation, and be suitably
>licensed.

While I don't mind copying the docs into Documentation, I'm concerned
that over time they will diverge from the docs on the website. This is
similar to how other documentation (such as the virtio spec) live out of
tree to avoid these issues.

w.r.t the licensing, again: this was sent under GPL2 (note the SPDX tags
in each file), and the patches carry a S-O-B by someone who was a
Microsoft employee at the time the patches were sent.
James Hilliard June 28, 2020, 11:39 p.m. UTC | #24
On Tue, May 19, 2020 at 2:36 PM Sasha Levin <sashal@kernel.org> wrote:
>
> Hi Daniel,
>
> On Tue, May 19, 2020 at 09:21:15PM +0200, Daniel Vetter wrote:
> >Hi Sasha
> >
> >So obviously great that Microsoft is trying to upstream all this, and
> >very much welcome and all that.
> >
> >But I guess there's a bunch of rather fundamental issues before we
> >look into any kind of code details. And that might make this quite a
> >hard sell for upstream to drivers/gpu subsystem:
>
> Let me preface my answers by saying that speaking personally I very much
> dislike that the userspace is closed and wish I could do something about
> it.
>
> >- From the blog it sounds like the userspace is all closed. That
> >includes the hw specific part and compiler chunks, all stuff we've
> >generally expected to be able to look in the past for any kind of
> >other driver. It's event documented here:
> >
> >https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
> >
> >What's your plan here?
>
> Let me answer with a (genuine) question: does this driver have anything
> to do with DRM even after we enable graphics on it? I'm still trying to
> figure it out.
>
> There is an open source DX12 Galluim driver (that lives here:
> https://gitlab.freedesktop.org/kusma/mesa/-/tree/msclc-d3d12) with open
> source compiler and so on.
>
> The plan is for Microsoft to provide shims to allow the existing Linux
> userspace interact with DX12; I'll explain below why we had to pipe DX12
> all the way into the Linux guest, but this is *not* to introduce DX12
> into the Linux world as competition. There is no intent for anyone in
> the Linux world to start coding for the DX12 API.
If that really is the case why is microsoft recommending developers to break
compatibility with native Linux and use the DX12 API's here:
https://devblogs.microsoft.com/directx/in-the-works-opencl-and-opengl-mapping-layers-to-directx/

Quote:
"Make it easier for developers to port their apps to D3D12. For developers
looking to move from older OpenCL and OpenGL API versions to D3D12,
the open source mapping layers will provide helpful example code on how
to use the D3D12 Translation Layer library."

If developers of applications that use OpenCL and OpenGL API's were to
follow this advice and transition to D3D12 their applications would no longer
work on Linux systems unless using WSL2. Is Microsoft planning on creating
a D3D12/DirectML frontend that doesn't depend on WSL2?
>
> This is why I'm not sure whether this touches DRM on the Linux side of
> things. Nothing is actually rendered on Linux but rather piped to
> Windows to be done there.
>
> >btw since the main goal here (at least at first) seems to be get
> >compute and ML going the official work-around here is to relabel your
> >driver as an accelerator driver (just sed -e s/vGPU/vaccel/ over the
> >entire thing or so) and then Olof and Greg will take it into
> >drivers/accel ...
>
> This submission is not a case of "we want it upstream NOW" but rather
> "let's work together to figure out how to do it right" :)
>
> I thought about placing this driver in drivers/hyper-v/ given that it's
> basically just a pipe between the host and the guest. There is no fancy
> logic in this drivers. Maybe the right place is indeed drivers/accel or
> drivers/hyper-v but I'd love if we agree on that rather than doing that
> as a workaround and 6 months down the road enabling graphics.
>
> >- Next up (but that's not really a surprise for a fresh vendor driver)
> >at a more technical level, this seems to reinvent the world, from
> >device enumeration (why is this not exposed as /dev/dri/card0 so it
> >better integrates with existing linux desktop stuff, in case that
> >becomes a goal ever) down to reinvented kref_put_mutex (and please
> >look at drm_device->struct_mutex for an example of how bad of a
> >nightmare that locking pattern is and how many years it took us to
> >untangle that one.
>
> I'd maybe note that neither of us here at Microsoft is an expert in the
> Linux DRM world. Stuff might have been done in a certain way because we
> didn't know better.
>
> >- Why DX12 on linux? Looking at this feels like classic divide and
>
> There is a single usecase for this: WSL2 developer who wants to run
> machine learning on his GPU. The developer is working on his laptop,
> which is running Windows and that laptop has a single GPU that Windows
> is using.
>
> Since the GPU is being used by Windows, we can't assign it directly to
> the Linux guest, but instead we can use GPU Partitioning to give the
> guest access to the GPU. This means that the guest needs to be able to
> "speak" DX12, which is why we pulled DX12 into Linux.
>
> >conquer (or well triple E from the 90s), we have vk, we have
> >drm_syncobj, we have an entire ecosystem of winsys layers that work
> >across vendors. Is the plan here that we get a dx12 driver for other
> >hw mesa drivers from you guys, so this is all consistent and we have a
> >nice linux platform? How does this integrate everywhere else with
> >linux winsys standards, like dma-buf for passing stuff around,
> >dma-fence/sync_file/drm_syncobj for syncing, drm_fourcc/modifiers for
> >some idea how it all meshes together?
>
> Let me point you to this blog post that has more information about the
> graphics side of things:
> https://www.collabora.com/news-and-blog/news-and-events/introducing-opencl-and-opengl-on-directx.html
> .
>
> The intent is to wrap DX12 with shims to work with the existing
> ecosystem; DX12 isn't a new player on it's own and thus isn't trying to
> divide/conquer anything.
Shouldn't tensorflow/machine learning be going through the opencl
compatibility layer/shims instead of talking directly to DX12/DirectML?

If tensorflow or any other machine learning software uses DX12 API's
directly then they won't be compatible with Linux unless running on top
of WSL2.
>
> >- There's been a pile of hallway track/private discussions about
> >moving on from the buffer-based memory managed model to something more
> >modern. That relates to your DXLOCK2 question, but there's a lot more
> >to userspace managed gpu memory residency than just that. monitored
> >fences are another part. Also, to avoid a platform split we need to
> >figure out how to tie this back into the dma-buf and dma-fence
> >(including various uapi flavours) or it'll be made of fail. dx12 has
> >all that in some form, except 0 integration with the linux stuff we
> >have (no surprise, since linux isn't windows). Finally if we go to the
> >trouble of a completely revamped I think ioctls aren't a great idea,
> >something like iouring (the gossip name is drm_uring) would be a lot
> >better. Also for easier paravirt we'd need 0 cpu pointers in any such
> >new interface. Adding a few people who've been involved in these
> >discussions thus far, mostly under a drm/hmm.ko heading iirc.
> >
> >I think the above are the really big ticket items around what's the
> >plan here and are we solving even the right problem.
>
> Part of the reason behind this implementation is simplicity. Again, no
> objections around moving to uring and doing other improvements.
>
> --
> Thanks,
> Sasha
>
>
>