[v2] Documentation/gpu: Add a VM_BIND async draft document

Message ID	20230621100435.54425-1-thomas.hellstrom@linux.intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com> To: intel-xe@lists.freedesktop.org Subject: [PATCH v2] Documentation/gpu: Add a VM_BIND async draft document Date: Wed, 21 Jun 2023 12:04:35 +0200 Message-Id: <20230621100435.54425-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Cc: Matthew Brost <matthew.brost@intel.com>, =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com>, linux-kernel@vger.kernel.org, Oak Zeng <oak.zeng@intel.com>, Danilo Krummrich <dakr@redhat.com>, dri-devel@lists.freedesktop.org, Nirmoy Das <nirmoy.das@intel.com> Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	[v2] Documentation/gpu: Add a VM_BIND async draft document \| expand [v2] Documentation/gpu: Add a VM_BIND async draft document

Message ID

20230621100435.54425-1-thomas.hellstrom@linux.intel.com (mailing list archive)

State

New, archived

Headers

From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Subject: [PATCH v2] Documentation/gpu: Add a VM_BIND async draft document
Date: Wed, 21 Jun 2023 12:04:35 +0200
Message-Id: <20230621100435.54425-1-thomas.hellstrom@linux.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: list
Cc: Matthew Brost <matthew.brost@intel.com>,
 =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com>,
 linux-kernel@vger.kernel.org, Oak Zeng <oak.zeng@intel.com>,
 Danilo Krummrich <dakr@redhat.com>, dri-devel@lists.freedesktop.org,
 Nirmoy Das <nirmoy.das@intel.com>
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Series

[v2] Documentation/gpu: Add a VM_BIND async draft document | expand

Commit Message

Thomas Hellstrom June 21, 2023, 10:04 a.m. UTC

Add a motivation for and description of asynchronous VM_BIND operation

v2:
- Fix typos (Nirmoy Das)
- Improve the description of a memory fence (Oak Zeng)
- Add a reference to the document in the Xe RFC.
- Add pointers to sample uAPI suggestions

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
---
 Documentation/gpu/drm-vm-bind-async.rst | 145 ++++++++++++++++++++++++
 Documentation/gpu/rfc/xe.rst            |   4 +-
 2 files changed, 147 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/gpu/drm-vm-bind-async.rst

Comments

Danilo Krummrich June 21, 2023, 11:23 a.m. UTC | #1

On Wed, Jun 21, 2023 at 12:04:35PM +0200, Thomas Hellström wrote:
> Add a motivation for and description of asynchronous VM_BIND operation
> 
> v2:
> - Fix typos (Nirmoy Das)
> - Improve the description of a memory fence (Oak Zeng)
> - Add a reference to the document in the Xe RFC.
> - Add pointers to sample uAPI suggestions
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Acked-by: Nirmoy Das <nirmoy.das@intel.com>
> ---
>  Documentation/gpu/drm-vm-bind-async.rst | 145 ++++++++++++++++++++++++
>  Documentation/gpu/rfc/xe.rst            |   4 +-
>  2 files changed, 147 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/gpu/drm-vm-bind-async.rst
> 
> diff --git a/Documentation/gpu/drm-vm-bind-async.rst b/Documentation/gpu/drm-vm-bind-async.rst
> new file mode 100644
> index 000000000000..69aff250b62f
> --- /dev/null
> +++ b/Documentation/gpu/drm-vm-bind-async.rst
> @@ -0,0 +1,145 @@
> +====================
> +Asynchronous VM_BIND
> +====================
> +
> +Nomenclature:
> +=============
> +
> +* VRAM: On-device memory. Sometimes referred to as device local memory.
> +
> +* vm: A GPU address space. Typically per process, but can be shared by
> +  multiple processes.

Rather obvious, but maybe specify as "GPU virtual address space" or just
"GPU VA space".

Personally, I don't like "vm" as an abbreviation for "GPU VA space" or "virtual
address space" in general, but it is commonly used and I fail to find a better
one to be honest.

> +
> +* VM_BIND: An operation or a list of operations to modify a vm using
> +  an IOCTL. The operations include mapping and unmapping system- or
> +  VRAM memory.
> +
> +* syncobj: A container that abstracts synchronization objects. The
> +  synchronization objects can be either generic, like dma-fences or
> +  driver specific. A syncobj typically indicates the type of the
> +  underlying synchronization object.
> +
> +* in-syncobj: Argument to a VM_BIND IOCTL, the VM_BIND operation waits
> +  for these before starting.
> +
> +* out-syncbj: Argument to a VM_BIND_IOCTL, the VM_BIND operation
> +  signals these when the bind operation is complete.
> +
> +* memory fence: A synchronization object, different from a dma-fence.
> +  A memory fence uses the value of a specified memory location to determine
> +  signaled status. A memory fence can be awaited and signaled by both
> +  the GPU and CPU. Memory fences are sometimes referred to as
> +  user-fences, and do not necessarily bey the dma-fence rule of
> +  signalling within a "reasonable amount of time". The kernel should
> +  thus avoid waiting for memory fences with locks held.
> +
> +* long-running workload: A workload that may take more than the
> +  current stipulated dma-fence maximum signal delay to complete and
> +  which therefore needs to set the VM or the GPU execution context in
> +  a certain mode that disallows completion dma-fences.
> +
> +* UMD: User-mode driver.
> +
> +* KMD: Kernel-mode driver.
> +
> +
> +Synchronous / Asynchronous VM_BIND operation
> +============================================
> +
> +Synchronous VM_BIND
> +___________________
> +With Synchronous VM_BIND, the VM_BIND operations all complete before the
> +ioctl returns. A synchronous VM_BIND takes neither in-fences nor

Just some nit-picking: IOCTL is mostly written in caps, maybe you want to stick
to either of the two variants.

> +out-fences. Synchronous VM_BIND may block and wait for GPU operations;
> +for example swapin or clearing, or even previous binds.
> +
> +Asynchronous VM_BIND
> +____________________
> +Asynchronous VM_BIND accepts both in-syncobjs and out-syncobjs. While the
> +IOCTL may return immediately, the VM_BIND operations wait for the in-syncobjs
> +before modifying the GPU page-tables, and signal the out-syncobjs when
> +the modification is done in the sense that the next execbuf that

Maybe add "execbuf" to the nomenclature.

> +awaits for the out-syncobjs will see the change. Errors are reported
> +synchronously assuming that the asynchronous part of the job never errors.
> +In low-memory situations the implementation may block, performing the
> +VM_BIND synchronously, because there might not be enough memory
> +immediately available for preparing the asynchronous operation.
> +
> +If the VM_BIND IOCTL takes a list or an array of operations as an argument,
> +the in-syncobjs needs to signal before the first operation starts to
> +execute, and the out-syncobjs signal after the last operation
> +completes. Operations in the operation list can be assumed, where it
> +matters, to complete in order.
> +
> +To aid in supporting user-space queues, the VM_BIND may take a bind context

I think "bind context" should also be explained in the nomenclature.

> +AKA bind engine identifier argument. All VM_BIND operations using the same
> +bind engine can then be assumed, where it matters, to complete in
> +order. No such assumptions can be made between VM_BIND operations
> +using separate bind contexts.
> +
> +The purpose of an Asynchronous VM_BIND operation is for user-mode
> +drivers to be able to pipeline interleaved vm modifications and
> +execbufs. For long-running workloads, such pipelining of a bind
> +operation is not allowed and any in-fences need to be awaited
> +synchronously.
> +
> +Also for VM_BINDS for long-running VMs the user-mode driver should typically
> +select memory fences as out-fences since that gives greater flexibility for
> +the kernel mode driver to inject other  operations into the bind /
> +unbind operations. Like for example inserting breakpoints into batch
> +buffers. The workload execution can then easily be pipelined behind
> +the bind completion using the memory out-fence as the signal condition
> +for a gpu semaphore embedded by UMD in the workload.
> +
> +Multi-operation VM_BIND IOCTL error handling and interrupts
> +========================================
> +
> +The VM_BIND operations of the ioctl may error due to lack of resources
> +to complete and also due to interrupted waits. In both situations UMD
> +should preferably restart the IOCTL after taking suitable action. If
> +UMD has overcommitted a memory resource, an -ENOSPC error will be
> +returned, and UMD may then unbind resources that are not used at the
> +moment and restart the IOCTL. On -EINTR, UMD should simply restart the
> +IOCTL and on -ENOMEM user-space may either attempt to free known
> +system memory resources or abort the operation. If aborting as a
> +result of a failed operation in a list of operations, some operations
> +may still have completed, and to get back to a known state, user-space
> +should therefore attempt to unbind all virtual memory regions touched
> +by the failing IOCTL.
> +Unbind operations are guaranteed not to cause any errors due to
> +resource constraints.
> +In between a failed VM_BIND ioctl and a successful restart there may
> +be implementation defined restrictions on the use of the VM. For a
> +description why, please see KMD implementation details under [error
> +state saving]_.
> +
> +Sample uAPI implementations
> +======================
> +Suggested uAPI implementations at the moment of writing can be found for
> +the Nouveau driver `here: https://patchwork.freedesktop.org/patch/543260/?series=112994&rev=6`
> +and for the Xe driver `here: https://cgit.freedesktop.org/drm/drm-xe/diff/include/uapi/drm/xe_drm.h?h=drm-xe-next&id=9cb016ebbb6a275f57b1cb512b95d5a842391ad7`
> +
> +KMD implementation details
> +==========================

Maybe we can mention the GPUVA manager as a helper for implementing such an
interface.

Anyway, I will surely add a link pointing to this document to the documentation
of the GPUVA manager.

- Danilo

> +
> +.. [error state saving] Open: When the VM_BIND ioctl returns an error, some
> +			or even parts of an operation may have been
> +			completed. If the ioctl is restarted, in order
> +			to know where to restart, the KMD can
> +			either put the VM in an error state and save
> +			one instance of the needed restart state
> +			internally. In this case, KMD needs to block
> +			further modifications of the VM state that may
> +			cause additional failures requiring a restart
> +			state save, until the error has been fully resolved.
> +			If the uAPI instead defines a pointer to a
> +			UMD allocated cookie in the IOCTL struct, it
> +			could also choose to store the restart state
> +			in that cookie.
> +
> +			The restart state may, for example, be the
> +			number of successfully completed operations.
> +
> +			Easiest for UMD would of course be if KMD did
> +			a full unwind on error so that no error state
> +			needs to be saved.
> diff --git a/Documentation/gpu/rfc/xe.rst b/Documentation/gpu/rfc/xe.rst
> index 2516fe141db6..0f062e1346d2 100644
> --- a/Documentation/gpu/rfc/xe.rst
> +++ b/Documentation/gpu/rfc/xe.rst
> @@ -138,8 +138,8 @@ memory fences. Ideally with helper support so people don't get it wrong in all
>  possible ways.
>  
>  As a key measurable result, the benefits of ASYNC VM_BIND and a discussion of
> -various flavors, error handling and a sample API should be documented here or in
> -a separate document pointed to by this document.
> +various flavors, error handling and sample API suggestions are documented in
> +Documentation/gpu/drm-vm-bind-async.rst
>  
>  Userptr integration and vm_bind
>  -------------------------------
> -- 
> 2.40.1
>

Thomas Hellstrom June 21, 2023, 2:14 p.m. UTC | #2

Thanks for reviewing.

On Wed, 2023-06-21 at 13:23 +0200, Danilo Krummrich wrote:
> On Wed, Jun 21, 2023 at 12:04:35PM +0200, Thomas Hellström wrote:
> > Add a motivation for and description of asynchronous VM_BIND
> > operation
> > 
> > v2:
> > - Fix typos (Nirmoy Das)
> > - Improve the description of a memory fence (Oak Zeng)
> > - Add a reference to the document in the Xe RFC.
> > - Add pointers to sample uAPI suggestions
> > 
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Acked-by: Nirmoy Das <nirmoy.das@intel.com>
> > ---
> >  Documentation/gpu/drm-vm-bind-async.rst | 145
> > ++++++++++++++++++++++++
> >  Documentation/gpu/rfc/xe.rst            |   4 +-
> >  2 files changed, 147 insertions(+), 2 deletions(-)
> >  create mode 100644 Documentation/gpu/drm-vm-bind-async.rst
> > 
> > diff --git a/Documentation/gpu/drm-vm-bind-async.rst
> > b/Documentation/gpu/drm-vm-bind-async.rst
> > new file mode 100644
> > index 000000000000..69aff250b62f
> > --- /dev/null
> > +++ b/Documentation/gpu/drm-vm-bind-async.rst
> > @@ -0,0 +1,145 @@
> > +====================
> > +Asynchronous VM_BIND
> > +====================
> > +
> > +Nomenclature:
> > +=============
> > +
> > +* VRAM: On-device memory. Sometimes referred to as device local
> > memory.
> > +
> > +* vm: A GPU address space. Typically per process, but can be
> > shared by
> > +  multiple processes.
> 
> Rather obvious, but maybe specify as "GPU virtual address space" or
> just
> "GPU VA space".
> 
> Personally, I don't like "vm" as an abbreviation for "GPU VA space"
> or "virtual
> address space" in general, but it is commonly used and I fail to find
> a better
> one to be honest.

I have another upcoming document related to VM_BIND locking variants /
userptr integration and there I call gpu_vm. It would make sense to
unify the naming in the documents anyway. I don't have a strong opinion
either as long as it's not too long.

> 
> > +
> > +* VM_BIND: An operation or a list of operations to modify a vm
> > using
> > +  an IOCTL. The operations include mapping and unmapping system-
> > or
> > +  VRAM memory.
> > +
> > +* syncobj: A container that abstracts synchronization objects. The
> > +  synchronization objects can be either generic, like dma-fences
> > or
> > +  driver specific. A syncobj typically indicates the type of the
> > +  underlying synchronization object.
> > +
> > +* in-syncobj: Argument to a VM_BIND IOCTL, the VM_BIND operation
> > waits
> > +  for these before starting.
> > +
> > +* out-syncbj: Argument to a VM_BIND_IOCTL, the VM_BIND operation
> > +  signals these when the bind operation is complete.
> > +
> > +* memory fence: A synchronization object, different from a dma-
> > fence.
> > +  A memory fence uses the value of a specified memory location to
> > determine
> > +  signaled status. A memory fence can be awaited and signaled by
> > both
> > +  the GPU and CPU. Memory fences are sometimes referred to as
> > +  user-fences, and do not necessarily bey the dma-fence rule of
> > +  signalling within a "reasonable amount of time". The kernel
> > should
> > +  thus avoid waiting for memory fences with locks held.
> > +
> > +* long-running workload: A workload that may take more than the
> > +  current stipulated dma-fence maximum signal delay to complete
> > and
> > +  which therefore needs to set the VM or the GPU execution context
> > in
> > +  a certain mode that disallows completion dma-fences.
> > +
> > +* UMD: User-mode driver.
> > +
> > +* KMD: Kernel-mode driver.
> > +
> > +
> > +Synchronous / Asynchronous VM_BIND operation
> > +============================================
> > +
> > +Synchronous VM_BIND
> > +___________________
> > +With Synchronous VM_BIND, the VM_BIND operations all complete
> > before the
> > +ioctl returns. A synchronous VM_BIND takes neither in-fences nor
> 
> Just some nit-picking: IOCTL is mostly written in caps, maybe you
> want to stick
> to either of the two variants.

Sure,

> 
> > +out-fences. Synchronous VM_BIND may block and wait for GPU
> > operations;
> > +for example swapin or clearing, or even previous binds.
> > +
> > +Asynchronous VM_BIND
> > +____________________
> > +Asynchronous VM_BIND accepts both in-syncobjs and out-syncobjs.
> > While the
> > +IOCTL may return immediately, the VM_BIND operations wait for the
> > in-syncobjs
> > +before modifying the GPU page-tables, and signal the out-syncobjs
> > when
> > +the modification is done in the sense that the next execbuf that
> 
> Maybe add "execbuf" to the nomenclature.
> 
> > +awaits for the out-syncobjs will see the change. Errors are
> > reported
> > +synchronously assuming that the asynchronous part of the job never
> > errors.
> > +In low-memory situations the implementation may block, performing
> > the
> > +VM_BIND synchronously, because there might not be enough memory
> > +immediately available for preparing the asynchronous operation.
> > +
> > +If the VM_BIND IOCTL takes a list or an array of operations as an
> > argument,
> > +the in-syncobjs needs to signal before the first operation starts
> > to
> > +execute, and the out-syncobjs signal after the last operation
> > +completes. Operations in the operation list can be assumed, where
> > it
> > +matters, to complete in order.
> > +
> > +To aid in supporting user-space queues, the VM_BIND may take a
> > bind context
> 
> I think "bind context" should also be explained in the nomenclature.
> 
> > +AKA bind engine identifier argument. All VM_BIND operations using
> > the same
> > +bind engine can then be assumed, where it matters, to complete in
> > +order. No such assumptions can be made between VM_BIND operations
> > +using separate bind contexts.
> > +
> > +The purpose of an Asynchronous VM_BIND operation is for user-mode
> > +drivers to be able to pipeline interleaved vm modifications and
> > +execbufs. For long-running workloads, such pipelining of a bind
> > +operation is not allowed and any in-fences need to be awaited
> > +synchronously.
> > +
> > +Also for VM_BINDS for long-running VMs the user-mode driver should
> > typically
> > +select memory fences as out-fences since that gives greater
> > flexibility for
> > +the kernel mode driver to inject other  operations into the bind /
> > +unbind operations. Like for example inserting breakpoints into
> > batch
> > +buffers. The workload execution can then easily be pipelined
> > behind
> > +the bind completion using the memory out-fence as the signal
> > condition
> > +for a gpu semaphore embedded by UMD in the workload.
> > +
> > +Multi-operation VM_BIND IOCTL error handling and interrupts
> > +========================================
> > +
> > +The VM_BIND operations of the ioctl may error due to lack of
> > resources
> > +to complete and also due to interrupted waits. In both situations
> > UMD
> > +should preferably restart the IOCTL after taking suitable action.
> > If
> > +UMD has overcommitted a memory resource, an -ENOSPC error will be
> > +returned, and UMD may then unbind resources that are not used at
> > the
> > +moment and restart the IOCTL. On -EINTR, UMD should simply restart
> > the
> > +IOCTL and on -ENOMEM user-space may either attempt to free known
> > +system memory resources or abort the operation. If aborting as a
> > +result of a failed operation in a list of operations, some
> > operations
> > +may still have completed, and to get back to a known state, user-
> > space
> > +should therefore attempt to unbind all virtual memory regions
> > touched
> > +by the failing IOCTL.
> > +Unbind operations are guaranteed not to cause any errors due to
> > +resource constraints.
> > +In between a failed VM_BIND ioctl and a successful restart there
> > may
> > +be implementation defined restrictions on the use of the VM. For a
> > +description why, please see KMD implementation details under
> > [error
> > +state saving]_.
> > +
> > +Sample uAPI implementations
> > +======================
> > +Suggested uAPI implementations at the moment of writing can be
> > found for
> > +the Nouveau driver `here:
> > https://patchwork.freedesktop.org/patch/543260/?series=112994&rev=6`
> > +and for the Xe driver `here:
> > https://cgit.freedesktop.org/drm/drm-xe/diff/include/uapi/drm/xe_drm.h?h=drm-xe-next&id=9cb016ebbb6a275f57b1cb512b95d5a842391ad7`
> > +
> > +KMD implementation details
> > +==========================
> 
> Maybe we can mention the GPUVA manager as a helper for implementing
> such an
> interface.

Sure will add this as well.
Thanks,

Thomas

> 
> Anyway, I will surely add a link pointing to this document to the
> documentation
> of the GPUVA manager.


> 
> - Danilo
> 
> > +
> > +.. [error state saving] Open: When the VM_BIND ioctl returns an
> > error, some
> > +                       or even parts of an operation may have been
> > +                       completed. If the ioctl is restarted, in
> > order
> > +                       to know where to restart, the KMD can
> > +                       either put the VM in an error state and
> > save
> > +                       one instance of the needed restart state
> > +                       internally. In this case, KMD needs to
> > block
> > +                       further modifications of the VM state that
> > may
> > +                       cause additional failures requiring a
> > restart
> > +                       state save, until the error has been fully
> > resolved.
> > +                       If the uAPI instead defines a pointer to a
> > +                       UMD allocated cookie in the IOCTL struct,
> > it
> > +                       could also choose to store the restart
> > state
> > +                       in that cookie.
> > +
> > +                       The restart state may, for example, be the
> > +                       number of successfully completed
> > operations.
> > +
> > +                       Easiest for UMD would of course be if KMD
> > did
> > +                       a full unwind on error so that no error
> > state
> > +                       needs to be saved.
> > diff --git a/Documentation/gpu/rfc/xe.rst
> > b/Documentation/gpu/rfc/xe.rst
> > index 2516fe141db6..0f062e1346d2 100644
> > --- a/Documentation/gpu/rfc/xe.rst
> > +++ b/Documentation/gpu/rfc/xe.rst
> > @@ -138,8 +138,8 @@ memory fences. Ideally with helper support so
> > people don't get it wrong in all
> >  possible ways.
> >  
> >  As a key measurable result, the benefits of ASYNC VM_BIND and a
> > discussion of
> > -various flavors, error handling and a sample API should be
> > documented here or in
> > -a separate document pointed to by this document.
> > +various flavors, error handling and sample API suggestions are
> > documented in
> > +Documentation/gpu/drm-vm-bind-async.rst
> >  
> >  Userptr integration and vm_bind
> >  -------------------------------
> > -- 
> > 2.40.1
> > 
>

kernel test robot June 25, 2023, 10 a.m. UTC | #3

Hi Thomas,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm/drm-next drm-tip/drm-tip next-20230623]
[cannot apply to linus/master v6.4-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Thomas-Hellstr-m/Documentation-gpu-Add-a-VM_BIND-async-draft-document/20230621-180712
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:    https://lore.kernel.org/r/20230621100435.54425-1-thomas.hellstrom%40linux.intel.com
patch subject: [PATCH v2] Documentation/gpu: Add a VM_BIND async draft document
reproduce: (https://download.01.org/0day-ci/archive/20230625/202306251715.VhtKaEVg-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306251715.VhtKaEVg-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> Documentation/gpu/drm-vm-bind-async.rst:95: WARNING: Title underline too short.
>> Documentation/gpu/drm-vm-bind-async.rst: WARNING: document isn't included in any toctree

vim +95 Documentation/gpu/drm-vm-bind-async.rst

    93	
    94	Multi-operation VM_BIND IOCTL error handling and interrupts
  > 95	========================================
    96

diff --git a/Documentation/gpu/drm-vm-bind-async.rst b/Documentation/gpu/drm-vm-bind-async.rst
new file mode 100644
index 000000000000..69aff250b62f
--- /dev/null
+++ b/Documentation/gpu/drm-vm-bind-async.rst
@@ -0,0 +1,145 @@ 
+====================
+Asynchronous VM_BIND
+====================
+
+Nomenclature:
+=============
+
+* VRAM: On-device memory. Sometimes referred to as device local memory.
+
+* vm: A GPU address space. Typically per process, but can be shared by
+  multiple processes.
+
+* VM_BIND: An operation or a list of operations to modify a vm using
+  an IOCTL. The operations include mapping and unmapping system- or
+  VRAM memory.
+
+* syncobj: A container that abstracts synchronization objects. The
+  synchronization objects can be either generic, like dma-fences or
+  driver specific. A syncobj typically indicates the type of the
+  underlying synchronization object.
+
+* in-syncobj: Argument to a VM_BIND IOCTL, the VM_BIND operation waits
+  for these before starting.
+
+* out-syncbj: Argument to a VM_BIND_IOCTL, the VM_BIND operation
+  signals these when the bind operation is complete.
+
+* memory fence: A synchronization object, different from a dma-fence.
+  A memory fence uses the value of a specified memory location to determine
+  signaled status. A memory fence can be awaited and signaled by both
+  the GPU and CPU. Memory fences are sometimes referred to as
+  user-fences, and do not necessarily bey the dma-fence rule of
+  signalling within a "reasonable amount of time". The kernel should
+  thus avoid waiting for memory fences with locks held.
+
+* long-running workload: A workload that may take more than the
+  current stipulated dma-fence maximum signal delay to complete and
+  which therefore needs to set the VM or the GPU execution context in
+  a certain mode that disallows completion dma-fences.
+
+* UMD: User-mode driver.
+
+* KMD: Kernel-mode driver.
+
+
+Synchronous / Asynchronous VM_BIND operation
+============================================
+
+Synchronous VM_BIND
+___________________
+With Synchronous VM_BIND, the VM_BIND operations all complete before the
+ioctl returns. A synchronous VM_BIND takes neither in-fences nor
+out-fences. Synchronous VM_BIND may block and wait for GPU operations;
+for example swapin or clearing, or even previous binds.
+
+Asynchronous VM_BIND
+____________________
+Asynchronous VM_BIND accepts both in-syncobjs and out-syncobjs. While the
+IOCTL may return immediately, the VM_BIND operations wait for the in-syncobjs
+before modifying the GPU page-tables, and signal the out-syncobjs when
+the modification is done in the sense that the next execbuf that
+awaits for the out-syncobjs will see the change. Errors are reported
+synchronously assuming that the asynchronous part of the job never errors.
+In low-memory situations the implementation may block, performing the
+VM_BIND synchronously, because there might not be enough memory
+immediately available for preparing the asynchronous operation.
+
+If the VM_BIND IOCTL takes a list or an array of operations as an argument,
+the in-syncobjs needs to signal before the first operation starts to
+execute, and the out-syncobjs signal after the last operation
+completes. Operations in the operation list can be assumed, where it
+matters, to complete in order.
+
+To aid in supporting user-space queues, the VM_BIND may take a bind context
+AKA bind engine identifier argument. All VM_BIND operations using the same
+bind engine can then be assumed, where it matters, to complete in
+order. No such assumptions can be made between VM_BIND operations
+using separate bind contexts.
+
+The purpose of an Asynchronous VM_BIND operation is for user-mode
+drivers to be able to pipeline interleaved vm modifications and
+execbufs. For long-running workloads, such pipelining of a bind
+operation is not allowed and any in-fences need to be awaited
+synchronously.
+
+Also for VM_BINDS for long-running VMs the user-mode driver should typically
+select memory fences as out-fences since that gives greater flexibility for
+the kernel mode driver to inject other  operations into the bind /
+unbind operations. Like for example inserting breakpoints into batch
+buffers. The workload execution can then easily be pipelined behind
+the bind completion using the memory out-fence as the signal condition
+for a gpu semaphore embedded by UMD in the workload.
+
+Multi-operation VM_BIND IOCTL error handling and interrupts
+========================================
+
+The VM_BIND operations of the ioctl may error due to lack of resources
+to complete and also due to interrupted waits. In both situations UMD
+should preferably restart the IOCTL after taking suitable action. If
+UMD has overcommitted a memory resource, an -ENOSPC error will be
+returned, and UMD may then unbind resources that are not used at the
+moment and restart the IOCTL. On -EINTR, UMD should simply restart the
+IOCTL and on -ENOMEM user-space may either attempt to free known
+system memory resources or abort the operation. If aborting as a
+result of a failed operation in a list of operations, some operations
+may still have completed, and to get back to a known state, user-space
+should therefore attempt to unbind all virtual memory regions touched
+by the failing IOCTL.
+Unbind operations are guaranteed not to cause any errors due to
+resource constraints.
+In between a failed VM_BIND ioctl and a successful restart there may
+be implementation defined restrictions on the use of the VM. For a
+description why, please see KMD implementation details under [error
+state saving]_.
+
+Sample uAPI implementations
+======================
+Suggested uAPI implementations at the moment of writing can be found for
+the Nouveau driver `here: https://patchwork.freedesktop.org/patch/543260/?series=112994&rev=6`
+and for the Xe driver `here: https://cgit.freedesktop.org/drm/drm-xe/diff/include/uapi/drm/xe_drm.h?h=drm-xe-next&id=9cb016ebbb6a275f57b1cb512b95d5a842391ad7`
+
+KMD implementation details
+==========================
+
+.. [error state saving] Open: When the VM_BIND ioctl returns an error, some
+			or even parts of an operation may have been
+			completed. If the ioctl is restarted, in order
+			to know where to restart, the KMD can
+			either put the VM in an error state and save
+			one instance of the needed restart state
+			internally. In this case, KMD needs to block
+			further modifications of the VM state that may
+			cause additional failures requiring a restart
+			state save, until the error has been fully resolved.
+			If the uAPI instead defines a pointer to a
+			UMD allocated cookie in the IOCTL struct, it
+			could also choose to store the restart state
+			in that cookie.
+
+			The restart state may, for example, be the
+			number of successfully completed operations.
+
+			Easiest for UMD would of course be if KMD did
+			a full unwind on error so that no error state
+			needs to be saved.
diff --git a/Documentation/gpu/rfc/xe.rst b/Documentation/gpu/rfc/xe.rst
index 2516fe141db6..0f062e1346d2 100644
--- a/Documentation/gpu/rfc/xe.rst
+++ b/Documentation/gpu/rfc/xe.rst
@@ -138,8 +138,8 @@  memory fences. Ideally with helper support so people don't get it wrong in all
 possible ways.
 
 As a key measurable result, the benefits of ASYNC VM_BIND and a discussion of
-various flavors, error handling and a sample API should be documented here or in
-a separate document pointed to by this document.
+various flavors, error handling and sample API suggestions are documented in
+Documentation/gpu/drm-vm-bind-async.rst
 
 Userptr integration and vm_bind
 -------------------------------

[v2] Documentation/gpu: Add a VM_BIND async draft document

Commit Message

Comments

Patch