diff mbox series

[v4,09/10] userfaultfd: update documentation to describe minor fault handling

Message ID 20210204183433.1431202-10-axelrasmussen@google.com (mailing list archive)
State New, archived
Headers show
Series userfaultfd: add minor fault handling | expand

Commit Message

Axel Rasmussen Feb. 4, 2021, 6:34 p.m. UTC
Reword / reorganize things a little bit into "lists", so new features /
modes / ioctls can sort of just be appended.

Describe how UFFDIO_REGISTER_MODE_MINOR and UFFDIO_CONTINUE can be used
to intercept and resolve minor faults. Make it clear that COPY and
ZEROPAGE are used for MISSING faults, whereas CONTINUE is used for MINOR
faults.

Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
---
 Documentation/admin-guide/mm/userfaultfd.rst | 107 ++++++++++++-------
 1 file changed, 66 insertions(+), 41 deletions(-)

Comments

Randy Dunlap Feb. 4, 2021, 7:57 p.m. UTC | #1
Hi Axel-

one typo found:

On 2/4/21 10:34 AM, Axel Rasmussen wrote:
> Reword / reorganize things a little bit into "lists", so new features /
> modes / ioctls can sort of just be appended.

Good plan.

> 
> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
> ---
>  Documentation/admin-guide/mm/userfaultfd.rst | 107 ++++++++++++-------
>  1 file changed, 66 insertions(+), 41 deletions(-)
> 
> diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
> index 65eefa66c0ba..cfd3daf59d0e 100644
> --- a/Documentation/admin-guide/mm/userfaultfd.rst
> +++ b/Documentation/admin-guide/mm/userfaultfd.rst

[snip]

> -
> -Once the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
> -be invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
> -register a memory range in the ``userfaultfd`` by setting the
> +events, except page fault notifications, may be generated:
> +
> +- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events
> +  other than page faults are supported. These events are described in more
> +  detail below in the `Non-cooperative userfaultfd`_ section.
> +
> +- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM``
> +  indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING``
> +  registrations for hugetlbfs and shared memory (covering all shmem APIs,
> +  i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``,
> +  etc) virtual memory areas, respectively.
> +
> +- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports
> +  ``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory
> +  areas.
> +
> +The userland application should set the feature flags it intends to use

(ah, userspace has moved to userland temporarily. :)

> +when envoking the ``UFFDIO_API`` ioctl, to request that those features be

        invoking

> +enabled if supported.


thanks.
Axel Rasmussen Feb. 4, 2021, 9:04 p.m. UTC | #2
On Thu, Feb 4, 2021 at 11:57 AM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> Hi Axel-
>
> one typo found:
>
> On 2/4/21 10:34 AM, Axel Rasmussen wrote:
> > Reword / reorganize things a little bit into "lists", so new features /
> > modes / ioctls can sort of just be appended.
>
> Good plan.
>
> >
> > Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
> > ---
> >  Documentation/admin-guide/mm/userfaultfd.rst | 107 ++++++++++++-------
> >  1 file changed, 66 insertions(+), 41 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
> > index 65eefa66c0ba..cfd3daf59d0e 100644
> > --- a/Documentation/admin-guide/mm/userfaultfd.rst
> > +++ b/Documentation/admin-guide/mm/userfaultfd.rst
>
> [snip]
>
> > -
> > -Once the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
> > -be invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
> > -register a memory range in the ``userfaultfd`` by setting the
> > +events, except page fault notifications, may be generated:
> > +
> > +- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events
> > +  other than page faults are supported. These events are described in more
> > +  detail below in the `Non-cooperative userfaultfd`_ section.
> > +
> > +- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM``
> > +  indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING``
> > +  registrations for hugetlbfs and shared memory (covering all shmem APIs,
> > +  i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``,
> > +  etc) virtual memory areas, respectively.
> > +
> > +- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports
> > +  ``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory
> > +  areas.
> > +
> > +The userland application should set the feature flags it intends to use
>
> (ah, userspace has moved to userland temporarily. :)

For better or worse, other parts of the document I'm not touching also
use this wording. Maybe we should s/userland/userspace/g, but perhaps
better done as a separate commit to keep this diff focused?
Anecdotally, the use of "userland" doesn't seem to be completely
unprecedented (e.g. grep -r "userland" | wc -l yields 566 matches in
the kernel tree).

I don't have strong feelings, and I was amused by picturing some
Shire-esque countryside with a friendly sign that reads: ~userland
welcomes you~. :)

>
> > +when envoking the ``UFFDIO_API`` ioctl, to request that those features be
>
>         invoking

Whoops! Will send a new version with this fix. Thanks!

>
> > +enabled if supported.
>
>
> thanks.
> --
> ~Randy
>
Randy Dunlap Feb. 4, 2021, 9:07 p.m. UTC | #3
On 2/4/21 1:04 PM, Axel Rasmussen wrote:
> On Thu, Feb 4, 2021 at 11:57 AM Randy Dunlap <rdunlap@infradead.org> wrote:
>>
>> Hi Axel-
>>
>> one typo found:
>>
>> On 2/4/21 10:34 AM, Axel Rasmussen wrote:
>>> Reword / reorganize things a little bit into "lists", so new features /
>>> modes / ioctls can sort of just be appended.
>>
>> Good plan.
>>
>>>
>>> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
>>> ---
>>>  Documentation/admin-guide/mm/userfaultfd.rst | 107 ++++++++++++-------
>>>  1 file changed, 66 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
>>> index 65eefa66c0ba..cfd3daf59d0e 100644
>>> --- a/Documentation/admin-guide/mm/userfaultfd.rst
>>> +++ b/Documentation/admin-guide/mm/userfaultfd.rst
>>
>> [snip]
>>
>>> -
>>> -Once the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
>>> -be invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
>>> -register a memory range in the ``userfaultfd`` by setting the
>>> +events, except page fault notifications, may be generated:
>>> +
>>> +- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events
>>> +  other than page faults are supported. These events are described in more
>>> +  detail below in the `Non-cooperative userfaultfd`_ section.
>>> +
>>> +- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM``
>>> +  indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING``
>>> +  registrations for hugetlbfs and shared memory (covering all shmem APIs,
>>> +  i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``,
>>> +  etc) virtual memory areas, respectively.
>>> +
>>> +- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports
>>> +  ``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory
>>> +  areas.
>>> +
>>> +The userland application should set the feature flags it intends to use
>>
>> (ah, userspace has moved to userland temporarily. :)
> 
> For better or worse, other parts of the document I'm not touching also
> use this wording. Maybe we should s/userland/userspace/g, but perhaps
> better done as a separate commit to keep this diff focused?
> Anecdotally, the use of "userland" doesn't seem to be completely
> unprecedented (e.g. grep -r "userland" | wc -l yields 566 matches in
> the kernel tree).
> 
> I don't have strong feelings, and I was amused by picturing some
> Shire-esque countryside with a friendly sign that reads: ~userland
> welcomes you~. :)

I'm OK with not changing it. Up to you.
diff mbox series

Patch

diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
index 65eefa66c0ba..cfd3daf59d0e 100644
--- a/Documentation/admin-guide/mm/userfaultfd.rst
+++ b/Documentation/admin-guide/mm/userfaultfd.rst
@@ -63,36 +63,36 @@  the generic ioctl available.
 
 The ``uffdio_api.features`` bitmask returned by the ``UFFDIO_API`` ioctl
 defines what memory types are supported by the ``userfaultfd`` and what
-events, except page fault notifications, may be generated.
-
-If the kernel supports registering ``userfaultfd`` ranges on hugetlbfs
-virtual memory areas, ``UFFD_FEATURE_MISSING_HUGETLBFS`` will be set in
-``uffdio_api.features``. Similarly, ``UFFD_FEATURE_MISSING_SHMEM`` will be
-set if the kernel supports registering ``userfaultfd`` ranges on shared
-memory (covering all shmem APIs, i.e. tmpfs, ``IPCSHM``, ``/dev/zero``,
-``MAP_SHARED``, ``memfd_create``, etc).
-
-The userland application that wants to use ``userfaultfd`` with hugetlbfs
-or shared memory need to set the corresponding flag in
-``uffdio_api.features`` to enable those features.
-
-If the userland desires to receive notifications for events other than
-page faults, it has to verify that ``uffdio_api.features`` has appropriate
-``UFFD_FEATURE_EVENT_*`` bits set. These events are described in more
-detail below in `Non-cooperative userfaultfd`_ section.
-
-Once the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
-be invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
-register a memory range in the ``userfaultfd`` by setting the
+events, except page fault notifications, may be generated:
+
+- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events
+  other than page faults are supported. These events are described in more
+  detail below in the `Non-cooperative userfaultfd`_ section.
+
+- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM``
+  indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING``
+  registrations for hugetlbfs and shared memory (covering all shmem APIs,
+  i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``,
+  etc) virtual memory areas, respectively.
+
+- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports
+  ``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory
+  areas.
+
+The userland application should set the feature flags it intends to use
+when envoking the ``UFFDIO_API`` ioctl, to request that those features be
+enabled if supported.
+
+Once the ``userfaultfd`` API has been enabled the ``UFFDIO_REGISTER``
+ioctl should be invoked (if present in the returned ``uffdio_api.ioctls``
+bitmask) to register a memory range in the ``userfaultfd`` by setting the
 uffdio_register structure accordingly. The ``uffdio_register.mode``
 bitmask will specify to the kernel which kind of faults to track for
-the range (``UFFDIO_REGISTER_MODE_MISSING`` would track missing
-pages). The ``UFFDIO_REGISTER`` ioctl will return the
+the range. The ``UFFDIO_REGISTER`` ioctl will return the
 ``uffdio_register.ioctls`` bitmask of ioctls that are suitable to resolve
 userfaults on the range registered. Not all ioctls will necessarily be
-supported for all memory types depending on the underlying virtual
-memory backend (anonymous memory vs tmpfs vs real filebacked
-mappings).
+supported for all memory types (e.g. anonymous memory vs. shmem vs.
+hugetlbfs), or all types of intercepted faults.
 
 Userland can use the ``uffdio_register.ioctls`` to manage the virtual
 address space in the background (to add or potentially also remove
@@ -100,21 +100,46 @@  memory from the ``userfaultfd`` registered range). This means a userfault
 could be triggering just before userland maps in the background the
 user-faulted page.
 
-The primary ioctl to resolve userfaults is ``UFFDIO_COPY``. That
-atomically copies a page into the userfault registered range and wakes
-up the blocked userfaults
-(unless ``uffdio_copy.mode & UFFDIO_COPY_MODE_DONTWAKE`` is set).
-Other ioctl works similarly to ``UFFDIO_COPY``. They're atomic as in
-guaranteeing that nothing can see an half copied page since it'll
-keep userfaulting until the copy has finished.
+Resolving Userfaults
+--------------------
+
+There are three basic ways to resolve userfaults:
+
+- ``UFFDIO_COPY`` atomically copies some existing page contents from
+  userspace.
+
+- ``UFFDIO_ZEROPAGE`` atomically zeros the new page.
+
+- ``UFFDIO_CONTINUE`` maps an existing, previously-populated page.
+
+These operations are atomic in the sense that they guarantee nothing can
+see a half-populated page, since readers will keep userfaulting until the
+operation has finished.
+
+By default, these wake up userfaults blocked on the range in question.
+They support a ``UFFDIO_*_MODE_DONTWAKE`` ``mode`` flag, which indicates
+that waking will be done separately at some later time.
+
+Which ioctl to choose depends on the kind of page fault, and what we'd
+like to do to resolve it:
+
+- For ``UFFDIO_REGISTER_MODE_MISSING`` faults, the fault needs to be
+  resolved by either providing a new page (``UFFDIO_COPY``), or mapping
+  the zero page (``UFFDIO_ZEROPAGE``). By default, the kernel would map
+  the zero page for a missing fault. With userfaultfd, userspace can
+  decide what content to provide before the faulting thread continues.
+
+- For ``UFFDIO_REGISTER_MODE_MINOR`` faults, there is an existing page (in
+  the page cache). Userspace has the option of modifying the page's
+  contents before resolving the fault. Once the contents are correct
+  (modified or not), userspace asks the kernel to map the page and let the
+  faulting thread continue with ``UFFDIO_CONTINUE``.
 
 Notes:
 
-- If you requested ``UFFDIO_REGISTER_MODE_MISSING`` when registering then
-  you must provide some kind of page in your thread after reading from
-  the uffd.  You must provide either ``UFFDIO_COPY`` or ``UFFDIO_ZEROPAGE``.
-  The normal behavior of the OS automatically providing a zero page on
-  an anonymous mmaping is not in place.
+- You can tell which kind of fault occurred by examining
+  ``pagefault.flags`` within the ``uffd_msg``, checking for the
+  ``UFFD_PAGEFAULT_FLAG_*`` flags.
 
 - None of the page-delivering ioctls default to the range that you
   registered with.  You must fill in all fields for the appropriate
@@ -122,9 +147,9 @@  Notes:
 
 - You get the address of the access that triggered the missing page
   event out of a struct uffd_msg that you read in the thread from the
-  uffd.  You can supply as many pages as you want with ``UFFDIO_COPY`` or
-  ``UFFDIO_ZEROPAGE``.  Keep in mind that unless you used DONTWAKE then
-  the first of any of those IOCTLs wakes up the faulting thread.
+  uffd.  You can supply as many pages as you want with these IOCTLs.
+  Keep in mind that unless you used DONTWAKE then the first of any of
+  those IOCTLs wakes up the faulting thread.
 
 - Be sure to test for all errors including
   (``pollfd[0].revents & POLLERR``).  This can happen, e.g. when ranges