Message ID | 20230919190206.388896-11-axelrasmussen@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [01/10] userfaultfd.2: briefly mention two-step feature handshake process | expand |
On Tue, Sep 19, 2023 at 12:02:06PM -0700, Axel Rasmussen wrote: > This is a new feature recently added to the kernel. So, document the new > ioctl the same way we do other UFFDIO_* ioctls. > > Also note the corresponding new ioctl flag we can return in reponse to a > UFFDIO_REGISTER call. > > Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> With a small correction Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> > --- > man2/ioctl_userfaultfd.2 | 112 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 112 insertions(+) > > diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 > index afe3caffc..1282f63e1 100644 > --- a/man2/ioctl_userfaultfd.2 > +++ b/man2/ioctl_userfaultfd.2 > @@ -405,6 +405,11 @@ operation is supported. > The > .B UFFDIO_CONTINUE > operation is supported. > +.TP > +.B 1 << _UFFDIO_POISON > +The > +.B UFFDIO_POISON > +operation is supported. > .PP > This > .BR ioctl (2) > @@ -916,6 +921,113 @@ The faulting process has exited at the time of a > .B UFFDIO_CONTINUE > operation. > .\" > +.SS UFFDIO_POISON > +(Since Linux 6.6.) > +Mark an address range as "poisoned". > +Future accesses to these addresses will raise a > +.B SIGBUS > +signal. > +Unlike > +.B MADV_HWPOISON > +this works by installing page table entries, > +rather than "really" poisoning the underlying physical pages. > +This means it only affects this particular address space. > +.PP > +The > +.I argp > +argument is a pointer to a > +.I uffdio_continue Did you mean uffdio_poison? > +structure as shown below: > +.PP > +.in +4n > +.EX > +struct uffdio_poison { > + struct uffdio_range range; > + /* Range to install poison PTE markers in */ > + __u64 mode; /* Flags controlling the behavior of poison */ > + __s64 updated; /* Number of bytes poisoned, or negated error */ > +}; > +.EE > +.in > +.PP > +The following value may be bitwise ORed in > +.I mode > +to change the behavior of the > +.B UFFDIO_POISON > +operation: > +.TP > +.B UFFDIO_POISON_MODE_DONTWAKE > +Do not wake up the thread that waits for page-fault resolution. > +.PP > +The > +.I updated > +field is used by the kernel > +to return the number of bytes that were actually poisoned, > +or an error in the same manner as > +.BR UFFDIO_COPY . > +If the value returned in the > +.I updated > +field doesn't match the value that was specified in > +.IR range.len , > +the operation fails with the error > +.BR EAGAIN . > +The > +.I updated > +field is output-only; > +it is not read by the > +.B UFFDIO_POISON > +operation. > +.PP > +This > +.BR ioctl (2) > +operation returns 0 on success. > +In this case, > +the entire area was poisoned. > +On error, \-1 is returned and > +.I errno > +is set to indicate the error. > +Possible errors include: > +.TP > +.B EAGAIN > +The number of bytes mapped > +(i.e., the value returned in the > +.I updated > +field) > +does not equal the value that was specified in the > +.I range.len > +field. > +.TP > +.B EINVAL > +Either > +.I range.start > +or > +.I range.len > +was not a multiple of the system page size; or > +.I range.len > +was zero; or the range specified was invalid. > +.TP > +.B EINVAL > +An invalid bit was specified in the > +.I mode > +field. > +.TP > +.B EEXIST > +One or more pages were already mapped in the given range. > +.TP > +.B ENOENT > +The faulting process has changed its virtual memory layout simultaneously with > +an outstanding > +.B UFFDIO_POISON > +operation. > +.TP > +.B ENOMEM > +Allocating memory for page table entries failed. > +.TP > +.B ESRCH > +The faulting process has exited at the time of a > +.B UFFDIO_POISON > +operation. > +.\" > .SH RETURN VALUE > See descriptions of the individual operations, above. > .SH ERRORS > -- > 2.42.0.459.ge4e396fd5e-goog > >
On Mon, Oct 9, 2023 at 2:10 AM Mike Rapoport <rppt@kernel.org> wrote: > > On Tue, Sep 19, 2023 at 12:02:06PM -0700, Axel Rasmussen wrote: > > This is a new feature recently added to the kernel. So, document the new > > ioctl the same way we do other UFFDIO_* ioctls. > > > > Also note the corresponding new ioctl flag we can return in reponse to a > > UFFDIO_REGISTER call. > > > > Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> > > With a small correction > > Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> > > > --- > > man2/ioctl_userfaultfd.2 | 112 +++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 112 insertions(+) > > > > diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 > > index afe3caffc..1282f63e1 100644 > > --- a/man2/ioctl_userfaultfd.2 > > +++ b/man2/ioctl_userfaultfd.2 > > @@ -405,6 +405,11 @@ operation is supported. > > The > > .B UFFDIO_CONTINUE > > operation is supported. > > +.TP > > +.B 1 << _UFFDIO_POISON > > +The > > +.B UFFDIO_POISON > > +operation is supported. > > .PP > > This > > .BR ioctl (2) > > @@ -916,6 +921,113 @@ The faulting process has exited at the time of a > > .B UFFDIO_CONTINUE > > operation. > > .\" > > +.SS UFFDIO_POISON > > +(Since Linux 6.6.) > > +Mark an address range as "poisoned". > > +Future accesses to these addresses will raise a > > +.B SIGBUS > > +signal. > > +Unlike > > +.B MADV_HWPOISON > > +this works by installing page table entries, > > +rather than "really" poisoning the underlying physical pages. > > +This means it only affects this particular address space. > > +.PP > > +The > > +.I argp > > +argument is a pointer to a > > +.I uffdio_continue > > Did you mean uffdio_poison? Ah, yes. :) Should have copy/pasted more carefully. I can send a v3 with this small correction. > > > +structure as shown below: > > +.PP > > +.in +4n > > +.EX > > +struct uffdio_poison { > > + struct uffdio_range range; > > + /* Range to install poison PTE markers in */ > > + __u64 mode; /* Flags controlling the behavior of poison */ > > + __s64 updated; /* Number of bytes poisoned, or negated error */ > > +}; > > +.EE > > +.in > > +.PP > > +The following value may be bitwise ORed in > > +.I mode > > +to change the behavior of the > > +.B UFFDIO_POISON > > +operation: > > +.TP > > +.B UFFDIO_POISON_MODE_DONTWAKE > > +Do not wake up the thread that waits for page-fault resolution. > > +.PP > > +The > > +.I updated > > +field is used by the kernel > > +to return the number of bytes that were actually poisoned, > > +or an error in the same manner as > > +.BR UFFDIO_COPY . > > +If the value returned in the > > +.I updated > > +field doesn't match the value that was specified in > > +.IR range.len , > > +the operation fails with the error > > +.BR EAGAIN . > > +The > > +.I updated > > +field is output-only; > > +it is not read by the > > +.B UFFDIO_POISON > > +operation. > > +.PP > > +This > > +.BR ioctl (2) > > +operation returns 0 on success. > > +In this case, > > +the entire area was poisoned. > > +On error, \-1 is returned and > > +.I errno > > +is set to indicate the error. > > +Possible errors include: > > +.TP > > +.B EAGAIN > > +The number of bytes mapped > > +(i.e., the value returned in the > > +.I updated > > +field) > > +does not equal the value that was specified in the > > +.I range.len > > +field. > > +.TP > > +.B EINVAL > > +Either > > +.I range.start > > +or > > +.I range.len > > +was not a multiple of the system page size; or > > +.I range.len > > +was zero; or the range specified was invalid. > > +.TP > > +.B EINVAL > > +An invalid bit was specified in the > > +.I mode > > +field. > > +.TP > > +.B EEXIST > > +One or more pages were already mapped in the given range. > > +.TP > > +.B ENOENT > > +The faulting process has changed its virtual memory layout simultaneously with > > +an outstanding > > +.B UFFDIO_POISON > > +operation. > > +.TP > > +.B ENOMEM > > +Allocating memory for page table entries failed. > > +.TP > > +.B ESRCH > > +The faulting process has exited at the time of a > > +.B UFFDIO_POISON > > +operation. > > +.\" > > .SH RETURN VALUE > > See descriptions of the individual operations, above. > > .SH ERRORS > > -- > > 2.42.0.459.ge4e396fd5e-goog > > > > > > -- > Sincerely yours, > Mike.
diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 index afe3caffc..1282f63e1 100644 --- a/man2/ioctl_userfaultfd.2 +++ b/man2/ioctl_userfaultfd.2 @@ -405,6 +405,11 @@ operation is supported. The .B UFFDIO_CONTINUE operation is supported. +.TP +.B 1 << _UFFDIO_POISON +The +.B UFFDIO_POISON +operation is supported. .PP This .BR ioctl (2) @@ -916,6 +921,113 @@ The faulting process has exited at the time of a .B UFFDIO_CONTINUE operation. .\" +.SS UFFDIO_POISON +(Since Linux 6.6.) +Mark an address range as "poisoned". +Future accesses to these addresses will raise a +.B SIGBUS +signal. +Unlike +.B MADV_HWPOISON +this works by installing page table entries, +rather than "really" poisoning the underlying physical pages. +This means it only affects this particular address space. +.PP +The +.I argp +argument is a pointer to a +.I uffdio_continue +structure as shown below: +.PP +.in +4n +.EX +struct uffdio_poison { + struct uffdio_range range; + /* Range to install poison PTE markers in */ + __u64 mode; /* Flags controlling the behavior of poison */ + __s64 updated; /* Number of bytes poisoned, or negated error */ +}; +.EE +.in +.PP +The following value may be bitwise ORed in +.I mode +to change the behavior of the +.B UFFDIO_POISON +operation: +.TP +.B UFFDIO_POISON_MODE_DONTWAKE +Do not wake up the thread that waits for page-fault resolution. +.PP +The +.I updated +field is used by the kernel +to return the number of bytes that were actually poisoned, +or an error in the same manner as +.BR UFFDIO_COPY . +If the value returned in the +.I updated +field doesn't match the value that was specified in +.IR range.len , +the operation fails with the error +.BR EAGAIN . +The +.I updated +field is output-only; +it is not read by the +.B UFFDIO_POISON +operation. +.PP +This +.BR ioctl (2) +operation returns 0 on success. +In this case, +the entire area was poisoned. +On error, \-1 is returned and +.I errno +is set to indicate the error. +Possible errors include: +.TP +.B EAGAIN +The number of bytes mapped +(i.e., the value returned in the +.I updated +field) +does not equal the value that was specified in the +.I range.len +field. +.TP +.B EINVAL +Either +.I range.start +or +.I range.len +was not a multiple of the system page size; or +.I range.len +was zero; or the range specified was invalid. +.TP +.B EINVAL +An invalid bit was specified in the +.I mode +field. +.TP +.B EEXIST +One or more pages were already mapped in the given range. +.TP +.B ENOENT +The faulting process has changed its virtual memory layout simultaneously with +an outstanding +.B UFFDIO_POISON +operation. +.TP +.B ENOMEM +Allocating memory for page table entries failed. +.TP +.B ESRCH +The faulting process has exited at the time of a +.B UFFDIO_POISON +operation. +.\" .SH RETURN VALUE See descriptions of the individual operations, above. .SH ERRORS
This is a new feature recently added to the kernel. So, document the new ioctl the same way we do other UFFDIO_* ioctls. Also note the corresponding new ioctl flag we can return in reponse to a UFFDIO_REGISTER call. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> --- man2/ioctl_userfaultfd.2 | 112 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 112 insertions(+)