diff mbox series

[2/3] rust: add typed accessors for userspace pointers

Message ID 20240124-alice-mm-v1-2-d1abcec83c44@google.com (mailing list archive)
State New
Headers show
Series Memory management patches needed by Rust Binder | expand

Commit Message

Alice Ryhl Jan. 24, 2024, 11:20 a.m. UTC
Add safe methods for reading and writing Rust values to and from
userspace pointers.

The C methods for copying to/from userspace use a function called
`check_object_size` to verify that the kernel pointer is not dangling.
However, this check is skipped when the length is a compile-time
constant, with the assumption that such cases trivially have a correct
kernel pointer.

In this patch, we apply the same optimization to the typed accessors.
For both methods, the size of the operation is known at compile time to
be size_of of the type being read or written. Since the C side doesn't
provide a variant that skips only this check, we create custom helpers
for this purpose.

The majority of reads and writes to userspace pointers in the Rust
Binder driver uses these accessor methods. Benchmarking has found that
skipping the `check_object_size` check makes a big difference for the
cases being skipped here. (And that the check doesn't make a difference
for the cases that use the raw read/write methods.)

This code is based on something that was originally written by Wedson on
the old rust branch. It was modified by Alice to skip the
`check_object_size` check, and to update various comments, including the
notes about kernel pointers in `WritableToBytes`.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/helpers.c          |  34 +++++++++++++
 rust/kernel/user_ptr.rs | 125 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 159 insertions(+)

Comments

Valentin Obst Jan. 24, 2024, 11:46 p.m. UTC | #1
> +/*

nit: this would be the first comment in the kernel crate to use this
style, not sure if there is a rule about that though. Maybe still
preferable to keep it consistent.

> + * These methods skip the `check_object_size` check that `copy_[to|from]_user`
> + * normally performs.

nit: They skip the (stronger, and also present without usercopy
hardening) `check_copy_size` wrapping that one.

>                        In C, these checks are skipped whenever the length is a
> + * compile-time constant, since when that is the case, the kernel pointer
> + * usually points at a local variable that is being initialized

Question: I thought that this exemption is about dynamic size
calculations being more susceptible to bugs than hard-coded ones. Does
someone recall the original rationale for that?

>                                                                  and the kernel
> + * pointer is trivially non-dangling.

As far as I know the hardened usercopy checks are not meant to catch
UAFs but rather about OOB accesses (and some info leaks). For example,
if the object is on the heap they check if the copy size exceeds the
allocation size, or, if the object is on the stack, they verify the copy
size does not leave the stack frame.

> + *
> + * These helpers serve the same purpose in Rust. Whenever the length is known at
> + * compile-time, we call this helper to skip the check.
> + */
> +unsigned long rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, const void __user *from, unsigned long n)
> +{
> +	unsigned long res;
> +
> +	might_fault();
> +	instrument_copy_from_user_before(to, from, n);
> +	if (should_fail_usercopy())
> +		return n;
> +	res = raw_copy_from_user(to, from, n);
> +	instrument_copy_from_user_after(to, from, n, res);
> +	return res;
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> +
> +unsigned long rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, const void *from, unsigned long n)
> +{
> +	might_fault();
> +	if (should_fail_usercopy())
> +		return n;
> +	instrument_copy_to_user(to, from, n);
> +	return raw_copy_to_user(to, from, n);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);

Could those be wrapping `_copy_[to|from]_user` instead?

	- Valentin
Arnd Bergmann Jan. 25, 2024, 12:26 p.m. UTC | #2
On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:
> +unsigned long 
> rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, 
> const void __user *from, unsigned long n)
> +{
> +	unsigned long res;
> +
> +	might_fault();
> +	instrument_copy_from_user_before(to, from, n);
> +	if (should_fail_usercopy())
> +		return n;
> +	res = raw_copy_from_user(to, from, n);
> +	instrument_copy_from_user_after(to, from, n, res);
> +	return res;
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> +
> +unsigned long 
> rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, 
> const void *from, unsigned long n)
> +{
> +	might_fault();
> +	if (should_fail_usercopy())
> +		return n;
> +	instrument_copy_to_user(to, from, n);
> +	return raw_copy_to_user(to, from, n);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);

These functions are almost identical to the ones in
lib/usercopy.c for !defined(INLINE_COPY_TO_USER).

That version has an extra memset() after a partial
copy_from_user(), and you probably want to have the
same thing here for consistency.

I think ideally we should only have one out-of-line copy
of these two functions and have that one shared between
rust and architectures that want the C version out of line
as well.

     Arnd
Alice Ryhl Jan. 25, 2024, 12:37 p.m. UTC | #3
On Thu, Jan 25, 2024 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:
> > +unsigned long
> > rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to,
> > const void __user *from, unsigned long n)
> > +{
> > +     unsigned long res;
> > +
> > +     might_fault();
> > +     instrument_copy_from_user_before(to, from, n);
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     res = raw_copy_from_user(to, from, n);
> > +     instrument_copy_from_user_after(to, from, n, res);
> > +     return res;
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> > +
> > +unsigned long
> > rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to,
> > const void *from, unsigned long n)
> > +{
> > +     might_fault();
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     instrument_copy_to_user(to, from, n);
> > +     return raw_copy_to_user(to, from, n);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
>
> These functions are almost identical to the ones in
> lib/usercopy.c for !defined(INLINE_COPY_TO_USER).
>
> That version has an extra memset() after a partial
> copy_from_user(), and you probably want to have the
> same thing here for consistency.
>
> I think ideally we should only have one out-of-line copy
> of these two functions and have that one shared between
> rust and architectures that want the C version out of line
> as well.

I had a bit of trouble figuring out all of the copy_[to/from]_user
methods that are available. I was hoping that a better solution would
be available, and it sounds like one is. Is _copy_from_user always
available as an exported symbol? If it's always available and skips
the check, then I can just use that. I don't think the memset matters
for my case.

Otherwise, I can add a helper in rust/helpers.c that wraps
_copy_from_user only when INLINE_COPY_FROM_USER is defined, and call
the helper in those cases, and otherwise call the exported symbol
directly. (I need an exported symbol to call into C from Rust.)

Would that make sense?

Alice
Alice Ryhl Jan. 25, 2024, 12:40 p.m. UTC | #4
On Thu, Jan 25, 2024 at 12:47 AM Valentin Obst <kernel@valentinobst.de> wrote:
>
> > +/*
>
> nit: this would be the first comment in the kernel crate to use this
> style, not sure if there is a rule about that though. Maybe still
> preferable to keep it consistent.
>
> > + * These methods skip the `check_object_size` check that `copy_[to|from]_user`
> > + * normally performs.
>
> nit: They skip the (stronger, and also present without usercopy
> hardening) `check_copy_size` wrapping that one.

The only difference between check_object_size and check_copy_size is
the extra check with __builtin_object_size, but that doesn't work
across the C/Rust boundary, and Rust doesn't have a direct equivalent.

> >                        In C, these checks are skipped whenever the length is a
> > + * compile-time constant, since when that is the case, the kernel pointer
> > + * usually points at a local variable that is being initialized
>
> Question: I thought that this exemption is about dynamic size
> calculations being more susceptible to bugs than hard-coded ones. Does
> someone recall the original rationale for that?
>
> >                                                                  and the kernel
> > + * pointer is trivially non-dangling.
>
> As far as I know the hardened usercopy checks are not meant to catch
> UAFs but rather about OOB accesses (and some info leaks). For example,
> if the object is on the heap they check if the copy size exceeds the
> allocation size, or, if the object is on the stack, they verify the copy
> size does not leave the stack frame.

Right, I can reword to say OOB instead of UAF.

> > + *
> > + * These helpers serve the same purpose in Rust. Whenever the length is known at
> > + * compile-time, we call this helper to skip the check.
> > + */
> > +unsigned long rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, const void __user *from, unsigned long n)
> > +{
> > +     unsigned long res;
> > +
> > +     might_fault();
> > +     instrument_copy_from_user_before(to, from, n);
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     res = raw_copy_from_user(to, from, n);
> > +     instrument_copy_from_user_after(to, from, n, res);
> > +     return res;
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> > +
> > +unsigned long rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, const void *from, unsigned long n)
> > +{
> > +     might_fault();
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     instrument_copy_to_user(to, from, n);
> > +     return raw_copy_to_user(to, from, n);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
>
> Could those be wrapping `_copy_[to|from]_user` instead?

Yeah maybe, see the other thread with Arnd Bergmann.

Alice
Arnd Bergmann Jan. 25, 2024, 3:59 p.m. UTC | #5
On Thu, Jan 25, 2024, at 13:37, Alice Ryhl wrote:
> On Thu, Jan 25, 2024 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
>> On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:

>> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
>>
>> These functions are almost identical to the ones in
>> lib/usercopy.c for !defined(INLINE_COPY_TO_USER).
>>
>> That version has an extra memset() after a partial
>> copy_from_user(), and you probably want to have the
>> same thing here for consistency.
>>
>> I think ideally we should only have one out-of-line copy
>> of these two functions and have that one shared between
>> rust and architectures that want the C version out of line
>> as well.
>
> I had a bit of trouble figuring out all of the copy_[to/from]_user
> methods that are available. I was hoping that a better solution would
> be available, and it sounds like one is. Is _copy_from_user always
> available as an exported symbol? If it's always available and skips
> the check, then I can just use that. I don't think the memset matters
> for my case.

At the moment, it appears that it's available on the few architectures
that don't #define INLINE_COPY_FROM_USER: alpha, csky, powerpc,
riscv and x86. On the other architectures it is always an inline
function.

> Otherwise, I can add a helper in rust/helpers.c that wraps
> _copy_from_user only when INLINE_COPY_FROM_USER is defined, and call
> the helper in those cases, and otherwise call the exported symbol
> directly. (I need an exported symbol to call into C from Rust.)
>
> Would that make sense?

I don't think we can have a perfect abstraction here, but rather
than putting knowledge of INLINE_COPY_FROM_USER into the rust
wrapper, I would suggest putting a bit of information about
rust into lib/usercopy.c.

I've tried to come up with an idea below, see if that works
for you.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 3064314f4832..835aa175d0ee 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -138,13 +138,18 @@ __copy_to_user(void __user *to, const void *from, unsigned long n)
 	return raw_copy_to_user(to, from, n);
 }
 
-#ifdef INLINE_COPY_FROM_USER
 static inline __must_check unsigned long
-_copy_from_user(void *to, const void __user *from, unsigned long n)
+_inline_copy_from_user(void *to, const void __user *from, unsigned long n)
 {
 	unsigned long res = n;
 	might_fault();
 	if (!should_fail_usercopy() && likely(access_ok(from, n))) {
+		/*
+		 * Ensure that bad access_ok() speculation will not
+		 * lead to nasty side effects *after* the copy is
+		 * finished:
+		 */
+		barrier_nospec();
 		instrument_copy_from_user_before(to, from, n);
 		res = raw_copy_from_user(to, from, n);
 		instrument_copy_from_user_after(to, from, n, res);
@@ -153,14 +158,11 @@ _copy_from_user(void *to, const void __user *from, unsigned long n)
 		memset(to + (n - res), 0, res);
 	return res;
 }
-#else
 extern __must_check unsigned long
 _copy_from_user(void *, const void __user *, unsigned long);
-#endif
 
-#ifdef INLINE_COPY_TO_USER
 static inline __must_check unsigned long
-_copy_to_user(void __user *to, const void *from, unsigned long n)
+_inline_copy_to_user(void __user *to, const void *from, unsigned long n)
 {
 	might_fault();
 	if (should_fail_usercopy())
@@ -171,25 +173,32 @@ _copy_to_user(void __user *to, const void *from, unsigned long n)
 	}
 	return n;
 }
-#else
 extern __must_check unsigned long
 _copy_to_user(void __user *, const void *, unsigned long);
-#endif
 
 static __always_inline unsigned long __must_check
 copy_from_user(void *to, const void __user *from, unsigned long n)
 {
-	if (check_copy_size(to, n, false))
-		n = _copy_from_user(to, from, n);
-	return n;
+	if (!check_copy_size(to, n, false))
+		return n;
+#ifdef INLINE_COPY_FROM_USER
+	return _inline_copy_from_user(to, from, n);
+#else
+	return _copy_from_user(to, from, n);
+#endif
 }
 
 static __always_inline unsigned long __must_check
 copy_to_user(void __user *to, const void *from, unsigned long n)
 {
-	if (check_copy_size(from, n, true))
-		n = _copy_to_user(to, from, n);
-	return n;
+	if (!check_copy_size(from, n, true))
+		return n;
+
+#ifdef INLINE_COPY_TO_USER
+	return _inline_copy_to_user(to, from, n);
+#else
+	return _copy_to_user(to, from, n);
+#endif
 }
 
 #ifndef copy_mc_to_kernel
diff --git a/lib/usercopy.c b/lib/usercopy.c
index d29fe29c6849..503a064d79e2 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -7,40 +7,18 @@
 
 /* out-of-line parts */
 
-#ifndef INLINE_COPY_FROM_USER
+#if !defined(INLINE_COPY_FROM_USER) || defined(CONFIG_RUST)
 unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n)
 {
-	unsigned long res = n;
-	might_fault();
-	if (!should_fail_usercopy() && likely(access_ok(from, n))) {
-		/*
-		 * Ensure that bad access_ok() speculation will not
-		 * lead to nasty side effects *after* the copy is
-		 * finished:
-		 */
-		barrier_nospec();
-		instrument_copy_from_user_before(to, from, n);
-		res = raw_copy_from_user(to, from, n);
-		instrument_copy_from_user_after(to, from, n, res);
-	}
-	if (unlikely(res))
-		memset(to + (n - res), 0, res);
-	return res;
+	return _inline_copy_from_user(to, from, n);
 }
 EXPORT_SYMBOL(_copy_from_user);
 #endif
 
-#ifndef INLINE_COPY_TO_USER
+#if !defined(INLINE_COPY_TO_USER) || defined(CONFIG_RUST)
 unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
 {
-	might_fault();
-	if (should_fail_usercopy())
-		return n;
-	if (likely(access_ok(to, n))) {
-		instrument_copy_to_user(to, from, n);
-		n = raw_copy_to_user(to, from, n);
-	}
-	return n;
+	return _inline_copy_to_user(to, from, n);
 }
 EXPORT_SYMBOL(_copy_to_user);
 #endif
Alice Ryhl Jan. 25, 2024, 4:15 p.m. UTC | #6
On Thu, Jan 25, 2024 at 5:00 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Jan 25, 2024, at 13:37, Alice Ryhl wrote:
> > On Thu, Jan 25, 2024 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >> On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:
>
> >> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
> >>
> >> These functions are almost identical to the ones in
> >> lib/usercopy.c for !defined(INLINE_COPY_TO_USER).
> >>
> >> That version has an extra memset() after a partial
> >> copy_from_user(), and you probably want to have the
> >> same thing here for consistency.
> >>
> >> I think ideally we should only have one out-of-line copy
> >> of these two functions and have that one shared between
> >> rust and architectures that want the C version out of line
> >> as well.
> >
> > I had a bit of trouble figuring out all of the copy_[to/from]_user
> > methods that are available. I was hoping that a better solution would
> > be available, and it sounds like one is. Is _copy_from_user always
> > available as an exported symbol? If it's always available and skips
> > the check, then I can just use that. I don't think the memset matters
> > for my case.
>
> At the moment, it appears that it's available on the few architectures
> that don't #define INLINE_COPY_FROM_USER: alpha, csky, powerpc,
> riscv and x86. On the other architectures it is always an inline
> function.
>
> > Otherwise, I can add a helper in rust/helpers.c that wraps
> > _copy_from_user only when INLINE_COPY_FROM_USER is defined, and call
> > the helper in those cases, and otherwise call the exported symbol
> > directly. (I need an exported symbol to call into C from Rust.)
> >
> > Would that make sense?
>
> I don't think we can have a perfect abstraction here, but rather
> than putting knowledge of INLINE_COPY_FROM_USER into the rust
> wrapper, I would suggest putting a bit of information about
> rust into lib/usercopy.c.
>
> I've tried to come up with an idea below, see if that works
> for you.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 3064314f4832..835aa175d0ee 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -138,13 +138,18 @@ __copy_to_user(void __user *to, const void *from, unsigned long n)
>         return raw_copy_to_user(to, from, n);
>  }
>
> -#ifdef INLINE_COPY_FROM_USER
>  static inline __must_check unsigned long
> -_copy_from_user(void *to, const void __user *from, unsigned long n)
> +_inline_copy_from_user(void *to, const void __user *from, unsigned long n)
>  {
>         unsigned long res = n;
>         might_fault();
>         if (!should_fail_usercopy() && likely(access_ok(from, n))) {
> +               /*
> +                * Ensure that bad access_ok() speculation will not
> +                * lead to nasty side effects *after* the copy is
> +                * finished:
> +                */
> +               barrier_nospec();
>                 instrument_copy_from_user_before(to, from, n);
>                 res = raw_copy_from_user(to, from, n);
>                 instrument_copy_from_user_after(to, from, n, res);
> @@ -153,14 +158,11 @@ _copy_from_user(void *to, const void __user *from, unsigned long n)
>                 memset(to + (n - res), 0, res);
>         return res;
>  }
> -#else
>  extern __must_check unsigned long
>  _copy_from_user(void *, const void __user *, unsigned long);
> -#endif
>
> -#ifdef INLINE_COPY_TO_USER
>  static inline __must_check unsigned long
> -_copy_to_user(void __user *to, const void *from, unsigned long n)
> +_inline_copy_to_user(void __user *to, const void *from, unsigned long n)
>  {
>         might_fault();
>         if (should_fail_usercopy())
> @@ -171,25 +173,32 @@ _copy_to_user(void __user *to, const void *from, unsigned long n)
>         }
>         return n;
>  }
> -#else
>  extern __must_check unsigned long
>  _copy_to_user(void __user *, const void *, unsigned long);
> -#endif
>
>  static __always_inline unsigned long __must_check
>  copy_from_user(void *to, const void __user *from, unsigned long n)
>  {
> -       if (check_copy_size(to, n, false))
> -               n = _copy_from_user(to, from, n);
> -       return n;
> +       if (!check_copy_size(to, n, false))
> +               return n;
> +#ifdef INLINE_COPY_FROM_USER
> +       return _inline_copy_from_user(to, from, n);
> +#else
> +       return _copy_from_user(to, from, n);
> +#endif
>  }
>
>  static __always_inline unsigned long __must_check
>  copy_to_user(void __user *to, const void *from, unsigned long n)
>  {
> -       if (check_copy_size(from, n, true))
> -               n = _copy_to_user(to, from, n);
> -       return n;
> +       if (!check_copy_size(from, n, true))
> +               return n;
> +
> +#ifdef INLINE_COPY_TO_USER
> +       return _inline_copy_to_user(to, from, n);
> +#else
> +       return _copy_to_user(to, from, n);
> +#endif
>  }
>
>  #ifndef copy_mc_to_kernel
> diff --git a/lib/usercopy.c b/lib/usercopy.c
> index d29fe29c6849..503a064d79e2 100644
> --- a/lib/usercopy.c
> +++ b/lib/usercopy.c
> @@ -7,40 +7,18 @@
>
>  /* out-of-line parts */
>
> -#ifndef INLINE_COPY_FROM_USER
> +#if !defined(INLINE_COPY_FROM_USER) || defined(CONFIG_RUST)
>  unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n)
>  {
> -       unsigned long res = n;
> -       might_fault();
> -       if (!should_fail_usercopy() && likely(access_ok(from, n))) {
> -               /*
> -                * Ensure that bad access_ok() speculation will not
> -                * lead to nasty side effects *after* the copy is
> -                * finished:
> -                */
> -               barrier_nospec();
> -               instrument_copy_from_user_before(to, from, n);
> -               res = raw_copy_from_user(to, from, n);
> -               instrument_copy_from_user_after(to, from, n, res);
> -       }
> -       if (unlikely(res))
> -               memset(to + (n - res), 0, res);
> -       return res;
> +       return _inline_copy_from_user(to, from, n);
>  }
>  EXPORT_SYMBOL(_copy_from_user);
>  #endif
>
> -#ifndef INLINE_COPY_TO_USER
> +#if !defined(INLINE_COPY_TO_USER) || defined(CONFIG_RUST)
>  unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
>  {
> -       might_fault();
> -       if (should_fail_usercopy())
> -               return n;
> -       if (likely(access_ok(to, n))) {
> -               instrument_copy_to_user(to, from, n);
> -               n = raw_copy_to_user(to, from, n);
> -       }
> -       return n;
> +       return _inline_copy_to_user(to, from, n);
>  }
>  EXPORT_SYMBOL(_copy_to_user);
>  #endif

Sure, if that's okay with you, then I'm happy to do it that way in v2.

Thank you!
Alice
Trevor Gross Feb. 1, 2024, 5:03 a.m. UTC | #7
On Wed, Jan 24, 2024 at 6:21 AM Alice Ryhl <aliceryhl@google.com> wrote:

I see this patch answers some of my naming questions from 1/3, sorry
for not reading all the way through.

> diff --git a/rust/kernel/user_ptr.rs b/rust/kernel/user_ptr.rs
> index 00aa26aa6a83..daa46abe5525 100644
> --- a/rust/kernel/user_ptr.rs
> +++ b/rust/kernel/user_ptr.rs
> @@ -11,6 +11,7 @@
>  use crate::{bindings, error::code::*, error::Result};
>  use alloc::vec::Vec;
>  use core::ffi::{c_ulong, c_void};
> +use core::mem::{size_of, MaybeUninit};
>
>  /// The maximum length of a operation using `copy_[from|to]_user`.
>  ///
> @@ -151,6 +152,36 @@ pub unsafe fn read_raw(&mut self, out: *mut u8, len: usize) -> Result {
>          Ok(())
>      }
>
> +    /// Reads a value of the specified type.
> +    ///
> +    /// Fails with `EFAULT` if the read encounters a page fault.
> +    pub fn read<T: ReadableFromBytes>(&mut self) -> Result<T> {

I think that `T: Copy` is required here, or for Copy to be a
supertrait of ReadableBytes, since the data in the buffer is being
duplicated from a reference.

Send is probably also a reasonable bound to have .

> +        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
> +            return Err(EFAULT);
> +        }
> +        let mut out: MaybeUninit<T> = MaybeUninit::uninit();
> +        // SAFETY: The local variable `out` is valid for writing `size_of::<T>()` bytes.
> +        let res = unsafe {
> +            bindings::copy_from_user_unsafe_skip_check_object_size(
> +                out.as_mut_ptr().cast::<c_void>(),
> +                self.0,
> +                size_of::<T>() as c_ulong,

As with the other patch, I think it would be more clear to use
`c_ulong::try_from(...)` rather than comparing against
`MAX_USER_OP_LEN ` and later casting. Possibly just in a helper
function.

> +            )
> +        };
> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Since this is not a pointer to a valid object in our program,
> +        // we cannot use `add`, which has C-style rules for defined
> +        // behavior.
> +        self.0 = self.0.wrapping_add(size_of::<T>());

There are now methods `wrapping_byte_add` (since 1.75). Doesn't make
much of a difference since the pointer is c_void anyway, but it does
make the unit more clear.

> +        self.1 -= size_of::<T>();
> +
> +        // SAFETY: The read above has initialized all bytes in `out`, and since
> +        // `T` implements `ReadableFromBytes`, any bit-pattern is a valid value
> +        // for this type.
> +        Ok(unsafe { out.assume_init() })
> +    }
> +
>      /// Reads all remaining data in the buffer into a vector.
>      ///
>      /// Fails with `EFAULT` if the read encounters a page fault.
> @@ -219,4 +250,98 @@ pub fn write_slice(&mut self, data: &[u8]) -> Result {
>          // `len`, so the pointer is valid for reading `len` bytes.
>          unsafe { self.write_raw(ptr, len) }
>      }
> +
> +    /// Writes the provided Rust value to this userspace pointer.
> +    ///
> +    /// Fails with `EFAULT` if the write encounters a page fault.
> +    pub fn write<T: WritableToBytes>(&mut self, value: &T) -> Result {

Send + Copy are also needed here, or supertraits of WritableToBytes.

> +        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
> +            return Err(EFAULT);
> +        }
> +        // SAFETY: The reference points to a value of type `T`, so it is valid
> +        // for reading `size_of::<T>()` bytes.
> +        let res = unsafe {
> +            bindings::copy_to_user_unsafe_skip_check_object_size(
> +                self.0,
> +                (value as *const T).cast::<c_void>(),
> +                size_of::<T>() as c_ulong,
> +            )
> +        };
> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Since this is not a pointer to a valid object in our program,
> +        // we cannot use `add`, which has C-style rules for defined
> +        // behavior.
> +        self.0 = self.0.wrapping_add(size_of::<T>());
> +        self.1 -= size_of::<T>();
> +        Ok(())
> +    }
>  }
> +
> +/// Specifies that a type is safely readable from bytes.
> +///
> +/// Not all types are valid for all values. For example, a `bool` must be either
> +/// zero or one, so reading arbitrary bytes into something that contains a
> +/// `bool` is not okay.
> +///
> +/// It's okay for the type to have padding, as initializing those bytes has no
> +/// effect.
> +///
> +/// # Safety
> +///
> +/// All bit-patterns must be valid for this type.
> +pub unsafe trait ReadableFromBytes {}
> +
> +// SAFETY: All bit patterns are acceptable values of the types below.
> +unsafe impl ReadableFromBytes for u8 {}
> +unsafe impl ReadableFromBytes for u16 {}
> +unsafe impl ReadableFromBytes for u32 {}
> +unsafe impl ReadableFromBytes for u64 {}
> +unsafe impl ReadableFromBytes for usize {}
> +unsafe impl ReadableFromBytes for i8 {}
> +unsafe impl ReadableFromBytes for i16 {}
> +unsafe impl ReadableFromBytes for i32 {}
> +unsafe impl ReadableFromBytes for i64 {}
> +unsafe impl ReadableFromBytes for isize {}
> +// SAFETY: If all bit patterns are acceptable for individual values in an array,
> +// then all bit patterns are also acceptable for arrays of that type.
> +unsafe impl<T: ReadableFromBytes> ReadableFromBytes for [T] {}
> +unsafe impl<T: ReadableFromBytes, const N: usize> ReadableFromBytes for [T; N] {}
> +
> +/// Specifies that a type is safely writable to bytes.
> +///
> +/// If a struct implements this trait, then it is okay to copy it byte-for-byte
> +/// to userspace. This means that it should not have any padding, as padding
> +/// bytes are uninitialized. Reading uninitialized memory is not just undefined
> +/// behavior, it may even lead to leaking sensitive information on the stack to
> +/// userspace.
> +///
> +/// The struct should also not hold kernel pointers, as kernel pointer addresses
> +/// are also considered sensitive. However, leaking kernel pointers is not
> +/// considered undefined behavior by Rust, so this is a correctness requirement,
> +/// but not a safety requirement.
> +///
> +/// # Safety
> +///
> +/// Values of this type may not contain any uninitialized bytes.
> +pub unsafe trait WritableToBytes {}
> +
> +// SAFETY: Instances of the following types have no uninitialized portions.
> +unsafe impl WritableToBytes for u8 {}
> +unsafe impl WritableToBytes for u16 {}
> +unsafe impl WritableToBytes for u32 {}
> +unsafe impl WritableToBytes for u64 {}
> +unsafe impl WritableToBytes for usize {}
> +unsafe impl WritableToBytes for i8 {}
> +unsafe impl WritableToBytes for i16 {}
> +unsafe impl WritableToBytes for i32 {}
> +unsafe impl WritableToBytes for i64 {}
> +unsafe impl WritableToBytes for isize {}
> +unsafe impl WritableToBytes for bool {}
> +unsafe impl WritableToBytes for char {}
> +unsafe impl WritableToBytes for str {}
> +// SAFETY: If individual values in an array have no uninitialized portions, then
> +// the the array itself does not have any uninitialized portions either.
> +unsafe impl<T: WritableToBytes> WritableToBytes for [T] {}
> +unsafe impl<T: WritableToBytes, const N: usize> WritableToBytes for [T; N] {}
>
> --
> 2.43.0.429.g432eaa2c6b-goog
>
>

These traits are probably usable in a lot of other places (e.g.
packets, GPU), so could you put them in a separate module?

The patterns here are pretty similar to what the bytemuck crate does
[1]. Since that crate is well established and open licensed, I think
it makes sense to keep their naming or possibly even vendor a portion
in.

In particular, this would likely include the traits:

- AnyBitPattern, which is roughly ReadableFromBytes here
- NoUninit, which is roughly WritableToBytes here
- Optionally Pod (plain old data), a supertrait of both AnyBitPattern
and NoUninit just used to simplify trait implementation (impl Pod and
you get the other two).

And the functions:

- from_bytes to turn &[u8] into &T for use in `read`. Needs `T: Copy`
to return an owned value, as noted above.
- bytes_of to turn &T into &[u8], for use in `write`

The derive macros would also be nice to have down the line, though
bytemuck's unfortunately relies on syn.

- Trevor

[1]: https://docs.rs/bytemuck/latest/bytemuck/
Alice Ryhl Feb. 8, 2024, 1:14 p.m. UTC | #8
On Thu, Feb 1, 2024 at 6:04 AM Trevor Gross <tmgross@umich.edu> wrote:
>
> On Wed, Jan 24, 2024 at 6:21 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > +    /// Reads a value of the specified type.
> > +    ///
> > +    /// Fails with `EFAULT` if the read encounters a page fault.
> > +    pub fn read<T: ReadableFromBytes>(&mut self) -> Result<T> {
>
> I think that `T: Copy` is required here, or for Copy to be a
> supertrait of ReadableBytes, since the data in the buffer is being
> duplicated from a reference.
>
> Send is probably also a reasonable bound to have .

We're not moving a value of type `T`. We're creating a new value of
type `T` from a byte array. The trait says that we can do that, so I
think that is enough here.

Besides, we'll often want to use this for wrappers around bindgen
types. If we add a Copy bound, then we need to get bindgen to generate
a #[derive(Copy)] for them, which I don't think it does right now.

> > +        // SAFETY: The local variable `out` is valid for writing `size_of::<T>()` bytes.
> > +        let res = unsafe {
> > +            bindings::copy_from_user_unsafe_skip_check_object_size(
> > +                out.as_mut_ptr().cast::<c_void>(),
> > +                self.0,
> > +                size_of::<T>() as c_ulong,
>
> As with the other patch, I think it would be more clear to use
> `c_ulong::try_from(...)` rather than comparing against
> `MAX_USER_OP_LEN ` and later casting. Possibly just in a helper
> function.

Done.

> > +        // Since this is not a pointer to a valid object in our program,
> > +        // we cannot use `add`, which has C-style rules for defined
> > +        // behavior.
> > +        self.0 = self.0.wrapping_add(size_of::<T>());
>
> There are now methods `wrapping_byte_add` (since 1.75). Doesn't make
> much of a difference since the pointer is c_void anyway, but it does
> make the unit more clear.

Sure, I can use those methods.

> > +    /// Writes the provided Rust value to this userspace pointer.
> > +    ///
> > +    /// Fails with `EFAULT` if the write encounters a page fault.
> > +    pub fn write<T: WritableToBytes>(&mut self, value: &T) -> Result {
>
> Send + Copy are also needed here, or supertraits of WritableToBytes.

I also disagree here. We're not moving a value of type T. We're
creating a byte array from a value of type T, and the trait says that
we can do that.

> > +/// Specifies that a type is safely readable from bytes.
> > +///
> > +/// Not all types are valid for all values. For example, a `bool` must be either
> > +/// zero or one, so reading arbitrary bytes into something that contains a
> > +/// `bool` is not okay.
> > +///
> > +/// It's okay for the type to have padding, as initializing those bytes has no
> > +/// effect.
> > +///
> > +/// # Safety
> > +///
> > +/// All bit-patterns must be valid for this type.
> > +pub unsafe trait ReadableFromBytes {}
> > +
> > +// SAFETY: All bit patterns are acceptable values of the types below.
> > +unsafe impl ReadableFromBytes for u8 {}
> > +unsafe impl ReadableFromBytes for u16 {}
> > +unsafe impl ReadableFromBytes for u32 {}
> > +unsafe impl ReadableFromBytes for u64 {}
> > +unsafe impl ReadableFromBytes for usize {}
> > +unsafe impl ReadableFromBytes for i8 {}
> > +unsafe impl ReadableFromBytes for i16 {}
> > +unsafe impl ReadableFromBytes for i32 {}
> > +unsafe impl ReadableFromBytes for i64 {}
> > +unsafe impl ReadableFromBytes for isize {}
> > +// SAFETY: If all bit patterns are acceptable for individual values in an array,
> > +// then all bit patterns are also acceptable for arrays of that type.
> > +unsafe impl<T: ReadableFromBytes> ReadableFromBytes for [T] {}
> > +unsafe impl<T: ReadableFromBytes, const N: usize> ReadableFromBytes for [T; N] {}
> > +
> > +/// Specifies that a type is safely writable to bytes.
> > +///
> > +/// If a struct implements this trait, then it is okay to copy it byte-for-byte
> > +/// to userspace. This means that it should not have any padding, as padding
> > +/// bytes are uninitialized. Reading uninitialized memory is not just undefined
> > +/// behavior, it may even lead to leaking sensitive information on the stack to
> > +/// userspace.
> > +///
> > +/// The struct should also not hold kernel pointers, as kernel pointer addresses
> > +/// are also considered sensitive. However, leaking kernel pointers is not
> > +/// considered undefined behavior by Rust, so this is a correctness requirement,
> > +/// but not a safety requirement.
> > +///
> > +/// # Safety
> > +///
> > +/// Values of this type may not contain any uninitialized bytes.
> > +pub unsafe trait WritableToBytes {}
> > +
> > +// SAFETY: Instances of the following types have no uninitialized portions.
> > +unsafe impl WritableToBytes for u8 {}
> > +unsafe impl WritableToBytes for u16 {}
> > +unsafe impl WritableToBytes for u32 {}
> > +unsafe impl WritableToBytes for u64 {}
> > +unsafe impl WritableToBytes for usize {}
> > +unsafe impl WritableToBytes for i8 {}
> > +unsafe impl WritableToBytes for i16 {}
> > +unsafe impl WritableToBytes for i32 {}
> > +unsafe impl WritableToBytes for i64 {}
> > +unsafe impl WritableToBytes for isize {}
> > +unsafe impl WritableToBytes for bool {}
> > +unsafe impl WritableToBytes for char {}
> > +unsafe impl WritableToBytes for str {}
> > +// SAFETY: If individual values in an array have no uninitialized portions, then
> > +// the the array itself does not have any uninitialized portions either.
> > +unsafe impl<T: WritableToBytes> WritableToBytes for [T] {}
> > +unsafe impl<T: WritableToBytes, const N: usize> WritableToBytes for [T; N] {}
>
> These traits are probably usable in a lot of other places (e.g.
> packets, GPU), so could you put them in a separate module?

I can move them to the types module.

> The patterns here are pretty similar to what the bytemuck crate does
> [1]. Since that crate is well established and open licensed, I think
> it makes sense to keep their naming or possibly even vendor a portion
> in.

Vendoring bytemuck is out of scope for this patchset.

If we *are* going to vendor one of them, I would suggest zerocopy over bytemuck.

> In particular, this would likely include the traits:
>
> - AnyBitPattern, which is roughly ReadableFromBytes here
> - NoUninit, which is roughly WritableToBytes here
> - Optionally Pod (plain old data), a supertrait of both AnyBitPattern
> and NoUninit just used to simplify trait implementation (impl Pod and
> you get the other two).

I can rename the two traits, but I'm not going to introduce Pod. It's
over engineered for my purposes. Also, I prefer the trait names from
zerocopy. They emphasize that it's really about conversions to/from
byte arrays, and not about moving values around.

Note that WritableToBytes has an extra comment about pointer addresses
that bytemuck/zerocopy doesn't have.

> And the functions:
>
> - from_bytes to turn &[u8] into &T for use in `read`. Needs `T: Copy`
> to return an owned value, as noted above.
> - bytes_of to turn &T into &[u8], for use in `write`
>
> The derive macros would also be nice to have down the line, though
> bytemuck's unfortunately relies on syn.
>
> - Trevor
>
> [1]: https://docs.rs/bytemuck/latest/bytemuck/
diff mbox series

Patch

diff --git a/rust/helpers.c b/rust/helpers.c
index 312b6fcb49d5..187f445fbf19 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -52,6 +52,40 @@  unsigned long rust_helper_copy_to_user(void __user *to, const void *from,
 }
 EXPORT_SYMBOL_GPL(rust_helper_copy_to_user);
 
+/*
+ * These methods skip the `check_object_size` check that `copy_[to|from]_user`
+ * normally performs. In C, these checks are skipped whenever the length is a
+ * compile-time constant, since when that is the case, the kernel pointer
+ * usually points at a local variable that is being initialized and the kernel
+ * pointer is trivially non-dangling.
+ *
+ * These helpers serve the same purpose in Rust. Whenever the length is known at
+ * compile-time, we call this helper to skip the check.
+ */
+unsigned long rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, const void __user *from, unsigned long n)
+{
+	unsigned long res;
+
+	might_fault();
+	instrument_copy_from_user_before(to, from, n);
+	if (should_fail_usercopy())
+		return n;
+	res = raw_copy_from_user(to, from, n);
+	instrument_copy_from_user_after(to, from, n, res);
+	return res;
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
+
+unsigned long rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, const void *from, unsigned long n)
+{
+	might_fault();
+	if (should_fail_usercopy())
+		return n;
+	instrument_copy_to_user(to, from, n);
+	return raw_copy_to_user(to, from, n);
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
+
 void rust_helper_mutex_lock(struct mutex *lock)
 {
 	mutex_lock(lock);
diff --git a/rust/kernel/user_ptr.rs b/rust/kernel/user_ptr.rs
index 00aa26aa6a83..daa46abe5525 100644
--- a/rust/kernel/user_ptr.rs
+++ b/rust/kernel/user_ptr.rs
@@ -11,6 +11,7 @@ 
 use crate::{bindings, error::code::*, error::Result};
 use alloc::vec::Vec;
 use core::ffi::{c_ulong, c_void};
+use core::mem::{size_of, MaybeUninit};
 
 /// The maximum length of a operation using `copy_[from|to]_user`.
 ///
@@ -151,6 +152,36 @@  pub unsafe fn read_raw(&mut self, out: *mut u8, len: usize) -> Result {
         Ok(())
     }
 
+    /// Reads a value of the specified type.
+    ///
+    /// Fails with `EFAULT` if the read encounters a page fault.
+    pub fn read<T: ReadableFromBytes>(&mut self) -> Result<T> {
+        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
+            return Err(EFAULT);
+        }
+        let mut out: MaybeUninit<T> = MaybeUninit::uninit();
+        // SAFETY: The local variable `out` is valid for writing `size_of::<T>()` bytes.
+        let res = unsafe {
+            bindings::copy_from_user_unsafe_skip_check_object_size(
+                out.as_mut_ptr().cast::<c_void>(),
+                self.0,
+                size_of::<T>() as c_ulong,
+            )
+        };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Since this is not a pointer to a valid object in our program,
+        // we cannot use `add`, which has C-style rules for defined
+        // behavior.
+        self.0 = self.0.wrapping_add(size_of::<T>());
+        self.1 -= size_of::<T>();
+        // SAFETY: The read above has initialized all bytes in `out`, and since
+        // `T` implements `ReadableFromBytes`, any bit-pattern is a valid value
+        // for this type.
+        Ok(unsafe { out.assume_init() })
+    }
+
     /// Reads all remaining data in the buffer into a vector.
     ///
     /// Fails with `EFAULT` if the read encounters a page fault.
@@ -219,4 +250,98 @@  pub fn write_slice(&mut self, data: &[u8]) -> Result {
         // `len`, so the pointer is valid for reading `len` bytes.
         unsafe { self.write_raw(ptr, len) }
     }
+
+    /// Writes the provided Rust value to this userspace pointer.
+    ///
+    /// Fails with `EFAULT` if the write encounters a page fault.
+    pub fn write<T: WritableToBytes>(&mut self, value: &T) -> Result {
+        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
+            return Err(EFAULT);
+        }
+        // SAFETY: The reference points to a value of type `T`, so it is valid
+        // for reading `size_of::<T>()` bytes.
+        let res = unsafe {
+            bindings::copy_to_user_unsafe_skip_check_object_size(
+                self.0,
+                (value as *const T).cast::<c_void>(),
+                size_of::<T>() as c_ulong,
+            )
+        };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Since this is not a pointer to a valid object in our program,
+        // we cannot use `add`, which has C-style rules for defined
+        // behavior.
+        self.0 = self.0.wrapping_add(size_of::<T>());
+        self.1 -= size_of::<T>();
+        Ok(())
+    }
 }
+
+/// Specifies that a type is safely readable from bytes.
+///
+/// Not all types are valid for all values. For example, a `bool` must be either
+/// zero or one, so reading arbitrary bytes into something that contains a
+/// `bool` is not okay.
+///
+/// It's okay for the type to have padding, as initializing those bytes has no
+/// effect.
+///
+/// # Safety
+///
+/// All bit-patterns must be valid for this type.
+pub unsafe trait ReadableFromBytes {}
+
+// SAFETY: All bit patterns are acceptable values of the types below.
+unsafe impl ReadableFromBytes for u8 {}
+unsafe impl ReadableFromBytes for u16 {}
+unsafe impl ReadableFromBytes for u32 {}
+unsafe impl ReadableFromBytes for u64 {}
+unsafe impl ReadableFromBytes for usize {}
+unsafe impl ReadableFromBytes for i8 {}
+unsafe impl ReadableFromBytes for i16 {}
+unsafe impl ReadableFromBytes for i32 {}
+unsafe impl ReadableFromBytes for i64 {}
+unsafe impl ReadableFromBytes for isize {}
+// SAFETY: If all bit patterns are acceptable for individual values in an array,
+// then all bit patterns are also acceptable for arrays of that type.
+unsafe impl<T: ReadableFromBytes> ReadableFromBytes for [T] {}
+unsafe impl<T: ReadableFromBytes, const N: usize> ReadableFromBytes for [T; N] {}
+
+/// Specifies that a type is safely writable to bytes.
+///
+/// If a struct implements this trait, then it is okay to copy it byte-for-byte
+/// to userspace. This means that it should not have any padding, as padding
+/// bytes are uninitialized. Reading uninitialized memory is not just undefined
+/// behavior, it may even lead to leaking sensitive information on the stack to
+/// userspace.
+///
+/// The struct should also not hold kernel pointers, as kernel pointer addresses
+/// are also considered sensitive. However, leaking kernel pointers is not
+/// considered undefined behavior by Rust, so this is a correctness requirement,
+/// but not a safety requirement.
+///
+/// # Safety
+///
+/// Values of this type may not contain any uninitialized bytes.
+pub unsafe trait WritableToBytes {}
+
+// SAFETY: Instances of the following types have no uninitialized portions.
+unsafe impl WritableToBytes for u8 {}
+unsafe impl WritableToBytes for u16 {}
+unsafe impl WritableToBytes for u32 {}
+unsafe impl WritableToBytes for u64 {}
+unsafe impl WritableToBytes for usize {}
+unsafe impl WritableToBytes for i8 {}
+unsafe impl WritableToBytes for i16 {}
+unsafe impl WritableToBytes for i32 {}
+unsafe impl WritableToBytes for i64 {}
+unsafe impl WritableToBytes for isize {}
+unsafe impl WritableToBytes for bool {}
+unsafe impl WritableToBytes for char {}
+unsafe impl WritableToBytes for str {}
+// SAFETY: If individual values in an array have no uninitialized portions, then
+// the the array itself does not have any uninitialized portions either.
+unsafe impl<T: WritableToBytes> WritableToBytes for [T] {}
+unsafe impl<T: WritableToBytes, const N: usize> WritableToBytes for [T; N] {}