diff mbox series

[v8,2/7] mm: rust: add vm_area_struct methods that require read access

Message ID 20241120-vma-v8-2-eb31425da66b@google.com (mailing list archive)
State New
Headers show
Series Rust support for mm_struct, vm_area_struct, and mmap | expand

Commit Message

Alice Ryhl Nov. 20, 2024, 2:49 p.m. UTC
This adds a type called VmAreaRef which is used when referencing a vma
that you have read access to. Here, read access means that you hold
either the mmap read lock or the vma read lock (or stronger).

Additionally, a vma_lookup method is added to the mmap read guard, which
enables you to obtain a &VmAreaRef in safe Rust code.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/helpers/mm.c      |   6 ++
 rust/kernel/mm.rs      |  21 ++++++
 rust/kernel/mm/virt.rs | 171 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 198 insertions(+)

Comments

Lorenzo Stoakes Nov. 20, 2024, 7:07 p.m. UTC | #1
On Wed, Nov 20, 2024 at 02:49:56PM +0000, Alice Ryhl wrote:
> This adds a type called VmAreaRef which is used when referencing a vma
> that you have read access to. Here, read access means that you hold
> either the mmap read lock or the vma read lock (or stronger).

This is pure nit and not important but...

I know we screwed up naming in mm, with the god awful vm_area_struct (we
abbreviate, for 2 letters, get bored, stop abbreviating, but throw in a
struct for a laugh) or 'VMA', but I wonder if this would be better as
VmaRef? Then again that's kinda terrible too.

Sorry about that. I mean this isn't hugely important + ultimately _our
fault_.

VirtualMemoryAreaRef is worse... VirtMemAreaRef? OK. Maybe VMAreaRef is the
least bad...

>
> Additionally, a vma_lookup method is added to the mmap read guard, which
> enables you to obtain a &VmAreaRef in safe Rust code.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/helpers/mm.c      |   6 ++
>  rust/kernel/mm.rs      |  21 ++++++
>  rust/kernel/mm/virt.rs | 171 +++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 198 insertions(+)
>
> diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> index 7201747a5d31..7b72eb065a3e 100644
> --- a/rust/helpers/mm.c
> +++ b/rust/helpers/mm.c
> @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm)
>  {
>  	mmap_read_unlock(mm);
>  }
> +
> +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> +					      unsigned long addr)
> +{
> +	return vma_lookup(mm, addr);
> +}

I wonder whether we want to abstract some of the VMA iterator stuff,
because it's inefficient to look up VMAs by doing this each time if you are
looking at, for instance, adjacent VMAs.

But perhaps reasonable to defer doing that to later work?

> diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> index 84cba581edaa..ace8e7d57afe 100644
> --- a/rust/kernel/mm.rs
> +++ b/rust/kernel/mm.rs
> @@ -12,6 +12,8 @@
>  };
>  use core::{ops::Deref, ptr::NonNull};
>
> +pub mod virt;
> +
>  /// A wrapper for the kernel's `struct mm_struct`.
>  ///
>  /// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> @@ -210,6 +212,25 @@ pub struct MmapReadGuard<'a> {
>      _nts: NotThreadSafe,
>  }
>
> +impl<'a> MmapReadGuard<'a> {
> +    /// Look up a vma at the given address.
> +    #[inline]
> +    pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {

Kind of lovely to have functions under the read guard and then knowing that
hey - we must hold the read lock to be able to do this.

Imagine, a programming language with actual types... :P

> +        // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
> +        // for `vma_addr`.
> +        let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };

Why do we say 'as _' here?

> +
> +        if vma.is_null() {
> +            None
> +        } else {
> +            // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
> +            // the returned area will borrow from this read lock guard, so it can only be used
> +            // while the mmap read lock is still held.

Lovely!

> +            unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
> +        }
> +    }
> +}
> +
>  impl Drop for MmapReadGuard<'_> {
>      #[inline]
>      fn drop(&mut self) {
> diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> new file mode 100644
> index 000000000000..1e755dca46dd
> --- /dev/null
> +++ b/rust/kernel/mm/virt.rs
> @@ -0,0 +1,171 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Virtual memory.

Trivial rust q again but does this result in a heading in generated docs or
something?

> +
> +use crate::{bindings, types::Opaque};
> +
> +/// A wrapper for the kernel's `struct vm_area_struct` with read access.
> +///
> +/// It represents an area of virtual memory.
> +///
> +/// # Invariants
> +///
> +/// The caller must hold the mmap read lock or the vma read lock.

Might be worth mentioning here that you have yet to abstract the vma lock?

> +#[repr(transparent)]
> +pub struct VmAreaRef {
> +    vma: Opaque<bindings::vm_area_struct>,
> +}
> +
> +// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no
> +// matter what the vma flags are.

typo: 'or strong' -> 'or stronger'.

Nitty, but perhaps say 'Methods must be usable' rather than 'they' to be
totally clear.

> +impl VmAreaRef {
> +    /// Access a virtual memory area given a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> +    /// (or stronger) is held for at least the duration of 'a.

Well, VMA locks make this more complicated, in that we can just hold a VMA
lock. But again, perhaps worth doing this incrementally and just talk about
mmap locks to start with.

However since we reference VMA locks elsewhere, we should reference them
here (and probably worth mentioning that they are not yet abstracted).

> +    #[inline]
> +    pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> +        // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> +        unsafe { &*vma.cast() }
> +    }
> +
> +    /// Returns a raw pointer to this area.
> +    #[inline]
> +    pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
> +        self.vma.get()
> +    }
> +
> +    /// Returns the flags associated with the virtual memory area.
> +    ///
> +    /// The possible flags are a combination of the constants in [`flags`].
> +    #[inline]
> +    pub fn flags(&self) -> vm_flags_t {
> +        // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> +        // access is not a data race.
> +        unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }

Hm what's up with this __bindgen_anon_N stuff?

> +    }
> +
> +    /// Returns the start address of the virtual memory area.
> +    #[inline]
> +    pub fn start(&self) -> usize {

Is usize guranteed to be equivalent to unsigned long?

> +        // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> +        // access is not a data race.
> +        unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
> +    }
> +
> +    /// Returns the end address of the virtual memory area.

Worth mentioning that it's an _exclusive_ end.

> +    #[inline]
> +    pub fn end(&self) -> usize {
> +        // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> +        // access is not a data race.
> +        unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
> +    }
> +
> +    /// Unmap pages in the given page range.

This needs some more description, as 'unmapping' pages is unfortunately an
overloaded term in the kernel and this very much might confuse people as
opposed to e.g. munmap()'ing a range.

I'd say something like 'clear page table mappings for the range at the leaf
level, leaving all other page tables intact, freeing any memory referenced
by the VMA in this range (anonymous memory is completely freed, file-backed
memory has its reference count on page cache folio's dropped, any dirty
data will still be written back to disk as usual)'.

> +    #[inline]
> +    pub fn zap_page_range_single(&self, address: usize, size: usize) {
> +        // SAFETY: By the type invariants, the caller has read access to this VMA, which is
> +        // sufficient for this method call. This method has no requirements on the vma flags. Any
> +        // value of `address` and `size` is allowed.
> +        unsafe {
> +            bindings::zap_page_range_single(

Hm weirdly I see this in rust/bindings/bindings_generated.rs but not in
rust/helpers/mm.c is this intended?

Is this meant to be generated _from_ somewhere? Is something missing for
that?

> +                self.as_ptr(),
> +                address as _,
> +                size as _,
> +                core::ptr::null_mut(),
> +            )
> +        };
> +    }
> +}
> +
> +/// The integer type used for vma flags.
> +#[doc(inline)]
> +pub use bindings::vm_flags_t;

Where do you declare this type?

> +
> +/// All possible flags for [`VmAreaRef`].

Well you've not specified all as they're some speciailist ones and ones
that depend on config flags, so maybe worth just saying 'core' flags?

> +pub mod flags {

Pure rust noobie question (again...) but what is a 'mod'?

> +    use super::vm_flags_t;
> +    use crate::bindings;
> +
> +    /// No flags are set.
> +    pub const NONE: vm_flags_t = bindings::VM_NONE as _;

This is strictly not a flag, as is the 0 value (and is used 'cleverly' in
kernel code when we have flags that are defined under some circumstances
but not others, where we can then assign these values to VM_NONE if not
available, ensuring all |= ... operations are no-ops and all &
... expressions evaluate to false) but I guess it doesn't matter all too
much right?

> +
> +    /// Mapping allows reads.
> +    pub const READ: vm_flags_t = bindings::VM_READ as _;
> +
> +    /// Mapping allows writes.
> +    pub const WRITE: vm_flags_t = bindings::VM_WRITE as _;
> +
> +    /// Mapping allows execution.
> +    pub const EXEC: vm_flags_t = bindings::VM_EXEC as _;
> +
> +    /// Mapping is shared.
> +    pub const SHARED: vm_flags_t = bindings::VM_SHARED as _;
> +
> +    /// Mapping may be updated to allow reads.
> +    pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _;
> +
> +    /// Mapping may be updated to allow writes.
> +    pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _;
> +
> +    /// Mapping may be updated to allow execution.
> +    pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _;
> +
> +    /// Mapping may be updated to be shared.
> +    pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _;
> +
> +    /// Page-ranges managed without `struct page`, just pure PFN.
> +    pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _;
> +
> +    /// Memory mapped I/O or similar.
> +    pub const IO: vm_flags_t = bindings::VM_IO as _;
> +
> +    /// Do not copy this vma on fork.
> +    pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _;
> +
> +    /// Cannot expand with mremap().
> +    pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _;
> +
> +    /// Lock the pages covered when they are faulted in.
> +    pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _;
> +
> +    /// Is a VM accounted object.
> +    pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _;
> +
> +    /// should the VM suppress accounting.
> +    pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _;
> +
> +    /// Huge TLB Page VM.
> +    pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _;
> +
> +    /// Synchronous page faults.

Might be worth mentioning that this is DAX-specific only.

> +    pub const SYNC: vm_flags_t = bindings::VM_SYNC as _;
> +
> +    /// Architecture-specific flag.
> +    pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _;
> +
> +    /// Wipe VMA contents in child..

Typo '..' - also probably worth saying 'wipe VMA contents in child on
fork'.

> +    pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _;
> +
> +    /// Do not include in the core dump.
> +    pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _;
> +
> +    /// Not soft dirty clean area.
> +    pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _;
> +
> +    /// Can contain `struct page` and pure PFN pages.
> +    pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _;
> +
> +    /// MADV_HUGEPAGE marked this vma.
> +    pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _;
> +
> +    /// MADV_NOHUGEPAGE marked this vma.
> +    pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _;
> +
> +    /// KSM may merge identical pages.
> +    pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _;
> +}
>

I'd say these comments are a bit sparse and need more detail, but they are
literally comments from include/linux/mm.h and therefore,  again, our fault
so this is fine :)

> --
> 2.47.0.371.ga323438b13-goog
>
diff mbox series

Patch

diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
index 7201747a5d31..7b72eb065a3e 100644
--- a/rust/helpers/mm.c
+++ b/rust/helpers/mm.c
@@ -37,3 +37,9 @@  void rust_helper_mmap_read_unlock(struct mm_struct *mm)
 {
 	mmap_read_unlock(mm);
 }
+
+struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
+					      unsigned long addr)
+{
+	return vma_lookup(mm, addr);
+}
diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
index 84cba581edaa..ace8e7d57afe 100644
--- a/rust/kernel/mm.rs
+++ b/rust/kernel/mm.rs
@@ -12,6 +12,8 @@ 
 };
 use core::{ops::Deref, ptr::NonNull};
 
+pub mod virt;
+
 /// A wrapper for the kernel's `struct mm_struct`.
 ///
 /// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
@@ -210,6 +212,25 @@  pub struct MmapReadGuard<'a> {
     _nts: NotThreadSafe,
 }
 
+impl<'a> MmapReadGuard<'a> {
+    /// Look up a vma at the given address.
+    #[inline]
+    pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {
+        // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
+        // for `vma_addr`.
+        let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
+
+        if vma.is_null() {
+            None
+        } else {
+            // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
+            // the returned area will borrow from this read lock guard, so it can only be used
+            // while the mmap read lock is still held.
+            unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
+        }
+    }
+}
+
 impl Drop for MmapReadGuard<'_> {
     #[inline]
     fn drop(&mut self) {
diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
new file mode 100644
index 000000000000..1e755dca46dd
--- /dev/null
+++ b/rust/kernel/mm/virt.rs
@@ -0,0 +1,171 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2024 Google LLC.
+
+//! Virtual memory.
+
+use crate::{bindings, types::Opaque};
+
+/// A wrapper for the kernel's `struct vm_area_struct` with read access.
+///
+/// It represents an area of virtual memory.
+///
+/// # Invariants
+///
+/// The caller must hold the mmap read lock or the vma read lock.
+#[repr(transparent)]
+pub struct VmAreaRef {
+    vma: Opaque<bindings::vm_area_struct>,
+}
+
+// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no
+// matter what the vma flags are.
+impl VmAreaRef {
+    /// Access a virtual memory area given a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
+    /// (or stronger) is held for at least the duration of 'a.
+    #[inline]
+    pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
+        // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
+        unsafe { &*vma.cast() }
+    }
+
+    /// Returns a raw pointer to this area.
+    #[inline]
+    pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
+        self.vma.get()
+    }
+
+    /// Returns the flags associated with the virtual memory area.
+    ///
+    /// The possible flags are a combination of the constants in [`flags`].
+    #[inline]
+    pub fn flags(&self) -> vm_flags_t {
+        // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
+        // access is not a data race.
+        unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }
+    }
+
+    /// Returns the start address of the virtual memory area.
+    #[inline]
+    pub fn start(&self) -> usize {
+        // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
+        // access is not a data race.
+        unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
+    }
+
+    /// Returns the end address of the virtual memory area.
+    #[inline]
+    pub fn end(&self) -> usize {
+        // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
+        // access is not a data race.
+        unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
+    }
+
+    /// Unmap pages in the given page range.
+    #[inline]
+    pub fn zap_page_range_single(&self, address: usize, size: usize) {
+        // SAFETY: By the type invariants, the caller has read access to this VMA, which is
+        // sufficient for this method call. This method has no requirements on the vma flags. Any
+        // value of `address` and `size` is allowed.
+        unsafe {
+            bindings::zap_page_range_single(
+                self.as_ptr(),
+                address as _,
+                size as _,
+                core::ptr::null_mut(),
+            )
+        };
+    }
+}
+
+/// The integer type used for vma flags.
+#[doc(inline)]
+pub use bindings::vm_flags_t;
+
+/// All possible flags for [`VmAreaRef`].
+pub mod flags {
+    use super::vm_flags_t;
+    use crate::bindings;
+
+    /// No flags are set.
+    pub const NONE: vm_flags_t = bindings::VM_NONE as _;
+
+    /// Mapping allows reads.
+    pub const READ: vm_flags_t = bindings::VM_READ as _;
+
+    /// Mapping allows writes.
+    pub const WRITE: vm_flags_t = bindings::VM_WRITE as _;
+
+    /// Mapping allows execution.
+    pub const EXEC: vm_flags_t = bindings::VM_EXEC as _;
+
+    /// Mapping is shared.
+    pub const SHARED: vm_flags_t = bindings::VM_SHARED as _;
+
+    /// Mapping may be updated to allow reads.
+    pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _;
+
+    /// Mapping may be updated to allow writes.
+    pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _;
+
+    /// Mapping may be updated to allow execution.
+    pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _;
+
+    /// Mapping may be updated to be shared.
+    pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _;
+
+    /// Page-ranges managed without `struct page`, just pure PFN.
+    pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _;
+
+    /// Memory mapped I/O or similar.
+    pub const IO: vm_flags_t = bindings::VM_IO as _;
+
+    /// Do not copy this vma on fork.
+    pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _;
+
+    /// Cannot expand with mremap().
+    pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _;
+
+    /// Lock the pages covered when they are faulted in.
+    pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _;
+
+    /// Is a VM accounted object.
+    pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _;
+
+    /// should the VM suppress accounting.
+    pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _;
+
+    /// Huge TLB Page VM.
+    pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _;
+
+    /// Synchronous page faults.
+    pub const SYNC: vm_flags_t = bindings::VM_SYNC as _;
+
+    /// Architecture-specific flag.
+    pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _;
+
+    /// Wipe VMA contents in child..
+    pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _;
+
+    /// Do not include in the core dump.
+    pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _;
+
+    /// Not soft dirty clean area.
+    pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _;
+
+    /// Can contain `struct page` and pure PFN pages.
+    pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _;
+
+    /// MADV_HUGEPAGE marked this vma.
+    pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _;
+
+    /// MADV_NOHUGEPAGE marked this vma.
+    pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _;
+
+    /// KSM may merge identical pages.
+    pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _;
+}