Message ID | 20241120-vma-v8-6-eb31425da66b@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Rust support for mm_struct, vm_area_struct, and mmap | expand |
On Wed, Nov 20, 2024 at 02:50:00PM +0000, Alice Ryhl wrote: > When setting up a new vma in an mmap call, you have a bunch of extra > permissions that you normally don't have. This introduces a new > VmAreaNew type that is used for this case. Hm I'm confused by what you mean here. What permissions do you mean? Is this to abstract a VMA as passed by f_op->mmap()? I think it would be better to explicitly say this if so. > > To avoid setting invalid flag values, the methods for clearing > VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error > if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking > the return value results in a compilation error because the `Result` > type is marked #[must_use]. This is nice. Though note that, it is explicitly not permitted to permit writability for a VMA that previously had it disallowed, and we explicitly WARN_ON() this now. Concretely that means a VMA where !(vma->vm_flags & VM_MAYWRITE), you must not vma->vm_flags |= VM_MAYWRITE. > > For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When > we add a VM_PFNMAP method, we will need some way to prevent you from > setting both VM_MIXEDMAP and VM_PFNMAP on the same vma. Yes this would be unwise. An aside here - really you should _only_ change flags in this hook (perhaps of course also initialising vma->vm_private_data state), trying to change anything _core_ here would be deeply dangerous. We are far too permissive with this right now, and it's something we want to try ideally to limit in the future. > > Signed-off-by: Alice Ryhl <aliceryhl@google.com> > --- > rust/kernel/mm/virt.rs | 169 ++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 168 insertions(+), 1 deletion(-) > > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs > index de7f2338810a..22aff8e77854 100644 > --- a/rust/kernel/mm/virt.rs > +++ b/rust/kernel/mm/virt.rs > @@ -6,7 +6,7 @@ > > use crate::{ > bindings, > - error::{to_result, Result}, > + error::{code::EINVAL, to_result, Result}, > page::Page, > types::Opaque, > }; > @@ -148,6 +148,173 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { > } > } > > +/// A builder for setting up a vma in an `mmap` call. Would be better to explicitly reference the struct file_operations->mmap() hook and to say that we should only be updating flags and vm_private_data here (though perhaps not worth mentioning _that_ if not explicitly exposed by your interface). I'm guessing fields are, unless a setter/builder is established, immutable? > +/// > +/// # Invariants > +/// > +/// For the duration of 'a, the referenced vma must be undergoing initialization. > +pub struct VmAreaNew { > + vma: VmAreaRef, > +} > + > +// Make all `VmAreaRef` methods available on `VmAreaNew`. > +impl Deref for VmAreaNew { > + type Target = VmAreaRef; > + > + #[inline] > + fn deref(&self) -> &VmAreaRef { > + &self.vma > + } > +} > + > +impl VmAreaNew { > + /// Access a virtual memory area given a raw pointer. > + /// > + /// # Safety > + /// > + /// Callers must ensure that `vma` is undergoing initial vma setup for the duration of 'a. > + #[inline] > + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { > + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. > + unsafe { &*vma.cast() } > + } > + > + /// Internal method for updating the vma flags. > + /// > + /// # Safety > + /// > + /// This must not be used to set the flags to an invalid value. > + #[inline] > + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) { > + let mut flags = self.flags(); > + flags |= set; > + flags &= !unset; > + > + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet > + // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel. > + // The caller promises that this does not set the flags to an invalid value. > + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags = flags }; Hm not sure if this is correct. We explicitly maintain a union in struct vm_area_struct as: union { const vm_flags_t vm_flags; vm_flags_t __private __vm_flags; }; Where vma->vm_flags is const, and then use helpers like vm_flags_init() to set them, which also do things like assert locks (though not in the init case, of course). So erally we should at least be updating __vm_flags here, though I'm not sure how bindgen treats it? > + } > + > + /// Set the `VM_MIXEDMAP` flag on this vma. > + /// > + /// This enables the vma to contain both `struct page` and pure PFN pages. Returns a reference > + /// that can be used to call `vm_insert_page` on the vma. > + #[inline] > + pub fn set_mixedmap(&self) -> &VmAreaMixedMap { > + // SAFETY: We don't yet provide a way to set VM_PFNMAP, so this cannot put the flags in an > + // invalid state. > + unsafe { self.update_flags(flags::MIXEDMAP, 0) }; > + > + // SAFETY: We just set `VM_MIXEDMAP` on the vma. > + unsafe { VmAreaMixedMap::from_raw(self.vma.as_ptr()) } > + } > + > + /// Set the `VM_IO` flag on this vma. > + /// > + /// This marks the vma as being a memory-mapped I/O region. > + #[inline] > + pub fn set_io(&self) { > + // SAFETY: Setting the VM_IO flag is always okay. > + unsafe { self.update_flags(flags::IO, 0) }; > + } > + > + /// Set the `VM_DONTEXPAND` flag on this vma. > + /// > + /// This prevents the vma from being expanded with `mremap()`. > + #[inline] > + pub fn set_dontexpand(&self) { > + // SAFETY: Setting the VM_DONTEXPAND flag is always okay. > + unsafe { self.update_flags(flags::DONTEXPAND, 0) }; > + } > + > + /// Set the `VM_DONTCOPY` flag on this vma. > + /// > + /// This prevents the vma from being copied on fork. This option is only permanent if `VM_IO` > + /// is set. > + #[inline] > + pub fn set_dontcopy(&self) { > + // SAFETY: Setting the VM_DONTCOPY flag is always okay. > + unsafe { self.update_flags(flags::DONTCOPY, 0) }; > + } > + > + /// Set the `VM_DONTDUMP` flag on this vma. > + /// > + /// This prevents the vma from being included in core dumps. This option is only permanent if > + /// `VM_IO` is set. > + #[inline] > + pub fn set_dontdump(&self) { > + // SAFETY: Setting the VM_DONTDUMP flag is always okay. > + unsafe { self.update_flags(flags::DONTDUMP, 0) }; > + } > + > + /// Returns whether `VM_READ` is set. > + /// > + /// This flag indicates whether userspace is mapping this vma as readable. > + #[inline] > + pub fn get_read(&self) -> bool { > + (self.flags() & flags::READ) != 0 > + } > + > + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set. > + /// > + /// This flag indicates whether userspace is allowed to make this vma readable with > + /// `mprotect()`. > + #[inline] > + pub fn try_clear_mayread(&self) -> Result { > + if self.get_read() { > + return Err(EINVAL); > + } This is quite nice! Strong(er) typing for the win, again :>) > + // SAFETY: Clearing `VM_MAYREAD` is okay when `VM_READ` is not set. > + unsafe { self.update_flags(0, flags::MAYREAD) }; > + Ok(()) > + } > + > + /// Returns whether `VM_WRITE` is set. > + /// > + /// This flag indicates whether userspace is mapping this vma as writable. > + #[inline] > + pub fn get_write(&self) -> bool { > + (self.flags() & flags::WRITE) != 0 > + } > + > + /// Try to clear the `VM_MAYWRITE` flag, failing if `VM_WRITE` is set. > + /// > + /// This flag indicates whether userspace is allowed to make this vma writable with > + /// `mprotect()`. > + #[inline] > + pub fn try_clear_maywrite(&self) -> Result { > + if self.get_write() { > + return Err(EINVAL); > + } > + // SAFETY: Clearing `VM_MAYWRITE` is okay when `VM_WRITE` is not set. > + unsafe { self.update_flags(0, flags::MAYWRITE) }; > + Ok(()) > + } > + > + /// Returns whether `VM_EXEC` is set. > + /// > + /// This flag indicates whether userspace is mapping this vma as executable. > + #[inline] > + pub fn get_exec(&self) -> bool { > + (self.flags() & flags::EXEC) != 0 > + } > + > + /// Try to clear the `VM_MAYEXEC` flag, failing if `VM_EXEC` is set. > + /// > + /// This flag indicates whether userspace is allowed to make this vma executable with > + /// `mprotect()`. > + #[inline] > + pub fn try_clear_mayexec(&self) -> Result { > + if self.get_exec() { > + return Err(EINVAL); > + } > + // SAFETY: Clearing `VM_MAYEXEC` is okay when `VM_EXEC` is not set. > + unsafe { self.update_flags(0, flags::MAYEXEC) }; > + Ok(()) > + } > +} > + > /// The integer type used for vma flags. > #[doc(inline)] > pub use bindings::vm_flags_t; > > -- > 2.47.0.371.ga323438b13-goog >
diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs index de7f2338810a..22aff8e77854 100644 --- a/rust/kernel/mm/virt.rs +++ b/rust/kernel/mm/virt.rs @@ -6,7 +6,7 @@ use crate::{ bindings, - error::{to_result, Result}, + error::{code::EINVAL, to_result, Result}, page::Page, types::Opaque, }; @@ -148,6 +148,173 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { } } +/// A builder for setting up a vma in an `mmap` call. +/// +/// # Invariants +/// +/// For the duration of 'a, the referenced vma must be undergoing initialization. +pub struct VmAreaNew { + vma: VmAreaRef, +} + +// Make all `VmAreaRef` methods available on `VmAreaNew`. +impl Deref for VmAreaNew { + type Target = VmAreaRef; + + #[inline] + fn deref(&self) -> &VmAreaRef { + &self.vma + } +} + +impl VmAreaNew { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is undergoing initial vma setup for the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Internal method for updating the vma flags. + /// + /// # Safety + /// + /// This must not be used to set the flags to an invalid value. + #[inline] + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) { + let mut flags = self.flags(); + flags |= set; + flags &= !unset; + + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet + // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel. + // The caller promises that this does not set the flags to an invalid value. + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags = flags }; + } + + /// Set the `VM_MIXEDMAP` flag on this vma. + /// + /// This enables the vma to contain both `struct page` and pure PFN pages. Returns a reference + /// that can be used to call `vm_insert_page` on the vma. + #[inline] + pub fn set_mixedmap(&self) -> &VmAreaMixedMap { + // SAFETY: We don't yet provide a way to set VM_PFNMAP, so this cannot put the flags in an + // invalid state. + unsafe { self.update_flags(flags::MIXEDMAP, 0) }; + + // SAFETY: We just set `VM_MIXEDMAP` on the vma. + unsafe { VmAreaMixedMap::from_raw(self.vma.as_ptr()) } + } + + /// Set the `VM_IO` flag on this vma. + /// + /// This marks the vma as being a memory-mapped I/O region. + #[inline] + pub fn set_io(&self) { + // SAFETY: Setting the VM_IO flag is always okay. + unsafe { self.update_flags(flags::IO, 0) }; + } + + /// Set the `VM_DONTEXPAND` flag on this vma. + /// + /// This prevents the vma from being expanded with `mremap()`. + #[inline] + pub fn set_dontexpand(&self) { + // SAFETY: Setting the VM_DONTEXPAND flag is always okay. + unsafe { self.update_flags(flags::DONTEXPAND, 0) }; + } + + /// Set the `VM_DONTCOPY` flag on this vma. + /// + /// This prevents the vma from being copied on fork. This option is only permanent if `VM_IO` + /// is set. + #[inline] + pub fn set_dontcopy(&self) { + // SAFETY: Setting the VM_DONTCOPY flag is always okay. + unsafe { self.update_flags(flags::DONTCOPY, 0) }; + } + + /// Set the `VM_DONTDUMP` flag on this vma. + /// + /// This prevents the vma from being included in core dumps. This option is only permanent if + /// `VM_IO` is set. + #[inline] + pub fn set_dontdump(&self) { + // SAFETY: Setting the VM_DONTDUMP flag is always okay. + unsafe { self.update_flags(flags::DONTDUMP, 0) }; + } + + /// Returns whether `VM_READ` is set. + /// + /// This flag indicates whether userspace is mapping this vma as readable. + #[inline] + pub fn get_read(&self) -> bool { + (self.flags() & flags::READ) != 0 + } + + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma readable with + /// `mprotect()`. + #[inline] + pub fn try_clear_mayread(&self) -> Result { + if self.get_read() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYREAD` is okay when `VM_READ` is not set. + unsafe { self.update_flags(0, flags::MAYREAD) }; + Ok(()) + } + + /// Returns whether `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is mapping this vma as writable. + #[inline] + pub fn get_write(&self) -> bool { + (self.flags() & flags::WRITE) != 0 + } + + /// Try to clear the `VM_MAYWRITE` flag, failing if `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma writable with + /// `mprotect()`. + #[inline] + pub fn try_clear_maywrite(&self) -> Result { + if self.get_write() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYWRITE` is okay when `VM_WRITE` is not set. + unsafe { self.update_flags(0, flags::MAYWRITE) }; + Ok(()) + } + + /// Returns whether `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is mapping this vma as executable. + #[inline] + pub fn get_exec(&self) -> bool { + (self.flags() & flags::EXEC) != 0 + } + + /// Try to clear the `VM_MAYEXEC` flag, failing if `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma executable with + /// `mprotect()`. + #[inline] + pub fn try_clear_mayexec(&self) -> Result { + if self.get_exec() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYEXEC` is okay when `VM_EXEC` is not set. + unsafe { self.update_flags(0, flags::MAYEXEC) }; + Ok(()) + } +} + /// The integer type used for vma flags. #[doc(inline)] pub use bindings::vm_flags_t;
When setting up a new vma in an mmap call, you have a bunch of extra permissions that you normally don't have. This introduces a new VmAreaNew type that is used for this case. To avoid setting invalid flag values, the methods for clearing VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking the return value results in a compilation error because the `Result` type is marked #[must_use]. For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When we add a VM_PFNMAP method, we will need some way to prevent you from setting both VM_MIXEDMAP and VM_PFNMAP on the same vma. Signed-off-by: Alice Ryhl <aliceryhl@google.com> --- rust/kernel/mm/virt.rs | 169 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 168 insertions(+), 1 deletion(-)