From patchwork Thu Feb 13 11:04:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEDEDC021A4 for ; Thu, 13 Feb 2025 11:04:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 515A46B0088; Thu, 13 Feb 2025 06:04:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C6306B0089; Thu, 13 Feb 2025 06:04:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 367C66B008A; Thu, 13 Feb 2025 06:04:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 13E0B6B0088 for ; Thu, 13 Feb 2025 06:04:43 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C823BC0B9E for ; Thu, 13 Feb 2025 11:04:42 +0000 (UTC) X-FDA: 83114638404.13.5C62117 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf19.hostedemail.com (Postfix) with ESMTP id C47B81A0008 for ; Thu, 13 Feb 2025 11:04:40 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=a+ZqSKrD; spf=pass (imf19.hostedemail.com: domain of 3x9GtZwkKCGYEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3x9GtZwkKCGYEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444680; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w9c//+puMEPIu02W9GqdsmxdftBKhgzEpRqTvLiOh1g=; b=BC1EM22wXhe7rm+dwbTzyFIsrHZB78J8eyFkBEFmFAWvsgWy8L8Vrx5C5hBef2oUf5T7iO KqcK5vA8tNjDICfYejMSCcLxAg1M3CC8UySkt0rkX2bgeKnu774tLXtHxaSYvwADwzcxt0 mYmz9m5gJH5XUGxcRQ0J3CLwLQ8sjNY= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=a+ZqSKrD; spf=pass (imf19.hostedemail.com: domain of 3x9GtZwkKCGYEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3x9GtZwkKCGYEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444680; a=rsa-sha256; cv=none; b=FmUt1AOmtDpvIM98rH1GOMawHLqPGLDrRTlSMESsvBx4x1AxNNLxbfqFYKF3PFMy7JsONC u4dR0P8rxj3aec7rY6BbQqR1gMScg8Z3thRhGank4juvxaIvJ9JWtwLV20da9FC9VBusI8 fA6GT97ftujC9i/JvXiD7Vsf4isezzY= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4394c489babso3973965e9.1 for ; Thu, 13 Feb 2025 03:04:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444679; x=1740049479; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=w9c//+puMEPIu02W9GqdsmxdftBKhgzEpRqTvLiOh1g=; b=a+ZqSKrDYhihOR5Lg8G93iJw1UA3YRtvUsafmouhusSr+WrxMuREtyJYVYvzWRg2rr Y3Yh0Z3kbrYnZT0ieWTfL9vs90RLwNPQaK8noKMMXTRlfsgI7R6bPsCGBAgKc6df7UD7 Bh1gTvAWbzPxyD6O4gjgZcWRng1jP4nKB3etPp2vMuVG8JBh369wUhMo+1xVzAq++6BK zPMG1vMiTmYoJ22lmap+J+9q+pZ6Nf3wd9HnpRUesFCMvSylPEHQVF2WtvCmCjtabGxN aweg2os3/eC9QmHIbfs5Iah1jsGHh5KSYd4Wkab+QZ8AKB+FSSsSMUEBmWqSiPisCnND pthw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444679; x=1740049479; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=w9c//+puMEPIu02W9GqdsmxdftBKhgzEpRqTvLiOh1g=; b=PQKV4cqruMzaql+dsdDo5GEpJgY3eGRGmFzBvcX0uxhUVXbATgZZsj0KyDel0OkSZc SZkFU5Sm/Xh7WfmosXc9g7Hy05b5eqPHVF7yxTbQ78TWHUPSGQ4/M9eMLRnB1KEn82fP zV2ZMNlsCy9KFn4FYKnJry0RTUsxaDM+BbDWWA9880lrtvBGzdbYoLYeiSMdZOr4IxjX RIow5x7L4+0wIi5IewFFrP1ZsJFXb0HIcbkbJWL8BduKjnqEXGUnXX242kP9xhn32zK0 VqOyMNEJIDYMWqFBhHMx7/TuqF73BLvVayA2rVy9oFVXmKOXOwlnHhK5EvpGEE88Pvn0 rxeA== X-Forwarded-Encrypted: i=1; AJvYcCXivJbhSkT0NzMUWnyxstLEtkzQeUO9LA2VzlTAOwLRkmZuf0IYjU6MxOVhBeG66794fqG4AqbBuw==@kvack.org X-Gm-Message-State: AOJu0YyiUB845WR4LtyEkVhsVhd7ve0G0c76pyAdLi2p31PrQkEtD0A7 Kbt1StgjMTO0H9BUoP2xLCevaTM7QPynm58HCgleXJvIRB+WMpyprR9NIGKN8PTvDk8vMhRihzB opohuHIy7cI079Q== X-Google-Smtp-Source: AGHT+IHbYFtzmogXBg9MKVnhQ13dP/0PgnaMmx1EqoIKBkmnDNwv9gD8FAc5s4pngUmvqFUXCOowllHdgVlDphw= X-Received: from wmbay19.prod.google.com ([2002:a05:600c:1e13:b0:439:5636:735f]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:46cc:b0:439:54d3:d076 with SMTP id 5b1f17b1804b1-439601694e7mr26033985e9.1.1739444679495; Thu, 13 Feb 2025 03:04:39 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:00 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=10247; i=aliceryhl@google.com; h=from:subject:message-id; bh=dLD6cHWVgf4Y0Wdnlt3dpV8a6zt+QppcLE+M4FISSKQ=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdG+XF/CRGicj29lFaft2zXQGBxQrOuGoT+gX gCTVXTMW2yJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RvgAKCRAEWL7uWMY5 RowED/9wPP/m0yVmu3CtDvLxBZkr/NFB0Ykbc1trzp07J536Dy+CvTd4+mxPHKf7fjYyQ5yOM67 420cGv5bXr9WLCKJAlouRfe5bWz3QrQeT0kRtlomkO4/viLZkkLwSfsmaocEtLFgAIrfZ3nDgFU sYPUKgV3lbeNpk4vVKgea0YwsSOKvH1Kx7sC0p/Yv0TfAbpFAn4MKggcxL0KCP28+zgYn9Bw3hX qFZlZwxn3XNw5lbYN0B1GQWH+vmFl3AJapms3E2pO0ekZkB1PiM56H6PWgvRuc8OkEmVqeWDfpe AQbUuB2ydzSyVu7Xjh+yUCRDp+VxH580NVduHrptwMC1jKjkHrKQB8aaISG5c74kxzOocK5QlRL 9cDzGyIwxfpwczktY7Ks5MnuM2b9F5FfqbnscuSsIqSuciGngIbCDaLTTMRlyuMMini0aQ20PIf vTi7zYUQ6sRQcbiWLiOZakX7PUOAc9ZkdMcc0HegaWwMuyrr62twXCl5ScjEZC1UcRi+AKuV9i9 sUx0YkYqewiFtH8vl5Ti+MDgN/+R70XAxjlNPy4ri51cde1yYBM+QKU0YZgvkN+y1g5g9TPevK6 1vZwENcZYQN87Fxomcqn+JTXGIcpA/B2+47TeuwP4LcOmTzuJqK9C136EQdqGpBJsL3lf0OWQV/ VxK2B9Fk9lXYRFQ== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-1-b29c47ab21f5@google.com> Subject: [PATCH v14 1/8] mm: rust: add abstraction for struct mm_struct From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl , Balbir Singh X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: C47B81A0008 X-Stat-Signature: rc5f4dbot8ogik9prsapwkinoyjdby9e X-HE-Tag: 1739444680-470061 X-HE-Meta: U2FsdGVkX18SAsnePgqufTVe6Csx6Mq3Sg/DsjPPY3Ao++tKQEf8NhwSGdEeZHfX4CJ6j6f85d+Y/HNPXUE/TyuPlgubGUp8wmf+PadqI8XObNNp4JcALmVI7USNvzSmS7wG1WBP5Nkqtcqu68BgxcTN3t//J4JKZwE0qvywiEX9wJMdgsCF6tf/WGCW5SXKpElNWHkupBwouQpX/FlE2dgR3GRrqJ3r17PIN9w9dQRuUMt3IENuKz2mPxCyCO6pfxjbdrem2KKvKkqnTugS6c0YGQ/j4C9C1lwuXsz2hxM+py9IMNyeEBasEvmzPuCMLFtkVC7i4v09CLtko0avRl+4/PBMfq4AAfBYInpgwn7m6XucTXzfzJLM/Q6wi7Tn9LM9bFdmktNaF8gkYHfcnPPFF9/NzdhVrMI3ebhysH2xJSqCU0d6EmWtk/GBNKHr7y3mfHzCBZ8nZws8D/zRrAj/M3x84sZoKQ77ErqLKlh7v3bZzj4tSF8u88w5CFjqFYlmHf7gib0CBAdESY6UNJJm7aJuPKuCgJofknxJhU0QzL6/4EqYQcQolW4tmjT3lsNuqSEFe/VVKgK+qOpnWg9l8zDXQ24C2cEYZno2bcIrakA8bTyWLOL2nBeR6GniG1qEbn9C8HdVU78n2OVlCayU2kRD8YnNuUGPTNPXHNuPMLQ91yHXm68D3K1MJPch+ZPuxnijWYg4cqe5ZY0hoSAEQ8Moh7V56qhHvrszwGyamTnpNcyewwLEk4SHRiBj3wIe2yrwB5BMZ71EgxxvPM+G4eWsjAZT6QbC0FbiVJlw2+9ENr30HG9WwU0c9S9mmrqImf0MRa7sJx3rDqLUT5pnfG4Y9+ZtpsP2GnbzHZbD6TnaAqdPd9GbEa6dmPHYnUmz2x079KyijXuPEqJGvzVw/XDzsys1rGVPxHhf/S1a+kL7LG3JRr4e5QEUnTRvkz01tQ9kEjQ/gG4soLv WpUZn2iB RFOxvd3KhHYMmBRW6LVd2wKsPWlN//1A5ePugFcXZD9dDEUg35G1U+oeQeSuvx6d2fZbnCJ2eYHWj7eFE2gCgYDm7WGYW49bRqy8qFlBsU+fQ/lycf6xKCSNNFX21S9B7ciktoOy580zMeFd2lkg4vLX1hBsnWXFcV8ZJc3ZQrg/dYqzqIIkMUoDn0kpiTvOYE7BAoMoe1ijUBJQARL3FNXFM+KXQaajQIghb8O3yJiCeIvTspm0TgyUmx7sMtqNzfTceHVMy6m6bdxH+jGL8e2mztKoe0ntDZC3wns2mJBfa89bdu7gBHPSW91AJ6iNc9JSF0S7VwOjIT+zBxtpGiauQORBf0bMxhUASYCPM6C7av9hbNqiFHnvCofx5zAP2GxnpOltfG0j6zrsxJGzO2T7L+XJ+wKrXxbF2jrsuVDDHNC/4Xge0eS/eGYSbPdi5qqLffUXCEYBIqb4hSKryML1BflMY0FRx5vZXYiLdB/2EV8h6snRg+/4PQHl/LDKj10Uw5F91SvrGtB1Tj9Ztl63SZcRWlDYUCOz31jbGDt3bM3VQ9hdwAqQ1x9DTKOHNwq+BsKmYA0MXs1QpFtYVckOZsFWRC0JKvwU7CQVnbLcPF359EE0jMUDYDESyouMNQTagNkvrlIOMym7jOuMFFtxbiQH/WVPU0EgKkE0vaaEn7InUhrNZ2o0/zO+P+KxxshqBSQmgZKdh5SXVOMTsU5qQrIigSs6PY2ACD3yC5xp2Qncu6gA9DKJlP1JTwYVp1QzT X-Bogosity: Ham, tests=bogofilter, spamicity=0.375410, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: These abstractions allow you to reference a `struct mm_struct` using both mmgrab and mmget refcounts. This is done using two Rust types: * Mm - represents an mm_struct where you don't know anything about the value of mm_users. * MmWithUser - represents an mm_struct where you know at compile time that mm_users is non-zero. This allows us to encode in the type system whether a method requires that mm_users is non-zero or not. For instance, you can always call `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is non-zero. The struct is called Mm to keep consistency with the C side. The ability to obtain `current->mm` is added later in this series. Acked-by: Lorenzo Stoakes Acked-by: Balbir Singh Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/helpers/helpers.c | 1 + rust/helpers/mm.c | 39 +++++++++ rust/kernel/lib.rs | 1 + rust/kernel/mm.rs | 209 +++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 250 insertions(+) diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c index 0640b7e115be..97cfc2d29f5e 100644 --- a/rust/helpers/helpers.c +++ b/rust/helpers/helpers.c @@ -18,6 +18,7 @@ #include "io.c" #include "jump_label.c" #include "kunit.c" +#include "mm.c" #include "mutex.c" #include "page.c" #include "platform.c" diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c new file mode 100644 index 000000000000..7201747a5d31 --- /dev/null +++ b/rust/helpers/mm.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +void rust_helper_mmgrab(struct mm_struct *mm) +{ + mmgrab(mm); +} + +void rust_helper_mmdrop(struct mm_struct *mm) +{ + mmdrop(mm); +} + +void rust_helper_mmget(struct mm_struct *mm) +{ + mmget(mm); +} + +bool rust_helper_mmget_not_zero(struct mm_struct *mm) +{ + return mmget_not_zero(mm); +} + +void rust_helper_mmap_read_lock(struct mm_struct *mm) +{ + mmap_read_lock(mm); +} + +bool rust_helper_mmap_read_trylock(struct mm_struct *mm) +{ + return mmap_read_trylock(mm); +} + +void rust_helper_mmap_read_unlock(struct mm_struct *mm) +{ + mmap_read_unlock(mm); +} diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index 496ed32b0911..9cf35fbff356 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -57,6 +57,7 @@ pub mod kunit; pub mod list; pub mod miscdevice; +pub mod mm; #[cfg(CONFIG_NET)] pub mod net; pub mod of; diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs new file mode 100644 index 000000000000..2fb5f440af60 --- /dev/null +++ b/rust/kernel/mm.rs @@ -0,0 +1,209 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Memory management. +//! +//! This module deals with managing the address space of userspace processes. Each process has an +//! instance of [`Mm`], which keeps track of multiple VMAs (virtual memory areas). Each VMA +//! corresponds to a region of memory that the userspace process can access, and the VMA lets you +//! control what happens when userspace reads or writes to that region of memory. +//! +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h) + +use crate::{ + bindings, + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque}, +}; +use core::{ops::Deref, ptr::NonNull}; + +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This represents the address space of a userspace process, so each process has one `Mm` +/// instance. It may hold many VMAs internally. +/// +/// There is a counter called `mm_users` that counts the users of the address space; this includes +/// the userspace process itself, but can also include kernel threads accessing the address space. +/// Once `mm_users` reaches zero, this indicates that the address space can be destroyed. To access +/// the address space, you must prevent `mm_users` from reaching zero while you are accessing it. +/// The [`MmWithUser`] type represents an address space where this is guaranteed, and you can +/// create one using [`mmget_not_zero`]. +/// +/// The `ARef` smart pointer holds an `mmgrab` refcount. Its destructor may sleep. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmgrab`. +/// +/// [`mmget_not_zero`]: Mm::mmget_not_zero +#[repr(transparent)] +pub struct Mm { + mm: Opaque, +} + +// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called. +unsafe impl Send for Mm {} +// SAFETY: All methods on `Mm` can be called in parallel from several threads. +unsafe impl Sync for Mm {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for Mm { + #[inline] + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmgrab(self.as_raw()) }; + } + + #[inline] + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmdrop(obj.cast().as_ptr()) }; + } +} + +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget` +/// refcount. It can be used to access the associated address space. +/// +/// The `ARef` smart pointer holds an `mmget` refcount. Its destructor may sleep. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero. +#[repr(transparent)] +pub struct MmWithUser { + mm: Mm, +} + +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called. +unsafe impl Send for MmWithUser {} +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads. +unsafe impl Sync for MmWithUser {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for MmWithUser { + #[inline] + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmget(self.as_raw()) }; + } + + #[inline] + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmput(obj.cast().as_ptr()) }; + } +} + +// Make all `Mm` methods available on `MmWithUser`. +impl Deref for MmWithUser { + type Target = Mm; + + #[inline] + fn deref(&self) -> &Mm { + &self.mm + } +} + +// These methods are safe to call even if `mm_users` is zero. +impl Mm { + /// Returns a raw pointer to the inner `mm_struct`. + #[inline] + pub fn as_raw(&self) -> *mut bindings::mm_struct { + self.mm.get() + } + + /// Obtain a reference from a raw pointer. + /// + /// # Safety + /// + /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated + /// during the lifetime 'a. + #[inline] + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm { + // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to + // repr(transparent). + unsafe { &*ptr.cast() } + } + + /// Calls `mmget_not_zero` and returns a handle if it succeeds. + #[inline] + pub fn mmget_not_zero(&self) -> Option> { + // SAFETY: The pointer is valid since self is a reference. + let success = unsafe { bindings::mmget_not_zero(self.as_raw()) }; + + if success { + // SAFETY: We just created an `mmget` refcount. + Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) }) + } else { + None + } + } +} + +// These methods require `mm_users` to be non-zero. +impl MmWithUser { + /// Obtain a reference from a raw pointer. + /// + /// # Safety + /// + /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains + /// non-zero for the duration of the lifetime 'a. + #[inline] + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { + // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due + // to repr(transparent). + unsafe { &*ptr.cast() } + } + + /// Lock the mmap read lock. + #[inline] + pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmap_read_lock(self.as_raw()) }; + + // INVARIANT: We just acquired the read lock. + MmapReadGuard { + mm: self, + _nts: NotThreadSafe, + } + } + + /// Try to lock the mmap read lock. + #[inline] + pub fn mmap_read_trylock(&self) -> Option> { + // SAFETY: The pointer is valid since self is a reference. + let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) }; + + if success { + // INVARIANT: We just acquired the read lock. + Some(MmapReadGuard { + mm: self, + _nts: NotThreadSafe, + }) + } else { + None + } + } +} + +/// A guard for the mmap read lock. +/// +/// # Invariants +/// +/// This `MmapReadGuard` guard owns the mmap read lock. +pub struct MmapReadGuard<'a> { + mm: &'a MmWithUser, + // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread + _nts: NotThreadSafe, +} + +impl Drop for MmapReadGuard<'_> { + #[inline] + fn drop(&mut self) { + // SAFETY: We hold the read lock by the type invariants. + unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) }; + } +} From patchwork Thu Feb 13 11:04:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E6DFC021A0 for ; Thu, 13 Feb 2025 11:04:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 983E26B008A; Thu, 13 Feb 2025 06:04:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 90D8A6B008C; Thu, 13 Feb 2025 06:04:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C3266B0092; Thu, 13 Feb 2025 06:04:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4979E6B008A for ; Thu, 13 Feb 2025 06:04:45 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 0CB8D1C7F06 for ; Thu, 13 Feb 2025 11:04:45 +0000 (UTC) X-FDA: 83114638530.09.C9C2462 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf19.hostedemail.com (Postfix) with ESMTP id 12C331A0008 for ; Thu, 13 Feb 2025 11:04:42 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bVCOcHNH; spf=pass (imf19.hostedemail.com: domain of 3ydGtZwkKCGgGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3ydGtZwkKCGgGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444683; a=rsa-sha256; cv=none; b=wP6zZzbjPCfYUBrn2pgS2ohmkx3g63V7S/Lr/8nmH2APRr7Gy37QwNa3Ysbu9+4fuuP2Ru n+c3FEwekv72BaI6m/dbnfV/b2L/okWN9/RC9ACxHgz+uAVQqZ7tXfcx5w1yDUZIZ+fHzE 859ZF0Z8/+SDC4pCZ0g+CU7QyOPU6ro= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bVCOcHNH; spf=pass (imf19.hostedemail.com: domain of 3ydGtZwkKCGgGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3ydGtZwkKCGgGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444683; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=za3474x0ZF8AMlWXSEpyVL59vkrxbiA4YbQOJZDWTjE=; b=LK1pOLmZne18Jg/sNJ9fc4Yy90RuaHl/iRGfPBn9UqpJ++Yxokq7h8dzI/AxEs+qUpnKVj 4Drt7AVT7pi418tP5xzJG6nwkDFOX0KWKroD/9uQhFkZUxudk6WF1ST/wx5hhGWifg7hCQ mQt6oNLpm40ZXfcyvIp5GkzgsD70kJY= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-38de11a0002so689564f8f.3 for ; Thu, 13 Feb 2025 03:04:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444681; x=1740049481; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=za3474x0ZF8AMlWXSEpyVL59vkrxbiA4YbQOJZDWTjE=; b=bVCOcHNHOn6/AVkttOo8R8aHQ6TCMxNN/mdf94tQp2yGFgRZI1T5j17K2z02J3rXxH UZmVVDUlxFN7V7fya+LTSJyOzwCdyU4pXBNpcYk2D9R2EpTlbGZbKqzBRxT8gyzEVoRG UiEXUjy4G41IiVMvM+8eM+Rom9+L5XFvsaJtrcbUEpBQgtnJv0wYc06X4YkbtMZNS5aM bJ+0g0ZRl4wAbq0GZVZftr27xaT+9vK29S94gTlHA9RY5qxb3Yf3yRqhq/XIerqiQ5Jw 0zYfD4OqxbgHXlpvnAtoWUpVS3k/GhJTCDP37q20vgDEOD/j6sE9UjFwa3JxnuTmfH+I lL6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444681; x=1740049481; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=za3474x0ZF8AMlWXSEpyVL59vkrxbiA4YbQOJZDWTjE=; b=bd9F9pHR3zfbh2eaJrUH1vrCA7zv3plAEe64Rl6Ar1EP3Dl/4GjAlHcsbEZMDSnBMG AuL88YYLa1zKbEouuUGl93kj6WoltyvfSJm82KPtnItI2P4cSH0yYDtzCdiN8OzqI4LN YjI5rBHsR5FVnwPtdJee5K9nkC09qFJ15CXctT26fJB7vaTKyb+iyJURVV/7fBTAAI3r bSf2CdAGmyl+v2TpVgtlpvt3i2o/r1rcmVjqbGStEoKuwtTpJMn0X2Q558qPxlxbkQVZ XUow/4CZWXsvprPlIVyLx452TDD7bawwgjGE/oCfKQODa0rdSpJxHjupOXGJf/HKLXlb kodA== X-Forwarded-Encrypted: i=1; AJvYcCVYHCZT6nIemfEjAA2RaTic+c8tHM+xMi1asTDDXwbkwBe3dZmnHzn4Dz7bVNMuQRzHffZN+eHyzA==@kvack.org X-Gm-Message-State: AOJu0YzZJmt1UdpNSX9CgsHDFE0FhJbs5NgbIab513TeF9Xlg+hHXqEH lbvL0qAAEOCb7uR0AsKU+9NMLR1UrFM4EyDPE27q63/aSO9mP2HRdqu9DDn0YDjiMJ6li9/gEGG A+zSl3wVnTGjJWw== X-Google-Smtp-Source: AGHT+IFqQez3BgoQ7uNF4wdhzNnxr+PJ45L8RFg6wDi/c48ndpBoPVBo6FJ8W7vBko0MJMPWmUUseOsLeSI1Ix8= X-Received: from wmbhg21.prod.google.com ([2002:a05:600c:5395:b0:438:e219:3af5]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:6d02:0:b0:38d:e092:3d18 with SMTP id ffacd0b85a97d-38dea2fa0bemr6387327f8f.50.1739444681571; Thu, 13 Feb 2025 03:04:41 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:01 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=11820; i=aliceryhl@google.com; h=from:subject:message-id; bh=Q0RIStyPwKILnX2Vrxa7XEF9eaYk5DS6vlU3U+wI2SI=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdG+2HO72aNkovdBvp3nh2Z3b2jBiNHTe+tqw Gxn7dLgjsyJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RvgAKCRAEWL7uWMY5 RhIKD/4whgGT2Jm8XqQ/u4jOaeumFx8NScl5ogOyGNK8zZE+jXjy8Oz+TgqxjHe7H+HRKfu8i28 aWz7Dyts7C3Lv2/bxafa2+6D5UZXQJ9cCjCp+iv0sExbz481iaIgCOQRrv+avMXzxfg+UYOo6ze WukLXyjJP6Xmj1kCLPJL9g95WYH/mAR3r36vfW5RmHGX5m5hjfXglA5NE1sguHGUat+pOZxv8qM VMDq3nbd4/O7WNNJVmc613d2m4I6FotyyRKP0PN2N/7IVeQoZ6vyZkHR6CjI6eYaXRSTsYy2XzY LybS//m56IDMY2S2M0Z3yhwdGQG41gk7xTPOP2ELlCnUE0nKaaKaWV2y1tcV7dBNVOj4ZTNaspE 5i0UBvdaGyalGeYgXIPuTZJZXNR/oN+PHQgl1Dw6gPlyUqo3tP7gvH8Ce2fm0ai4wLD86dSSwp/ X3T1BTVLHbciDkzEUIiifAQrcKGJrdrh3f06rdAia4sozXdSuy4dyC5Bvl/qKhpGCQGKHCzycSO xEulzrYlQ3LLIOlIZDm7FlBg5gCj3zvHr00N8MoPCHaRqVsZUCwIkIEunqQVmQHMtyoYK6GZVcA 8Y6w0K0NXKvzaAEnSiIDTYMm2Euiwq2Otr7WgG1rPyYYp6zpPNAH8meEf2gXp/zC2VMziO3GutN eonf2T4A+n0jtbw== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-2-b29c47ab21f5@google.com> Subject: [PATCH v14 2/8] mm: rust: add vm_area_struct methods that require read access From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Queue-Id: 12C331A0008 X-Stat-Signature: xofjr711i87ps5shipy1p9kdifzxmkqi X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1739444682-16115 X-HE-Meta: U2FsdGVkX1+A3VP4yNHsj381pI0XzagWndD1BMSCXQ6ROiajWDg1lzB1rc3sDxAe+ALPrtBkKAheKZzYj0LortXu7hWhuK4Zkn0pFS7bKoncNnW3ltvVE78ZTWY+qnrkZTLysK5LBPe1LVcc0wHcjlF2ei83Sh5swRw1JgjZJBVp6/waP4oHoxBkJ1C+vmU6E8JNRxqOMMcpwDqAmWDrGzrwGFFZTvAnJkETtYaa6sBd4jkYZagr7S4IUI+E+2QDV8l3cphSa0/EqmKYztFNRPgAfHSOxkoqLZUr7ySIM3s05/TkQ5X8eq4blzcCmZEQFR4DDRB2mgYJVW6vlvTh3h+2lcyvYx1V6ylkUpGzRmChVCWx04ei1bgc1IUXHGbeH3v54FIUO0R8K56A9LTtMo+WKGocQFFRWj8o1ICquQ+JxlQMl7o5O1ylpmsDYmSe/Bx2jQTxEZTczdEV846qbsFqj0r8fRHo4DZ6z1IoicxGmjEosWrC6IBt8uRTbrGZLoWEopB/jVgDb4olm076q95yXSN8qNk4vc4Q8oAnu6kPJpNi3Gli0QhRZxRWwNppf/NS4AcYd6PjAoG4yPc61nC6QYDwFjoa6aN27GPD06rWmwbQ05K1vfIZuZwERrEPBgFMft2jxAdBHZJv8re04Cf/90zyhRRTvPq/JVDS7+pIKvUYhI7K4Yt5k6ICAwMmkNT7q6wjQTVZO2wEsiK0E6pLCqVT0k5tsCWduwWvkWgZ1wbpiS/3S/IPBKgYHIaRQEjeSv0LJrxKCpnOhShWDGi+CXliDBfa+PL6WfeVfcY1GGOLpOLGdjKOsr5Lj3Bt2aoMuYqIL4nicIY7wnYkFwWOc8EGZM2PXkoRIQhNaCiUMsEX7mauww6P48xe/FsFer+9+A2c58VfkJV1e0EL5aXyTdkxpMvf4tcyRboqv7ScNIHGK8ZR2JDMbwwvuFJQUonXobH/8zeVRYrbjJO sjwCj2/F yCUyFaBjnmFIvzwe6ib+hhIcUpDyKeabxSQhLto/tp3ID9wHKH+YmFIk4zApZ9YbpqkIP6nFXuXiUxdL5uciRFi5jQHHWkttUilMty3eTrHF1np9tx13fQbyZ5JL/XfnO3J26gyZ5dEXBNgscY83JgKs5C9f0KXYpEgJSoSwcQn9A62ITjBD+PyvJQJVSd5d65VZIQQovHh+LQO3iUjT8yvw0emxzlSiN7S8IIdbO9QiWUCAJ2p8GihrmfBflJhv7myPAQ7McnYKH4ORhZCcBPRd1gmuO7pR6bTsMLCExxqWsXFRUWShyNYewIr0R2jiYIz3d5+IKvIcsS9ohv9tI4dYHlrKXLdJBvS71Jbz+lFmmBK2/kwWYW9RNH+lTS7Tb38ZYwRAslk2FpadFFwD9LUHyYrP6xQ14IXKZ1JnwAT+SndxSEDPvfA9MdOq8jmrisQK4pgwWTov6b3AcITDrvRmicAaegYRR3/Cdiij5ETt3r6t4KG66hHVDJXSRjN2XnZgnBOb0LU3g+FJWrc3do/KZXcxvCKD6JpANqO50i1vmUx7uK7jJ5LxWo7+OQ/NQC1nUH0WlyNCjCs2vh3V/jFS0s/mUqmOtgWPmMHm39S/2t1xMZV+ZmcKgcAMPUtyvfWx5IHdIgoIQR8jX3xnw/LhhK9U0p4yMxQd6qJ/9VhlC3p8hC3dHICcJz0FxkKDq29RbWDYT8mPGWDcUdSA6wx4WWGtsmm9ia0ty X-Bogosity: Ham, tests=bogofilter, spamicity=0.224880, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This adds a type called VmaRef which is used when referencing a vma that you have read access to. Here, read access means that you hold either the mmap read lock or the vma read lock (or stronger). Additionally, a vma_lookup method is added to the mmap read guard, which enables you to obtain a &VmaRef in safe Rust code. This patch only provides a way to lock the mmap read lock, but a follow-up patch also provides a way to just lock the vma read lock. Acked-by: Lorenzo Stoakes Reviewed-by: Jann Horn Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/helpers/mm.c | 6 ++ rust/kernel/mm.rs | 23 ++++++ rust/kernel/mm/virt.rs | 210 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 239 insertions(+) diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c index 7201747a5d31..7b72eb065a3e 100644 --- a/rust/helpers/mm.c +++ b/rust/helpers/mm.c @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm) { mmap_read_unlock(mm); } + +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, + unsigned long addr) +{ + return vma_lookup(mm, addr); +} diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 2fb5f440af60..8b19dde24978 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -17,6 +17,8 @@ }; use core::{ops::Deref, ptr::NonNull}; +pub mod virt; + /// A wrapper for the kernel's `struct mm_struct`. /// /// This represents the address space of a userspace process, so each process has one `Mm` @@ -200,6 +202,27 @@ pub struct MmapReadGuard<'a> { _nts: NotThreadSafe, } +impl<'a> MmapReadGuard<'a> { + /// Look up a vma at the given address. + #[inline] + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmaRef> { + // SAFETY: By the type invariants we hold the mmap read guard, so we can safely call this + // method. Any value is okay for `vma_addr`. + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr) }; + + if vma.is_null() { + None + } else { + // SAFETY: We just checked that a vma was found, so the pointer references a valid vma. + // + // Furthermore, the returned vma is still under the protection of the read lock guard + // and can be used while the mmap read lock is still held. That the vma is not used + // after the MmapReadGuard gets dropped is enforced by the borrow-checker. + unsafe { Some(virt::VmaRef::from_raw(vma)) } + } + } +} + impl Drop for MmapReadGuard<'_> { #[inline] fn drop(&mut self) { diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs new file mode 100644 index 000000000000..a66be649f0b8 --- /dev/null +++ b/rust/kernel/mm/virt.rs @@ -0,0 +1,210 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Virtual memory. +//! +//! This module deals with managing a single VMA in the address space of a userspace process. Each +//! VMA corresponds to a region of memory that the userspace process can access, and the VMA lets +//! you control what happens when userspace reads or writes to that region of memory. +//! +//! The module has several different Rust types that all correspond to the C type called +//! `vm_area_struct`. The different structs represent what kind of access you have to the VMA, e.g. +//! [`VmaRef`] is used when you hold the mmap or vma read lock. Using the appropriate struct +//! ensures that you can't, for example, accidentally call a function that requires holding the +//! write lock when you only hold the read lock. + +use crate::{bindings, mm::MmWithUser, types::Opaque}; + +/// A wrapper for the kernel's `struct vm_area_struct` with read access. +/// +/// It represents an area of virtual memory. +/// +/// # Invariants +/// +/// The caller must hold the mmap read lock or the vma read lock. +#[repr(transparent)] +pub struct VmaRef { + vma: Opaque, +} + +// Methods you can call when holding the mmap or vma read lock (or stronger). They must be usable +// no matter what the vma flags are. +impl VmaRef { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap or vma + /// read lock (or stronger) is held for at least the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Returns a raw pointer to this area. + #[inline] + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct { + self.vma.get() + } + + /// Access the underlying `mm_struct`. + #[inline] + pub fn mm(&self) -> &MmWithUser { + // SAFETY: By the type invariants, this `vm_area_struct` is valid and we hold the mmap/vma + // read lock or stronger. This implies that the underlying mm has a non-zero value of + // `mm_users`. + unsafe { MmWithUser::from_raw((*self.as_ptr()).vm_mm) } + } + + /// Returns the flags associated with the virtual memory area. + /// + /// The possible flags are a combination of the constants in [`flags`]. + #[inline] + pub fn flags(&self) -> vm_flags_t { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags } + } + + /// Returns the (inclusive) start address of the virtual memory area. + #[inline] + pub fn start(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start } + } + + /// Returns the (exclusive) end address of the virtual memory area. + #[inline] + pub fn end(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end } + } + + /// Zap pages in the given page range. + /// + /// This clears page table mappings for the range at the leaf level, leaving all other page + /// tables intact, and freeing any memory referenced by the VMA in this range. That is, + /// anonymous memory is completely freed, file-backed memory has its reference count on page + /// cache folio's dropped, any dirty data will still be written back to disk as usual. + /// + /// It may seem odd that we clear at the leaf level, this is however a product of the page + /// table structure used to map physical memory into a virtual address space - each virtual + /// address actually consists of a bitmap of array indices into page tables, which form a + /// hierarchical page table level structure. + /// + /// As a result, each page table level maps a multiple of page table levels below, and thus + /// span ever increasing ranges of pages. At the leaf or PTE level, we map the actual physical + /// memory. + /// + /// It is here where a zap operates, as it the only place we can be certain of clearing without + /// impacting any other virtual mappings. It is an implementation detail as to whether the + /// kernel goes further in freeing unused page tables, but for the purposes of this operation + /// we must only assume that the leaf level is cleared. + #[inline] + pub fn zap_page_range_single(&self, address: usize, size: usize) { + let (end, did_overflow) = address.overflowing_add(size); + if did_overflow || address < self.start() || self.end() < end { + // TODO: call WARN_ONCE once Rust version of it is added + return; + } + + // SAFETY: By the type invariants, the caller has read access to this VMA, which is + // sufficient for this method call. This method has no requirements on the vma flags. The + // address range is checked to be within the vma. + unsafe { + bindings::zap_page_range_single(self.as_ptr(), address, size, core::ptr::null_mut()) + }; + } +} + +/// The integer type used for vma flags. +#[doc(inline)] +pub use bindings::vm_flags_t; + +/// All possible flags for [`VmaRef`]. +pub mod flags { + use super::vm_flags_t; + use crate::bindings; + + /// No flags are set. + pub const NONE: vm_flags_t = bindings::VM_NONE as _; + + /// Mapping allows reads. + pub const READ: vm_flags_t = bindings::VM_READ as _; + + /// Mapping allows writes. + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _; + + /// Mapping allows execution. + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _; + + /// Mapping is shared. + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _; + + /// Mapping may be updated to allow reads. + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _; + + /// Mapping may be updated to allow writes. + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _; + + /// Mapping may be updated to allow execution. + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _; + + /// Mapping may be updated to be shared. + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _; + + /// Page-ranges managed without `struct page`, just pure PFN. + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _; + + /// Memory mapped I/O or similar. + pub const IO: vm_flags_t = bindings::VM_IO as _; + + /// Do not copy this vma on fork. + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _; + + /// Cannot expand with mremap(). + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _; + + /// Lock the pages covered when they are faulted in. + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _; + + /// Is a VM accounted object. + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _; + + /// Should the VM suppress accounting. + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _; + + /// Huge TLB Page VM. + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _; + + /// Synchronous page faults. (DAX-specific) + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _; + + /// Architecture-specific flag. + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _; + + /// Wipe VMA contents in child on fork. + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _; + + /// Do not include in the core dump. + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _; + + /// Not soft dirty clean area. + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _; + + /// Can contain `struct page` and pure PFN pages. + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _; + + /// MADV_HUGEPAGE marked this vma. + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _; + + /// MADV_NOHUGEPAGE marked this vma. + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _; + + /// KSM may merge identical pages. + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _; +} From patchwork Thu Feb 13 11:04:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F01DCC021A4 for ; Thu, 13 Feb 2025 11:04:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16D686B0092; Thu, 13 Feb 2025 06:04:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 11C826B0093; Thu, 13 Feb 2025 06:04:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E633D6B0095; Thu, 13 Feb 2025 06:04:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C75996B0092 for ; Thu, 13 Feb 2025 06:04:47 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4FECCA05BE for ; Thu, 13 Feb 2025 11:04:47 +0000 (UTC) X-FDA: 83114638614.26.6B40ABD Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf12.hostedemail.com (Postfix) with ESMTP id 547064000C for ; Thu, 13 Feb 2025 11:04:45 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zdufPU8A; spf=pass (imf12.hostedemail.com: domain of 3y9GtZwkKCGoITQKMZgPTOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3y9GtZwkKCGoITQKMZgPTOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444685; a=rsa-sha256; cv=none; b=xDLsBeDaH8HCrdwq0dP3+KnFBPOC0Edq+jRLZWxQg/G3vjTR9XqQAA+tfTel3jNclsqwhQ dDfbiOK2bv20Ek+7OCIQPYZ6obd8+bCtnolrZSSjNSmezvU4nlNjP0otV80+oJmj6hgzDd HsiJ6C8Z807pF1mKy2vZf3OtCiGjSHo= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zdufPU8A; spf=pass (imf12.hostedemail.com: domain of 3y9GtZwkKCGoITQKMZgPTOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3y9GtZwkKCGoITQKMZgPTOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444685; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=k8n2CzMf4Ztdvpb33CNEbhu5gkerbPNLl9WXyFXHMr8=; b=leB0TqFDEtP0pMVRK4aWTryPvGXp2w/SUpqH4aBSL6aGLU9xllUqWT5GGeceNU66sgEvjn jMl7h2e3JXR+8ScTxJD/seEF799FLKxkGpxHgsBu89L1zhwiYexE9PR7inIQOiBNtTJ0b2 tPB00W48L1yPk8Ut+Ix3Uxagvjhx+rc= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4394b8bd4e1so4018355e9.0 for ; Thu, 13 Feb 2025 03:04:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444684; x=1740049484; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=k8n2CzMf4Ztdvpb33CNEbhu5gkerbPNLl9WXyFXHMr8=; b=zdufPU8AJyidyUZPeL8E7TzWQRuK9NtxBvi5bFLOYl7ksSRpNwXSGEMA7n/4WVXRMm SoGoLJAtSVuQlGs/Xomtx08XJK6zC7fOfJI5MIT2gUr2vbk5u+4CUcj+WprjZXJJePfH 17gjgH+enxBm9HlA3Vx1agmjSTaN9zg8Y9uau4G9A3AUZ4tjvHinhNuDB6ClvPNLYNPP BI8Ed8Do12B4uRc+YRvIx+BQuOAxwFKHkwOcPXlDJwGD8L+ZEegGQNe8Med9GcPghexV WpXQTBUUgUAM+TuNm2UNbw1vJ7tFWhwRNcvqm5tLPaGmtTW16qhA7bshT3YwSmV0cZL2 nnSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444684; x=1740049484; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=k8n2CzMf4Ztdvpb33CNEbhu5gkerbPNLl9WXyFXHMr8=; b=bSnqDP+RzkefCrKPWXHZ4pOGwaeQBW+kTS96jdO+FB8CdTorL/pHHD0bKaZZxs66IJ 81XaB+xbLMLNnB/OJe0XZ9Ip44a9/E1x6VadVPfEq5XhJnjy83Ky5NmSlI/jMQgd838N XUCYBgNLmX2/Ly1EVP6BC0u4rnP/yY7vCFeDN+2i6L1Ze7JaAS+4G1GKE+8i18BCq4iH kvvKqx1MXPz/vrYUEH3pVs7WcYR2bPLn4YO8qLXCeU471JxIu/M1fwRSlW77K4CEbn7w 0g8n4lTRR6K+acZYZtsqwNzh9Wg+Dp13FwUhPjTVupNmX85s9s1pWI3X6fHIHyl+XDpG kisg== X-Forwarded-Encrypted: i=1; AJvYcCV1KwUApOuagFM0e7uCfkVJxhOr9zCVjhclBmuQmn5hRmxGSKtRfrUZhsea2btro0kgOISyY3wH/Q==@kvack.org X-Gm-Message-State: AOJu0YydN17/efp8rxGNzYUwNdc6GfjJ5BFI9rPwewfbotMmL9N4vGoU tlqGM1C07mK6R0Mwe72EGVIxfPoryI0dB33aPSakfaLFy6FmMH39NjQ0M+6OaUR1EXg8u0BRXK+ JWENfKi/4vV32IQ== X-Google-Smtp-Source: AGHT+IG/p1umuQ0eJvKQbofKR2QTqEeXJEbd0YG5yn6XEcTENIpdxL45h7o3jCh3ZD60eJkyCoDKaBabKC0ULTw= X-Received: from wmbbh11.prod.google.com ([2002:a05:600c:3d0b:b0:439:45fc:d36d]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:5246:b0:439:643a:c8c4 with SMTP id 5b1f17b1804b1-439643ad059mr10276655e9.22.1739444683950; Thu, 13 Feb 2025 03:04:43 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:02 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=4242; i=aliceryhl@google.com; h=from:subject:message-id; bh=7ZqqtUDXsP1Gbf5JFgD6mksdBLRaKifIQkb0l8YZdwQ=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdG/lIDQqsozko862Oxdr+bH9Os+BAM98CkzW TQa1IBfCNyJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RvwAKCRAEWL7uWMY5 RkmVEACn0Bt1SGwPL8suuc+/47buudaTvg3C0yzMa/mLK3qz1QtjJ/vbqRpsG+2npghIgyqGybL /aoGakDA+MpjDEuSZ48Sj8+IfzZKx2S5QjUFbKImceVUBQ74kG0KaRiwRZ5QDm1AwbnVy+YOTzj s2UHHoDn/+zj5ohSQj084UIiOIW8diiANWVf1SFI3MHTSK92oXMVUUIINrDxr2MHo1QYUaP6upr aXnJ36ntw7Rr04VgFHD54PDsGPqLHO5Re9WOxKQ9dYfibQt6IfQPKOc1qapGkRz33ZzFGnZRmYo igPauwx2md2jJ7jtq4eGsnNcbd4aq5GjD8Yc/mIO63CmITjOlPHq3UdvFDeDRHtC+r+yqFJXN2r gq1yAD+iXjILo4u1V25wO8369fHEMYGWa736PP63qiVSnHYTfY5bPhuuR6hGdEvBAXBjRkvOrvI uK1Papa5R28mD6UPHrUa/KSpUmu8jQQCnqs8hkSXpgpR+sZuLCvQDOq2PZ/KYcf16GPokj1XBrb kdu4UYkD3b+aorFabaCod8DPGo31Koml+GsGR53OQCg6WiKYIZJFbw1IFzh+M3p8HDPei63VpvW cMFE2AmzaRijV5Lv1oxBWfyrJOfMOjBLg7JHzgDODBxphjEiCAXDvR97INrlToJlCDR4oRIt5Ao HuTGjipQymChgVA== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-3-b29c47ab21f5@google.com> Subject: [PATCH v14 3/8] mm: rust: add vm_insert_page From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Stat-Signature: 7n7w1gz6zrgibdhzs5jqfg9tgfdsstuw X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 547064000C X-Rspam-User: X-HE-Tag: 1739444685-420955 X-HE-Meta: U2FsdGVkX18/HoZrVX2W94SHjuBDDsia6Nhfr/BlCye18BRcBAPnCBSQ/0ESv6QuXbmDrHLV3YCDvrmja1ps5zLli5lpMBN+sfssnMEYH8tw7ywFgAbqM5iGN78IWAFB3uWatGHSKNCjXJNclPnYTAfIVkPOUGNyYcyaeMc7+UzKY161utQAFKraE3MNi53Qi5CMV4RDm+GRPfeQMcn1C1pE26sfrlfPCU9xic48IHBLDfMQabuvlzqhkzNSkuBHlRNvMNMdBH2WCuxFOUz2tiL9NAaibLX7t+qnybs9HzOuFrb5/X3mYPGNC1f7ckvM8gmSqK6QRM9vX+gVgfYM/EbOECeeGl5k+xhFGwzzWBvYJVfFiwpeDmkfanGu8ElnWgvfnXWVgInTm6E9g6UobhI9r5xuBivhADagiW/Z0WxKLWbykwa6YVZ/a05boSo5v/tZMljW40nzRKs54pQ0WhFnxMEwbBECMg4kao0Yo2PP1zwdUfnNK6r3dUfjNaevdqOGv5E+orfyYRpfGyZwEa1bRHQwbCRDNXNLUD7deBJTkBb9/zi+tjiKKZTQgaAb5l/vJFx/jgvZdE81d7GEQHcU3g5LjJ3i7x1kgbCTn2UeOE+4icd2fknLNdxNGa6kY0TUQld3gtiARylV1vdR3NlR00ru1ugyhsr6KiMDdB9qXIFzmGjjAi/cabIfqmqtyTuemcWnWdenY7Q4xfywdMNKKUS8rtNNIEmh55E9SbI16ubh1Tke20j/wyPQ5Jq6mmiaLApDr50Zsu2iwABmLlPSc4fv0MJpxTq/n8CcwJ13GfB90ncPtxeC0JoEntpatUd6gSHZaOkaZV201kYWoOkX2O+08lqHKBYP5kNjTspyl8/Bq85pL1iDWXcKJa/PG3nkLXNgxK5fHKAx1EBjgOizmYHedCHI/it0bcJiLMxu6jD/TTlXVFhiraUUJdRbpx1fu1XsBN5op2SYwuB ilHNhxZC abgvrrK3cVV+HJZfGcCLYhIk5VaPBT0PTh6Ph91FhLwnSOog03BuaYyDTjKMExdLFlZX0/xw2+4kZ0AGjf81CrYWUa+F34JCpNT2drC5e3+mLIt+ogy8ZvBHGA8Ijl6d/ZoHui8BdTmzIi77j6SPIFbRI89j2Yhkg/QczKnaSD0VJORAx7jLqTIj2LF7PqdwaTYM6fcHpYxjVi4TsnUJrtA3nnOP4gdE+fhAN1CgReeASfgy7vBqJ19rdi8K9/CnvU1ASddOXvncstQXg/TO6UT0lDB+hAGly/gIj3NBiXBqhe+yBY93a2/7ff7KpgzvvxSncCaFi4+MgfH8q5WaYR7zTG1CeXlNnqdZgy9Bqzpuh8mcFD5neBL6RT4Fn/Gpgwm7i1AfIc2nb2FyRuSOahW6V0Xnzr7d2uXz9m3eTWEbPWAdEwpD9S+CT6IQF1qr5hYOOqQ4DBfqls6m5VmdptbjOJZ+I5ElvcTtzXV8qy9qNfqU07n1XnXslDtYR/IgmrUIiWvbmJaqomZ8GK44T2NR4DIiED6XaLjJQh7/6mCvb89no9pJQGhVCT/08e7iZqgAIIWzxM2R8JkggiHYqGi886qUHULxEp8jZZF4p9ditHjyKNOvZcnjkIQfWsJU/vSsW1s4dsmki++MIz7Kj+/Fi9aqnfL9ykShROt/Hb5F4x6e0rExj464ArzmL+j/BSisgpJD8ImtkGnupcWtavuRDqc1rnfbZl3tJ X-Bogosity: Unsure, tests=bogofilter, spamicity=0.491101, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The vm_insert_page method is only usable on vmas with the VM_MIXEDMAP flag, so we introduce a new type to keep track of such vmas. The approach used in this patch assumes that we will not need to encode many flag combinations in the type. I don't think we need to encode more than VM_MIXEDMAP and VM_PFNMAP as things are now. However, if that becomes necessary, using generic parameters in a single type would scale better as the number of flags increases. Acked-by: Lorenzo Stoakes Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/mm/virt.rs | 79 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs index a66be649f0b8..3e2eabcc2145 100644 --- a/rust/kernel/mm/virt.rs +++ b/rust/kernel/mm/virt.rs @@ -14,7 +14,15 @@ //! ensures that you can't, for example, accidentally call a function that requires holding the //! write lock when you only hold the read lock. -use crate::{bindings, mm::MmWithUser, types::Opaque}; +use crate::{ + bindings, + error::{to_result, Result}, + mm::MmWithUser, + page::Page, + types::Opaque, +}; + +use core::ops::Deref; /// A wrapper for the kernel's `struct vm_area_struct` with read access. /// @@ -119,6 +127,75 @@ pub fn zap_page_range_single(&self, address: usize, size: usize) { bindings::zap_page_range_single(self.as_ptr(), address, size, core::ptr::null_mut()) }; } + + /// If the [`VM_MIXEDMAP`] flag is set, returns a [`VmaMixedMap`] to this VMA, otherwise + /// returns `None`. + /// + /// This can be used to access methods that require [`VM_MIXEDMAP`] to be set. + /// + /// [`VM_MIXEDMAP`]: flags::MIXEDMAP + #[inline] + pub fn as_mixedmap_vma(&self) -> Option<&VmaMixedMap> { + if self.flags() & flags::MIXEDMAP != 0 { + // SAFETY: We just checked that `VM_MIXEDMAP` is set. All other requirements are + // satisfied by the type invariants of `VmaRef`. + Some(unsafe { VmaMixedMap::from_raw(self.as_ptr()) }) + } else { + None + } + } +} + +/// A wrapper for the kernel's `struct vm_area_struct` with read access and [`VM_MIXEDMAP`] set. +/// +/// It represents an area of virtual memory. +/// +/// This struct is identical to [`VmaRef`] except that it must only be used when the +/// [`VM_MIXEDMAP`] flag is set on the vma. +/// +/// # Invariants +/// +/// The caller must hold the mmap read lock or the vma read lock. The `VM_MIXEDMAP` flag must be +/// set. +/// +/// [`VM_MIXEDMAP`]: flags::MIXEDMAP +#[repr(transparent)] +pub struct VmaMixedMap { + vma: VmaRef, +} + +// Make all `VmaRef` methods available on `VmaMixedMap`. +impl Deref for VmaMixedMap { + type Target = VmaRef; + + #[inline] + fn deref(&self) -> &VmaRef { + &self.vma + } +} + +impl VmaMixedMap { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock + /// (or stronger) is held for at least the duration of 'a. The `VM_MIXEDMAP` flag must be set. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Maps a single page at the given address within the virtual memory area. + /// + /// This operation does not take ownership of the page. + #[inline] + pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { + // SAFETY: By the type invariant of `Self` caller has read access and has verified that + // `VM_MIXEDMAP` is set. By invariant on `Page` the page has order 0. + to_result(unsafe { bindings::vm_insert_page(self.as_ptr(), address, page.as_ptr()) }) + } } /// The integer type used for vma flags. From patchwork Thu Feb 13 11:04:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5FBCC021A0 for ; Thu, 13 Feb 2025 11:04:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BB8D6B0095; Thu, 13 Feb 2025 06:04:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16AC86B0096; Thu, 13 Feb 2025 06:04:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8C706B0098; Thu, 13 Feb 2025 06:04:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C975E6B0095 for ; Thu, 13 Feb 2025 06:04:49 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 69A421205EF for ; Thu, 13 Feb 2025 11:04:49 +0000 (UTC) X-FDA: 83114638698.10.9FA1948 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf08.hostedemail.com (Postfix) with ESMTP id 72260160012 for ; Thu, 13 Feb 2025 11:04:47 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=xYrBHyg0; spf=pass (imf08.hostedemail.com: domain of 3ztGtZwkKCG0LWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3ztGtZwkKCG0LWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444687; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/uyBvvUIBqhEGg3k1MSKl3P/NCSf2A+x8xfYmCFIPos=; b=pqRVbxQ+LtIbl/VyqIwkguEbjHsi763RcnDRpEtBjxEqMLlo7+TcbpqR6+iqRU72xIK4yx XT2qWs7UOfb4mO58CzH61RplXm9WgL8w1+rf8tW3tdv7N+osiZamUumrrQxtPHpKdtVf8J gUxZPVRR25Wu/1uNloKOOu1hZcTiZ5Q= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=xYrBHyg0; spf=pass (imf08.hostedemail.com: domain of 3ztGtZwkKCG0LWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3ztGtZwkKCG0LWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444687; a=rsa-sha256; cv=none; b=pkar0qIJbhdEiE0iSB9UgIDcjNhYf7fKtGgw02p3XdjCLS39ivZXjvZYJi8CqZIKEuVd+/ Jv2tIegyfKKdrlTmK7AhSeJKj0/K4NdK5yVNWu5J6TtuCdS53Pdp8A05LhX9ykQ4Ptx8a8 FLo8XENL6LWerdLmlI5HrrZEZm0Oc8o= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-38dcb65c717so434356f8f.2 for ; Thu, 13 Feb 2025 03:04:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444686; x=1740049486; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/uyBvvUIBqhEGg3k1MSKl3P/NCSf2A+x8xfYmCFIPos=; b=xYrBHyg0tjvbE7kd7HyeIrlsbFNL5167WpKiVOcCWc4NNbGEWYXgUc67Fz768mjdfu HoznvFVcIyA30c2F5k8B8mbE1kR249thMpuf1UvhVEywb/9HE+Fv4DB0ppb6EdK+Q4iQ 0WOynN14vB3GYTMpdvPZ08/Uxr3I8w4REgS4ty6wAO96oXEWhJ9AMhnhsfqgpeENIAKN wkkwL1uN0JqgS1ClZEEz8zLcaCiI7OmsaI5gt67WSt/11ciGhiHck3+HEQPWPTxreL5w O+TTbn7UB2h6ZcOvrfKi2RKBSCc6zU+z8/Asj+Fh4RmAramit3og1cf7tQw9aVaq1qAn nKeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444686; x=1740049486; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/uyBvvUIBqhEGg3k1MSKl3P/NCSf2A+x8xfYmCFIPos=; b=vXl1JSa6yFeBBKEcOCUHHTy3HRTjZ7TXsuDNA5+X2n9OAV8yS2QdBLrZHrRzDiNhtR XMf0ubSQiYtq5NVZlNpOuVYdWHKSPzfKWQ9hAsIzfcxTWt7KOk5C5Xv5J2uSXyj7oLxA MIKJaLppzWaryis8VfSkPs6TbQ3+oZW1t4sJqa/F7fzHu5zCcQURAFW/kfPwvlGUybjy 2sBWRL9r5yr7oUajISdrGRUbd5bSkT/pj1lCKmS2fiyce6sGy1dUUiofiNhRHGnBwE3W q2r0OXRGz/I6VmGb2xX4p7WNiUSUhocTd3CZP77fzVJBAUmvxcueSRGD7PZLZmdN75R8 3tgg== X-Forwarded-Encrypted: i=1; AJvYcCVLCNUrSioFlMXGYF3ic56JpzYOt1gH2TcAqBMv962XCpAEdAsCS6bTExwIwgAcbA6tbuq/tPamNg==@kvack.org X-Gm-Message-State: AOJu0YwLuHBY1zE8hyXXg/J73ZDpKrzomC7kbqjgkzBUQYxQggjgRDFD byqlF/3NPdWIv4whBSnSJmfpUfcla5QwIeiywRwOpa9v0IJ9PVdquWs2+TJKjEEie89zZcZazjU BZiOzEB8UcrVrZQ== X-Google-Smtp-Source: AGHT+IHfpu4LwUir1xs0IVZCM1IC0uXcvmusDtyLHlflipwlekCf6G6k2n3LWvqP3SiylhkkpqKNBH+Ir9bJCqU= X-Received: from wmbay19.prod.google.com ([2002:a05:600c:1e13:b0:439:5636:735f]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3b9e:b0:439:3dc0:29b6 with SMTP id 5b1f17b1804b1-43960169525mr31940045e9.2.1739444686140; Thu, 13 Feb 2025 03:04:46 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:03 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=4092; i=aliceryhl@google.com; h=from:subject:message-id; bh=7YH1vGH4J5NrwgKxb07UMrGiCmpjxEls6xw94mytwWs=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdHAisosHMy/OP18+ZFSNrfkNK5udMfSbyYBd /UYV8VUP0+JAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RwAAKCRAEWL7uWMY5 Rto2D/9a3RX0gRbzHxMqA63smy2e5oFZ4X2tYbSXIkyIwN6VWd8YZtH1PcFwUa5wK0pU8XdojFd AYyeERKnFDljzjTqnvwqBfphg0liowGbNq23niy2uCffaJxJo4UFB/Drit3FPKggtxzHYqlSPSs 7luyMeiT/YiEiq1s08asYDvnPjbigsK7Vl5UaX+i6F1pHlNeaJ750Ohu+Z8Ts3OEu/MyCC69Oso 9buHg7+vnmGaOGJHFS4eDB16eYOTTcvgOIZ2ChRuao5jPRoBo2GjNWXxjy4vV7QfGrE4Ji5AqUn IyBvpRKcct/CWYF5YcWkNEDGIvx5sqofRxhE3uhD2+JdwrCefrkQYizdTzFxvFfh7VAAue0FLUa GNmYd2K0SvmP1Oikt07eoNMllm+se3DMiQdMea/I/HQYLf4vH/zEd09fKFYz8O4vsAl/nsIGGyw ulMOzd64acCq2XDNQH9O4ocWZM0V905RHEID9LA86bKDSVZiyp+fgDpbmDkwTMfYhKgTi3hUlVM O+oKYcf9ikY3RTB2rW+zZRdXGN+6G39veH40gcvbcFon8yw4J7Q/gd61g5QGdnKCz+a4skqrcrQ rW24zBO2XYUM9mStTg2QYQbtyTURioRrsA18aFyqz3iWziTvzshu0DHWgtkaeah3hggl6BIqBfb DtcnZcK0amgnBGA== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-4-b29c47ab21f5@google.com> Subject: [PATCH v14 4/8] mm: rust: add lock_vma_under_rcu From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspam-User: X-Stat-Signature: ojtbfm811r8nznzskghqsdwjsitetceh X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 72260160012 X-HE-Tag: 1739444687-393942 X-HE-Meta: U2FsdGVkX1/RMAZAiyMWb/3cDyLWdJB/0cEwULW8u8dBBy4iB0aw6OswMM9S3d8tI5U/FHXVA93Z+9bM9NnxHgNspk5hTJ7UCbQ8Uda+ZCCOqvUx2hvydvb3RjCce1i5DFQsntONPsnDZFDrz18JXmkwki1O7n7Yguoiva7f791gPSlhQhmwvbi63zkdElS6JbmOB+bfBjkWoh3UESFDNyKN0k8ZCgVhWVOdSaLDEaT66r93kmqjS/JvBZDrktJap90yQGgLB+vWDTfgjFW5aoxaZCUelfLvFQ2iSdT8b89xyQF5L9fcJ3Yp4rShEEzF/4PwL+m8BP6+j0bxxl0WiO1/lcBJtmaDhaYKtp8vIcSCt2nuK/7E24zGSqKwIaXHX6E+TvX+gu3CDlsULy+BcYoQiRHy7as3jhPR+1ZB5fuuwbrXIo4SzsYQHF3Z235QLkWKoQ46erUvE9GQ6LPh3eW1R6Lk0AhWeVQp5jIr3NYw3SHYolfBbzlNQa0xItanxIeTpu76exJJXHrmcwdaM7Tbie2QD6kI09dMbFrBXum6kp8cG51WmUgXtBF/VR5QFPdt/bgHmg2Ke80t83M7sAHHGh8QL2qzgMIAf/fgGLuGXxMB298eKs5mb+CzSAWK684+JH878Re7LeURh6awaCgH6Ie01nahByUHDciPq4SxOkg0tdw/eC44eR4pBDUdVfQx0UgF2T0GerBEAU7FxYoTv3zHSALpY/0jMpdf0CaWSnHvp1Ifhl6NbKU54oMZHMUS9VFkHWrJOzdPJ0CLisf3YbSLb/9nBjJu5VBY+PtZ0+iFy4D5qBVqzHGqxx03ogsjxyNqylHXdo/j273yxUnWxLtBFDsYJSbonAxCIN8ToFWKnImpaIxeRxoM1ZuTBEfJA684vgbPd0aMeNGdenKhSojUPko7v5xt7GkQ3XM6Wp0wzSQBfIXSGZN72YevKq6HEMOSWwoiXAgJkrk 5hhLmewy GnpTYsfurUuhAoIYZx5VYo5WFqU55CbXPH34nQDSj87psIVp9a/pA4tMzMq0f1SFroQh+PPRjZwtnPBF6vyQqcItJC8Lpwl37v/JyR+LpXuYoptE1s68j6yD/sVtsgYj1WKtMVwxS4etivFpEjkmU8ckEtP4s4fkaREKC+0Erd3pAWctlCRHU5QxzSE0biVdOXkWGUSqiRD5anNI65145VeDcWFwonKLu3yB//G11R2la5ETe9YtoeUNFlhl4cL59UZayc1btjKDXQFD5QB3C8T3WkcDrFpaGwTOgwaR8+3cVM13SxI9W8k4tKTYKupr/QZoopEELRBzeTzXtGzSkjPk1PKf6+UxY4w1e5142PwfFZl9x3JpnEOQOSwPukj1Pet4HCtHF9wdpBWZLe0LarbARTSqXBj3aA8czlpdBxoC1h12Zs9i74jy8Jw3vDLlX5MZeXSlRks6zsWqRM+A1vbzMJnWeWymNfG8RUHHKJw2/BYRWygQAY9LN6GJ/7zZI5HuCQH9f60SF4NpL/OBH8SO4mrT6kSQDC1MnFui5oRL6FQFq8aM/2yElNlhm+ZEXi4K5L0s9/AvTt/v/P6cYAvPtUwNQftWT40t/CkoBurpanIOFTqGAbG7R/fzqdn+9EpZprMgb1xCp7W0rdvlZXWgeoQGvWmsxZ54FMbXonMPSKNRTI53AIVzZ+aMXSfVuWkpV3dtgrMGP7xBfJ+x5/v3y20dSIyqd74kf X-Bogosity: Unsure, tests=bogofilter, spamicity=0.475864, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, the binder driver always uses the mmap lock to make changes to its vma. Because the mmap lock is global to the process, this can involve significant contention. However, the kernel has a feature called per-vma locks, which can significantly reduce contention. For example, you can take a vma lock in parallel with an mmap write lock. This is important because contention on the mmap lock has been a long-term recurring challenge for the Binder driver. This patch introduces support for using `lock_vma_under_rcu` from Rust. The Rust Binder driver will be able to use this to reduce contention on the mmap lock. Acked-by: Lorenzo Stoakes Reviewed-by: Jann Horn Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/helpers/mm.c | 5 +++++ rust/kernel/mm.rs | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c index 7b72eb065a3e..81b510c96fd2 100644 --- a/rust/helpers/mm.c +++ b/rust/helpers/mm.c @@ -43,3 +43,8 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, { return vma_lookup(mm, addr); } + +void rust_helper_vma_end_read(struct vm_area_struct *vma) +{ + vma_end_read(vma); +} diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 8b19dde24978..618aa48e00a4 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -18,6 +18,7 @@ use core::{ops::Deref, ptr::NonNull}; pub mod virt; +use virt::VmaRef; /// A wrapper for the kernel's `struct mm_struct`. /// @@ -160,6 +161,36 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { unsafe { &*ptr.cast() } } + /// Attempt to access a vma using the vma read lock. + /// + /// This is an optimistic trylock operation, so it may fail if there is contention. In that + /// case, you should fall back to taking the mmap read lock. + /// + /// When per-vma locks are disabled, this always returns `None`. + #[inline] + pub fn lock_vma_under_rcu(&self, vma_addr: usize) -> Option> { + #[cfg(CONFIG_PER_VMA_LOCK)] + { + // SAFETY: Calling `bindings::lock_vma_under_rcu` is always okay given an mm where + // `mm_users` is non-zero. + let vma = unsafe { bindings::lock_vma_under_rcu(self.as_raw(), vma_addr) }; + if !vma.is_null() { + return Some(VmaReadGuard { + // SAFETY: If `lock_vma_under_rcu` returns a non-null ptr, then it points at a + // valid vma. The vma is stable for as long as the vma read lock is held. + vma: unsafe { VmaRef::from_raw(vma) }, + _nts: NotThreadSafe, + }); + } + } + + // Silence warnings about unused variables. + #[cfg(not(CONFIG_PER_VMA_LOCK))] + let _ = vma_addr; + + None + } + /// Lock the mmap read lock. #[inline] pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> { @@ -230,3 +261,32 @@ fn drop(&mut self) { unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) }; } } + +/// A guard for the vma read lock. +/// +/// # Invariants +/// +/// This `VmaReadGuard` guard owns the vma read lock. +pub struct VmaReadGuard<'a> { + vma: &'a VmaRef, + // `vma_end_read` must be called on the same thread as where the lock was taken + _nts: NotThreadSafe, +} + +// Make all `VmaRef` methods available on `VmaReadGuard`. +impl Deref for VmaReadGuard<'_> { + type Target = VmaRef; + + #[inline] + fn deref(&self) -> &VmaRef { + self.vma + } +} + +impl Drop for VmaReadGuard<'_> { + #[inline] + fn drop(&mut self) { + // SAFETY: We hold the read lock by the type invariants. + unsafe { bindings::vma_end_read(self.vma.as_ptr()) }; + } +} From patchwork Thu Feb 13 11:04:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A10EC021A0 for ; Thu, 13 Feb 2025 11:04:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 84C3F6B0099; Thu, 13 Feb 2025 06:04:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FB2E6B0098; Thu, 13 Feb 2025 06:04:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 650086B0099; Thu, 13 Feb 2025 06:04:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3B30F6B0096 for ; Thu, 13 Feb 2025 06:04:52 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9628847EB9 for ; Thu, 13 Feb 2025 11:04:51 +0000 (UTC) X-FDA: 83114638782.05.F75B38D Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf01.hostedemail.com (Postfix) with ESMTP id 8A2E940014 for ; Thu, 13 Feb 2025 11:04:49 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BoMMq4Jy; spf=pass (imf01.hostedemail.com: domain of 30NGtZwkKCG8NYVPRelUYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=30NGtZwkKCG8NYVPRelUYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444689; a=rsa-sha256; cv=none; b=qqEr8JEtYk5fY0m6fJc6MYoMarqipG2VmqicSNjfAUOeTrl+wI5fUQGXr7yS6gHIdx2vA/ t/Y5LOgM0NuvNxdnq4KSMM5Ks2LqF3quBlOwGOR24tG3Z2m9rtby1I5Hs3tqpcQgTMBNuA f/QQ5udeqSXHU8zft5AhXBwAXI+H6Ac= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BoMMq4Jy; spf=pass (imf01.hostedemail.com: domain of 30NGtZwkKCG8NYVPRelUYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=30NGtZwkKCG8NYVPRelUYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444689; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CgLb6B+K0IFKSCneh1iu8CxwdcsXLyqW6hdJtG80sPw=; b=mhvHrat9597B0z56QEzMqSjBlkMvF11tR52MUZaO4gTX1nzm0rDMKPUMQch2EAR1Bbn5Sv GFhVTI966x+K9NWQrJL1JauCRChdpXWGV/59OEPtYuU8qCg79Pplji8AzFeZuJf5mDN4Ai RARlTisTdyz5AyTpzdyuU3rlgKfGd+s= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43935bcec0aso5502485e9.3 for ; Thu, 13 Feb 2025 03:04:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444688; x=1740049488; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=CgLb6B+K0IFKSCneh1iu8CxwdcsXLyqW6hdJtG80sPw=; b=BoMMq4JyJjp+kXbM6fUyVVWBR+KrA9OvULjw80Ibr0SZSqms8f8Bw57Ac8RSkqenij erW2GZY4mLT0QjJzfg5of86DeT9sq2lnkRc+DMedzDKYV5x8uzgN5VUggtwErS0td6dx jpRo3Zl24PZxn0pyKDU+eChoUPYI/2YkkGv27Xaj/8EfT0B9JthPo4jg6k3zlYUeboYJ JlUdvx36zNS3BKfMAWioskB9g0/MveBHOD4TnAJJZWIhDV9Ll/1EmUXMGybKStRbSoar VBpUq0ZKdOIQAOUdBRLHKCNm6JUoR93u6rRJQiOlkfMN/Ifl48ssC7avteRDdSSwBC+Q Cj7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444688; x=1740049488; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CgLb6B+K0IFKSCneh1iu8CxwdcsXLyqW6hdJtG80sPw=; b=UeVoI5w707Fdn9Q/q8tweMV1p5L+yl4b8Kb1hM+eUDyaTNCimzsZhHeqYNMB34PvOD V83c+pDz7atmno5y6MM+XZnUvQUarXELNVywzGwSCqqOM7vAsz+8/MUQOLEPQudwWpBg Clm0ju8NdjxlIhl42sy/U2T7LGt9WdnotU3FbHqbuldtEDfAyMr30ryAc46X5ywAsVkv vmV/ztgNibuhYpKOnd3Dg8Y35T6qc5o+C0fpNdBW0yV3Wt+Mlu8FskxPs2vH06zEu+x6 IRXlBsJWp6ttLVt5N5k19TGaaI2LDpnakhltJ0lzDChJv6LBlTAVigPItLNgHQ/379/b 23kw== X-Forwarded-Encrypted: i=1; AJvYcCXRP3yPsdDHyz8Kl0ifEJf7tHGnV81JvzBQ+SPMWOG1szbNBYDm9AaCUgv94EZwKTsQNPirn7rZkQ==@kvack.org X-Gm-Message-State: AOJu0YxRzqXqCvIgV0DlJVUZTmQMSPPlIhvE6Oeri7cKdo9qdUzw6tyy mOfWfasjvmIEFYcDAtrPWxO3SL0ewr1XO8pRZmH3fY3QwUG3PnI9OpfJGvfpf79L8DUicocyEcK l+htmB0gT7qFICA== X-Google-Smtp-Source: AGHT+IEE4mERgLHcPxh4Z0XFIOmAvsYP5lm1mfa5000wNXTddJEmQQVgk4sP9gXH2gaDBiy234o63CvPhb4kDSo= X-Received: from wmrn40.prod.google.com ([2002:a05:600c:5028:b0:439:5d9c:5d7e]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:46cc:b0:436:18d0:aa6e with SMTP id 5b1f17b1804b1-43960185e92mr36208335e9.5.1739444688303; Thu, 13 Feb 2025 03:04:48 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:04 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=3164; i=aliceryhl@google.com; h=from:subject:message-id; bh=Y+J2tVj51QjuipDpR/j89prtBvkMX66gHrDq3YgpYsQ=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdHABUIA7xn4otWDfLthBx/LSs7Na9aVE8qME MbIMzywdJOJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RwAAKCRAEWL7uWMY5 RjJsD/9N/t+Xpj0ZBHju/OxddzbE+XEa6J7uMG5lNLfo03HNmsU7+6msxBvBQJ5vmKKjXjm3x9c oQ6hTWK6bPKhSrlen4dQyjRbw9cwqTxkxsgiLMGpqYZsYazH+ZkFqGJ/jfsq4gKS8grIm0Rj2G+ D10/6vhw9ak557lgN3VdinlGRP3uz2TAMd3uiMGmlNInXAPtA58zBnPDFgrNmvoFWahp559WW6I 20YGQWCPZKHjnbkMUCKsvkUbV18BxMTsLDTA0RIic2VyHQPOn2hB7hkCZOmdSzYtn0wBwZaCwjD ceaujq4uwhd+WV5miM++gFGrD9oJCHg8/ZOkc4ktS3GEoYRLvOVVzfFgCbNuP9YAnCercllEqfY GpfkQhzlFm+//Jxpm4fvqK7SVcfK/wbHpnXqx24ZgF2e+rJRoCrORcAttWL1atKC4oLtF96Q4GY /eZdG2B1Lep6R4PMqWwNOJ0oA7sOd1VvPSpKMs9k8B7dPLICaYDyu/9LWb6Gy47AyIlT2q5i4XD 31iiqx+C7U8sGa0Z9N3uV28UyrkCD9qSFEJZc/0iTVZwnPn9tjuRtmOu5y5iaItpLQn+Wnxv7UL svXiQQz4GpRz1fQx3xZJrfYl8541aByQWVHo//K0h2RGgCSJV55vtoHEr5dKkxDfGptyf/REv5J Q+YfAvaP2OZ/9Vg== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-5-b29c47ab21f5@google.com> Subject: [PATCH v14 5/8] mm: rust: add mmput_async support From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Queue-Id: 8A2E940014 X-Stat-Signature: wspu3d8iuicnhckhnreckcckkkjy5caq X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1739444689-650641 X-HE-Meta: U2FsdGVkX1+wbp51k/N7ZtdiQ0T1JFXEHHtLXxW8gyc1bARKkq88GDBAomrC5G3sOXE5rLiZci/mT9uFFujuH5rOfFEMGgb8MX6AP7EA40Tby1L1rWmJRmokV/3JELh+ixOmQfsrgkAqVgaPrwWG0umupNw5qcr+eyY/1C5YnmKVNIor06wJJCawgbm2a5n+1JM+cY3ObCfWscoRanxl6ZE7c7hX9Gu/c8cNu9STDnxj8IXDPmaNwCBwXSa0xWNfJoDxQPoJZP+DQVapL3ruXGqHaEPB04Y9BGvN+xpZGnztQCbXa0Jx00G52Cotw3u1TNvpYbWrqhCED064SeVdpcG19ogEHGclnK0ELmh+pQNCchZU70UsGEGstqMXjEgz/7oE2jk0Zsm0WZTEY6GA0DSMFxw0YbK2O2yCzkLXhasKqE/vLrtHP80522Xa3ZIHkUmwttGnzM2wm1uaF+02zy3fQLoVFh9Pho0O7o3Ytb+zc6wXuoGvPn7tJ1mXsZvv3mAxj4HoN2C8gR8fG96ORpWeMfrfQSk+r6YWp4HoxiClpRXhSRmJDk3V5u5BfWmNUfn9L7L3PPkc8r7t1kZo11iWG90LWBT7RryEj0BM3zDj8L+MW7/xZDkhm1L/70zbvS0yproyyR880LCdd6OY5WfvTx/xkCrSmffpM+5H4SdnoswfyDkL1l69KoSvLt4AQap3A3mdRrhpDaildq+VXfaFv9FFaEpHfecGylK57juC6kTjwOKcubpAxYJKH9Giq6kFoHdLbuw7mC31Ph44LfZ6E3KVWrg7hyJG0j5rR14sJ2f6O56tpcn3YUZ1z8FdwXqED0kqKMXX22vRjk8vLtObnxsF7K47vFQdNB+Um/4tQVqUh8r6dMWEn92ocKyzsNh4Ti+5o0CAFPMyz1OZuOodf6R9ngrM4vFKI5YjRCN1CjV+RPfaSXPMp0YvZ8KYMsJ+73vq8Vp6GqEfXcO VorANykk v1FS+EJV4SwIEgxzAJFrIekgluWTG/k55vrzRjwasAHaWtmx7lx3pZbzgGvcJZx9ZFfZrPnIuXpiTDYWrlg6z8Y5dKyx/ov7C/xCEkmMDpZHOAgxSyrzVmWElWLePZHHjq1XWzovQzP43YAW5eWvFGaDGnv2Dm9//ddd0sjEXLXs91IkFDWX3t/Ig9zSSFylQgFJnmc28Vtx2dNf8BIkGZtJsr4ixhExAauCB37UVlzRUElF9dy4LC9OeaAi1XLtTyahawEiBBoqC63jtPZ5lM/ep8nocTqilP+OvoBc1fTaq68UfuzYu85xhjw0kcuab0vpY+w8vt1TssgU0rAXGSCKvcXtc3g3zSXIny2P5be71Fsk511fa85YvzQDbDWHIcntyX8CSgNIx98HUXHNyrgOQ00wU1PVgM7Cni03Nmm6sd2pXGaGcRiVlO67lVQod0cZNURAApRoSsEt8Ji9eSNnQ+WdfMO9gWE/aDmxWvjRE6rs8BeTM1x24EfObvXNXWfBMFCm8CfLwSlTdTkkxvAlDYiDHIX93JxrIenl6kaqjya9ZhmTf0hasgRcuB61mVLHLjFX0Uj2KxzohSWldkZZQJGXi9cPcn3l1jouG4WFMYaI1njTUCa3pvgcnistpqatrOHR7a9UJTy7cCxkMSU0YY4405jJix5H+08as5C/1VEORhIqtSWAdn5Ld6HonfmuSLVbgdCyLXbpsCFeqAQErYPvzpXL2rVsT X-Bogosity: Unsure, tests=bogofilter, spamicity=0.458426, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Adds an MmWithUserAsync type that uses mmput_async when dropped but is otherwise identical to MmWithUser. This has to be done using a separate type because the thing we are changing is the destructor. Rust Binder needs this to avoid a certain deadlock. See commit 9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for details. It's also needed in the shrinker to avoid cleaning up the mm in the shrinker's context. Reviewed-by: Andreas Hindborg Acked-by: Lorenzo Stoakes (for mm bits) Signed-off-by: Alice Ryhl --- rust/kernel/mm.rs | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 618aa48e00a4..42decd311740 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -110,6 +110,48 @@ fn deref(&self) -> &Mm { } } +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This type is identical to `MmWithUser` except that it uses `mmput_async` when dropping a +/// refcount. This means that the destructor of `ARef` is safe to call in atomic +/// context. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero. +#[repr(transparent)] +pub struct MmWithUserAsync { + mm: MmWithUser, +} + +// SAFETY: It is safe to call `mmput_async` on another thread than where `mmget` was called. +unsafe impl Send for MmWithUserAsync {} +// SAFETY: All methods on `MmWithUserAsync` can be called in parallel from several threads. +unsafe impl Sync for MmWithUserAsync {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for MmWithUserAsync { + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmget(self.as_raw()) }; + } + + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmput_async(obj.cast().as_ptr()) }; + } +} + +// Make all `MmWithUser` methods available on `MmWithUserAsync`. +impl Deref for MmWithUserAsync { + type Target = MmWithUser; + + #[inline] + fn deref(&self) -> &MmWithUser { + &self.mm + } +} + // These methods are safe to call even if `mm_users` is zero. impl Mm { /// Returns a raw pointer to the inner `mm_struct`. @@ -161,6 +203,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { unsafe { &*ptr.cast() } } + /// Use `mmput_async` when dropping this refcount. + #[inline] + pub fn into_mmput_async(me: ARef) -> ARef { + // SAFETY: The layouts and invariants are compatible. + unsafe { ARef::from_raw(ARef::into_raw(me).cast()) } + } + /// Attempt to access a vma using the vma read lock. /// /// This is an optimistic trylock operation, so it may fail if there is contention. In that From patchwork Thu Feb 13 11:04:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CB92C021A0 for ; Thu, 13 Feb 2025 11:04:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B824C6B009A; Thu, 13 Feb 2025 06:04:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B07FB6B009B; Thu, 13 Feb 2025 06:04:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 982F5280001; Thu, 13 Feb 2025 06:04:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 77D406B0098 for ; Thu, 13 Feb 2025 06:04:55 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 903D680BC7 for ; Thu, 13 Feb 2025 11:04:54 +0000 (UTC) X-FDA: 83114638908.06.495A901 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf15.hostedemail.com (Postfix) with ESMTP id 9C79BA0007 for ; Thu, 13 Feb 2025 11:04:51 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YNtmOb7l; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 30tGtZwkKCHEPaXRTgnWaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=30tGtZwkKCHEPaXRTgnWaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--aliceryhl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444691; a=rsa-sha256; cv=none; b=oYglUK6inY3Cr2dmMNaXtaWQ980HVxvxJqmdmbKLfz55tKF5LwEDS5tIg/H7399ubPgNdh zMg1pdZH1ZU3TGc88tkzLS5OE7Uyy5ingeKmHZmd/vdTrukecoXgKgNejB0yWX5yZsAYdY LABYwAP0oRAOwU21m5NkzLxTq/yEwkI= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YNtmOb7l; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 30tGtZwkKCHEPaXRTgnWaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=30tGtZwkKCHEPaXRTgnWaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444691; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uW/Ty2fOlXPxtDtxLZmFYWp7FZLpKD6TeMqqX/rT2JY=; b=OTD8hqtI2ttlCDO4EPYnESPiMzONduzsGrLasfes0iz5s9Ncu/Xs0rtMtWVbqeTgJvxORb lRQBSdCwugHuBh5DsLcOOlo6tq4+ui3foUZiIrJKIC1Obv7lK2Ium+Gxq7s7UQ5uPsbGGH B4ktqwsSbhQd1rQmJo7bsfNjI/Updvw= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-38dd8d11139so560478f8f.1 for ; Thu, 13 Feb 2025 03:04:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444690; x=1740049490; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uW/Ty2fOlXPxtDtxLZmFYWp7FZLpKD6TeMqqX/rT2JY=; b=YNtmOb7llgbcRb+5U/JNM6zkC0PoMTWC8clyrxF7OZUo26DZ8UZhrU6AJPcx0lufNE f+QDQxeAcOrWr4dxNINLmxQHgdKp+adPetKUtm1fkYQ84hAXYVfpB7jTrYbXcI9yZIJb e4FDON2f9m4fPXkFLXHSF/OLDpfTPyNen/FDjUJ0A+/oUPg9i4yWEnft4h1mov4pFZsW z4GcRTurdf59jG9xdQ3FhG9FdvpY0E2ssHHCQhi06hUUZrKJKFArJj4EPxoGVWedLPiw KHuLN8hbdOO+cNDRi2XXm3kK2iqrfvS6g6Sg3/+zrQkrFALRdOLnzG9gkVwC8bHZ5lXo G5KA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444690; x=1740049490; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uW/Ty2fOlXPxtDtxLZmFYWp7FZLpKD6TeMqqX/rT2JY=; b=gi5uMA5JldlFGcC/yUePR5A96UHr2s8CAHbPfi18Is4Ass6v9AxUDmzexxb6ewBt2N wlUZSHQtJov4dBEzo1jguBTs7Xc/2k65pOMyfuCSfbLBiaut9jdS8VCOO/mjuTv9XnPW ToiOWu99qDbAEdR0ewUMHGwnEEbK+f9Bsl1gLwu1ItZXIGniOT1O/PJt1U2OQB48oOpx /sxch1VS8z3FjzvO0j0SHUVB8N82W48OZXCM229k+hQBe48l+JLYTvHd4/No3ycgvoEj N3aADVB0YPaDJtMzEmdRSMbCFCyN7aeofAvMNSP+eZZqBVDnVKTiscI9bIljsdTD9U6A TqNQ== X-Forwarded-Encrypted: i=1; AJvYcCVmjlcTI/WU4kZcqoBX2fgdpw1663A8GSOKzCko75MeNWcQxJOBoOHR6Kr4A0rLjaAB6u4FWWa6jw==@kvack.org X-Gm-Message-State: AOJu0YwJ3ev2f1aKKn8Gm48WKvAznUI0xrvgxMdauh5g9U05l/UwHU6B C5+dhFZzndRDITA2JZJe6sr3sMEVBSb+9plrSThPp/ujKMSaCWLGUzW4C8+G1JYB6EGoO4g93+J HPQfORICMMeGKeQ== X-Google-Smtp-Source: AGHT+IFf80WLAYmWuMESuCsRlEKC2D4+AWF643A9VnDB796H1Sc9MR5GfacRO+WmOsDE1DcsoXPNr2jsbgDj6bM= X-Received: from wmpz19.prod.google.com ([2002:a05:600c:a13:b0:439:468e:a94b]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:584c:0:b0:38f:28a1:501d with SMTP id ffacd0b85a97d-38f28a15761mr1140823f8f.25.1739444690305; Thu, 13 Feb 2025 03:04:50 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:05 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=8586; i=aliceryhl@google.com; h=from:subject:message-id; bh=7pYVDkxLGS09DigBqO6pbnQoHpQKADY37T16seVY/5M=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdHBPqs4jQruLh6dcALn9+kKABa5Ht3oQjY1+ +vcelNnxJGJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RwQAKCRAEWL7uWMY5 RoegD/9QReh2ab4/0o7RdV9QbH4EgO9m6OwqpDj+W9QmyXARO7Ruo0kO/4TqUdNIaMCj3jla/KV 30nO+8ABp6pB+ytxxCEuWSbUNhYggNXvQ8yeOS6lQ+cZt4ltoU1HWlPUCLpqof8/DUXQwJxI0ZP gGmJzLTu3akLT46yhxKz2RbPo3uY5snGt2qFQ/dTg88bzz0+t2rQkwv43PkW829KZaCUq1kvflL DWvf3KjhbasZ4O29c89f/lJO4HDBt35g6SeDWNVOSTxZmQrqqkaB9crY+tSORD/SiCR5uYV9lpt gFXf3IJ8vx3toO1pJ4yHzYA3uA+HEzYi4OQyUebkScsjgOAlnLucnMjvl11NfcSbz0qtTVpMrsr A6K/o4ByC4bf4ZVMpf1bDcntvhlu+FGy8rPUr4QJ2GI62qz01/Zbg9lccJw6llQbYQGOh9WAIq8 FLGwqlNQnDw3a1xEkSMstwXjBXg3ZHlnznMRV+m/P0VWp8aKx8hocFESO15ruq2Hbt8HYOeZkMe cSfm0aG4pe8B0ym7ao4oRSJXjupndKzMh/ukacbJJQ9J9QJqTP0Scxdvln6WlUf8uIXAhTyNjcR 5d3V4cdKI9NhwmF3IqQVTbPvb269/pa4lnxzsbulPSkhqs7GzDlnjLgOJcI7sNXHr8kSu0RfPn6 EenvfoAFmtQcfdA== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-6-b29c47ab21f5@google.com> Subject: [PATCH v14 6/8] mm: rust: add VmaNew for f_ops->mmap() From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9C79BA0007 X-Stat-Signature: z49jm5ben36x1c1rp95rgwojdzmbqa76 X-Rspam-User: X-HE-Tag: 1739444691-809333 X-HE-Meta: U2FsdGVkX1/FJ09aa3nfBZJ2EnzIBpIkoVdw/Ik5gsBn4VzaXTkIgAaDTX9S/7n8/znZtUWnANjyelkxJ151AS7c4M4bfpcQmWfwIqy8pDVNe6DBzC8RIPFO5JRm/acxsYKIzL36T74pjbmF1wWVocf9OVwGO0zQ2ZlGOSROe5D8WcX9+1Ka4Gh3zI+LhLXwrn+/m11ZlRdek0U6QuApNJNX23Pijv3MyucZhroEVgpj49wazDTMCJgSxTnBghKynb69WIVaLYGdukDFtBx4Vkz2sJTE5AL+2mE8ng2LqlHDWBRD2iyOrcGQggLwRrNPY4+wXG5+bFUa/AknrEAbSdckoODAJFkH4ejGGERqpntH/Jifb0dc6imvnUz73l8pJBJUqdUCEb5RNMIQT6kYXxgqdwpayGwaoHJCeW5m0JUHOMETVagRswXG8mspP+FyMoeooZc/M/zxV9ocNFvpzjtHMISX0tTCJxGb5tqYGE+g61VJRz8kR0nMSUwbMaCrPAkYCHqAi/IRbBhOoMB1T7Aq56YPZh/qmV8zDuMonpZPUJFW48/8RX6CAnIqzLIp8Sz6DN90kIze9FzkAfrYlwXn1QTg6TWAhvt+Pu5wsdpOpFUVPOfEMd0PsfR1yzGy8s1O3sR1eNZMYj+eJiL9n2m9lvBHORvTPm5YubxIKEyauu4q8DtuZM9YL0T+s6xx9LfqKf8Py7RuO49YjzLOUKIZLlwam/2MurtCawkRjL+41lAtcdxt+IP1Zd7Yfot0UYUqKW41XyCDFpozt92ZPFNevBQ0FwChwGfHyQ07CcRxU8AdJZdheKBXQnPwljVcCeS7Ay1xQDEmBaeV+LkJDPBHR15bW4wmoj8Win5PL61TYznTbAQF2dKHB1mTNLaTlddKmL9cUC1pfVqP6heZ/lWpFxtgZa8JRKSWcNE+1SrMR+uqmFWuwwi+8EU5rfTznbUXBOhSYoGfQGnY2vp 2Bs6yl+/ KHDvFRnTSuN94Ub2YRueU02EI3TBow/yIE8rM+f8jCqLW7XRbeonuJNoVmjro72cnV9eoDl782YlmlttIwK4BtoBEIi2kMlFIvPlAHztfkj6qGkaZ/++ZHxbQeFMb+BwTixzyi5ZNS5fLnuu8peRDQL6vgZ5Qa3vruktVPNRITkbu99Kk/zMej0aa3tm7s3X603Q9tecl577/CQUuqnfpSitrcav9dfqfp5N94qFU6wwxu91SZAidtEB0QJlwjeTtGw0CJwjer+u4CQAmJbfdhKdx9V/mtVWeEZMLBiwC54TkkybCiUyFo6LsWDmQG03TaJw+p51/WMnW7h+y/Cx6tDqXmV+R42EZsfxIi3THLBU3xAPO+nuB89C+kFzqdJFUEt53mYm+AQSzRQxvermAAyhAH0LItiqt/9PrxgDKVjdP17fOLKUidonVnaPgeHgb3B/BUwDEk4xSxMu4d9BKthzA/4kG/mQYEJCMVeIaBh8N9ICm4zwUhkectyxm8XirWDKRGkPQozvUJGuOAlEI8NxX4eF/Tpf+T9OSsm79EtUcmhIkyeKL6TUESwpXXWbpECUtG5F/T5vJiM6UJ5M2g8/INVLJBHt8llLQGRAhYyW/Ot2Uc3pAVRnoXaZTk3z3zUxgmG5WDPvsUrg0NAe/01NAJDlVH2b0tN6yMCzmtr3ubOhCwsgnCPeWCpBctJ0KPS7qOKOeTr+VamCXE5czx2HiFtfcYijNgHiP X-Bogosity: Ham, tests=bogofilter, spamicity=0.400699, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This type will be used when setting up a new vma in an f_ops->mmap() hook. Using a separate type from VmaRef allows us to have a separate set of operations that you are only able to use during the mmap() hook. For example, the VM_MIXEDMAP flag must not be changed after the initial setup that happens during the f_ops->mmap() hook. To avoid setting invalid flag values, the methods for clearing VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking the return value results in a compilation error because the `Result` type is marked #[must_use]. For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When we add a VM_PFNMAP method, we will need some way to prevent you from setting both VM_MIXEDMAP and VM_PFNMAP on the same vma. Acked-by: Lorenzo Stoakes Reviewed-by: Jann Horn Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/mm/virt.rs | 186 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 185 insertions(+), 1 deletion(-) diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs index 3e2eabcc2145..31803674aecc 100644 --- a/rust/kernel/mm/virt.rs +++ b/rust/kernel/mm/virt.rs @@ -16,7 +16,7 @@ use crate::{ bindings, - error::{to_result, Result}, + error::{code::EINVAL, to_result, Result}, mm::MmWithUser, page::Page, types::Opaque, @@ -198,6 +198,190 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { } } +/// A configuration object for setting up a VMA in an `f_ops->mmap()` hook. +/// +/// The `f_ops->mmap()` hook is called when a new VMA is being created, and the hook is able to +/// configure the VMA in various ways to fit the driver that owns it. Using `VmaNew` indicates that +/// you are allowed to perform operations on the VMA that can only be performed before the VMA is +/// fully initialized. +/// +/// # Invariants +/// +/// For the duration of 'a, the referenced vma must be undergoing initialization in an +/// `f_ops->mmap()` hook. +pub struct VmaNew { + vma: VmaRef, +} + +// Make all `VmaRef` methods available on `VmaNew`. +impl Deref for VmaNew { + type Target = VmaRef; + + #[inline] + fn deref(&self) -> &VmaRef { + &self.vma + } +} + +impl VmaNew { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is undergoing initial vma setup for the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *mut bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Internal method for updating the vma flags. + /// + /// # Safety + /// + /// This must not be used to set the flags to an invalid value. + #[inline] + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) { + let mut flags = self.flags(); + flags |= set; + flags &= !unset; + + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet + // shared. Additionally, `VmaNew` is `!Sync`, so it cannot be used to write in parallel. + // The caller promises that this does not set the flags to an invalid value. + unsafe { (*self.as_ptr()).__bindgen_anon_2.__vm_flags = flags }; + } + + /// Set the `VM_MIXEDMAP` flag on this vma. + /// + /// This enables the vma to contain both `struct page` and pure PFN pages. Returns a reference + /// that can be used to call `vm_insert_page` on the vma. + #[inline] + pub fn set_mixedmap(&self) -> &VmaMixedMap { + // SAFETY: We don't yet provide a way to set VM_PFNMAP, so this cannot put the flags in an + // invalid state. + unsafe { self.update_flags(flags::MIXEDMAP, 0) }; + + // SAFETY: We just set `VM_MIXEDMAP` on the vma. + unsafe { VmaMixedMap::from_raw(self.vma.as_ptr()) } + } + + /// Set the `VM_IO` flag on this vma. + /// + /// This is used for memory mapped IO and similar. The flag tells other parts of the kernel to + /// avoid looking at the pages. For memory mapped IO this is useful as accesses to the pages + /// could have side effects. + #[inline] + pub fn set_io(&self) { + // SAFETY: Setting the VM_IO flag is always okay. + unsafe { self.update_flags(flags::IO, 0) }; + } + + /// Set the `VM_DONTEXPAND` flag on this vma. + /// + /// This prevents the vma from being expanded with `mremap()`. + #[inline] + pub fn set_dontexpand(&self) { + // SAFETY: Setting the VM_DONTEXPAND flag is always okay. + unsafe { self.update_flags(flags::DONTEXPAND, 0) }; + } + + /// Set the `VM_DONTCOPY` flag on this vma. + /// + /// This prevents the vma from being copied on fork. This option is only permanent if `VM_IO` + /// is set. + #[inline] + pub fn set_dontcopy(&self) { + // SAFETY: Setting the VM_DONTCOPY flag is always okay. + unsafe { self.update_flags(flags::DONTCOPY, 0) }; + } + + /// Set the `VM_DONTDUMP` flag on this vma. + /// + /// This prevents the vma from being included in core dumps. This option is only permanent if + /// `VM_IO` is set. + #[inline] + pub fn set_dontdump(&self) { + // SAFETY: Setting the VM_DONTDUMP flag is always okay. + unsafe { self.update_flags(flags::DONTDUMP, 0) }; + } + + /// Returns whether `VM_READ` is set. + /// + /// This flag indicates whether userspace is mapping this vma as readable. + #[inline] + pub fn readable(&self) -> bool { + (self.flags() & flags::READ) != 0 + } + + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma readable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYREAD` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_mayread(&self) -> Result { + if self.readable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYREAD` is okay when `VM_READ` is not set. + unsafe { self.update_flags(0, flags::MAYREAD) }; + Ok(()) + } + + /// Returns whether `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is mapping this vma as writable. + #[inline] + pub fn writable(&self) -> bool { + (self.flags() & flags::WRITE) != 0 + } + + /// Try to clear the `VM_MAYWRITE` flag, failing if `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma writable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYWRITE` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_maywrite(&self) -> Result { + if self.writable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYWRITE` is okay when `VM_WRITE` is not set. + unsafe { self.update_flags(0, flags::MAYWRITE) }; + Ok(()) + } + + /// Returns whether `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is mapping this vma as executable. + #[inline] + pub fn executable(&self) -> bool { + (self.flags() & flags::EXEC) != 0 + } + + /// Try to clear the `VM_MAYEXEC` flag, failing if `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma executable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYEXEC` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_mayexec(&self) -> Result { + if self.executable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYEXEC` is okay when `VM_EXEC` is not set. + unsafe { self.update_flags(0, flags::MAYEXEC) }; + Ok(()) + } +} + /// The integer type used for vma flags. #[doc(inline)] pub use bindings::vm_flags_t; From patchwork Thu Feb 13 11:04:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 575ACC0219D for ; Thu, 13 Feb 2025 11:05:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3BADA6B009B; Thu, 13 Feb 2025 06:04:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 34128280001; Thu, 13 Feb 2025 06:04:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 145AD6B009D; Thu, 13 Feb 2025 06:04:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DF42A6B009C for ; Thu, 13 Feb 2025 06:04:55 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 77F06120C1E for ; Thu, 13 Feb 2025 11:04:55 +0000 (UTC) X-FDA: 83114638950.18.7DD49DD Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf20.hostedemail.com (Postfix) with ESMTP id 9229A1C001C for ; Thu, 13 Feb 2025 11:04:53 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=s2EIAdEB; spf=pass (imf20.hostedemail.com: domain of 31NGtZwkKCHMRcZTVipYcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=31NGtZwkKCHMRcZTVipYcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444693; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m2sxvAYKNr0+htAaMCuMZG0BVvvnjrhzSWzEt2rFCZM=; b=qtHG1BdLsW8So9KaFbOeOHSjTYcyXJtp66z8Vl5cuzmjqlmpbGiQFoH6SL9KKmXkOJQAWi n3EI9cD9pUXwCGEwdVul85nxsuDKuRwloeLuTbIVNJZRPbL72tJ28vVBJY2m3oDLSMIDMF MzRrElC2EWJdd6txo5uPPNTykf9aeUM= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=s2EIAdEB; spf=pass (imf20.hostedemail.com: domain of 31NGtZwkKCHMRcZTVipYcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=31NGtZwkKCHMRcZTVipYcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444693; a=rsa-sha256; cv=none; b=eydWxzkVfrFe/sU1PQok9KOzCLLv3QDubL+mols/bbZpy4432Ald/pdRohp55DpiUbS4ZM TzUqA4o47w2sZjNNoD5OindoXV7zuDiQ953T83UBPVyQoI41pcnoz7q9NCH6RSFQBo4/24 3uDe+4/OFMYvZeA6qdVIgl3NxBnc2uU= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43935bcec0aso5503095e9.3 for ; Thu, 13 Feb 2025 03:04:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444692; x=1740049492; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=m2sxvAYKNr0+htAaMCuMZG0BVvvnjrhzSWzEt2rFCZM=; b=s2EIAdEB2Eh008GWlSuAxzcBrAWriHNFcFQCZ73xNmUPGZoUYl+biaPw7EeFgH221S 6oCHTTYIsKsl/FGVzObOUdCLv2GBuIGxFyMmqk0h9ukPRK2oaX0G7PJ7WPDRPWb9+M4Q zpgZRB78vvbo1S0uWlAHyzp3u2IjXWHapLLm5HnuEX31mcarYZae9lWc/b2scYSSeTVq 4orUIkvCEL1SxPLWag93vfNJrv4M/BUbKmeOKJHTIbbUpU3QFXVLJMhypM+G94oTW67q 33u+apVHTDPC57Kjj5C4icvfEsNIKPhZ4+0tmPwlC09gsEp6vimeEHDGNpTKlXarMdZY chnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444692; x=1740049492; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=m2sxvAYKNr0+htAaMCuMZG0BVvvnjrhzSWzEt2rFCZM=; b=D3GF/xeagpPuWlgd+j3nUr2B3dD09thgCjLdAKjip40DEle9bs6Qs5cT+A6o0qh01c NdS3EAsRMc2K8C6TD0XRHvz/KWKrrTwKyB6GWGHHbxpjMYZ2AmwdjO2GZmIHdpyWVrqq QfWW3PllUaqPSvt/F3Af5W6JTty4UxJ2MxgdnSMiOjdvjbDpeE5GjF8nAAfMbmplsH87 nJB0MaPL7rzEaQ1PenYh0ij3ZZDR2taX86nbVu0VKxxKSk67oGXGsQoy+ss9MKXZiDBo xrK++PN52HHk/mB6GRKTrgI/hBatqjXbub2BXBaiwWbijMiVn27HZyMuz6YBcVSbTdA0 DClg== X-Forwarded-Encrypted: i=1; AJvYcCWDtcnc8Eo8nmOLb9Eni6KD7v1t4oblxYkNajvOgb+IQjms9ujngVajBGrQOyrvglxyODGwQRki6w==@kvack.org X-Gm-Message-State: AOJu0Yz68WUVma1iOvxcPggMT/BUNUuCSWpAWl0AeXnlm8p8AELRhXkj V51qbCfnK/bEcv1H/IN0ZoUQRsMazrg5HfjIS1I4CT8eTUQnIJUYLf/UhQm9ZoOMcPOjg/jvrS8 CkVre348HFK4CNw== X-Google-Smtp-Source: AGHT+IEB3aP700Hf+T0Y9hXFswWx+oBLY+WAVsjCa34dg3231EVe8adDYbhQ2hlUMA3P7Xrq6PWzNabIh7UH2D4= X-Received: from wmgg21.prod.google.com ([2002:a05:600d:15:b0:439:3c3e:f0e8]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3f12:b0:439:45ce:15da with SMTP id 5b1f17b1804b1-439601c10a2mr29099955e9.25.1739444692374; Thu, 13 Feb 2025 03:04:52 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:06 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=3859; i=aliceryhl@google.com; h=from:subject:message-id; bh=2M89B3p91FPP25Gw8Y6SiWDXagycK/dpbfENE1t2aLQ=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdHBNVpj/HCfF1bMPs0sT3j/Jo2hPFWd70C3l iOXvQTr0YyJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RwQAKCRAEWL7uWMY5 RtHcEACcBTXyAj1CXLqb0ECe7oLoK6K8Pl+NeXmHTuMk/Ev7mlbfXpB1nTJjbWRsYCabctv6mkD G9xvp3vC3cneN8bB44qp9x+ebNayRdV32qqfyi8cfv/2t5alAL3cKGTErJW6gEeIGyTAi7Mmyem stGNNElApGyVgVocm3n6dDGebZ5GQgXMWmtUzojC+mhcF7jIE00ry3mrPAxCQA/Q4SgFWZE2Uhy HasbIcqybKe2sgVO1R2tJ2sA2ba6ybalTf2ORz3h4PheSouXcsG51HIepwvqn/mhb9nyocthl88 bZIptuUR1O79gMUyLSpctcApRDnt8YbFqX/DCLY3mVxCq55sbcHRT3caJ+bCIsGG0a/vsONgeaX /jdSt0JbUYbE9+PpBd7IHqjgZxw9z6HRZITmoLhIUaQa00mEbtASKRL0pCOLFK093wxdw2Gy5BO V8rsvOruaOrT6AIdS4sDOtMyRXbp6TjKq6EioB92dhZnRPmb1a0Xwv6Td7GifsTVOa6msSoYugx DF/+4YZOrTawLkkY/1b4vb/8cgLTehGuS4ASNFFHBbOPFjhWh1PRCvnapev05NtEZ1A+JYbrBSj QDyex48V5rO7SbbnmgPgMbO5gTtVde02PpFkw2+qxRX88cQiVaOaFOODzsaMmVS/zYWyz+FXlaz MAB1NXQrXMto4PQ== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-7-b29c47ab21f5@google.com> Subject: [PATCH v14 7/8] rust: miscdevice: add mmap support From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9229A1C001C X-Stat-Signature: 43zdw649exkdu395ukgqxmhi7pxrer6k X-HE-Tag: 1739444693-970550 X-HE-Meta: U2FsdGVkX1+UJfc+9XHdGEoiJ6mEZnauZfCNy6Bxp+4+gy4ByDFIErQpn+F3G1t/GeUvmPkNnrya4YQ2esf18omXWXdeac86EKfpTgwHT/hXCziMf1YrRzMQT4NB7UXnLIyhyojK7rtSWJi2USgg7ztAPmjrvXPOc0+Bhk4lb5WvLjAMZuRCNGS8LmSlGCczgQuUBJ59nl8dSLxFT5V3VfdBD1QrdpV6z7xmwtnuPCTl7KjQHcYopulWOvBwjRJRO3AJE2FV/FZxjx6zhU1Tf6pYw1TKbEnUWKKC0/oh/+VYo7jFOK03XAtr3s08p/8ob31plHEk/LUcUNVme44C7810rslxevGGayUeNrHnX67wYXDVeqijRKn3CL5K0+miaSAx4jljtKMNrERbYcYDe8ZgOu/UNtNlkkHnsSck1A7FE6TyHCBn2JB9SX4cawnqJyPBcNGBV6hz1PHLroDmOxEStD3nLZDfGLuivMrr0MC8vUDqs6aFb417FFPbykEu2rhKgML8MgKuuDE1sleKg0nJnUvJ5Ni40+t8ux3sfpJ+fVWMyxaI4Y0h/1I67YWOuo5l+pUn5rAEZ3Yu+J9JJVgitw6EgesYDfSU7ICCQlgkrw7x/4d8KcmeWdu1iF8i76biK/+qF7Uv12OZV6/S1Ca8GZgJ8yBGj23+QTOA+Xbd0RJ4jQb95a/atrkwFlC5XsoMyIe2cGvQEQlWjy33N5+SOcDW+81ndNDyKQkW9hXkl9hJpCfLW5MnZP9WtFCCVCGweEpIFL4VWkVUMKSiUcnsHNZFkTusjap1pi1+3UpnHc8CKHfZWKIcYoHFtAED1lQ7mKY0QvVy+LtFH1CEYo+381VEwM3lVBkq1mL0+2gGoSkEGMsTdPtrzfx6927BTrkrKeWyRrO6GKpVt+Znu2JmlAIFQ1FVN6NX/JrWu4tCVoWl32uKGbNDmK4iItc2uOftmfmvrg6SioRSJYH nvvjtgdh SsNskWJ0fjz+mB3DD5hvItwYSxbGkD9xx8KabucRDg8osDKJL/RAGBwmLuU83qRNjXYjwz8kqy8Jzk9sIkyQK9HaQ0f4Vxc2k1nB5tDMYtI7rMdxUoniKGs+EEkrIveJ/NwY/omuhKAM5fEwMFMnFC1WVt6lAK4IMRvnvLyNchVJybM3flEBBgAOXOap/+RQXU94yFPGlosxj62nDSFd4ky4+e0vyLWgL/foYZ9KwluiSlutREx9RGKHHP/1JXhtMybZGFaVFe3wx8EYxijosAD1aNKAFLyMMumo2OeZqvHbBWbjDWacXUEKNy+EhfBAD85jVO8TiYSzHQqQOpJ0wNY0GdpYtL3rWfz77sLJt3EycEd41HBiLUKcCRtk12+7vmliDgutVkt90sgDpupi8qVjtuE81Fz39kxcKe6DonAMEX9PuhwpP/dfjKULWtc42cSilMU3Gt6sA2CvxXut55LryDxsz/foNCTivrE7MZhbeTsMBSamstXASRNtTHmRiD6MRCdFrZOZgOWUuAI814GzqBuySBWVK9x70DILvwhF78gjmtykVhSzj2qvkh2KLwTTsXR5fpzyy2wc7tX2fpI8/t+HBEyfr8GVGQM/ucA422pEUj5pAKgdJn0zbWN5ywT504LAwLFji5HRKhm8jWqp/k5pqgIJQieBpmhQsBvfAJ93qy0CGSYh95rD+XPoSrUrHjv9qKxtXMlhAUQCK5mrbW6ir/ECMT46y8qmSzr9qklfpHsJ8l7aFzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.401911, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add the ability to write a file_operations->mmap hook in Rust when using the miscdevice abstraction. The `vma` argument to the `mmap` hook uses the `VmaNew` type from the previous commit; this type provides the correct set of operations for a file_operations->mmap hook. Acked-by: Greg Kroah-Hartman Acked-by: Lorenzo Stoakes Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/miscdevice.rs | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs index e14433b2ab9d..e4a2c5996832 100644 --- a/rust/kernel/miscdevice.rs +++ b/rust/kernel/miscdevice.rs @@ -14,6 +14,7 @@ error::{to_result, Error, Result, VTABLE_DEFAULT_ERROR}, ffi::{c_int, c_long, c_uint, c_ulong}, fs::File, + mm::virt::VmaNew, prelude::*, seq_file::SeqFile, str::CStr, @@ -119,6 +120,22 @@ fn release(device: Self::Ptr, _file: &File) { drop(device); } + /// Handle for mmap. + /// + /// This function is invoked when a user space process invokes the `mmap` system call on + /// `file`. The function is a callback that is part of the VMA initializer. The kernel will do + /// initial setup of the VMA before calling this function. The function can then interact with + /// the VMA initialization by calling methods of `vma`. If the function does not return an + /// error, the kernel will complete initialization of the VMA according to the properties of + /// `vma`. + fn mmap( + _device: ::Borrowed<'_>, + _file: &File, + _vma: &VmaNew, + ) -> Result { + kernel::build_error!(VTABLE_DEFAULT_ERROR) + } + /// Handler for ioctls. /// /// The `cmd` argument is usually manipulated using the utilties in [`kernel::ioctl`]. @@ -176,6 +193,7 @@ impl VtableHelper { const VTABLE: bindings::file_operations = bindings::file_operations { open: Some(fops_open::), release: Some(fops_release::), + mmap: maybe_fn(T::HAS_MMAP, fops_mmap::), unlocked_ioctl: maybe_fn(T::HAS_IOCTL, fops_ioctl::), #[cfg(CONFIG_COMPAT)] compat_ioctl: if T::HAS_COMPAT_IOCTL { @@ -257,6 +275,32 @@ impl VtableHelper { 0 } +/// # Safety +/// +/// `file` must be a valid file that is associated with a `MiscDeviceRegistration`. +/// `vma` must be a vma that is currently being mmap'ed with this file. +unsafe extern "C" fn fops_mmap( + file: *mut bindings::file, + vma: *mut bindings::vm_area_struct, +) -> c_int { + // SAFETY: The mmap call of a file can access the private data. + let private = unsafe { (*file).private_data }; + // SAFETY: This is a Rust Miscdevice, so we call `into_foreign` in `open` and `from_foreign` in + // `release`, and `fops_mmap` is guaranteed to be called between those two operations. + let device = unsafe { ::borrow(private) }; + // SAFETY: The caller provides a vma that is undergoing initial VMA setup. + let area = unsafe { VmaNew::from_raw(vma) }; + // SAFETY: + // * The file is valid for the duration of this call. + // * There is no active fdget_pos region on the file on this thread. + let file = unsafe { File::from_raw_file(file) }; + + match T::mmap(device, file, area) { + Ok(()) => 0, + Err(err) => err.to_errno(), + } +} + /// # Safety /// /// `file` must be a valid file that is associated with a `MiscDeviceRegistration`. From patchwork Thu Feb 13 11:04:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13973109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A80DC0219D for ; Thu, 13 Feb 2025 11:05:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90F2F6B009D; Thu, 13 Feb 2025 06:04:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 899566B009F; Thu, 13 Feb 2025 06:04:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 651956B009E; Thu, 13 Feb 2025 06:04:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3D1876B009C for ; Thu, 13 Feb 2025 06:04:58 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 031771A0C7F for ; Thu, 13 Feb 2025 11:04:57 +0000 (UTC) X-FDA: 83114639076.27.A97621C Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf27.hostedemail.com (Postfix) with ESMTP id F04EA4000C for ; Thu, 13 Feb 2025 11:04:55 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=UcA61Od8; spf=pass (imf27.hostedemail.com: domain of 31tGtZwkKCHUTebVXkraeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--aliceryhl.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=31tGtZwkKCHUTebVXkraeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739444696; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pggCn+rSttrAyFGqOyOJWEZwQTAFQJ/uxRrTgaYt6Zw=; b=cq6Tsj/4EhRs95q/jTGx20AV3yTrySeIgblYyVN1yfbGyqg1axKiZl/pfxrzmoIhf6VQTx IXcqn8at3SIdjl/PzYxXg2gNeWsaG8sOoxuuM8qSB85MajwCH93cZ+HELNI22kgmV+yNwX RFURluD0Kar3yR1IZR/agIeOkNXs1q0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=UcA61Od8; spf=pass (imf27.hostedemail.com: domain of 31tGtZwkKCHUTebVXkraeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--aliceryhl.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=31tGtZwkKCHUTebVXkraeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739444696; a=rsa-sha256; cv=none; b=wYCiVjuny81KmwY/rrIojfiNvgS85zBz7J19EHLjPTxMszJpxOoiPqUOyC43Aypi2zsgvT uiwuYN4QxawBIGQhXtezywo1MWiJ6tGNiGh2B5m062gq3s2Nd43YVoZL+NKtm/J4iOltg8 Kxl4UPaagJdGB2DCyFXBEvC2hDFs3Qg= Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-38ddee8329eso407467f8f.1 for ; Thu, 13 Feb 2025 03:04:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739444695; x=1740049495; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pggCn+rSttrAyFGqOyOJWEZwQTAFQJ/uxRrTgaYt6Zw=; b=UcA61Od8ivN59JN1xBuI1znn7lSAUdqgoYbb9TKoUXqa9GIwtjj36hZMiqrZ5QNWwr cs/AZioL+U9kngzUIeR3OS0+YyBX+g1kpzsljJB1yo1QPCsdtDkrykjC5pFjMhsMDums jGQfYYmbGojYSOwbl0JjeMbTHTdYAInHXaxEzgVFxM+KNEtdG4D1XxoaHNeLEeX98C+O vEBb8uQetw5LfCljjquMuOp4l5u4OwKwmESwUvu4H1gqkTbVjcZ4G+ceqofJ65LMX4Yr vlDOu0SJAeAVpDzKEhfaufsi/CPY12hMpWQqcbvsUm3fq0MQucy22DYhxVmRQgRxYg8e KZ2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739444695; x=1740049495; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pggCn+rSttrAyFGqOyOJWEZwQTAFQJ/uxRrTgaYt6Zw=; b=JukHWLG7zpYe60mG8S1+PmVY746MNZxzyPHZzTZc9R1peDiZESSfsfVxLtw3U1cvpY /R5cX/YEBMYdwturmF5qXN4Lm8ih6bVIatGWvuzMA1WHgBqNM833+AewH3121zlzmA2T P0sufw7z/jWJeo6fgUapOiUBzTN0BQk65WSIsmGexNTTZYdgoInvQAaBAOPlKFN3etHL XDQefLsot4gadA+I6YQVVNo+9hotobpVIPR/z6UemVoyLF5VfJT1xXFuwTCUuHv06lMF 1e+WxYgS0WIA+JQphsHtpZij12OQRNIKtz+/LOojg1YtzhnAf+G707WI3vReA2JEV79C a8Rw== X-Forwarded-Encrypted: i=1; AJvYcCV4jnqcsInozpwFSKeatn1NdYEMiBapG9t7RaTCdcRGHu4Hbo0n8dPJrlvNwJtLPuDeQiXKSp0m3w==@kvack.org X-Gm-Message-State: AOJu0Yz3pJjA2QySpc8YVNrb3Pyd+dSfAI1cpY+6d9B/AqLSJV4VggIz pYM7ZeyrCFIP4F1vhJ8yS7FTcAKytuwjke+Vljru2L1f2hJIgoSY/luyFb4EHAgEkPaZt6HDKR6 oSok7+WTGT10inA== X-Google-Smtp-Source: AGHT+IGOn6Mt66bw64HQ0WkAveH41wrxH69hatXUdAVol6d4fHE/vHEGMWR2wYfjVsRWmOaT0jjiIajQSSSAjtc= X-Received: from wmsp21.prod.google.com ([2002:a05:600c:1d95:b0:439:4829:ec32]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:e4d:b0:38d:e61a:bc7 with SMTP id ffacd0b85a97d-38f24512decmr2550555f8f.40.1739444694761; Thu, 13 Feb 2025 03:04:54 -0800 (PST) Date: Thu, 13 Feb 2025 11:04:07 +0000 In-Reply-To: <20250213-vma-v14-0-b29c47ab21f5@google.com> Mime-Version: 1.0 References: <20250213-vma-v14-0-b29c47ab21f5@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=18134; i=aliceryhl@google.com; h=from:subject:message-id; bh=1zv+hEEHUtoRrSeYtVG9cnUb/jaHdXXVk6Pmq2XF4w4=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnrdHChHUOhGpI4kDYpztj4GO15xSiXl+k7bnvp CTKdHXzRLWJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ63RwgAKCRAEWL7uWMY5 RpsID/9IhzcWTdhOJEwojq4rxoACbbWSGPJnIkg4Xqt8Z7jN+BB8275YLTeMQ2fTbUkO1YnoB3d Sx9n5XV7G6Qma/705uZT7hBgknowr8Xo4FTFBbq1fwvQL7ejohMfGL5AwIvq1JnwYVE+zTSMgIi XKB9Ns+kQptPUcCoUkh1QjAAeG6V2z+8CXWmGhUOvERJykv993uwCUoGCLbvrAiwRAXdOEjcAV7 kA6o7e2+YfzICSW5bH1oknNqqL0k1Dp3xGKhyyjAdHLOX2zE8EVa4YLc29kD48DsQxPfwLzaRE9 wD9uxsXNC4P334w28jai5myCnwLGpz/5tIoikO+c1xVGBKnT301+jPVpyDLjllPZ3ZOdoFcJsVu zlIkdsCh7NZ3SQ2ig+hWlxKxnSzNolA+nVZb1XVGhkKD/65ibdie4XhsChw5PFnNzSxsDYDww7q 9BIuzwUZhlz/D2khqvYzGvxkUinoljisuNHOXnGzeJgxcWCHHE+znWlzkQ096dVVwiW80hNLVaH rg5XXmvACrFBh2NXLfYjf1hJk4e+nEWq7eqLKb9aa1cwFRDOTW+otUz0mwQD/v70U2IhITltU1E dykuskCSfHuXfjozqyyMPBI46BoAvm/3+rSIz8cXlsfGPaSummi/Mf3wCn1IZ/wYyox67ikph9q u/9avX3d36Gpqfg== X-Mailer: b4 0.13.0 Message-ID: <20250213-vma-v14-8-b29c47ab21f5@google.com> Subject: [PATCH v14 8/8] task: rust: rework how current is accessed From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: F04EA4000C X-Stat-Signature: ut8f9y7tgedbd7efwrg3ujjpcsq7dj6j X-HE-Tag: 1739444695-199029 X-HE-Meta: U2FsdGVkX1/5nzLIH1fivD09ZLs+ZidvYyQbb1cNTEM5S2DdSID0C5u/XXpFQvbA7QCHzAETQxwcWfg7P7zejeIzfRIxDTe6+YBWgAayivLSm/PUSwmdRU32e1Yo0c9OCWqNuqoCiXsmDo70uYKqPaVh2YfkeEivjghWQ/xPAxgO9ybGJd4WDZr+hNh946Vaw3GZengmW6CEfLjrI9e13WBeROYAdZcAoBXnup91JmP7YNBLlMJ6pXTp5TgxjiTgT3BiNA+gYBA5MbxO6nMZ+Qp3J8iyis2iTCMqtEETVU/P5/+77UGHxNejwDUhnXxrlZAWmPEP3rLXTqf4nWMos5PiFx3IIRV5WH80HYasgPfUbifo9oINq5uVt2MBFMErmk6v9utIt8PSaINAvF4uuM5zCJsGG5QpZ+QIQzhb0A7u/s44J43hyf/0Pp4Qedml53u/9SNh1SvIYxdEc7WNcB2oarqC6MCMwvjY30LSAarmHc/U+9J3e6TjihBoBmls4arxJ3MsOD7/jNBOsNUqBfWOexzt1+RYv/ELYGGdML5dkRwMdTb5UYUqmnOw1JzyzMVcidnfVUs+5bQzu3glPAJrNUGvCnHQz4jMEmIchwvfOJF6FJlNdUUB/XDKUwH2mwMW2MaEb1dxgYNBpWyEAOYbdkbG3+SAyXZWj8p4SnuwpqhwLGpvCt2j0HOYH4UK4a4NI8FiPOrB56waaTh/ACgo9Q6yN4epDklF8VT1H0JG3GSROkA3zjA8rIH0mukS7/4KcmTeQtWGXSOXjszx+VBVB8snCRdVe15SFbKFdtol4W0KT6vH8geCEpYjrp0XnGEO+wLc7bQTmSkVOfBwoitP5LN+omYwFfD4uEO059G46ryxRW/VkvWHCxnBHblujZlSIDA3vFqD0wPIFUmuTTBM/WTLXf2D5mhN8HUCl3cPkeyZ63xhRE1RhuevP5Matw4tdmJ8qFtdkbiL9+f uFo4be9E GmUVLzo3yPsOBZplIvEHG30ErIN6/QgZZqMIp9xk1t00Ftwe9pIHwQvAxXqVFXqX6DPRWlsHP6cXvnRQL85lck+JzoMVc5ckBctezZzHwXNK7a3CifhAt2UIMJ610f95VuF/c18lFJb2BelNi/NaIOt/QmWK+mDPV1tY2OX3zakbcfrG2waU4VBSYRJvdsCIsJjx5wS5r4+58lwxrFLf1L5tFi5/vMRICM7N3s7MaCexUDhco2oIJ9tnq6FFgde0KuMGhtGNxnzsRZuNw62mFfzwapSkIZfAxzcRIcMs9gTYivx+gNAUpY3LUAjxWyeTWfxLZJvz0N8jdIDwjgmfP05Qs7UvHysfEc7eFVGeQqRBTwrWwjH7YGR+WRNIbG3cN89lzSf/VWovj/nyglsYqwlu5yaiJ7l/l6257tL5iZrEEOSNyBk5MN8LPv8n2+pYic65JnMwuCqNJfpXkiAbU75DGrTsDiAazRwPEjHUhh6JyITamXuEHaGHkm036p6Poq7opBerE9CaqJq309D9y5NENJ4i3pAByc3hTe3zaEYdLYBESMCdgyaFXmMNJsvN6NEwL8VmhYfPIh5QUzQE7sCHEaDnK6xuGapM2LnURvhTGmHDeb0GyhFkIWX4Pw9oUJR54KLJ+0K/BRvfXGqg0W2jaJyejfVXGQMpg/qgyuQIcJwA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.358304, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce a new type called `CurrentTask` that lets you perform various operations that are only safe on the `current` task. Use the new type to provide a way to access the current mm without incrementing its refcount. With this change, you can write stuff such as let vma = current!().mm().lock_vma_under_rcu(addr); without incrementing any refcounts. This replaces the existing abstractions for accessing the current pid namespace. With the old approach, every field access to current involves both a macro and a unsafe helper function. The new approach simplifies that to a single safe function on the `CurrentTask` type. This makes it less heavy-weight to add additional current accessors in the future. That said, creating a `CurrentTask` type like the one in this patch requires that we are careful to ensure that it cannot escape the current task or otherwise access things after they are freed. To do this, I declared that it cannot escape the current "task context" where I defined a "task context" as essentially the region in which `current` remains unchanged. So e.g., release_task() or begin_new_exec() would leave the task context. If a userspace thread returns to userspace and later makes another syscall, then I consider the two syscalls to be different task contexts. This allows values stored in that task to be modified between syscalls, even if they're guaranteed to be immutable during a syscall. Ensuring correctness of `CurrentTask` is slightly tricky if we also want the ability to have a safe `kthread_use_mm()` implementation in Rust. To support that safely, there are two patterns we need to ensure are safe: // Case 1: current!() called inside the scope. let mm; kthread_use_mm(some_mm, || { mm = current!().mm(); }); drop(some_mm); mm.do_something(); // UAF and: // Case 2: current!() called before the scope. let mm; let task = current!(); kthread_use_mm(some_mm, || { mm = task.mm(); }); drop(some_mm); mm.do_something(); // UAF The existing `current!()` abstraction already natively prevents the first case: The `&CurrentTask` would be tied to the inner scope, so the borrow-checker ensures that no reference derived from it can escape the scope. Fixing the second case is a bit more tricky. The solution is to essentially pretend that the contents of the scope execute on an different thread, which means that only thread-safe types can cross the boundary. Since `CurrentTask` is marked `NotThreadSafe`, attempts to move it to another thread will fail, and this includes our fake pretend thread boundary. This has the disadvantage that other types that aren't thread-safe for reasons unrelated to `current` also cannot be moved across the `kthread_use_mm()` boundary. I consider this an acceptable tradeoff. Reviewed-by: Boqun Feng Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/task.rs | 247 +++++++++++++++++++++++++++------------------------- 1 file changed, 129 insertions(+), 118 deletions(-) diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs index 07bc22a7645c..0b6cb9a83a2e 100644 --- a/rust/kernel/task.rs +++ b/rust/kernel/task.rs @@ -7,6 +7,7 @@ use crate::{ bindings, ffi::{c_int, c_long, c_uint}, + mm::MmWithUser, pid_namespace::PidNamespace, types::{ARef, NotThreadSafe, Opaque}, }; @@ -31,22 +32,20 @@ #[macro_export] macro_rules! current { () => { - // SAFETY: Deref + addr-of below create a temporary `TaskRef` that cannot outlive the - // caller. + // SAFETY: This expression creates a temporary value that is dropped at the end of the + // caller's scope. The following mechanisms ensure that the resulting `&CurrentTask` cannot + // leave current task context: + // + // * To return to userspace, the caller must leave the current scope. + // * Operations such as `begin_new_exec()` are necessarily unsafe and the caller of + // `begin_new_exec()` is responsible for safety. + // * Rust abstractions for things such as a `kthread_use_mm()` scope must require the + // closure to be `Send`, so the `NotThreadSafe` field of `CurrentTask` ensures that the + // `&CurrentTask` cannot cross the scope in either direction. unsafe { &*$crate::task::Task::current() } }; } -/// Returns the currently running task's pid namespace. -#[macro_export] -macro_rules! current_pid_ns { - () => { - // SAFETY: Deref + addr-of below create a temporary `PidNamespaceRef` that cannot outlive - // the caller. - unsafe { &*$crate::task::Task::current_pid_ns() } - }; -} - /// Wraps the kernel's `struct task_struct`. /// /// # Invariants @@ -85,7 +84,7 @@ macro_rules! current_pid_ns { /// impl State { /// fn new() -> Self { /// Self { -/// creator: current!().into(), +/// creator: ARef::from(&**current!()), /// index: 0, /// } /// } @@ -105,6 +104,44 @@ unsafe impl Send for Task {} // synchronised by C code (e.g., `signal_pending`). unsafe impl Sync for Task {} +/// Represents the [`Task`] in the `current` global. +/// +/// This type exists to provide more efficient operations that are only valid on the current task. +/// For example, to retrieve the pid-namespace of a task, you must use rcu protection unless it is +/// the current task. +/// +/// # Invariants +/// +/// Each value of this type must only be accessed from the task context it was created within. +/// +/// Of course, every thread is in a different task context, but for the purposes of this invariant, +/// these operations also permanently leave the task context: +/// +/// * Returning to userspace from system call context. +/// * Calling `release_task()`. +/// * Calling `begin_new_exec()` in a binary format loader. +/// +/// Other operations temporarily create a new sub-context: +/// +/// * Calling `kthread_use_mm()` creates a new context, and `kthread_unuse_mm()` returns to the +/// old context. +/// +/// This means that a `CurrentTask` obtained before a `kthread_use_mm()` call may be used again +/// once `kthread_unuse_mm()` is called, but it must not be used between these two calls. +/// Conversely, a `CurrentTask` obtained between a `kthread_use_mm()`/`kthread_unuse_mm()` pair +/// must not be used after `kthread_unuse_mm()`. +#[repr(transparent)] +pub struct CurrentTask(Task, NotThreadSafe); + +// Make all `Task` methods available on `CurrentTask`. +impl Deref for CurrentTask { + type Target = Task; + #[inline] + fn deref(&self) -> &Task { + &self.0 + } +} + /// The type of process identifiers (PIDs). type Pid = bindings::pid_t; @@ -131,119 +168,29 @@ pub fn current_raw() -> *mut bindings::task_struct { /// /// # Safety /// - /// Callers must ensure that the returned object doesn't outlive the current task/thread. - pub unsafe fn current() -> impl Deref { - struct TaskRef<'a> { - task: &'a Task, - _not_send: NotThreadSafe, + /// Callers must ensure that the returned object is only used to access a [`CurrentTask`] + /// within the task context that was active when this function was called. For more details, + /// see the invariants section for [`CurrentTask`]. + pub unsafe fn current() -> impl Deref { + struct TaskRef { + task: *const CurrentTask, } - impl Deref for TaskRef<'_> { - type Target = Task; + impl Deref for TaskRef { + type Target = CurrentTask; fn deref(&self) -> &Self::Target { - self.task + // SAFETY: The returned reference borrows from this `TaskRef`, so it cannot outlive + // the `TaskRef`, which the caller of `Task::current()` has promised will not + // outlive the task/thread for which `self.task` is the `current` pointer. Thus, it + // is okay to return a `CurrentTask` reference here. + unsafe { &*self.task } } } - let current = Task::current_raw(); TaskRef { - // SAFETY: If the current thread is still running, the current task is valid. Given - // that `TaskRef` is not `Send`, we know it cannot be transferred to another thread - // (where it could potentially outlive the caller). - task: unsafe { &*current.cast() }, - _not_send: NotThreadSafe, - } - } - - /// Returns a PidNamespace reference for the currently executing task's/thread's pid namespace. - /// - /// This function can be used to create an unbounded lifetime by e.g., storing the returned - /// PidNamespace in a global variable which would be a bug. So the recommended way to get the - /// current task's/thread's pid namespace is to use the [`current_pid_ns`] macro because it is - /// safe. - /// - /// # Safety - /// - /// Callers must ensure that the returned object doesn't outlive the current task/thread. - pub unsafe fn current_pid_ns() -> impl Deref { - struct PidNamespaceRef<'a> { - task: &'a PidNamespace, - _not_send: NotThreadSafe, - } - - impl Deref for PidNamespaceRef<'_> { - type Target = PidNamespace; - - fn deref(&self) -> &Self::Target { - self.task - } - } - - // The lifetime of `PidNamespace` is bound to `Task` and `struct pid`. - // - // The `PidNamespace` of a `Task` doesn't ever change once the `Task` is alive. A - // `unshare(CLONE_NEWPID)` or `setns(fd_pidns/pidfd, CLONE_NEWPID)` will not have an effect - // on the calling `Task`'s pid namespace. It will only effect the pid namespace of children - // created by the calling `Task`. This invariant guarantees that after having acquired a - // reference to a `Task`'s pid namespace it will remain unchanged. - // - // When a task has exited and been reaped `release_task()` will be called. This will set - // the `PidNamespace` of the task to `NULL`. So retrieving the `PidNamespace` of a task - // that is dead will return `NULL`. Note, that neither holding the RCU lock nor holding a - // referencing count to - // the `Task` will prevent `release_task()` being called. - // - // In order to retrieve the `PidNamespace` of a `Task` the `task_active_pid_ns()` function - // can be used. There are two cases to consider: - // - // (1) retrieving the `PidNamespace` of the `current` task - // (2) retrieving the `PidNamespace` of a non-`current` task - // - // From system call context retrieving the `PidNamespace` for case (1) is always safe and - // requires neither RCU locking nor a reference count to be held. Retrieving the - // `PidNamespace` after `release_task()` for current will return `NULL` but no codepath - // like that is exposed to Rust. - // - // Retrieving the `PidNamespace` from system call context for (2) requires RCU protection. - // Accessing `PidNamespace` outside of RCU protection requires a reference count that - // must've been acquired while holding the RCU lock. Note that accessing a non-`current` - // task means `NULL` can be returned as the non-`current` task could have already passed - // through `release_task()`. - // - // To retrieve (1) the `current_pid_ns!()` macro should be used which ensure that the - // returned `PidNamespace` cannot outlive the calling scope. The associated - // `current_pid_ns()` function should not be called directly as it could be abused to - // created an unbounded lifetime for `PidNamespace`. The `current_pid_ns!()` macro allows - // Rust to handle the common case of accessing `current`'s `PidNamespace` without RCU - // protection and without having to acquire a reference count. - // - // For (2) the `task_get_pid_ns()` method must be used. This will always acquire a - // reference on `PidNamespace` and will return an `Option` to force the caller to - // explicitly handle the case where `PidNamespace` is `None`, something that tends to be - // forgotten when doing the equivalent operation in `C`. Missing RCU primitives make it - // difficult to perform operations that are otherwise safe without holding a reference - // count as long as RCU protection is guaranteed. But it is not important currently. But we - // do want it in the future. - // - // Note for (2) the required RCU protection around calling `task_active_pid_ns()` - // synchronizes against putting the last reference of the associated `struct pid` of - // `task->thread_pid`. The `struct pid` stored in that field is used to retrieve the - // `PidNamespace` of the caller. When `release_task()` is called `task->thread_pid` will be - // `NULL`ed and `put_pid()` on said `struct pid` will be delayed in `free_pid()` via - // `call_rcu()` allowing everyone with an RCU protected access to the `struct pid` acquired - // from `task->thread_pid` to finish. - // - // SAFETY: The current task's pid namespace is valid as long as the current task is running. - let pidns = unsafe { bindings::task_active_pid_ns(Task::current_raw()) }; - PidNamespaceRef { - // SAFETY: If the current thread is still running, the current task and its associated - // pid namespace are valid. `PidNamespaceRef` is not `Send`, so we know it cannot be - // transferred to another thread (where it could potentially outlive the current - // `Task`). The caller needs to ensure that the PidNamespaceRef doesn't outlive the - // current task/thread. - task: unsafe { PidNamespace::from_ptr(pidns) }, - _not_send: NotThreadSafe, + // CAST: The layout of `struct task_struct` and `CurrentTask` is identical. + task: Task::current_raw().cast(), } } @@ -326,6 +273,70 @@ pub fn wake_up(&self) { } } +impl CurrentTask { + /// Access the address space of the current task. + /// + /// This function does not touch the refcount of the mm. + #[inline] + pub fn mm(&self) -> Option<&MmWithUser> { + // SAFETY: The `mm` field of `current` is not modified from other threads, so reading it is + // not a data race. + let mm = unsafe { (*self.as_ptr()).mm }; + + if mm.is_null() { + return None; + } + + // SAFETY: If `current->mm` is non-null, then it references a valid mm with a non-zero + // value of `mm_users`. Furthermore, the returned `&MmWithUser` borrows from this + // `CurrentTask`, so it cannot escape the scope in which the current pointer was obtained. + // + // This is safe even if `kthread_use_mm()`/`kthread_unuse_mm()` are used. There are two + // relevant cases: + // * If the `&CurrentTask` was created before `kthread_use_mm()`, then it cannot be + // accessed during the `kthread_use_mm()`/`kthread_unuse_mm()` scope due to the + // `NotThreadSafe` field of `CurrentTask`. + // * If the `&CurrentTask` was created within a `kthread_use_mm()`/`kthread_unuse_mm()` + // scope, then the `&CurrentTask` cannot escape that scope, so the returned `&MmWithUser` + // also cannot escape that scope. + // In either case, it's not possible to read `current->mm` and keep using it after the + // scope is ended with `kthread_unuse_mm()`. + Some(unsafe { MmWithUser::from_raw(mm) }) + } + + /// Access the pid namespace of the current task. + /// + /// This function does not touch the refcount of the namespace or use RCU protection. + /// + /// To access the pid namespace of another task, see [`Task::get_pid_ns`]. + #[doc(alias = "task_active_pid_ns")] + #[inline] + pub fn active_pid_ns(&self) -> Option<&PidNamespace> { + // SAFETY: It is safe to call `task_active_pid_ns` without RCU protection when calling it + // on the current task. + let active_ns = unsafe { bindings::task_active_pid_ns(self.as_ptr()) }; + + if active_ns.is_null() { + return None; + } + + // The lifetime of `PidNamespace` is bound to `Task` and `struct pid`. + // + // The `PidNamespace` of a `Task` doesn't ever change once the `Task` is alive. + // + // From system call context retrieving the `PidNamespace` for the current task is always + // safe and requires neither RCU locking nor a reference count to be held. Retrieving the + // `PidNamespace` after `release_task()` for current will return `NULL` but no codepath + // like that is exposed to Rust. + // + // SAFETY: If `current`'s pid ns is non-null, then it references a valid pid ns. + // Furthermore, the returned `&PidNamespace` borrows from this `CurrentTask`, so it cannot + // escape the scope in which the current pointer was obtained, e.g. it cannot live past a + // `release_task()` call. + Some(unsafe { PidNamespace::from_ptr(active_ns) }) + } +} + // SAFETY: The type invariants guarantee that `Task` is always refcounted. unsafe impl crate::types::AlwaysRefCounted for Task { fn inc_ref(&self) {