From patchwork Wed Mar 8 22:19:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 13166512 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38A70C742A7 for ; Wed, 8 Mar 2023 22:19:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA7F2280004; Wed, 8 Mar 2023 17:19:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C58B86B0075; Wed, 8 Mar 2023 17:19:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF8D1280004; Wed, 8 Mar 2023 17:19:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 988716B0072 for ; Wed, 8 Mar 2023 17:19:51 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6B32D41083 for ; Wed, 8 Mar 2023 22:19:51 +0000 (UTC) X-FDA: 80547149382.12.10FDDFD Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf07.hostedemail.com (Postfix) with ESMTP id 9884740016 for ; Wed, 8 Mar 2023 22:19:49 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eDQwE03s; spf=pass (imf07.hostedemail.com: domain of 3BAoJZA0KCGMBYFMSBTNVTTFOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--axelrasmussen.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3BAoJZA0KCGMBYFMSBTNVTTFOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--axelrasmussen.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678313989; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YSBFx52RvXhjtM1Ms4suwV92HKajMqVhV7t5r8e9JsE=; b=ey99mM9Ojs6ITcn8zFJVv1U/bA8JR1PkM/zKuCalCghZS95dGtgF2YSmgqYbnULIo90a8v Q4vUBYf6DwPtYzfAXaEtMaRjc7MtB+LQhvl+pEjJ9NbSOCBR90KWAJ+knP5Yh+FFPUsJ0B 6AK2J7LSpUz9qR6pA1EuIghaIrix9yM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eDQwE03s; spf=pass (imf07.hostedemail.com: domain of 3BAoJZA0KCGMBYFMSBTNVTTFOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--axelrasmussen.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3BAoJZA0KCGMBYFMSBTNVTTFOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--axelrasmussen.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678313989; a=rsa-sha256; cv=none; b=xeh/jc0R4oe0XnHpmo3F/UpXNiCC9TcT80WDz5Z0Jfr1mTdiG6UkQZvFk1H+3lslK7E7Fx jN8evAI5XVpi4rDpaI44s7we+5whS/poC6m8Ew+VcxWz1Hi9RIx8P8MfKKCXgU4nX24Xq1 y1Oe8fHXZaEv4Ww5Sn49vIC4Pj9+3JU= Received: by mail-yb1-f201.google.com with SMTP id l24-20020a25b318000000b007eba3f8e3baso158512ybj.4 for ; Wed, 08 Mar 2023 14:19:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678313989; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YSBFx52RvXhjtM1Ms4suwV92HKajMqVhV7t5r8e9JsE=; b=eDQwE03ssVefKuCHB5TW/9cRNCO3i1U9pzf4Ycjg91bvG1agIDtj0QTDE2eFmvjKVx fxVClPeOZeCKq1qpEHM10j3+vzTxm44vqdZdGCiDVrHAvZXHmxYM4g8g7ee6/gIHO4RV 03sVypQgd6aKkvssDe9MhchkPxSpvN6scQQTzDvldztsTNuwXJ25oes+JdaaWvEgW/vh Ks2aJ38BOoW1g1SD7qNi3QXAOQ0/ZHcfW8uPrpQnxFE88qQ6reh6KQ7yHElvT2u7uvjc Kakf4FsYCP6sbS07URRagbKBCEzMwfQV1R7J01ZCBx0pl/NWDzpaG7SW0ucedJET5wuC 9P6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678313989; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YSBFx52RvXhjtM1Ms4suwV92HKajMqVhV7t5r8e9JsE=; b=K+DzSRUQ0C8pW/h3gLxIN1556mGG3v5b8diEjH9DOSUseC46chiUi3xlFJbZW3420P G+JVKKJzqvKPO2FVssnylZifwX10XRaXXooJkCXGhv/v8HYPTKfkHbLQp31lbzqjJTFE 8ZISxVFGDAxHybbT/e5dXH6booeeCRcAhAJRXnIdGjZTI4yZN4sPy01EdN2w0JaS8MEU Kmth4Vld8I5cnCAyNJvZ2dc3SivfpjrS2x/DA8hNWYXkYM64xy+TU/1qoLepuQocMqNb H7xviiwF1hboz48skwcbpflLZeXlsH2fI2QCUoQnBsTHDWy6VfHNprrWD7NuKpsuHOQ0 IqPg== X-Gm-Message-State: AO0yUKUfnXz3woRfKuCWquAFd/yg62ay43Rafe8On0tBdc/icOiSU1xC jKO3cHdSgy3RWqBgON1qLE0XFd5niocEmJL8xBI9 X-Google-Smtp-Source: AK7set/Y8V4CBFECi0ndhD/iIYsa8Ih564aUc0de91ZFzdsRCTCZqdn0UCRIo/zFP9QU38jh/ZHPKYk13T3u2uxCOIk3 X-Received: from axel.svl.corp.google.com ([2620:15c:2d4:203:96cb:1c04:7322:78a4]) (user=axelrasmussen job=sendgmr) by 2002:a81:ad5a:0:b0:536:4d58:54b2 with SMTP id l26-20020a81ad5a000000b005364d5854b2mr13032103ywk.4.1678313988783; Wed, 08 Mar 2023 14:19:48 -0800 (PST) Date: Wed, 8 Mar 2023 14:19:32 -0800 In-Reply-To: <20230308221932.1548827-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230308221932.1548827-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.40.0.rc1.284.g88254d51c5-goog Message-ID: <20230308221932.1548827-5-axelrasmussen@google.com> Subject: [PATCH v4 4/4] mm: userfaultfd: add UFFDIO_CONTINUE_MODE_WP to install WP PTEs From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Hugh Dickins , Jan Kara , "Liam R. Howlett" , Matthew Wilcox , Mike Kravetz , Mike Rapoport , Muchun Song , Nadav Amit , Peter Xu , Shuah Khan Cc: James Houghton , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9884740016 X-Stat-Signature: 3n1cguczcgjce4dfbu6fw3on7gujce8r X-HE-Tag: 1678313989-694588 X-HE-Meta: U2FsdGVkX19jQVaCY+wCAMYKw2GV0xqR/qPWxFPke4AhSZ6RkACQnqLJfufPA4x9iqPmUYhovWockswfVuljzhMOncsew3eS583gkug9P3EtXlHin1rdKML41aMAO5VwpSfQ+o36WI6Ox13/UlS40wHk9r0kDCzxQ5flBkrrZNlegJJGwsqF1NY14mxeHtBeN1+AgLqpzmLkQZ/tgakR0na7A1RoX+90cIpv2iHCGu3/MYi8Q7Hbb20b7vuY+OnQgMO/fN79gj07L8FGuZZBFH4OzAyQkKxjMfgvUuBtrdG66qW/H7+MlxjKVTu2N6yASOmWL0IBaBetiW7IjSG8zWdHpTrBvg8NYUd+4PNfH9A/dXHvR0DgrLD4eXAf11chbg7Oo9VtOEjjLo5FRMr5IUmVoKw38+FyDS79Jap/4NHsZPcMdtYVsk0fxB/2ZhmfJCVdhtX5saaGcsi/Jzjf7dOq9n7PhCW9xkCmDUJ2WWTcDuymZiYjN26jUIuoQ7J8E40Q77zBu9asbis11mA6aXnJ8rO6yCxb+m5Adh8E4h7OJjlW0xb89nHml7EPJK7uZHbuFKpiaHkhzM0yI9lbD7kOfjxAdfxkmg7w4oD8/uyniiLO2OwrGLGd1KnDvdjLiLnGJI/SaUNVlzdLCRnKC4dmxgvNw9aOXZJtRA3eNVVLUW0/SBw318Dme/OmSYGewNp4mUcfqdLBGofEgDZpE9V35MCPHuLgZNxZwrAKN6/wVV9CJoc24GvmqNvCVClkPsn5PzLDa9wxMNdhV6LaeaqHrO9TPKY6Rm1BZy+TTL14rcyv4IFphpGl5VGceHCZf22AHFzncEaAQgmJO0YNsXQJ4yAwFbVWDgNwVSmSyE/uaXsV2EyUsGMVy4o3iUsj6lL6T778J49kGPHDHQUgvuWB75dDp8ebUpqy7XiE9uGMGNO9TjGrkRTzKsF6SoEIENJoRTx0FPKSCs3b7Lp o7F6o+4J hWziNrLgvU/9S10fT6WiujEWMhRr8mQzdRDre3HFru5bc8+TzQqkANS5eaP6ua9lTvvPuco505gsuZpRNB1zfRR+gvQVHpndaGo3WO0s/uH57q5DWXdhKHjtD1l5wk5wqixqN07cHSgnKcGFI91m7pj+MlF85MQLa6ihC57cfT/z22TyuyBokALMpIhM3iJsNeNZKbG6nHXhlNf0UmqOu4MELTJmqfVpoF7zE5Fs/jfXXAUxQaL/Z0tjfF9EFKNwJfBB389WvkpLs07A1psaVrZYSOK0RaK15YajU2yvCU5ny8wRj9li96YuUer9JrQzvuywwOKuSwBzj07ZZ6RduzQCV8jTw0T5Z0cvLlZi7iOTwEQhvPw0GhM763WdGvf5I3tPigyKVKzrPuI7O3ItFj+1ybPYf5e1jVCo0soY32vF1Kj0Pq5zSGhbZpTS/ZENZUWMQ/7sXv9wFFUwCq2FnDHfmHzMpft1ZWkRAxw3XLG9B3WI4zVtol9PoNBpzeFatKAWo0tX/lFhHT89Y5MedeOjrN6DNAzZwxkp2Imxbni3mJDH2Yyn2h+HVHMfQGX/f/vj4FlKPZJOZgBVN8H58nwH3xgty+nmKkL1XJX/g6mw8l7lf83J5RZBFZUTrnpJaUwZqXNYuTRXTuazIMPqO2hubD/t4vUFtQ9N9raEOq4tDybbT1GCZ0ACNiqNRWZJ44cZTUL2b1kg9MHFO028k2hTiReah5N70sR6fM/BoWsmjfgRCk+8glzow0UKAyuPTIkRB3E4usbgkbfCoEyvUNsIjUzd7OBxKLqbYtrzcYhmozkrJH4B+LgV7vw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: UFFDIO_COPY already has UFFDIO_COPY_MODE_WP, so when installing a new PTE to resolve a missing fault, one can install a write-protected one. This is useful when using UFFDIO_REGISTER_MODE_{MISSING,WP} in combination. This was motivated by testing HugeTLB HGM [1], and in particular its interaction with userfaultfd features. Existing userfaultfd code supports using WP and MINOR modes together (i.e. you can register an area with both enabled), but without this CONTINUE flag the combination is in practice unusable. So, add an analogous UFFDIO_CONTINUE_MODE_WP, which does the same thing as UFFDIO_COPY_MODE_WP, but for *minor* faults. Update the selftest to do some very basic exercising of the new flag. [1]: https://patchwork.kernel.org/project/linux-mm/cover/20230218002819.1486479-1-jthoughton@google.com/ Acked-by: Peter Xu Signed-off-by: Axel Rasmussen Acked-by: Mike Rapoport (IBM) --- fs/userfaultfd.c | 8 ++++++-- include/linux/userfaultfd_k.h | 3 ++- include/uapi/linux/userfaultfd.h | 7 +++++++ mm/userfaultfd.c | 5 +++-- tools/testing/selftests/mm/userfaultfd.c | 4 ++++ 5 files changed, 22 insertions(+), 5 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 56e54e50414e..664019381e04 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1878,6 +1878,7 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) struct uffdio_continue uffdio_continue; struct uffdio_continue __user *user_uffdio_continue; struct userfaultfd_wake_range range; + uffd_flags_t flags = 0; user_uffdio_continue = (struct uffdio_continue __user *)arg; @@ -1902,13 +1903,16 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) uffdio_continue.range.start) { goto out; } - if (uffdio_continue.mode & ~UFFDIO_CONTINUE_MODE_DONTWAKE) + if (uffdio_continue.mode & ~(UFFDIO_CONTINUE_MODE_DONTWAKE | + UFFDIO_CONTINUE_MODE_WP)) goto out; + if (uffdio_continue.mode & UFFDIO_CONTINUE_MODE_WP) + flags |= MFILL_ATOMIC_WP; if (mmget_not_zero(ctx->mm)) { ret = mfill_atomic_continue(ctx->mm, uffdio_continue.range.start, uffdio_continue.range.len, - &ctx->mmap_changing); + &ctx->mmap_changing, flags); mmput(ctx->mm); } else { return -ESRCH; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 4d7425684171..9499cfcf83fa 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -82,7 +82,8 @@ extern ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm, unsigned long len, atomic_t *mmap_changing); extern ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long dst_start, - unsigned long len, atomic_t *mmap_changing); + unsigned long len, atomic_t *mmap_changing, + uffd_flags_t flags); extern int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool enable_wp, atomic_t *mmap_changing); diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 005e5e306266..14059a0861bf 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -297,6 +297,13 @@ struct uffdio_writeprotect { struct uffdio_continue { struct uffdio_range range; #define UFFDIO_CONTINUE_MODE_DONTWAKE ((__u64)1<<0) + /* + * UFFDIO_CONTINUE_MODE_WP will map the page write protected on + * the fly. UFFDIO_CONTINUE_MODE_WP is available only if the + * write protected ioctl is implemented for the range + * according to the uffdio_register.ioctls. + */ +#define UFFDIO_CONTINUE_MODE_WP ((__u64)1<<1) __u64 mode; /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index dd807924446f..2f64e0a9b234 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -693,10 +693,11 @@ ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm, unsigned long start, } ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long start, - unsigned long len, atomic_t *mmap_changing) + unsigned long len, atomic_t *mmap_changing, + uffd_flags_t flags) { return mfill_atomic(dst_mm, start, 0, len, mmap_changing, - uffd_flags_set_mode(0, MFILL_ATOMIC_CONTINUE)); + uffd_flags_set_mode(flags, MFILL_ATOMIC_CONTINUE)); } long uffd_wp_range(struct vm_area_struct *dst_vma, diff --git a/tools/testing/selftests/mm/userfaultfd.c b/tools/testing/selftests/mm/userfaultfd.c index 7f22844ed704..41c1f9abc481 100644 --- a/tools/testing/selftests/mm/userfaultfd.c +++ b/tools/testing/selftests/mm/userfaultfd.c @@ -585,6 +585,8 @@ static void continue_range(int ufd, __u64 start, __u64 len) req.range.start = start; req.range.len = len; req.mode = 0; + if (test_uffdio_wp) + req.mode |= UFFDIO_CONTINUE_MODE_WP; if (ioctl(ufd, UFFDIO_CONTINUE, &req)) err("UFFDIO_CONTINUE failed for address 0x%" PRIx64, @@ -1332,6 +1334,8 @@ static int userfaultfd_minor_test(void) uffdio_register.range.start = (unsigned long)area_dst_alias; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MINOR; + if (test_uffdio_wp) + uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) err("register failure");