From patchwork Thu Mar 9 22:37:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13168421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B436C64EC4 for ; Thu, 9 Mar 2023 22:37:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ACC516B0074; Thu, 9 Mar 2023 17:37:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A7CB2280003; Thu, 9 Mar 2023 17:37:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F6B4280002; Thu, 9 Mar 2023 17:37:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7AAD36B0074 for ; Thu, 9 Mar 2023 17:37:21 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3EB44160344 for ; Thu, 9 Mar 2023 22:37:21 +0000 (UTC) X-FDA: 80550822282.04.E344CA9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 2883E4001C for ; Thu, 9 Mar 2023 22:37:19 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=MwpScK97; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678401439; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ysgFWfyZC6/5ktdPobreCAXolrb4DyJ34SU0oA25JVU=; b=uztqiWAVpqsoertQHBBjAERvstg1QDiekbFzPc8cKJ/oQI1uXGhAj8QH3r85vavyxSmw8W xpPHBrzHjlXuKDaQfs0C1/tr5t1w12iIioBaOLQsQKNBCZn/5nF4Ip2F1fIk9r4bMW0+Wt dcTKtNtKVeuCob0cZU3xUxelB8PTHOY= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=MwpScK97; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678401439; a=rsa-sha256; cv=none; b=x5h3PNPZpp9wIM2mtL65Zhj3FedLneWuHnXqD4wfSzZSpzpQca6S3ElfC00Fa1fADNJADh IRoeJ9WFP5GlZQNaFud3X3V3GA6CUIr86flYZie+Obt6oPKe9UgS83mGF5d0miVHWqh00H xtB3ttF47YyG/BLdhLyZXu5jrqgHr28= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678401438; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ysgFWfyZC6/5ktdPobreCAXolrb4DyJ34SU0oA25JVU=; b=MwpScK97ripgutew7GI87d1ozevRLOu/ZI/TWs2lYD8f6aLHQNGhkLdUQqHWMXYxYPaPrO eik8RlIxS/7NkH63nGjX1ghgkRNgNW7QiiFEUUpkVTaARBZeVD3a3L32iUKRtoMmtxBSly YLCRBivElGDmq7bsWm+Wr0p+++EhHQo= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-108-LVA9uOA5Nb6Mox0WxOzsEw-1; Thu, 09 Mar 2023 17:37:17 -0500 X-MC-Unique: LVA9uOA5Nb6Mox0WxOzsEw-1 Received: by mail-qk1-f197.google.com with SMTP id ea22-20020a05620a489600b00742cec04043so2078196qkb.7 for ; Thu, 09 Mar 2023 14:37:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678401436; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ysgFWfyZC6/5ktdPobreCAXolrb4DyJ34SU0oA25JVU=; b=S+0VrgCQOQuMJR/ja5mOXwtFt/H22J5hfBEPAUAaHr1nPyaFhN2AJS0t6sS2m1L+Ld 1j5fhmVvhh4fKD3ZCVr4sbG+yMCaFPyhwqPjb+bdwdEoaht64Z0S5pshs32tkFgtuOdP XLqEwDmTIEeGVCqCBAjTu7tbv/03n4jpiB0mVj8c8gcg05LqoKiP8Hpofb6XuEiX9gHo GWH34C3ApWZNEnFs/xUnYP39enWNYVd3ycCKKLXlIKXxxGClohEJlczGBbtKcpqNPbAv 5HygCfxFguTN/nfxZy7D9WOKzD81+LelAsTNjUh0QPKfAIWeJ/stJ+etMtRpS/E5bAkQ YqfQ== X-Gm-Message-State: AO0yUKVaIFQNMmvFLrujbeJ/whGIkpRasMZp2um/2jQehR1x/1LwpPUT KQZpHcTgMHqVDPGefixsOh7bzhkOf68cVS6EIHf/kHk3XGGRr+g6UWwuKnKbMDNaXRRCTo9NJAd tNXsFezw87kYk3kurXJcTQzYRl9tskLgJe6TCpoKUjQBGFPbtwyJ2C+m8+RRdXm/m/WyK X-Received: by 2002:ac8:5906:0:b0:3bf:daa8:cacc with SMTP id 6-20020ac85906000000b003bfdaa8caccmr1663768qty.3.1678401436308; Thu, 09 Mar 2023 14:37:16 -0800 (PST) X-Google-Smtp-Source: AK7set+wO2kA6nqYFUeDU7Z02nij0VDKvtwfZVmJOJ+w6vZGgiE35KpmnIuI+ipqEUlvP2ZmpRWjUg== X-Received: by 2002:ac8:5906:0:b0:3bf:daa8:cacc with SMTP id 6-20020ac85906000000b003bfdaa8caccmr1663715qty.3.1678401435734; Thu, 09 Mar 2023 14:37:15 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id c26-20020ac84e1a000000b003b8484fdfccsm172215qtw.42.2023.03.09.14.37.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Mar 2023 14:37:14 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Axel Rasmussen , Paul Gofman , Muhammad Usama Anjum , David Hildenbrand , Mike Rapoport , Andrea Arcangeli , peterx@redhat.com, Andrew Morton Subject: [PATCH v4 1/2] mm/uffd: UFFD_FEATURE_WP_UNPOPULATED Date: Thu, 9 Mar 2023 17:37:10 -0500 Message-Id: <20230309223711.823547-2-peterx@redhat.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230309223711.823547-1-peterx@redhat.com> References: <20230309223711.823547-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: pi69skyiid1fpna39bfuohg8tgy8sntt X-Rspam-User: X-Rspamd-Queue-Id: 2883E4001C X-Rspamd-Server: rspam06 X-HE-Tag: 1678401439-351731 X-HE-Meta: U2FsdGVkX1/nHZ3DuLfX+uHzBIM6i414W5nw80zkwCKQfHqcfQYu2EXIAGajW/0/cvzv0mios5lqRpX26EtSP3avd+OrMAW/DbqWVjYu+hVPVt3Gs7LVW+jWZgNdwBNjEfGH1KUOEz7ge9fxhNIhfPpgQ1fWQdr7X9zbHLbm+UtVn0lL8IlLz6UHBf+QERW113b3UNoVfX7JiPmRVhqbHxQgGwMtGb7M330EysjMuNVavsUfTQoP9fQogNnMu3A6VRlsoE/8S3QPPqRbrS0kVB7WnZljk2Zf4qk2bBbOtdaISG7OhXu+T+mpK4fsOqLiIeIde2byGD28vZP0v7TX1iWP0Jjh/5mwv4eU3x68ZJSs/ULyHiD+mCbmNG0U2Xzuc5DQRkAGpVn2TJRblBcVpVvDvn6A1cySZCMUtipopWrAyqwWSXbS8oVc+yWp0caqJP03DWvLvswHVSxoOLII0k2qfvBVLt/UfcZQS3VrbLcMUi1IyPzDge6wqOp4JQF8YF8xr3UZeuSCIkYAPILR3BUnUDCq9K9yWlfSsGyQTjwD+U9C8mvGoeBIR5+2iTgjbcvcBNyZjt7Hd5fCq6uvgLP6tZq9QGpIH5BBVsZRAvbImOmWs+nD2zuAi13mZkMQBvvrasLv5MrzjoMly2q9jZSAOaNaSrQuAqabQzwYzarGFZYySKIXiJjXCYYtxZZWZu3JPNlzkuspBAHn5GS8V3bLaH/Qi0wIsOBnGN9ToOYGvoRFclhXkOGAnihsmig4Q+rcuglOqrtAJM+DJiDN1LM6gsljfDVngtBqtQ3oRqcgqWruUS5I0eJECb4VOsocqtIVMpQtIXXDMkxDcJhbcXIvE90I0XwjUmI4AT2srH6Jbe2iwFw3JbrqQ95ALrgxj4f9cbDyPwa42ubiuFZUFmmo+nk9kzHHyNzQqz9AYTxK8/UX+zftNRogxYffiJ43nX2bbEyzxkETsa5WYs1 AVkDn2VG LXvE1GOo19XoEx4spQIsUE4xVg/TLmzm1A84WH76oXsndVxXnMCvVTFytiw02nHxUFSLCdFqEGud+FITSJtpN8Ax6VKeNeJXzkI76D/YfURAU+mhVfJ/cNimNrsNcm8jfrBdO3As8h8kWVg5rMsCjDnarr3FheS4BOSn25v429bmpfSNyxeOm/4noiOsTnGCeYofbT/RuWZXX/MWFp/uq9hn1hLzst4ZnmUCvbxFfqn2Yq7c3iJ+3DODniT/UCqMA1kNvv5uCVvHrwEaykZpm5BDY0kNOdaFReE4SiwcfdLjF7R2OIqwXq2eUkJxJMH3mt/bNkf2FBCBt290QJNYlSJyb/DHuTaB4FkHCdxDASrNugpr/5xL7voS3jOyGtgvER6tqR8ExMqoKj2aV8/Qsc9f9RlZ4vqJkCj0YYq/llj56ttA8+SjVm+FTEwCWxfFNqTpsWIaCJQ2c4DhGUY89tsC78eikGmh4QxpTndL2LqkB37ppIaLCQcW90tl95/i3eQQZtmCqh7GJbFWPfJMYDV5/4XXg0eNzHp2en3Erd1U6HEhWvolUltVHvusRPgCTCWuPsRWvqeyuP64aA42YMnjyt5lbjGvS9YeTA9Oi1pMq7G1uK+UNNjKldkFNhG+PyOEUfy3yQjZnix1wlS6VI4HQmbqb7RytAYpDt/HWoJFIy+jU8UjAIKXXGa5rpSNXIE5Fd0eRL6ZJ3/Cxj5okJPSptDI0D0HAFfKbZI8sM7igEnDA4fbxfsnuAg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a new feature that controls how uffd-wp handles none ptes. When it's set, the kernel will handle anonymous memory the same way as file memory, by allowing the user to wr-protect unpopulated ptes. File memories handles none ptes consistently by allowing wr-protecting of none ptes because of the unawareness of page cache being exist or not. For anonymous it was not as persistent because we used to assume that we don't need protections on none ptes or known zero pages. One use case of such a feature bit was VM live snapshot, where if without wr-protecting empty ptes the snapshot can contain random rubbish in the holes of the anonymous memory, which can cause misbehave of the guest when the guest OS assumes the pages should be all zeros. QEMU worked it around by pre-populate the section with reads to fill in zero page entries before starting the whole snapshot process [1]. Recently there's another need raised on using userfaultfd wr-protect for detecting dirty pages (to replace soft-dirty in some cases) [2]. In that case if without being able to wr-protect none ptes by default, the dirty info can get lost, since we cannot treat every none pte to be dirty (the current design is identify a page dirty based on uffd-wp bit being cleared). In general, we want to be able to wr-protect empty ptes too even for anonymous. This patch implements UFFD_FEATURE_WP_UNPOPULATED so that it'll make uffd-wp handling on none ptes being consistent no matter what the memory type is underneath. It doesn't have any impact on file memories so far because we already have pte markers taking care of that. So it only affects anonymous. The feature bit is by default off, so the old behavior will be maintained. Sometimes it may be wanted because the wr-protect of none ptes will contain overheads not only during UFFDIO_WRITEPROTECT (by applying pte markers to anonymous), but also on creating the pgtables to store the pte markers. So there's potentially less chance of using thp on the first fault for a none pmd or larger than a pmd. The major implementation part is teaching the whole kernel to understand pte markers even for anonymously mapped ranges, meanwhile allowing the UFFDIO_WRITEPROTECT ioctl to apply pte markers for anonymous too when the new feature bit is set. Note that even if the patch subject starts with mm/uffd, there're a few small refactors to major mm path of handling anonymous page faults. But they should be straightforward. With WP_UNPOPUATED, application like QEMU can avoid pre-read faults all the memory before wr-protect during taking a live snapshot. Quotting from Muhammad's test result here [3] based on a simple program [4]: (1) With huge page disabled echo madvise > /sys/kernel/mm/transparent_hugepage/enabled ./uffd_wp_perf Test DEFAULT: 4 Test PRE-READ: 1111453 (pre-fault 1101011) Test MADVISE: 278276 (pre-fault 266378) Test WP-UNPOPULATE: 11712 (2) With Huge page enabled echo always > /sys/kernel/mm/transparent_hugepage/enabled ./uffd_wp_perf Test DEFAULT: 4 Test PRE-READ: 22521 (pre-fault 22348) Test MADVISE: 4909 (pre-fault 4743) Test WP-UNPOPULATE: 14448 There'll be a great perf boost for no-thp case, while for thp enabled with extreme case of all-thp-zero WP_UNPOPULATED can be slower than MADVISE, but that's low possibility in reality, also the overhead was not reduced but postponed until a follow up write on any huge zero thp, so potentially it is faster by making the follow up writes slower. [1] https://lore.kernel.org/all/20210401092226.102804-4-andrey.gruzdev@virtuozzo.com/ [2] https://lore.kernel.org/all/Y+v2HJ8+3i%2FKzDBu@x1n/ [3] https://lore.kernel.org/all/d0eb0a13-16dc-1ac1-653a-78b7273781e3@collabora.com/ [4] https://github.com/xzpeter/clibs/blob/master/uffd-test/uffd-wp-perf.c Signed-off-by: Peter Xu Acked-by: David Hildenbrand --- Documentation/admin-guide/mm/userfaultfd.rst | 17 ++++++ fs/userfaultfd.c | 16 ++++++ include/linux/mm_inline.h | 6 +++ include/linux/userfaultfd_k.h | 23 ++++++++ include/uapi/linux/userfaultfd.h | 10 +++- mm/memory.c | 56 +++++++++++++++----- mm/mprotect.c | 51 ++++++++++++++---- 7 files changed, 154 insertions(+), 25 deletions(-) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 7dc823b56ca4..c86b56c95ea6 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -219,6 +219,23 @@ former will have ``UFFD_PAGEFAULT_FLAG_WP`` set, the latter you still need to supply a page when ``UFFDIO_REGISTER_MODE_MISSING`` was used. +Userfaultfd write-protect mode currently behave differently on none ptes +(when e.g. page is missing) over different types of memories. + +For anonymous memory, ``ioctl(UFFDIO_WRITEPROTECT)`` will ignore none ptes +(e.g. when pages are missing and not populated). For file-backed memories +like shmem and hugetlbfs, none ptes will be write protected just like a +present pte. In other words, there will be a userfaultfd write fault +message generated when writting to a missing page on file typed memories, +as long as the page range was write-protected before. Such a message will +not be generated on anonymous memories by default. + +If the application wants to be able to write protect none ptes on anonymous +memory, one can pre-populate the memory with e.g. MADV_POPULATE_READ. On +newer kernels, one can also detect the feature UFFD_FEATURE_WP_UNPOPULATED +and set the feature bit in advance to make sure none ptes will also be +write protected even upon anonymous memory. + QEMU/KVM ======== diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 44d1ee429eb0..881e9c82b9d1 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -108,6 +108,21 @@ static bool userfaultfd_is_initialized(struct userfaultfd_ctx *ctx) return ctx->features & UFFD_FEATURE_INITIALIZED; } +/* + * Whether WP_UNPOPULATED is enabled on the uffd context. It is only + * meaningful when userfaultfd_wp()==true on the vma and when it's + * anonymous. + */ +bool userfaultfd_wp_unpopulated(struct vm_area_struct *vma) +{ + struct userfaultfd_ctx *ctx = vma->vm_userfaultfd_ctx.ctx; + + if (!ctx) + return false; + + return ctx->features & UFFD_FEATURE_WP_UNPOPULATED; +} + static void userfaultfd_set_vm_flags(struct vm_area_struct *vma, vm_flags_t flags) { @@ -1971,6 +1986,7 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, #endif #ifndef CONFIG_PTE_MARKER_UFFD_WP uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; + uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; #endif uffdio_api.ioctls = UFFD_API_IOCTLS; ret = -EFAULT; diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index de1e622dd366..0e1d239a882c 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -557,6 +557,12 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, /* The current status of the pte should be "cleared" before calling */ WARN_ON_ONCE(!pte_none(*pte)); + /* + * NOTE: userfaultfd_wp_unpopulated() doesn't need this whole + * thing, because when zapping either it means it's dropping the + * page, or in TTU where the present pte will be quickly replaced + * with a swap pte. There's no way of leaking the bit. + */ if (vma_is_anonymous(vma) || !userfaultfd_wp(vma)) return; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 3767f18114ef..0cf8880219da 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -179,6 +179,7 @@ extern int userfaultfd_unmap_prep(struct mm_struct *mm, unsigned long start, unsigned long end, struct list_head *uf); extern void userfaultfd_unmap_complete(struct mm_struct *mm, struct list_head *uf); +extern bool userfaultfd_wp_unpopulated(struct vm_area_struct *vma); #else /* CONFIG_USERFAULTFD */ @@ -274,8 +275,30 @@ static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) return false; } +static inline bool userfaultfd_wp_unpopulated(struct vm_area_struct *vma) +{ + return false; +} + #endif /* CONFIG_USERFAULTFD */ +static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma) +{ + /* Only wr-protect mode uses pte markers */ + if (!userfaultfd_wp(vma)) + return false; + + /* File-based uffd-wp always need markers */ + if (!vma_is_anonymous(vma)) + return true; + + /* + * Anonymous uffd-wp only needs the markers if WP_UNPOPULATED + * enabled (to apply markers on zero pages). + */ + return userfaultfd_wp_unpopulated(vma); +} + static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) { #ifdef CONFIG_PTE_MARKER_UFFD_WP diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 005e5e306266..90c958952bfc 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -38,7 +38,8 @@ UFFD_FEATURE_MINOR_HUGETLBFS | \ UFFD_FEATURE_MINOR_SHMEM | \ UFFD_FEATURE_EXACT_ADDRESS | \ - UFFD_FEATURE_WP_HUGETLBFS_SHMEM) + UFFD_FEATURE_WP_HUGETLBFS_SHMEM | \ + UFFD_FEATURE_WP_UNPOPULATED) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -203,6 +204,12 @@ struct uffdio_api { * * UFFD_FEATURE_WP_HUGETLBFS_SHMEM indicates that userfaultfd * write-protection mode is supported on both shmem and hugetlbfs. + * + * UFFD_FEATURE_WP_UNPOPULATED indicates that userfaultfd + * write-protection mode will always apply to unpopulated pages + * (i.e. empty ptes). This will be the default behavior for shmem + * & hugetlbfs, so this flag only affects anonymous memory behavior + * when userfault write-protection mode is registered. */ #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) #define UFFD_FEATURE_EVENT_FORK (1<<1) @@ -217,6 +224,7 @@ struct uffdio_api { #define UFFD_FEATURE_MINOR_SHMEM (1<<10) #define UFFD_FEATURE_EXACT_ADDRESS (1<<11) #define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<12) +#define UFFD_FEATURE_WP_UNPOPULATED (1<<13) __u64 features; __u64 ioctls; diff --git a/mm/memory.c b/mm/memory.c index 0adf23ea5416..8d73d3056348 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -104,6 +104,20 @@ EXPORT_SYMBOL(mem_map); #endif static vm_fault_t do_fault(struct vm_fault *vmf); +static vm_fault_t do_anonymous_page(struct vm_fault *vmf); +static bool vmf_pte_changed(struct vm_fault *vmf); + +/* + * Return true if the original pte was a uffd-wp pte marker (so the pte was + * wr-protected). + */ +static bool vmf_orig_pte_uffd_wp(struct vm_fault *vmf) +{ + if (!(vmf->flags & FAULT_FLAG_ORIG_PTE_VALID)) + return false; + + return pte_marker_uffd_wp(vmf->orig_pte); +} /* * A number of key systems in x86 including ioremap() rely on the assumption @@ -1350,6 +1364,10 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, struct zap_details *details, pte_t pteval) { + /* Zap on anonymous always means dropping everything */ + if (vma_is_anonymous(vma)) + return; + if (zap_drop_file_uffd_wp(details)) return; @@ -1456,8 +1474,12 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, continue; rss[mm_counter(page)]--; } else if (pte_marker_entry_uffd_wp(entry)) { - /* Only drop the uffd-wp marker if explicitly requested */ - if (!zap_drop_file_uffd_wp(details)) + /* + * For anon: always drop the marker; for file: only + * drop the marker if explicitly requested. + */ + if (!vma_is_anonymous(vma) && + !zap_drop_file_uffd_wp(details)) continue; } else if (is_hwpoison_entry(entry) || is_swapin_error_entry(entry)) { @@ -3624,6 +3646,14 @@ static vm_fault_t pte_marker_clear(struct vm_fault *vmf) return 0; } +static vm_fault_t do_pte_missing(struct vm_fault *vmf) +{ + if (vma_is_anonymous(vmf->vma)) + return do_anonymous_page(vmf); + else + return do_fault(vmf); +} + /* * This is actually a page-missing access, but with uffd-wp special pte * installed. It means this pte was wr-protected before being unmapped. @@ -3634,11 +3664,10 @@ static vm_fault_t pte_marker_handle_uffd_wp(struct vm_fault *vmf) * Just in case there're leftover special ptes even after the region * got unregistered - we can simply clear them. */ - if (unlikely(!userfaultfd_wp(vmf->vma) || vma_is_anonymous(vmf->vma))) + if (unlikely(!userfaultfd_wp(vmf->vma))) return pte_marker_clear(vmf); - /* do_fault() can handle pte markers too like none pte */ - return do_fault(vmf); + return do_pte_missing(vmf); } static vm_fault_t handle_pte_marker(struct vm_fault *vmf) @@ -4008,6 +4037,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) { + bool uffd_wp = vmf_orig_pte_uffd_wp(vmf); struct vm_area_struct *vma = vmf->vma; struct folio *folio; vm_fault_t ret = 0; @@ -4041,7 +4071,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) vma->vm_page_prot)); vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); - if (!pte_none(*vmf->pte)) { + if (vmf_pte_changed(vmf)) { update_mmu_tlb(vma, vmf->address, vmf->pte); goto unlock; } @@ -4081,7 +4111,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); - if (!pte_none(*vmf->pte)) { + if (vmf_pte_changed(vmf)) { update_mmu_tlb(vma, vmf->address, vmf->pte); goto release; } @@ -4101,6 +4131,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) folio_add_new_anon_rmap(folio, vma, vmf->address); folio_add_lru_vma(folio, vma); setpte: + if (uffd_wp) + entry = pte_mkuffd_wp(entry); set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); /* No need to invalidate - it was non-present before */ @@ -4268,7 +4300,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) { struct vm_area_struct *vma = vmf->vma; - bool uffd_wp = pte_marker_uffd_wp(vmf->orig_pte); + bool uffd_wp = vmf_orig_pte_uffd_wp(vmf); bool write = vmf->flags & FAULT_FLAG_WRITE; bool prefault = vmf->address != addr; pte_t entry; @@ -4915,12 +4947,8 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) } } - if (!vmf->pte) { - if (vma_is_anonymous(vmf->vma)) - return do_anonymous_page(vmf); - else - return do_fault(vmf); - } + if (!vmf->pte) + return do_pte_missing(vmf); if (!pte_present(vmf->orig_pte)) return do_swap_page(vmf); diff --git a/mm/mprotect.c b/mm/mprotect.c index 231929f119d9..455f7051098f 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -276,7 +276,15 @@ static long change_pte_range(struct mmu_gather *tlb, } else { /* It must be an none page, or what else?.. */ WARN_ON_ONCE(!pte_none(oldpte)); - if (unlikely(uffd_wp && !vma_is_anonymous(vma))) { + + /* + * Nobody plays with any none ptes besides + * userfaultfd when applying the protections. + */ + if (likely(!uffd_wp)) + continue; + + if (userfaultfd_wp_use_markers(vma)) { /* * For file-backed mem, we need to be able to * wr-protect a none pte, because even if the @@ -320,23 +328,46 @@ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) return 0; } -/* Return true if we're uffd wr-protecting file-backed memory, or false */ +/* + * Return true if we want to split huge thps in change protection + * procedure, false otherwise. + */ static inline bool -uffd_wp_protect_file(struct vm_area_struct *vma, unsigned long cp_flags) +pgtable_split_needed(struct vm_area_struct *vma, unsigned long cp_flags) { + /* + * pte markers only resides in pte level, if we need pte markers, + * we need to split. We cannot wr-protect shmem thp because file + * thp is handled differently when split by erasing the pmd so far. + */ return (cp_flags & MM_CP_UFFD_WP) && !vma_is_anonymous(vma); } /* - * If wr-protecting the range for file-backed, populate pgtable for the case - * when pgtable is empty but page cache exists. When {pte|pmd|...}_alloc() - * failed we treat it the same way as pgtable allocation failures during - * page faults by kicking OOM and returning error. + * Return true if we want to populate pgtables in change protection + * procedure, false otherwise + */ +static inline bool +pgtable_populate_needed(struct vm_area_struct *vma, unsigned long cp_flags) +{ + /* If not within ioctl(UFFDIO_WRITEPROTECT), then don't bother */ + if (!(cp_flags & MM_CP_UFFD_WP)) + return false; + + /* Populate if the userfaultfd mode requires pte markers */ + return userfaultfd_wp_use_markers(vma); +} + +/* + * Populate the pgtable underneath for whatever reason if requested. + * When {pte|pmd|...}_alloc() failed we treat it the same way as pgtable + * allocation failures during page faults by kicking OOM and returning + * error. */ #define change_pmd_prepare(vma, pmd, cp_flags) \ ({ \ long err = 0; \ - if (unlikely(uffd_wp_protect_file(vma, cp_flags))) { \ + if (unlikely(pgtable_populate_needed(vma, cp_flags))) { \ if (pte_alloc(vma->vm_mm, pmd)) \ err = -ENOMEM; \ } \ @@ -351,7 +382,7 @@ uffd_wp_protect_file(struct vm_area_struct *vma, unsigned long cp_flags) #define change_prepare(vma, high, low, addr, cp_flags) \ ({ \ long err = 0; \ - if (unlikely(uffd_wp_protect_file(vma, cp_flags))) { \ + if (unlikely(pgtable_populate_needed(vma, cp_flags))) { \ low##_t *p = low##_alloc(vma->vm_mm, high, addr); \ if (p == NULL) \ err = -ENOMEM; \ @@ -404,7 +435,7 @@ static inline long change_pmd_range(struct mmu_gather *tlb, if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { if ((next - addr != HPAGE_PMD_SIZE) || - uffd_wp_protect_file(vma, cp_flags)) { + pgtable_split_needed(vma, cp_flags)) { __split_huge_pmd(vma, pmd, addr, false, NULL); /* * For file-backed, the pmd could have been From patchwork Thu Mar 9 22:37:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13168422 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 734A0C61DA4 for ; Thu, 9 Mar 2023 22:37:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1582F280002; Thu, 9 Mar 2023 17:37:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 106636B007B; Thu, 9 Mar 2023 17:37:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9C33280002; Thu, 9 Mar 2023 17:37:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DAD786B0078 for ; Thu, 9 Mar 2023 17:37:23 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B07B81C69A5 for ; Thu, 9 Mar 2023 22:37:23 +0000 (UTC) X-FDA: 80550822366.11.EB4D9F8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf10.hostedemail.com (Postfix) with ESMTP id A7EE8C000D for ; Thu, 9 Mar 2023 22:37:21 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BRUDcZKV; spf=pass (imf10.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678401441; a=rsa-sha256; cv=none; b=yx6BMDftXM7F7yCqUo8ao++3aEXj/Kl/W4WFSSIzbVdUoBvhrSz688DO6wRqR60etigsu9 XMSG3K+F9t7FIiaCslzoh+T3pa280PVtdbpFywhYcNTHLOhf0T0p8d1S/txy6lMQQQJ0nA 245EjOPZ0gnE6Yu34oVZBS0VY/aYi3E= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BRUDcZKV; spf=pass (imf10.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678401441; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dh3vnUO1CryjqGh5pf4N+hTFZfi7U8elDbn6yAru0xw=; b=16T/OeYDhKBCd7ofnWvGEa3d8pN7BfzUFvDlTE9Ml0K3rdJQpOWamKASBRw2xXMOnJnR4/ ns+sUEXmRKFdqBsHGFdMNCTaiaXPT/8TUEGtHuWoqwij0s8g5KRO0PE6Lzta5fQPOxA88d UHp9HrHAEW8tqlGbmUgh08HHeiCcqQo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678401441; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dh3vnUO1CryjqGh5pf4N+hTFZfi7U8elDbn6yAru0xw=; b=BRUDcZKVwdvXXFJL0sJS0RfFmdSd0Vgdjputs3nGxjEJ0Id2lJPFiXxvryQQY6HbxxweEJ qC9HM8IwdasPnetzBCJAF+p1jdaMunjN1glYRhhhV/Zn2FmeriI7WPOBxY4Tprhgix5xzz agb05AcMEUhw2BSrdoUajGIhzit6buk= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-90-pbnZdw2tNQm0QMbJzutyFw-1; Thu, 09 Mar 2023 17:37:20 -0500 X-MC-Unique: pbnZdw2tNQm0QMbJzutyFw-1 Received: by mail-qk1-f199.google.com with SMTP id x5-20020a05620a01e500b007428997e800so2059856qkn.10 for ; Thu, 09 Mar 2023 14:37:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678401438; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dh3vnUO1CryjqGh5pf4N+hTFZfi7U8elDbn6yAru0xw=; b=gzrUXlAwTdthBTuUWIzYB4qJJ0PEjBbh4N9FJOVlLX2ZAIXv8B3csOl+46U99l9Hnf Qnh4lpXuPZCrIUiJmuEWCqz4upmA1SvBJYy95SspFSraB9qR4SqfolAn3I8+x6cLtIT4 GuukJLqycp3/eBf2GIM9LCqu4uj1aMltVGUT0MdPI5MBRqn21PiCguz8d51yXFFbyNFf 9VRVInyNMNbubY57HIKAP9ayTHXaZ0uaTGuuOUfdLXmWuinDHio1KC28qrXbPWyz+bpo hTa8lKhFsqVKvK3RfZhZZ5EM6+84afRWn8uK9PpT493RwRccJ+PX2ixceMDLK6BmIwq2 wdDg== X-Gm-Message-State: AO0yUKUwKGGbvh6AtkGvs6m/0Nf6iDl7dYIlFCTz4HyNflyh2bf6rZFK UQXHolM/vQXF95KlAkLwVgolPc7SEj1iiGrYC/VOAyEwctOnLLXDuknY9yZSCScaTy7jivs+O1n G+Pgt1EV/SCo+FF7Rtib/slbShZkpOrwII/7Ad+ILdzJmqUwDoD1ocTWNY76NEvj4dTeA X-Received: by 2002:a05:622a:1a09:b0:3bf:c458:5bac with SMTP id f9-20020a05622a1a0900b003bfc4585bacmr1271821qtb.0.1678401438568; Thu, 09 Mar 2023 14:37:18 -0800 (PST) X-Google-Smtp-Source: AK7set/zs42/Izdu6QlfJ7NnX3Mn0ZoSSqNN1L96LiNcygyJm4jXsLB3B/nRmD9vB/SX58+KtB6MDA== X-Received: by 2002:a05:622a:1a09:b0:3bf:c458:5bac with SMTP id f9-20020a05622a1a0900b003bfc4585bacmr1271786qtb.0.1678401438212; Thu, 09 Mar 2023 14:37:18 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id c26-20020ac84e1a000000b003b8484fdfccsm172215qtw.42.2023.03.09.14.37.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Mar 2023 14:37:16 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Axel Rasmussen , Paul Gofman , Muhammad Usama Anjum , David Hildenbrand , Mike Rapoport , Andrea Arcangeli , peterx@redhat.com, Andrew Morton Subject: [PATCH v4 2/2] selftests/mm: Smoke test UFFD_FEATURE_WP_UNPOPULATED Date: Thu, 9 Mar 2023 17:37:11 -0500 Message-Id: <20230309223711.823547-3-peterx@redhat.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230309223711.823547-1-peterx@redhat.com> References: <20230309223711.823547-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Queue-Id: A7EE8C000D X-Rspamd-Server: rspam01 X-Stat-Signature: m8b3mm3xrii697ko4uudbas1n7jsq13t X-HE-Tag: 1678401441-563234 X-HE-Meta: U2FsdGVkX18PhsYbUANCir2mqOFeqjNC3OuhwVTk+E6OWQdZvYPuSKb2Ro2PkEPZiBOn7Nb5MAoBNBI3KwKgunUsnlYxJtvvKf4mgWYHgNBwQp0g453Mi3GcQGRCWYLeV2pT81N80p0OBBCwxmpwMuoBib+BOk9WuzJVKhC44tCGKgroYQI/IZ2fwqrt7Xlk/fXJh0Sei+G207wmfIhXkrV2ezxrhqq8lTopXD9eotGFZkKUA44/gnxOiXMvB6qEHRTeAwDAc9IsRGR/Ei4tRrTrXV0AmX0juGhXrV/cFx7+hgL9fn72BFQgh+Wloja41f6vBreNa4ROsKpv/V0UQaETIywFzAnAmhSkphBBD9qiKMoZ/eAKok8SSz+LdYmoe+B5wt2Dixhn4VedDk7ObRhW1nOIgOgWXWTkg9RdT9532TItv+TPGtIKEe02ZSTLcQ5JWr2W5C/KzqPaHOx2iFqTAiisMJiGGnUhHx28Bdd3w02PnrbtLO78aookjxBA0nCwJ9isaNrNz7pPcLrJ1I6+Yj2wdHGi4ONuw+jmRYXTLCXJA6xkw2rOB8MdzrMi+ofa8ubSGUG9cQXGH6fa6tg9EpvEwjAhLyE6aIJIHn5xHHUwzVvr8PU7VuqsL3Qc1X2IeJRc/dfydmyBTkKyNwRJSjqLbZSFqvyKI1iOfp3LnDOOvVpg7SfXk7oq2lSxzRNMoWqNvJwe2X0ka055g5JUWinvs5OAkp7PE1mljm6qMe4wTz1ae3LoIg9Hm8mXOlM+h7XtvQN8CC871QCQyNUsyT/5xLBfD9ds5mSpzKkBivvkFoZXRf8B35ey9wFMHFdxG2tMUNCqEMHJUITCvPDAoYiirMIS+Kdb39Fj/l6uajoqrBe3lEOkVwbik2Ax1vUUeovflhl1a7ZPjRtqGmon2F5bgfbJ3UoPqKRlLIRG3DOX4AxBsFhhbAEfGTznB3LKR5IHgUiUmQPRxzs FB9i5xEy 6DqmFjA6mPtu+dZBBBOhXVJsjx52/V0xCYHRevE8xZXmR4MTDGKW1vcIe6cvOc0Q2p/M9yIb98a8ak1CfpaaeHr2UYna+gU4BVizZJ+wqc7s+77yyv/6fffGGnfUYVQrpjjZhryMMbe4k6YN3LaSjfLyW4yorm5Z/IqmFABPbH9g5Cwi21i+/VFSeCLl8cJaE8O4wZ9KQlXT9B3lSfzeotsQPhzqWXndG9zImUiindkjy4hSaEaTnapEDf5LFlaoVOEAAQ9Hg27WSz8IWp4h4yn9a/F/6dPHRFtYYLwV1PClDCn2RnriH53+KBio/X0rezz50uCkYFKk+hwDinPwo6THKTEVpHQuyePIkh0CM0MT0WeT3AFFEm5wkd0d/sD8j+Lp1zbYrZmZNLRzNj30E3CU8aIjhkO4UwJux2VQT9flOr3gJOjb74Qv53Ga6N0jxM2ShiJdW33narA//4WFGfQALe1q3XrxaFg5Q8Y3vfuu6cXamlBJlZ5iRHGwK8qgJeCm+O98u52bcNBA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enable it by default on the stress test, and add some smoke tests for the pte markers on anonymous. Signed-off-by: Peter Xu --- tools/testing/selftests/mm/userfaultfd.c | 45 ++++++++++++++++++++++-- 1 file changed, 43 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/mm/userfaultfd.c b/tools/testing/selftests/mm/userfaultfd.c index 7f22844ed704..e030d63c031a 100644 --- a/tools/testing/selftests/mm/userfaultfd.c +++ b/tools/testing/selftests/mm/userfaultfd.c @@ -1444,6 +1444,43 @@ static int pagemap_test_fork(bool present) return result; } +static void userfaultfd_wp_unpopulated_test(int pagemap_fd) +{ + uint64_t value; + + /* Test applying pte marker to anon unpopulated */ + wp_range(uffd, (uint64_t)area_dst, page_size, true); + value = pagemap_read_vaddr(pagemap_fd, area_dst); + pagemap_check_wp(value, true); + + /* Test unprotect on anon pte marker */ + wp_range(uffd, (uint64_t)area_dst, page_size, false); + value = pagemap_read_vaddr(pagemap_fd, area_dst); + pagemap_check_wp(value, false); + + /* Test zap on anon marker */ + wp_range(uffd, (uint64_t)area_dst, page_size, true); + if (madvise(area_dst, page_size, MADV_DONTNEED)) + err("madvise(MADV_DONTNEED) failed"); + value = pagemap_read_vaddr(pagemap_fd, area_dst); + pagemap_check_wp(value, false); + + /* Test fault in after marker removed */ + *area_dst = 1; + value = pagemap_read_vaddr(pagemap_fd, area_dst); + pagemap_check_wp(value, false); + /* Drop it to make pte none again */ + if (madvise(area_dst, page_size, MADV_DONTNEED)) + err("madvise(MADV_DONTNEED) failed"); + + /* Test read-zero-page upon pte marker */ + wp_range(uffd, (uint64_t)area_dst, page_size, true); + *(volatile char *)area_dst; + /* Drop it to make pte none again */ + if (madvise(area_dst, page_size, MADV_DONTNEED)) + err("madvise(MADV_DONTNEED) failed"); +} + static void userfaultfd_pagemap_test(unsigned int test_pgsize) { struct uffdio_register uffdio_register; @@ -1462,7 +1499,7 @@ static void userfaultfd_pagemap_test(unsigned int test_pgsize) /* Flush so it doesn't flush twice in parent/child later */ fflush(stdout); - uffd_test_ctx_init(0); + uffd_test_ctx_init(UFFD_FEATURE_WP_UNPOPULATED); if (test_pgsize > page_size) { /* This is a thp test */ @@ -1482,6 +1519,10 @@ static void userfaultfd_pagemap_test(unsigned int test_pgsize) pagemap_fd = pagemap_open(); + /* Smoke test WP_UNPOPULATED first when it's still empty */ + if (test_pgsize == page_size) + userfaultfd_wp_unpopulated_test(pagemap_fd); + /* Touch the page */ *area_dst = 1; wp_range(uffd, (uint64_t)area_dst, test_pgsize, true); @@ -1526,7 +1567,7 @@ static int userfaultfd_stress(void) struct uffdio_register uffdio_register; struct uffd_stats uffd_stats[nr_cpus]; - uffd_test_ctx_init(0); + uffd_test_ctx_init(UFFD_FEATURE_WP_UNPOPULATED); if (posix_memalign(&area, page_size, page_size)) err("out of memory");