From patchwork Thu Jul 15 20:16:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12380959 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2EC5CC636C9 for ; Thu, 15 Jul 2021 20:16:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D6181613C4 for ; Thu, 15 Jul 2021 20:16:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D6181613C4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 40F748D0107; Thu, 15 Jul 2021 16:16:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BF758D00EC; Thu, 15 Jul 2021 16:16:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 239648D0107; Thu, 15 Jul 2021 16:16:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0099.hostedemail.com [216.40.44.99]) by kanga.kvack.org (Postfix) with ESMTP id 015818D00EC for ; Thu, 15 Jul 2021 16:16:35 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D9A4A1BCAB for ; Thu, 15 Jul 2021 20:16:34 +0000 (UTC) X-FDA: 78365929908.35.2589BA6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 8F164191F for ; Thu, 15 Jul 2021 20:16:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626380194; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YMBtlxFg4n27f91rtS7zXaKqOiqv9TDTO7vSN67CwFo=; b=Oer8pBsQSX9YqXaVmkeCW8AAx3i4tHepBMtvwD3Uqpbc+hpUcpGioGpd6cAn7FAV+emDoj /Czc1icTJQH/EGReK3YY+jIThdy1Ojbxb6VaYPxJVM4v4CWNAAE5o5c2CnHyQ5Whyxpt2r w8dkJdxzIs/Ze00i3vC8pymMzR4w/PY= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-590-KgLCUH0vPHS-bDe440E_XQ-1; Thu, 15 Jul 2021 16:16:31 -0400 X-MC-Unique: KgLCUH0vPHS-bDe440E_XQ-1 Received: by mail-qk1-f200.google.com with SMTP id k63-20020a37a1420000b02903b4fb67f606so4679679qke.10 for ; Thu, 15 Jul 2021 13:16:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YMBtlxFg4n27f91rtS7zXaKqOiqv9TDTO7vSN67CwFo=; b=fQFq0iR/pX0afGP6Lgpc521pQbNfGY2rlkMwGjc8b4Kb72t3Vmu5hyuzflIuty0Dmj Kb2hWMKKz73FI9DqVX5w0Y9jIRYrO7AZvyCE4QcQCk70uTy851yDZTEDdZawTpO88hkR ylTPO/tXg/V7Xp15CZNLbuXcjGR09SV66Z6+EbJdK4THVmYtEaEey0ag03q4vodEMeCM 2OPk671Fskylp+qUmsH8CjlsI4+FgEM4i6GI58OVtL0AvqwB+WBwrgQGoNcK39g2GNkn b4TctMMUsYX1QYjyXVK5Az+mkh6vjYhBferEmInxXi8cOgkvmzpuTFiVsPgCFXdN7LRD A8Tw== X-Gm-Message-State: AOAM531C7EdejNzbZmBnm+z8NRJcUF6LynrG4f2HSbx9noem6B8LrAdd 5F2Kt+/LgNQoOjUhsNeFv4YOzAMaE3yPq6bZxcUZgHRpk4e1GE7B31QtJ+ag5kBfY4ANmtwRNvF feZPlOTkb0zk= X-Received: by 2002:a37:9d12:: with SMTP id g18mr5605883qke.457.1626380190635; Thu, 15 Jul 2021 13:16:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyNuixvDlfkRWt4KafbV8fooXyPXVrZeuGHjP+wus+I0LUnSews9Nd0V0qylOk23o20G6gv1w== X-Received: by 2002:a37:9d12:: with SMTP id g18mr5605869qke.457.1626380190410; Thu, 15 Jul 2021 13:16:30 -0700 (PDT) Received: from localhost.localdomain (bras-base-toroon474qw-grc-65-184-144-111-238.dsl.bell.ca. [184.144.111.238]) by smtp.gmail.com with ESMTPSA id r4sm2526758qtc.66.2021.07.15.13.16.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Jul 2021 13:16:29 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Axel Rasmussen , Nadav Amit , Jerome Glisse , "Kirill A . Shutemov" , Jason Gunthorpe , Alistair Popple , Andrew Morton , David Hildenbrand , peterx@redhat.com, Andrea Arcangeli , Matthew Wilcox , Mike Kravetz , Tiberiu Georgescu , Hugh Dickins , Miaohe Lin , Mike Rapoport Subject: [PATCH v5 18/26] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Date: Thu, 15 Jul 2021 16:16:26 -0400 Message-Id: <20210715201626.211813-1-peterx@redhat.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210715201422.211004-1-peterx@redhat.com> References: <20210715201422.211004-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8F164191F X-Stat-Signature: g3tj9rt5d4wwoxyckcippthjyjywc69r Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Oer8pBsQ; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf22.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1626380194-698312 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, pass the wp_copy variable into hugetlb_mcopy_atomic_pte() thoughout the stack. Then, apply the UFFD_WP bit if UFFDIO_COPY_MODE_WP is with UFFDIO_COPY. Introduce huge_pte_mkuffd_wp() for it. Hugetlb pages are only managed by hugetlbfs, so we're safe even without setting dirty bit in the huge pte if the page is installed as read-only. However we'd better still keep the dirty bit set for a read-only UFFDIO_COPY pte (when UFFDIO_COPY_MODE_WP bit is set), not only to match what we do with shmem, but also because the page does contain dirty data that the kernel just copied from the userspace. Signed-off-by: Peter Xu Reported-by: kernel test robot --- include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 22 +++++++++++++++++----- mm/userfaultfd.c | 12 ++++++++---- 3 files changed, 29 insertions(+), 11 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index c30f39815e13..fcdbf9f46d85 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -155,7 +155,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep); + struct page **pagep, + bool wp_copy); #endif /* CONFIG_USERFAULTFD */ bool hugetlb_reserve_pages(struct inode *inode, long from, long to, struct vm_area_struct *vma, @@ -336,7 +337,8 @@ static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { BUG(); return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d34636085eaf..880cb2137d04 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5141,7 +5141,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE); struct hstate *h = hstate_vma(dst_vma); @@ -5277,17 +5278,28 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, hugepage_add_new_anon_rmap(page, dst_vma, dst_addr); } - /* For CONTINUE on a non-shared VMA, don't set VM_WRITE for CoW. */ - if (is_continue && !vm_shared) + /* + * For either: (1) CONTINUE on a non-shared VMA, or (2) UFFDIO_COPY + * with wp flag set, don't set pte write bit. + */ + if (wp_copy || (is_continue && !vm_shared)) writable = 0; else writable = dst_vma->vm_flags & VM_WRITE; _dst_pte = make_huge_pte(dst_vma, page, writable); - if (writable) - _dst_pte = huge_pte_mkdirty(_dst_pte); + /* + * Always mark UFFDIO_COPY page dirty; note that this may not be + * extremely important for hugetlbfs for now since swapping is not + * supported, but we should still be clear in that this page cannot be + * thrown away at will, even if write bit not set. + */ + _dst_pte = huge_pte_mkdirty(_dst_pte); _dst_pte = pte_mkyoung(_dst_pte); + if (wp_copy) + _dst_pte = huge_pte_mkuffd_wp(_dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); (void)huge_ptep_set_access_flags(dst_vma, dst_addr, dst_pte, _dst_pte, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 0c7212dfb95d..501d6b9f7a5a 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -297,7 +297,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode) + enum mcopy_atomic_mode mode, + bool wp_copy) { int vm_shared = dst_vma->vm_flags & VM_SHARED; ssize_t err; @@ -393,7 +394,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, - dst_addr, src_addr, mode, &page); + dst_addr, src_addr, mode, &page, + wp_copy); mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); @@ -448,7 +450,8 @@ extern ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode); + enum mcopy_atomic_mode mode, + bool wp_copy); #endif /* CONFIG_HUGETLB_PAGE */ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, @@ -568,7 +571,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, */ if (is_vm_hugetlb_page(dst_vma)) return __mcopy_atomic_hugetlb(dst_mm, dst_vma, dst_start, - src_start, len, mcopy_mode); + src_start, len, mcopy_mode, + wp_copy); if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) goto out_unlock;