From patchwork Thu Feb 4 18:34:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 12070409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFB57C433E0 for ; Fri, 5 Feb 2021 16:20:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2A37A64D9A for ; Fri, 5 Feb 2021 16:20:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2A37A64D9A Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8CBD56B0006; Fri, 5 Feb 2021 11:20:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 87DC86B0073; Fri, 5 Feb 2021 11:20:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7934C6B0074; Fri, 5 Feb 2021 11:20:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 63DAC6B0006 for ; Fri, 5 Feb 2021 11:20:48 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2989C181AEF10 for ; Fri, 5 Feb 2021 16:20:48 +0000 (UTC) X-FDA: 77784727776.18.983573B Received: by imf03.hostedemail.com (Postfix, from userid 200) id E9E73C00CEC7; Thu, 4 Feb 2021 18:34:50 +0000 (UTC) Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf03.hostedemail.com (Postfix) with ESMTP id C0CA2DB3D7D7 for ; Thu, 4 Feb 2021 18:34:47 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id 134so4180469ybd.3 for ; Thu, 04 Feb 2021 10:34:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=s9r2ulxIVo+JxzRZfbGUdKJOP0/th7I3qsQK2fKA868=; b=wV+Hva7Qm6LmWtboJMy1O3oNDDp9+Ve3p0BFRSkoshlP4KJG6Pvq+kn2LJL+fXNz0H 7Be7hCNSDbaTmGhW64WInTBmORgd0QNu3sU1mKpY0L8+8id+LYOX5eXOk3umYJ1Xe0JA S4YuMEO/342CuRpnjziXI81SaQS/YKnf6ibKDL5oBM5EJyHhJFroah7osop6dWo/lzEx 6agFStDL6ZeCHk295SjxM/nC0hV8tAuhTDJ90CD8vJZLw+1EGbDMJTJpn4be4+UREWYa WOcBxClzEpU87vKEFl6J3wIo8cL88U+kEP43Otj+h35ibnJpTaCGn3XsX5Mtenym6CCo iW8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=s9r2ulxIVo+JxzRZfbGUdKJOP0/th7I3qsQK2fKA868=; b=s7u8jcWE+kk1Pl7ynvCtuwphWybGnVovcXj/xBpdwa00VzJaZTA/pfMvMPZkB+Gjtp esZTX8jxz1DcqBYYznBy9mHfE2OXV7QOT44bntFVCOS9LieQajwe5YDCNhZ5QLfxkZL9 zM3cas1pG9OjvrA7SASxRB2p32Mbts6woM0Pm7u25qb5mi9py9e5UXvoUo834zeEOiZF 6GnR5N6SeQ0mlhfX/oU4+jxPR4gf0GwMdxS+hmgBVSCst5GRz2hFV1YGrFVPnglvWjGG RC59UAfuvxFIO4VxWTxIX5ESMN/enZ+S2cNXAWmNcbtssKk8vDlVrNRImiXkYF1B/5Q9 zZZw== X-Gm-Message-State: AOAM530LR4AfsX7vt8S1p0xb+F9M8qraLVq/fYgYnK9U0yLBEJhCISSQ 66Zl0P937HCcrkYAvs+iGlp5zAk6+Zv4KoBck3dD X-Google-Smtp-Source: ABdhPJx6wGFI4hlg9+bzkoNUY3w5U7WH/0NErIIYgtcvUJiTksoN103vNGBceqQZ7UuGL6uHIDxGzIj67AWdShq6BnyQ X-Received: from ajr0.svl.corp.google.com ([2620:15c:2cd:203:b001:12c1:dc19:2089]) (user=axelrasmussen job=sendgmr) by 2002:a25:6cc6:: with SMTP id h189mr814735ybc.434.1612463688006; Thu, 04 Feb 2021 10:34:48 -0800 (PST) Date: Thu, 4 Feb 2021 10:34:25 -0800 In-Reply-To: <20210204183433.1431202-1-axelrasmussen@google.com> Message-Id: <20210204183433.1431202-3-axelrasmussen@google.com> Mime-Version: 1.0 References: <20210204183433.1431202-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v4 02/10] hugetlb/userfaultfd: Forbid huge pmd sharing when uffd enabled From: Axel Rasmussen To: Alexander Viro , Alexey Dobriyan , Andrea Arcangeli , Andrew Morton , Anshuman Khandual , Catalin Marinas , Chinwen Chang , Huang Ying , Ingo Molnar , Jann Horn , Jerome Glisse , Lokesh Gidra , "Matthew Wilcox (Oracle)" , Michael Ellerman , " =?utf-8?q?Michal_Koutn=C3=BD?= " , Michel Lespinasse , Mike Kravetz , Mike Rapoport , Nicholas Piggin , Peter Xu , Shaohua Li , Shawn Anastasio , Steven Rostedt , Steven Price , Vlastimil Babka Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Adam Ruprecht , Axel Rasmussen , Cannon Matthews , "Dr . David Alan Gilbert" , David Rientjes , Mina Almasry , Oliver Upton X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C0CA2DB3D7D7 X-Stat-Signature: 8hs6mc7fd7wukz8u1j9e86neabpo7tkm Received-SPF: none (flex--axelrasmussen.bounces.google.com>: No applicable sender policy available) receiver=imf03; identity=mailfrom; envelope-from="<3SD4cYA0KCD0ZwdkqZrltrrdmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--axelrasmussen.bounces.google.com>"; helo=mail-yb1-f201.google.com; client-ip=209.85.219.201 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1612463687-744288 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Huge pmd sharing could bring problem to userfaultfd. The thing is that userfaultfd is running its logic based on the special bits on page table entries, however the huge pmd sharing could potentially share page table entries for different address ranges. That could cause issues on either: - When sharing huge pmd page tables for an uffd write protected range, the newly mapped huge pmd range will also be write protected unexpectedly, or, - When we try to write protect a range of huge pmd shared range, we'll first do huge_pmd_unshare() in hugetlb_change_protection(), however that also means the UFFDIO_WRITEPROTECT could be silently skipped for the shared region, which could lead to data loss. Since at it, a few other things are done altogether: - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because that's definitely something that arch code would like to use too - ARM64 currently directly check against CONFIG_ARCH_WANT_HUGE_PMD_SHARE when trying to share huge pmd. Switch to the want_pmd_share() helper. Since at it, move vma_shareable() from huge_pmd_share() into want_pmd_share(). Signed-off-by: Peter Xu Signed-off-by: Axel Rasmussen --- arch/arm64/mm/hugetlbpage.c | 3 +-- include/linux/hugetlb.h | 2 ++ include/linux/userfaultfd_k.h | 9 +++++++++ mm/hugetlb.c | 20 ++++++++++++++------ 4 files changed, 26 insertions(+), 8 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 6e3bcffe2837..58987a98e179 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -284,8 +284,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, */ ptep = pte_alloc_map(mm, pmdp, addr); } else if (sz == PMD_SIZE) { - if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && - pud_none(READ_ONCE(*pudp))) + if (want_pmd_share(vma, addr) && pud_none(READ_ONCE(*pudp))) ptep = huge_pmd_share(mm, vma, addr, pudp); else ptep = (pte_t *)pmd_alloc(mm, pudp, addr); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 92e8799edffd..f6d5939a6eb0 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -948,4 +948,6 @@ static inline __init void hugetlb_cma_check(void) } #endif +bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr); + #endif /* _LINUX_HUGETLB_H */ diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index a8e5f3ea9bb2..c63ccdae3eab 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -52,6 +52,15 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, return vma->vm_userfaultfd_ctx.ctx == vm_ctx.ctx; } +/* + * Never enable huge pmd sharing on uffd-wp registered vmas, because uffd-wp + * protect information is per pgtable entry. + */ +static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_missing(struct vm_area_struct *vma) { return vma->vm_flags & VM_UFFD_MISSING; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3185631f61bc..588c4c28c44d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5251,6 +5251,18 @@ static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr) return false; } +bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) +{ +#ifndef CONFIG_ARCH_WANT_HUGE_PMD_SHARE + return false; +#endif +#ifdef CONFIG_USERFAULTFD + if (uffd_disable_huge_pmd_share(vma)) + return false; +#endif + return vma_shareable(vma, addr); +} + /* * Determine if start,end range within vma could be mapped by shared pmd. * If yes, adjust start and end to cover range associated with possible @@ -5305,9 +5317,6 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, pte_t *pte; spinlock_t *ptl; - if (!vma_shareable(vma, addr)) - return (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_assert_locked(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { if (svma == vma) @@ -5371,7 +5380,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, *addr = ALIGN(*addr, HPAGE_SIZE * PTRS_PER_PTE) - HPAGE_SIZE; return 1; } -#define want_pmd_share() (1) + #else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct vma, unsigned long addr, pud_t *pud) @@ -5389,7 +5398,6 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, unsigned long *start, unsigned long *end) { } -#define want_pmd_share() (0) #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB @@ -5411,7 +5419,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte = (pte_t *)pud; } else { BUG_ON(sz != PMD_SIZE); - if (want_pmd_share() && pud_none(*pud)) + if (want_pmd_share(vma, addr) && pud_none(*pud)) pte = huge_pmd_share(mm, vma, addr, pud); else pte = (pte_t *)pmd_alloc(mm, pud, addr);