From patchwork Sat Feb 18 00:27:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145380 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3FB3C6379F for ; Sat, 18 Feb 2023 00:29:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 45A2D28000A; Fri, 17 Feb 2023 19:29:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 40BDB280002; Fri, 17 Feb 2023 19:29:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 283FB28000A; Fri, 17 Feb 2023 19:29:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1B66C280002 for ; Fri, 17 Feb 2023 19:29:00 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BF71E405E8 for ; Sat, 18 Feb 2023 00:28:59 +0000 (UTC) X-FDA: 80478527598.24.ABF2995 Received: from mail-ua1-f74.google.com (mail-ua1-f74.google.com [209.85.222.74]) by imf30.hostedemail.com (Postfix) with ESMTP id 076548000D for ; Sat, 18 Feb 2023 00:28:57 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ZXgI9bmT; spf=pass (imf30.hostedemail.com: domain of 3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680138; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6oe0JKndOCFJAJQ6FnTw91/ULLeb3XwnLhf6Mjl8ksE=; b=bu+RY8aT7/Co1hgUvmfB9t3pBKVEK7R2Ji47B0bxu6AihZ7KsrH+068Q0bHl4eNEBpV1tB iWR1xRUkB7kdpVuTwApIpPxGllC7eZhBv2HLcQg/813V6d8xmpMPq1I3GdA/eqS2vYAJvL ze9LnNp63Y/siVaXltrnbuR0XJG2vmo= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ZXgI9bmT; spf=pass (imf30.hostedemail.com: domain of 3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680138; a=rsa-sha256; cv=none; b=QfbkskqPdPIRcClv3D9ceacZTx7WZC7NDDTiaNAc0opQhQkuPply1DuBUJS8ukLKq2RtkH m7Yh2LSNjlhNIVHRpa1TqBMx7o5WtayGxMdo+GqkrhyZKSWzzZeGdX+aOGM59rVk8oHb1G 1GBgX3NdAer0acfh/2MZRdcuw23j1hU= Received: by mail-ua1-f74.google.com with SMTP id p6-20020ab05486000000b0068398735344so660676uaa.15 for ; Fri, 17 Feb 2023 16:28:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6oe0JKndOCFJAJQ6FnTw91/ULLeb3XwnLhf6Mjl8ksE=; b=ZXgI9bmT7U5wLokZ0C4oNJ6Gk9RcbKcaK8gcoGLiwH9auXPSDVM0Ol5I8tJdiHkQBm cqJXsvHhpPo95w78Rkbk3SLZVdtOTUOQ5nztMCmY5/Ud5mRj3hFPJhvSI6benmg8rsXE zUOmpa69V6q/M9cjYkffaZNIdg3JvQ6rKjZMp6Dj0DYTIaXY+Lg7BLYbHF6CKnajEaK+ wl2mnnELaLd/60dmFNDm2ghVntcF83+VBQ5Wv+xyP+zua7DC9xkWhlP06ETH3LjJnFLh bf40rMtbYVk0AxD73EzwULPcAUGoIp7bBDuosmnG4aaaqWZsAeWgTL7c1Q5nHavHsXWV bZFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6oe0JKndOCFJAJQ6FnTw91/ULLeb3XwnLhf6Mjl8ksE=; b=m1LOP9Ah5lRKe4HPSv28TdanbFWCYsyQJHKnPVox/P+E7IrqFt1Ak1PhwuGMindQ5s FTn+SW96hrs/GkGXD45LOm7mAVDbW1uFHci2B0MvFr6iRd9Dvsey+rBeU0DZbMHgvs/s 4foAt4oajpwLOth2DvkrB5HF0ASAOTXN4KtnRVQPYGZfjDgvRT/87CvIODfYkaGHJK4S YZIhkNv/F8R0mhLzxJRZ//wqyA1wHYxsWXwGIVyU+fbjpL6iKwQKLZRbzmvvHhfyBhgG 86wqU81mZXnnt0OXr91MIrjAb447yQlChV9iwyGd81oBwa3q2exINL/ntXQsYJPUu+3l qTLQ== X-Gm-Message-State: AO0yUKXYiONEcjO59Ex3TaXPvZAEVpPB+8kJjUYWjBKqbtLdVO0SMJ4R zK2i7VvcpDOJ4mPzChmlnsBIJZVa+suIJj/z X-Google-Smtp-Source: AK7set8EIO6lV9DKEYEv7Gocq2q4pnYUxsgcQQWqM5Ogw0BQTVEawiIrDCY/tG3VwGpwjeKgD8+npdD3lNuWqFdx X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a1f:9111:0:b0:409:92de:63bd with SMTP id t17-20020a1f9111000000b0040992de63bdmr110245vkd.12.1676680137159; Fri, 17 Feb 2023 16:28:57 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:47 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-15-jthoughton@google.com> Subject: [PATCH v2 14/46] hugetlb: split PTE markers when doing HGM walks From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 076548000D X-Stat-Signature: 9dy1exngyihcn5na5ku6twbmhr3hajr1 X-HE-Tag: 1676680137-290923 X-HE-Meta: U2FsdGVkX1+eSY0ategURx9HH3ftV2GkpZBfsfb15CAAm+ZsJAMOub+/vVcK3VdOZnp0X9UX/N9xxhepbY0z7uMc2ejg4hpNvw7nckAXqz/zuj3PdhFu9omw1AWETHaNWkcHwA9ezRzMNQ95ju3WAi14GQEEC3O2moEwVlg3VqUy6774REI2NpxCHMa+A29C/YwMUABhVGGw/yFPQGpmmh4E3naVjBVsYDxRHxOlmkjidJW12utTONnBlnVUs4iAwPUXCtRWnFlCZa31AIcsgunZIHKyApwbfGa4qK2qhejeqp3ehLICPq3SyMQELjU8iubSHd07AMIZdeKVpNTi+Th2juCM5CiDV228XtHEzHMKGqfMbQf7IADHJW8lCkmGZ6Ay16shFF5tDECtvz6DeWUqh7uYBv07N+oByR7Y46ab62oRvqp8sP3x+M03b5qv3VO/IRtMFtRy/BzmaP2IhdwHJsdssIz7hsRo8iu0sHXLG5gw6WNcQQusEIwXkxmNaK+p0IP4jwatnkzxi7z09cZu5sUD1d0Wau8TyI5nYRPp+60DnfBrB4PZ9u+icH5rd/1m7p/b0nAPTsYmnEyAYl+aczn6N8wrzkUyxgDI/cB3DKX8pjf6aeoMxkUM4hCmihzWnVdrNMMofeng/8x8M2xdyCCQ4YdLQmoAotlqbuGKcJA443rQijBSrpzsc21BG/j1iQoTErodvDjZliyKwBSLsLH+U0TvzAWOKSJf11FMl6NCKF6mMIFVmimZ3ZWFCy0+lybtrQB/qFCV5srX+Yvf53zDt8JeKRrKG7NIltpb/GZnfosyKm/aVLnaSRrFRz8WPnB0fNGAM+Onrg8hClJLQ8uQCeY9Bz800gEMydrnMKiIXlVIgKMVMleHdaAS8Fe2QPKmoouuX2ZcNQjrijK35mcWOewNIn9ay2RTU83zyF6t7pmy8I8RajKNKVndxyatvkwzuJGzZGX1D7D G5qbfseI 6eKXyXIsBs2ejG5yq4kCx3HdlUUM7O2ySSwP6zowDW7BsikSCrg2ibCRaggBvXPk0TqWeKV9rQLjHTgag5ps1UAGhoLI2alk222EMxD9G9KCtSg1Z4ZA50P9TRWX9lNxzaW8bJOnhlDocYVJPxOR38OPXALWRs7+IrQxtl4REWOZNySaxVBB9228o89RvcpKlX9nSXXivqSqbYkJv72Y0zoc7yrljN70shzT/hMUYG+8BBpq4FJq/D8oStnYCOInP3DgmvvzkY6W19H7mU+7ymfLR6WldZs2mWo8c+DCSbsTrBMtIhbs7vevcMU+a5j3/7juLOVVQyelk5JUZa92kdn9FRZ96A1KZKXTrVHw5phHZTlt2flGU1pk23Q8Ta0WCfq/3hMCIq+/ODy8gv/ituMCZsOYrEjAcTTngQbN/P8lkwf9xGp1fO3lItqC6psHk8PN7QiunOhySgjG6Pz5F9ulMI+JQluN8w53kiG/7OObiQeIbn2+2k5TDvGDzKZm8AGOlxOCLJbabH254ZW7pmhg+V9KwbmvjToptReDJVLJp1SfNLn9wCV84l68qZjwTUPOFzyC17D3dK04/4Gp78sOKVOTfatTqF+LpLksuDutlM/sbu9gGma8IZdPMEivrTem3ur/pAsj7SLlVGVuRs4cmzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Fix how UFFDIO_CONTINUE and UFFDIO_WRITEPROTECT interact in these two ways: - UFFDIO_WRITEPROTECT no longer prevents a high-granularity UFFDIO_CONTINUE. - UFFD-WP PTE markers installed with UFFDIO_WRITEPROTECT will be properly propagated when high-granularily UFFDIO_CONTINUEs are performed. Note: UFFDIO_WRITEPROTECT is not yet permitted at PAGE_SIZE granularity. Signed-off-by: James Houghton Acked-by: Mike Kravetz diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 810c05feb41f..f74183acc521 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -506,6 +506,30 @@ static bool has_same_uncharge_info(struct file_region *rg, #endif } +static void hugetlb_install_markers_pmd(pmd_t *pmdp, pte_marker marker) +{ + int i; + + for (i = 0; i < PTRS_PER_PMD; ++i) + /* + * WRITE_ONCE not needed because the pud hasn't been + * installed yet. + */ + pmdp[i] = __pmd(pte_val(make_pte_marker(marker))); +} + +static void hugetlb_install_markers_pte(pte_t *ptep, pte_marker marker) +{ + int i; + + for (i = 0; i < PTRS_PER_PTE; ++i) + /* + * WRITE_ONCE not needed because the pmd hasn't been + * installed yet. + */ + ptep[i] = make_pte_marker(marker); +} + /* * hugetlb_alloc_pmd -- Allocate or find a PMD beneath a PUD-level hpte. * @@ -528,23 +552,32 @@ pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, pmd_t *new; pud_t *pudp; pud_t pud; + bool is_marker; + pte_marker marker; if (hpte->level != HUGETLB_LEVEL_PUD) return ERR_PTR(-EINVAL); pudp = (pud_t *)hpte->ptep; retry: + is_marker = false; pud = READ_ONCE(*pudp); if (likely(pud_present(pud))) return unlikely(pud_leaf(pud)) ? ERR_PTR(-EEXIST) : pmd_offset(pudp, addr); - else if (!pud_none(pud)) + else if (!pud_none(pud)) { /* - * Not present and not none means that a swap entry lives here, - * and we can't get rid of it. + * Not present and not none means that a swap entry lives here. + * If it's a PTE marker, we can deal with it. If it's another + * swap entry, we don't attempt to split it. */ - return ERR_PTR(-EEXIST); + is_marker = is_pte_marker(__pte(pud_val(pud))); + if (!is_marker) + return ERR_PTR(-EEXIST); + + marker = pte_marker_get(pte_to_swp_entry(__pte(pud_val(pud)))); + } new = pmd_alloc_one(mm, addr); if (!new) @@ -557,6 +590,13 @@ pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, goto retry; } + /* + * Install markers before PUD to avoid races with other + * page tables walks. + */ + if (is_marker) + hugetlb_install_markers_pmd(new, marker); + mm_inc_nr_pmds(mm); smp_wmb(); /* See comment in pmd_install() */ pud_populate(mm, pudp, new); @@ -576,23 +616,32 @@ pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, pgtable_t new; pmd_t *pmdp; pmd_t pmd; + bool is_marker; + pte_marker marker; if (hpte->level != HUGETLB_LEVEL_PMD) return ERR_PTR(-EINVAL); pmdp = (pmd_t *)hpte->ptep; retry: + is_marker = false; pmd = READ_ONCE(*pmdp); if (likely(pmd_present(pmd))) return unlikely(pmd_leaf(pmd)) ? ERR_PTR(-EEXIST) : pte_offset_kernel(pmdp, addr); - else if (!pmd_none(pmd)) + else if (!pmd_none(pmd)) { /* - * Not present and not none means that a swap entry lives here, - * and we can't get rid of it. + * Not present and not none means that a swap entry lives here. + * If it's a PTE marker, we can deal with it. If it's another + * swap entry, we don't attempt to split it. */ - return ERR_PTR(-EEXIST); + is_marker = is_pte_marker(__pte(pmd_val(pmd))); + if (!is_marker) + return ERR_PTR(-EEXIST); + + marker = pte_marker_get(pte_to_swp_entry(__pte(pmd_val(pmd)))); + } /* * With CONFIG_HIGHPTE, calling `pte_alloc_one` directly may result @@ -613,6 +662,9 @@ pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, goto retry; } + if (is_marker) + hugetlb_install_markers_pte(page_address(new), marker); + mm_inc_nr_ptes(mm); smp_wmb(); /* See comment in pmd_install() */ pmd_populate(mm, pmdp, new); @@ -7384,7 +7436,12 @@ static int __hugetlb_hgm_walk(struct mm_struct *mm, struct vm_area_struct *vma, if (!pte_present(pte)) { if (!alloc) return 0; - if (unlikely(!huge_pte_none(pte))) + /* + * In hugetlb_alloc_pmd and hugetlb_alloc_pte, + * we split PTE markers, so we can tolerate + * PTE markers here. + */ + if (unlikely(!huge_pte_none_mostly(pte))) return -EEXIST; } else if (hugetlb_pte_present_leaf(hpte, pte)) return 0;