From patchwork Tue Dec 21 15:46:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 12690005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFD3FC43217 for ; Tue, 21 Dec 2021 15:46:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B309C6B0093; Tue, 21 Dec 2021 10:46:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B07336B0096; Tue, 21 Dec 2021 10:46:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9808B6B0098; Tue, 21 Dec 2021 10:46:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id 8A0C96B0093 for ; Tue, 21 Dec 2021 10:46:55 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 4D6917BE67 for ; Tue, 21 Dec 2021 15:46:55 +0000 (UTC) X-FDA: 78942229590.02.D549482 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) by imf28.hostedemail.com (Postfix) with ESMTP id 1266FC000C for ; Tue, 21 Dec 2021 15:46:54 +0000 (UTC) Received: by mail-qk1-f182.google.com with SMTP id d21so12933596qkl.3 for ; Tue, 21 Dec 2021 07:46:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=RKJqaZULeG1QSA7YB831SR6HERaPIQOIwY733yUqU5M=; b=FKON5HAyn15g62AGhOlT1/tBz+VEw1J67ZHg3wwFLM8ZgF8fXf2LcxUQI365+XnEvp 7WyWvfd+32btAKnG7C3/aIuB3xfZm55cFb5ugAQ8ST1/145P5LenlINlmo3DZmrzo8zb uIo242st9q8BikF5hgObrPFY+SIaDvInrNRImnfXXsqF6KH2Q00gueQ8N0Y+q3y7Fu91 hRhDr+Gco87JRzd8fkqTAtbUrfUNbv7rPdS8FvOkYHKPnyd8jw3J6SFAmeVfo5OTXxwb w4wiwz8v1zVH4sD4khwU/oJlbp9yOeIjT5IipirRF9BJmr1K1xtPleoa1ZgMnGJH0JLT 3BZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RKJqaZULeG1QSA7YB831SR6HERaPIQOIwY733yUqU5M=; b=Ac4cfVnF0n5PBOZ8FLJaRdU5AUU4s04AWhZ3noVfibm2g1jKjqTS+1jvZybAiUFo1w DyW0cp1FPwsnHmkR3WAoJjh/63bQkZLlC9YBmOGAGwVbTEFQ6A7/O4ftSOUNI0IcOIoF pcgzT1wwijFLt1yfnGmmyPsylE7j3ecXct8VKdlH+RzrkF65qXP8F0qkgZ3YQACBUtTy w39d3urv3w5LJZ4zqlMoFti7ds5iEI57HqOplJ2NH6CJGJPhSzPD8k3I//PQH6T6rBbI N5oLE7gLMn8dbJgWhWpOJAuMRxWx/IqfMD1d/G+ZBTz0ycxYHeG2ib7ptDBm9KtrbZxg odwA== X-Gm-Message-State: AOAM532iFfDJOfWUSMiEOmTTVUGmTMeCLth5IHvnCLG6Kk7fO6E+eTsV 6EOkxiizLs2n7TaJBOjX3QqipA== X-Google-Smtp-Source: ABdhPJzcVYX4pffnC7B1BEW7SxoZDfhSLwDnPuVSyGmvG/Ya68GAB7xCw58hA1bjyg8lE09NC2cp3g== X-Received: by 2002:a05:620a:4096:: with SMTP id f22mr2473224qko.600.1640101614024; Tue, 21 Dec 2021 07:46:54 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:53 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 1/4] mm: change page type prior to adding page table entry Date: Tue, 21 Dec 2021 15:46:47 +0000 Message-Id: <20211221154650.1047963-2-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 1266FC000C X-Stat-Signature: qec5ohrnudn9wpsdynhx8uh44of8hj6i Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=FKON5HAy; spf=pass (imf28.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.182 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none X-HE-Tag: 1640101614-171198 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are a few places where we first update the entry in the user page table, and later change the struct page to indicate that this is anonymous or file page. In most places, however, we first configure the page metadata and then insert entries into the page table. Page table check, will use the information from struct page to verify the type of entry is inserted. Change the order in all places to first update struct page, and later to update page table. This means that we first do calls that may change the type of page (anon or file): page_move_anon_rmap page_add_anon_rmap do_page_add_anon_rmap page_add_new_anon_rmap page_add_file_rmap hugepage_add_anon_rmap hugepage_add_new_anon_rmap And after that do calls that add entries to the page table: set_huge_pte_at set_pte_at Signed-off-by: Pasha Tatashin --- mm/hugetlb.c | 6 +++--- mm/memory.c | 9 +++++---- mm/migrate.c | 5 ++--- mm/swapfile.c | 4 ++-- 4 files changed, 12 insertions(+), 12 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a1baa198519a..61895cc01d09 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4684,8 +4684,8 @@ hugetlb_install_page(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr struct page *new_page) { __SetPageUptodate(new_page); - set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1)); hugepage_add_new_anon_rmap(new_page, vma, addr); + set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1)); hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm); ClearHPageRestoreReserve(new_page); SetHPageMigratable(new_page); @@ -5259,10 +5259,10 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, /* Break COW */ huge_ptep_clear_flush(vma, haddr, ptep); mmu_notifier_invalidate_range(mm, range.start, range.end); - set_huge_pte_at(mm, haddr, ptep, - make_huge_pte(vma, new_page, 1)); page_remove_rmap(old_page, true); hugepage_add_new_anon_rmap(new_page, vma, haddr); + set_huge_pte_at(mm, haddr, ptep, + make_huge_pte(vma, new_page, 1)); SetHPageMigratable(new_page); /* Make the old page be freed below */ new_page = old_page; diff --git a/mm/memory.c b/mm/memory.c index 71e475d440b0..11cb28a2ca54 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -720,8 +720,6 @@ static void restore_exclusive_pte(struct vm_area_struct *vma, else if (is_writable_device_exclusive_entry(entry)) pte = maybe_mkwrite(pte_mkdirty(pte), vma); - set_pte_at(vma->vm_mm, address, ptep, pte); - /* * No need to take a page reference as one was already * created when the swap entry was made. @@ -735,6 +733,8 @@ static void restore_exclusive_pte(struct vm_area_struct *vma, */ WARN_ON_ONCE(!PageAnon(page)); + set_pte_at(vma->vm_mm, address, ptep, pte); + if (vma->vm_flags & VM_LOCKED) mlock_vma_page(page); @@ -3635,8 +3635,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) pte = pte_mkuffd_wp(pte); pte = pte_wrprotect(pte); } - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; /* ksm created a completely new copy */ @@ -3647,6 +3645,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) do_page_add_anon_rmap(page, vma, vmf->address, exclusive); } + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); + arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); + swap_free(entry); if (mem_cgroup_swap_full(page) || (vma->vm_flags & VM_LOCKED) || PageMlocked(page)) diff --git a/mm/migrate.c b/mm/migrate.c index c9296d63878d..f943a2d99de7 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -237,20 +237,19 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, pte = pte_mkhuge(pte); pte = arch_make_huge_pte(pte, shift, vma->vm_flags); - set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); if (PageAnon(new)) hugepage_add_anon_rmap(new, vma, pvmw.address); else page_dup_rmap(new, true); + set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } else #endif { - set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); - if (PageAnon(new)) page_add_anon_rmap(new, vma, pvmw.address, false); else page_add_file_rmap(new, false); + set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new)) mlock_vma_page(new); diff --git a/mm/swapfile.c b/mm/swapfile.c index e59e08ef46e1..e64207e2ef1d 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1917,14 +1917,14 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES); get_page(page); - set_pte_at(vma->vm_mm, addr, pte, - pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page == swapcache) { page_add_anon_rmap(page, vma, addr, false); } else { /* ksm created a completely new copy */ page_add_new_anon_rmap(page, vma, addr, false); lru_cache_add_inactive_or_unevictable(page, vma); } + set_pte_at(vma->vm_mm, addr, pte, + pte_mkold(mk_pte(page, vma->vm_page_prot))); swap_free(entry); out: pte_unmap_unlock(pte, ptl); From patchwork Tue Dec 21 15:46:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 12690007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C77BAC433EF for ; Tue, 21 Dec 2021 15:46:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6EB306B0096; Tue, 21 Dec 2021 10:46:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 69A406B0099; Tue, 21 Dec 2021 10:46:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 562816B009A; Tue, 21 Dec 2021 10:46:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id 4470E6B0096 for ; Tue, 21 Dec 2021 10:46:56 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0D7667CB2A for ; Tue, 21 Dec 2021 15:46:56 +0000 (UTC) X-FDA: 78942229632.26.FB0229C Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf24.hostedemail.com (Postfix) with ESMTP id 7898B180027 for ; Tue, 21 Dec 2021 15:46:51 +0000 (UTC) Received: by mail-qt1-f175.google.com with SMTP id 8so13219076qtx.5 for ; Tue, 21 Dec 2021 07:46:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=8lYwPz/4m3tDnSwW80VP9nsgiF1z9D2++nKEW6hl0KA=; b=kO8yGAwFB3DggVZ93fPld7nNchSHxNZW2NmK+E+Zhg/kPc9IUvAJr7Riv6SFkBYvSS Ch+8U2tozsmQnKCF1kRrpen+bCce4/fnW/fvR6gpy60FmL+f1PH5MT+LPk9Itli4RXPQ mFPsXKYT5fAaH2slrqgjPYEDdMPEcAyYLarVP3uEYLgNajmhicBZUCE/9uOvl3+uK/d+ tX57f3ADUC538hwUiPUZx3mVZMKQ1pGZPZ3fqhvsetg9pNFsUQ9if3L2RPeoefEVJdju WPrj0vvglEUYoiXd5zdxvY1LhKIeTyAJIg2sWzBWRAI+aIxGmqe/Nyl/LLRtmZ0xihpt G+Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8lYwPz/4m3tDnSwW80VP9nsgiF1z9D2++nKEW6hl0KA=; b=KP7MOxpLSpWlfd0vBv2VqaYhEbsSY2tHuD/jtCUs2+c8z/dcgRydYjytzNQSwOujJT U/5PcWKb6Gtt7dBqWaQEcSfJ2e2RnfFGNR4DDWp6114eOLmabtOkuU/73kdVlooR4YnC ff0Z1w0wWH3zeGjLVpnv4zq3LZAGByWvElxrv2wzgkLBMFAorMkz/2WhiMD6BBvzaSiY fubyfegSGCwDRbjfaUU/o3NsiWEw8GzbShWZvgopzef80X25sP5Mu+BbF0DGi2RR4Smk eLz2tpVkkMXtnY22K5DVFgfXc7ImAPliBfmVEY5BEIx59gTWy34UTvXPCh3VDR29TpbR YNfQ== X-Gm-Message-State: AOAM53312au8p5IeL/06RD3/QGhlv5/WJlwLj4IDFeR8OyUKMICcOUtZ lR2TP7+ge5e5datAPYqFIxZ0lw== X-Google-Smtp-Source: ABdhPJyeJtPKa7tGYCh0gZkqHjPcdfCZESI0/adSt3JTZTkT3vCpCDexDTbFfki3nLwOQ4rZwOtgMQ== X-Received: by 2002:ac8:7774:: with SMTP id h20mr2730148qtu.236.1640101614863; Tue, 21 Dec 2021 07:46:54 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:54 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 2/4] mm: ptep_clear() page table helper Date: Tue, 21 Dec 2021 15:46:48 +0000 Message-Id: <20211221154650.1047963-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=kO8yGAwF; spf=pass (imf24.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 7898B180027 X-Stat-Signature: eowu1tm9nn5ubi8ekozqm8tfnf1xxomo X-HE-Tag: 1640101611-606874 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We have ptep_get_and_clear() and ptep_get_and_clear_full() helpers to clear PTE from user page tables, but there is no variant for simple clear of a present PTE from user page tables without using a low level pte_clear() which can be either native or para-virtualised. Add a new ptep_clear() that can be used in common code to clear PTEs from page table. We will need this call later in order to add a hook for page table check. Signed-off-by: Pasha Tatashin --- Documentation/vm/arch_pgtable_helpers.rst | 6 ++++-- include/linux/pgtable.h | 8 ++++++++ mm/debug_vm_pgtable.c | 2 +- mm/khugepaged.c | 12 ++---------- 4 files changed, 15 insertions(+), 13 deletions(-) diff --git a/Documentation/vm/arch_pgtable_helpers.rst b/Documentation/vm/arch_pgtable_helpers.rst index 552567d863b8..fbe06ec75370 100644 --- a/Documentation/vm/arch_pgtable_helpers.rst +++ b/Documentation/vm/arch_pgtable_helpers.rst @@ -66,9 +66,11 @@ PTE Page Table Helpers +---------------------------+--------------------------------------------------+ | pte_mknotpresent | Invalidates a mapped PTE | +---------------------------+--------------------------------------------------+ -| ptep_get_and_clear | Clears a PTE | +| ptep_clear | Clears a PTE | +---------------------------+--------------------------------------------------+ -| ptep_get_and_clear_full | Clears a PTE | +| ptep_get_and_clear | Clears and returns PTE | ++---------------------------+--------------------------------------------------+ +| ptep_get_and_clear_full | Clears and returns PTE (batched PTE unmap) | +---------------------------+--------------------------------------------------+ | ptep_test_and_clear_young | Clears young from a PTE | +---------------------------+--------------------------------------------------+ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e24d2c992b11..bc8713a76e03 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -258,6 +258,14 @@ static inline int pmdp_clear_flush_young(struct vm_area_struct *vma, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif +#ifndef __HAVE_ARCH_PTEP_CLEAR +static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + pte_clear(mm, addr, ptep); +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 228e3954b90c..cd48a34c87a4 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -652,7 +652,7 @@ static void __init pte_clear_tests(struct pgtable_debug_args *args) set_pte_at(args->mm, args->vaddr, args->ptep, pte); flush_dcache_page(page); barrier(); - pte_clear(args->mm, args->vaddr, args->ptep); + ptep_clear(args->mm, args->vaddr, args->ptep); pte = ptep_get(args->ptep); WARN_ON(!pte_none(pte)); } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index ed0fa6368706..7720189a2da7 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -756,11 +756,7 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page, * ptl mostly unnecessary. */ spin_lock(ptl); - /* - * paravirt calls inside pte_clear here are - * superfluous. - */ - pte_clear(vma->vm_mm, address, _pte); + ptep_clear(vma->vm_mm, address, _pte); spin_unlock(ptl); } } else { @@ -774,11 +770,7 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page, * inside page_remove_rmap(). */ spin_lock(ptl); - /* - * paravirt calls inside pte_clear here are - * superfluous. - */ - pte_clear(vma->vm_mm, address, _pte); + ptep_clear(vma->vm_mm, address, _pte); page_remove_rmap(src_page, false); spin_unlock(ptl); free_page_and_swap_cache(src_page); From patchwork Tue Dec 21 15:46:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 12690011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67E23C4321E for ; Tue, 21 Dec 2021 15:47:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1D2E6B0099; Tue, 21 Dec 2021 10:46:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BCB816B009B; Tue, 21 Dec 2021 10:46:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A43E16B009D; Tue, 21 Dec 2021 10:46:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id 947F76B0099 for ; Tue, 21 Dec 2021 10:46:57 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4E09B181AC9C6 for ; Tue, 21 Dec 2021 15:46:57 +0000 (UTC) X-FDA: 78942229674.27.CCF3D18 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf04.hostedemail.com (Postfix) with ESMTP id CD4294003B for ; Tue, 21 Dec 2021 15:46:56 +0000 (UTC) Received: by mail-qt1-f176.google.com with SMTP id j17so13239442qtx.2 for ; Tue, 21 Dec 2021 07:46:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=cI7YVBo+Cohe6VKMVjfceS0qoE/CwWGaUs+lrS/IFhw=; b=j9YnSfPBq3Uj1s+FVmAJaDYoGxPz4B7+SyaV0NM2+GjRboMUIbJXQqvyzkDSGOukFa G13jNSIoE5JSsqf5nBU0FqbHa9KrtdU6umw3dBV9vcEhWI7LLFlZJIogNi02qvXShmyb BrJbgiabYSHPmwtNSqZptFzHSSMUXQ0K43QGHpgwwzjHnfGsw6ivu0gwMnXPKMgpegyQ 3st+WrYpN6vJwHS2IjdBsqwM3Q2yaIXybCHMdgSkVbheFO/d7NA+CPpp1EUJRxby+oI2 8vUqw9UWOubI2AVpG5Zd9hNotCKDe5nyKukLEyTc5fCxQpEjeSb+IkFTOZ2fsFMdeB+y kPtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cI7YVBo+Cohe6VKMVjfceS0qoE/CwWGaUs+lrS/IFhw=; b=ayx0PhKxw/qTVQXNAXM3KC/MvRrjOej9ONdyIwzM3gHBdg2u67YmfQ4j3QtrfkGlhB GQXOPvRSgqz+sHLpzroo+iNLYQH4QSeA79jGFL3U1UaYeB/L0/Fi18d8EpwDS4j7ATjH ShBeQyp8QW+03hSWyHc0UNoL2bhQkzAjZWQuLA1oH0kWIHXIS5Ei75nhVcXI716EUwi7 3K/PQvtMpq8ZgohMJqQySJnTSn1b4uqCf0PgZMRrRC+it7WLoINOf3JpjTVPxpsVsz28 baiitnkC7ahDPg9w8z1pLrZZSmPSszw00z0/brjuwql1uuFwgOhZz7kjXwW4nY5l6QjH 0AMQ== X-Gm-Message-State: AOAM532Ez4ROiITKtwVyzMWyodQYFgGp4aob7zzzHueqk7CwJdBcWVNg L+RPobvWkyMzEc6RXyd4L2dAEw== X-Google-Smtp-Source: ABdhPJwlMJHxT8pjC4T4DgGsxuBW0b2SGWQjuWAGtDd4XhfZroIVquyM1GZwaRX9tJWbgpmt03OfTQ== X-Received: by 2002:a05:622a:54d:: with SMTP id m13mr2685987qtx.33.1640101616027; Tue, 21 Dec 2021 07:46:56 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:55 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 3/4] mm: page table check Date: Tue, 21 Dec 2021 15:46:49 +0000 Message-Id: <20211221154650.1047963-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: CD4294003B X-Stat-Signature: 3boryajgsoesct7g31tzgrzrzbwcc83t Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=j9YnSfPB; spf=pass (imf04.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none X-Rspamd-Server: rspam10 X-HE-Tag: 1640101616-718703 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Check user page table entries at the time they are added and removed. Allows to synchronously catch memory corruption issues related to double mapping. When a pte for an anonymous page is added into page table, we verify that this pte does not already point to a file backed page, and vice versa if this is a file backed page that is being added we verify that this page does not have an anonymous mapping We also enforce that read-only sharing for anonymous pages is allowed (i.e. cow after fork). All other sharing must be for file pages. Page table check allows to protect and debug cases where "struct page" metadata became corrupted for some reason. For example, when refcnt or mapcount become invalid. Signed-off-by: Pasha Tatashin --- Documentation/vm/index.rst | 1 + Documentation/vm/page_table_check.rst | 56 ++++++ MAINTAINERS | 9 + arch/Kconfig | 3 + include/linux/page_table_check.h | 147 ++++++++++++++ mm/Kconfig.debug | 24 +++ mm/Makefile | 1 + mm/page_alloc.c | 4 + mm/page_ext.c | 4 + mm/page_table_check.c | 270 ++++++++++++++++++++++++++ 10 files changed, 519 insertions(+) create mode 100644 Documentation/vm/page_table_check.rst create mode 100644 include/linux/page_table_check.h create mode 100644 mm/page_table_check.c diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst index 6f5ffef4b716..43bb54d897d9 100644 --- a/Documentation/vm/index.rst +++ b/Documentation/vm/index.rst @@ -31,6 +31,7 @@ algorithms. If you are looking for advice on simply allocating memory, see the page_migration page_frags page_owner + page_table_check remap_file_pages slub split_page_table_lock diff --git a/Documentation/vm/page_table_check.rst b/Documentation/vm/page_table_check.rst new file mode 100644 index 000000000000..81f521ff7ea7 --- /dev/null +++ b/Documentation/vm/page_table_check.rst @@ -0,0 +1,56 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _page_table_check: + +================ +Page Table Check +================ + +Introduction +============ + +Page table check allows to hardern the kernel by ensuring that some types of +the memory corruptions are prevented. + +Page table check performs extra verifications at the time when new pages become +accessible from the userspace by getting their page table entries (PTEs PMDs +etc.) added into the table. + +In case of detected corruption, the kernel is crashed. There is a small +performance and memory overhead associated with the page table check. Therefore, +it is disabled by default, but can be optionally enabled on systems where the +extra hardening outweighs the performance costs. Also, because page table check +is synchronous, it can help with debugging double map memory corruption issues, +by crashing kernel at the time wrong mapping occurs instead of later which is +often the case with memory corruptions bugs. + +Double mapping detection logic +============================== + ++-------------------+-------------------+-------------------+------------------+ +| Current Mapping | New mapping | Permissions | Rule | ++===================+===================+===================+==================+ +| Anonymous | Anonymous | Read | Allow | ++-------------------+-------------------+-------------------+------------------+ +| Anonymous | Anonymous | Read / Write | Prohibit | ++-------------------+-------------------+-------------------+------------------+ +| Anonymous | Named | Any | Prohibit | ++-------------------+-------------------+-------------------+------------------+ +| Named | Anonymous | Any | Prohibit | ++-------------------+-------------------+-------------------+------------------+ +| Named | Named | Any | Allow | ++-------------------+-------------------+-------------------+------------------+ + +Enabling Page Table Check +========================= + +Build kernel with: + +- PAGE_TABLE_CHECK=y + Note, it can only be enabled on platforms where ARCH_SUPPORTS_PAGE_TABLE_CHECK + is available. + +- Boot with 'page_table_check=on' kernel parameter. + +Optionally, build kernel with PAGE_TABLE_CHECK_ENFORCED in order to have page +table support without extra kernel parameter. diff --git a/MAINTAINERS b/MAINTAINERS index 4403b348851d..16bc8cdc1492 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14485,6 +14485,15 @@ F: include/net/page_pool.h F: include/trace/events/page_pool.h F: net/core/page_pool.c +PAGE TABLE CHECK +M: Pasha Tatashin +M: Andrew Morton +L: linux-mm@kvack.org +S: Maintained +F: Documentation/vm/page_table_check.rst +F: include/linux/page_table_check.h +F: mm/page_table_check.c + PANASONIC LAPTOP ACPI EXTRAS DRIVER M: Kenneth Chan L: platform-driver-x86@vger.kernel.org diff --git a/arch/Kconfig b/arch/Kconfig index 75ad877c5c48..fdba59052abc 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1307,6 +1307,9 @@ config HAVE_ARCH_PFN_VALID config ARCH_SUPPORTS_DEBUG_PAGEALLOC bool +config ARCH_SUPPORTS_PAGE_TABLE_CHECK + bool + config ARCH_SPLIT_ARG64 bool help diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h new file mode 100644 index 000000000000..38cace1da7b6 --- /dev/null +++ b/include/linux/page_table_check.h @@ -0,0 +1,147 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2021, Google LLC. + * Pasha Tatashin + */ +#ifndef __LINUX_PAGE_TABLE_CHECK_H +#define __LINUX_PAGE_TABLE_CHECK_H + +#ifdef CONFIG_PAGE_TABLE_CHECK +#include + +extern struct static_key_true page_table_check_disabled; +extern struct page_ext_operations page_table_check_ops; + +void __page_table_check_zero(struct page *page, unsigned int order); +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte); +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd); +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud); +void __page_table_check_pte_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +void __page_table_check_pmd_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd); +void __page_table_check_pud_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud); + +static inline void page_table_check_alloc(struct page *page, unsigned int order) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_zero(page, order); +} + +static inline void page_table_check_free(struct page *page, unsigned int order) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_zero(page, order); +} + +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pte_clear(mm, addr, pte); +} + +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pmd_clear(mm, addr, pmd); +} + +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pud_clear(mm, addr, pud); +} + +static inline void page_table_check_pte_set(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + pte_t pte) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pte_set(mm, addr, ptep, pte); +} + +static inline void page_table_check_pmd_set(struct mm_struct *mm, + unsigned long addr, pmd_t *pmdp, + pmd_t pmd) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pmd_set(mm, addr, pmdp, pmd); +} + +static inline void page_table_check_pud_set(struct mm_struct *mm, + unsigned long addr, pud_t *pudp, + pud_t pud) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pud_set(mm, addr, pudp, pud); +} + +#else + +static inline void page_table_check_alloc(struct page *page, unsigned int order) +{ +} + +static inline void page_table_check_free(struct page *page, unsigned int order) +{ +} + +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) +{ +} + +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) +{ +} + +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) +{ +} + +static inline void page_table_check_pte_set(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + pte_t pte) +{ +} + +static inline void page_table_check_pmd_set(struct mm_struct *mm, + unsigned long addr, pmd_t *pmdp, + pmd_t pmd) +{ +} + +static inline void page_table_check_pud_set(struct mm_struct *mm, + unsigned long addr, pud_t *pudp, + pud_t pud) +{ +} + +#endif /* CONFIG_PAGE_TABLE_CHECK */ +#endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug index 1e73717802f8..5bd5bb097252 100644 --- a/mm/Kconfig.debug +++ b/mm/Kconfig.debug @@ -62,6 +62,30 @@ config PAGE_OWNER If unsure, say N. +config PAGE_TABLE_CHECK + bool "Check for invalid mappings in user page tables" + depends on ARCH_SUPPORTS_PAGE_TABLE_CHECK + select PAGE_EXTENSION + help + Check that anonymous page is not being mapped twice with read write + permissions. Check that anonymous and file pages are not being + erroneously shared. Since the checking is performed at the time + entries are added and removed to user page tables, leaking, corruption + and double mapping problems are detected synchronously. + + If unsure say "n". + +config PAGE_TABLE_CHECK_ENFORCED + bool "Enforce the page table checking by default" + depends on PAGE_TABLE_CHECK + help + Always enable page table checking. By default the page table checking + is disabled, and can be optionally enabled via page_table_check=on + kernel parameter. This config enforces that page table check is always + enabled. + + If unsure say "n". + config PAGE_POISONING bool "Poison pages after freeing" help diff --git a/mm/Makefile b/mm/Makefile index 7919cd7f13f2..588d3113f3b0 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -114,6 +114,7 @@ obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += early_ioremap.o obj-$(CONFIG_CMA) += cma.o obj-$(CONFIG_MEMORY_BALLOON) += balloon_compaction.o obj-$(CONFIG_PAGE_EXTENSION) += page_ext.o +obj-$(CONFIG_PAGE_TABLE_CHECK) += page_table_check.o obj-$(CONFIG_CMA_DEBUGFS) += cma_debug.o obj-$(CONFIG_SECRETMEM) += secretmem.o obj-$(CONFIG_CMA_SYSFS) += cma_sysfs.o diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 77253ea6031e..edfd6c81af82 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -64,6 +64,7 @@ #include #include #include +#include #include #include #include @@ -1308,6 +1309,7 @@ static __always_inline bool free_pages_prepare(struct page *page, if (memcg_kmem_enabled() && PageMemcgKmem(page)) __memcg_kmem_uncharge_page(page, order); reset_page_owner(page, order); + page_table_check_free(page, order); return false; } @@ -1347,6 +1349,7 @@ static __always_inline bool free_pages_prepare(struct page *page, page_cpupid_reset_last(page); page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; reset_page_owner(page, order); + page_table_check_free(page, order); if (!PageHighMem(page)) { debug_check_no_locks_freed(page_address(page), @@ -2421,6 +2424,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, } set_page_owner(page, order, gfp_flags); + page_table_check_alloc(page, order); } static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags, diff --git a/mm/page_ext.c b/mm/page_ext.c index 6242afb24d84..bee3240604dc 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -8,6 +8,7 @@ #include #include #include +#include /* * struct page extension @@ -75,6 +76,9 @@ static struct page_ext_operations *page_ext_ops[] = { #if defined(CONFIG_PAGE_IDLE_FLAG) && !defined(CONFIG_64BIT) &page_idle_ops, #endif +#ifdef CONFIG_PAGE_TABLE_CHECK + &page_table_check_ops, +#endif }; unsigned long page_ext_size = sizeof(struct page_ext); diff --git a/mm/page_table_check.c b/mm/page_table_check.c new file mode 100644 index 000000000000..7504e7caa2a1 --- /dev/null +++ b/mm/page_table_check.c @@ -0,0 +1,270 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2021, Google LLC. + * Pasha Tatashin + */ +#include +#include + +#undef pr_fmt +#define pr_fmt(fmt) "page_table_check: " fmt + +struct page_table_check { + atomic_t anon_map_count; + atomic_t file_map_count; +}; + +static bool __page_table_check_enabled __initdata = + IS_ENABLED(CONFIG_PAGE_TABLE_CHECK_ENFORCED); + +DEFINE_STATIC_KEY_TRUE(page_table_check_disabled); +EXPORT_SYMBOL(page_table_check_disabled); + +static int __init early_page_table_check_param(char *buf) +{ + if (!buf) + return -EINVAL; + + if (strcmp(buf, "on") == 0) + __page_table_check_enabled = true; + else if (strcmp(buf, "off") == 0) + __page_table_check_enabled = false; + + return 0; +} + +early_param("page_table_check", early_page_table_check_param); + +static bool __init need_page_table_check(void) +{ + return __page_table_check_enabled; +} + +static void __init init_page_table_check(void) +{ + if (!__page_table_check_enabled) + return; + static_branch_disable(&page_table_check_disabled); +} + +struct page_ext_operations page_table_check_ops = { + .size = sizeof(struct page_table_check), + .need = need_page_table_check, + .init = init_page_table_check, +}; + +static struct page_table_check *get_page_table_check(struct page_ext *page_ext) +{ + BUG_ON(!page_ext); + return (void *)(page_ext) + page_table_check_ops.offset; +} + +static inline bool pte_user_accessible_page(pte_t pte) +{ + return (pte_val(pte) & _PAGE_PRESENT) && (pte_val(pte) & _PAGE_USER); +} + +static inline bool pmd_user_accessible_page(pmd_t pmd) +{ + return pmd_leaf(pmd) && (pmd_val(pmd) & _PAGE_PRESENT) && + (pmd_val(pmd) & _PAGE_USER); +} + +static inline bool pud_user_accessible_page(pud_t pud) +{ + return pud_leaf(pud) && (pud_val(pud) & _PAGE_PRESENT) && + (pud_val(pud) & _PAGE_USER); +} + +/* + * An enty is removed from the page table, decrement the counters for that page + * verify that it is of correct type and counters do not become negative. + */ +static void page_table_check_clear(struct mm_struct *mm, unsigned long addr, + unsigned long pfn, unsigned long pgcnt) +{ + struct page_ext *page_ext; + struct page *page; + bool anon; + int i; + + if (!pfn_valid(pfn)) + return; + + page = pfn_to_page(pfn); + page_ext = lookup_page_ext(page); + anon = PageAnon(page); + + for (i = 0; i < pgcnt; i++) { + struct page_table_check *ptc = get_page_table_check(page_ext); + + if (anon) { + BUG_ON(atomic_read(&ptc->file_map_count)); + BUG_ON(atomic_dec_return(&ptc->anon_map_count) < 0); + } else { + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0); + } + page_ext = page_ext_next(page_ext); + } +} + +/* + * A new enty is added to the page table, increment the counters for that page + * verify that it is of correct type and is not being mapped with a different + * type to a different process. + */ +static void page_table_check_set(struct mm_struct *mm, unsigned long addr, + unsigned long pfn, unsigned long pgcnt, + bool rw) +{ + struct page_ext *page_ext; + struct page *page; + bool anon; + int i; + + if (!pfn_valid(pfn)) + return; + + page = pfn_to_page(pfn); + page_ext = lookup_page_ext(page); + anon = PageAnon(page); + + for (i = 0; i < pgcnt; i++) { + struct page_table_check *ptc = get_page_table_check(page_ext); + + if (anon) { + BUG_ON(atomic_read(&ptc->file_map_count)); + BUG_ON(atomic_inc_return(&ptc->anon_map_count) > 1 && rw); + } else { + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_inc_return(&ptc->file_map_count) < 0); + } + page_ext = page_ext_next(page_ext); + } +} + +/* + * page is on free list, or is being allocated, verify that counters are zeroes + * crash if they are not. + */ +void __page_table_check_zero(struct page *page, unsigned int order) +{ + struct page_ext *page_ext = lookup_page_ext(page); + int i; + + BUG_ON(!page_ext); + for (i = 0; i < (1 << order); i++) { + struct page_table_check *ptc = get_page_table_check(page_ext); + + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_read(&ptc->file_map_count)); + page_ext = page_ext_next(page_ext); + } +} + +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte) +{ + if (&init_mm == mm) + return; + + if (pte_user_accessible_page(pte)) { + page_table_check_clear(mm, addr, pte_pfn(pte), + PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pte_clear); + +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd) +{ + if (&init_mm == mm) + return; + + if (pmd_user_accessible_page(pmd)) { + page_table_check_clear(mm, addr, pmd_pfn(pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pmd_clear); + +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud) +{ + if (&init_mm == mm) + return; + + if (pud_user_accessible_page(pud)) { + page_table_check_clear(mm, addr, pud_pfn(pud), + PUD_PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pud_clear); + +void __page_table_check_pte_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + pte_t old_pte; + + if (&init_mm == mm) + return; + + old_pte = *ptep; + if (pte_user_accessible_page(old_pte)) { + page_table_check_clear(mm, addr, pte_pfn(old_pte), + PAGE_SIZE >> PAGE_SHIFT); + } + + if (pte_user_accessible_page(pte)) { + page_table_check_set(mm, addr, pte_pfn(pte), + PAGE_SIZE >> PAGE_SHIFT, + pte_write(pte)); + } +} +EXPORT_SYMBOL(__page_table_check_pte_set); + +void __page_table_check_pmd_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd) +{ + pmd_t old_pmd; + + if (&init_mm == mm) + return; + + old_pmd = *pmdp; + if (pmd_user_accessible_page(old_pmd)) { + page_table_check_clear(mm, addr, pmd_pfn(old_pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT); + } + + if (pmd_user_accessible_page(pmd)) { + page_table_check_set(mm, addr, pmd_pfn(pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT, + pmd_write(pmd)); + } +} +EXPORT_SYMBOL(__page_table_check_pmd_set); + +void __page_table_check_pud_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud) +{ + pud_t old_pud; + + if (&init_mm == mm) + return; + + old_pud = *pudp; + if (pud_user_accessible_page(old_pud)) { + page_table_check_clear(mm, addr, pud_pfn(old_pud), + PUD_PAGE_SIZE >> PAGE_SHIFT); + } + + if (pud_user_accessible_page(pud)) { + page_table_check_set(mm, addr, pud_pfn(pud), + PUD_PAGE_SIZE >> PAGE_SHIFT, + pud_write(pud)); + } +} +EXPORT_SYMBOL(__page_table_check_pud_set); From patchwork Tue Dec 21 15:46:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 12690009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC38EC433F5 for ; Tue, 21 Dec 2021 15:47:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E162F6B009B; Tue, 21 Dec 2021 10:46:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DC7156B009F; Tue, 21 Dec 2021 10:46:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8EC06B00A0; Tue, 21 Dec 2021 10:46:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id AD81A6B009B for ; Tue, 21 Dec 2021 10:46:58 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6BA657BE67 for ; Tue, 21 Dec 2021 15:46:58 +0000 (UTC) X-FDA: 78942229716.08.FE1DD42 Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by imf07.hostedemail.com (Postfix) with ESMTP id B4C5540024 for ; Tue, 21 Dec 2021 15:46:57 +0000 (UTC) Received: by mail-qv1-f48.google.com with SMTP id fq10so4271575qvb.10 for ; Tue, 21 Dec 2021 07:46:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Xmyi2zCZ0qq0KJ7l9kYGd9/NNLTFkYCru9KHZL7RlW0=; b=Zvi8Rg4g886r2e8/bLoT2y0NWG6X3S56t/9kCh4NczClQm07VEJTUryxHSMYWFbEAo p8PMXJ28kOioXPBq0ix99dx2hczfOcjhb56wg27CwaCByKCyeYEsIq/mLoS45sgCBGeM fDMAZ8kDg3haHxKo30vmLRsG3wmmMiyAUkWAsdmlE9K4ZhS/Clmk7D2ouNpwtysNOU+N vt2wiNJ7VNXFzv5pcyJDqBeNFyaPTB6vpEDWgvRVRZtqjd7GkeD75fe5919W+lA7HGg4 lMLeSnvReUyHifMEDe2ULS43fvlV/KZTOV2GeuXkTxivarBuppz470cyXZmtg/B6rE5q cwjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Xmyi2zCZ0qq0KJ7l9kYGd9/NNLTFkYCru9KHZL7RlW0=; b=JRz6UIaBpvnpyDHdicQOO0A3eE3tN9b3jfiUOUVByPOomTNgOZqrKaSk2pne+ZXu21 dAqI/NhiByE5GdJKhc6HagONyoTiHTVRH8OqrbE1aVgkvkJOjftmAPjzQ8pQyDodvogC y4cM6RuzDXpHMXLq6VhJXkIMpsM1dlZuqlmAlZSN4O8yzLH5AqQDdh7ozFNNoa7j/Zro /yyJamuZjddMv7GcUpWoLVB7+PlJflXbRShD/UJIwaYgqsrmdb6A+WPIMgToHz436MF4 FnTRmVkql/OZXWlBw8vitEPJfsw8PchT363CkZLMZhzeDCgV7O++M9GxWYtyenBIp5qP nYxA== X-Gm-Message-State: AOAM533GVbWFV23RA+CVeYZnkx90gBiftsjXTWJnuzV8WdjKgdUdajox md2T6Kt5XZXhMM62qKk3QFA3aQ== X-Google-Smtp-Source: ABdhPJwInCo1DVCqL5Hwf+MENbgmDZwkZnQaFs2cfcc4ye0/E2LpBXPTI45RluoaIQIBeUt6O6joMA== X-Received: by 2002:ad4:4ee3:: with SMTP id dv3mr3100222qvb.8.1640101617069; Tue, 21 Dec 2021 07:46:57 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:56 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 4/4] x86: mm: add x86_64 support for page table check Date: Tue, 21 Dec 2021 15:46:50 +0000 Message-Id: <20211221154650.1047963-5-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: B4C5540024 X-Stat-Signature: 63ickdwr9k8pnggfzwxe71armohgifj3 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=Zvi8Rg4g; spf=pass (imf07.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.219.48 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none X-Rspamd-Server: rspam10 X-HE-Tag: 1640101617-932492 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add page table check hooks into routines that modify user page tables. Signed-off-by: Pasha Tatashin --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 29 +++++++++++++++++++++++++++-- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5eac1e3610e9..cc91c639acfb 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -104,6 +104,7 @@ config X86 select ARCH_SUPPORTS_ACPI select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC + select ARCH_SUPPORTS_PAGE_TABLE_CHECK if X86_64 select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096 select ARCH_SUPPORTS_LTO_CLANG diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index dea9fe8a56cc..8a9432fb3802 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -27,6 +27,7 @@ #include #include #include +#include extern pgd_t early_top_pgt[PTRS_PER_PGD]; bool __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -1007,18 +1008,21 @@ static inline pud_t native_local_pudp_get_and_clear(pud_t *pudp) static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { + page_table_check_pte_set(mm, addr, ptep, pte); set_pte(ptep, pte); } static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { + page_table_check_pmd_set(mm, addr, pmdp, pmd); set_pmd(pmdp, pmd); } static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud) { + page_table_check_pud_set(mm, addr, pudp, pud); native_set_pud(pudp, pud); } @@ -1049,6 +1053,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { pte_t pte = native_ptep_get_and_clear(ptep); + page_table_check_pte_clear(mm, addr, pte); return pte; } @@ -1064,12 +1069,23 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, * care about updates and native needs no locking */ pte = native_local_ptep_get_and_clear(ptep); + page_table_check_pte_clear(mm, addr, pte); } else { pte = ptep_get_and_clear(mm, addr, ptep); } return pte; } +#define __HAVE_ARCH_PTEP_CLEAR +static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + if (IS_ENABLED(CONFIG_PAGE_TABLE_CHECK)) + ptep_get_and_clear(mm, addr, ptep); + else + pte_clear(mm, addr, ptep); +} + #define __HAVE_ARCH_PTEP_SET_WRPROTECT static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) @@ -1110,14 +1126,22 @@ static inline int pmd_write(pmd_t pmd) static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { - return native_pmdp_get_and_clear(pmdp); + pmd_t pmd = native_pmdp_get_and_clear(pmdp); + + page_table_check_pmd_clear(mm, addr, pmd); + + return pmd; } #define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pud_t *pudp) { - return native_pudp_get_and_clear(pudp); + pud_t pud = native_pudp_get_and_clear(pudp); + + page_table_check_pud_clear(mm, addr, pud); + + return pud; } #define __HAVE_ARCH_PMDP_SET_WRPROTECT @@ -1138,6 +1162,7 @@ static inline int pud_write(pud_t pud) static inline pmd_t pmdp_establish(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t pmd) { + page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd); if (IS_ENABLED(CONFIG_SMP)) { return xchg(pmdp, pmd); } else {