From patchwork Fri Jan 10 18:40:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935240 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00FA7E7719C for ; Fri, 10 Jan 2025 18:41:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E31C86B00CB; Fri, 10 Jan 2025 13:41:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DBB9D6B00CC; Fri, 10 Jan 2025 13:41:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B979C6B00CD; Fri, 10 Jan 2025 13:41:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 90CEF6B00CB for ; Fri, 10 Jan 2025 13:41:14 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4DDA31A0DC9 for ; Fri, 10 Jan 2025 18:41:14 +0000 (UTC) X-FDA: 82992409668.29.699A937 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf16.hostedemail.com (Postfix) with ESMTP id 4450F18000A for ; Fri, 10 Jan 2025 18:41:12 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0WjXlTKp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534472; a=rsa-sha256; cv=none; b=fHdIkQBOm8bTSvD3TQY152/IlXarjXBtdCN5w+VB0Qa1B+X2xkU0CqQJzay136fyquJIsX yCIwZkT29R4bcPy86j/Br/Z3Td19/7OMKi8vB19uBP0u0LfC7fMhJmxPH4Ly7Tv1yGcRB5 gk23dCdyloqQLKHULFYMCwVIUlPzlB8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0WjXlTKp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9R39fCT8BkGaT3ojJEFUEFWnag6aLMpEmNSF+NNhCcA=; b=VAa3zVn5QYJTg5cgMzsauPVw668xUTVQrzXhioYu3ZtFp43fkzsdnnUgF6xJtXVnGugkrc f0Ep8ExdY/719RbxY+wh89HvrR/I+sSN5WT5C21QCgwMhr9YOInjTn159uLgu2UGE4HvyX yrPprIUo+v46ildGitrooURUVIJ1GUM= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43623bf2a83so20554375e9.0 for ; Fri, 10 Jan 2025 10:41:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534471; x=1737139271; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9R39fCT8BkGaT3ojJEFUEFWnag6aLMpEmNSF+NNhCcA=; b=0WjXlTKps2Dj2nM4wVbpKE4OaRsfOAD7vZo4XgP8BtV+eDbIbLepAc992bKTMmWR9Y FSfNXcJs5Zg3h5pQ0mxrYjw/iXtbXs672BauQVKyPnf0Yz+8TVUJWW6UOFK8X6U9Nr6w Cye5PGMOc5zEW4piZBS7ORas6YXeG9/mxphCYunn83VGYVZn03YWjxdMu7cVESCPeF1X yboL2g2X/QUTZDK7S+bGDoQq5hydl8PQI/JDWQKVLuka4VMQ/FVvPX/OalSndzDCMd3i ZrPt73oRap1YbFc3PHjsGMKfz5Zh9ORuRRE+9OxXWHl2AKO6l1ze84V2qhxs7QX3Omy6 ApYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534471; x=1737139271; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9R39fCT8BkGaT3ojJEFUEFWnag6aLMpEmNSF+NNhCcA=; b=aAsdiB1RtkBRHqe+sezeZ87w5y5NAPXVBxEJvmwT+JwLTmuCbHzhGyhp5L/rHQVsVt LqCIfDGwNUrD7wvk/t9i9/JR8/gUeSi9w0nmXWyjAwTw/1FU8tcbxeecYWn2wZn5ed0C 4zElfLaP4Y5fw9x/JhqhYDop2bUNHPAyyQueYn7xE27Oay7XNghSInWzxPxYvjAMjDOc LtVqMMM6uxJbgB1Ta6nRggkL9MABtNuMG0YR9bbXcI2fQ16ratfOuFbBLRWFp8b9vUIh 81YmoePdErAki24SyRa2OzOH6Ggj0TjRWQ+Z2JTjRslh77lj4P0P+DvpfUwMjB0bqXaK b3AQ== X-Forwarded-Encrypted: i=1; AJvYcCWMb6G0CLKuKMWF0FnzNEhjPCLyjUAgNaFwYjR7hv+uevavi6w0PpopQ0ljfU+aPwRK0dA/asoRGg==@kvack.org X-Gm-Message-State: AOJu0Yy9dMklzkzuzwEWmSH+HbWy1/+LQLyRnsBdbZ4OKXeRooCM/vyX cyQTR11Ui/P+YtrPgbXmd43gAHXBZFesNAJYesD36X6NBNvN6a7cthxvt3Z+cIqjkoh6kej3s7W 8fiSHuwSv0g== X-Google-Smtp-Source: AGHT+IFAF35eNkOAcz1ZyY8tB2zfXGnUuByWneOp6LmVFLX6PovKFbMifWci/fMsVQav8K8GF97UZk3GG6JG8w== X-Received: from wmbfl22.prod.google.com ([2002:a05:600c:b96:b0:436:6fa7:621]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:310c:b0:436:840b:2593 with SMTP id 5b1f17b1804b1-436e26ad50emr117815595e9.15.1736534470650; Fri, 10 Jan 2025 10:41:10 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:37 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-11-8419288bc805@google.com> Subject: [PATCH RFC v2 11/29] mm: asi: Functions to map/unmap a memory range into ASI page tables From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid , Kevin Cheng X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4450F18000A X-Stat-Signature: n7j3mwdfgbohwiwjydyj1d47ogspbko7 X-HE-Tag: 1736534472-761981 X-HE-Meta: U2FsdGVkX1+gbbCu9+ab/g6rMHoIEELWXWYgOANPdPAaH8xpaenzHaSC97tGCo8/ooEWWaNPXUGZvSEfa0SJShEAG+u2r/iRO/ZqSfJHW9m9b7eQst1oDL7uLxkM43Sls2tBksme/h/lacptqfHCbW10ac9HC5Ed3tIGQ7oqfqRsHi6RGE0Hvx960IFWfYn/KGjZL9wV3zg8fteJE4oKbu7dnXufoxR5w4nUlq+eaJpRXLWpv0U7K+tEzIYtpHl3fFxXeJNDHeZBEJehmF39Rb3X2RIUBIk9WD+uCl+KoRM2mGNhHnM0N9e18CvdAa9SdFVQWEwuHC54RBY54Pk6LwLaSXeel40/wUn3FP29cFYsgM0X7GyckTRy/wGzfo8PlBQHY0lAq9KAfQCqHkOJj1CIgdJuE3h9hSWb/VUk5UTS7uaKOnkGv2nOkWp1UrYYNakjI/mEOkXBN1ziBsefJlYO1tyRqnlpcEItcm9mj1uF8T2XzWmiqZyEGR3pUDZTkmPZ2GfqCf+bAmx48yMtvHIe+8Wf/S6kpRF5kK8cbjC9C1WA3kke7HFnPB04yYQHf53AUmf+ka7psJJo5n1a7gc3VDrmLZOfo1XjSlmRGmH6u9WoKKSo9C58eZhhbkd2ZXyNUiW6Mh8dYbPGdCiBrh8sXWn7qHPJWDzfYNcqwzF6j2myZ296JE2w9mHDtdrYYJ59MhVIWyTLJGyDF16tv7TjuibNIrYN8Ro0tqvUWkRqIewkN9np1F9E1IcFgmhg8VS6hAwYTnqHNnz4G3x3rxLHn6qjIkDFbJp9+NSqtiXFFxAyuyE1+V2selG2UioHET97249fkSafoWeEyZTcDUlszCzuZgkn3kkbaYkXoWMl9U0BdYsNs69do2/8NDsYqEucafbd6DNVCPX1XuzaUdFGlGYX2QX8vVSRKRCDuNWskrLhCU0r9zRlWDjH5cd/7YWKDxFjq1LQbPUsvcI ieZv+tzh tyf8e9jc/TFRnROGwmEvbk5Lt4h2HBXCjSu8acQHKdabrchNSIAAadaXOF6RUDVZ8DH57D0IilZu6TgezSc1qtK1HuyP84aQDD/AuxS1V6El3ndzvJb/crQSAVem5GiqyQ9GfAGocUGQ57ouBlWIk78dreAjeL9l0ZdFTPLn3rhdq+wJXwMQFmRoHXZpGZmSCtQ0Ff9/6j9esmw7j4B1XVqx/XUo0/d904hU39V9TohUYXLcNlo8BJf3P9ApiOk7mGHteLBm6GU4wdG2CTIi/A4LGsiDlvIRT1K0hp2QeUKwtcHAbrZe8g0xXodLQthsFedLxIRA+JaA6RQuqFeCiFdZ3vACVs0Ts2te0BuXp/y4zJJAaeeT9RNQf3rZRHgCfsvL9yMGVBGrV/6Dk+I3/C35o8OShS+5++v4aZBUNjzIeZ4dJI+4RhBuo+CV2G2hLOxg+k/uA1kK0Lmjn7FJCvqNP4dYUFOwGmtwHKzqZKj7gzqJ4QnflIRJ/QJk8RijTfL4KzmMwOtIb/MCCGwaH0/ktgI+iWpGwhEHFFO5MdIgXLa6MTjrDWn3TbQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Junaid Shahid Two functions, asi_map() and asi_map_gfp(), are added to allow mapping memory into ASI page tables. The mapping will be identical to the one for the same virtual address in the unrestricted page tables. This is necessary to allow switching between the page tables at any arbitrary point in the kernel. Another function, asi_unmap() is added to allow unmapping memory mapped via asi_map* RFC Notes: Don't read too much into the implementation of this, lots of it should probably be rewritten. It also needs to gain support for partial unmappings. Checkpatch-args: --ignore=MACRO_ARG_UNUSED Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman Signed-off-by: Kevin Cheng --- arch/x86/include/asm/asi.h | 5 + arch/x86/mm/asi.c | 236 ++++++++++++++++++++++++++++++++++++++++++++- arch/x86/mm/tlb.c | 5 + include/asm-generic/asi.h | 11 +++ include/linux/pgtable.h | 3 + mm/internal.h | 2 + mm/vmalloc.c | 32 +++--- 7 files changed, 280 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index a55e73f1b2bc84c41b9ab25f642a4d5f1aa6ba90..33f18be0e268b3a6725196619cbb8d847c21e197 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -157,6 +157,11 @@ void asi_relax(void); /* Immediately exit the restricted address space if in it */ void asi_exit(void); +int asi_map_gfp(struct asi *asi, void *addr, size_t len, gfp_t gfp_flags); +int asi_map(struct asi *asi, void *addr, size_t len); +void asi_unmap(struct asi *asi, void *addr, size_t len); +void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len); + static inline void asi_init_thread_state(struct thread_struct *thread) { thread->asi_state.intr_nest_depth = 0; diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index b15d043acedc9f459f17e86564a15061650afc3a..f2d8fbc0366c289891903e1c2ac6c59b9476d95f 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -11,6 +11,9 @@ #include #include #include +#include + +#include "../../../mm/internal.h" static struct asi_taint_policy *taint_policies[ASI_MAX_NUM_CLASSES]; @@ -100,7 +103,6 @@ const char *asi_class_name(enum asi_class_id class_id) */ static_assert(!IS_ENABLED(CONFIG_PARAVIRT)); #define DEFINE_ASI_PGTBL_ALLOC(base, level) \ -__maybe_unused \ static level##_t * asi_##level##_alloc(struct asi *asi, \ base##_t *base, ulong addr, \ gfp_t flags) \ @@ -455,3 +457,235 @@ void asi_handle_switch_mm(void) this_cpu_or(asi_taints, new_taints); this_cpu_and(asi_taints, ~(ASI_TAINTS_GUEST_MASK | ASI_TAINTS_USER_MASK)); } + +static bool is_page_within_range(unsigned long addr, unsigned long page_size, + unsigned long range_start, unsigned long range_end) +{ + unsigned long page_start = ALIGN_DOWN(addr, page_size); + unsigned long page_end = page_start + page_size; + + return page_start >= range_start && page_end <= range_end; +} + +static bool follow_physaddr( + pgd_t *pgd_table, unsigned long virt, + phys_addr_t *phys, unsigned long *page_size, ulong *flags) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + /* RFC: This should be rewritten with lookup_address_in_*. */ + + *page_size = PGDIR_SIZE; + pgd = pgd_offset_pgd(pgd_table, virt); + if (!pgd_present(*pgd)) + return false; + if (pgd_leaf(*pgd)) { + *phys = PFN_PHYS(pgd_pfn(*pgd)) | (virt & ~PGDIR_MASK); + *flags = pgd_flags(*pgd); + return true; + } + + *page_size = P4D_SIZE; + p4d = p4d_offset(pgd, virt); + if (!p4d_present(*p4d)) + return false; + if (p4d_leaf(*p4d)) { + *phys = PFN_PHYS(p4d_pfn(*p4d)) | (virt & ~P4D_MASK); + *flags = p4d_flags(*p4d); + return true; + } + + *page_size = PUD_SIZE; + pud = pud_offset(p4d, virt); + if (!pud_present(*pud)) + return false; + if (pud_leaf(*pud)) { + *phys = PFN_PHYS(pud_pfn(*pud)) | (virt & ~PUD_MASK); + *flags = pud_flags(*pud); + return true; + } + + *page_size = PMD_SIZE; + pmd = pmd_offset(pud, virt); + if (!pmd_present(*pmd)) + return false; + if (pmd_leaf(*pmd)) { + *phys = PFN_PHYS(pmd_pfn(*pmd)) | (virt & ~PMD_MASK); + *flags = pmd_flags(*pmd); + return true; + } + + *page_size = PAGE_SIZE; + pte = pte_offset_map(pmd, virt); + if (!pte) + return false; + + if (!pte_present(*pte)) { + pte_unmap(pte); + return false; + } + + *phys = PFN_PHYS(pte_pfn(*pte)) | (virt & ~PAGE_MASK); + *flags = pte_flags(*pte); + + pte_unmap(pte); + return true; +} + +/* + * Map the given range into the ASI page tables. The source of the mapping is + * the regular unrestricted page tables. Can be used to map any kernel memory. + * + * The caller MUST ensure that the source mapping will not change during this + * function. For dynamic kernel memory, this is generally ensured by mapping the + * memory within the allocator. + * + * If this fails, it may leave partial mappings behind. You must asi_unmap them, + * bearing in mind asi_unmap's requirements on the calling context. Part of the + * reason for this is that we don't want to unexpectedly undo mappings that + * weren't created by the present caller. + * + * If the source mapping is a large page and the range being mapped spans the + * entire large page, then it will be mapped as a large page in the ASI page + * tables too. If the range does not span the entire huge page, then it will be + * mapped as smaller pages. In that case, the implementation is slightly + * inefficient, as it will walk the source page tables again for each small + * destination page, but that should be ok for now, as usually in such cases, + * the range would consist of a small-ish number of pages. + * + * RFC: * vmap_p4d_range supports huge mappings, we can probably use that now. + */ +int __must_check asi_map_gfp(struct asi *asi, void *addr, unsigned long len, gfp_t gfp_flags) +{ + unsigned long virt; + unsigned long start = (size_t)addr; + unsigned long end = start + len; + unsigned long page_size; + + if (!static_asi_enabled()) + return 0; + + VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); + VM_BUG_ON(!IS_ALIGNED(len, PAGE_SIZE)); + /* RFC: fault_in_kernel_space should be renamed. */ + VM_BUG_ON(!fault_in_kernel_space(start)); + + gfp_flags &= GFP_RECLAIM_MASK; + + if (asi->mm != &init_mm) + gfp_flags |= __GFP_ACCOUNT; + + for (virt = start; virt < end; virt = ALIGN(virt + 1, page_size)) { + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + phys_addr_t phys; + ulong flags; + + if (!follow_physaddr(asi->mm->pgd, virt, &phys, &page_size, &flags)) + continue; + +#define MAP_AT_LEVEL(base, BASE, level, LEVEL) { \ + if (base##_leaf(*base)) { \ + if (WARN_ON_ONCE(PHYS_PFN(phys & BASE##_MASK) !=\ + base##_pfn(*base))) \ + return -EBUSY; \ + continue; \ + } \ + \ + level = asi_##level##_alloc(asi, base, virt, gfp_flags);\ + if (!level) \ + return -ENOMEM; \ + \ + if (page_size >= LEVEL##_SIZE && \ + (level##_none(*level) || level##_leaf(*level)) && \ + is_page_within_range(virt, LEVEL##_SIZE, \ + start, end)) { \ + page_size = LEVEL##_SIZE; \ + phys &= LEVEL##_MASK; \ + \ + if (!level##_none(*level)) { \ + if (WARN_ON_ONCE(level##_pfn(*level) != \ + PHYS_PFN(phys))) { \ + return -EBUSY; \ + } \ + } else { \ + set_##level(level, \ + __##level(phys | flags)); \ + } \ + continue; \ + } \ + } + + pgd = pgd_offset_pgd(asi->pgd, virt); + + MAP_AT_LEVEL(pgd, PGDIR, p4d, P4D); + MAP_AT_LEVEL(p4d, P4D, pud, PUD); + MAP_AT_LEVEL(pud, PUD, pmd, PMD); + /* + * If a large page is going to be partially mapped + * in 4k pages, convert the PSE/PAT bits. + */ + if (page_size >= PMD_SIZE) + flags = protval_large_2_4k(flags); + MAP_AT_LEVEL(pmd, PMD, pte, PAGE); + + VM_BUG_ON(true); /* Should never reach here. */ + } + + return 0; +#undef MAP_AT_LEVEL +} + +int __must_check asi_map(struct asi *asi, void *addr, unsigned long len) +{ + return asi_map_gfp(asi, addr, len, GFP_KERNEL); +} + +/* + * Unmap a kernel address range previously mapped into the ASI page tables. + * + * The area being unmapped must be a whole previously mapped region (or regions) + * Unmapping a partial subset of a previously mapped region is not supported. + * That will work, but may end up unmapping more than what was asked for, if + * the mapping contained huge pages. A later patch will remove this limitation + * by splitting the huge mapping in the ASI page table in such a case. For now, + * vunmap_pgd_range() will just emit a warning if this situation is detected. + * + * This might sleep, and cannot be called with interrupts disabled. + */ +void asi_unmap(struct asi *asi, void *addr, size_t len) +{ + size_t start = (size_t)addr; + size_t end = start + len; + pgtbl_mod_mask mask = 0; + + if (!static_asi_enabled() || !len) + return; + + VM_BUG_ON(start & ~PAGE_MASK); + VM_BUG_ON(len & ~PAGE_MASK); + VM_BUG_ON(!fault_in_kernel_space(start)); /* Misnamed, ignore "fault_" */ + + vunmap_pgd_range(asi->pgd, start, end, &mask); + + /* We don't support partial unmappings. */ + if (mask & PGTBL_P4D_MODIFIED) { + VM_WARN_ON(!IS_ALIGNED((ulong)addr, P4D_SIZE)); + VM_WARN_ON(!IS_ALIGNED((ulong)len, P4D_SIZE)); + } else if (mask & PGTBL_PUD_MODIFIED) { + VM_WARN_ON(!IS_ALIGNED((ulong)addr, PUD_SIZE)); + VM_WARN_ON(!IS_ALIGNED((ulong)len, PUD_SIZE)); + } else if (mask & PGTBL_PMD_MODIFIED) { + VM_WARN_ON(!IS_ALIGNED((ulong)addr, PMD_SIZE)); + VM_WARN_ON(!IS_ALIGNED((ulong)len, PMD_SIZE)); + } + + asi_flush_tlb_range(asi, addr, len); +} diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index c41e083c5b5281684be79ad0391c1a5fc7b0c493..c55733e144c7538ce7f97b74ea2b1b9c22497c32 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1040,6 +1040,11 @@ noinstr u16 asi_pcid(struct asi *asi, u16 asid) // return kern_pcid(asid) | ((asi->index + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); } +void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len) +{ + flush_tlb_kernel_range((ulong)addr, (ulong)addr + len); +} + #else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ u16 asi_pcid(struct asi *asi, u16 asid) { return kern_pcid(asid); } diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index f777a6cf604b0656fb39087f6eba08f980b2cb6f..5be8f7d657ba0bc2196e333f62b084d0c9eef7b6 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -77,6 +77,17 @@ static inline int asi_intr_nest_depth(void) { return 0; } static inline void asi_intr_exit(void) { } +static inline int asi_map(struct asi *asi, void *addr, size_t len) +{ + return 0; +} + +static inline +void asi_unmap(struct asi *asi, void *addr, size_t len) { } + +static inline +void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len) { } + #define static_asi_enabled() false static inline void asi_check_boottime_disable(void) { } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e8b2ac6bd2ae3b0a768734c8411f45a7d162e12d..492a9cdee7ff3d4e562c4bf508dc14fd7fa67e36 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1900,6 +1900,9 @@ typedef unsigned int pgtbl_mod_mask; #ifndef pmd_leaf #define pmd_leaf(x) false #endif +#ifndef pte_leaf +#define pte_leaf(x) 1 +#endif #ifndef pgd_leaf_size #define pgd_leaf_size(x) (1ULL << PGDIR_SHIFT) diff --git a/mm/internal.h b/mm/internal.h index 64c2eb0b160e169ab9134e3ab618d8a1d552d92c..c0454fe019b9078a963b1ab3685bf31ccfd768b7 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -395,6 +395,8 @@ void unmap_page_range(struct mmu_gather *tlb, void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, unsigned int order); void force_page_cache_ra(struct readahead_control *, unsigned long nr); +void vunmap_pgd_range(pgd_t *pgd_table, unsigned long addr, unsigned long end, + pgtbl_mod_mask *mask); static inline void force_page_cache_readahead(struct address_space *mapping, struct file *file, pgoff_t index, unsigned long nr_to_read) { diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 634162271c0045965eabd9bfe8b64f4a1135576c..8d260f2174fe664b54dcda054cb9759ae282bf03 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -427,6 +427,24 @@ static void vunmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, } while (p4d++, addr = next, addr != end); } +void vunmap_pgd_range(pgd_t *pgd_table, unsigned long addr, unsigned long end, + pgtbl_mod_mask *mask) +{ + unsigned long next; + pgd_t *pgd = pgd_offset_pgd(pgd_table, addr); + + BUG_ON(addr >= end); + + do { + next = pgd_addr_end(addr, end); + if (pgd_bad(*pgd)) + *mask |= PGTBL_PGD_MODIFIED; + if (pgd_none_or_clear_bad(pgd)) + continue; + vunmap_p4d_range(pgd, addr, next, mask); + } while (pgd++, addr = next, addr != end); +} + /* * vunmap_range_noflush is similar to vunmap_range, but does not * flush caches or TLBs. @@ -441,21 +459,9 @@ static void vunmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, */ void __vunmap_range_noflush(unsigned long start, unsigned long end) { - unsigned long next; - pgd_t *pgd; - unsigned long addr = start; pgtbl_mod_mask mask = 0; - BUG_ON(addr >= end); - pgd = pgd_offset_k(addr); - do { - next = pgd_addr_end(addr, end); - if (pgd_bad(*pgd)) - mask |= PGTBL_PGD_MODIFIED; - if (pgd_none_or_clear_bad(pgd)) - continue; - vunmap_p4d_range(pgd, addr, next, &mask); - } while (pgd++, addr = next, addr != end); + vunmap_pgd_range(init_mm.pgd, start, end, &mask); if (mask & ARCH_PAGE_TABLE_SYNC_MASK) arch_sync_kernel_mappings(start, end);