From patchwork Sat Feb 18 21:14:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 13145650 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88F68C636CC for ; Sat, 18 Feb 2023 21:16:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 45D7E280009; Sat, 18 Feb 2023 16:16:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E581280004; Sat, 18 Feb 2023 16:16:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 210E4280009; Sat, 18 Feb 2023 16:16:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1298A280004 for ; Sat, 18 Feb 2023 16:16:12 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D0120A02DF for ; Sat, 18 Feb 2023 21:16:11 +0000 (UTC) X-FDA: 80481670542.25.44FEACD Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf25.hostedemail.com (Postfix) with ESMTP id A6592A000D for ; Sat, 18 Feb 2023 21:16:09 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=oCHqOc50; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf25.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676754970; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nWDBkTSU3J8LACwiki+UNFLjNBB7IM2dgQ2DeDf+8OE=; b=LGVZWjA8s9Yna7tkoBQNJf+jmoMlipXEulNsBEhqz3jMQqEgKTyBFqLb4xNxiZ6RXYutSZ 9xNbVqs/yXHcIzfO0BCdxBZnQqlAmsXOKqlDeSNkE0STSAxJ/+FAdcsEg/iu8IbxHBj1fG sn5oq/KMbquTH9i/fxYpBh24dDS+yaU= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=oCHqOc50; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf25.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676754970; a=rsa-sha256; cv=none; b=h+yCQCt8/a7XWYxImaL470vSuKQNmp1jTJN42ymtFq2CGHzOpghZ9Ygp2lrdKI+hHsjBhh LZsldnLOBqORnFl6jnqp+aAewccnptz081aXLqr7wYVt9h3L6dZ3yhC37vpeAZwxoCKb1v NoDWCoz1tYEoDrDuYAacQtBjhBZfXC0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676754969; x=1708290969; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YUfVMdqaIDo5eYQ52v6BzSjW4FAE7LTxtjQ5YsNs8EI=; b=oCHqOc50WlYD1298STFES/CcxE+8pDYYWOhXFlpSGrvgNIrFYThiPagq UQX99F1R34EwNZ2GNJ9lDvFXSkjdRtQzDDkLhLeLtMTrPNaR5ypVo0Mh+ pbftLVtAG770+Y1tuqoBrag8X9feDXj0qYKcqQ5Cfflk+TtBc24gzmXhn SpDH6bcA5vQ9EzCaRHp18uB2bKtQFJRO3jHWOcLmzByphDFI+rmi4mO/Y wBjPfc3Tiz0oGCKkCFM24TAiDe8cAtNVL6TDbCEkQOmf//11dB/1bFJcH 7wuSP1dG5S2+eHh+Ue4qdLncEzxNW4LUASinjNAtwp5vJc/aFwzLTou4R Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10625"; a="418427322" X-IronPort-AV: E=Sophos;i="5.97,309,1669104000"; d="scan'208";a="418427322" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2023 13:16:05 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10625"; a="664241624" X-IronPort-AV: E=Sophos;i="5.97,309,1669104000"; d="scan'208";a="664241624" Received: from adityava-mobl1.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.209.80.223]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2023 13:16:04 -0800 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , Weijiang Yang , "Kirill A . Shutemov" , John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org, Andrew.Cooper3@citrix.com, christina.schimpe@intel.com, david@redhat.com, debug@rivosinc.com Cc: rick.p.edgecombe@intel.com, linux-arm-kernel@lists.infradead.org, linux-s390@vger.kernel.org, xen-devel@lists.xenproject.org Subject: [PATCH v6 11/41] mm: Introduce pte_mkwrite_kernel() Date: Sat, 18 Feb 2023 13:14:03 -0800 Message-Id: <20230218211433.26859-12-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230218211433.26859-1-rick.p.edgecombe@intel.com> References: <20230218211433.26859-1-rick.p.edgecombe@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: A6592A000D X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: bajmqmz1pb65ta5wkt3g3whtgeq7af1q X-HE-Tag: 1676754969-899045 X-HE-Meta: U2FsdGVkX19SYnAl/JxTx8+6WAMFfyiUuaOeBEjeDLBFa5dXe26+qicfiQKa6c8lfHlra5JQ5agg0g6v9AN0T8t4GN41E36E8HVgywUc+uuZksqcFLTWtmpYvy8dJJJejkHOkwhibEdvRQg9VzhhoYDae0oxbK0P3/3DhcuV3Ww0achJxT18y2FSLrbOcTFRwSwOcXZswixZ3CVOk34EzdVkK3/I7IHKInAnv33d8D+imyxf0Y0VGYtKLBIO05rmQ9oCfI7w7KaEtE4YoNNbBQcr7XXgHwGYZ2RWTRwQQfEfUoBynrqV/ssgqG+mPCUVvjCype21Sok+BtGv1/l5qOq+jZ+UGqIO9O/34NyuvcXj0pGf1/oJgNcGow88XCi1JL5yn80iJKvls00PlJJFNgL2diojNnPM08og523VyUxoxL0uLzWy80w/FUdwUXky1WmF/tw7hD2kTbRnzdMTupQ2HVj040S6rRoVc67wPfD+bSjM5skTS3xIn/d3GJjXFbWdbdJZ3uo0GIiRqL2M3AmGfPinPKjZZzJyqtSFKoMFb33C/+CmcZ2gubAWuFXjVpAJ95o4dCt+BItlCZ+07ZENmDG6uuDxHWMcZ1lwsclnbP/+9FSbsgfxF1Gkj56zWVBMqHxw9uWK+V4NDXHr65MDMiTfpHXBSiDTbJ7M0JMQCzhYfmC+73EoXr9GS18yeb3sEV0vXfqqzhMO15NkjSdfyYcmKZ3YpKoiz/RER02V+zHqrXobvivpimobypQJYW88zAtkeTCCN2AXeSxDE0q7kCTKepvqasmJLTTC7rRzqyMqzYHFNUcZ+MUgTlBnqvMGc7GANz31VQnajC2NSpD1dbaUCDiv0toLOYWjFhLuglFMqIho/svnVQB4rMr7FwBx17dk7JeOHnsk8gzV03y3b9uVeaDlIg9YhPeuZSXGAjAoz3SIYQxHHbXRCusyJ+gil25At4yscx7fRZt VoA0aiEK tkjQK0bZcN6cqPwc9nsLcne6YaNDPRQytH5SJ3tN7ziROZ+cjf35XlFKdnBJ4SMUO4wiZGfV/K/TYrZv19uMemjrNgAAISzpsWr0H9Ng2hsSCm6YaN4lYpn3bKkZs0hS3g7K1v0fBvD6P89fa+zt0mbheaHLAj1YUH8xtFlF4W4lW6I4jdONlelwotGBKKegOq8fu2ZXulmNn6FuQtAAmFQc7wKweAMDE/WTyA7a47mhjOcSSV3RNGHGpn2eCk+Wwz4Dpgg05VSkBQ8AMnTcmMyj3qKHzKYNL6bvFdSQWlK5n6CKv56ommUKpi45CICdSAKjIKPBzo9hylZPIpkfk7//Wx6CkBpOJelkONRAxr26dCRWhASZ7qbG7VnmuYq4Q+lBbGrZUya4G9TciNA+aTsxfpM7bYBAUUuw4ndbfTb+FeBPTqGLdEpCnc7jAOqODB/2dUkH0GFHT7U0W9uDTvAzYGfNAS21Z/YqdHVJY/4o3mbKx2B89FrXOJkvm0odFdTJiSAT9C13KnaMFl0O34dIosV4zESbUVrzPAG/5Ibq/G9Km0/u92g1Aeg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The x86 Control-flow Enforcement Technology (CET) feature includes a new type of memory called shadow stack. This shadow stack memory has some unusual properties, which requires some core mm changes to function properly. One of these changes is to allow for pte_mkwrite() to create different types of writable memory (the existing conventionally writable type and also the new shadow stack type). Future patches will convert pte_mkwrite() to take a VMA in order to facilitate this, however there are places in the kernel where pte_mkwrite() is called outside of the context of a VMA. These are for kernel memory. So create a new variant called pte_mkwrite_kernel() and switch the kernel users over to it. Have pte_mkwrite() and pte_mkwrite_kernel() be the same for now. Future patches will introduce changes to make pte_mkwrite() take a VMA. Only do this for architectures that need it because they call pte_mkwrite() in arch code without an associated VMA. Since it will only currently be used in arch code, so do not include it in arch_pgtable_helpers.rst. Cc: linux-doc@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Tested-by: Pengfei Xu Suggested-by: David Hildenbrand Signed-off-by: Rick Edgecombe Reviewed-by: Kees Cook Acked-by: David Hildenbrand Acked-by: Deepak Gupta --- Hi Non-x86 Arch’s, x86 has a feature that allows for the creation of a special type of writable memory (shadow stack) that is only writable in limited specific ways. Previously, changes were proposed to core MM code to teach it to decide when to create normally writable memory or the special shadow stack writable memory, but David Hildenbrand suggested[0] to change pXX_mkwrite() to take a VMA, so awareness of shadow stack memory can be moved into x86 code. Since pXX_mkwrite() is defined in every arch, it requires some tree-wide changes. So that is why you are seeing some patches out of a big x86 series pop up in your arch mailing list. There is no functional change. After this refactor, the shadow stack series goes on to use the arch helpers to push shadow stack memory details inside arch/x86. Testing was just 0-day build testing. Hopefully that is enough context. Thanks! [0] https://lore.kernel.org/lkml/0e29a2d0-08d8-bcd6-ff26-4bea0e4037b0@redhat.com/#t v6: - New patch --- arch/arm64/include/asm/pgtable.h | 7 ++++++- arch/arm64/mm/trans_pgd.c | 4 ++-- arch/s390/include/asm/pgtable.h | 7 ++++++- arch/s390/mm/pageattr.c | 2 +- arch/x86/include/asm/pgtable.h | 7 ++++++- arch/x86/xen/mmu_pv.c | 2 +- 6 files changed, 22 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 65e78999c75d..ed555f947697 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -180,13 +180,18 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) return pmd; } -static inline pte_t pte_mkwrite(pte_t pte) +static inline pte_t pte_mkwrite_kernel(pte_t pte) { pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); return pte; } +static inline pte_t pte_mkwrite(pte_t pte) +{ + return pte_mkwrite_kernel(pte); +} + static inline pte_t pte_mkclean(pte_t pte) { pte = clear_pte_bit(pte, __pgprot(PTE_DIRTY)); diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 4ea2eefbc053..5c07e68d80ea 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -40,7 +40,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) * read only (code, rodata). Clear the RDONLY bit from * the temporary mappings we use during restore. */ - set_pte(dst_ptep, pte_mkwrite(pte)); + set_pte(dst_ptep, pte_mkwrite_kernel(pte)); } else if (debug_pagealloc_enabled() && !pte_none(pte)) { /* * debug_pagealloc will removed the PTE_VALID bit if @@ -53,7 +53,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) */ BUG_ON(!pfn_valid(pte_pfn(pte))); - set_pte(dst_ptep, pte_mkpresent(pte_mkwrite(pte))); + set_pte(dst_ptep, pte_mkpresent(pte_mkwrite_kernel(pte))); } } diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index b26cbf1c533c..29522418b5f4 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -991,7 +991,7 @@ static inline pte_t pte_wrprotect(pte_t pte) return set_pte_bit(pte, __pgprot(_PAGE_PROTECT)); } -static inline pte_t pte_mkwrite(pte_t pte) +static inline pte_t pte_mkwrite_kernel(pte_t pte) { pte = set_pte_bit(pte, __pgprot(_PAGE_WRITE)); if (pte_val(pte) & _PAGE_DIRTY) @@ -999,6 +999,11 @@ static inline pte_t pte_mkwrite(pte_t pte) return pte; } +static inline pte_t pte_mkwrite(pte_t pte) +{ + return pte_mkwrite_kernel(pte); +} + static inline pte_t pte_mkclean(pte_t pte) { pte = clear_pte_bit(pte, __pgprot(_PAGE_DIRTY)); diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c index 85195c18b2e8..4ee5fe5caa23 100644 --- a/arch/s390/mm/pageattr.c +++ b/arch/s390/mm/pageattr.c @@ -96,7 +96,7 @@ static int walk_pte_level(pmd_t *pmdp, unsigned long addr, unsigned long end, if (flags & SET_MEMORY_RO) new = pte_wrprotect(new); else if (flags & SET_MEMORY_RW) - new = pte_mkwrite(pte_mkdirty(new)); + new = pte_mkwrite_kernel(pte_mkdirty(new)); if (flags & SET_MEMORY_NX) new = set_pte_bit(new, __pgprot(_PAGE_NOEXEC)); else if (flags & SET_MEMORY_X) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index b39f16c0d507..4f9fddcff2b9 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -364,11 +364,16 @@ static inline pte_t pte_mkyoung(pte_t pte) return pte_set_flags(pte, _PAGE_ACCESSED); } -static inline pte_t pte_mkwrite(pte_t pte) +static inline pte_t pte_mkwrite_kernel(pte_t pte) { return pte_set_flags(pte, _PAGE_RW); } +static inline pte_t pte_mkwrite(pte_t pte) +{ + return pte_mkwrite_kernel(pte); +} + static inline pte_t pte_mkhuge(pte_t pte) { return pte_set_flags(pte, _PAGE_PSE); diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index ee29fb558f2e..a23f04243c19 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -150,7 +150,7 @@ void make_lowmem_page_readwrite(void *vaddr) if (pte == NULL) return; /* vaddr missing */ - ptev = pte_mkwrite(*pte); + ptev = pte_mkwrite_kernel(*pte); if (HYPERVISOR_update_va_mapping(address, ptev, 0)) BUG();