From patchwork Tue Jun 13 00:10:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 13277774 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C4CF6C88CB7 for ; Tue, 13 Jun 2023 00:16:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=IAm3oN9y2JykCQWW8owiQ1RAzvdwLh3Zj4oqz6AwBN4=; b=0nNgap1sfwGvyS ByFmmI9Mw79hKuA2Chh7S4dO4p8CX1tTqVV3+zQm0gSKzMxLt/PPT6IydGmiNB4NvlTXmPlVCfIXd Xu7WpOnakRTdp60yr0yvzaD1ijIiarj8zryryjnMTDobJ4nVoIihUeTQ7sENHtYdB+5SWac/Y1TPZ aK2AvQ0VQ6YzDAlzmBbzdqaFBhA8ELbiuVAOJb5FOgaVfoLjkJ8ekz4dz4A5vr9JU7mMgPCpXM7ge GPCRi6yLfh9Z/H5UtjWcBJc5RI33JMAxHrQaOWwMJIbGZ9JUGdTuVFeYPX1TMh2zqFsJ4Rjfkj9Jx NYelEvolBdtLkC25/rrA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1q8rh9-006K8h-2Z; Tue, 13 Jun 2023 00:15:35 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1q8rh8-006K7U-1l for linux-arm-kernel@bombadil.infradead.org; Tue, 13 Jun 2023 00:15:34 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:Content-Type :MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=bhEM/cU7s2w8AGabEsZ/KJXI1WiWuC9Y86R+7+JHEQk=; b=S8NYpS03BzpQ/FNFJIIej0qDzk 41333ATnxWLDUDwZCJ0md9cmXmdTCm5TjK2u2ucOGFQv2olvfNyU5ZIwZZilCzpyK2TZCRuw8FZXu l/jw6lPtc7qPRUD2Vh99P5EKqu+Z4jWkeOb5rv3puTa9Tpp6cxbx6Ct3KA/zbG3aJLtSzKsFjsXs6 6FVPQO2QtxGU7MBDSFZaWUfh8kZqwSX0n+CdibvQBa49aqPt1CkBBR1YARDeH0RyFNxvUkU9tz+qn 9UKaBG5Js8GYXAXqsROgz5ZRgM/AsTmJX6IL/ZuM1PZee3kr9OrTb5Y0JTe3sl11qPCNpwdaOtnij OA0esjsA==; Received: from mga03.intel.com ([134.134.136.65]) by desiato.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1q8re5-00960l-0L for linux-arm-kernel@lists.infradead.org; Tue, 13 Jun 2023 00:14:36 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686615145; x=1718151145; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3c77rqX6YUy6Cyz5r8ePlQOYiEkqAYIhDQVjx5oVoZw=; b=ehhcAkqCtY4xDIRxcfGoR37l7nGrVdlDTF5P1L7un4maAXfl3rGcqnMl zxVPblJAe/v7IYGFEK98ZuyphVt211NCl46R84i0ZsAdL31bR70EtFkOA rKC4xqnxceT9YvJXK1rC//aZyO2f0IIGpiRYV2lshYNoz/gY4aXAco6dQ GTcp3ckJA3eof1wtdCwL61yH1rsy5uUw3kcv3t5Y6tk5PTU/+TanqyZFv sSbRJxIDSlzR5/osT5i6r+ebTFfIMNa5vuYb+lpYFzF3LS66iMQgAxjLJ NhdMradz0kp6khAjWiUepYJNAZVqZqxC0gSpEysuKVJehBtSoP0IcpfeJ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10739"; a="361556698" X-IronPort-AV: E=Sophos;i="6.00,238,1681196400"; d="scan'208";a="361556698" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jun 2023 17:12:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10739"; a="835670972" X-IronPort-AV: E=Sophos;i="6.00,238,1681196400"; d="scan'208";a="835670972" Received: from almeisch-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.amr.corp.intel.com) ([10.209.42.242]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jun 2023 17:12:07 -0700 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , Weijiang Yang , "Kirill A . Shutemov" , John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org, Andrew.Cooper3@citrix.com, christina.schimpe@intel.com, david@redhat.com, debug@rivosinc.com, szabolcs.nagy@arm.com, torvalds@linux-foundation.org, broonie@kernel.org Cc: rick.p.edgecombe@intel.com, linux-arm-kernel@lists.infradead.org, linux-s390@vger.kernel.org, xen-devel@lists.xenproject.org Subject: [PATCH v9 02/42] mm: Move pte/pmd_mkwrite() callers with no VMA to _novma() Date: Mon, 12 Jun 2023 17:10:28 -0700 Message-Id: <20230613001108.3040476-3-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230613001108.3040476-1-rick.p.edgecombe@intel.com> References: <20230613001108.3040476-1-rick.p.edgecombe@intel.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230613_011236_948403_C0C9AAC3 X-CRM114-Status: GOOD ( 27.45 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The x86 Shadow stack feature includes a new type of memory called shadow stack. This shadow stack memory has some unusual properties, which requires some core mm changes to function properly. One of these unusual properties is that shadow stack memory is writable, but only in limited ways. These limits are applied via a specific PTE bit combination. Nevertheless, the memory is writable, and core mm code will need to apply the writable permissions in the typical paths that call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so that the x86 implementation of it can know whether to create regular writable memory or shadow stack memory. But there are a couple of challenges to this. Modifying the signatures of each arch pte_mkwrite() implementation would be error prone because some are generated with macros and would need to be re-implemented. Also, some pte_mkwrite() callers operate on kernel memory without a VMA. So this can be done in a three step process. First pte_mkwrite() can be renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite() added that just calls pte_mkwrite_novma(). Next callers without a VMA can be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers can be changed to take/pass a VMA. Previous patches have done the first step, so next move the callers that don't have a VMA to pte_mkwrite_novma(). Also do the same for pmd_mkwrite(). This will be ok for the shadow stack feature, as these callers are on kernel memory which will not need to be made shadow stack, and the other architectures only currently support one type of memory in pte_mkwrite() Cc: linux-doc@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: Rick Edgecombe Reviewed-by: Mike Rapoport (IBM) Acked-by: David Hildenbrand --- Hi Non-x86 Arch’s, x86 has a feature that allows for the creation of a special type of writable memory (shadow stack) that is only writable in limited specific ways. Previously, changes were proposed to core MM code to teach it to decide when to create normally writable memory or the special shadow stack writable memory, but David Hildenbrand suggested[0] to change pXX_mkwrite() to take a VMA, so awareness of shadow stack memory can be moved into x86 code. Later Linus suggested a less error-prone way[1] to go about this after the first attempt had a bug. Since pXX_mkwrite() is defined in every arch, it requires some tree-wide changes. So that is why you are seeing some patches out of a big x86 series pop up in your arch mailing list. There is no functional change. After this refactor, the shadow stack series goes on to use the arch helpers to push arch memory details inside arch/x86 and other arch's with upcoming shadow stack features. Testing was just 0-day build testing. Hopefully that is enough context. Thanks! [0] https://lore.kernel.org/lkml/0e29a2d0-08d8-bcd6-ff26-4bea0e4037b0@redhat.com/ [1] https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/ --- arch/arm64/mm/trans_pgd.c | 4 ++-- arch/s390/mm/pageattr.c | 4 ++-- arch/x86/xen/mmu_pv.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 4ea2eefbc053..a01493f3a06f 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -40,7 +40,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) * read only (code, rodata). Clear the RDONLY bit from * the temporary mappings we use during restore. */ - set_pte(dst_ptep, pte_mkwrite(pte)); + set_pte(dst_ptep, pte_mkwrite_novma(pte)); } else if (debug_pagealloc_enabled() && !pte_none(pte)) { /* * debug_pagealloc will removed the PTE_VALID bit if @@ -53,7 +53,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) */ BUG_ON(!pfn_valid(pte_pfn(pte))); - set_pte(dst_ptep, pte_mkpresent(pte_mkwrite(pte))); + set_pte(dst_ptep, pte_mkpresent(pte_mkwrite_novma(pte))); } } diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c index 5ba3bd8a7b12..6931d484d8a7 100644 --- a/arch/s390/mm/pageattr.c +++ b/arch/s390/mm/pageattr.c @@ -97,7 +97,7 @@ static int walk_pte_level(pmd_t *pmdp, unsigned long addr, unsigned long end, if (flags & SET_MEMORY_RO) new = pte_wrprotect(new); else if (flags & SET_MEMORY_RW) - new = pte_mkwrite(pte_mkdirty(new)); + new = pte_mkwrite_novma(pte_mkdirty(new)); if (flags & SET_MEMORY_NX) new = set_pte_bit(new, __pgprot(_PAGE_NOEXEC)); else if (flags & SET_MEMORY_X) @@ -155,7 +155,7 @@ static void modify_pmd_page(pmd_t *pmdp, unsigned long addr, if (flags & SET_MEMORY_RO) new = pmd_wrprotect(new); else if (flags & SET_MEMORY_RW) - new = pmd_mkwrite(pmd_mkdirty(new)); + new = pmd_mkwrite_novma(pmd_mkdirty(new)); if (flags & SET_MEMORY_NX) new = set_pmd_bit(new, __pgprot(_SEGMENT_ENTRY_NOEXEC)); else if (flags & SET_MEMORY_X) diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index b3b8d289b9ab..63fced067057 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -150,7 +150,7 @@ void make_lowmem_page_readwrite(void *vaddr) if (pte == NULL) return; /* vaddr missing */ - ptev = pte_mkwrite(*pte); + ptev = pte_mkwrite_novma(*pte); if (HYPERVISOR_update_va_mapping(address, ptev, 0)) BUG();