From patchwork Fri Nov 4 22:35:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rick Edgecombe X-Patchwork-Id: 13032653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74B42C433FE for ; Fri, 4 Nov 2022 22:39:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74A4C8E0009; Fri, 4 Nov 2022 18:39:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D2B28E0007; Fri, 4 Nov 2022 18:39:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 525A78E0009; Fri, 4 Nov 2022 18:39:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3A8458E0007 for ; Fri, 4 Nov 2022 18:39:38 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 1A2CBC09E6 for ; Fri, 4 Nov 2022 22:39:38 +0000 (UTC) X-FDA: 80097228036.05.006764A Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf06.hostedemail.com (Postfix) with ESMTP id 94D77180003 for ; Fri, 4 Nov 2022 22:39:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667601577; x=1699137577; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=5QjeO1hUpatOnGhS13rjzsuKy1aDnHsoqPiBGBWzrX0=; b=QbvRcO/+ZCWaHHF5YvwVQbUxea6vlKziKJAaalcgJdadAG6CRJZKPGyB TLt/z0BRPrDKC84A2xpLCHg8NEp8lwny5szzoFsIQsPpfrFxfEDRHnIHb IdlrCH+2lzSLBdUD74wKJqG6U4DnJhPOf0viALQFcEtM4mmTbhtf//N+4 lUluDI2Rn4M7K3OQFo4i/waFTMhpogSINCcbHYDcAAMDjga2T0epfGqPL SWSAtdxuWCJ2CNJfvTvDX+Bun+cYkmXOEZvN6URDdJj8R4hrW/GZgxt8k nuYWd0AAD3p095SAzKOfaMtI5llv66sQxotda4bvwmjn8pmVfkOfzLsl2 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10521"; a="297559645" X-IronPort-AV: E=Sophos;i="5.96,138,1665471600"; d="scan'208";a="297559645" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Nov 2022 15:39:37 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10521"; a="668514069" X-IronPort-AV: E=Sophos;i="5.96,138,1665471600"; d="scan'208";a="668514069" Received: from adhjerms-mobl1.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.212.227.68]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Nov 2022 15:39:36 -0700 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V . Shankar" , Weijiang Yang , "Kirill A . Shutemov" , John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org Cc: rick.p.edgecombe@intel.com, Yu-cheng Yu Subject: [PATCH v3 16/37] x86/mm: Update maybe_mkwrite() for shadow stack Date: Fri, 4 Nov 2022 15:35:43 -0700 Message-Id: <20221104223604.29615-17-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221104223604.29615-1-rick.p.edgecombe@intel.com> References: <20221104223604.29615-1-rick.p.edgecombe@intel.com> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667601577; a=rsa-sha256; cv=none; b=vdYDNZsXmzHAP1VPO5MZNb8gmDyughS/orJKjmnqXPm7LX/G19sL6ZKaa2PM4FHm1QdpYe kAfnhFINDO60NtJFTckaWffYkY0YKlj4AGBLpO+YyAW7LWC9hkYj0YrYOvymx+3oPACeBt UXrXHCBHW0WHV29jJbUDiFAeEucTJZI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="QbvRcO/+"; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf06.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667601577; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=nBTWAjcv66kbHX4AaVWrQ5311qyWLHUMgz/own7AHvA=; b=s2TiyGWFg873t6PD6af07cPUWvhT8Bi7G+wgKI1oaA7ZxOlfM2dvxPFguos1AIv+cXNxWE Z35duwcaIEBVlh+eUed6BMg/Kfw8NFLenc9AcMJoAoIlWJ+MQF/xtAqcq/0uh3zXs09tLz jf5s3Xouyxg26Bi9Mw05I6G2V1k3gW4= X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 94D77180003 X-Rspam-User: Authentication-Results: imf06.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="QbvRcO/+"; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf06.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com X-Stat-Signature: 1yicehi136f9s8fujt97isd5c51iywua X-HE-Tag: 1667601577-345721 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yu-cheng Yu When serving a page fault, maybe_mkwrite() makes a PTE writable if there is a write access to it, and its vma has VM_WRITE. Shadow stack accesses to shadow stack vma's are also treated as write accesses by the fault handler. This is because setting shadow stack memory makes it writable via some instructions, so COW has to happen even for shadow stack reads. So maybe_mkwrite() should continue to set VM_WRITE vma's as normally writable, but also set VM_WRITE|VM_SHADOW_STACK vma's as shadow stack. Do this by adding a pte_mkwrite_shstk() and a cross-arch stub. Check for VM_SHADOW_STACK in maybe_mkwrite() and call pte_mkwrite_shstk() accordingly. Apply the same changes to maybe_pmd_mkwrite(). Tested-by: Pengfei Xu Tested-by: John Allen Signed-off-by: Yu-cheng Yu Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe Cc: Kees Cook --- v3: - Remove unneeded define for maybe_mkwrite (Peterz) - Switch to cleaner version of maybe_mkwrite() (Peterz) v2: - Change to handle shadow stacks that are VM_WRITE|VM_SHADOW_STACK - Ditch arch specific maybe_mkwrite(), and make the code generic - Move do_anonymous_page() to next patch (Kirill) Yu-cheng v29: - Remove likely()'s. arch/x86/include/asm/pgtable.h | 2 ++ include/linux/mm.h | 13 ++++++++++--- include/linux/pgtable.h | 14 ++++++++++++++ mm/huge_memory.c | 10 +++++++--- 4 files changed, 33 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index f252c42f3ca1..df67bcf9f69e 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -420,6 +420,7 @@ static inline pte_t pte_mkdirty(pte_t pte) return pte_set_flags(pte, dirty | _PAGE_SOFT_DIRTY); } +#define pte_mkwrite_shstk pte_mkwrite_shstk static inline pte_t pte_mkwrite_shstk(pte_t pte) { /* pte_clear_cow() also sets Dirty=1 */ @@ -556,6 +557,7 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd) return pmd_set_flags(pmd, dirty | _PAGE_SOFT_DIRTY); } +#define pmd_mkwrite_shstk pmd_mkwrite_shstk static inline pmd_t pmd_mkwrite_shstk(pmd_t pmd) { return pmd_clear_cow(pmd); diff --git a/include/linux/mm.h b/include/linux/mm.h index 42c4e4bc972d..5d9536fa860a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1015,12 +1015,19 @@ void free_compound_page(struct page *page); * servicing faults for write access. In the normal case, do always want * pte_mkwrite. But get_user_pages can cause write faults for mappings * that do not have writing enabled, when used by access_process_vm. + * + * If a vma is shadow stack (a type of writable memory), mark the pte shadow + * stack. */ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) { - if (likely(vma->vm_flags & VM_WRITE)) - pte = pte_mkwrite(pte); - return pte; + if (!(vma->vm_flags & VM_WRITE)) + return pte; + + if (vma->vm_flags & VM_SHADOW_STACK) + return pte_mkwrite_shstk(pte); + + return pte_mkwrite(pte); } vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page); diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index a108b60a6962..5ce6732a6b65 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -493,6 +493,13 @@ static inline pte_t pte_sw_mkyoung(pte_t pte) #define pte_mk_savedwrite pte_mkwrite #endif +#ifndef pte_mkwrite_shstk +static inline pte_t pte_mkwrite_shstk(pte_t pte) +{ + return pte; +} +#endif + #ifndef pte_clear_savedwrite #define pte_clear_savedwrite pte_wrprotect #endif @@ -501,6 +508,13 @@ static inline pte_t pte_sw_mkyoung(pte_t pte) #define pmd_savedwrite pmd_write #endif +#ifndef pmd_mkwrite_shstk +static inline pmd_t pmd_mkwrite_shstk(pmd_t pmd) +{ + return pmd; +} +#endif + #ifndef pmd_mk_savedwrite #define pmd_mk_savedwrite pmd_mkwrite #endif diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 561a42567477..73b9b78f8cf4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -553,9 +553,13 @@ __setup("transparent_hugepage=", setup_transparent_hugepage); pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { - if (likely(vma->vm_flags & VM_WRITE)) - pmd = pmd_mkwrite(pmd); - return pmd; + if (!(vma->vm_flags & VM_WRITE)) + return pmd; + + if (vma->vm_flags & VM_SHADOW_STACK) + return pmd_mkwrite_shstk(pmd); + + return pmd_mkwrite(pmd); } #ifdef CONFIG_MEMCG