From patchwork Sat Dec 3 00:35:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 13063353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 345A5C4321E for ; Sat, 3 Dec 2022 00:37:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D040A8E0009; Fri, 2 Dec 2022 19:37:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C8DB28E0002; Fri, 2 Dec 2022 19:37:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B2E388E0009; Fri, 2 Dec 2022 19:37:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 995568E0002 for ; Fri, 2 Dec 2022 19:37:01 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6E4341C5D51 for ; Sat, 3 Dec 2022 00:37:01 +0000 (UTC) X-FDA: 80199130242.15.534E378 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf14.hostedemail.com (Postfix) with ESMTP id C3ABF100006 for ; Sat, 3 Dec 2022 00:37:00 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=U6IKydM0; spf=pass (imf14.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670027821; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=aR2IwIcBM4q2qnqTcEoPI9s+IKsCSRfLILwEqGZ+4FY=; b=HsFbIJPuYjkC0CJJUYJ9WUhX47C94GWmJlIy80oHkCUkiuX8GgtItdiqeuZexntyrwqMS6 PTksGVnVknPk8Z/t89YH8H4jyoBGMfMkfIH9/ujO/KbEgvIz4EIyHDpC+gg+S6Z5q09eff QHMKRi99Z/YWZ8Xqiozkrctd3CibafU= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=U6IKydM0; spf=pass (imf14.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670027821; a=rsa-sha256; cv=none; b=7rgeMxtePngJEDQSrtA44v2pDYO1AVswewOYdKuD2FZfbG+FYP0XOHkwSvTbsYPvNdCDb0 Sw6UyqLIWrfEKDwROGBH6RrdjNlYltl2IUTFLxyn1FKZgqbKgrqWnz2dZxcwNyu5ifjDYw neheD4meTq+DcQx2d5FCHtuTifvoBQs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670027820; x=1701563820; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=mMUmCMQ2UpuIfMiP4n71ImLyKciupdTDYlrsPhVTKDs=; b=U6IKydM0kMONtDRlGCFYOzjo8Dz5AT7QXwiT0RbXp2qatDCPicw4HO8O q0wjJN1FCAQ2+yrJh7+69vrxURb/2gcj++0NjUn8gEDw/AbihOsQceAdC j8Mcz1OO98eFplTam2BEdXqUnSS+jE9ar8wSdRrzegMpvfhedaaDgZFf9 7NdMtV1DqqhTQxWPS1n+/Wx4oS+FcY3f+JBO58pAAxOL4CFXzmP2uZUep bofl7Zng42VzDN1a4lOgJDQ9dJE823DKMGx4pvGvG2pa6+09C7VW8lff1 0EHWXbDESgJqYyfrbf6++gOl/9VmSV9s+Tx8vGLlz7zscyTpiu1A7zdzG g==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="313710942" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="313710942" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 16:37:00 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="787479830" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="787479830" Received: from bgordon1-mobl1.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.212.211.211]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 16:36:57 -0800 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , Weijiang Yang , "Kirill A . Shutemov" , John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org, Andrew.Cooper3@citrix.com, christina.schimpe@intel.com Cc: rick.p.edgecombe@intel.com, Yu-cheng Yu Subject: [PATCH v4 12/39] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY to _PAGE_COW Date: Fri, 2 Dec 2022 16:35:39 -0800 Message-Id: <20221203003606.6838-13-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221203003606.6838-1-rick.p.edgecombe@intel.com> References: <20221203003606.6838-1-rick.p.edgecombe@intel.com> X-Stat-Signature: u3jf14r48pbozo4wx3fbohe11wo7z4so X-Rspam-User: X-Spamd-Result: default: False [-3.40 / 9.00]; BAYES_HAM(-6.00)[100.00%]; SUSPICIOUS_RECIPS(1.50)[]; SUBJECT_HAS_UNDERSCORES(1.00)[]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[intel.com,none]; R_SPF_ALLOW(-0.20)[+ip4:192.55.52.93/32:c]; R_DKIM_ALLOW(-0.20)[intel.com:s=Intel]; MIME_GOOD(-0.10)[text/plain]; RCVD_NO_TLS_LAST(0.10)[]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_TWELVE(0.00)[40]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; DKIM_TRACE(0.00)[intel.com:+]; TO_DN_SOME(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; TAGGED_RCPT(0.00)[]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: C3ABF100006 X-Rspamd-Server: rspam06 X-HE-Tag: 1670027820-478514 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yu-cheng Yu When Shadow Stack is in use, Write=0,Dirty=1 PTE are reserved for shadow stack. Copy-on-write PTes then have Write=0,Cow=1. When a PTE goes from Write=1,Dirty=1 to Write=0,Cow=1, it could become a transient shadow stack PTE in two cases: The first case is that some processors can start a write but end up seeing a Write=0 PTE by the time they get to the Dirty bit, creating a transient shadow stack PTE. However, this will not occur on processors supporting Shadow Stack, and a TLB flush is not necessary. The second case is that when _PAGE_DIRTY is replaced with _PAGE_COW non- atomically, a transient shadow stack PTE can be created as a result. Thus, prevent that with cmpxchg. In the case of pmdp_set_wrprotect(), for nopmd configs the ->pmd operated on does not exist and the logic would need to be different. Although the extra functionality will normally be optimized out when user shadow stacks are not configured, also exclude it in the preprocessor stage so that it will still compile. User shadow stack is not supported there by Linux anyway. Leave the cpu_feature_enabled() check so that the functionality also disables based on runtime detection of the feature. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. Tested-by: Pengfei Xu Tested-by: John Allen Signed-off-by: Yu-cheng Yu Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe Reviewed-by: Kees Cook --- v3: - Remove unnecessary #ifdef (Dave Hansen) v2: - Compile out some code due to clang build error - Clarify commit log (dhansen) - Normalize PTE bit descriptions between patches (dhansen) - Update comment with text from (dhansen) Yu-cheng v30: - Replace (pmdval_t) cast with CONFIG_PGTABLE_LEVELES > 2 (Borislav Petkov). arch/x86/include/asm/pgtable.h | 35 ++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 67bd2627c293..b68428099932 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1195,6 +1195,21 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { + /* + * Avoid accidentally creating shadow stack PTEs + * (Write=0,Dirty=1). Use cmpxchg() to prevent races with + * the hardware setting Dirty=1. + */ + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) { + pte_t old_pte, new_pte; + + old_pte = READ_ONCE(*ptep); + do { + new_pte = pte_wrprotect(old_pte); + } while (!try_cmpxchg(&ptep->pte, &old_pte.pte, new_pte.pte)); + + return; + } clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); } @@ -1247,6 +1262,26 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { +#ifdef CONFIG_X86_USER_SHADOW_STACK + /* + * If Shadow Stack is enabled, pmd_wrprotect() moves _PAGE_DIRTY + * to _PAGE_COW (see comments at pmd_wrprotect()). + * When a thread reads a RW=1, Dirty=0 PMD and before changing it + * to RW=0, Dirty=0, another thread could have written to the page + * and the PMD is RW=1, Dirty=1 now. + */ + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) { + pmd_t old_pmd, new_pmd; + + old_pmd = READ_ONCE(*pmdp); + do { + new_pmd = pmd_wrprotect(old_pmd); + } while (!try_cmpxchg(&pmdp->pmd, &old_pmd.pmd, new_pmd.pmd)); + + return; + } +#endif + clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); }