From patchwork Thu Sep 29 22:29:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 12994666 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76801C433FE for ; Thu, 29 Sep 2022 22:30:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF9208D0006; Thu, 29 Sep 2022 18:30:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA8668D0001; Thu, 29 Sep 2022 18:30:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFA188D0006; Thu, 29 Sep 2022 18:30:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C14808D0001 for ; Thu, 29 Sep 2022 18:30:11 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 908FD1C6E3A for ; Thu, 29 Sep 2022 22:30:11 +0000 (UTC) X-FDA: 79966567422.01.8744867 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf12.hostedemail.com (Postfix) with ESMTP id CE71440009 for ; Thu, 29 Sep 2022 22:30:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664490610; x=1696026610; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=eV9GA8OUMVYJDpffX6i9+4mzKnfsWlOOmYPUr5qRV9M=; b=PdG27S/4UWgyWjMOAV08UDmdHbVkcUt1D7FDX4uaykb7jiiyXF3v2aZf bZDgr1ajKWroAhxzR1FM8Qn2ruj0WVT/6OzSk4h8joYJKMmEST/Sxxh37 9usMkjENiNXiKyUB9wlGj6akmzuqKfPKmKbEszOrypK8/JPYG76hg39mR KLPKWGqs8T9mYgibXwhYPOXPtMGW1CrhYuSUjKHl+WM+bCkxlJspCkDg0 7W6ugvMuby8GT/jYql+YdxZmIll/FME8hMIQllSfrp3uZ5Q/6LdfOCxjO urvs3nfnAhYnUMafE0zNdPvkZvFnHmUOCNtOq9w/GNsT3aatFYiT090G4 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10485"; a="328420417" X-IronPort-AV: E=Sophos;i="5.93,356,1654585200"; d="scan'208";a="328420417" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2022 15:30:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10485"; a="691016186" X-IronPort-AV: E=Sophos;i="5.93,356,1654585200"; d="scan'208";a="691016186" Received: from sergungo-mobl.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.251.25.88]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2022 15:30:08 -0700 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V . Shankar" , Weijiang Yang , "Kirill A . Shutemov" , joao.moreira@intel.com, John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com Cc: rick.p.edgecombe@intel.com, Yu-cheng Yu Subject: [PATCH v2 12/39] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY to _PAGE_COW Date: Thu, 29 Sep 2022 15:29:09 -0700 Message-Id: <20220929222936.14584-13-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220929222936.14584-1-rick.p.edgecombe@intel.com> References: <20220929222936.14584-1-rick.p.edgecombe@intel.com> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664490611; a=rsa-sha256; cv=none; b=WEgTKDkQlnQpZe2jV+7lLOj/c1Z1PJJnOtYk3viVwg9fcK7sg7HQowcrphgO/g6OelrcTH bAvdbSgINiUOM9UQcLOzCk0Vul7rsTdFSLwQlDcYjGk/Y1tawqRmIpTE1FSIdZxbhiQFlR sH7FNtcB8EUbPhFbWy5iHSBHxnAVDwE= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="PdG27S/4"; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf12.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664490611; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=0lqQoOoJwESjO+CTCm+DZMhNYniwX6TJlB5GrhfWP90=; b=hWbkpVKho8cHeJn/WfXblS84V3nMWw+oI25Rs3YYFojac0XVxJsiUGpWDqcQW14bqPttAb v5lr9jmo0h+cSmKq9sE1WoN6bqE4MX6ekwAcvp8+uMkiZx0pBdGz4PYLD34NkAPIIQFxzE mKyfiFP+LsEyY+/ttA2LwZDLcZ4rZ1I= X-Rspamd-Queue-Id: CE71440009 X-Rspam-User: Authentication-Results: imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="PdG27S/4"; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf12.hostedemail.com: domain of rick.p.edgecombe@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com X-Rspamd-Server: rspam03 X-Stat-Signature: mheiez4nr83c4y1ngrgwtjcy7t3y1i6o X-HE-Tag: 1664490610-835127 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yu-cheng Yu When Shadow Stack is in use, Write=0,Dirty=1 PTE are reserved for shadow stack. Copy-on-write PTes then have Write=0,Cow=1. When a PTE goes from Write=1,Dirty=1 to Write=0,Cow=1, it could become a transient shadow stack PTE in two cases: The first case is that some processors can start a write but end up seeing a Write=0 PTE by the time they get to the Dirty bit, creating a transient shadow stack PTE. However, this will not occur on processors supporting Shadow Stack, and a TLB flush is not necessary. The second case is that when _PAGE_DIRTY is replaced with _PAGE_COW non- atomically, a transient shadow stack PTE can be created as a result. Thus, prevent that with cmpxchg. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. Signed-off-by: Yu-cheng Yu Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe --- v2: - Compile out some code due to clang build error - Clarify commit log (dhansen) - Normalize PTE bit descriptions between patches (dhansen) - Update comment with text from (dhansen) Yu-cheng v30: - Replace (pmdval_t) cast with CONFIG_PGTABLE_LEVELES > 2 (Borislav Petkov). arch/x86/include/asm/pgtable.h | 36 ++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 2f2963429f48..58c7bf9d7392 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1287,6 +1287,23 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { +#ifdef CONFIG_X86_SHADOW_STACK + /* + * Avoid accidentally creating shadow stack PTEs + * (Write=0,Dirty=1). Use cmpxchg() to prevent races with + * the hardware setting Dirty=1. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pte_t old_pte, new_pte; + + old_pte = READ_ONCE(*ptep); + do { + new_pte = pte_wrprotect(old_pte); + } while (!try_cmpxchg(&ptep->pte, &old_pte.pte, new_pte.pte)); + + return; + } +#endif clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); } @@ -1339,6 +1356,25 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { +#ifdef CONFIG_X86_SHADOW_STACK + /* + * If Shadow Stack is enabled, pmd_wrprotect() moves _PAGE_DIRTY + * to _PAGE_COW (see comments at pmd_wrprotect()). + * When a thread reads a RW=1, Dirty=0 PMD and before changing it + * to RW=0, Dirty=0, another thread could have written to the page + * and the PMD is RW=1, Dirty=1 now. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pmd_t old_pmd, new_pmd; + + old_pmd = READ_ONCE(*pmdp); + do { + new_pmd = pmd_wrprotect(old_pmd); + } while (!try_cmpxchg(&pmdp->pmd, &old_pmd.pmd, new_pmd.pmd)); + + return; + } +#endif clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); }